From Catherine.M.Moroney at jpl.nasa.gov  Tue Jul  1 19:46:32 2014
From: Catherine.M.Moroney at jpl.nasa.gov (Moroney, Catherine M (398D))
Date: Tue, 1 Jul 2014 23:46:32 +0000
Subject: [Numpy-discussion] numpy.histogram not giving expected results
Message-ID: <19894208-1D97-461B-86EB-CF4394176CEE@jpl.nasa.gov>

Hello,

I'm trying to calculate a 1-d histogram of a distribution that contains mostly zeros,
and I'm having problems with examples where the values to be histogrammed fall
exactly on the bin boundaries:

For example, this gives me the expected results (entering the exact bin values):

>>> data
array([ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
        0.  ,  0.05, -0.05])
>>> bins_list = numpy.array([-0.1, -0.05, 0.0, 0.05, 0.1])
>>> (counts, edges) = numpy.histogram(data, bins=bins_list)
>>> counts
array([ 0,  1, 10,  1])
>>> edges
array([-0.1 , -0.05,  0.  ,  0.05,  0.1 ])


but this does not (generating the bin values via bumpy.arange):

>>> bins_arange = numpy.arange(-0.1, 0.101, 0.05)
>>> data
array([ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
        0.  ,  0.05, -0.05])
>>> bins_arange
array([-0.1 , -0.05,  0.  ,  0.05,  0.1 ])
>>> (counts, edges) = numpy.histogram(data, bins=bins_arange)
>>> counts
array([ 0,  1, 11,  0])

I'm assuming this is due to slight rounding in the calculation of bins_arange, 
as compared to the manually entered values in bins_list.

What is the recommended way of getting the first set of results, without
having to manually enter all the values in the "bins" argument?

The following also gives me unexpected results:

>>> data
array([ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
        0.  ,  0.05, -0.05])
counts, edges) = numpy.histogram(data, range=(-0.1, 0.1), bins=4)
>>> counts
array([ 0,  1, 11,  0])


Thank you for any advice,

Catherine

From chris.barker at noaa.gov  Tue Jul  1 20:05:50 2014
From: chris.barker at noaa.gov (Chris Barker)
Date: Tue, 1 Jul 2014 17:05:50 -0700
Subject: [Numpy-discussion] numpy.histogram not giving expected results
In-Reply-To: <19894208-1D97-461B-86EB-CF4394176CEE@jpl.nasa.gov>
References: <19894208-1D97-461B-86EB-CF4394176CEE@jpl.nasa.gov>
Message-ID: <CALGmxE+mM+4Mj=iGv2KSYqSermKEkAH+QHuDkVXAvsortGmiKg@mail.gmail.com>

A few thoughts:

1) don't use arange() for flaoting point numbers, use linspace().

2) histogram1d is a floating point function, and you shouldn't expect exact
results for floating point -- in particular, values exactly at the bin
boundaries are likely to be "uncertain" -- not quite the right word, but
you get the idea.

3) if you expect have a lot of certain specific values, say, integers, or
zeros -- then you don't want your bin boundaries to be exactly at the value
-- they should be between the expected values.

4) remember that histogramming is inherently sensitive to bin position
anyway -- if these small bin-boundary differences matter, than you may not
be using teh best approach.

-HTH,
  -Chris


> >>> data
> array([ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
>         0.  ,  0.05, -0.05])
> >>> bins_list = numpy.array([-0.1, -0.05, 0.0, 0.05, 0.1])
> >>> (counts, edges) = numpy.histogram(data, bins=bins_list)
> >>> counts
> array([ 0,  1, 10,  1])
> >>> edges
> array([-0.1 , -0.05,  0.  ,  0.05,  0.1 ])
>
>
>
> but this does not (generating the bin values via bumpy.arange):
>
> >>> bins_arange = numpy.arange(-0.1, 0.101, 0.05)
> >>> data
> array([ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
>         0.  ,  0.05, -0.05])
> >>> bins_arange
> array([-0.1 , -0.05,  0.  ,  0.05,  0.1 ])
> >>> (counts, edges) = numpy.histogram(data, bins=bins_arange)
> >>> counts
> array([ 0,  1, 11,  0])
>
> I'm assuming this is due to slight rounding in the calculation of
> bins_arange,
> as compared to the manually entered values in bins_list.
>
> What is the recommended way of getting the first set of results, without
> having to manually enter all the values in the "bins" argument?
>
> The following also gives me unexpected results:
>
> >>> data
> array([ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
>         0.  ,  0.05, -0.05])
> counts, edges) = numpy.histogram(data, range=(-0.1, 0.1), bins=4)
> >>> counts
> array([ 0,  1, 11,  0])
>
>
>
> Thank you for any advice,
>
> Catherine
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140701/da0d9472/attachment.html>

From olivier.grisel at ensta.org  Wed Jul  2 03:24:44 2014
From: olivier.grisel at ensta.org (Olivier Grisel)
Date: Wed, 2 Jul 2014 09:24:44 +0200
Subject: [Numpy-discussion] 64-bit windows numpy / scipy wheels for
	testing
In-Reply-To: <CABL7CQhfo6++1B6hBh3pDzmLZOeNbqo1SzQOfBMbLEQj4G+U7A@mail.gmail.com>
References: <CAH6Pt5r47rOKXYfdiqaGQihV+yzrq2dc9rmYbboNb=HUbrtEpA@mail.gmail.com>
	<ljivi7$8u2$1@ger.gmane.org>
	<CAH6Pt5omGrboenvUxwRrGFkc=0cJrEhSU3P_=KVEJAgprFEjyA@mail.gmail.com>
	<CAGGsPMyh8D0uFVJwO1T5EQnJhwpCfRgmpOdJz6g9stza7KqgUA@mail.gmail.com>
	<CAH6Pt5qs5072gzRosEK3UtcUen0VuWVyyi5P0wmDD71V7LB+qg@mail.gmail.com>
	<CAGGsPMwsnJqe6=EszAb8o6WhmYv-7nJfcU9qvkTLzY67-jbhqg@mail.gmail.com>
	<CAH6Pt5oGoFq=OKT+BFpTO9AxPCBkwdmZ5WihXM1JZbUVLKQxkw@mail.gmail.com>
	<CAH6Pt5pgwboTLHqUeUaUXxHZg4C5V4f+M+c4jre2bSeXiryP0Q@mail.gmail.com>
	<CAGY4rcVPgcrj1JKUb5radLuLkrbg+u_zs6gSZvX2TY=i9+43jA@mail.gmail.com>
	<CAH6Pt5oNrx9d=fMeXKW+m6TOvyhXcKRCHJsb7DTfkqJ3+i=40A@mail.gmail.com>
	<CAGY4rcU2gOcFABf4V-QLyy5Uf-8ZhoFWfJKmuFQ7mPs1-NqCvA@mail.gmail.com>
	<536CB2C6.1030305@googlemail.com>
	<CAGY4rcV1tecTqbVB-nY2_zuAQ1U0uVRH9iPa8Wd2inL8mh+zpA@mail.gmail.com>
	<CAH6Pt5oOx+POzTJdMHU49JxZs=Pkc6OFD-B6=9topx68QjtiiA@mail.gmail.com>
	<CABL7CQhfo6++1B6hBh3pDzmLZOeNbqo1SzQOfBMbLEQj4G+U7A@mail.gmail.com>
Message-ID: <CAFvE7K4+TMXG8-=bFsWk7hx5Wh4VkhhyYhw4z7EBb6kKnqPJrw@mail.gmail.com>

Hi Matthew and Ralf,

Has anyone managed to build working whl packages for numpy and scipy
on win32 using the static mingw-w64 toolchain?

-- 
Olivier


From mszepien at gmail.com  Wed Jul  2 04:07:04 2014
From: mszepien at gmail.com (Mark Szepieniec)
Date: Wed, 2 Jul 2014 10:07:04 +0200
Subject: [Numpy-discussion] numpy.histogram not giving expected results
In-Reply-To: <CALGmxE+mM+4Mj=iGv2KSYqSermKEkAH+QHuDkVXAvsortGmiKg@mail.gmail.com>
References: <19894208-1D97-461B-86EB-CF4394176CEE@jpl.nasa.gov>
	<CALGmxE+mM+4Mj=iGv2KSYqSermKEkAH+QHuDkVXAvsortGmiKg@mail.gmail.com>
Message-ID: <CAE4-1rXCjojUA75nY7d9RTu+S9z-y_L7dnZyK=3B0nTzjGF3cg@mail.gmail.com>

Hi Catherine,

I can't reproduce your issue with bins_list vs. bins_arange, but passing
both range and number of bins to np.histogram does give the same strange
behavior for me:

In [16]: data = np.array([ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
 0.  ,  0.  ,
        0.  ,  0.05, -0.05])

In [17]: bins_list = np.array([-0.1, -0.05, 0.0, 0.05, 0.1])

In [18]: np.histogram(data, bins=bins_list)
Out[18]: (array([ 0,  1, 10,  1]), array([-0.1 , -0.05,  0.  ,  0.05,  0.1
]))

In [19]: bins_arange = np.arange(-0.1, 0.101, 0.05)

In [20]: np.histogram(data, bins=bins_arange)
Out[20]: (array([ 0,  1, 10,  1]), array([-0.1 , -0.05,  0.  ,  0.05,  0.1
]))

In [21]: np.histogram(data, range=(-0.1, 0.1), bins=4)
Out[21]: (array([ 0,  1, 11,  0]), array([-0.1 , -0.05,  0.  ,  0.05,  0.1
]))

In [22]: np.version.version
Out[22]: '1.8.1'

Looks like the 0.05 value of data is being binned differently in the last
case, but I'm not sure why either...

Mark


On Wed, Jul 2, 2014 at 2:05 AM, Chris Barker <chris.barker at noaa.gov> wrote:

> A few thoughts:
>
> 1) don't use arange() for flaoting point numbers, use linspace().
>
> 2) histogram1d is a floating point function, and you shouldn't expect
> exact results for floating point -- in particular, values exactly at the
> bin boundaries are likely to be "uncertain" -- not quite the right word,
> but you get the idea.
>
> 3) if you expect have a lot of certain specific values, say, integers, or
> zeros -- then you don't want your bin boundaries to be exactly at the value
> -- they should be between the expected values.
>
> 4) remember that histogramming is inherently sensitive to bin position
> anyway -- if these small bin-boundary differences matter, than you may not
> be using teh best approach.
>
> -HTH,
>   -Chris
>
>
>
>
>
>
>> >>> data
>> array([ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
>>         0.  ,  0.05, -0.05])
>> >>> bins_list = numpy.array([-0.1, -0.05, 0.0, 0.05, 0.1])
>> >>> (counts, edges) = numpy.histogram(data, bins=bins_list)
>> >>> counts
>> array([ 0,  1, 10,  1])
>> >>> edges
>> array([-0.1 , -0.05,  0.  ,  0.05,  0.1 ])
>>
>>
>>
>> but this does not (generating the bin values via bumpy.arange):
>>
>> >>> bins_arange = numpy.arange(-0.1, 0.101, 0.05)
>> >>> data
>> array([ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
>>         0.  ,  0.05, -0.05])
>> >>> bins_arange
>> array([-0.1 , -0.05,  0.  ,  0.05,  0.1 ])
>> >>> (counts, edges) = numpy.histogram(data, bins=bins_arange)
>> >>> counts
>> array([ 0,  1, 11,  0])
>>
>> I'm assuming this is due to slight rounding in the calculation of
>> bins_arange,
>> as compared to the manually entered values in bins_list.
>>
>> What is the recommended way of getting the first set of results, without
>> having to manually enter all the values in the "bins" argument?
>>
>> The following also gives me unexpected results:
>>
>> >>> data
>> array([ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
>>         0.  ,  0.05, -0.05])
>> counts, edges) = numpy.histogram(data, range=(-0.1, 0.1), bins=4)
>> >>> counts
>> array([ 0,  1, 11,  0])
>>
>>
>>
>> Thank you for any advice,
>>
>> Catherine
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
>
> --
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
>
> Chris.Barker at noaa.gov
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140702/13d883fb/attachment.html>

From njs at pobox.com  Wed Jul  2 04:49:07 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 2 Jul 2014 09:49:07 +0100
Subject: [Numpy-discussion] Fwd: [Python-ideas] PEP pre-draft: Support for
	indexing with keyword arguments
In-Reply-To: <53B33800.1030300@ferrara.linux.it>
References: <53B33800.1030300@ferrara.linux.it>
Message-ID: <CAPJVwBnr3r5O1UNFpVpMTSJBEoA+BbcH4b0URKLtdRgTcNKnwA@mail.gmail.com>

There's some discussion on python-ideas about making it possible for python
indexing to accept kwargs, eg

   arr[1:2, foo=bar]

Since numpy is a very heavy user of indexing which might benefit from this,
I thought I should forward it here. If we have clear use cases for such a
feature then that may strongly affect the discussion.

I admit I can't actually think of any features this would enable for us
though...

-n
---------- Forwarded message ----------
From: "Stefano Borini" <stefano.borini at ferrara.linux.it>
Date: 2 Jul 2014 00:17
Subject: [Python-ideas] PEP pre-draft: Support for indexing with keyword
arguments
To: "python-ideas at python.org" <python-ideas at python.org>, "Joseph
Martinot-Lagarde" <joseph.martinot-lagarde at m4x.org>
Cc:

Dear all,

after the first mailing list feedback, and further private discussion with
Joseph Martinot-Lagarde, I drafted a first iteration of a PEP for keyword
arguments in indexing. The document is available here.

https://github.com/stefanoborini/pep-keyword/blob/master/PEP-XXX.txt

The document is not in final form when it comes to specifications. In fact,
it requires additional discussion about the best strategy to achieve the
desired result. Particular attention has been devoted to present
alternative implementation strategies, their pros and cons. I will examine
all feedback tomorrow morning European time (in approx 10 hrs), and apply
any pull requests or comments you may have.

When the specification is finalized, or this community suggests that the
PEP is in a form suitable for official submission despite potential open
issues, I will submit it to the editor panel for further discussion, and
deploy an actual implementation according to the agreed specification for a
working test run.

I apologize for potential mistakes in the PEP drafting and submission
process, as this is my first PEP.

Kind Regards,

Stefano Borini
_______________________________________________
Python-ideas mailing list
Python-ideas at python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140702/00a4bdcd/attachment.html>

From cmkleffner at gmail.com  Wed Jul  2 05:36:44 2014
From: cmkleffner at gmail.com (Carl Kleffner)
Date: Wed, 2 Jul 2014 11:36:44 +0200
Subject: [Numpy-discussion] 64-bit windows numpy / scipy wheels for
	testing
In-Reply-To: <CAFvE7K4+TMXG8-=bFsWk7hx5Wh4VkhhyYhw4z7EBb6kKnqPJrw@mail.gmail.com>
References: <CAH6Pt5r47rOKXYfdiqaGQihV+yzrq2dc9rmYbboNb=HUbrtEpA@mail.gmail.com>
	<ljivi7$8u2$1@ger.gmane.org>
	<CAH6Pt5omGrboenvUxwRrGFkc=0cJrEhSU3P_=KVEJAgprFEjyA@mail.gmail.com>
	<CAGGsPMyh8D0uFVJwO1T5EQnJhwpCfRgmpOdJz6g9stza7KqgUA@mail.gmail.com>
	<CAH6Pt5qs5072gzRosEK3UtcUen0VuWVyyi5P0wmDD71V7LB+qg@mail.gmail.com>
	<CAGGsPMwsnJqe6=EszAb8o6WhmYv-7nJfcU9qvkTLzY67-jbhqg@mail.gmail.com>
	<CAH6Pt5oGoFq=OKT+BFpTO9AxPCBkwdmZ5WihXM1JZbUVLKQxkw@mail.gmail.com>
	<CAH6Pt5pgwboTLHqUeUaUXxHZg4C5V4f+M+c4jre2bSeXiryP0Q@mail.gmail.com>
	<CAGY4rcVPgcrj1JKUb5radLuLkrbg+u_zs6gSZvX2TY=i9+43jA@mail.gmail.com>
	<CAH6Pt5oNrx9d=fMeXKW+m6TOvyhXcKRCHJsb7DTfkqJ3+i=40A@mail.gmail.com>
	<CAGY4rcU2gOcFABf4V-QLyy5Uf-8ZhoFWfJKmuFQ7mPs1-NqCvA@mail.gmail.com>
	<536CB2C6.1030305@googlemail.com>
	<CAGY4rcV1tecTqbVB-nY2_zuAQ1U0uVRH9iPa8Wd2inL8mh+zpA@mail.gmail.com>
	<CAH6Pt5oOx+POzTJdMHU49JxZs=Pkc6OFD-B6=9topx68QjtiiA@mail.gmail.com>
	<CABL7CQhfo6++1B6hBh3pDzmLZOeNbqo1SzQOfBMbLEQj4G+U7A@mail.gmail.com>
	<CAFvE7K4+TMXG8-=bFsWk7hx5Wh4VkhhyYhw4z7EBb6kKnqPJrw@mail.gmail.com>
Message-ID: <CAGGsPMxmOV8T+Wm6LfTER1xebCQk4MEebXfdRULJ6b8zWLL_=Q@mail.gmail.com>

Hi all,

I do regulary builds for python-2.7. Due to my limited resources I didn't
build for 3.3 or 3.4 right now. I didn't updated my toolchhain from
february, but I do regulary builds of OpenBLAS. OpenBLAS is under heavy
development right now, thanks to Werner Saar, see:
https://github.com/wernsaar/OpenBLAS .
A lot of bugs have been canceled out at the cost of performance, see the
kernel TODO list:
https://github.com/xianyi/OpenBLAS/wiki/Fixed-optimized-kernels-To-do-List
. Many bugs related to Windows have been corrected. A very weird bug i.e.:
https://github.com/xianyi/OpenBLAS/issues/394 and
https://github.com/JuliaLang/julia/issues/5574 .
I got the impression, that the Julia community (and maybe the R and octave
community) is very interested getting towards a stable Windows OpenBLAS.
OpenBLAS is the only free OSS optimized BLAS/Lapack solution maintained for
Windows today. Atlas seems not to be maintained for Windows anymore (is
this true Matthew?)

somewhat older test wheels for python-2.7 can be downloaded here:
see: http://figshare.com/articles/search?q=numpy&quick=1&x=0&y=0
(2014-06-10) numpy and scipy wheels for py-2.7
The scipy test suite (amd64) emits segfaults with multithreaded OpenBLAS,
but is stable with single thread (see the log files). I didn't dig into
this further. Win32 works with MT OpenBLAS, but has some test failures with
atan2 and hypot. The is more or less the status today. I can upload new
wheels linked against a recent OpenBLAS, maybe tomorrow on Binstar.

Regards,

Carl


2014-07-02 9:24 GMT+02:00 Olivier Grisel <olivier.grisel at ensta.org>:

> Hi Matthew and Ralf,
>
> Has anyone managed to build working whl packages for numpy and scipy
> on win32 using the static mingw-w64 toolchain?
>
> --
> Olivier
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140702/0347a257/attachment.html>

From mads.ipsen at gmail.com  Wed Jul  2 06:15:25 2014
From: mads.ipsen at gmail.com (Mads Ipsen)
Date: Wed, 02 Jul 2014 12:15:25 +0200
Subject: [Numpy-discussion] Accessing irregular sized array data from C
Message-ID: <53B3DBBD.8030202@gmail.com>

Hi,

If you setup an M x N array like this

  a = 1.0*numpy.arange(24).reshape(8,3)

you can access the data from a C function like this

void foo(PyObject * numpy_data)
{
    // Get dimension and data pointer
    int const m = static_cast<int>(PyArray_DIMS(numpy_data)[0]);
    int const n = static_cast<int>(PyArray_DIMS(numpy_data)[1]);
    double * const data = (double *) PyArray_DATA(numpy_data);

    // Access data
    ...
}

Now, suppose I have an irregular shaped numpy array like this

  a1 = numpy.array([ 1.0, 2.0, 3.0])
  a2 = numpy.array([-2.0, 4.0])
  a3 = numpy.array([5.0])
  b  = numpy.array([a1,a2,a3])

How can open up the doors to the array data of b on the C-side?

Best regards,

Mads

-- 
+---------------------------------------------------------+
| Mads Ipsen                                              |
+----------------------+----------------------------------+
| G?seb?ksvej 7, 4. tv | phone:              +45-29716388 |
| DK-2500 Valby        | email:      mads.ipsen at gmail.com |
| Denmark              | map  :   www.tinyurl.com/ns52fpa |
+----------------------+----------------------------------+


From matthew.brett at gmail.com  Wed Jul  2 06:29:07 2014
From: matthew.brett at gmail.com (Matthew Brett)
Date: Wed, 2 Jul 2014 11:29:07 +0100
Subject: [Numpy-discussion] 64-bit windows numpy / scipy wheels for
	testing
In-Reply-To: <CAGGsPMxmOV8T+Wm6LfTER1xebCQk4MEebXfdRULJ6b8zWLL_=Q@mail.gmail.com>
References: <CAH6Pt5r47rOKXYfdiqaGQihV+yzrq2dc9rmYbboNb=HUbrtEpA@mail.gmail.com>
	<ljivi7$8u2$1@ger.gmane.org>
	<CAH6Pt5omGrboenvUxwRrGFkc=0cJrEhSU3P_=KVEJAgprFEjyA@mail.gmail.com>
	<CAGGsPMyh8D0uFVJwO1T5EQnJhwpCfRgmpOdJz6g9stza7KqgUA@mail.gmail.com>
	<CAH6Pt5qs5072gzRosEK3UtcUen0VuWVyyi5P0wmDD71V7LB+qg@mail.gmail.com>
	<CAGGsPMwsnJqe6=EszAb8o6WhmYv-7nJfcU9qvkTLzY67-jbhqg@mail.gmail.com>
	<CAH6Pt5oGoFq=OKT+BFpTO9AxPCBkwdmZ5WihXM1JZbUVLKQxkw@mail.gmail.com>
	<CAH6Pt5pgwboTLHqUeUaUXxHZg4C5V4f+M+c4jre2bSeXiryP0Q@mail.gmail.com>
	<CAGY4rcVPgcrj1JKUb5radLuLkrbg+u_zs6gSZvX2TY=i9+43jA@mail.gmail.com>
	<CAH6Pt5oNrx9d=fMeXKW+m6TOvyhXcKRCHJsb7DTfkqJ3+i=40A@mail.gmail.com>
	<CAGY4rcU2gOcFABf4V-QLyy5Uf-8ZhoFWfJKmuFQ7mPs1-NqCvA@mail.gmail.com>
	<536CB2C6.1030305@googlemail.com>
	<CAGY4rcV1tecTqbVB-nY2_zuAQ1U0uVRH9iPa8Wd2inL8mh+zpA@mail.gmail.com>
	<CAH6Pt5oOx+POzTJdMHU49JxZs=Pkc6OFD-B6=9topx68QjtiiA@mail.gmail.com>
	<CABL7CQhfo6++1B6hBh3pDzmLZOeNbqo1SzQOfBMbLEQj4G+U7A@mail.gmail.com>
	<CAFvE7K4+TMXG8-=bFsWk7hx5Wh4VkhhyYhw4z7EBb6kKnqPJrw@mail.gmail.com>
	<CAGGsPMxmOV8T+Wm6LfTER1xebCQk4MEebXfdRULJ6b8zWLL_=Q@mail.gmail.com>
Message-ID: <CAH6Pt5pFftrXN8dkS4Dcj+Dt8Bx6EMyqp_cK0RX6=KUKhVx2+w@mail.gmail.com>

Hi,

On Wed, Jul 2, 2014 at 10:36 AM, Carl Kleffner <cmkleffner at gmail.com> wrote:
> Hi all,
>
> I do regulary builds for python-2.7. Due to my limited resources I didn't
> build for 3.3 or 3.4 right now. I didn't updated my toolchhain from
> february, but I do regulary builds of OpenBLAS. OpenBLAS is under heavy
> development right now, thanks to Werner Saar, see:
> https://github.com/wernsaar/OpenBLAS .
> A lot of bugs have been canceled out at the cost of performance, see the
> kernel TODO list:
> https://github.com/xianyi/OpenBLAS/wiki/Fixed-optimized-kernels-To-do-List .
> Many bugs related to Windows have been corrected. A very weird bug i.e.:
> https://github.com/xianyi/OpenBLAS/issues/394 and
> https://github.com/JuliaLang/julia/issues/5574 .
> I got the impression, that the Julia community (and maybe the R and octave
> community) is very interested getting towards a stable Windows OpenBLAS.
> OpenBLAS is the only free OSS optimized BLAS/Lapack solution maintained for
> Windows today. Atlas seems not to be maintained for Windows anymore (is this
> true Matthew?)

No, it's not true, but it's not really false either.  Clint Whaley is
the ATLAS maintainer and his interests are firmly in
high-performance-computing so he is much more interested in exotic new
chips than in Windows.  But, he does aim to make the latest stable
release buildable on Windows, and he's helped me do that for the
latest stable, with some hope he'll continue to work on the 64-bit
Windows kernels which are hobbled at the moment because of differences
in the Windows / other OS 64-bit ABI.  Builds here:

https://nipy.bic.berkeley.edu/scipy_installers/atlas_builds/

> somewhat older test wheels for python-2.7 can be downloaded here:
> see: http://figshare.com/articles/search?q=numpy&quick=1&x=0&y=0
> (2014-06-10) numpy and scipy wheels for py-2.7
> The scipy test suite (amd64) emits segfaults with multithreaded OpenBLAS,
> but is stable with single thread (see the log files). I didn't dig into this
> further. Win32 works with MT OpenBLAS, but has some test failures with atan2
> and hypot. The is more or less the status today. I can upload new wheels
> linked against a recent OpenBLAS, maybe tomorrow on Binstar.

I built some 64-bit wheels against Carl's toolchain and the ATLAS
above, I think they don't have any threading issues, but the scipy
wheel fails one scipy test due to some very small precision
differences in the mingw runtime.  I think we agreed this failure
wasn't important.

https://nipy.bic.berkeley.edu/scipy_installers/numpy-1.8.1-cp27-none-win_amd64.whl
https://nipy.bic.berkeley.edu/scipy_installers/scipy-0.13.3-cp27-none-win_amd64.whl

Cheers,

Matthew


From matthew.brett at gmail.com  Wed Jul  2 06:37:16 2014
From: matthew.brett at gmail.com (Matthew Brett)
Date: Wed, 2 Jul 2014 11:37:16 +0100
Subject: [Numpy-discussion] 64-bit windows numpy / scipy wheels for
	testing
In-Reply-To: <CAH6Pt5pFftrXN8dkS4Dcj+Dt8Bx6EMyqp_cK0RX6=KUKhVx2+w@mail.gmail.com>
References: <CAH6Pt5r47rOKXYfdiqaGQihV+yzrq2dc9rmYbboNb=HUbrtEpA@mail.gmail.com>
	<ljivi7$8u2$1@ger.gmane.org>
	<CAH6Pt5omGrboenvUxwRrGFkc=0cJrEhSU3P_=KVEJAgprFEjyA@mail.gmail.com>
	<CAGGsPMyh8D0uFVJwO1T5EQnJhwpCfRgmpOdJz6g9stza7KqgUA@mail.gmail.com>
	<CAH6Pt5qs5072gzRosEK3UtcUen0VuWVyyi5P0wmDD71V7LB+qg@mail.gmail.com>
	<CAGGsPMwsnJqe6=EszAb8o6WhmYv-7nJfcU9qvkTLzY67-jbhqg@mail.gmail.com>
	<CAH6Pt5oGoFq=OKT+BFpTO9AxPCBkwdmZ5WihXM1JZbUVLKQxkw@mail.gmail.com>
	<CAH6Pt5pgwboTLHqUeUaUXxHZg4C5V4f+M+c4jre2bSeXiryP0Q@mail.gmail.com>
	<CAGY4rcVPgcrj1JKUb5radLuLkrbg+u_zs6gSZvX2TY=i9+43jA@mail.gmail.com>
	<CAH6Pt5oNrx9d=fMeXKW+m6TOvyhXcKRCHJsb7DTfkqJ3+i=40A@mail.gmail.com>
	<CAGY4rcU2gOcFABf4V-QLyy5Uf-8ZhoFWfJKmuFQ7mPs1-NqCvA@mail.gmail.com>
	<536CB2C6.1030305@googlemail.com>
	<CAGY4rcV1tecTqbVB-nY2_zuAQ1U0uVRH9iPa8Wd2inL8mh+zpA@mail.gmail.com>
	<CAH6Pt5oOx+POzTJdMHU49JxZs=Pkc6OFD-B6=9topx68QjtiiA@mail.gmail.com>
	<CABL7CQhfo6++1B6hBh3pDzmLZOeNbqo1SzQOfBMbLEQj4G+U7A@mail.gmail.com>
	<CAFvE7K4+TMXG8-=bFsWk7hx5Wh4VkhhyYhw4z7EBb6kKnqPJrw@mail.gmail.com>
	<CAGGsPMxmOV8T+Wm6LfTER1xebCQk4MEebXfdRULJ6b8zWLL_=Q@mail.gmail.com>
	<CAH6Pt5pFftrXN8dkS4Dcj+Dt8Bx6EMyqp_cK0RX6=KUKhVx2+w@mail.gmail.com>
Message-ID: <CAH6Pt5r2fNJGGj1K=foSt9tFU89+wf82waYAzxYHrKGtQbTXwQ@mail.gmail.com>

Hi,

On Wed, Jul 2, 2014 at 11:29 AM, Matthew Brett <matthew.brett at gmail.com> wrote:
> Hi,
>
> On Wed, Jul 2, 2014 at 10:36 AM, Carl Kleffner <cmkleffner at gmail.com> wrote:
>> Hi all,
>>
>> I do regulary builds for python-2.7. Due to my limited resources I didn't
>> build for 3.3 or 3.4 right now. I didn't updated my toolchhain from
>> february, but I do regulary builds of OpenBLAS. OpenBLAS is under heavy
>> development right now, thanks to Werner Saar, see:
>> https://github.com/wernsaar/OpenBLAS .
>> A lot of bugs have been canceled out at the cost of performance, see the
>> kernel TODO list:
>> https://github.com/xianyi/OpenBLAS/wiki/Fixed-optimized-kernels-To-do-List .
>> Many bugs related to Windows have been corrected. A very weird bug i.e.:
>> https://github.com/xianyi/OpenBLAS/issues/394 and
>> https://github.com/JuliaLang/julia/issues/5574 .
>> I got the impression, that the Julia community (and maybe the R and octave
>> community) is very interested getting towards a stable Windows OpenBLAS.
>> OpenBLAS is the only free OSS optimized BLAS/Lapack solution maintained for
>> Windows today. Atlas seems not to be maintained for Windows anymore (is this
>> true Matthew?)
>
> No, it's not true, but it's not really false either.  Clint Whaley is
> the ATLAS maintainer and his interests are firmly in
> high-performance-computing so he is much more interested in exotic new
> chips than in Windows.  But, he does aim to make the latest stable
> release buildable on Windows, and he's helped me do that for the
> latest stable, with some hope he'll continue to work on the 64-bit
> Windows kernels which are hobbled at the moment because of differences
> in the Windows / other OS 64-bit ABI.  Builds here:
>
> https://nipy.bic.berkeley.edu/scipy_installers/atlas_builds/
>
>> somewhat older test wheels for python-2.7 can be downloaded here:
>> see: http://figshare.com/articles/search?q=numpy&quick=1&x=0&y=0
>> (2014-06-10) numpy and scipy wheels for py-2.7
>> The scipy test suite (amd64) emits segfaults with multithreaded OpenBLAS,
>> but is stable with single thread (see the log files). I didn't dig into this
>> further. Win32 works with MT OpenBLAS, but has some test failures with atan2
>> and hypot. The is more or less the status today. I can upload new wheels
>> linked against a recent OpenBLAS, maybe tomorrow on Binstar.
>
> I built some 64-bit wheels against Carl's toolchain and the ATLAS
> above, I think they don't have any threading issues, but the scipy
> wheel fails one scipy test due to some very small precision
> differences in the mingw runtime.  I think we agreed this failure
> wasn't important.
>
> https://nipy.bic.berkeley.edu/scipy_installers/numpy-1.8.1-cp27-none-win_amd64.whl
> https://nipy.bic.berkeley.edu/scipy_installers/scipy-0.13.3-cp27-none-win_amd64.whl

Sorry - I wasn't paying attention - you asked about 32-bit wheels.
Honestly, using the same toolchain, they wouldn't be at all hard to
build.

One issue is that the ATLAS builds depend on SSE2.  That isn't an
issue for 64 bit builds because the 64-bit ABI requires SSE2, but it
is an issue for 32-bit where we have no such guarantee.  It looks like
99% of Windows users do have SSE2 though [1].  So I think what is
required is

* Build the wheels for 32-bit (easy)
* Patch the wheels to check and give helpful error in absence of SSE2
(fairly easy)
* Get agreement these should go up on pypi and be maintained (feedback anyone?)

Cheers,

Matthew

[1] https://github.com/numpy/numpy/wiki/Windows-versions#sse--sse2


From jtaylor.debian at googlemail.com  Wed Jul  2 06:46:36 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Wed, 2 Jul 2014 12:46:36 +0200
Subject: [Numpy-discussion] Accessing irregular sized array data from C
In-Reply-To: <53B3DBBD.8030202@gmail.com>
References: <53B3DBBD.8030202@gmail.com>
Message-ID: <CAK5FAtF-BUXq0OF9KATo08iQqsUfKjL-v2Ajhkidf3KyRObqMw@mail.gmail.com>

On Wed, Jul 2, 2014 at 12:15 PM, Mads Ipsen <mads.ipsen at gmail.com> wrote:
> Hi,
>
> If you setup an M x N array like this
>
>   a = 1.0*numpy.arange(24).reshape(8,3)
>
> you can access the data from a C function like this
>
> void foo(PyObject * numpy_data)
> {
>     // Get dimension and data pointer
>     int const m = static_cast<int>(PyArray_DIMS(numpy_data)[0]);
>     int const n = static_cast<int>(PyArray_DIMS(numpy_data)[1]);
>     double * const data = (double *) PyArray_DATA(numpy_data);
>
>     // Access data
>     ...
> }
>
> Now, suppose I have an irregular shaped numpy array like this
>
>   a1 = numpy.array([ 1.0, 2.0, 3.0])
>   a2 = numpy.array([-2.0, 4.0])
>   a3 = numpy.array([5.0])
>   b  = numpy.array([a1,a2,a3])
>
> How can open up the doors to the array data of b on the C-side?
>

numpy does not directly support irregular shaped arrays (or ragged arrays).
If you look at the result of your example you will see this:
In [5]: b
Out[5]: array([array([ 1.,  2.,  3.]), array([-2.,  4.]), array([
5.])], dtype=object)

b has datatype object, this means it is a 1d array containing more
array objects. Numpy does not directly know about the shapes or types
the sub arrays. It is not necessarily homogeneous anymore, but
compared to a regular python list you still have elementwise
operations (if the contained python objects support them) and it can
have multiple dimensions.

In C you would access such an array it like this:

PyArrayObject * const data = (PyArrayObject *) PyArray_DATA(numpy_data);
for (i=0; i < PyArray_DIMS(numpy_data)[0]; i++) {
   assert(PyArray_Check(data[i]));
   double * const sub_data = (double *) PyArray_DATA(data[i]);
}


From cmkleffner at gmail.com  Wed Jul  2 07:18:07 2014
From: cmkleffner at gmail.com (Carl Kleffner)
Date: Wed, 2 Jul 2014 13:18:07 +0200
Subject: [Numpy-discussion] 64-bit windows numpy / scipy wheels for
	testing
In-Reply-To: <CAH6Pt5r2fNJGGj1K=foSt9tFU89+wf82waYAzxYHrKGtQbTXwQ@mail.gmail.com>
References: <CAH6Pt5r47rOKXYfdiqaGQihV+yzrq2dc9rmYbboNb=HUbrtEpA@mail.gmail.com>
	<ljivi7$8u2$1@ger.gmane.org>
	<CAH6Pt5omGrboenvUxwRrGFkc=0cJrEhSU3P_=KVEJAgprFEjyA@mail.gmail.com>
	<CAGGsPMyh8D0uFVJwO1T5EQnJhwpCfRgmpOdJz6g9stza7KqgUA@mail.gmail.com>
	<CAH6Pt5qs5072gzRosEK3UtcUen0VuWVyyi5P0wmDD71V7LB+qg@mail.gmail.com>
	<CAGGsPMwsnJqe6=EszAb8o6WhmYv-7nJfcU9qvkTLzY67-jbhqg@mail.gmail.com>
	<CAH6Pt5oGoFq=OKT+BFpTO9AxPCBkwdmZ5WihXM1JZbUVLKQxkw@mail.gmail.com>
	<CAH6Pt5pgwboTLHqUeUaUXxHZg4C5V4f+M+c4jre2bSeXiryP0Q@mail.gmail.com>
	<CAGY4rcVPgcrj1JKUb5radLuLkrbg+u_zs6gSZvX2TY=i9+43jA@mail.gmail.com>
	<CAH6Pt5oNrx9d=fMeXKW+m6TOvyhXcKRCHJsb7DTfkqJ3+i=40A@mail.gmail.com>
	<CAGY4rcU2gOcFABf4V-QLyy5Uf-8ZhoFWfJKmuFQ7mPs1-NqCvA@mail.gmail.com>
	<536CB2C6.1030305@googlemail.com>
	<CAGY4rcV1tecTqbVB-nY2_zuAQ1U0uVRH9iPa8Wd2inL8mh+zpA@mail.gmail.com>
	<CAH6Pt5oOx+POzTJdMHU49JxZs=Pkc6OFD-B6=9topx68QjtiiA@mail.gmail.com>
	<CABL7CQhfo6++1B6hBh3pDzmLZOeNbqo1SzQOfBMbLEQj4G+U7A@mail.gmail.com>
	<CAFvE7K4+TMXG8-=bFsWk7hx5Wh4VkhhyYhw4z7EBb6kKnqPJrw@mail.gmail.com>
	<CAGGsPMxmOV8T+Wm6LfTER1xebCQk4MEebXfdRULJ6b8zWLL_=Q@mail.gmail.com>
	<CAH6Pt5pFftrXN8dkS4Dcj+Dt8Bx6EMyqp_cK0RX6=KUKhVx2+w@mail.gmail.com>
	<CAH6Pt5r2fNJGGj1K=foSt9tFU89+wf82waYAzxYHrKGtQbTXwQ@mail.gmail.com>
Message-ID: <CAGGsPMyxvNkz__+Hc1YSp_fLRPbdg4gez9EAwQyS9jzv_R+ZCA@mail.gmail.com>

Hi,

The mingw-w64 based wheels (Atlas and openBLAS) are based on a patched
numpy version, that hasn't been published as numpy pull for revision until
now (my failure). I could try to do this tomorrow in the evening. Another
important point is, that the toolchain, that is capable to compile
numpy/scipy was adapted to allow for MSVC / mingw runtime compatibility and
does not create any gcc/mingw runtime dependency anymore.

OpenBLAS has one advantage over Atlas: numpy/scipy are linked dynamically
against OpenBLAS. Statically linked BLAS like MKL or ATLAS creates huge
python extensions and have considerable higher memory consumption compared
to dynamically linkage. On the other hand correctness is more important, so
ATLAS has to be preferred now.

Users with non SEE processors could be provided with wheels distributed on
binstar.

Regards

Carl


2014-07-02 12:37 GMT+02:00 Matthew Brett <matthew.brett at gmail.com>:

> Hi,
>
> On Wed, Jul 2, 2014 at 11:29 AM, Matthew Brett <matthew.brett at gmail.com>
> wrote:
> > Hi,
> >
> > On Wed, Jul 2, 2014 at 10:36 AM, Carl Kleffner <cmkleffner at gmail.com>
> wrote:
> >> Hi all,
> >>
> >> I do regulary builds for python-2.7. Due to my limited resources I
> didn't
> >> build for 3.3 or 3.4 right now. I didn't updated my toolchhain from
> >> february, but I do regulary builds of OpenBLAS. OpenBLAS is under heavy
> >> development right now, thanks to Werner Saar, see:
> >> https://github.com/wernsaar/OpenBLAS .
> >> A lot of bugs have been canceled out at the cost of performance, see the
> >> kernel TODO list:
> >>
> https://github.com/xianyi/OpenBLAS/wiki/Fixed-optimized-kernels-To-do-List
> .
> >> Many bugs related to Windows have been corrected. A very weird bug i.e.:
> >> https://github.com/xianyi/OpenBLAS/issues/394 and
> >> https://github.com/JuliaLang/julia/issues/5574 .
> >> I got the impression, that the Julia community (and maybe the R and
> octave
> >> community) is very interested getting towards a stable Windows OpenBLAS.
> >> OpenBLAS is the only free OSS optimized BLAS/Lapack solution maintained
> for
> >> Windows today. Atlas seems not to be maintained for Windows anymore (is
> this
> >> true Matthew?)
> >
> > No, it's not true, but it's not really false either.  Clint Whaley is
> > the ATLAS maintainer and his interests are firmly in
> > high-performance-computing so he is much more interested in exotic new
> > chips than in Windows.  But, he does aim to make the latest stable
> > release buildable on Windows, and he's helped me do that for the
> > latest stable, with some hope he'll continue to work on the 64-bit
> > Windows kernels which are hobbled at the moment because of differences
> > in the Windows / other OS 64-bit ABI.  Builds here:
> >
> > https://nipy.bic.berkeley.edu/scipy_installers/atlas_builds/
> >
> >> somewhat older test wheels for python-2.7 can be downloaded here:
> >> see: http://figshare.com/articles/search?q=numpy&quick=1&x=0&y=0
> >> (2014-06-10) numpy and scipy wheels for py-2.7
> >> The scipy test suite (amd64) emits segfaults with multithreaded
> OpenBLAS,
> >> but is stable with single thread (see the log files). I didn't dig into
> this
> >> further. Win32 works with MT OpenBLAS, but has some test failures with
> atan2
> >> and hypot. The is more or less the status today. I can upload new wheels
> >> linked against a recent OpenBLAS, maybe tomorrow on Binstar.
> >
> > I built some 64-bit wheels against Carl's toolchain and the ATLAS
> > above, I think they don't have any threading issues, but the scipy
> > wheel fails one scipy test due to some very small precision
> > differences in the mingw runtime.  I think we agreed this failure
> > wasn't important.
> >
> >
> https://nipy.bic.berkeley.edu/scipy_installers/numpy-1.8.1-cp27-none-win_amd64.whl
> >
> https://nipy.bic.berkeley.edu/scipy_installers/scipy-0.13.3-cp27-none-win_amd64.whl
>
> Sorry - I wasn't paying attention - you asked about 32-bit wheels.
> Honestly, using the same toolchain, they wouldn't be at all hard to
> build.
>
> One issue is that the ATLAS builds depend on SSE2.  That isn't an
> issue for 64 bit builds because the 64-bit ABI requires SSE2, but it
> is an issue for 32-bit where we have no such guarantee.  It looks like
> 99% of Windows users do have SSE2 though [1].  So I think what is
> required is
>
> * Build the wheels for 32-bit (easy)
> * Patch the wheels to check and give helpful error in absence of SSE2
> (fairly easy)
> * Get agreement these should go up on pypi and be maintained (feedback
> anyone?)
>
> Cheers,
>
> Matthew
>
> [1] https://github.com/numpy/numpy/wiki/Windows-versions#sse--sse2
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140702/3177b17d/attachment.html>

From matthew.brett at gmail.com  Wed Jul  2 07:35:07 2014
From: matthew.brett at gmail.com (Matthew Brett)
Date: Wed, 2 Jul 2014 12:35:07 +0100
Subject: [Numpy-discussion] 64-bit windows numpy / scipy wheels for
	testing
In-Reply-To: <CAGGsPMyxvNkz__+Hc1YSp_fLRPbdg4gez9EAwQyS9jzv_R+ZCA@mail.gmail.com>
References: <CAH6Pt5r47rOKXYfdiqaGQihV+yzrq2dc9rmYbboNb=HUbrtEpA@mail.gmail.com>
	<ljivi7$8u2$1@ger.gmane.org>
	<CAH6Pt5omGrboenvUxwRrGFkc=0cJrEhSU3P_=KVEJAgprFEjyA@mail.gmail.com>
	<CAGGsPMyh8D0uFVJwO1T5EQnJhwpCfRgmpOdJz6g9stza7KqgUA@mail.gmail.com>
	<CAH6Pt5qs5072gzRosEK3UtcUen0VuWVyyi5P0wmDD71V7LB+qg@mail.gmail.com>
	<CAGGsPMwsnJqe6=EszAb8o6WhmYv-7nJfcU9qvkTLzY67-jbhqg@mail.gmail.com>
	<CAH6Pt5oGoFq=OKT+BFpTO9AxPCBkwdmZ5WihXM1JZbUVLKQxkw@mail.gmail.com>
	<CAH6Pt5pgwboTLHqUeUaUXxHZg4C5V4f+M+c4jre2bSeXiryP0Q@mail.gmail.com>
	<CAGY4rcVPgcrj1JKUb5radLuLkrbg+u_zs6gSZvX2TY=i9+43jA@mail.gmail.com>
	<CAH6Pt5oNrx9d=fMeXKW+m6TOvyhXcKRCHJsb7DTfkqJ3+i=40A@mail.gmail.com>
	<CAGY4rcU2gOcFABf4V-QLyy5Uf-8ZhoFWfJKmuFQ7mPs1-NqCvA@mail.gmail.com>
	<536CB2C6.1030305@googlemail.com>
	<CAGY4rcV1tecTqbVB-nY2_zuAQ1U0uVRH9iPa8Wd2inL8mh+zpA@mail.gmail.com>
	<CAH6Pt5oOx+POzTJdMHU49JxZs=Pkc6OFD-B6=9topx68QjtiiA@mail.gmail.com>
	<CABL7CQhfo6++1B6hBh3pDzmLZOeNbqo1SzQOfBMbLEQj4G+U7A@mail.gmail.com>
	<CAFvE7K4+TMXG8-=bFsWk7hx5Wh4VkhhyYhw4z7EBb6kKnqPJrw@mail.gmail.com>
	<CAGGsPMxmOV8T+Wm6LfTER1xebCQk4MEebXfdRULJ6b8zWLL_=Q@mail.gmail.com>
	<CAH6Pt5pFftrXN8dkS4Dcj+Dt8Bx6EMyqp_cK0RX6=KUKhVx2+w@mail.gmail.com>
	<CAH6Pt5r2fNJGGj1K=foSt9tFU89+wf82waYAzxYHrKGtQbTXwQ@mail.gmail.com>
	<CAGGsPMyxvNkz__+Hc1YSp_fLRPbdg4gez9EAwQyS9jzv_R+ZCA@mail.gmail.com>
Message-ID: <CAH6Pt5rj8QOrtCyZMKeaM34ofdXkKRCiABGNLD1vVUC_B7bqTA@mail.gmail.com>

Hi,

On Wed, Jul 2, 2014 at 12:18 PM, Carl Kleffner <cmkleffner at gmail.com> wrote:
> Hi,
>
> The mingw-w64 based wheels (Atlas and openBLAS) are based on a patched numpy
> version, that hasn't been published as numpy pull for revision until now (my
> failure). I could try to do this tomorrow in the evening.

That would be really good.  I'll try and help with review if I can.

> Another important
> point is, that the toolchain, that is capable to compile numpy/scipy was
> adapted to allow for MSVC / mingw runtime compatibility and does not create
> any gcc/mingw runtime dependency anymore.
>
> OpenBLAS has one advantage over Atlas: numpy/scipy are linked dynamically
> against OpenBLAS. Statically linked BLAS like MKL or ATLAS creates huge
> python extensions and have considerable higher memory consumption compared
> to dynamically linkage. On the other hand correctness is more important, so
> ATLAS has to be preferred now.

Do you have any index of what the memory cost is?   If it's in the
order of 20M presumably that won't have much practical impact?

> Users with non SEE processors could be provided with wheels distributed on
> binstar.

The last plan we seemed to have was to continue making the 'superpack'
exe installers which contain no-SSE, SSE2 and SSE3 builds where the
installer selects which one to install at runtime.   The warning from
the wheel would point to these installers as the backup option.

If we did want to produce alternative wheels, I guess a specific
static https directory would be easiest; otherwise the user would get
the odd effect that they'd get a hobbled wheel by default when
installing from binstar (assuming they did in fact have SSE2).  I
mean, this

pip install -f https://somewhere.org/no_sse_wheels --no-index numpy

seems to make more sense as an alternative install command for
non-SSE, than this:

pip install -i http://binstar.org numpy

because in the former case, you can see what is special about the command.

Cheers,

Matthew


From mads.ipsen at gmail.com  Wed Jul  2 07:44:45 2014
From: mads.ipsen at gmail.com (Mads Ipsen)
Date: Wed, 02 Jul 2014 13:44:45 +0200
Subject: [Numpy-discussion] Accessing irregular sized array data from C
In-Reply-To: <CAK5FAtF-BUXq0OF9KATo08iQqsUfKjL-v2Ajhkidf3KyRObqMw@mail.gmail.com>
References: <53B3DBBD.8030202@gmail.com>
	<CAK5FAtF-BUXq0OF9KATo08iQqsUfKjL-v2Ajhkidf3KyRObqMw@mail.gmail.com>
Message-ID: <53B3F0AD.70700@gmail.com>


On 02/07/14 12:46, Julian Taylor wrote:
> On Wed, Jul 2, 2014 at 12:15 PM, Mads Ipsen <mads.ipsen at gmail.com> wrote:
>> Hi,
>>
>> If you setup an M x N array like this
>>
>>   a = 1.0*numpy.arange(24).reshape(8,3)
>>
>> you can access the data from a C function like this
>>
>> void foo(PyObject * numpy_data)
>> {
>>     // Get dimension and data pointer
>>     int const m = static_cast<int>(PyArray_DIMS(numpy_data)[0]);
>>     int const n = static_cast<int>(PyArray_DIMS(numpy_data)[1]);
>>     double * const data = (double *) PyArray_DATA(numpy_data);
>>
>>     // Access data
>>     ...
>> }
>>
>> Now, suppose I have an irregular shaped numpy array like this
>>
>>   a1 = numpy.array([ 1.0, 2.0, 3.0])
>>   a2 = numpy.array([-2.0, 4.0])
>>   a3 = numpy.array([5.0])
>>   b  = numpy.array([a1,a2,a3])
>>
>> How can open up the doors to the array data of b on the C-side?
>>
> 
> numpy does not directly support irregular shaped arrays (or ragged arrays).
> If you look at the result of your example you will see this:
> In [5]: b
> Out[5]: array([array([ 1.,  2.,  3.]), array([-2.,  4.]), array([
> 5.])], dtype=object)
> 
> b has datatype object, this means it is a 1d array containing more
> array objects. Numpy does not directly know about the shapes or types
> the sub arrays. It is not necessarily homogeneous anymore, but
> compared to a regular python list you still have elementwise
> operations (if the contained python objects support them) and it can
> have multiple dimensions.
> 
> In C you would access such an array it like this:
> 
> PyArrayObject * const data = (PyArrayObject *) PyArray_DATA(numpy_data);
> for (i=0; i < PyArray_DIMS(numpy_data)[0]; i++) {
>    assert(PyArray_Check(data[i]));
>    double * const sub_data = (double *) PyArray_DATA(data[i]);
> }
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 

Thanks - that'll get me going!

Best,

Mads

-- 
+---------------------------------------------------------+
| Mads Ipsen                                              |
+----------------------+----------------------------------+
| G?seb?ksvej 7, 4. tv | phone:              +45-29716388 |
| DK-2500 Valby        | email:      mads.ipsen at gmail.com |
| Denmark              | map  :   www.tinyurl.com/ns52fpa |
+----------------------+----------------------------------+


From olivier.grisel at ensta.org  Wed Jul  2 07:47:27 2014
From: olivier.grisel at ensta.org (Olivier Grisel)
Date: Wed, 2 Jul 2014 13:47:27 +0200
Subject: [Numpy-discussion] 64-bit windows numpy / scipy wheels for
	testing
In-Reply-To: <CAGGsPMyxvNkz__+Hc1YSp_fLRPbdg4gez9EAwQyS9jzv_R+ZCA@mail.gmail.com>
References: <CAH6Pt5r47rOKXYfdiqaGQihV+yzrq2dc9rmYbboNb=HUbrtEpA@mail.gmail.com>
	<ljivi7$8u2$1@ger.gmane.org>
	<CAH6Pt5omGrboenvUxwRrGFkc=0cJrEhSU3P_=KVEJAgprFEjyA@mail.gmail.com>
	<CAGGsPMyh8D0uFVJwO1T5EQnJhwpCfRgmpOdJz6g9stza7KqgUA@mail.gmail.com>
	<CAH6Pt5qs5072gzRosEK3UtcUen0VuWVyyi5P0wmDD71V7LB+qg@mail.gmail.com>
	<CAGGsPMwsnJqe6=EszAb8o6WhmYv-7nJfcU9qvkTLzY67-jbhqg@mail.gmail.com>
	<CAH6Pt5oGoFq=OKT+BFpTO9AxPCBkwdmZ5WihXM1JZbUVLKQxkw@mail.gmail.com>
	<CAH6Pt5pgwboTLHqUeUaUXxHZg4C5V4f+M+c4jre2bSeXiryP0Q@mail.gmail.com>
	<CAGY4rcVPgcrj1JKUb5radLuLkrbg+u_zs6gSZvX2TY=i9+43jA@mail.gmail.com>
	<CAH6Pt5oNrx9d=fMeXKW+m6TOvyhXcKRCHJsb7DTfkqJ3+i=40A@mail.gmail.com>
	<CAGY4rcU2gOcFABf4V-QLyy5Uf-8ZhoFWfJKmuFQ7mPs1-NqCvA@mail.gmail.com>
	<536CB2C6.1030305@googlemail.com>
	<CAGY4rcV1tecTqbVB-nY2_zuAQ1U0uVRH9iPa8Wd2inL8mh+zpA@mail.gmail.com>
	<CAH6Pt5oOx+POzTJdMHU49JxZs=Pkc6OFD-B6=9topx68QjtiiA@mail.gmail.com>
	<CABL7CQhfo6++1B6hBh3pDzmLZOeNbqo1SzQOfBMbLEQj4G+U7A@mail.gmail.com>
	<CAFvE7K4+TMXG8-=bFsWk7hx5Wh4VkhhyYhw4z7EBb6kKnqPJrw@mail.gmail.com>
	<CAGGsPMxmOV8T+Wm6LfTER1xebCQk4MEebXfdRULJ6b8zWLL_=Q@mail.gmail.com>
	<CAH6Pt5pFftrXN8dkS4Dcj+Dt8Bx6EMyqp_cK0RX6=KUKhVx2+w@mail.gmail.com>
	<CAH6Pt5r2fNJGGj1K=foSt9tFU89+wf82waYAzxYHrKGtQbTXwQ@mail.gmail.com>
	<CAGGsPMyxvNkz__+Hc1YSp_fLRPbdg4gez9EAwQyS9jzv_R+ZCA@mail.gmail.com>
Message-ID: <CAFvE7K4v3aHVB3K7kaCgn0vuK+yrBBjLKb-68_GD-iq51R-HEg@mail.gmail.com>

Hi Carl,

All the items you suggest would be very appreciated. Don't hesitate to
ping me if you need me to test new packages.

Also the sklearn project has a free Rackspace Cloud account that
Matthew is already using to make travis upload OSX wheels for the
master branch of various scipy stack projects. Rackspace cloud can
also be used to start windows VMs if needed. Please tell me if you
want a some user credentials and API key.

Myself I use the Rackspace Cloud account to build sklearn wheels
following those instructions:

  https://github.com/scikit-learn/scikit-learn/wiki/How-to-make-a-release#building-windows-binary-packages

We are using msvc express (but only for 32bit Python) right now. I
have yet to try to build sklearn with your mingw-w64 static toolchain.

Rackspace granted us $2000 worth of cloud resource per month (e.g.
bandwith and VM time) so there is plenty of resource left to help with
upstream projects such as numpy and scipy.

Best,

-- 
Olivier


From cmkleffner at gmail.com  Wed Jul  2 09:24:13 2014
From: cmkleffner at gmail.com (Carl Kleffner)
Date: Wed, 2 Jul 2014 15:24:13 +0200
Subject: [Numpy-discussion] 64-bit windows numpy / scipy wheels for
	testing
In-Reply-To: <CAH6Pt5rj8QOrtCyZMKeaM34ofdXkKRCiABGNLD1vVUC_B7bqTA@mail.gmail.com>
References: <CAH6Pt5r47rOKXYfdiqaGQihV+yzrq2dc9rmYbboNb=HUbrtEpA@mail.gmail.com>
	<ljivi7$8u2$1@ger.gmane.org>
	<CAH6Pt5omGrboenvUxwRrGFkc=0cJrEhSU3P_=KVEJAgprFEjyA@mail.gmail.com>
	<CAGGsPMyh8D0uFVJwO1T5EQnJhwpCfRgmpOdJz6g9stza7KqgUA@mail.gmail.com>
	<CAH6Pt5qs5072gzRosEK3UtcUen0VuWVyyi5P0wmDD71V7LB+qg@mail.gmail.com>
	<CAGGsPMwsnJqe6=EszAb8o6WhmYv-7nJfcU9qvkTLzY67-jbhqg@mail.gmail.com>
	<CAH6Pt5oGoFq=OKT+BFpTO9AxPCBkwdmZ5WihXM1JZbUVLKQxkw@mail.gmail.com>
	<CAH6Pt5pgwboTLHqUeUaUXxHZg4C5V4f+M+c4jre2bSeXiryP0Q@mail.gmail.com>
	<CAGY4rcVPgcrj1JKUb5radLuLkrbg+u_zs6gSZvX2TY=i9+43jA@mail.gmail.com>
	<CAH6Pt5oNrx9d=fMeXKW+m6TOvyhXcKRCHJsb7DTfkqJ3+i=40A@mail.gmail.com>
	<CAGY4rcU2gOcFABf4V-QLyy5Uf-8ZhoFWfJKmuFQ7mPs1-NqCvA@mail.gmail.com>
	<536CB2C6.1030305@googlemail.com>
	<CAGY4rcV1tecTqbVB-nY2_zuAQ1U0uVRH9iPa8Wd2inL8mh+zpA@mail.gmail.com>
	<CAH6Pt5oOx+POzTJdMHU49JxZs=Pkc6OFD-B6=9topx68QjtiiA@mail.gmail.com>
	<CABL7CQhfo6++1B6hBh3pDzmLZOeNbqo1SzQOfBMbLEQj4G+U7A@mail.gmail.com>
	<CAFvE7K4+TMXG8-=bFsWk7hx5Wh4VkhhyYhw4z7EBb6kKnqPJrw@mail.gmail.com>
	<CAGGsPMxmOV8T+Wm6LfTER1xebCQk4MEebXfdRULJ6b8zWLL_=Q@mail.gmail.com>
	<CAH6Pt5pFftrXN8dkS4Dcj+Dt8Bx6EMyqp_cK0RX6=KUKhVx2+w@mail.gmail.com>
	<CAH6Pt5r2fNJGGj1K=foSt9tFU89+wf82waYAzxYHrKGtQbTXwQ@mail.gmail.com>
	<CAGGsPMyxvNkz__+Hc1YSp_fLRPbdg4gez9EAwQyS9jzv_R+ZCA@mail.gmail.com>
	<CAH6Pt5rj8QOrtCyZMKeaM34ofdXkKRCiABGNLD1vVUC_B7bqTA@mail.gmail.com>
Message-ID: <CAGGsPMy7Y1nqz+5TCuGFP2gvK6G2ReYMb9X1e_Jq3rwO6FoLJA@mail.gmail.com>

Hi,

personally I don't have a preference of Binstar over somewhere.org. More
important is that one has to agree where to find the binaries. Binstar has
the concept of channels and allow wheels. So one could provide a channel
for NOSSE and more channels for other specialized builds:
ATLAS/OpenBLAS/RefBLAS, SSE4/AVX and so on.

A generic binary should be build with generic optimizing GCC switches and
SSE2 per default. I propose to provide generic binaries for PYPI instead of
superbinaries. and specialized binaries on Binstar or somewhere else.

Just thinking two or three steps ahead.

Regards

Carl


2014-07-02 13:35 GMT+02:00 Matthew Brett <matthew.brett at gmail.com>:

> Hi,
>
> On Wed, Jul 2, 2014 at 12:18 PM, Carl Kleffner <cmkleffner at gmail.com>
> wrote:
> > Hi,
> >
> > The mingw-w64 based wheels (Atlas and openBLAS) are based on a patched
> numpy
> > version, that hasn't been published as numpy pull for revision until now
> (my
> > failure). I could try to do this tomorrow in the evening.
>
> That would be really good.  I'll try and help with review if I can.
>
> > Another important
> > point is, that the toolchain, that is capable to compile numpy/scipy was
> > adapted to allow for MSVC / mingw runtime compatibility and does not
> create
> > any gcc/mingw runtime dependency anymore.
> >
> > OpenBLAS has one advantage over Atlas: numpy/scipy are linked dynamically
> > against OpenBLAS. Statically linked BLAS like MKL or ATLAS creates huge
> > python extensions and have considerable higher memory consumption
> compared
> > to dynamically linkage. On the other hand correctness is more important,
> so
> > ATLAS has to be preferred now.
>
> Do you have any index of what the memory cost is?   If it's in the
> order of 20M presumably that won't have much practical impact?
>
> > Users with non SEE processors could be provided with wheels distributed
> on
> > binstar.
>
> The last plan we seemed to have was to continue making the 'superpack'
> exe installers which contain no-SSE, SSE2 and SSE3 builds where the
> installer selects which one to install at runtime.   The warning from
> the wheel would point to these installers as the backup option.
>
> If we did want to produce alternative wheels, I guess a specific
> static https directory would be easiest; otherwise the user would get
> the odd effect that they'd get a hobbled wheel by default when
> installing from binstar (assuming they did in fact have SSE2).  I
> mean, this
>
> pip install -f https://somewhere.org/no_sse_wheels --no-index numpy
>
> seems to make more sense as an alternative install command for
> non-SSE, than this:
>
> pip install -i http://binstar.org numpy
>
> because in the former case, you can see what is special about the command.
>
> Cheers,
>
> Matthew
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140702/dbbb5eb5/attachment.html>

From matthew.brett at gmail.com  Wed Jul  2 09:36:57 2014
From: matthew.brett at gmail.com (Matthew Brett)
Date: Wed, 2 Jul 2014 14:36:57 +0100
Subject: [Numpy-discussion] 64-bit windows numpy / scipy wheels for
	testing
In-Reply-To: <CAGGsPMy7Y1nqz+5TCuGFP2gvK6G2ReYMb9X1e_Jq3rwO6FoLJA@mail.gmail.com>
References: <CAH6Pt5r47rOKXYfdiqaGQihV+yzrq2dc9rmYbboNb=HUbrtEpA@mail.gmail.com>
	<ljivi7$8u2$1@ger.gmane.org>
	<CAH6Pt5omGrboenvUxwRrGFkc=0cJrEhSU3P_=KVEJAgprFEjyA@mail.gmail.com>
	<CAGGsPMyh8D0uFVJwO1T5EQnJhwpCfRgmpOdJz6g9stza7KqgUA@mail.gmail.com>
	<CAH6Pt5qs5072gzRosEK3UtcUen0VuWVyyi5P0wmDD71V7LB+qg@mail.gmail.com>
	<CAGGsPMwsnJqe6=EszAb8o6WhmYv-7nJfcU9qvkTLzY67-jbhqg@mail.gmail.com>
	<CAH6Pt5oGoFq=OKT+BFpTO9AxPCBkwdmZ5WihXM1JZbUVLKQxkw@mail.gmail.com>
	<CAH6Pt5pgwboTLHqUeUaUXxHZg4C5V4f+M+c4jre2bSeXiryP0Q@mail.gmail.com>
	<CAGY4rcVPgcrj1JKUb5radLuLkrbg+u_zs6gSZvX2TY=i9+43jA@mail.gmail.com>
	<CAH6Pt5oNrx9d=fMeXKW+m6TOvyhXcKRCHJsb7DTfkqJ3+i=40A@mail.gmail.com>
	<CAGY4rcU2gOcFABf4V-QLyy5Uf-8ZhoFWfJKmuFQ7mPs1-NqCvA@mail.gmail.com>
	<536CB2C6.1030305@googlemail.com>
	<CAGY4rcV1tecTqbVB-nY2_zuAQ1U0uVRH9iPa8Wd2inL8mh+zpA@mail.gmail.com>
	<CAH6Pt5oOx+POzTJdMHU49JxZs=Pkc6OFD-B6=9topx68QjtiiA@mail.gmail.com>
	<CABL7CQhfo6++1B6hBh3pDzmLZOeNbqo1SzQOfBMbLEQj4G+U7A@mail.gmail.com>
	<CAFvE7K4+TMXG8-=bFsWk7hx5Wh4VkhhyYhw4z7EBb6kKnqPJrw@mail.gmail.com>
	<CAGGsPMxmOV8T+Wm6LfTER1xebCQk4MEebXfdRULJ6b8zWLL_=Q@mail.gmail.com>
	<CAH6Pt5pFftrXN8dkS4Dcj+Dt8Bx6EMyqp_cK0RX6=KUKhVx2+w@mail.gmail.com>
	<CAH6Pt5r2fNJGGj1K=foSt9tFU89+wf82waYAzxYHrKGtQbTXwQ@mail.gmail.com>
	<CAGGsPMyxvNkz__+Hc1YSp_fLRPbdg4gez9EAwQyS9jzv_R+ZCA@mail.gmail.com>
	<CAH6Pt5rj8QOrtCyZMKeaM34ofdXkKRCiABGNLD1vVUC_B7bqTA@mail.gmail.com>
	<CAGGsPMy7Y1nqz+5TCuGFP2gvK6G2ReYMb9X1e_Jq3rwO6FoLJA@mail.gmail.com>
Message-ID: <CAH6Pt5rPhXYyWHPVy8fYrym=2oNQxj5kchUL38KdLrwXjmhL8w@mail.gmail.com>

Hi,

On Wed, Jul 2, 2014 at 2:24 PM, Carl Kleffner <cmkleffner at gmail.com> wrote:
> Hi,
>
> personally I don't have a preference of Binstar over somewhere.org. More
> important is that one has to agree where to find the binaries. Binstar has
> the concept of channels and allow wheels. So one could provide a channel for
> NOSSE and more channels for other specialized builds:
> ATLAS/OpenBLAS/RefBLAS, SSE4/AVX and so on.

Having a noSSE channel would make sense.

> A generic binary should be build with generic optimizing GCC switches and
> SSE2 per default. I propose to provide generic binaries for PYPI instead of
> superbinaries. and specialized binaries on Binstar or somewhere else.

The exe superbinary installers can also go on pypi without causing
confusion to pip at least, but it would be good to have wheels as
well.

> Just thinking two or three steps ahead.

It's good to have a plan :)

Cheers,

Matthew


From mszepien at gmail.com  Wed Jul  2 10:57:29 2014
From: mszepien at gmail.com (Mark Szepieniec)
Date: Wed, 2 Jul 2014 16:57:29 +0200
Subject: [Numpy-discussion] numpy.histogram not giving expected results
In-Reply-To: <CAE4-1rXCjojUA75nY7d9RTu+S9z-y_L7dnZyK=3B0nTzjGF3cg@mail.gmail.com>
References: <19894208-1D97-461B-86EB-CF4394176CEE@jpl.nasa.gov>
	<CALGmxE+mM+4Mj=iGv2KSYqSermKEkAH+QHuDkVXAvsortGmiKg@mail.gmail.com>
	<CAE4-1rXCjojUA75nY7d9RTu+S9z-y_L7dnZyK=3B0nTzjGF3cg@mail.gmail.com>
Message-ID: <CAE4-1rVo2Z2OfBP_8ONcFDw5voyFLtG0bC7sO3Qi+fP3096CYQ@mail.gmail.com>

Looks this could be a float32 vs float64 problem:

In [19]: data32 = np.array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.05,
-0.05], dtype=np.float32)
In [20]: data64 = np.array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.05,
-0.05], dtype=np.float64)
In [21]: bins32 = np.arange(-0.1, 0.101, 0.05, dtype=np.float32)
In [22]: bins64 = np.arange(-0.1, 0.101, 0.05, dtype=np.float64)

In [23]: np.histogram(data32, bins32)
Out[23]:
(array([ 0,  1, 10,  1]), array([-0.1 , -0.05,  0.  ,  0.05,  0.1 ],
dtype=float32))

In [24]: np.histogram(data32, bins64)
Out[24]: (array([ 1,  0, 10,  1]), array([-0.1 , -0.05,  0.  ,  0.05,  0.1
]))

In [25]: np.histogram(data64, bins32)
Out[25]:
(array([ 0,  1, 11,  0]), array([-0.1 , -0.05,  0.  ,  0.05,  0.1 ],
dtype=float32))

In [26]: np.histogram(data64, bins64)
Out[26]: (array([ 0,  1, 10,  1]), array([-0.1 , -0.05,  0.  ,  0.05,  0.1
]))


I guess users always be very careful when mixing floating point types, but
should numpy prevent (or warn) the user from doing so in this case?


On Wed, Jul 2, 2014 at 10:07 AM, Mark Szepieniec <mszepien at gmail.com> wrote:

> Hi Catherine,
>
> I can't reproduce your issue with bins_list vs. bins_arange, but passing
> both range and number of bins to np.histogram does give the same strange
> behavior for me:
>
> In [16]: data = np.array([ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
>  0.  ,  0.  ,
>         0.  ,  0.05, -0.05])
>
> In [17]: bins_list = np.array([-0.1, -0.05, 0.0, 0.05, 0.1])
>
> In [18]: np.histogram(data, bins=bins_list)
> Out[18]: (array([ 0,  1, 10,  1]), array([-0.1 , -0.05,  0.  ,  0.05,  0.1
> ]))
>
> In [19]: bins_arange = np.arange(-0.1, 0.101, 0.05)
>
> In [20]: np.histogram(data, bins=bins_arange)
> Out[20]: (array([ 0,  1, 10,  1]), array([-0.1 , -0.05,  0.  ,  0.05,  0.1
> ]))
>
> In [21]: np.histogram(data, range=(-0.1, 0.1), bins=4)
> Out[21]: (array([ 0,  1, 11,  0]), array([-0.1 , -0.05,  0.  ,  0.05,  0.1
> ]))
>
> In [22]: np.version.version
> Out[22]: '1.8.1'
>
> Looks like the 0.05 value of data is being binned differently in the last
> case, but I'm not sure why either...
>
> Mark
>
>
> On Wed, Jul 2, 2014 at 2:05 AM, Chris Barker <chris.barker at noaa.gov>
> wrote:
>
>> A few thoughts:
>>
>> 1) don't use arange() for flaoting point numbers, use linspace().
>>
>> 2) histogram1d is a floating point function, and you shouldn't expect
>> exact results for floating point -- in particular, values exactly at the
>> bin boundaries are likely to be "uncertain" -- not quite the right word,
>> but you get the idea.
>>
>> 3) if you expect have a lot of certain specific values, say, integers, or
>> zeros -- then you don't want your bin boundaries to be exactly at the value
>> -- they should be between the expected values.
>>
>> 4) remember that histogramming is inherently sensitive to bin position
>> anyway -- if these small bin-boundary differences matter, than you may not
>> be using teh best approach.
>>
>> -HTH,
>>   -Chris
>>
>>
>>
>>
>>
>>
>>> >>> data
>>> array([ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
>>>         0.  ,  0.05, -0.05])
>>> >>> bins_list = numpy.array([-0.1, -0.05, 0.0, 0.05, 0.1])
>>> >>> (counts, edges) = numpy.histogram(data, bins=bins_list)
>>> >>> counts
>>> array([ 0,  1, 10,  1])
>>> >>> edges
>>> array([-0.1 , -0.05,  0.  ,  0.05,  0.1 ])
>>>
>>>
>>>
>>> but this does not (generating the bin values via bumpy.arange):
>>>
>>> >>> bins_arange = numpy.arange(-0.1, 0.101, 0.05)
>>> >>> data
>>> array([ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
>>>         0.  ,  0.05, -0.05])
>>> >>> bins_arange
>>> array([-0.1 , -0.05,  0.  ,  0.05,  0.1 ])
>>> >>> (counts, edges) = numpy.histogram(data, bins=bins_arange)
>>> >>> counts
>>> array([ 0,  1, 11,  0])
>>>
>>> I'm assuming this is due to slight rounding in the calculation of
>>> bins_arange,
>>> as compared to the manually entered values in bins_list.
>>>
>>> What is the recommended way of getting the first set of results, without
>>> having to manually enter all the values in the "bins" argument?
>>>
>>> The following also gives me unexpected results:
>>>
>>> >>> data
>>> array([ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
>>>         0.  ,  0.05, -0.05])
>>> counts, edges) = numpy.histogram(data, range=(-0.1, 0.1), bins=4)
>>> >>> counts
>>> array([ 0,  1, 11,  0])
>>>
>>>
>>>
>>> Thank you for any advice,
>>>
>>> Catherine
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>
>>
>>
>> --
>>
>> Christopher Barker, Ph.D.
>> Oceanographer
>>
>> Emergency Response Division
>> NOAA/NOS/OR&R            (206) 526-6959   voice
>> 7600 Sand Point Way NE   (206) 526-6329   fax
>> Seattle, WA  98115       (206) 526-6317   main reception
>>
>> Chris.Barker at noaa.gov
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140702/0bfd8fb3/attachment.html>

From chris.barker at noaa.gov  Wed Jul  2 13:24:53 2014
From: chris.barker at noaa.gov (Chris Barker)
Date: Wed, 2 Jul 2014 10:24:53 -0700
Subject: [Numpy-discussion] Accessing irregular sized array data from C
In-Reply-To: <CAK5FAtF-BUXq0OF9KATo08iQqsUfKjL-v2Ajhkidf3KyRObqMw@mail.gmail.com>
References: <53B3DBBD.8030202@gmail.com>
	<CAK5FAtF-BUXq0OF9KATo08iQqsUfKjL-v2Ajhkidf3KyRObqMw@mail.gmail.com>
Message-ID: <CALGmxEL93x483zxnZ-rdvGPVWEfzzZB464mN6RjVDgi8HP+RGQ@mail.gmail.com>

On Wed, Jul 2, 2014 at 3:46 AM, Julian Taylor <jtaylor.debian at googlemail.com
> wrote:

> numpy does not directly support irregular shaped arrays (or ragged arrays).
> If you look at the result of your example you will see this:
> In [5]: b
> Out[5]: array([array([ 1.,  2.,  3.]), array([-2.,  4.]), array([
> 5.])], dtype=object)
>
> b has datatype object, this means it is a 1d array containing more
> array objects. Numpy does not directly know about the shapes or types
> the sub arrays. It is not necessarily homogeneous anymore, but
> compared to a regular python list you still have elementwise
> operations (if the contained python objects support them) and it can
> have multiple dimensions.
>

All true, but afiew notes:

1) you probably wan to look at Cython for making this sor tof thing easier.

2) a numpy=based ragged array implementation might make sense as well. You
essentially store the data in a rank-1 shaped numpy array, and provide
custom indexing to get the "rows" out. This would allow you to have all the
data in a single memory block available to C (or Cython), so that you could
fully optimize indexing and access, and have a data structure that makes
sense in pure C.

I've enclosed a start off such a class ( I honestly can't remember how far
I got with it!, but it was at least useful for one project of mine.)

HTH,
  -Chris

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140702/7ecdca2d/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ragged_array.py
Type: text/x-python-script
Size: 4305 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140702/7ecdca2d/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test_ragged_array.py
Type: text/x-python-script
Size: 3068 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140702/7ecdca2d/attachment-0001.bin>

From chris.barker at noaa.gov  Wed Jul  2 13:29:17 2014
From: chris.barker at noaa.gov (Chris Barker)
Date: Wed, 2 Jul 2014 10:29:17 -0700
Subject: [Numpy-discussion] numpy.histogram not giving expected results
In-Reply-To: <CAE4-1rVo2Z2OfBP_8ONcFDw5voyFLtG0bC7sO3Qi+fP3096CYQ@mail.gmail.com>
References: <19894208-1D97-461B-86EB-CF4394176CEE@jpl.nasa.gov>
	<CALGmxE+mM+4Mj=iGv2KSYqSermKEkAH+QHuDkVXAvsortGmiKg@mail.gmail.com>
	<CAE4-1rXCjojUA75nY7d9RTu+S9z-y_L7dnZyK=3B0nTzjGF3cg@mail.gmail.com>
	<CAE4-1rVo2Z2OfBP_8ONcFDw5voyFLtG0bC7sO3Qi+fP3096CYQ@mail.gmail.com>
Message-ID: <CALGmxEJvDcDFFyb2RaHhv7uizZvRyKwfnD49jhoAVzoat__8jA@mail.gmail.com>

On Wed, Jul 2, 2014 at 7:57 AM, Mark Szepieniec <mszepien at gmail.com> wrote:

> Looks this could be a float32 vs float64 problem:
>

that would explain it.


> I guess users always be very careful when mixing floating point types, but
> should numpy prevent (or warn) the user from doing so in this case?
>

I don't think so -- this "uncertainty" is very much the nature of
histogramming, particularly with floating point values -- you should expect
to get different results with different data precisions. As you should for
ANY floating point computation.

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140702/0bbaf579/attachment.html>

From chris.barker at noaa.gov  Wed Jul  2 13:34:40 2014
From: chris.barker at noaa.gov (Chris Barker)
Date: Wed, 2 Jul 2014 10:34:40 -0700
Subject: [Numpy-discussion] 64-bit windows numpy / scipy wheels for
	testing
In-Reply-To: <CAH6Pt5r2fNJGGj1K=foSt9tFU89+wf82waYAzxYHrKGtQbTXwQ@mail.gmail.com>
References: <CAH6Pt5r47rOKXYfdiqaGQihV+yzrq2dc9rmYbboNb=HUbrtEpA@mail.gmail.com>
	<ljivi7$8u2$1@ger.gmane.org>
	<CAH6Pt5omGrboenvUxwRrGFkc=0cJrEhSU3P_=KVEJAgprFEjyA@mail.gmail.com>
	<CAGGsPMyh8D0uFVJwO1T5EQnJhwpCfRgmpOdJz6g9stza7KqgUA@mail.gmail.com>
	<CAH6Pt5qs5072gzRosEK3UtcUen0VuWVyyi5P0wmDD71V7LB+qg@mail.gmail.com>
	<CAGGsPMwsnJqe6=EszAb8o6WhmYv-7nJfcU9qvkTLzY67-jbhqg@mail.gmail.com>
	<CAH6Pt5oGoFq=OKT+BFpTO9AxPCBkwdmZ5WihXM1JZbUVLKQxkw@mail.gmail.com>
	<CAH6Pt5pgwboTLHqUeUaUXxHZg4C5V4f+M+c4jre2bSeXiryP0Q@mail.gmail.com>
	<CAGY4rcVPgcrj1JKUb5radLuLkrbg+u_zs6gSZvX2TY=i9+43jA@mail.gmail.com>
	<CAH6Pt5oNrx9d=fMeXKW+m6TOvyhXcKRCHJsb7DTfkqJ3+i=40A@mail.gmail.com>
	<CAGY4rcU2gOcFABf4V-QLyy5Uf-8ZhoFWfJKmuFQ7mPs1-NqCvA@mail.gmail.com>
	<536CB2C6.1030305@googlemail.com>
	<CAGY4rcV1tecTqbVB-nY2_zuAQ1U0uVRH9iPa8Wd2inL8mh+zpA@mail.gmail.com>
	<CAH6Pt5oOx+POzTJdMHU49JxZs=Pkc6OFD-B6=9topx68QjtiiA@mail.gmail.com>
	<CABL7CQhfo6++1B6hBh3pDzmLZOeNbqo1SzQOfBMbLEQj4G+U7A@mail.gmail.com>
	<CAFvE7K4+TMXG8-=bFsWk7hx5Wh4VkhhyYhw4z7EBb6kKnqPJrw@mail.gmail.com>
	<CAGGsPMxmOV8T+Wm6LfTER1xebCQk4MEebXfdRULJ6b8zWLL_=Q@mail.gmail.com>
	<CAH6Pt5pFftrXN8dkS4Dcj+Dt8Bx6EMyqp_cK0RX6=KUKhVx2+w@mail.gmail.com>
	<CAH6Pt5r2fNJGGj1K=foSt9tFU89+wf82waYAzxYHrKGtQbTXwQ@mail.gmail.com>
Message-ID: <CALGmxEJMUzK=QqR0BmajSKUtgXE1CESQeBGYpQnTB2CS2KzN5Q@mail.gmail.com>

On Wed, Jul 2, 2014 at 3:37 AM, Matthew Brett <matthew.brett at gmail.com>
wrote:

>  It looks like
> 99% of Windows users do have SSE2 though [1].  So I think what is
> required is
>
> * Build the wheels for 32-bit (easy)
> * Patch the wheels to check and give helpful error in absence of SSE2
> (fairly easy)
> * Get agreement these should go up on pypi and be maintained (feedback
> anyone?)
>

+Inf

It would benefit the community a LOT to have binary wheels up on PyPi, and
the very small number of failures due to old  hardware will be no big deal,
as long as the users get a meaningful message, rather than a hard crash.

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140702/129a1141/attachment.html>

From jtaylor.debian at googlemail.com  Wed Jul  2 13:36:25 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Wed, 02 Jul 2014 19:36:25 +0200
Subject: [Numpy-discussion] numpy.histogram not giving expected results
In-Reply-To: <CALGmxEJvDcDFFyb2RaHhv7uizZvRyKwfnD49jhoAVzoat__8jA@mail.gmail.com>
References: <19894208-1D97-461B-86EB-CF4394176CEE@jpl.nasa.gov>	<CALGmxE+mM+4Mj=iGv2KSYqSermKEkAH+QHuDkVXAvsortGmiKg@mail.gmail.com>	<CAE4-1rXCjojUA75nY7d9RTu+S9z-y_L7dnZyK=3B0nTzjGF3cg@mail.gmail.com>	<CAE4-1rVo2Z2OfBP_8ONcFDw5voyFLtG0bC7sO3Qi+fP3096CYQ@mail.gmail.com>
	<CALGmxEJvDcDFFyb2RaHhv7uizZvRyKwfnD49jhoAVzoat__8jA@mail.gmail.com>
Message-ID: <53B44319.4000007@googlemail.com>

On 02.07.2014 19:29, Chris Barker wrote:
> On Wed, Jul 2, 2014 at 7:57 AM, Mark Szepieniec <mszepien at gmail.com
> <mailto:mszepien at gmail.com>> wrote:
> 
>     Looks this could be a float32 vs float64 problem:
> 
> 
> that would explain it.
>  
> 
>     I guess users always be very careful when mixing floating point
>     types, but should numpy prevent (or warn) the user from doing so in
>     this case?
> 
> 
> I don't think so -- this "uncertainty" is very much the nature of
> histogramming, particularly with floating point values -- you should
> expect to get different results with different data precisions. As you
> should for ANY floating point computation.
> 


we recently fixed a float32/float64 issue in histogram.
https://github.com/numpy/numpy/issues/4799
I think it boils down to the use of round() in histogram which is not so
great in python as its based on decimals not significant figures (so it
does nothing for float32 values > 1e7).
Though this one seems different as it still occurs in git master.


From jtaylor.debian at googlemail.com  Wed Jul  2 13:38:08 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Wed, 02 Jul 2014 19:38:08 +0200
Subject: [Numpy-discussion] Accessing irregular sized array data from C
In-Reply-To: <53B3F0AD.70700@gmail.com>
References: <53B3DBBD.8030202@gmail.com>	<CAK5FAtF-BUXq0OF9KATo08iQqsUfKjL-v2Ajhkidf3KyRObqMw@mail.gmail.com>
	<53B3F0AD.70700@gmail.com>
Message-ID: <53B44380.4030901@googlemail.com>

On 02.07.2014 13:44, Mads Ipsen wrote:
> 
> 
> On 02/07/14 12:46, Julian Taylor wrote:
>> On Wed, Jul 2, 2014 at 12:15 PM, Mads Ipsen <mads.ipsen at gmail.com> wrote:
>>> Hi,
>>>
>>> If you setup an M x N array like this
>>>
>>>   a = 1.0*numpy.arange(24).reshape(8,3)
>>>
>>> you can access the data from a C function like this
>>>
>>> void foo(PyObject * numpy_data)
>>> {
>>>     // Get dimension and data pointer
>>>     int const m = static_cast<int>(PyArray_DIMS(numpy_data)[0]);
>>>     int const n = static_cast<int>(PyArray_DIMS(numpy_data)[1]);
>>>     double * const data = (double *) PyArray_DATA(numpy_data);
>>>
>>>     // Access data
>>>     ...
>>> }
>>>
>>> Now, suppose I have an irregular shaped numpy array like this
>>>
>>>   a1 = numpy.array([ 1.0, 2.0, 3.0])
>>>   a2 = numpy.array([-2.0, 4.0])
>>>   a3 = numpy.array([5.0])
>>>   b  = numpy.array([a1,a2,a3])
>>>
>>> How can open up the doors to the array data of b on the C-side?
>>>
>>
>> numpy does not directly support irregular shaped arrays (or ragged arrays).
>> If you look at the result of your example you will see this:
>> In [5]: b
>> Out[5]: array([array([ 1.,  2.,  3.]), array([-2.,  4.]), array([
>> 5.])], dtype=object)
>>
>> b has datatype object, this means it is a 1d array containing more
>> array objects. Numpy does not directly know about the shapes or types
>> the sub arrays. It is not necessarily homogeneous anymore, but
>> compared to a regular python list you still have elementwise
>> operations (if the contained python objects support them) and it can
>> have multiple dimensions.
>>
>> In C you would access such an array it like this:
>>
>> PyArrayObject * const data = (PyArrayObject *) PyArray_DATA(numpy_data);
>> for (i=0; i < PyArray_DIMS(numpy_data)[0]; i++) {
>>    assert(PyArray_Check(data[i]));
>>    double * const sub_data = (double *) PyArray_DATA(data[i]);
>> }
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
> 
> Thanks - that'll get me going!
> 

another thing, don't use int as the index to the array, use npy_intp
which is large enough to also index arrays > 4GB if the platform
supports it.

Also note that object arrays are not very well optimized in numpy, so
numerous operations can be slow.


From chris.barker at noaa.gov  Wed Jul  2 13:55:41 2014
From: chris.barker at noaa.gov (Chris Barker)
Date: Wed, 2 Jul 2014 10:55:41 -0700
Subject: [Numpy-discussion] 64-bit windows numpy / scipy wheels for
	testing
In-Reply-To: <CAH6Pt5rPhXYyWHPVy8fYrym=2oNQxj5kchUL38KdLrwXjmhL8w@mail.gmail.com>
References: <CAH6Pt5r47rOKXYfdiqaGQihV+yzrq2dc9rmYbboNb=HUbrtEpA@mail.gmail.com>
	<ljivi7$8u2$1@ger.gmane.org>
	<CAH6Pt5omGrboenvUxwRrGFkc=0cJrEhSU3P_=KVEJAgprFEjyA@mail.gmail.com>
	<CAGGsPMyh8D0uFVJwO1T5EQnJhwpCfRgmpOdJz6g9stza7KqgUA@mail.gmail.com>
	<CAH6Pt5qs5072gzRosEK3UtcUen0VuWVyyi5P0wmDD71V7LB+qg@mail.gmail.com>
	<CAGGsPMwsnJqe6=EszAb8o6WhmYv-7nJfcU9qvkTLzY67-jbhqg@mail.gmail.com>
	<CAH6Pt5oGoFq=OKT+BFpTO9AxPCBkwdmZ5WihXM1JZbUVLKQxkw@mail.gmail.com>
	<CAH6Pt5pgwboTLHqUeUaUXxHZg4C5V4f+M+c4jre2bSeXiryP0Q@mail.gmail.com>
	<CAGY4rcVPgcrj1JKUb5radLuLkrbg+u_zs6gSZvX2TY=i9+43jA@mail.gmail.com>
	<CAH6Pt5oNrx9d=fMeXKW+m6TOvyhXcKRCHJsb7DTfkqJ3+i=40A@mail.gmail.com>
	<CAGY4rcU2gOcFABf4V-QLyy5Uf-8ZhoFWfJKmuFQ7mPs1-NqCvA@mail.gmail.com>
	<536CB2C6.1030305@googlemail.com>
	<CAGY4rcV1tecTqbVB-nY2_zuAQ1U0uVRH9iPa8Wd2inL8mh+zpA@mail.gmail.com>
	<CAH6Pt5oOx+POzTJdMHU49JxZs=Pkc6OFD-B6=9topx68QjtiiA@mail.gmail.com>
	<CABL7CQhfo6++1B6hBh3pDzmLZOeNbqo1SzQOfBMbLEQj4G+U7A@mail.gmail.com>
	<CAFvE7K4+TMXG8-=bFsWk7hx5Wh4VkhhyYhw4z7EBb6kKnqPJrw@mail.gmail.com>
	<CAGGsPMxmOV8T+Wm6LfTER1xebCQk4MEebXfdRULJ6b8zWLL_=Q@mail.gmail.com>
	<CAH6Pt5pFftrXN8dkS4Dcj+Dt8Bx6EMyqp_cK0RX6=KUKhVx2+w@mail.gmail.com>
	<CAH6Pt5r2fNJGGj1K=foSt9tFU89+wf82waYAzxYHrKGtQbTXwQ@mail.gmail.com>
	<CAGGsPMyxvNkz__+Hc1YSp_fLRPbdg4gez9EAwQyS9jzv_R+ZCA@mail.gmail.com>
	<CAH6Pt5rj8QOrtCyZMKeaM34ofdXkKRCiABGNLD1vVUC_B7bqTA@mail.gmail.com>
	<CAGGsPMy7Y1nqz+5TCuGFP2gvK6G2ReYMb9X1e_Jq3rwO6FoLJA@mail.gmail.com>
	<CAH6Pt5rPhXYyWHPVy8fYrym=2oNQxj5kchUL38KdLrwXjmhL8w@mail.gmail.com>
Message-ID: <CALGmxEKyXDsuF81=coYYvtJSOLnpnO6AW5CxNYu+JSBOCiQu=A@mail.gmail.com>

On Wed, Jul 2, 2014 at 6:36 AM, Matthew Brett <matthew.brett at gmail.com>
wrote:

>
> Having a noSSE channel would make sense.
>
>
Indeed -- the default (i.e what you get with pip install numpy) should be
SSE2 -- I":d much rather have a few folks with old hardware have to go
through some hoops that n have most people get something that is "much
slower than MATLAB".


> The exe superbinary installers can also go on pypi without causing
> confusion to pip at least, but it would be good to have wheels as
> well.
>

it doesn't hurt to have them, but we really need to get Windows away from
the exe installers into the pip / virtualenv / etc world.

-Chris

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140702/2332b373/attachment.html>

From shoyer at gmail.com  Wed Jul  2 14:01:23 2014
From: shoyer at gmail.com (Stephan Hoyer)
Date: Wed, 2 Jul 2014 11:01:23 -0700
Subject: [Numpy-discussion] Fwd: [Python-ideas] PEP pre-draft: Support
 for indexing with keyword arguments
In-Reply-To: <CAPJVwBnr3r5O1UNFpVpMTSJBEoA+BbcH4b0URKLtdRgTcNKnwA@mail.gmail.com>
References: <53B33800.1030300@ferrara.linux.it>
	<CAPJVwBnr3r5O1UNFpVpMTSJBEoA+BbcH4b0URKLtdRgTcNKnwA@mail.gmail.com>
Message-ID: <CAEQ_TvdUBHp6XwunqXnZgah30faDQ-MLHtubkwccAcdrNVxsWA@mail.gmail.com>

NumPy doesn't have named axes, but perhaps it should. See, for example,
Fernando Perez's datarray prototype (https://github.com/fperez/datarray) or
my project, xray (https://github.com/xray/xray).

Syntactical support for indexing an axis by name would makes using named
axes much more readable. For example, compare:

gridValues[x=3, y=5, z=0:8] = 0

vs.

gridValues.set_items(dict(x=3, y=5, z=slice(0, 8)), 0)

This is case 2 in the draft PEP.

I am less sure about the other cases. For some of these, such as get with a
default, using a function call is a perfectly fine substitute.

Best,
Stephan


On Wed, Jul 2, 2014 at 1:49 AM, Nathaniel Smith <njs at pobox.com> wrote:

> There's some discussion on python-ideas about making it possible for
> python indexing to accept kwargs, eg
>
>    arr[1:2, foo=bar]
>
> Since numpy is a very heavy user of indexing which might benefit from
> this, I thought I should forward it here. If we have clear use cases for
> such a feature then that may strongly affect the discussion.
>
> I admit I can't actually think of any features this would enable for us
> though...
>
> -n
> ---------- Forwarded message ----------
> From: "Stefano Borini" <stefano.borini at ferrara.linux.it>
> Date: 2 Jul 2014 00:17
> Subject: [Python-ideas] PEP pre-draft: Support for indexing with keyword
> arguments
> To: "python-ideas at python.org" <python-ideas at python.org>, "Joseph
> Martinot-Lagarde" <joseph.martinot-lagarde at m4x.org>
> Cc:
>
> Dear all,
>
> after the first mailing list feedback, and further private discussion with
> Joseph Martinot-Lagarde, I drafted a first iteration of a PEP for keyword
> arguments in indexing. The document is available here.
>
> https://github.com/stefanoborini/pep-keyword/blob/master/PEP-XXX.txt
>
> The document is not in final form when it comes to specifications. In
> fact, it requires additional discussion about the best strategy to achieve
> the desired result. Particular attention has been devoted to present
> alternative implementation strategies, their pros and cons. I will examine
> all feedback tomorrow morning European time (in approx 10 hrs), and apply
> any pull requests or comments you may have.
>
> When the specification is finalized, or this community suggests that the
> PEP is in a form suitable for official submission despite potential open
> issues, I will submit it to the editor panel for further discussion, and
> deploy an actual implementation according to the agreed specification for a
> working test run.
>
> I apologize for potential mistakes in the PEP drafting and submission
> process, as this is my first PEP.
>
> Kind Regards,
>
> Stefano Borini
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140702/a3ce5e6f/attachment.html>

From chris.barker at noaa.gov  Wed Jul  2 14:04:42 2014
From: chris.barker at noaa.gov (Chris Barker)
Date: Wed, 2 Jul 2014 11:04:42 -0700
Subject: [Numpy-discussion] numpy.histogram not giving expected results
In-Reply-To: <53B44319.4000007@googlemail.com>
References: <19894208-1D97-461B-86EB-CF4394176CEE@jpl.nasa.gov>
	<CALGmxE+mM+4Mj=iGv2KSYqSermKEkAH+QHuDkVXAvsortGmiKg@mail.gmail.com>
	<CAE4-1rXCjojUA75nY7d9RTu+S9z-y_L7dnZyK=3B0nTzjGF3cg@mail.gmail.com>
	<CAE4-1rVo2Z2OfBP_8ONcFDw5voyFLtG0bC7sO3Qi+fP3096CYQ@mail.gmail.com>
	<CALGmxEJvDcDFFyb2RaHhv7uizZvRyKwfnD49jhoAVzoat__8jA@mail.gmail.com>
	<53B44319.4000007@googlemail.com>
Message-ID: <CALGmxELG7D7xJjgCRuGiWRTfg3QqLh+HxYOy3CHy1gXsTpE5Vw@mail.gmail.com>

On Wed, Jul 2, 2014 at 10:36 AM, Julian Taylor <
jtaylor.debian at googlemail.com> wrote:

we recently fixed a float32/float64 issue in histogram.
> https://github.com/numpy/numpy/issues/4799


It's a good idea to keep the edges in the same dtype as the input data, it
will make for fewer surprises, but I'm not sure that it's necessarily any
more "correct". A value within an eps of a bin could arbitrarily end up on
either side -- that's simply the nature of floating point.


> I think it boils down to the use of round() in histogram which is not so
> great in python as its based on decimals not significant figures (so it
> does nothing for float32 values > 1e7).
>

Using decimals rather than sig-figs is a problem regardless of precision,
and isn't that the same problem with C libmath round() ?

-CHB


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140702/75a54797/attachment.html>

From jtaylor.debian at googlemail.com  Wed Jul  2 14:17:53 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Wed, 02 Jul 2014 20:17:53 +0200
Subject: [Numpy-discussion] numpy.histogram not giving expected results
In-Reply-To: <CALGmxELG7D7xJjgCRuGiWRTfg3QqLh+HxYOy3CHy1gXsTpE5Vw@mail.gmail.com>
References: <19894208-1D97-461B-86EB-CF4394176CEE@jpl.nasa.gov>	<CALGmxE+mM+4Mj=iGv2KSYqSermKEkAH+QHuDkVXAvsortGmiKg@mail.gmail.com>	<CAE4-1rXCjojUA75nY7d9RTu+S9z-y_L7dnZyK=3B0nTzjGF3cg@mail.gmail.com>	<CAE4-1rVo2Z2OfBP_8ONcFDw5voyFLtG0bC7sO3Qi+fP3096CYQ@mail.gmail.com>	<CALGmxEJvDcDFFyb2RaHhv7uizZvRyKwfnD49jhoAVzoat__8jA@mail.gmail.com>	<53B44319.4000007@googlemail.com>
	<CALGmxELG7D7xJjgCRuGiWRTfg3QqLh+HxYOy3CHy1gXsTpE5Vw@mail.gmail.com>
Message-ID: <53B44CD1.8000107@googlemail.com>

On 02.07.2014 20:04, Chris Barker wrote:
> On Wed, Jul 2, 2014 at 10:36 AM, Julian Taylor
> <jtaylor.debian at googlemail.com <mailto:jtaylor.debian at googlemail.com>>
> wrote:
> 
>     we recently fixed a float32/float64 issue in histogram.
>     https://github.com/numpy/numpy/issues/4799
> 
> 
> It's a good idea to keep the edges in the same dtype as the input data,
> it will make for fewer surprises, but I'm not sure that it's necessarily
> any more "correct". A value within an eps of a bin could arbitrarily end
> up on either side -- that's simply the nature of floating point.
> 
>  
> 
>     I think it boils down to the use of round() in histogram which is not so
>     great in python as its based on decimals not significant figures (so it
>     does nothing for float32 values > 1e7).
> 
> 
> Using decimals rather than sig-figs is a problem regardless of
> precision, and isn't that the same problem with C libmath round() ?
> 

C round just rounds to the nearest integer and the result is still a float.
numpy/python is different and implements round as round(d * 10**decimal)
/ 10**decimal


From sturla.molden at gmail.com  Wed Jul  2 15:12:17 2014
From: sturla.molden at gmail.com (Sturla Molden)
Date: Wed, 2 Jul 2014 19:12:17 +0000 (UTC)
Subject: [Numpy-discussion] Accessing irregular sized array data from C
References: <53B3DBBD.8030202@gmail.com>
	<CAK5FAtF-BUXq0OF9KATo08iQqsUfKjL-v2Ajhkidf3KyRObqMw@mail.gmail.com>
	<53B3F0AD.70700@gmail.com> <53B44380.4030901@googlemail.com>
Message-ID: <801964251426020918.364158sturla.molden-gmail.com@news.gmane.org>

Julian Taylor <jtaylor.debian at googlemail.com> wrote:
 
> another thing, don't use int as the index to the array, use npy_intp
> which is large enough to also index arrays > 4GB if the platform
> supports it.

With double* a 32-bit int can index 16 GB, a 32-bit unsigned int can index
32 GB. 

With char* a 32-bit int can only index 2 GB.


Sturla


From njs at pobox.com  Wed Jul  2 15:16:54 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 2 Jul 2014 20:16:54 +0100
Subject: [Numpy-discussion] Accessing irregular sized array data from C
In-Reply-To: <801964251426020918.364158sturla.molden-gmail.com@news.gmane.org>
References: <53B3DBBD.8030202@gmail.com>
	<CAK5FAtF-BUXq0OF9KATo08iQqsUfKjL-v2Ajhkidf3KyRObqMw@mail.gmail.com>
	<53B3F0AD.70700@gmail.com> <53B44380.4030901@googlemail.com>
	<801964251426020918.364158sturla.molden-gmail.com@news.gmane.org>
Message-ID: <CAPJVwBnBR4J4bzaGaNQbnaJWKB7bkns6VGc-9Y3kAK+p0EugwQ@mail.gmail.com>

On 2 Jul 2014 20:12, "Sturla Molden" <sturla.molden at gmail.com> wrote:
>
> Julian Taylor <jtaylor.debian at googlemail.com> wrote:
>
> > another thing, don't use int as the index to the array, use npy_intp
> > which is large enough to also index arrays > 4GB if the platform
> > supports it.
>
> With double* a 32-bit int can index 16 GB, a 32-bit unsigned int can index
> 32 GB.
>
> With char* a 32-bit int can only index 2 GB.

Per dimension, if we're talking about addressing.

Numpy internally does all index/stride calculations in units of bytes,
though, so if accessing the data array directly and using strides, the only
reliable approach is to use intp or equivalent.

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140702/03b8c8b6/attachment.html>

From sturla.molden at gmail.com  Wed Jul  2 15:20:52 2014
From: sturla.molden at gmail.com (Sturla Molden)
Date: Wed, 2 Jul 2014 19:20:52 +0000 (UTC)
Subject: [Numpy-discussion] Accessing irregular sized array data from C
References: <53B3DBBD.8030202@gmail.com>
	<CAK5FAtF-BUXq0OF9KATo08iQqsUfKjL-v2Ajhkidf3KyRObqMw@mail.gmail.com>
	<CALGmxEL93x483zxnZ-rdvGPVWEfzzZB464mN6RjVDgi8HP+RGQ@mail.gmail.com>
Message-ID: <268090784426021268.412829sturla.molden-gmail.com@news.gmane.org>

Chris Barker <chris.barker at noaa.gov> wrote:

> 2) a numpy=based ragged array implementation might make sense as well. You
> essentially store the data in a rank-1 shaped numpy array, and provide
> custom indexing to get the "rows" out. This would allow you to have all the
> data in a single memory block available to C (or Cython), so that you could
> fully optimize indexing and access, and have a data structure that makes
> sense in pure C.

If the sub-arrays are contiguous, an ndarray of ndarrays is not inherently
slower in C than the common double** idiom. As with double** the
performance depends on iterating along the contiguous sub-arrays in the
innermost loop.

>From the Python side it will be more hurtful, yes, but not when working
with the NumPy C API.

Sturla


From sturla.molden at gmail.com  Wed Jul  2 15:33:20 2014
From: sturla.molden at gmail.com (Sturla Molden)
Date: Wed, 2 Jul 2014 19:33:20 +0000 (UTC)
Subject: [Numpy-discussion] Accessing irregular sized array data from C
References: <53B3DBBD.8030202@gmail.com>
	<CAK5FAtF-BUXq0OF9KATo08iQqsUfKjL-v2Ajhkidf3KyRObqMw@mail.gmail.com>
	<53B3F0AD.70700@gmail.com> <53B44380.4030901@googlemail.com>
	<801964251426020918.364158sturla.molden-gmail.com@news.gmane.org>
	<CAPJVwBnBR4J4bzaGaNQbnaJWKB7bkns6VGc-9Y3kAK+p0EugwQ@mail.gmail.com>
Message-ID: <1406948660426021705.905843sturla.molden-gmail.com@news.gmane.org>

Nathaniel Smith <njs at pobox.com> wrote:

> Numpy internally does all index/stride calculations in units of bytes,
> though, so if accessing the data array directly and using strides, the only
> reliable approach is to use intp or equivalent.

If we use PyArray_STRIDES we should use npy_intp, yes, because we are
computing the address directly from a char*. It depends on how much we know
about the array in advance.

Also a C standard pendant would point out we can only assume an int will be
at least 16 bit, and we should use long to make sure it is at least 32 bit.

Sturla


From fperez.net at gmail.com  Wed Jul  2 22:17:19 2014
From: fperez.net at gmail.com (Fernando Perez)
Date: Wed, 2 Jul 2014 19:17:19 -0700
Subject: [Numpy-discussion] Fwd: [Python-ideas] PEP pre-draft: Support
 for indexing with keyword arguments
In-Reply-To: <CAEQ_TvdUBHp6XwunqXnZgah30faDQ-MLHtubkwccAcdrNVxsWA@mail.gmail.com>
References: <53B33800.1030300@ferrara.linux.it>
	<CAPJVwBnr3r5O1UNFpVpMTSJBEoA+BbcH4b0URKLtdRgTcNKnwA@mail.gmail.com>
	<CAEQ_TvdUBHp6XwunqXnZgah30faDQ-MLHtubkwccAcdrNVxsWA@mail.gmail.com>
Message-ID: <CAHAreOpWBLYBEw+g0io=a6s_2vVaPVn1GDgrya2WkR-iP54G3w@mail.gmail.com>

Added to the py3 Bof ideas page:

https://github.com/ipython/ipython/wiki/Sprints:-SciPy2014-Py3-BoF

Thanks for this heads-up!


On Wed, Jul 2, 2014 at 11:01 AM, Stephan Hoyer <shoyer at gmail.com> wrote:

> NumPy doesn't have named axes, but perhaps it should. See, for example,
> Fernando Perez's datarray prototype (https://github.com/fperez/datarray)
> or my project, xray (https://github.com/xray/xray).
>
> Syntactical support for indexing an axis by name would makes using named
> axes much more readable. For example, compare:
>
> gridValues[x=3, y=5, z=0:8] = 0
>
> vs.
>
> gridValues.set_items(dict(x=3, y=5, z=slice(0, 8)), 0)
>
> This is case 2 in the draft PEP.
>
> I am less sure about the other cases. For some of these, such as get with
> a default, using a function call is a perfectly fine substitute.
>
> Best,
> Stephan
>
>
>
>
> On Wed, Jul 2, 2014 at 1:49 AM, Nathaniel Smith <njs at pobox.com> wrote:
>
>> There's some discussion on python-ideas about making it possible for
>> python indexing to accept kwargs, eg
>>
>>    arr[1:2, foo=bar]
>>
>> Since numpy is a very heavy user of indexing which might benefit from
>> this, I thought I should forward it here. If we have clear use cases for
>> such a feature then that may strongly affect the discussion.
>>
>> I admit I can't actually think of any features this would enable for us
>> though...
>>
>> -n
>> ---------- Forwarded message ----------
>> From: "Stefano Borini" <stefano.borini at ferrara.linux.it>
>> Date: 2 Jul 2014 00:17
>> Subject: [Python-ideas] PEP pre-draft: Support for indexing with keyword
>> arguments
>> To: "python-ideas at python.org" <python-ideas at python.org>, "Joseph
>> Martinot-Lagarde" <joseph.martinot-lagarde at m4x.org>
>> Cc:
>>
>> Dear all,
>>
>> after the first mailing list feedback, and further private discussion
>> with Joseph Martinot-Lagarde, I drafted a first iteration of a PEP for
>> keyword arguments in indexing. The document is available here.
>>
>> https://github.com/stefanoborini/pep-keyword/blob/master/PEP-XXX.txt
>>
>> The document is not in final form when it comes to specifications. In
>> fact, it requires additional discussion about the best strategy to achieve
>> the desired result. Particular attention has been devoted to present
>> alternative implementation strategies, their pros and cons. I will examine
>> all feedback tomorrow morning European time (in approx 10 hrs), and apply
>> any pull requests or comments you may have.
>>
>> When the specification is finalized, or this community suggests that the
>> PEP is in a form suitable for official submission despite potential open
>> issues, I will submit it to the editor panel for further discussion, and
>> deploy an actual implementation according to the agreed specification for a
>> working test run.
>>
>> I apologize for potential mistakes in the PEP drafting and submission
>> process, as this is my first PEP.
>>
>> Kind Regards,
>>
>> Stefano Borini
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 
Fernando Perez (@fperez_org; http://fperez.org)
fperez.net-at-gmail: mailing lists only (I ignore this when swamped!)
fernando.perez-at-berkeley: contact me here for any direct mail
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140702/01786afb/attachment.html>

From sturla.molden at gmail.com  Wed Jul  2 23:56:17 2014
From: sturla.molden at gmail.com (Sturla Molden)
Date: Thu, 03 Jul 2014 05:56:17 +0200
Subject: [Numpy-discussion] 64-bit windows numpy / scipy wheels for
	testing
In-Reply-To: <CALGmxEKyXDsuF81=coYYvtJSOLnpnO6AW5CxNYu+JSBOCiQu=A@mail.gmail.com>
References: <CAH6Pt5r47rOKXYfdiqaGQihV+yzrq2dc9rmYbboNb=HUbrtEpA@mail.gmail.com>	<536CB2C6.1030305@googlemail.com>	<CAGY4rcV1tecTqbVB-nY2_zuAQ1U0uVRH9iPa8Wd2inL8mh+zpA@mail.gmail.com>	<CAH6Pt5oOx+POzTJdMHU49JxZs=Pkc6OFD-B6=9topx68QjtiiA@mail.gmail.com>	<CABL7CQhfo6++1B6hBh3pDzmLZOeNbqo1SzQOfBMbLEQj4G+U7A@mail.gmail.com>	<CAFvE7K4+TMXG8-=bFsWk7hx5Wh4VkhhyYhw4z7EBb6kKnqPJrw@mail.gmail.com>	<CAGGsPMxmOV8T+Wm6LfTER1xebCQk4MEebXfdRULJ6b8zWLL_=Q@mail.gmail.com>	<CAH6Pt5pFftrXN8dkS4Dcj+Dt8Bx6EMyqp_cK0RX6=KUKhVx2+w@mail.gmail.com>	<CAH6Pt5r2fNJGGj1K=foSt9tFU89+wf82waYAzxYHrKGtQbTXwQ@mail.gmail.com>	<CAGGsPMyxvNkz__+Hc1YSp_fLRPbdg4gez9EAwQyS9jzv_R+ZCA@mail.gmail.com>	<CAH6Pt5rj8QOrtCyZMKeaM34ofdXkKRCiABGNLD1vVUC_B7bqTA@mail.gmail.com>	<CAGGsPMy7Y1nqz+5TCuGFP2gvK6G2ReYMb9X1e_Jq3rwO6FoLJA@mail.gmail.com>	<CAH6Pt5rPhXYyWHPVy8fYrym=2oNQxj5kchUL38KdLrwXjmhL8w@mail.gmail.com>
	<CALGmxEKyXDsuF81=coYYvtJSOLnpnO6AW5CxNYu+JSBOCiQu=A@mail.gmail.com>
Message-ID: <lp2k91$7jg$1@ger.gmane.org>

On 02/07/14 19:55, Chris Barker wrote:

>
> Indeed -- the default (i.e what you get with pip install numpy) should
> be SSE2 -- I":d much rather have a few folks with old hardware have to
> go through some hoops that n have most people get something that is
> "much slower than MATLAB".


I think we should use SSE3 as default. It is already ten years old. Most 
users (99.999 %) who want binary wheels have an SSE3 capable CPU.

According to Wikipedia:

AMD:
Athlon 64 (since Venice Stepping E3 and San Diego Stepping E4)
Athlon 64 X2
Athlon 64 FX (since San Diego Stepping E4)
Opteron (since Stepping E4)
Sempron (since Palermo. Stepping E3)
Phenom
Phenom II
Athlon II
Turion 64
Turion 64 X2
Turion X2
Turion X2 Ultra
Turion II X2 Mobile
Turion II X2 Ultra
APU
FX Series

Intel:
Celeron D
Celeron (starting with Core microarchitecture)
Pentium 4 (since Prescott)
Pentium D
Pentium Extreme Edition (but NOT Pentium 4 Extreme Edition)
Pentium Dual-Core
Pentium (starting with Core microarchitecture)
Core
Xeon (since Nocona)
Atom


If you have Pentium II, you can build your own NumPy...


Sturla


From jtaylor.debian at googlemail.com  Thu Jul  3 03:42:41 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Thu, 03 Jul 2014 09:42:41 +0200
Subject: [Numpy-discussion] 64-bit windows numpy / scipy wheels for
	testing
In-Reply-To: <lp2k91$7jg$1@ger.gmane.org>
References: <CAH6Pt5r47rOKXYfdiqaGQihV+yzrq2dc9rmYbboNb=HUbrtEpA@mail.gmail.com>	<536CB2C6.1030305@googlemail.com>	<CAGY4rcV1tecTqbVB-nY2_zuAQ1U0uVRH9iPa8Wd2inL8mh+zpA@mail.gmail.com>	<CAH6Pt5oOx+POzTJdMHU49JxZs=Pkc6OFD-B6=9topx68QjtiiA@mail.gmail.com>	<CABL7CQhfo6++1B6hBh3pDzmLZOeNbqo1SzQOfBMbLEQj4G+U7A@mail.gmail.com>	<CAFvE7K4+TMXG8-=bFsWk7hx5Wh4VkhhyYhw4z7EBb6kKnqPJrw@mail.gmail.com>	<CAGGsPMxmOV8T+Wm6LfTER1xebCQk4MEebXfdRULJ6b8zWLL_=Q@mail.gmail.com>	<CAH6Pt5pFftrXN8dkS4Dcj+Dt8Bx6EMyqp_cK0RX6=KUKhVx2+w@mail.gmail.com>	<CAH6Pt5r2fNJGGj1K=foSt9tFU89+wf82waYAzxYHrKGtQbTXwQ@mail.gmail.com>	<CAGGsPMyxvNkz__+Hc1YSp_fLRPbdg4gez9EAwQyS9jzv_R+ZCA@mail.gmail.com>	<CAH6Pt5rj8QOrtCyZMKeaM34ofdXkKRCiABGNLD1vVUC_B7bqTA@mail.gmail.com>	<CAGGsPMy7Y1nqz+5TCuGFP2gvK6G2ReYMb9X1e_Jq3rwO6FoLJA@mail.gmail.com>	<CAH6Pt5rPhXYyWHPVy8fYrym=2oNQxj5kchUL38KdLrwXjmhL8w@mail.gmail.com>	<CALGmxEKyXDsuF81=coYYvtJSOLnpnO6AW5CxNYu+JSBOCiQu=A@mail.gmail.com>
	<lp2k91$7jg$1@ger.gmane.org>
Message-ID: <53B50971.7040407@googlemail.com>

On 03.07.2014 05:56, Sturla Molden wrote:
> On 02/07/14 19:55, Chris Barker wrote:
> 
>>
>> Indeed -- the default (i.e what you get with pip install numpy) should
>> be SSE2 -- I":d much rather have a few folks with old hardware have to
>> go through some hoops that n have most people get something that is
>> "much slower than MATLAB".
> 
> 
> I think we should use SSE3 as default. It is already ten years old. Most 
> users (99.999 %) who want binary wheels have an SSE3 capable CPU.
> 

while true that pretty much all cpus currently around have it there is
no technical requirement for even new cpus to have SSE3. Compared to
SSE2 you do not have to implement it to sell a compatible 64 bit cpu.
Not even the new x32 ABI requires it.

In practice I think we could easily get away with using SSE3 as default
but I still would like to see if it makes any performance difference in
benchmarks. In my experience (which is exclusively on pre-haswell
machines) the horizontal operations it offers tend to be slower than
other solutions.


From m.hulsman at tudelft.nl  Thu Jul  3 04:51:31 2014
From: m.hulsman at tudelft.nl (Marc Hulsman)
Date: Thu, 03 Jul 2014 10:51:31 +0200
Subject: [Numpy-discussion] Fast way to convert (nested) list to numpy
	object array?
Message-ID: <53B51993.7080207@tudelft.nl>

Hello,

In my application I use nested, someitmes variable length lists, e.g.
[[1,2], [1,2,3], ...]. These
can also become double nested, etc. up to arbitrary complexity.

I like to use numpy indicing on the outer list,
i.e. I want to create: array([[1, 2], [1, 2, 3]], dtype=object)

However, because numpy likes to 'walk' through the nested lists, this
becomes rather slow
when the nested lists are large, e.g.

 k = [range(i) for i in range(10000)]
%timeit numpy.array(k)
1 loops, best of 3: 2.11 s per loop

Compared to shorter lists, e.g:
k2 = [range(numpy.random.randint(0,10)) for i in range(10000)]
%timeit numpy.array(k2)
100 loops, best of 3: 2.7 ms per loop

As I know beforehand that numpy does not have to descend into these
objects, I would just like to create
a 1-dimensional array.  I thought about using fromiter, but his fails with:

ValueError: cannot create object arrays from iterator

A second approach I tried is to create an empty array, and then fill it:
x = numpy.empty(len(k), dtype=object)
%timeit x[:] = k
1000 loops, best of 3: 220 ?s per loop

This works already much, much better, but the loop still takes time to
'descend' into the objects if they have a fixed size, e.g.:
k3 = [[range(10) for i in range(100)] for i in range(10000)]
%timeit x[:] = k3
10 loops, best of 3: 45.6 ms per loop

A python loop is in these cases even faster
%timeit for pos, e in enumerate(k3): x[pos] = e
1000 loops, best of 3: 1.02 ms per loop

This piece of code is quite time-critical in my application, and I
observe slow downs due to this behaviour.
My question therefore is if there is a fast way to just convert a list
simply into a 1-dimensional object array,
without each object being descended into?

More in general, if i create an array with numpy.array(k), would it be
possible to indicate that it should
search only 1,2,... nested levels deep into k?

Thanks for any advice,
Marc


From pablopg at computer.org  Thu Jul  3 05:14:31 2014
From: pablopg at computer.org (=?UTF-8?B?UGFibG8gUMOpcmV6IEdhcmPDrWE=?=)
Date: Thu, 3 Jul 2014 11:14:31 +0200
Subject: [Numpy-discussion] Numpy and debug symbols
Message-ID: <CAMHUo7mjXrmXa=F=PDSaRyvKnuQObBotGWApC7tEacJacZjUJA@mail.gmail.com>

Hello, I'm a newcomer and I have a question I did not manage to solve yet,
I posted it into these two stack-overflow entries:

http://stackoverflow.com/questions/24529811/compiling-numpy-for-windows-python-2-7-7

http://stackoverflow.com/questions/24548485/using-numpy-on-an-embedded-python-interpreter-using-vs2008-under-windows-7

Thank you very much in advance!
-- 
Pablo P?rez Garc?a
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140703/d3d63f75/attachment.html>

From jtaylor.debian at googlemail.com  Thu Jul  3 05:22:57 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Thu, 3 Jul 2014 11:22:57 +0200
Subject: [Numpy-discussion] Numpy and debug symbols
In-Reply-To: <CAMHUo7mjXrmXa=F=PDSaRyvKnuQObBotGWApC7tEacJacZjUJA@mail.gmail.com>
References: <CAMHUo7mjXrmXa=F=PDSaRyvKnuQObBotGWApC7tEacJacZjUJA@mail.gmail.com>
Message-ID: <CAK5FAtH3MB4qebze-=Y=NfV-Rp3zsQ478m-5wRG0m2ToJ_BSSA@mail.gmail.com>

On Thu, Jul 3, 2014 at 11:14 AM, Pablo P?rez Garc?a
<pablopg at computer.org> wrote:
> Hello, I'm a newcomer and I have a question I did not manage to solve yet, I
> posted it into these two stack-overflow entries:
>
> http://stackoverflow.com/questions/24529811/compiling-numpy-for-windows-python-2-7-7
>
> http://stackoverflow.com/questions/24548485/using-numpy-on-an-embedded-python-interpreter-using-vs2008-under-windows-7
>

I don't know how it works on windows but on linux/mac in order to
import debug builds of binary extensions you need to use debug build
of python which is a different runtime. I guess on windows you either
have to download a special installer with the debug build or build it
yourself (configure --with-pydebug)


From jtaylor.debian at googlemail.com  Thu Jul  3 05:30:33 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Thu, 3 Jul 2014 11:30:33 +0200
Subject: [Numpy-discussion] Fast way to convert (nested) list to numpy
 object array?
In-Reply-To: <53B51993.7080207@tudelft.nl>
References: <53B51993.7080207@tudelft.nl>
Message-ID: <CAK5FAtFApW67ry7Cpw-G3YEffO3MXJTpVQtUgFWB2x+mC+akeQ@mail.gmail.com>

numpy descends into the lists even if you request a object dtype as it
treats object arrays containing nested lists of equal size as
ndimensional:

np.array([[1,2], [3,4]], dtype=object).ndim
2

I don't think we have a constructor that limits the maximum dimension,
only one the minimum dimension.
I guess we could add one e.g. np.array(nested_list, dtype=object, ndmax=1)
But I'm not sure if its really worth it, can't you somehow move the
array construction out of your tight loops?


From jtaylor.debian at googlemail.com  Thu Jul  3 05:43:20 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Thu, 3 Jul 2014 11:43:20 +0200
Subject: [Numpy-discussion] Fast way to convert (nested) list to numpy
 object array?
In-Reply-To: <CAK5FAtFApW67ry7Cpw-G3YEffO3MXJTpVQtUgFWB2x+mC+akeQ@mail.gmail.com>
References: <53B51993.7080207@tudelft.nl>
	<CAK5FAtFApW67ry7Cpw-G3YEffO3MXJTpVQtUgFWB2x+mC+akeQ@mail.gmail.com>
Message-ID: <CAK5FAtGuh9oGE-tjFDJUo-pndHrSZrsYps97BxfqiGvqMSe7vA@mail.gmail.com>

On Thu, Jul 3, 2014 at 11:30 AM, Julian Taylor
<jtaylor.debian at googlemail.com> wrote:
> numpy descends into the lists even if you request a object dtype as it
> treats object arrays containing nested lists of equal size as
> ndimensional:
>
> np.array([[1,2], [3,4]], dtype=object).ndim
> 2
>
> I don't think we have a constructor that limits the maximum dimension,
> only one the minimum dimension.
> I guess we could add one e.g. np.array(nested_list, dtype=object, ndmax=1)
> But I'm not sure if its really worth it, can't you somehow move the
> array construction out of your tight loops?


On second though I guess adding a short circuit to the dimension
discovery on mismatching list length with object type should solve the
issue too.
A bit more information on the use case would still be useful, why do
you need to use numpy arrays for this in the first place?


From matthew.brett at gmail.com  Thu Jul  3 06:06:39 2014
From: matthew.brett at gmail.com (Matthew Brett)
Date: Thu, 3 Jul 2014 11:06:39 +0100
Subject: [Numpy-discussion] 64-bit windows numpy / scipy wheels for
	testing
In-Reply-To: <lp2k91$7jg$1@ger.gmane.org>
References: <CAH6Pt5r47rOKXYfdiqaGQihV+yzrq2dc9rmYbboNb=HUbrtEpA@mail.gmail.com>
	<536CB2C6.1030305@googlemail.com>
	<CAGY4rcV1tecTqbVB-nY2_zuAQ1U0uVRH9iPa8Wd2inL8mh+zpA@mail.gmail.com>
	<CAH6Pt5oOx+POzTJdMHU49JxZs=Pkc6OFD-B6=9topx68QjtiiA@mail.gmail.com>
	<CABL7CQhfo6++1B6hBh3pDzmLZOeNbqo1SzQOfBMbLEQj4G+U7A@mail.gmail.com>
	<CAFvE7K4+TMXG8-=bFsWk7hx5Wh4VkhhyYhw4z7EBb6kKnqPJrw@mail.gmail.com>
	<CAGGsPMxmOV8T+Wm6LfTER1xebCQk4MEebXfdRULJ6b8zWLL_=Q@mail.gmail.com>
	<CAH6Pt5pFftrXN8dkS4Dcj+Dt8Bx6EMyqp_cK0RX6=KUKhVx2+w@mail.gmail.com>
	<CAH6Pt5r2fNJGGj1K=foSt9tFU89+wf82waYAzxYHrKGtQbTXwQ@mail.gmail.com>
	<CAGGsPMyxvNkz__+Hc1YSp_fLRPbdg4gez9EAwQyS9jzv_R+ZCA@mail.gmail.com>
	<CAH6Pt5rj8QOrtCyZMKeaM34ofdXkKRCiABGNLD1vVUC_B7bqTA@mail.gmail.com>
	<CAGGsPMy7Y1nqz+5TCuGFP2gvK6G2ReYMb9X1e_Jq3rwO6FoLJA@mail.gmail.com>
	<CAH6Pt5rPhXYyWHPVy8fYrym=2oNQxj5kchUL38KdLrwXjmhL8w@mail.gmail.com>
	<CALGmxEKyXDsuF81=coYYvtJSOLnpnO6AW5CxNYu+JSBOCiQu=A@mail.gmail.com>
	<lp2k91$7jg$1@ger.gmane.org>
Message-ID: <CAH6Pt5o6g5A_gjCsH75xkqLzPLJUz9OdsB4a5Ej5S-kectw+og@mail.gmail.com>

Hi,

On Thu, Jul 3, 2014 at 4:56 AM, Sturla Molden <sturla.molden at gmail.com> wrote:
> On 02/07/14 19:55, Chris Barker wrote:
>
>>
>> Indeed -- the default (i.e what you get with pip install numpy) should
>> be SSE2 -- I":d much rather have a few folks with old hardware have to
>> go through some hoops that n have most people get something that is
>> "much slower than MATLAB".
>
>
> I think we should use SSE3 as default. It is already ten years old. Most
> users (99.999 %) who want binary wheels have an SSE3 capable CPU.

The 99% for SSE2 comes from the Firefox crash reports, where the large
majority are for very recent Firefox downloads.

If you can identify SSE3 machines from the reported CPU string (as the
Firefox people did for SSE2), please do have a look a see if you can
get a count for SSE3 in the Firefox crash reports; if it's close to
99% that would make a strong argument:

https://github.com/numpy/numpy/wiki/Windows-versions#sse--sse2
https://gist.github.com/matthew-brett/9cb5274f7451a3eb8fc0

Cheers,

Matthew


From cmkleffner at gmail.com  Thu Jul  3 06:33:35 2014
From: cmkleffner at gmail.com (Carl Kleffner)
Date: Thu, 3 Jul 2014 12:33:35 +0200
Subject: [Numpy-discussion] Numpy and debug symbols
In-Reply-To: <CAK5FAtH3MB4qebze-=Y=NfV-Rp3zsQ478m-5wRG0m2ToJ_BSSA@mail.gmail.com>
References: <CAMHUo7mjXrmXa=F=PDSaRyvKnuQObBotGWApC7tEacJacZjUJA@mail.gmail.com>
	<CAK5FAtH3MB4qebze-=Y=NfV-Rp3zsQ478m-5wRG0m2ToJ_BSSA@mail.gmail.com>
Message-ID: <CAGGsPMyJktgKv1f+mD5F928FL7-aQcFt_947s9JVJj78o_HkWw@mail.gmail.com>

Hi,

to trace this error, you can try to run your programm with the dependency
walker http://www.dependencywalker.com/ . In the menu there is a profiling
option. With 'Start profiling' you get messages of all accesses to DLLs and
Python extensions. Most likely a DLL is not found.
Be aware: for 64bit development you need a dedicated zip-file for the
dependency walker.

Regards

Carl


2014-07-03 11:22 GMT+02:00 Julian Taylor <jtaylor.debian at googlemail.com>:

> On Thu, Jul 3, 2014 at 11:14 AM, Pablo P?rez Garc?a
> <pablopg at computer.org> wrote:
> > Hello, I'm a newcomer and I have a question I did not manage to solve
> yet, I
> > posted it into these two stack-overflow entries:
> >
> >
> http://stackoverflow.com/questions/24529811/compiling-numpy-for-windows-python-2-7-7
> >
> >
> http://stackoverflow.com/questions/24548485/using-numpy-on-an-embedded-python-interpreter-using-vs2008-under-windows-7
> >
>
> I don't know how it works on windows but on linux/mac in order to
> import debug builds of binary extensions you need to use debug build
> of python which is a different runtime. I guess on windows you either
> have to download a special installer with the debug build or build it
> yourself (configure --with-pydebug)
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140703/52ebc299/attachment.html>

From matthew.brett at gmail.com  Thu Jul  3 06:46:23 2014
From: matthew.brett at gmail.com (Matthew Brett)
Date: Thu, 3 Jul 2014 11:46:23 +0100
Subject: [Numpy-discussion] 64-bit windows numpy / scipy wheels for
	testing
In-Reply-To: <CAH6Pt5o6g5A_gjCsH75xkqLzPLJUz9OdsB4a5Ej5S-kectw+og@mail.gmail.com>
References: <CAH6Pt5r47rOKXYfdiqaGQihV+yzrq2dc9rmYbboNb=HUbrtEpA@mail.gmail.com>
	<536CB2C6.1030305@googlemail.com>
	<CAGY4rcV1tecTqbVB-nY2_zuAQ1U0uVRH9iPa8Wd2inL8mh+zpA@mail.gmail.com>
	<CAH6Pt5oOx+POzTJdMHU49JxZs=Pkc6OFD-B6=9topx68QjtiiA@mail.gmail.com>
	<CABL7CQhfo6++1B6hBh3pDzmLZOeNbqo1SzQOfBMbLEQj4G+U7A@mail.gmail.com>
	<CAFvE7K4+TMXG8-=bFsWk7hx5Wh4VkhhyYhw4z7EBb6kKnqPJrw@mail.gmail.com>
	<CAGGsPMxmOV8T+Wm6LfTER1xebCQk4MEebXfdRULJ6b8zWLL_=Q@mail.gmail.com>
	<CAH6Pt5pFftrXN8dkS4Dcj+Dt8Bx6EMyqp_cK0RX6=KUKhVx2+w@mail.gmail.com>
	<CAH6Pt5r2fNJGGj1K=foSt9tFU89+wf82waYAzxYHrKGtQbTXwQ@mail.gmail.com>
	<CAGGsPMyxvNkz__+Hc1YSp_fLRPbdg4gez9EAwQyS9jzv_R+ZCA@mail.gmail.com>
	<CAH6Pt5rj8QOrtCyZMKeaM34ofdXkKRCiABGNLD1vVUC_B7bqTA@mail.gmail.com>
	<CAGGsPMy7Y1nqz+5TCuGFP2gvK6G2ReYMb9X1e_Jq3rwO6FoLJA@mail.gmail.com>
	<CAH6Pt5rPhXYyWHPVy8fYrym=2oNQxj5kchUL38KdLrwXjmhL8w@mail.gmail.com>
	<CALGmxEKyXDsuF81=coYYvtJSOLnpnO6AW5CxNYu+JSBOCiQu=A@mail.gmail.com>
	<lp2k91$7jg$1@ger.gmane.org>
	<CAH6Pt5o6g5A_gjCsH75xkqLzPLJUz9OdsB4a5Ej5S-kectw+og@mail.gmail.com>
Message-ID: <CAH6Pt5qJS357CtoXGyz2cNmVFr8ufjjdwe_fHqU86YcAcJ5vcw@mail.gmail.com>

I guess this one's mainly for Carl:

On Thu, Jul 3, 2014 at 11:06 AM, Matthew Brett <matthew.brett at gmail.com> wrote:
> Hi,
>
> On Thu, Jul 3, 2014 at 4:56 AM, Sturla Molden <sturla.molden at gmail.com> wrote:
>> On 02/07/14 19:55, Chris Barker wrote:
>>
>>>
>>> Indeed -- the default (i.e what you get with pip install numpy) should
>>> be SSE2 -- I":d much rather have a few folks with old hardware have to
>>> go through some hoops that n have most people get something that is
>>> "much slower than MATLAB".
>>
>>
>> I think we should use SSE3 as default. It is already ten years old. Most
>> users (99.999 %) who want binary wheels have an SSE3 capable CPU.
>
> The 99% for SSE2 comes from the Firefox crash reports, where the large
> majority are for very recent Firefox downloads.
>
> If you can identify SSE3 machines from the reported CPU string (as the
> Firefox people did for SSE2), please do have a look a see if you can
> get a count for SSE3 in the Firefox crash reports; if it's close to
> 99% that would make a strong argument:
>
> https://github.com/numpy/numpy/wiki/Windows-versions#sse--sse2
> https://gist.github.com/matthew-brett/9cb5274f7451a3eb8fc0

Jonathan Helmus recently pointed out https://ci.appveyor.com in a
discussion on the scikit-image mailing list.  The scikit-image team
are trying to get builds and tests working there.  The configuration
file allows arbitrary cmd and powershell commands executed in a clean
Windows virtual machine.  Do you think it would be possible to get the
wheel builds working on something like that?  That would be a big step
forward, just because the current procedure is rather fiddly, even if
not very difficult.

Any news on the pull request to numpy?  Waiting eagerly :)

Cheers,

Matthew


From pablopg at computer.org  Thu Jul  3 06:51:35 2014
From: pablopg at computer.org (=?UTF-8?B?UGFibG8gUMOpcmV6IEdhcmPDrWE=?=)
Date: Thu, 3 Jul 2014 12:51:35 +0200
Subject: [Numpy-discussion] Numpy and debug symbols
In-Reply-To: <CAGGsPMyJktgKv1f+mD5F928FL7-aQcFt_947s9JVJj78o_HkWw@mail.gmail.com>
References: <CAMHUo7mjXrmXa=F=PDSaRyvKnuQObBotGWApC7tEacJacZjUJA@mail.gmail.com>
	<CAK5FAtH3MB4qebze-=Y=NfV-Rp3zsQ478m-5wRG0m2ToJ_BSSA@mail.gmail.com>
	<CAGGsPMyJktgKv1f+mD5F928FL7-aQcFt_947s9JVJj78o_HkWw@mail.gmail.com>
Message-ID: <CAMHUo7k8E_JxjhBTL4V-YUjHuaAzb-QVq2zBpZPwAPGLSOF8Vw@mail.gmail.com>

Hello,

I was able to run Dependency Walker and I noticed that in Debug mode the
following type of libraries are not loaded:

"MULTIARRAY.PYD", "UMATH.PYD"

Also in debug mode Python27_D is loaded and in release mode Python27 which
sounds good to me... but for some reason debug mode cannot load necessary
dependencies.

I attach both files.

By the way, I like this community!


2014-07-03 12:33 GMT+02:00 Carl Kleffner <cmkleffner at gmail.com>:

> Hi,
>
> to trace this error, you can try to run your programm with the dependency
> walker http://www.dependencywalker.com/ . In the menu there is a
> profiling option. With 'Start profiling' you get messages of all accesses
> to DLLs and Python extensions. Most likely a DLL is not found.
> Be aware: for 64bit development you need a dedicated zip-file for the
> dependency walker.
>
> Regards
>
> Carl
>
>
> 2014-07-03 11:22 GMT+02:00 Julian Taylor <jtaylor.debian at googlemail.com>:
>
> On Thu, Jul 3, 2014 at 11:14 AM, Pablo P?rez Garc?a
>> <pablopg at computer.org> wrote:
>> > Hello, I'm a newcomer and I have a question I did not manage to solve
>> yet, I
>> > posted it into these two stack-overflow entries:
>> >
>> >
>> http://stackoverflow.com/questions/24529811/compiling-numpy-for-windows-python-2-7-7
>> >
>> >
>> http://stackoverflow.com/questions/24548485/using-numpy-on-an-embedded-python-interpreter-using-vs2008-under-windows-7
>> >
>>
>> I don't know how it works on windows but on linux/mac in order to
>> import debug builds of binary extensions you need to use debug build
>> of python which is a different runtime. I guess on windows you either
>> have to download a special installer with the debug build or build it
>> yourself (configure --with-pydebug)
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 
Pablo P?rez Garc?a
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140703/6e3108d2/attachment.html>
-------------- next part --------------
#Dependency Walker for DEBUG.

Warning: At least one delay-load dependency module was not found.
Warning: At least one module has an unresolved import due to a missing export function in a delay-load dependent module.

--------------------------------------------------------------------------------
Starting profile on 03/07/2014 at 12:41:42

Options Selected:
     Simulate ShellExecute by inserting any App Paths directories into the PATH environment variable.
     Log DllMain calls for process attach and process detach messages.
     Hook the process to gather more detailed dependency information.
     Log LoadLibrary function calls.
     Log GetProcAddress function calls.
     Log debug output messages.
     Automatically open and profile child processes.
--------------------------------------------------------------------------------

Started "DEMO_FOR_PYTHON.EXE" (process 0xBD4) at address 0x00DD0000.  Successfully hooked module.
Loaded "NTDLL.DLL" at address 0x77D50000.  Successfully hooked module.
Loaded "KERNEL32.DLL" at address 0x75910000.  Successfully hooked module.
Loaded "KERNELBASE.DLL" at address 0x77900000.  Successfully hooked module.
DllMain(0x77900000, DLL_PROCESS_ATTACH, 0x00000000) in "KERNELBASE.DLL" called.
DllMain(0x77900000, DLL_PROCESS_ATTACH, 0x00000000) in "KERNELBASE.DLL" returned 1 (0x1).
DllMain(0x75910000, DLL_PROCESS_ATTACH, 0x00000000) in "KERNEL32.DLL" called.
DllMain(0x75910000, DLL_PROCESS_ATTACH, 0x00000000) in "KERNEL32.DLL" returned 1 (0x1).
Injected "DEPENDS.DLL" at address 0x08370000.
DllMain(0x08370000, DLL_PROCESS_ATTACH, 0x00000000) in "DEPENDS.DLL" called.
DllMain(0x08370000, DLL_PROCESS_ATTACH, 0x00000000) in "DEPENDS.DLL" returned 1 (0x1).
Loaded "MSVCR90D.DLL" at address 0x634F0000.  Successfully hooked module.
Loaded "PYTHON27_D.DLL" at address 0x1E000000.  Successfully hooked module.
Loaded "USER32.DLL" at address 0x76560000.  Successfully hooked module.
Loaded "GDI32.DLL" at address 0x75A60000.  Successfully hooked module.
Loaded "LPK.DLL" at address 0x764F0000.  Successfully hooked module.
Loaded "USP10.DLL" at address 0x77850000.  Successfully hooked module.
Loaded "MSVCRT.DLL" at address 0x75830000.  Successfully hooked module.
Loaded "ADVAPI32.DLL" at address 0x75DB0000.  Successfully hooked module.
Loaded "SECHOST.DLL" at address 0x760D0000.  Successfully hooked module.
Loaded "RPCRT4.DLL" at address 0x75B80000.  Successfully hooked module.
Loaded "SSPICLI.DLL" at address 0x75750000.  Successfully hooked module.
Loaded "CRYPTBASE.DLL" at address 0x75740000.  Successfully hooked module.
Loaded "SHELL32.DLL" at address 0x76790000.  Successfully hooked module.
Loaded "SHLWAPI.DLL" at address 0x76490000.  Successfully hooked module.
Entrypoint reached. All implicit modules have been loaded.
DllMain(0x634F0000, DLL_PROCESS_ATTACH, 0x002DF840) in "MSVCR90D.DLL" called.
GetProcAddress(0x75910000 [KERNEL32.DLL], "FlsAlloc") called from "MSVCR90D.DLL" at address 0x6352E339 and returned 0x75924EF3.
GetProcAddress(0x75910000 [KERNEL32.DLL], "FlsGetValue") called from "MSVCR90D.DLL" at address 0x6352E34D and returned 0x75921252.
GetProcAddress(0x75910000 [KERNEL32.DLL], "FlsSetValue") called from "MSVCR90D.DLL" at address 0x6352E361 and returned 0x759241D0.
GetProcAddress(0x75910000 [KERNEL32.DLL], "FlsFree") called from "MSVCR90D.DLL" at address 0x6352E375 and returned 0x7592355F.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90D.DLL" at address 0x6352E0DC and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90D.DLL" at address 0x6352E0DC and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90D.DLL" at address 0x6352E0DC and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90D.DLL" at address 0x6352E0DC and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90D.DLL" at address 0x6352E0DC and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90D.DLL" at address 0x6352E0DC and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90D.DLL" at address 0x6352E0DC and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90D.DLL" at address 0x6352E1DC and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90D.DLL" at address 0x6352E1DC and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90D.DLL" at address 0x6352E5DB and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90D.DLL" at address 0x6352E5F3 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "IsProcessorFeaturePresent") called from "MSVCR90D.DLL" at address 0x635E5A0B and returned 0x759251FD.
GetProcAddress(0x75910000 [KERNEL32.DLL], "FindActCtxSectionStringW") called from "MSVCR90D.DLL" at address 0x6352CA3A and returned 0x7592A6D8.
DllMain(0x634F0000, DLL_PROCESS_ATTACH, 0x002DF840) in "MSVCR90D.DLL" returned 1 (0x1).
DllMain(0x75830000, DLL_PROCESS_ATTACH, 0x002DF840) in "MSVCRT.DLL" called.
DllMain(0x75830000, DLL_PROCESS_ATTACH, 0x002DF840) in "MSVCRT.DLL" returned 1 (0x1).
DllMain(0x77850000, DLL_PROCESS_ATTACH, 0x002DF840) in "USP10.DLL" called.
LoadLibraryA("gdi32.dll") called from "USP10.DLL" at address 0x77866020.
LoadLibraryA("gdi32.dll") returned 0x75A60000.
GetProcAddress(0x75A60000 [GDI32.DLL], "GetCharABCWidthsI") called from "USP10.DLL" at address 0x77866055 and returned 0x75A799A3.
DllMain(0x77850000, DLL_PROCESS_ATTACH, 0x002DF840) in "USP10.DLL" returned 1 (0x1).
DllMain(0x764F0000, DLL_PROCESS_ATTACH, 0x002DF840) in "LPK.DLL" called.
DllMain(0x764F0000, DLL_PROCESS_ATTACH, 0x002DF840) in "LPK.DLL" returned 1 (0x1).
DllMain(0x75A60000, DLL_PROCESS_ATTACH, 0x002DF840) in "GDI32.DLL" called.
DllMain(0x75A60000, DLL_PROCESS_ATTACH, 0x002DF840) in "GDI32.DLL" returned 1 (0x1).
DllMain(0x75740000, DLL_PROCESS_ATTACH, 0x002DF840) in "CRYPTBASE.DLL" called.
DllMain(0x75740000, DLL_PROCESS_ATTACH, 0x002DF840) in "CRYPTBASE.DLL" returned 1 (0x1).
DllMain(0x75750000, DLL_PROCESS_ATTACH, 0x002DF840) in "SSPICLI.DLL" called.
DllMain(0x75750000, DLL_PROCESS_ATTACH, 0x002DF840) in "SSPICLI.DLL" returned 1 (0x1).
DllMain(0x75B80000, DLL_PROCESS_ATTACH, 0x002DF840) in "RPCRT4.DLL" called.
DllMain(0x75B80000, DLL_PROCESS_ATTACH, 0x002DF840) in "RPCRT4.DLL" returned 1975101185 (0x75B9A701).
DllMain(0x760D0000, DLL_PROCESS_ATTACH, 0x002DF840) in "SECHOST.DLL" called.
DllMain(0x760D0000, DLL_PROCESS_ATTACH, 0x002DF840) in "SECHOST.DLL" returned 1 (0x1).
DllMain(0x75DB0000, DLL_PROCESS_ATTACH, 0x002DF840) in "ADVAPI32.DLL" called.
DllMain(0x75DB0000, DLL_PROCESS_ATTACH, 0x002DF840) in "ADVAPI32.DLL" returned 1 (0x1).
DllMain(0x76560000, DLL_PROCESS_ATTACH, 0x002DF840) in "USER32.DLL" called.
LoadLibraryW("C:\Windows\system32\IMM32.DLL") called from "USER32.DLL" at address 0x7657CF0E.
Loaded "IMM32.DLL" at address 0x76500000.  Successfully hooked module.
Loaded "MSCTF.DLL" at address 0x75FF0000.  Successfully hooked module.
DllMain(0x75FF0000, DLL_PROCESS_ATTACH, 0x00000000) in "MSCTF.DLL" called.
DllMain(0x75FF0000, DLL_PROCESS_ATTACH, 0x00000000) in "MSCTF.DLL" returned 1 (0x1).
DllMain(0x76500000, DLL_PROCESS_ATTACH, 0x00000000) in "IMM32.DLL" called.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmWINNLSEnableIME") called from "USER32.DLL" at address 0x7657C312 and returned 0x7651F637.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmWINNLSGetEnableStatus") called from "USER32.DLL" at address 0x7657C327 and returned 0x7651F65E.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmSendIMEMessageExW") called from "USER32.DLL" at address 0x7657C33C and returned 0x7651F8EC.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmSendIMEMessageExA") called from "USER32.DLL" at address 0x7657C351 and returned 0x7651F907.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmIMPGetIMEW") called from "USER32.DLL" at address 0x7657C366 and returned 0x7651FB65.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmIMPGetIMEA") called from "USER32.DLL" at address 0x7657C37B and returned 0x7651FB99.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmIMPQueryIMEW") called from "USER32.DLL" at address 0x7657C390 and returned 0x7651F9CA.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmIMPQueryIMEA") called from "USER32.DLL" at address 0x7657C3A5 and returned 0x7651FAD6.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmIMPSetIMEW") called from "USER32.DLL" at address 0x7657C3BA and returned 0x7651F746.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmIMPSetIMEA") called from "USER32.DLL" at address 0x7657C3CF and returned 0x7651F86E.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmAssociateContext") called from "USER32.DLL" at address 0x7657C3E4 and returned 0x76513540.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmEscapeA") called from "USER32.DLL" at address 0x7657C3F9 and returned 0x76519327.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmEscapeW") called from "USER32.DLL" at address 0x7657C40E and returned 0x765195A9.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmGetCompositionStringA") called from "USER32.DLL" at address 0x7657C423 and returned 0x76517A37.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmGetCompositionStringW") called from "USER32.DLL" at address 0x7657C438 and returned 0x7651420C.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmGetCompositionWindow") called from "USER32.DLL" at address 0x7657C44D and returned 0x76512E79.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmGetContext") called from "USER32.DLL" at address 0x7657C462 and returned 0x76512084.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmGetDefaultIMEWnd") called from "USER32.DLL" at address 0x7657C477 and returned 0x76511F9D.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmIsIME") called from "USER32.DLL" at address 0x7657C48C and returned 0x76512FC7.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmReleaseContext") called from "USER32.DLL" at address 0x7657C4A1 and returned 0x765121A2.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmRegisterClient") called from "USER32.DLL" at address 0x7657C4B6 and returned 0x76511346.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmGetCompositionFontW") called from "USER32.DLL" at address 0x7657C4CB and returned 0x765168C8.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmGetCompositionFontA") called from "USER32.DLL" at address 0x7657C4E0 and returned 0x7651682C.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmSetCompositionFontW") called from "USER32.DLL" at address 0x7657C4F5 and returned 0x76513938.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmSetCompositionFontA") called from "USER32.DLL" at address 0x7657C50A and returned 0x76516964.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmSetCompositionWindow") called from "USER32.DLL" at address 0x7657C51F and returned 0x765138AA.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmNotifyIME") called from "USER32.DLL" at address 0x7657C534 and returned 0x76513C6C.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmLockIMC") called from "USER32.DLL" at address 0x7657C549 and returned 0x76511E7D.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmUnlockIMC") called from "USER32.DLL" at address 0x7657C55E and returned 0x76511E95.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmLoadIME") called from "USER32.DLL" at address 0x7657C573 and returned 0x7651197A.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmSetOpenStatus") called from "USER32.DLL" at address 0x7657C588 and returned 0x76513FF3.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmFreeLayout") called from "USER32.DLL" at address 0x7657C59D and returned 0x765197EF.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmActivateLayout") called from "USER32.DLL" at address 0x7657C5B2 and returned 0x76518DF5.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmGetCandidateWindow") called from "USER32.DLL" at address 0x7657C5C7 and returned 0x76512EBC.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmSetCandidateWindow") called from "USER32.DLL" at address 0x7657C5DC and returned 0x76513E02.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmConfigureIMEW") called from "USER32.DLL" at address 0x7657C5F1 and returned 0x7651913F.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmGetConversionStatus") called from "USER32.DLL" at address 0x7657C606 and returned 0x765124E9.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmSetConversionStatus") called from "USER32.DLL" at address 0x7657C61B and returned 0x76513EE6.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmSetStatusWindowPos") called from "USER32.DLL" at address 0x7657C630 and returned 0x76516A7C.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmGetImeInfoEx") called from "USER32.DLL" at address 0x7657C645 and returned 0x765114D8.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmLockImeDpi") called from "USER32.DLL" at address 0x7657C65A and returned 0x76512025.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmUnlockImeDpi") called from "USER32.DLL" at address 0x7657C66F and returned 0x76511FD8.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmGetOpenStatus") called from "USER32.DLL" at address 0x7657C684 and returned 0x76513DCF.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmSetActiveContext") called from "USER32.DLL" at address 0x7657C699 and returned 0x76512246.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmTranslateMessage") called from "USER32.DLL" at address 0x7657C6AE and returned 0x7651F27F.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmLoadLayout") called from "USER32.DLL" at address 0x7657C6C3 and returned 0x76519E79.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmProcessKey") called from "USER32.DLL" at address 0x7657C6D8 and returned 0x76513A3C.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmPutImeMenuItemsIntoMappedFile") called from "USER32.DLL" at address 0x7657C6ED and returned 0x76524E96.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmGetProperty") called from "USER32.DLL" at address 0x7657C702 and returned 0x76513BB8.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmSetCompositionStringA") called from "USER32.DLL" at address 0x7657C717 and returned 0x765183C2.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmSetCompositionStringW") called from "USER32.DLL" at address 0x7657C72C and returned 0x765183E9.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmEnumInputContext") called from "USER32.DLL" at address 0x7657C741 and returned 0x765131DD.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmSystemHandler") called from "USER32.DLL" at address 0x7657C756 and returned 0x7651B1CF.
GetProcAddress(0x76500000 [IMM32.DLL], "CtfImmTIMActivate") called from "USER32.DLL" at address 0x7657C767 and returned 0x76511888.
GetProcAddress(0x76500000 [IMM32.DLL], "CtfImmRestoreToolbarWnd") called from "USER32.DLL" at address 0x7657C778 and returned 0x76525114.
GetProcAddress(0x76500000 [IMM32.DLL], "CtfImmHideToolbarWnd") called from "USER32.DLL" at address 0x7657C789 and returned 0x7652514B.
GetProcAddress(0x76500000 [IMM32.DLL], "CtfImmDispatchDefImeMessage") called from "USER32.DLL" at address 0x7657C79A and returned 0x7651163C.
GetProcAddress(0x76500000 [IMM32.DLL], "CtfImmNotify") called from "USER32.DLL" at address 0x7657C7AB and returned 0x765115D0.
GetProcAddress(0x76500000 [IMM32.DLL], "CtfImmSetDefaultRemoteKeyboardLayout") called from "USER32.DLL" at address 0x7657C7BC and returned 0x765253CC.
GetProcAddress(0x76500000 [IMM32.DLL], "CtfImmGetCompatibleKeyboardLayout") called from "USER32.DLL" at address 0x7657C7CD and returned 0x765253DC.
DllMain(0x76500000, DLL_PROCESS_ATTACH, 0x00000000) in "IMM32.DLL" returned 1 (0x1).
LoadLibraryW("C:\Windows\system32\IMM32.DLL") returned 0x76500000.
GetProcAddress(0x764F0000 [LPK.DLL], "LpkTabbedTextOut") called from "GDI32.DLL" at address 0x75A76970 and returned 0x764F48A0.
GetProcAddress(0x764F0000 [LPK.DLL], "LpkPSMTextOut") called from "GDI32.DLL" at address 0x75A7697B and returned 0x764F1430.
GetProcAddress(0x764F0000 [LPK.DLL], "LpkDrawTextEx") called from "GDI32.DLL" at address 0x75A76986 and returned 0x764F13D0.
GetProcAddress(0x764F0000 [LPK.DLL], "LpkEditControl") called from "GDI32.DLL" at address 0x75A76991 and returned 0x764F7000.
DllMain(0x76560000, DLL_PROCESS_ATTACH, 0x002DF840) in "USER32.DLL" returned 1 (0x1).
DllMain(0x76490000, DLL_PROCESS_ATTACH, 0x002DF840) in "SHLWAPI.DLL" called.
DllMain(0x76490000, DLL_PROCESS_ATTACH, 0x002DF840) in "SHLWAPI.DLL" returned 1 (0x1).
DllMain(0x76790000, DLL_PROCESS_ATTACH, 0x002DF840) in "SHELL32.DLL" called.
DllMain(0x76790000, DLL_PROCESS_ATTACH, 0x002DF840) in "SHELL32.DLL" returned 1 (0x1).
DllMain(0x1E000000, DLL_PROCESS_ATTACH, 0x002DF840) in "PYTHON27_D.DLL" called.
GetProcAddress(0x75910000 [KERNEL32.DLL], "GetCurrentActCtx") called from "PYTHON27_D.DLL" at address 0x1E180A17 and returned 0x7593D521.
GetProcAddress(0x75910000 [KERNEL32.DLL], "ActivateActCtx") called from "PYTHON27_D.DLL" at address 0x1E180A34 and returned 0x75925458.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DeactivateActCtx") called from "PYTHON27_D.DLL" at address 0x1E180A48 and returned 0x75925424.
GetProcAddress(0x75910000 [KERNEL32.DLL], "AddRefActCtx") called from "PYTHON27_D.DLL" at address 0x1E180A5C and returned 0x7593D510.
GetProcAddress(0x75910000 [KERNEL32.DLL], "ReleaseActCtx") called from "PYTHON27_D.DLL" at address 0x1E180A70 and returned 0x75925489.
DllMain(0x1E000000, DLL_PROCESS_ATTACH, 0x002DF840) in "PYTHON27_D.DLL" returned 1 (0x1).
DllMain(0x76500000, DLL_PROCESS_DETACH, 0x00000001) in "IMM32.DLL" called.
DllMain(0x76500000, DLL_PROCESS_DETACH, 0x00000001) in "IMM32.DLL" returned 1 (0x1).
DllMain(0x75FF0000, DLL_PROCESS_DETACH, 0x00000001) in "MSCTF.DLL" called.
DllMain(0x75FF0000, DLL_PROCESS_DETACH, 0x00000001) in "MSCTF.DLL" returned 1 (0x1).
DllMain(0x1E000000, DLL_PROCESS_DETACH, 0x00000001) in "PYTHON27_D.DLL" called.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90D.DLL" at address 0x6352E1DC and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90D.DLL" at address 0x6352E1DC and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90D.DLL" at address 0x6352E0DC and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90D.DLL" at address 0x6352E1DC and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90D.DLL" at address 0x6352E0DC and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90D.DLL" at address 0x6352E1DC and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90D.DLL" at address 0x6352E1DC and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90D.DLL" at address 0x6352E0DC and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90D.DLL" at address 0x6352E1DC and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90D.DLL" at address 0x6352E0DC and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90D.DLL" at address 0x6352E1DC and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90D.DLL" at address 0x6352E1DC and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90D.DLL" at address 0x6352E0DC and returned 0x77D9107B.
DllMain(0x1E000000, DLL_PROCESS_DETACH, 0x00000001) in "PYTHON27_D.DLL" returned 1 (0x1).
DllMain(0x76790000, DLL_PROCESS_DETACH, 0x00000001) in "SHELL32.DLL" called.
DllMain(0x76790000, DLL_PROCESS_DETACH, 0x00000001) in "SHELL32.DLL" returned 1 (0x1).
DllMain(0x76490000, DLL_PROCESS_DETACH, 0x00000001) in "SHLWAPI.DLL" called.
DllMain(0x76490000, DLL_PROCESS_DETACH, 0x00000001) in "SHLWAPI.DLL" returned 1 (0x1).
DllMain(0x76560000, DLL_PROCESS_DETACH, 0x00000001) in "USER32.DLL" called.
DllMain(0x76560000, DLL_PROCESS_DETACH, 0x00000001) in "USER32.DLL" returned 1 (0x1).
DllMain(0x75DB0000, DLL_PROCESS_DETACH, 0x00000001) in "ADVAPI32.DLL" called.
DllMain(0x75DB0000, DLL_PROCESS_DETACH, 0x00000001) in "ADVAPI32.DLL" returned 1 (0x1).
DllMain(0x760D0000, DLL_PROCESS_DETACH, 0x00000001) in "SECHOST.DLL" called.
DllMain(0x760D0000, DLL_PROCESS_DETACH, 0x00000001) in "SECHOST.DLL" returned 1 (0x1).
DllMain(0x75B80000, DLL_PROCESS_DETACH, 0x00000001) in "RPCRT4.DLL" called.
DllMain(0x75B80000, DLL_PROCESS_DETACH, 0x00000001) in "RPCRT4.DLL" returned 1 (0x1).
DllMain(0x75750000, DLL_PROCESS_DETACH, 0x00000001) in "SSPICLI.DLL" called.
DllMain(0x75750000, DLL_PROCESS_DETACH, 0x00000001) in "SSPICLI.DLL" returned 1 (0x1).
DllMain(0x75740000, DLL_PROCESS_DETACH, 0x00000001) in "CRYPTBASE.DLL" called.
DllMain(0x75740000, DLL_PROCESS_DETACH, 0x00000001) in "CRYPTBASE.DLL" returned 1 (0x1).
DllMain(0x75A60000, DLL_PROCESS_DETACH, 0x00000001) in "GDI32.DLL" called.
DllMain(0x75A60000, DLL_PROCESS_DETACH, 0x00000001) in "GDI32.DLL" returned 1 (0x1).
DllMain(0x764F0000, DLL_PROCESS_DETACH, 0x00000001) in "LPK.DLL" called.
DllMain(0x764F0000, DLL_PROCESS_DETACH, 0x00000001) in "LPK.DLL" returned 1 (0x1).
DllMain(0x77850000, DLL_PROCESS_DETACH, 0x00000001) in "USP10.DLL" called.
DllMain(0x77850000, DLL_PROCESS_DETACH, 0x00000001) in "USP10.DLL" returned 1 (0x1).
DllMain(0x75830000, DLL_PROCESS_DETACH, 0x00000001) in "MSVCRT.DLL" called.
DllMain(0x75830000, DLL_PROCESS_DETACH, 0x00000001) in "MSVCRT.DLL" returned 1 (0x1).
DllMain(0x634F0000, DLL_PROCESS_DETACH, 0x00000001) in "MSVCR90D.DLL" called.
DllMain(0x634F0000, DLL_PROCESS_DETACH, 0x00000001) in "MSVCR90D.DLL" returned 1 (0x1).
DllMain(0x08370000, DLL_PROCESS_DETACH, 0x00000001) in "DEPENDS.DLL" called.
DllMain(0x08370000, DLL_PROCESS_DETACH, 0x00000001) in "DEPENDS.DLL" returned 1 (0x1).
DllMain(0x75910000, DLL_PROCESS_DETACH, 0x00000001) in "KERNEL32.DLL" called.
DllMain(0x75910000, DLL_PROCESS_DETACH, 0x00000001) in "KERNEL32.DLL" returned 1 (0x1).
DllMain(0x77900000, DLL_PROCESS_DETACH, 0x00000001) in "KERNELBASE.DLL" called.
DllMain(0x77900000, DLL_PROCESS_DETACH, 0x00000001) in "KERNELBASE.DLL" returned 1 (0x1).
Exited "DEMO_FOR_PYTHON.EXE" (process 0xBD4) with code 0 (0x0).
-------------- next part --------------
Warning: At least one delay-load dependency module was not found.
Warning: At least one module has an unresolved import due to a missing export function in a delay-load dependent module.

--------------------------------------------------------------------------------
Starting profile on 03/07/2014 at 12:38:08

Options Selected:
     Simulate ShellExecute by inserting any App Paths directories into the PATH environment variable.
     Log DllMain calls for process attach and process detach messages.
     Hook the process to gather more detailed dependency information.
     Log LoadLibrary function calls.
     Log GetProcAddress function calls.
     Log debug output messages.
     Automatically open and profile child processes.
--------------------------------------------------------------------------------

Started "DEMO_FOR_PYTHON.EXE" (process 0x1FE0) at address 0x00220000.  Successfully hooked module.
Loaded "NTDLL.DLL" at address 0x77D50000.  Successfully hooked module.
Loaded "KERNEL32.DLL" at address 0x75910000.  Successfully hooked module.
Loaded "KERNELBASE.DLL" at address 0x77900000.  Successfully hooked module.
DllMain(0x77900000, DLL_PROCESS_ATTACH, 0x00000000) in "KERNELBASE.DLL" called.
DllMain(0x77900000, DLL_PROCESS_ATTACH, 0x00000000) in "KERNELBASE.DLL" returned 1 (0x1).
DllMain(0x75910000, DLL_PROCESS_ATTACH, 0x00000000) in "KERNEL32.DLL" called.
DllMain(0x75910000, DLL_PROCESS_ATTACH, 0x00000000) in "KERNEL32.DLL" returned 1 (0x1).
Injected "DEPENDS.DLL" at address 0x08370000.
DllMain(0x08370000, DLL_PROCESS_ATTACH, 0x00000000) in "DEPENDS.DLL" called.
DllMain(0x08370000, DLL_PROCESS_ATTACH, 0x00000000) in "DEPENDS.DLL" returned 1 (0x1).
Loaded "MSVCR90.DLL" at address 0x74160000.  Successfully hooked module.
Loaded "PYTHON27.DLL" at address 0x1E000000.  Successfully hooked module.
Loaded "USER32.DLL" at address 0x76560000.  Successfully hooked module.
Loaded "GDI32.DLL" at address 0x75A60000.  Successfully hooked module.
Loaded "LPK.DLL" at address 0x764F0000.  Successfully hooked module.
Loaded "USP10.DLL" at address 0x77850000.  Successfully hooked module.
Loaded "MSVCRT.DLL" at address 0x75830000.  Successfully hooked module.
Loaded "ADVAPI32.DLL" at address 0x75DB0000.  Successfully hooked module.
Loaded "SECHOST.DLL" at address 0x760D0000.  Successfully hooked module.
Loaded "RPCRT4.DLL" at address 0x75B80000.  Successfully hooked module.
Loaded "SSPICLI.DLL" at address 0x75750000.  Successfully hooked module.
Loaded "CRYPTBASE.DLL" at address 0x75740000.  Successfully hooked module.
Loaded "SHELL32.DLL" at address 0x76790000.  Successfully hooked module.
Loaded "SHLWAPI.DLL" at address 0x76490000.  Successfully hooked module.
Entrypoint reached. All implicit modules have been loaded.
DllMain(0x74160000, DLL_PROCESS_ATTACH, 0x0045FAE4) in "MSVCR90.DLL" called.
GetProcAddress(0x75910000 [KERNEL32.DLL], "FlsAlloc") called from "MSVCR90.DLL" at address 0x74183ACC and returned 0x75924EF3.
GetProcAddress(0x75910000 [KERNEL32.DLL], "FlsGetValue") called from "MSVCR90.DLL" at address 0x74183AD9 and returned 0x75921252.
GetProcAddress(0x75910000 [KERNEL32.DLL], "FlsSetValue") called from "MSVCR90.DLL" at address 0x74183AE6 and returned 0x759241D0.
GetProcAddress(0x75910000 [KERNEL32.DLL], "FlsFree") called from "MSVCR90.DLL" at address 0x74183AF3 and returned 0x7592355F.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x7418379B and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x741837AB and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "IsProcessorFeaturePresent") called from "MSVCR90.DLL" at address 0x741E386B and returned 0x759251FD.
GetProcAddress(0x75910000 [KERNEL32.DLL], "FindActCtxSectionStringW") called from "MSVCR90.DLL" at address 0x74182822 and returned 0x7592A6D8.
DllMain(0x74160000, DLL_PROCESS_ATTACH, 0x0045FAE4) in "MSVCR90.DLL" returned 1 (0x1).
DllMain(0x75830000, DLL_PROCESS_ATTACH, 0x0045FAE4) in "MSVCRT.DLL" called.
DllMain(0x75830000, DLL_PROCESS_ATTACH, 0x0045FAE4) in "MSVCRT.DLL" returned 1 (0x1).
DllMain(0x77850000, DLL_PROCESS_ATTACH, 0x0045FAE4) in "USP10.DLL" called.
LoadLibraryA("gdi32.dll") called from "USP10.DLL" at address 0x77866020.
LoadLibraryA("gdi32.dll") returned 0x75A60000.
GetProcAddress(0x75A60000 [GDI32.DLL], "GetCharABCWidthsI") called from "USP10.DLL" at address 0x77866055 and returned 0x75A799A3.
DllMain(0x77850000, DLL_PROCESS_ATTACH, 0x0045FAE4) in "USP10.DLL" returned 1 (0x1).
DllMain(0x764F0000, DLL_PROCESS_ATTACH, 0x0045FAE4) in "LPK.DLL" called.
DllMain(0x764F0000, DLL_PROCESS_ATTACH, 0x0045FAE4) in "LPK.DLL" returned 1 (0x1).
DllMain(0x75A60000, DLL_PROCESS_ATTACH, 0x0045FAE4) in "GDI32.DLL" called.
DllMain(0x75A60000, DLL_PROCESS_ATTACH, 0x0045FAE4) in "GDI32.DLL" returned 1 (0x1).
DllMain(0x75740000, DLL_PROCESS_ATTACH, 0x0045FAE4) in "CRYPTBASE.DLL" called.
DllMain(0x75740000, DLL_PROCESS_ATTACH, 0x0045FAE4) in "CRYPTBASE.DLL" returned 1 (0x1).
DllMain(0x75750000, DLL_PROCESS_ATTACH, 0x0045FAE4) in "SSPICLI.DLL" called.
DllMain(0x75750000, DLL_PROCESS_ATTACH, 0x0045FAE4) in "SSPICLI.DLL" returned 1 (0x1).
DllMain(0x75B80000, DLL_PROCESS_ATTACH, 0x0045FAE4) in "RPCRT4.DLL" called.
DllMain(0x75B80000, DLL_PROCESS_ATTACH, 0x0045FAE4) in "RPCRT4.DLL" returned 1975101185 (0x75B9A701).
DllMain(0x760D0000, DLL_PROCESS_ATTACH, 0x0045FAE4) in "SECHOST.DLL" called.
DllMain(0x760D0000, DLL_PROCESS_ATTACH, 0x0045FAE4) in "SECHOST.DLL" returned 1 (0x1).
DllMain(0x75DB0000, DLL_PROCESS_ATTACH, 0x0045FAE4) in "ADVAPI32.DLL" called.
DllMain(0x75DB0000, DLL_PROCESS_ATTACH, 0x0045FAE4) in "ADVAPI32.DLL" returned 1 (0x1).
DllMain(0x76560000, DLL_PROCESS_ATTACH, 0x0045FAE4) in "USER32.DLL" called.
LoadLibraryW("C:\Windows\system32\IMM32.DLL") called from "USER32.DLL" at address 0x7657CF0E.
Loaded "IMM32.DLL" at address 0x76500000.  Successfully hooked module.
Loaded "MSCTF.DLL" at address 0x75FF0000.  Successfully hooked module.
DllMain(0x75FF0000, DLL_PROCESS_ATTACH, 0x00000000) in "MSCTF.DLL" called.
DllMain(0x75FF0000, DLL_PROCESS_ATTACH, 0x00000000) in "MSCTF.DLL" returned 1 (0x1).
DllMain(0x76500000, DLL_PROCESS_ATTACH, 0x00000000) in "IMM32.DLL" called.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmWINNLSEnableIME") called from "USER32.DLL" at address 0x7657C312 and returned 0x7651F637.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmWINNLSGetEnableStatus") called from "USER32.DLL" at address 0x7657C327 and returned 0x7651F65E.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmSendIMEMessageExW") called from "USER32.DLL" at address 0x7657C33C and returned 0x7651F8EC.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmSendIMEMessageExA") called from "USER32.DLL" at address 0x7657C351 and returned 0x7651F907.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmIMPGetIMEW") called from "USER32.DLL" at address 0x7657C366 and returned 0x7651FB65.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmIMPGetIMEA") called from "USER32.DLL" at address 0x7657C37B and returned 0x7651FB99.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmIMPQueryIMEW") called from "USER32.DLL" at address 0x7657C390 and returned 0x7651F9CA.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmIMPQueryIMEA") called from "USER32.DLL" at address 0x7657C3A5 and returned 0x7651FAD6.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmIMPSetIMEW") called from "USER32.DLL" at address 0x7657C3BA and returned 0x7651F746.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmIMPSetIMEA") called from "USER32.DLL" at address 0x7657C3CF and returned 0x7651F86E.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmAssociateContext") called from "USER32.DLL" at address 0x7657C3E4 and returned 0x76513540.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmEscapeA") called from "USER32.DLL" at address 0x7657C3F9 and returned 0x76519327.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmEscapeW") called from "USER32.DLL" at address 0x7657C40E and returned 0x765195A9.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmGetCompositionStringA") called from "USER32.DLL" at address 0x7657C423 and returned 0x76517A37.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmGetCompositionStringW") called from "USER32.DLL" at address 0x7657C438 and returned 0x7651420C.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmGetCompositionWindow") called from "USER32.DLL" at address 0x7657C44D and returned 0x76512E79.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmGetContext") called from "USER32.DLL" at address 0x7657C462 and returned 0x76512084.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmGetDefaultIMEWnd") called from "USER32.DLL" at address 0x7657C477 and returned 0x76511F9D.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmIsIME") called from "USER32.DLL" at address 0x7657C48C and returned 0x76512FC7.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmReleaseContext") called from "USER32.DLL" at address 0x7657C4A1 and returned 0x765121A2.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmRegisterClient") called from "USER32.DLL" at address 0x7657C4B6 and returned 0x76511346.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmGetCompositionFontW") called from "USER32.DLL" at address 0x7657C4CB and returned 0x765168C8.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmGetCompositionFontA") called from "USER32.DLL" at address 0x7657C4E0 and returned 0x7651682C.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmSetCompositionFontW") called from "USER32.DLL" at address 0x7657C4F5 and returned 0x76513938.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmSetCompositionFontA") called from "USER32.DLL" at address 0x7657C50A and returned 0x76516964.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmSetCompositionWindow") called from "USER32.DLL" at address 0x7657C51F and returned 0x765138AA.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmNotifyIME") called from "USER32.DLL" at address 0x7657C534 and returned 0x76513C6C.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmLockIMC") called from "USER32.DLL" at address 0x7657C549 and returned 0x76511E7D.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmUnlockIMC") called from "USER32.DLL" at address 0x7657C55E and returned 0x76511E95.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmLoadIME") called from "USER32.DLL" at address 0x7657C573 and returned 0x7651197A.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmSetOpenStatus") called from "USER32.DLL" at address 0x7657C588 and returned 0x76513FF3.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmFreeLayout") called from "USER32.DLL" at address 0x7657C59D and returned 0x765197EF.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmActivateLayout") called from "USER32.DLL" at address 0x7657C5B2 and returned 0x76518DF5.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmGetCandidateWindow") called from "USER32.DLL" at address 0x7657C5C7 and returned 0x76512EBC.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmSetCandidateWindow") called from "USER32.DLL" at address 0x7657C5DC and returned 0x76513E02.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmConfigureIMEW") called from "USER32.DLL" at address 0x7657C5F1 and returned 0x7651913F.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmGetConversionStatus") called from "USER32.DLL" at address 0x7657C606 and returned 0x765124E9.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmSetConversionStatus") called from "USER32.DLL" at address 0x7657C61B and returned 0x76513EE6.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmSetStatusWindowPos") called from "USER32.DLL" at address 0x7657C630 and returned 0x76516A7C.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmGetImeInfoEx") called from "USER32.DLL" at address 0x7657C645 and returned 0x765114D8.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmLockImeDpi") called from "USER32.DLL" at address 0x7657C65A and returned 0x76512025.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmUnlockImeDpi") called from "USER32.DLL" at address 0x7657C66F and returned 0x76511FD8.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmGetOpenStatus") called from "USER32.DLL" at address 0x7657C684 and returned 0x76513DCF.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmSetActiveContext") called from "USER32.DLL" at address 0x7657C699 and returned 0x76512246.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmTranslateMessage") called from "USER32.DLL" at address 0x7657C6AE and returned 0x7651F27F.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmLoadLayout") called from "USER32.DLL" at address 0x7657C6C3 and returned 0x76519E79.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmProcessKey") called from "USER32.DLL" at address 0x7657C6D8 and returned 0x76513A3C.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmPutImeMenuItemsIntoMappedFile") called from "USER32.DLL" at address 0x7657C6ED and returned 0x76524E96.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmGetProperty") called from "USER32.DLL" at address 0x7657C702 and returned 0x76513BB8.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmSetCompositionStringA") called from "USER32.DLL" at address 0x7657C717 and returned 0x765183C2.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmSetCompositionStringW") called from "USER32.DLL" at address 0x7657C72C and returned 0x765183E9.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmEnumInputContext") called from "USER32.DLL" at address 0x7657C741 and returned 0x765131DD.
GetProcAddress(0x76500000 [IMM32.DLL], "ImmSystemHandler") called from "USER32.DLL" at address 0x7657C756 and returned 0x7651B1CF.
GetProcAddress(0x76500000 [IMM32.DLL], "CtfImmTIMActivate") called from "USER32.DLL" at address 0x7657C767 and returned 0x76511888.
GetProcAddress(0x76500000 [IMM32.DLL], "CtfImmRestoreToolbarWnd") called from "USER32.DLL" at address 0x7657C778 and returned 0x76525114.
GetProcAddress(0x76500000 [IMM32.DLL], "CtfImmHideToolbarWnd") called from "USER32.DLL" at address 0x7657C789 and returned 0x7652514B.
GetProcAddress(0x76500000 [IMM32.DLL], "CtfImmDispatchDefImeMessage") called from "USER32.DLL" at address 0x7657C79A and returned 0x7651163C.
GetProcAddress(0x76500000 [IMM32.DLL], "CtfImmNotify") called from "USER32.DLL" at address 0x7657C7AB and returned 0x765115D0.
GetProcAddress(0x76500000 [IMM32.DLL], "CtfImmSetDefaultRemoteKeyboardLayout") called from "USER32.DLL" at address 0x7657C7BC and returned 0x765253CC.
GetProcAddress(0x76500000 [IMM32.DLL], "CtfImmGetCompatibleKeyboardLayout") called from "USER32.DLL" at address 0x7657C7CD and returned 0x765253DC.
DllMain(0x76500000, DLL_PROCESS_ATTACH, 0x00000000) in "IMM32.DLL" returned 1 (0x1).
LoadLibraryW("C:\Windows\system32\IMM32.DLL") returned 0x76500000.
GetProcAddress(0x764F0000 [LPK.DLL], "LpkTabbedTextOut") called from "GDI32.DLL" at address 0x75A76970 and returned 0x764F48A0.
GetProcAddress(0x764F0000 [LPK.DLL], "LpkPSMTextOut") called from "GDI32.DLL" at address 0x75A7697B and returned 0x764F1430.
GetProcAddress(0x764F0000 [LPK.DLL], "LpkDrawTextEx") called from "GDI32.DLL" at address 0x75A76986 and returned 0x764F13D0.
GetProcAddress(0x764F0000 [LPK.DLL], "LpkEditControl") called from "GDI32.DLL" at address 0x75A76991 and returned 0x764F7000.
DllMain(0x76560000, DLL_PROCESS_ATTACH, 0x0045FAE4) in "USER32.DLL" returned 1 (0x1).
DllMain(0x76490000, DLL_PROCESS_ATTACH, 0x0045FAE4) in "SHLWAPI.DLL" called.
DllMain(0x76490000, DLL_PROCESS_ATTACH, 0x0045FAE4) in "SHLWAPI.DLL" returned 1 (0x1).
DllMain(0x76790000, DLL_PROCESS_ATTACH, 0x0045FAE4) in "SHELL32.DLL" called.
DllMain(0x76790000, DLL_PROCESS_ATTACH, 0x0045FAE4) in "SHELL32.DLL" returned 1 (0x1).
DllMain(0x1E000000, DLL_PROCESS_ATTACH, 0x0045FAE4) in "PYTHON27.DLL" called.
GetProcAddress(0x75910000 [KERNEL32.DLL], "GetCurrentActCtx") called from "PYTHON27.DLL" at address 0x1E0DE6E1 and returned 0x7593D521.
GetProcAddress(0x75910000 [KERNEL32.DLL], "ActivateActCtx") called from "PYTHON27.DLL" at address 0x1E0DE6F7 and returned 0x75925458.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DeactivateActCtx") called from "PYTHON27.DLL" at address 0x1E0DE704 and returned 0x75925424.
GetProcAddress(0x75910000 [KERNEL32.DLL], "AddRefActCtx") called from "PYTHON27.DLL" at address 0x1E0DE711 and returned 0x7593D510.
GetProcAddress(0x75910000 [KERNEL32.DLL], "ReleaseActCtx") called from "PYTHON27.DLL" at address 0x1E0DE71E and returned 0x75925489.
DllMain(0x1E000000, DLL_PROCESS_ATTACH, 0x0045FAE4) in "PYTHON27.DLL" returned 1 (0x1).
LoadLibraryExA("C:\Python27\lib\site-packages\numpy\core\multiarray.pyd", 0x00000000, LOAD_WITH_ALTERED_SEARCH_PATH) called from "PYTHON27.DLL" at address 0x1E0F970F.
Loaded "MULTIARRAY.PYD" at address 0x10000000.  Successfully hooked module.
DllMain(0x10000000, DLL_PROCESS_ATTACH, 0x00000000) in "MULTIARRAY.PYD" called.
DllMain(0x10000000, DLL_PROCESS_ATTACH, 0x00000000) in "MULTIARRAY.PYD" returned 1 (0x1).
LoadLibraryExA("C:\Python27\lib\site-packages\numpy\core\multiarray.pyd", 0x00000000, LOAD_WITH_ALTERED_SEARCH_PATH) returned 0x10000000.
GetProcAddress(0x10000000 [MULTIARRAY.PYD], "initmultiarray") called from "PYTHON27.DLL" at address 0x1E0F98BF and returned 0x100915D0.
LoadLibraryExA("C:\Python27\lib\site-packages\numpy\core\umath.pyd", 0x00000000, LOAD_WITH_ALTERED_SEARCH_PATH) called from "PYTHON27.DLL" at address 0x1E0F970F.
Loaded "UMATH.PYD" at address 0x023A0000.  Successfully hooked module.
DllMain(0x023A0000, DLL_PROCESS_ATTACH, 0x00000000) in "UMATH.PYD" called.
DllMain(0x023A0000, DLL_PROCESS_ATTACH, 0x00000000) in "UMATH.PYD" returned 1 (0x1).
LoadLibraryExA("C:\Python27\lib\site-packages\numpy\core\umath.pyd", 0x00000000, LOAD_WITH_ALTERED_SEARCH_PATH) returned 0x023A0000.
GetProcAddress(0x023A0000 [UMATH.PYD], "initumath") called from "PYTHON27.DLL" at address 0x1E0F98BF and returned 0x023A7CE0.
LoadLibraryExA("C:\Python27\lib\site-packages\numpy\core\_dotblas.pyd", 0x00000000, LOAD_WITH_ALTERED_SEARCH_PATH) called from "PYTHON27.DLL" at address 0x1E0F970F.
Loaded "_DOTBLAS.PYD" at address 0x02400000.  Successfully hooked module.
Loaded "LIBIOMP5MD.DLL" at address 0x028F0000.  Successfully hooked module.
DllMain(0x028F0000, DLL_PROCESS_ATTACH, 0x00000000) in "LIBIOMP5MD.DLL" called.
GetProcAddress(0x75910000 [KERNEL32.DLL], "FlsAlloc") called from "LIBIOMP5MD.DLL" at address 0x02993113 and returned 0x75924EF3.
GetProcAddress(0x75910000 [KERNEL32.DLL], "FlsGetValue") called from "LIBIOMP5MD.DLL" at address 0x02993120 and returned 0x75921252.
GetProcAddress(0x75910000 [KERNEL32.DLL], "FlsSetValue") called from "LIBIOMP5MD.DLL" at address 0x0299312D and returned 0x759241D0.
GetProcAddress(0x75910000 [KERNEL32.DLL], "FlsFree") called from "LIBIOMP5MD.DLL" at address 0x0299313A and returned 0x7592355F.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "LIBIOMP5MD.DLL" at address 0x02992C79 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "LIBIOMP5MD.DLL" at address 0x02992C79 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "LIBIOMP5MD.DLL" at address 0x02992C79 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "LIBIOMP5MD.DLL" at address 0x02992C79 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "LIBIOMP5MD.DLL" at address 0x02992C79 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "LIBIOMP5MD.DLL" at address 0x02992C79 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "LIBIOMP5MD.DLL" at address 0x02992C79 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "LIBIOMP5MD.DLL" at address 0x02992CF4 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "LIBIOMP5MD.DLL" at address 0x02992CF4 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "LIBIOMP5MD.DLL" at address 0x02992E05 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "LIBIOMP5MD.DLL" at address 0x02992E15 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "IsProcessorFeaturePresent") called from "LIBIOMP5MD.DLL" at address 0x02996805 and returned 0x759251FD.
DllMain(0x028F0000, DLL_PROCESS_ATTACH, 0x00000000) in "LIBIOMP5MD.DLL" returned 1 (0x1).
DllMain(0x02400000, DLL_PROCESS_ATTACH, 0x00000000) in "_DOTBLAS.PYD" called.
DllMain(0x02400000, DLL_PROCESS_ATTACH, 0x00000000) in "_DOTBLAS.PYD" returned 1 (0x1).
LoadLibraryExA("C:\Python27\lib\site-packages\numpy\core\_dotblas.pyd", 0x00000000, LOAD_WITH_ALTERED_SEARCH_PATH) returned 0x02400000.
GetProcAddress(0x02400000 [_DOTBLAS.PYD], "init_dotblas") called from "PYTHON27.DLL" at address 0x1E0F98BF and returned 0x02403560.
LoadLibraryExA("C:\Python27\lib\site-packages\numpy\core\scalarmath.pyd", 0x00000000, LOAD_WITH_ALTERED_SEARCH_PATH) called from "PYTHON27.DLL" at address 0x1E0F970F.
Loaded "SCALARMATH.PYD" at address 0x00330000.  Successfully hooked module.
DllMain(0x00330000, DLL_PROCESS_ATTACH, 0x00000000) in "SCALARMATH.PYD" called.
DllMain(0x00330000, DLL_PROCESS_ATTACH, 0x00000000) in "SCALARMATH.PYD" returned 1 (0x1).
LoadLibraryExA("C:\Python27\lib\site-packages\numpy\core\scalarmath.pyd", 0x00000000, LOAD_WITH_ALTERED_SEARCH_PATH) returned 0x00330000.
GetProcAddress(0x00330000 [SCALARMATH.PYD], "initscalarmath") called from "PYTHON27.DLL" at address 0x1E0F98BF and returned 0x0034CCE0.
LoadLibraryExA("C:\Python27\lib\site-packages\numpy\lib\_compiled_base.pyd", 0x00000000, LOAD_WITH_ALTERED_SEARCH_PATH) called from "PYTHON27.DLL" at address 0x1E0F970F.
Loaded "_COMPILED_BASE.PYD" at address 0x00150000.  Successfully hooked module.
DllMain(0x00150000, DLL_PROCESS_ATTACH, 0x00000000) in "_COMPILED_BASE.PYD" called.
DllMain(0x00150000, DLL_PROCESS_ATTACH, 0x00000000) in "_COMPILED_BASE.PYD" returned 1 (0x1).
LoadLibraryExA("C:\Python27\lib\site-packages\numpy\lib\_compiled_base.pyd", 0x00000000, LOAD_WITH_ALTERED_SEARCH_PATH) returned 0x00150000.
GetProcAddress(0x00150000 [_COMPILED_BASE.PYD], "init_compiled_base") called from "PYTHON27.DLL" at address 0x1E0F98BF and returned 0x001540A0.
LoadLibraryExA("C:\Python27\lib\site-packages\numpy\linalg\lapack_lite.pyd", 0x00000000, LOAD_WITH_ALTERED_SEARCH_PATH) called from "PYTHON27.DLL" at address 0x1E0F970F.
Loaded "LAPACK_LITE.PYD" at address 0x02E60000.  Successfully hooked module.
DllMain(0x02E60000, DLL_PROCESS_ATTACH, 0x00000000) in "LAPACK_LITE.PYD" called.
DllMain(0x02E60000, DLL_PROCESS_ATTACH, 0x00000000) in "LAPACK_LITE.PYD" returned 1 (0x1).
LoadLibraryExA("C:\Python27\lib\site-packages\numpy\linalg\lapack_lite.pyd", 0x00000000, LOAD_WITH_ALTERED_SEARCH_PATH) returned 0x02E60000.
GetProcAddress(0x02E60000 [LAPACK_LITE.PYD], "initlapack_lite") called from "PYTHON27.DLL" at address 0x1E0F98BF and returned 0x02E63050.
LoadLibraryExA("C:\Python27\lib\site-packages\numpy\linalg\_umath_linalg.pyd", 0x00000000, LOAD_WITH_ALTERED_SEARCH_PATH) called from "PYTHON27.DLL" at address 0x1E0F970F.
Loaded "_UMATH_LINALG.PYD" at address 0x03780000.  Successfully hooked module.
DllMain(0x03780000, DLL_PROCESS_ATTACH, 0x00000000) in "_UMATH_LINALG.PYD" called.
DllMain(0x03780000, DLL_PROCESS_ATTACH, 0x00000000) in "_UMATH_LINALG.PYD" returned 1 (0x1).
LoadLibraryExA("C:\Python27\lib\site-packages\numpy\linalg\_umath_linalg.pyd", 0x00000000, LOAD_WITH_ALTERED_SEARCH_PATH) returned 0x03780000.
GetProcAddress(0x03780000 [_UMATH_LINALG.PYD], "init_umath_linalg") called from "PYTHON27.DLL" at address 0x1E0F98BF and returned 0x03788C90.
LoadLibraryExA("C:\Python27\lib\site-packages\numpy\fft\fftpack_lite.pyd", 0x00000000, LOAD_WITH_ALTERED_SEARCH_PATH) called from "PYTHON27.DLL" at address 0x1E0F970F.
Loaded "FFTPACK_LITE.PYD" at address 0x00520000.  Successfully hooked module.
DllMain(0x00520000, DLL_PROCESS_ATTACH, 0x00000000) in "FFTPACK_LITE.PYD" called.
DllMain(0x00520000, DLL_PROCESS_ATTACH, 0x00000000) in "FFTPACK_LITE.PYD" returned 1 (0x1).
LoadLibraryExA("C:\Python27\lib\site-packages\numpy\fft\fftpack_lite.pyd", 0x00000000, LOAD_WITH_ALTERED_SEARCH_PATH) returned 0x00520000.
GetProcAddress(0x00520000 [FFTPACK_LITE.PYD], "initfftpack_lite") called from "PYTHON27.DLL" at address 0x1E0F98BF and returned 0x00521AB0.
LoadLibraryExA("C:\Python27\lib\site-packages\numpy\random\mtrand.pyd", 0x00000000, LOAD_WITH_ALTERED_SEARCH_PATH) called from "PYTHON27.DLL" at address 0x1E0F970F.
Loaded "MTRAND.PYD" at address 0x048E0000.  Successfully hooked module.
DllMain(0x048E0000, DLL_PROCESS_ATTACH, 0x00000000) in "MTRAND.PYD" called.
DllMain(0x048E0000, DLL_PROCESS_ATTACH, 0x00000000) in "MTRAND.PYD" returned 1 (0x1).
LoadLibraryExA("C:\Python27\lib\site-packages\numpy\random\mtrand.pyd", 0x00000000, LOAD_WITH_ALTERED_SEARCH_PATH) returned 0x048E0000.
GetProcAddress(0x048E0000 [MTRAND.PYD], "initmtrand") called from "PYTHON27.DLL" at address 0x1E0F98BF and returned 0x048F8D10.
LoadLibraryExA("CRYPTSP.dll", 0x00000000, 0x00000000) called from "ADVAPI32.DLL" at address 0x75DC46D9.
Loaded "CRYPTSP.DLL" at address 0x70AF0000.  Successfully hooked module.
DllMain(0x70AF0000, DLL_PROCESS_ATTACH, 0x00000000) in "CRYPTSP.DLL" called.
DllMain(0x70AF0000, DLL_PROCESS_ATTACH, 0x00000000) in "CRYPTSP.DLL" returned 1 (0x1).
LoadLibraryExA("CRYPTSP.dll", 0x00000000, 0x00000000) returned 0x70AF0000.
GetProcAddress(0x70AF0000 [CRYPTSP.DLL], "CryptAcquireContextA") called from "ADVAPI32.DLL" at address 0x75DC473D and returned 0x70AF42A3.
Loaded "RSAENH.DLL" at address 0x70A70000.  Successfully hooked module.
DllMain(0x70A70000, DLL_PROCESS_ATTACH, 0x00000000) in "RSAENH.DLL" called.
DllMain(0x70A70000, DLL_PROCESS_ATTACH, 0x00000000) in "RSAENH.DLL" returned 1 (0x1).
LoadLibraryExA("ADVAPI32.dll", 0x00000000, 0x00000000) called from "RSAENH.DLL" at address 0x70A758D0.
LoadLibraryExA("ADVAPI32.dll", 0x00000000, 0x00000000) returned 0x75DB0000.
GetProcAddress(0x75DB0000 [ADVAPI32.DLL], "OpenThreadToken") called from "RSAENH.DLL" at address 0x70A757EF and returned 0x75DC42AC.
GetProcAddress(0x75DB0000 [ADVAPI32.DLL], "OpenProcessToken") called from "RSAENH.DLL" at address 0x70A757EF and returned 0x75DC4284.
GetProcAddress(0x75DB0000 [ADVAPI32.DLL], "GetTokenInformation") called from "RSAENH.DLL" at address 0x70A757EF and returned 0x75DC429C.
GetProcAddress(0x75DB0000 [ADVAPI32.DLL], "AllocateAndInitializeSid") called from "RSAENH.DLL" at address 0x70A757EF and returned 0x75DC4066.
GetProcAddress(0x75DB0000 [ADVAPI32.DLL], "EqualSid") called from "RSAENH.DLL" at address 0x70A757EF and returned 0x75DC408B.
GetProcAddress(0x75DB0000 [ADVAPI32.DLL], "FreeSid") called from "RSAENH.DLL" at address 0x70A757EF and returned 0x75DC40AE.
LoadLibraryExA("CRYPTBASE.dll", 0x00000000, 0x00000000) called from "RSAENH.DLL" at address 0x70A758D0.
LoadLibraryExA("CRYPTBASE.dll", 0x00000000, 0x00000000) returned 0x75740000.
GetProcAddress(0x75740000 [CRYPTBASE.DLL], "SystemFunction036") called from "RSAENH.DLL" at address 0x70A757EF and returned 0x757412F0.
GetProcAddress(0x70AF0000 [CRYPTSP.DLL], "CryptGenRandom") called from "ADVAPI32.DLL" at address 0x75DC473D and returned 0x70AF4F73.
GetProcAddress(0x70AF0000 [CRYPTSP.DLL], "CryptReleaseContext") called from "ADVAPI32.DLL" at address 0x75DC473D and returned 0x70AF2EF0.
LoadLibraryExA("C:\Python27\DLLs\_ctypes.pyd", 0x00000000, LOAD_WITH_ALTERED_SEARCH_PATH) called from "PYTHON27.DLL" at address 0x1E0F970F.
Loaded "_CTYPES.PYD" at address 0x1D1A0000.  Successfully hooked module.
Loaded "OLE32.DLL" at address 0x774B0000.  Successfully hooked module.
Loaded "OLEAUT32.DLL" at address 0x75AF0000.  Successfully hooked module.
DllMain(0x774B0000, DLL_PROCESS_ATTACH, 0x00000000) in "OLE32.DLL" called.
DllMain(0x774B0000, DLL_PROCESS_ATTACH, 0x00000000) in "OLE32.DLL" returned 1 (0x1).
DllMain(0x75AF0000, DLL_PROCESS_ATTACH, 0x00000000) in "OLEAUT32.DLL" called.
DllMain(0x75AF0000, DLL_PROCESS_ATTACH, 0x00000000) in "OLEAUT32.DLL" returned 1 (0x1).
DllMain(0x1D1A0000, DLL_PROCESS_ATTACH, 0x00000000) in "_CTYPES.PYD" called.
DllMain(0x1D1A0000, DLL_PROCESS_ATTACH, 0x00000000) in "_CTYPES.PYD" returned 1 (0x1).
LoadLibraryExA("C:\Python27\DLLs\_ctypes.pyd", 0x00000000, LOAD_WITH_ALTERED_SEARCH_PATH) returned 0x1D1A0000.
GetProcAddress(0x1D1A0000 [_CTYPES.PYD], "init_ctypes") called from "PYTHON27.DLL" at address 0x1E0F98BF and returned 0x1D1A7130.
LoadLibraryA("kernel32") called from "_CTYPES.PYD" at address 0x1D1A96B2.
LoadLibraryA("kernel32") returned 0x75910000.
GetProcAddress(0x75910000 [KERNEL32.DLL], "GetLastError") called from "_CTYPES.PYD" at address 0x1D1A445F and returned 0x759211C0.
DllMain(0x1D1A0000, DLL_PROCESS_DETACH, 0x00000001) in "_CTYPES.PYD" called.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
DllMain(0x1D1A0000, DLL_PROCESS_DETACH, 0x00000001) in "_CTYPES.PYD" returned 1 (0x1).
DllMain(0x75AF0000, DLL_PROCESS_DETACH, 0x00000001) in "OLEAUT32.DLL" called.
DllMain(0x75AF0000, DLL_PROCESS_DETACH, 0x00000001) in "OLEAUT32.DLL" returned 1 (0x1).
DllMain(0x774B0000, DLL_PROCESS_DETACH, 0x00000001) in "OLE32.DLL" called.
DllMain(0x774B0000, DLL_PROCESS_DETACH, 0x00000001) in "OLE32.DLL" returned 1 (0x1).
DllMain(0x70A70000, DLL_PROCESS_DETACH, 0x00000001) in "RSAENH.DLL" called.
DllMain(0x70A70000, DLL_PROCESS_DETACH, 0x00000001) in "RSAENH.DLL" returned 1 (0x1).
DllMain(0x70AF0000, DLL_PROCESS_DETACH, 0x00000001) in "CRYPTSP.DLL" called.
DllMain(0x70AF0000, DLL_PROCESS_DETACH, 0x00000001) in "CRYPTSP.DLL" returned 1 (0x1).
DllMain(0x048E0000, DLL_PROCESS_DETACH, 0x00000001) in "MTRAND.PYD" called.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
DllMain(0x048E0000, DLL_PROCESS_DETACH, 0x00000001) in "MTRAND.PYD" returned 1 (0x1).
DllMain(0x00520000, DLL_PROCESS_DETACH, 0x00000001) in "FFTPACK_LITE.PYD" called.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
DllMain(0x00520000, DLL_PROCESS_DETACH, 0x00000001) in "FFTPACK_LITE.PYD" returned 1 (0x1).
DllMain(0x03780000, DLL_PROCESS_DETACH, 0x00000001) in "_UMATH_LINALG.PYD" called.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
DllMain(0x03780000, DLL_PROCESS_DETACH, 0x00000001) in "_UMATH_LINALG.PYD" returned 1 (0x1).
DllMain(0x02E60000, DLL_PROCESS_DETACH, 0x00000001) in "LAPACK_LITE.PYD" called.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
DllMain(0x02E60000, DLL_PROCESS_DETACH, 0x00000001) in "LAPACK_LITE.PYD" returned 1 (0x1).
DllMain(0x00150000, DLL_PROCESS_DETACH, 0x00000001) in "_COMPILED_BASE.PYD" called.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
DllMain(0x00150000, DLL_PROCESS_DETACH, 0x00000001) in "_COMPILED_BASE.PYD" returned 1 (0x1).
DllMain(0x00330000, DLL_PROCESS_DETACH, 0x00000001) in "SCALARMATH.PYD" called.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
DllMain(0x00330000, DLL_PROCESS_DETACH, 0x00000001) in "SCALARMATH.PYD" returned 1 (0x1).
DllMain(0x02400000, DLL_PROCESS_DETACH, 0x00000001) in "_DOTBLAS.PYD" called.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
DllMain(0x02400000, DLL_PROCESS_DETACH, 0x00000001) in "_DOTBLAS.PYD" returned 1 (0x1).
DllMain(0x028F0000, DLL_PROCESS_DETACH, 0x00000001) in "LIBIOMP5MD.DLL" called.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "LIBIOMP5MD.DLL" at address 0x02992CF4 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "LIBIOMP5MD.DLL" at address 0x02992CF4 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "LIBIOMP5MD.DLL" at address 0x02992C79 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "LIBIOMP5MD.DLL" at address 0x02992CF4 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "LIBIOMP5MD.DLL" at address 0x02992C79 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "LIBIOMP5MD.DLL" at address 0x02992CF4 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "LIBIOMP5MD.DLL" at address 0x02992CF4 and returned 0x77D89DD5.
DllMain(0x028F0000, DLL_PROCESS_DETACH, 0x00000001) in "LIBIOMP5MD.DLL" returned 1 (0x1).
DllMain(0x023A0000, DLL_PROCESS_DETACH, 0x00000001) in "UMATH.PYD" called.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
DllMain(0x023A0000, DLL_PROCESS_DETACH, 0x00000001) in "UMATH.PYD" returned 1 (0x1).
DllMain(0x10000000, DLL_PROCESS_DETACH, 0x00000001) in "MULTIARRAY.PYD" called.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
DllMain(0x10000000, DLL_PROCESS_DETACH, 0x00000001) in "MULTIARRAY.PYD" returned 1 (0x1).
DllMain(0x76500000, DLL_PROCESS_DETACH, 0x00000001) in "IMM32.DLL" called.
DllMain(0x76500000, DLL_PROCESS_DETACH, 0x00000001) in "IMM32.DLL" returned 1 (0x1).
DllMain(0x75FF0000, DLL_PROCESS_DETACH, 0x00000001) in "MSCTF.DLL" called.
DllMain(0x75FF0000, DLL_PROCESS_DETACH, 0x00000001) in "MSCTF.DLL" returned 1 (0x1).
DllMain(0x1E000000, DLL_PROCESS_DETACH, 0x00000001) in "PYTHON27.DLL" called.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "DecodePointer") called from "MSVCR90.DLL" at address 0x74183667 and returned 0x77D89DD5.
GetProcAddress(0x75910000 [KERNEL32.DLL], "EncodePointer") called from "MSVCR90.DLL" at address 0x741835E2 and returned 0x77D9107B.
DllMain(0x1E000000, DLL_PROCESS_DETACH, 0x00000001) in "PYTHON27.DLL" returned 1 (0x1).
DllMain(0x76790000, DLL_PROCESS_DETACH, 0x00000001) in "SHELL32.DLL" called.
DllMain(0x76790000, DLL_PROCESS_DETACH, 0x00000001) in "SHELL32.DLL" returned 1 (0x1).
DllMain(0x76490000, DLL_PROCESS_DETACH, 0x00000001) in "SHLWAPI.DLL" called.
DllMain(0x76490000, DLL_PROCESS_DETACH, 0x00000001) in "SHLWAPI.DLL" returned 1 (0x1).
DllMain(0x76560000, DLL_PROCESS_DETACH, 0x00000001) in "USER32.DLL" called.
DllMain(0x76560000, DLL_PROCESS_DETACH, 0x00000001) in "USER32.DLL" returned 1 (0x1).
DllMain(0x75DB0000, DLL_PROCESS_DETACH, 0x00000001) in "ADVAPI32.DLL" called.
DllMain(0x75DB0000, DLL_PROCESS_DETACH, 0x00000001) in "ADVAPI32.DLL" returned 1 (0x1).
DllMain(0x760D0000, DLL_PROCESS_DETACH, 0x00000001) in "SECHOST.DLL" called.
DllMain(0x760D0000, DLL_PROCESS_DETACH, 0x00000001) in "SECHOST.DLL" returned 1 (0x1).
DllMain(0x75B80000, DLL_PROCESS_DETACH, 0x00000001) in "RPCRT4.DLL" called.
DllMain(0x75B80000, DLL_PROCESS_DETACH, 0x00000001) in "RPCRT4.DLL" returned 1 (0x1).
DllMain(0x75750000, DLL_PROCESS_DETACH, 0x00000001) in "SSPICLI.DLL" called.
DllMain(0x75750000, DLL_PROCESS_DETACH, 0x00000001) in "SSPICLI.DLL" returned 1 (0x1).
DllMain(0x75740000, DLL_PROCESS_DETACH, 0x00000001) in "CRYPTBASE.DLL" called.
DllMain(0x75740000, DLL_PROCESS_DETACH, 0x00000001) in "CRYPTBASE.DLL" returned 1 (0x1).
DllMain(0x75A60000, DLL_PROCESS_DETACH, 0x00000001) in "GDI32.DLL" called.
DllMain(0x75A60000, DLL_PROCESS_DETACH, 0x00000001) in "GDI32.DLL" returned 1 (0x1).
DllMain(0x764F0000, DLL_PROCESS_DETACH, 0x00000001) in "LPK.DLL" called.
DllMain(0x764F0000, DLL_PROCESS_DETACH, 0x00000001) in "LPK.DLL" returned 1 (0x1).
DllMain(0x77850000, DLL_PROCESS_DETACH, 0x00000001) in "USP10.DLL" called.
DllMain(0x77850000, DLL_PROCESS_DETACH, 0x00000001) in "USP10.DLL" returned 1 (0x1).
DllMain(0x75830000, DLL_PROCESS_DETACH, 0x00000001) in "MSVCRT.DLL" called.
DllMain(0x75830000, DLL_PROCESS_DETACH, 0x00000001) in "MSVCRT.DLL" returned 1 (0x1).
DllMain(0x74160000, DLL_PROCESS_DETACH, 0x00000001) in "MSVCR90.DLL" called.
DllMain(0x74160000, DLL_PROCESS_DETACH, 0x00000001) in "MSVCR90.DLL" returned 1 (0x1).
DllMain(0x08370000, DLL_PROCESS_DETACH, 0x00000001) in "DEPENDS.DLL" called.
DllMain(0x08370000, DLL_PROCESS_DETACH, 0x00000001) in "DEPENDS.DLL" returned 1 (0x1).
DllMain(0x75910000, DLL_PROCESS_DETACH, 0x00000001) in "KERNEL32.DLL" called.
DllMain(0x75910000, DLL_PROCESS_DETACH, 0x00000001) in "KERNEL32.DLL" returned 1 (0x1).
DllMain(0x77900000, DLL_PROCESS_DETACH, 0x00000001) in "KERNELBASE.DLL" called.
DllMain(0x77900000, DLL_PROCESS_DETACH, 0x00000001) in "KERNELBASE.DLL" returned 1 (0x1).
Exited "DEMO_FOR_PYTHON.EXE" (process 0x1FE0) with code 0 (0x0).

From cmkleffner at gmail.com  Thu Jul  3 07:17:31 2014
From: cmkleffner at gmail.com (Carl Kleffner)
Date: Thu, 3 Jul 2014 13:17:31 +0200
Subject: [Numpy-discussion] Numpy and debug symbols
In-Reply-To: <CAMHUo7k8E_JxjhBTL4V-YUjHuaAzb-QVq2zBpZPwAPGLSOF8Vw@mail.gmail.com>
References: <CAMHUo7mjXrmXa=F=PDSaRyvKnuQObBotGWApC7tEacJacZjUJA@mail.gmail.com>
	<CAK5FAtH3MB4qebze-=Y=NfV-Rp3zsQ478m-5wRG0m2ToJ_BSSA@mail.gmail.com>
	<CAGGsPMyJktgKv1f+mD5F928FL7-aQcFt_947s9JVJj78o_HkWw@mail.gmail.com>
	<CAMHUo7k8E_JxjhBTL4V-YUjHuaAzb-QVq2zBpZPwAPGLSOF8Vw@mail.gmail.com>
Message-ID: <CAGGsPMwN81n=9=sp1QwpLCVMZ6=kLK6LwU5CEWScH_L8v=nacg@mail.gmail.com>

Hi,

numpy extensions are linked against python27.dll. I have no idea, if it
works to copy python27.dll side by side to python27_d.dll (I guess not).
But you can try it anyway. The clean way is to get or compile a debug numpy
version linked against python27_d.dll

Regards

Carll


2014-07-03 12:51 GMT+02:00 Pablo P?rez Garc?a <pablopg at computer.org>:

> Hello,
>
> I was able to run Dependency Walker and I noticed that in Debug mode the
> following type of libraries are not loaded:
>
> "MULTIARRAY.PYD", "UMATH.PYD"
>
> Also in debug mode Python27_D is loaded and in release mode Python27 which
> sounds good to me... but for some reason debug mode cannot load necessary
> dependencies.
>
> I attach both files.
>
> By the way, I like this community!
>
>
>
> 2014-07-03 12:33 GMT+02:00 Carl Kleffner <cmkleffner at gmail.com>:
>
> Hi,
>>
>> to trace this error, you can try to run your programm with the dependency
>> walker http://www.dependencywalker.com/ . In the menu there is a
>> profiling option. With 'Start profiling' you get messages of all accesses
>> to DLLs and Python extensions. Most likely a DLL is not found.
>> Be aware: for 64bit development you need a dedicated zip-file for the
>> dependency walker.
>>
>> Regards
>>
>> Carl
>>
>>
>> 2014-07-03 11:22 GMT+02:00 Julian Taylor <jtaylor.debian at googlemail.com>:
>>
>> On Thu, Jul 3, 2014 at 11:14 AM, Pablo P?rez Garc?a
>>> <pablopg at computer.org> wrote:
>>> > Hello, I'm a newcomer and I have a question I did not manage to solve
>>> yet, I
>>> > posted it into these two stack-overflow entries:
>>> >
>>> >
>>> http://stackoverflow.com/questions/24529811/compiling-numpy-for-windows-python-2-7-7
>>> >
>>> >
>>> http://stackoverflow.com/questions/24548485/using-numpy-on-an-embedded-python-interpreter-using-vs2008-under-windows-7
>>> >
>>>
>>> I don't know how it works on windows but on linux/mac in order to
>>> import debug builds of binary extensions you need to use debug build
>>> of python which is a different runtime. I guess on windows you either
>>> have to download a special installer with the debug build or build it
>>> yourself (configure --with-pydebug)
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
>
> --
> Pablo P?rez Garc?a
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140703/eee9c2c5/attachment.html>

From cmkleffner at gmail.com  Thu Jul  3 07:51:56 2014
From: cmkleffner at gmail.com (Carl Kleffner)
Date: Thu, 3 Jul 2014 13:51:56 +0200
Subject: [Numpy-discussion] 64-bit windows numpy / scipy wheels for
	testing
In-Reply-To: <CAH6Pt5qJS357CtoXGyz2cNmVFr8ufjjdwe_fHqU86YcAcJ5vcw@mail.gmail.com>
References: <CAH6Pt5r47rOKXYfdiqaGQihV+yzrq2dc9rmYbboNb=HUbrtEpA@mail.gmail.com>
	<536CB2C6.1030305@googlemail.com>
	<CAGY4rcV1tecTqbVB-nY2_zuAQ1U0uVRH9iPa8Wd2inL8mh+zpA@mail.gmail.com>
	<CAH6Pt5oOx+POzTJdMHU49JxZs=Pkc6OFD-B6=9topx68QjtiiA@mail.gmail.com>
	<CABL7CQhfo6++1B6hBh3pDzmLZOeNbqo1SzQOfBMbLEQj4G+U7A@mail.gmail.com>
	<CAFvE7K4+TMXG8-=bFsWk7hx5Wh4VkhhyYhw4z7EBb6kKnqPJrw@mail.gmail.com>
	<CAGGsPMxmOV8T+Wm6LfTER1xebCQk4MEebXfdRULJ6b8zWLL_=Q@mail.gmail.com>
	<CAH6Pt5pFftrXN8dkS4Dcj+Dt8Bx6EMyqp_cK0RX6=KUKhVx2+w@mail.gmail.com>
	<CAH6Pt5r2fNJGGj1K=foSt9tFU89+wf82waYAzxYHrKGtQbTXwQ@mail.gmail.com>
	<CAGGsPMyxvNkz__+Hc1YSp_fLRPbdg4gez9EAwQyS9jzv_R+ZCA@mail.gmail.com>
	<CAH6Pt5rj8QOrtCyZMKeaM34ofdXkKRCiABGNLD1vVUC_B7bqTA@mail.gmail.com>
	<CAGGsPMy7Y1nqz+5TCuGFP2gvK6G2ReYMb9X1e_Jq3rwO6FoLJA@mail.gmail.com>
	<CAH6Pt5rPhXYyWHPVy8fYrym=2oNQxj5kchUL38KdLrwXjmhL8w@mail.gmail.com>
	<CALGmxEKyXDsuF81=coYYvtJSOLnpnO6AW5CxNYu+JSBOCiQu=A@mail.gmail.com>
	<lp2k91$7jg$1@ger.gmane.org>
	<CAH6Pt5o6g5A_gjCsH75xkqLzPLJUz9OdsB4a5Ej5S-kectw+og@mail.gmail.com>
	<CAH6Pt5qJS357CtoXGyz2cNmVFr8ufjjdwe_fHqU86YcAcJ5vcw@mail.gmail.com>
Message-ID: <CAGGsPMziLZg_ZZBC=Z0ANDXqFfA3TOKQ3ZJxL21oBRUWc9rEzg@mail.gmail.com>

Hi Matthew,

I can make it in the late evening (MEZ timezone), so you have to wait a bit
... I also will try to create new numpy/scipy wheels. I now have the latest
OpenBLAS version ready. Olivier gaves me access to rackspace. I wil try it
out on the weekend.

Regards

Carl


2014-07-03 12:46 GMT+02:00 Matthew Brett <matthew.brett at gmail.com>:

> I guess this one's mainly for Carl:
>
> On Thu, Jul 3, 2014 at 11:06 AM, Matthew Brett <matthew.brett at gmail.com>
> wrote:
> > Hi,
> >
> > On Thu, Jul 3, 2014 at 4:56 AM, Sturla Molden <sturla.molden at gmail.com>
> wrote:
> >> On 02/07/14 19:55, Chris Barker wrote:
> >>
> >>>
> >>> Indeed -- the default (i.e what you get with pip install numpy) should
> >>> be SSE2 -- I":d much rather have a few folks with old hardware have to
> >>> go through some hoops that n have most people get something that is
> >>> "much slower than MATLAB".
> >>
> >>
> >> I think we should use SSE3 as default. It is already ten years old. Most
> >> users (99.999 %) who want binary wheels have an SSE3 capable CPU.
> >
> > The 99% for SSE2 comes from the Firefox crash reports, where the large
> > majority are for very recent Firefox downloads.
> >
> > If you can identify SSE3 machines from the reported CPU string (as the
> > Firefox people did for SSE2), please do have a look a see if you can
> > get a count for SSE3 in the Firefox crash reports; if it's close to
> > 99% that would make a strong argument:
> >
> > https://github.com/numpy/numpy/wiki/Windows-versions#sse--sse2
> > https://gist.github.com/matthew-brett/9cb5274f7451a3eb8fc0
>
> Jonathan Helmus recently pointed out https://ci.appveyor.com in a
> discussion on the scikit-image mailing list.  The scikit-image team
> are trying to get builds and tests working there.  The configuration
> file allows arbitrary cmd and powershell commands executed in a clean
> Windows virtual machine.  Do you think it would be possible to get the
> wheel builds working on something like that?  That would be a big step
> forward, just because the current procedure is rather fiddly, even if
> not very difficult.
>
> Any news on the pull request to numpy?  Waiting eagerly :)
>
> Cheers,
>
> Matthew
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140703/f47ab37c/attachment.html>

From m.hulsman at tudelft.nl  Thu Jul  3 08:36:17 2014
From: m.hulsman at tudelft.nl (Marc Hulsman)
Date: Thu, 03 Jul 2014 14:36:17 +0200
Subject: [Numpy-discussion] Fast way to convert (nested) list to numpy
 object array?
In-Reply-To: <CAK5FAtGuh9oGE-tjFDJUo-pndHrSZrsYps97BxfqiGvqMSe7vA@mail.gmail.com>
References: <53B51993.7080207@tudelft.nl>	<CAK5FAtFApW67ry7Cpw-G3YEffO3MXJTpVQtUgFWB2x+mC+akeQ@mail.gmail.com>
	<CAK5FAtGuh9oGE-tjFDJUo-pndHrSZrsYps97BxfqiGvqMSe7vA@mail.gmail.com>
Message-ID: <53B54E41.8090309@tudelft.nl>

On 07/03/2014 11:43 AM, Julian Taylor wrote:
> On second though I guess adding a short circuit to the dimension
> discovery on mismatching list length with object type should solve the
> issue too. A bit more information on the use case would still be
> useful, why do you need to use numpy arrays for this in the first place?

I use numpy as the base for a prototype data handling language (which
matches dimensions not on position as in numpy, but by identity).
This allows SQL like operations on complex data structures. The code has
to be generic, to handle the corner cases. Numpy is used as it
provides the fast indicing/ufuncs.

Input is often formatted using regular Python constructs. This input
data is 'unpacked' to a certain depth, which means
that it is converted to numpy arrays, to allow for generic query
operations.

This can however go wrong. Say that we have nested variable length
lists, what sometimes happens is that part of the data has
(by chance) only fixed length nested lists, while another part has
variable length nested lists. If we then unpack, numpy will for
the first case construct a multi-dimensional array, while for the second
case it will construct a single-dimensional
array of nested lists. If we then want to e.g. concatenate this data
using a generic operation, it will have trouble to handle the mix of
multi-dimensional and 1-dimensional arrays.  The code becomes quite a
bit simpler if I know at forehand that I can expect just e.g.
a 1-dimensional array.

This is maybe somewhat of a corner case :) However, I was still
wondering why, when assigning x[:] = k, k is still 'descended into'
further than needed given the limited dimension of x. This seems
unnecessary? Also, it is also not really clear to me why fromiter
does not work using object dtypes. A solution for these two more general
problems would already help me a lot.

The generic solution of adding an nmaxdim parameter to numpy.array would
of course be even more ideal :)


From sebastian at sipsolutions.net  Thu Jul  3 08:44:19 2014
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Thu, 03 Jul 2014 14:44:19 +0200
Subject: [Numpy-discussion] Fast way to convert (nested) list to numpy
 object array?
In-Reply-To: <53B54E41.8090309@tudelft.nl>
References: <53B51993.7080207@tudelft.nl>
	<CAK5FAtFApW67ry7Cpw-G3YEffO3MXJTpVQtUgFWB2x+mC+akeQ@mail.gmail.com>
	<CAK5FAtGuh9oGE-tjFDJUo-pndHrSZrsYps97BxfqiGvqMSe7vA@mail.gmail.com>
	<53B54E41.8090309@tudelft.nl>
Message-ID: <1404391459.13834.8.camel@sebastian-t440>

On Do, 2014-07-03 at 14:36 +0200, Marc Hulsman wrote:
> On 07/03/2014 11:43 AM, Julian Taylor wrote:
> > On second though I guess adding a short circuit to the dimension
> > discovery on mismatching list length with object type should solve the
> > issue too. A bit more information on the use case would still be
> > useful, why do you need to use numpy arrays for this in the first place?
> 
> I use numpy as the base for a prototype data handling language (which
> matches dimensions not on position as in numpy, but by identity).
> This allows SQL like operations on complex data structures. The code has
> to be generic, to handle the corner cases. Numpy is used as it
> provides the fast indicing/ufuncs.
> 
> Input is often formatted using regular Python constructs. This input
> data is 'unpacked' to a certain depth, which means
> that it is converted to numpy arrays, to allow for generic query
> operations.
> 
> This can however go wrong. Say that we have nested variable length
> lists, what sometimes happens is that part of the data has
> (by chance) only fixed length nested lists, while another part has
> variable length nested lists. If we then unpack, numpy will for
> the first case construct a multi-dimensional array, while for the second
> case it will construct a single-dimensional
> array of nested lists. If we then want to e.g. concatenate this data
> using a generic operation, it will have trouble to handle the mix of
> multi-dimensional and 1-dimensional arrays.  The code becomes quite a
> bit simpler if I know at forehand that I can expect just e.g.
> a 1-dimensional array.
> 
> This is maybe somewhat of a corner case :) However, I was still
> wondering why, when assigning x[:] = k, k is still 'descended into'
> further than needed given the limited dimension of x. This seems
> unnecessary? Also, it is also not really clear to me why fromiter
> does not work using object dtypes. A solution for these two more general
> problems would already help me a lot.

True and true. I don't see a problem with fromiter being more general,
just someone has to sit down and add new error checks/cleanup stuff for
the object case. The assignment could probably also be optimized, not
sure how hard that is, I would expect it isn't that hard.

As usually, someone just needs to find time and the interest to actually
do it ;).

- Sebastian

> 
> The generic solution of adding an nmaxdim parameter to numpy.array would
> of course be even more ideal :)
> 
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140703/4556bbcf/attachment.sig>

From sturla.molden at gmail.com  Thu Jul  3 09:27:24 2014
From: sturla.molden at gmail.com (Sturla Molden)
Date: Thu, 3 Jul 2014 13:27:24 +0000 (UTC)
Subject: [Numpy-discussion] [Python-ideas] PEP pre-draft: Support for
	indexing with keyword arguments
References: <CAPJVwBnr3r5O1UNFpVpMTSJBEoA+BbcH4b0URKLtdRgTcNKnwA@mail.gmail.com>
Message-ID: <295992745426086730.054539sturla.molden-gmail.com@news.gmane.org>

Pandas might have more use for this than NumPy. Database interfaces might
also have use for this.

Sturla

Nathaniel Smith <njs at pobox.com> wrote:
> There's some discussion on python-ideas about making it possible for python
> indexing to accept kwargs, eg
> 
>    arr[1:2, foo=bar]
> 
> Since numpy is a very heavy user of indexing which might benefit from this,
> I thought I should forward it here. If we have clear use cases for such a
> feature then that may strongly affect the discussion.
> 
> I admit I can't actually think of any features this would enable for us
> though...
> 
> -n
> ---------- Forwarded message ----------
> From: "Stefano Borini" <stefano.borini at ferrara.linux.it>
> Date: 2 Jul 2014 00:17
> Subject: [Python-ideas] PEP pre-draft: Support for indexing with keyword
> arguments
> To: "python-ideas at python.org" <python-ideas at python.org>, "Joseph
> Martinot-Lagarde" <joseph.martinot-lagarde at m4x.org>
> Cc:
> 
> Dear all,
> 
> after the first mailing list feedback, and further private discussion with
> Joseph Martinot-Lagarde, I drafted a first iteration of a PEP for keyword
> arguments in indexing. The document is available here.
> 
> <a
> href="https://github.com/stefanoborini/pep-keyword/blob/master/PEP-XXX.txt">https://github.com/stefanoborini/pep-keyword/blob/master/PEP-XXX.txt</a>
> 
> The document is not in final form when it comes to specifications. In fact,
> it requires additional discussion about the best strategy to achieve the
> desired result. Particular attention has been devoted to present
> alternative implementation strategies, their pros and cons. I will examine
> all feedback tomorrow morning European time (in approx 10 hrs), and apply
> any pull requests or comments you may have.
> 
> When the specification is finalized, or this community suggests that the
> PEP is in a form suitable for official submission despite potential open
> issues, I will submit it to the editor panel for further discussion, and
> deploy an actual implementation according to the agreed specification for a
> working test run.
> 
> I apologize for potential mistakes in the PEP drafting and submission
> process, as this is my first PEP.
> 
> Kind Regards,
> 
> Stefano Borini
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> <a
> href="https://mail.python.org/mailman/listinfo/python-ideas">https://mail.python.org/mailman/listinfo/python-ideas</a>
> Code of Conduct: <a
> href="http://python.org/psf/codeofconduct/">http://python.org/psf/codeofconduct/</a>
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> <a
> href="http://mail.scipy.org/mailman/listinfo/numpy-discussion">http://mail.scipy.org/mailman/listinfo/numpy-discussion</a>


From matthew.brett at gmail.com  Thu Jul  3 10:43:51 2014
From: matthew.brett at gmail.com (Matthew Brett)
Date: Thu, 3 Jul 2014 15:43:51 +0100
Subject: [Numpy-discussion] 64-bit windows numpy / scipy wheels for
	testing
In-Reply-To: <CAGGsPMziLZg_ZZBC=Z0ANDXqFfA3TOKQ3ZJxL21oBRUWc9rEzg@mail.gmail.com>
References: <CAH6Pt5r47rOKXYfdiqaGQihV+yzrq2dc9rmYbboNb=HUbrtEpA@mail.gmail.com>
	<536CB2C6.1030305@googlemail.com>
	<CAGY4rcV1tecTqbVB-nY2_zuAQ1U0uVRH9iPa8Wd2inL8mh+zpA@mail.gmail.com>
	<CAH6Pt5oOx+POzTJdMHU49JxZs=Pkc6OFD-B6=9topx68QjtiiA@mail.gmail.com>
	<CABL7CQhfo6++1B6hBh3pDzmLZOeNbqo1SzQOfBMbLEQj4G+U7A@mail.gmail.com>
	<CAFvE7K4+TMXG8-=bFsWk7hx5Wh4VkhhyYhw4z7EBb6kKnqPJrw@mail.gmail.com>
	<CAGGsPMxmOV8T+Wm6LfTER1xebCQk4MEebXfdRULJ6b8zWLL_=Q@mail.gmail.com>
	<CAH6Pt5pFftrXN8dkS4Dcj+Dt8Bx6EMyqp_cK0RX6=KUKhVx2+w@mail.gmail.com>
	<CAH6Pt5r2fNJGGj1K=foSt9tFU89+wf82waYAzxYHrKGtQbTXwQ@mail.gmail.com>
	<CAGGsPMyxvNkz__+Hc1YSp_fLRPbdg4gez9EAwQyS9jzv_R+ZCA@mail.gmail.com>
	<CAH6Pt5rj8QOrtCyZMKeaM34ofdXkKRCiABGNLD1vVUC_B7bqTA@mail.gmail.com>
	<CAGGsPMy7Y1nqz+5TCuGFP2gvK6G2ReYMb9X1e_Jq3rwO6FoLJA@mail.gmail.com>
	<CAH6Pt5rPhXYyWHPVy8fYrym=2oNQxj5kchUL38KdLrwXjmhL8w@mail.gmail.com>
	<CALGmxEKyXDsuF81=coYYvtJSOLnpnO6AW5CxNYu+JSBOCiQu=A@mail.gmail.com>
	<lp2k91$7jg$1@ger.gmane.org>
	<CAH6Pt5o6g5A_gjCsH75xkqLzPLJUz9OdsB4a5Ej5S-kectw+og@mail.gmail.com>
	<CAH6Pt5qJS357CtoXGyz2cNmVFr8ufjjdwe_fHqU86YcAcJ5vcw@mail.gmail.com>
	<CAGGsPMziLZg_ZZBC=Z0ANDXqFfA3TOKQ3ZJxL21oBRUWc9rEzg@mail.gmail.com>
Message-ID: <CAH6Pt5pEb3yKW5gqa=t1_BwB0XPR8-hz8=1wcuVhCW7f1r_3GA@mail.gmail.com>

Hi,

On Thu, Jul 3, 2014 at 12:51 PM, Carl Kleffner <cmkleffner at gmail.com> wrote:
> Hi Matthew,
>
> I can make it in the late evening (MEZ timezone), so you have to wait a bit
> ... I also will try to create new numpy/scipy wheels. I now have the latest
> OpenBLAS version ready. Olivier gaves me access to rackspace. I wil try it
> out on the weekend.

Great - thanks a lot,

Matthew


From chris.barker at noaa.gov  Thu Jul  3 11:59:00 2014
From: chris.barker at noaa.gov (Chris Barker)
Date: Thu, 3 Jul 2014 08:59:00 -0700
Subject: [Numpy-discussion] Teaching Scipy BoF at SciPy
Message-ID: <CALGmxEJ6VHUpV4ec5YRjuE=jE00StUesEDm9J+6ZuNKrajSdOg@mail.gmail.com>

HI  Folks,

I will be hosting a "Teaching the SciPy Stack" BoF at SciPy this year:

https://conference.scipy.org/scipy2014/schedule/presentation/1762/

(Actually, I proposed it for the conference, but would be more than happy
to have other folks join me in facilitating, hosting, etc.)

I've put up a Wiki page to collect ideas for topics. Please take a look and
add your $0.02:

https://github.com/numpy/numpy/wiki/TeachingSciPy-BoF-at-Scipy-2014

See you there,

  -Chris

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140703/26e89931/attachment.html>

From ted.sandler at gmail.com  Thu Jul  3 13:17:10 2014
From: ted.sandler at gmail.com (Ted Sandler)
Date: Thu, 3 Jul 2014 10:17:10 -0700
Subject: [Numpy-discussion] parsing dtype descriptors
Message-ID: <CAHYXaWv3EiGLFZOfMJaSp1pb5ndPGAt5A+ffsYzTwBZ=1KS_Yg@mail.gmail.com>

Hi all, is there a spec or grammar for valid values of numpy dtype
descriptor strings?

I am writing code to parse ".npy" files from Java and want to be able to
handle the range of ndarray descriptor strings. I came across this code:

   dtype = numpy.dtype(d['descr'])

at line 267 in format.py:

   https://github.com/numpy/numpy/blob/master/numpy/lib/format.py

However, I can't seem to find where it's implemented.

Help is appreciated.  Thanks!
Ted
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140703/dc08bcca/attachment.html>

From shoyer at gmail.com  Thu Jul  3 13:30:01 2014
From: shoyer at gmail.com (Stephan Hoyer)
Date: Thu, 3 Jul 2014 10:30:01 -0700
Subject: [Numpy-discussion] Fast way to convert (nested) list to numpy
 object array?
In-Reply-To: <53B54E41.8090309@tudelft.nl>
References: <53B51993.7080207@tudelft.nl>
	<CAK5FAtFApW67ry7Cpw-G3YEffO3MXJTpVQtUgFWB2x+mC+akeQ@mail.gmail.com>
	<CAK5FAtGuh9oGE-tjFDJUo-pndHrSZrsYps97BxfqiGvqMSe7vA@mail.gmail.com>
	<53B54E41.8090309@tudelft.nl>
Message-ID: <CAEQ_TveqcSdFuZRavb3CF_DvRRcfAnOHDNHJ7nt=pEVYuZC2Ew@mail.gmail.com>

On Thu, Jul 3, 2014 at 5:36 AM, Marc Hulsman <m.hulsman at tudelft.nl> wrote:

> This can however go wrong. Say that we have nested variable length
> lists, what sometimes happens is that part of the data has
> (by chance) only fixed length nested lists, while another part has
> variable length nested lists. If we then unpack, numpy will for
> the first case construct a multi-dimensional array, while for the second
> case it will construct a single-dimensional
> array of nested lists. If we then want to e.g. concatenate this data
> using a generic operation, it will have trouble to handle the mix of
> multi-dimensional and 1-dimensional arrays.  The code becomes quite a
> bit simpler if I know at forehand that I can expect just e.g.
> a 1-dimensional array.
>

Pandas has a couple of awkward work-arounds to do just that (creating
object arrays). Might be worth taking a look:
https://github.com/pydata/pandas/blob/master/pandas/lib.pyx#L315
https://github.com/pydata/pandas/blob/master/pandas/core/common.py#L2124

Cheers,
Stephan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140703/7072f255/attachment.html>

From valentin at haenel.co  Thu Jul  3 15:35:06 2014
From: valentin at haenel.co (Valentin Haenel)
Date: Thu, 3 Jul 2014 21:35:06 +0200
Subject: [Numpy-discussion] parsing dtype descriptors
In-Reply-To: <CAHYXaWv3EiGLFZOfMJaSp1pb5ndPGAt5A+ffsYzTwBZ=1KS_Yg@mail.gmail.com>
References: <CAHYXaWv3EiGLFZOfMJaSp1pb5ndPGAt5A+ffsYzTwBZ=1KS_Yg@mail.gmail.com>
Message-ID: <20140703193506.GA25653@kudu.in-berlin.de>

Dear Ted,

* Ted Sandler <ted.sandler at gmail.com> [2014-07-03]:
> Hi all, is there a spec or grammar for valid values of numpy dtype
> descriptor strings?
> 
> I am writing code to parse ".npy" files from Java and want to be able to
> handle the range of ndarray descriptor strings. I came across this code:
> 
>    dtype = numpy.dtype(d['descr'])
> 
> at line 267 in format.py:
> 
>    https://github.com/numpy/numpy/blob/master/numpy/lib/format.py
> 
> However, I can't seem to find where it's implemented.

Not sure exactly, what you are looking for, but maybe the following
helps:

https://github.com/numpy/numpy/blob/master/numpy/lib/format.py#L210

best,

V-


From ted.sandler at gmail.com  Thu Jul  3 17:53:51 2014
From: ted.sandler at gmail.com (Ted Sandler)
Date: Thu, 3 Jul 2014 14:53:51 -0700
Subject: [Numpy-discussion] parsing dtype descriptors
In-Reply-To: <20140703193506.GA25653@kudu.in-berlin.de>
References: <CAHYXaWv3EiGLFZOfMJaSp1pb5ndPGAt5A+ffsYzTwBZ=1KS_Yg@mail.gmail.com>
	<20140703193506.GA25653@kudu.in-berlin.de>
Message-ID: <CAHYXaWtfRWp5vdcNc-z4ZQiSJmNtEbJa64b3pn0znOnfNCeoCg@mail.gmail.com>

Thanks. No, it's not what I'm looking for.

I'm looking for the code that parses the string "<i8" in the npy file array
header's descriptor:

  {'descr': '<i8', 'fortran_order': False, 'shape': (5,), }

There are many different descriptor strings, e.g.:

 '>f8'
 '=f4'
 'float32'
 '>c16'
 ...

Ideally, I want the exhaustive list of valid input strings that describe
standard ndarrays (i.e. ndarrays with simple entries as opposed to records
or subarrays). Lacking an exhaustive list or spec, I'd like the source code
that does the parsing for them.  This stackoverflow post is worth looking
at:


http://stackoverflow.com/questions/13997087/what-are-the-available-datatypes-for-dtype-with-numpys-loadtxt-an-genfromtxt

Thanks again,
Ted


On Thu, Jul 3, 2014 at 12:35 PM, Valentin Haenel <valentin at haenel.co> wrote:

> Dear Ted,
>
> * Ted Sandler <ted.sandler at gmail.com> [2014-07-03]:
> > Hi all, is there a spec or grammar for valid values of numpy dtype
> > descriptor strings?
> >
> > I am writing code to parse ".npy" files from Java and want to be able to
> > handle the range of ndarray descriptor strings. I came across this code:
> >
> >    dtype = numpy.dtype(d['descr'])
> >
> > at line 267 in format.py:
> >
> >    https://github.com/numpy/numpy/blob/master/numpy/lib/format.py
> >
> > However, I can't seem to find where it's implemented.
>
> Not sure exactly, what you are looking for, but maybe the following
> helps:
>
> https://github.com/numpy/numpy/blob/master/numpy/lib/format.py#L210
>
> best,
>
> V-
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140703/cc8f82e9/attachment.html>

From charlesr.harris at gmail.com  Thu Jul  3 18:54:46 2014
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu, 3 Jul 2014 16:54:46 -0600
Subject: [Numpy-discussion] Fast way to convert (nested) list to numpy
 object array?
In-Reply-To: <CAK5FAtFApW67ry7Cpw-G3YEffO3MXJTpVQtUgFWB2x+mC+akeQ@mail.gmail.com>
References: <53B51993.7080207@tudelft.nl>
	<CAK5FAtFApW67ry7Cpw-G3YEffO3MXJTpVQtUgFWB2x+mC+akeQ@mail.gmail.com>
Message-ID: <CAB6mnxJJDO1HzqMB9B+69WeiO_7nXiOjp_aByT9ftuJ=iUtxUA@mail.gmail.com>

On Thu, Jul 3, 2014 at 3:30 AM, Julian Taylor <jtaylor.debian at googlemail.com
> wrote:

> numpy descends into the lists even if you request a object dtype as it
> treats object arrays containing nested lists of equal size as
> ndimensional:
>
> np.array([[1,2], [3,4]], dtype=object).ndim
> 2
>
> I don't think we have a constructor that limits the maximum dimension,
> only one the minimum dimension.
>

There was discussion of such some years ago specifically for the object
case. I think it would be useful.

<snip>

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140703/cecfe5a2/attachment.html>

From toddrjen at gmail.com  Fri Jul  4 02:03:12 2014
From: toddrjen at gmail.com (Todd)
Date: Fri, 4 Jul 2014 08:03:12 +0200
Subject: [Numpy-discussion] Fwd: [Python-ideas] PEP pre-draft: Support
 for indexing with keyword arguments
In-Reply-To: <CAPJVwBnr3r5O1UNFpVpMTSJBEoA+BbcH4b0URKLtdRgTcNKnwA@mail.gmail.com>
References: <53B33800.1030300@ferrara.linux.it>
	<CAPJVwBnr3r5O1UNFpVpMTSJBEoA+BbcH4b0URKLtdRgTcNKnwA@mail.gmail.com>
Message-ID: <CAFpSVpJ-XL3UrENZewZMRiq42N_=OaGUsdTx41r0EMQRskgBag@mail.gmail.com>

On Jul 2, 2014 10:49 AM, "Nathaniel Smith" <njs at pobox.com> wrote:
>
> I admit I can't actually think of any features this would enable for us
though...

Could it be useful for structured arrays?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140704/872cbffe/attachment.html>

From sebastian at sipsolutions.net  Fri Jul  4 04:39:33 2014
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Fri, 04 Jul 2014 10:39:33 +0200
Subject: [Numpy-discussion] Fwd: [Python-ideas] PEP pre-draft: Support
 for indexing with keyword arguments
In-Reply-To: <CAFpSVpJ-XL3UrENZewZMRiq42N_=OaGUsdTx41r0EMQRskgBag@mail.gmail.com>
References: <53B33800.1030300@ferrara.linux.it>
	<CAPJVwBnr3r5O1UNFpVpMTSJBEoA+BbcH4b0URKLtdRgTcNKnwA@mail.gmail.com>
	<CAFpSVpJ-XL3UrENZewZMRiq42N_=OaGUsdTx41r0EMQRskgBag@mail.gmail.com>
Message-ID: <1404463173.2714.4.camel@sebastian-t440>

On Fr, 2014-07-04 at 08:03 +0200, Todd wrote:
> 
> On Jul 2, 2014 10:49 AM, "Nathaniel Smith" <njs at pobox.com> wrote:
> >
> > I admit I can't actually think of any features this would enable for
> us though...
> 
> Could it be useful for structured arrays?

Not sure how. The named columns seem like a decent point to me. For
toggling indexing options, I wonder if usually function calls or
temporary object construction (at least for numpy) ala:

arr.ox[...]
arr.indx(option)[...]

are not better in any case.

- Sebastian

> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140704/51606013/attachment.sig>

From valentin at haenel.co  Fri Jul  4 04:53:09 2014
From: valentin at haenel.co (Valentin Haenel)
Date: Fri, 4 Jul 2014 10:53:09 +0200
Subject: [Numpy-discussion] parsing dtype descriptors
In-Reply-To: <CAHYXaWtfRWp5vdcNc-z4ZQiSJmNtEbJa64b3pn0znOnfNCeoCg@mail.gmail.com>
References: <CAHYXaWv3EiGLFZOfMJaSp1pb5ndPGAt5A+ffsYzTwBZ=1KS_Yg@mail.gmail.com>
	<20140703193506.GA25653@kudu.in-berlin.de>
	<CAHYXaWtfRWp5vdcNc-z4ZQiSJmNtEbJa64b3pn0znOnfNCeoCg@mail.gmail.com>
Message-ID: <20140704085309.GB30233@kudu.in-berlin.de>

Dear Ted,

* Ted Sandler <ted.sandler at gmail.com> [2014-07-03]:
> Thanks. No, it's not what I'm looking for.
> 
> I'm looking for the code that parses the string "<i8" in the npy file array
> header's descriptor:
> 
>   {'descr': '<i8', 'fortran_order': False, 'shape': (5,), }
> 
> There are many different descriptor strings, e.g.:
> 
>  '>f8'
>  '=f4'
>  'float32'
>  '>c16'
>  ...
> 
> Ideally, I want the exhaustive list of valid input strings that describe
> standard ndarrays (i.e. ndarrays with simple entries as opposed to records
> or subarrays). Lacking an exhaustive list or spec, I'd like the source code
> that does the parsing for them.  This stackoverflow post is worth looking
> at:
> 
> 
> http://stackoverflow.com/questions/13997087/what-are-the-available-datatypes-for-dtype-with-numpys-loadtxt-an-genfromtxt

The only thing I could find in this direction, was:

http://docs.scipy.org/doc/numpy/reference/arrays.dtypes.html

But since you have mentioned the stackoverflow post, I presume you have
already discovered this page.

best,

V-


From robert.kern at gmail.com  Fri Jul  4 04:53:36 2014
From: robert.kern at gmail.com (Robert Kern)
Date: Fri, 4 Jul 2014 09:53:36 +0100
Subject: [Numpy-discussion] parsing dtype descriptors
In-Reply-To: <CAHYXaWtfRWp5vdcNc-z4ZQiSJmNtEbJa64b3pn0znOnfNCeoCg@mail.gmail.com>
References: <CAHYXaWv3EiGLFZOfMJaSp1pb5ndPGAt5A+ffsYzTwBZ=1KS_Yg@mail.gmail.com>
	<20140703193506.GA25653@kudu.in-berlin.de>
	<CAHYXaWtfRWp5vdcNc-z4ZQiSJmNtEbJa64b3pn0znOnfNCeoCg@mail.gmail.com>
Message-ID: <CAF6FJiu88j4O3r7E1GohTNxXrvE9Y6eU488aU+df5i78Sy7JfA@mail.gmail.com>

On Thu, Jul 3, 2014 at 10:53 PM, Ted Sandler <ted.sandler at gmail.com> wrote:
> Thanks. No, it's not what I'm looking for.
>
> I'm looking for the code that parses the string "<i8" in the npy file array
> header's descriptor:
>
>   {'descr': '<i8', 'fortran_order': False, 'shape': (5,), }
>
> There are many different descriptor strings, e.g.:
>
>  '>f8'
>  '=f4'
>  'float32'
>  '>c16'
>  ...
>
> Ideally, I want the exhaustive list of valid input strings that describe
> standard ndarrays (i.e. ndarrays with simple entries as opposed to records
> or subarrays). Lacking an exhaustive list or spec, I'd like the source code
> that does the parsing for them.

https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/descriptor.c#L1321
https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/conversion_utils.c#L1000
https://github.com/numpy/numpy/blob/master/numpy/core/include/numpy/ndarraytypes.h#L97

-- 
Robert Kern


From valentin at haenel.co  Fri Jul  4 09:49:54 2014
From: valentin at haenel.co (Valentin Haenel)
Date: Fri, 4 Jul 2014 15:49:54 +0200
Subject: [Numpy-discussion] About the npz format
In-Reply-To: <53515EE1.4080101@googlemail.com>
References: <CAFmkXceZAHDLmnF0QbrHb2cAjniv1soMWAZtFcVu62uqdNU_VA@mail.gmail.com>
	<CAPJVwB=+T+VwDmhTnkqKga2h7hapXAVF1x1M9dfvpC0f9FVzEw@mail.gmail.com>
	<CAFmkXcf_Q7ViHi=vw-gb39=BcYbq=UZiFM3vL4bMKaa82c0ruQ@mail.gmail.com>
	<CAPJVwBmzEr8UGeti=0YgZ9QUkgmRmrtMjL=dF0nMeiROooic7Q@mail.gmail.com>
	<CAFmkXcd69=b7UUSEaXqOYmR0P4WiFwOqKk-QBJmLvo2MnkaKfw@mail.gmail.com>
	<535030C4.9020700@googlemail.com>
	<20140417202635.GB22624@kudu.in-berlin.de>
	<20140417205627.GA4192@kudu.in-berlin.de>
	<20140418162927.GB1837@kudu.in-berlin.de>
	<53515EE1.4080101@googlemail.com>
Message-ID: <20140704134954.GB31861@kudu.in-berlin.de>

sorry, for the top-post, but should we add this as an issue on the
github tracker? I'd like to revisit it this summer.

V-

* Julian Taylor <jtaylor.debian at googlemail.com> [2014-04-18]:
> On 18.04.2014 18:29, Valentin Haenel wrote:
> > Hi,
> > 
> > * Valentin Haenel <valentin at haenel.co> [2014-04-17]:
> >> * Valentin Haenel <valentin at haenel.co> [2014-04-17]:
> >>> * Julian Taylor <jtaylor.debian at googlemail.com> [2014-04-17]:
> >>>> On 17.04.2014 21:30, onefire wrote:
> >>>>> Thanks for the suggestion. I did profile the program before, just not
> >>>>> using Python.
> >>>>
> >>>> one problem of npz is that the zipfile module does not support streaming
> >>>> data in (or if it does now we aren't using it).
> >>>> So numpy writes the file uncompressed to disk and then zips it which is
> >>>> horrible for performance and disk usage.
> >>>
> >>> As a workaround may also be possible to write the temporary NPY files to
> >>> cStringIO instances and then use ``ZipFile.writestr`` with the
> >>> ``getvalue()`` of the cStringIO object. However that approach may
> >>> require some memory. In python 2.7, for each array: one copy inside the
> >>> cStringIO instance and then another copy of when calling getvalue on the
> >>> cString, I believe.
> >>
> >> There is a proof-of-concept implementation here:
> >>
> >> https://github.com/esc/numpy/compare/feature;npz_no_temp_file
> > 
> > Anybody interested in me fixing this up (unit tests, API, etc..) for
> > inclusion?
> > 
> 
> I wonder if it would be better to instead use a fifo to avoid the memory
> doubling. Windows probably hasn't got them (exposed via python) but one
> can slap a platform check in front.
> attached a proof of concept without proper error handling (which is
> unfortunately the tricky part)

> >From 472b4c0a44804b65d0774147010ec7a931a1c52d Mon Sep 17 00:00:00 2001
> From: Julian Taylor <jtaylor.debian at googlemail.com>
> Date: Thu, 17 Apr 2014 23:01:47 +0200
> Subject: [PATCH] use a pipe for savez
> 
> ---
>  numpy/lib/npyio.py | 25 +++++++++++--------------
>  1 file changed, 11 insertions(+), 14 deletions(-)
> 
> diff --git a/numpy/lib/npyio.py b/numpy/lib/npyio.py
> index 98b4b6e..baafa9d 100644
> --- a/numpy/lib/npyio.py
> +++ b/numpy/lib/npyio.py
> @@ -585,22 +585,19 @@ def _savez(file, args, kwds, compress):
>      zipf = zipfile_factory(file, mode="w", compression=compression)
>  
>      # Stage arrays in a temporary file on disk, before writing to zip.
> -    fd, tmpfile = tempfile.mkstemp(suffix='-numpy.npy')
> -    os.close(fd)
> -    try:
> +    import threading
> +    with tempfile.TemporaryDirectory() as td:
> +        fifoname = os.path.join(td, "fifo")
> +        os.mkfifo(fifoname)
>          for key, val in namedict.items():
>              fname = key + '.npy'
> -            fid = open(tmpfile, 'wb')
> -            try:
> -                format.write_array(fid, np.asanyarray(val))
> -                fid.close()
> -                fid = None
> -                zipf.write(tmpfile, arcname=fname)
> -            finally:
> -                if fid:
> -                    fid.close()
> -    finally:
> -        os.remove(tmpfile)
> +            def mywrite(pipe, val):
> +                with open(pipe, "wb") as wpipe:
> +                    format.write_array(wpipe, np.asanyarray(val))
> +            t = threading.Thread(target=mywrite, args=(fifoname, val))
> +            t.start()
> +            zipf.write(fifoname, arcname=fname)
> +            t.join()
>  
>      zipf.close()
>  
> -- 
> 1.9.1
> 

> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From pelson.pub at gmail.com  Fri Jul  4 10:01:52 2014
From: pelson.pub at gmail.com (Phil Elson)
Date: Fri, 4 Jul 2014 15:01:52 +0100
Subject: [Numpy-discussion] Teaching Scipy BoF at SciPy
In-Reply-To: <CALGmxEJ6VHUpV4ec5YRjuE=jE00StUesEDm9J+6ZuNKrajSdOg@mail.gmail.com>
References: <CALGmxEJ6VHUpV4ec5YRjuE=jE00StUesEDm9J+6ZuNKrajSdOg@mail.gmail.com>
Message-ID: <CA+L60sAj3H_wrN+Z1YeMRMvteKXpZZDnVgiOMiuw67eLsqYQAA@mail.gmail.com>

Nice idea. Just a repository of courses would be a great first step.

For example, I know Jake Vanderplas's course at
https://github.com/jakevdp/2013_fall_ASTR599 is useful, and I have a few
introduction (3hr) courses at https://github.com/SciTools/courses.


On 3 July 2014 16:59, Chris Barker <chris.barker at noaa.gov> wrote:

> HI  Folks,
>
> I will be hosting a "Teaching the SciPy Stack" BoF at SciPy this year:
>
> https://conference.scipy.org/scipy2014/schedule/presentation/1762/
>
> (Actually, I proposed it for the conference, but would be more than happy
> to have other folks join me in facilitating, hosting, etc.)
>
> I've put up a Wiki page to collect ideas for topics. Please take a look
> and add your $0.02:
>
> https://github.com/numpy/numpy/wiki/TeachingSciPy-BoF-at-Scipy-2014
>
> See you there,
>
>   -Chris
>
> --
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
>
> Chris.Barker at noaa.gov
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140704/64cbb78a/attachment.html>

From m.hulsman at tudelft.nl  Fri Jul  4 11:32:41 2014
From: m.hulsman at tudelft.nl (Marc Hulsman)
Date: Fri, 04 Jul 2014 17:32:41 +0200
Subject: [Numpy-discussion] Fast way to convert (nested) list to numpy
 object array?
In-Reply-To: <1404391459.13834.8.camel@sebastian-t440>
References: <53B51993.7080207@tudelft.nl>	<CAK5FAtFApW67ry7Cpw-G3YEffO3MXJTpVQtUgFWB2x+mC+akeQ@mail.gmail.com>	<CAK5FAtGuh9oGE-tjFDJUo-pndHrSZrsYps97BxfqiGvqMSe7vA@mail.gmail.com>	<53B54E41.8090309@tudelft.nl>
	<1404391459.13834.8.camel@sebastian-t440>
Message-ID: <53B6C919.4010806@tudelft.nl>

On 07/03/2014 02:44 PM, Sebastian Berg wrote:
> True and true. I don't see a problem with fromiter being more general,
> just someone has to sit down and add new error checks/cleanup stuff
> for the object case. The assignment could probably also be optimized,
> not sure how hard that is, I would expect it isn't that hard. As
> usually, someone just needs to find time and the interest to actually
> do it ;). - Sebastian 

I looked at the code of FromIter below.

    /*
     * We would need to alter the memory RENEW code to decrement any
     * reference counts before throwing away any memory.
     */
    if (PyDataType_REFCHK(dtype)) {
        PyErr_SetString(PyExc_ValueError,
                "cannot create object arrays from iterator");
        goto done;
    }


However, the memory renew code (which just reallocs the memory to
increase the array size) uses
a simple realloc. It seems to me that it is not necessary to adapt
reference counts in this case (as the incref
from the new memory compensates the decref from the memory that is
removed)? For the addition of elements
to the array, everything seems to be ok anyway, as setitem is used,
which does the incref already.
So I think it should be possible to just remove this check?

I did not yet look at the assignment issue,  had some difficulty finding
the correct place in the code, does does
anyone have any pointers were to look?


>> The generic solution of adding an nmaxdim parameter to numpy.array would
>> of course be even more ideal :)
>>
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From charlesr.harris at gmail.com  Fri Jul  4 15:42:41 2014
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 4 Jul 2014 13:42:41 -0600
Subject: [Numpy-discussion] Questions about fixes for 1.9.0rc2
Message-ID: <CAB6mnxK+bEN4ToMk0O_LUWbHjB7s2eAO+XAY9YxpJ45D4ZYsaQ@mail.gmail.com>

Sebastian Seberg has fixed one class of test failures due to the indexing
changes in numpy 1.9.0b1.  There are some remaining errors, and in the case
of the Matplotlib failures, they look to me to be Matplotlib bugs. The 2-d
arrays that cause the error are returned by the overloaded
_interpolate_single_key function in CubicTriInterpolator that is documented
in the base class to return a 1-d array, whereas the actual dimensions are
of the form (n, 1). The question is, what is the best work around here for
these sorts errors? Can we afford to break Matplotlib and other packages on
account of a bug that was previously accepted by Numpy? Perhaps a
FutureWarning rather than an error would be more appropriate at this point,
and that modification would be easy to make.

Thoughts?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140704/f7295592/attachment.html>

From charlesr.harris at gmail.com  Fri Jul  4 16:00:06 2014
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 4 Jul 2014 14:00:06 -0600
Subject: [Numpy-discussion] Questions about fixes for 1.9.0rc2
In-Reply-To: <CAB6mnxK+bEN4ToMk0O_LUWbHjB7s2eAO+XAY9YxpJ45D4ZYsaQ@mail.gmail.com>
References: <CAB6mnxK+bEN4ToMk0O_LUWbHjB7s2eAO+XAY9YxpJ45D4ZYsaQ@mail.gmail.com>
Message-ID: <CAB6mnxJ=Pjj8sFh0KqssCUNKESQQqgNUP-OYpfP6nGCtk1WL4A@mail.gmail.com>

On Fri, Jul 4, 2014 at 1:42 PM, Charles R Harris <charlesr.harris at gmail.com>
wrote:

> Sebastian Seberg has fixed one class of test failures due to the indexing
> changes in numpy 1.9.0b1.  There are some remaining errors, and in the case
> of the Matplotlib failures, they look to me to be Matplotlib bugs. The 2-d
> arrays that cause the error are returned by the overloaded
> _interpolate_single_key function in CubicTriInterpolator that is
> documented in the base class to return a 1-d array, whereas the actual
> dimensions are of the form (n, 1). The question is, what is the best work
> around here for these sorts errors? Can we afford to break Matplotlib and
> other packages on account of a bug that was previously accepted by Numpy?
> Perhaps a FutureWarning rather than an error would be more appropriate at
> this point, and that modification would be easy to make.
>
> Thoughts?
>
>
I'll add that all of the remaining test failures, with the possible
exception of the Tables errors, look like bugs to me. The Tables errors
result from the fact that in fancy indexing assignment into 1-d array the
right hand side used to be repeated until sufficient values for the
assignment were available. Not sure what to do about that.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140704/237be5c2/attachment.html>

From ralf.gommers at gmail.com  Fri Jul  4 16:02:29 2014
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Fri, 4 Jul 2014 22:02:29 +0200
Subject: [Numpy-discussion] Questions about fixes for 1.9.0rc2
In-Reply-To: <CAB6mnxJ=Pjj8sFh0KqssCUNKESQQqgNUP-OYpfP6nGCtk1WL4A@mail.gmail.com>
References: <CAB6mnxK+bEN4ToMk0O_LUWbHjB7s2eAO+XAY9YxpJ45D4ZYsaQ@mail.gmail.com>
	<CAB6mnxJ=Pjj8sFh0KqssCUNKESQQqgNUP-OYpfP6nGCtk1WL4A@mail.gmail.com>
Message-ID: <CABL7CQjQxOVWFSFG3nf9s9pQop05NK2+6X7XHaEFjF3HH8xbUQ@mail.gmail.com>

On Fri, Jul 4, 2014 at 10:00 PM, Charles R Harris <charlesr.harris at gmail.com
> wrote:

>
>
>
> On Fri, Jul 4, 2014 at 1:42 PM, Charles R Harris <
> charlesr.harris at gmail.com> wrote:
>
>> Sebastian Seberg has fixed one class of test failures due to the indexing
>> changes in numpy 1.9.0b1.  There are some remaining errors, and in the case
>> of the Matplotlib failures, they look to me to be Matplotlib bugs. The 2-d
>> arrays that cause the error are returned by the overloaded
>> _interpolate_single_key function in CubicTriInterpolator that is
>> documented in the base class to return a 1-d array, whereas the actual
>> dimensions are of the form (n, 1). The question is, what is the best
>> work around here for these sorts errors? Can we afford to break Matplotlib
>> and other packages on account of a bug that was previously accepted by
>> Numpy?
>>
>
It depends how bad the break is, but in principle I'd say that breaking
Matplotlib is not OK.


> Perhaps a FutureWarning rather than an error would be more appropriate at
>> this point, and that modification would be easy to make.
>>
>
Sounds like a good idea then.

Ralf


>
>> Thoughts?
>>
>>
> I'll add that all of the remaining test failures, with the possible
> exception of the Tables errors, look like bugs to me. The Tables errors
> result from the fact that in fancy indexing assignment into 1-d array the
> right hand side used to be repeated until sufficient values for the
> assignment were available. Not sure what to do about that.
>
> Chuck
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140704/bc48a9f1/attachment.html>

From njs at pobox.com  Fri Jul  4 16:09:45 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 4 Jul 2014 21:09:45 +0100
Subject: [Numpy-discussion] Questions about fixes for 1.9.0rc2
In-Reply-To: <CABL7CQjQxOVWFSFG3nf9s9pQop05NK2+6X7XHaEFjF3HH8xbUQ@mail.gmail.com>
References: <CAB6mnxK+bEN4ToMk0O_LUWbHjB7s2eAO+XAY9YxpJ45D4ZYsaQ@mail.gmail.com>
	<CAB6mnxJ=Pjj8sFh0KqssCUNKESQQqgNUP-OYpfP6nGCtk1WL4A@mail.gmail.com>
	<CABL7CQjQxOVWFSFG3nf9s9pQop05NK2+6X7XHaEFjF3HH8xbUQ@mail.gmail.com>
Message-ID: <CAPJVwB=WcTd2O+gr1GAcgW+F2=S=5bH6hd9_9xHjhCjmt4XnkA@mail.gmail.com>

On Fri, Jul 4, 2014 at 9:02 PM, Ralf Gommers <ralf.gommers at gmail.com> wrote:
>
> On Fri, Jul 4, 2014 at 10:00 PM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
>>
>> On Fri, Jul 4, 2014 at 1:42 PM, Charles R Harris
>> <charlesr.harris at gmail.com> wrote:
>>>
>>> Sebastian Seberg has fixed one class of test failures due to the indexing
>>> changes in numpy 1.9.0b1.  There are some remaining errors, and in the case
>>> of the Matplotlib failures, they look to me to be Matplotlib bugs. The 2-d
>>> arrays that cause the error are returned by the overloaded
>>> _interpolate_single_key function in CubicTriInterpolator that is documented
>>> in the base class to return a 1-d array, whereas the actual dimensions are
>>> of the form (n, 1). The question is, what is the best work around here for
>>> these sorts errors? Can we afford to break Matplotlib and other packages on
>>> account of a bug that was previously accepted by Numpy?
>
>
> It depends how bad the break is, but in principle I'd say that breaking
> Matplotlib is not OK.

I agree. If it's easy to hack around it and issue a warning for now,
and doesn't have other negative consequences, then IMO we should give
matplotlib a release or so worth of grace period to fix things.

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org


From charlesr.harris at gmail.com  Fri Jul  4 16:33:02 2014
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 4 Jul 2014 14:33:02 -0600
Subject: [Numpy-discussion] Questions about fixes for 1.9.0rc2
In-Reply-To: <CAPJVwB=WcTd2O+gr1GAcgW+F2=S=5bH6hd9_9xHjhCjmt4XnkA@mail.gmail.com>
References: <CAB6mnxK+bEN4ToMk0O_LUWbHjB7s2eAO+XAY9YxpJ45D4ZYsaQ@mail.gmail.com>
	<CAB6mnxJ=Pjj8sFh0KqssCUNKESQQqgNUP-OYpfP6nGCtk1WL4A@mail.gmail.com>
	<CABL7CQjQxOVWFSFG3nf9s9pQop05NK2+6X7XHaEFjF3HH8xbUQ@mail.gmail.com>
	<CAPJVwB=WcTd2O+gr1GAcgW+F2=S=5bH6hd9_9xHjhCjmt4XnkA@mail.gmail.com>
Message-ID: <CAB6mnxJvEPCKm8cRaWXHKaGJwFM2_3j+-E9QdQ-gAQZ8kkLe8g@mail.gmail.com>

On Fri, Jul 4, 2014 at 2:09 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Fri, Jul 4, 2014 at 9:02 PM, Ralf Gommers <ralf.gommers at gmail.com>
> wrote:
> >
> > On Fri, Jul 4, 2014 at 10:00 PM, Charles R Harris
> > <charlesr.harris at gmail.com> wrote:
> >>
> >> On Fri, Jul 4, 2014 at 1:42 PM, Charles R Harris
> >> <charlesr.harris at gmail.com> wrote:
> >>>
> >>> Sebastian Seberg has fixed one class of test failures due to the
> indexing
> >>> changes in numpy 1.9.0b1.  There are some remaining errors, and in the
> case
> >>> of the Matplotlib failures, they look to me to be Matplotlib bugs. The
> 2-d
> >>> arrays that cause the error are returned by the overloaded
> >>> _interpolate_single_key function in CubicTriInterpolator that is
> documented
> >>> in the base class to return a 1-d array, whereas the actual dimensions
> are
> >>> of the form (n, 1). The question is, what is the best work around here
> for
> >>> these sorts errors? Can we afford to break Matplotlib and other
> packages on
> >>> account of a bug that was previously accepted by Numpy?
> >
> >
> > It depends how bad the break is, but in principle I'd say that breaking
> > Matplotlib is not OK.
>
> I agree. If it's easy to hack around it and issue a warning for now,
> and doesn't have other negative consequences, then IMO we should give
> matplotlib a release or so worth of grace period to fix things.
>

Here is another example, from skimage.

======================================================================
ERROR: test_join.test_relabel_sequential_offset1
----------------------------------------------------------------------
Traceback (most recent call last):
  File "X:\Python27-x64\lib\site-packages\nose\case.py", line 197, in
runTest
    self.test(*self.arg)
  File
"X:\Python27-x64\lib\site-packages\skimage\segmentation\tests\test_join.py",
line 30, in test_relabel_sequential_offset1
    ar_relab, fw, inv = relabel_sequential(ar)
  File "X:\Python27-x64\lib\site-packages\skimage\segmentation\_join.py",
line 127, in relabel_sequential
    forward_map[labels0] = np.arange(offset, offset + len(labels0) + 1)
ValueError: shape mismatch: value array of shape (6,) could not be
broadcast to indexing result of shape (5,)

Which is pretty clearly a coding error. Unfortunately, the error is in the
package rather than the test.

The only easy way to fix all of these sorts of things is to revert the
indexing changes, and I'm loathe to do that. Grrr...

Chuck

>
> -n
>
> --
> Nathaniel J. Smith
> Postdoctoral researcher - Informatics - University of Edinburgh
> http://vorpus.org
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140704/4876fe61/attachment.html>

From njs at pobox.com  Fri Jul  4 16:41:46 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 4 Jul 2014 21:41:46 +0100
Subject: [Numpy-discussion] Questions about fixes for 1.9.0rc2
In-Reply-To: <CAB6mnxJvEPCKm8cRaWXHKaGJwFM2_3j+-E9QdQ-gAQZ8kkLe8g@mail.gmail.com>
References: <CAB6mnxK+bEN4ToMk0O_LUWbHjB7s2eAO+XAY9YxpJ45D4ZYsaQ@mail.gmail.com>
	<CAB6mnxJ=Pjj8sFh0KqssCUNKESQQqgNUP-OYpfP6nGCtk1WL4A@mail.gmail.com>
	<CABL7CQjQxOVWFSFG3nf9s9pQop05NK2+6X7XHaEFjF3HH8xbUQ@mail.gmail.com>
	<CAPJVwB=WcTd2O+gr1GAcgW+F2=S=5bH6hd9_9xHjhCjmt4XnkA@mail.gmail.com>
	<CAB6mnxJvEPCKm8cRaWXHKaGJwFM2_3j+-E9QdQ-gAQZ8kkLe8g@mail.gmail.com>
Message-ID: <CAPJVwBmd0iuPe6MfJTfJEbEEKV8SimKLDZN5zejO_8VRePtLtA@mail.gmail.com>

On Fri, Jul 4, 2014 at 9:33 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
> On Fri, Jul 4, 2014 at 2:09 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>> On Fri, Jul 4, 2014 at 9:02 PM, Ralf Gommers <ralf.gommers at gmail.com>
>> wrote:
>> >
>> > On Fri, Jul 4, 2014 at 10:00 PM, Charles R Harris
>> > <charlesr.harris at gmail.com> wrote:
>> >>
>> >> On Fri, Jul 4, 2014 at 1:42 PM, Charles R Harris
>> >> <charlesr.harris at gmail.com> wrote:
>> >>>
>> >>> Sebastian Seberg has fixed one class of test failures due to the
>> >>> indexing
>> >>> changes in numpy 1.9.0b1.  There are some remaining errors, and in the
>> >>> case
>> >>> of the Matplotlib failures, they look to me to be Matplotlib bugs. The
>> >>> 2-d
>> >>> arrays that cause the error are returned by the overloaded
>> >>> _interpolate_single_key function in CubicTriInterpolator that is
>> >>> documented
>> >>> in the base class to return a 1-d array, whereas the actual dimensions
>> >>> are
>> >>> of the form (n, 1). The question is, what is the best work around here
>> >>> for
>> >>> these sorts errors? Can we afford to break Matplotlib and other
>> >>> packages on
>> >>> account of a bug that was previously accepted by Numpy?
>> >
>> >
>> > It depends how bad the break is, but in principle I'd say that breaking
>> > Matplotlib is not OK.
>>
>> I agree. If it's easy to hack around it and issue a warning for now,
>> and doesn't have other negative consequences, then IMO we should give
>> matplotlib a release or so worth of grace period to fix things.
>
>
> Here is another example, from skimage.
>
> ======================================================================
> ERROR: test_join.test_relabel_sequential_offset1
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File "X:\Python27-x64\lib\site-packages\nose\case.py", line 197, in
> runTest
>     self.test(*self.arg)
>   File
> "X:\Python27-x64\lib\site-packages\skimage\segmentation\tests\test_join.py",
> line 30, in test_relabel_sequential_offset1
>     ar_relab, fw, inv = relabel_sequential(ar)
>   File "X:\Python27-x64\lib\site-packages\skimage\segmentation\_join.py",
> line 127, in relabel_sequential
>     forward_map[labels0] = np.arange(offset, offset + len(labels0) + 1)
> ValueError: shape mismatch: value array of shape (6,) could not be broadcast
> to indexing result of shape (5,)
>
> Which is pretty clearly a coding error. Unfortunately, the error is in the
> package rather than the test.
>
> The only easy way to fix all of these sorts of things is to revert the
> indexing changes, and I'm loathe to do that. Grrr...

Ugh, that's pretty bad :-/. Do you really think we can't use a
band-aid over the new indexing code, though?

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org


From charlesr.harris at gmail.com  Fri Jul  4 16:48:39 2014
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 4 Jul 2014 14:48:39 -0600
Subject: [Numpy-discussion] Questions about fixes for 1.9.0rc2
In-Reply-To: <CAPJVwBmd0iuPe6MfJTfJEbEEKV8SimKLDZN5zejO_8VRePtLtA@mail.gmail.com>
References: <CAB6mnxK+bEN4ToMk0O_LUWbHjB7s2eAO+XAY9YxpJ45D4ZYsaQ@mail.gmail.com>
	<CAB6mnxJ=Pjj8sFh0KqssCUNKESQQqgNUP-OYpfP6nGCtk1WL4A@mail.gmail.com>
	<CABL7CQjQxOVWFSFG3nf9s9pQop05NK2+6X7XHaEFjF3HH8xbUQ@mail.gmail.com>
	<CAPJVwB=WcTd2O+gr1GAcgW+F2=S=5bH6hd9_9xHjhCjmt4XnkA@mail.gmail.com>
	<CAB6mnxJvEPCKm8cRaWXHKaGJwFM2_3j+-E9QdQ-gAQZ8kkLe8g@mail.gmail.com>
	<CAPJVwBmd0iuPe6MfJTfJEbEEKV8SimKLDZN5zejO_8VRePtLtA@mail.gmail.com>
Message-ID: <CAB6mnxK6Y8dVU_3Yu=YQCvtK6gSXNZ_H-XuqH1BqtmXNw6_uQg@mail.gmail.com>

On Fri, Jul 4, 2014 at 2:41 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Fri, Jul 4, 2014 at 9:33 PM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
> >
> > On Fri, Jul 4, 2014 at 2:09 PM, Nathaniel Smith <njs at pobox.com> wrote:
> >>
> >> On Fri, Jul 4, 2014 at 9:02 PM, Ralf Gommers <ralf.gommers at gmail.com>
> >> wrote:
> >> >
> >> > On Fri, Jul 4, 2014 at 10:00 PM, Charles R Harris
> >> > <charlesr.harris at gmail.com> wrote:
> >> >>
> >> >> On Fri, Jul 4, 2014 at 1:42 PM, Charles R Harris
> >> >> <charlesr.harris at gmail.com> wrote:
> >> >>>
> >> >>> Sebastian Seberg has fixed one class of test failures due to the
> >> >>> indexing
> >> >>> changes in numpy 1.9.0b1.  There are some remaining errors, and in
> the
> >> >>> case
> >> >>> of the Matplotlib failures, they look to me to be Matplotlib bugs.
> The
> >> >>> 2-d
> >> >>> arrays that cause the error are returned by the overloaded
> >> >>> _interpolate_single_key function in CubicTriInterpolator that is
> >> >>> documented
> >> >>> in the base class to return a 1-d array, whereas the actual
> dimensions
> >> >>> are
> >> >>> of the form (n, 1). The question is, what is the best work around
> here
> >> >>> for
> >> >>> these sorts errors? Can we afford to break Matplotlib and other
> >> >>> packages on
> >> >>> account of a bug that was previously accepted by Numpy?
> >> >
> >> >
> >> > It depends how bad the break is, but in principle I'd say that
> breaking
> >> > Matplotlib is not OK.
> >>
> >> I agree. If it's easy to hack around it and issue a warning for now,
> >> and doesn't have other negative consequences, then IMO we should give
> >> matplotlib a release or so worth of grace period to fix things.
> >
> >
> > Here is another example, from skimage.
> >
> > ======================================================================
> > ERROR: test_join.test_relabel_sequential_offset1
> > ----------------------------------------------------------------------
> > Traceback (most recent call last):
> >   File "X:\Python27-x64\lib\site-packages\nose\case.py", line 197, in
> > runTest
> >     self.test(*self.arg)
> >   File
> >
> "X:\Python27-x64\lib\site-packages\skimage\segmentation\tests\test_join.py",
> > line 30, in test_relabel_sequential_offset1
> >     ar_relab, fw, inv = relabel_sequential(ar)
> >   File "X:\Python27-x64\lib\site-packages\skimage\segmentation\_join.py",
> > line 127, in relabel_sequential
> >     forward_map[labels0] = np.arange(offset, offset + len(labels0) + 1)
> > ValueError: shape mismatch: value array of shape (6,) could not be
> broadcast
> > to indexing result of shape (5,)
> >
> > Which is pretty clearly a coding error. Unfortunately, the error is in
> the
> > package rather than the test.
> >
> > The only easy way to fix all of these sorts of things is to revert the
> > indexing changes, and I'm loathe to do that. Grrr...
>
> Ugh, that's pretty bad :-/. Do you really think we can't use a
> band-aid over the new indexing code, though?
>

Yeah, we can. But Sebastian doesn't have time and I'm unfamiliar with the
code, so it may take a while...

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140704/000d15f9/attachment.html>

From njs at pobox.com  Fri Jul  4 17:15:01 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 4 Jul 2014 22:15:01 +0100
Subject: [Numpy-discussion] Questions about fixes for 1.9.0rc2
In-Reply-To: <CAB6mnxK6Y8dVU_3Yu=YQCvtK6gSXNZ_H-XuqH1BqtmXNw6_uQg@mail.gmail.com>
References: <CAB6mnxK+bEN4ToMk0O_LUWbHjB7s2eAO+XAY9YxpJ45D4ZYsaQ@mail.gmail.com>
	<CAB6mnxJ=Pjj8sFh0KqssCUNKESQQqgNUP-OYpfP6nGCtk1WL4A@mail.gmail.com>
	<CABL7CQjQxOVWFSFG3nf9s9pQop05NK2+6X7XHaEFjF3HH8xbUQ@mail.gmail.com>
	<CAPJVwB=WcTd2O+gr1GAcgW+F2=S=5bH6hd9_9xHjhCjmt4XnkA@mail.gmail.com>
	<CAB6mnxJvEPCKm8cRaWXHKaGJwFM2_3j+-E9QdQ-gAQZ8kkLe8g@mail.gmail.com>
	<CAPJVwBmd0iuPe6MfJTfJEbEEKV8SimKLDZN5zejO_8VRePtLtA@mail.gmail.com>
	<CAB6mnxK6Y8dVU_3Yu=YQCvtK6gSXNZ_H-XuqH1BqtmXNw6_uQg@mail.gmail.com>
Message-ID: <CAPJVwBnneoJhOqcD87wU2_-pzphObfubPovrGEG8N7K-1_4FPA@mail.gmail.com>

On Fri, Jul 4, 2014 at 9:48 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
> On Fri, Jul 4, 2014 at 2:41 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>> On Fri, Jul 4, 2014 at 9:33 PM, Charles R Harris
>> <charlesr.harris at gmail.com> wrote:
>> >
>> > On Fri, Jul 4, 2014 at 2:09 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> >>
>> >> On Fri, Jul 4, 2014 at 9:02 PM, Ralf Gommers <ralf.gommers at gmail.com>
>> >> wrote:
>> >> >
>> >> > On Fri, Jul 4, 2014 at 10:00 PM, Charles R Harris
>> >> > <charlesr.harris at gmail.com> wrote:
>> >> >>
>> >> >> On Fri, Jul 4, 2014 at 1:42 PM, Charles R Harris
>> >> >> <charlesr.harris at gmail.com> wrote:
>> >> >>>
>> >> >>> Sebastian Seberg has fixed one class of test failures due to the
>> >> >>> indexing
>> >> >>> changes in numpy 1.9.0b1.  There are some remaining errors, and in
>> >> >>> the
>> >> >>> case
>> >> >>> of the Matplotlib failures, they look to me to be Matplotlib bugs.
>> >> >>> The
>> >> >>> 2-d
>> >> >>> arrays that cause the error are returned by the overloaded
>> >> >>> _interpolate_single_key function in CubicTriInterpolator that is
>> >> >>> documented
>> >> >>> in the base class to return a 1-d array, whereas the actual
>> >> >>> dimensions
>> >> >>> are
>> >> >>> of the form (n, 1). The question is, what is the best work around
>> >> >>> here
>> >> >>> for
>> >> >>> these sorts errors? Can we afford to break Matplotlib and other
>> >> >>> packages on
>> >> >>> account of a bug that was previously accepted by Numpy?
>> >> >
>> >> >
>> >> > It depends how bad the break is, but in principle I'd say that
>> >> > breaking
>> >> > Matplotlib is not OK.
>> >>
>> >> I agree. If it's easy to hack around it and issue a warning for now,
>> >> and doesn't have other negative consequences, then IMO we should give
>> >> matplotlib a release or so worth of grace period to fix things.
>> >
>> >
>> > Here is another example, from skimage.
>> >
>> > ======================================================================
>> > ERROR: test_join.test_relabel_sequential_offset1
>> > ----------------------------------------------------------------------
>> > Traceback (most recent call last):
>> >   File "X:\Python27-x64\lib\site-packages\nose\case.py", line 197, in
>> > runTest
>> >     self.test(*self.arg)
>> >   File
>> >
>> > "X:\Python27-x64\lib\site-packages\skimage\segmentation\tests\test_join.py",
>> > line 30, in test_relabel_sequential_offset1
>> >     ar_relab, fw, inv = relabel_sequential(ar)
>> >   File
>> > "X:\Python27-x64\lib\site-packages\skimage\segmentation\_join.py",
>> > line 127, in relabel_sequential
>> >     forward_map[labels0] = np.arange(offset, offset + len(labels0) + 1)
>> > ValueError: shape mismatch: value array of shape (6,) could not be
>> > broadcast
>> > to indexing result of shape (5,)
>> >
>> > Which is pretty clearly a coding error. Unfortunately, the error is in
>> > the
>> > package rather than the test.
>> >
>> > The only easy way to fix all of these sorts of things is to revert the
>> > indexing changes, and I'm loathe to do that. Grrr...
>>
>> Ugh, that's pretty bad :-/. Do you really think we can't use a
>> band-aid over the new indexing code, though?
>
>
> Yeah, we can. But Sebastian doesn't have time and I'm unfamiliar with the
> code, so it may take a while...

Fair enough!

I guess that if what are (arguably) bugs in matplotlib and
scikit-image are holding up the numpy release, then it's worth CC'ing
their mailing lists in case someone feels like volunteering to fix
it... ;-).

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org


From charlesr.harris at gmail.com  Fri Jul  4 17:31:55 2014
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 4 Jul 2014 15:31:55 -0600
Subject: [Numpy-discussion] Questions about fixes for 1.9.0rc2
In-Reply-To: <CAPJVwBnneoJhOqcD87wU2_-pzphObfubPovrGEG8N7K-1_4FPA@mail.gmail.com>
References: <CAB6mnxK+bEN4ToMk0O_LUWbHjB7s2eAO+XAY9YxpJ45D4ZYsaQ@mail.gmail.com>
	<CAB6mnxJ=Pjj8sFh0KqssCUNKESQQqgNUP-OYpfP6nGCtk1WL4A@mail.gmail.com>
	<CABL7CQjQxOVWFSFG3nf9s9pQop05NK2+6X7XHaEFjF3HH8xbUQ@mail.gmail.com>
	<CAPJVwB=WcTd2O+gr1GAcgW+F2=S=5bH6hd9_9xHjhCjmt4XnkA@mail.gmail.com>
	<CAB6mnxJvEPCKm8cRaWXHKaGJwFM2_3j+-E9QdQ-gAQZ8kkLe8g@mail.gmail.com>
	<CAPJVwBmd0iuPe6MfJTfJEbEEKV8SimKLDZN5zejO_8VRePtLtA@mail.gmail.com>
	<CAB6mnxK6Y8dVU_3Yu=YQCvtK6gSXNZ_H-XuqH1BqtmXNw6_uQg@mail.gmail.com>
	<CAPJVwBnneoJhOqcD87wU2_-pzphObfubPovrGEG8N7K-1_4FPA@mail.gmail.com>
Message-ID: <CAB6mnxKdUX831dCyk_aBaw8rAYgF=SAg-g1P6k5mBvz2s+Eo=A@mail.gmail.com>

On Fri, Jul 4, 2014 at 3:15 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Fri, Jul 4, 2014 at 9:48 PM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
> >
> > On Fri, Jul 4, 2014 at 2:41 PM, Nathaniel Smith <njs at pobox.com> wrote:
> >>
> >> On Fri, Jul 4, 2014 at 9:33 PM, Charles R Harris
> >> <charlesr.harris at gmail.com> wrote:
> >> >
> >> > On Fri, Jul 4, 2014 at 2:09 PM, Nathaniel Smith <njs at pobox.com>
> wrote:
> >> >>
> >> >> On Fri, Jul 4, 2014 at 9:02 PM, Ralf Gommers <ralf.gommers at gmail.com
> >
> >> >> wrote:
> >> >> >
> >> >> > On Fri, Jul 4, 2014 at 10:00 PM, Charles R Harris
> >> >> > <charlesr.harris at gmail.com> wrote:
> >> >> >>
> >> >> >> On Fri, Jul 4, 2014 at 1:42 PM, Charles R Harris
> >> >> >> <charlesr.harris at gmail.com> wrote:
> >> >> >>>
> >> >> >>> Sebastian Seberg has fixed one class of test failures due to the
> >> >> >>> indexing
> >> >> >>> changes in numpy 1.9.0b1.  There are some remaining errors, and
> in
> >> >> >>> the
> >> >> >>> case
> >> >> >>> of the Matplotlib failures, they look to me to be Matplotlib
> bugs.
> >> >> >>> The
> >> >> >>> 2-d
> >> >> >>> arrays that cause the error are returned by the overloaded
> >> >> >>> _interpolate_single_key function in CubicTriInterpolator that is
> >> >> >>> documented
> >> >> >>> in the base class to return a 1-d array, whereas the actual
> >> >> >>> dimensions
> >> >> >>> are
> >> >> >>> of the form (n, 1). The question is, what is the best work around
> >> >> >>> here
> >> >> >>> for
> >> >> >>> these sorts errors? Can we afford to break Matplotlib and other
> >> >> >>> packages on
> >> >> >>> account of a bug that was previously accepted by Numpy?
> >> >> >
> >> >> >
> >> >> > It depends how bad the break is, but in principle I'd say that
> >> >> > breaking
> >> >> > Matplotlib is not OK.
> >> >>
> >> >> I agree. If it's easy to hack around it and issue a warning for now,
> >> >> and doesn't have other negative consequences, then IMO we should give
> >> >> matplotlib a release or so worth of grace period to fix things.
> >> >
> >> >
> >> > Here is another example, from skimage.
> >> >
> >> > ======================================================================
> >> > ERROR: test_join.test_relabel_sequential_offset1
> >> > ----------------------------------------------------------------------
> >> > Traceback (most recent call last):
> >> >   File "X:\Python27-x64\lib\site-packages\nose\case.py", line 197, in
> >> > runTest
> >> >     self.test(*self.arg)
> >> >   File
> >> >
> >> >
> "X:\Python27-x64\lib\site-packages\skimage\segmentation\tests\test_join.py",
> >> > line 30, in test_relabel_sequential_offset1
> >> >     ar_relab, fw, inv = relabel_sequential(ar)
> >> >   File
> >> > "X:\Python27-x64\lib\site-packages\skimage\segmentation\_join.py",
> >> > line 127, in relabel_sequential
> >> >     forward_map[labels0] = np.arange(offset, offset + len(labels0) +
> 1)
> >> > ValueError: shape mismatch: value array of shape (6,) could not be
> >> > broadcast
> >> > to indexing result of shape (5,)
> >> >
> >> > Which is pretty clearly a coding error. Unfortunately, the error is in
> >> > the
> >> > package rather than the test.
> >> >
> >> > The only easy way to fix all of these sorts of things is to revert the
> >> > indexing changes, and I'm loathe to do that. Grrr...
> >>
> >> Ugh, that's pretty bad :-/. Do you really think we can't use a
> >> band-aid over the new indexing code, though?
> >
> >
> > Yeah, we can. But Sebastian doesn't have time and I'm unfamiliar with the
> > code, so it may take a while...
>
> Fair enough!
>
> I guess that if what are (arguably) bugs in matplotlib and
> scikit-image are holding up the numpy release, then it's worth CC'ing
> their mailing lists in case someone feels like volunteering to fix
> it... ;-).
>

I can do that ;) Doesn't help with the release though unless we want to
document the errors in the release notes and tell folks to wait on the next
release of the packages.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140704/9f2c42e2/attachment.html>

From njs at pobox.com  Fri Jul  4 17:33:13 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 4 Jul 2014 22:33:13 +0100
Subject: [Numpy-discussion] Questions about fixes for 1.9.0rc2
In-Reply-To: <CAB6mnxKdUX831dCyk_aBaw8rAYgF=SAg-g1P6k5mBvz2s+Eo=A@mail.gmail.com>
References: <CAB6mnxK+bEN4ToMk0O_LUWbHjB7s2eAO+XAY9YxpJ45D4ZYsaQ@mail.gmail.com>
	<CAB6mnxJ=Pjj8sFh0KqssCUNKESQQqgNUP-OYpfP6nGCtk1WL4A@mail.gmail.com>
	<CABL7CQjQxOVWFSFG3nf9s9pQop05NK2+6X7XHaEFjF3HH8xbUQ@mail.gmail.com>
	<CAPJVwB=WcTd2O+gr1GAcgW+F2=S=5bH6hd9_9xHjhCjmt4XnkA@mail.gmail.com>
	<CAB6mnxJvEPCKm8cRaWXHKaGJwFM2_3j+-E9QdQ-gAQZ8kkLe8g@mail.gmail.com>
	<CAPJVwBmd0iuPe6MfJTfJEbEEKV8SimKLDZN5zejO_8VRePtLtA@mail.gmail.com>
	<CAB6mnxK6Y8dVU_3Yu=YQCvtK6gSXNZ_H-XuqH1BqtmXNw6_uQg@mail.gmail.com>
	<CAPJVwBnneoJhOqcD87wU2_-pzphObfubPovrGEG8N7K-1_4FPA@mail.gmail.com>
	<CAB6mnxKdUX831dCyk_aBaw8rAYgF=SAg-g1P6k5mBvz2s+Eo=A@mail.gmail.com>
Message-ID: <CAPJVwBnJL=cUjuiwr6ZSS0HfG3DYOA=NPE-rWyayqXoro77zoQ@mail.gmail.com>

On Fri, Jul 4, 2014 at 10:31 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
> On Fri, Jul 4, 2014 at 3:15 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>> On Fri, Jul 4, 2014 at 9:48 PM, Charles R Harris
>> <charlesr.harris at gmail.com> wrote:
>> >
>> > On Fri, Jul 4, 2014 at 2:41 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> >>
>> >> On Fri, Jul 4, 2014 at 9:33 PM, Charles R Harris
>> >> <charlesr.harris at gmail.com> wrote:
>> >> >
>> >> > On Fri, Jul 4, 2014 at 2:09 PM, Nathaniel Smith <njs at pobox.com>
>> >> > wrote:
>> >> >>
>> >> >> On Fri, Jul 4, 2014 at 9:02 PM, Ralf Gommers
>> >> >> <ralf.gommers at gmail.com>
>> >> >> wrote:
>> >> >> >
>> >> >> > On Fri, Jul 4, 2014 at 10:00 PM, Charles R Harris
>> >> >> > <charlesr.harris at gmail.com> wrote:
>> >> >> >>
>> >> >> >> On Fri, Jul 4, 2014 at 1:42 PM, Charles R Harris
>> >> >> >> <charlesr.harris at gmail.com> wrote:
>> >> >> >>>
>> >> >> >>> Sebastian Seberg has fixed one class of test failures due to the
>> >> >> >>> indexing
>> >> >> >>> changes in numpy 1.9.0b1.  There are some remaining errors, and
>> >> >> >>> in
>> >> >> >>> the
>> >> >> >>> case
>> >> >> >>> of the Matplotlib failures, they look to me to be Matplotlib
>> >> >> >>> bugs.
>> >> >> >>> The
>> >> >> >>> 2-d
>> >> >> >>> arrays that cause the error are returned by the overloaded
>> >> >> >>> _interpolate_single_key function in CubicTriInterpolator that is
>> >> >> >>> documented
>> >> >> >>> in the base class to return a 1-d array, whereas the actual
>> >> >> >>> dimensions
>> >> >> >>> are
>> >> >> >>> of the form (n, 1). The question is, what is the best work
>> >> >> >>> around
>> >> >> >>> here
>> >> >> >>> for
>> >> >> >>> these sorts errors? Can we afford to break Matplotlib and other
>> >> >> >>> packages on
>> >> >> >>> account of a bug that was previously accepted by Numpy?
>> >> >> >
>> >> >> >
>> >> >> > It depends how bad the break is, but in principle I'd say that
>> >> >> > breaking
>> >> >> > Matplotlib is not OK.
>> >> >>
>> >> >> I agree. If it's easy to hack around it and issue a warning for now,
>> >> >> and doesn't have other negative consequences, then IMO we should
>> >> >> give
>> >> >> matplotlib a release or so worth of grace period to fix things.
>> >> >
>> >> >
>> >> > Here is another example, from skimage.
>> >> >
>> >> >
>> >> > ======================================================================
>> >> > ERROR: test_join.test_relabel_sequential_offset1
>> >> >
>> >> > ----------------------------------------------------------------------
>> >> > Traceback (most recent call last):
>> >> >   File "X:\Python27-x64\lib\site-packages\nose\case.py", line 197, in
>> >> > runTest
>> >> >     self.test(*self.arg)
>> >> >   File
>> >> >
>> >> >
>> >> > "X:\Python27-x64\lib\site-packages\skimage\segmentation\tests\test_join.py",
>> >> > line 30, in test_relabel_sequential_offset1
>> >> >     ar_relab, fw, inv = relabel_sequential(ar)
>> >> >   File
>> >> > "X:\Python27-x64\lib\site-packages\skimage\segmentation\_join.py",
>> >> > line 127, in relabel_sequential
>> >> >     forward_map[labels0] = np.arange(offset, offset + len(labels0) +
>> >> > 1)
>> >> > ValueError: shape mismatch: value array of shape (6,) could not be
>> >> > broadcast
>> >> > to indexing result of shape (5,)
>> >> >
>> >> > Which is pretty clearly a coding error. Unfortunately, the error is
>> >> > in
>> >> > the
>> >> > package rather than the test.
>> >> >
>> >> > The only easy way to fix all of these sorts of things is to revert
>> >> > the
>> >> > indexing changes, and I'm loathe to do that. Grrr...
>> >>
>> >> Ugh, that's pretty bad :-/. Do you really think we can't use a
>> >> band-aid over the new indexing code, though?
>> >
>> >
>> > Yeah, we can. But Sebastian doesn't have time and I'm unfamiliar with
>> > the
>> > code, so it may take a while...
>>
>> Fair enough!
>>
>> I guess that if what are (arguably) bugs in matplotlib and
>> scikit-image are holding up the numpy release, then it's worth CC'ing
>> their mailing lists in case someone feels like volunteering to fix
>> it... ;-).
>
> I can do that ;) Doesn't help with the release though unless we want to
> document the errors in the release notes and tell folks to wait on the next
> release of the packages.

Oh, I meant, in case they want to fix numpy so that their packages
don't break :-).

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org


From charlesr.harris at gmail.com  Fri Jul  4 19:07:22 2014
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 4 Jul 2014 17:07:22 -0600
Subject: [Numpy-discussion] Questions about fixes for 1.9.0rc2
In-Reply-To: <CAPJVwBnJL=cUjuiwr6ZSS0HfG3DYOA=NPE-rWyayqXoro77zoQ@mail.gmail.com>
References: <CAB6mnxK+bEN4ToMk0O_LUWbHjB7s2eAO+XAY9YxpJ45D4ZYsaQ@mail.gmail.com>
	<CAB6mnxJ=Pjj8sFh0KqssCUNKESQQqgNUP-OYpfP6nGCtk1WL4A@mail.gmail.com>
	<CABL7CQjQxOVWFSFG3nf9s9pQop05NK2+6X7XHaEFjF3HH8xbUQ@mail.gmail.com>
	<CAPJVwB=WcTd2O+gr1GAcgW+F2=S=5bH6hd9_9xHjhCjmt4XnkA@mail.gmail.com>
	<CAB6mnxJvEPCKm8cRaWXHKaGJwFM2_3j+-E9QdQ-gAQZ8kkLe8g@mail.gmail.com>
	<CAPJVwBmd0iuPe6MfJTfJEbEEKV8SimKLDZN5zejO_8VRePtLtA@mail.gmail.com>
	<CAB6mnxK6Y8dVU_3Yu=YQCvtK6gSXNZ_H-XuqH1BqtmXNw6_uQg@mail.gmail.com>
	<CAPJVwBnneoJhOqcD87wU2_-pzphObfubPovrGEG8N7K-1_4FPA@mail.gmail.com>
	<CAB6mnxKdUX831dCyk_aBaw8rAYgF=SAg-g1P6k5mBvz2s+Eo=A@mail.gmail.com>
	<CAPJVwBnJL=cUjuiwr6ZSS0HfG3DYOA=NPE-rWyayqXoro77zoQ@mail.gmail.com>
Message-ID: <CAB6mnx+YHfOen+ywpRbzjLO6r_+u8qZNro_-v_G5u7Yn=Wgemg@mail.gmail.com>

On Fri, Jul 4, 2014 at 3:33 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Fri, Jul 4, 2014 at 10:31 PM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
> >
> > On Fri, Jul 4, 2014 at 3:15 PM, Nathaniel Smith <njs at pobox.com> wrote:
> >>
> >> On Fri, Jul 4, 2014 at 9:48 PM, Charles R Harris
> >> <charlesr.harris at gmail.com> wrote:
> >> >
> >> > On Fri, Jul 4, 2014 at 2:41 PM, Nathaniel Smith <njs at pobox.com>
> wrote:
> >> >>
> >> >> On Fri, Jul 4, 2014 at 9:33 PM, Charles R Harris
> >> >> <charlesr.harris at gmail.com> wrote:
> >> >> >
> >> >> > On Fri, Jul 4, 2014 at 2:09 PM, Nathaniel Smith <njs at pobox.com>
> >> >> > wrote:
> >> >> >>
> >> >> >> On Fri, Jul 4, 2014 at 9:02 PM, Ralf Gommers
> >> >> >> <ralf.gommers at gmail.com>
> >> >> >> wrote:
> >> >> >> >
> >> >> >> > On Fri, Jul 4, 2014 at 10:00 PM, Charles R Harris
> >> >> >> > <charlesr.harris at gmail.com> wrote:
> >> >> >> >>
> >> >> >> >> On Fri, Jul 4, 2014 at 1:42 PM, Charles R Harris
> >> >> >> >> <charlesr.harris at gmail.com> wrote:
> >> >> >> >>>
> >> >> >> >>> Sebastian Seberg has fixed one class of test failures due to
> the
> >> >> >> >>> indexing
> >> >> >> >>> changes in numpy 1.9.0b1.  There are some remaining errors,
> and
> >> >> >> >>> in
> >> >> >> >>> the
> >> >> >> >>> case
> >> >> >> >>> of the Matplotlib failures, they look to me to be Matplotlib
> >> >> >> >>> bugs.
> >> >> >> >>> The
> >> >> >> >>> 2-d
> >> >> >> >>> arrays that cause the error are returned by the overloaded
> >> >> >> >>> _interpolate_single_key function in CubicTriInterpolator that
> is
> >> >> >> >>> documented
> >> >> >> >>> in the base class to return a 1-d array, whereas the actual
> >> >> >> >>> dimensions
> >> >> >> >>> are
> >> >> >> >>> of the form (n, 1). The question is, what is the best work
> >> >> >> >>> around
> >> >> >> >>> here
> >> >> >> >>> for
> >> >> >> >>> these sorts errors? Can we afford to break Matplotlib and
> other
> >> >> >> >>> packages on
> >> >> >> >>> account of a bug that was previously accepted by Numpy?
> >> >> >> >
> >> >> >> >
> >> >> >> > It depends how bad the break is, but in principle I'd say that
> >> >> >> > breaking
> >> >> >> > Matplotlib is not OK.
> >> >> >>
> >> >> >> I agree. If it's easy to hack around it and issue a warning for
> now,
> >> >> >> and doesn't have other negative consequences, then IMO we should
> >> >> >> give
> >> >> >> matplotlib a release or so worth of grace period to fix things.
> >> >> >
> >> >> >
> >> >> > Here is another example, from skimage.
> >> >> >
> >> >> >
> >> >> >
> ======================================================================
> >> >> > ERROR: test_join.test_relabel_sequential_offset1
> >> >> >
> >> >> >
> ----------------------------------------------------------------------
> >> >> > Traceback (most recent call last):
> >> >> >   File "X:\Python27-x64\lib\site-packages\nose\case.py", line 197,
> in
> >> >> > runTest
> >> >> >     self.test(*self.arg)
> >> >> >   File
> >> >> >
> >> >> >
> >> >> >
> "X:\Python27-x64\lib\site-packages\skimage\segmentation\tests\test_join.py",
> >> >> > line 30, in test_relabel_sequential_offset1
> >> >> >     ar_relab, fw, inv = relabel_sequential(ar)
> >> >> >   File
> >> >> > "X:\Python27-x64\lib\site-packages\skimage\segmentation\_join.py",
> >> >> > line 127, in relabel_sequential
> >> >> >     forward_map[labels0] = np.arange(offset, offset + len(labels0)
> +
> >> >> > 1)
> >> >> > ValueError: shape mismatch: value array of shape (6,) could not be
> >> >> > broadcast
> >> >> > to indexing result of shape (5,)
> >> >> >
> >> >> > Which is pretty clearly a coding error. Unfortunately, the error is
> >> >> > in
> >> >> > the
> >> >> > package rather than the test.
> >> >> >
> >> >> > The only easy way to fix all of these sorts of things is to revert
> >> >> > the
> >> >> > indexing changes, and I'm loathe to do that. Grrr...
> >> >>
> >> >> Ugh, that's pretty bad :-/. Do you really think we can't use a
> >> >> band-aid over the new indexing code, though?
> >> >
> >> >
> >> > Yeah, we can. But Sebastian doesn't have time and I'm unfamiliar with
> >> > the
> >> > code, so it may take a while...
> >>
> >> Fair enough!
> >>
> >> I guess that if what are (arguably) bugs in matplotlib and
> >> scikit-image are holding up the numpy release, then it's worth CC'ing
> >> their mailing lists in case someone feels like volunteering to fix
> >> it... ;-).
> >
> > I can do that ;) Doesn't help with the release though unless we want to
> > document the errors in the release notes and tell folks to wait on the
> next
> > release of the packages.
>
> Oh, I meant, in case they want to fix numpy so that their packages
> don't break :-).
>
>
I've filed issues with all the affected projects. Here is the current
status.

matplotlib -- Reported, being fixed, should be in 1.4 in a few days.
skimage -- Reported.
scikit-learn -- Reported.
tables -- Reported.
statsmodels -- Reported, fixed in master.
bottleneck -- Reported. IIRC, kwgoodman already knew of the changes.
pyfits -- Reported to astropy.
milk -- Reported.
pandas -- Reportedly fixed in master.

If the issues are fixed in matplotlib and pandas I'd be inclined to release
as is with a mention of versions in the release notes.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140704/2af14b8e/attachment.html>

From jeffreback at gmail.com  Fri Jul  4 19:14:06 2014
From: jeffreback at gmail.com (Jeff Reback)
Date: Fri, 4 Jul 2014 19:14:06 -0400
Subject: [Numpy-discussion] Questions about fixes for 1.9.0rc2
In-Reply-To: <CAB6mnx+YHfOen+ywpRbzjLO6r_+u8qZNro_-v_G5u7Yn=Wgemg@mail.gmail.com>
References: <CAB6mnxK+bEN4ToMk0O_LUWbHjB7s2eAO+XAY9YxpJ45D4ZYsaQ@mail.gmail.com>
	<CAB6mnxJ=Pjj8sFh0KqssCUNKESQQqgNUP-OYpfP6nGCtk1WL4A@mail.gmail.com>
	<CABL7CQjQxOVWFSFG3nf9s9pQop05NK2+6X7XHaEFjF3HH8xbUQ@mail.gmail.com>
	<CAPJVwB=WcTd2O+gr1GAcgW+F2=S=5bH6hd9_9xHjhCjmt4XnkA@mail.gmail.com>
	<CAB6mnxJvEPCKm8cRaWXHKaGJwFM2_3j+-E9QdQ-gAQZ8kkLe8g@mail.gmail.com>
	<CAPJVwBmd0iuPe6MfJTfJEbEEKV8SimKLDZN5zejO_8VRePtLtA@mail.gmail.com>
	<CAB6mnxK6Y8dVU_3Yu=YQCvtK6gSXNZ_H-XuqH1BqtmXNw6_uQg@mail.gmail.com>
	<CAPJVwBnneoJhOqcD87wU2_-pzphObfubPovrGEG8N7K-1_4FPA@mail.gmail.com>
	<CAB6mnxKdUX831dCyk_aBaw8rAYgF=SAg-g1P6k5mBvz2s+Eo=A@mail.gmail.com>
	<CAPJVwBnJL=cUjuiwr6ZSS0HfG3DYOA=NPE-rWyayqXoro77zoQ@mail.gmail.com>
	<CAB6mnx+YHfOen+ywpRbzjLO6r_+u8qZNro_-v_G5u7Yn=Wgemg@mail.gmail.com>
Message-ID: <190AABD2-4BF4-46AB-BCB6-9E8A2BCE00E7@gmail.com>

ok from pandas
we test with numpy master on Travis (which does pick up things!)

thanks

> On Jul 4, 2014, at 7:07 PM, Charles R Harris <charlesr.harris at gmail.com> wrote:
> 
> 
> 
> 
>> On Fri, Jul 4, 2014 at 3:33 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> On Fri, Jul 4, 2014 at 10:31 PM, Charles R Harris
>> <charlesr.harris at gmail.com> wrote:
>> >
>> > On Fri, Jul 4, 2014 at 3:15 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> >>
>> >> On Fri, Jul 4, 2014 at 9:48 PM, Charles R Harris
>> >> <charlesr.harris at gmail.com> wrote:
>> >> >
>> >> > On Fri, Jul 4, 2014 at 2:41 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> >> >>
>> >> >> On Fri, Jul 4, 2014 at 9:33 PM, Charles R Harris
>> >> >> <charlesr.harris at gmail.com> wrote:
>> >> >> >
>> >> >> > On Fri, Jul 4, 2014 at 2:09 PM, Nathaniel Smith <njs at pobox.com>
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> On Fri, Jul 4, 2014 at 9:02 PM, Ralf Gommers
>> >> >> >> <ralf.gommers at gmail.com>
>> >> >> >> wrote:
>> >> >> >> >
>> >> >> >> > On Fri, Jul 4, 2014 at 10:00 PM, Charles R Harris
>> >> >> >> > <charlesr.harris at gmail.com> wrote:
>> >> >> >> >>
>> >> >> >> >> On Fri, Jul 4, 2014 at 1:42 PM, Charles R Harris
>> >> >> >> >> <charlesr.harris at gmail.com> wrote:
>> >> >> >> >>>
>> >> >> >> >>> Sebastian Seberg has fixed one class of test failures due to the
>> >> >> >> >>> indexing
>> >> >> >> >>> changes in numpy 1.9.0b1.  There are some remaining errors, and
>> >> >> >> >>> in
>> >> >> >> >>> the
>> >> >> >> >>> case
>> >> >> >> >>> of the Matplotlib failures, they look to me to be Matplotlib
>> >> >> >> >>> bugs.
>> >> >> >> >>> The
>> >> >> >> >>> 2-d
>> >> >> >> >>> arrays that cause the error are returned by the overloaded
>> >> >> >> >>> _interpolate_single_key function in CubicTriInterpolator that is
>> >> >> >> >>> documented
>> >> >> >> >>> in the base class to return a 1-d array, whereas the actual
>> >> >> >> >>> dimensions
>> >> >> >> >>> are
>> >> >> >> >>> of the form (n, 1). The question is, what is the best work
>> >> >> >> >>> around
>> >> >> >> >>> here
>> >> >> >> >>> for
>> >> >> >> >>> these sorts errors? Can we afford to break Matplotlib and other
>> >> >> >> >>> packages on
>> >> >> >> >>> account of a bug that was previously accepted by Numpy?
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > It depends how bad the break is, but in principle I'd say that
>> >> >> >> > breaking
>> >> >> >> > Matplotlib is not OK.
>> >> >> >>
>> >> >> >> I agree. If it's easy to hack around it and issue a warning for now,
>> >> >> >> and doesn't have other negative consequences, then IMO we should
>> >> >> >> give
>> >> >> >> matplotlib a release or so worth of grace period to fix things.
>> >> >> >
>> >> >> >
>> >> >> > Here is another example, from skimage.
>> >> >> >
>> >> >> >
>> >> >> > ======================================================================
>> >> >> > ERROR: test_join.test_relabel_sequential_offset1
>> >> >> >
>> >> >> > ----------------------------------------------------------------------
>> >> >> > Traceback (most recent call last):
>> >> >> >   File "X:\Python27-x64\lib\site-packages\nose\case.py", line 197, in
>> >> >> > runTest
>> >> >> >     self.test(*self.arg)
>> >> >> >   File
>> >> >> >
>> >> >> >
>> >> >> > "X:\Python27-x64\lib\site-packages\skimage\segmentation\tests\test_join.py",
>> >> >> > line 30, in test_relabel_sequential_offset1
>> >> >> >     ar_relab, fw, inv = relabel_sequential(ar)
>> >> >> >   File
>> >> >> > "X:\Python27-x64\lib\site-packages\skimage\segmentation\_join.py",
>> >> >> > line 127, in relabel_sequential
>> >> >> >     forward_map[labels0] = np.arange(offset, offset + len(labels0) +
>> >> >> > 1)
>> >> >> > ValueError: shape mismatch: value array of shape (6,) could not be
>> >> >> > broadcast
>> >> >> > to indexing result of shape (5,)
>> >> >> >
>> >> >> > Which is pretty clearly a coding error. Unfortunately, the error is
>> >> >> > in
>> >> >> > the
>> >> >> > package rather than the test.
>> >> >> >
>> >> >> > The only easy way to fix all of these sorts of things is to revert
>> >> >> > the
>> >> >> > indexing changes, and I'm loathe to do that. Grrr...
>> >> >>
>> >> >> Ugh, that's pretty bad :-/. Do you really think we can't use a
>> >> >> band-aid over the new indexing code, though?
>> >> >
>> >> >
>> >> > Yeah, we can. But Sebastian doesn't have time and I'm unfamiliar with
>> >> > the
>> >> > code, so it may take a while...
>> >>
>> >> Fair enough!
>> >>
>> >> I guess that if what are (arguably) bugs in matplotlib and
>> >> scikit-image are holding up the numpy release, then it's worth CC'ing
>> >> their mailing lists in case someone feels like volunteering to fix
>> >> it... ;-).
>> >
>> > I can do that ;) Doesn't help with the release though unless we want to
>> > document the errors in the release notes and tell folks to wait on the next
>> > release of the packages.
>> 
>> Oh, I meant, in case they want to fix numpy so that their packages
>> don't break :-).
> 
> I've filed issues with all the affected projects. Here is the current status.
> 
> matplotlib -- Reported, being fixed, should be in 1.4 in a few days.
> skimage -- Reported.
> scikit-learn -- Reported.
> tables -- Reported.
> statsmodels -- Reported, fixed in master.
> bottleneck -- Reported. IIRC, kwgoodman already knew of the changes.
> pyfits -- Reported to astropy.
> milk -- Reported.
> pandas -- Reportedly fixed in master.
> 
> If the issues are fixed in matplotlib and pandas I'd be inclined to release as is with a mention of versions in the release notes.
> 
> Chuck
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140704/8a505aee/attachment.html>

From njs at pobox.com  Fri Jul  4 19:41:17 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Sat, 5 Jul 2014 00:41:17 +0100
Subject: [Numpy-discussion] Questions about fixes for 1.9.0rc2
In-Reply-To: <CAB6mnx+YHfOen+ywpRbzjLO6r_+u8qZNro_-v_G5u7Yn=Wgemg@mail.gmail.com>
References: <CAB6mnxK+bEN4ToMk0O_LUWbHjB7s2eAO+XAY9YxpJ45D4ZYsaQ@mail.gmail.com>
	<CAB6mnxJ=Pjj8sFh0KqssCUNKESQQqgNUP-OYpfP6nGCtk1WL4A@mail.gmail.com>
	<CABL7CQjQxOVWFSFG3nf9s9pQop05NK2+6X7XHaEFjF3HH8xbUQ@mail.gmail.com>
	<CAPJVwB=WcTd2O+gr1GAcgW+F2=S=5bH6hd9_9xHjhCjmt4XnkA@mail.gmail.com>
	<CAB6mnxJvEPCKm8cRaWXHKaGJwFM2_3j+-E9QdQ-gAQZ8kkLe8g@mail.gmail.com>
	<CAPJVwBmd0iuPe6MfJTfJEbEEKV8SimKLDZN5zejO_8VRePtLtA@mail.gmail.com>
	<CAB6mnxK6Y8dVU_3Yu=YQCvtK6gSXNZ_H-XuqH1BqtmXNw6_uQg@mail.gmail.com>
	<CAPJVwBnneoJhOqcD87wU2_-pzphObfubPovrGEG8N7K-1_4FPA@mail.gmail.com>
	<CAB6mnxKdUX831dCyk_aBaw8rAYgF=SAg-g1P6k5mBvz2s+Eo=A@mail.gmail.com>
	<CAPJVwBnJL=cUjuiwr6ZSS0HfG3DYOA=NPE-rWyayqXoro77zoQ@mail.gmail.com>
	<CAB6mnx+YHfOen+ywpRbzjLO6r_+u8qZNro_-v_G5u7Yn=Wgemg@mail.gmail.com>
Message-ID: <CAPJVwBn5e_TeHJ1KEzPnoJrEeuyA83zJkCxrV9_OvAqWwiqD=Q@mail.gmail.com>

On 5 Jul 2014 00:07, "Charles R Harris" <charlesr.harris at gmail.com> wrote:
>
>
>
>
> On Fri, Jul 4, 2014 at 3:33 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>> On Fri, Jul 4, 2014 at 10:31 PM, Charles R Harris
>> <charlesr.harris at gmail.com> wrote:
>> >
>> > On Fri, Jul 4, 2014 at 3:15 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> >>
>> >> On Fri, Jul 4, 2014 at 9:48 PM, Charles R Harris
>> >> <charlesr.harris at gmail.com> wrote:
>> >> >
>> >> > On Fri, Jul 4, 2014 at 2:41 PM, Nathaniel Smith <njs at pobox.com>
wrote:
>> >> >>
>> >> >> On Fri, Jul 4, 2014 at 9:33 PM, Charles R Harris
>> >> >> <charlesr.harris at gmail.com> wrote:
>> >> >> >
>> >> >> > On Fri, Jul 4, 2014 at 2:09 PM, Nathaniel Smith <njs at pobox.com>
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> On Fri, Jul 4, 2014 at 9:02 PM, Ralf Gommers
>> >> >> >> <ralf.gommers at gmail.com>
>> >> >> >> wrote:
>> >> >> >> >
>> >> >> >> > On Fri, Jul 4, 2014 at 10:00 PM, Charles R Harris
>> >> >> >> > <charlesr.harris at gmail.com> wrote:
>> >> >> >> >>
>> >> >> >> >> On Fri, Jul 4, 2014 at 1:42 PM, Charles R Harris
>> >> >> >> >> <charlesr.harris at gmail.com> wrote:
>> >> >> >> >>>
>> >> >> >> >>> Sebastian Seberg has fixed one class of test failures due
to the
>> >> >> >> >>> indexing
>> >> >> >> >>> changes in numpy 1.9.0b1.  There are some remaining errors,
and
>> >> >> >> >>> in
>> >> >> >> >>> the
>> >> >> >> >>> case
>> >> >> >> >>> of the Matplotlib failures, they look to me to be Matplotlib
>> >> >> >> >>> bugs.
>> >> >> >> >>> The
>> >> >> >> >>> 2-d
>> >> >> >> >>> arrays that cause the error are returned by the overloaded
>> >> >> >> >>> _interpolate_single_key function in CubicTriInterpolator
that is
>> >> >> >> >>> documented
>> >> >> >> >>> in the base class to return a 1-d array, whereas the actual
>> >> >> >> >>> dimensions
>> >> >> >> >>> are
>> >> >> >> >>> of the form (n, 1). The question is, what is the best work
>> >> >> >> >>> around
>> >> >> >> >>> here
>> >> >> >> >>> for
>> >> >> >> >>> these sorts errors? Can we afford to break Matplotlib and
other
>> >> >> >> >>> packages on
>> >> >> >> >>> account of a bug that was previously accepted by Numpy?
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > It depends how bad the break is, but in principle I'd say that
>> >> >> >> > breaking
>> >> >> >> > Matplotlib is not OK.
>> >> >> >>
>> >> >> >> I agree. If it's easy to hack around it and issue a warning for
now,
>> >> >> >> and doesn't have other negative consequences, then IMO we should
>> >> >> >> give
>> >> >> >> matplotlib a release or so worth of grace period to fix things.
>> >> >> >
>> >> >> >
>> >> >> > Here is another example, from skimage.
>> >> >> >
>> >> >> >
>> >> >> >
======================================================================
>> >> >> > ERROR: test_join.test_relabel_sequential_offset1
>> >> >> >
>> >> >> >
----------------------------------------------------------------------
>> >> >> > Traceback (most recent call last):
>> >> >> >   File "X:\Python27-x64\lib\site-packages\nose\case.py", line
197, in
>> >> >> > runTest
>> >> >> >     self.test(*self.arg)
>> >> >> >   File
>> >> >> >
>> >> >> >
>> >> >> >
"X:\Python27-x64\lib\site-packages\skimage\segmentation\tests\test_join.py",
>> >> >> > line 30, in test_relabel_sequential_offset1
>> >> >> >     ar_relab, fw, inv = relabel_sequential(ar)
>> >> >> >   File
>> >> >> >
"X:\Python27-x64\lib\site-packages\skimage\segmentation\_join.py",
>> >> >> > line 127, in relabel_sequential
>> >> >> >     forward_map[labels0] = np.arange(offset, offset +
len(labels0) +
>> >> >> > 1)
>> >> >> > ValueError: shape mismatch: value array of shape (6,) could not
be
>> >> >> > broadcast
>> >> >> > to indexing result of shape (5,)
>> >> >> >
>> >> >> > Which is pretty clearly a coding error. Unfortunately, the error
is
>> >> >> > in
>> >> >> > the
>> >> >> > package rather than the test.
>> >> >> >
>> >> >> > The only easy way to fix all of these sorts of things is to
revert
>> >> >> > the
>> >> >> > indexing changes, and I'm loathe to do that. Grrr...
>> >> >>
>> >> >> Ugh, that's pretty bad :-/. Do you really think we can't use a
>> >> >> band-aid over the new indexing code, though?
>> >> >
>> >> >
>> >> > Yeah, we can. But Sebastian doesn't have time and I'm unfamiliar
with
>> >> > the
>> >> > code, so it may take a while...
>> >>
>> >> Fair enough!
>> >>
>> >> I guess that if what are (arguably) bugs in matplotlib and
>> >> scikit-image are holding up the numpy release, then it's worth CC'ing
>> >> their mailing lists in case someone feels like volunteering to fix
>> >> it... ;-).
>> >
>> > I can do that ;) Doesn't help with the release though unless we want to
>> > document the errors in the release notes and tell folks to wait on the
next
>> > release of the packages.
>>
>> Oh, I meant, in case they want to fix numpy so that their packages
>> don't break :-).
>>
>
> I've filed issues with all the affected projects. Here is the current
status.
>
> matplotlib -- Reported, being fixed, should be in 1.4 in a few days.
> skimage -- Reported.
> scikit-learn -- Reported.
> tables -- Reported.
> statsmodels -- Reported, fixed in master.
> bottleneck -- Reported. IIRC, kwgoodman already knew of the changes.
> pyfits -- Reported to astropy.
> milk -- Reported.
> pandas -- Reportedly fixed in master.

That is a massive pile of affected projects :-(.

My worry is that if all these projects we know about are broken, then how
many other codebases that we aren't testing are also broken?

> If the issues are fixed in matplotlib and pandas I'd be inclined to
release as is with a mention of versions in the release notes.

Even if it's fixed in pandas master, how long until it's in user's hands?

-n

> Chuck
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140705/31304396/attachment.html>

From jeffreback at gmail.com  Fri Jul  4 19:43:28 2014
From: jeffreback at gmail.com (Jeff Reback)
Date: Fri, 4 Jul 2014 19:43:28 -0400
Subject: [Numpy-discussion] Questions about fixes for 1.9.0rc2
In-Reply-To: <CAPJVwBn5e_TeHJ1KEzPnoJrEeuyA83zJkCxrV9_OvAqWwiqD=Q@mail.gmail.com>
References: <CAB6mnxK+bEN4ToMk0O_LUWbHjB7s2eAO+XAY9YxpJ45D4ZYsaQ@mail.gmail.com>
	<CAB6mnxJ=Pjj8sFh0KqssCUNKESQQqgNUP-OYpfP6nGCtk1WL4A@mail.gmail.com>
	<CABL7CQjQxOVWFSFG3nf9s9pQop05NK2+6X7XHaEFjF3HH8xbUQ@mail.gmail.com>
	<CAPJVwB=WcTd2O+gr1GAcgW+F2=S=5bH6hd9_9xHjhCjmt4XnkA@mail.gmail.com>
	<CAB6mnxJvEPCKm8cRaWXHKaGJwFM2_3j+-E9QdQ-gAQZ8kkLe8g@mail.gmail.com>
	<CAPJVwBmd0iuPe6MfJTfJEbEEKV8SimKLDZN5zejO_8VRePtLtA@mail.gmail.com>
	<CAB6mnxK6Y8dVU_3Yu=YQCvtK6gSXNZ_H-XuqH1BqtmXNw6_uQg@mail.gmail.com>
	<CAPJVwBnneoJhOqcD87wU2_-pzphObfubPovrGEG8N7K-1_4FPA@mail.gmail.com>
	<CAB6mnxKdUX831dCyk_aBaw8rAYgF=SAg-g1P6k5mBvz2s+Eo=A@mail.gmail.com>
	<CAPJVwBnJL=cUjuiwr6ZSS0HfG3DYOA=NPE-rWyayqXoro77zoQ@mail.gmail.com>
	<CAB6mnx+YHfOen+ywpRbzjLO6r_+u8qZNro_-v_G5u7Yn=Wgemg@mail.gmail.com>
	<CAPJVwBn5e_TeHJ1KEzPnoJrEeuyA83zJkCxrV9_OvAqWwiqD=Q@mail.gmail.com>
Message-ID: <2DCD0E92-DC55-4662-BAA2-184FEAEEE471@gmail.com>

pandas 0.14.1 scheduled for end of next week (was waiting to see schedule for numpy 1.9) but works either way

> On Jul 4, 2014, at 7:41 PM, Nathaniel Smith <njs at pobox.com> wrote:
> 
> On 5 Jul 2014 00:07, "Charles R Harris" <charlesr.harris at gmail.com> wrote:
> >
> >
> >
> >
> > On Fri, Jul 4, 2014 at 3:33 PM, Nathaniel Smith <njs at pobox.com> wrote:
> >>
> >> On Fri, Jul 4, 2014 at 10:31 PM, Charles R Harris
> >> <charlesr.harris at gmail.com> wrote:
> >> >
> >> > On Fri, Jul 4, 2014 at 3:15 PM, Nathaniel Smith <njs at pobox.com> wrote:
> >> >>
> >> >> On Fri, Jul 4, 2014 at 9:48 PM, Charles R Harris
> >> >> <charlesr.harris at gmail.com> wrote:
> >> >> >
> >> >> > On Fri, Jul 4, 2014 at 2:41 PM, Nathaniel Smith <njs at pobox.com> wrote:
> >> >> >>
> >> >> >> On Fri, Jul 4, 2014 at 9:33 PM, Charles R Harris
> >> >> >> <charlesr.harris at gmail.com> wrote:
> >> >> >> >
> >> >> >> > On Fri, Jul 4, 2014 at 2:09 PM, Nathaniel Smith <njs at pobox.com>
> >> >> >> > wrote:
> >> >> >> >>
> >> >> >> >> On Fri, Jul 4, 2014 at 9:02 PM, Ralf Gommers
> >> >> >> >> <ralf.gommers at gmail.com>
> >> >> >> >> wrote:
> >> >> >> >> >
> >> >> >> >> > On Fri, Jul 4, 2014 at 10:00 PM, Charles R Harris
> >> >> >> >> > <charlesr.harris at gmail.com> wrote:
> >> >> >> >> >>
> >> >> >> >> >> On Fri, Jul 4, 2014 at 1:42 PM, Charles R Harris
> >> >> >> >> >> <charlesr.harris at gmail.com> wrote:
> >> >> >> >> >>>
> >> >> >> >> >>> Sebastian Seberg has fixed one class of test failures due to the
> >> >> >> >> >>> indexing
> >> >> >> >> >>> changes in numpy 1.9.0b1.  There are some remaining errors, and
> >> >> >> >> >>> in
> >> >> >> >> >>> the
> >> >> >> >> >>> case
> >> >> >> >> >>> of the Matplotlib failures, they look to me to be Matplotlib
> >> >> >> >> >>> bugs.
> >> >> >> >> >>> The
> >> >> >> >> >>> 2-d
> >> >> >> >> >>> arrays that cause the error are returned by the overloaded
> >> >> >> >> >>> _interpolate_single_key function in CubicTriInterpolator that is
> >> >> >> >> >>> documented
> >> >> >> >> >>> in the base class to return a 1-d array, whereas the actual
> >> >> >> >> >>> dimensions
> >> >> >> >> >>> are
> >> >> >> >> >>> of the form (n, 1). The question is, what is the best work
> >> >> >> >> >>> around
> >> >> >> >> >>> here
> >> >> >> >> >>> for
> >> >> >> >> >>> these sorts errors? Can we afford to break Matplotlib and other
> >> >> >> >> >>> packages on
> >> >> >> >> >>> account of a bug that was previously accepted by Numpy?
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> > It depends how bad the break is, but in principle I'd say that
> >> >> >> >> > breaking
> >> >> >> >> > Matplotlib is not OK.
> >> >> >> >>
> >> >> >> >> I agree. If it's easy to hack around it and issue a warning for now,
> >> >> >> >> and doesn't have other negative consequences, then IMO we should
> >> >> >> >> give
> >> >> >> >> matplotlib a release or so worth of grace period to fix things.
> >> >> >> >
> >> >> >> >
> >> >> >> > Here is another example, from skimage.
> >> >> >> >
> >> >> >> >
> >> >> >> > ======================================================================
> >> >> >> > ERROR: test_join.test_relabel_sequential_offset1
> >> >> >> >
> >> >> >> > ----------------------------------------------------------------------
> >> >> >> > Traceback (most recent call last):
> >> >> >> >   File "X:\Python27-x64\lib\site-packages\nose\case.py", line 197, in
> >> >> >> > runTest
> >> >> >> >     self.test(*self.arg)
> >> >> >> >   File
> >> >> >> >
> >> >> >> >
> >> >> >> > "X:\Python27-x64\lib\site-packages\skimage\segmentation\tests\test_join.py",
> >> >> >> > line 30, in test_relabel_sequential_offset1
> >> >> >> >     ar_relab, fw, inv = relabel_sequential(ar)
> >> >> >> >   File
> >> >> >> > "X:\Python27-x64\lib\site-packages\skimage\segmentation\_join.py",
> >> >> >> > line 127, in relabel_sequential
> >> >> >> >     forward_map[labels0] = np.arange(offset, offset + len(labels0) +
> >> >> >> > 1)
> >> >> >> > ValueError: shape mismatch: value array of shape (6,) could not be
> >> >> >> > broadcast
> >> >> >> > to indexing result of shape (5,)
> >> >> >> >
> >> >> >> > Which is pretty clearly a coding error. Unfortunately, the error is
> >> >> >> > in
> >> >> >> > the
> >> >> >> > package rather than the test.
> >> >> >> >
> >> >> >> > The only easy way to fix all of these sorts of things is to revert
> >> >> >> > the
> >> >> >> > indexing changes, and I'm loathe to do that. Grrr...
> >> >> >>
> >> >> >> Ugh, that's pretty bad :-/. Do you really think we can't use a
> >> >> >> band-aid over the new indexing code, though?
> >> >> >
> >> >> >
> >> >> > Yeah, we can. But Sebastian doesn't have time and I'm unfamiliar with
> >> >> > the
> >> >> > code, so it may take a while...
> >> >>
> >> >> Fair enough!
> >> >>
> >> >> I guess that if what are (arguably) bugs in matplotlib and
> >> >> scikit-image are holding up the numpy release, then it's worth CC'ing
> >> >> their mailing lists in case someone feels like volunteering to fix
> >> >> it... ;-).
> >> >
> >> > I can do that ;) Doesn't help with the release though unless we want to
> >> > document the errors in the release notes and tell folks to wait on the next
> >> > release of the packages.
> >>
> >> Oh, I meant, in case they want to fix numpy so that their packages
> >> don't break :-).
> >>
> >
> > I've filed issues with all the affected projects. Here is the current status.
> >
> > matplotlib -- Reported, being fixed, should be in 1.4 in a few days.
> > skimage -- Reported.
> > scikit-learn -- Reported.
> > tables -- Reported.
> > statsmodels -- Reported, fixed in master.
> > bottleneck -- Reported. IIRC, kwgoodman already knew of the changes.
> > pyfits -- Reported to astropy.
> > milk -- Reported.
> > pandas -- Reportedly fixed in master.
> 
> That is a massive pile of affected projects :-(.
> 
> My worry is that if all these projects we know about are broken, then how many other codebases that we aren't testing are also broken?
> 
> > If the issues are fixed in matplotlib and pandas I'd be inclined to release as is with a mention of versions in the release notes.
> 
> Even if it's fixed in pandas master, how long until it's in user's hands?
> 
> -n
> 
> > Chuck
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140704/8e1ff2c6/attachment.html>

From charlesr.harris at gmail.com  Fri Jul  4 22:25:45 2014
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 4 Jul 2014 20:25:45 -0600
Subject: [Numpy-discussion] Remove bento from numpy
Message-ID: <CAB6mnxLwuA9MjKTj+aq6+qRVHzpWNOtYJVOE7p7Gy2sifdMzbw@mail.gmail.com>

Ralf likes the speed of bento, but it is not currently maintained and does
not properly build numpy with all the optimizations added by Julian. I find
the usual setup.py method fast enough and it has the advantage that all the
numpy developers can deal with it.

Thoughts?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140704/5332a3c1/attachment.html>

From charlesr.harris at gmail.com  Fri Jul  4 22:56:18 2014
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 4 Jul 2014 20:56:18 -0600
Subject: [Numpy-discussion] Questions about fixes for 1.9.0rc2
In-Reply-To: <CAB6mnx+YHfOen+ywpRbzjLO6r_+u8qZNro_-v_G5u7Yn=Wgemg@mail.gmail.com>
References: <CAB6mnxK+bEN4ToMk0O_LUWbHjB7s2eAO+XAY9YxpJ45D4ZYsaQ@mail.gmail.com>
	<CAB6mnxJ=Pjj8sFh0KqssCUNKESQQqgNUP-OYpfP6nGCtk1WL4A@mail.gmail.com>
	<CABL7CQjQxOVWFSFG3nf9s9pQop05NK2+6X7XHaEFjF3HH8xbUQ@mail.gmail.com>
	<CAPJVwB=WcTd2O+gr1GAcgW+F2=S=5bH6hd9_9xHjhCjmt4XnkA@mail.gmail.com>
	<CAB6mnxJvEPCKm8cRaWXHKaGJwFM2_3j+-E9QdQ-gAQZ8kkLe8g@mail.gmail.com>
	<CAPJVwBmd0iuPe6MfJTfJEbEEKV8SimKLDZN5zejO_8VRePtLtA@mail.gmail.com>
	<CAB6mnxK6Y8dVU_3Yu=YQCvtK6gSXNZ_H-XuqH1BqtmXNw6_uQg@mail.gmail.com>
	<CAPJVwBnneoJhOqcD87wU2_-pzphObfubPovrGEG8N7K-1_4FPA@mail.gmail.com>
	<CAB6mnxKdUX831dCyk_aBaw8rAYgF=SAg-g1P6k5mBvz2s+Eo=A@mail.gmail.com>
	<CAPJVwBnJL=cUjuiwr6ZSS0HfG3DYOA=NPE-rWyayqXoro77zoQ@mail.gmail.com>
	<CAB6mnx+YHfOen+ywpRbzjLO6r_+u8qZNro_-v_G5u7Yn=Wgemg@mail.gmail.com>
Message-ID: <CAB6mnxLAqGPyvn79s=WU7VhuN55Jsb6TOi8NsJ2-XfUe6KHvaw@mail.gmail.com>

On Fri, Jul 4, 2014 at 5:07 PM, Charles R Harris <charlesr.harris at gmail.com>
wrote:

>
>
>
> On Fri, Jul 4, 2014 at 3:33 PM, Nathaniel Smith <njs at pobox.com> wrote:
>
>> On Fri, Jul 4, 2014 at 10:31 PM, Charles R Harris
>> <charlesr.harris at gmail.com> wrote:
>> >
>> > On Fri, Jul 4, 2014 at 3:15 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> >>
>> >> On Fri, Jul 4, 2014 at 9:48 PM, Charles R Harris
>> >> <charlesr.harris at gmail.com> wrote:
>> >> >
>> >> > On Fri, Jul 4, 2014 at 2:41 PM, Nathaniel Smith <njs at pobox.com>
>> wrote:
>> >> >>
>> >> >> On Fri, Jul 4, 2014 at 9:33 PM, Charles R Harris
>> >> >> <charlesr.harris at gmail.com> wrote:
>> >> >> >
>> >> >> > On Fri, Jul 4, 2014 at 2:09 PM, Nathaniel Smith <njs at pobox.com>
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> On Fri, Jul 4, 2014 at 9:02 PM, Ralf Gommers
>> >> >> >> <ralf.gommers at gmail.com>
>> >> >> >> wrote:
>> >> >> >> >
>> >> >> >> > On Fri, Jul 4, 2014 at 10:00 PM, Charles R Harris
>> >> >> >> > <charlesr.harris at gmail.com> wrote:
>> >> >> >> >>
>> >> >> >> >> On Fri, Jul 4, 2014 at 1:42 PM, Charles R Harris
>> >> >> >> >> <charlesr.harris at gmail.com> wrote:
>> >> >> >> >>>
>> >> >> >> >>> Sebastian Seberg has fixed one class of test failures due to
>> the
>> >> >> >> >>> indexing
>> >> >> >> >>> changes in numpy 1.9.0b1.  There are some remaining errors,
>> and
>> >> >> >> >>> in
>> >> >> >> >>> the
>> >> >> >> >>> case
>> >> >> >> >>> of the Matplotlib failures, they look to me to be Matplotlib
>> >> >> >> >>> bugs.
>> >> >> >> >>> The
>> >> >> >> >>> 2-d
>> >> >> >> >>> arrays that cause the error are returned by the overloaded
>> >> >> >> >>> _interpolate_single_key function in CubicTriInterpolator
>> that is
>> >> >> >> >>> documented
>> >> >> >> >>> in the base class to return a 1-d array, whereas the actual
>> >> >> >> >>> dimensions
>> >> >> >> >>> are
>> >> >> >> >>> of the form (n, 1). The question is, what is the best work
>> >> >> >> >>> around
>> >> >> >> >>> here
>> >> >> >> >>> for
>> >> >> >> >>> these sorts errors? Can we afford to break Matplotlib and
>> other
>> >> >> >> >>> packages on
>> >> >> >> >>> account of a bug that was previously accepted by Numpy?
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > It depends how bad the break is, but in principle I'd say that
>> >> >> >> > breaking
>> >> >> >> > Matplotlib is not OK.
>> >> >> >>
>> >> >> >> I agree. If it's easy to hack around it and issue a warning for
>> now,
>> >> >> >> and doesn't have other negative consequences, then IMO we should
>> >> >> >> give
>> >> >> >> matplotlib a release or so worth of grace period to fix things.
>> >> >> >
>> >> >> >
>> >> >> > Here is another example, from skimage.
>> >> >> >
>> >> >> >
>> >> >> >
>> ======================================================================
>> >> >> > ERROR: test_join.test_relabel_sequential_offset1
>> >> >> >
>> >> >> >
>> ----------------------------------------------------------------------
>> >> >> > Traceback (most recent call last):
>> >> >> >   File "X:\Python27-x64\lib\site-packages\nose\case.py", line
>> 197, in
>> >> >> > runTest
>> >> >> >     self.test(*self.arg)
>> >> >> >   File
>> >> >> >
>> >> >> >
>> >> >> >
>> "X:\Python27-x64\lib\site-packages\skimage\segmentation\tests\test_join.py",
>> >> >> > line 30, in test_relabel_sequential_offset1
>> >> >> >     ar_relab, fw, inv = relabel_sequential(ar)
>> >> >> >   File
>> >> >> > "X:\Python27-x64\lib\site-packages\skimage\segmentation\_join.py",
>> >> >> > line 127, in relabel_sequential
>> >> >> >     forward_map[labels0] = np.arange(offset, offset +
>> len(labels0) +
>> >> >> > 1)
>> >> >> > ValueError: shape mismatch: value array of shape (6,) could not be
>> >> >> > broadcast
>> >> >> > to indexing result of shape (5,)
>> >> >> >
>> >> >> > Which is pretty clearly a coding error. Unfortunately, the error
>> is
>> >> >> > in
>> >> >> > the
>> >> >> > package rather than the test.
>> >> >> >
>> >> >> > The only easy way to fix all of these sorts of things is to revert
>> >> >> > the
>> >> >> > indexing changes, and I'm loathe to do that. Grrr...
>> >> >>
>> >> >> Ugh, that's pretty bad :-/. Do you really think we can't use a
>> >> >> band-aid over the new indexing code, though?
>> >> >
>> >> >
>> >> > Yeah, we can. But Sebastian doesn't have time and I'm unfamiliar with
>> >> > the
>> >> > code, so it may take a while...
>> >>
>> >> Fair enough!
>> >>
>> >> I guess that if what are (arguably) bugs in matplotlib and
>> >> scikit-image are holding up the numpy release, then it's worth CC'ing
>> >> their mailing lists in case someone feels like volunteering to fix
>> >> it... ;-).
>> >
>> > I can do that ;) Doesn't help with the release though unless we want to
>> > document the errors in the release notes and tell folks to wait on the
>> next
>> > release of the packages.
>>
>> Oh, I meant, in case they want to fix numpy so that their packages
>> don't break :-).
>>
>>
> I've filed issues with all the affected projects. Here is the current
> status.
>
> matplotlib -- Reported, being fixed, should be in 1.4 in a few days.
> skimage -- Reported.
> scikit-learn -- Reported.
> tables -- Reported.
> statsmodels -- Reported, fixed in master.
> bottleneck -- Reported. IIRC, kwgoodman already knew of the changes.
> pyfits -- Reported to astropy.
> milk -- Reported.
> pandas -- Reportedly fixed in master.
>

skimage is now fixed in master.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140704/f5dce0c2/attachment.html>

From cournape at gmail.com  Sat Jul  5 04:13:46 2014
From: cournape at gmail.com (David Cournapeau)
Date: Sat, 5 Jul 2014 17:13:46 +0900
Subject: [Numpy-discussion] Remove bento from numpy
In-Reply-To: <CAB6mnxLwuA9MjKTj+aq6+qRVHzpWNOtYJVOE7p7Gy2sifdMzbw@mail.gmail.com>
References: <CAB6mnxLwuA9MjKTj+aq6+qRVHzpWNOtYJVOE7p7Gy2sifdMzbw@mail.gmail.com>
Message-ID: <CAGY4rcVtxmqcUaO5Qd_Oz+6B=7dGgiY7qdWsx_enS-qgQ9RxsA@mail.gmail.com>

On Sat, Jul 5, 2014 at 11:25 AM, Charles R Harris <charlesr.harris at gmail.com
> wrote:

> Ralf likes the speed of bento, but it is not currently maintained
>

What exactly is not maintained ?

David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140705/e22478df/attachment.html>

From ralf.gommers at gmail.com  Sat Jul  5 04:14:21 2014
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sat, 5 Jul 2014 10:14:21 +0200
Subject: [Numpy-discussion] Questions about fixes for 1.9.0rc2
In-Reply-To: <CAPJVwBn5e_TeHJ1KEzPnoJrEeuyA83zJkCxrV9_OvAqWwiqD=Q@mail.gmail.com>
References: <CAB6mnxK+bEN4ToMk0O_LUWbHjB7s2eAO+XAY9YxpJ45D4ZYsaQ@mail.gmail.com>
	<CAB6mnxJ=Pjj8sFh0KqssCUNKESQQqgNUP-OYpfP6nGCtk1WL4A@mail.gmail.com>
	<CABL7CQjQxOVWFSFG3nf9s9pQop05NK2+6X7XHaEFjF3HH8xbUQ@mail.gmail.com>
	<CAPJVwB=WcTd2O+gr1GAcgW+F2=S=5bH6hd9_9xHjhCjmt4XnkA@mail.gmail.com>
	<CAB6mnxJvEPCKm8cRaWXHKaGJwFM2_3j+-E9QdQ-gAQZ8kkLe8g@mail.gmail.com>
	<CAPJVwBmd0iuPe6MfJTfJEbEEKV8SimKLDZN5zejO_8VRePtLtA@mail.gmail.com>
	<CAB6mnxK6Y8dVU_3Yu=YQCvtK6gSXNZ_H-XuqH1BqtmXNw6_uQg@mail.gmail.com>
	<CAPJVwBnneoJhOqcD87wU2_-pzphObfubPovrGEG8N7K-1_4FPA@mail.gmail.com>
	<CAB6mnxKdUX831dCyk_aBaw8rAYgF=SAg-g1P6k5mBvz2s+Eo=A@mail.gmail.com>
	<CAPJVwBnJL=cUjuiwr6ZSS0HfG3DYOA=NPE-rWyayqXoro77zoQ@mail.gmail.com>
	<CAB6mnx+YHfOen+ywpRbzjLO6r_+u8qZNro_-v_G5u7Yn=Wgemg@mail.gmail.com>
	<CAPJVwBn5e_TeHJ1KEzPnoJrEeuyA83zJkCxrV9_OvAqWwiqD=Q@mail.gmail.com>
Message-ID: <CABL7CQj3xXigNor5eUXFiPOuqxh8X-xmooxH0Sg3_7kPtqDO7Q@mail.gmail.com>

On Sat, Jul 5, 2014 at 1:41 AM, Nathaniel Smith <njs at pobox.com> wrote:

> On 5 Jul 2014 00:07, "Charles R Harris" <charlesr.harris at gmail.com> wrote:
> > I've filed issues with all the affected projects. Here is the current
> status.
> >
> > matplotlib -- Reported, being fixed, should be in 1.4 in a few days.
> > skimage -- Reported.
> > scikit-learn -- Reported.
> > tables -- Reported.
> > statsmodels -- Reported, fixed in master.
> > bottleneck -- Reported. IIRC, kwgoodman already knew of the changes.
> > pyfits -- Reported to astropy.
> > milk -- Reported.
> > pandas -- Reportedly fixed in master.
>
> That is a massive pile of affected projects :-(.
>
> My worry is that if all these projects we know about are broken, then how
> many other codebases that we aren't testing are also broken?
>
Same worry here. If a major change in numpy breaks ~half of the projects
that make up a typical scipy stack, that change should not be made without
at least one release that emits warnings first.

We would have caught this much earlier had we had something like
https://github.com/matthew-brett/scipy-stack-osx-testing. Maybe a good idea
to have that as a separate repo in the numpy org, add a few more projects
to it, and then regularly run numpy master (or a PR) against the latest
releases of those projects.

Ralf

> If the issues are fixed in matplotlib and pandas I'd be inclined to
> release as is with a mention of versions in the release notes.
>
> Even if it's fixed in pandas master, how long until it's in user's hands?
>
> -n
>
> > Chuck
>
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140705/7596eab7/attachment.html>

From ralf.gommers at gmail.com  Sat Jul  5 04:23:26 2014
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sat, 5 Jul 2014 10:23:26 +0200
Subject: [Numpy-discussion] Remove bento from numpy
In-Reply-To: <CAGY4rcVtxmqcUaO5Qd_Oz+6B=7dGgiY7qdWsx_enS-qgQ9RxsA@mail.gmail.com>
References: <CAB6mnxLwuA9MjKTj+aq6+qRVHzpWNOtYJVOE7p7Gy2sifdMzbw@mail.gmail.com>
	<CAGY4rcVtxmqcUaO5Qd_Oz+6B=7dGgiY7qdWsx_enS-qgQ9RxsA@mail.gmail.com>
Message-ID: <CABL7CQiLoxLume94LeBBZvNhNF-LZig_=aAWmsVOH-bpqx1f0Q@mail.gmail.com>

On Sat, Jul 5, 2014 at 10:13 AM, David Cournapeau <cournape at gmail.com>
wrote:

>
>
>
> On Sat, Jul 5, 2014 at 11:25 AM, Charles R Harris <
> charlesr.harris at gmail.com> wrote:
>
>> Ralf likes the speed of bento, but it is not currently maintained
>>
>
> What exactly is not maintained ?
>

The issue is that Julian made some slightly nontrivial changes to
core/setup.py and didn't want to update core/bscript. No one else has taken
the time either to make those changes. That didn't bother me enough yet to
go fix it, because they're all optional features and using Bento builds
works just fine at the moment (and is part of the Travis CI test runs, so
it'll keep working).

I don't think the above is a good reason to remove Bento support. The much
faster builds alone are a good reason to keep it. And the assertion that
all numpy devs understand numpy.distutils is more than a little
questionable:)

Ralf


>
> David
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140705/4e733b16/attachment.html>

From cournape at gmail.com  Sat Jul  5 04:44:38 2014
From: cournape at gmail.com (David Cournapeau)
Date: Sat, 5 Jul 2014 17:44:38 +0900
Subject: [Numpy-discussion] Remove bento from numpy
In-Reply-To: <CABL7CQiLoxLume94LeBBZvNhNF-LZig_=aAWmsVOH-bpqx1f0Q@mail.gmail.com>
References: <CAB6mnxLwuA9MjKTj+aq6+qRVHzpWNOtYJVOE7p7Gy2sifdMzbw@mail.gmail.com>
	<CAGY4rcVtxmqcUaO5Qd_Oz+6B=7dGgiY7qdWsx_enS-qgQ9RxsA@mail.gmail.com>
	<CABL7CQiLoxLume94LeBBZvNhNF-LZig_=aAWmsVOH-bpqx1f0Q@mail.gmail.com>
Message-ID: <CAGY4rcXXa_zHPYBsn-yR8BhjSNy5VaapZBf7e7jtp0i0Wt_PMQ@mail.gmail.com>

On Sat, Jul 5, 2014 at 5:23 PM, Ralf Gommers <ralf.gommers at gmail.com> wrote:

>
>
>
> On Sat, Jul 5, 2014 at 10:13 AM, David Cournapeau <cournape at gmail.com>
> wrote:
>
>>
>>
>>
>> On Sat, Jul 5, 2014 at 11:25 AM, Charles R Harris <
>> charlesr.harris at gmail.com> wrote:
>>
>>> Ralf likes the speed of bento, but it is not currently maintained
>>>
>>
>> What exactly is not maintained ?
>>
>
> The issue is that Julian made some slightly nontrivial changes to
> core/setup.py and didn't want to update core/bscript. No one else has taken
> the time either to make those changes. That didn't bother me enough yet to
> go fix it, because they're all optional features and using Bento builds
> works just fine at the moment (and is part of the Travis CI test runs, so
> it'll keep working).
>

What are those changes ?

David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140705/9cab202e/attachment.html>

From ralf.gommers at gmail.com  Sat Jul  5 05:02:14 2014
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sat, 5 Jul 2014 11:02:14 +0200
Subject: [Numpy-discussion] Remove bento from numpy
In-Reply-To: <CAGY4rcXXa_zHPYBsn-yR8BhjSNy5VaapZBf7e7jtp0i0Wt_PMQ@mail.gmail.com>
References: <CAB6mnxLwuA9MjKTj+aq6+qRVHzpWNOtYJVOE7p7Gy2sifdMzbw@mail.gmail.com>
	<CAGY4rcVtxmqcUaO5Qd_Oz+6B=7dGgiY7qdWsx_enS-qgQ9RxsA@mail.gmail.com>
	<CABL7CQiLoxLume94LeBBZvNhNF-LZig_=aAWmsVOH-bpqx1f0Q@mail.gmail.com>
	<CAGY4rcXXa_zHPYBsn-yR8BhjSNy5VaapZBf7e7jtp0i0Wt_PMQ@mail.gmail.com>
Message-ID: <CABL7CQgvCanjd7Qr+2JY-r0iyQ_MxyoepN914AR85uq0sQjS7w@mail.gmail.com>

On Sat, Jul 5, 2014 at 10:44 AM, David Cournapeau <cournape at gmail.com>
wrote:

>
>
>
> On Sat, Jul 5, 2014 at 5:23 PM, Ralf Gommers <ralf.gommers at gmail.com>
> wrote:
>
>>
>>
>>
>> On Sat, Jul 5, 2014 at 10:13 AM, David Cournapeau <cournape at gmail.com>
>> wrote:
>>
>>>
>>>
>>>
>>> On Sat, Jul 5, 2014 at 11:25 AM, Charles R Harris <
>>> charlesr.harris at gmail.com> wrote:
>>>
>>>> Ralf likes the speed of bento, but it is not currently maintained
>>>>
>>>
>>> What exactly is not maintained ?
>>>
>>
>> The issue is that Julian made some slightly nontrivial changes to
>> core/setup.py and didn't want to update core/bscript. No one else has taken
>> the time either to make those changes. That didn't bother me enough yet to
>> go fix it, because they're all optional features and using Bento builds
>> works just fine at the moment (and is part of the Travis CI test runs, so
>> it'll keep working).
>>
>
> What are those changes?
>

Comment in bscript:

    # TODO: add OPTIONAL_HEADERS, OPTIONAL_INTRINSICS and
    # OPTIONAL_GCC_ATTRIBUTES (see setup.py and gh-3766).  These are
    # performance optimizations for GCC.

Plus the changes in https://github.com/numpy/numpy/pull/4692, that
apparently weren't documented in bscript as TODO.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140705/8e439e32/attachment.html>

From jtaylor.debian at googlemail.com  Sat Jul  5 07:32:49 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Sat, 05 Jul 2014 13:32:49 +0200
Subject: [Numpy-discussion] Remove bento from numpy
In-Reply-To: <CABL7CQgvCanjd7Qr+2JY-r0iyQ_MxyoepN914AR85uq0sQjS7w@mail.gmail.com>
References: <CAB6mnxLwuA9MjKTj+aq6+qRVHzpWNOtYJVOE7p7Gy2sifdMzbw@mail.gmail.com>	<CAGY4rcVtxmqcUaO5Qd_Oz+6B=7dGgiY7qdWsx_enS-qgQ9RxsA@mail.gmail.com>	<CABL7CQiLoxLume94LeBBZvNhNF-LZig_=aAWmsVOH-bpqx1f0Q@mail.gmail.com>	<CAGY4rcXXa_zHPYBsn-yR8BhjSNy5VaapZBf7e7jtp0i0Wt_PMQ@mail.gmail.com>
	<CABL7CQgvCanjd7Qr+2JY-r0iyQ_MxyoepN914AR85uq0sQjS7w@mail.gmail.com>
Message-ID: <53B7E261.3010801@googlemail.com>

On 05.07.2014 11:02, Ralf Gommers wrote:
> 
> 
> 
> On Sat, Jul 5, 2014 at 10:44 AM, David Cournapeau <cournape at gmail.com
> <mailto:cournape at gmail.com>> wrote:
> 
> 
> 
> 
>     On Sat, Jul 5, 2014 at 5:23 PM, Ralf Gommers <ralf.gommers at gmail.com
>     <mailto:ralf.gommers at gmail.com>> wrote:
> 
> 
> 
> 
>         On Sat, Jul 5, 2014 at 10:13 AM, David Cournapeau
>         <cournape at gmail.com <mailto:cournape at gmail.com>> wrote:
> 
> 
> 
> 
>             On Sat, Jul 5, 2014 at 11:25 AM, Charles R Harris
>             <charlesr.harris at gmail.com
>             <mailto:charlesr.harris at gmail.com>> wrote:
> 
>                 Ralf likes the speed of bento, but it is not currently
>                 maintained
> 
> 
>             What exactly is not maintained ?
> 
> 
>         The issue is that Julian made some slightly nontrivial changes
>         to core/setup.py and didn't want to update core/bscript. No one
>         else has taken the time either to make those changes. That
>         didn't bother me enough yet to go fix it, because they're all
>         optional features and using Bento builds works just fine at the
>         moment (and is part of the Travis CI test runs, so it'll keep
>         working).
> 
> 
>     What are those changes?
> 
> 
> Comment in bscript:
> 
>     # TODO: add OPTIONAL_HEADERS, OPTIONAL_INTRINSICS and
>     # OPTIONAL_GCC_ATTRIBUTES (see setup.py and gh-3766).  These are
>     # performance optimizations for GCC.
> 
> Plus the changes in https://github.com/numpy/numpy/pull/4692, that
> apparently weren't documented in bscript as TODO.
> 

+ bento builds in debug mode which is could be slower because I
sprinkled asserts in lots of places


From njs at pobox.com  Sat Jul  5 07:54:06 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Sat, 5 Jul 2014 12:54:06 +0100
Subject: [Numpy-discussion] Remove bento from numpy
In-Reply-To: <CABL7CQiLoxLume94LeBBZvNhNF-LZig_=aAWmsVOH-bpqx1f0Q@mail.gmail.com>
References: <CAB6mnxLwuA9MjKTj+aq6+qRVHzpWNOtYJVOE7p7Gy2sifdMzbw@mail.gmail.com>
	<CAGY4rcVtxmqcUaO5Qd_Oz+6B=7dGgiY7qdWsx_enS-qgQ9RxsA@mail.gmail.com>
	<CABL7CQiLoxLume94LeBBZvNhNF-LZig_=aAWmsVOH-bpqx1f0Q@mail.gmail.com>
Message-ID: <CAPJVwBnABaa_UqEntMd8pHdfhhB-P1EyunRJb7_kxw9vskLzpQ@mail.gmail.com>

On 5 Jul 2014 09:23, "Ralf Gommers" <ralf.gommers at gmail.com> wrote:
>
> On Sat, Jul 5, 2014 at 10:13 AM, David Cournapeau <cournape at gmail.com>
wrote:
>>
>> On Sat, Jul 5, 2014 at 11:25 AM, Charles R Harris <
charlesr.harris at gmail.com> wrote:
>>>
>>> Ralf likes the speed of bento, but it is not currently maintained
>>
>>
>> What exactly is not maintained ?
>
>
> The issue is that Julian made some slightly nontrivial changes to
core/setup.py and didn't want to update core/bscript. No one else has taken
the time either to make those changes. That didn't bother me enough yet to
go fix it, because they're all optional features and using Bento builds
works just fine at the moment (and is part of the Travis CI test runs, so
it'll keep working).

Perhaps a compromise would be to declare it officially unsupported and
remove it from Travis CI, while leaving the files in place to be used on an
at-your-own-risk basis? As long as it's in Travis, the default is that
anyone who breaks it has to fix it. If it's not in Travis, then the default
is that the people (person?) who use bento are responsible for keeping it
working for their needs.

> I don't think the above is a good reason to remove Bento support. The
much faster builds alone are a good reason to keep it. And the assertion
that all numpy devs understand numpy.distutils is more than a little
questionable:)

They surely don't. But thousands of people use setup.py, and one or two use
bento. Yet supporting both requires twice as much energy and attention as
supporting just one. We've probably spent more person-hours talking about
this, documenting the missing bscript bits, etc. than you've saved on those
fast builds.

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140705/fd326363/attachment.html>

From ralf.gommers at gmail.com  Sat Jul  5 09:32:03 2014
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sat, 5 Jul 2014 15:32:03 +0200
Subject: [Numpy-discussion] Remove bento from numpy
In-Reply-To: <CAPJVwBnABaa_UqEntMd8pHdfhhB-P1EyunRJb7_kxw9vskLzpQ@mail.gmail.com>
References: <CAB6mnxLwuA9MjKTj+aq6+qRVHzpWNOtYJVOE7p7Gy2sifdMzbw@mail.gmail.com>
	<CAGY4rcVtxmqcUaO5Qd_Oz+6B=7dGgiY7qdWsx_enS-qgQ9RxsA@mail.gmail.com>
	<CABL7CQiLoxLume94LeBBZvNhNF-LZig_=aAWmsVOH-bpqx1f0Q@mail.gmail.com>
	<CAPJVwBnABaa_UqEntMd8pHdfhhB-P1EyunRJb7_kxw9vskLzpQ@mail.gmail.com>
Message-ID: <CABL7CQjzW8C_g3vrQ5Z8qhmSkJX795beuZnSW874LTbPiUn7wQ@mail.gmail.com>

On Sat, Jul 5, 2014 at 1:54 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On 5 Jul 2014 09:23, "Ralf Gommers" <ralf.gommers at gmail.com> wrote:
> >
> > On Sat, Jul 5, 2014 at 10:13 AM, David Cournapeau <cournape at gmail.com>
> wrote:
> >>
> >> On Sat, Jul 5, 2014 at 11:25 AM, Charles R Harris <
> charlesr.harris at gmail.com> wrote:
> >>>
> >>> Ralf likes the speed of bento, but it is not currently maintained
> >>
> >>
> >> What exactly is not maintained ?
> >
> >
> > The issue is that Julian made some slightly nontrivial changes to
> core/setup.py and didn't want to update core/bscript. No one else has taken
> the time either to make those changes. That didn't bother me enough yet to
> go fix it, because they're all optional features and using Bento builds
> works just fine at the moment (and is part of the Travis CI test runs, so
> it'll keep working).
>
> Perhaps a compromise would be to declare it officially unsupported and
> remove it from Travis CI, while leaving the files in place to be used on an
> at-your-own-risk basis? As long as it's in Travis, the default is that
> anyone who breaks it has to fix it. If it's not in Travis, then the default
> is that the people (person?) who use bento are responsible for keeping it
> working for their needs.
>
-1 that just means that simple changes like adding a new extension will not
get made before PRs get merged, and bento support will be in a broken state
much more often.

> > I don't think the above is a good reason to remove Bento support. The
> much faster builds alone are a good reason to keep it. And the assertion
> that all numpy devs understand numpy.distutils is more than a little
> questionable:)
>
> They surely don't. But thousands of people use setup.py, and one or two
> use bento.
>
I'm getting a little tired of these assertions. It's clear that David and I
use it. A cursory search on Github reveals that Stefan, Fabian, Jonas and
@aksarkar do (or did) as well:
   https://github.com/scipy/scipy/commit/74d823b3
   https://github.com/numpy/numpy/issues/2993
   https://github.com/numpy/numpy/pull/3606
   https://github.com/numpy/numpy/issues/3889
For every user you can measure there's usually a number of users that you
don't hear about.

> Yet supporting both requires twice as much energy and attention as
> supporting just one.
>
That's of course not true. For most changes the differences in where and
how to update the build systems are small. Only for unusual changes like
Julian patches to make use of optional GCC features, Bento and distutils
may require very different changes.

> We've probably spent more person-hours talking about this, documenting the
> missing bscript bits, etc. than you've saved on those fast builds.
>
Then maybe stop talking about it:)

Besides the fast builds, which is only one example of why I like Bento
better, there's also the fundamental question of what we do with build
tools in the long term. It's clear that distutils is a dead end. All the
PEPs related to packaging move in the direction of supporting tools like
Bento better. If in the future we need significant new features in our
build tool, Bento is a much better base to build on than numpy.distutils.
It's unfortunate that at the moment there's no one that works on improving
our build situation, but that is what it is. Removing Bento support is a
step in the wrong direction imho.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140705/decbfb16/attachment.html>

From njs at pobox.com  Sat Jul  5 10:17:27 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Sat, 5 Jul 2014 15:17:27 +0100
Subject: [Numpy-discussion] Remove bento from numpy
In-Reply-To: <CABL7CQjzW8C_g3vrQ5Z8qhmSkJX795beuZnSW874LTbPiUn7wQ@mail.gmail.com>
References: <CAB6mnxLwuA9MjKTj+aq6+qRVHzpWNOtYJVOE7p7Gy2sifdMzbw@mail.gmail.com>
	<CAGY4rcVtxmqcUaO5Qd_Oz+6B=7dGgiY7qdWsx_enS-qgQ9RxsA@mail.gmail.com>
	<CABL7CQiLoxLume94LeBBZvNhNF-LZig_=aAWmsVOH-bpqx1f0Q@mail.gmail.com>
	<CAPJVwBnABaa_UqEntMd8pHdfhhB-P1EyunRJb7_kxw9vskLzpQ@mail.gmail.com>
	<CABL7CQjzW8C_g3vrQ5Z8qhmSkJX795beuZnSW874LTbPiUn7wQ@mail.gmail.com>
Message-ID: <CAPJVwBmPmhK+PnoLhWMGEAGvWFt4Vyz+w8PwTfSWTWmesais=w@mail.gmail.com>

On Sat, Jul 5, 2014 at 2:32 PM, Ralf Gommers <ralf.gommers at gmail.com> wrote:
>
> On Sat, Jul 5, 2014 at 1:54 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>> On 5 Jul 2014 09:23, "Ralf Gommers" <ralf.gommers at gmail.com> wrote:
>> >
>> > On Sat, Jul 5, 2014 at 10:13 AM, David Cournapeau <cournape at gmail.com>
>> > wrote:
>> >>
>> >> On Sat, Jul 5, 2014 at 11:25 AM, Charles R Harris
>> >> <charlesr.harris at gmail.com> wrote:
>> >>>
>> >>> Ralf likes the speed of bento, but it is not currently maintained
>> >>
>> >>
>> >> What exactly is not maintained ?
>> >
>> >
>> > The issue is that Julian made some slightly nontrivial changes to
>> > core/setup.py and didn't want to update core/bscript. No one else has taken
>> > the time either to make those changes. That didn't bother me enough yet to
>> > go fix it, because they're all optional features and using Bento builds
>> > works just fine at the moment (and is part of the Travis CI test runs, so
>> > it'll keep working).
>>
>> Perhaps a compromise would be to declare it officially unsupported and
>> remove it from Travis CI, while leaving the files in place to be used on an
>> at-your-own-risk basis? As long as it's in Travis, the default is that
>> anyone who breaks it has to fix it. If it's not in Travis, then the default
>> is that the people (person?) who use bento are responsible for keeping it
>> working for their needs.
>
> -1 that just means that simple changes like adding a new extension will not
> get made before PRs get merged, and bento support will be in a broken state
> much more often.

Yes, and then the handful of people who care about this would fix it
or not. Your -1 is attempting to veto other people's *not* paying
attention to this build system. I... don't think -1's work that way
:-(

>> > I don't think the above is a good reason to remove Bento support. The
>> > much faster builds alone are a good reason to keep it. And the assertion
>> > that all numpy devs understand numpy.distutils is more than a little
>> > questionable:)
>>
>> They surely don't. But thousands of people use setup.py, and one or two
>> use bento.
>
> I'm getting a little tired of these assertions. It's clear that David and I
> use it. A cursory search on Github reveals that Stefan, Fabian, Jonas and
> @aksarkar do (or did) as well:
>    https://github.com/scipy/scipy/commit/74d823b3
>    https://github.com/numpy/numpy/issues/2993
>    https://github.com/numpy/numpy/pull/3606
>    https://github.com/numpy/numpy/issues/3889
> For every user you can measure there's usually a number of users that you
> don't hear about.

I apologize for forgetting before that you do use Bento, but these
patches you're finding don't really change the overall picture. Let's
assume that there are 100 people using Bento, who would be slightly
inconvenienced if they had to use setup.py instead, or got stuck
patching the bento build themselves to keep it working. 100 is
probably an order of magnitude too high, but whatever. OTOH numpy has
almost 7 million downloads on PyPI+sf.net, of which approximately
every one used setup.py one way or another, plus all the people get it
from alternative channels like distros, which also AFAIK universally
use setup.py. Software development is all about trade-offs. Time that
numpy developers spend messing about with bento to benefit those
hundred users is time that could instead be spent on improvements that
benefit many orders of magnitudes more users. Why do you want us to
spend our time producing x units of value when we could instead be
producing 100*x units of value for the same effort?

>> Yet supporting both requires twice as much energy and attention as
>> supporting just one.
>
> That's of course not true. For most changes the differences in where and how
> to update the build systems are small. Only for unusual changes like Julian
> patches to make use of optional GCC features, Bento and distutils may
> require very different changes.
>>
>> We've probably spent more person-hours talking about this, documenting the
>> missing bscript bits, etc. than you've saved on those fast builds.
>
> Then maybe stop talking about it:)
>
> Besides the fast builds, which is only one example of why I like Bento
> better, there's also the fundamental question of what we do with build tools
> in the long term. It's clear that distutils is a dead end. All the PEPs
> related to packaging move in the direction of supporting tools like Bento
> better. If in the future we need significant new features in our build tool,
> Bento is a much better base to build on than numpy.distutils. It's
> unfortunate that at the moment there's no one that works on improving our
> build situation, but that is what it is. Removing Bento support is a step in
> the wrong direction imho.

"We must do something! This is something!"

Bento is pre-alpha software whose last upstream commit was in July
2013. It's own CI tests have been failing since Feb. 2013, almost a
year and a half ago. Bento build support was added to numpy in early
2011, and 3.5 years later it still hasn't convinced most of the core
team that it provides any value at all, yet it continues to take up
time and attention.

Maybe bento will revive and take over the new python packaging world!
Maybe not. Maybe something else will. I don't see how our support for
it will really affect these outcomes in any way. And I especially
don't see why it's important to spend time *now* on keeping bento
working, just in case it becomes useful *later*. If it proves valuable
later, we can always fix our bscripts then. They won't dissolve
irrecoverably out of history no matter what we do.

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org


From cournape at gmail.com  Sat Jul  5 10:21:24 2014
From: cournape at gmail.com (David Cournapeau)
Date: Sat, 5 Jul 2014 23:21:24 +0900
Subject: [Numpy-discussion] Remove bento from numpy
In-Reply-To: <CAPJVwBmPmhK+PnoLhWMGEAGvWFt4Vyz+w8PwTfSWTWmesais=w@mail.gmail.com>
References: <CAB6mnxLwuA9MjKTj+aq6+qRVHzpWNOtYJVOE7p7Gy2sifdMzbw@mail.gmail.com>
	<CAGY4rcVtxmqcUaO5Qd_Oz+6B=7dGgiY7qdWsx_enS-qgQ9RxsA@mail.gmail.com>
	<CABL7CQiLoxLume94LeBBZvNhNF-LZig_=aAWmsVOH-bpqx1f0Q@mail.gmail.com>
	<CAPJVwBnABaa_UqEntMd8pHdfhhB-P1EyunRJb7_kxw9vskLzpQ@mail.gmail.com>
	<CABL7CQjzW8C_g3vrQ5Z8qhmSkJX795beuZnSW874LTbPiUn7wQ@mail.gmail.com>
	<CAPJVwBmPmhK+PnoLhWMGEAGvWFt4Vyz+w8PwTfSWTWmesais=w@mail.gmail.com>
Message-ID: <CAGY4rcWi7u7JUCDK3MjFKmZOgMJTgJ=D1h1eWTPCiRd3Q3gq_w@mail.gmail.com>

On Sat, Jul 5, 2014 at 11:17 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Sat, Jul 5, 2014 at 2:32 PM, Ralf Gommers <ralf.gommers at gmail.com>
> wrote:
> >
> > On Sat, Jul 5, 2014 at 1:54 PM, Nathaniel Smith <njs at pobox.com> wrote:
> >>
> >> On 5 Jul 2014 09:23, "Ralf Gommers" <ralf.gommers at gmail.com> wrote:
> >> >
> >> > On Sat, Jul 5, 2014 at 10:13 AM, David Cournapeau <cournape at gmail.com
> >
> >> > wrote:
> >> >>
> >> >> On Sat, Jul 5, 2014 at 11:25 AM, Charles R Harris
> >> >> <charlesr.harris at gmail.com> wrote:
> >> >>>
> >> >>> Ralf likes the speed of bento, but it is not currently maintained
> >> >>
> >> >>
> >> >> What exactly is not maintained ?
> >> >
> >> >
> >> > The issue is that Julian made some slightly nontrivial changes to
> >> > core/setup.py and didn't want to update core/bscript. No one else has
> taken
> >> > the time either to make those changes. That didn't bother me enough
> yet to
> >> > go fix it, because they're all optional features and using Bento
> builds
> >> > works just fine at the moment (and is part of the Travis CI test
> runs, so
> >> > it'll keep working).
> >>
> >> Perhaps a compromise would be to declare it officially unsupported and
> >> remove it from Travis CI, while leaving the files in place to be used
> on an
> >> at-your-own-risk basis? As long as it's in Travis, the default is that
> >> anyone who breaks it has to fix it. If it's not in Travis, then the
> default
> >> is that the people (person?) who use bento are responsible for keeping
> it
> >> working for their needs.
> >
> > -1 that just means that simple changes like adding a new extension will
> not
> > get made before PRs get merged, and bento support will be in a broken
> state
> > much more often.
>
> Yes, and then the handful of people who care about this would fix it
> or not. Your -1 is attempting to veto other people's *not* paying
> attention to this build system. I... don't think -1's work that way
> :-(
>
> >> > I don't think the above is a good reason to remove Bento support. The
> >> > much faster builds alone are a good reason to keep it. And the
> assertion
> >> > that all numpy devs understand numpy.distutils is more than a little
> >> > questionable:)
> >>
> >> They surely don't. But thousands of people use setup.py, and one or two
> >> use bento.
> >
> > I'm getting a little tired of these assertions. It's clear that David
> and I
> > use it. A cursory search on Github reveals that Stefan, Fabian, Jonas and
> > @aksarkar do (or did) as well:
> >    https://github.com/scipy/scipy/commit/74d823b3
> >    https://github.com/numpy/numpy/issues/2993
> >    https://github.com/numpy/numpy/pull/3606
> >    https://github.com/numpy/numpy/issues/3889
> > For every user you can measure there's usually a number of users that you
> > don't hear about.
>
> I apologize for forgetting before that you do use Bento, but these
> patches you're finding don't really change the overall picture. Let's
> assume that there are 100 people using Bento, who would be slightly
> inconvenienced if they had to use setup.py instead, or got stuck
> patching the bento build themselves to keep it working. 100 is
> probably an order of magnitude too high, but whatever. OTOH numpy has
> almost 7 million downloads on PyPI+sf.net, of which approximately
> every one used setup.py one way or another, plus all the people get it
> from alternative channels like distros, which also AFAIK universally
> use setup.py. Software development is all about trade-offs. Time that
> numpy developers spend messing about with bento to benefit those
> hundred users is time that could instead be spent on improvements that
> benefit many orders of magnitudes more users. Why do you want us to
> spend our time producing x units of value when we could instead be
> producing 100*x units of value for the same effort?
>
> >> Yet supporting both requires twice as much energy and attention as
> >> supporting just one.
> >
> > That's of course not true. For most changes the differences in where and
> how
> > to update the build systems are small. Only for unusual changes like
> Julian
> > patches to make use of optional GCC features, Bento and distutils may
> > require very different changes.
> >>
> >> We've probably spent more person-hours talking about this, documenting
> the
> >> missing bscript bits, etc. than you've saved on those fast builds.
> >
> > Then maybe stop talking about it:)
> >
> > Besides the fast builds, which is only one example of why I like Bento
> > better, there's also the fundamental question of what we do with build
> tools
> > in the long term. It's clear that distutils is a dead end. All the PEPs
> > related to packaging move in the direction of supporting tools like Bento
> > better. If in the future we need significant new features in our build
> tool,
> > Bento is a much better base to build on than numpy.distutils. It's
> > unfortunate that at the moment there's no one that works on improving our
> > build situation, but that is what it is. Removing Bento support is a
> step in
> > the wrong direction imho.
>
> "We must do something! This is something!"
>
> Bento is pre-alpha software whose last upstream commit was in July
> 2013. It's own CI tests have been failing since Feb. 2013, almost a
> year and a half ago. Bento build support was added to numpy in early
> 2011, and 3.5 years later it still hasn't convinced most of the core
> team that it provides any value at all, yet it continues to take up
> time and attention.
>
> Maybe bento will revive and take over the new python packaging world!
> Maybe not. Maybe something else will. I don't see how our support for
> it will really affect these outcomes in any way. And I especially
> don't see why it's important to spend time *now* on keeping bento
> working, just in case it becomes useful *later*.


But it is working right now, so that argument is moot.

David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140705/2c370c7e/attachment.html>

From matthew.brett at gmail.com  Sat Jul  5 10:28:16 2014
From: matthew.brett at gmail.com (Matthew Brett)
Date: Sat, 5 Jul 2014 15:28:16 +0100
Subject: [Numpy-discussion] Remove bento from numpy
In-Reply-To: <CAGY4rcWi7u7JUCDK3MjFKmZOgMJTgJ=D1h1eWTPCiRd3Q3gq_w@mail.gmail.com>
References: <CAB6mnxLwuA9MjKTj+aq6+qRVHzpWNOtYJVOE7p7Gy2sifdMzbw@mail.gmail.com>
	<CAGY4rcVtxmqcUaO5Qd_Oz+6B=7dGgiY7qdWsx_enS-qgQ9RxsA@mail.gmail.com>
	<CABL7CQiLoxLume94LeBBZvNhNF-LZig_=aAWmsVOH-bpqx1f0Q@mail.gmail.com>
	<CAPJVwBnABaa_UqEntMd8pHdfhhB-P1EyunRJb7_kxw9vskLzpQ@mail.gmail.com>
	<CABL7CQjzW8C_g3vrQ5Z8qhmSkJX795beuZnSW874LTbPiUn7wQ@mail.gmail.com>
	<CAPJVwBmPmhK+PnoLhWMGEAGvWFt4Vyz+w8PwTfSWTWmesais=w@mail.gmail.com>
	<CAGY4rcWi7u7JUCDK3MjFKmZOgMJTgJ=D1h1eWTPCiRd3Q3gq_w@mail.gmail.com>
Message-ID: <CAH6Pt5rbaMa8V78ZDM3PZhrG4B9tcnefmFgKwDNrLvjqTbZJ5Q@mail.gmail.com>

On Sat, Jul 5, 2014 at 3:21 PM, David Cournapeau <cournape at gmail.com> wrote:
>
>
>
> On Sat, Jul 5, 2014 at 11:17 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>> On Sat, Jul 5, 2014 at 2:32 PM, Ralf Gommers <ralf.gommers at gmail.com>
>> wrote:
>> >
>> > On Sat, Jul 5, 2014 at 1:54 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> >>
>> >> On 5 Jul 2014 09:23, "Ralf Gommers" <ralf.gommers at gmail.com> wrote:
>> >> >
>> >> > On Sat, Jul 5, 2014 at 10:13 AM, David Cournapeau
>> >> > <cournape at gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> On Sat, Jul 5, 2014 at 11:25 AM, Charles R Harris
>> >> >> <charlesr.harris at gmail.com> wrote:
>> >> >>>
>> >> >>> Ralf likes the speed of bento, but it is not currently maintained
>> >> >>
>> >> >>
>> >> >> What exactly is not maintained ?
>> >> >
>> >> >
>> >> > The issue is that Julian made some slightly nontrivial changes to
>> >> > core/setup.py and didn't want to update core/bscript. No one else has
>> >> > taken
>> >> > the time either to make those changes. That didn't bother me enough
>> >> > yet to
>> >> > go fix it, because they're all optional features and using Bento
>> >> > builds
>> >> > works just fine at the moment (and is part of the Travis CI test
>> >> > runs, so
>> >> > it'll keep working).
>> >>
>> >> Perhaps a compromise would be to declare it officially unsupported and
>> >> remove it from Travis CI, while leaving the files in place to be used
>> >> on an
>> >> at-your-own-risk basis? As long as it's in Travis, the default is that
>> >> anyone who breaks it has to fix it. If it's not in Travis, then the
>> >> default
>> >> is that the people (person?) who use bento are responsible for keeping
>> >> it
>> >> working for their needs.
>> >
>> > -1 that just means that simple changes like adding a new extension will
>> > not
>> > get made before PRs get merged, and bento support will be in a broken
>> > state
>> > much more often.
>>
>> Yes, and then the handful of people who care about this would fix it
>> or not. Your -1 is attempting to veto other people's *not* paying
>> attention to this build system. I... don't think -1's work that way
>> :-(
>>
>> >> > I don't think the above is a good reason to remove Bento support. The
>> >> > much faster builds alone are a good reason to keep it. And the
>> >> > assertion
>> >> > that all numpy devs understand numpy.distutils is more than a little
>> >> > questionable:)
>> >>
>> >> They surely don't. But thousands of people use setup.py, and one or two
>> >> use bento.
>> >
>> > I'm getting a little tired of these assertions. It's clear that David
>> > and I
>> > use it. A cursory search on Github reveals that Stefan, Fabian, Jonas
>> > and
>> > @aksarkar do (or did) as well:
>> >    https://github.com/scipy/scipy/commit/74d823b3
>> >    https://github.com/numpy/numpy/issues/2993
>> >    https://github.com/numpy/numpy/pull/3606
>> >    https://github.com/numpy/numpy/issues/3889
>> > For every user you can measure there's usually a number of users that
>> > you
>> > don't hear about.
>>
>> I apologize for forgetting before that you do use Bento, but these
>> patches you're finding don't really change the overall picture. Let's
>> assume that there are 100 people using Bento, who would be slightly
>> inconvenienced if they had to use setup.py instead, or got stuck
>> patching the bento build themselves to keep it working. 100 is
>> probably an order of magnitude too high, but whatever. OTOH numpy has
>> almost 7 million downloads on PyPI+sf.net, of which approximately
>> every one used setup.py one way or another, plus all the people get it
>> from alternative channels like distros, which also AFAIK universally
>> use setup.py. Software development is all about trade-offs. Time that
>> numpy developers spend messing about with bento to benefit those
>> hundred users is time that could instead be spent on improvements that
>> benefit many orders of magnitudes more users. Why do you want us to
>> spend our time producing x units of value when we could instead be
>> producing 100*x units of value for the same effort?
>>
>> >> Yet supporting both requires twice as much energy and attention as
>> >> supporting just one.
>> >
>> > That's of course not true. For most changes the differences in where and
>> > how
>> > to update the build systems are small. Only for unusual changes like
>> > Julian
>> > patches to make use of optional GCC features, Bento and distutils may
>> > require very different changes.
>> >>
>> >> We've probably spent more person-hours talking about this, documenting
>> >> the
>> >> missing bscript bits, etc. than you've saved on those fast builds.
>> >
>> > Then maybe stop talking about it:)
>> >
>> > Besides the fast builds, which is only one example of why I like Bento
>> > better, there's also the fundamental question of what we do with build
>> > tools
>> > in the long term. It's clear that distutils is a dead end. All the PEPs
>> > related to packaging move in the direction of supporting tools like
>> > Bento
>> > better. If in the future we need significant new features in our build
>> > tool,
>> > Bento is a much better base to build on than numpy.distutils. It's
>> > unfortunate that at the moment there's no one that works on improving
>> > our
>> > build situation, but that is what it is. Removing Bento support is a
>> > step in
>> > the wrong direction imho.
>>
>> "We must do something! This is something!"
>>
>> Bento is pre-alpha software whose last upstream commit was in July
>> 2013. It's own CI tests have been failing since Feb. 2013, almost a
>> year and a half ago. Bento build support was added to numpy in early
>> 2011, and 3.5 years later it still hasn't convinced most of the core
>> team that it provides any value at all, yet it continues to take up
>> time and attention.
>>
>> Maybe bento will revive and take over the new python packaging world!
>> Maybe not. Maybe something else will. I don't see how our support for
>> it will really affect these outcomes in any way. And I especially
>> don't see why it's important to spend time *now* on keeping bento
>> working, just in case it becomes useful *later*.
>
>
> But it is working right now, so that argument is moot.

Why don't we wait until there is a significant problem with getting
the Bento builds to work, and revisit then.

Cheers,

Matthew


From charlesr.harris at gmail.com  Sat Jul  5 10:51:56 2014
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sat, 5 Jul 2014 08:51:56 -0600
Subject: [Numpy-discussion] Remove bento from numpy
In-Reply-To: <CAH6Pt5rbaMa8V78ZDM3PZhrG4B9tcnefmFgKwDNrLvjqTbZJ5Q@mail.gmail.com>
References: <CAB6mnxLwuA9MjKTj+aq6+qRVHzpWNOtYJVOE7p7Gy2sifdMzbw@mail.gmail.com>
	<CAGY4rcVtxmqcUaO5Qd_Oz+6B=7dGgiY7qdWsx_enS-qgQ9RxsA@mail.gmail.com>
	<CABL7CQiLoxLume94LeBBZvNhNF-LZig_=aAWmsVOH-bpqx1f0Q@mail.gmail.com>
	<CAPJVwBnABaa_UqEntMd8pHdfhhB-P1EyunRJb7_kxw9vskLzpQ@mail.gmail.com>
	<CABL7CQjzW8C_g3vrQ5Z8qhmSkJX795beuZnSW874LTbPiUn7wQ@mail.gmail.com>
	<CAPJVwBmPmhK+PnoLhWMGEAGvWFt4Vyz+w8PwTfSWTWmesais=w@mail.gmail.com>
	<CAGY4rcWi7u7JUCDK3MjFKmZOgMJTgJ=D1h1eWTPCiRd3Q3gq_w@mail.gmail.com>
	<CAH6Pt5rbaMa8V78ZDM3PZhrG4B9tcnefmFgKwDNrLvjqTbZJ5Q@mail.gmail.com>
Message-ID: <CAB6mnx+4SbSGsmC-270442grmiRt+3mpa-ZmD7w2OB+ikBA6ag@mail.gmail.com>

On Sat, Jul 5, 2014 at 8:28 AM, Matthew Brett <matthew.brett at gmail.com>
wrote:

> On Sat, Jul 5, 2014 at 3:21 PM, David Cournapeau <cournape at gmail.com>
> wrote:
> >
> >
> >
> > On Sat, Jul 5, 2014 at 11:17 PM, Nathaniel Smith <njs at pobox.com> wrote:
> >>
> >> On Sat, Jul 5, 2014 at 2:32 PM, Ralf Gommers <ralf.gommers at gmail.com>
> >> wrote:
> >> >
> >> > On Sat, Jul 5, 2014 at 1:54 PM, Nathaniel Smith <njs at pobox.com>
> wrote:
> >> >>
> >> >> On 5 Jul 2014 09:23, "Ralf Gommers" <ralf.gommers at gmail.com> wrote:
> >> >> >
> >> >> > On Sat, Jul 5, 2014 at 10:13 AM, David Cournapeau
> >> >> > <cournape at gmail.com>
> >> >> > wrote:
> >> >> >>
> >> >> >> On Sat, Jul 5, 2014 at 11:25 AM, Charles R Harris
> >> >> >> <charlesr.harris at gmail.com> wrote:
> >> >> >>>
> >> >> >>> Ralf likes the speed of bento, but it is not currently maintained
> >> >> >>
> >> >> >>
> >> >> >> What exactly is not maintained ?
> >> >> >
> >> >> >
> >> >> > The issue is that Julian made some slightly nontrivial changes to
> >> >> > core/setup.py and didn't want to update core/bscript. No one else
> has
> >> >> > taken
> >> >> > the time either to make those changes. That didn't bother me enough
> >> >> > yet to
> >> >> > go fix it, because they're all optional features and using Bento
> >> >> > builds
> >> >> > works just fine at the moment (and is part of the Travis CI test
> >> >> > runs, so
> >> >> > it'll keep working).
> >> >>
> >> >> Perhaps a compromise would be to declare it officially unsupported
> and
> >> >> remove it from Travis CI, while leaving the files in place to be used
> >> >> on an
> >> >> at-your-own-risk basis? As long as it's in Travis, the default is
> that
> >> >> anyone who breaks it has to fix it. If it's not in Travis, then the
> >> >> default
> >> >> is that the people (person?) who use bento are responsible for
> keeping
> >> >> it
> >> >> working for their needs.
> >> >
> >> > -1 that just means that simple changes like adding a new extension
> will
> >> > not
> >> > get made before PRs get merged, and bento support will be in a broken
> >> > state
> >> > much more often.
> >>
> >> Yes, and then the handful of people who care about this would fix it
> >> or not. Your -1 is attempting to veto other people's *not* paying
> >> attention to this build system. I... don't think -1's work that way
> >> :-(
> >>
> >> >> > I don't think the above is a good reason to remove Bento support.
> The
> >> >> > much faster builds alone are a good reason to keep it. And the
> >> >> > assertion
> >> >> > that all numpy devs understand numpy.distutils is more than a
> little
> >> >> > questionable:)
> >> >>
> >> >> They surely don't. But thousands of people use setup.py, and one or
> two
> >> >> use bento.
> >> >
> >> > I'm getting a little tired of these assertions. It's clear that David
> >> > and I
> >> > use it. A cursory search on Github reveals that Stefan, Fabian, Jonas
> >> > and
> >> > @aksarkar do (or did) as well:
> >> >    https://github.com/scipy/scipy/commit/74d823b3
> >> >    https://github.com/numpy/numpy/issues/2993
> >> >    https://github.com/numpy/numpy/pull/3606
> >> >    https://github.com/numpy/numpy/issues/3889
> >> > For every user you can measure there's usually a number of users that
> >> > you
> >> > don't hear about.
> >>
> >> I apologize for forgetting before that you do use Bento, but these
> >> patches you're finding don't really change the overall picture. Let's
> >> assume that there are 100 people using Bento, who would be slightly
> >> inconvenienced if they had to use setup.py instead, or got stuck
> >> patching the bento build themselves to keep it working. 100 is
> >> probably an order of magnitude too high, but whatever. OTOH numpy has
> >> almost 7 million downloads on PyPI+sf.net, of which approximately
> >> every one used setup.py one way or another, plus all the people get it
> >> from alternative channels like distros, which also AFAIK universally
> >> use setup.py. Software development is all about trade-offs. Time that
> >> numpy developers spend messing about with bento to benefit those
> >> hundred users is time that could instead be spent on improvements that
> >> benefit many orders of magnitudes more users. Why do you want us to
> >> spend our time producing x units of value when we could instead be
> >> producing 100*x units of value for the same effort?
> >>
> >> >> Yet supporting both requires twice as much energy and attention as
> >> >> supporting just one.
> >> >
> >> > That's of course not true. For most changes the differences in where
> and
> >> > how
> >> > to update the build systems are small. Only for unusual changes like
> >> > Julian
> >> > patches to make use of optional GCC features, Bento and distutils may
> >> > require very different changes.
> >> >>
> >> >> We've probably spent more person-hours talking about this,
> documenting
> >> >> the
> >> >> missing bscript bits, etc. than you've saved on those fast builds.
> >> >
> >> > Then maybe stop talking about it:)
> >> >
> >> > Besides the fast builds, which is only one example of why I like Bento
> >> > better, there's also the fundamental question of what we do with build
> >> > tools
> >> > in the long term. It's clear that distutils is a dead end. All the
> PEPs
> >> > related to packaging move in the direction of supporting tools like
> >> > Bento
> >> > better. If in the future we need significant new features in our build
> >> > tool,
> >> > Bento is a much better base to build on than numpy.distutils. It's
> >> > unfortunate that at the moment there's no one that works on improving
> >> > our
> >> > build situation, but that is what it is. Removing Bento support is a
> >> > step in
> >> > the wrong direction imho.
> >>
> >> "We must do something! This is something!"
> >>
> >> Bento is pre-alpha software whose last upstream commit was in July
> >> 2013. It's own CI tests have been failing since Feb. 2013, almost a
> >> year and a half ago. Bento build support was added to numpy in early
> >> 2011, and 3.5 years later it still hasn't convinced most of the core
> >> team that it provides any value at all, yet it continues to take up
> >> time and attention.
> >>
> >> Maybe bento will revive and take over the new python packaging world!
> >> Maybe not. Maybe something else will. I don't see how our support for
> >> it will really affect these outcomes in any way. And I especially
> >> don't see why it's important to spend time *now* on keeping bento
> >> working, just in case it becomes useful *later*.
> >
> >
> > But it is working right now, so that argument is moot.
>
> Why don't we wait until there is a significant problem with getting
> the Bento builds to work, and revisit then.
>
>
My feeling is that it is deceptive, as most folks who might use bento won't
know that some optimizations are missing from the result.

David, I have pinged you a number of times about getting the numpy bento
build updated. The fact that bento builds numpy without failing is not the
same as bento building numpy in the best way.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140705/c8184013/attachment.html>

From cournape at gmail.com  Sat Jul  5 11:05:25 2014
From: cournape at gmail.com (David Cournapeau)
Date: Sun, 6 Jul 2014 00:05:25 +0900
Subject: [Numpy-discussion] Remove bento from numpy
In-Reply-To: <CAB6mnx+4SbSGsmC-270442grmiRt+3mpa-ZmD7w2OB+ikBA6ag@mail.gmail.com>
References: <CAB6mnxLwuA9MjKTj+aq6+qRVHzpWNOtYJVOE7p7Gy2sifdMzbw@mail.gmail.com>
	<CAGY4rcVtxmqcUaO5Qd_Oz+6B=7dGgiY7qdWsx_enS-qgQ9RxsA@mail.gmail.com>
	<CABL7CQiLoxLume94LeBBZvNhNF-LZig_=aAWmsVOH-bpqx1f0Q@mail.gmail.com>
	<CAPJVwBnABaa_UqEntMd8pHdfhhB-P1EyunRJb7_kxw9vskLzpQ@mail.gmail.com>
	<CABL7CQjzW8C_g3vrQ5Z8qhmSkJX795beuZnSW874LTbPiUn7wQ@mail.gmail.com>
	<CAPJVwBmPmhK+PnoLhWMGEAGvWFt4Vyz+w8PwTfSWTWmesais=w@mail.gmail.com>
	<CAGY4rcWi7u7JUCDK3MjFKmZOgMJTgJ=D1h1eWTPCiRd3Q3gq_w@mail.gmail.com>
	<CAH6Pt5rbaMa8V78ZDM3PZhrG4B9tcnefmFgKwDNrLvjqTbZJ5Q@mail.gmail.com>
	<CAB6mnx+4SbSGsmC-270442grmiRt+3mpa-ZmD7w2OB+ikBA6ag@mail.gmail.com>
Message-ID: <CAGY4rcWCZxXscZqAYj0BcGjJv9Kzdyy7YP09=oQthEg5Nm7u6w@mail.gmail.com>

On Sat, Jul 5, 2014 at 11:51 PM, Charles R Harris <charlesr.harris at gmail.com
> wrote:

>
>
>
> On Sat, Jul 5, 2014 at 8:28 AM, Matthew Brett <matthew.brett at gmail.com>
> wrote:
>
>> On Sat, Jul 5, 2014 at 3:21 PM, David Cournapeau <cournape at gmail.com>
>> wrote:
>> >
>> >
>> >
>> > On Sat, Jul 5, 2014 at 11:17 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> >>
>> >> On Sat, Jul 5, 2014 at 2:32 PM, Ralf Gommers <ralf.gommers at gmail.com>
>> >> wrote:
>> >> >
>> >> > On Sat, Jul 5, 2014 at 1:54 PM, Nathaniel Smith <njs at pobox.com>
>> wrote:
>> >> >>
>> >> >> On 5 Jul 2014 09:23, "Ralf Gommers" <ralf.gommers at gmail.com> wrote:
>> >> >> >
>> >> >> > On Sat, Jul 5, 2014 at 10:13 AM, David Cournapeau
>> >> >> > <cournape at gmail.com>
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> On Sat, Jul 5, 2014 at 11:25 AM, Charles R Harris
>> >> >> >> <charlesr.harris at gmail.com> wrote:
>> >> >> >>>
>> >> >> >>> Ralf likes the speed of bento, but it is not currently
>> maintained
>> >> >> >>
>> >> >> >>
>> >> >> >> What exactly is not maintained ?
>> >> >> >
>> >> >> >
>> >> >> > The issue is that Julian made some slightly nontrivial changes to
>> >> >> > core/setup.py and didn't want to update core/bscript. No one else
>> has
>> >> >> > taken
>> >> >> > the time either to make those changes. That didn't bother me
>> enough
>> >> >> > yet to
>> >> >> > go fix it, because they're all optional features and using Bento
>> >> >> > builds
>> >> >> > works just fine at the moment (and is part of the Travis CI test
>> >> >> > runs, so
>> >> >> > it'll keep working).
>> >> >>
>> >> >> Perhaps a compromise would be to declare it officially unsupported
>> and
>> >> >> remove it from Travis CI, while leaving the files in place to be
>> used
>> >> >> on an
>> >> >> at-your-own-risk basis? As long as it's in Travis, the default is
>> that
>> >> >> anyone who breaks it has to fix it. If it's not in Travis, then the
>> >> >> default
>> >> >> is that the people (person?) who use bento are responsible for
>> keeping
>> >> >> it
>> >> >> working for their needs.
>> >> >
>> >> > -1 that just means that simple changes like adding a new extension
>> will
>> >> > not
>> >> > get made before PRs get merged, and bento support will be in a broken
>> >> > state
>> >> > much more often.
>> >>
>> >> Yes, and then the handful of people who care about this would fix it
>> >> or not. Your -1 is attempting to veto other people's *not* paying
>> >> attention to this build system. I... don't think -1's work that way
>> >> :-(
>> >>
>> >> >> > I don't think the above is a good reason to remove Bento support.
>> The
>> >> >> > much faster builds alone are a good reason to keep it. And the
>> >> >> > assertion
>> >> >> > that all numpy devs understand numpy.distutils is more than a
>> little
>> >> >> > questionable:)
>> >> >>
>> >> >> They surely don't. But thousands of people use setup.py, and one or
>> two
>> >> >> use bento.
>> >> >
>> >> > I'm getting a little tired of these assertions. It's clear that David
>> >> > and I
>> >> > use it. A cursory search on Github reveals that Stefan, Fabian, Jonas
>> >> > and
>> >> > @aksarkar do (or did) as well:
>> >> >    https://github.com/scipy/scipy/commit/74d823b3
>> >> >    https://github.com/numpy/numpy/issues/2993
>> >> >    https://github.com/numpy/numpy/pull/3606
>> >> >    https://github.com/numpy/numpy/issues/3889
>> >> > For every user you can measure there's usually a number of users that
>> >> > you
>> >> > don't hear about.
>> >>
>> >> I apologize for forgetting before that you do use Bento, but these
>> >> patches you're finding don't really change the overall picture. Let's
>> >> assume that there are 100 people using Bento, who would be slightly
>> >> inconvenienced if they had to use setup.py instead, or got stuck
>> >> patching the bento build themselves to keep it working. 100 is
>> >> probably an order of magnitude too high, but whatever. OTOH numpy has
>> >> almost 7 million downloads on PyPI+sf.net, of which approximately
>> >> every one used setup.py one way or another, plus all the people get it
>> >> from alternative channels like distros, which also AFAIK universally
>> >> use setup.py. Software development is all about trade-offs. Time that
>> >> numpy developers spend messing about with bento to benefit those
>> >> hundred users is time that could instead be spent on improvements that
>> >> benefit many orders of magnitudes more users. Why do you want us to
>> >> spend our time producing x units of value when we could instead be
>> >> producing 100*x units of value for the same effort?
>> >>
>> >> >> Yet supporting both requires twice as much energy and attention as
>> >> >> supporting just one.
>> >> >
>> >> > That's of course not true. For most changes the differences in where
>> and
>> >> > how
>> >> > to update the build systems are small. Only for unusual changes like
>> >> > Julian
>> >> > patches to make use of optional GCC features, Bento and distutils may
>> >> > require very different changes.
>> >> >>
>> >> >> We've probably spent more person-hours talking about this,
>> documenting
>> >> >> the
>> >> >> missing bscript bits, etc. than you've saved on those fast builds.
>> >> >
>> >> > Then maybe stop talking about it:)
>> >> >
>> >> > Besides the fast builds, which is only one example of why I like
>> Bento
>> >> > better, there's also the fundamental question of what we do with
>> build
>> >> > tools
>> >> > in the long term. It's clear that distutils is a dead end. All the
>> PEPs
>> >> > related to packaging move in the direction of supporting tools like
>> >> > Bento
>> >> > better. If in the future we need significant new features in our
>> build
>> >> > tool,
>> >> > Bento is a much better base to build on than numpy.distutils. It's
>> >> > unfortunate that at the moment there's no one that works on improving
>> >> > our
>> >> > build situation, but that is what it is. Removing Bento support is a
>> >> > step in
>> >> > the wrong direction imho.
>> >>
>> >> "We must do something! This is something!"
>> >>
>> >> Bento is pre-alpha software whose last upstream commit was in July
>> >> 2013. It's own CI tests have been failing since Feb. 2013, almost a
>> >> year and a half ago. Bento build support was added to numpy in early
>> >> 2011, and 3.5 years later it still hasn't convinced most of the core
>> >> team that it provides any value at all, yet it continues to take up
>> >> time and attention.
>> >>
>> >> Maybe bento will revive and take over the new python packaging world!
>> >> Maybe not. Maybe something else will. I don't see how our support for
>> >> it will really affect these outcomes in any way. And I especially
>> >> don't see why it's important to spend time *now* on keeping bento
>> >> working, just in case it becomes useful *later*.
>> >
>> >
>> > But it is working right now, so that argument is moot.
>>
>> Why don't we wait until there is a significant problem with getting
>> the Bento builds to work, and revisit then.
>>
>>
> My feeling is that it is deceptive, as most folks who might use bento
> won't know that some optimizations are missing from the result.
>
> David, I have pinged you a number of times about getting the numpy bento
> build updated. The fact that bento builds numpy without failing is not the
> same as bento building numpy in the best way.
>

Fair enough, let me look at it now, looks fairly trivial to fix

David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140706/352f037c/attachment.html>

From sebastian at sipsolutions.net  Sat Jul  5 11:11:03 2014
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Sat, 05 Jul 2014 17:11:03 +0200
Subject: [Numpy-discussion] Fast way to convert (nested) list to numpy
 object array?
In-Reply-To: <53B6C919.4010806@tudelft.nl>
References: <53B51993.7080207@tudelft.nl>
	<CAK5FAtFApW67ry7Cpw-G3YEffO3MXJTpVQtUgFWB2x+mC+akeQ@mail.gmail.com>
	<CAK5FAtGuh9oGE-tjFDJUo-pndHrSZrsYps97BxfqiGvqMSe7vA@mail.gmail.com>
	<53B54E41.8090309@tudelft.nl> <1404391459.13834.8.camel@sebastian-t440>
	<53B6C919.4010806@tudelft.nl>
Message-ID: <1404573063.3423.5.camel@sebastian-t440>

On Fr, 2014-07-04 at 17:32 +0200, Marc Hulsman wrote:
> On 07/03/2014 02:44 PM, Sebastian Berg wrote:
> > True and true. I don't see a problem with fromiter being more general,
> > just someone has to sit down and add new error checks/cleanup stuff
> > for the object case. The assignment could probably also be optimized,
> > not sure how hard that is, I would expect it isn't that hard. As
> > usually, someone just needs to find time and the interest to actually
> > do it ;). - Sebastian 
> 
> I looked at the code of FromIter below.
> 
>     /*
>      * We would need to alter the memory RENEW code to decrement any
>      * reference counts before throwing away any memory.
>      */
>     if (PyDataType_REFCHK(dtype)) {
>         PyErr_SetString(PyExc_ValueError,
>                 "cannot create object arrays from iterator");
>         goto done;
>     }
> 
> 
> However, the memory renew code (which just reallocs the memory to
> increase the array size) uses
> a simple realloc. It seems to me that it is not necessary to adapt
> reference counts in this case (as the incref
> from the new memory compensates the decref from the memory that is
> removed)? For the addition of elements
> to the array, everything seems to be ok anyway, as setitem is used,
> which does the incref already.
> So I think it should be possible to just remove this check?
> 

Yes and no. I agree that the comment was just being overly careful,
since the renew will copy the pointers without calling Py_INCREF.
However, you *will* need to add new error cleanup logic in case the
iterator throws an error, or you run out of memory. Since then you need
to decref everything again.

> I did not yet look at the assignment issue,  had some difficulty finding
> the correct place in the code, does does
> anyone have any pointers were to look?
> 

This is handled by PyArray_CopyObject in arrayobject.c. The actual logic
is probably done by PyArray_GetArrayParamsFromObject in ctors.c, that is
a public function, so my guess is, you would have to create a new one
which allows passing in a maximum ndim and then make the old one call
that one with NPY_MAXDIMS (or whatever it was)

- Sebastian

> 
> 
> 
> >> The generic solution of adding an nmaxdim parameter to numpy.array would
> >> of course be even more ideal :)
> >>
> >>
> >>
> >> _______________________________________________
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion at scipy.org
> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >>
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140705/3d67f7eb/attachment.sig>

From sebastian at sipsolutions.net  Sat Jul  5 11:24:36 2014
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Sat, 05 Jul 2014 17:24:36 +0200
Subject: [Numpy-discussion] Questions about fixes for 1.9.0rc2
In-Reply-To: <CAPJVwBn5e_TeHJ1KEzPnoJrEeuyA83zJkCxrV9_OvAqWwiqD=Q@mail.gmail.com>
References: <CAB6mnxK+bEN4ToMk0O_LUWbHjB7s2eAO+XAY9YxpJ45D4ZYsaQ@mail.gmail.com>
	<CAB6mnxJ=Pjj8sFh0KqssCUNKESQQqgNUP-OYpfP6nGCtk1WL4A@mail.gmail.com>
	<CABL7CQjQxOVWFSFG3nf9s9pQop05NK2+6X7XHaEFjF3HH8xbUQ@mail.gmail.com>
	<CAPJVwB=WcTd2O+gr1GAcgW+F2=S=5bH6hd9_9xHjhCjmt4XnkA@mail.gmail.com>
	<CAB6mnxJvEPCKm8cRaWXHKaGJwFM2_3j+-E9QdQ-gAQZ8kkLe8g@mail.gmail.com>
	<CAPJVwBmd0iuPe6MfJTfJEbEEKV8SimKLDZN5zejO_8VRePtLtA@mail.gmail.com>
	<CAB6mnxK6Y8dVU_3Yu=YQCvtK6gSXNZ_H-XuqH1BqtmXNw6_uQg@mail.gmail.com>
	<CAPJVwBnneoJhOqcD87wU2_-pzphObfubPovrGEG8N7K-1_4FPA@mail.gmail.com>
	<CAB6mnxKdUX831dCyk_aBaw8rAYgF=SAg-g1P6k5mBvz2s+Eo=A@mail.gmail.com>
	<CAPJVwBnJL=cUjuiwr6ZSS0HfG3DYOA=NPE-rWyayqXoro77zoQ@mail.gmail.com>
	<CAB6mnx+YHfOen+ywpRbzjLO6r_+u8qZNro_-v_G5u7Yn=Wgemg@mail.gmail.com>
	<CAPJVwBn5e_TeHJ1KEzPnoJrEeuyA83zJkCxrV9_OvAqWwiqD=Q@mail.gmail.com>
Message-ID: <1404573876.3423.9.camel@sebastian-t440>

On Sa, 2014-07-05 at 00:41 +0100, Nathaniel Smith wrote:
> On 5 Jul 2014 00:07, "Charles R Harris" <charlesr.harris at gmail.com>

<snip>
> 
> That is a massive pile of affected projects :-(.
> 
> My worry is that if all these projects we know about are broken, then
> how many other codebases that we aren't testing are also broken?
> 

Yeah, I would imagine quite a few might be. It isn't that I guess many
used the "feature" deliberately, but it is easy to just code it and
assume that the code is correct since it works. So I think I will just
need to fix it. The pull request *should* already do this with a band
aid-solution, by just falling back to the old funky stuff if there is a
failure. If someone is good with python exception handling and string
formatting in C, please feel free to have a look ;).

- Sebastian

> > If the issues are fixed in matplotlib and pandas I'd be inclined to
> release as is with a mention of versions in the release notes.
> 
> Even if it's fixed in pandas master, how long until it's in user's
> hands?
> 
> -n
> 
> > Chuck
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140705/99cf3cd1/attachment.sig>

From njs at pobox.com  Sat Jul  5 12:38:02 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Sat, 5 Jul 2014 17:38:02 +0100
Subject: [Numpy-discussion] Remove bento from numpy
In-Reply-To: <CAGY4rcWi7u7JUCDK3MjFKmZOgMJTgJ=D1h1eWTPCiRd3Q3gq_w@mail.gmail.com>
References: <CAB6mnxLwuA9MjKTj+aq6+qRVHzpWNOtYJVOE7p7Gy2sifdMzbw@mail.gmail.com>
	<CAGY4rcVtxmqcUaO5Qd_Oz+6B=7dGgiY7qdWsx_enS-qgQ9RxsA@mail.gmail.com>
	<CABL7CQiLoxLume94LeBBZvNhNF-LZig_=aAWmsVOH-bpqx1f0Q@mail.gmail.com>
	<CAPJVwBnABaa_UqEntMd8pHdfhhB-P1EyunRJb7_kxw9vskLzpQ@mail.gmail.com>
	<CABL7CQjzW8C_g3vrQ5Z8qhmSkJX795beuZnSW874LTbPiUn7wQ@mail.gmail.com>
	<CAPJVwBmPmhK+PnoLhWMGEAGvWFt4Vyz+w8PwTfSWTWmesais=w@mail.gmail.com>
	<CAGY4rcWi7u7JUCDK3MjFKmZOgMJTgJ=D1h1eWTPCiRd3Q3gq_w@mail.gmail.com>
Message-ID: <CAPJVwB==_LLqRe0waJ8WK5zPVVa+4-QHnG=CjuEq0+nF3O0uhg@mail.gmail.com>

On Sat, Jul 5, 2014 at 3:21 PM, David Cournapeau <cournape at gmail.com> wrote:
>
> On Sat, Jul 5, 2014 at 11:17 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>> Maybe bento will revive and take over the new python packaging world!
>> Maybe not. Maybe something else will. I don't see how our support for
>> it will really affect these outcomes in any way. And I especially
>> don't see why it's important to spend time *now* on keeping bento
>> working, just in case it becomes useful *later*.
>
> But it is working right now, so that argument is moot.

My suggestion was that we should drop the rule that a patch has to
keep bento working to be merged. We're talking about future breakages
and future effort. The fact that it's working now doesn't say anything
about whether it's worth continuing to invest time in it.

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org


From cournape at gmail.com  Sat Jul  5 12:40:17 2014
From: cournape at gmail.com (David Cournapeau)
Date: Sun, 6 Jul 2014 01:40:17 +0900
Subject: [Numpy-discussion] Remove bento from numpy
In-Reply-To: <CAPJVwB==_LLqRe0waJ8WK5zPVVa+4-QHnG=CjuEq0+nF3O0uhg@mail.gmail.com>
References: <CAB6mnxLwuA9MjKTj+aq6+qRVHzpWNOtYJVOE7p7Gy2sifdMzbw@mail.gmail.com>
	<CAGY4rcVtxmqcUaO5Qd_Oz+6B=7dGgiY7qdWsx_enS-qgQ9RxsA@mail.gmail.com>
	<CABL7CQiLoxLume94LeBBZvNhNF-LZig_=aAWmsVOH-bpqx1f0Q@mail.gmail.com>
	<CAPJVwBnABaa_UqEntMd8pHdfhhB-P1EyunRJb7_kxw9vskLzpQ@mail.gmail.com>
	<CABL7CQjzW8C_g3vrQ5Z8qhmSkJX795beuZnSW874LTbPiUn7wQ@mail.gmail.com>
	<CAPJVwBmPmhK+PnoLhWMGEAGvWFt4Vyz+w8PwTfSWTWmesais=w@mail.gmail.com>
	<CAGY4rcWi7u7JUCDK3MjFKmZOgMJTgJ=D1h1eWTPCiRd3Q3gq_w@mail.gmail.com>
	<CAPJVwB==_LLqRe0waJ8WK5zPVVa+4-QHnG=CjuEq0+nF3O0uhg@mail.gmail.com>
Message-ID: <CAGY4rcUrXooKwDTYkef10tfPLcOuqSmqLV5ZuvnN+XY6czVozA@mail.gmail.com>

The efforts are on average less demanding than this discussion. We are
talking about adding entries to a list in most cases...

Also, while adding the optimization support for bento, I've noticed that a
lot of the related distutils code is broken, and does not work as expected
on at least OS X + clang.

David


On Sun, Jul 6, 2014 at 1:38 AM, Nathaniel Smith <njs at pobox.com> wrote:

> On Sat, Jul 5, 2014 at 3:21 PM, David Cournapeau <cournape at gmail.com>
> wrote:
> >
> > On Sat, Jul 5, 2014 at 11:17 PM, Nathaniel Smith <njs at pobox.com> wrote:
> >>
> >> Maybe bento will revive and take over the new python packaging world!
> >> Maybe not. Maybe something else will. I don't see how our support for
> >> it will really affect these outcomes in any way. And I especially
> >> don't see why it's important to spend time *now* on keeping bento
> >> working, just in case it becomes useful *later*.
> >
> > But it is working right now, so that argument is moot.
>
> My suggestion was that we should drop the rule that a patch has to
> keep bento working to be merged. We're talking about future breakages
> and future effort. The fact that it's working now doesn't say anything
> about whether it's worth continuing to invest time in it.
>
> --
> Nathaniel J. Smith
> Postdoctoral researcher - Informatics - University of Edinburgh
> http://vorpus.org
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140706/d8d40806/attachment.html>

From jtaylor.debian at googlemail.com  Sat Jul  5 12:55:04 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Sat, 05 Jul 2014 18:55:04 +0200
Subject: [Numpy-discussion] Remove bento from numpy
In-Reply-To: <CAGY4rcUrXooKwDTYkef10tfPLcOuqSmqLV5ZuvnN+XY6czVozA@mail.gmail.com>
References: <CAB6mnxLwuA9MjKTj+aq6+qRVHzpWNOtYJVOE7p7Gy2sifdMzbw@mail.gmail.com>	<CAGY4rcVtxmqcUaO5Qd_Oz+6B=7dGgiY7qdWsx_enS-qgQ9RxsA@mail.gmail.com>	<CABL7CQiLoxLume94LeBBZvNhNF-LZig_=aAWmsVOH-bpqx1f0Q@mail.gmail.com>	<CAPJVwBnABaa_UqEntMd8pHdfhhB-P1EyunRJb7_kxw9vskLzpQ@mail.gmail.com>	<CABL7CQjzW8C_g3vrQ5Z8qhmSkJX795beuZnSW874LTbPiUn7wQ@mail.gmail.com>	<CAPJVwBmPmhK+PnoLhWMGEAGvWFt4Vyz+w8PwTfSWTWmesais=w@mail.gmail.com>	<CAGY4rcWi7u7JUCDK3MjFKmZOgMJTgJ=D1h1eWTPCiRd3Q3gq_w@mail.gmail.com>	<CAPJVwB==_LLqRe0waJ8WK5zPVVa+4-QHnG=CjuEq0+nF3O0uhg@mail.gmail.com>
	<CAGY4rcUrXooKwDTYkef10tfPLcOuqSmqLV5ZuvnN+XY6czVozA@mail.gmail.com>
Message-ID: <53B82DE8.7090905@googlemail.com>

On 05.07.2014 18:40, David Cournapeau wrote:
> The efforts are on average less demanding than this discussion. We are
> talking about adding entries to a list in most cases...
> 
> Also, while adding the optimization support for bento, I've noticed that
> a lot of the related distutils code is broken, and does not work as
> expected on at least OS X + clang.

It just spits out a lot of warnings but they are harmless.
We could remove them by using with -Werror=attribute for the conftests
if it really bothers someone.
Or do you mean something else?

> 
> David
> 
> 
> On Sun, Jul 6, 2014 at 1:38 AM, Nathaniel Smith <njs at pobox.com
> <mailto:njs at pobox.com>> wrote:
> 
>     On Sat, Jul 5, 2014 at 3:21 PM, David Cournapeau <cournape at gmail.com
>     <mailto:cournape at gmail.com>> wrote:
>     >
>     > On Sat, Jul 5, 2014 at 11:17 PM, Nathaniel Smith <njs at pobox.com
>     <mailto:njs at pobox.com>> wrote:
>     >>
>     >> Maybe bento will revive and take over the new python packaging world!
>     >> Maybe not. Maybe something else will. I don't see how our support for
>     >> it will really affect these outcomes in any way. And I especially
>     >> don't see why it's important to spend time *now* on keeping bento
>     >> working, just in case it becomes useful *later*.
>     >
>     > But it is working right now, so that argument is moot.
> 
>     My suggestion was that we should drop the rule that a patch has to
>     keep bento working to be merged. We're talking about future breakages
>     and future effort. The fact that it's working now doesn't say anything
>     about whether it's worth continuing to invest time in it.
> 
>     --
>     Nathaniel J. Smith
>     Postdoctoral researcher - Informatics - University of Edinburgh
>     http://vorpus.org
>     _______________________________________________
>     NumPy-Discussion mailing list
>     NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>     http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
> 
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 


From cournape at gmail.com  Sat Jul  5 13:11:30 2014
From: cournape at gmail.com (David Cournapeau)
Date: Sun, 6 Jul 2014 02:11:30 +0900
Subject: [Numpy-discussion] Remove bento from numpy
In-Reply-To: <53B82DE8.7090905@googlemail.com>
References: <CAB6mnxLwuA9MjKTj+aq6+qRVHzpWNOtYJVOE7p7Gy2sifdMzbw@mail.gmail.com>
	<CAGY4rcVtxmqcUaO5Qd_Oz+6B=7dGgiY7qdWsx_enS-qgQ9RxsA@mail.gmail.com>
	<CABL7CQiLoxLume94LeBBZvNhNF-LZig_=aAWmsVOH-bpqx1f0Q@mail.gmail.com>
	<CAPJVwBnABaa_UqEntMd8pHdfhhB-P1EyunRJb7_kxw9vskLzpQ@mail.gmail.com>
	<CABL7CQjzW8C_g3vrQ5Z8qhmSkJX795beuZnSW874LTbPiUn7wQ@mail.gmail.com>
	<CAPJVwBmPmhK+PnoLhWMGEAGvWFt4Vyz+w8PwTfSWTWmesais=w@mail.gmail.com>
	<CAGY4rcWi7u7JUCDK3MjFKmZOgMJTgJ=D1h1eWTPCiRd3Q3gq_w@mail.gmail.com>
	<CAPJVwB==_LLqRe0waJ8WK5zPVVa+4-QHnG=CjuEq0+nF3O0uhg@mail.gmail.com>
	<CAGY4rcUrXooKwDTYkef10tfPLcOuqSmqLV5ZuvnN+XY6czVozA@mail.gmail.com>
	<53B82DE8.7090905@googlemail.com>
Message-ID: <CAGY4rcX2CrHhxURe2a8bO3jsOdxSF4YOiAxHz2T5zhEdf2CyEw@mail.gmail.com>

On Sun, Jul 6, 2014 at 1:55 AM, Julian Taylor <jtaylor.debian at googlemail.com
> wrote:

> On 05.07.2014 18:40, David Cournapeau wrote:
> > The efforts are on average less demanding than this discussion. We are
> > talking about adding entries to a list in most cases...
> >
> > Also, while adding the optimization support for bento, I've noticed that
> > a lot of the related distutils code is broken, and does not work as
> > expected on at least OS X + clang.
>
> It just spits out a lot of warnings but they are harmless.
>

Adding lots of warnings are not harmless as they render the compiler
warning system near useless (too many false alarms).

I will fix the checks for both distutils and bento (using the autoconf
macros setup, which should be more reliable than what we use for builtin
and __attribute__-related checks)

David


> We could remove them by using with -Werror=attribute for the conftests
> if it really bothers someone.
> Or do you mean something else?
>
> >
> > David
> >
> >
> > On Sun, Jul 6, 2014 at 1:38 AM, Nathaniel Smith <njs at pobox.com
> > <mailto:njs at pobox.com>> wrote:
> >
> >     On Sat, Jul 5, 2014 at 3:21 PM, David Cournapeau <cournape at gmail.com
> >     <mailto:cournape at gmail.com>> wrote:
> >     >
> >     > On Sat, Jul 5, 2014 at 11:17 PM, Nathaniel Smith <njs at pobox.com
> >     <mailto:njs at pobox.com>> wrote:
> >     >>
> >     >> Maybe bento will revive and take over the new python packaging
> world!
> >     >> Maybe not. Maybe something else will. I don't see how our support
> for
> >     >> it will really affect these outcomes in any way. And I especially
> >     >> don't see why it's important to spend time *now* on keeping bento
> >     >> working, just in case it becomes useful *later*.
> >     >
> >     > But it is working right now, so that argument is moot.
> >
> >     My suggestion was that we should drop the rule that a patch has to
> >     keep bento working to be merged. We're talking about future breakages
> >     and future effort. The fact that it's working now doesn't say
> anything
> >     about whether it's worth continuing to invest time in it.
> >
> >     --
> >     Nathaniel J. Smith
> >     Postdoctoral researcher - Informatics - University of Edinburgh
> >     http://vorpus.org
> >     _______________________________________________
> >     NumPy-Discussion mailing list
> >     NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
> >     http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> >
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140706/6ed2498a/attachment.html>

From jtaylor.debian at googlemail.com  Sat Jul  5 13:24:55 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Sat, 05 Jul 2014 19:24:55 +0200
Subject: [Numpy-discussion] Remove bento from numpy
In-Reply-To: <CAGY4rcX2CrHhxURe2a8bO3jsOdxSF4YOiAxHz2T5zhEdf2CyEw@mail.gmail.com>
References: <CAB6mnxLwuA9MjKTj+aq6+qRVHzpWNOtYJVOE7p7Gy2sifdMzbw@mail.gmail.com>	<CAGY4rcVtxmqcUaO5Qd_Oz+6B=7dGgiY7qdWsx_enS-qgQ9RxsA@mail.gmail.com>	<CABL7CQiLoxLume94LeBBZvNhNF-LZig_=aAWmsVOH-bpqx1f0Q@mail.gmail.com>	<CAPJVwBnABaa_UqEntMd8pHdfhhB-P1EyunRJb7_kxw9vskLzpQ@mail.gmail.com>	<CABL7CQjzW8C_g3vrQ5Z8qhmSkJX795beuZnSW874LTbPiUn7wQ@mail.gmail.com>	<CAPJVwBmPmhK+PnoLhWMGEAGvWFt4Vyz+w8PwTfSWTWmesais=w@mail.gmail.com>	<CAGY4rcWi7u7JUCDK3MjFKmZOgMJTgJ=D1h1eWTPCiRd3Q3gq_w@mail.gmail.com>	<CAPJVwB==_LLqRe0waJ8WK5zPVVa+4-QHnG=CjuEq0+nF3O0uhg@mail.gmail.com>	<CAGY4rcUrXooKwDTYkef10tfPLcOuqSmqLV5ZuvnN+XY6czVozA@mail.gmail.com>	<53B82DE8.7090905@googlemail.com>
	<CAGY4rcX2CrHhxURe2a8bO3jsOdxSF4YOiAxHz2T5zhEdf2CyEw@mail.gmail.com>
Message-ID: <53B834E7.4060109@googlemail.com>

On 05.07.2014 19:11, David Cournapeau wrote:
> On Sun, Jul 6, 2014 at 1:55 AM, Julian Taylor
> <jtaylor.debian at googlemail.com <mailto:jtaylor.debian at googlemail.com>>
> wrote:
> 
>     On 05.07.2014 18:40, David Cournapeau wrote:
>     > The efforts are on average less demanding than this discussion. We are
>     > talking about adding entries to a list in most cases...
>     >
>     > Also, while adding the optimization support for bento, I've
>     noticed that
>     > a lot of the related distutils code is broken, and does not work as
>     > expected on at least OS X + clang.
> 
>     It just spits out a lot of warnings but they are harmless.
> 
> 
> Adding lots of warnings are not harmless as they render the compiler
> warning system near useless (too many false alarms).
> 

true but until now we haven't received a single complaint nor fixes so
probably not many developers are actually using macs/clang to work on
numpy C code.
But I do agree its bad and I have fixing that on my todo list, I didn't
get around to it yet.

> I will fix the checks for both distutils and bento (using the autoconf
> macros setup, which should be more reliable than what we use for builtin
> and __attribute__-related checks)
> 
> David
>  
> 
>     We could remove them by using with -Werror=attribute for the conftests
>     if it really bothers someone.
>     Or do you mean something else?
> 
>     >
>     > David
>     >
>     >
>     > On Sun, Jul 6, 2014 at 1:38 AM, Nathaniel Smith <njs at pobox.com
>     <mailto:njs at pobox.com>
>     > <mailto:njs at pobox.com <mailto:njs at pobox.com>>> wrote:
>     >
>     >     On Sat, Jul 5, 2014 at 3:21 PM, David Cournapeau
>     <cournape at gmail.com <mailto:cournape at gmail.com>
>     >     <mailto:cournape at gmail.com <mailto:cournape at gmail.com>>> wrote:
>     >     >
>     >     > On Sat, Jul 5, 2014 at 11:17 PM, Nathaniel Smith
>     <njs at pobox.com <mailto:njs at pobox.com>
>     >     <mailto:njs at pobox.com <mailto:njs at pobox.com>>> wrote:
>     >     >>
>     >     >> Maybe bento will revive and take over the new python
>     packaging world!
>     >     >> Maybe not. Maybe something else will. I don't see how our
>     support for
>     >     >> it will really affect these outcomes in any way. And I
>     especially
>     >     >> don't see why it's important to spend time *now* on keeping
>     bento
>     >     >> working, just in case it becomes useful *later*.
>     >     >
>     >     > But it is working right now, so that argument is moot.
>     >
>     >     My suggestion was that we should drop the rule that a patch has to
>     >     keep bento working to be merged. We're talking about future
>     breakages
>     >     and future effort. The fact that it's working now doesn't say
>     anything
>     >     about whether it's worth continuing to invest time in it.
>     >
>     >     --
>     >     Nathaniel J. Smith
>     >     Postdoctoral researcher - Informatics - University of Edinburgh
>     >     http://vorpus.org
>     >     _______________________________________________
>     >     NumPy-Discussion mailing list
>     >     NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>     <mailto:NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>>
>     >     http://mail.scipy.org/mailman/listinfo/numpy-discussion
>     >
>     >
>     >
>     >
>     > _______________________________________________
>     > NumPy-Discussion mailing list
>     > NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>     > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>     >
> 
>     _______________________________________________
>     NumPy-Discussion mailing list
>     NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>     http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
> 
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 


From cournape at gmail.com  Sat Jul  5 13:28:14 2014
From: cournape at gmail.com (David Cournapeau)
Date: Sun, 6 Jul 2014 02:28:14 +0900
Subject: [Numpy-discussion] Remove bento from numpy
In-Reply-To: <53B834E7.4060109@googlemail.com>
References: <CAB6mnxLwuA9MjKTj+aq6+qRVHzpWNOtYJVOE7p7Gy2sifdMzbw@mail.gmail.com>
	<CAGY4rcVtxmqcUaO5Qd_Oz+6B=7dGgiY7qdWsx_enS-qgQ9RxsA@mail.gmail.com>
	<CABL7CQiLoxLume94LeBBZvNhNF-LZig_=aAWmsVOH-bpqx1f0Q@mail.gmail.com>
	<CAPJVwBnABaa_UqEntMd8pHdfhhB-P1EyunRJb7_kxw9vskLzpQ@mail.gmail.com>
	<CABL7CQjzW8C_g3vrQ5Z8qhmSkJX795beuZnSW874LTbPiUn7wQ@mail.gmail.com>
	<CAPJVwBmPmhK+PnoLhWMGEAGvWFt4Vyz+w8PwTfSWTWmesais=w@mail.gmail.com>
	<CAGY4rcWi7u7JUCDK3MjFKmZOgMJTgJ=D1h1eWTPCiRd3Q3gq_w@mail.gmail.com>
	<CAPJVwB==_LLqRe0waJ8WK5zPVVa+4-QHnG=CjuEq0+nF3O0uhg@mail.gmail.com>
	<CAGY4rcUrXooKwDTYkef10tfPLcOuqSmqLV5ZuvnN+XY6czVozA@mail.gmail.com>
	<53B82DE8.7090905@googlemail.com>
	<CAGY4rcX2CrHhxURe2a8bO3jsOdxSF4YOiAxHz2T5zhEdf2CyEw@mail.gmail.com>
	<53B834E7.4060109@googlemail.com>
Message-ID: <CAGY4rcWCy0Zifeiyb3L5EmcxoojFKRqdu5-UdzxsTRXc5q+2Lw@mail.gmail.com>

On Sun, Jul 6, 2014 at 2:24 AM, Julian Taylor <jtaylor.debian at googlemail.com
> wrote:

> On 05.07.2014 19:11, David Cournapeau wrote:
> > On Sun, Jul 6, 2014 at 1:55 AM, Julian Taylor
> > <jtaylor.debian at googlemail.com <mailto:jtaylor.debian at googlemail.com>>
> > wrote:
> >
> >     On 05.07.2014 18:40, David Cournapeau wrote:
> >     > The efforts are on average less demanding than this discussion. We
> are
> >     > talking about adding entries to a list in most cases...
> >     >
> >     > Also, while adding the optimization support for bento, I've
> >     noticed that
> >     > a lot of the related distutils code is broken, and does not work as
> >     > expected on at least OS X + clang.
> >
> >     It just spits out a lot of warnings but they are harmless.
> >
> >
> > Adding lots of warnings are not harmless as they render the compiler
> > warning system near useless (too many false alarms).
> >
>
> true but until now we haven't received a single complaint nor fixes so
> probably not many developers are actually using macs/clang to work on
> numpy C code.
>

Not many people are working on numpy C code period :)

FWIW, clang is now the standard OS X compiler since Maverick (Apple in all
its wisdom made gcc an alias to clang...).

David


> But I do agree its bad and I have fixing that on my todo list, I didn't
> get around to it yet.
>
> > I will fix the checks for both distutils and bento (using the autoconf
> > macros setup, which should be more reliable than what we use for builtin
> > and __attribute__-related checks)
> >
> > David
> >
> >
> >     We could remove them by using with -Werror=attribute for the
> conftests
> >     if it really bothers someone.
> >     Or do you mean something else?
> >
> >     >
> >     > David
> >     >
> >     >
> >     > On Sun, Jul 6, 2014 at 1:38 AM, Nathaniel Smith <njs at pobox.com
> >     <mailto:njs at pobox.com>
> >     > <mailto:njs at pobox.com <mailto:njs at pobox.com>>> wrote:
> >     >
> >     >     On Sat, Jul 5, 2014 at 3:21 PM, David Cournapeau
> >     <cournape at gmail.com <mailto:cournape at gmail.com>
> >     >     <mailto:cournape at gmail.com <mailto:cournape at gmail.com>>>
> wrote:
> >     >     >
> >     >     > On Sat, Jul 5, 2014 at 11:17 PM, Nathaniel Smith
> >     <njs at pobox.com <mailto:njs at pobox.com>
> >     >     <mailto:njs at pobox.com <mailto:njs at pobox.com>>> wrote:
> >     >     >>
> >     >     >> Maybe bento will revive and take over the new python
> >     packaging world!
> >     >     >> Maybe not. Maybe something else will. I don't see how our
> >     support for
> >     >     >> it will really affect these outcomes in any way. And I
> >     especially
> >     >     >> don't see why it's important to spend time *now* on keeping
> >     bento
> >     >     >> working, just in case it becomes useful *later*.
> >     >     >
> >     >     > But it is working right now, so that argument is moot.
> >     >
> >     >     My suggestion was that we should drop the rule that a patch
> has to
> >     >     keep bento working to be merged. We're talking about future
> >     breakages
> >     >     and future effort. The fact that it's working now doesn't say
> >     anything
> >     >     about whether it's worth continuing to invest time in it.
> >     >
> >     >     --
> >     >     Nathaniel J. Smith
> >     >     Postdoctoral researcher - Informatics - University of Edinburgh
> >     >     http://vorpus.org
> >     >     _______________________________________________
> >     >     NumPy-Discussion mailing list
> >     >     NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
> >     <mailto:NumPy-Discussion at scipy.org <mailto:
> NumPy-Discussion at scipy.org>>
> >     >     http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >     >
> >     >
> >     >
> >     >
> >     > _______________________________________________
> >     > NumPy-Discussion mailing list
> >     > NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
> >     > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >     >
> >
> >     _______________________________________________
> >     NumPy-Discussion mailing list
> >     NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
> >     http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> >
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140706/4dbca222/attachment.html>

From chris.barker at noaa.gov  Sat Jul  5 15:41:18 2014
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Sat, 5 Jul 2014 12:41:18 -0700
Subject: [Numpy-discussion] Teaching Scipy BoF at SciPy
In-Reply-To: <CA+L60sAj3H_wrN+Z1YeMRMvteKXpZZDnVgiOMiuw67eLsqYQAA@mail.gmail.com>
References: <CALGmxEJ6VHUpV4ec5YRjuE=jE00StUesEDm9J+6ZuNKrajSdOg@mail.gmail.com>
	<CA+L60sAj3H_wrN+Z1YeMRMvteKXpZZDnVgiOMiuw67eLsqYQAA@mail.gmail.com>
Message-ID: <5468590778974524027@unknownmsgid>

On Jul 4, 2014, at 7:02 AM, Phil Elson <pelson.pub at gmail.com> wrote:

Nice idea. Just a repository of courses would be a great first step.


Yup -- or really even a curated page of links and refrrences.

Maybe we can get first draft of such a thing put together during the BoF.
Feel free to add this idea to the Wiki :-)

I hope you can come to the BoF, I know there are a number of others at the
same time.

-CHB


For example, I know Jake Vanderplas's course at
https://github.com/jakevdp/2013_fall_ASTR599 is useful, and I have a few
introduction (3hr) courses at https://github.com/SciTools/courses.


On 3 July 2014 16:59, Chris Barker <chris.barker at noaa.gov> wrote:

> HI  Folks,
>
> I will be hosting a "Teaching the SciPy Stack" BoF at SciPy this year:
>
> https://conference.scipy.org/scipy2014/schedule/presentation/1762/
>
> (Actually, I proposed it for the conference, but would be more than happy
> to have other folks join me in facilitating, hosting, etc.)
>
> I've put up a Wiki page to collect ideas for topics. Please take a look
> and add your $0.02:
>
> https://github.com/numpy/numpy/wiki/TeachingSciPy-BoF-at-Scipy-2014
>
> See you there,
>
>   -Chris
>
> --
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
>
> Chris.Barker at noaa.gov
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140705/47e34992/attachment.html>

From ralf.gommers at gmail.com  Sat Jul  5 18:42:41 2014
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sun, 6 Jul 2014 00:42:41 +0200
Subject: [Numpy-discussion] Remove bento from numpy
In-Reply-To: <CAPJVwBmPmhK+PnoLhWMGEAGvWFt4Vyz+w8PwTfSWTWmesais=w@mail.gmail.com>
References: <CAB6mnxLwuA9MjKTj+aq6+qRVHzpWNOtYJVOE7p7Gy2sifdMzbw@mail.gmail.com>
	<CAGY4rcVtxmqcUaO5Qd_Oz+6B=7dGgiY7qdWsx_enS-qgQ9RxsA@mail.gmail.com>
	<CABL7CQiLoxLume94LeBBZvNhNF-LZig_=aAWmsVOH-bpqx1f0Q@mail.gmail.com>
	<CAPJVwBnABaa_UqEntMd8pHdfhhB-P1EyunRJb7_kxw9vskLzpQ@mail.gmail.com>
	<CABL7CQjzW8C_g3vrQ5Z8qhmSkJX795beuZnSW874LTbPiUn7wQ@mail.gmail.com>
	<CAPJVwBmPmhK+PnoLhWMGEAGvWFt4Vyz+w8PwTfSWTWmesais=w@mail.gmail.com>
Message-ID: <CABL7CQhLpodkQ2XzOr+AsdERRA-6n6dKybY1_OWihBOFiKV18w@mail.gmail.com>

On Sat, Jul 5, 2014 at 4:17 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Sat, Jul 5, 2014 at 2:32 PM, Ralf Gommers <ralf.gommers at gmail.com>
> wrote:
> >
> > On Sat, Jul 5, 2014 at 1:54 PM, Nathaniel Smith <njs at pobox.com> wrote:
> >>
> >> On 5 Jul 2014 09:23, "Ralf Gommers" <ralf.gommers at gmail.com> wrote:
>
>> Perhaps a compromise would be to declare it officially unsupported and
> >> remove it from Travis CI, while leaving the files in place to be used
> on an
> >> at-your-own-risk basis? As long as it's in Travis, the default is that
> >> anyone who breaks it has to fix it. If it's not in Travis, then the
> default
> >> is that the people (person?) who use bento are responsible for keeping
> it
> >> working for their needs.
> >
> > -1 that just means that simple changes like adding a new extension will
> not
> > get made before PRs get merged, and bento support will be in a broken
> state
> > much more often.
>
> Yes, and then the handful of people who care about this would fix it
> or not.


What next, we give Alan Isaac commit rights and then it's OK to break
numpy.matrix when that's convenient?


> Your -1 is attempting to veto other people's *not* paying
> attention to this build system. I... don't think -1's work that way
> :-(
>

You're proposing it'll be OK for others to break stuff that the people
before them put quite some effort into implementing. I damn well have the
right to give that a -1.

David is fixing the few existing problems now, so there should be zero
issues here. You're deliberately mischaracterizing the situation
(pre-alpha, lot of effort, etc.), so I'm not going to bother responding to
the rest, I'm annoyed enough as is.

Ralf

P.S. if anyone wants to spend some productive energy on the build
situation, MSVC 2010 support for Python 3.x would be nice:
https://github.com/numpy/numpy/issues/4245
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140706/98c20b86/attachment.html>

From alan.isaac at gmail.com  Sat Jul  5 19:13:25 2014
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Sat, 05 Jul 2014 19:13:25 -0400
Subject: [Numpy-discussion] Remove bento from numpy
In-Reply-To: <CABL7CQhLpodkQ2XzOr+AsdERRA-6n6dKybY1_OWihBOFiKV18w@mail.gmail.com>
References: <CAB6mnxLwuA9MjKTj+aq6+qRVHzpWNOtYJVOE7p7Gy2sifdMzbw@mail.gmail.com>	<CAGY4rcVtxmqcUaO5Qd_Oz+6B=7dGgiY7qdWsx_enS-qgQ9RxsA@mail.gmail.com>	<CABL7CQiLoxLume94LeBBZvNhNF-LZig_=aAWmsVOH-bpqx1f0Q@mail.gmail.com>	<CAPJVwBnABaa_UqEntMd8pHdfhhB-P1EyunRJb7_kxw9vskLzpQ@mail.gmail.com>	<CABL7CQjzW8C_g3vrQ5Z8qhmSkJX795beuZnSW874LTbPiUn7wQ@mail.gmail.com>	<CAPJVwBmPmhK+PnoLhWMGEAGvWFt4Vyz+w8PwTfSWTWmesais=w@mail.gmail.com>
	<CABL7CQhLpodkQ2XzOr+AsdERRA-6n6dKybY1_OWihBOFiKV18w@mail.gmail.com>
Message-ID: <53B88695.3020802@gmail.com>

On 7/5/2014 6:42 PM, Ralf Gommers wrote:
> What next, we give Alan Isaac commit rights and then it's OK to break numpy.matrix when that's convenient?


I always wondered what I would do with commit rights ...

Alan


From ben.root at ou.edu  Sat Jul  5 22:13:18 2014
From: ben.root at ou.edu (Benjamin Root)
Date: Sat, 5 Jul 2014 22:13:18 -0400
Subject: [Numpy-discussion] Questions about fixes for 1.9.0rc2
In-Reply-To: <1404573876.3423.9.camel@sebastian-t440>
References: <CAB6mnxK+bEN4ToMk0O_LUWbHjB7s2eAO+XAY9YxpJ45D4ZYsaQ@mail.gmail.com>
	<CAB6mnxJ=Pjj8sFh0KqssCUNKESQQqgNUP-OYpfP6nGCtk1WL4A@mail.gmail.com>
	<CABL7CQjQxOVWFSFG3nf9s9pQop05NK2+6X7XHaEFjF3HH8xbUQ@mail.gmail.com>
	<CAPJVwB=WcTd2O+gr1GAcgW+F2=S=5bH6hd9_9xHjhCjmt4XnkA@mail.gmail.com>
	<CAB6mnxJvEPCKm8cRaWXHKaGJwFM2_3j+-E9QdQ-gAQZ8kkLe8g@mail.gmail.com>
	<CAPJVwBmd0iuPe6MfJTfJEbEEKV8SimKLDZN5zejO_8VRePtLtA@mail.gmail.com>
	<CAB6mnxK6Y8dVU_3Yu=YQCvtK6gSXNZ_H-XuqH1BqtmXNw6_uQg@mail.gmail.com>
	<CAPJVwBnneoJhOqcD87wU2_-pzphObfubPovrGEG8N7K-1_4FPA@mail.gmail.com>
	<CAB6mnxKdUX831dCyk_aBaw8rAYgF=SAg-g1P6k5mBvz2s+Eo=A@mail.gmail.com>
	<CAPJVwBnJL=cUjuiwr6ZSS0HfG3DYOA=NPE-rWyayqXoro77zoQ@mail.gmail.com>
	<CAB6mnx+YHfOen+ywpRbzjLO6r_+u8qZNro_-v_G5u7Yn=Wgemg@mail.gmail.com>
	<CAPJVwBn5e_TeHJ1KEzPnoJrEeuyA83zJkCxrV9_OvAqWwiqD=Q@mail.gmail.com>
	<1404573876.3423.9.camel@sebastian-t440>
Message-ID: <CANNq6FkzrDGnj3zsCEsxEpnyQpS=2SJOL8i6CjKN_u9kjOAP=g@mail.gmail.com>

Drats... I actually know those two topics... and I might have free time
tomorrow afternoon at SciPy. Maybe I could take a peek at it?

Ben


On Sat, Jul 5, 2014 at 11:24 AM, Sebastian Berg <sebastian at sipsolutions.net>
wrote:

> On Sa, 2014-07-05 at 00:41 +0100, Nathaniel Smith wrote:
> > On 5 Jul 2014 00:07, "Charles R Harris" <charlesr.harris at gmail.com>
>
> <snip>
> >
> > That is a massive pile of affected projects :-(.
> >
> > My worry is that if all these projects we know about are broken, then
> > how many other codebases that we aren't testing are also broken?
> >
>
> Yeah, I would imagine quite a few might be. It isn't that I guess many
> used the "feature" deliberately, but it is easy to just code it and
> assume that the code is correct since it works. So I think I will just
> need to fix it. The pull request *should* already do this with a band
> aid-solution, by just falling back to the old funky stuff if there is a
> failure. If someone is good with python exception handling and string
> formatting in C, please feel free to have a look ;).
>
> - Sebastian
>
> > > If the issues are fixed in matplotlib and pandas I'd be inclined to
> > release as is with a mention of versions in the release notes.
> >
> > Even if it's fixed in pandas master, how long until it's in user's
> > hands?
> >
> > -n
> >
> > > Chuck
> > >
> > >
> > > _______________________________________________
> > > NumPy-Discussion mailing list
> > > NumPy-Discussion at scipy.org
> > > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> > >
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140705/6e93d080/attachment.html>

From bryanv at continuum.io  Sun Jul  6 01:12:08 2014
From: bryanv at continuum.io (Bryan Van de Ven)
Date: Sun, 6 Jul 2014 01:12:08 -0400
Subject: [Numpy-discussion] Remove bento from numpy
In-Reply-To: <CABL7CQhLpodkQ2XzOr+AsdERRA-6n6dKybY1_OWihBOFiKV18w@mail.gmail.com>
References: <CAB6mnxLwuA9MjKTj+aq6+qRVHzpWNOtYJVOE7p7Gy2sifdMzbw@mail.gmail.com>
	<CAGY4rcVtxmqcUaO5Qd_Oz+6B=7dGgiY7qdWsx_enS-qgQ9RxsA@mail.gmail.com>
	<CABL7CQiLoxLume94LeBBZvNhNF-LZig_=aAWmsVOH-bpqx1f0Q@mail.gmail.com>
	<CAPJVwBnABaa_UqEntMd8pHdfhhB-P1EyunRJb7_kxw9vskLzpQ@mail.gmail.com>
	<CABL7CQjzW8C_g3vrQ5Z8qhmSkJX795beuZnSW874LTbPiUn7wQ@mail.gmail.com>
	<CAPJVwBmPmhK+PnoLhWMGEAGvWFt4Vyz+w8PwTfSWTWmesais=w@mail.gmail.com>
	<CABL7CQhLpodkQ2XzOr+AsdERRA-6n6dKybY1_OWihBOFiKV18w@mail.gmail.com>
Message-ID: <439ACFB7-AE6F-40CA-8483-BC837BE10520@continuum.io>

Speaking as someone who started but then stopped dabbling in the NumPy C core, having to think about two build system is a huge turn-off. Getting into the NumPy C code is hard enough without having to worry about multiple build systems. 

Bryan 


On Jul 5, 2014, at 6:42 PM, Ralf Gommers <ralf.gommers at gmail.com> wrote:

> 
> 
> 
> On Sat, Jul 5, 2014 at 4:17 PM, Nathaniel Smith <njs at pobox.com> wrote:
> On Sat, Jul 5, 2014 at 2:32 PM, Ralf Gommers <ralf.gommers at gmail.com> wrote:
> >
> > On Sat, Jul 5, 2014 at 1:54 PM, Nathaniel Smith <njs at pobox.com> wrote:
> >>
> >> On 5 Jul 2014 09:23, "Ralf Gommers" <ralf.gommers at gmail.com> wrote: 
> >> Perhaps a compromise would be to declare it officially unsupported and
> >> remove it from Travis CI, while leaving the files in place to be used on an
> >> at-your-own-risk basis? As long as it's in Travis, the default is that
> >> anyone who breaks it has to fix it. If it's not in Travis, then the default
> >> is that the people (person?) who use bento are responsible for keeping it
> >> working for their needs.
> >
> > -1 that just means that simple changes like adding a new extension will not
> > get made before PRs get merged, and bento support will be in a broken state
> > much more often.
> 
> Yes, and then the handful of people who care about this would fix it
> or not.
> 
> What next, we give Alan Isaac commit rights and then it's OK to break numpy.matrix when that's convenient?
>  
> Your -1 is attempting to veto other people's *not* paying
> attention to this build system. I... don't think -1's work that way
> :-(
> 
> You're proposing it'll be OK for others to break stuff that the people before them put quite some effort into implementing. I damn well have the right to give that a -1. 
> 
> David is fixing the few existing problems now, so there should be zero issues here. You're deliberately mischaracterizing the situation (pre-alpha, lot of effort, etc.), so I'm not going to bother responding to the rest, I'm annoyed enough as is.
> 
> Ralf
> 
> P.S. if anyone wants to spend some productive energy on the build situation, MSVC 2010 support for Python 3.x would be nice: https://github.com/numpy/numpy/issues/4245
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From sebastian at sipsolutions.net  Sun Jul  6 02:30:34 2014
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Sun, 06 Jul 2014 08:30:34 +0200
Subject: [Numpy-discussion] Questions about fixes for 1.9.0rc2
In-Reply-To: <CANNq6FkzrDGnj3zsCEsxEpnyQpS=2SJOL8i6CjKN_u9kjOAP=g@mail.gmail.com>
References: <CAB6mnxK+bEN4ToMk0O_LUWbHjB7s2eAO+XAY9YxpJ45D4ZYsaQ@mail.gmail.com>
	<CAB6mnxJ=Pjj8sFh0KqssCUNKESQQqgNUP-OYpfP6nGCtk1WL4A@mail.gmail.com>
	<CABL7CQjQxOVWFSFG3nf9s9pQop05NK2+6X7XHaEFjF3HH8xbUQ@mail.gmail.com>
	<CAPJVwB=WcTd2O+gr1GAcgW+F2=S=5bH6hd9_9xHjhCjmt4XnkA@mail.gmail.com>
	<CAB6mnxJvEPCKm8cRaWXHKaGJwFM2_3j+-E9QdQ-gAQZ8kkLe8g@mail.gmail.com>
	<CAPJVwBmd0iuPe6MfJTfJEbEEKV8SimKLDZN5zejO_8VRePtLtA@mail.gmail.com>
	<CAB6mnxK6Y8dVU_3Yu=YQCvtK6gSXNZ_H-XuqH1BqtmXNw6_uQg@mail.gmail.com>
	<CAPJVwBnneoJhOqcD87wU2_-pzphObfubPovrGEG8N7K-1_4FPA@mail.gmail.com>
	<CAB6mnxKdUX831dCyk_aBaw8rAYgF=SAg-g1P6k5mBvz2s+Eo=A@mail.gmail.com>
	<CAPJVwBnJL=cUjuiwr6ZSS0HfG3DYOA=NPE-rWyayqXoro77zoQ@mail.gmail.com>
	<CAB6mnx+YHfOen+ywpRbzjLO6r_+u8qZNro_-v_G5u7Yn=Wgemg@mail.gmail.com>
	<CAPJVwBn5e_TeHJ1KEzPnoJrEeuyA83zJkCxrV9_OvAqWwiqD=Q@mail.gmail.com>
	<1404573876.3423.9.camel@sebastian-t440>
	<CANNq6FkzrDGnj3zsCEsxEpnyQpS=2SJOL8i6CjKN_u9kjOAP=g@mail.gmail.com>
Message-ID: <1404628234.12836.1.camel@sebastian-t440>

On Sa, 2014-07-05 at 22:13 -0400, Benjamin Root wrote:
> Drats... I actually know those two topics... and I might have free
> time tomorrow afternoon at SciPy. Maybe I could take a peek at it?
> 

Maybe if you have time. It is just the attempt_1d_fallback function in
the pull request https://github.com/numpy/numpy/pull/4804
This is called only after the normal indexing code gave an exception
already and maybe we can make the warnings more informative.

- Sebastian

> 
> Ben
> 
> 
> 
> On Sat, Jul 5, 2014 at 11:24 AM, Sebastian Berg
> <sebastian at sipsolutions.net> wrote:
>         On Sa, 2014-07-05 at 00:41 +0100, Nathaniel Smith wrote:
>         > On 5 Jul 2014 00:07, "Charles R Harris"
>         <charlesr.harris at gmail.com>
>         
>         
>         <snip>
>         >
>         > That is a massive pile of affected projects :-(.
>         >
>         > My worry is that if all these projects we know about are
>         broken, then
>         > how many other codebases that we aren't testing are also
>         broken?
>         >
>         
>         
>         Yeah, I would imagine quite a few might be. It isn't that I
>         guess many
>         used the "feature" deliberately, but it is easy to just code
>         it and
>         assume that the code is correct since it works. So I think I
>         will just
>         need to fix it. The pull request *should* already do this with
>         a band
>         aid-solution, by just falling back to the old funky stuff if
>         there is a
>         failure. If someone is good with python exception handling and
>         string
>         formatting in C, please feel free to have a look ;).
>         
>         - Sebastian
>         
>         > > If the issues are fixed in matplotlib and pandas I'd be
>         inclined to
>         > release as is with a mention of versions in the release
>         notes.
>         >
>         > Even if it's fixed in pandas master, how long until it's in
>         user's
>         > hands?
>         >
>         > -n
>         >
>         > > Chuck
>         > >
>         > >
>         > > _______________________________________________
>         > > NumPy-Discussion mailing list
>         > > NumPy-Discussion at scipy.org
>         > > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>         > >
>         >
>         >
>         > _______________________________________________
>         > NumPy-Discussion mailing list
>         > NumPy-Discussion at scipy.org
>         > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>         
>         
>         
>         _______________________________________________
>         NumPy-Discussion mailing list
>         NumPy-Discussion at scipy.org
>         http://mail.scipy.org/mailman/listinfo/numpy-discussion
>         
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140706/6995464f/attachment.sig>

From sturla.molden at gmail.com  Sun Jul  6 03:52:53 2014
From: sturla.molden at gmail.com (Sturla Molden)
Date: Sun, 6 Jul 2014 07:52:53 +0000 (UTC)
Subject: [Numpy-discussion] About the npz format
References: <20140704134954.GB31861@kudu.in-berlin.de>
Message-ID: <940174683426325795.593859sturla.molden-gmail.com@news.gmane.org>

There is no os.mkfifo on Windows.

Sturla

Valentin Haenel <valentin at haenel.co> wrote:
> sorry, for the top-post, but should we add this as an issue on the
> github tracker? I'd like to revisit it this summer.
> 
> V-
> 
> * Julian Taylor <jtaylor.debian at googlemail.com> [2014-04-18]:
>> On 18.04.2014 18:29, Valentin Haenel wrote:
>>> Hi,
>>> 
>>> * Valentin Haenel <valentin at haenel.co> [2014-04-17]:
>>>> * Valentin Haenel <valentin at haenel.co> [2014-04-17]:
>>>>> * Julian Taylor <jtaylor.debian at googlemail.com> [2014-04-17]:
>>>>>> On 17.04.2014 21:30, onefire wrote:
>>>>>>> Thanks for the suggestion. I did profile the program before, just not
>>>>>>> using Python.
>>>>>> 
>>>>>> one problem of npz is that the zipfile module does not support streaming
>>>>>> data in (or if it does now we aren't using it).
>>>>>> So numpy writes the file uncompressed to disk and then zips it which is
>>>>>> horrible for performance and disk usage.
>>>>> 
>>>>> As a workaround may also be possible to write the temporary NPY files to
>>>>> cStringIO instances and then use ``ZipFile.writestr`` with the
>>>>> ``getvalue()`` of the cStringIO object. However that approach may
>>>>> require some memory. In python 2.7, for each array: one copy inside the
>>>>> cStringIO instance and then another copy of when calling getvalue on the
>>>>> cString, I believe.
>>>> 
>>>> There is a proof-of-concept implementation here:
>>>> 
>>>> https://github.com/esc/numpy/compare/feature;npz_no_temp_file
>>> 
>>> Anybody interested in me fixing this up (unit tests, API, etc..) for
>>> inclusion?
>>> 
>> 
>> I wonder if it would be better to instead use a fifo to avoid the memory
>> doubling. Windows probably hasn't got them (exposed via python) but one
>> can slap a platform check in front.
>> attached a proof of concept without proper error handling (which is
>> unfortunately the tricky part)
> 
>>> From 472b4c0a44804b65d0774147010ec7a931a1c52d Mon Sep 17 00:00:00 2001
>> From: Julian Taylor <jtaylor.debian at googlemail.com>
>> Date: Thu, 17 Apr 2014 23:01:47 +0200
>> Subject: [PATCH] use a pipe for savez
>> 
>> ---
>>  numpy/lib/npyio.py | 25 +++++++++++--------------
>>  1 file changed, 11 insertions(+), 14 deletions(-)
>> 
>> diff --git a/numpy/lib/npyio.py b/numpy/lib/npyio.py
>> index 98b4b6e..baafa9d 100644
>> --- a/numpy/lib/npyio.py
>> +++ b/numpy/lib/npyio.py
>> @@ -585,22 +585,19 @@ def _savez(file, args, kwds, compress):
>>      zipf = zipfile_factory(file, mode="w", compression=compression)
>>  
>>      # Stage arrays in a temporary file on disk, before writing to zip.
>> -    fd, tmpfile = tempfile.mkstemp(suffix='-numpy.npy')
>> -    os.close(fd)
>> -    try:
>> +    import threading
>> +    with tempfile.TemporaryDirectory() as td:
>> +        fifoname = os.path.join(td, "fifo")
>> +        os.mkfifo(fifoname)
>>          for key, val in namedict.items():
>>              fname = key + '.npy'
>> -            fid = open(tmpfile, 'wb')
>> -            try:
>> -                format.write_array(fid, np.asanyarray(val))
>> -                fid.close()
>> -                fid = None
>> -                zipf.write(tmpfile, arcname=fname)
>> -            finally:
>> -                if fid:
>> -                    fid.close()
>> -    finally:
>> -        os.remove(tmpfile)
>> +            def mywrite(pipe, val):
>> +                with open(pipe, "wb") as wpipe:
>> +                    format.write_array(wpipe, np.asanyarray(val))
>> +            t = threading.Thread(target=mywrite, args=(fifoname, val))
>> +            t.start()
>> +            zipf.write(fifoname, arcname=fname)
>> +            t.join()
>>  
>>      zipf.close()
>>  
>> -- 
>> 1.9.1
>> 
> 
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From sturla.molden at gmail.com  Sun Jul  6 04:35:55 2014
From: sturla.molden at gmail.com (Sturla Molden)
Date: Sun, 6 Jul 2014 08:35:55 +0000 (UTC)
Subject: [Numpy-discussion] [Python-ideas] PEP pre-draft: Support for
	indexing with keyword arguments
References: <CAPJVwBnr3r5O1UNFpVpMTSJBEoA+BbcH4b0URKLtdRgTcNKnwA@mail.gmail.com>
	<CAFpSVpJ-XL3UrENZewZMRiq42N_=OaGUsdTx41r0EMQRskgBag@mail.gmail.com>
	<1404463173.2714.4.camel@sebastian-t440>
Message-ID: <1324974989426327476.518333sturla.molden-gmail.com@news.gmane.org>

Sebastian Berg <sebastian at sipsolutions.net> wrote:
 
>> Could it be useful for structured arrays?
> 
> Not sure how. The named columns seem like a decent point to me. 

NumPy is naming the fields, not the axes, so it might be more useful for
Pandas than NumPy. For example if we have an image with r,g,b data, NumPy
would not name a 'color' axis with indexes 'r', 'g' and 'b'. 

But conceptually image[i:m, j:n, field='r'] could be faster than image[i:m,
j:n]['r'] or image['r'][i:m, j:n], and perhaps also slightly more readable.
I am note sure about nested dtypes in record arrays though...

If the possibility of keyword indexing are supported in Python, there is
nothing that prevents this Pandas like extension to NumPy arrays: 

image[i:m, j:n, color='r'] 

it would require an extension of the current dtype descriptors, in order to
tell NumPy among which fields the keyword "color" would select, but it
shouldn't be undoable.


Sturla


From cournape at gmail.com  Sun Jul  6 05:02:08 2014
From: cournape at gmail.com (David Cournapeau)
Date: Sun, 6 Jul 2014 18:02:08 +0900
Subject: [Numpy-discussion] Remove bento from numpy
In-Reply-To: <53B834E7.4060109@googlemail.com>
References: <CAB6mnxLwuA9MjKTj+aq6+qRVHzpWNOtYJVOE7p7Gy2sifdMzbw@mail.gmail.com>
	<CAGY4rcVtxmqcUaO5Qd_Oz+6B=7dGgiY7qdWsx_enS-qgQ9RxsA@mail.gmail.com>
	<CABL7CQiLoxLume94LeBBZvNhNF-LZig_=aAWmsVOH-bpqx1f0Q@mail.gmail.com>
	<CAPJVwBnABaa_UqEntMd8pHdfhhB-P1EyunRJb7_kxw9vskLzpQ@mail.gmail.com>
	<CABL7CQjzW8C_g3vrQ5Z8qhmSkJX795beuZnSW874LTbPiUn7wQ@mail.gmail.com>
	<CAPJVwBmPmhK+PnoLhWMGEAGvWFt4Vyz+w8PwTfSWTWmesais=w@mail.gmail.com>
	<CAGY4rcWi7u7JUCDK3MjFKmZOgMJTgJ=D1h1eWTPCiRd3Q3gq_w@mail.gmail.com>
	<CAPJVwB==_LLqRe0waJ8WK5zPVVa+4-QHnG=CjuEq0+nF3O0uhg@mail.gmail.com>
	<CAGY4rcUrXooKwDTYkef10tfPLcOuqSmqLV5ZuvnN+XY6czVozA@mail.gmail.com>
	<53B82DE8.7090905@googlemail.com>
	<CAGY4rcX2CrHhxURe2a8bO3jsOdxSF4YOiAxHz2T5zhEdf2CyEw@mail.gmail.com>
	<53B834E7.4060109@googlemail.com>
Message-ID: <CAGY4rcWimUGQbkp7jmZZiEaC7LA=D6xSeiBTUCLcHwEFCQmHfA@mail.gmail.com>

On Sun, Jul 6, 2014 at 2:24 AM, Julian Taylor <jtaylor.debian at googlemail.com
> wrote:

> On 05.07.2014 19:11, David Cournapeau wrote:
> > On Sun, Jul 6, 2014 at 1:55 AM, Julian Taylor
> > <jtaylor.debian at googlemail.com <mailto:jtaylor.debian at googlemail.com>>
> > wrote:
> >
> >     On 05.07.2014 18:40, David Cournapeau wrote:
> >     > The efforts are on average less demanding than this discussion. We
> are
> >     > talking about adding entries to a list in most cases...
> >     >
> >     > Also, while adding the optimization support for bento, I've
> >     noticed that
> >     > a lot of the related distutils code is broken, and does not work as
> >     > expected on at least OS X + clang.
> >
> >     It just spits out a lot of warnings but they are harmless.
> >
> >
> > Adding lots of warnings are not harmless as they render the compiler
> > warning system near useless (too many false alarms).
> >
>
> true but until now we haven't received a single complaint nor fixes so
> probably not many developers are actually using macs/clang to work on
> numpy C code.
> But I do agree its bad and I have fixing that on my todo list, I didn't
> get around to it yet.
>

Here is an attempt: https://github.com/numpy/numpy/pull/4842

It uses a vile hack, but I did not see any other simple way. It fixes the
warnings on osx, once travis-ci confirms the tests pass ok on linux, I will
test it on msvc.

David


>
> > I will fix the checks for both distutils and bento (using the autoconf
> > macros setup, which should be more reliable than what we use for builtin
> > and __attribute__-related checks)
> >
> > David
> >
> >
> >     We could remove them by using with -Werror=attribute for the
> conftests
> >     if it really bothers someone.
> >     Or do you mean something else?
> >
> >     >
> >     > David
> >     >
> >     >
> >     > On Sun, Jul 6, 2014 at 1:38 AM, Nathaniel Smith <njs at pobox.com
> >     <mailto:njs at pobox.com>
> >     > <mailto:njs at pobox.com <mailto:njs at pobox.com>>> wrote:
> >     >
> >     >     On Sat, Jul 5, 2014 at 3:21 PM, David Cournapeau
> >     <cournape at gmail.com <mailto:cournape at gmail.com>
> >     >     <mailto:cournape at gmail.com <mailto:cournape at gmail.com>>>
> wrote:
> >     >     >
> >     >     > On Sat, Jul 5, 2014 at 11:17 PM, Nathaniel Smith
> >     <njs at pobox.com <mailto:njs at pobox.com>
> >     >     <mailto:njs at pobox.com <mailto:njs at pobox.com>>> wrote:
> >     >     >>
> >     >     >> Maybe bento will revive and take over the new python
> >     packaging world!
> >     >     >> Maybe not. Maybe something else will. I don't see how our
> >     support for
> >     >     >> it will really affect these outcomes in any way. And I
> >     especially
> >     >     >> don't see why it's important to spend time *now* on keeping
> >     bento
> >     >     >> working, just in case it becomes useful *later*.
> >     >     >
> >     >     > But it is working right now, so that argument is moot.
> >     >
> >     >     My suggestion was that we should drop the rule that a patch
> has to
> >     >     keep bento working to be merged. We're talking about future
> >     breakages
> >     >     and future effort. The fact that it's working now doesn't say
> >     anything
> >     >     about whether it's worth continuing to invest time in it.
> >     >
> >     >     --
> >     >     Nathaniel J. Smith
> >     >     Postdoctoral researcher - Informatics - University of Edinburgh
> >     >     http://vorpus.org
> >     >     _______________________________________________
> >     >     NumPy-Discussion mailing list
> >     >     NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
> >     <mailto:NumPy-Discussion at scipy.org <mailto:
> NumPy-Discussion at scipy.org>>
> >     >     http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >     >
> >     >
> >     >
> >     >
> >     > _______________________________________________
> >     > NumPy-Discussion mailing list
> >     > NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
> >     > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >     >
> >
> >     _______________________________________________
> >     NumPy-Discussion mailing list
> >     NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
> >     http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> >
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140706/5928bd47/attachment.html>

From ben.root at ou.edu  Sun Jul  6 13:54:27 2014
From: ben.root at ou.edu (Benjamin Root)
Date: Sun, 6 Jul 2014 13:54:27 -0400
Subject: [Numpy-discussion] Questions about fixes for 1.9.0rc2
In-Reply-To: <1404628234.12836.1.camel@sebastian-t440>
References: <CAB6mnxK+bEN4ToMk0O_LUWbHjB7s2eAO+XAY9YxpJ45D4ZYsaQ@mail.gmail.com>
	<CAB6mnxJ=Pjj8sFh0KqssCUNKESQQqgNUP-OYpfP6nGCtk1WL4A@mail.gmail.com>
	<CABL7CQjQxOVWFSFG3nf9s9pQop05NK2+6X7XHaEFjF3HH8xbUQ@mail.gmail.com>
	<CAPJVwB=WcTd2O+gr1GAcgW+F2=S=5bH6hd9_9xHjhCjmt4XnkA@mail.gmail.com>
	<CAB6mnxJvEPCKm8cRaWXHKaGJwFM2_3j+-E9QdQ-gAQZ8kkLe8g@mail.gmail.com>
	<CAPJVwBmd0iuPe6MfJTfJEbEEKV8SimKLDZN5zejO_8VRePtLtA@mail.gmail.com>
	<CAB6mnxK6Y8dVU_3Yu=YQCvtK6gSXNZ_H-XuqH1BqtmXNw6_uQg@mail.gmail.com>
	<CAPJVwBnneoJhOqcD87wU2_-pzphObfubPovrGEG8N7K-1_4FPA@mail.gmail.com>
	<CAB6mnxKdUX831dCyk_aBaw8rAYgF=SAg-g1P6k5mBvz2s+Eo=A@mail.gmail.com>
	<CAPJVwBnJL=cUjuiwr6ZSS0HfG3DYOA=NPE-rWyayqXoro77zoQ@mail.gmail.com>
	<CAB6mnx+YHfOen+ywpRbzjLO6r_+u8qZNro_-v_G5u7Yn=Wgemg@mail.gmail.com>
	<CAPJVwBn5e_TeHJ1KEzPnoJrEeuyA83zJkCxrV9_OvAqWwiqD=Q@mail.gmail.com>
	<1404573876.3423.9.camel@sebastian-t440>
	<CANNq6FkzrDGnj3zsCEsxEpnyQpS=2SJOL8i6CjKN_u9kjOAP=g@mail.gmail.com>
	<1404628234.12836.1.camel@sebastian-t440>
Message-ID: <CANNq6F=+C18ZR4QVk-_aciXXtmozTZ_t4yREwOX6Kg611=AQXQ@mail.gmail.com>

I see that a solution has already been found and merged. Are there any
remaining issues for matplotlib to resolve?


On Sun, Jul 6, 2014 at 2:30 AM, Sebastian Berg <sebastian at sipsolutions.net>
wrote:

> On Sa, 2014-07-05 at 22:13 -0400, Benjamin Root wrote:
> > Drats... I actually know those two topics... and I might have free
> > time tomorrow afternoon at SciPy. Maybe I could take a peek at it?
> >
>
> Maybe if you have time. It is just the attempt_1d_fallback function in
> the pull request https://github.com/numpy/numpy/pull/4804
> This is called only after the normal indexing code gave an exception
> already and maybe we can make the warnings more informative.
>
> - Sebastian
>
> >
> > Ben
> >
> >
> >
> > On Sat, Jul 5, 2014 at 11:24 AM, Sebastian Berg
> > <sebastian at sipsolutions.net> wrote:
> >         On Sa, 2014-07-05 at 00:41 +0100, Nathaniel Smith wrote:
> >         > On 5 Jul 2014 00:07, "Charles R Harris"
> >         <charlesr.harris at gmail.com>
> >
> >
> >         <snip>
> >         >
> >         > That is a massive pile of affected projects :-(.
> >         >
> >         > My worry is that if all these projects we know about are
> >         broken, then
> >         > how many other codebases that we aren't testing are also
> >         broken?
> >         >
> >
> >
> >         Yeah, I would imagine quite a few might be. It isn't that I
> >         guess many
> >         used the "feature" deliberately, but it is easy to just code
> >         it and
> >         assume that the code is correct since it works. So I think I
> >         will just
> >         need to fix it. The pull request *should* already do this with
> >         a band
> >         aid-solution, by just falling back to the old funky stuff if
> >         there is a
> >         failure. If someone is good with python exception handling and
> >         string
> >         formatting in C, please feel free to have a look ;).
> >
> >         - Sebastian
> >
> >         > > If the issues are fixed in matplotlib and pandas I'd be
> >         inclined to
> >         > release as is with a mention of versions in the release
> >         notes.
> >         >
> >         > Even if it's fixed in pandas master, how long until it's in
> >         user's
> >         > hands?
> >         >
> >         > -n
> >         >
> >         > > Chuck
> >         > >
> >         > >
> >         > > _______________________________________________
> >         > > NumPy-Discussion mailing list
> >         > > NumPy-Discussion at scipy.org
> >         > > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >         > >
> >         >
> >         >
> >         > _______________________________________________
> >         > NumPy-Discussion mailing list
> >         > NumPy-Discussion at scipy.org
> >         > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> >
> >
> >         _______________________________________________
> >         NumPy-Discussion mailing list
> >         NumPy-Discussion at scipy.org
> >         http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140706/777d067e/attachment.html>

From charlesr.harris at gmail.com  Sun Jul  6 14:07:11 2014
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sun, 6 Jul 2014 12:07:11 -0600
Subject: [Numpy-discussion] Questions about fixes for 1.9.0rc2
In-Reply-To: <CANNq6F=+C18ZR4QVk-_aciXXtmozTZ_t4yREwOX6Kg611=AQXQ@mail.gmail.com>
References: <CAB6mnxK+bEN4ToMk0O_LUWbHjB7s2eAO+XAY9YxpJ45D4ZYsaQ@mail.gmail.com>
	<CAB6mnxJ=Pjj8sFh0KqssCUNKESQQqgNUP-OYpfP6nGCtk1WL4A@mail.gmail.com>
	<CABL7CQjQxOVWFSFG3nf9s9pQop05NK2+6X7XHaEFjF3HH8xbUQ@mail.gmail.com>
	<CAPJVwB=WcTd2O+gr1GAcgW+F2=S=5bH6hd9_9xHjhCjmt4XnkA@mail.gmail.com>
	<CAB6mnxJvEPCKm8cRaWXHKaGJwFM2_3j+-E9QdQ-gAQZ8kkLe8g@mail.gmail.com>
	<CAPJVwBmd0iuPe6MfJTfJEbEEKV8SimKLDZN5zejO_8VRePtLtA@mail.gmail.com>
	<CAB6mnxK6Y8dVU_3Yu=YQCvtK6gSXNZ_H-XuqH1BqtmXNw6_uQg@mail.gmail.com>
	<CAPJVwBnneoJhOqcD87wU2_-pzphObfubPovrGEG8N7K-1_4FPA@mail.gmail.com>
	<CAB6mnxKdUX831dCyk_aBaw8rAYgF=SAg-g1P6k5mBvz2s+Eo=A@mail.gmail.com>
	<CAPJVwBnJL=cUjuiwr6ZSS0HfG3DYOA=NPE-rWyayqXoro77zoQ@mail.gmail.com>
	<CAB6mnx+YHfOen+ywpRbzjLO6r_+u8qZNro_-v_G5u7Yn=Wgemg@mail.gmail.com>
	<CAPJVwBn5e_TeHJ1KEzPnoJrEeuyA83zJkCxrV9_OvAqWwiqD=Q@mail.gmail.com>
	<1404573876.3423.9.camel@sebastian-t440>
	<CANNq6FkzrDGnj3zsCEsxEpnyQpS=2SJOL8i6CjKN_u9kjOAP=g@mail.gmail.com>
	<1404628234.12836.1.camel@sebastian-t440>
	<CANNq6F=+C18ZR4QVk-_aciXXtmozTZ_t4yREwOX6Kg611=AQXQ@mail.gmail.com>
Message-ID: <CAB6mnxK92VBR9EM62h-93JRC=r-BEisDLkwOE__NCNAUciiEbw@mail.gmail.com>

On Sun, Jul 6, 2014 at 11:54 AM, Benjamin Root <ben.root at ou.edu> wrote:

> I see that a solution has already been found and merged. Are there any
> remaining issues for matplotlib to resolve?
>
>
You might take a look at the fixes in the matplotlib PR. They struck me as
a bit hasty rather than fixes for the underlying problems, especially in
the cubic interpolation case. The other mismatched assignment might be
fixable with a `.flat` on the lhs rather than a reshape on the rhs. At
least that was one suggested fix, I don't know if it works...

<snip>

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140706/007fd169/attachment.html>

From ben.root at ou.edu  Sun Jul  6 14:40:01 2014
From: ben.root at ou.edu (Benjamin Root)
Date: Sun, 6 Jul 2014 14:40:01 -0400
Subject: [Numpy-discussion] Cython requirement?
Message-ID: <CANNq6FkN-uZZYVhuLkYtDrXK=dpYQ=zL2EdRH1fA1uvjq8wQXA@mail.gmail.com>

When did Cython become a build requirement? I remember discussing the use
of Cython a while back, and IIRC the agreement was that both the cython
code and the generated C files would be included in version control so that
cython wouldn't be a build requirement, only a developer requirement when
modifying those files.

I just did a git clean -fxd and rebase to current master, and I am getting
a message indicating that I need Cython 0.19 to build numpy (I haven't
updated cython in ages on this particular machine).

ben at tigger:~/Programs/numpy$ python setup.py install --user
Running from numpy source directory.
Cythonizing sources
Processing numpy/random/mtrand/mtrand.pyx
Traceback (most recent call last):
  File "/home/ben/Programs/numpy/tools/cythonize.py", line 199, in <module>
    main()
  File "/home/ben/Programs/numpy/tools/cythonize.py", line 195, in main
    find_process_files(root_dir)
  File "/home/ben/Programs/numpy/tools/cythonize.py", line 187, in
find_process_files
    process(cur_dir, fromfile, tofile, function, hash_db)
  File "/home/ben/Programs/numpy/tools/cythonize.py", line 161, in process
    processor_function(fromfile, tofile)
  File "/home/ben/Programs/numpy/tools/cythonize.py", line 59, in
process_pyx
    raise Exception('Building %s requires Cython >= 0.19' % VENDOR)
Exception: Building NumPy requires Cython >= 0.19
Traceback (most recent call last):
  File "setup.py", line 251, in <module>
    setup_package()
  File "setup.py", line 239, in setup_package
    generate_cython()
  File "setup.py", line 191, in generate_cython
    raise RuntimeError("Running cythonize failed!")
RuntimeError: Running cythonize failed!

Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140706/01df39b1/attachment.html>

From davidmenhur at gmail.com  Sun Jul  6 14:53:46 2014
From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=)
Date: Sun, 6 Jul 2014 20:53:46 +0200
Subject: [Numpy-discussion] Cython requirement?
In-Reply-To: <CANNq6FkN-uZZYVhuLkYtDrXK=dpYQ=zL2EdRH1fA1uvjq8wQXA@mail.gmail.com>
References: <CANNq6FkN-uZZYVhuLkYtDrXK=dpYQ=zL2EdRH1fA1uvjq8wQXA@mail.gmail.com>
Message-ID: <CAJhcF=0VkcxJGzOT0edtnDJ1zeJnrKMRUo6UDuQ2LQA08qA4AA@mail.gmail.com>

On 6 July 2014 20:40, Benjamin Root <ben.root at ou.edu> wrote:

> When did Cython become a build requirement? I remember discussing the use
> of Cython a while back, and IIRC the agreement was that both the cython
> code and the generated C files would be included in version control so that
> cython wouldn't be a build requirement, only a developer requirement when
> modifying those files.


The policy was changed to not include them in VC, but to them in the
releases, not to pollute the repository and avoid having C files not
matching the pyx, IIRC. The change was fairly recent, I was only able to
dig this email mentioning it:

http://numpy-discussion.10968.n7.nabble.com/numpy-git-master-requiring-cython-for-build-td37250.html


/David.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140706/9b9c455e/attachment.html>

From sebastian at sipsolutions.net  Sun Jul  6 14:54:51 2014
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Sun, 06 Jul 2014 20:54:51 +0200
Subject: [Numpy-discussion] Questions about fixes for 1.9.0rc2
In-Reply-To: <CAB6mnxK92VBR9EM62h-93JRC=r-BEisDLkwOE__NCNAUciiEbw@mail.gmail.com>
References: <CAB6mnxK+bEN4ToMk0O_LUWbHjB7s2eAO+XAY9YxpJ45D4ZYsaQ@mail.gmail.com>
	<CAB6mnxJ=Pjj8sFh0KqssCUNKESQQqgNUP-OYpfP6nGCtk1WL4A@mail.gmail.com>
	<CABL7CQjQxOVWFSFG3nf9s9pQop05NK2+6X7XHaEFjF3HH8xbUQ@mail.gmail.com>
	<CAPJVwB=WcTd2O+gr1GAcgW+F2=S=5bH6hd9_9xHjhCjmt4XnkA@mail.gmail.com>
	<CAB6mnxJvEPCKm8cRaWXHKaGJwFM2_3j+-E9QdQ-gAQZ8kkLe8g@mail.gmail.com>
	<CAPJVwBmd0iuPe6MfJTfJEbEEKV8SimKLDZN5zejO_8VRePtLtA@mail.gmail.com>
	<CAB6mnxK6Y8dVU_3Yu=YQCvtK6gSXNZ_H-XuqH1BqtmXNw6_uQg@mail.gmail.com>
	<CAPJVwBnneoJhOqcD87wU2_-pzphObfubPovrGEG8N7K-1_4FPA@mail.gmail.com>
	<CAB6mnxKdUX831dCyk_aBaw8rAYgF=SAg-g1P6k5mBvz2s+Eo=A@mail.gmail.com>
	<CAPJVwBnJL=cUjuiwr6ZSS0HfG3DYOA=NPE-rWyayqXoro77zoQ@mail.gmail.com>
	<CAB6mnx+YHfOen+ywpRbzjLO6r_+u8qZNro_-v_G5u7Yn=Wgemg@mail.gmail.com>
	<CAPJVwBn5e_TeHJ1KEzPnoJrEeuyA83zJkCxrV9_OvAqWwiqD=Q@mail.gmail.com>
	<1404573876.3423.9.camel@sebastian-t440>
	<CANNq6FkzrDGnj3zsCEsxEpnyQpS=2SJOL8i6CjKN_u9kjOAP=g@mail.gmail.com>
	<1404628234.12836.1.camel@sebastian-t440>
	<CANNq6F=+C18ZR4QVk-_aciXXtmozTZ_t4yREwOX6Kg611=AQXQ@mail.gmail.com>
	<CAB6mnxK92VBR9EM62h-93JRC=r-BEisDLkwOE__NCNAUciiEbw@mail.gmail.com>
Message-ID: <1404672891.14324.2.camel@sebastian-t440>

On So, 2014-07-06 at 12:07 -0600, Charles R Harris wrote:
> 
> 
> 
> On Sun, Jul 6, 2014 at 11:54 AM, Benjamin Root <ben.root at ou.edu>
> wrote:
>         I see that a solution has already been found and merged. Are
>         there any remaining issues for matplotlib to resolve?
>         
>         
>         
> 
> 
> You might take a look at the fixes in the matplotlib PR. They struck
> me as a bit hasty rather than fixes for the underlying problems,
> especially in the cubic interpolation case. The other mismatched
> assignment might be fixable with a `.flat` on the lhs rather than a
> reshape on the rhs. At least that was one suggested fix, I don't know
> if it works...
> 

Frankly, I wouldn't necessarily suggest using .flat assignments instead.
`.flat` will basically enforce the old behavior, which is not
necessarily better...

- Sebastian

> 
> <snip>
> 
> 
> Chuck  
> 
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140706/797a4187/attachment.sig>

From robert.kern at gmail.com  Sun Jul  6 15:00:24 2014
From: robert.kern at gmail.com (Robert Kern)
Date: Sun, 6 Jul 2014 20:00:24 +0100
Subject: [Numpy-discussion] Cython requirement?
In-Reply-To: <CANNq6FkN-uZZYVhuLkYtDrXK=dpYQ=zL2EdRH1fA1uvjq8wQXA@mail.gmail.com>
References: <CANNq6FkN-uZZYVhuLkYtDrXK=dpYQ=zL2EdRH1fA1uvjq8wQXA@mail.gmail.com>
Message-ID: <CAF6FJisFZLG3T4mNRWg6_jtshbJJZ52zfvBvEEvsW3967dk5aw@mail.gmail.com>

On Sun, Jul 6, 2014 at 7:40 PM, Benjamin Root <ben.root at ou.edu> wrote:
> When did Cython become a build requirement? I remember discussing the use of
> Cython a while back, and IIRC the agreement was that both the cython code
> and the generated C files would be included in version control so that
> cython wouldn't be a build requirement, only a developer requirement when
> modifying those files.

It's a build requirement for building from the git checkout, but not
the distributed source tarballs. The change was not too long ago, but
it was discussed here.

-- 
Robert Kern


From ben.root at ou.edu  Sun Jul  6 15:32:30 2014
From: ben.root at ou.edu (Benjamin Root)
Date: Sun, 6 Jul 2014 15:32:30 -0400
Subject: [Numpy-discussion] indexed assignment testcases
Message-ID: <CANNq6FkH03RTTurS__vy2wizXfZDFuOH+gj7Q2YnMDYEiUO29w@mail.gmail.com>

While trying to wrap my head around the issues with matplotlib's tri module
and the new numpy indexing, I have made some test cases where I wonder if
warnings should be issued.

import numpy as np
a = np.ones((10,))
all_false = np.zeros((10,), dtype=bool)
a[all_false] = np.array([2.0])   # the shapes don't match here

mask_in = np.array([False]*8 + [True, True])
a[mask_in] = np.array([])    # raises ValueError as expected
a[mask_in] = np.array([[]])  # no exception because it is 2-D, for some
reason (on master, but not release-0.9b1)

a[mask_in] = np.array([2.0]) # This works and repeats 2.0 twice. I thought
this wasn't supposed to happen anymore?

Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140706/ecfee7bd/attachment.html>

From ben.root at ou.edu  Sun Jul  6 15:33:38 2014
From: ben.root at ou.edu (Benjamin Root)
Date: Sun, 6 Jul 2014 15:33:38 -0400
Subject: [Numpy-discussion] Cython requirement?
In-Reply-To: <CAF6FJisFZLG3T4mNRWg6_jtshbJJZ52zfvBvEEvsW3967dk5aw@mail.gmail.com>
References: <CANNq6FkN-uZZYVhuLkYtDrXK=dpYQ=zL2EdRH1fA1uvjq8wQXA@mail.gmail.com>
	<CAF6FJisFZLG3T4mNRWg6_jtshbJJZ52zfvBvEEvsW3967dk5aw@mail.gmail.com>
Message-ID: <CANNq6F=aY0Gk0YyEsR29FtYSPGLg4fcMqvr=b-JKPD2TkjQL1A@mail.gmail.com>

Ok, must have missed that discussion. I don't like the reasoning, but that
boat has sailed.


On Sun, Jul 6, 2014 at 3:00 PM, Robert Kern <robert.kern at gmail.com> wrote:

> On Sun, Jul 6, 2014 at 7:40 PM, Benjamin Root <ben.root at ou.edu> wrote:
> > When did Cython become a build requirement? I remember discussing the
> use of
> > Cython a while back, and IIRC the agreement was that both the cython code
> > and the generated C files would be included in version control so that
> > cython wouldn't be a build requirement, only a developer requirement when
> > modifying those files.
>
> It's a build requirement for building from the git checkout, but not
> the distributed source tarballs. The change was not too long ago, but
> it was discussed here.
>
> --
> Robert Kern
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140706/198a65cb/attachment.html>

From sebastian at sipsolutions.net  Sun Jul  6 15:58:36 2014
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Sun, 06 Jul 2014 21:58:36 +0200
Subject: [Numpy-discussion] indexed assignment testcases
In-Reply-To: <CANNq6FkH03RTTurS__vy2wizXfZDFuOH+gj7Q2YnMDYEiUO29w@mail.gmail.com>
References: <CANNq6FkH03RTTurS__vy2wizXfZDFuOH+gj7Q2YnMDYEiUO29w@mail.gmail.com>
Message-ID: <1404676716.14324.6.camel@sebastian-t440>

On So, 2014-07-06 at 15:32 -0400, Benjamin Root wrote:
> While trying to wrap my head around the issues with matplotlib's tri
> module and the new numpy indexing, I have made some test cases where I
> wonder if warnings should be issued.
> 
> 
> import numpy as np
> 
> a = np.ones((10,))
> 
> all_false = np.zeros((10,), dtype=bool)
> 
> a[all_false] = np.array([2.0])   # the shapes don't match here
> 

The shapes match using broadcasting. Values shape of (1,) can be
broadcast to indexing result shape of (0,).

> 
> mask_in = np.array([False]*8 + [True, True])
> 
> a[mask_in] = np.array([])    # raises ValueError as expected
> 
> a[mask_in] = np.array([[]])  # no exception because it is 2-D, for
> some reason (on master, but not release-0.9b1)
> 
Gives a (maybe not good) deprecation warning in master. But those are
typically invisible...

> 
> a[mask_in] = np.array([2.0]) # This works and repeats 2.0 twice. I
> thought this wasn't supposed to happen anymore?
> 

Again, broadcasting of values onto out shape.

- Sebastian

> 
> Ben Root
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140706/f8235e03/attachment.sig>

From charlesr.harris at gmail.com  Sun Jul  6 15:59:00 2014
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sun, 6 Jul 2014 13:59:00 -0600
Subject: [Numpy-discussion] indexed assignment testcases
In-Reply-To: <CANNq6FkH03RTTurS__vy2wizXfZDFuOH+gj7Q2YnMDYEiUO29w@mail.gmail.com>
References: <CANNq6FkH03RTTurS__vy2wizXfZDFuOH+gj7Q2YnMDYEiUO29w@mail.gmail.com>
Message-ID: <CAB6mnxJr-3gQA5HR8SNGhrxFipuS5YmW6g+n3xiZ6cBaTtJeoA@mail.gmail.com>

On Sun, Jul 6, 2014 at 1:32 PM, Benjamin Root <ben.root at ou.edu> wrote:

> While trying to wrap my head around the issues with matplotlib's tri
> module and the new numpy indexing, I have made some test cases where I
> wonder if warnings should be issued.
>
> import numpy as np
> a = np.ones((10,))
> all_false = np.zeros((10,), dtype=bool)
> a[all_false] = np.array([2.0])   # the shapes don't match here
>

It broadcasts because the leading dimension is 1.


>
> mask_in = np.array([False]*8 + [True, True])
> a[mask_in] = np.array([])    # raises ValueError as expected
> a[mask_in] = np.array([[]])  # no exception because it is 2-D, for some
> reason (on master, but not release-0.9b1)
>

Now falls back to old behavior and raises a DeprecationWarning. You don't
see that by default.


>
> a[mask_in] = np.array([2.0]) # This works and repeats 2.0 twice. I thought
> this wasn't supposed to happen anymore?
>

Broadcasting again.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140706/6d542e59/attachment.html>

From ben.root at ou.edu  Sun Jul  6 16:14:36 2014
From: ben.root at ou.edu (Benjamin Root)
Date: Sun, 6 Jul 2014 16:14:36 -0400
Subject: [Numpy-discussion] indexed assignment testcases
In-Reply-To: <CAB6mnxJr-3gQA5HR8SNGhrxFipuS5YmW6g+n3xiZ6cBaTtJeoA@mail.gmail.com>
References: <CANNq6FkH03RTTurS__vy2wizXfZDFuOH+gj7Q2YnMDYEiUO29w@mail.gmail.com>
	<CAB6mnxJr-3gQA5HR8SNGhrxFipuS5YmW6g+n3xiZ6cBaTtJeoA@mail.gmail.com>
Message-ID: <CANNq6FmF06o2uc8mj6taTyKWU9LUiDBp1oY_SuMggtp8hvcFJw@mail.gmail.com>

re: deprecation warnings... that's what I get when I am working on my
non-dev box because I am at the conference, and have gotten too used to the
setup of my dev box...

as for the broadcasting issue, I can see it for the second case, but the
first case still doesn't sit right with me. My understanding of
broadcasting is to effectively *expand* an array to match the shape of
another array (or some target shape). In this case, the array is being
effectively *contracted* in shape. That makes zero sense to me.

Ben


On Sun, Jul 6, 2014 at 3:59 PM, Charles R Harris <charlesr.harris at gmail.com>
wrote:

>
>
>
> On Sun, Jul 6, 2014 at 1:32 PM, Benjamin Root <ben.root at ou.edu> wrote:
>
>> While trying to wrap my head around the issues with matplotlib's tri
>> module and the new numpy indexing, I have made some test cases where I
>> wonder if warnings should be issued.
>>
>> import numpy as np
>> a = np.ones((10,))
>> all_false = np.zeros((10,), dtype=bool)
>> a[all_false] = np.array([2.0])   # the shapes don't match here
>>
>
> It broadcasts because the leading dimension is 1.
>
>
>>
>> mask_in = np.array([False]*8 + [True, True])
>> a[mask_in] = np.array([])    # raises ValueError as expected
>> a[mask_in] = np.array([[]])  # no exception because it is 2-D, for some
>> reason (on master, but not release-0.9b1)
>>
>
> Now falls back to old behavior and raises a DeprecationWarning. You don't
> see that by default.
>
>
>>
>> a[mask_in] = np.array([2.0]) # This works and repeats 2.0 twice. I
>> thought this wasn't supposed to happen anymore?
>>
>
> Broadcasting again.
>
> Chuck
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140706/94c8fb84/attachment.html>

From var.mail.daniel at gmail.com  Sun Jul  6 16:35:39 2014
From: var.mail.daniel at gmail.com (Daniel da Silva)
Date: Sun, 6 Jul 2014 16:35:39 -0400
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat` style
Message-ID: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>

The idea is that there be a short-hand for creating arrays as there is for
matrices:

  np.mat('.2 .7 .1; .3 .5 .2; .1 .1 .9')

It was suggested in GitHub issue #4817
<https://github.com/numpy/numpy/issues/4817> in light that it would be
beneficial to beginners and to presenters during demonstrations.  In GitHub
pull request #484 <https://github.com/numpy/numpy/pull/4845>, I implemented
this as the np.arr function.

Does anyone have any feedback on the API details? Some examples from my
implementation follow.

>>> np.arr('3; 4; 5')
    array([[3],
           [4],
           [5]])

>>> np.arr('3; 4; 5', dtype=float)
    array([[ 3.],
           [ 4.],
           [ 5.]])

>>> np.arr('1 0 0; 0 1 0; 0 0 1')
    array([[1, 0, 0],
           [0, 1, 0],
           [0, 0, 1]])

>>> np.arr('4, 5; 6, 7')
    array([[4, 5],
           [6, 7]])
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140706/6390c799/attachment.html>

From sebastian at sipsolutions.net  Sun Jul  6 17:04:04 2014
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Sun, 06 Jul 2014 23:04:04 +0200
Subject: [Numpy-discussion] indexed assignment testcases
In-Reply-To: <CANNq6FmF06o2uc8mj6taTyKWU9LUiDBp1oY_SuMggtp8hvcFJw@mail.gmail.com>
References: <CANNq6FkH03RTTurS__vy2wizXfZDFuOH+gj7Q2YnMDYEiUO29w@mail.gmail.com>
	<CAB6mnxJr-3gQA5HR8SNGhrxFipuS5YmW6g+n3xiZ6cBaTtJeoA@mail.gmail.com>
	<CANNq6FmF06o2uc8mj6taTyKWU9LUiDBp1oY_SuMggtp8hvcFJw@mail.gmail.com>
Message-ID: <1404680644.16951.2.camel@sebastian-t440>

On So, 2014-07-06 at 16:14 -0400, Benjamin Root wrote:
> re: deprecation warnings... that's what I get when I am working on my
> non-dev box because I am at the conference, and have gotten too used
> to the setup of my dev box...
> 
> 
> as for the broadcasting issue, I can see it for the second case, but
> the first case still doesn't sit right with me. My understanding of
> broadcasting is to effectively *expand* an array to match the shape of
> another array (or some target shape). In this case, the array is being
> effectively *contracted* in shape. That makes zero sense to me.
> 

Well, from a technical point of view, it is more like changing the shape
to whatever fits while setting the stride to 0. I am sure there are a
few places where the doc is not clear.
From a practical point of view, it makes sense if you consider this:

arr[arr < 0] = 0

Where it might be that the array has no elements smaller 0. Though I
admit I would write 0 here, and not [0].

- Sebastian

> 
> Ben
> 
> 
> 
> On Sun, Jul 6, 2014 at 3:59 PM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
>         
>         
>         
>         On Sun, Jul 6, 2014 at 1:32 PM, Benjamin Root
>         <ben.root at ou.edu> wrote:
>                 While trying to wrap my head around the issues with
>                 matplotlib's tri module and the new numpy indexing, I
>                 have made some test cases where I wonder if warnings
>                 should be issued.
>                 
>                 
>                 import numpy as np
>                 
>                 a = np.ones((10,))
>                 
>                 all_false = np.zeros((10,), dtype=bool)
>                 
>                 a[all_false] = np.array([2.0])   # the shapes don't
>                 match here
>                 
>         
>         
>         It broadcasts because the leading dimension is 1.
>          
>         
>                 
>                 
>                 mask_in = np.array([False]*8 + [True, True])
>                 
>                 a[mask_in] = np.array([])    # raises ValueError as
>                 expected
>                 
>                 a[mask_in] = np.array([[]])  # no exception because it
>                 is 2-D, for some reason (on master, but not
>                 release-0.9b1)
>                 
>          
>         Now falls back to old behavior and raises a
>         DeprecationWarning. You don't see that by default.
>          
>         
>                 
>                 
>                 a[mask_in] = np.array([2.0]) # This works and repeats
>                 2.0 twice. I thought this wasn't supposed to happen
>                 anymore?
>                 
>         
>         
>         Broadcasting again.
>         
>         
>         
>         Chuck 
>         
>         
>         _______________________________________________
>         NumPy-Discussion mailing list
>         NumPy-Discussion at scipy.org
>         http://mail.scipy.org/mailman/listinfo/numpy-discussion
>         
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140706/f28ba058/attachment.sig>

From njs at pobox.com  Sun Jul  6 17:43:10 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Sun, 6 Jul 2014 22:43:10 +0100
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat`
	style
In-Reply-To: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>
Message-ID: <CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>

On Sun, Jul 6, 2014 at 9:35 PM, Daniel da Silva
<var.mail.daniel at gmail.com> wrote:
> The idea is that there be a short-hand for creating arrays as there is for
> matrices:
>
>   np.mat('.2 .7 .1; .3 .5 .2; .1 .1 .9')
>
> It was suggested in GitHub issue #4817 in light that it would be beneficial
> to beginners and to presenters during demonstrations.  In GitHub pull
> request #484, I implemented this as the np.arr function.
>
> Does anyone have any feedback on the API details? Some examples from my
> implementation follow.
>
>>>> np.arr('3; 4; 5')
>     array([[3],
>            [4],
>            [5]])
>
>>>> np.arr('3; 4; 5', dtype=float)
>     array([[ 3.],
>            [ 4.],
>            [ 5.]])
>
>>>> np.arr('1 0 0; 0 1 0; 0 0 1')
>     array([[1, 0, 0],
>            [0, 1, 0],
>            [0, 0, 1]])
>
>>>> np.arr('4, 5; 6, 7')
>     array([[4, 5],
>            [6, 7]])

It occurs to me that np.mat always returns a 2d matrix, but for arrays
there are more options.

What should np.arr('1 2 3') return? a 1d array or a 2d row vector?
(Maybe np.arr('1 2 3;') should give the row-vector?)

Should there be some way to write 3d or higher-d arrays?

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org


From njs at pobox.com  Sun Jul  6 17:48:13 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Sun, 6 Jul 2014 22:48:13 +0100
Subject: [Numpy-discussion] indexed assignment testcases
In-Reply-To: <CANNq6FmF06o2uc8mj6taTyKWU9LUiDBp1oY_SuMggtp8hvcFJw@mail.gmail.com>
References: <CANNq6FkH03RTTurS__vy2wizXfZDFuOH+gj7Q2YnMDYEiUO29w@mail.gmail.com>
	<CAB6mnxJr-3gQA5HR8SNGhrxFipuS5YmW6g+n3xiZ6cBaTtJeoA@mail.gmail.com>
	<CANNq6FmF06o2uc8mj6taTyKWU9LUiDBp1oY_SuMggtp8hvcFJw@mail.gmail.com>
Message-ID: <CAPJVwBmeBASMbnV8QZV_UOLPp2HWzs5vMxwzgvhi7eprjozKeQ@mail.gmail.com>

On Sun, Jul 6, 2014 at 9:14 PM, Benjamin Root <ben.root at ou.edu> wrote:
> as for the broadcasting issue, I can see it for the second case, but the
> first case still doesn't sit right with me. My understanding of broadcasting
> is to effectively *expand* an array to match the shape of another array (or
> some target shape). In this case, the array is being effectively
> *contracted* in shape. That makes zero sense to me.

That's how it's always worked though, in all cases of broadcasting;
nothing special about indexing:

In [8]: a = np.zeros((3, 0))

In [9]: a + 1
Out[9]: array([], shape=(3, 0), dtype=float64)

In [10]: a + [[1], [2], [3]]
Out[10]: array([], shape=(3, 0), dtype=float64)

IME it's extremely useful in practice for avoiding special cases when
some axis has a vary size that can be zero.

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org


From ben.root at ou.edu  Sun Jul  6 17:57:37 2014
From: ben.root at ou.edu (Benjamin Root)
Date: Sun, 6 Jul 2014 17:57:37 -0400
Subject: [Numpy-discussion] indexed assignment testcases
In-Reply-To: <CAPJVwBmeBASMbnV8QZV_UOLPp2HWzs5vMxwzgvhi7eprjozKeQ@mail.gmail.com>
References: <CANNq6FkH03RTTurS__vy2wizXfZDFuOH+gj7Q2YnMDYEiUO29w@mail.gmail.com>
	<CAB6mnxJr-3gQA5HR8SNGhrxFipuS5YmW6g+n3xiZ6cBaTtJeoA@mail.gmail.com>
	<CANNq6FmF06o2uc8mj6taTyKWU9LUiDBp1oY_SuMggtp8hvcFJw@mail.gmail.com>
	<CAPJVwBmeBASMbnV8QZV_UOLPp2HWzs5vMxwzgvhi7eprjozKeQ@mail.gmail.com>
Message-ID: <CANNq6FnOTdYLYSYyGuv+0NP_XJj9zGTjs2QAhVfmfk6p_y5Ksg@mail.gmail.com>

I guess I always treated scalars as something special when it comes to
broadcasting. Seeing these examples, I can see how my grokking of
broadcasting was incomplete.

I still think that the assignment of an array of values (as opposed to a
scalar) to nothing could potentially mask deeper issues, but now I see that
it may be impossible to distinguish from the perfectly normal case.

Cheers!
Ben Root


On Sun, Jul 6, 2014 at 5:48 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Sun, Jul 6, 2014 at 9:14 PM, Benjamin Root <ben.root at ou.edu> wrote:
> > as for the broadcasting issue, I can see it for the second case, but the
> > first case still doesn't sit right with me. My understanding of
> broadcasting
> > is to effectively *expand* an array to match the shape of another array
> (or
> > some target shape). In this case, the array is being effectively
> > *contracted* in shape. That makes zero sense to me.
>
> That's how it's always worked though, in all cases of broadcasting;
> nothing special about indexing:
>
> In [8]: a = np.zeros((3, 0))
>
> In [9]: a + 1
> Out[9]: array([], shape=(3, 0), dtype=float64)
>
> In [10]: a + [[1], [2], [3]]
> Out[10]: array([], shape=(3, 0), dtype=float64)
>
> IME it's extremely useful in practice for avoiding special cases when
> some axis has a vary size that can be zero.
>
> --
> Nathaniel J. Smith
> Postdoctoral researcher - Informatics - University of Edinburgh
> http://vorpus.org
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140706/a1e46e51/attachment.html>

From efiring at hawaii.edu  Sun Jul  6 18:06:25 2014
From: efiring at hawaii.edu (Eric Firing)
Date: Sun, 06 Jul 2014 12:06:25 -1000
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat`
	style
In-Reply-To: <CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>
	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>
Message-ID: <53B9C861.3090809@hawaii.edu>

On 2014/07/06, 11:43 AM, Nathaniel Smith wrote:
> On Sun, Jul 6, 2014 at 9:35 PM, Daniel da Silva
> <var.mail.daniel at gmail.com> wrote:
>> The idea is that there be a short-hand for creating arrays as there is for
>> matrices:
>>
>>    np.mat('.2 .7 .1; .3 .5 .2; .1 .1 .9')
>>
>> It was suggested in GitHub issue #4817 in light that it would be beneficial
>> to beginners and to presenters during demonstrations.  In GitHub pull
>> request #484, I implemented this as the np.arr function.
>>
>> Does anyone have any feedback on the API details? Some examples from my
>> implementation follow.
>>
>>>>> np.arr('3; 4; 5')
>>      array([[3],
>>             [4],
>>             [5]])
>>
>>>>> np.arr('3; 4; 5', dtype=float)
>>      array([[ 3.],
>>             [ 4.],
>>             [ 5.]])
>>
>>>>> np.arr('1 0 0; 0 1 0; 0 0 1')
>>      array([[1, 0, 0],
>>             [0, 1, 0],
>>             [0, 0, 1]])
>>
>>>>> np.arr('4, 5; 6, 7')
>>      array([[4, 5],
>>             [6, 7]])
>
> It occurs to me that np.mat always returns a 2d matrix, but for arrays
> there are more options.
>
> What should np.arr('1 2 3') return? a 1d array or a 2d row vector?

I would say 1d array.  This is numpy, not numpy.matrix.

> (Maybe np.arr('1 2 3;') should give the row-vector?)

Yes, it is reasonable that a semicolon should trigger 2d.

>
> Should there be some way to write 3d or higher-d arrays?

No, there should not.  This is for quick demos and that sort of thing. 
It is not a substitute for np.array().  (I'm not entirely convinced 
np.arr() is a good idea at all; but if it is, it must be kept simple.)

A possible downside for beginners is that this might delay their 
understanding that the commas are needed for np.array([1, 2, 3]).

Eric

>
> -n
>


From ted.sandler at gmail.com  Sun Jul  6 18:47:47 2014
From: ted.sandler at gmail.com (Ted Sandler)
Date: Sun, 6 Jul 2014 15:47:47 -0700
Subject: [Numpy-discussion] parsing dtype descriptors
In-Reply-To: <CAF6FJiu88j4O3r7E1GohTNxXrvE9Y6eU488aU+df5i78Sy7JfA@mail.gmail.com>
References: <CAHYXaWv3EiGLFZOfMJaSp1pb5ndPGAt5A+ffsYzTwBZ=1KS_Yg@mail.gmail.com>
	<20140703193506.GA25653@kudu.in-berlin.de>
	<CAHYXaWtfRWp5vdcNc-z4ZQiSJmNtEbJa64b3pn0znOnfNCeoCg@mail.gmail.com>
	<CAF6FJiu88j4O3r7E1GohTNxXrvE9Y6eU488aU+df5i78Sy7JfA@mail.gmail.com>
Message-ID: <CAHYXaWv2GmbkNV9JLV5q6f8-WXMVA==XwVDQHmEFPCB1zXLmUA@mail.gmail.com>

Thanks!


On Fri, Jul 4, 2014 at 1:53 AM, Robert Kern <robert.kern at gmail.com> wrote:

> On Thu, Jul 3, 2014 at 10:53 PM, Ted Sandler <ted.sandler at gmail.com>
> wrote:
> > Thanks. No, it's not what I'm looking for.
> >
> > I'm looking for the code that parses the string "<i8" in the npy file
> array
> > header's descriptor:
> >
> >   {'descr': '<i8', 'fortran_order': False, 'shape': (5,), }
> >
> > There are many different descriptor strings, e.g.:
> >
> >  '>f8'
> >  '=f4'
> >  'float32'
> >  '>c16'
> >  ...
> >
> > Ideally, I want the exhaustive list of valid input strings that describe
> > standard ndarrays (i.e. ndarrays with simple entries as opposed to
> records
> > or subarrays). Lacking an exhaustive list or spec, I'd like the source
> code
> > that does the parsing for them.
>
>
> https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/descriptor.c#L1321
>
> https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/conversion_utils.c#L1000
>
> https://github.com/numpy/numpy/blob/master/numpy/core/include/numpy/ndarraytypes.h#L97
>
> --
> Robert Kern
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140706/1bcb4665/attachment.html>

From ndarray at mac.com  Sun Jul  6 22:27:21 2014
From: ndarray at mac.com (Alexander Belopolsky)
Date: Sun, 6 Jul 2014 22:27:21 -0400
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat`
	style
In-Reply-To: <53B9C861.3090809@hawaii.edu>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>
	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>
	<53B9C861.3090809@hawaii.edu>
Message-ID: <CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>

On Sun, Jul 6, 2014 at 6:06 PM, Eric Firing <efiring at hawaii.edu> wrote:

>  (I'm not entirely convinced
> np.arr() is a good idea at all; but if it is, it must be kept simple.)
>

If you are going to introduce this functionality, please don't call it
np.arr.

Right now, np.a<tab> presents you with a whopping 53 completion choices.
 Adding "r", narrows that to 21, but np.arr<tab> completes to np.array
right away.  Please don't introduce another bump in this road.

"Namespaces are one honking great idea -- let's do more of those!"

I would suggest calling it something like np.array_simple or
np.array_from_string, but the best choice IMO, would be
np.ndarray.from_string (a static constructor method).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140706/a3c72ad6/attachment.html>

From efiring at hawaii.edu  Sun Jul  6 22:59:45 2014
From: efiring at hawaii.edu (Eric Firing)
Date: Sun, 06 Jul 2014 16:59:45 -1000
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat`
	style
In-Reply-To: <CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>	<53B9C861.3090809@hawaii.edu>
	<CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>
Message-ID: <53BA0D21.5050508@hawaii.edu>

On 2014/07/06, 4:27 PM, Alexander Belopolsky wrote:
>
> On Sun, Jul 6, 2014 at 6:06 PM, Eric Firing <efiring at hawaii.edu
> <mailto:efiring at hawaii.edu>> wrote:
>
>       (I'm not entirely convinced
>     np.arr() is a good idea at all; but if it is, it must be kept simple.)
>
>
> If you are going to introduce this functionality, please don't call it
> np.arr.
>
> Right now, np.a<tab> presents you with a whopping 53 completion choices.
>   Adding "r", narrows that to 21, but np.arr<tab> completes to np.array
> right away.  Please don't introduce another bump in this road.
>
> "Namespaces are one honking great idea -- let's do more of those!"
>
> I would suggest calling it something like np.array_simple or
> np.array_from_string, but the best choice IMO, would be
> np.ndarray.from_string (a static constructor method).


I think the problem is that this defeats the point: minimizing typing 
when doing an off-the-cuff demo or test.  I don't know that this use 
case justifies the clutter, regardless of what it is called; but 
evidently there is some demand for it.

Eric

>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From ndarray at mac.com  Mon Jul  7 00:29:33 2014
From: ndarray at mac.com (Alexander Belopolsky)
Date: Mon, 7 Jul 2014 00:29:33 -0400
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat`
	style
In-Reply-To: <53BA0D21.5050508@hawaii.edu>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>
	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>
	<53B9C861.3090809@hawaii.edu>
	<CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>
	<53BA0D21.5050508@hawaii.edu>
Message-ID: <CAP7h-xaVZ9xWDc67ViX890rbHaNJ7ag-wA3Xe=oQDBO9M2Ey3g@mail.gmail.com>

On Sun, Jul 6, 2014 at 10:59 PM, Eric Firing <efiring at hawaii.edu> wrote:

> > I would suggest calling it something like np.array_simple or
> > np.array_from_string, but the best choice IMO, would be
> > np.ndarray.from_string (a static constructor method).
>
>
> I think the problem is that this defeats the point: minimizing typing
> when doing an off-the-cuff demo or test.


You can always put np.arr = np.ndarray.from_string or even arr =
np.ndarray.from_string right next to the line where you define np.  (Which
makes me wonder if something like this belongs to ipython magic.)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140707/ea182d8e/attachment.html>

From charlesr.harris at gmail.com  Mon Jul  7 01:53:22 2014
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sun, 6 Jul 2014 23:53:22 -0600
Subject: [Numpy-discussion] 1.10-devel is open
Message-ID: <CAB6mnxKyxxJD+uznnFU5nJ_BMCbwCFSyD7JJ3JWta43+XGTj_Q@mail.gmail.com>

Just so. The fixes for 1.9.0b1 are now in that branch ready for the next
beta.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140706/a0cc2778/attachment.html>

From J.M.Hoekstra at tudelft.nl  Mon Jul  7 02:48:52 2014
From: J.M.Hoekstra at tudelft.nl (Jacco Hoekstra - LR)
Date: Mon, 7 Jul 2014 06:48:52 +0000
Subject: [Numpy-discussion] Short-hand array creation in
	`numpy.mat`	style
In-Reply-To: <CAP7h-xaVZ9xWDc67ViX890rbHaNJ7ag-wA3Xe=oQDBO9M2Ey3g@mail.gmail.com>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>
	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>
	<53B9C861.3090809@hawaii.edu>
	<CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>
	<53BA0D21.5050508@hawaii.edu>
	<CAP7h-xaVZ9xWDc67ViX890rbHaNJ7ag-wA3Xe=oQDBO9M2Ey3g@mail.gmail.com>
Message-ID: <245AC908B39361438CFA2299B0DD50E438FAB9BD@SRV361.tudelft.net>

How about using the old name np.mat() for this type of array creation?

So the:

A = np.mat(?1 2;3 4?)

creates a two dimensional array.

But then resulting in an array A instead of the matrix type? It might at least provide some partial downward compatibility.

Best regards,
    Jacco Hoekstra


From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Alexander Belopolsky
Sent: maandag 7 juli 2014 6:30
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Short-hand array creation in `numpy.mat` style


On Sun, Jul 6, 2014 at 10:59 PM, Eric Firing <efiring at hawaii.edu<mailto:efiring at hawaii.edu>> wrote:
> I would suggest calling it something like np.array_simple or
> np.array_from_string, but the best choice IMO, would be
> np.ndarray.from_string (a static constructor method).

I think the problem is that this defeats the point: minimizing typing
when doing an off-the-cuff demo or test.

You can always put np.arr = np.ndarray.from_string or even arr = np.ndarray.from_string right next to the line where you define np.  (Which makes me wonder if something like this belongs to ipython magic.)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140707/bdb52a97/attachment.html>

From jtaylor.debian at googlemail.com  Mon Jul  7 04:02:13 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Mon, 07 Jul 2014 10:02:13 +0200
Subject: [Numpy-discussion] 1.10-devel is open
In-Reply-To: <CAB6mnxKyxxJD+uznnFU5nJ_BMCbwCFSyD7JJ3JWta43+XGTj_Q@mail.gmail.com>
References: <CAB6mnxKyxxJD+uznnFU5nJ_BMCbwCFSyD7JJ3JWta43+XGTj_Q@mail.gmail.com>
Message-ID: <53BA5405.5000508@googlemail.com>

On 07.07.2014 07:53, Charles R Harris wrote:
> Just so. The fixes for 1.9.0b1 are now in that branch ready for the next
> beta.
> 

how did you do that without a merge commit?
however you did it you have git has lost ancestry which is not so nice
for backporting.
If there are no objections I'd like to rewind the maintenance branch
back to beta1 and merge master in properly.


From jtaylor.debian at googlemail.com  Mon Jul  7 04:33:10 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Mon, 07 Jul 2014 10:33:10 +0200
Subject: [Numpy-discussion] 1.10-devel is open
In-Reply-To: <53BA5405.5000508@googlemail.com>
References: <CAB6mnxKyxxJD+uznnFU5nJ_BMCbwCFSyD7JJ3JWta43+XGTj_Q@mail.gmail.com>
	<53BA5405.5000508@googlemail.com>
Message-ID: <53BA5B46.4010306@googlemail.com>

On 07.07.2014 10:02, Julian Taylor wrote:
> On 07.07.2014 07:53, Charles R Harris wrote:
>> Just so. The fixes for 1.9.0b1 are now in that branch ready for the next
>> beta.
>>
> 
> how did you do that without a merge commit?
> however you did it you have git has lost ancestry which is not so nice
> for backporting.
> If there are no objections I'd like to rewind the maintenance branch
> back to beta1 and merge master in properly.
> 

I went ahead with the and rewind + merge [0], please reset your branches
to the origin in case you updated the maintenance/1.9.x branch in last
few hours and get merge errors when running git pull.

[0] https://github.com/numpy/numpy/pull/4849

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140707/203c6840/attachment.sig>

From olivier.grisel at ensta.org  Mon Jul  7 05:18:42 2014
From: olivier.grisel at ensta.org (Olivier Grisel)
Date: Mon, 7 Jul 2014 11:18:42 +0200
Subject: [Numpy-discussion] 64-bit windows numpy / scipy wheels for
	testing
In-Reply-To: <CAH6Pt5qJS357CtoXGyz2cNmVFr8ufjjdwe_fHqU86YcAcJ5vcw@mail.gmail.com>
References: <CAH6Pt5r47rOKXYfdiqaGQihV+yzrq2dc9rmYbboNb=HUbrtEpA@mail.gmail.com>
	<536CB2C6.1030305@googlemail.com>
	<CAGY4rcV1tecTqbVB-nY2_zuAQ1U0uVRH9iPa8Wd2inL8mh+zpA@mail.gmail.com>
	<CAH6Pt5oOx+POzTJdMHU49JxZs=Pkc6OFD-B6=9topx68QjtiiA@mail.gmail.com>
	<CABL7CQhfo6++1B6hBh3pDzmLZOeNbqo1SzQOfBMbLEQj4G+U7A@mail.gmail.com>
	<CAFvE7K4+TMXG8-=bFsWk7hx5Wh4VkhhyYhw4z7EBb6kKnqPJrw@mail.gmail.com>
	<CAGGsPMxmOV8T+Wm6LfTER1xebCQk4MEebXfdRULJ6b8zWLL_=Q@mail.gmail.com>
	<CAH6Pt5pFftrXN8dkS4Dcj+Dt8Bx6EMyqp_cK0RX6=KUKhVx2+w@mail.gmail.com>
	<CAH6Pt5r2fNJGGj1K=foSt9tFU89+wf82waYAzxYHrKGtQbTXwQ@mail.gmail.com>
	<CAGGsPMyxvNkz__+Hc1YSp_fLRPbdg4gez9EAwQyS9jzv_R+ZCA@mail.gmail.com>
	<CAH6Pt5rj8QOrtCyZMKeaM34ofdXkKRCiABGNLD1vVUC_B7bqTA@mail.gmail.com>
	<CAGGsPMy7Y1nqz+5TCuGFP2gvK6G2ReYMb9X1e_Jq3rwO6FoLJA@mail.gmail.com>
	<CAH6Pt5rPhXYyWHPVy8fYrym=2oNQxj5kchUL38KdLrwXjmhL8w@mail.gmail.com>
	<CALGmxEKyXDsuF81=coYYvtJSOLnpnO6AW5CxNYu+JSBOCiQu=A@mail.gmail.com>
	<lp2k91$7jg$1@ger.gmane.org>
	<CAH6Pt5o6g5A_gjCsH75xkqLzPLJUz9OdsB4a5Ej5S-kectw+og@mail.gmail.com>
	<CAH6Pt5qJS357CtoXGyz2cNmVFr8ufjjdwe_fHqU86YcAcJ5vcw@mail.gmail.com>
Message-ID: <CAFvE7K5i-d6j=BQu5+GxQUqSdDjJ91N6yUUhqqBc-0V5rPh8AQ@mail.gmail.com>

Hi!

I gave appveyor a try this WE so as to build a minimalistic Python 3
project with a Cython extension. It works both with 32 and 64 bit
MSVC++ and can generate wheel packages. See:

    https://github.com/ogrisel/python-appveyor-demo

However 2008 is not (yet) installed so it cannot be used for Python
2.7. The Feodor Fitsner seems to be open to install older versions of
MSVC++ on the worker VM image so this might be possible in the future.
Let's see.

Off-course for numpy / scipy this does not solve the fortran compiler
issue, so Carl's static mingw-w64 toolchain still looks like a very
promising solution (and could probably be run on the appveyor infra as
well).

Best,

-- 
Olivier


From davidmenhur at gmail.com  Mon Jul  7 07:17:16 2014
From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=)
Date: Mon, 7 Jul 2014 13:17:16 +0200
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat`
	style
In-Reply-To: <245AC908B39361438CFA2299B0DD50E438FAB9BD@SRV361.tudelft.net>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>
	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>
	<53B9C861.3090809@hawaii.edu>
	<CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>
	<53BA0D21.5050508@hawaii.edu>
	<CAP7h-xaVZ9xWDc67ViX890rbHaNJ7ag-wA3Xe=oQDBO9M2Ey3g@mail.gmail.com>
	<245AC908B39361438CFA2299B0DD50E438FAB9BD@SRV361.tudelft.net>
Message-ID: <CAJhcF=1fL=bQRjvmLkidrn1OLk4-iP_5QxTwAQDiw0AjmMiPsg@mail.gmail.com>

On 7 July 2014 08:48, Jacco Hoekstra - LR <J.M.Hoekstra at tudelft.nl> wrote:

> How about using the old name np.mat() for this type of array creation?


How about a new one? np.matarray, for MATLAB array.

/David.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140707/a23cc9d2/attachment.html>

From alan.isaac at gmail.com  Mon Jul  7 08:25:25 2014
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Mon, 07 Jul 2014 08:25:25 -0400
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat`
	style
In-Reply-To: <CAJhcF=1fL=bQRjvmLkidrn1OLk4-iP_5QxTwAQDiw0AjmMiPsg@mail.gmail.com>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>	<53B9C861.3090809@hawaii.edu>	<CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>	<53BA0D21.5050508@hawaii.edu>	<CAP7h-xaVZ9xWDc67ViX890rbHaNJ7ag-wA3Xe=oQDBO9M2Ey3g@mail.gmail.com>	<245AC908B39361438CFA2299B0DD50E438FAB9BD@SRV361.tudelft.net>
	<CAJhcF=1fL=bQRjvmLkidrn1OLk4-iP_5QxTwAQDiw0AjmMiPsg@mail.gmail.com>
Message-ID: <53BA91B5.6010604@gmail.com>

On 7/7/2014 7:17 AM, Da?id wrote:
> How about a new one? np.matarray, for MATLAB array.


How about `str2arr` or even `build`, since teaching appears to be a focus.
Also, I agree '1 2 3' shd become 1d and '1 2 3;' shd become 2d.
It seems unambiguous to allow '1 2 3;;' to be 3d, or even
'1 2;3 4;;5 6;7 8' (two 2d arrays), but I'm just noting
that, not urging that it be implemented.

Alan Isaac


From charlesr.harris at gmail.com  Mon Jul  7 08:34:18 2014
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 7 Jul 2014 06:34:18 -0600
Subject: [Numpy-discussion] 1.10-devel is open
In-Reply-To: <53BA5405.5000508@googlemail.com>
References: <CAB6mnxKyxxJD+uznnFU5nJ_BMCbwCFSyD7JJ3JWta43+XGTj_Q@mail.gmail.com>
	<53BA5405.5000508@googlemail.com>
Message-ID: <CAB6mnxLiGLeFXQTD5f-Vdp-Pra1xf+FBg75i15jd_JNaHA9Oew@mail.gmail.com>

On Mon, Jul 7, 2014 at 2:02 AM, Julian Taylor <jtaylor.debian at googlemail.com
> wrote:

> On 07.07.2014 07:53, Charles R Harris wrote:
> > Just so. The fixes for 1.9.0b1 are now in that branch ready for the next
> > beta.
> >
>
> how did you do that without a merge commit?
>

  git branch tmp maintenance/1.9.x
  git co tmp
  git branch -f maintenance/1.9.x d244ec7
  git rebase -p --onto tmp 10098da maintenance/1.9.x


> however you did it you have git has lost ancestry which is not so nice
> for backporting.
>

Same changesets, I believe. If '-p' is omitted the merges are omitted.


> If there are no objections I'd like to rewind the maintenance branch
> back to beta1 and merge master in properly.
>

I thought this somewhat cleaner than a merge :0

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140707/6d26b431/attachment.html>

From jtaylor.debian at googlemail.com  Mon Jul  7 08:46:13 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Mon, 7 Jul 2014 14:46:13 +0200
Subject: [Numpy-discussion] 1.10-devel is open
In-Reply-To: <CAB6mnxLiGLeFXQTD5f-Vdp-Pra1xf+FBg75i15jd_JNaHA9Oew@mail.gmail.com>
References: <CAB6mnxKyxxJD+uznnFU5nJ_BMCbwCFSyD7JJ3JWta43+XGTj_Q@mail.gmail.com>
	<53BA5405.5000508@googlemail.com>
	<CAB6mnxLiGLeFXQTD5f-Vdp-Pra1xf+FBg75i15jd_JNaHA9Oew@mail.gmail.com>
Message-ID: <CAK5FAtF+xO9ioeD2Ram6Ak1cUOTLKZ=9okksutRJX5Vt-WT_-g@mail.gmail.com>

On Mon, Jul 7, 2014 at 2:34 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
> On Mon, Jul 7, 2014 at 2:02 AM, Julian Taylor
> <jtaylor.debian at googlemail.com> wrote:
>>
>> On 07.07.2014 07:53, Charles R Harris wrote:
>> > Just so. The fixes for 1.9.0b1 are now in that branch ready for the next
>> > beta.
>> >
>>
>> how did you do that without a merge commit?
>
>
>   git branch tmp maintenance/1.9.x
>   git co tmp
>   git branch -f maintenance/1.9.x d244ec7
>   git rebase -p --onto tmp 10098da maintenance/1.9.x
>
>>
>> however you did it you have git has lost ancestry which is not so nice
>> for backporting.
>
>
> Same changesets, I believe. If '-p' is omitted the merges are omitted.
>
>>
>> If there are no objections I'd like to rewind the maintenance branch
>> back to beta1 and merge master in properly.
>
>
> I thought this somewhat cleaner than a merge :0
>

By rebasing or cherry-picking git loses the information that the
changeset originates from another branch.
So when you try to merge or cherrypick more changes from the branch
the changes are coming from the automerging bails or is at least less
useful.
So if you are moving changes from one branch to another one should
merge whenever possible.

Now that both branches have diverged, 1.9 by the release commit, and
1.10 by the opening commit, there is no easy way for git to track the
origins of a changeset and we have to do the usual cherry picking, as
to my knowledge git does not have partial merges.


From sebastian at sipsolutions.net  Mon Jul  7 09:11:47 2014
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Mon, 07 Jul 2014 15:11:47 +0200
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat`
 style
In-Reply-To: <53BA91B5.6010604@gmail.com>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>
	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>
	<53B9C861.3090809@hawaii.edu>
	<CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>
	<53BA0D21.5050508@hawaii.edu>
	<CAP7h-xaVZ9xWDc67ViX890rbHaNJ7ag-wA3Xe=oQDBO9M2Ey3g@mail.gmail.com>
	<245AC908B39361438CFA2299B0DD50E438FAB9BD@SRV361.tudelft.net>
	<CAJhcF=1fL=bQRjvmLkidrn1OLk4-iP_5QxTwAQDiw0AjmMiPsg@mail.gmail.com>
	<53BA91B5.6010604@gmail.com>
Message-ID: <1404738707.25854.13.camel@sebastian-t440>

On Mo, 2014-07-07 at 08:25 -0400, Alan G Isaac wrote:
> On 7/7/2014 7:17 AM, Da?id wrote:
> > How about a new one? np.matarray, for MATLAB array.
> 
> 
> How about `str2arr` or even `build`, since teaching appears to be a focus.
> Also, I agree '1 2 3' shd become 1d and '1 2 3;' shd become 2d.
> It seems unambiguous to allow '1 2 3;;' to be 3d, or even
> '1 2;3 4;;5 6;7 8' (two 2d arrays), but I'm just noting
> that, not urging that it be implemented.
> 

Probably overdoing it, but if we plan on more then just this, what about
banning such functions to something like numpy.interactive/numpy.helpers
which you can then import * (or better specific functions) from?

I think the fact that you need many imports on startup should rather be
fixed by an ipython scientific mode or other startup imports.

- Sebastian

> Alan Isaac
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140707/a358f6c8/attachment.sig>

From charlesr.harris at gmail.com  Mon Jul  7 09:12:50 2014
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 7 Jul 2014 07:12:50 -0600
Subject: [Numpy-discussion] 1.10-devel is open
In-Reply-To: <CAK5FAtF+xO9ioeD2Ram6Ak1cUOTLKZ=9okksutRJX5Vt-WT_-g@mail.gmail.com>
References: <CAB6mnxKyxxJD+uznnFU5nJ_BMCbwCFSyD7JJ3JWta43+XGTj_Q@mail.gmail.com>
	<53BA5405.5000508@googlemail.com>
	<CAB6mnxLiGLeFXQTD5f-Vdp-Pra1xf+FBg75i15jd_JNaHA9Oew@mail.gmail.com>
	<CAK5FAtF+xO9ioeD2Ram6Ak1cUOTLKZ=9okksutRJX5Vt-WT_-g@mail.gmail.com>
Message-ID: <CAB6mnxJOY_fcSsh+PSLH_xfFuKB-0G9F6ftwAOCRy5R922RVLQ@mail.gmail.com>

On Mon, Jul 7, 2014 at 6:46 AM, Julian Taylor <jtaylor.debian at googlemail.com
> wrote:

> On Mon, Jul 7, 2014 at 2:34 PM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
> > On Mon, Jul 7, 2014 at 2:02 AM, Julian Taylor
> > <jtaylor.debian at googlemail.com> wrote:
> >>
> >> On 07.07.2014 07:53, Charles R Harris wrote:
> >> > Just so. The fixes for 1.9.0b1 are now in that branch ready for the
> next
> >> > beta.
> >> >
> >>
> >> how did you do that without a merge commit?
> >
> >
> >   git branch tmp maintenance/1.9.x
> >   git co tmp
> >   git branch -f maintenance/1.9.x d244ec7
> >   git rebase -p --onto tmp 10098da maintenance/1.9.x
> >
> >>
> >> however you did it you have git has lost ancestry which is not so nice
> >> for backporting.
> >
> >
> > Same changesets, I believe. If '-p' is omitted the merges are omitted.
> >
> >>
> >> If there are no objections I'd like to rewind the maintenance branch
> >> back to beta1 and merge master in properly.
> >
> >
> > I thought this somewhat cleaner than a merge :0
> >
>
> By rebasing or cherry-picking git loses the information that the
> changeset originates from another branch.
> So when you try to merge or cherrypick more changes from the branch
> the changes are coming from the automerging bails or is at least less
> useful.
> So if you are moving changes from one branch to another one should
> merge whenever possible.
>
> Now that both branches have diverged, 1.9 by the release commit, and
> 1.10 by the opening commit, there is no easy way for git to track the
> origins of a changeset and we have to do the usual cherry picking, as
> to my knowledge git does not have partial merges.
>

Yes, what I did was like one big cherry-pick. But I think we end up in the
same place with two divergent branches. I think git history is just a
string of changesets and each changeset has a hash. Same hash, same
changeset, and I think that was preserved, so in that sense history was
preserved. The 1.9.x branch pushed without trouble. Anyway, six of one,
half dozen of the other. I was going to do the merge route originally, even
did the merge.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140707/250a85e8/attachment.html>

From njs at pobox.com  Mon Jul  7 09:25:58 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Mon, 7 Jul 2014 14:25:58 +0100
Subject: [Numpy-discussion] 1.10-devel is open
In-Reply-To: <CAB6mnxJOY_fcSsh+PSLH_xfFuKB-0G9F6ftwAOCRy5R922RVLQ@mail.gmail.com>
References: <CAB6mnxKyxxJD+uznnFU5nJ_BMCbwCFSyD7JJ3JWta43+XGTj_Q@mail.gmail.com>
	<53BA5405.5000508@googlemail.com>
	<CAB6mnxLiGLeFXQTD5f-Vdp-Pra1xf+FBg75i15jd_JNaHA9Oew@mail.gmail.com>
	<CAK5FAtF+xO9ioeD2Ram6Ak1cUOTLKZ=9okksutRJX5Vt-WT_-g@mail.gmail.com>
	<CAB6mnxJOY_fcSsh+PSLH_xfFuKB-0G9F6ftwAOCRy5R922RVLQ@mail.gmail.com>
Message-ID: <CAPJVwBmn3ww07pZkmVJafKdJnh8gO4fCkk4iXturNZmCfT62nQ@mail.gmail.com>

On 7 Jul 2014 14:12, "Charles R Harris" <charlesr.harris at gmail.com> wrote:.
>
> Yes, what I did was like one big cherry-pick. But I think we end up in
the same place with two divergent branches. I think git history is just a
string of changesets and each changeset has a hash. Same hash, same
changeset, and I think that was preserved, so in that sense history was
preserved.

No, git history hashes are effectively a hash of <history hash of base
revision, new changes>. So when you rebase, you keep the same changes but
move them to be based on a different base revision. This means that the
rebased changes get new hashes, and git has no idea that the changes are
related.

If you merge, then git marks the original changes as being parents of the
merge node, so it can answer questions like "what changes have been applied
to maintenance/1.9.x since it branched from master?", and can do better
cherrypicks because it can use a more recent common ancestor.

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140707/40813a12/attachment.html>

From jtaylor.debian at googlemail.com  Mon Jul  7 09:30:59 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Mon, 7 Jul 2014 15:30:59 +0200
Subject: [Numpy-discussion] 1.10-devel is open
In-Reply-To: <CAB6mnxJOY_fcSsh+PSLH_xfFuKB-0G9F6ftwAOCRy5R922RVLQ@mail.gmail.com>
References: <CAB6mnxKyxxJD+uznnFU5nJ_BMCbwCFSyD7JJ3JWta43+XGTj_Q@mail.gmail.com>
	<53BA5405.5000508@googlemail.com>
	<CAB6mnxLiGLeFXQTD5f-Vdp-Pra1xf+FBg75i15jd_JNaHA9Oew@mail.gmail.com>
	<CAK5FAtF+xO9ioeD2Ram6Ak1cUOTLKZ=9okksutRJX5Vt-WT_-g@mail.gmail.com>
	<CAB6mnxJOY_fcSsh+PSLH_xfFuKB-0G9F6ftwAOCRy5R922RVLQ@mail.gmail.com>
Message-ID: <CAK5FAtGkAOZSBcTncTL_MX12_chNjyZNLYkww1cRvCMGgEGsGQ@mail.gmail.com>

On Mon, Jul 7, 2014 at 3:12 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
> On Mon, Jul 7, 2014 at 6:46 AM, Julian Taylor
> <jtaylor.debian at googlemail.com> wrote:
>>
>> On Mon, Jul 7, 2014 at 2:34 PM, Charles R Harris
>> <charlesr.harris at gmail.com> wrote:
>> > On Mon, Jul 7, 2014 at 2:02 AM, Julian Taylor
>> > <jtaylor.debian at googlemail.com> wrote:
>> >>
>> >> On 07.07.2014 07:53, Charles R Harris wrote:
>> >> > Just so. The fixes for 1.9.0b1 are now in that branch ready for the
>> >> > next
>> >> > beta.
>> >> >
>> >>
>> >> how did you do that without a merge commit?
>> >
>> >
>> >   git branch tmp maintenance/1.9.x
>> >   git co tmp
>> >   git branch -f maintenance/1.9.x d244ec7
>> >   git rebase -p --onto tmp 10098da maintenance/1.9.x
>> >
>> >>
>> >> however you did it you have git has lost ancestry which is not so nice
>> >> for backporting.
>> >
>> >
>> > Same changesets, I believe. If '-p' is omitted the merges are omitted.
>> >
>> >>
>> >> If there are no objections I'd like to rewind the maintenance branch
>> >> back to beta1 and merge master in properly.
>> >
>> >
>> > I thought this somewhat cleaner than a merge :0
>> >
>>
>> By rebasing or cherry-picking git loses the information that the
>> changeset originates from another branch.
>> So when you try to merge or cherrypick more changes from the branch
>> the changes are coming from the automerging bails or is at least less
>> useful.
>> So if you are moving changes from one branch to another one should
>> merge whenever possible.
>>
>> Now that both branches have diverged, 1.9 by the release commit, and
>> 1.10 by the opening commit, there is no easy way for git to track the
>> origins of a changeset and we have to do the usual cherry picking, as
>> to my knowledge git does not have partial merges.
>
>
> Yes, what I did was like one big cherry-pick. But I think we end up in the
> same place with two divergent branches. I think git history is just a string
> of changesets and each changeset has a hash. Same hash, same changeset, and
> I think that was preserved, so in that sense history was preserved. The
> 1.9.x branch pushed without trouble. Anyway, six of one, half dozen of the
> other. I was going to do the merge route originally, even did the merge.
>

the rebase does not preserve hashes, it rewrites the commits (minimal
change is changing the commiter).
Your approach brings us to this state:
R the maintenance release commit, D the master 1.10 opening commit

A - > B -> C -> D  < master
    \   R -> B' -> C'   < maintenance

whereas a merge is this:

A -> B -> C  -> D        < master
  \            merge
   \ R -------- \ M         < maintenance

the difference is when you now want to merge D into the maintenance branch.
In the first case git tries to merge the D changeset into the branch,
it tracks down the anchestry of D and C' figures
This leads to the merge base A, and git needs to merge B, C, D, R, B', C'
now merging B and B' and C and C' will conflict as they change the
same lines (in an ideal world git would realize that the diffs are
actually equal, but it does not do that in my experience) and asks the
user for help.

now in the merge case its different.
You want to move D into the branch it tracks down the ancestry of D and R
This leads to the merge base A, and both branches have the same commit B and C.
So now it only needs to merge D and R (leading to M) which will be
automatic if they do not conflict.


From josef.pktd at gmail.com  Mon Jul  7 09:50:35 2014
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Mon, 7 Jul 2014 09:50:35 -0400
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat`
	style
In-Reply-To: <1404738707.25854.13.camel@sebastian-t440>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>
	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>
	<53B9C861.3090809@hawaii.edu>
	<CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>
	<53BA0D21.5050508@hawaii.edu>
	<CAP7h-xaVZ9xWDc67ViX890rbHaNJ7ag-wA3Xe=oQDBO9M2Ey3g@mail.gmail.com>
	<245AC908B39361438CFA2299B0DD50E438FAB9BD@SRV361.tudelft.net>
	<CAJhcF=1fL=bQRjvmLkidrn1OLk4-iP_5QxTwAQDiw0AjmMiPsg@mail.gmail.com>
	<53BA91B5.6010604@gmail.com>
	<1404738707.25854.13.camel@sebastian-t440>
Message-ID: <CAMMTP+BmSdMMJFArRfXwhcgOPYkB2QugJ90vzxYRd+WGabEa-Q@mail.gmail.com>

On Mon, Jul 7, 2014 at 9:11 AM, Sebastian Berg <sebastian at sipsolutions.net>
wrote:

> On Mo, 2014-07-07 at 08:25 -0400, Alan G Isaac wrote:
> > On 7/7/2014 7:17 AM, Da?id wrote:
> > > How about a new one? np.matarray, for MATLAB array.
> >
> >
> > How about `str2arr` or even `build`, since teaching appears to be a
> focus.
> > Also, I agree '1 2 3' shd become 1d and '1 2 3;' shd become 2d.
> > It seems unambiguous to allow '1 2 3;;' to be 3d, or even
> > '1 2;3 4;;5 6;7 8' (two 2d arrays), but I'm just noting
> > that, not urging that it be implemented.
> >
>
> Probably overdoing it, but if we plan on more then just this, what about
> banning such functions to something like numpy.interactive/numpy.helpers
> which you can then import * (or better specific functions) from?
>
> I think the fact that you need many imports on startup should rather be
> fixed by an ipython scientific mode or other startup imports.
>


Is this whole thing really worth it? We get back to a numpy pylab.

First users learn the dirty shortcuts, and then they have to learn how to
do it "properly".

(I'm using quite often string split and reshape for copy-pasted text
tables.)

Josef


>
> - Sebastian
>
> > Alan Isaac
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140707/013719db/attachment.html>

From sebastian at sipsolutions.net  Mon Jul  7 10:28:09 2014
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Mon, 07 Jul 2014 16:28:09 +0200
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat`
 style
In-Reply-To: <CAMMTP+BmSdMMJFArRfXwhcgOPYkB2QugJ90vzxYRd+WGabEa-Q@mail.gmail.com>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>
	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>
	<53B9C861.3090809@hawaii.edu>
	<CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>
	<53BA0D21.5050508@hawaii.edu>
	<CAP7h-xaVZ9xWDc67ViX890rbHaNJ7ag-wA3Xe=oQDBO9M2Ey3g@mail.gmail.com>
	<245AC908B39361438CFA2299B0DD50E438FAB9BD@SRV361.tudelft.net>
	<CAJhcF=1fL=bQRjvmLkidrn1OLk4-iP_5QxTwAQDiw0AjmMiPsg@mail.gmail.com>
	<53BA91B5.6010604@gmail.com> <1404738707.25854.13.camel@sebastian-t440>
	<CAMMTP+BmSdMMJFArRfXwhcgOPYkB2QugJ90vzxYRd+WGabEa-Q@mail.gmail.com>
Message-ID: <1404743289.25854.23.camel@sebastian-t440>

On Mo, 2014-07-07 at 09:50 -0400, josef.pktd at gmail.com wrote:
> 
> 
> 
> On Mon, Jul 7, 2014 at 9:11 AM, Sebastian Berg
> <sebastian at sipsolutions.net> wrote:
>         On Mo, 2014-07-07 at 08:25 -0400, Alan G Isaac wrote:
>         > On 7/7/2014 7:17 AM, Da?id wrote:
>         > > How about a new one? np.matarray, for MATLAB array.
>         >
>         >
>         > How about `str2arr` or even `build`, since teaching appears
>         to be a focus.
>         > Also, I agree '1 2 3' shd become 1d and '1 2 3;' shd become
>         2d.
>         > It seems unambiguous to allow '1 2 3;;' to be 3d, or even
>         > '1 2;3 4;;5 6;7 8' (two 2d arrays), but I'm just noting
>         > that, not urging that it be implemented.
>         >
>         
>         
>         Probably overdoing it, but if we plan on more then just this,
>         what about
>         banning such functions to something like
>         numpy.interactive/numpy.helpers
>         which you can then import * (or better specific functions)
>         from?
>         
>         I think the fact that you need many imports on startup should
>         rather be
>         fixed by an ipython scientific mode or other startup imports.
> 
> 
> 
> 
> Is this whole thing really worth it? We get back to a numpy pylab.
> 
> 
> First users learn the dirty shortcuts, and then they have to learn how
> to do it "properly".
> 

Yeah, you are right. Just a bit afraid of creating too many such
functions that I am not sure are very useful/used much. For example I am
not sure that many use np.r_ or np.c_

> 
> 
> (I'm using quite often string split and reshape for copy-pasted text
> tables.)
> 
> 
> Josef
> 
> 
>  
>         
>         - Sebastian
>         
>         > Alan Isaac
>         >
>         > _______________________________________________
>         > NumPy-Discussion mailing list
>         > NumPy-Discussion at scipy.org
>         > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>         
>         
>         
>         _______________________________________________
>         NumPy-Discussion mailing list
>         NumPy-Discussion at scipy.org
>         http://mail.scipy.org/mailman/listinfo/numpy-discussion
>         
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140707/d1b46266/attachment.sig>

From njs at pobox.com  Mon Jul  7 13:58:32 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Mon, 7 Jul 2014 18:58:32 +0100
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat`
	style
In-Reply-To: <1404743289.25854.23.camel@sebastian-t440>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>
	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>
	<53B9C861.3090809@hawaii.edu>
	<CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>
	<53BA0D21.5050508@hawaii.edu>
	<CAP7h-xaVZ9xWDc67ViX890rbHaNJ7ag-wA3Xe=oQDBO9M2Ey3g@mail.gmail.com>
	<245AC908B39361438CFA2299B0DD50E438FAB9BD@SRV361.tudelft.net>
	<CAJhcF=1fL=bQRjvmLkidrn1OLk4-iP_5QxTwAQDiw0AjmMiPsg@mail.gmail.com>
	<53BA91B5.6010604@gmail.com>
	<1404738707.25854.13.camel@sebastian-t440>
	<CAMMTP+BmSdMMJFArRfXwhcgOPYkB2QugJ90vzxYRd+WGabEa-Q@mail.gmail.com>
	<1404743289.25854.23.camel@sebastian-t440>
Message-ID: <CAPJVwBmipO6VuMknR9pieSS15Oic3XGfG6jRM8i2tWFyncwGzg@mail.gmail.com>

On Mon, Jul 7, 2014 at 3:28 PM, Sebastian Berg
<sebastian at sipsolutions.net> wrote:
> On Mo, 2014-07-07 at 09:50 -0400, josef.pktd at gmail.com wrote:
>>
>> On Mon, Jul 7, 2014 at 9:11 AM, Sebastian Berg
>> <sebastian at sipsolutions.net> wrote:
>>         On Mo, 2014-07-07 at 08:25 -0400, Alan G Isaac wrote:
>>         > On 7/7/2014 7:17 AM, Da?id wrote:
>>         > > How about a new one? np.matarray, for MATLAB array.
>>         >
>>         >
>>         > How about `str2arr` or even `build`, since teaching appears
>>         to be a focus.
>>         > Also, I agree '1 2 3' shd become 1d and '1 2 3;' shd become
>>         2d.
>>         > It seems unambiguous to allow '1 2 3;;' to be 3d, or even
>>         > '1 2;3 4;;5 6;7 8' (two 2d arrays), but I'm just noting
>>         > that, not urging that it be implemented.
>>         >
>>
>>         Probably overdoing it, but if we plan on more then just this,
>>         what about
>>         banning such functions to something like
>>         numpy.interactive/numpy.helpers
>>         which you can then import * (or better specific functions)
>>         from?
>>
>>         I think the fact that you need many imports on startup should
>>         rather be
>>         fixed by an ipython scientific mode or other startup imports.
>>
>>
>>
>>
>> Is this whole thing really worth it? We get back to a numpy pylab.
>>
>>
>> First users learn the dirty shortcuts, and then they have to learn how
>> to do it "properly".
>>
>
> Yeah, you are right. Just a bit afraid of creating too many such
> functions that I am not sure are very useful/used much. For example I am
> not sure that many use np.r_ or np.c_

Yeah, we definitely have too many random bits of API around overall.
But I think this one is probably worthwhile. It doesn't add any real
complexity (no new types, trivial for readers to understand the first
time they encounter it, etc.), and it addresses a recurring perceived
shortcoming of numpy that people run into in the first 5 minutes of
use, at a time when it's pretty easy to give up and go back to Matlab.
And, it removes one of the perceived advantages of np.matrix over
np.ndarray, so it smooths our way for eventually phasing out
np.matrix.

I'm not sure that preserving np.arr<tab> is that important (<tab> here
only saves 1 character!), but some possible alternatives for short
names:

  np.marr  ("matlab-like array construction")
  np.sarr   ("string array")
  np.parse

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org


From josef.pktd at gmail.com  Mon Jul  7 14:15:33 2014
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Mon, 7 Jul 2014 14:15:33 -0400
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat`
	style
In-Reply-To: <CAPJVwBmipO6VuMknR9pieSS15Oic3XGfG6jRM8i2tWFyncwGzg@mail.gmail.com>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>
	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>
	<53B9C861.3090809@hawaii.edu>
	<CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>
	<53BA0D21.5050508@hawaii.edu>
	<CAP7h-xaVZ9xWDc67ViX890rbHaNJ7ag-wA3Xe=oQDBO9M2Ey3g@mail.gmail.com>
	<245AC908B39361438CFA2299B0DD50E438FAB9BD@SRV361.tudelft.net>
	<CAJhcF=1fL=bQRjvmLkidrn1OLk4-iP_5QxTwAQDiw0AjmMiPsg@mail.gmail.com>
	<53BA91B5.6010604@gmail.com>
	<1404738707.25854.13.camel@sebastian-t440>
	<CAMMTP+BmSdMMJFArRfXwhcgOPYkB2QugJ90vzxYRd+WGabEa-Q@mail.gmail.com>
	<1404743289.25854.23.camel@sebastian-t440>
	<CAPJVwBmipO6VuMknR9pieSS15Oic3XGfG6jRM8i2tWFyncwGzg@mail.gmail.com>
Message-ID: <CAMMTP+AX=451mOvnTDrJtQRe_SJ5LoM=zDTf7wkLYUAxbMOQdA@mail.gmail.com>

On Mon, Jul 7, 2014 at 1:58 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Mon, Jul 7, 2014 at 3:28 PM, Sebastian Berg
> <sebastian at sipsolutions.net> wrote:
> > On Mo, 2014-07-07 at 09:50 -0400, josef.pktd at gmail.com wrote:
> >>
> >> On Mon, Jul 7, 2014 at 9:11 AM, Sebastian Berg
> >> <sebastian at sipsolutions.net> wrote:
> >>         On Mo, 2014-07-07 at 08:25 -0400, Alan G Isaac wrote:
> >>         > On 7/7/2014 7:17 AM, Da?id wrote:
> >>         > > How about a new one? np.matarray, for MATLAB array.
> >>         >
> >>         >
> >>         > How about `str2arr` or even `build`, since teaching appears
> >>         to be a focus.
> >>         > Also, I agree '1 2 3' shd become 1d and '1 2 3;' shd become
> >>         2d.
> >>         > It seems unambiguous to allow '1 2 3;;' to be 3d, or even
> >>         > '1 2;3 4;;5 6;7 8' (two 2d arrays), but I'm just noting
> >>         > that, not urging that it be implemented.
> >>         >
> >>
> >>         Probably overdoing it, but if we plan on more then just this,
> >>         what about
> >>         banning such functions to something like
> >>         numpy.interactive/numpy.helpers
> >>         which you can then import * (or better specific functions)
> >>         from?
> >>
> >>         I think the fact that you need many imports on startup should
> >>         rather be
> >>         fixed by an ipython scientific mode or other startup imports.
> >>
> >>
> >>
> >>
> >> Is this whole thing really worth it? We get back to a numpy pylab.
> >>
> >>
> >> First users learn the dirty shortcuts, and then they have to learn how
> >> to do it "properly".
> >>
> >
> > Yeah, you are right. Just a bit afraid of creating too many such
> > functions that I am not sure are very useful/used much. For example I am
> > not sure that many use np.r_ or np.c_
>
> Yeah, we definitely have too many random bits of API around overall.
> But I think this one is probably worthwhile. It doesn't add any real
> complexity (no new types, trivial for readers to understand the first
> time they encounter it, etc.), and it addresses a recurring perceived
> shortcoming of numpy that people run into in the first 5 minutes of
> use, at a time when it's pretty easy to give up and go back to Matlab.
> And, it removes one of the perceived advantages of np.matrix over
> np.ndarray, so it smooths our way for eventually phasing out
> np.matrix.
>
> I'm not sure that preserving np.arr<tab> is that important (<tab> here
> only saves 1 character!), but some possible alternatives for short
> names:
>
>   np.marr  ("matlab-like array construction")
>   np.sarr   ("string array")
>   np.parse
>

short like np.s   (didn't know there is already s_)

something long like

>>> np.fromstring('1 2', sep=' ')
array([ 1.,  2.])


>>> np.fromstring2d('1 2 3; 5 3.4 7')
array([[ 1. ,  2. ,  3. ],
       [ 5. ,  3.4,  7. ]])


Josef


>
> -n
>
> --
> Nathaniel J. Smith
> Postdoctoral researcher - Informatics - University of Edinburgh
> http://vorpus.org
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140707/9bf723a3/attachment.html>

From faltet at gmail.com  Mon Jul  7 14:20:30 2014
From: faltet at gmail.com (Francesc Alted)
Date: Mon, 07 Jul 2014 20:20:30 +0200
Subject: [Numpy-discussion] ANN: python-blosc 1.2.7 released
Message-ID: <53BAE4EE.5090909@gmail.com>

=============================
Announcing python-blosc 1.2.4
=============================

What is new?
============

This is a maintenance release, where included c-blosc sources have been
updated to 1.4.0.  This adds support for non-Intel architectures, most
specially those not supporting unaligned access.

For more info, you can have a look at the release notes in:

https://github.com/Blosc/python-blosc/wiki/Release-notes

More docs and examples are available in the documentation site:

http://python-blosc.blosc.org


What is it?
===========

Blosc (http://www.blosc.org) is a high performance compressor
optimized for binary data.  It has been designed to transmit data to
the processor cache faster than the traditional, non-compressed,
direct memory fetch approach via a memcpy() OS call.

Blosc is the first compressor that is meant not only to reduce the size
of large datasets on-disk or in-memory, but also to accelerate object
manipulations that are memory-bound
(http://www.blosc.org/docs/StarvingCPUs.pdf).  See
http://www.blosc.org/synthetic-benchmarks.html for some benchmarks on
how much speed it can achieve in some datasets.

Blosc works well for compressing numerical arrays that contains data
with relatively low entropy, like sparse data, time series, grids with
regular-spaced values, etc.

python-blosc (http://python-blosc.blosc.org/) is the Python wrapper for
the Blosc compression library.

There is also a handy command line and Python library for Blosc called
Bloscpack (https://github.com/Blosc/bloscpack) that allows you to
compress large binary datafiles on-disk.


Installing
==========

python-blosc is in PyPI repository, so installing it is easy:

$ pip install -U blosc  # yes, you should omit the python- prefix


Download sources
================

The sources are managed through github services at:

http://github.com/Blosc/python-blosc


Documentation
=============

There is Sphinx-based documentation site at:

http://python-blosc.blosc.org/


Mailing list
============

There is an official mailing list for Blosc at:

blosc at googlegroups.com
http://groups.google.es/group/blosc


Licenses
========

Both Blosc and its Python wrapper are distributed using the MIT license.
See:

https://github.com/Blosc/python-blosc/blob/master/LICENSES

for more details.

----

   **Enjoy data!**

-- 
Francesc Alted


From faltet at gmail.com  Mon Jul  7 14:28:31 2014
From: faltet at gmail.com (Francesc Alted)
Date: Mon, 07 Jul 2014 20:28:31 +0200
Subject: [Numpy-discussion] [CORRECTION] python-blosc 1.2.4 released (Was:
 ANN: python-blosc 1.2.7 released)
In-Reply-To: <53BAE4EE.5090909@gmail.com>
References: <53BAE4EE.5090909@gmail.com>
Message-ID: <53BAE6CF.3070008@gmail.com>

Indeed it was 1.2.4 the version just released and not 1.2.7.  Sorry for 
the typo!

Francesc

On 7/7/14, 8:20 PM, Francesc Alted wrote:
> =============================
> Announcing python-blosc 1.2.4
> =============================
>
> What is new?
> ============
>
> This is a maintenance release, where included c-blosc sources have been
> updated to 1.4.0.  This adds support for non-Intel architectures, most
> specially those not supporting unaligned access.
>
> For more info, you can have a look at the release notes in:
>
> https://github.com/Blosc/python-blosc/wiki/Release-notes
>
> More docs and examples are available in the documentation site:
>
> http://python-blosc.blosc.org
>
>
> What is it?
> ===========
>
> Blosc (http://www.blosc.org) is a high performance compressor
> optimized for binary data.  It has been designed to transmit data to
> the processor cache faster than the traditional, non-compressed,
> direct memory fetch approach via a memcpy() OS call.
>
> Blosc is the first compressor that is meant not only to reduce the size
> of large datasets on-disk or in-memory, but also to accelerate object
> manipulations that are memory-bound
> (http://www.blosc.org/docs/StarvingCPUs.pdf).  See
> http://www.blosc.org/synthetic-benchmarks.html for some benchmarks on
> how much speed it can achieve in some datasets.
>
> Blosc works well for compressing numerical arrays that contains data
> with relatively low entropy, like sparse data, time series, grids with
> regular-spaced values, etc.
>
> python-blosc (http://python-blosc.blosc.org/) is the Python wrapper for
> the Blosc compression library.
>
> There is also a handy command line and Python library for Blosc called
> Bloscpack (https://github.com/Blosc/bloscpack) that allows you to
> compress large binary datafiles on-disk.
>
>
> Installing
> ==========
>
> python-blosc is in PyPI repository, so installing it is easy:
>
> $ pip install -U blosc  # yes, you should omit the python- prefix
>
>
> Download sources
> ================
>
> The sources are managed through github services at:
>
> http://github.com/Blosc/python-blosc
>
>
> Documentation
> =============
>
> There is Sphinx-based documentation site at:
>
> http://python-blosc.blosc.org/
>
>
> Mailing list
> ============
>
> There is an official mailing list for Blosc at:
>
> blosc at googlegroups.com
> http://groups.google.es/group/blosc
>
>
> Licenses
> ========
>
> Both Blosc and its Python wrapper are distributed using the MIT license.
> See:
>
> https://github.com/Blosc/python-blosc/blob/master/LICENSES
>
> for more details.
>
> ----
>
>   **Enjoy data!**
>


-- 
Francesc Alted


From valentin at haenel.co  Mon Jul  7 14:30:44 2014
From: valentin at haenel.co (Valentin Haenel)
Date: Mon, 7 Jul 2014 20:30:44 +0200
Subject: [Numpy-discussion] ANN: python-blosc 1.2.7 released
In-Reply-To: <53BAE4EE.5090909@gmail.com>
References: <53BAE4EE.5090909@gmail.com>
Message-ID: <20140707183044.GB13382@kudu.in-berlin.de>

Hi,

* Francesc Alted <faltet at gmail.com> [2014-07-07]:

[snip]

> There is also a handy command line and Python library for Blosc called
> Bloscpack (https://github.com/Blosc/bloscpack) that allows you to
> compress large binary datafiles on-disk.

For this list, you might be interested to know, that Bloscpack also
supports compressing/decompressing Numpy arrays out-of-the-box via a
Python API:

https://github.com/Blosc/bloscpack#numpy

best,

V-


From chris.barker at noaa.gov  Mon Jul  7 14:32:12 2014
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Mon, 7 Jul 2014 11:32:12 -0700
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat`
	style
In-Reply-To: <CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>
	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>
	<53B9C861.3090809@hawaii.edu>
	<CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>
Message-ID: <-2968451659458027190@unknownmsgid>

If you are going to introduce this functionality, please don't call it
np.arr.


I agree, but..,

I would suggest calling it something like np.array_simple or
np.array_from_string, but the best choice IMO, would be
np.ndarray.from_string (a static constructor method).


Except the entire point of his is that it's easy to type...

-1 on the whole idea -- this isn't Matlab, I'd saving a little typing worth
it?

CHB


_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140707/db257555/attachment.html>

From chris.barker at noaa.gov  Mon Jul  7 15:02:45 2014
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Mon, 7 Jul 2014 12:02:45 -0700
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat`
	style
In-Reply-To: <1404743289.25854.23.camel@sebastian-t440>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>
	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>
	<53B9C861.3090809@hawaii.edu>
	<CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>
	<53BA0D21.5050508@hawaii.edu>
	<CAP7h-xaVZ9xWDc67ViX890rbHaNJ7ag-wA3Xe=oQDBO9M2Ey3g@mail.gmail.com>
	<245AC908B39361438CFA2299B0DD50E438FAB9BD@SRV361.tudelft.net>
	<CAJhcF=1fL=bQRjvmLkidrn1OLk4-iP_5QxTwAQDiw0AjmMiPsg@mail.gmail.com>
	<53BA91B5.6010604@gmail.com> <1404738707.25854.13.camel@sebastian-t440>
	<CAMMTP+BmSdMMJFArRfXwhcgOPYkB2QugJ90vzxYRd+WGabEa-Q@mail.gmail.com>
	<1404743289.25854.23.camel@sebastian-t440>
Message-ID: <-337347367645528086@unknownmsgid>

On Jul 7, 2014, at 7:28 AM, Sebastian Berg <sebastian at sipsolutions.net> wrote:
> not sure that many use np.r_ or np.c_

I actually really like those ;-)

-Chris


From pav at iki.fi  Tue Jul  8 09:09:17 2014
From: pav at iki.fi (Pauli Virtanen)
Date: Tue, 08 Jul 2014 16:09:17 +0300
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat`
	style
In-Reply-To: <-2968451659458027190@unknownmsgid>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>	<53B9C861.3090809@hawaii.edu>	<CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>
	<-2968451659458027190@unknownmsgid>
Message-ID: <lpgqht$tm3$1@ger.gmane.org>

07.07.2014 21:32, Chris Barker - NOAA Federal kirjoitti:
> If you are going to introduce this functionality, please don't call it
> np.arr.

It might be appropriate for pirate versions of Numpy.

    ***

Seriously though, having a variant of `mat` that returns arrays could be
useful, so weak +0. Preferably, the name should be quite short to type.

On the other hand, unlike r_ and c_, I haven't seen or used mat() in
real code.

-- 
Pauli Virtanen


From joseluismietta at yahoo.com.ar  Tue Jul  8 20:29:55 2014
From: joseluismietta at yahoo.com.ar (=?iso-8859-1?Q?Jos=E8_Luis_Mietta?=)
Date: Tue, 8 Jul 2014 17:29:55 -0700
Subject: [Numpy-discussion] Number of elements in a intersection graph
Message-ID: <1404865795.90882.YahooMailNeo@web142302.mail.bf1.yahoo.com>

Hi experts!!

I am studying the intersection between line 
segments (sticks). I have an Numpy array (M) corresponding to the 
intersection graph of the system (the element Mij = 1 if the sticks' i 
'and' 'j' intersect, and Mij = 0 if not intersect).

I want to 
determine the number of elements that form the path that connects two 
sticks (N and K), i.e.: the number of sticks that form the spanning 
cluster between stick N and K. 
How I can do?? Please explain step by step.

Best regards!

Thanks a lot.

Jos? Luis
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140708/fdf81bac/attachment.html>

From stefan at sun.ac.za  Wed Jul  9 06:41:17 2014
From: stefan at sun.ac.za (=?UTF-8?Q?St=C3=A9fan_van_der_Walt?=)
Date: Wed, 9 Jul 2014 12:41:17 +0200
Subject: [Numpy-discussion] Remove bento from numpy
In-Reply-To: <CAGY4rcUrXooKwDTYkef10tfPLcOuqSmqLV5ZuvnN+XY6czVozA@mail.gmail.com>
References: <CAB6mnxLwuA9MjKTj+aq6+qRVHzpWNOtYJVOE7p7Gy2sifdMzbw@mail.gmail.com>
	<CAGY4rcVtxmqcUaO5Qd_Oz+6B=7dGgiY7qdWsx_enS-qgQ9RxsA@mail.gmail.com>
	<CABL7CQiLoxLume94LeBBZvNhNF-LZig_=aAWmsVOH-bpqx1f0Q@mail.gmail.com>
	<CAPJVwBnABaa_UqEntMd8pHdfhhB-P1EyunRJb7_kxw9vskLzpQ@mail.gmail.com>
	<CABL7CQjzW8C_g3vrQ5Z8qhmSkJX795beuZnSW874LTbPiUn7wQ@mail.gmail.com>
	<CAPJVwBmPmhK+PnoLhWMGEAGvWFt4Vyz+w8PwTfSWTWmesais=w@mail.gmail.com>
	<CAGY4rcWi7u7JUCDK3MjFKmZOgMJTgJ=D1h1eWTPCiRd3Q3gq_w@mail.gmail.com>
	<CAPJVwB==_LLqRe0waJ8WK5zPVVa+4-QHnG=CjuEq0+nF3O0uhg@mail.gmail.com>
	<CAGY4rcUrXooKwDTYkef10tfPLcOuqSmqLV5ZuvnN+XY6czVozA@mail.gmail.com>
Message-ID: <CABDkGQmRRcQ3mcnKSYqgQpQxe5bNjtiAb5sdRpSv18S=DriarA@mail.gmail.com>

On Sat, Jul 5, 2014 at 6:40 PM, David Cournapeau <cournape at gmail.com> wrote:
> The efforts are on average less demanding than this discussion. We are
> talking about adding entries to a list in most cases...

In scikit-image we use the following script to check for the most
basic discrepancies:

https://github.com/scikit-image/scikit-image/blob/master/check_bento_build.py

St?fan


From olivier.grisel at ensta.org  Wed Jul  9 10:00:34 2014
From: olivier.grisel at ensta.org (Olivier Grisel)
Date: Wed, 9 Jul 2014 16:00:34 +0200
Subject: [Numpy-discussion] 64-bit windows numpy / scipy wheels for
	testing
In-Reply-To: <CAFvE7K5i-d6j=BQu5+GxQUqSdDjJ91N6yUUhqqBc-0V5rPh8AQ@mail.gmail.com>
References: <CAH6Pt5r47rOKXYfdiqaGQihV+yzrq2dc9rmYbboNb=HUbrtEpA@mail.gmail.com>
	<536CB2C6.1030305@googlemail.com>
	<CAGY4rcV1tecTqbVB-nY2_zuAQ1U0uVRH9iPa8Wd2inL8mh+zpA@mail.gmail.com>
	<CAH6Pt5oOx+POzTJdMHU49JxZs=Pkc6OFD-B6=9topx68QjtiiA@mail.gmail.com>
	<CABL7CQhfo6++1B6hBh3pDzmLZOeNbqo1SzQOfBMbLEQj4G+U7A@mail.gmail.com>
	<CAFvE7K4+TMXG8-=bFsWk7hx5Wh4VkhhyYhw4z7EBb6kKnqPJrw@mail.gmail.com>
	<CAGGsPMxmOV8T+Wm6LfTER1xebCQk4MEebXfdRULJ6b8zWLL_=Q@mail.gmail.com>
	<CAH6Pt5pFftrXN8dkS4Dcj+Dt8Bx6EMyqp_cK0RX6=KUKhVx2+w@mail.gmail.com>
	<CAH6Pt5r2fNJGGj1K=foSt9tFU89+wf82waYAzxYHrKGtQbTXwQ@mail.gmail.com>
	<CAGGsPMyxvNkz__+Hc1YSp_fLRPbdg4gez9EAwQyS9jzv_R+ZCA@mail.gmail.com>
	<CAH6Pt5rj8QOrtCyZMKeaM34ofdXkKRCiABGNLD1vVUC_B7bqTA@mail.gmail.com>
	<CAGGsPMy7Y1nqz+5TCuGFP2gvK6G2ReYMb9X1e_Jq3rwO6FoLJA@mail.gmail.com>
	<CAH6Pt5rPhXYyWHPVy8fYrym=2oNQxj5kchUL38KdLrwXjmhL8w@mail.gmail.com>
	<CALGmxEKyXDsuF81=coYYvtJSOLnpnO6AW5CxNYu+JSBOCiQu=A@mail.gmail.com>
	<lp2k91$7jg$1@ger.gmane.org>
	<CAH6Pt5o6g5A_gjCsH75xkqLzPLJUz9OdsB4a5Ej5S-kectw+og@mail.gmail.com>
	<CAH6Pt5qJS357CtoXGyz2cNmVFr8ufjjdwe_fHqU86YcAcJ5vcw@mail.gmail.com>
	<CAFvE7K5i-d6j=BQu5+GxQUqSdDjJ91N6yUUhqqBc-0V5rPh8AQ@mail.gmail.com>
Message-ID: <CAFvE7K6+3vB9emv2MCt6KMPz8PYZhQUQazgnHQVg3FEOU0PioA@mail.gmail.com>

Feodor updated the AppVeyor nodes to have the Windows SDK matching
MSVC 2008 Express for Python 2. I have updated my sample scripts and
we now have a working example of a free CI system for:

Python 2 and 3 both for 32 and 64 bit architectures.

https://github.com/ogrisel/python-appveyor-demo

Best,

-- 
Olivier


From bryanv at continuum.io  Wed Jul  9 11:13:58 2014
From: bryanv at continuum.io (Bryan Van de Ven)
Date: Wed, 9 Jul 2014 10:13:58 -0500
Subject: [Numpy-discussion] ANN: Bokeh 0.5 released
Message-ID: <565B058A-583F-4D6E-B6B2-5C7FDB724F2E@continuum.io>

I am very happy to announce the release of Bokeh version 0.5! (http://continuum.io/blog/bokeh-0.5)

Bokeh is a Python library for visualizing large and realtime datasets on the web.

This release includes many new features: weekly dev releases, a new plot frame, a click tool, "always on" hover tool, multiple axes, log axes, minor ticks, gears and gauges glyphs, and an NPM BokehJS package. Several usability enhancements have been made to the plotting.py interface to make it even easier to use. The Bokeh tutorial also now includes exercises in IPython notebook form. Of course, we've made many little bug fixes - see the CHANGELOG for full details.

The biggest news is all the long-term and architectural goals landing in Bokeh 0.5:

    * Widgets! Build apps and dashboards with Bokeh
    * Very high level bokeh.charts interface
    * Initial Abstract Rendering support for big data visualizations
    * Tighter Pandas integration
    * Simpler, easier plot embedding options

Expect dynamic, data-driven layouts, including ggplot style auto-faceting in upcoming releases, as well as R language bindings, more statistical plot types in bokeh.charts, and cloud hosting for Bokeh apps.

Check out the full documentation, interactive gallery, and tutorial at

    http://bokeh.pydata.org

as well as the new Bokeh IPython notebook nbviewer index (including all the tutorials) at:

    http://nbviewer.ipython.org/github/ContinuumIO/bokeh-notebooks/blob/master/index.ipynb

If you are using Anaconda, you can install with conda:

    conda install bokeh

Alternatively, you can install with pip:

    pip install bokeh

BokehJS is also available by CDN for use in standalone javascript applications:

    http://cdn.pydata.org/bokeh-0.5.min.js
    http://cdn.pydata.org/bokeh-0.5.min.css

Issues, enhancement requests, and pull requests can be made on the Bokeh Github page: https://github.com/continuumio/bokeh

Questions can be directed to the Bokeh mailing list: bokeh at continuum.io

If you have interest in helping to develop Bokeh, please get involved! Special thanks to recent contributors: Tabish Chasmawala, Samuel Colvin, Christina Doig, Tarun Gaba, Maggie Mari, Amy Troschinetz, Ben Zaitlen.

Bryan Van de Ven
Continuum Analytics
http://continuum.io

From robert.kern at gmail.com  Wed Jul  9 15:31:35 2014
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 9 Jul 2014 20:31:35 +0100
Subject: [Numpy-discussion] Number of elements in a intersection graph
In-Reply-To: <1404865795.90882.YahooMailNeo@web142302.mail.bf1.yahoo.com>
References: <1404865795.90882.YahooMailNeo@web142302.mail.bf1.yahoo.com>
Message-ID: <CAF6FJitpFMPCS1kz_dK9SG4QDKuDSZVxdaQkDwDqtC-5QGm3tQ@mail.gmail.com>

On Wed, Jul 9, 2014 at 1:29 AM, Jos? Luis Mietta
<joseluismietta at yahoo.com.ar> wrote:
> Hi experts!!
>
> I am studying the intersection between line segments (sticks). I have an
> Numpy array (M) corresponding to the intersection graph of the system (the
> element Mij = 1 if the sticks' i 'and' 'j' intersect, and Mij = 0 if not
> intersect).
>
> I want to determine the number of elements that form the path that connects
> two sticks (N and K), i.e.: the number of sticks that form the spanning
> cluster between stick N and K.
> How I can do?? Please explain step by step.

The last time you asked a question about this project, we pointed you
to the networkx package.

http://networkx.github.io/documentation/latest/reference/algorithms.shortest_paths.html

You can make a networkx.Graph object from your adjacency matrix very simply:

  graph = networkx.Graph(M)

-- 
Robert Kern


From rmcgibbo at gmail.com  Wed Jul  9 18:53:26 2014
From: rmcgibbo at gmail.com (Robert McGibbon)
Date: Wed, 9 Jul 2014 15:53:26 -0700
Subject: [Numpy-discussion] 64-bit windows numpy / scipy wheels for
	testing
In-Reply-To: <CAFvE7K6+3vB9emv2MCt6KMPz8PYZhQUQazgnHQVg3FEOU0PioA@mail.gmail.com>
References: <CAH6Pt5r47rOKXYfdiqaGQihV+yzrq2dc9rmYbboNb=HUbrtEpA@mail.gmail.com>
	<536CB2C6.1030305@googlemail.com>
	<CAGY4rcV1tecTqbVB-nY2_zuAQ1U0uVRH9iPa8Wd2inL8mh+zpA@mail.gmail.com>
	<CAH6Pt5oOx+POzTJdMHU49JxZs=Pkc6OFD-B6=9topx68QjtiiA@mail.gmail.com>
	<CABL7CQhfo6++1B6hBh3pDzmLZOeNbqo1SzQOfBMbLEQj4G+U7A@mail.gmail.com>
	<CAFvE7K4+TMXG8-=bFsWk7hx5Wh4VkhhyYhw4z7EBb6kKnqPJrw@mail.gmail.com>
	<CAGGsPMxmOV8T+Wm6LfTER1xebCQk4MEebXfdRULJ6b8zWLL_=Q@mail.gmail.com>
	<CAH6Pt5pFftrXN8dkS4Dcj+Dt8Bx6EMyqp_cK0RX6=KUKhVx2+w@mail.gmail.com>
	<CAH6Pt5r2fNJGGj1K=foSt9tFU89+wf82waYAzxYHrKGtQbTXwQ@mail.gmail.com>
	<CAGGsPMyxvNkz__+Hc1YSp_fLRPbdg4gez9EAwQyS9jzv_R+ZCA@mail.gmail.com>
	<CAH6Pt5rj8QOrtCyZMKeaM34ofdXkKRCiABGNLD1vVUC_B7bqTA@mail.gmail.com>
	<CAGGsPMy7Y1nqz+5TCuGFP2gvK6G2ReYMb9X1e_Jq3rwO6FoLJA@mail.gmail.com>
	<CAH6Pt5rPhXYyWHPVy8fYrym=2oNQxj5kchUL38KdLrwXjmhL8w@mail.gmail.com>
	<CALGmxEKyXDsuF81=coYYvtJSOLnpnO6AW5CxNYu+JSBOCiQu=A@mail.gmail.com>
	<lp2k91$7jg$1@ger.gmane.org>
	<CAH6Pt5o6g5A_gjCsH75xkqLzPLJUz9OdsB4a5Ej5S-kectw+og@mail.gmail.com>
	<CAH6Pt5qJS357CtoXGyz2cNmVFr8ufjjdwe_fHqU86YcAcJ5vcw@mail.gmail.com>
	<CAFvE7K5i-d6j=BQu5+GxQUqSdDjJ91N6yUUhqqBc-0V5rPh8AQ@mail.gmail.com>
	<CAFvE7K6+3vB9emv2MCt6KMPz8PYZhQUQazgnHQVg3FEOU0PioA@mail.gmail.com>
Message-ID: <CAN4+E8G=i5g5gNTFjzxWUwGOC4HTh_HvME-dXRSnvEvQqg4W7Q@mail.gmail.com>

This is an awesome resource for tons of projects.

Thanks Olivier!

-Robert


On Wed, Jul 9, 2014 at 7:00 AM, Olivier Grisel <olivier.grisel at ensta.org>
wrote:

> Feodor updated the AppVeyor nodes to have the Windows SDK matching
> MSVC 2008 Express for Python 2. I have updated my sample scripts and
> we now have a working example of a free CI system for:
>
> Python 2 and 3 both for 32 and 64 bit architectures.
>
> https://github.com/ogrisel/python-appveyor-demo
>
> Best,
>
> --
> Olivier
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140709/7f116e29/attachment.html>

From ted.sandler at gmail.com  Wed Jul  9 20:29:04 2014
From: ted.sandler at gmail.com (Ted Sandler)
Date: Wed, 9 Jul 2014 17:29:04 -0700
Subject: [Numpy-discussion] Number of elements in a intersection graph
In-Reply-To: <CAF6FJitpFMPCS1kz_dK9SG4QDKuDSZVxdaQkDwDqtC-5QGm3tQ@mail.gmail.com>
References: <1404865795.90882.YahooMailNeo@web142302.mail.bf1.yahoo.com>
	<CAF6FJitpFMPCS1kz_dK9SG4QDKuDSZVxdaQkDwDqtC-5QGm3tQ@mail.gmail.com>
Message-ID: <CAHYXaWtQHfSfr4Ewvb3pUd-jpyV=P8bDAx_2Z96OEJXJ6q4vvw@mail.gmail.com>

Use NetworkX + breadth first search and you are done.


On Wed, Jul 9, 2014 at 12:31 PM, Robert Kern <robert.kern at gmail.com> wrote:

> On Wed, Jul 9, 2014 at 1:29 AM, Jos? Luis Mietta
> <joseluismietta at yahoo.com.ar> wrote:
> > Hi experts!!
> >
> > I am studying the intersection between line segments (sticks). I have an
> > Numpy array (M) corresponding to the intersection graph of the system
> (the
> > element Mij = 1 if the sticks' i 'and' 'j' intersect, and Mij = 0 if not
> > intersect).
> >
> > I want to determine the number of elements that form the path that
> connects
> > two sticks (N and K), i.e.: the number of sticks that form the spanning
> > cluster between stick N and K.
> > How I can do?? Please explain step by step.
>
> The last time you asked a question about this project, we pointed you
> to the networkx package.
>
>
> http://networkx.github.io/documentation/latest/reference/algorithms.shortest_paths.html
>
> You can make a networkx.Graph object from your adjacency matrix very
> simply:
>
>   graph = networkx.Graph(M)
>
> --
> Robert Kern
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140709/784e8df4/attachment.html>

From jtaylor.debian at googlemail.com  Thu Jul 10 03:46:00 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Thu, 10 Jul 2014 09:46:00 +0200
Subject: [Numpy-discussion] numpy.partition and the ICC compiler
Message-ID: <53BE44B8.3070908@googlemail.com>

hi,
there seems to be some issue with the newish selection code when
compiling numpy with the ICC compiler.
See this issue: https://github.com/numpy/numpy/issues/4836

I cannot reproduce the problem even when compiling with ICC myself.
I have also tried valgrind and GCC's undefined behavior sanitizer
without any results.

Can somebody with debugging experience please try the posted testcase
and if it is reproduce-able provide the information needed to fix this.
It should also affect 1.8.0 if you replace the percentile call with
np.partition(imc[i], (0, 1, 1027603, 1027604))

I would need a backtrace, the current register state, the local
variables, disassembly and ideally the few steps until the crash/wrong
result with variable state.

Cheers,
Julian


From jtaylor.debian at googlemail.com  Fri Jul 11 03:39:46 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Fri, 11 Jul 2014 09:39:46 +0200
Subject: [Numpy-discussion] np.zeros of structured array of array of objects
Message-ID: <53BF94C2.7000407@googlemail.com>

Hi,
looking at https://github.com/numpy/numpy/issues/4857 I noticed that
np.zeros of a structured array of array of objects only initializes the
first element of if the embedded array to zero and leaves the rest None:

In [1]: a = numpy.zeros(10, dtype=[('multiple objects', object, 2)]); a
Out[1]:
array([([0, None],), ([0, None],), ([0, None],), ([0, None],),
       ([0, None],), ([0, None],), ([0, None],), ([0, None],),
       ([0, None],), ([0, None],)],
      dtype=[('multiple objects', 'O', (2,))])


Is this the intented behavior? I would have expected all fields to be
set to an int-object 0.
If not can we change it or is it too likely people rely on this behavior?


From michael.lehn at uni-ulm.de  Fri Jul 11 02:21:29 2014
From: michael.lehn at uni-ulm.de (Dr. Michael Lehn)
Date: Fri, 11 Jul 2014 08:21:29 +0200
Subject: [Numpy-discussion] The BLAS problem (was: Re: Wiki page for
	building numerical stuff on Windows)
In-Reply-To: <CAPJVwB=qk4kbzOsfa96H9B6US03Jf03JRs3zjX0OZzj9K5GwRQ@mail.gmail.com>
References: <CAPJVwBnYnwhApnHbLnLuQe3URZueNN96h3ZCKHFCdYU+gwgPdQ@mail.gmail.com>
	<46818810418925962.495791sturla.molden-gmail.com@news.gmane.org>
	<517271708418928107.376969sturla.molden-gmail.com@news.gmane.org>
	<C5CBAB2F-5969-4623-A580-02B597CC12D2@uni-ulm.de>
	<CAPJVwBk=bGOBgDiBtiZxSx=JgOovRGDWdoT7JMcGrfHiMxM-Ww@mail.gmail.com>
	<ljmpj1$s1f$1@ger.gmane.org>
	<CAPJVwB=qk4kbzOsfa96H9B6US03Jf03JRs3zjX0OZzj9K5GwRQ@mail.gmail.com>
Message-ID: <D482DD15-0328-4B8F-923E-D8DC9D2FE54F@uni-ulm.de>


Am 29.04.2014 um 02:01 schrieb Nathaniel Smith <njs at pobox.com>:

> On Tue, Apr 29, 2014 at 12:52 AM, Sturla Molden <sturla.molden at gmail.com> wrote:
>> On 29/04/14 01:30, Nathaniel Smith wrote:
>> 
>>> I finally read this paper:
>>> 
>>>    http://www.cs.utexas.edu/users/flame/pubs/blis2_toms_rev2.pdf
>>> 
>>> and I have to say that I'm no longer so convinced that OpenBLAS is the
>>> right starting point.
>> 
>> I think OpenBLAS in the long run is doomed as an OSS project. Having
>> huge portions of the source in assembly is not sustainable in 2014.
>> OpenBLAS (like GotoBLAS2 before it) runs a high risk of becoming
>> abandonware.
> 
> Have you read the paper I linked? I really recommend it. BLIS is
> apparently 95% straight-up-C, plus a slot where you stick in a tiny
> CPU-specific super-optimized kernel [1]. So this localizes the nasty
> stuff to one tiny function, plus most of the kernels that have been
> written so far do in fact use intrinsics [2].
> 
> [1] https://code.google.com/p/blis/wiki/KernelsHowTo
> [2] https://code.google.com/p/blis/wiki/HardwareSupport
> 

I was teaching this summer an undergraduate class ?Software Basics on HPC?.  Of course on topic
was the efficient implementation of the matrix-matrix product GEMM.  The BLIS paper [1] is a great
source for that.

In my opinion having your own hands-on experience is very important for actually understanding this
concepts.  That in particular means that we implemented our own matrix-matrix product.  The pure C
(ANSI C) implementation has less than 450 lines of code.  The code consists of several function and
students developed these functions one by one from one assignment to the other.  You can see the
result here:

	http://apfel.mathematik.uni-ulm.de/~lehn/sghpc/gemm/page02/index.html#toc4

Other assignments where about improving the micro kernel with SSE instructions.  You can travers
through the pages to see how we where doing so step by step.

Please understand that this course material is still work in progress and needs some polish here and
there.  Still it could be useful for others and even a starting point for a simple BLAS implementation.

Cheers,

Michael


[1]: http://www.cs.utexas.edu/users/flame/pubs/BLISTOMSrev2.pdf


-----------------------------------------------------------------------------------
Dr. Michael Lehn
University of Ulm, Institute for Numerical Mathematics
Helmholtzstr. 20
D-89069 Ulm, Germany
Phone: (+49) 731 50-23534, Fax: (+49) 731 50-23548
-----------------------------------------------------------------------------------

From olivier.grisel at ensta.org  Fri Jul 11 06:30:40 2014
From: olivier.grisel at ensta.org (Olivier Grisel)
Date: Fri, 11 Jul 2014 12:30:40 +0200
Subject: [Numpy-discussion] 64-bit windows numpy / scipy wheels for
	testing
In-Reply-To: <CAN4+E8G=i5g5gNTFjzxWUwGOC4HTh_HvME-dXRSnvEvQqg4W7Q@mail.gmail.com>
References: <CAH6Pt5r47rOKXYfdiqaGQihV+yzrq2dc9rmYbboNb=HUbrtEpA@mail.gmail.com>
	<536CB2C6.1030305@googlemail.com>
	<CAGY4rcV1tecTqbVB-nY2_zuAQ1U0uVRH9iPa8Wd2inL8mh+zpA@mail.gmail.com>
	<CAH6Pt5oOx+POzTJdMHU49JxZs=Pkc6OFD-B6=9topx68QjtiiA@mail.gmail.com>
	<CABL7CQhfo6++1B6hBh3pDzmLZOeNbqo1SzQOfBMbLEQj4G+U7A@mail.gmail.com>
	<CAFvE7K4+TMXG8-=bFsWk7hx5Wh4VkhhyYhw4z7EBb6kKnqPJrw@mail.gmail.com>
	<CAGGsPMxmOV8T+Wm6LfTER1xebCQk4MEebXfdRULJ6b8zWLL_=Q@mail.gmail.com>
	<CAH6Pt5pFftrXN8dkS4Dcj+Dt8Bx6EMyqp_cK0RX6=KUKhVx2+w@mail.gmail.com>
	<CAH6Pt5r2fNJGGj1K=foSt9tFU89+wf82waYAzxYHrKGtQbTXwQ@mail.gmail.com>
	<CAGGsPMyxvNkz__+Hc1YSp_fLRPbdg4gez9EAwQyS9jzv_R+ZCA@mail.gmail.com>
	<CAH6Pt5rj8QOrtCyZMKeaM34ofdXkKRCiABGNLD1vVUC_B7bqTA@mail.gmail.com>
	<CAGGsPMy7Y1nqz+5TCuGFP2gvK6G2ReYMb9X1e_Jq3rwO6FoLJA@mail.gmail.com>
	<CAH6Pt5rPhXYyWHPVy8fYrym=2oNQxj5kchUL38KdLrwXjmhL8w@mail.gmail.com>
	<CALGmxEKyXDsuF81=coYYvtJSOLnpnO6AW5CxNYu+JSBOCiQu=A@mail.gmail.com>
	<lp2k91$7jg$1@ger.gmane.org>
	<CAH6Pt5o6g5A_gjCsH75xkqLzPLJUz9OdsB4a5Ej5S-kectw+og@mail.gmail.com>
	<CAH6Pt5qJS357CtoXGyz2cNmVFr8ufjjdwe_fHqU86YcAcJ5vcw@mail.gmail.com>
	<CAFvE7K5i-d6j=BQu5+GxQUqSdDjJ91N6yUUhqqBc-0V5rPh8AQ@mail.gmail.com>
	<CAFvE7K6+3vB9emv2MCt6KMPz8PYZhQUQazgnHQVg3FEOU0PioA@mail.gmail.com>
	<CAN4+E8G=i5g5gNTFjzxWUwGOC4HTh_HvME-dXRSnvEvQqg4W7Q@mail.gmail.com>
Message-ID: <CAFvE7K75ynJ_xHDnCgRSmjYXPBWuTX88=6UUNBx-O+A1qhbBZw@mail.gmail.com>

2014-07-10 0:53 GMT+02:00 Robert McGibbon <rmcgibbo at gmail.com>:
> This is an awesome resource for tons of projects.

Thanks.

FYI here is the PR for sklearn to use AppVeyor CI:

  https://github.com/scikit-learn/scikit-learn/pull/3363

It's slightly different from the minimalistic sample I wrote for
python-appveyor-demo in the sense that for sklearn I decided to
actually install the generated wheel package and run the tests on the
resulting installed library rather than on the project source folder.

--
Olivier


From jeffreback at gmail.com  Fri Jul 11 07:56:38 2014
From: jeffreback at gmail.com (Jeff)
Date: Fri, 11 Jul 2014 04:56:38 -0700 (PDT)
Subject: [Numpy-discussion] ANN: Pandas 0.14.0 Release Candidate 1
In-Reply-To: <CAHMnJKh-L26x=-3c6pPoSnqmb7fAujSH=hGQ6f-V=On9FLXQTw@mail.gmail.com>
References: <CAHMnJKh-L26x=-3c6pPoSnqmb7fAujSH=hGQ6f-V=On9FLXQTw@mail.gmail.com>
Message-ID: <78326bb6-44e0-41c4-8fbe-526b01cec592@googlegroups.com>

Matthew, we posted the release of 0.14.1 last night. Are these picked up 
and build here automatically? 
https://nipy.bic.berkeley.edu/scipy_installers/
thanks

Jeff
On Saturday, May 17, 2014 7:22:00 AM UTC-4, Jeff wrote:
>
> Hi,
>
> I'm pleased to announce the availability of the first release candidate of 
> Pandas 0.14.0.
> Please try this RC and report any issues here: Pandas Issues 
> <https://github.com/pydata/pandas/issues>
> We will be releasing officially in about 2 weeks or so.
>
> This is a major release from 0.13.1 and includes a small number of API 
> changes, several new features, enhancements, and 
> performance improvements along with a large number of bug fixes. 
>
> Highlights include:
>
>    - Officially support Python 3.4
>    - SQL interfaces updated to use sqlalchemy,
>    - Display interface changes
>    - MultiIndexing Using Slicers
>    - Ability to join a singly-indexed DataFrame with a multi-indexed 
>    DataFrame
>    - More consistency in groupby results and more flexible groupby 
>    specifications
>    - Holiday calendars are now supported in CustomBusinessDay
>    - Several improvements in plotting functions, including: hexbin, area 
>    and pie plots.
>    - Performance doc section on I/O operations
>    
> Since there are some significant changes in the default way DataFrames are 
> displayed. I have put
> up a comment issue looking for some feedback here 
> <https://github.com/pydata/pandas/issues/7146>
>
> Here are the full whatsnew and documentation links:
>
> v0.14.0 Whatsnew 
> <http://pandas-docs.github.io/pandas-docs-travis/whatsnew.html>
>
> v0.14.0 Documentation Page 
> <http://pandas-docs.github.io/pandas-docs-travis/>
>
> Source tarballs, and windows builds are available here:
>
> Pandas v0.14rc1 Release <https://github.com/pydata/pandas/releases>
>
> A big thank you to everyone who contributed to this release!
>
> Jeff
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140711/15cee6bb/attachment.html>

From jeffreback at gmail.com  Fri Jul 11 09:31:15 2014
From: jeffreback at gmail.com (Jeff Reback)
Date: Fri, 11 Jul 2014 09:31:15 -0400
Subject: [Numpy-discussion] ANN: pandas 0.14.1 released
Message-ID: <CAHMnJKju5ozn38VePnNRncLLVNDMid3jKY6=ja09=tWaNicqvw@mail.gmail.com>

Hello,

We are proud to announce v0.14.1 of pandas, a minor release from 0.14.0.

This release includes a small number of API changes, several new features,
enhancements, and performance improvements along with a large number of bug
fixes.

This was 1.5 months of work with 244 commits by 45 authors encompassing 306
issues.

We recommend that all users upgrade to this version.

*Highlights:*


   - New method select_dtypes()
   <http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.select_dtypes.html#pandas.DataFrame.select_dtypes>
   to select columns based on the dtype
   - New method sem()
   <http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.sem.html#pandas.Series.sem>
   to calculate the standard error of the mean.
   - Support for dateutil timezones (see *docs*
   <http://pandas.pydata.org/pandas-docs/stable/timeseries.html#timeseries-timezone>
   ).
   - Support for ignoring full line comments in the read_csv()
   <http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html#pandas.read_csv>text
   parser.
   - New documentation section on *Options and Settings*
   <http://pandas.pydata.org/pandas-docs/stable/options.html#options>.
   - Lots of bug fixes


For a more a full description of Whatsnew for v0.14.1 here:
http://pandas.pydata.org/pandas-docs/stable/whatsnew.html


*What is it:*

*pandas* is a Python package providing fast, flexible, and expressive data
structures designed to make working with ?relational? or ?labeled? data both
easy and intuitive. It aims to be the fundamental high-level building block
for
doing practical, real world data analysis in Python. Additionally, it has
the
broader goal of becoming the most powerful and flexible open source data
analysis / manipulation tool available in any language.


Documentation:
http://pandas.pydata.org/pandas-docs/stable/

Source tarballs, windows binaries are available on PyPI:
https://pypi.python.org/pypi/pandas

windows binaries are courtesy of  Christoph Gohlke and are built on Numpy
1.8
macosx wheels will be available soon, courtesy of Matthew Brett

Please report any issues here:
https://github.com/pydata/pandas/issues


Thanks

The Pandas Development Team


Contributors to the 0.14.1 release

   - Andrew Rosenfeld
   - Andy Hayden
   - Benjamin Adams
   - Benjamin M. Gross
   - Brian Quistorff
   - Brian Wignall
   - bwignall
   - clham
   - Daniel Waeber
   - David Bew
   - David Stephens
   - DSM
   - dsm054
   - helger
   - immerrr
   - Jacob Schaer
   - jaimefrio
   - Jan Schulz
   - John David Reaver
   - John W. O?Brien
   - Joris Van den Bossche
   - jreback
   - Julien Danjou
   - Kevin Sheppard
   - K.-Michael Aye
   - Kyle Meyer
   - lexual
   - Matthew Brett
   - Matt Wittmann
   - Michael Mueller
   - Mortada Mehyar
   - onesandzeroes
   - Phillip Cloud
   - Rob Levy
   - rockg
   - sanguineturtle
   - Schaer, Jacob C
   - seth-p
   - sinhrks
   - Stephan Hoyer
   - Thomas Kluyver
   - Todd Jennings
   - TomAugspurger
   - unknown
   - yelite
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140711/029c3f2a/attachment.html>

From njs at pobox.com  Fri Jul 11 10:35:48 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 11 Jul 2014 15:35:48 +0100
Subject: [Numpy-discussion] np.zeros of structured array of array of
	objects
In-Reply-To: <53BF94C2.7000407@googlemail.com>
References: <53BF94C2.7000407@googlemail.com>
Message-ID: <CAPJVwBmYMzg7VrvotAJ0i0qAKM4onT29LggPJMT3zn1gb_E3Tg@mail.gmail.com>

On Fri, Jul 11, 2014 at 8:39 AM, Julian Taylor
<jtaylor.debian at googlemail.com> wrote:
> Hi,
> looking at https://github.com/numpy/numpy/issues/4857 I noticed that
> np.zeros of a structured array of array of objects only initializes the
> first element of if the embedded array to zero and leaves the rest None:
>
> In [1]: a = numpy.zeros(10, dtype=[('multiple objects', object, 2)]); a
> Out[1]:
> array([([0, None],), ([0, None],), ([0, None],), ([0, None],),
>        ([0, None],), ([0, None],), ([0, None],), ([0, None],),
>        ([0, None],), ([0, None],)],
>       dtype=[('multiple objects', 'O', (2,))])
>
>
> Is this the intented behavior? I would have expected all fields to be
> set to an int-object 0.
> If not can we change it or is it too likely people rely on this behavior?

Looks like a bug to me, and I can't off-hand think of any reason why
anyone would be relying on this... I vote that unless someone speaks
up we just fix it. If it really does break anything then we can always
catch that in beta...

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org


From var.mail.daniel at gmail.com  Fri Jul 11 16:30:36 2014
From: var.mail.daniel at gmail.com (Daniel da Silva)
Date: Fri, 11 Jul 2014 16:30:36 -0400
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat`
	style
In-Reply-To: <lpgqht$tm3$1@ger.gmane.org>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>
	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>
	<53B9C861.3090809@hawaii.edu>
	<CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>
	<-2968451659458027190@unknownmsgid> <lpgqht$tm3$1@ger.gmane.org>
Message-ID: <CACoeR0wT=f3bPmc9j8UezPPa8UZz8ad6sj6q_6F3cyciGXHxjw@mail.gmail.com>

I think the idea at hand is not that it would be used everyday, but it
would be there when needed. What people do everyday is with *real* data.
They are using functions to load the data. Where this would come in useful
would be presentations and tutorials.

If leading a presentation on scientific computing in Python to beginners,
which would look better on a bullet in a slide?

   -

   np.build('.2 .7 .1; .3 .5 .2; .1 .1 .9'))

   -

   np.array([[.2, .7, .1], [.3, .5, .2], [.1, .1, .9]])


The default way of defining contrived arrays by passing lists of lists is
awkward for beginners. While lists of lists are not a hard concept, it's
not something you want to force on someone who doesn't know the Python
language yet. The second bullet above doesn't represent the readability of
the Python world.

I would suggest that this be named np.build() (or np.helpers.build()) in
light of it providing a simple interface to building arrays. Again, when
you work with real data you are taking an extra step to think about how you
load that data. That's not what you need to think about when being
introduced to NumPy.


On Tue, Jul 8, 2014 at 9:09 AM, Pauli Virtanen <pav at iki.fi> wrote:

> 07.07.2014 21:32, Chris Barker - NOAA Federal kirjoitti:
> > If you are going to introduce this functionality, please don't call it
> > np.arr.
>
> It might be appropriate for pirate versions of Numpy.
>
>     ***
>
> Seriously though, having a variant of `mat` that returns arrays could be
> useful, so weak +0. Preferably, the name should be quite short to type.
>
> On the other hand, unlike r_ and c_, I haven't seen or used mat() in
> real code.
>
> --
> Pauli Virtanen
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140711/31d8f4ac/attachment.html>

From rays at blue-cove.com  Fri Jul 11 12:10:30 2014
From: rays at blue-cove.com (RayS)
Date: Fri, 11 Jul 2014 09:10:30 -0700
Subject: [Numpy-discussion] ANN: Pandas 0.14.0 Release Candidate 1
In-Reply-To: <78326bb6-44e0-41c4-8fbe-526b01cec592@googlegroups.com>
References: <CAHMnJKh-L26x=-3c6pPoSnqmb7fAujSH=hGQ6f-V=On9FLXQTw@mail.gmail.com>
	<78326bb6-44e0-41c4-8fbe-526b01cec592@googlegroups.com>
Message-ID: <201407111610.s6BGAXia005296@blue-cove.com>

At 04:56 AM 7/11/2014, you wrote:
>Matthew, we posted the release of 0.14.1 last night. Are these 
>picked up and build here automatically? 
>https://nipy.bic.berkeley.edu/scipy_installers/

I see it's at http://www.lfd.uci.edu/~gohlke/pythonlibs/#pandas

- Ray 


From jeffreback at gmail.com  Sat Jul 12 06:33:37 2014
From: jeffreback at gmail.com (Jeff Reback)
Date: Sat, 12 Jul 2014 06:33:37 -0400
Subject: [Numpy-discussion] ANN: Pandas 0.14.0 Release Candidate 1
In-Reply-To: <201407111610.s6BGAXia005296@blue-cove.com>
References: <CAHMnJKh-L26x=-3c6pPoSnqmb7fAujSH=hGQ6f-V=On9FLXQTw@mail.gmail.com>
	<78326bb6-44e0-41c4-8fbe-526b01cec592@googlegroups.com>
	<201407111610.s6BGAXia005296@blue-cove.com>
Message-ID: <2C2898F6-6754-4798-A2D0-7BAD32FE57AA@gmail.com>

Ray

Matthew builds Mac osx wheels for scipy stack (those are windows binaries)

thanks anyhow


> On Jul 11, 2014, at 12:10 PM, RayS <rays at blue-cove.com> wrote:
> 
> At 04:56 AM 7/11/2014, you wrote:
>> Matthew, we posted the release of 0.14.1 last night. Are these 
>> picked up and build here automatically? 
>> https://nipy.bic.berkeley.edu/scipy_installers/
> 
> I see it's at http://www.lfd.uci.edu/~gohlke/pythonlibs/#pandas
> 
> - Ray 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From charlesr.harris at gmail.com  Sat Jul 12 13:17:14 2014
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sat, 12 Jul 2014 12:17:14 -0500
Subject: [Numpy-discussion] String type again.
Message-ID: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>

As previous posts have pointed out, Numpy's `S` type is currently treated
as a byte string, which leads to more complicated code in python3. OTOH,
the unicode type is stored as UCS4, which consumes a lot of space,
especially for ascii strings. This note proposes to adapt the currently
existing 'a' type letter, currently aliased to 'S', as a new fixed encoding
dtype. Python 3.3 introduced two one byte internal representations for
unicode strings, ascii and latin1. Ascii has the advantage that it is a
subset of UTF-8, whereas latin1 has a few more symbols. Another possibility
is to just make it an UTF-8 encoding, but I think this would involve more
overhead as Python would need to determine the maximum character size.
These are just preliminary thoughts, comments are welcome.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140712/472d4e1c/attachment.html>

From joseluismietta at yahoo.com.ar  Sat Jul 12 12:53:45 2014
From: joseluismietta at yahoo.com.ar (=?iso-8859-1?Q?Jos=E8_Luis_Mietta?=)
Date: Sat, 12 Jul 2014 09:53:45 -0700
Subject: [Numpy-discussion] plt.show() and plt.draw() doesnt work
Message-ID: <1405184025.93121.YahooMailNeo@web142302.mail.bf1.yahoo.com>

Hi experts!

I have a numpy array M. I generate a graph using NetworkX and then I want to draw this graph:

??? import networkx as nx
??? import matplotlib.pyplot as plt
??? G=nx.graph(M)
??? nx.draw(G)
??? plt.draw()

Doing this, no picture appears. In addition, if I do `plt.show()` no picture appears.

Please help!

Best regards
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140712/6b4d9637/attachment.html>

From davidmenhur at gmail.com  Sat Jul 12 17:32:24 2014
From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=)
Date: Sat, 12 Jul 2014 23:32:24 +0200
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat`
	style
In-Reply-To: <CACoeR0wT=f3bPmc9j8UezPPa8UZz8ad6sj6q_6F3cyciGXHxjw@mail.gmail.com>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>
	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>
	<53B9C861.3090809@hawaii.edu>
	<CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>
	<-2968451659458027190@unknownmsgid> <lpgqht$tm3$1@ger.gmane.org>
	<CACoeR0wT=f3bPmc9j8UezPPa8UZz8ad6sj6q_6F3cyciGXHxjw@mail.gmail.com>
Message-ID: <CAJhcF=0deWLKxYvwu3q1N75_YqTbxoFxr9D2jDbRFr64K82LHg@mail.gmail.com>

On 11 July 2014 22:30, Daniel da Silva <var.mail.daniel at gmail.com> wrote:

> I think the idea at hand is not that it would be used everyday, but it
> would be there when needed. What people do everyday is with *real* data.
> They are using functions to load the data.
>

But sometimes we have to hard-code a few values, and it is true that making
a list (or nested list) is quite verbose; one example are unittests. Having
a MATLAB-style array creation would be convenient for those cases.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140712/d318ff49/attachment.html>

From njs at pobox.com  Sat Jul 12 20:02:37 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Sun, 13 Jul 2014 01:02:37 +0100
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
Message-ID: <CAPJVwBm67=oZbKXN60tUqdDD4qtqYK66iBO1yK6Cd1MF_qxkyA@mail.gmail.com>

On 12 Jul 2014 23:06, "Charles R Harris" <charlesr.harris at gmail.com> wrote:
>
> As previous posts have pointed out, Numpy's `S` type is currently treated
as a byte string, which leads to more complicated code in python3. OTOH,
the unicode type is stored as UCS4, which consumes a lot of space,
especially for ascii strings. This note proposes to adapt the currently
existing 'a' type letter, currently aliased to 'S', as a new fixed encoding
dtype. Python 3.3 introduced two one byte internal representations for
unicode strings, ascii and latin1. Ascii has the advantage that it is a
subset of UTF-8, whereas latin1 has a few more symbols. Another possibility
is to just make it an UTF-8 encoding, but I think this would involve more
overhead as Python would need to determine the maximum character size.
These are just preliminary thoughts, comments are welcome.

I feel like for most purposes, what we *really* want is a variable length
string dtype (I.e., where each element can be a different length.). Pandas
pays quite some price in overhead to fake this right now. Adding such a
thing will cause some problems regarding compatibility (what to do with
array(["foo"])) and education, but I think it's worth it in the long run. A
variable length string with out of band storage also would allow for a lot
of py3.3-style storage tricks of we want then.

Given that, though, I'm a little dubious about adding a third fixed length
string type, since it seems like it might be a temporary patch, yet raises
the prospect of having to indefinitely support *5* distinct string types (3
of which will map to py3 str)...

OTOH, fixed length nul padded latin1 would be useful for various flat file
reading tasks.

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140713/a98d8be1/attachment.html>

From ralf.gommers at gmail.com  Sun Jul 13 08:11:07 2014
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sun, 13 Jul 2014 14:11:07 +0200
Subject: [Numpy-discussion] plt.show() and plt.draw() doesnt work
In-Reply-To: <1405184025.93121.YahooMailNeo@web142302.mail.bf1.yahoo.com>
References: <1405184025.93121.YahooMailNeo@web142302.mail.bf1.yahoo.com>
Message-ID: <CABL7CQhBw=NevGw4jK41vKi8vAeCMKa=dAVaeQh-mvMtUYepMQ@mail.gmail.com>

On Sat, Jul 12, 2014 at 6:53 PM, Jos? Luis Mietta <
joseluismietta at yahoo.com.ar> wrote:

> Hi experts!
>
> I have a numpy array M. I generate a graph using NetworkX and then I want
> to draw this graph:
>
>     import networkx as nx
>     import matplotlib.pyplot as plt
>     G=nx.graph(M)
>     nx.draw(G)
>     plt.draw()
>
> Doing this, no picture appears. In addition, if I do `plt.show()` no
> picture appears.
>

You're getting a TypeError I guess? The third line is incorrect, should be
    G = nx.graph.Graph(M)

If that's not the issue and it's really about plotting, you should ask on
the matplotlib users list.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140713/9022f4da/attachment.html>

From ndarray at mac.com  Sun Jul 13 13:05:48 2014
From: ndarray at mac.com (Alexander Belopolsky)
Date: Sun, 13 Jul 2014 13:05:48 -0400
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CAPJVwBm67=oZbKXN60tUqdDD4qtqYK66iBO1yK6Cd1MF_qxkyA@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
	<CAPJVwBm67=oZbKXN60tUqdDD4qtqYK66iBO1yK6Cd1MF_qxkyA@mail.gmail.com>
Message-ID: <CAP7h-xbs=VFved2VQ5m_2_sz5OF5Jes8cCiaU=EAq8Bq_u9Dwg@mail.gmail.com>

On Sat, Jul 12, 2014 at 8:02 PM, Nathaniel Smith <njs at pobox.com> wrote:

> I feel like for most purposes, what we *really* want is a variable length
> string dtype (I.e., where each element can be a different length.).


I've been toying with the idea of creating an array type for interned
strings.  In many applications dealing with large arrays of variable size
strings, the strings come from a relatively short set of names.  Arrays of
interned strings can be manipulated very efficiently because in may
respects they are just like arrays of integers.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140713/bdfe01fa/attachment.html>

From ndarray at mac.com  Sun Jul 13 13:13:57 2014
From: ndarray at mac.com (Alexander Belopolsky)
Date: Sun, 13 Jul 2014 13:13:57 -0400
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat`
	style
In-Reply-To: <CACoeR0wT=f3bPmc9j8UezPPa8UZz8ad6sj6q_6F3cyciGXHxjw@mail.gmail.com>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>
	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>
	<53B9C861.3090809@hawaii.edu>
	<CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>
	<-2968451659458027190@unknownmsgid> <lpgqht$tm3$1@ger.gmane.org>
	<CACoeR0wT=f3bPmc9j8UezPPa8UZz8ad6sj6q_6F3cyciGXHxjw@mail.gmail.com>
Message-ID: <CAP7h-xYtS_Sc5PZt9rW1rg3Mk4LjUc6P=PuQ_g+J=Qe6VuhNtQ@mail.gmail.com>

On Fri, Jul 11, 2014 at 4:30 PM, Daniel da Silva <var.mail.daniel at gmail.com>
wrote:

> If leading a presentation on scientific computing in Python to beginners,
> which would look better on a bullet in a slide?
>
>    -
>
>    np.build('.2 .7 .1; .3 .5 .2; .1 .1 .9'))
>
>    -
>
>    np.array([[.2, .7, .1], [.3, .5, .2], [.1, .1, .9]])
>
>
> np.array([[.2, .7, .1],
          [.3, .5, .2],
          [.1, .1, .9]])
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140713/5f629779/attachment.html>

From ndarray at mac.com  Sun Jul 13 13:31:14 2014
From: ndarray at mac.com (Alexander Belopolsky)
Date: Sun, 13 Jul 2014 13:31:14 -0400
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat`
	style
In-Reply-To: <CAP7h-xYtS_Sc5PZt9rW1rg3Mk4LjUc6P=PuQ_g+J=Qe6VuhNtQ@mail.gmail.com>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>
	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>
	<53B9C861.3090809@hawaii.edu>
	<CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>
	<-2968451659458027190@unknownmsgid> <lpgqht$tm3$1@ger.gmane.org>
	<CACoeR0wT=f3bPmc9j8UezPPa8UZz8ad6sj6q_6F3cyciGXHxjw@mail.gmail.com>
	<CAP7h-xYtS_Sc5PZt9rW1rg3Mk4LjUc6P=PuQ_g+J=Qe6VuhNtQ@mail.gmail.com>
Message-ID: <CAP7h-xbvdv86DWZvYaLSJpba=vSTHdCQK1+YD_qfNRv4DeFBaw@mail.gmail.com>

Also, the use of strings will confuse most syntax highlighters.  Compare
the two options in this screenshot:

[image: Inline image 2]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140713/f64e1ca1/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screen Shot 2014-07-13 at 1.29.20 PM.png
Type: image/png
Size: 26129 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140713/f64e1ca1/attachment.png>

From ben.root at ou.edu  Mon Jul 14 09:23:22 2014
From: ben.root at ou.edu (Benjamin Root)
Date: Mon, 14 Jul 2014 09:23:22 -0400
Subject: [Numpy-discussion] plt.show() and plt.draw() doesnt work
In-Reply-To: <1405184025.93121.YahooMailNeo@web142302.mail.bf1.yahoo.com>
References: <1405184025.93121.YahooMailNeo@web142302.mail.bf1.yahoo.com>
Message-ID: <CANNq6FkjS2s4ZsHeaMiw0PSY0uHL0ek5Af--ciKWv+HtQUmCzg@mail.gmail.com>

Please send this question to the matplotlib-users mailing list (if you
haven't already, I am still going through a huge backlog). This is the
NumPy list.

Ben Root


On Sat, Jul 12, 2014 at 12:53 PM, Jos? Luis Mietta <
joseluismietta at yahoo.com.ar> wrote:

> Hi experts!
>
> I have a numpy array M. I generate a graph using NetworkX and then I want
> to draw this graph:
>
>     import networkx as nx
>     import matplotlib.pyplot as plt
>     G=nx.graph(M)
>     nx.draw(G)
>     plt.draw()
>
> Doing this, no picture appears. In addition, if I do `plt.show()` no
> picture appears.
>
> Please help!
>
> Best regards
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140714/2c74afa4/attachment.html>

From olivier.grisel at ensta.org  Mon Jul 14 13:00:45 2014
From: olivier.grisel at ensta.org (Olivier Grisel)
Date: Mon, 14 Jul 2014 19:00:45 +0200
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CAP7h-xbs=VFved2VQ5m_2_sz5OF5Jes8cCiaU=EAq8Bq_u9Dwg@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
	<CAPJVwBm67=oZbKXN60tUqdDD4qtqYK66iBO1yK6Cd1MF_qxkyA@mail.gmail.com>
	<CAP7h-xbs=VFved2VQ5m_2_sz5OF5Jes8cCiaU=EAq8Bq_u9Dwg@mail.gmail.com>
Message-ID: <CAFvE7K4CP09VOY4CbFE6TEXOheYekaqJTQz5CaARX7_oRnzTPw@mail.gmail.com>

2014-07-13 19:05 GMT+02:00 Alexander Belopolsky <ndarray at mac.com>:
>
> On Sat, Jul 12, 2014 at 8:02 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>> I feel like for most purposes, what we *really* want is a variable length
>> string dtype (I.e., where each element can be a different length.).
>
>
>
> I've been toying with the idea of creating an array type for interned
> strings.  In many applications dealing with large arrays of variable size
> strings, the strings come from a relatively short set of names.  Arrays of
> interned strings can be manipulated very efficiently because in may respects
> they are just like arrays of integers.

+1 I think this is why pandas is using dtype=object to load string
data: in many cases short string values are used to represent
categorical variables with a comparatively small cardinality of
possible values for a dataset with comparatively numerous records.

In that case the dtype=object is not that bad as it just stores
pointer on string objects managed by Python. It's possible to intern
the strings manually at load time (I don't know if pandas or python
already do it automatically in that case). The integer semantics is
good for that case. Having an explicit dtype might be even better.

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel


From andrew.collette at gmail.com  Mon Jul 14 13:39:41 2014
From: andrew.collette at gmail.com (Andrew Collette)
Date: Mon, 14 Jul 2014 11:39:41 -0600
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
Message-ID: <CALmrCV2K1kTU0onRtx0uaCpJKLAvqru0Hz3aA66TSG2RWiEY8w@mail.gmail.com>

Hi Chuck,

> This note proposes to adapt the currently existing 'a'
> type letter, currently aliased to 'S', as a new fixed encoding dtype. Python
> 3.3 introduced two one byte internal representations for unicode strings,
> ascii and latin1. Ascii has the advantage that it is a subset of UTF-8,
> whereas latin1 has a few more symbols. Another possibility is to just make
> it an UTF-8 encoding, but I think this would involve more overhead as Python
> would need to determine the maximum character size.

For storing data in HDF5 (PyTables or h5py), it would be somewhat
cleaner if either ASCII or UTF-8 are used, as these are the only two
charsets officially supported by the library.  Latin-1 would require a
custom read/write converter, which isn't the end of the world but
would be tricky to do in a correct way, and likely somewhat slow.
We'd also run into truncation issues since certain latin-1 chars
become multibyte sequences in UTF8.

I assume 'a' strings would still be null-padded?

Andrew


From charlesr.harris at gmail.com  Mon Jul 14 14:22:43 2014
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 14 Jul 2014 12:22:43 -0600
Subject: [Numpy-discussion] __numpy_ufunc__
Message-ID: <CAB6mnx+nJCNLf-yJYom045jK5fonvOi8AMPSQXKSfyZSf6T9_g@mail.gmail.com>

Hi All,

Julian has raised the question of including numpy_ufunc in numpy 1.9. I
don't feel strongly one way or the other, but it doesn't seem to be
finished yet and 1.10 might be a better place to work out the remaining
problems along with the astropy folks testing possible uses.

Thoughts?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140714/58840470/attachment.html>

From chris.barker at noaa.gov  Mon Jul 14 16:13:00 2014
From: chris.barker at noaa.gov (Chris Barker)
Date: Mon, 14 Jul 2014 13:13:00 -0700
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
Message-ID: <CALGmxEKR3NB_iL1fPb-N0hyf86dfy_BHxvy+KXcM-cYt-OtH8A@mail.gmail.com>

On Sat, Jul 12, 2014 at 10:17 AM, Charles R Harris <
charlesr.harris at gmail.com> wrote:

> As previous posts have pointed out, Numpy's `S` type is currently treated
> as a byte string, which leads to more complicated code in python3.
>

Also, a byte string in py3 is not, in fact the same as the py2 string type.
So we have a problem -- if we want 'S' to mean what it essentially does in
py2, what do we map it to in pure-python land?

I propose we embrace the py3 model as fully as possible:

There is text data, and there is binary data. In py3, that is 'str' and
'bytes'.

So numpy should have dtypes to match these. We're a bit stuck, however,
because 'S' mapped to the py2 string type, which no longer exists in py3.
Sorry not running py3 to see what 'S' does now, but I know it's bit broken,
and may be too late to change it.

But: it is certainly a common case in the scientific world to have
1-byte-per-character string data, and care about store size. So a
1-byte-per-character text data types may be a good idea:

As for a bytes type -- do we need it, or are we fine with simply using
uint8 arrays? (or, even the most common case, converting directly to the
type that is actually stored in those bytes...


> especially for ascii strings. This note proposes to adapt the currently
> existing 'a' type letter, currently aliased to 'S', as a new fixed encoding
> dtype.
>

+1


> Python 3.3 introduced two one byte internal representations for unicode
> strings, ascii and latin1. Ascii has the advantage that it is a subset of
> UTF-8, whereas latin1 has a few more symbols.
>

+1 for latin-1 -- those extra symbols are handy. Also, at least with
Python's stdlib encoding, you can round-trip any binary data through
latin-1 -- kind of making it act like a bytes object....


> Another possibility is to just make it an UTF-8 encoding, but I think this
> would involve more overhead as Python would need to determine the maximum
> character size.
>

yeah -- that is a) overhead, and b) breaks the numpy fixed size dtype
model. And it's trickier for numpy arrays, 'cause they are mutable --
python strings can do OK, as they don't need to accommodate potentially
changing sizes of strings.

On Sat, Jul 12, 2014 at 5:02 PM, Nathaniel Smith <njs at pobox.com> wrote:

> I feel like for most purposes, what we *really* want is a variable length
> string dtype (I.e., where each element can be a different length.).


well, that is fundamentally different than the usual numpy data model -- it
would require that the array store pointers and dereference them on use --
is there anywhere else in numpy (other than the object dtype ) that does
that?

And if we did -- would it end up having any advantage over putting strings
in an object array? Or for that matter, using a list of strings instead?


> Pandas pays quite some price in overhead to fake this right now. Adding
> such a thing will cause some problems regarding compatibility (what to do
> with array(["foo"])) and education, but I think it's worth it in the long
> run.


i.e do you use the fixed-length type or the variable-length type? I'm not
sure it's to killer to have a default and let eh user set a dtype if they
want something else.

-Chris

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140714/b6d96575/attachment.html>

From hodgson.neil at yahoo.co.uk  Tue Jul 15 05:22:56 2014
From: hodgson.neil at yahoo.co.uk (Neil Hodgson)
Date: Tue, 15 Jul 2014 10:22:56 +0100
Subject: [Numpy-discussion] Bug in np.cross for 2D vectors
Message-ID: <1405416176.45058.YahooMailNeo@web133104.mail.ir2.yahoo.com>

Hi,

We came across this bug while using np.cross on 3D arrays of 2D vectors.
The first example shows the problem and we looked at the source for np.cross and believe we found the bug - an unnecessary swapaxes when returning the output (comment inserted in the code).

Thanks
Neil 

# Example


shape = (3,5,7,2)

# These are effectively 3D arrays (3*5*7) of 2D vectors
data1 = np.random.randn(*shape)
data2 = np.random.randn(*shape)

# The cross product of data1 and data2 should produce a (3*5*7) array of scalars
cross_product_longhand = data1[:,:,:,0]*data2[:,:,:,1]-data1[:,:,:,1]*data2[:,:,:,0]
print 'longhand output shape:',cross_product_longhand.shape # and it does

cross_product_numpy = np.cross(data1,data2)
print 'numpy output shape:',cross_product_numpy.shape # It seems to have transposed the last 2 dimensions

if (cross_product_longhand == np.transpose(cross_product_numpy, (0,2,1))).all():
print 'Unexpected transposition in numpy.cross (numpy version %s)'%np.__version__

# np.cross L1464if axis is not None: 
??? axisa, axisb, axisc=(axis,)*3
a = asarray(a).swapaxes(axisa, 0)
b = asarray(b).swapaxes(axisb, 0)
msg = "incompatible dimensions for cross product\n"\
?????      "(dimension must be 2 or 3)"
if (a.shape[0] not in [2, 3]) or (b.shape[0] not in [2, 3]):
????    raise ValueError(msg)
if a.shape[0] == 2:??? if (b.shape[0] == 2): 
??????? cp = a[0]*b[1] - a[1]*b[0]
??????? if cp.ndim == 0:
??????????? return cp
??????? else:
??????????? ## WE SHOULD NOT SWAPAXES HERE! 
??????????? ## For 2D vectors the first axis has been 

??????????? ## collapsed during the cross product
???????????            return cp.swapaxes(0, axisc)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140715/4827961a/attachment.html>

From J.M.Hoekstra at tudelft.nl  Tue Jul 15 02:33:30 2014
From: J.M.Hoekstra at tudelft.nl (Jacco Hoekstra - LR)
Date: Tue, 15 Jul 2014 06:33:30 +0000
Subject: [Numpy-discussion] Short-hand array creation in
	`numpy.mat`	style
In-Reply-To: <CAP7h-xbvdv86DWZvYaLSJpba=vSTHdCQK1+YD_qfNRv4DeFBaw@mail.gmail.com>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>
	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>
	<53B9C861.3090809@hawaii.edu>
	<CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>
	<-2968451659458027190@unknownmsgid> <lpgqht$tm3$1@ger.gmane.org>
	<CACoeR0wT=f3bPmc9j8UezPPa8UZz8ad6sj6q_6F3cyciGXHxjw@mail.gmail.com>
	<CAP7h-xYtS_Sc5PZt9rW1rg3Mk4LjUc6P=PuQ_g+J=Qe6VuhNtQ@mail.gmail.com>
	<CAP7h-xbvdv86DWZvYaLSJpba=vSTHdCQK1+YD_qfNRv4DeFBaw@mail.gmail.com>
Message-ID: <245AC908B39361438CFA2299B0DD50E438FC7B7C@SRV361.tudelft.net>

WeIl, I do not see the confusion here (only due to the use of the array function, maybe). It is a string, after all, so it should be colour-coded as such.

I would love to keep this feaure of np.mat in somehow, named np.txt2arr or something. We, linear algebraists, will already lose the .I method for matrix inversion, the * for matrix multiplication, let?s keep at least one of the many handy features of the matrix-type in.  It is simply a very useful, short-hand way, probably a separate function, to make a 2D-array. If you think it?s ugly, don?t use it. But it certainly is faster to type it and former Matlab-users will love it as well. Just my 2 cts.

From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Alexander Belopolsky
Sent: zondag 13 juli 2014 19:31
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Short-hand array creation in `numpy.mat` style

Also, the use of strings will confuse most syntax highlighters.  Compare the two options in this screenshot:

[Inline image 2]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140715/862ab2ab/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.jpg
Type: image/jpeg
Size: 7699 bytes
Desc: image002.jpg
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140715/862ab2ab/attachment.jpg>

From njs at pobox.com  Tue Jul 15 06:55:13 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Tue, 15 Jul 2014 11:55:13 +0100
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat`
	style
In-Reply-To: <CAP7h-xbvdv86DWZvYaLSJpba=vSTHdCQK1+YD_qfNRv4DeFBaw@mail.gmail.com>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>
	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>
	<53B9C861.3090809@hawaii.edu>
	<CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>
	<-2968451659458027190@unknownmsgid> <lpgqht$tm3$1@ger.gmane.org>
	<CACoeR0wT=f3bPmc9j8UezPPa8UZz8ad6sj6q_6F3cyciGXHxjw@mail.gmail.com>
	<CAP7h-xYtS_Sc5PZt9rW1rg3Mk4LjUc6P=PuQ_g+J=Qe6VuhNtQ@mail.gmail.com>
	<CAP7h-xbvdv86DWZvYaLSJpba=vSTHdCQK1+YD_qfNRv4DeFBaw@mail.gmail.com>
Message-ID: <CAPJVwBkPAWG6hRKJWaE3tW40s5fXheh=CeEw+giUneoLAMmLDA@mail.gmail.com>

On Sun, Jul 13, 2014 at 6:31 PM, Alexander Belopolsky <ndarray at mac.com>
wrote:

> Also, the use of strings will confuse most syntax highlighters.  Compare
> the two options in this screenshot:
>
> [image: Inline image 2]
>

I guess this is a minor issue for "real" code, but even IPython doesn't
(yet?) provide syntax highlighting for lines as they're typed, and this is
a tool intended mainly for interactive use.

That screenshot also I think illustrates why people have such a preference
for the first syntax. The second line looks nice, but try typing it quickly
and getting all the commas located correctly inside versus outside of each
of the triply-nested brackets...

No-one's come up with any names for this that are nearly as good as "arr".
Is it really that bad to have to type one extra character, np.array instead
of np.arr<tab>?

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140715/9ba4bab9/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screen Shot 2014-07-13 at 1.29.20 PM.png
Type: image/png
Size: 26129 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140715/9ba4bab9/attachment.png>

From jeffreback at gmail.com  Tue Jul 15 06:56:11 2014
From: jeffreback at gmail.com (Jeff Reback)
Date: Tue, 15 Jul 2014 06:56:11 -0400
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CAFvE7K4CP09VOY4CbFE6TEXOheYekaqJTQz5CaARX7_oRnzTPw@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
	<CAPJVwBm67=oZbKXN60tUqdDD4qtqYK66iBO1yK6Cd1MF_qxkyA@mail.gmail.com>
	<CAP7h-xbs=VFved2VQ5m_2_sz5OF5Jes8cCiaU=EAq8Bq_u9Dwg@mail.gmail.com>
	<CAFvE7K4CP09VOY4CbFE6TEXOheYekaqJTQz5CaARX7_oRnzTPw@mail.gmail.com>
Message-ID: <5A29EC4A-CFE7-4B16-9C0A-4541B5544D62@gmail.com>

in 0.15.0 pandas will have full fledged support for categoricals which in effect allow u 2 map a smaller number of strings to integers 

this is now in pandas master 

http://pandas-docs.github.io/pandas-docs-travis/categorical.html

feedback welcome!

> On Jul 14, 2014, at 1:00 PM, Olivier Grisel <olivier.grisel at ensta.org> wrote:
> 
> 2014-07-13 19:05 GMT+02:00 Alexander Belopolsky <ndarray at mac.com>:
>> 
>>> On Sat, Jul 12, 2014 at 8:02 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>> 
>>> I feel like for most purposes, what we *really* want is a variable length
>>> string dtype (I.e., where each element can be a different length.).
>> 
>> 
>> 
>> I've been toying with the idea of creating an array type for interned
>> strings.  In many applications dealing with large arrays of variable size
>> strings, the strings come from a relatively short set of names.  Arrays of
>> interned strings can be manipulated very efficiently because in may respects
>> they are just like arrays of integers.
> 
> +1 I think this is why pandas is using dtype=object to load string
> data: in many cases short string values are used to represent
> categorical variables with a comparatively small cardinality of
> possible values for a dataset with comparatively numerous records.
> 
> In that case the dtype=object is not that bad as it just stores
> pointer on string objects managed by Python. It's possible to intern
> the strings manually at load time (I don't know if pandas or python
> already do it automatically in that case). The integer semantics is
> good for that case. Having an explicit dtype might be even better.
> 
> -- 
> Olivier
> http://twitter.com/ogrisel - http://github.com/ogrisel
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From sebastian at sipsolutions.net  Tue Jul 15 07:26:30 2014
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Tue, 15 Jul 2014 13:26:30 +0200
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
Message-ID: <1405423590.8281.7.camel@sebastian-t440>

On Sa, 2014-07-12 at 12:17 -0500, Charles R Harris wrote:
> As previous posts have pointed out, Numpy's `S` type is currently
> treated as a byte string, which leads to more complicated code in
> python3. OTOH, the unicode type is stored as UCS4, which consumes a
> lot of space, especially for ascii strings. This note proposes to
> adapt the currently existing 'a' type letter, currently aliased to
> 'S', as a new fixed encoding dtype. Python 3.3 introduced two one byte
> internal representations for unicode strings, ascii and latin1. Ascii
> has the advantage that it is a subset of UTF-8, whereas latin1 has a
> few more symbols. Another possibility is to just make it an UTF-8
> encoding, but I think this would involve more overhead as Python would
> need to determine the maximum character size. These are just
> preliminary thoughts, comments are welcome.
> 

Just wondering, couldn't we have a type which actually has an
(arbitrary, python supported) encoding (and "bytes" might even just be a
special case of no encoding)? Basically storing bytes and on access do
element[i].decode(specified_encoding) and on storing element[i] =
value.encode(specified_encoding).

There is always the never ending small issue of trailing null bytes. If
we want to be fully compatible, such a type would have to store the
string length explicitly to support trailing null bytes.

- Sebastian

> 
> Chuck  
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From jaime.frio at gmail.com  Tue Jul 15 07:41:57 2014
From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=)
Date: Tue, 15 Jul 2014 04:41:57 -0700
Subject: [Numpy-discussion] Bug in np.cross for 2D vectors
In-Reply-To: <1405416176.45058.YahooMailNeo@web133104.mail.ir2.yahoo.com>
References: <1405416176.45058.YahooMailNeo@web133104.mail.ir2.yahoo.com>
Message-ID: <CAPOWHW=a5YXMc5Te4OmBDvN6+hAwNstXsvLAa7AuvoNeoWrSVw@mail.gmail.com>

On Tue, Jul 15, 2014 at 2:22 AM, Neil Hodgson <hodgson.neil at yahoo.co.uk>
wrote:

> Hi,
>
> We came across this bug while using np.cross on 3D arrays of 2D vectors.
>

What version of numpy are you using? This should already be solved in numpy
master, and be part of the 1.9 release. Here's the relevant commit,
although the code has been cleaned up a bit in later ones:

https://github.com/numpy/numpy/commit/b9454f50f23516234c325490913224c3a69fb122

Jaime

-- 
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes
de dominaci?n mundial.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140715/73ae3e31/attachment.html>

From charlesr.harris at gmail.com  Tue Jul 15 11:15:17 2014
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Tue, 15 Jul 2014 09:15:17 -0600
Subject: [Numpy-discussion] String type again.
In-Reply-To: <1405423590.8281.7.camel@sebastian-t440>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
	<1405423590.8281.7.camel@sebastian-t440>
Message-ID: <CAB6mnx+OPGJuhmCbH=OEt=pAB2XTNh-sEFZUDyUVEFxnAAO9rg@mail.gmail.com>

On Tue, Jul 15, 2014 at 5:26 AM, Sebastian Berg <sebastian at sipsolutions.net>
wrote:

> On Sa, 2014-07-12 at 12:17 -0500, Charles R Harris wrote:
> > As previous posts have pointed out, Numpy's `S` type is currently
> > treated as a byte string, which leads to more complicated code in
> > python3. OTOH, the unicode type is stored as UCS4, which consumes a
> > lot of space, especially for ascii strings. This note proposes to
> > adapt the currently existing 'a' type letter, currently aliased to
> > 'S', as a new fixed encoding dtype. Python 3.3 introduced two one byte
> > internal representations for unicode strings, ascii and latin1. Ascii
> > has the advantage that it is a subset of UTF-8, whereas latin1 has a
> > few more symbols. Another possibility is to just make it an UTF-8
> > encoding, but I think this would involve more overhead as Python would
> > need to determine the maximum character size. These are just
> > preliminary thoughts, comments are welcome.
> >
>
> Just wondering, couldn't we have a type which actually has an
> (arbitrary, python supported) encoding (and "bytes" might even just be a
> special case of no encoding)? Basically storing bytes and on access do
> element[i].decode(specified_encoding) and on storing element[i] =
> value.encode(specified_encoding).
>
> There is always the never ending small issue of trailing null bytes. If
> we want to be fully compatible, such a type would have to store the
> string length explicitly to support trailing null bytes.
>

UTF-8 encoding works with null bytes. That is one of the reasons it is so
popular.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140715/cdeb9b39/attachment.html>

From charlesr.harris at gmail.com  Tue Jul 15 11:29:13 2014
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Tue, 15 Jul 2014 09:29:13 -0600
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CAB6mnx+OPGJuhmCbH=OEt=pAB2XTNh-sEFZUDyUVEFxnAAO9rg@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
	<1405423590.8281.7.camel@sebastian-t440>
	<CAB6mnx+OPGJuhmCbH=OEt=pAB2XTNh-sEFZUDyUVEFxnAAO9rg@mail.gmail.com>
Message-ID: <CAB6mnxJPe3QwVW7T0QYGMhw+Df6tWiKY95m5Aq14MiKckjcnuA@mail.gmail.com>

On Tue, Jul 15, 2014 at 9:15 AM, Charles R Harris <charlesr.harris at gmail.com
> wrote:

>
>
>
> On Tue, Jul 15, 2014 at 5:26 AM, Sebastian Berg <
> sebastian at sipsolutions.net> wrote:
>
>> On Sa, 2014-07-12 at 12:17 -0500, Charles R Harris wrote:
>> > As previous posts have pointed out, Numpy's `S` type is currently
>> > treated as a byte string, which leads to more complicated code in
>> > python3. OTOH, the unicode type is stored as UCS4, which consumes a
>> > lot of space, especially for ascii strings. This note proposes to
>> > adapt the currently existing 'a' type letter, currently aliased to
>> > 'S', as a new fixed encoding dtype. Python 3.3 introduced two one byte
>> > internal representations for unicode strings, ascii and latin1. Ascii
>> > has the advantage that it is a subset of UTF-8, whereas latin1 has a
>> > few more symbols. Another possibility is to just make it an UTF-8
>> > encoding, but I think this would involve more overhead as Python would
>> > need to determine the maximum character size. These are just
>> > preliminary thoughts, comments are welcome.
>> >
>>
>> Just wondering, couldn't we have a type which actually has an
>> (arbitrary, python supported) encoding (and "bytes" might even just be a
>> special case of no encoding)? Basically storing bytes and on access do
>> element[i].decode(specified_encoding) and on storing element[i] =
>> value.encode(specified_encoding).
>>
>> There is always the never ending small issue of trailing null bytes. If
>> we want to be fully compatible, such a type would have to store the
>> string length explicitly to support trailing null bytes.
>>
>
> UTF-8 encoding works with null bytes. That is one of the reasons it is so
> popular.
>
>
Thinking more about it, the easiest thing to do might be to make the S
dtype a UTF-8 encoding. Most of the machinery to deal with that is already
in place. That change might affect some users though, and we might need to
do some work to make it backwards compatible with python 2.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140715/72e4ee3b/attachment.html>

From chris.barker at noaa.gov  Tue Jul 15 12:18:30 2014
From: chris.barker at noaa.gov (Chris Barker)
Date: Tue, 15 Jul 2014 09:18:30 -0700
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CALmrCV2K1kTU0onRtx0uaCpJKLAvqru0Hz3aA66TSG2RWiEY8w@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
	<CALmrCV2K1kTU0onRtx0uaCpJKLAvqru0Hz3aA66TSG2RWiEY8w@mail.gmail.com>
Message-ID: <CALGmxEK0_waHnPpzdx1zV4Yk7_0=HuLkNvcy+ChYiqhAyyHmJg@mail.gmail.com>

On Mon, Jul 14, 2014 at 10:39 AM, Andrew Collette <andrew.collette at gmail.com
> wrote:


> For storing data in HDF5 (PyTables or h5py), it would be somewhat
> cleaner if either ASCII or UTF-8 are used, as these are the only two
> charsets officially supported by the library.


good argument for ASCII, but utf-8 is a bad idea, as there is no 1:1
correspondence between length of string in bytes and length in characters
-- as numpy needs to pre-allocate a defined number of bytes for a dtype,
there is a disconnect between the user and numpy as to how long a string is
being stored...this isn't a problem for immutable strings, and less of a
problem for HDF, as you can determine how many bytes you need before you
write the file (or does HDF support var-length elements?)


>  Latin-1 would require a
> custom read/write converter, which isn't the end of the world


"custom"? it would be an encoding operation -- which you'd need to go from
utf-8 to/from unicode anyway. So you would lose the ability to have a nice
1:1 binary representation map between numpy and HDF... good argument for
ASCII, I guess. Or for HDF to use latin-1 ;-)

Does HDF enforce ascii-only? what does it do with the > 127 values?


> would be tricky to do in a correct way, and likely somewhat slow.
> We'd also run into truncation issues since certain latin-1 chars
> become multibyte sequences in UTF8.
>

that's the whole issue with UTF-8 -- it needs to be addressed somewhere,
and the numpy-HDF interface seems like a smarter place to put it than the
numpy-user interface!

I assume 'a' strings would still be null-padded?


yup.


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140715/1688d145/attachment.html>

From ben.root at ou.edu  Tue Jul 15 12:32:00 2014
From: ben.root at ou.edu (Benjamin Root)
Date: Tue, 15 Jul 2014 12:32:00 -0400
Subject: [Numpy-discussion] __numpy_ufunc__
In-Reply-To: <CAB6mnx+nJCNLf-yJYom045jK5fonvOi8AMPSQXKSfyZSf6T9_g@mail.gmail.com>
References: <CAB6mnx+nJCNLf-yJYom045jK5fonvOi8AMPSQXKSfyZSf6T9_g@mail.gmail.com>
Message-ID: <CANNq6Fm24LLtqduvA2ZVGs-B48v5f63JxGwjL-AVNxCVvnNfug@mail.gmail.com>

Perhaps a bit of context might be useful? How is numpy_ufunc different from
the ufuncs that we know and love? What are the known implications? What are
the known shortcomings? Are there ABI and/or API concerns between 1.9 and
1.10?

Ben Root


On Mon, Jul 14, 2014 at 2:22 PM, Charles R Harris <charlesr.harris at gmail.com
> wrote:

> Hi All,
>
> Julian has raised the question of including numpy_ufunc in numpy 1.9. I
> don't feel strongly one way or the other, but it doesn't seem to be
> finished yet and 1.10 might be a better place to work out the remaining
> problems along with the astropy folks testing possible uses.
>
> Thoughts?
>
> Chuck
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140715/858cf19e/attachment.html>

From jtaylor.debian at googlemail.com  Tue Jul 15 14:06:26 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Tue, 15 Jul 2014 20:06:26 +0200
Subject: [Numpy-discussion] __numpy_ufunc__ and 1.9 release
Message-ID: <53C56DA2.40402@googlemail.com>

hi,
as you may know we want to release numpy 1.9 soon. We should have solved
most indexing regressions the first beta showed.

The remaining blockers are finishing the new __numpy_ufunc__ feature.
This feature should allow for alternative method to overriding the
behavior of ufuncs from subclasses.
It is described here:
https://github.com/numpy/numpy/blob/master/doc/neps/ufunc-overrides.rst

The current blocker issues are:
https://github.com/numpy/numpy/issues/4753
https://github.com/numpy/numpy/pull/4815

I'm not to familiar with all the complications of subclassing so I can't
really say how hard this is to solve.
My issue is that it there still seems to be debate on how to handle
operator overriding correctly and I am opposed to releasing a numpy with
yet another experimental feature that may or may not be finished
sometime later. Having datetime in infinite experimental state is bad
enough.
I think nobody is served well if we release 1.9 with the feature
prematurely based on a not representative set of users and the later
after more users showed up see we have to change its behavior.

So I'm wondering if we should delay the introduction of this feature to
1.10 or is it important enough to wait until there is a consensus on the
remaining issues?


From shoyer at gmail.com  Tue Jul 15 14:21:39 2014
From: shoyer at gmail.com (Stephan Hoyer)
Date: Tue, 15 Jul 2014 11:21:39 -0700
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CAFvE7K4CP09VOY4CbFE6TEXOheYekaqJTQz5CaARX7_oRnzTPw@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
	<CAPJVwBm67=oZbKXN60tUqdDD4qtqYK66iBO1yK6Cd1MF_qxkyA@mail.gmail.com>
	<CAP7h-xbs=VFved2VQ5m_2_sz5OF5Jes8cCiaU=EAq8Bq_u9Dwg@mail.gmail.com>
	<CAFvE7K4CP09VOY4CbFE6TEXOheYekaqJTQz5CaARX7_oRnzTPw@mail.gmail.com>
Message-ID: <CAEQ_TvcrOrhU40qzpX=fy-pVj5ian2Y76Df+nRmECi8hXxqq4w@mail.gmail.com>

On Mon, Jul 14, 2014 at 10:00 AM, Olivier Grisel <olivier.grisel at ensta.org>
wrote:

> 2014-07-13 19:05 GMT+02:00 Alexander Belopolsky <ndarray at mac.com>:
> > I've been toying with the idea of creating an array type for interned
> > strings.  In many applications dealing with large arrays of variable size
> > strings, the strings come from a relatively short set of names.  Arrays
> of
> > interned strings can be manipulated very efficiently because in may
> respects
> > they are just like arrays of integers.
>
> +1 I think this is why pandas is using dtype=object to load string
> data: in many cases short string values are used to represent
> categorical variables with a comparatively small cardinality of
> possible values for a dataset with comparatively numerous records.
>

Pandas has a new "categorical" type (just merged into master) which is
pretty similar to interned strings:
https://github.com/pydata/pandas/pull/7217
http://pandas-docs.github.io/pandas-docs-travis/whatsnew.html

Of course, it would be ideal for numpy itself to natively support
categoricals and variables length strings.

Best,
Stephan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140715/b54371c7/attachment.html>

From aldcroft at head.cfa.harvard.edu  Tue Jul 15 14:40:58 2014
From: aldcroft at head.cfa.harvard.edu (Aldcroft, Thomas)
Date: Tue, 15 Jul 2014 14:40:58 -0400
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CAPJVwBm67=oZbKXN60tUqdDD4qtqYK66iBO1yK6Cd1MF_qxkyA@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
	<CAPJVwBm67=oZbKXN60tUqdDD4qtqYK66iBO1yK6Cd1MF_qxkyA@mail.gmail.com>
Message-ID: <CAMtEP6wbWroSsnWFSDiiT8NuK3vj2LOVi52iiczY4U3HeCJBOA@mail.gmail.com>

On Sat, Jul 12, 2014 at 8:02 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On 12 Jul 2014 23:06, "Charles R Harris" <charlesr.harris at gmail.com>
> wrote:
> >
> > As previous posts have pointed out, Numpy's `S` type is currently
> treated as a byte string, which leads to more complicated code in python3.
> OTOH, the unicode type is stored as UCS4, which consumes a lot of space,
> especially for ascii strings. This note proposes to adapt the currently
> existing 'a' type letter, currently aliased to 'S', as a new fixed encoding
> dtype. Python 3.3 introduced two one byte internal representations for
> unicode strings, ascii and latin1. Ascii has the advantage that it is a
> subset of UTF-8, whereas latin1 has a few more symbols. Another possibility
> is to just make it an UTF-8 encoding, but I think this would involve more
> overhead as Python would need to determine the maximum character size.
> These are just preliminary thoughts, comments are welcome.
>
> I feel like for most purposes, what we *really* want is a variable length
> string dtype (I.e., where each element can be a different length.). Pandas
> pays quite some price in overhead to fake this right now. Adding such a
> thing will cause some problems regarding compatibility (what to do with
> array(["foo"])) and education, but I think it's worth it in the long run. A
> variable length string with out of band storage also would allow for a lot
> of py3.3-style storage tricks of we want then.
>
> Given that, though, I'm a little dubious about adding a third fixed length
> string type, since it seems like it might be a temporary patch, yet raises
> the prospect of having to indefinitely support *5* distinct string types (3
> of which will map to py3 str)...
>
> OTOH, fixed length nul padded latin1 would be useful for various flat file
> reading tasks.
>
As one of the original agitators for this, let me re-iterate that what the
astronomical community *really* wants is the original proposal as described
by Chris Barker [1] and essentially what Charles said.  We have large data
archives that have ASCII string data in binary formats like FITS and HDF5.
 The current readers for those datasets present users with numpy S data
types, which in Python 3 cannot be compared to str (unicode) literals.  In
many cases those datasets are large, and in my case I regularly deal with
multi-Gb sized bytestring arrays.  Converting those to a U dtype is not
practical.

This issue is the sole blocker that I personally have in beginning to move
our operations code base to be Python 3 compatible, and eventually actually
baselining Python 3.

A variable length string would be great, but it feels like a different (and
more difficult) problem to me.  If, however, this can be the solution to
the problem I described, and it can be implemented in a finite time, then
I'm all for it!  :-)

I hate begging for features with no chance of contributing much to the
implementation (lacking the necessary expertise in numpy internals).  I
would be happy to draft a NEP if that will help the process.

Cheers,
Tom

[1]:
http://mail.scipy.org/pipermail/numpy-discussion/2014-January/068622.html

> -n
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140715/8cfd2da8/attachment.html>

From andrew.collette at gmail.com  Tue Jul 15 15:11:41 2014
From: andrew.collette at gmail.com (Andrew Collette)
Date: Tue, 15 Jul 2014 13:11:41 -0600
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
Message-ID: <CALmrCV3DQj7nu0KcCvoktmunK_t_qp+RR2rwB7L-qCBVuQ+=3A@mail.gmail.com>

Hi,

> good argument for ASCII, but utf-8 is a bad idea, as there is no 1:1
correspondence between length of string in bytes and length in characters
-- as numpy needs to pre-allocate a defined number of bytes for a dtype,
there is a disconnect between the user and numpy as to how long a string is
being stored...this isn't a problem for immutable strings, and less of a
problem for HDF, as you can determine how many bytes you need before you
write the file (or does HDF support var-length elements?)

There is an HDF5 variable-length type, which we currently read and
write as Python str objects (using NumPy's object type).  But HDF5
additionally has a fixed-storage-width UTF8 type, so we could map to a
NumPy fixed-storage-width type trivially.

When determining the HDF5 data type, unfortunately all you have to go
on is the NumPy dtype... creating an HDF5 dataset is done separately
from writing the data.

> "custom"? it would be an encoding operation -- which you'd need to go from
utf-8 to/from unicode anyway. So you would lose the ability to have a nice
1:1 binary representation map between numpy and HDF... good argument for
ASCII, I guess. Or for HDF to use latin-1 ;-)

"Custom" in this context means a user-created HDF5 data-conversion
filter, which is necessary since all data conversion is handled inside
the HDF5 library.  We've written several for things like the NumPy
bool type, etc:

https://github.com/h5py/h5py/blob/master/h5py/_conv.pyx

As far as generic Unicode goes, we currently don't support the NumPy
"U" dtype in h5py for similar reasons; there's no destination type in
HDF5 which (1) would preserve the dtype for round-trip write/read
operations and (2) doesn't risk truncation.  A Latin-1 based 'a' type
would have similar problems.

> Does HDF enforce ascii-only? what does it do with the > 127 values?

Unfortunately/fortunately the charset is not enforced for either ASCII
or UTF-8, although the HDF Group has been thinking about it.

> that's the whole issue with UTF-8 -- it needs to be addressed somewhere,
and the numpy-HDF interface seems like a smarter place to put it than the
numpy-user interface!

I agree fixed-storage-width UTF-8 is likely too complex to use as a
native NumPy type.  Ideally, NumPy would support variable-length
strings, in which case all these headaches would go away.  But I
imagine that's also somewhat complicated. :)

Andrew


From chris.barker at noaa.gov  Tue Jul 15 16:45:41 2014
From: chris.barker at noaa.gov (Chris Barker)
Date: Tue, 15 Jul 2014 13:45:41 -0700
Subject: [Numpy-discussion] String type again.
In-Reply-To: <1405423590.8281.7.camel@sebastian-t440>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
	<1405423590.8281.7.camel@sebastian-t440>
Message-ID: <CALGmxEJb+xFH=3TjdpYie51AmG_Ru4KFsWRYRzq7Nmt4bJvO=Q@mail.gmail.com>

On Tue, Jul 15, 2014 at 4:26 AM, Sebastian Berg <sebastian at sipsolutions.net>
wrote:

> Just wondering, couldn't we have a type which actually has an
>  (arbitrary, python supported) encoding (and "bytes" might even just be a
> special case of no encoding)?


well, then we're back to the core issue here:

numpy dtypes need to be a pre-specified length

encoded bytes are an arbitrary length.

This leads us to wanting to use only fixed-number-of-bytes-per-character
encodings:
 - ascii
 - latin-a
 - UCS-4 (or UTF-32..I get a bit confused about the names)

maybe UCS-2 (NOT UTF-16) would be worth considering, for a compromise
between space and fraction of unicode supported.

Basically storing bytes and on access do
> element[i].decode(specified_encoding) and on storing element[i] =
> value.encode(specified_encoding).
>

this really doesn't seem that different than just using python strings --
is there a point to having a pointer-to-python-string type as a less
generalized version of the currently possible  python strings in object
arrays?

 There is always the never ending small issue of trailing null bytes. If

> we want to be fully compatible, such a type would have to store the
> string length explicitly to support trailing null bytes.
>

are null bytes legal (as something other than a terminator) in some
encodings?

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140715/166096a3/attachment.html>

From tsyu80 at gmail.com  Wed Jul 16 00:37:13 2014
From: tsyu80 at gmail.com (Tony Yu)
Date: Tue, 15 Jul 2014 23:37:13 -0500
Subject: [Numpy-discussion] `allclose` vs `assert_allclose`
Message-ID: <CAEym_HoQgVswDS7h7hx4JjqVb-jKmoSpZmt36VAQSbyTX_NXnw@mail.gmail.com>

Is there any reason why the defaults for `allclose` and `assert_allclose`
differ? This makes debugging a broken test much more difficult. More
importantly, using an absolute tolerance of 0 causes failures for some
common cases. For example, if two values are very close to zero, a test
will fail:

    np.testing.assert_allclose(0, 1e-14)

Git blame suggests the change was made in the following commit, but I guess
that change only reverted to the original behavior.

https://github.com/numpy/numpy/commit/f43223479f917e404e724e6a3df27aa701e6d6bf

It seems like the defaults for  `allclose` and `assert_allclose` should
match, and an absolute tolerance of 0 is probably not ideal. I guess this
is a pretty big behavioral change, but the current default for
`assert_allclose` doesn't seem ideal.

Thanks,
-Tony
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140715/e960b35f/attachment.html>

From ralf.gommers at gmail.com  Wed Jul 16 03:06:07 2014
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Wed, 16 Jul 2014 09:06:07 +0200
Subject: [Numpy-discussion] __numpy_ufunc__
In-Reply-To: <CAB6mnx+nJCNLf-yJYom045jK5fonvOi8AMPSQXKSfyZSf6T9_g@mail.gmail.com>
References: <CAB6mnx+nJCNLf-yJYom045jK5fonvOi8AMPSQXKSfyZSf6T9_g@mail.gmail.com>
Message-ID: <CABL7CQiAtOUAf_5tyYsDwsTDmdKD3CLQfjU82CfupszjFxOvWg@mail.gmail.com>

On Mon, Jul 14, 2014 at 8:22 PM, Charles R Harris <charlesr.harris at gmail.com
> wrote:

> Hi All,
>
> Julian has raised the question of including numpy_ufunc in numpy 1.9. I
> don't feel strongly one way or the other, but it doesn't seem to be
> finished yet and 1.10 might be a better place to work out the remaining
> problems along with the astropy folks testing possible uses.
>
> Thoughts?
>

It's already in, so do you mean not using? Would help to know what the
issue is, because it's finished enough that it's already used in a released
version of scipy (in sparse matrices).

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140716/0d2a8eba/attachment.html>

From njs at pobox.com  Wed Jul 16 04:07:40 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 16 Jul 2014 09:07:40 +0100
Subject: [Numpy-discussion] __numpy_ufunc__
In-Reply-To: <CANNq6Fm24LLtqduvA2ZVGs-B48v5f63JxGwjL-AVNxCVvnNfug@mail.gmail.com>
References: <CAB6mnx+nJCNLf-yJYom045jK5fonvOi8AMPSQXKSfyZSf6T9_g@mail.gmail.com>
	<CANNq6Fm24LLtqduvA2ZVGs-B48v5f63JxGwjL-AVNxCVvnNfug@mail.gmail.com>
Message-ID: <CAPJVwB=irs0FMFE-mai9etUP=gos+wF=UNh7xnLMsiVxZL3MbQ@mail.gmail.com>

Weirdly, I never received Chuck's original email in this thread. Should
some list admin be informed?

I also am not sure what/where Julian's comments were, so I second the call
for context :-). Putting it off until 1.10 doesn't seem like an obviously
bad idea to me, but specifics would help...

(__numpy_ufunc__ is the new system for allowing arbitrary third party
objects to override how ufuncs are applied to them, i.e. it means
np.sin(sparsemat) and np.sin(gpuarray) can be defined to do something
sensible. Conceptually it replaces the old __array_prepare__/__array_wrap__
system, which was limited to ndarray subclasses and has major limits on
what you can do. Of course __array_prepare/wrap__ will also continue to be
supported for compatibility.)

-n
On 16 Jul 2014 00:10, "Benjamin Root" <ben.root at ou.edu> wrote:

> Perhaps a bit of context might be useful? How is numpy_ufunc different
> from the ufuncs that we know and love? What are the known implications?
> What are the known shortcomings? Are there ABI and/or API concerns between
> 1.9 and 1.10?
>
> Ben Root
>
>
> On Mon, Jul 14, 2014 at 2:22 PM, Charles R Harris <
> charlesr.harris at gmail.com> wrote:
>
>> Hi All,
>>
>> Julian has raised the question of including numpy_ufunc in numpy 1.9. I
>> don't feel strongly one way or the other, but it doesn't seem to be
>> finished yet and 1.10 might be a better place to work out the remaining
>> problems along with the astropy folks testing possible uses.
>>
>> Thoughts?
>>
>> Chuck
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140716/048c5028/attachment.html>

From toddrjen at gmail.com  Wed Jul 16 06:48:09 2014
From: toddrjen at gmail.com (Todd)
Date: Wed, 16 Jul 2014 12:48:09 +0200
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CALGmxEKR3NB_iL1fPb-N0hyf86dfy_BHxvy+KXcM-cYt-OtH8A@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
	<CALGmxEKR3NB_iL1fPb-N0hyf86dfy_BHxvy+KXcM-cYt-OtH8A@mail.gmail.com>
Message-ID: <CAFpSVpKyop44UYtEU8ErF7raCtuWrObnjGvpTDLA12E8DFTjSA@mail.gmail.com>

On Jul 16, 2014 11:43 AM, "Chris Barker" <chris.barker at noaa.gov> wrote:
> So numpy should have dtypes to match these. We're a bit stuck, however,
because 'S' mapped to the py2 string type, which no longer exists in py3.
Sorry not running py3 to see what 'S' does now, but I know it's bit broken,
and may be too late to change it

In py3 a 'S' dtype is converted to a python bytes object.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140716/2a129cd5/attachment.html>

From chaoyuejoy at gmail.com  Wed Jul 16 09:18:44 2014
From: chaoyuejoy at gmail.com (Chao YUE)
Date: Wed, 16 Jul 2014 15:18:44 +0200
Subject: [Numpy-discussion] Rounding float to integer while minizing the
 difference between the two arrays?
Message-ID: <CAAN-aRH11e-LgTAqXyrGxh-KzHP9SSMW5=R6Hc_tqh-r=3ecRg@mail.gmail.com>

Dear all,

I have two arrays with both float type, let's say X and Y. I want to round
the X to integers (intX) according to some decimal threshold, at the same
time I want to limit the following difference as small:

diff = np.sum(X*Y) - np.sum(intX*Y)

I don't have to necessarily minimize the "diff" variable (If with this
demand the computation time is too long). But I would like to limit the
"diff" to, let's say ten percent within np.sum(X*Y).

I have tried to write some functions, but I don't know where to start the
opitimization.

def convert_integer(x,threshold=0):
    """
    This fucntion converts the float number x to integer according to the
threshold.
    """
    if abs(x-0) < 1e5:
        return 0
    else:
        pdec,pint = math.modf(x)
        if pdec > threshold:
            return int(math.ceil(pint)+1)
        else:
            return int(math.ceil(pint))

def convert_arr(arr,threshold=0):
    out = arr.copy()
    for i,num in enumerate(arr):
        out[i] = convert_integer(num,threshold=threshold)
    return out

In [147]:
convert_arr(np.array([0.14,1.14,0.12]),0.13)

Out[147]:
array([1, 2, 0])

Now my problem is, how can I minimize or limit the following?
diff = np.sum(X*Y) - np.sum(convert_arr(X,threshold=?)*Y)

Because it's the first time I encounter such kind of question, so please
give me some clue to start :p Thanks a lot in advance.

Best,

Chao

-- 
please visit:
http://www.globalcarbonatlas.org/
***********************************************************************************
Chao YUE
Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
UMR 1572 CEA-CNRS-UVSQ
Batiment 712 - Pe 119
91191 GIF Sur YVETTE Cedex
Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
************************************************************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140716/658b842f/attachment.html>

From njs at pobox.com  Wed Jul 16 09:52:31 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 16 Jul 2014 14:52:31 +0100
Subject: [Numpy-discussion] `allclose` vs `assert_allclose`
In-Reply-To: <CAEym_HoQgVswDS7h7hx4JjqVb-jKmoSpZmt36VAQSbyTX_NXnw@mail.gmail.com>
References: <CAEym_HoQgVswDS7h7hx4JjqVb-jKmoSpZmt36VAQSbyTX_NXnw@mail.gmail.com>
Message-ID: <CAPJVwBn7Br5Ng7U=y26o=nj_E3LMdpWbc5Qp53h1fSOFD603eg@mail.gmail.com>

On 16 Jul 2014 10:26, "Tony Yu" <tsyu80 at gmail.com> wrote:
>
> Is there any reason why the defaults for `allclose` and `assert_allclose`
differ? This makes debugging a broken test much more difficult. More
importantly, using an absolute tolerance of 0 causes failures for some
common cases. For example, if two values are very close to zero, a test
will fail:
>
>     np.testing.assert_allclose(0, 1e-14)
>
> Git blame suggests the change was made in the following commit, but I
guess that change only reverted to the original behavior.
>
>
https://github.com/numpy/numpy/commit/f43223479f917e404e724e6a3df27aa701e6d6bf
>
> It seems like the defaults for  `allclose` and `assert_allclose` should
match, and an absolute tolerance of 0 is probably not ideal. I guess this
is a pretty big behavioral change, but the current default for
`assert_allclose` doesn't seem ideal.

What you say makes sense to me, and loosening the default tolerances won't
break any existing tests. (And I'm not too worried about people who were
counting on getting 1e-7 instead of 1e-5 or whatever... if it matters that
much to you exactly what tolerance you test, you should be setting the
tolerance explicitly!) I vote that unless someone comes up with some
terrible objection in the next few days then you should submit a PR :-)

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140716/67bc88cd/attachment.html>

From aldcroft at head.cfa.harvard.edu  Wed Jul 16 10:01:45 2014
From: aldcroft at head.cfa.harvard.edu (Aldcroft, Thomas)
Date: Wed, 16 Jul 2014 10:01:45 -0400
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CAB6mnx+OPGJuhmCbH=OEt=pAB2XTNh-sEFZUDyUVEFxnAAO9rg@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
	<1405423590.8281.7.camel@sebastian-t440>
	<CAB6mnx+OPGJuhmCbH=OEt=pAB2XTNh-sEFZUDyUVEFxnAAO9rg@mail.gmail.com>
Message-ID: <CAMtEP6wB7iSqm2bz=ajRCwFzvRHyUgO+Z4m3Ub6ENZ8ueU5MVA@mail.gmail.com>

On Tue, Jul 15, 2014 at 11:15 AM, Charles R Harris <
charlesr.harris at gmail.com> wrote:

>
>
>
> On Tue, Jul 15, 2014 at 5:26 AM, Sebastian Berg <
> sebastian at sipsolutions.net> wrote:
>
>> On Sa, 2014-07-12 at 12:17 -0500, Charles R Harris wrote:
>> > As previous posts have pointed out, Numpy's `S` type is currently
>> > treated as a byte string, which leads to more complicated code in
>> > python3. OTOH, the unicode type is stored as UCS4, which consumes a
>> > lot of space, especially for ascii strings. This note proposes to
>> > adapt the currently existing 'a' type letter, currently aliased to
>> > 'S', as a new fixed encoding dtype. Python 3.3 introduced two one byte
>> > internal representations for unicode strings, ascii and latin1. Ascii
>> > has the advantage that it is a subset of UTF-8, whereas latin1 has a
>> > few more symbols. Another possibility is to just make it an UTF-8
>> > encoding, but I think this would involve more overhead as Python would
>> > need to determine the maximum character size. These are just
>> > preliminary thoughts, comments are welcome.
>> >
>>
>> Just wondering, couldn't we have a type which actually has an
>> (arbitrary, python supported) encoding (and "bytes" might even just be a
>> special case of no encoding)? Basically storing bytes and on access do
>> element[i].decode(specified_encoding) and on storing element[i] =
>> value.encode(specified_encoding).
>>
>> There is always the never ending small issue of trailing null bytes. If
>> we want to be fully compatible, such a type would have to store the
>> string length explicitly to support trailing null bytes.
>>
>
> UTF-8 encoding works with null bytes. That is one of the reasons it is so
> popular.
>

>
> Thinking more about it, the easiest thing to do might be to make the S
> dtype a UTF-8 encoding. Most of the machinery to deal with that is already
> in place. That change might affect some users though, and we might need to
> do some work to make it backwards compatible with python 2.
>
> Chuck

Are you saying that numpy S dtypes would be exported to Py3 as str?  This
would work in my use case, though it seems it would break things for the
(few-ish) people using the numpy S type in Py3 since it would now look like
a Python str instead of bytes object.

One other thought is that one *might* finesse the fixed width vs. utf-8
variable length issue by using the exact same rules that currently apply to
strings in Py2:

- When setting an array from input like a list of strings (unicode in Py3),
make the array wide enough to handle the widest (in bytes) entry.
- When setting an element in an existing array, truncate any characters
that don't fit in the existing width.

In the second point note that the truncation would be full unicode
characters, not bytes.  This could be a point of confusion in some cases,
but it's simple to implement and formally consistent with current behavior.

- Tom

p.s. Strangely enough the mail I quoted from Chuck beginning with "Thinking
about it more .." never got to my email and I only happened to have seen it
in the archives.


> Chuck
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140716/f3d54983/attachment.html>

From chaoyuejoy at gmail.com  Wed Jul 16 11:26:05 2014
From: chaoyuejoy at gmail.com (Chao YUE)
Date: Wed, 16 Jul 2014 17:26:05 +0200
Subject: [Numpy-discussion] Rounding float to integer while minizing the
 difference between the two arrays?
In-Reply-To: <CAAN-aRH11e-LgTAqXyrGxh-KzHP9SSMW5=R6Hc_tqh-r=3ecRg@mail.gmail.com>
References: <CAAN-aRH11e-LgTAqXyrGxh-KzHP9SSMW5=R6Hc_tqh-r=3ecRg@mail.gmail.com>
Message-ID: <CAAN-aRH6K9oTTm+OpkM4JGud6a65djYXFnHU1OUeN2gOj29eBA@mail.gmail.com>

Sorry, there is one error in this part of code, it should be:


def convert_integer(x,threshold=0):
    """
    This fucntion converts the float number x to integer according to the
threshold.
    """
    if abs(x-0) < 1e-5:
        return 0
    else:
        pdec,pint = math.modf(x)
        if pdec > threshold:
            return int(math.ceil(pint)+1)
        else:
            return int(math.ceil(pint))


On Wed, Jul 16, 2014 at 3:18 PM, Chao YUE <chaoyuejoy at gmail.com> wrote:

> Dear all,
>
> I have two arrays with both float type, let's say X and Y. I want to round
> the X to integers (intX) according to some decimal threshold, at the same
> time I want to limit the following difference as small:
>
> diff = np.sum(X*Y) - np.sum(intX*Y)
>
> I don't have to necessarily minimize the "diff" variable (If with this
> demand the computation time is too long). But I would like to limit the
> "diff" to, let's say ten percent within np.sum(X*Y).
>
> I have tried to write some functions, but I don't know where to start the
> opitimization.
>
> def convert_integer(x,threshold=0):
>     """
>     This fucntion converts the float number x to integer according to the
> threshold.
>     """
>     if abs(x-0) < 1e5:
>         return 0
>     else:
>         pdec,pint = math.modf(x)
>         if pdec > threshold:
>             return int(math.ceil(pint)+1)
>         else:
>             return int(math.ceil(pint))
>
> def convert_arr(arr,threshold=0):
>     out = arr.copy()
>     for i,num in enumerate(arr):
>         out[i] = convert_integer(num,threshold=threshold)
>     return out
>
> In [147]:
> convert_arr(np.array([0.14,1.14,0.12]),0.13)
>
> Out[147]:
> array([1, 2, 0])
>
> Now my problem is, how can I minimize or limit the following?
> diff = np.sum(X*Y) - np.sum(convert_arr(X,threshold=?)*Y)
>
> Because it's the first time I encounter such kind of question, so please
> give me some clue to start :p Thanks a lot in advance.
>
> Best,
>
> Chao
>
> --
> please visit:
> http://www.globalcarbonatlas.org/
>
> ***********************************************************************************
> Chao YUE
> Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
> UMR 1572 CEA-CNRS-UVSQ
> Batiment 712 - Pe 119
> 91191 GIF Sur YVETTE Cedex
> Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
>
> ************************************************************************************
>


-- 
please visit:
http://www.globalcarbonatlas.org/
***********************************************************************************
Chao YUE
Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
UMR 1572 CEA-CNRS-UVSQ
Batiment 712 - Pe 119
91191 GIF Sur YVETTE Cedex
Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
************************************************************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140716/2b0a7143/attachment.html>

From pav at iki.fi  Wed Jul 16 13:16:13 2014
From: pav at iki.fi (Pauli Virtanen)
Date: Wed, 16 Jul 2014 20:16:13 +0300
Subject: [Numpy-discussion] __numpy_ufunc__ and 1.9 release
In-Reply-To: <53C56DA2.40402@googlemail.com>
References: <53C56DA2.40402@googlemail.com>
Message-ID: <53C6B35D.9020609@iki.fi>

Hi,

15.07.2014 21:06, Julian Taylor kirjoitti:
[clip: __numpy_ufunc__]
> So I'm wondering if we should delay the introduction of this
> feature to 1.10 or is it important enough to wait until there is a
> consensus on the remaining issues?

My 10c:

The feature is not so much in hurry that it alone should delay 1.9.
Moreover, it's best for everyone that it is bug-free on the first go,
and it gets some real-world testing before the release. Better safe than
sorry.

I'd pull it out from 1.9.x branch, and iron out the remaining wrinkles
before 1.10.

	Pauli


From aldcroft at head.cfa.harvard.edu  Wed Jul 16 13:32:44 2014
From: aldcroft at head.cfa.harvard.edu (Aldcroft, Thomas)
Date: Wed, 16 Jul 2014 13:32:44 -0400
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CAFpSVpKyop44UYtEU8ErF7raCtuWrObnjGvpTDLA12E8DFTjSA@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
	<CALGmxEKR3NB_iL1fPb-N0hyf86dfy_BHxvy+KXcM-cYt-OtH8A@mail.gmail.com>
	<CAFpSVpKyop44UYtEU8ErF7raCtuWrObnjGvpTDLA12E8DFTjSA@mail.gmail.com>
Message-ID: <CAMtEP6wfn-qz3NEGPoMDGz_x85sWK95xJm7S_r1RCUwwTSXwmA@mail.gmail.com>

On Wed, Jul 16, 2014 at 6:48 AM, Todd <toddrjen at gmail.com> wrote:

> On Jul 16, 2014 11:43 AM, "Chris Barker" <chris.barker at noaa.gov> wrote:
> > So numpy should have dtypes to match these. We're a bit stuck, however,
> because 'S' mapped to the py2 string type, which no longer exists in py3.
> Sorry not running py3 to see what 'S' does now, but I know it's bit broken,
> and may be too late to change it
>
> In py3 a 'S' dtype is converted to a python bytes object.
>
As a slightly philosophical aside, at some point during Scipy, Nick Coghlan
said that the core Python team had stopped recommending the use of `from
__future__ import unicode_literals` for Python 2 / 3 compatible code.  I
have some experience now with writing 2 / 3 code for astropy and I came to
the same conclusion.  The point is that `str` is the "natural" text class
that is used by default for both 2 and 3.  Most scientific Py2 code is
written to this model.

Following this to the Py3 end, that would imply that the most natural
convention for numpy S dtype in Py3 would be that it gets to Python as a
utf-8 `str`, as Chuck suggested.  I think the variable-length encoding
issue is not such a problem if you follow the existing numpy convention of
truncating overflowing strings on assignment.

Using utf-8 like this would (I think) make most Py2 code that uses HDF5 and
FITS ASCII string data just work out of the box on Py3, which would be
super.

- Tom


>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140716/abb36f12/attachment.html>

From ralf.gommers at gmail.com  Wed Jul 16 14:47:24 2014
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Wed, 16 Jul 2014 20:47:24 +0200
Subject: [Numpy-discussion] `allclose` vs `assert_allclose`
In-Reply-To: <CAEym_HoQgVswDS7h7hx4JjqVb-jKmoSpZmt36VAQSbyTX_NXnw@mail.gmail.com>
References: <CAEym_HoQgVswDS7h7hx4JjqVb-jKmoSpZmt36VAQSbyTX_NXnw@mail.gmail.com>
Message-ID: <CABL7CQg9A5GXRKvyJYiYEvd5x33J6MMesVB5p5dQH-q+4Hm0ZA@mail.gmail.com>

On Wed, Jul 16, 2014 at 6:37 AM, Tony Yu <tsyu80 at gmail.com> wrote:

> Is there any reason why the defaults for `allclose` and `assert_allclose`
> differ? This makes debugging a broken test much more difficult. More
> importantly, using an absolute tolerance of 0 causes failures for some
> common cases. For example, if two values are very close to zero, a test
> will fail:
>
>     np.testing.assert_allclose(0, 1e-14)
>
> Git blame suggests the change was made in the following commit, but I
> guess that change only reverted to the original behavior.
>
>
> https://github.com/numpy/numpy/commit/f43223479f917e404e724e6a3df27aa701e6d6bf
>

Indeed, was reverting a change that crept into
https://github.com/numpy/numpy/commit/f527b49a


>
> It seems like the defaults for  `allclose` and `assert_allclose` should
> match, and an absolute tolerance of 0 is probably not ideal. I guess this
> is a pretty big behavioral change, but the current default for
> `assert_allclose` doesn't seem ideal.
>

I agree, current behavior quite annoying. It would make sense to change the
atol default to 1e-8, but technically it's a backwards compatibility break.
Would probably have a very minor impact though. Changing the default for
rtol in one of the functions may be much more painful though, I don't think
that should be done.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140716/cbd274e5/attachment.html>

From ralf.gommers at gmail.com  Wed Jul 16 14:53:32 2014
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Wed, 16 Jul 2014 20:53:32 +0200
Subject: [Numpy-discussion] __numpy_ufunc__
In-Reply-To: <CAPJVwB=irs0FMFE-mai9etUP=gos+wF=UNh7xnLMsiVxZL3MbQ@mail.gmail.com>
References: <CAB6mnx+nJCNLf-yJYom045jK5fonvOi8AMPSQXKSfyZSf6T9_g@mail.gmail.com>
	<CANNq6Fm24LLtqduvA2ZVGs-B48v5f63JxGwjL-AVNxCVvnNfug@mail.gmail.com>
	<CAPJVwB=irs0FMFE-mai9etUP=gos+wF=UNh7xnLMsiVxZL3MbQ@mail.gmail.com>
Message-ID: <CABL7CQjJbySkg4J5yomN3VFiSrPDim1XAFEVAGn+-jh2DH=bbg@mail.gmail.com>

On Wed, Jul 16, 2014 at 10:07 AM, Nathaniel Smith <njs at pobox.com> wrote:

> Weirdly, I never received Chuck's original email in this thread. Should
> some list admin be informed?
>
Also weirdly, my reply didn't show up on gmane. Not sure if it got through,
so re-sending:

It's already in, so do you mean not using? Would help to know what the
issue is, because it's finished enough that it's already used in a released
version of scipy (in sparse matrices).

Ralf

I also am not sure what/where Julian's comments were, so I second the call
> for context :-). Putting it off until 1.10 doesn't seem like an obviously
> bad idea to me, but specifics would help...
>
> (__numpy_ufunc__ is the new system for allowing arbitrary third party
> objects to override how ufuncs are applied to them, i.e. it means
> np.sin(sparsemat) and np.sin(gpuarray) can be defined to do something
> sensible. Conceptually it replaces the old __array_prepare__/__array_wrap__
> system, which was limited to ndarray subclasses and has major limits on
> what you can do. Of course __array_prepare/wrap__ will also continue to be
> supported for compatibility.)
>
-n
> On 16 Jul 2014 00:10, "Benjamin Root" <ben.root at ou.edu> wrote:
>
>> Perhaps a bit of context might be useful? How is numpy_ufunc different
>> from the ufuncs that we know and love? What are the known implications?
>> What are the known shortcomings? Are there ABI and/or API concerns between
>> 1.9 and 1.10?
>>
>> Ben Root
>>
>>
>> On Mon, Jul 14, 2014 at 2:22 PM, Charles R Harris <
>> charlesr.harris at gmail.com> wrote:
>>
>>> Hi All,
>>>
>>> Julian has raised the question of including numpy_ufunc in numpy 1.9. I
>>> don't feel strongly one way or the other, but it doesn't seem to be
>>> finished yet and 1.10 might be a better place to work out the remaining
>>> problems along with the astropy folks testing possible uses.
>>>
>>> Thoughts?
>>>
>>> Chuck
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140716/2dcc18a6/attachment.html>

From chris.barker at noaa.gov  Wed Jul 16 16:51:39 2014
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Wed, 16 Jul 2014 13:51:39 -0700
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CALmrCV3DQj7nu0KcCvoktmunK_t_qp+RR2rwB7L-qCBVuQ+=3A@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
	<CALmrCV3DQj7nu0KcCvoktmunK_t_qp+RR2rwB7L-qCBVuQ+=3A@mail.gmail.com>
Message-ID: <-4597269384285942771@unknownmsgid>

> But HDF5
> additionally has a fixed-storage-width UTF8 type, so we could map to a
> NumPy fixed-storage-width type trivially.

Sure -- this is why *nix uses utf-8 for filenames -- it can just be a
char*. But that just punts the problem to client code.

I think a UTF-8 string type does not match the numpy model well, and I
don't think we should support it just because it would be easier for
the HDF 5 wrappers.

( to be fair, there are probably other similar systems numpy wants to
interface with that cod use this...)

It seems if you want a 1:1 binary mapping between HDF and numpy for
utf strings, then a bytes type in numpy makes more sense. Numpy
could/should have encode and decode methods for converting byte arrays
to/from Unicode arrays (does it already? ).

> "Custom" in this context means a user-created HDF5 data-conversion
> filter, which is necessary since all data conversion is handled inside
> the HDF5 library.

> As far as generic Unicode goes, we currently don't support the NumPy
> "U" dtype in h5py for similar reasons; there's no destination type in
> HDF5 which (1) would preserve the dtype for round-trip write/read
> operations and (2) doesn't risk truncation.

It sounds to like HDF5 simply doesn't support Unicode. Calling an
array of bytes utf-8 simple pushes the problem on to client libs. As
that's where the problem lies, then the PyHDF may be the place to
address it.

If we put utf-8 in numpy, we have the truncation problem there instead
-- which is exactly what I think we should avoid.

> A Latin-1 based 'a' type
> would have similar problems.

Maybe not -- latin1 is fixed width.

>> Does HDF enforce ascii-only? what does it do with the > 127 values?
>
> Unfortunately/fortunately the charset is not enforced for either ASCII

So you can dump Latin-1 into and out of the HDF 'ASCII' type -- it's
essentially the old char* / py2 string. An ugly situation, but why not
use it?

> or UTF-8,

So ASCII and utf-8 are really the same thing, with different meta-data...

> although the HDF Group has been thinking about it.

I wonder if they would consider going Latin-1 instead of ASCII --
similarly to utf-8 it's backward compatible with ASCII, but gives you
a little more.

I don't know that there is another 1byte encoding worth using -- it
maybe be my English bias, but it seems Latin-1 gives us ASCII+some
extra stuff handy for science ( I use the degree symbol a lot, for
instance) with nothing lost.

> Ideally, NumPy would support variable-length
> strings, in which case all these headaches would go away.

Would they? That would push the problem back to PyHDF -- which I'm
arguing is where it belongs, but I didn't think you were ;-)
>
> But I
> imagine that's also somewhat complicated. :)

That's a whole other kettle of fish, yes.


-Chris


From chaoyuejoy at gmail.com  Wed Jul 16 16:59:32 2014
From: chaoyuejoy at gmail.com (Chao YUE)
Date: Wed, 16 Jul 2014 22:59:32 +0200
Subject: [Numpy-discussion] Rounding float to integer while minizing the
 difference between the two arrays?
In-Reply-To: <CAAN-aRH11e-LgTAqXyrGxh-KzHP9SSMW5=R6Hc_tqh-r=3ecRg@mail.gmail.com>
References: <CAAN-aRH11e-LgTAqXyrGxh-KzHP9SSMW5=R6Hc_tqh-r=3ecRg@mail.gmail.com>
Message-ID: <CAAN-aRFBgENkhfm6oYFvVq--sOvtqVDLmktLiQ7prOGt985fxg@mail.gmail.com>

Dear all,

A bit sorry, this is not difficult. scipy.optimize.minimize_scalar seems to
solve my problem. Thanks anyway, for this great tool.

Cheers,

Chao


On Wed, Jul 16, 2014 at 3:18 PM, Chao YUE <chaoyuejoy at gmail.com> wrote:

> Dear all,
>
> I have two arrays with both float type, let's say X and Y. I want to round
> the X to integers (intX) according to some decimal threshold, at the same
> time I want to limit the following difference as small:
>
> diff = np.sum(X*Y) - np.sum(intX*Y)
>
> I don't have to necessarily minimize the "diff" variable (If with this
> demand the computation time is too long). But I would like to limit the
> "diff" to, let's say ten percent within np.sum(X*Y).
>
> I have tried to write some functions, but I don't know where to start the
> opitimization.
>
> def convert_integer(x,threshold=0):
>     """
>     This fucntion converts the float number x to integer according to the
> threshold.
>     """
>     if abs(x-0) < 1e5:
>         return 0
>     else:
>         pdec,pint = math.modf(x)
>         if pdec > threshold:
>             return int(math.ceil(pint)+1)
>         else:
>             return int(math.ceil(pint))
>
> def convert_arr(arr,threshold=0):
>     out = arr.copy()
>     for i,num in enumerate(arr):
>         out[i] = convert_integer(num,threshold=threshold)
>     return out
>
> In [147]:
> convert_arr(np.array([0.14,1.14,0.12]),0.13)
>
> Out[147]:
> array([1, 2, 0])
>
> Now my problem is, how can I minimize or limit the following?
> diff = np.sum(X*Y) - np.sum(convert_arr(X,threshold=?)*Y)
>
> Because it's the first time I encounter such kind of question, so please
> give me some clue to start :p Thanks a lot in advance.
>
> Best,
>
> Chao
>
> --
> please visit:
> http://www.globalcarbonatlas.org/
>
> ***********************************************************************************
> Chao YUE
> Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
> UMR 1572 CEA-CNRS-UVSQ
> Batiment 712 - Pe 119
> 91191 GIF Sur YVETTE Cedex
> Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
>
> ************************************************************************************
>


-- 
please visit:
http://www.globalcarbonatlas.org/
***********************************************************************************
Chao YUE
Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
UMR 1572 CEA-CNRS-UVSQ
Batiment 712 - Pe 119
91191 GIF Sur YVETTE Cedex
Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
************************************************************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140716/c3b065f7/attachment.html>

From jtaylor.debian at googlemail.com  Wed Jul 16 18:20:40 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Thu, 17 Jul 2014 00:20:40 +0200
Subject: [Numpy-discussion] parallel distutils extensions build? use gcc
	-flto
Message-ID: <53C6FAB8.40107@googlemail.com>

hi,
I have been playing around a bit with gccs link time optimization
feature and found that using it actually speeds up a from scratch build
of numpy due to its ability to perform parallel optimization and linking.
As a bonus you also should get faster binaries due to the better
optimizations lto allows.

As compiling with lto does require some possibly lesser know details I
wanted to share it.

Prerequesits are a working gcc toolchain of at least gcc-4.8 and
binutils > 2.21, gcc 4.9 is better as its faster.

First of all numpy checks the long double representation by compiling a
file and looking at the binary, this won't work as the od -b
reimplementation here does not understand lto objects, so on x86 we must
short circuit that:
--- a/numpy/core/setup_common.py
+++ b/numpy/core/setup_common.py
@@ -174,6 +174,7 @@ def check_long_double_representation(cmd):
     # We need to use _compile because we need the object filename
     src, object = cmd._compile(body, None, None, 'c')
     try:
+	return 'IEEE_DOUBLE_LE'
         type = long_double_representation(pyod(object))
         return type
     finally:


Next we build numpy as usual but override the compiler, linker and ar to
add our custom flags.
The setup.py call would look like this:

CC='gcc -fno-fat-lto-objects -flto=4 -fuse-linker-plugin -O3' \
LDSHARED='gcc -fno-fat-lto-objects -flto=4 -fuse-linker-plugin -shared
-O3' AR=gcc-ar \
python setup.py build_ext

Some explanation:
The ar override is needed as numpy builds a static library and ar needs
to know about lto objects. gcc-ar does exactly that.
-flto=4 the main flag tell gcc to perform link time optimizations using
4 parallel processes.
-fno-fat-lto-objects tells gcc to only build lto objects, normally it
builds both an lto object and a normal object for toolchain
compatibilty. If our toolchain can handle lto objects this is just a
waste of time and we skip it. (The flag is default in gcc-4.9 but not 4.8)
-fuse-linker-plugin directs it to run its link time optimizer plugin in
the linking step, the linker must support plugins, both bfd (> 2.21) and
gold linker do so. This allows for more optimizations.
-O3 has to be added to the linker too as thats where the optimization
occurs. In general a problem with lto is that the compiler options of
all steps much match the flags used for linking.

If you are using c++ or gfortran you also have to override that to use
lto (CXX and FF(?))

See https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html for a lot
more details.


For some numbers on my machine a from scratch numpy build with no
caching takes 1min55s, with lto on 4 it only takes 55s. Pretty neat for
a much more involved optimization process.

Concerning the speed gain we get by this, I ran our benchmark suite with
this build, there were no really significant gains which is somewhat
expected as numpy is simple C code with most function bottlenecks
already inlined.

So conclusion: flto seems to work well with recent gccs and allows for
faster builds using the limited distutils. While probably not useful for
development where compiler caching (ccache) is of utmost importance it
is still interesting for projects doing one shot uncached builds (travis
like CI) and have huge objects (e.g. swig or cython) and don't want to
change to proper parallel build systems like bento.

PS: So far I know clang also supports lto but I never used it
PPS: using NPY_SEPARATE_COMPILATION=0 crashes gcc-4.9, time for a bug
report.

Cheers,
Julian


From fperez.net at gmail.com  Wed Jul 16 23:08:58 2014
From: fperez.net at gmail.com (Fernando Perez)
Date: Wed, 16 Jul 2014 20:08:58 -0700
Subject: [Numpy-discussion] Numpy BoF at SciPy 2014 - quick report
Message-ID: <CAHAreOrW1Uj3-emegqS_JMx7jtzeZThONgM0ofskiAw3sT4Rew@mail.gmail.com>

Hi all,

sorry for not posting earlier, post-conference InboxInfinity blues and all
that...

The BoF did go as planned, and it was a good discussion, mostly following
the tentative agenda outlined here:

https://github.com/numpy/numpy/wiki/Numpy-BoF-at-Scipy-2014

Various folks were kind enough to take notes during the conversation on an
Etherpad instance:

https://scipy2014.etherpad.mozilla.org/35

For the sake of completeness and future reference, below I'm including a
copy of the notes in this email.

Other than what's in the notes, my take home from the discussion is mostly
that:

- we probably needed a longer slot than 45 minutes to have a chance to dig
in a little deeper.

- it would have been more productive if a focused numpy sprint had been
also planned, so that there could be more structured follow-up on the ideas
that came up.

It would be great to hear from others who were present at the conference.
In particular, Chris Barker brought up a number of things regarding
datetime and planned on following up during the sprints, but I'm not sure
what ended up happening.

Thanks to everyone who participated!

Cheers

f


#### Copy of Etherpad notes as of 7/16/2014:

Notes from BoF:
  1:30, July 19, 2014


Working with topics on this page:
https://github.com/numpy/numpy/wiki/Numpy-BoF-at-Scipy-2014

chuck: where do we go from here? -- what is the role of numpy now?

Generalized ufuncs -- still some more to do -- (LA stuff - norms)
 - some ufuncs don't impliment array interface -- which are those -- sprint
topic?
 - zeros_like, ones_like, more... (duplicate) github issue:
https://github.com/numpy/numpy/issues/4862

 Here's the original issue: https://github.com/numpy/numpy/issues/3602

Implementation of @ (matrix multiplication)
 - will be in 3.5 ~ 18months
 - no work started yet -- have to make sure we do it.
 - @@ was not added.
 - The PEP for numpy is well-defined. Not much thinking to be done. (Good
for a sprint)

 Datetime:
  - Can it be done? -- too many calendars -- to many time scales, etc.
  -  Can we cover most applications?
  - DynND -- higher abstraction -- convert to back end implimentation
  - Also look at what R and Julia do?
  - Maybe fix up the little issues in datetime64, first?
  - Pandas does not use numpy machinery
    - uses a array of objects: those objects are subclassed form
datetime.datetime
     - does use int64, but gets unboxed on storage.
  - Root cause is using UTC, rather than a naive time.
   - Naive is not associated with a time zone. Can be interpreted in any
way.
    - Ripping out the locale timezone on I/O would help.
    - More often than not, using the locale timezone is not desired.
   - For example, many experimental data do not attach time zones. (Or
wrong timezone)
   - Consider laboratory time (stopwatch rather than a clock). (timedelta)
   - The C++ committee is standardizing this.
   - A key feature which is missing, is being able to choose your epoch.

New DTypes
 - Example: quad float types. A solution for missing values? Adding units
support.
 - Record & structured arrays play around with dtypes. Needs to be easier
to use these.
 - Improve documentation.
 - How to extend to support things like labeled arrays?
  - This is orthogonal to dtypes.
  - Would rather access time column instead of 3rd column.
  - Would provide a better foundation for pandas.
 - Key is to keep inputs simple.
 - Finish the DataArray push?
  - We are very closely there. It has been sitting there for a while.
  - If interested, talk at sprints on July 10.

Missing values?
 - maybe improve masked array.
 - give up for now.

Inheriting ndarray
 - introduces many bugs.
 - should discourage this, but make it easier to work with it.

Dynd
 - The issues discussed so far were motivation for starting dynd
  - for example, a pluggable type system
  - adding a categorical type in numpy (at Continuum) broke lots. Easier in
dynd.
 - Commitment for dynd is to give it a numpy-like API
 - Both need to evolve together.
  - Find ways to make things more uniform (in numpy)
  - Dynd is more an experimental phase, changing quickly.
 - Can we import dynd as np?
  - Not a goal. More exploratory in this phase.
  - Adding a layer like that at a later time would be good. Not there, yet.
  - Do not want to repeat py2->py3 debacle.
 - Buffer protocol:
  - Supported, but dynd extends it.
  - As a pure C++ library, goal is to freeze once stable so systems beyond
Python can depend on it as a stable interface for working with array data.

Boost::Python
 - Nothing official from numpy for using numpy arrays in C++
 - Not prioritized.
 - Numpy has gotten better about namespace pollution?
 - It kind of works already. Talk to Mike Droettboom

-- 
Fernando Perez (@fperez_org; http://fperez.org)
fperez.net-at-gmail: mailing lists only (I ignore this when swamped!)
fernando.perez-at-berkeley: contact me here for any direct mail
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140716/9d457329/attachment.html>

From joseph.martinot-lagarde at m4x.org  Wed Jul 16 14:57:26 2014
From: joseph.martinot-lagarde at m4x.org (Joseph Martinot-Lagarde)
Date: Wed, 16 Jul 2014 20:57:26 +0200
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CALGmxEK0_waHnPpzdx1zV4Yk7_0=HuLkNvcy+ChYiqhAyyHmJg@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>	<CALmrCV2K1kTU0onRtx0uaCpJKLAvqru0Hz3aA66TSG2RWiEY8w@mail.gmail.com>
	<CALGmxEK0_waHnPpzdx1zV4Yk7_0=HuLkNvcy+ChYiqhAyyHmJg@mail.gmail.com>
Message-ID: <53C6CB16.2060503@m4x.org>

Le 15/07/2014 18:18, Chris Barker a ?crit :
> (or does HDF support var-length
> elements?)
>
It does: http://www.hdfgroup.org/HDF5/doc/TechNotes/VLTypes.html


From sebastian at sipsolutions.net  Tue Jul 15 07:16:34 2014
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Tue, 15 Jul 2014 13:16:34 +0200
Subject: [Numpy-discussion] Bug in np.cross for 2D vectors
In-Reply-To: <1405416176.45058.YahooMailNeo@web133104.mail.ir2.yahoo.com>
References: <1405416176.45058.YahooMailNeo@web133104.mail.ir2.yahoo.com>
Message-ID: <1405422994.8281.1.camel@sebastian-t440>

On Di, 2014-07-15 at 10:22 +0100, Neil Hodgson wrote:
> Hi,
> 
> We came across this bug while using np.cross on 3D arrays of 2D
> vectors.

Hi,

which numpy version are you using? Until recently, the cross product
simply did *not* work in a broadcasting manner (3d arrays of 2d
vectors), it did something, but usually not the right thing. This is
fixed in recent versions (not sure if 1.8 or only now with 1.9)

- Sebastian

> The first example shows the problem and we looked at the source for
> np.cross and believe we found the bug - an unnecessary swapaxes when
> returning the output (comment inserted in the code).
> 
> Thanks
> Neil 
> 
> # Example
> 
> shape = (3,5,7,2)
> 
> 
> # These are effectively 3D arrays (3*5*7) of 2D vectors
> data1 = np.random.randn(*shape)
> data2 = np.random.randn(*shape)
> 
> 
> # The cross product of data1 and data2 should produce a (3*5*7) array
> of scalars
> cross_product_longhand =
> data1[:,:,:,0]*data2[:,:,:,1]-data1[:,:,:,1]*data2[:,:,:,0]
> print 'longhand output shape:',cross_product_longhand.shape # and it
> does
> 
> 
> cross_product_numpy = np.cross(data1,data2)
> print 'numpy output shape:',cross_product_numpy.shape # It seems to
> have transposed the last 2 dimensions
> 
> 
> if (cross_product_longhand == np.transpose(cross_product_numpy,
> (0,2,1))).all():
> print 'Unexpected transposition in numpy.cross (numpy version %s)'%
> np.__version__
> 
> 
> # np.cross L1464
> if axis is not None: 
>     axisa, axisb, axisc=(axis,)*3
> a = asarray(a).swapaxes(axisa, 0)
> b = asarray(b).swapaxes(axisb, 0)
> msg = "incompatible dimensions for cross product\n"\
>       "(dimension must be 2 or 3)"
> if (a.shape[0] not in [2, 3]) or (b.shape[0] not in [2, 3]):
>      raise ValueError(msg)
> if a.shape[0] == 2:
>     if (b.shape[0] == 2): 
>         cp = a[0]*b[1] - a[1]*b[0]
>         if cp.ndim == 0:
>             return cp
>         else:
>             ## WE SHOULD NOT SWAPAXES HERE! 
>             ## For 2D vectors the first axis has been 
> 
>             ## collapsed during the cross product
>             return cp.swapaxes(0, axisc)
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From sebastian at sipsolutions.net  Wed Jul 16 05:14:10 2014
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Wed, 16 Jul 2014 11:14:10 +0200
Subject: [Numpy-discussion] __numpy_ufunc__
In-Reply-To: <CAPJVwB=irs0FMFE-mai9etUP=gos+wF=UNh7xnLMsiVxZL3MbQ@mail.gmail.com>
References: <CAB6mnx+nJCNLf-yJYom045jK5fonvOi8AMPSQXKSfyZSf6T9_g@mail.gmail.com>
	<CANNq6Fm24LLtqduvA2ZVGs-B48v5f63JxGwjL-AVNxCVvnNfug@mail.gmail.com>
	<CAPJVwB=irs0FMFE-mai9etUP=gos+wF=UNh7xnLMsiVxZL3MbQ@mail.gmail.com>
Message-ID: <1405502050.6657.0.camel@sebastian-t440>

On Mi, 2014-07-16 at 09:07 +0100, Nathaniel Smith wrote:
> Weirdly, I never received Chuck's original email in this thread.
> Should some list admin be informed?
> 

I send some mails yesterday and they never arrived... Not sure if it is
a problem on my side or not.

> I also am not sure what/where Julian's comments were, so I second the
> call for context :-). Putting it off until 1.10 doesn't seem like an
> obviously bad idea to me, but specifics would help...
> 
> (__numpy_ufunc__ is the new system for allowing arbitrary third party
> objects to override how ufuncs are applied to them, i.e. it means
> np.sin(sparsemat) and np.sin(gpuarray) can be defined to do something
> sensible. Conceptually it replaces the old
> __array_prepare__/__array_wrap__ system, which was limited to ndarray
> subclasses and has major limits on what you can do. Of course
> __array_prepare/wrap__ will also continue to be supported for
> compatibility.)
> 
> -n
> 
> On 16 Jul 2014 00:10, "Benjamin Root" <ben.root at ou.edu> wrote:
>         Perhaps a bit of context might be useful? How is numpy_ufunc
>         different from the ufuncs that we know and love? What are the
>         known implications? What are the known shortcomings? Are there
>         ABI and/or API concerns between 1.9 and 1.10?
>         
>         
>         Ben Root
>         
>         
>         
>         On Mon, Jul 14, 2014 at 2:22 PM, Charles R Harris
>         <charlesr.harris at gmail.com> wrote:
>                 Hi All,
>                 
>                 Julian has raised the question of including
>                 numpy_ufunc in numpy 1.9. I don't feel strongly one
>                 way or the other, but it doesn't seem to be finished
>                 yet and 1.10 might be a better place to work out the
>                 remaining problems along with the astropy folks
>                 testing possible uses.
>                 
>                 
>                 Thoughts?
>                 
>                 
>                 Chuck 
>                 
>                 
>                 _______________________________________________
>                 NumPy-Discussion mailing list
>                 NumPy-Discussion at scipy.org
>                 http://mail.scipy.org/mailman/listinfo/numpy-discussion
>                 
>         
>         
>         
>         _______________________________________________
>         NumPy-Discussion mailing list
>         NumPy-Discussion at scipy.org
>         http://mail.scipy.org/mailman/listinfo/numpy-discussion
>         
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From njs at pobox.com  Thu Jul 17 07:04:04 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 17 Jul 2014 12:04:04 +0100
Subject: [Numpy-discussion] Mailing list slowdown (was Re: __numpy_ufunc__)
Message-ID: <CAPJVwBnrSxK6KfqJAnbCcWEEPNRmXya9amfXQFQgcAtL_Ygd2g@mail.gmail.com>

On 17 Jul 2014 11:51, "Sebastian Berg" <sebastian at sipsolutions.net> wrote:
>
> On Mi, 2014-07-16 at 09:07 +0100, Nathaniel Smith wrote:
> > Weirdly, I never received Chuck's original email in this thread.
> > Should some list admin be informed?
> >
>
> I send some mails yesterday and they never arrived... Not sure if it is
> a problem on my side or not.

I did eventually get Chuck's original message, but not until several days
later.

CC'ing postmaster at enthought.com in case they have some insight into what's
going on!

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140717/555033df/attachment.html>

From hodgson.neil at yahoo.co.uk  Wed Jul 16 16:25:44 2014
From: hodgson.neil at yahoo.co.uk (Neil Hodgson)
Date: Wed, 16 Jul 2014 21:25:44 +0100
Subject: [Numpy-discussion] Bug in np.cross for 2D vectors
Message-ID: <1405542344.31622.YahooMailNeo@web133105.mail.ir2.yahoo.com>

> Hi,
>
> We came across this bug while using np.cross on 3D arrays of 2D vectors.
>

> What version of numpy are you using? This should already be solved in numpy
> master, and be part of the 1.9 release. Here's the relevant commit,
> although the code has been cleaned up a bit in later ones:

> https://github.com/numpy/numpy/commit/b9454f50f23516234c325490913224c3a69fb122

> Jaime

Yes, we are using 1.8 - sorry I should have checked!
Thanks
Neil
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140716/a3bb382d/attachment.html>

From hodgson.neil at yahoo.co.uk  Thu Jul 17 08:06:45 2014
From: hodgson.neil at yahoo.co.uk (Neil Hodgson)
Date: Thu, 17 Jul 2014 13:06:45 +0100
Subject: [Numpy-discussion] Bug in np.cross for 2D vectors
In-Reply-To: <1405542344.31622.YahooMailNeo@web133105.mail.ir2.yahoo.com>
References: <1405542344.31622.YahooMailNeo@web133105.mail.ir2.yahoo.com>
Message-ID: <1405598805.57307.YahooMailNeo@web133104.mail.ir2.yahoo.com>


> Hi,
>
> We came across this bug while using np.cross on 3D arrays of 2D vectors.
>

> What version of numpy are you using? This should already be solved in numpy
> master, and be part of the 1.9 release. Here's the relevant commit,
> although the code has been cleaned up a bit in later ones:

> https://github.com/numpy/numpy/commit/b9454f50f23516234c325490913224c3a69fb122

> Jaime

>Hi,
>
>which numpy version are you using? Until recently, the cross product
>simply did *not* work in a broadcasting manner (3d arrays of 2d
>vectors), it did something, but usually not the right thing. This is
>fixed in recent versions (not sure if 1.8 or only now with 1.9)

>- Sebastian


Hi, I thought I replied, but I don't see it on the list, so here goes again...

Yes, we are using 1.8, will confirm it's ok with 1.9

Thanks
Neil
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140717/c9d305d0/attachment.html>

From njs at pobox.com  Thu Jul 17 11:37:24 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 17 Jul 2014 16:37:24 +0100
Subject: [Numpy-discussion] `allclose` vs `assert_allclose`
In-Reply-To: <CABL7CQg9A5GXRKvyJYiYEvd5x33J6MMesVB5p5dQH-q+4Hm0ZA@mail.gmail.com>
References: <CAEym_HoQgVswDS7h7hx4JjqVb-jKmoSpZmt36VAQSbyTX_NXnw@mail.gmail.com>
	<CABL7CQg9A5GXRKvyJYiYEvd5x33J6MMesVB5p5dQH-q+4Hm0ZA@mail.gmail.com>
Message-ID: <CAPJVwB=JaYNR7zU-miEk5-Xi8Jx8Avb2EkXc+fKcYi0-37BYsQ@mail.gmail.com>

On Wed, Jul 16, 2014 at 7:47 PM, Ralf Gommers <ralf.gommers at gmail.com> wrote:
>
> On Wed, Jul 16, 2014 at 6:37 AM, Tony Yu <tsyu80 at gmail.com> wrote:
>> It seems like the defaults for  `allclose` and `assert_allclose` should
>> match, and an absolute tolerance of 0 is probably not ideal. I guess this is
>> a pretty big behavioral change, but the current default for
>> `assert_allclose` doesn't seem ideal.
>
> I agree, current behavior quite annoying. It would make sense to change the
> atol default to 1e-8, but technically it's a backwards compatibility break.
> Would probably have a very minor impact though. Changing the default for
> rtol in one of the functions may be much more painful though, I don't think
> that should be done.

Currently we have:

allclose: rtol=1e-5, atol=1e-8
assert_allclose: rtol=1e-7, atol=0

Why would it be painful to change assert_allclose to match allclose?
It would weaken some tests, but no code would break.

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org


From njs at pobox.com  Thu Jul 17 11:48:19 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 17 Jul 2014 16:48:19 +0100
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CAB6mnxJPe3QwVW7T0QYGMhw+Df6tWiKY95m5Aq14MiKckjcnuA@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
	<1405423590.8281.7.camel@sebastian-t440>
	<CAB6mnx+OPGJuhmCbH=OEt=pAB2XTNh-sEFZUDyUVEFxnAAO9rg@mail.gmail.com>
	<CAB6mnxJPe3QwVW7T0QYGMhw+Df6tWiKY95m5Aq14MiKckjcnuA@mail.gmail.com>
Message-ID: <CAPJVwBk6izCdh2o5fu-NgU8T75odNrnBJwRAW9jF7EF-HRPRfA@mail.gmail.com>

On Tue, Jul 15, 2014 at 4:29 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
> Thinking more about it, the easiest thing to do might be to make the S dtype
> a UTF-8 encoding. Most of the machinery to deal with that is already in
> place. That change might affect some users though, and we might need to do
> some work to make it backwards compatible with python 2.

I'd be very concerned about backcompat for existing code that uses
e.g. "S128" as a dtype to mean "128 arbitrary bytes". An example is
this file format reading code:
   https://github.com/rerpy/rerpy/blob/master/rerpy/io/erpss.py#L123
The file format says there are 128 bytes there, and their
interpretation depends on other fields in the header -- but in one
case, for "large montages", there's an encoding where every 3 bytes
represents 4 characters using an ad hoc 6-bit character set:
   https://github.com/rerpy/rerpy/blob/master/rerpy/io/erpss.py#L133

Perhaps this case could be handled better by using a u8 subarray or
something (that code also goes to some efforts to work around nul
padding), and that particular project hasn't been ported to py3 yet so
technically wouldn't be affected if we changed the meaning of "S" on
py3. But it does seem useful to have a "fixed length bytes" dtype even
in py3, and if we declare that be "S" then it avoids breaking any
existing code depending on it...

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org


From njs at pobox.com  Thu Jul 17 11:52:59 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 17 Jul 2014 16:52:59 +0100
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CAMtEP6wbWroSsnWFSDiiT8NuK3vj2LOVi52iiczY4U3HeCJBOA@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
	<CAPJVwBm67=oZbKXN60tUqdDD4qtqYK66iBO1yK6Cd1MF_qxkyA@mail.gmail.com>
	<CAMtEP6wbWroSsnWFSDiiT8NuK3vj2LOVi52iiczY4U3HeCJBOA@mail.gmail.com>
Message-ID: <CAPJVwBmg-xyGqay6AkiPP5wztQ7wysANJm_Tx8wpVqhVZ2+2EQ@mail.gmail.com>

On Tue, Jul 15, 2014 at 7:40 PM, Aldcroft, Thomas
<aldcroft at head.cfa.harvard.edu> wrote:
>
> On Sat, Jul 12, 2014 at 8:02 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>> OTOH, fixed length nul padded latin1 would be useful for various flat file
>> reading tasks.
>
> As one of the original agitators for this, let me re-iterate that what the
> astronomical community *really* wants is the original proposal as described
> by Chris Barker [1] and essentially what Charles said.  We have large data
> archives that have ASCII string data in binary formats like FITS and HDF5.
> The current readers for those datasets present users with numpy S data
> types, which in Python 3 cannot be compared to str (unicode) literals.  In
> many cases those datasets are large, and in my case I regularly deal with
> multi-Gb sized bytestring arrays.  Converting those to a U dtype is not
> practical.

This is feedback is *super* useful, thanks. Can you elaborate a bit
more on your requirements?

I get that:
- You have data that is treated as text, so it is convenient to be
able to use Python strings for things like equality tests, np.sum(arr
== "green") etc.
- Your data uses only ASCII characters, and you don't want to spend
more than 1 byte of memory per character.

Do you ever have 8 bit characters, and if so, what encoding do you use?

Does it matter to you that the memory layout for these 1-byte-per-char
strings remain fixed-width nul-padded concatenated strings (e.g.,
because you are mmap'ing files that have this format)? Or do FITS/HDF5
handle layout details internally and you don't care so long as the
above requirements are met?

Does the fixed-width nature of numpy strings cause problems in the
above setting?

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org


From njs at pobox.com  Thu Jul 17 12:11:11 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 17 Jul 2014 17:11:11 +0100
Subject: [Numpy-discussion] [SciPy-Dev] __numpy_ufunc__ and 1.9 release
In-Reply-To: <53C56DA2.40402@googlemail.com>
References: <53C56DA2.40402@googlemail.com>
Message-ID: <CAPJVwBmAM35kkztaPghLn8P8W5++pct+Q1ik1VLCOC6Xrg4fFQ@mail.gmail.com>

On Tue, Jul 15, 2014 at 7:06 PM, Julian Taylor
<jtaylor.debian at googlemail.com> wrote:
> hi,
> as you may know we want to release numpy 1.9 soon. We should have solved
> most indexing regressions the first beta showed.
>
> The remaining blockers are finishing the new __numpy_ufunc__ feature.
> This feature should allow for alternative method to overriding the
> behavior of ufuncs from subclasses.
> It is described here:
> https://github.com/numpy/numpy/blob/master/doc/neps/ufunc-overrides.rst
>
> The current blocker issues are:
> https://github.com/numpy/numpy/issues/4753
> https://github.com/numpy/numpy/pull/4815
>
> I'm not to familiar with all the complications of subclassing so I can't
> really say how hard this is to solve.
> My issue is that it there still seems to be debate on how to handle
> operator overriding correctly and I am opposed to releasing a numpy with
> yet another experimental feature that may or may not be finished
> sometime later. Having datetime in infinite experimental state is bad
> enough.
> I think nobody is served well if we release 1.9 with the feature
> prematurely based on a not representative set of users and the later
> after more users showed up see we have to change its behavior.
>
> So I'm wondering if we should delay the introduction of this feature to
> 1.10 or is it important enough to wait until there is a consensus on the
> remaining issues?

-1 on delaying the release (but you knew I'd say that)

I don't have a strong feeling about whether or not we should disable
__numpy_ufunc__ for the 1.9 release based on those bugs. They don't
seem obviously catastrophic to me, but you make a good point about
datetime. I think it's your call as release manager...

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org


From josef.pktd at gmail.com  Thu Jul 17 16:07:03 2014
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Thu, 17 Jul 2014 16:07:03 -0400
Subject: [Numpy-discussion] `allclose` vs `assert_allclose`
In-Reply-To: <CAPJVwBn7Br5Ng7U=y26o=nj_E3LMdpWbc5Qp53h1fSOFD603eg@mail.gmail.com>
References: <CAEym_HoQgVswDS7h7hx4JjqVb-jKmoSpZmt36VAQSbyTX_NXnw@mail.gmail.com>
	<CAPJVwBn7Br5Ng7U=y26o=nj_E3LMdpWbc5Qp53h1fSOFD603eg@mail.gmail.com>
Message-ID: <CAMMTP+DA0ja+zbDR5Fh=yTxsW_TeSApHD9tAWY261RgZ+x4UoQ@mail.gmail.com>

On Wed, Jul 16, 2014 at 9:52 AM, Nathaniel Smith <njs at pobox.com> wrote:

> On 16 Jul 2014 10:26, "Tony Yu" <tsyu80 at gmail.com> wrote:
> >
> > Is there any reason why the defaults for `allclose` and
> `assert_allclose` differ? This makes debugging a broken test much more
> difficult. More importantly, using an absolute tolerance of 0 causes
> failures for some common cases. For example, if two values are very close
> to zero, a test will fail:
> >
> >     np.testing.assert_allclose(0, 1e-14)
> >
> > Git blame suggests the change was made in the following commit, but I
> guess that change only reverted to the original behavior.
> >
> >
> https://github.com/numpy/numpy/commit/f43223479f917e404e724e6a3df27aa701e6d6bf
> >
> > It seems like the defaults for  `allclose` and `assert_allclose` should
> match, and an absolute tolerance of 0 is probably not ideal. I guess this
> is a pretty big behavioral change, but the current default for
> `assert_allclose` doesn't seem ideal.
>
> What you say makes sense to me, and loosening the default tolerances won't
> break any existing tests. (And I'm not too worried about people who were
> counting on getting 1e-7 instead of 1e-5 or whatever... if it matters that
> much to you exactly what tolerance you test, you should be setting the
> tolerance explicitly!) I vote that unless someone comes up with some
> terrible objection in the next few days then you should submit a PR :-)
>

If you mean by this to add atol=1e-8 as default, then I'm against it.

At least it will change the meaning of many of our tests in statsmodels.

I'm using rtol to check for correct 1e-15 or 1e-30, which would be
completely swamped if you change the default atol=0.
Adding atol=0 to all assert_allclose that currently use only rtol is a lot
of work.
I think I almost never use a default rtol, but I often leave atol at the
default = 0.

If we have zeros, then I don't think it's too much work to decide whether
this should be atol=1e-20, or 1e-8.

Josef


> -n
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140717/dc2a6df0/attachment.html>

From josef.pktd at gmail.com  Thu Jul 17 16:21:33 2014
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Thu, 17 Jul 2014 16:21:33 -0400
Subject: [Numpy-discussion] `allclose` vs `assert_allclose`
In-Reply-To: <CAMMTP+DA0ja+zbDR5Fh=yTxsW_TeSApHD9tAWY261RgZ+x4UoQ@mail.gmail.com>
References: <CAEym_HoQgVswDS7h7hx4JjqVb-jKmoSpZmt36VAQSbyTX_NXnw@mail.gmail.com>
	<CAPJVwBn7Br5Ng7U=y26o=nj_E3LMdpWbc5Qp53h1fSOFD603eg@mail.gmail.com>
	<CAMMTP+DA0ja+zbDR5Fh=yTxsW_TeSApHD9tAWY261RgZ+x4UoQ@mail.gmail.com>
Message-ID: <CAMMTP+DBOss+8Xbyo8oU+n1mJ6pqzQVFZNXPsVOB5Wja4LWZVQ@mail.gmail.com>

On Thu, Jul 17, 2014 at 4:07 PM, <josef.pktd at gmail.com> wrote:

>
>
>
> On Wed, Jul 16, 2014 at 9:52 AM, Nathaniel Smith <njs at pobox.com> wrote:
>
>> On 16 Jul 2014 10:26, "Tony Yu" <tsyu80 at gmail.com> wrote:
>> >
>> > Is there any reason why the defaults for `allclose` and
>> `assert_allclose` differ? This makes debugging a broken test much more
>> difficult. More importantly, using an absolute tolerance of 0 causes
>> failures for some common cases. For example, if two values are very close
>> to zero, a test will fail:
>> >
>> >     np.testing.assert_allclose(0, 1e-14)
>> >
>> > Git blame suggests the change was made in the following commit, but I
>> guess that change only reverted to the original behavior.
>> >
>> >
>> https://github.com/numpy/numpy/commit/f43223479f917e404e724e6a3df27aa701e6d6bf
>> >
>> > It seems like the defaults for  `allclose` and `assert_allclose` should
>> match, and an absolute tolerance of 0 is probably not ideal. I guess this
>> is a pretty big behavioral change, but the current default for
>> `assert_allclose` doesn't seem ideal.
>>
>> What you say makes sense to me, and loosening the default tolerances
>> won't break any existing tests. (And I'm not too worried about people who
>> were counting on getting 1e-7 instead of 1e-5 or whatever... if it matters
>> that much to you exactly what tolerance you test, you should be setting the
>> tolerance explicitly!) I vote that unless someone comes up with some
>> terrible objection in the next few days then you should submit a PR :-)
>>
>
> If you mean by this to add atol=1e-8 as default, then I'm against it.
>
> At least it will change the meaning of many of our tests in statsmodels.
>
> I'm using rtol to check for correct 1e-15 or 1e-30, which would be
> completely swamped if you change the default atol=0.
> Adding atol=0 to all assert_allclose that currently use only rtol is a lot
> of work.
> I think I almost never use a default rtol, but I often leave atol at the
> default = 0.
>
> If we have zeros, then I don't think it's too much work to decide whether
> this should be atol=1e-20, or 1e-8.
>

Just to explain, p-values, sf of the distributions are usually accurate at
1e-30 or 1e-50 or something like that. And when we test the tails of the
distributions we use that the relative error is small and the absolute
error is "tiny".

We would need to do a grep to see how many cases there actually are in
scipy and statsmodels, before we change it because for some use cases we
only get atol 1e-5 or 1e-7 (e.g. nonlinear optimization).
Linear algebra is usually atol or rtol 1e-11 to 1e-14 in my cases, AFAIR.

Josef


>
> Josef
>
>
>
>> -n
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140717/6d8dcee9/attachment.html>

From josef.pktd at gmail.com  Thu Jul 17 17:01:36 2014
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Thu, 17 Jul 2014 17:01:36 -0400
Subject: [Numpy-discussion] `allclose` vs `assert_allclose`
In-Reply-To: <CAMMTP+DBOss+8Xbyo8oU+n1mJ6pqzQVFZNXPsVOB5Wja4LWZVQ@mail.gmail.com>
References: <CAEym_HoQgVswDS7h7hx4JjqVb-jKmoSpZmt36VAQSbyTX_NXnw@mail.gmail.com>
	<CAPJVwBn7Br5Ng7U=y26o=nj_E3LMdpWbc5Qp53h1fSOFD603eg@mail.gmail.com>
	<CAMMTP+DA0ja+zbDR5Fh=yTxsW_TeSApHD9tAWY261RgZ+x4UoQ@mail.gmail.com>
	<CAMMTP+DBOss+8Xbyo8oU+n1mJ6pqzQVFZNXPsVOB5Wja4LWZVQ@mail.gmail.com>
Message-ID: <CAMMTP+B-q82FoALryu6vMbo03xFY2NV1FY6gSr+jyq1_P=To5Q@mail.gmail.com>

On Thu, Jul 17, 2014 at 4:21 PM, <josef.pktd at gmail.com> wrote:

>
>
>
> On Thu, Jul 17, 2014 at 4:07 PM, <josef.pktd at gmail.com> wrote:
>
>>
>>
>>
>> On Wed, Jul 16, 2014 at 9:52 AM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>>> On 16 Jul 2014 10:26, "Tony Yu" <tsyu80 at gmail.com> wrote:
>>> >
>>> > Is there any reason why the defaults for `allclose` and
>>> `assert_allclose` differ? This makes debugging a broken test much more
>>> difficult. More importantly, using an absolute tolerance of 0 causes
>>> failures for some common cases. For example, if two values are very close
>>> to zero, a test will fail:
>>>
>>
And one more comment: I debug "broken tests" pretty often. My favorites in
pdb are

np.max(np.abs(x - y))

and

np.max(np.abs(x / y - 1))

to see how much I would have to adjust atol and rtol in assert_allclose in
the tests to make them pass, and to decide whether this is an acceptable
numerical difference or a bug.

allclose doesn't tell me anything and I almost never use it.

Josef


> >
>>> >     np.testing.assert_allclose(0, 1e-14)
>>> >
>>> > Git blame suggests the change was made in the following commit, but I
>>> guess that change only reverted to the original behavior.
>>> >
>>> >
>>> https://github.com/numpy/numpy/commit/f43223479f917e404e724e6a3df27aa701e6d6bf
>>> >
>>> > It seems like the defaults for  `allclose` and `assert_allclose`
>>> should match, and an absolute tolerance of 0 is probably not ideal. I guess
>>> this is a pretty big behavioral change, but the current default for
>>> `assert_allclose` doesn't seem ideal.
>>>
>>> What you say makes sense to me, and loosening the default tolerances
>>> won't break any existing tests. (And I'm not too worried about people who
>>> were counting on getting 1e-7 instead of 1e-5 or whatever... if it matters
>>> that much to you exactly what tolerance you test, you should be setting the
>>> tolerance explicitly!) I vote that unless someone comes up with some
>>> terrible objection in the next few days then you should submit a PR :-)
>>>
>>
>> If you mean by this to add atol=1e-8 as default, then I'm against it.
>>
>> At least it will change the meaning of many of our tests in statsmodels.
>>
>> I'm using rtol to check for correct 1e-15 or 1e-30, which would be
>> completely swamped if you change the default atol=0.
>> Adding atol=0 to all assert_allclose that currently use only rtol is a
>> lot of work.
>> I think I almost never use a default rtol, but I often leave atol at the
>> default = 0.
>>
>> If we have zeros, then I don't think it's too much work to decide whether
>> this should be atol=1e-20, or 1e-8.
>>
>
> Just to explain, p-values, sf of the distributions are usually accurate at
> 1e-30 or 1e-50 or something like that. And when we test the tails of the
> distributions we use that the relative error is small and the absolute
> error is "tiny".
>
> We would need to do a grep to see how many cases there actually are in
> scipy and statsmodels, before we change it because for some use cases we
> only get atol 1e-5 or 1e-7 (e.g. nonlinear optimization).
> Linear algebra is usually atol or rtol 1e-11 to 1e-14 in my cases, AFAIR.
>
> Josef
>
>
>>
>> Josef
>>
>>
>>
>>> -n
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140717/929b7289/attachment.html>

From chris.barker at noaa.gov  Thu Jul 17 17:05:26 2014
From: chris.barker at noaa.gov (Chris Barker)
Date: Thu, 17 Jul 2014 14:05:26 -0700
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CAFpSVpKyop44UYtEU8ErF7raCtuWrObnjGvpTDLA12E8DFTjSA@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
	<CALGmxEKR3NB_iL1fPb-N0hyf86dfy_BHxvy+KXcM-cYt-OtH8A@mail.gmail.com>
	<CAFpSVpKyop44UYtEU8ErF7raCtuWrObnjGvpTDLA12E8DFTjSA@mail.gmail.com>
Message-ID: <CALGmxEKTTEziBxNio4-Mvz3vk=p3UHgrsyrMMjjOmPYbNanqXQ@mail.gmail.com>

On Wed, Jul 16, 2014 at 3:48 AM, Todd <toddrjen at gmail.com> wrote:

> On Jul 16, 2014 11:43 AM, "Chris Barker" <chris.barker at noaa.gov> wrote:
> > So numpy should have dtypes to match these. We're a bit stuck, however,
> because 'S' mapped to the py2 string type, which no longer exists in py3.
> Sorry not running py3 to see what 'S' does now, but I know it's bit broken,
> and may be too late to change it
>
> In py3 a 'S' dtype is converted to a python bytes object.
>
right -- thanks. That's the source of the problems.

A bit of a higher-level view of the issues at hand.

Python has three relevant data types:

A unicode type (unicode in py2, str in py3)
A one-byte-per-char stringtype (py2 string)
A bytes type

The big problem is that py2 only has the unicode and py2string types, and
py3 only has the unicode and bytes type.

numpy has 'S' and 'U' types: which map naturally to the py2string and
unicode types.

but since py3 has no py2string type, we have a problem.

If numpy were to embrace the py3 model, then 'S' should have mapped to
py3's string, aka unicode.

But:

1) then there would be no bytes type, which is a problem, as people do need
to a pass collections of bytes around. I"ve alwyas figured numpy's uint8
should suffice for that, but "strings of bytes" are useful, and it seem to
be awkward, or maybe impossible to construct such a beast with the usual
dtype machinery

2) there is a need (or at least a desire), to have a compact,
one-byte-per-charater text type in numpy.

Thinking of it in this framework leads me to the conclusion that numpy
should have three types:

1) A unicode type --no change here

2) A bytes types -- almost the current 'S' type
    - A bytes type would map to/from py3 bytes objects (and py2 bytes
objects, which are the same as py2strings)
    - one way is would differ from a py2str is that there would be no
assumption of null-termination (not sure where that is now)

3) A one-byte-per-char text type -- more or less Chuck's current proposal.
   - it would map to/from the py3 string -- it is text after all
   - it would be null-terminated
   - it would have a one-byte per-char encoding: ascii, latin-1 or settable
(TBA)

It would be nice if numpy had built-in encoding/decoding to/from the
unicode type to/from the bytes type (tricky due to not knowing how many
bytes a given string will decode to without decoding it..

Which leaves us with the decisions:

* what does 'S' map to?
  - currently it's almost a bytes type, and maps to bytes in py3 -- so
maybe keep that status quo. Except that it really doesn't act like text
anymore, so 2 to 3 transition is kind of ugly, and the name is misleading.

* what encoding to use for the one-byte-per-char-text-type?
   - I think latin-1 is the way to go -- you could use it like asciii if
you want, but if you need a few other characters they are there. And you
can even store binary data in it, thought that's a "bad idea" anyway.
  - ascii would solve common use cases, but I see no reason to restrict
folks to 127 characters -- you can use those if you like. If the binary
data needs to get passed to something that really needs to be ascii-only,
it could be checked at that point.
  - perhaps the best option is for client code to be able chose an encoding
-- but more code, maybe a more confusing interface? worth it?

* Do we have a utf-8 type?: I think not -- it simply does not map to both
unicode and numpy's fixed-length requirement.

If all this gets done, we have some transition issues, but I think it would
solve everyone's problems (though maybe not as cleanly as we'd like...).

For instance, if someone needs to map numpy arrays to utf-8 data (i.e.
HDF5), then they can either use the bytes type and let the user decode, or
encode/decode to unicode on i/o.

-Chris

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140717/326789be/attachment.html>

From charles at crunch.io  Thu Jul 17 18:10:14 2014
From: charles at crunch.io (Charles G. Waldman)
Date: Thu, 17 Jul 2014 15:10:14 -0700
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat`
	style
In-Reply-To: <CAPJVwBkPAWG6hRKJWaE3tW40s5fXheh=CeEw+giUneoLAMmLDA@mail.gmail.com>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>
	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>
	<53B9C861.3090809@hawaii.edu>
	<CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>
	<-2968451659458027190@unknownmsgid> <lpgqht$tm3$1@ger.gmane.org>
	<CACoeR0wT=f3bPmc9j8UezPPa8UZz8ad6sj6q_6F3cyciGXHxjw@mail.gmail.com>
	<CAP7h-xYtS_Sc5PZt9rW1rg3Mk4LjUc6P=PuQ_g+J=Qe6VuhNtQ@mail.gmail.com>
	<CAP7h-xbvdv86DWZvYaLSJpba=vSTHdCQK1+YD_qfNRv4DeFBaw@mail.gmail.com>
	<CAPJVwBkPAWG6hRKJWaE3tW40s5fXheh=CeEw+giUneoLAMmLDA@mail.gmail.com>
Message-ID: <CADU3R7hM_WNvoQkfMiCtfD1qH0RmQ9GWv1rwRee3kz9++zs4_A@mail.gmail.com>

-1 on the 'arr' name.  I think if we're going to support this function at
all (which I'm not convinced is a good idea), it should be np.fromsomething
like the other from* functions.

Maybe frommatlab?

I think that 'arr' is just too generic and too close to 'array'.


On Tue, Jul 15, 2014 at 3:55 AM, Nathaniel Smith <njs at pobox.com> wrote:

> On Sun, Jul 13, 2014 at 6:31 PM, Alexander Belopolsky <ndarray at mac.com>
> wrote:
>
>> Also, the use of strings will confuse most syntax highlighters.  Compare
>> the two options in this screenshot:
>>
>> [image: Inline image 2]
>>
>
> I guess this is a minor issue for "real" code, but even IPython doesn't
> (yet?) provide syntax highlighting for lines as they're typed, and this is
> a tool intended mainly for interactive use.
>
> That screenshot also I think illustrates why people have such a preference
> for the first syntax. The second line looks nice, but try typing it quickly
> and getting all the commas located correctly inside versus outside of each
> of the triply-nested brackets...
>
> No-one's come up with any names for this that are nearly as good as "arr".
> Is it really that bad to have to type one extra character, np.array instead
> of np.arr<tab>?
>
> -n
>
> --
> Nathaniel J. Smith
> Postdoctoral researcher - Informatics - University of Edinburgh
> http://vorpus.org
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140717/6b0ea8c3/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screen Shot 2014-07-13 at 1.29.20 PM.png
Type: image/png
Size: 26129 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140717/6b0ea8c3/attachment.png>

From tsyu80 at gmail.com  Fri Jul 18 00:33:49 2014
From: tsyu80 at gmail.com (Tony Yu)
Date: Thu, 17 Jul 2014 23:33:49 -0500
Subject: [Numpy-discussion] `allclose` vs `assert_allclose`
In-Reply-To: <CABL7CQg9A5GXRKvyJYiYEvd5x33J6MMesVB5p5dQH-q+4Hm0ZA@mail.gmail.com>
References: <CAEym_HoQgVswDS7h7hx4JjqVb-jKmoSpZmt36VAQSbyTX_NXnw@mail.gmail.com>
	<CABL7CQg9A5GXRKvyJYiYEvd5x33J6MMesVB5p5dQH-q+4Hm0ZA@mail.gmail.com>
Message-ID: <CAEym_HorwDt+8hn_OK0SN90No0XMqZHrWqQ7J1SoWkGXwN-R6Q@mail.gmail.com>

On Wed, Jul 16, 2014 at 1:47 PM, Ralf Gommers <ralf.gommers at gmail.com>
wrote:

>
>
>
> On Wed, Jul 16, 2014 at 6:37 AM, Tony Yu <tsyu80 at gmail.com> wrote:
>
>> Is there any reason why the defaults for `allclose` and `assert_allclose`
>> differ? This makes debugging a broken test much more difficult. More
>> importantly, using an absolute tolerance of 0 causes failures for some
>> common cases. For example, if two values are very close to zero, a test
>> will fail:
>>
>>     np.testing.assert_allclose(0, 1e-14)
>>
>> Git blame suggests the change was made in the following commit, but I
>> guess that change only reverted to the original behavior.
>>
>>
>> https://github.com/numpy/numpy/commit/f43223479f917e404e724e6a3df27aa701e6d6bf
>>
>
> Indeed, was reverting a change that crept into
> https://github.com/numpy/numpy/commit/f527b49a
>
>
>>
>> It seems like the defaults for  `allclose` and `assert_allclose` should
>> match, and an absolute tolerance of 0 is probably not ideal. I guess this
>> is a pretty big behavioral change, but the current default for
>> `assert_allclose` doesn't seem ideal.
>>
>
> I agree, current behavior quite annoying. It would make sense to change
> the atol default to 1e-8, but technically it's a backwards compatibility
> break. Would probably have a very minor impact though. Changing the default
> for rtol in one of the functions may be much more painful though, I don't
> think that should be done.
>
> Ralf
>

Thanks for the feedback. I've opened up a PR here:

https://github.com/numpy/numpy/pull/4880

Best,
-Tony
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140717/f98f2c9f/attachment.html>

From rhl at astro.princeton.edu  Thu Jul 17 09:48:16 2014
From: rhl at astro.princeton.edu (Robert Lupton the Good)
Date: Thu, 17 Jul 2014 09:48:16 -0400
Subject: [Numpy-discussion] Numpy BoF at SciPy 2014 - quick report
In-Reply-To: <mailman.186.1405593558.1001.numpy-discussion@scipy.org>
References: <mailman.186.1405593558.1001.numpy-discussion@scipy.org>
Message-ID: <3A0037EB-BF51-4943-8E27-EDDFC8F09456@astro.princeton.edu>

Having just re-read the PEP I'm concerned that this proposal leaves at least one major (?) trap for naive users, namely
	x = np.array([1, 10])
	print X.T at x
which will print 101, not [[1, 10], [10, 100]]

Yes, I know why this is happening but it's still a problem -- the user said, "I'm thinking matrices" when they wrote @ but the x.T had done the "wrong" thing before the @ kicked in.  And yes, a savvy user would have written x = np.ones([[1, 10]]) (but then np.dot(x, x.T) isn't a scalar).

This is the way things are at present, but with the new @ syntax coming in I think we should consider fixing it.

I can think of three possibilities:
	1. Leave this as a trap for the unwary, and a reason for people to stick to np.matrix (np.matrix([1, 10]) behaves "correctly")
	2. Make x.T a syntax error for 1-D arrays.  It's a no-op and IMHO a trap. 
	3. Make x.T promote the shape == (2,) array to (1, 2) and return a (2, 1) array.  This may be too magic, but it's my preferred solution.

						R

> Implementation of @ (matrix multiplication)
>  - will be in 3.5 ~ 18months
>  - no work started yet -- have to make sure we do it.
>  - @@ was not added.
>  - The PEP for numpy is well-defined. Not much thinking to be done. (Good for a sprint)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 495 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140717/ba39f45c/attachment.sig>

From sebastian at sipsolutions.net  Fri Jul 18 04:03:59 2014
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Fri, 18 Jul 2014 10:03:59 +0200
Subject: [Numpy-discussion] Numpy BoF at SciPy 2014 - quick report
In-Reply-To: <3A0037EB-BF51-4943-8E27-EDDFC8F09456@astro.princeton.edu>
References: <mailman.186.1405593558.1001.numpy-discussion@scipy.org>
	<3A0037EB-BF51-4943-8E27-EDDFC8F09456@astro.princeton.edu>
Message-ID: <1405670639.6974.4.camel@sebastian-t440>

On Do, 2014-07-17 at 09:48 -0400, Robert Lupton the Good wrote:
> Having just re-read the PEP I'm concerned that this proposal leaves at least one major (?) trap for naive users, namely
> 	x = np.array([1, 10])
> 	print X.T at x
> which will print 101, not [[1, 10], [10, 100]]
> 
> Yes, I know why this is happening but it's still a problem -- the user said, "I'm thinking matrices" when they wrote @ but the x.T had done the "wrong" thing before the @ kicked in.  And yes, a savvy user would have written x = np.ones([[1, 10]]) (but then np.dot(x, x.T) isn't a scalar).
> 
> This is the way things are at present, but with the new @ syntax coming in I think we should consider fixing it.
> 
> I can think of three possibilities:
> 	1. Leave this as a trap for the unwary, and a reason for people to stick to np.matrix (np.matrix([1, 10]) behaves "correctly")
> 	2. Make x.T a syntax error for 1-D arrays.  It's a no-op and IMHO a trap. 
> 	3. Make x.T promote the shape == (2,) array to (1, 2) and return a (2, 1) array.  This may be too magic, but it's my preferred solution.
> 

Making it a warning may be another option. Changing `.T` to promote to
2-d (also maybe to actually only transpose the last two axes for higher
D arrays), could be nice, but getting there might take quite a long
FutureWarning or even Error -> new feature cycle...

- Sebastian

> 						R
> 
> > Implementation of @ (matrix multiplication)
> >  - will be in 3.5 ~ 18months
> >  - no work started yet -- have to make sure we do it.
> >  - @@ was not added.
> >  - The PEP for numpy is well-defined. Not much thinking to be done. (Good for a sprint)
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From robert.kern at gmail.com  Fri Jul 18 06:31:13 2014
From: robert.kern at gmail.com (Robert Kern)
Date: Fri, 18 Jul 2014 11:31:13 +0100
Subject: [Numpy-discussion] Numpy BoF at SciPy 2014 - quick report
In-Reply-To: <1405670639.6974.4.camel@sebastian-t440>
References: <mailman.186.1405593558.1001.numpy-discussion@scipy.org>
	<3A0037EB-BF51-4943-8E27-EDDFC8F09456@astro.princeton.edu>
	<1405670639.6974.4.camel@sebastian-t440>
Message-ID: <CAF6FJisQ9hDAuBha7aEKXpsneG7q8yciWhfJe2=4UPriC4VeZg@mail.gmail.com>

On Fri, Jul 18, 2014 at 9:03 AM, Sebastian Berg
<sebastian at sipsolutions.net> wrote:
> On Do, 2014-07-17 at 09:48 -0400, Robert Lupton the Good wrote:
>> Having just re-read the PEP I'm concerned that this proposal leaves at least one major (?) trap for naive users, namely
>>       x = np.array([1, 10])
>>       print X.T at x
>> which will print 101, not [[1, 10], [10, 100]]
>>
>> Yes, I know why this is happening but it's still a problem -- the user said, "I'm thinking matrices" when they wrote @ but the x.T had done the "wrong" thing before the @ kicked in.  And yes, a savvy user would have written x = np.ones([[1, 10]]) (but then np.dot(x, x.T) isn't a scalar).
>>
>> This is the way things are at present, but with the new @ syntax coming in I think we should consider fixing it.
>>
>> I can think of three possibilities:
>>       1. Leave this as a trap for the unwary, and a reason for people to stick to np.matrix (np.matrix([1, 10]) behaves "correctly")
>>       2. Make x.T a syntax error for 1-D arrays.  It's a no-op and IMHO a trap.
>>       3. Make x.T promote the shape == (2,) array to (1, 2) and return a (2, 1) array.  This may be too magic, but it's my preferred solution.
>
> Making it a warning may be another option. Changing `.T` to promote to
> 2-d (also maybe to actually only transpose the last two axes for higher
> D arrays), could be nice, but getting there might take quite a long
> FutureWarning or even Error -> new feature cycle...

Hmm, just the other day I wrote some code that relies on the current
behavior. I was writing a function that could work both on 3-vectors
and arrays of 3-vectors. To unpack the input into the separate
components, I did:

  x, y, z = vector.T

Which works correctly whether `vector` is shaped (3,) or (N, 3).

-- 
Robert Kern


From njs at pobox.com  Fri Jul 18 06:33:00 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 18 Jul 2014 11:33:00 +0100
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CALGmxEKTTEziBxNio4-Mvz3vk=p3UHgrsyrMMjjOmPYbNanqXQ@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
	<CALGmxEKR3NB_iL1fPb-N0hyf86dfy_BHxvy+KXcM-cYt-OtH8A@mail.gmail.com>
	<CAFpSVpKyop44UYtEU8ErF7raCtuWrObnjGvpTDLA12E8DFTjSA@mail.gmail.com>
	<CALGmxEKTTEziBxNio4-Mvz3vk=p3UHgrsyrMMjjOmPYbNanqXQ@mail.gmail.com>
Message-ID: <CAPJVwB=VEG0sEcwvttYipQaLtwrBaC7aoN-mAkSOxsdHtcMLDw@mail.gmail.com>

On Thu, Jul 17, 2014 at 10:05 PM, Chris Barker <chris.barker at noaa.gov> wrote:
> A bit of a higher-level view of the issues at hand.
>
> Python has three relevant data types:
>
> A unicode type (unicode in py2, str in py3)
> A one-byte-per-char stringtype (py2 string)
> A bytes type
>
> The big problem is that py2 only has the unicode and py2string types, and
> py3 only has the unicode and bytes type.
>
> numpy has 'S' and 'U' types: which map naturally to the py2string and
> unicode types.
>
> but since py3 has no py2string type, we have a problem.
>
> If numpy were to embrace the py3 model, then 'S' should have mapped to py3's
> string, aka unicode.
>
> But:
>
> 1) then there would be no bytes type, which is a problem, as people do need
> to a pass collections of bytes around. I"ve alwyas figured numpy's uint8
> should suffice for that, but "strings of bytes" are useful, and it seem to
> be awkward, or maybe impossible to construct such a beast with the usual
> dtype machinery
>
> 2) there is a need (or at least a desire), to have a compact,
> one-byte-per-charater text type in numpy.
>
> Thinking of it in this framework leads me to the conclusion that numpy
> should have three types:

This sounds pretty reasonable to me.

> 1) A unicode type --no change here
>
> 2) A bytes types -- almost the current 'S' type
>     - A bytes type would map to/from py3 bytes objects (and py2 bytes
> objects, which are the same as py2strings)
>     - one way is would differ from a py2str is that there would be no
> assumption of null-termination (not sure where that is now)

AFAICT this is *exactly* the same as the current 'S' type. What
differences do you see?

> 3) A one-byte-per-char text type -- more or less Chuck's current proposal.
>    - it would map to/from the py3 string -- it is text after all
>    - it would be null-terminated

Numpy strings types are never null-terminated ATM. They're
null-padded, which is slightly different. When storing data in an S5,
for instance, strings of length 5 have no nulls appending, strings of
length 4 have 1 null appended, strings of length 3 have 2 nulls
appended, etc. When reading data out of an S5, then all trailing nulls
are stripped.

So, they may not be null terminated (if the length of the string
exactly matches the length of the dtype), and the strings being stored
can contain internal nulls ("foo\x00bar" is fine), but they cannot
contain trailing nulls ("foo\x00" will come back as just "foo").

Do you actually care about null-termination specifically? Or did you
just mean "it should work like the other ones, which I vaguely
remember involves nulls"? ;-)

>    - it would have a one-byte per-char encoding: ascii, latin-1 or settable
> (TBA)

Settable is technically very difficult until we redo the dtype
machinery to allow parametrized types.

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org


From njs at pobox.com  Fri Jul 18 06:37:46 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 18 Jul 2014 11:37:46 +0100
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat`
	style
In-Reply-To: <CADU3R7hM_WNvoQkfMiCtfD1qH0RmQ9GWv1rwRee3kz9++zs4_A@mail.gmail.com>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>
	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>
	<53B9C861.3090809@hawaii.edu>
	<CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>
	<-2968451659458027190@unknownmsgid> <lpgqht$tm3$1@ger.gmane.org>
	<CACoeR0wT=f3bPmc9j8UezPPa8UZz8ad6sj6q_6F3cyciGXHxjw@mail.gmail.com>
	<CAP7h-xYtS_Sc5PZt9rW1rg3Mk4LjUc6P=PuQ_g+J=Qe6VuhNtQ@mail.gmail.com>
	<CAP7h-xbvdv86DWZvYaLSJpba=vSTHdCQK1+YD_qfNRv4DeFBaw@mail.gmail.com>
	<CAPJVwBkPAWG6hRKJWaE3tW40s5fXheh=CeEw+giUneoLAMmLDA@mail.gmail.com>
	<CADU3R7hM_WNvoQkfMiCtfD1qH0RmQ9GWv1rwRee3kz9++zs4_A@mail.gmail.com>
Message-ID: <CAPJVwBnMCt4vr6aQM5e74EBP347D01bWWKRYuZ5By0Fg4XY2dg@mail.gmail.com>

On Thu, Jul 17, 2014 at 11:10 PM, Charles G. Waldman <charles at crunch.io> wrote:
>
> -1 on the 'arr' name.  I think if we're going to support this function at all (which I'm not convinced is a good idea), it should be np.fromsomething like the other from* functions.
>
> Maybe frommatlab?
>
> I think that 'arr' is just too generic and too close to 'array'.

Well, it's definitely not a good idea if we name it something like that :-).

The whole motivation is to provide a quick way to type 2d arrays
interactively, hence the current name "np.mat". (The fact that it
happens to match matlab syntax is a nice bonus, because stealing is
always better than inventing when it works.)

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org


From njs at pobox.com  Fri Jul 18 06:44:49 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 18 Jul 2014 11:44:49 +0100
Subject: [Numpy-discussion] `allclose` vs `assert_allclose`
In-Reply-To: <CAMMTP+DA0ja+zbDR5Fh=yTxsW_TeSApHD9tAWY261RgZ+x4UoQ@mail.gmail.com>
References: <CAEym_HoQgVswDS7h7hx4JjqVb-jKmoSpZmt36VAQSbyTX_NXnw@mail.gmail.com>
	<CAPJVwBn7Br5Ng7U=y26o=nj_E3LMdpWbc5Qp53h1fSOFD603eg@mail.gmail.com>
	<CAMMTP+DA0ja+zbDR5Fh=yTxsW_TeSApHD9tAWY261RgZ+x4UoQ@mail.gmail.com>
Message-ID: <CAPJVwBmZHQjNj+KA4GxgsSv-XNJsyETumUQv0F97-bNXj0f5qQ@mail.gmail.com>

On Thu, Jul 17, 2014 at 9:07 PM,  <josef.pktd at gmail.com> wrote:
> On Wed, Jul 16, 2014 at 9:52 AM, Nathaniel Smith <njs at pobox.com> wrote:
>> What you say makes sense to me, and loosening the default tolerances won't
>> break any existing tests. (And I'm not too worried about people who were
>> counting on getting 1e-7 instead of 1e-5 or whatever... if it matters that
>> much to you exactly what tolerance you test, you should be setting the
>> tolerance explicitly!) I vote that unless someone comes up with some
>> terrible objection in the next few days then you should submit a PR :-)
>
> If you mean by this to add atol=1e-8 as default, then I'm against it.
>
> At least it will change the meaning of many of our tests in statsmodels.
>
> I'm using rtol to check for correct 1e-15 or 1e-30, which would be
> completely swamped if you change the default atol=0.
> Adding atol=0 to all assert_allclose that currently use only rtol is a lot
> of work.
> I think I almost never use a default rtol, but I often leave atol at the
> default = 0.
>
> If we have zeros, then I don't think it's too much work to decide whether
> this should be atol=1e-20, or 1e-8.

This is a compelling use-case, but there are also lots of compelling
usecases that want some non-zero atol (i.e., comparing stuff to 0).
Saying that allclose is for one of those use cases and assert_allclose
is for the other is... not a very felicitious API design, I think. So
we really should do *something*.

Are there really any cases where you want non-zero atol= that don't
involve comparing something against a 'desired' value of zero? It's a
little wacky, but I'm wondering if we ought to change the rule (for
all versions of allclose) to

if desired == 0:
    tol = atol
else:
    tol = rtol * desired

In particular, means that np.allclose(x, 1e-30) would reject x values
of 0 or 2e-30, but np.allclose(x, 0) will accept x == 1e-30 or 2e-30.

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org


From josef.pktd at gmail.com  Fri Jul 18 07:07:36 2014
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 18 Jul 2014 07:07:36 -0400
Subject: [Numpy-discussion] problems with mailing list ?
Message-ID: <CAMMTP+ABhFijQjpz_DMCYGAc=-5ynkRtfkEnrtz-WPZp5g9Z7Q@mail.gmail.com>

Are the problems with sending out the messages with the mailing lists?

I'm getting some replies without original messages, and in some threads I
don't get replies, missing part of the discussions.


Josef
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/f1752a9d/attachment.html>

From josef.pktd at gmail.com  Fri Jul 18 07:38:21 2014
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 18 Jul 2014 07:38:21 -0400
Subject: [Numpy-discussion] `allclose` vs `assert_allclose`
In-Reply-To: <CAMMTP+DA0ja+zbDR5Fh=yTxsW_TeSApHD9tAWY261RgZ+x4UoQ@mail.gmail.com>
References: <CAEym_HoQgVswDS7h7hx4JjqVb-jKmoSpZmt36VAQSbyTX_NXnw@mail.gmail.com>
	<CAPJVwBn7Br5Ng7U=y26o=nj_E3LMdpWbc5Qp53h1fSOFD603eg@mail.gmail.com>
	<CAMMTP+DA0ja+zbDR5Fh=yTxsW_TeSApHD9tAWY261RgZ+x4UoQ@mail.gmail.com>
Message-ID: <CAMMTP+D+19j2eJTYSfg5kHVieu333xwb9ZV+Tc6vBS88eUMtzw@mail.gmail.com>

On Thu, Jul 17, 2014 at 4:07 PM, <josef.pktd at gmail.com> wrote:

>
>
>
> On Wed, Jul 16, 2014 at 9:52 AM, Nathaniel Smith <njs at pobox.com> wrote:
>
>> On 16 Jul 2014 10:26, "Tony Yu" <tsyu80 at gmail.com> wrote:
>> >
>> > Is there any reason why the defaults for `allclose` and
>> `assert_allclose` differ? This makes debugging a broken test much more
>> difficult. More importantly, using an absolute tolerance of 0 causes
>> failures for some common cases. For example, if two values are very close
>> to zero, a test will fail:
>> >
>> >     np.testing.assert_allclose(0, 1e-14)
>> >
>> > Git blame suggests the change was made in the following commit, but I
>> guess that change only reverted to the original behavior.
>> >
>> >
>> https://github.com/numpy/numpy/commit/f43223479f917e404e724e6a3df27aa701e6d6bf
>> >
>> > It seems like the defaults for  `allclose` and `assert_allclose` should
>> match, and an absolute tolerance of 0 is probably not ideal. I guess this
>> is a pretty big behavioral change, but the current default for
>> `assert_allclose` doesn't seem ideal.
>>
>> What you say makes sense to me, and loosening the default tolerances
>> won't break any existing tests. (And I'm not too worried about people who
>> were counting on getting 1e-7 instead of 1e-5 or whatever... if it matters
>> that much to you exactly what tolerance you test, you should be setting the
>> tolerance explicitly!) I vote that unless someone comes up with some
>> terrible objection in the next few days then you should submit a PR :-)
>>
>
> If you mean by this to add atol=1e-8 as default, then I'm against it.
>
> At least it will change the meaning of many of our tests in statsmodels.
>
> I'm using rtol to check for correct 1e-15 or 1e-30, which would be
> completely swamped if you change the default atol=0.
> Adding atol=0 to all assert_allclose that currently use only rtol is a lot
> of work.
> I think I almost never use a default rtol, but I often leave atol at the
> default = 0.
>
> If we have zeros, then I don't think it's too much work to decide whether
> this should be atol=1e-20, or 1e-8.
>

copied from
http://mail.scipy.org/pipermail/numpy-discussion/2014-July/070639.html
since I didn't get any messages here

This is a compelling use-case, but there are also lots of compelling
usecases that want some non-zero atol (i.e., comparing stuff to 0).
Saying that allclose is for one of those use cases and assert_allclose
is for the other is... not a very felicitious API design, I think. So
we really should do *something*.

Are there really any cases where you want non-zero atol= that don't
involve comparing something against a 'desired' value of zero? It's a
little wacky, but I'm wondering if we ought to change the rule (for
all versions of allclose) to

if desired == 0:
    tol = atol
else:
    tol = rtol * desired

In particular, means that np.allclose(x, 1e-30) would reject x values
of 0 or 2e-30, but np.allclose(x, 0) will accept x == 1e-30 or 2e-30.

-n


That's much too confusing.
I don't know what the usecases for np.allclose are since I don't have any.

assert_allclose is one of our (statsmodels) most frequently used numpy
function

this is not informative:

`np.allclose(x, 1e-30)`


since there are keywords
either np.assert_allclose(x, atol=1e-30)
if I want to be "close" to zero
or

np.assert_allclose(x, rtol=1e-11, atol=1e-25)

if we have a mix of large numbers and "zeros" in an array.

Making the behavior of assert_allclose depending on whether desired is
exactly zero or 1e-20 looks too difficult to remember, and which desired I
use would depend on what I get out of R or Stata.

atol=1e-8 is not close to zero in most cases in my experience.


The numpy.testing assert functions are some of the most useful functions in
numpy, and heavily used "code".

Josef


>
> Josef
>
>
>
>> -n
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/76ae5624/attachment.html>

From josef.pktd at gmail.com  Fri Jul 18 07:41:57 2014
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 18 Jul 2014 07:41:57 -0400
Subject: [Numpy-discussion] `allclose` vs `assert_allclose`
In-Reply-To: <CAPJVwB=JaYNR7zU-miEk5-Xi8Jx8Avb2EkXc+fKcYi0-37BYsQ@mail.gmail.com>
References: <CAEym_HoQgVswDS7h7hx4JjqVb-jKmoSpZmt36VAQSbyTX_NXnw@mail.gmail.com>
	<CABL7CQg9A5GXRKvyJYiYEvd5x33J6MMesVB5p5dQH-q+4Hm0ZA@mail.gmail.com>
	<CAPJVwB=JaYNR7zU-miEk5-Xi8Jx8Avb2EkXc+fKcYi0-37BYsQ@mail.gmail.com>
Message-ID: <CAMMTP+DuCL-FfLdPqQg_-VqD5WEkd-K5LVtQi-ZnYwxvN9Ln7w@mail.gmail.com>

On Thu, Jul 17, 2014 at 11:37 AM, Nathaniel Smith <njs at pobox.com> wrote:

> On Wed, Jul 16, 2014 at 7:47 PM, Ralf Gommers <ralf.gommers at gmail.com>
> wrote:
> >
> > On Wed, Jul 16, 2014 at 6:37 AM, Tony Yu <tsyu80 at gmail.com> wrote:
> >> It seems like the defaults for  `allclose` and `assert_allclose` should
> >> match, and an absolute tolerance of 0 is probably not ideal. I guess
> this is
> >> a pretty big behavioral change, but the current default for
> >> `assert_allclose` doesn't seem ideal.
> >
> > I agree, current behavior quite annoying. It would make sense to change
> the
> > atol default to 1e-8, but technically it's a backwards compatibility
> break.
> > Would probably have a very minor impact though. Changing the default for
> > rtol in one of the functions may be much more painful though, I don't
> think
> > that should be done.
>
> Currently we have:
>
> allclose: rtol=1e-5, atol=1e-8
> assert_allclose: rtol=1e-7, atol=0
>
> Why would it be painful to change assert_allclose to match allclose?
> It would weaken some tests, but no code would break.
>

We might break our code, if suddenly our test suite doesn't do what it is
supposed to do.

(rough guess: 40% of the statsmodels code are unit tests.)

Josef


>
> -n
>
> --
> Nathaniel J. Smith
> Postdoctoral researcher - Informatics - University of Edinburgh
> http://vorpus.org
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/c862aa2d/attachment.html>

From charlesr.harris at gmail.com  Fri Jul 18 09:18:59 2014
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 18 Jul 2014 07:18:59 -0600
Subject: [Numpy-discussion] Numpy BoF at SciPy 2014 - quick report
In-Reply-To: <1405670639.6974.4.camel@sebastian-t440>
References: <mailman.186.1405593558.1001.numpy-discussion@scipy.org>
	<3A0037EB-BF51-4943-8E27-EDDFC8F09456@astro.princeton.edu>
	<1405670639.6974.4.camel@sebastian-t440>
Message-ID: <CAB6mnx+_RxnDeb0VJqhus6M=mYf9=wPR9N8TdcTOctjJOer5xg@mail.gmail.com>

On Fri, Jul 18, 2014 at 2:03 AM, Sebastian Berg <sebastian at sipsolutions.net>
wrote:

> On Do, 2014-07-17 at 09:48 -0400, Robert Lupton the Good wrote:
> > Having just re-read the PEP I'm concerned that this proposal leaves at
> least one major (?) trap for naive users, namely
> >       x = np.array([1, 10])
> >       print X.T at x
> > which will print 101, not [[1, 10], [10, 100]]
> >
> > Yes, I know why this is happening but it's still a problem -- the user
> said, "I'm thinking matrices" when they wrote @ but the x.T had done the
> "wrong" thing before the @ kicked in.  And yes, a savvy user would have
> written x = np.ones([[1, 10]]) (but then np.dot(x, x.T) isn't a scalar).
> >
> > This is the way things are at present, but with the new @ syntax coming
> in I think we should consider fixing it.
> >
> > I can think of three possibilities:
> >       1. Leave this as a trap for the unwary, and a reason for people to
> stick to np.matrix (np.matrix([1, 10]) behaves "correctly")
> >       2. Make x.T a syntax error for 1-D arrays.  It's a no-op and IMHO
> a trap.
> >       3. Make x.T promote the shape == (2,) array to (1, 2) and return a
> (2, 1) array.  This may be too magic, but it's my preferred solution.
> >
>
> Making it a warning may be another option. Changing `.T` to promote to
> 2-d (also maybe to actually only transpose the last two axes for higher
> D arrays), could be nice, but getting there might take quite a long
> FutureWarning or even Error -> new feature cycle...
>

I've toyed some with the idea of adding a flag bit for transpose of 1-d
arrays. It would flip with every transpose and be ignored for non 1-d
arrays. A bit of a hack, but would allow for a column/row vector
distinction.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/d9a1c92e/attachment.html>

From andy.terrel at gmail.com  Fri Jul 18 09:51:04 2014
From: andy.terrel at gmail.com (Andy Ray Terrel)
Date: Fri, 18 Jul 2014 09:51:04 -0400
Subject: [Numpy-discussion] problems with mailing list ?
In-Reply-To: <CAMMTP+ABhFijQjpz_DMCYGAc=-5ynkRtfkEnrtz-WPZp5g9Z7Q@mail.gmail.com>
References: <CAMMTP+ABhFijQjpz_DMCYGAc=-5ynkRtfkEnrtz-WPZp5g9Z7Q@mail.gmail.com>
Message-ID: <CA+WonSSretURiMTKC=Hnruhuj2uNAd-7X0qF2+atGYhUjLVjGw@mail.gmail.com>

Yes I've filed a ticket with Enthought.

On Fri, Jul 18, 2014 at 7:07 AM,  <josef.pktd at gmail.com> wrote:
> Are the problems with sending out the messages with the mailing lists?
>
> I'm getting some replies without original messages, and in some threads I
> don't get replies, missing part of the discussions.
>
>
> Josef
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From charles at crunch.io  Fri Jul 18 10:02:06 2014
From: charles at crunch.io (Charles G. Waldman)
Date: Fri, 18 Jul 2014 07:02:06 -0700
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat`
	style
In-Reply-To: <CAPJVwBnMCt4vr6aQM5e74EBP347D01bWWKRYuZ5By0Fg4XY2dg@mail.gmail.com>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>
	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>
	<53B9C861.3090809@hawaii.edu>
	<CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>
	<-2968451659458027190@unknownmsgid> <lpgqht$tm3$1@ger.gmane.org>
	<CACoeR0wT=f3bPmc9j8UezPPa8UZz8ad6sj6q_6F3cyciGXHxjw@mail.gmail.com>
	<CAP7h-xYtS_Sc5PZt9rW1rg3Mk4LjUc6P=PuQ_g+J=Qe6VuhNtQ@mail.gmail.com>
	<CAP7h-xbvdv86DWZvYaLSJpba=vSTHdCQK1+YD_qfNRv4DeFBaw@mail.gmail.com>
	<CAPJVwBkPAWG6hRKJWaE3tW40s5fXheh=CeEw+giUneoLAMmLDA@mail.gmail.com>
	<CADU3R7hM_WNvoQkfMiCtfD1qH0RmQ9GWv1rwRee3kz9++zs4_A@mail.gmail.com>
	<CAPJVwBnMCt4vr6aQM5e74EBP347D01bWWKRYuZ5By0Fg4XY2dg@mail.gmail.com>
Message-ID: <CADU3R7jMUuc0DsD1dF0oNdg9O-9nTWtH0eTKYF1rvHvyYQ1NMA@mail.gmail.com>

I greatly prefer "np.mat" to "np.arr" for this, FWIW


On Fri, Jul 18, 2014 at 3:37 AM, Nathaniel Smith <njs at pobox.com> wrote:
> On Thu, Jul 17, 2014 at 11:10 PM, Charles G. Waldman <charles at crunch.io> wrote:
>>
>> -1 on the 'arr' name.  I think if we're going to support this function at all (which I'm not convinced is a good idea), it should be np.fromsomething like the other from* functions.
>>
>> Maybe frommatlab?
>>
>> I think that 'arr' is just too generic and too close to 'array'.
>
> Well, it's definitely not a good idea if we name it something like that :-).
>
> The whole motivation is to provide a quick way to type 2d arrays
> interactively, hence the current name "np.mat". (The fact that it
> happens to match matlab syntax is a nice bonus, because stealing is
> always better than inventing when it works.)
>
> --
> Nathaniel J. Smith
> Postdoctoral researcher - Informatics - University of Edinburgh
> http://vorpus.org
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From aldcroft at head.cfa.harvard.edu  Fri Jul 18 10:04:22 2014
From: aldcroft at head.cfa.harvard.edu (Aldcroft, Thomas)
Date: Fri, 18 Jul 2014 10:04:22 -0400
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CAPJVwBmg-xyGqay6AkiPP5wztQ7wysANJm_Tx8wpVqhVZ2+2EQ@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
	<CAPJVwBm67=oZbKXN60tUqdDD4qtqYK66iBO1yK6Cd1MF_qxkyA@mail.gmail.com>
	<CAMtEP6wbWroSsnWFSDiiT8NuK3vj2LOVi52iiczY4U3HeCJBOA@mail.gmail.com>
	<CAPJVwBmg-xyGqay6AkiPP5wztQ7wysANJm_Tx8wpVqhVZ2+2EQ@mail.gmail.com>
Message-ID: <CAMtEP6ySkX8_kE6eo9mh+=Ch1X2xFhoiE9wqXxjnok65ifa9sQ@mail.gmail.com>

On Thu, Jul 17, 2014 at 11:52 AM, Nathaniel Smith <njs at pobox.com> wrote:

> On Tue, Jul 15, 2014 at 7:40 PM, Aldcroft, Thomas
> <aldcroft at head.cfa.harvard.edu> wrote:
> >
> > On Sat, Jul 12, 2014 at 8:02 PM, Nathaniel Smith <njs at pobox.com> wrote:
> >>
> >> OTOH, fixed length nul padded latin1 would be useful for various flat
> file
> >> reading tasks.
> >
> > As one of the original agitators for this, let me re-iterate that what
> the
> > astronomical community *really* wants is the original proposal as
> described
> > by Chris Barker [1] and essentially what Charles said.  We have large
> data
> > archives that have ASCII string data in binary formats like FITS and
> HDF5.
> > The current readers for those datasets present users with numpy S data
> > types, which in Python 3 cannot be compared to str (unicode) literals.
>  In
> > many cases those datasets are large, and in my case I regularly deal with
> > multi-Gb sized bytestring arrays.  Converting those to a U dtype is not
> > practical.
>
> This is feedback is *super* useful, thanks. Can you elaborate a bit
> more on your requirements?
>
> I get that:
> - You have data that is treated as text, so it is convenient to be
> able to use Python strings for things like equality tests, np.sum(arr
> == "green") etc.
> - Your data uses only ASCII characters, and you don't want to spend
> more than 1 byte of memory per character.
>
> Do you ever have 8 bit characters, and if so, what encoding do you use?
>

No.

>
> Does it matter to you that the memory layout for these 1-byte-per-char
> strings remain fixed-width nul-padded concatenated strings (e.g.,
> because you are mmap'ing files that have this format)? Or do FITS/HDF5
> handle layout details internally and you don't care so long as the
> above requirements are met?
>

Yes, memory layout matters since mmap'ing files is a key feature in FITS.


>
> Does the fixed-width nature of numpy strings cause problems in the
> above setting?
>

No.  In particular FITS is ubiquitous as the binary data transport format
in astronomy, and it specifies fixed width strings, so fixed width in numpy
is a good thing in this case.  More generally legacy (or even modern
high-performance) Fortran / C will commonly handle string arrays as arrays
of fixed width characters.  In the majority of cases these codes (that I'm
aware of) know nothing about unicode.

This all works transparently with Python 2 + Numpy, so the goal is to have
that same "it just works" capability in Python 3 with minimal code changes.

Thanks,
Tom


>
> -n
>
> --
> Nathaniel J. Smith
> Postdoctoral researcher - Informatics - University of Edinburgh
> http://vorpus.org
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/454f7617/attachment.html>

From jtaylor.debian at googlemail.com  Fri Jul 18 10:13:53 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Fri, 18 Jul 2014 16:13:53 +0200
Subject: [Numpy-discussion] proposal: new commit guidelines for backportable
	bugfixes
Message-ID: <CAK5FAtGegwSH8Mx=QaZqvDeFp2UT7eHCKfFYQix6a32W1Mh=7w@mail.gmail.com>

hi,
I have been doing a lot of backporting for the last few bugfix
releases and noticed that our current approach committing to master
and cherrypicking is not so good for the git history.
When cherry picking a bugfix from master to a maintenance branch both
branches contain a commit with the same content and git knows of no
relation between them. This causes unnecessary merge conflicts when
cherry picking two changes that modify the same file. The git version
(1.9.1) I am using is not smart enough too figure out the changesets
in both leaf commits is the same.
Additionally the output of `git log maintenance/1.9.x..master` becomes
very large as all already backported issues appear again in master.
[0]

To help with this I want to propose new best practices for pull
requests of bugfixes suitable for backporting.
Instead of basing the bugfix on the head commit of the master, base
them on the merge base between master and the latest maintenance
branch.
This allows merging the PR into both master and the maintenance branch
without pulling in any extra changes from either branches.
Then both branches contain the same commit and gits automerging can
work better and git log will only show you the commits that are only
really on one branch or the other.

In practice this is very simple. You can still develop your bugfix on
master but before you push it you just run:

git rebase --onto $(git merge-base master maintenance/1.9.x) HEAD^

In most bugfix PRs this should work without conflict as they should be
relatively small.
If you get a merge conflict during this operation, just do git rebase
--abort and do a normal pull request, in that case the backporter
should worry about the conflict.

Does this sound like a reasonable procedure?
Cheers,
Julian

[0] git cherry is supposed to help with that, but it never really
worked properly for me


From jtaylor.debian at googlemail.com  Fri Jul 18 11:10:53 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Fri, 18 Jul 2014 17:10:53 +0200
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CAPJVwBk6izCdh2o5fu-NgU8T75odNrnBJwRAW9jF7EF-HRPRfA@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
	<1405423590.8281.7.camel@sebastian-t440>
	<CAB6mnx+OPGJuhmCbH=OEt=pAB2XTNh-sEFZUDyUVEFxnAAO9rg@mail.gmail.com>
	<CAB6mnxJPe3QwVW7T0QYGMhw+Df6tWiKY95m5Aq14MiKckjcnuA@mail.gmail.com>
	<CAPJVwBk6izCdh2o5fu-NgU8T75odNrnBJwRAW9jF7EF-HRPRfA@mail.gmail.com>
Message-ID: <CAK5FAtHGkuC79JOJGBXYBpAYwoQdYfDdL=GafcptBrLamZ8n9Q@mail.gmail.com>

On Thu, Jul 17, 2014 at 5:48 PM, Nathaniel Smith <njs at pobox.com> wrote:
> On Tue, Jul 15, 2014 at 4:29 PM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
>> Thinking more about it, the easiest thing to do might be to make the S dtype
>> a UTF-8 encoding. Most of the machinery to deal with that is already in
>> place. That change might affect some users though, and we might need to do
>> some work to make it backwards compatible with python 2.
>
> I'd be very concerned about backcompat for existing code that uses
> e.g. "S128" as a dtype to mean "128 arbitrary bytes". An example is
> this file format reading code:
>    https://github.com/rerpy/rerpy/blob/master/rerpy/io/erpss.py#L123
> The file format says there are 128 bytes there, and their
> interpretation depends on other fields in the header -- but in one
> case, for "large montages", there's an encoding where every 3 bytes
> represents 4 characters using an ad hoc 6-bit character set:
>    https://github.com/rerpy/rerpy/blob/master/rerpy/io/erpss.py#L133
>
> Perhaps this case could be handled better by using a u8 subarray or
> something (that code also goes to some efforts to work around nul
> padding), and that particular project hasn't been ported to py3 yet so
> technically wouldn't be affected if we changed the meaning of "S" on
> py3. But it does seem useful to have a "fixed length bytes" dtype even
> in py3, and if we declare that be "S" then it avoids breaking any
> existing code depending on it...
>

We break code either way.
Either we break applications using S as string type, but now it
becomes bytes in python3.
Or we break applications treating S as byte type and we change it to
string in python3.

Unfortunately we missed the opportunity when adding python3 support to
fix the same exact same bytes/text boundary issue which is the main
reason why pythons3 exists in the first place.
We should have made porting to numpy3 a intentionally(!) backward
incompatible change just like python itself did.

Now we are stuck with deciding, which option breaks less.
On the one hand, that S is bytes in python3 is somewhat established by
now and lots of workarounds are already place.
On the other hand, I think code that relies on S being bytes is in the
minority and python3 usage is probably still  insignificant in this
area. Unfortunately getting actual numbers and not wild guesses on
this is probably not easy.


From charlesr.harris at gmail.com  Fri Jul 18 11:20:31 2014
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 18 Jul 2014 09:20:31 -0600
Subject: [Numpy-discussion] __numpy_ufunc__
In-Reply-To: <CABL7CQjJbySkg4J5yomN3VFiSrPDim1XAFEVAGn+-jh2DH=bbg@mail.gmail.com>
References: <CAB6mnx+nJCNLf-yJYom045jK5fonvOi8AMPSQXKSfyZSf6T9_g@mail.gmail.com>
	<CANNq6Fm24LLtqduvA2ZVGs-B48v5f63JxGwjL-AVNxCVvnNfug@mail.gmail.com>
	<CAPJVwB=irs0FMFE-mai9etUP=gos+wF=UNh7xnLMsiVxZL3MbQ@mail.gmail.com>
	<CABL7CQjJbySkg4J5yomN3VFiSrPDim1XAFEVAGn+-jh2DH=bbg@mail.gmail.com>
Message-ID: <CAB6mnxJj-2UtUq0jMsr8mA9Q1+UsUJHkkA91nrTbGbsrqG4--w@mail.gmail.com>

On Wed, Jul 16, 2014 at 12:53 PM, Ralf Gommers <ralf.gommers at gmail.com>
wrote:

>
>
>
> On Wed, Jul 16, 2014 at 10:07 AM, Nathaniel Smith <njs at pobox.com> wrote:
>
>> Weirdly, I never received Chuck's original email in this thread. Should
>> some list admin be informed?
>>
> Also weirdly, my reply didn't show up on gmane. Not sure if it got
> through, so re-sending:
>
> It's already in, so do you mean not using? Would help to know what the
> issue is, because it's finished enough that it's already used in a released
> version of scipy (in sparse matrices).
>

My own feeling is that we should leave it in as it is fairly useable and
just needs to have some problematic case worked out.  The fact that scipy
already uses it is a strong argument to keep it in. I think Julian's
concern is that they won't be worked out.

Julian has started another thread on the topic and that is probably where
the conversation should continue.

<snip>

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/32f3b34b/attachment.html>

From charlesr.harris at gmail.com  Fri Jul 18 11:38:02 2014
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 18 Jul 2014 09:38:02 -0600
Subject: [Numpy-discussion] proposal: new commit guidelines for
 backportable bugfixes
In-Reply-To: <CAK5FAtGegwSH8Mx=QaZqvDeFp2UT7eHCKfFYQix6a32W1Mh=7w@mail.gmail.com>
References: <CAK5FAtGegwSH8Mx=QaZqvDeFp2UT7eHCKfFYQix6a32W1Mh=7w@mail.gmail.com>
Message-ID: <CAB6mnxJ2HOe7srTkHuJijSghz4ioGLX=LnsRsqai--RuN2VSmQ@mail.gmail.com>

On Fri, Jul 18, 2014 at 8:13 AM, Julian Taylor <
jtaylor.debian at googlemail.com> wrote:

> hi,
> I have been doing a lot of backporting for the last few bugfix
> releases and noticed that our current approach committing to master
> and cherrypicking is not so good for the git history.
> When cherry picking a bugfix from master to a maintenance branch both
> branches contain a commit with the same content and git knows of no
> relation between them. This causes unnecessary merge conflicts when
> cherry picking two changes that modify the same file. The git version
> (1.9.1) I am using is not smart enough too figure out the changesets
> in both leaf commits is the same.
> Additionally the output of `git log maintenance/1.9.x..master` becomes
> very large as all already backported issues appear again in master.
> [0]
>
> To help with this I want to propose new best practices for pull
> requests of bugfixes suitable for backporting.
> Instead of basing the bugfix on the head commit of the master, base
> them on the merge base between master and the latest maintenance
> branch.
> This allows merging the PR into both master and the maintenance branch
> without pulling in any extra changes from either branches.
> Then both branches contain the same commit and gits automerging can
> work better and git log will only show you the commits that are only
> really on one branch or the other.
>
> In practice this is very simple. You can still develop your bugfix on
> master but before you push it you just run:
>
> git rebase --onto $(git merge-base master maintenance/1.9.x) HEAD^
>
> In most bugfix PRs this should work without conflict as they should be
> relatively small.
> If you get a merge conflict during this operation, just do git rebase
> --abort and do a normal pull request, in that case the backporter
> should worry about the conflict.
>
> Does this sound like a reasonable procedure?
> Cheers,
> Julian
>
> [0] git cherry is supposed to help with that, but it never really
> worked properly for me
>

Arrived here promptly. This looks OK to me, but with the understanding that
a number of folks won't know what is going on. It should be documented in
doc/source/dev/gitwash/development_workflow.rst and perhaps a command alias
in .git/config would help, something like npyrebase, or hopefully something
better ;)

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/0bff5354/attachment.html>

From derek at astro.physik.uni-goettingen.de  Fri Jul 18 08:09:31 2014
From: derek at astro.physik.uni-goettingen.de (Derek Homeier)
Date: Fri, 18 Jul 2014 14:09:31 +0200
Subject: [Numpy-discussion] problems with mailing list ?
In-Reply-To: <CAMMTP+ABhFijQjpz_DMCYGAc=-5ynkRtfkEnrtz-WPZp5g9Z7Q@mail.gmail.com>
References: <CAMMTP+ABhFijQjpz_DMCYGAc=-5ynkRtfkEnrtz-WPZp5g9Z7Q@mail.gmail.com>
Message-ID: <6B2A9899-AB3E-40E5-AB43-BD7A2DC07D8A@astro.physik.uni-goettingen.de>

On 18 Jul 2014, at 01:07 pm, josef.pktd at gmail.com wrote:

> Are the problems with sending out the messages with the mailing lists?
> 
> I'm getting some replies without original messages, and in some threads I don't get replies, missing part of the discussions.
> 
There seem to be problems with the Scipy list server; my last mails to astropy at scipy.org have taken
12-18 hours before they made it to the list, and some people here reported messages staying in the
void for several days. But I think it?s been reported to Enthought already.

						Derek


From charlesr.harris at gmail.com  Fri Jul 18 11:57:33 2014
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 18 Jul 2014 09:57:33 -0600
Subject: [Numpy-discussion] __numpy_ufunc__ and 1.9 release
In-Reply-To: <53C6B35D.9020609@iki.fi>
References: <53C56DA2.40402@googlemail.com>
	<53C6B35D.9020609@iki.fi>
Message-ID: <CAB6mnx+hvWjDxpEHWieetk1Fuq90TONzGLtThpUUtqvLf984CA@mail.gmail.com>

On Wed, Jul 16, 2014 at 11:16 AM, Pauli Virtanen <pav at iki.fi> wrote:

> Hi,
>
> 15.07.2014 21:06, Julian Taylor kirjoitti:
> [clip: __numpy_ufunc__]
> > So I'm wondering if we should delay the introduction of this
> > feature to 1.10 or is it important enough to wait until there is a
> > consensus on the remaining issues?
>
> My 10c:
>
> The feature is not so much in hurry that it alone should delay 1.9.
> Moreover, it's best for everyone that it is bug-free on the first go,
> and it gets some real-world testing before the release. Better safe than
> sorry.
>
> I'd pull it out from 1.9.x branch, and iron out the remaining wrinkles
> before 1.10.
>

Thanks Pauli, your opinion on the matter is what I needed to see and I'll
take it as dispositive.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/57ba1579/attachment.html>

From aldcroft at head.cfa.harvard.edu  Fri Jul 18 12:06:57 2014
From: aldcroft at head.cfa.harvard.edu (Aldcroft, Thomas)
Date: Fri, 18 Jul 2014 12:06:57 -0400
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CAK5FAtHGkuC79JOJGBXYBpAYwoQdYfDdL=GafcptBrLamZ8n9Q@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
	<1405423590.8281.7.camel@sebastian-t440>
	<CAB6mnx+OPGJuhmCbH=OEt=pAB2XTNh-sEFZUDyUVEFxnAAO9rg@mail.gmail.com>
	<CAB6mnxJPe3QwVW7T0QYGMhw+Df6tWiKY95m5Aq14MiKckjcnuA@mail.gmail.com>
	<CAPJVwBk6izCdh2o5fu-NgU8T75odNrnBJwRAW9jF7EF-HRPRfA@mail.gmail.com>
	<CAK5FAtHGkuC79JOJGBXYBpAYwoQdYfDdL=GafcptBrLamZ8n9Q@mail.gmail.com>
Message-ID: <CAMtEP6y-p0QTYBf=qoDhPE86zszqFoC18_vc02pbi=Vd4ByN3w@mail.gmail.com>

On Fri, Jul 18, 2014 at 11:10 AM, Julian Taylor <
jtaylor.debian at googlemail.com> wrote:

> On Thu, Jul 17, 2014 at 5:48 PM, Nathaniel Smith <njs at pobox.com> wrote:
> > On Tue, Jul 15, 2014 at 4:29 PM, Charles R Harris
> > <charlesr.harris at gmail.com> wrote:
> >> Thinking more about it, the easiest thing to do might be to make the S
> dtype
> >> a UTF-8 encoding. Most of the machinery to deal with that is already in
> >> place. That change might affect some users though, and we might need to
> do
> >> some work to make it backwards compatible with python 2.
> >
> > I'd be very concerned about backcompat for existing code that uses
> > e.g. "S128" as a dtype to mean "128 arbitrary bytes". An example is
> > this file format reading code:
> >    https://github.com/rerpy/rerpy/blob/master/rerpy/io/erpss.py#L123
> > The file format says there are 128 bytes there, and their
> > interpretation depends on other fields in the header -- but in one
> > case, for "large montages", there's an encoding where every 3 bytes
> > represents 4 characters using an ad hoc 6-bit character set:
> >    https://github.com/rerpy/rerpy/blob/master/rerpy/io/erpss.py#L133
> >
> > Perhaps this case could be handled better by using a u8 subarray or
> > something (that code also goes to some efforts to work around nul
> > padding), and that particular project hasn't been ported to py3 yet so
> > technically wouldn't be affected if we changed the meaning of "S" on
> > py3. But it does seem useful to have a "fixed length bytes" dtype even
> > in py3, and if we declare that be "S" then it avoids breaking any
> > existing code depending on it...
> >
>
> We break code either way.
> Either we break applications using S as string type, but now it
> becomes bytes in python3.
> Or we break applications treating S as byte type and we change it to
> string in python3.
>
> Unfortunately we missed the opportunity when adding python3 support to
> fix the same exact same bytes/text boundary issue which is the main
> reason why pythons3 exists in the first place.
> We should have made porting to numpy3 a intentionally(!) backward
> incompatible change just like python itself did.
>
> Now we are stuck with deciding, which option breaks less.
> On the one hand, that S is bytes in python3 is somewhat established by
> now and lots of workarounds are already place.
>

Removing workarounds is generally a good thing (!), and often not that hard
to do by numpy version number for libraries that need to support multiple
numpy versions.  It's never ideal to break compatibility, but in this case
it would be fixing something that is currently not working in a useful way.

- Tom


> On the other hand, I think code that relies on S being bytes is in the
> minority and python3 usage is probably still  insignificant in this
> area. Unfortunately getting actual numbers and not wild guesses on
> this is probably not easy.

_______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/38e1a661/attachment.html>

From pav at iki.fi  Fri Jul 18 12:07:45 2014
From: pav at iki.fi (Pauli Virtanen)
Date: Fri, 18 Jul 2014 19:07:45 +0300
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CAK5FAtHGkuC79JOJGBXYBpAYwoQdYfDdL=GafcptBrLamZ8n9Q@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>	<1405423590.8281.7.camel@sebastian-t440>	<CAB6mnx+OPGJuhmCbH=OEt=pAB2XTNh-sEFZUDyUVEFxnAAO9rg@mail.gmail.com>	<CAB6mnxJPe3QwVW7T0QYGMhw+Df6tWiKY95m5Aq14MiKckjcnuA@mail.gmail.com>	<CAPJVwBk6izCdh2o5fu-NgU8T75odNrnBJwRAW9jF7EF-HRPRfA@mail.gmail.com>
	<CAK5FAtHGkuC79JOJGBXYBpAYwoQdYfDdL=GafcptBrLamZ8n9Q@mail.gmail.com>
Message-ID: <53C94651.4040805@iki.fi>

18.07.2014 18:10, Julian Taylor kirjoitti:
[clip]
> We break code either way. Either we break applications using S as
> string type, but now it becomes bytes in python3. Or we break
> applications treating S as byte type and we change it to string in
> python3.
> 
> Unfortunately we missed the opportunity when adding python3 support
> to fix the same exact same bytes/text boundary issue which is the
> main reason why pythons3 exists in the first place. We should have
> made porting to numpy3 a intentionally(!) backward incompatible
> change just like python itself did.
> 
> Now we are stuck with deciding, which option breaks less. On the
> one hand, that S is bytes in python3 is somewhat established by now
> and lots of workarounds are already place. On the other hand, I
> think code that relies on S being bytes is in the minority and
> python3 usage is probably still  insignificant in this area.
> Unfortunately getting actual numbers and not wild guesses on this
> is probably not easy.

One way to try this out is to change the meaning of 'S' and see how
badly e.g. pandas or matplotlib break on py3 as a consequence.

Another approach would be to add a new 1-byte unicode as a type code
different from 'S'. The automatic ASCII encoding in
constructor/assignment on Py3 can be deprecated, which would make 'S'
a strict bytes dtype.

This also is not perfect, since array(['foo']) on Py2 should for
backward compatibility continue returning dtype='S'. Moreover,
already existing code does not make use of it.

-- 
Pauli Virtanen


From chris.barker at noaa.gov  Fri Jul 18 12:10:00 2014
From: chris.barker at noaa.gov (Chris Barker)
Date: Fri, 18 Jul 2014 09:10:00 -0700
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CAPJVwBk6izCdh2o5fu-NgU8T75odNrnBJwRAW9jF7EF-HRPRfA@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
	<1405423590.8281.7.camel@sebastian-t440>
	<CAB6mnx+OPGJuhmCbH=OEt=pAB2XTNh-sEFZUDyUVEFxnAAO9rg@mail.gmail.com>
	<CAB6mnxJPe3QwVW7T0QYGMhw+Df6tWiKY95m5Aq14MiKckjcnuA@mail.gmail.com>
	<CAPJVwBk6izCdh2o5fu-NgU8T75odNrnBJwRAW9jF7EF-HRPRfA@mail.gmail.com>
Message-ID: <CALGmxELcnKCwyRbMd6U16wAcVhZOuWUk-HKK29L5SKw7-BFRYQ@mail.gmail.com>

On Thu, Jul 17, 2014 at 8:48 AM, Nathaniel Smith <njs at pobox.com> wrote:

> I'd be very concerned about backcompat for existing code that uses
> e.g. "S128" as a dtype to mean "128 arbitrary bytes".


yup -- 'S' matches teh py2 string well, which is BOTH text and bytes. That
should not change -- at least in py2.


> An example is
> this file format reading code:
>    https://github.com/rerpy/rerpy/blob/master/rerpy/io/erpss.py#L123
> The file format says there are 128 bytes there, and their
> interpretation depends on other fields in the header -- but in one
> case, for "large montages", there's an encoding where every 3 bytes
> represents 4 characters using an ad hoc 6-bit character set:
>    https://github.com/rerpy/rerpy/blob/master/rerpy/io/erpss.py#L133
>
> Perhaps this case could be handled better by using a u8 subarray or
> something (that code also goes to some efforts to work around nul

padding),


yes -- that might have been better, though I have not been successful at
figuring out how to spell a dtype that works well -- hence my suggestion
that we have a bytes type.


> and that particular project hasn't been ported to py3 yet so
> technically wouldn't be affected if we changed the meaning of "S" on
> py3. But it does seem useful to have a "fixed length bytes" dtype even
> in py3, and if we declare that be "S" then it avoids breaking any
> existing code depending on it...
>

sure, but having 'S' be bytes does break other code that depends on it
being a text type. Unfortunately, py2 mingled text and bytes, numpy
mirrored that, so there is no completely backward compatible way to go
forward. But for some guidance -- text is the big issue with py2 <-> p3
migration, so folks are presumable going to expect things to change with
numpy text handling as well.

-Chris

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/00d46aed/attachment.html>

From alan.isaac at gmail.com  Fri Jul 18 12:11:06 2014
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Fri, 18 Jul 2014 12:11:06 -0400
Subject: [Numpy-discussion] Numpy BoF at SciPy 2014 - quick report
In-Reply-To: <1405670639.6974.4.camel@sebastian-t440>
References: <mailman.186.1405593558.1001.numpy-discussion@scipy.org>	<3A0037EB-BF51-4943-8E27-EDDFC8F09456@astro.princeton.edu>
	<1405670639.6974.4.camel@sebastian-t440>
Message-ID: <53C9471A.3010805@gmail.com>

On 7/18/2014 4:03 AM, Sebastian Berg wrote:
> Changing `.T` to promote to
> 2-d (also maybe to actually only transpose the last two axes for higher
> D arrays), could be nice, but getting there might take quite a long
> FutureWarning or even Error -> new feature cycle.


Considering the extent of implied breakage, I hope this will not be considered.
Also, there are already nice ways to add an axis (even optionally, with `atleast_2d`).

I think having `.T` as a no-op on a 1d array is correct behavior.
I would not change it.  However I can understand preferring an error.
(Mathematica considers it an error.)

Alan Isaac


From chris.barker at noaa.gov  Fri Jul 18 12:15:35 2014
From: chris.barker at noaa.gov (Chris Barker)
Date: Fri, 18 Jul 2014 09:15:35 -0700
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CAPJVwB=VEG0sEcwvttYipQaLtwrBaC7aoN-mAkSOxsdHtcMLDw@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
	<CALGmxEKR3NB_iL1fPb-N0hyf86dfy_BHxvy+KXcM-cYt-OtH8A@mail.gmail.com>
	<CAFpSVpKyop44UYtEU8ErF7raCtuWrObnjGvpTDLA12E8DFTjSA@mail.gmail.com>
	<CALGmxEKTTEziBxNio4-Mvz3vk=p3UHgrsyrMMjjOmPYbNanqXQ@mail.gmail.com>
	<CAPJVwB=VEG0sEcwvttYipQaLtwrBaC7aoN-mAkSOxsdHtcMLDw@mail.gmail.com>
Message-ID: <CALGmxE+JrnfgkLhSAU0Qf_0v=QnVo6GwbV1ByF7rHEK6mfq2LA@mail.gmail.com>

On Fri, Jul 18, 2014 at 3:33 AM, Nathaniel Smith <njs at pobox.com> wrote:

> > 2) A bytes types -- almost the current 'S' type
> >     - A bytes type would map to/from py3 bytes objects (and py2 bytes
> > objects, which are the same as py2strings)
> >     - one way is would differ from a py2str is that there would be no
> > assumption of null-termination (not sure where that is now)
>
> AFAICT this is *exactly* the same as the current 'S' type. What
> differences do you see?


as you mention it, it is the same on py3, except maybe handling of null
bytes -- you mentioned that you had to do some work-arounds for that. a
proper bytes type would do nothing special with null bytes.


> > 3) A one-byte-per-char text type -- more or less Chuck's current
> proposal.
> >    - it would map to/from the py3 string -- it is text after all
> >    - it would be null-terminated
>
> Numpy strings types are never null-terminated ATM. They're
> null-padded, which is slightly different. When storing data in an S5,
> for instance, strings of length 5 have no nulls appending, strings of
> length 4 have 1 null appended, strings of length 3 have 2 nulls
> appended, etc. When reading data out of an S5, then all trailing nulls
> are stripped.
>
> So, they may not be null terminated (if the length of the string
> exactly matches the length of the dtype), and the strings being stored
> can contain internal nulls ("foo\x00bar" is fine), but they cannot
> contain trailing nulls ("foo\x00" will come back as just "foo").
>
> Do you actually care about null-termination specifically? Or did you
> just mean "it should work like the other ones, which I vaguely
> remember involves nulls"? ;-)
>

That's pretty much what I meant, yes ;-) But the key is that when pushing
one of these things to a python string, any thing after a null byte is
ignored. Which is why you can't use it for arbitrary bytes.

>    - it would have a one-byte per-char encoding: ascii, latin-1 or
> settable
> > (TBA)
>
> Settable is technically very difficult until we redo the dtype
> machinery to allow parametrized types.


indeed -- we have that a bit with Datetime -- but that's a whole other
kettle of fish.

-CHB


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/a49ef68f/attachment.html>

From njs at pobox.com  Fri Jul 18 12:23:40 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 18 Jul 2014 17:23:40 +0100
Subject: [Numpy-discussion] proposal: new commit guidelines for
 backportable bugfixes
In-Reply-To: <CAK5FAtGegwSH8Mx=QaZqvDeFp2UT7eHCKfFYQix6a32W1Mh=7w@mail.gmail.com>
References: <CAK5FAtGegwSH8Mx=QaZqvDeFp2UT7eHCKfFYQix6a32W1Mh=7w@mail.gmail.com>
Message-ID: <CAPJVwBkXYr89z4uwmXL5XAE5xXde_jVtHJQY5+QgEAn01W41=g@mail.gmail.com>

On 18 Jul 2014 15:36, "Julian Taylor" <jtaylor.debian at googlemail.com> wrote:
>
> git rebase --onto $(git merge-base master maintenance/1.9.x) HEAD^

As a potential refinement, this might be simpler if we define a branch that
points to this commit.

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/44695d3d/attachment.html>

From jtaylor.debian at googlemail.com  Fri Jul 18 12:30:04 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Fri, 18 Jul 2014 18:30:04 +0200
Subject: [Numpy-discussion] proposal: new commit guidelines for
 backportable bugfixes
In-Reply-To: <CAB6mnxJ2HOe7srTkHuJijSghz4ioGLX=LnsRsqai--RuN2VSmQ@mail.gmail.com>
References: <CAK5FAtGegwSH8Mx=QaZqvDeFp2UT7eHCKfFYQix6a32W1Mh=7w@mail.gmail.com>
	<CAB6mnxJ2HOe7srTkHuJijSghz4ioGLX=LnsRsqai--RuN2VSmQ@mail.gmail.com>
Message-ID: <CAK5FAtHFomErqyQbKcUZpCGqwb9M=n-aEQ7x+w4BTS+hEWKdAw@mail.gmail.com>

On Fri, Jul 18, 2014 at 5:38 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Fri, Jul 18, 2014 at 8:13 AM, Julian Taylor
> <jtaylor.debian at googlemail.com> wrote:
>>
>> hi,
>> I have been doing a lot of backporting for the last few bugfix
>> releases and noticed that our current approach committing to master
>> and cherrypicking is not so good for the git history.
>> When cherry picking a bugfix from master to a maintenance branch both
>> branches contain a commit with the same content and git knows of no
>> relation between them. This causes unnecessary merge conflicts when
>> cherry picking two changes that modify the same file. The git version
>> (1.9.1) I am using is not smart enough too figure out the changesets
>> in both leaf commits is the same.
>> Additionally the output of `git log maintenance/1.9.x..master` becomes
>> very large as all already backported issues appear again in master.
>> [0]
>>
>> To help with this I want to propose new best practices for pull
>> requests of bugfixes suitable for backporting.
>> Instead of basing the bugfix on the head commit of the master, base
>> them on the merge base between master and the latest maintenance
>> branch.
>> This allows merging the PR into both master and the maintenance branch
>> without pulling in any extra changes from either branches.
>> Then both branches contain the same commit and gits automerging can
>> work better and git log will only show you the commits that are only
>> really on one branch or the other.
>>
>> In practice this is very simple. You can still develop your bugfix on
>> master but before you push it you just run:
>>
>> git rebase --onto $(git merge-base master maintenance/1.9.x) HEAD^
>>
>> In most bugfix PRs this should work without conflict as they should be
>> relatively small.
>> If you get a merge conflict during this operation, just do git rebase
>> --abort and do a normal pull request, in that case the backporter
>> should worry about the conflict.
>>
>> Does this sound like a reasonable procedure?
>> Cheers,
>> Julian
>>
>> [0] git cherry is supposed to help with that, but it never really
>> worked properly for me
>
>
> Arrived here promptly. This looks OK to me, but with the understanding that
> a number of folks won't know what is going on. It should be documented in
> doc/source/dev/gitwash/development_workflow.rst and perhaps a command alias
> in .git/config would help, something like npyrebase, or hopefully something
> better ;)
>
> Chuck
>

Yes of course I would document it when its ok for everyone.
I do not want that this inconveniences contributors, maybe we can just
ask for it if extra changes are required for the PR anyway.
I would just like that the people who merge PR's (which are currently
just a handful) try to use this method when the PR is applicable for a
maintenance branch.
We can add a small tool that does what does the rebase, merges to both
master and the branch and closes the PR, something like:
tools/merge-backport-pr #pr-number <maintenance-branch>


From andrew.collette at gmail.com  Fri Jul 18 12:32:24 2014
From: andrew.collette at gmail.com (Andrew Collette)
Date: Fri, 18 Jul 2014 10:32:24 -0600
Subject: [Numpy-discussion] String type again.
In-Reply-To: <-4597269384285942771@unknownmsgid>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
	<CALmrCV3DQj7nu0KcCvoktmunK_t_qp+RR2rwB7L-qCBVuQ+=3A@mail.gmail.com>
	<-4597269384285942771@unknownmsgid>
Message-ID: <CALmrCV2GK2cTec+hNnF3p7T3qVgvjmPNR=kg2NN0A9TCUfMiZQ@mail.gmail.com>

Hi Chris,

>> A Latin-1 based 'a' type
>> would have similar problems.
>
> Maybe not -- latin1 is fixed width.

Yes, Latin-1 is fixed width, but the issue is that when writing to a
fixed-width UTF8 string in HDF5, it will expand, possibly losing data.

What I would like to avoid is a situation where a user writes a
10-byte string from NumPy into a 10-byte space in an HDF5 dataset, and
unexpectedly loses the last few characters because of the encoding
mismatch.

People are used to truncation when e.g. storing a 20-byte string in a
10-byte dataset, but it's surprising when the source and destination
are the same size. :)

In any case, I certainly agree NumPy shouldn't be limited by the
capabilities of HDF5.  There are other valuable use cases, including
access to the high-bit characters Latin-1 provides.  But from a strict
compatibility standpoint, ASCII would be beneficial.

Andrew


From chris.barker at noaa.gov  Fri Jul 18 12:33:32 2014
From: chris.barker at noaa.gov (Chris Barker)
Date: Fri, 18 Jul 2014 09:33:32 -0700
Subject: [Numpy-discussion] String type again.
In-Reply-To: <53C94651.4040805@iki.fi>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
	<1405423590.8281.7.camel@sebastian-t440>
	<CAB6mnx+OPGJuhmCbH=OEt=pAB2XTNh-sEFZUDyUVEFxnAAO9rg@mail.gmail.com>
	<CAB6mnxJPe3QwVW7T0QYGMhw+Df6tWiKY95m5Aq14MiKckjcnuA@mail.gmail.com>
	<CAPJVwBk6izCdh2o5fu-NgU8T75odNrnBJwRAW9jF7EF-HRPRfA@mail.gmail.com>
	<CAK5FAtHGkuC79JOJGBXYBpAYwoQdYfDdL=GafcptBrLamZ8n9Q@mail.gmail.com>
	<53C94651.4040805@iki.fi>
Message-ID: <CALGmxEKV8up5COLzmY-jALv+wLt6zbUcq4ctMOSkAF8=iZS7xA@mail.gmail.com>

On Fri, Jul 18, 2014 at 9:07 AM, Pauli Virtanen <pav at iki.fi> wrote:

> Another approach would be to add a new 1-byte unicode


you can't do unicode in 1-byte -- so what does this mean, exactly?


> This also is not perfect, since array(['foo']) on Py2 should for
> backward compatibility continue returning dtype='S'.


yup. but we may be OK -- as "bytes" in py2 is the same as string anyway.
But what do we do with null bytes? when going from 'S' to py2 string?

-CHB

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/7ce4f9ba/attachment.html>

From jtaylor.debian at googlemail.com  Fri Jul 18 12:35:31 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Fri, 18 Jul 2014 18:35:31 +0200
Subject: [Numpy-discussion] proposal: new commit guidelines for
 backportable bugfixes
In-Reply-To: <CAPJVwBkXYr89z4uwmXL5XAE5xXde_jVtHJQY5+QgEAn01W41=g@mail.gmail.com>
References: <CAK5FAtGegwSH8Mx=QaZqvDeFp2UT7eHCKfFYQix6a32W1Mh=7w@mail.gmail.com>
	<CAPJVwBkXYr89z4uwmXL5XAE5xXde_jVtHJQY5+QgEAn01W41=g@mail.gmail.com>
Message-ID: <CAK5FAtF+seKQHu9q5bNMNatX7JEPXbwURzR+4VjWdaWUhULXog@mail.gmail.com>

On Fri, Jul 18, 2014 at 6:23 PM, Nathaniel Smith <njs at pobox.com> wrote:
> On 18 Jul 2014 15:36, "Julian Taylor" <jtaylor.debian at googlemail.com> wrote:
>>
>> git rebase --onto $(git merge-base master maintenance/1.9.x) HEAD^
>
> As a potential refinement, this might be simpler if we define a branch that
> points to this commit.
>

we could do that, though the merge base changes to the last commit
that was merged in that way. The old merge base is still valid but
much older. I applied this method to some of my bugfixes so the
current merge base of master and 1.9 is a commit from yesterday not
anymore the diverging point of master and 1.9.
But I don't know if the newer merge base makes any difference to git.


From markperrymiller at gmail.com  Fri Jul 18 12:45:52 2014
From: markperrymiller at gmail.com (Mark Miller)
Date: Fri, 18 Jul 2014 09:45:52 -0700
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat`
	style
In-Reply-To: <CAPJVwBnMCt4vr6aQM5e74EBP347D01bWWKRYuZ5By0Fg4XY2dg@mail.gmail.com>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>
	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>
	<53B9C861.3090809@hawaii.edu>
	<CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>
	<-2968451659458027190@unknownmsgid> <lpgqht$tm3$1@ger.gmane.org>
	<CACoeR0wT=f3bPmc9j8UezPPa8UZz8ad6sj6q_6F3cyciGXHxjw@mail.gmail.com>
	<CAP7h-xYtS_Sc5PZt9rW1rg3Mk4LjUc6P=PuQ_g+J=Qe6VuhNtQ@mail.gmail.com>
	<CAP7h-xbvdv86DWZvYaLSJpba=vSTHdCQK1+YD_qfNRv4DeFBaw@mail.gmail.com>
	<CAPJVwBkPAWG6hRKJWaE3tW40s5fXheh=CeEw+giUneoLAMmLDA@mail.gmail.com>
	<CADU3R7hM_WNvoQkfMiCtfD1qH0RmQ9GWv1rwRee3kz9++zs4_A@mail.gmail.com>
	<CAPJVwBnMCt4vr6aQM5e74EBP347D01bWWKRYuZ5By0Fg4XY2dg@mail.gmail.com>
Message-ID: <CAEv1Z-4UuuV4VMiM=OqEqFghJ81yg4QznWtHYKuJpMhwutkHSw@mail.gmail.com>

On Fri, Jul 18, 2014 at 3:37 AM, Nathaniel Smith <njs at pobox.com> wrote:

> On Thu, Jul 17, 2014 at 11:10 PM, Charles G. Waldman <charles at crunch.io>
> wrote:
> >
> > -1 on the 'arr' name.  I think if we're going to support this function
> at all (which I'm not convinced is a good idea), it should be
> np.fromsomething like the other from* functions.
> >
> > Maybe frommatlab?
> >
> > I think that 'arr' is just too generic and too close to 'array'.
>
> Well, it's definitely not a good idea if we name it something like that
> :-).
>
> The whole motivation is to provide a quick way to type 2d arrays
> interactively, hence the current name "np.mat". (The fact that it
> happens to match matlab syntax is a nice bonus, because stealing is
> always better than inventing when it works.)
>
>
Some minor confusion on my part. If the true goal is to just allow quick
entry of a 2d array, why not just advocate using

a = numpy.array(numpy.mat("1 2 3; 4 5 6; 7 8 9"))

If anyone is really set on having this functionality, they could just write
a one-line wrapper function and call it a day.

Note that I would personally not use this type of shorthand syntax for
teaching or presentations. I'd prefer to use proper python syntax myself
from the get go rather than having to start over from square one and teach
a completely different syntax for constructing >2d arrays.

"There should be one-- and preferably only one --obvious way to do it."
-Zen of Python
-Mark

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/dc784eaf/attachment.html>

From njs at pobox.com  Fri Jul 18 12:53:51 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 18 Jul 2014 17:53:51 +0100
Subject: [Numpy-discussion] `allclose` vs `assert_allclose`
In-Reply-To: <CAMMTP+D+19j2eJTYSfg5kHVieu333xwb9ZV+Tc6vBS88eUMtzw@mail.gmail.com>
References: <CAEym_HoQgVswDS7h7hx4JjqVb-jKmoSpZmt36VAQSbyTX_NXnw@mail.gmail.com>
	<CAPJVwBn7Br5Ng7U=y26o=nj_E3LMdpWbc5Qp53h1fSOFD603eg@mail.gmail.com>
	<CAMMTP+DA0ja+zbDR5Fh=yTxsW_TeSApHD9tAWY261RgZ+x4UoQ@mail.gmail.com>
	<CAMMTP+D+19j2eJTYSfg5kHVieu333xwb9ZV+Tc6vBS88eUMtzw@mail.gmail.com>
Message-ID: <CAPJVwBmML+mBn2reEpFvArtAwhqxK=Y5aFyjKtfOdO=seVk8_g@mail.gmail.com>

On Fri, Jul 18, 2014 at 12:38 PM,  <josef.pktd at gmail.com> wrote:
>
> On Thu, Jul 17, 2014 at 4:07 PM, <josef.pktd at gmail.com> wrote:
>
>> If you mean by this to add atol=1e-8 as default, then I'm against it.
>>
>> At least it will change the meaning of many of our tests in statsmodels.
>>
>> I'm using rtol to check for correct 1e-15 or 1e-30, which would be
>> completely swamped if you change the default atol=0.
>> Adding atol=0 to all assert_allclose that currently use only rtol is a lot
>> of work.
>> I think I almost never use a default rtol, but I often leave atol at the
>> default = 0.
>>
>> If we have zeros, then I don't think it's too much work to decide whether
>> this should be atol=1e-20, or 1e-8.
>
>
> copied from
> http://mail.scipy.org/pipermail/numpy-discussion/2014-July/070639.html
> since I didn't get any messages here
>
> This is a compelling use-case, but there are also lots of compelling
> usecases that want some non-zero atol (i.e., comparing stuff to 0).
> Saying that allclose is for one of those use cases and assert_allclose
> is for the other is... not a very felicitious API design, I think. So
> we really should do *something*.
>
> Are there really any cases where you want non-zero atol= that don't
> involve comparing something against a 'desired' value of zero? It's a
> little wacky, but I'm wondering if we ought to change the rule (for
> all versions of allclose) to
>
> if desired == 0:
>     tol = atol
> else:
>     tol = rtol * desired
>
> In particular, means that np.allclose(x, 1e-30) would reject x values
> of 0 or 2e-30, but np.allclose(x, 0) will accept x == 1e-30 or 2e-30.
>
> -n
>
>
> That's much too confusing.
> I don't know what the usecases for np.allclose are since I don't have any.

I wrote allclose because it's shorter, but my point is that
assert_allclose and allclose should use the same criterion, and was
making a suggestion for what that shared criterion might be.

> assert_allclose is one of our (statsmodels) most frequently used numpy
> function
>
> this is not informative:
>
> `np.allclose(x, 1e-30)`
>
>
> since there are keywords
> either np.assert_allclose(x, atol=1e-30)

I think we might be talking past each other here -- 1e-30 here is my
"gold" p-value that I'm hoping x will match, not a tolerance argument.

> if I want to be "close" to zero
> or
>
> np.assert_allclose(x, rtol=1e-11, atol=1e-25)
>
> if we have a mix of large numbers and "zeros" in an array.
>
> Making the behavior of assert_allclose depending on whether desired is
> exactly zero or 1e-20 looks too difficult to remember, and which desired I
> use would depend on what I get out of R or Stata.

I thought your whole point here was that 1e-20 and zero are
qualitatively different values that you would not want to accidentally
confuse? Surely R and Stata aren't returning exact zeros for small
non-zero values like probability tails?

> atol=1e-8 is not close to zero in most cases in my experience.

If I understand correctly (Tony?) the problem here is that another
common use case for assert_allclose is in cases like

assert_allclose(np.sin(some * complex ** calculation / (that - should
- be * zero)), 0)

For cases like this, you need *some* non-zero atol or the thing just
doesn't work, and one could quibble over the exact value as long as
it's larger than "normal" floating point error. These calculations
usually involve "normal" sized numbers, so atol should be comparable
to eps * these values.  eps is 2e-16, so atol=1e-8 works for values up
to around 1e8, which is a plausible upper bound for where people might
expect assert_allclose to just work. I'm trying to figure out some way
to support your use cases while also supporting other use cases.

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org


From chris.barker at noaa.gov  Fri Jul 18 12:54:03 2014
From: chris.barker at noaa.gov (Chris Barker)
Date: Fri, 18 Jul 2014 09:54:03 -0700
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CALmrCV2GK2cTec+hNnF3p7T3qVgvjmPNR=kg2NN0A9TCUfMiZQ@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
	<CALmrCV3DQj7nu0KcCvoktmunK_t_qp+RR2rwB7L-qCBVuQ+=3A@mail.gmail.com>
	<-4597269384285942771@unknownmsgid>
	<CALmrCV2GK2cTec+hNnF3p7T3qVgvjmPNR=kg2NN0A9TCUfMiZQ@mail.gmail.com>
Message-ID: <CALGmxEJtX7xJz1+GA-V7A7NzgtD1Juahd6iFUFx9qmJvbK1vSg@mail.gmail.com>

On Fri, Jul 18, 2014 at 9:32 AM, Andrew Collette <andrew.collette at gmail.com>
wrote:

> >> A Latin-1 based 'a' type
> >> would have similar problems.
> >
> > Maybe not -- latin1 is fixed width.
>
> Yes, Latin-1 is fixed width, but the issue is that when writing to a
> fixed-width UTF8 string in HDF5, it will expand, possibly losing data.
>

you shouldn't do that -- I was in no way suggesting that a latin-1 string
get pushed to a utf-8 array by default -- that would be a bad idea. utf-8
is a unicode encoding, it should be used for unicode.

As for truncation -- that's inherent in using a fixed-width array to store
a non-fixed width encoding.

What I would like to avoid is a situation where a user writes a
> 10-byte string from NumPy into a 10-byte space in an HDF5 dataset, and
> unexpectedly loses the last few characters because of the encoding
> mismatch.
>

Again, they shouldn't do that, they should be pushing a 10-character string
into something -- and utf-8 is going to (Possible) truncate that. That's
HDF/utf-8 limitation that people are going to have to deal with. I think
you're suggesting that numpy follow the HDF model, so that the numpy-HDF
transition can be clean and easy. However, I think that utf-8 is an
inappropriate model for numpy, and that the mess of bytes to utf-8 is
pyHDF's problem, not numpy's.

i.e your issue above -- should users put a 10 character string into a numpy
10 byte utf-8 type and see it truncated? That's what I want to avoid.

In any case, I certainly agree NumPy shouldn't be limited by the
> capabilities of HDF5.  There are other valuable use cases, including
> access to the high-bit characters Latin-1 provides.  But from a strict
> compatibility standpoint, ASCII would be beneficial.
>

This is where I wonder about HDF's "ascii" type -- is it really ascii? Or
is it that old standby
one-byte-per-character-and-if-it's-ascii-we-all-know-what-it-means-but-if-it's-not-we'll-still-pass-it-around
type? i.e the old char* ?

In which case, you can just push a latin-1 type into and out of your HDF
ascii arrays and everything will work just fine. Unless someone stores
something other than latin-1 or ascii in it -- but even then, the bytes
would still be preserved.

This is why I see no downside to latin-1 -- if you don't use the > 127 code
points, it's the same thing -- if you do, you get some extra handy
characters. The only difference is that a proper ascii type would not let
you store anything above 127 at all -- why restrict ourselves?

And if you want utf-8 in HDF, then use a unicode array knowing that some
truncation could occur, or use a byte array, and do the encoding yourself,
so the user knows exactly what they are doing.

[it would be nice if numpy had a pure numpy solution to encoding/decoding,
though maybe it wouldn't really be any faster than going through python
anyway...]

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/43139921/attachment.html>

From njs at pobox.com  Fri Jul 18 12:59:32 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 18 Jul 2014 17:59:32 +0100
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CALGmxEJtX7xJz1+GA-V7A7NzgtD1Juahd6iFUFx9qmJvbK1vSg@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
	<CALmrCV3DQj7nu0KcCvoktmunK_t_qp+RR2rwB7L-qCBVuQ+=3A@mail.gmail.com>
	<-4597269384285942771@unknownmsgid>
	<CALmrCV2GK2cTec+hNnF3p7T3qVgvjmPNR=kg2NN0A9TCUfMiZQ@mail.gmail.com>
	<CALGmxEJtX7xJz1+GA-V7A7NzgtD1Juahd6iFUFx9qmJvbK1vSg@mail.gmail.com>
Message-ID: <CAPJVwB=15cGp+F9KYaR6THgEkgZStn-e6RAHifHJYs+usXtTww@mail.gmail.com>

On Fri, Jul 18, 2014 at 5:54 PM, Chris Barker <chris.barker at noaa.gov> wrote:
>
> This is why I see no downside to latin-1 -- if you don't use the > 127 code
> points, it's the same thing -- if you do, you get some extra handy
> characters. The only difference is that a proper ascii type would not let
> you store anything above 127 at all -- why restrict ourselves?

IMO the extra characters aren't the most compelling argument for
latin1 over ascii. Latin1 gives the nice assurance that if some jerk
*does* give me an "ascii" file that somewhere has some byte with the
8th bit set, then I can still load the data and fix things by hand.
This is trickier if numpy just refuses to touch the data, blowing up
with an exception when I try. In general it's easy to create numpy
arrays containing arbitrary bitpatterns, so it's nice to have some
strategy for what to do with them.

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org


From njs at pobox.com  Fri Jul 18 13:00:24 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 18 Jul 2014 18:00:24 +0100
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat`
	style
In-Reply-To: <CADU3R7jMUuc0DsD1dF0oNdg9O-9nTWtH0eTKYF1rvHvyYQ1NMA@mail.gmail.com>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>
	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>
	<53B9C861.3090809@hawaii.edu>
	<CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>
	<-2968451659458027190@unknownmsgid> <lpgqht$tm3$1@ger.gmane.org>
	<CACoeR0wT=f3bPmc9j8UezPPa8UZz8ad6sj6q_6F3cyciGXHxjw@mail.gmail.com>
	<CAP7h-xYtS_Sc5PZt9rW1rg3Mk4LjUc6P=PuQ_g+J=Qe6VuhNtQ@mail.gmail.com>
	<CAP7h-xbvdv86DWZvYaLSJpba=vSTHdCQK1+YD_qfNRv4DeFBaw@mail.gmail.com>
	<CAPJVwBkPAWG6hRKJWaE3tW40s5fXheh=CeEw+giUneoLAMmLDA@mail.gmail.com>
	<CADU3R7hM_WNvoQkfMiCtfD1qH0RmQ9GWv1rwRee3kz9++zs4_A@mail.gmail.com>
	<CAPJVwBnMCt4vr6aQM5e74EBP347D01bWWKRYuZ5By0Fg4XY2dg@mail.gmail.com>
	<CADU3R7jMUuc0DsD1dF0oNdg9O-9nTWtH0eTKYF1rvHvyYQ1NMA@mail.gmail.com>
Message-ID: <CAPJVwBkPTy5jgqDtPoH=D-hBVg25p3abf-huqeBqzTqY3Xt5uQ@mail.gmail.com>

On Fri, Jul 18, 2014 at 3:02 PM, Charles G. Waldman <charles at crunch.io> wrote:
> I greatly prefer "np.mat" to "np.arr" for this, FWIW

Unfortunately that's already taken...

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org


From chris.barker at noaa.gov  Fri Jul 18 13:00:02 2014
From: chris.barker at noaa.gov (Chris Barker)
Date: Fri, 18 Jul 2014 10:00:02 -0700
Subject: [Numpy-discussion] `allclose` vs `assert_allclose`
In-Reply-To: <CAPJVwBmML+mBn2reEpFvArtAwhqxK=Y5aFyjKtfOdO=seVk8_g@mail.gmail.com>
References: <CAEym_HoQgVswDS7h7hx4JjqVb-jKmoSpZmt36VAQSbyTX_NXnw@mail.gmail.com>
	<CAPJVwBn7Br5Ng7U=y26o=nj_E3LMdpWbc5Qp53h1fSOFD603eg@mail.gmail.com>
	<CAMMTP+DA0ja+zbDR5Fh=yTxsW_TeSApHD9tAWY261RgZ+x4UoQ@mail.gmail.com>
	<CAMMTP+D+19j2eJTYSfg5kHVieu333xwb9ZV+Tc6vBS88eUMtzw@mail.gmail.com>
	<CAPJVwBmML+mBn2reEpFvArtAwhqxK=Y5aFyjKtfOdO=seVk8_g@mail.gmail.com>
Message-ID: <CALGmxE+BCgeT8e5zhC+BxPK3FQdBuwNYF7Lqi0erEyc6zhWSzA@mail.gmail.com>

On Fri, Jul 18, 2014 at 9:53 AM, Nathaniel Smith <njs at pobox.com> wrote:

> > I don't know what the usecases for np.allclose are since I don't have
> any.
>

I use it all the time -- sometimes you want to check something, but not
raise an assertion -- and I use it like:

assert np.allclose()

with pytest, because it does some nice failure reporting that way (though
maybe because I just landed on that).

Though I have to say I"m very surprised that assert_allclose() doesn't
simpily call allclose() to do it's work, and having different default is
really really bad.

but that cat's out of the bag.

If we don't normalize these, we should put nice strong notes in the docs
for both that they are NOT the same.

-CHB


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/deadc616/attachment.html>

From andy.terrel at gmail.com  Fri Jul 18 13:00:28 2014
From: andy.terrel at gmail.com (Andy Ray Terrel)
Date: Fri, 18 Jul 2014 13:00:28 -0400
Subject: [Numpy-discussion] Mailing list slowdown (was Re:
	__numpy_ufunc__)
In-Reply-To: <CAPJVwBnrSxK6KfqJAnbCcWEEPNRmXya9amfXQFQgcAtL_Ygd2g@mail.gmail.com>
References: <CAPJVwBnrSxK6KfqJAnbCcWEEPNRmXya9amfXQFQgcAtL_Ygd2g@mail.gmail.com>
Message-ID: <CA+WonSQMA=JNVmV6PY8zdQcZeEknsuCDLYL8Zux-kdM9X2SYJQ@mail.gmail.com>

We think this is fixed now. Let me know if it is otherwise.

On Thu, Jul 17, 2014 at 7:04 AM, Nathaniel Smith <njs at pobox.com> wrote:
> On 17 Jul 2014 11:51, "Sebastian Berg" <sebastian at sipsolutions.net> wrote:
>>
>> On Mi, 2014-07-16 at 09:07 +0100, Nathaniel Smith wrote:
>> > Weirdly, I never received Chuck's original email in this thread.
>> > Should some list admin be informed?
>> >
>>
>> I send some mails yesterday and they never arrived... Not sure if it is
>> a problem on my side or not.
>
> I did eventually get Chuck's original message, but not until several days
> later.
>
> CC'ing postmaster at enthought.com in case they have some insight into what's
> going on!
>
> -n
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From andy.terrel at gmail.com  Fri Jul 18 13:01:18 2014
From: andy.terrel at gmail.com (Andy Ray Terrel)
Date: Fri, 18 Jul 2014 13:01:18 -0400
Subject: [Numpy-discussion] problems with mailing list ?
In-Reply-To: <6B2A9899-AB3E-40E5-AB43-BD7A2DC07D8A@astro.physik.uni-goettingen.de>
References: <CAMMTP+ABhFijQjpz_DMCYGAc=-5ynkRtfkEnrtz-WPZp5g9Z7Q@mail.gmail.com>
	<6B2A9899-AB3E-40E5-AB43-BD7A2DC07D8A@astro.physik.uni-goettingen.de>
Message-ID: <CA+WonSRtx6vhqT7C17vMxLUOBDfY8=uBB1vTWP5Dy-jhPn0rhw@mail.gmail.com>

The Enthought support tells me this is fixed now. Please let me know
if otherwise.

On Fri, Jul 18, 2014 at 8:09 AM, Derek Homeier
<derek at astro.physik.uni-goettingen.de> wrote:
> On 18 Jul 2014, at 01:07 pm, josef.pktd at gmail.com wrote:
>
>> Are the problems with sending out the messages with the mailing lists?
>>
>> I'm getting some replies without original messages, and in some threads I don't get replies, missing part of the discussions.
>>
> There seem to be problems with the Scipy list server; my last mails to astropy at scipy.org have taken
> 12-18 hours before they made it to the list, and some people here reported messages staying in the
> void for several days. But I think it?s been reported to Enthought already.
>
>                                                 Derek
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From sturla.molden at gmail.com  Fri Jul 18 13:03:10 2014
From: sturla.molden at gmail.com (Sturla Molden)
Date: Fri, 18 Jul 2014 17:03:10 +0000 (UTC)
Subject: [Numpy-discussion] proposal: new commit guidelines for
	backportable bugfixes
References: <CAK5FAtGegwSH8Mx=QaZqvDeFp2UT7eHCKfFYQix6a32W1Mh=7w@mail.gmail.com>
Message-ID: <1569159014427395668.306103sturla.molden-gmail.com@news.gmane.org>

Julian Taylor <jtaylor.debian at googlemail.com> wrote:

> git rebase --onto $(git merge-base master maintenance/1.9.x) HEAD^

That's the problem with Git, it solves one problem an creates another.
Personally I have no idea what that command might do.

Sturla


From alan.isaac at gmail.com  Fri Jul 18 13:05:57 2014
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Fri, 18 Jul 2014 13:05:57 -0400
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat`
	style
In-Reply-To: <CAEv1Z-4UuuV4VMiM=OqEqFghJ81yg4QznWtHYKuJpMhwutkHSw@mail.gmail.com>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>	<53B9C861.3090809@hawaii.edu>	<CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>	<-2968451659458027190@unknownmsgid>
	<lpgqht$tm3$1@ger.gmane.org>	<CACoeR0wT=f3bPmc9j8UezPPa8UZz8ad6sj6q_6F3cyciGXHxjw@mail.gmail.com>	<CAP7h-xYtS_Sc5PZt9rW1rg3Mk4LjUc6P=PuQ_g+J=Qe6VuhNtQ@mail.gmail.com>	<CAP7h-xbvdv86DWZvYaLSJpba=vSTHdCQK1+YD_qfNRv4DeFBaw@mail.gmail.com>	<CAPJVwBkPAWG6hRKJWaE3tW40s5fXheh=CeEw+giUneoLAMmLDA@mail.gmail.com>	<CADU3R7hM_WNvoQkfMiCtfD1qH0RmQ9GWv1rwRee3kz9++zs4_A@mail.gmail.com>	<CAPJVwBnMCt4vr6aQM5e74EBP347D01bWWKRYuZ5By0Fg4XY2dg@mail.gmail.com>
	<CAEv1Z-4UuuV4VMiM=OqEqFghJ81yg4QznWtHYKuJpMhwutkHSw@mail.gmail.com>
Message-ID: <53C953F5.90100@gmail.com>

On 7/18/2014 12:45 PM, Mark Miller wrote:
> If the true goal is to just allow quick entry of a 2d array, why not just advocate using
> a = numpy.array(numpy.mat("1 2 3; 4 5 6; 7 8 9"))


It's even simpler:
a = np.mat(' 1 2 3;4 5 6;7 8 9').A

I'm not putting a dog in this race.  Still I would say that
the reason why such proposals miss the point is that
there are introductory settings where one would like
to explain as few complications as possible.  In
particular, one might prefer *not* to discuss the
existence of a matrix type.  As an additional downside,
this is only good for 2d, and there have been proposals
for the new array builder to handle other dimensions.

fwiw,
Alan Isaac


From pav at iki.fi  Fri Jul 18 13:26:59 2014
From: pav at iki.fi (Pauli Virtanen)
Date: Fri, 18 Jul 2014 20:26:59 +0300
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CALGmxEKV8up5COLzmY-jALv+wLt6zbUcq4ctMOSkAF8=iZS7xA@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>	<1405423590.8281.7.camel@sebastian-t440>	<CAB6mnx+OPGJuhmCbH=OEt=pAB2XTNh-sEFZUDyUVEFxnAAO9rg@mail.gmail.com>	<CAB6mnxJPe3QwVW7T0QYGMhw+Df6tWiKY95m5Aq14MiKckjcnuA@mail.gmail.com>	<CAPJVwBk6izCdh2o5fu-NgU8T75odNrnBJwRAW9jF7EF-HRPRfA@mail.gmail.com>	<CAK5FAtHGkuC79JOJGBXYBpAYwoQdYfDdL=GafcptBrLamZ8n9Q@mail.gmail.com>	<53C94651.4040805@iki.fi>
	<CALGmxEKV8up5COLzmY-jALv+wLt6zbUcq4ctMOSkAF8=iZS7xA@mail.gmail.com>
Message-ID: <53C958E3.9070700@iki.fi>

18.07.2014 19:33, Chris Barker kirjoitti:
> On Fri, Jul 18, 2014 at 9:07 AM, Pauli Virtanen <pav at iki.fi>
> wrote:
> 
>> Another approach would be to add a new 1-byte unicode
> 
> you can't do unicode in 1-byte -- so what does this mean, exactly?

The first 256 unicode code points, which happen to coincide with latin1.

>> This also is not perfect, since array(['foo']) on Py2 should for 
>> backward compatibility continue returning dtype='S'.
> 
> yup. but we may be OK -- as "bytes" in py2 is the same as string
> anyway. But what do we do with null bytes? when going from 'S' to
> py2 string?

Changing the null chopping and preserving backward compat would
require yet another new dtype. This would then mean that the 'S' dtype
would become pretty much deprecated on Py3.

Forcing everyone to re-do their Python 3 ports would be somewhat
cleaner. However, this train may have left a couple of years ago.

-- 
Pauli Virtanen


From andrew.collette at gmail.com  Fri Jul 18 13:29:10 2014
From: andrew.collette at gmail.com (Andrew Collette)
Date: Fri, 18 Jul 2014 11:29:10 -0600
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CALGmxEJtX7xJz1+GA-V7A7NzgtD1Juahd6iFUFx9qmJvbK1vSg@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
	<CALmrCV3DQj7nu0KcCvoktmunK_t_qp+RR2rwB7L-qCBVuQ+=3A@mail.gmail.com>
	<-4597269384285942771@unknownmsgid>
	<CALmrCV2GK2cTec+hNnF3p7T3qVgvjmPNR=kg2NN0A9TCUfMiZQ@mail.gmail.com>
	<CALGmxEJtX7xJz1+GA-V7A7NzgtD1Juahd6iFUFx9qmJvbK1vSg@mail.gmail.com>
Message-ID: <CALmrCV2Tr=viJPUSzBPVTM2R=JutC1DsVaBYHMH6HCs6J_6yag@mail.gmail.com>

Hi Chris,

> Again, they shouldn't do that, they should be pushing a 10-character string
> into something -- and utf-8 is going to (Possible) truncate that. That's
> HDF/utf-8 limitation that people are going to have to deal with. I think
> you're suggesting that numpy follow the HDF model, so that the numpy-HDF
> transition can be clean and easy. However, I think that utf-8 is an
> inappropriate model for numpy, and that the mess of bytes to utf-8 is
> pyHDF's problem, not numpy's.

The root of the issue is that HDF5 provides a limited set of
fixed-storage-width string types, and a fixed-storage-width NumPy type
of the same size using Latin-1 can't map to any of them without losing
data.  For example, if "a10" is a hypothetical 10-byte-wide NumPy
dtype using Latin-1, reading/writing to an "a10" HDF5 dataset backed
with 10-byte UTF-8 storage would risk truncation, even if the
advertised widths are the same.

There is unfortunately nothing we can do in the h5py code base to
paper over this... it's a limitation of the format.

> This is where I wonder about HDF's "ascii" type -- is it really ascii? Or is
> it that old standby
> one-byte-per-character-and-if-it's-ascii-we-all-know-what-it-means-but-if-it's-not-we'll-still-pass-it-around
> type? i.e the old char* ?
>
> In which case, you can just push a latin-1 type into and out of your HDF
> ascii arrays and everything will work just fine. Unless someone stores
> something other than latin-1 or ascii in it -- but even then, the bytes
> would still be preserved.

The encoding is explicitly ASCII (H5T_ASCII, in HDF5 lingo).
Anecdotally, I've heard people store other encodings in it, but (1)
I'm not eager to make things worse by mis-labelling data, and (2) the
HDF Group has made indications that they may start checking the
encoding at conversion time.  (1) is particularly important, as a
major focus of h5py is compatibility with the rest of the HDF5
ecosystem.

Again, I wouldn't argue that these considerations by themselves are
enough of a reason for NumPy to use ASCII or UTF-8, certainly.  Just
that from this particular HDF5 perspective, they provide maximum
compatibility and minimize the chances of accidental data loss.

Andrew


From charlesr.harris at gmail.com  Fri Jul 18 13:39:21 2014
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 18 Jul 2014 11:39:21 -0600
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CAPJVwB=15cGp+F9KYaR6THgEkgZStn-e6RAHifHJYs+usXtTww@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
	<CALmrCV3DQj7nu0KcCvoktmunK_t_qp+RR2rwB7L-qCBVuQ+=3A@mail.gmail.com>
	<-4597269384285942771@unknownmsgid>
	<CALmrCV2GK2cTec+hNnF3p7T3qVgvjmPNR=kg2NN0A9TCUfMiZQ@mail.gmail.com>
	<CALGmxEJtX7xJz1+GA-V7A7NzgtD1Juahd6iFUFx9qmJvbK1vSg@mail.gmail.com>
	<CAPJVwB=15cGp+F9KYaR6THgEkgZStn-e6RAHifHJYs+usXtTww@mail.gmail.com>
Message-ID: <CAB6mnxJSK6tUAph+eiy8=kxujy6jX02iv+ywA8TJECS_UEi6rg@mail.gmail.com>

On Fri, Jul 18, 2014 at 10:59 AM, Nathaniel Smith <njs at pobox.com> wrote:

> On Fri, Jul 18, 2014 at 5:54 PM, Chris Barker <chris.barker at noaa.gov>
> wrote:
> >
> > This is why I see no downside to latin-1 -- if you don't use the > 127
> code
> > points, it's the same thing -- if you do, you get some extra handy
> > characters. The only difference is that a proper ascii type would not let
> > you store anything above 127 at all -- why restrict ourselves?
>
> IMO the extra characters aren't the most compelling argument for
> latin1 over ascii. Latin1 gives the nice assurance that if some jerk
> *does* give me an "ascii" file that somewhere has some byte with the
> 8th bit set, then I can still load the data and fix things by hand.
> This is trickier if numpy just refuses to touch the data, blowing up
> with an exception when I try. In general it's easy to create numpy
> arrays containing arbitrary bitpatterns, so it's nice to have some
> strategy for what to do with them.
>
>
Just to throw in one more complication, there is no buffer protocol for a
fixed encoding type. In Python 3 'c', 's', 'p' are all considered as bytes,
in Python 2 as strings.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/ca36d2c6/attachment.html>

From pav at iki.fi  Fri Jul 18 13:47:15 2014
From: pav at iki.fi (Pauli Virtanen)
Date: Fri, 18 Jul 2014 20:47:15 +0300
Subject: [Numpy-discussion] proposal: new commit guidelines for
	backportable bugfixes
In-Reply-To: <CAK5FAtF+seKQHu9q5bNMNatX7JEPXbwURzR+4VjWdaWUhULXog@mail.gmail.com>
References: <CAK5FAtGegwSH8Mx=QaZqvDeFp2UT7eHCKfFYQix6a32W1Mh=7w@mail.gmail.com>	<CAPJVwBkXYr89z4uwmXL5XAE5xXde_jVtHJQY5+QgEAn01W41=g@mail.gmail.com>
	<CAK5FAtF+seKQHu9q5bNMNatX7JEPXbwURzR+4VjWdaWUhULXog@mail.gmail.com>
Message-ID: <53C95DA3.7010901@iki.fi>

18.07.2014 19:35, Julian Taylor kirjoitti:
> On Fri, Jul 18, 2014 at 6:23 PM, Nathaniel Smith <njs at pobox.com>
> wrote:
>> On 18 Jul 2014 15:36, "Julian Taylor"
>> <jtaylor.debian at googlemail.com> wrote:
>>> 
>>> git rebase --onto $(git merge-base master maintenance/1.9.x)
>>> HEAD^
>> 
>> As a potential refinement, this might be simpler if we define a
>> branch that points to this commit.
>> 
> 
> we could do that, though the merge base changes to the last commit 
> that was merged in that way. The old merge base is still valid but 
> much older. I applied this method to some of my bugfixes so the 
> current merge base of master and 1.9 is a commit from yesterday
> not anymore the diverging point of master and 1.9. But I don't know
> if the newer merge base makes any difference to git.

Will the merge base actually ever change if you don't merge the
branches to each other?

    ***

The other well-known alternative to bugfixes is to first commit it in
the earliest maintenance branch where you want to have it, and then
merge that branch forward to the newer maintenance branches, and
finally into master.

	Pauli


From josef.pktd at gmail.com  Fri Jul 18 14:03:56 2014
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 18 Jul 2014 14:03:56 -0400
Subject: [Numpy-discussion] `allclose` vs `assert_allclose`
In-Reply-To: <CAPJVwBmML+mBn2reEpFvArtAwhqxK=Y5aFyjKtfOdO=seVk8_g@mail.gmail.com>
References: <CAEym_HoQgVswDS7h7hx4JjqVb-jKmoSpZmt36VAQSbyTX_NXnw@mail.gmail.com>
	<CAPJVwBn7Br5Ng7U=y26o=nj_E3LMdpWbc5Qp53h1fSOFD603eg@mail.gmail.com>
	<CAMMTP+DA0ja+zbDR5Fh=yTxsW_TeSApHD9tAWY261RgZ+x4UoQ@mail.gmail.com>
	<CAMMTP+D+19j2eJTYSfg5kHVieu333xwb9ZV+Tc6vBS88eUMtzw@mail.gmail.com>
	<CAPJVwBmML+mBn2reEpFvArtAwhqxK=Y5aFyjKtfOdO=seVk8_g@mail.gmail.com>
Message-ID: <CAMMTP+CKitjyM5tj=4qXgEAJKtvWQ3P1RPhh9FVbEL24eyQ1Pw@mail.gmail.com>

On Fri, Jul 18, 2014 at 12:53 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Fri, Jul 18, 2014 at 12:38 PM,  <josef.pktd at gmail.com> wrote:
> >
> > On Thu, Jul 17, 2014 at 4:07 PM, <josef.pktd at gmail.com> wrote:
> >
> >> If you mean by this to add atol=1e-8 as default, then I'm against it.
> >>
> >> At least it will change the meaning of many of our tests in statsmodels.
> >>
> >> I'm using rtol to check for correct 1e-15 or 1e-30, which would be
> >> completely swamped if you change the default atol=0.
> >> Adding atol=0 to all assert_allclose that currently use only rtol is a
> lot
> >> of work.
> >> I think I almost never use a default rtol, but I often leave atol at the
> >> default = 0.
> >>
> >> If we have zeros, then I don't think it's too much work to decide
> whether
> >> this should be atol=1e-20, or 1e-8.
> >
> >
> > copied from
> > http://mail.scipy.org/pipermail/numpy-discussion/2014-July/070639.html
> > since I didn't get any messages here
> >
> > This is a compelling use-case, but there are also lots of compelling
> > usecases that want some non-zero atol (i.e., comparing stuff to 0).
> > Saying that allclose is for one of those use cases and assert_allclose
> > is for the other is... not a very felicitious API design, I think. So
> > we really should do *something*.
> >
> > Are there really any cases where you want non-zero atol= that don't
> > involve comparing something against a 'desired' value of zero? It's a
> > little wacky, but I'm wondering if we ought to change the rule (for
> > all versions of allclose) to
> >
> > if desired == 0:
> >     tol = atol
> > else:
> >     tol = rtol * desired
> >
> > In particular, means that np.allclose(x, 1e-30) would reject x values
> > of 0 or 2e-30, but np.allclose(x, 0) will accept x == 1e-30 or 2e-30.
> >
> > -n
> >
> >
> > That's much too confusing.
> > I don't know what the usecases for np.allclose are since I don't have
> any.
>
> I wrote allclose because it's shorter, but my point is that
> assert_allclose and allclose should use the same criterion, and was
> making a suggestion for what that shared criterion might be.
>
> > assert_allclose is one of our (statsmodels) most frequently used numpy
> > function
> >
> > this is not informative:
> >
> > `np.allclose(x, 1e-30)`
> >
> >
> > since there are keywords
> > either np.assert_allclose(x, atol=1e-30)
>
> I think we might be talking past each other here -- 1e-30 here is my
> "gold" p-value that I'm hoping x will match, not a tolerance argument.
>

my mistake


>
> > if I want to be "close" to zero
> > or
> >
> > np.assert_allclose(x, rtol=1e-11, atol=1e-25)
> >
> > if we have a mix of large numbers and "zeros" in an array.
> >
> > Making the behavior of assert_allclose depending on whether desired is
> > exactly zero or 1e-20 looks too difficult to remember, and which desired
> I
> > use would depend on what I get out of R or Stata.
>
> I thought your whole point here was that 1e-20 and zero are
> qualitatively different values that you would not want to accidentally
> confuse? Surely R and Stata aren't returning exact zeros for small
> non-zero values like probability tails?
>
> > atol=1e-8 is not close to zero in most cases in my experience.
>
> If I understand correctly (Tony?) the problem here is that another
> common use case for assert_allclose is in cases like
>
> assert_allclose(np.sin(some * complex ** calculation / (that - should
> - be * zero)), 0)
>
> For cases like this, you need *some* non-zero atol or the thing just
> doesn't work, and one could quibble over the exact value as long as
> it's larger than "normal" floating point error. These calculations
> usually involve "normal" sized numbers, so atol should be comparable
> to eps * these values.  eps is 2e-16, so atol=1e-8 works for values up
> to around 1e8, which is a plausible upper bound for where people might
> expect assert_allclose to just work. I'm trying to figure out some way
> to support your use cases while also supporting other use cases.
>

my problem is that there is no "normal" floating point error.
If I have units in 1000 or units in 0.0001 depends on the example and
dataset that we use for testing.

this test two different functions/methods that calculate the same thing

(Pdb) pval
array([  3.01270184e-42,   5.90847367e-02,   3.00066946e-12])
(Pdb) res2.pvalues
array([  3.01270184e-42,   5.90847367e-02,   3.00066946e-12])
(Pdb) assert_allclose(pval, res2.pvalues, rtol=5 * rtol, atol=1e-25)

I don't care about errors that are smaller that 1e-25

for example testing p-values against Stata

(Pdb) tt.pvalue
array([  5.70315140e-30,   6.24662551e-02,   5.86024090e-11])
(Pdb) res2.pvalues
array([  5.70315140e-30,   6.24662551e-02,   5.86024090e-11])
(Pdb) tt.pvalue - res2.pvalues
array([  2.16612016e-40,   2.51187959e-15,   4.30027936e-21])
(Pdb) tt.pvalue / res2.pvalues - 1
array([  3.79811738e-11,   4.01900735e-14,   7.33806349e-11])
(Pdb) rtol
1e-10
(Pdb) assert_allclose(tt.pvalue, res2.pvalues, rtol=5 * rtol)


I could find a lot more and maybe nicer examples, since I spend quite a bit
of time fine tuning unit tests.

Of course you can change it.

But the testing functions are code and very popular code.

And if you break backwards compatibility, then I wouldn't mind reviewing a
pull request for statsmodels that adds 300 to 400 `atol=0` to the unit
tests. :)

Josef


>
> -n
>
> --
> Nathaniel J. Smith
> Postdoctoral researcher - Informatics - University of Edinburgh
> http://vorpus.org
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/111fec89/attachment.html>

From josef.pktd at gmail.com  Fri Jul 18 14:20:54 2014
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 18 Jul 2014 14:20:54 -0400
Subject: [Numpy-discussion] `allclose` vs `assert_allclose`
In-Reply-To: <CAMMTP+CKitjyM5tj=4qXgEAJKtvWQ3P1RPhh9FVbEL24eyQ1Pw@mail.gmail.com>
References: <CAEym_HoQgVswDS7h7hx4JjqVb-jKmoSpZmt36VAQSbyTX_NXnw@mail.gmail.com>
	<CAPJVwBn7Br5Ng7U=y26o=nj_E3LMdpWbc5Qp53h1fSOFD603eg@mail.gmail.com>
	<CAMMTP+DA0ja+zbDR5Fh=yTxsW_TeSApHD9tAWY261RgZ+x4UoQ@mail.gmail.com>
	<CAMMTP+D+19j2eJTYSfg5kHVieu333xwb9ZV+Tc6vBS88eUMtzw@mail.gmail.com>
	<CAPJVwBmML+mBn2reEpFvArtAwhqxK=Y5aFyjKtfOdO=seVk8_g@mail.gmail.com>
	<CAMMTP+CKitjyM5tj=4qXgEAJKtvWQ3P1RPhh9FVbEL24eyQ1Pw@mail.gmail.com>
Message-ID: <CAMMTP+AEtWTcDed_-oaiwc9QuhP-AokddyizipuO1mnNKcYbBA@mail.gmail.com>

On Fri, Jul 18, 2014 at 2:03 PM, <josef.pktd at gmail.com> wrote:

>
>
>
> On Fri, Jul 18, 2014 at 12:53 PM, Nathaniel Smith <njs at pobox.com> wrote:
>
>> On Fri, Jul 18, 2014 at 12:38 PM,  <josef.pktd at gmail.com> wrote:
>> >
>> > On Thu, Jul 17, 2014 at 4:07 PM, <josef.pktd at gmail.com> wrote:
>> >
>> >> If you mean by this to add atol=1e-8 as default, then I'm against it.
>> >>
>> >> At least it will change the meaning of many of our tests in
>> statsmodels.
>> >>
>> >> I'm using rtol to check for correct 1e-15 or 1e-30, which would be
>> >> completely swamped if you change the default atol=0.
>> >> Adding atol=0 to all assert_allclose that currently use only rtol is a
>> lot
>> >> of work.
>> >> I think I almost never use a default rtol, but I often leave atol at
>> the
>> >> default = 0.
>> >>
>> >> If we have zeros, then I don't think it's too much work to decide
>> whether
>> >> this should be atol=1e-20, or 1e-8.
>> >
>> >
>> > copied from
>> > http://mail.scipy.org/pipermail/numpy-discussion/2014-July/070639.html
>> > since I didn't get any messages here
>> >
>> > This is a compelling use-case, but there are also lots of compelling
>> > usecases that want some non-zero atol (i.e., comparing stuff to 0).
>> > Saying that allclose is for one of those use cases and assert_allclose
>> > is for the other is... not a very felicitious API design, I think. So
>> > we really should do *something*.
>> >
>> > Are there really any cases where you want non-zero atol= that don't
>> > involve comparing something against a 'desired' value of zero? It's a
>> > little wacky, but I'm wondering if we ought to change the rule (for
>> > all versions of allclose) to
>> >
>> > if desired == 0:
>> >     tol = atol
>> > else:
>> >     tol = rtol * desired
>> >
>> > In particular, means that np.allclose(x, 1e-30) would reject x values
>> > of 0 or 2e-30, but np.allclose(x, 0) will accept x == 1e-30 or 2e-30.
>> >
>> > -n
>> >
>> >
>> > That's much too confusing.
>> > I don't know what the usecases for np.allclose are since I don't have
>> any.
>>
>> I wrote allclose because it's shorter, but my point is that
>> assert_allclose and allclose should use the same criterion, and was
>> making a suggestion for what that shared criterion might be.
>>
>> > assert_allclose is one of our (statsmodels) most frequently used numpy
>> > function
>> >
>> > this is not informative:
>> >
>> > `np.allclose(x, 1e-30)`
>> >
>> >
>> > since there are keywords
>> > either np.assert_allclose(x, atol=1e-30)
>>
>> I think we might be talking past each other here -- 1e-30 here is my
>> "gold" p-value that I'm hoping x will match, not a tolerance argument.
>>
>
> my mistake
>
>
>
>>
>> > if I want to be "close" to zero
>> > or
>> >
>> > np.assert_allclose(x, rtol=1e-11, atol=1e-25)
>> >
>> > if we have a mix of large numbers and "zeros" in an array.
>> >
>> > Making the behavior of assert_allclose depending on whether desired is
>> > exactly zero or 1e-20 looks too difficult to remember, and which
>> desired I
>> > use would depend on what I get out of R or Stata.
>>
>> I thought your whole point here was that 1e-20 and zero are
>> qualitatively different values that you would not want to accidentally
>> confuse? Surely R and Stata aren't returning exact zeros for small
>> non-zero values like probability tails?
>>
>> > atol=1e-8 is not close to zero in most cases in my experience.
>>
>> If I understand correctly (Tony?) the problem here is that another
>> common use case for assert_allclose is in cases like
>>
>> assert_allclose(np.sin(some * complex ** calculation / (that - should
>> - be * zero)), 0)
>>
>> For cases like this, you need *some* non-zero atol or the thing just
>> doesn't work, and one could quibble over the exact value as long as
>> it's larger than "normal" floating point error. These calculations
>> usually involve "normal" sized numbers, so atol should be comparable
>> to eps * these values.  eps is 2e-16, so atol=1e-8 works for values up
>> to around 1e8, which is a plausible upper bound for where people might
>> expect assert_allclose to just work. I'm trying to figure out some way
>> to support your use cases while also supporting other use cases.
>>
>
> my problem is that there is no "normal" floating point error.
> If I have units in 1000 or units in 0.0001 depends on the example and
> dataset that we use for testing.
>
> this test two different functions/methods that calculate the same thing
>
> (Pdb) pval
> array([  3.01270184e-42,   5.90847367e-02,   3.00066946e-12])
> (Pdb) res2.pvalues
> array([  3.01270184e-42,   5.90847367e-02,   3.00066946e-12])
> (Pdb) assert_allclose(pval, res2.pvalues, rtol=5 * rtol, atol=1e-25)
>
> I don't care about errors that are smaller that 1e-25
>
> for example testing p-values against Stata
>
> (Pdb) tt.pvalue
> array([  5.70315140e-30,   6.24662551e-02,   5.86024090e-11])
> (Pdb) res2.pvalues
> array([  5.70315140e-30,   6.24662551e-02,   5.86024090e-11])
> (Pdb) tt.pvalue - res2.pvalues
> array([  2.16612016e-40,   2.51187959e-15,   4.30027936e-21])
> (Pdb) tt.pvalue / res2.pvalues - 1
> array([  3.79811738e-11,   4.01900735e-14,   7.33806349e-11])
> (Pdb) rtol
> 1e-10
> (Pdb) assert_allclose(tt.pvalue, res2.pvalues, rtol=5 * rtol)
>
>
> I could find a lot more and maybe nicer examples, since I spend quite a
> bit of time fine tuning unit tests.
>
> Of course you can change it.
>
> But the testing functions are code and very popular code.
>
> And if you break backwards compatibility, then I wouldn't mind reviewing a
> pull request for statsmodels that adds 300 to 400 `atol=0` to the unit
> tests. :)
>


scipy (not current master) doesn't look "so" bad. I find 400
"assert_allclose(" and maybe a third to half use atol.
As expected optimize uses only atol because of the convergence criteria.
scipy.stats uses mostly rtol or default.

Josef


>
> Josef
>
>
>>
>> -n
>>
>> --
>> Nathaniel J. Smith
>> Postdoctoral researcher - Informatics - University of Edinburgh
>> http://vorpus.org
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/6bb769fb/attachment.html>

From njs at pobox.com  Fri Jul 18 14:29:25 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 18 Jul 2014 19:29:25 +0100
Subject: [Numpy-discussion] `allclose` vs `assert_allclose`
In-Reply-To: <CAMMTP+CKitjyM5tj=4qXgEAJKtvWQ3P1RPhh9FVbEL24eyQ1Pw@mail.gmail.com>
References: <CAEym_HoQgVswDS7h7hx4JjqVb-jKmoSpZmt36VAQSbyTX_NXnw@mail.gmail.com>
	<CAPJVwBn7Br5Ng7U=y26o=nj_E3LMdpWbc5Qp53h1fSOFD603eg@mail.gmail.com>
	<CAMMTP+DA0ja+zbDR5Fh=yTxsW_TeSApHD9tAWY261RgZ+x4UoQ@mail.gmail.com>
	<CAMMTP+D+19j2eJTYSfg5kHVieu333xwb9ZV+Tc6vBS88eUMtzw@mail.gmail.com>
	<CAPJVwBmML+mBn2reEpFvArtAwhqxK=Y5aFyjKtfOdO=seVk8_g@mail.gmail.com>
	<CAMMTP+CKitjyM5tj=4qXgEAJKtvWQ3P1RPhh9FVbEL24eyQ1Pw@mail.gmail.com>
Message-ID: <CAPJVwBmdPgyUzLVvjZPc6X26=wcqkm3S91-isdtezVwOFijnNQ@mail.gmail.com>

On Fri, Jul 18, 2014 at 7:03 PM,  <josef.pktd at gmail.com> wrote:
>
> On Fri, Jul 18, 2014 at 12:53 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>> For cases like this, you need *some* non-zero atol or the thing just
>> doesn't work, and one could quibble over the exact value as long as
>> it's larger than "normal" floating point error. These calculations
>> usually involve "normal" sized numbers, so atol should be comparable
>> to eps * these values.  eps is 2e-16, so atol=1e-8 works for values up
>> to around 1e8, which is a plausible upper bound for where people might
>> expect assert_allclose to just work. I'm trying to figure out some way
>> to support your use cases while also supporting other use cases.
>
>
> my problem is that there is no "normal" floating point error.
> If I have units in 1000 or units in 0.0001 depends on the example and
> dataset that we use for testing.
>
> this test two different functions/methods that calculate the same thing
>
> (Pdb) pval
> array([  3.01270184e-42,   5.90847367e-02,   3.00066946e-12])
> (Pdb) res2.pvalues
> array([  3.01270184e-42,   5.90847367e-02,   3.00066946e-12])
> (Pdb) assert_allclose(pval, res2.pvalues, rtol=5 * rtol, atol=1e-25)
>
> I don't care about errors that are smaller that 1e-25
>
> for example testing p-values against Stata
>
> (Pdb) tt.pvalue
> array([  5.70315140e-30,   6.24662551e-02,   5.86024090e-11])
> (Pdb) res2.pvalues
> array([  5.70315140e-30,   6.24662551e-02,   5.86024090e-11])
> (Pdb) tt.pvalue - res2.pvalues
> array([  2.16612016e-40,   2.51187959e-15,   4.30027936e-21])
> (Pdb) tt.pvalue / res2.pvalues - 1
> array([  3.79811738e-11,   4.01900735e-14,   7.33806349e-11])
> (Pdb) rtol
> 1e-10
> (Pdb) assert_allclose(tt.pvalue, res2.pvalues, rtol=5 * rtol)
>
>
> I could find a lot more and maybe nicer examples, since I spend quite a bit
> of time fine tuning unit tests.

...these are all cases where there are not exact zeros, so my proposal
would not affect them?

I can see the argument that we shouldn't provide any default rtol/atol
at all because there is no good default, but... I don't think putting
that big of a barrier in front of newbies writing their first tests is
a good idea.

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org


From josef.pktd at gmail.com  Fri Jul 18 14:31:01 2014
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 18 Jul 2014 14:31:01 -0400
Subject: [Numpy-discussion] `allclose` vs `assert_allclose`
In-Reply-To: <CAPJVwBmML+mBn2reEpFvArtAwhqxK=Y5aFyjKtfOdO=seVk8_g@mail.gmail.com>
References: <CAEym_HoQgVswDS7h7hx4JjqVb-jKmoSpZmt36VAQSbyTX_NXnw@mail.gmail.com>
	<CAPJVwBn7Br5Ng7U=y26o=nj_E3LMdpWbc5Qp53h1fSOFD603eg@mail.gmail.com>
	<CAMMTP+DA0ja+zbDR5Fh=yTxsW_TeSApHD9tAWY261RgZ+x4UoQ@mail.gmail.com>
	<CAMMTP+D+19j2eJTYSfg5kHVieu333xwb9ZV+Tc6vBS88eUMtzw@mail.gmail.com>
	<CAPJVwBmML+mBn2reEpFvArtAwhqxK=Y5aFyjKtfOdO=seVk8_g@mail.gmail.com>
Message-ID: <CAMMTP+DRCcE5fffY7RQZUgJYZHtnxseAz=mVatN_UpO5Ofd9Yg@mail.gmail.com>

>
>
> > Making the behavior of assert_allclose depending on whether desired is
> > exactly zero or 1e-20 looks too difficult to remember, and which desired
> I
> > use would depend on what I get out of R or Stata.
>
> I thought your whole point here was that 1e-20 and zero are
> qualitatively different values that you would not want to accidentally
> confuse? Surely R and Stata aren't returning exact zeros for small
> non-zero values like probability tails?
>
>
I was thinking of the case when we only see "pvalue < 1e-16" or something
like this, and we replace this by assert close to zero.
which would translate to `assert_allclose(pvalue, 0, atol=1e-16)`
with maybe an additional rtol=1e-11 if we have an array of pvalues where
some are "large" (>0.5).

It's not a very frequent case, mainly when we don't have access to the
underlying float numbers and only have the print representation.

Josef
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/cfa413cc/attachment.html>

From josef.pktd at gmail.com  Fri Jul 18 14:41:25 2014
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 18 Jul 2014 14:41:25 -0400
Subject: [Numpy-discussion] `allclose` vs `assert_allclose`
In-Reply-To: <CAPJVwBmdPgyUzLVvjZPc6X26=wcqkm3S91-isdtezVwOFijnNQ@mail.gmail.com>
References: <CAEym_HoQgVswDS7h7hx4JjqVb-jKmoSpZmt36VAQSbyTX_NXnw@mail.gmail.com>
	<CAPJVwBn7Br5Ng7U=y26o=nj_E3LMdpWbc5Qp53h1fSOFD603eg@mail.gmail.com>
	<CAMMTP+DA0ja+zbDR5Fh=yTxsW_TeSApHD9tAWY261RgZ+x4UoQ@mail.gmail.com>
	<CAMMTP+D+19j2eJTYSfg5kHVieu333xwb9ZV+Tc6vBS88eUMtzw@mail.gmail.com>
	<CAPJVwBmML+mBn2reEpFvArtAwhqxK=Y5aFyjKtfOdO=seVk8_g@mail.gmail.com>
	<CAMMTP+CKitjyM5tj=4qXgEAJKtvWQ3P1RPhh9FVbEL24eyQ1Pw@mail.gmail.com>
	<CAPJVwBmdPgyUzLVvjZPc6X26=wcqkm3S91-isdtezVwOFijnNQ@mail.gmail.com>
Message-ID: <CAMMTP+DBG_pcO4PfoiLsdwLv77rUUfENkbGx0yjmoCSDcxv+8Q@mail.gmail.com>

On Fri, Jul 18, 2014 at 2:29 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Fri, Jul 18, 2014 at 7:03 PM,  <josef.pktd at gmail.com> wrote:
> >
> > On Fri, Jul 18, 2014 at 12:53 PM, Nathaniel Smith <njs at pobox.com> wrote:
> >>
> >> For cases like this, you need *some* non-zero atol or the thing just
> >> doesn't work, and one could quibble over the exact value as long as
> >> it's larger than "normal" floating point error. These calculations
> >> usually involve "normal" sized numbers, so atol should be comparable
> >> to eps * these values.  eps is 2e-16, so atol=1e-8 works for values up
> >> to around 1e8, which is a plausible upper bound for where people might
> >> expect assert_allclose to just work. I'm trying to figure out some way
> >> to support your use cases while also supporting other use cases.
> >
> >
> > my problem is that there is no "normal" floating point error.
> > If I have units in 1000 or units in 0.0001 depends on the example and
> > dataset that we use for testing.
> >
> > this test two different functions/methods that calculate the same thing
> >
> > (Pdb) pval
> > array([  3.01270184e-42,   5.90847367e-02,   3.00066946e-12])
> > (Pdb) res2.pvalues
> > array([  3.01270184e-42,   5.90847367e-02,   3.00066946e-12])
> > (Pdb) assert_allclose(pval, res2.pvalues, rtol=5 * rtol, atol=1e-25)
> >
> > I don't care about errors that are smaller that 1e-25
> >
> > for example testing p-values against Stata
> >
> > (Pdb) tt.pvalue
> > array([  5.70315140e-30,   6.24662551e-02,   5.86024090e-11])
> > (Pdb) res2.pvalues
> > array([  5.70315140e-30,   6.24662551e-02,   5.86024090e-11])
> > (Pdb) tt.pvalue - res2.pvalues
> > array([  2.16612016e-40,   2.51187959e-15,   4.30027936e-21])
> > (Pdb) tt.pvalue / res2.pvalues - 1
> > array([  3.79811738e-11,   4.01900735e-14,   7.33806349e-11])
> > (Pdb) rtol
> > 1e-10
> > (Pdb) assert_allclose(tt.pvalue, res2.pvalues, rtol=5 * rtol)
> >
> >
> > I could find a lot more and maybe nicer examples, since I spend quite a
> bit
> > of time fine tuning unit tests.
>
> ...these are all cases where there are not exact zeros, so my proposal
> would not affect them?
>
> I can see the argument that we shouldn't provide any default rtol/atol
> at all because there is no good default, but... I don't think putting
> that big of a barrier in front of newbies writing their first tests is
> a good idea.
>

I think atol=0 is **very** good for newbies, and everyone else.
If expected is really zero or very small, then it immediately causes a test
failure, and it's relatively obvious how to fix it.

I worry a lot more about unit tests that don't "bite" written by newbies or
not so newbies who just use a default.

That's one of the problems we had with assert_almost_equal, and why I was
very happy to switch to assert_allclose with it's emphasis on relative
tolerance.

Josef


>
> -n
>
> --
> Nathaniel J. Smith
> Postdoctoral researcher - Informatics - University of Edinburgh
> http://vorpus.org
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/6c71507b/attachment.html>

From charles at crunch.io  Fri Jul 18 14:42:22 2014
From: charles at crunch.io (Charles G. Waldman)
Date: Fri, 18 Jul 2014 11:42:22 -0700
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat`
	style
In-Reply-To: <53C953F5.90100@gmail.com>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>
	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>
	<53B9C861.3090809@hawaii.edu>
	<CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>
	<-2968451659458027190@unknownmsgid> <lpgqht$tm3$1@ger.gmane.org>
	<CACoeR0wT=f3bPmc9j8UezPPa8UZz8ad6sj6q_6F3cyciGXHxjw@mail.gmail.com>
	<CAP7h-xYtS_Sc5PZt9rW1rg3Mk4LjUc6P=PuQ_g+J=Qe6VuhNtQ@mail.gmail.com>
	<CAP7h-xbvdv86DWZvYaLSJpba=vSTHdCQK1+YD_qfNRv4DeFBaw@mail.gmail.com>
	<CAPJVwBkPAWG6hRKJWaE3tW40s5fXheh=CeEw+giUneoLAMmLDA@mail.gmail.com>
	<CADU3R7hM_WNvoQkfMiCtfD1qH0RmQ9GWv1rwRee3kz9++zs4_A@mail.gmail.com>
	<CAPJVwBnMCt4vr6aQM5e74EBP347D01bWWKRYuZ5By0Fg4XY2dg@mail.gmail.com>
	<CAEv1Z-4UuuV4VMiM=OqEqFghJ81yg4QznWtHYKuJpMhwutkHSw@mail.gmail.com>
	<53C953F5.90100@gmail.com>
Message-ID: <CADU3R7je2tEyvLArGpgLwn+yUcXHtNqQHqRuvvy3uvooFQ1wUA@mail.gmail.com>

Well, if the goal is "shorthand", typing numpy.array(numpy.mat())
won't please many users.

But the more I think about it, the less I think Numpy should support
this (non-Pythonic) input mode.  Too much molly-coddling of new users!
When doing interactive work I usually just type:

>>> np.array([[1,2,3],
...                   [4,5,6],
...                   [7,8,9]])

which is (IMO) easier to read:  e.g. it's not totally obvious that
"1,0,0;0,1,0;0,0,1" represents a 3x3 identity matrix, but

[[1,0,0],
  [0,1,0],
  [0,0,1]]

is pretty obvious.

The difference in (non-whitespace) chars is 19 vs 25, so the
"shorthand" doesn't seem to save that much.

Just my ?0.02,

   - C


On Fri, Jul 18, 2014 at 10:05 AM, Alan G Isaac <alan.isaac at gmail.com> wrote:
> On 7/18/2014 12:45 PM, Mark Miller wrote:
>> If the true goal is to just allow quick entry of a 2d array, why not just advocate using
>> a = numpy.array(numpy.mat("1 2 3; 4 5 6; 7 8 9"))
>
>
> It's even simpler:
> a = np.mat(' 1 2 3;4 5 6;7 8 9').A
>
> I'm not putting a dog in this race.  Still I would say that
> the reason why such proposals miss the point is that
> there are introductory settings where one would like
> to explain as few complications as possible.  In
> particular, one might prefer *not* to discuss the
> existence of a matrix type.  As an additional downside,
> this is only good for 2d, and there have been proposals
> for the new array builder to handle other dimensions.
>
> fwiw,
> Alan Isaac
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From njs at pobox.com  Fri Jul 18 14:44:08 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 18 Jul 2014 19:44:08 +0100
Subject: [Numpy-discussion] `allclose` vs `assert_allclose`
In-Reply-To: <CAMMTP+DRCcE5fffY7RQZUgJYZHtnxseAz=mVatN_UpO5Ofd9Yg@mail.gmail.com>
References: <CAEym_HoQgVswDS7h7hx4JjqVb-jKmoSpZmt36VAQSbyTX_NXnw@mail.gmail.com>
	<CAPJVwBn7Br5Ng7U=y26o=nj_E3LMdpWbc5Qp53h1fSOFD603eg@mail.gmail.com>
	<CAMMTP+DA0ja+zbDR5Fh=yTxsW_TeSApHD9tAWY261RgZ+x4UoQ@mail.gmail.com>
	<CAMMTP+D+19j2eJTYSfg5kHVieu333xwb9ZV+Tc6vBS88eUMtzw@mail.gmail.com>
	<CAPJVwBmML+mBn2reEpFvArtAwhqxK=Y5aFyjKtfOdO=seVk8_g@mail.gmail.com>
	<CAMMTP+DRCcE5fffY7RQZUgJYZHtnxseAz=mVatN_UpO5Ofd9Yg@mail.gmail.com>
Message-ID: <CAPJVwB=vg9b0p8rE14vXprgVyKTzz9kshRxPWPWXbzP1QUO2Sw@mail.gmail.com>

On 18 Jul 2014 19:31, <josef.pktd at gmail.com> wrote:
>>
>>
>> > Making the behavior of assert_allclose depending on whether desired is
>> > exactly zero or 1e-20 looks too difficult to remember, and which
desired I
>> > use would depend on what I get out of R or Stata.
>>
>> I thought your whole point here was that 1e-20 and zero are
>> qualitatively different values that you would not want to accidentally
>> confuse? Surely R and Stata aren't returning exact zeros for small
>> non-zero values like probability tails?
>>
>
> I was thinking of the case when we only see "pvalue < 1e-16" or something
like this, and we replace this by assert close to zero.
> which would translate to `assert_allclose(pvalue, 0, atol=1e-16)`
> with maybe an additional rtol=1e-11 if we have an array of pvalues where
some are "large" (>0.5).

This example is also handled correctly by my proposal :-)

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/ed8e7a90/attachment.html>

From chris.barker at noaa.gov  Fri Jul 18 14:43:39 2014
From: chris.barker at noaa.gov (Chris Barker)
Date: Fri, 18 Jul 2014 11:43:39 -0700
Subject: [Numpy-discussion] Numpy BoF at SciPy 2014 - quick report
In-Reply-To: <CAB6mnx+_RxnDeb0VJqhus6M=mYf9=wPR9N8TdcTOctjJOer5xg@mail.gmail.com>
References: <mailman.186.1405593558.1001.numpy-discussion@scipy.org>
	<3A0037EB-BF51-4943-8E27-EDDFC8F09456@astro.princeton.edu>
	<1405670639.6974.4.camel@sebastian-t440>
	<CAB6mnx+_RxnDeb0VJqhus6M=mYf9=wPR9N8TdcTOctjJOer5xg@mail.gmail.com>
Message-ID: <CALGmxELYv5dmonQXcRPvekRKbwbexst7e_tVWC8znYX+N_sRkg@mail.gmail.com>

On Fri, Jul 18, 2014 at 6:18 AM, Charles R Harris <charlesr.harris at gmail.com
> wrote:

> I've toyed some with the idea of adding a flag bit for transpose of 1-d
> arrays. It would flip with every transpose and be ignored for non 1-d
> arrays. A bit of a hack, but would allow for a column/row vector
> distinction.
>

very cool! I've thought for a while that one of the major things lacking
from numpy.matrix was row and column vectors. To do linear algebra
naturally, you really need those.

This may be a really lightweight way to get that - without the distinction
between "arrays" and "matrixes", which I think we're trying to get rid of
 with the @ operator.

when would this flag be used?

- linear algebra operations (mostly @ -- anything else?)

- broadcasting???

neat idea, anyway.

-CHB


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/2b62dafd/attachment.html>

From pav at iki.fi  Fri Jul 18 14:47:20 2014
From: pav at iki.fi (Pauli Virtanen)
Date: Fri, 18 Jul 2014 21:47:20 +0300
Subject: [Numpy-discussion] `allclose` vs `assert_allclose`
In-Reply-To: <CAMMTP+CKitjyM5tj=4qXgEAJKtvWQ3P1RPhh9FVbEL24eyQ1Pw@mail.gmail.com>
References: <CAEym_HoQgVswDS7h7hx4JjqVb-jKmoSpZmt36VAQSbyTX_NXnw@mail.gmail.com>	<CAPJVwBn7Br5Ng7U=y26o=nj_E3LMdpWbc5Qp53h1fSOFD603eg@mail.gmail.com>	<CAMMTP+DA0ja+zbDR5Fh=yTxsW_TeSApHD9tAWY261RgZ+x4UoQ@mail.gmail.com>	<CAMMTP+D+19j2eJTYSfg5kHVieu333xwb9ZV+Tc6vBS88eUMtzw@mail.gmail.com>	<CAPJVwBmML+mBn2reEpFvArtAwhqxK=Y5aFyjKtfOdO=seVk8_g@mail.gmail.com>
	<CAMMTP+CKitjyM5tj=4qXgEAJKtvWQ3P1RPhh9FVbEL24eyQ1Pw@mail.gmail.com>
Message-ID: <53C96BB8.4060104@iki.fi>

18.07.2014 21:03, josef.pktd at gmail.com kirjoitti:
[clip]
> Of course you can change it.
> 
> But the testing functions are code and very popular code.
> 
> And if you break backwards compatibility, then I wouldn't mind reviewing a
> pull request for statsmodels that adds 300 to 400 `atol=0` to the unit
> tests. :)

10c:

Scipy has 960 of those, and atol ~ 0 is required in some cases
(difficult to say in how big percentage without review). The default of
atol=1e-8 is pretty large.

There's ~60 instances of allclose(), most of which are in tests. About
half of those don't have atol=, whereas most have rtol.

Using allclose in non-test code without specifying both tolerances
explicitly is IMHO a sign of sloppiness, as the default tolerances are
both pretty big (and atol != 0 is not scale-free).

    ***

Consistency would be nice, especially in not having traps like

	assert_allclose(a, b, eps)
	->
	assert_(not np.allclose(a, b, eps))

Bumping the tolerances in assert_allclose() up to match allclose() will
probably not break code, but it can render some tests ineffective.

If the change is made, it needs to be noted in the release notes. I
think the number of project authors who relied on that the default was
atol=0 is not so big.

(In other news, we should discourage use of assert_almost_equal, by
telling people to use assert_allclose instead in the docstring at the
least. It has only atol= and it specifies it in a very cumbersome log10
basis...)

-- 
Pauli Virtanen


From njs at pobox.com  Fri Jul 18 14:49:04 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 18 Jul 2014 19:49:04 +0100
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat`
	style
In-Reply-To: <53C953F5.90100@gmail.com>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>
	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>
	<53B9C861.3090809@hawaii.edu>
	<CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>
	<-2968451659458027190@unknownmsgid> <lpgqht$tm3$1@ger.gmane.org>
	<CACoeR0wT=f3bPmc9j8UezPPa8UZz8ad6sj6q_6F3cyciGXHxjw@mail.gmail.com>
	<CAP7h-xYtS_Sc5PZt9rW1rg3Mk4LjUc6P=PuQ_g+J=Qe6VuhNtQ@mail.gmail.com>
	<CAP7h-xbvdv86DWZvYaLSJpba=vSTHdCQK1+YD_qfNRv4DeFBaw@mail.gmail.com>
	<CAPJVwBkPAWG6hRKJWaE3tW40s5fXheh=CeEw+giUneoLAMmLDA@mail.gmail.com>
	<CADU3R7hM_WNvoQkfMiCtfD1qH0RmQ9GWv1rwRee3kz9++zs4_A@mail.gmail.com>
	<CAPJVwBnMCt4vr6aQM5e74EBP347D01bWWKRYuZ5By0Fg4XY2dg@mail.gmail.com>
	<CAEv1Z-4UuuV4VMiM=OqEqFghJ81yg4QznWtHYKuJpMhwutkHSw@mail.gmail.com>
	<53C953F5.90100@gmail.com>
Message-ID: <CAPJVwB=XbBHqx2jKJE7O7gh0=vfyKoy=M55fbBhpWMKgDjWFLw@mail.gmail.com>

On 18 Jul 2014 18:06, "Alan G Isaac" <alan.isaac at gmail.com> wrote:
>
> On 7/18/2014 12:45 PM, Mark Miller wrote:
> > If the true goal is to just allow quick entry of a 2d array, why not
just advocate using
> > a = numpy.array(numpy.mat("1 2 3; 4 5 6; 7 8 9"))
>
>
> It's even simpler:
> a = np.mat(' 1 2 3;4 5 6;7 8 9').A
>
> I'm not putting a dog in this race.  Still I would say that
> the reason why such proposals miss the point is that
> there are introductory settings where one would like
> to explain as few complications as possible.  In
> particular, one might prefer *not* to discuss the
> existence of a matrix type.  As an additional downside,
> this is only good for 2d, and there have been proposals
> for the new array builder to handle other dimensions.

Going through np.mat also fails on the meta-goal, which is to remove
reasons for people to prefer np.matrix to np.ndarray, so that eventually we
can deprecate the former without harm.

As far as this goal goes, it's all very well for some of us to say that
users should toughen up or whatever, but it's useless: they'll just ignore
you and use np.mat because it's easier. And then we have even more of a
mess to clean up later.

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/f15cde11/attachment.html>

From chris.barker at noaa.gov  Fri Jul 18 14:50:07 2014
From: chris.barker at noaa.gov (Chris Barker)
Date: Fri, 18 Jul 2014 11:50:07 -0700
Subject: [Numpy-discussion] Numpy BoF at SciPy 2014 - quick report
In-Reply-To: <CAHAreOrW1Uj3-emegqS_JMx7jtzeZThONgM0ofskiAw3sT4Rew@mail.gmail.com>
References: <CAHAreOrW1Uj3-emegqS_JMx7jtzeZThONgM0ofskiAw3sT4Rew@mail.gmail.com>
Message-ID: <CALGmxEKQu4pLb_wR6Bs9GCW2HVmGgE4UFnAtKvLSmNDwNQeCpg@mail.gmail.com>

On Wed, Jul 16, 2014 at 8:08 PM, Fernando Perez <fperez.net at gmail.com>
wrote:

> - it would have been more productive if a focused numpy sprint had been
> also planned, so that there could be more structured follow-up on the ideas
> that came up.
>

The trick is people to do it -- there are a scary few number of people with
skills, time, and inclination to work on the core numpy code. Exactly one
of them (thanks Chuck!) was there for the sprints this year. If there were
a way to put together a stand-alone numpy sprint at some point, that would
be really great!

In particular, Chris Barker brought up a number of things regarding
> datetime and planned on following up during the sprints, but I'm not sure
> what ended up happening.
>

We did indeed follow op. No code was written, but:

Chuck, Mark W. and I come up with a rough proposal.

A handful of other folks came by to chat about it, and seemed to think it
would be useful.

In short:

Some minor changes to time zone handling, with a hook in place to
potentially plug in fancier support in the future.

Possibly a hook in to plug in addition calendars.

We're working on a NEP as we speak (or, correctly speaking, I'm distracted
from working on the PEP by reading the numpy list....)

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/7400df18/attachment.html>

From chris.barker at noaa.gov  Fri Jul 18 14:52:31 2014
From: chris.barker at noaa.gov (Chris Barker)
Date: Fri, 18 Jul 2014 11:52:31 -0700
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat`
	style
In-Reply-To: <CAPJVwB=XbBHqx2jKJE7O7gh0=vfyKoy=M55fbBhpWMKgDjWFLw@mail.gmail.com>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>
	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>
	<53B9C861.3090809@hawaii.edu>
	<CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>
	<-2968451659458027190@unknownmsgid> <lpgqht$tm3$1@ger.gmane.org>
	<CACoeR0wT=f3bPmc9j8UezPPa8UZz8ad6sj6q_6F3cyciGXHxjw@mail.gmail.com>
	<CAP7h-xYtS_Sc5PZt9rW1rg3Mk4LjUc6P=PuQ_g+J=Qe6VuhNtQ@mail.gmail.com>
	<CAP7h-xbvdv86DWZvYaLSJpba=vSTHdCQK1+YD_qfNRv4DeFBaw@mail.gmail.com>
	<CAPJVwBkPAWG6hRKJWaE3tW40s5fXheh=CeEw+giUneoLAMmLDA@mail.gmail.com>
	<CADU3R7hM_WNvoQkfMiCtfD1qH0RmQ9GWv1rwRee3kz9++zs4_A@mail.gmail.com>
	<CAPJVwBnMCt4vr6aQM5e74EBP347D01bWWKRYuZ5By0Fg4XY2dg@mail.gmail.com>
	<CAEv1Z-4UuuV4VMiM=OqEqFghJ81yg4QznWtHYKuJpMhwutkHSw@mail.gmail.com>
	<53C953F5.90100@gmail.com>
	<CAPJVwB=XbBHqx2jKJE7O7gh0=vfyKoy=M55fbBhpWMKgDjWFLw@mail.gmail.com>
Message-ID: <CALGmxEJtV9bSOOpVu2_xHEpi2yjWTHWRsnssjiFeFwy7282HDw@mail.gmail.com>

On Fri, Jul 18, 2014 at 11:49 AM, Nathaniel Smith <njs at pobox.com> wrote:

> Going through np.mat also fails on the meta-goal, which is to remove
> reasons for people to prefer np.matrix to np.ndarray, so that eventually we
> can deprecate the former without harm.
>
> As far as this goal goes, it's all very well for some of us to say that
> users should toughen up or whatever, but it's useless: they'll just ignore
> you and use np.mat because it's easier. And then we have even more of a
> mess to clean up later.
>

so maybe don't do anything new, and np.mat can produce an array at some
point in the future when np.matrix is deprecated....

-CHB

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/e2bdb129/attachment.html>

From josef.pktd at gmail.com  Fri Jul 18 15:13:05 2014
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 18 Jul 2014 15:13:05 -0400
Subject: [Numpy-discussion] `allclose` vs `assert_allclose`
In-Reply-To: <CAPJVwB=vg9b0p8rE14vXprgVyKTzz9kshRxPWPWXbzP1QUO2Sw@mail.gmail.com>
References: <CAEym_HoQgVswDS7h7hx4JjqVb-jKmoSpZmt36VAQSbyTX_NXnw@mail.gmail.com>
	<CAPJVwBn7Br5Ng7U=y26o=nj_E3LMdpWbc5Qp53h1fSOFD603eg@mail.gmail.com>
	<CAMMTP+DA0ja+zbDR5Fh=yTxsW_TeSApHD9tAWY261RgZ+x4UoQ@mail.gmail.com>
	<CAMMTP+D+19j2eJTYSfg5kHVieu333xwb9ZV+Tc6vBS88eUMtzw@mail.gmail.com>
	<CAPJVwBmML+mBn2reEpFvArtAwhqxK=Y5aFyjKtfOdO=seVk8_g@mail.gmail.com>
	<CAMMTP+DRCcE5fffY7RQZUgJYZHtnxseAz=mVatN_UpO5Ofd9Yg@mail.gmail.com>
	<CAPJVwB=vg9b0p8rE14vXprgVyKTzz9kshRxPWPWXbzP1QUO2Sw@mail.gmail.com>
Message-ID: <CAMMTP+D6dAJvDGx+EtYq_Pojk5tfB1T4NEz3sOzqvphzdjfy0A@mail.gmail.com>

On Fri, Jul 18, 2014 at 2:44 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On 18 Jul 2014 19:31, <josef.pktd at gmail.com> wrote:
> >>
> >>
> >> > Making the behavior of assert_allclose depending on whether desired is
> >> > exactly zero or 1e-20 looks too difficult to remember, and which
> desired I
> >> > use would depend on what I get out of R or Stata.
> >>
> >> I thought your whole point here was that 1e-20 and zero are
> >> qualitatively different values that you would not want to accidentally
> >> confuse? Surely R and Stata aren't returning exact zeros for small
> >> non-zero values like probability tails?
> >>
> >
> > I was thinking of the case when we only see "pvalue < 1e-16" or
> something like this, and we replace this by assert close to zero.
> > which would translate to `assert_allclose(pvalue, 0, atol=1e-16)`
> > with maybe an additional rtol=1e-11 if we have an array of pvalues where
> some are "large" (>0.5).
>
> This example is also handled correctly by my proposal :-)
>
depends on the details of your proposal

alternative: desired is exactly zero means assert_equal

(Pdb) self.res_reg.params[m:]
array([ 0.,  0.,  0.])
(Pdb) assert_allclose(0, self.res_reg.params[m:])
(Pdb) assert_allclose(0, self.res_reg.params[m:], rtol=0, atol=0)
(Pdb)

This test uses currently assert_almost_equal with decimal=4   :(

regularized estimation with hard thresholding: the first m values are
estimate not equal zero, the m to the end elements are "exactly zero".

This is discrete models fit_regularized which predates numpy
assert_allclose.  I haven't checked what the unit test of Kerby's current
additions for fit_regularized looks like.

unit testing is serious business:
I'd rather have good unit test in SciPy related packages than convincing a
few more newbies that they can use the defaults for everything.

Josef


> -n
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/8102471a/attachment.html>

From chris.barker at noaa.gov  Fri Jul 18 15:13:21 2014
From: chris.barker at noaa.gov (Chris Barker)
Date: Fri, 18 Jul 2014 12:13:21 -0700
Subject: [Numpy-discussion] `allclose` vs `assert_allclose`
In-Reply-To: <53C96BB8.4060104@iki.fi>
References: <CAEym_HoQgVswDS7h7hx4JjqVb-jKmoSpZmt36VAQSbyTX_NXnw@mail.gmail.com>
	<CAPJVwBn7Br5Ng7U=y26o=nj_E3LMdpWbc5Qp53h1fSOFD603eg@mail.gmail.com>
	<CAMMTP+DA0ja+zbDR5Fh=yTxsW_TeSApHD9tAWY261RgZ+x4UoQ@mail.gmail.com>
	<CAMMTP+D+19j2eJTYSfg5kHVieu333xwb9ZV+Tc6vBS88eUMtzw@mail.gmail.com>
	<CAPJVwBmML+mBn2reEpFvArtAwhqxK=Y5aFyjKtfOdO=seVk8_g@mail.gmail.com>
	<CAMMTP+CKitjyM5tj=4qXgEAJKtvWQ3P1RPhh9FVbEL24eyQ1Pw@mail.gmail.com>
	<53C96BB8.4060104@iki.fi>
Message-ID: <CALGmxEKbzq3tMMz7MqO0gRxsGRc-bCSj6oq2=+6CNFQBtcR4Lg@mail.gmail.com>

On Fri, Jul 18, 2014 at 11:47 AM, Pauli Virtanen <pav at iki.fi> wrote:

> Using allclose in non-test code without specifying both tolerances
> explicitly is IMHO a sign of sloppiness, as the default tolerances are
> both pretty big (and atol != 0 is not scale-free).
>

using it without specifying tolerances is sloppy in ANY use case.

Bumping the tolerances in assert_allclose() up to match allclose() will
> probably not break code, but it can render some tests ineffective.
>

being a bit pedantic here, but rendering a test ineffective IS breaking
code.

And I'd rather a change break my tests than render them ineffective -- if
they break, I'll go look at them. If they are rendered ineffective, I'll
never notice.

Curious here -- is atol necessary for anything OTHER than near zero?

I can see that in a given case, you may know exactly what range of values
to expect (and everything in the array is of the same order of magnitude),
but an appropriate rtol would work there too. If only zero testing is
needed, then atol=0 makes sense as a default. (or maybe atol=eps)

Note:
"""
The relative difference (`rtol` * abs(`b`)) and the absolute difference
`atol` are added together to compare against the absolute difference
between `a` and `b`.
"""
Which points to seting atol=0 for the default as well, or it can totally
mess up a test on very small numbers.

I'll bet there is a LOT of sloppy use of these out  the wild (I know I've
been sloppy), and Im starting to think that atol=0 is the ONLY appropriate
default for the sloppy among us for instance:

In [40]: a1 = np.array([1e-100])

In [41]: a2 = np.array([1.00000001e-100])

In [42]: np.all
np.all       np.allclose  np.alltrue

In [42]: np.allclose(a1, a2, rtol=1e-10)
Out[42]: True

In [43]: np.allclose(a1, a2, rtol=1e-10, atol=0)
Out[43]: False

That's really not good.

By the way:
Definition:  np.allclose(a, b, rtol=1e-05, atol=1e-08)

Really? those are HUGE defaults for double-precision math. I can't believe
I haven't looked more closely at this before!

-Chris

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/c143a1ad/attachment.html>

From chris.barker at noaa.gov  Fri Jul 18 15:16:53 2014
From: chris.barker at noaa.gov (Chris Barker)
Date: Fri, 18 Jul 2014 12:16:53 -0700
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CAPJVwB=15cGp+F9KYaR6THgEkgZStn-e6RAHifHJYs+usXtTww@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
	<CALmrCV3DQj7nu0KcCvoktmunK_t_qp+RR2rwB7L-qCBVuQ+=3A@mail.gmail.com>
	<-4597269384285942771@unknownmsgid>
	<CALmrCV2GK2cTec+hNnF3p7T3qVgvjmPNR=kg2NN0A9TCUfMiZQ@mail.gmail.com>
	<CALGmxEJtX7xJz1+GA-V7A7NzgtD1Juahd6iFUFx9qmJvbK1vSg@mail.gmail.com>
	<CAPJVwB=15cGp+F9KYaR6THgEkgZStn-e6RAHifHJYs+usXtTww@mail.gmail.com>
Message-ID: <CALGmxEKwrS2iijcekRY-0F1exdhVvqG7Mk+DXMpr-gzOnK3+Rg@mail.gmail.com>

On Fri, Jul 18, 2014 at 9:59 AM, Nathaniel Smith <njs at pobox.com> wrote:

> IMO the extra characters aren't the most compelling argument for
> latin1 over ascii. Latin1 gives the nice assurance that if some jerk
> *does* give me an "ascii" file that somewhere has some byte with the
> 8th bit set, then I can still load the data and fix things by hand.
>

Absolutely!

py2's frequent barfing on the ascii encoding is really a pain.

And if you aren't going tin enforce ascii, then better to be clear about
what those extra bits mean.

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/f870a314/attachment.html>

From chris.barker at noaa.gov  Fri Jul 18 15:27:43 2014
From: chris.barker at noaa.gov (Chris Barker)
Date: Fri, 18 Jul 2014 12:27:43 -0700
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CALmrCV2Tr=viJPUSzBPVTM2R=JutC1DsVaBYHMH6HCs6J_6yag@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
	<CALmrCV3DQj7nu0KcCvoktmunK_t_qp+RR2rwB7L-qCBVuQ+=3A@mail.gmail.com>
	<-4597269384285942771@unknownmsgid>
	<CALmrCV2GK2cTec+hNnF3p7T3qVgvjmPNR=kg2NN0A9TCUfMiZQ@mail.gmail.com>
	<CALGmxEJtX7xJz1+GA-V7A7NzgtD1Juahd6iFUFx9qmJvbK1vSg@mail.gmail.com>
	<CALmrCV2Tr=viJPUSzBPVTM2R=JutC1DsVaBYHMH6HCs6J_6yag@mail.gmail.com>
Message-ID: <CALGmxEJPoHFe1MWbU0HZy9sYL2oauxKzPx+x3c8UtnUypWdS=Q@mail.gmail.com>

On Fri, Jul 18, 2014 at 10:29 AM, Andrew Collette <andrew.collette at gmail.com
> wrote:

> The root of the issue is that HDF5 provides a limited set of
> fixed-storage-width string types, and a fixed-storage-width NumPy type
> of the same size using Latin-1 can't map to any of them without losing
> data.  For example, if "a10" is a hypothetical 10-byte-wide NumPy
> dtype using Latin-1, reading/writing to an "a10" HDF5 dataset backed
> with 10-byte UTF-8 storage would risk truncation, even if the
> advertised widths are the same.
>

I do get this, yes.


> There is unfortunately nothing we can do in the h5py code base to
> paper over this... it's a limitation of the format.


yup. Similar limitations in numpy.

 > This is where I wonder about HDF's "ascii" type -- is it really ascii?
> Or is
> > it that old standby
> >
> one-byte-per-character-and-if-it's-ascii-we-all-know-what-it-means-but-if-it's-not-we'll-still-pass-it-around
> > type? i.e the old char* ?
> >
> > In which case, you can just push a latin-1 type into and out of your HDF
> > ascii arrays and everything will work just fine. Unless someone stores
> > something other than latin-1 or ascii in it -- but even then, the bytes
> > would still be preserved.
>
> The encoding is explicitly ASCII (H5T_ASCII, in HDF5 lingo).
> Anecdotally, I've heard people store other encodings in it, but (1)
> I'm not eager to make things worse by mis-labelling data, and (2) the
> HDF Group has made indications that they may start checking the
> encoding at conversion time.  (1) is particularly important, as a
> major focus of h5py is compatibility with the rest of the HDF5
> ecosystem.
>

If it were me, I'd encourage the HDF group to NOT enforce ascii. just like
with the numpy 'S' type, I'm guessing there is a fair bit of code in the
wild that [ab]uses the ascii type by throwing other bytes in there. In
fact, this one reason that utf-8 is so popular -- you still use all that
code that simply takes a char* and passes it around (or maybe compares it),
without making any assumptions about what it means.

that from this particular HDF5 perspective, they provide maximum
> compatibility and minimize the chances of accidental data loss.


What it would do is push the problem from the HDF5<->numpy interface to the
python<->numpy interface.

I'm not sure that's a good trade off.

-Chris

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/81761cc5/attachment.html>

From chris.barker at noaa.gov  Fri Jul 18 15:32:55 2014
From: chris.barker at noaa.gov (Chris Barker)
Date: Fri, 18 Jul 2014 12:32:55 -0700
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CAPJVwB=15cGp+F9KYaR6THgEkgZStn-e6RAHifHJYs+usXtTww@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
	<CALmrCV3DQj7nu0KcCvoktmunK_t_qp+RR2rwB7L-qCBVuQ+=3A@mail.gmail.com>
	<-4597269384285942771@unknownmsgid>
	<CALmrCV2GK2cTec+hNnF3p7T3qVgvjmPNR=kg2NN0A9TCUfMiZQ@mail.gmail.com>
	<CALGmxEJtX7xJz1+GA-V7A7NzgtD1Juahd6iFUFx9qmJvbK1vSg@mail.gmail.com>
	<CAPJVwB=15cGp+F9KYaR6THgEkgZStn-e6RAHifHJYs+usXtTww@mail.gmail.com>
Message-ID: <CALGmxEJU9ye-v9e6t2=Zc2b8UdaAmcg5zc_GxGUxKiE8Xku-tw@mail.gmail.com>

On Fri, Jul 18, 2014 at 9:59 AM, Nathaniel Smith <njs at pobox.com> wrote:

> IMO the extra characters aren't the most compelling argument for
> latin1 over ascii. Latin1 gives the nice assurance that if some jerk
> *does* give me an "ascii" file that somewhere has some byte with the
> 8th bit set, then I can still load the data and fix things by hand.
>

On Fri, Jul 18, 2014 at 10:39 AM, Charles R Harris <
charlesr.harris at gmail.com> wrote:

> Just to throw in one more complication, there is no buffer protocol for a
> fixed encoding type. In Python 3 'c', 's', 'p' are all considered as bytes,
> in Python 2 as strings.


I suppose another option is to formally cal it what has been a defacto
non-standard for years:

ascii-with-who-knows-what-for-the-higher-codes.

i.e ASCII, but  not barf on decoding, (replace?).

but you can use latin-1 the same way, so why not?

-CHB

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/54f42e90/attachment.html>

From pav at iki.fi  Fri Jul 18 15:43:05 2014
From: pav at iki.fi (Pauli Virtanen)
Date: Fri, 18 Jul 2014 22:43:05 +0300
Subject: [Numpy-discussion] `allclose` vs `assert_allclose`
In-Reply-To: <CALGmxEKbzq3tMMz7MqO0gRxsGRc-bCSj6oq2=+6CNFQBtcR4Lg@mail.gmail.com>
References: <CAEym_HoQgVswDS7h7hx4JjqVb-jKmoSpZmt36VAQSbyTX_NXnw@mail.gmail.com>	<CAPJVwBn7Br5Ng7U=y26o=nj_E3LMdpWbc5Qp53h1fSOFD603eg@mail.gmail.com>	<CAMMTP+DA0ja+zbDR5Fh=yTxsW_TeSApHD9tAWY261RgZ+x4UoQ@mail.gmail.com>	<CAMMTP+D+19j2eJTYSfg5kHVieu333xwb9ZV+Tc6vBS88eUMtzw@mail.gmail.com>	<CAPJVwBmML+mBn2reEpFvArtAwhqxK=Y5aFyjKtfOdO=seVk8_g@mail.gmail.com>	<CAMMTP+CKitjyM5tj=4qXgEAJKtvWQ3P1RPhh9FVbEL24eyQ1Pw@mail.gmail.com>	<53C96BB8.4060104@iki.fi>
	<CALGmxEKbzq3tMMz7MqO0gRxsGRc-bCSj6oq2=+6CNFQBtcR4Lg@mail.gmail.com>
Message-ID: <lqbtc9$aa0$1@ger.gmane.org>

18.07.2014 22:13, Chris Barker kirjoitti:
[clip]
> but an appropriate rtol would work there too. If only zero testing is
> needed, then atol=0 makes sense as a default. (or maybe atol=eps)

There's plenty of room below eps, but finfo(float).tiny ~ 3e-308 (or
some big multiple) is also reasonable in the scale-freeness sense.


From andrew.collette at gmail.com  Fri Jul 18 15:52:10 2014
From: andrew.collette at gmail.com (Andrew Collette)
Date: Fri, 18 Jul 2014 13:52:10 -0600
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CALGmxEJPoHFe1MWbU0HZy9sYL2oauxKzPx+x3c8UtnUypWdS=Q@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
	<CALmrCV3DQj7nu0KcCvoktmunK_t_qp+RR2rwB7L-qCBVuQ+=3A@mail.gmail.com>
	<-4597269384285942771@unknownmsgid>
	<CALmrCV2GK2cTec+hNnF3p7T3qVgvjmPNR=kg2NN0A9TCUfMiZQ@mail.gmail.com>
	<CALGmxEJtX7xJz1+GA-V7A7NzgtD1Juahd6iFUFx9qmJvbK1vSg@mail.gmail.com>
	<CALmrCV2Tr=viJPUSzBPVTM2R=JutC1DsVaBYHMH6HCs6J_6yag@mail.gmail.com>
	<CALGmxEJPoHFe1MWbU0HZy9sYL2oauxKzPx+x3c8UtnUypWdS=Q@mail.gmail.com>
Message-ID: <CALmrCV19YZG4g_XSdWutqxj+8ZWonZqDUxGUzKJ0a4tYLpw41w@mail.gmail.com>

Hi Chris,

> What it would do is push the problem from the HDF5<->numpy interface to the
> python<->numpy interface.
>
> I'm not sure that's a good trade off.

Maybe I'm being too paranoid about the truncation issue.  We already
perform truncation when going from e.g. vlen to fixed-width strings in
h5py... it's just the truncation behavior for same-width data that
throws me.

Here's a strawman for how a Latin-1 "a" type might be handled in h5py:

1. Creation from existing "a" data: Use vlen strings.  Doesn't
preserve the dtype, but maybe that's not so important.
2. Writing from "a" data to fixed-width ASCII: Copy, and replace
bytes>127 with "?" (or don't)
3. Writing from "a" data to fixed-width UTF-8: Transcode and truncate
(being careful not to end in the middle of a multibyte character)
4. Reading from fixed-width ASCII to "a": Straight copy, no inspection
5. Reading from fixed-width UTF-8 to "a": Copy, and replace
non-Latin-1 chars with "?"

(The above example uses replacement rather than raising an exception,
because an exception in the HDF5 conversion callback will leave the
write/read half-completed).

In any case, I can say that the lack of an text 'S' type in NumPy has
been a significant pain point for h5py users on Python 3 over the
years.  Whatever specific encoding ends up being used, such a type can
only improve the situation, and I'm firmly in favor of it.

Andrew


From joseph.martinot-lagarde at m4x.org  Fri Jul 18 16:15:42 2014
From: joseph.martinot-lagarde at m4x.org (Joseph Martinot-Lagarde)
Date: Fri, 18 Jul 2014 22:15:42 +0200
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat`
	style
In-Reply-To: <CADU3R7je2tEyvLArGpgLwn+yUcXHtNqQHqRuvvy3uvooFQ1wUA@mail.gmail.com>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>	<53B9C861.3090809@hawaii.edu>	<CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>	<-2968451659458027190@unknownmsgid>
	<lpgqht$tm3$1@ger.gmane.org>	<CACoeR0wT=f3bPmc9j8UezPPa8UZz8ad6sj6q_6F3cyciGXHxjw@mail.gmail.com>	<CAP7h-xYtS_Sc5PZt9rW1rg3Mk4LjUc6P=PuQ_g+J=Qe6VuhNtQ@mail.gmail.com>	<CAP7h-xbvdv86DWZvYaLSJpba=vSTHdCQK1+YD_qfNRv4DeFBaw@mail.gmail.com>	<CAPJVwBkPAWG6hRKJWaE3tW40s5fXheh=CeEw+giUneoLAMmLDA@mail.gmail.com>	<CADU3R7hM_WNvoQkfMiCtfD1qH0RmQ9GWv1rwRee3kz9++zs4_A@mail.gmail.com>	<CAPJVwBnMCt4vr6aQM5e74EBP347D01bWWKRYuZ5By0Fg4XY2dg@mail.gmail.com>	<CAEv1Z-4UuuV4VMiM=OqEqFghJ81yg4QznWtHYKuJpMhwutkHSw@mail.gmail.com>	<53C953F5.90100@gmail.com>
	<CADU3R7je2tEyvLArGpgLwn+yUcXHtNqQHqRuvvy3uvooFQ1wUA@mail.gmail.com>
Message-ID: <53C9806E.3090008@m4x.org>

Le 18/07/2014 20:42, Charles G. Waldman a ?crit :
> Well, if the goal is "shorthand", typing numpy.array(numpy.mat())
> won't please many users.
>
> But the more I think about it, the less I think Numpy should support
> this (non-Pythonic) input mode.  Too much molly-coddling of new users!
> When doing interactive work I usually just type:
>
>>>> np.array([[1,2,3],
> ...                   [4,5,6],
> ...                   [7,8,9]])
>
> which is (IMO) easier to read:  e.g. it's not totally obvious that
> "1,0,0;0,1,0;0,0,1" represents a 3x3 identity matrix, but
>
> [[1,0,0],
>    [0,1,0],
>    [0,0,1]]
>
> is pretty obvious.
>
Compare what's comparable:

[[1,0,0],
  [0,1,0],
  [0,0,1]]

vs

"1 0 0;"
"0 1 0;"
"0 0 1"

or

"""
1 0 0;
0 1 0;
0 0 1
"""

[[1,0,0], [0,1,0], [0,0,1]]
vs
"1 0 0; 0 1 0; 0 0 1"

> The difference in (non-whitespace) chars is 19 vs 25, so the
> "shorthand" doesn't seem to save that much.

Well, it's easier to type "" (twice the same character) than [], and you 
have no risk in swapping en opening and a closing bracket. In addition, 
you have to use AltGr on some keyboards to get the brackets. It doesn't 
boils down to a number of characters.

>
> Just my ?0.02,
>
>     - C
>
>
>
>
> On Fri, Jul 18, 2014 at 10:05 AM, Alan G Isaac <alan.isaac at gmail.com> wrote:
>> On 7/18/2014 12:45 PM, Mark Miller wrote:
>>> If the true goal is to just allow quick entry of a 2d array, why not just advocate using
>>> a = numpy.array(numpy.mat("1 2 3; 4 5 6; 7 8 9"))
>>
>>
>> It's even simpler:
>> a = np.mat(' 1 2 3;4 5 6;7 8 9').A
>>
>> I'm not putting a dog in this race.  Still I would say that
>> the reason why such proposals miss the point is that
>> there are introductory settings where one would like
>> to explain as few complications as possible.  In
>> particular, one might prefer *not* to discuss the
>> existence of a matrix type.  As an additional downside,
>> this is only good for 2d, and there have been proposals
>> for the new array builder to handle other dimensions.
>>
>> fwiw,
>> Alan Isaac
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From charles at crunch.io  Fri Jul 18 16:21:04 2014
From: charles at crunch.io (Charles G. Waldman)
Date: Fri, 18 Jul 2014 13:21:04 -0700
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat`
	style
In-Reply-To: <53C9806E.3090008@m4x.org>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>
	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>
	<53B9C861.3090809@hawaii.edu>
	<CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>
	<-2968451659458027190@unknownmsgid> <lpgqht$tm3$1@ger.gmane.org>
	<CACoeR0wT=f3bPmc9j8UezPPa8UZz8ad6sj6q_6F3cyciGXHxjw@mail.gmail.com>
	<CAP7h-xYtS_Sc5PZt9rW1rg3Mk4LjUc6P=PuQ_g+J=Qe6VuhNtQ@mail.gmail.com>
	<CAP7h-xbvdv86DWZvYaLSJpba=vSTHdCQK1+YD_qfNRv4DeFBaw@mail.gmail.com>
	<CAPJVwBkPAWG6hRKJWaE3tW40s5fXheh=CeEw+giUneoLAMmLDA@mail.gmail.com>
	<CADU3R7hM_WNvoQkfMiCtfD1qH0RmQ9GWv1rwRee3kz9++zs4_A@mail.gmail.com>
	<CAPJVwBnMCt4vr6aQM5e74EBP347D01bWWKRYuZ5By0Fg4XY2dg@mail.gmail.com>
	<CAEv1Z-4UuuV4VMiM=OqEqFghJ81yg4QznWtHYKuJpMhwutkHSw@mail.gmail.com>
	<53C953F5.90100@gmail.com>
	<CADU3R7je2tEyvLArGpgLwn+yUcXHtNqQHqRuvvy3uvooFQ1wUA@mail.gmail.com>
	<53C9806E.3090008@m4x.org>
Message-ID: <CADU3R7gK_ysiOHBu4uEFMWyC1YExXX9DquqYE8x0Fsu3iecqyw@mail.gmail.com>

Joseph Martinot-Lagarde writes:

> Compare what's comparable:

That's fair.

> In addition, you have to use AltGr on some keyboards to get the brackets

Wow, it must be rather painful to do any real programming on such a keyboard!

 - C


On Fri, Jul 18, 2014 at 1:15 PM, Joseph Martinot-Lagarde
<joseph.martinot-lagarde at m4x.org> wrote:
> Le 18/07/2014 20:42, Charles G. Waldman a ?crit :
>> Well, if the goal is "shorthand", typing numpy.array(numpy.mat())
>> won't please many users.
>>
>> But the more I think about it, the less I think Numpy should support
>> this (non-Pythonic) input mode.  Too much molly-coddling of new users!
>> When doing interactive work I usually just type:
>>
>>>>> np.array([[1,2,3],
>> ...                   [4,5,6],
>> ...                   [7,8,9]])
>>
>> which is (IMO) easier to read:  e.g. it's not totally obvious that
>> "1,0,0;0,1,0;0,0,1" represents a 3x3 identity matrix, but
>>
>> [[1,0,0],
>>    [0,1,0],
>>    [0,0,1]]
>>
>> is pretty obvious.
>>
> Compare what's comparable:
>
> [[1,0,0],
>   [0,1,0],
>   [0,0,1]]
>
> vs
>
> "1 0 0;"
> "0 1 0;"
> "0 0 1"
>
> or
>
> """
> 1 0 0;
> 0 1 0;
> 0 0 1
> """
>
> [[1,0,0], [0,1,0], [0,0,1]]
> vs
> "1 0 0; 0 1 0; 0 0 1"
>
>> The difference in (non-whitespace) chars is 19 vs 25, so the
>> "shorthand" doesn't seem to save that much.
>
> Well, it's easier to type "" (twice the same character) than [], and you
> have no risk in swapping en opening and a closing bracket. In addition,
> you have to use AltGr on some keyboards to get the brackets. It doesn't
> boils down to a number of characters.
>
>>
>> Just my ?0.02,
>>
>>     - C
>>
>>
>>
>>
>> On Fri, Jul 18, 2014 at 10:05 AM, Alan G Isaac <alan.isaac at gmail.com> wrote:
>>> On 7/18/2014 12:45 PM, Mark Miller wrote:
>>>> If the true goal is to just allow quick entry of a 2d array, why not just advocate using
>>>> a = numpy.array(numpy.mat("1 2 3; 4 5 6; 7 8 9"))
>>>
>>>
>>> It's even simpler:
>>> a = np.mat(' 1 2 3;4 5 6;7 8 9').A
>>>
>>> I'm not putting a dog in this race.  Still I would say that
>>> the reason why such proposals miss the point is that
>>> there are introductory settings where one would like
>>> to explain as few complications as possible.  In
>>> particular, one might prefer *not* to discuss the
>>> existence of a matrix type.  As an additional downside,
>>> this is only good for 2d, and there have been proposals
>>> for the new array builder to handle other dimensions.
>>>
>>> fwiw,
>>> Alan Isaac
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From chris.barker at noaa.gov  Fri Jul 18 16:32:50 2014
From: chris.barker at noaa.gov (Chris Barker)
Date: Fri, 18 Jul 2014 13:32:50 -0700
Subject: [Numpy-discussion] `allclose` vs `assert_allclose`
In-Reply-To: <lqbtc9$aa0$1@ger.gmane.org>
References: <CAEym_HoQgVswDS7h7hx4JjqVb-jKmoSpZmt36VAQSbyTX_NXnw@mail.gmail.com>
	<CAPJVwBn7Br5Ng7U=y26o=nj_E3LMdpWbc5Qp53h1fSOFD603eg@mail.gmail.com>
	<CAMMTP+DA0ja+zbDR5Fh=yTxsW_TeSApHD9tAWY261RgZ+x4UoQ@mail.gmail.com>
	<CAMMTP+D+19j2eJTYSfg5kHVieu333xwb9ZV+Tc6vBS88eUMtzw@mail.gmail.com>
	<CAPJVwBmML+mBn2reEpFvArtAwhqxK=Y5aFyjKtfOdO=seVk8_g@mail.gmail.com>
	<CAMMTP+CKitjyM5tj=4qXgEAJKtvWQ3P1RPhh9FVbEL24eyQ1Pw@mail.gmail.com>
	<53C96BB8.4060104@iki.fi>
	<CALGmxEKbzq3tMMz7MqO0gRxsGRc-bCSj6oq2=+6CNFQBtcR4Lg@mail.gmail.com>
	<lqbtc9$aa0$1@ger.gmane.org>
Message-ID: <CALGmxELZkJ_ut5i65N6pj8zGRav4NvgNK0y+1aabC4GmYmTkug@mail.gmail.com>

On Fri, Jul 18, 2014 at 12:43 PM, Pauli Virtanen <pav at iki.fi> wrote:

> 18.07.2014 22:13, Chris Barker kirjoitti:
> [clip]
> > but an appropriate rtol would work there too. If only zero testing is
> > needed, then atol=0 makes sense as a default. (or maybe atol=eps)
>
> There's plenty of room below eps, but finfo(float).tiny ~ 3e-308 (or
> some big multiple) is also reasonable in the scale-freeness sense.


right! brain blip -- eps is the difference between 1 and then next larger
representable number, yes? So a long way away from smallest representable
number. So yes, zero or [something]e-308 -- making zero seem like a good
idea again....

is it totally ridiculous to have the default be dependent on dtype? float32
vs float64?

-Chris

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/c900403b/attachment.html>

From josef.pktd at gmail.com  Fri Jul 18 16:44:34 2014
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 18 Jul 2014 16:44:34 -0400
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat`
	style
In-Reply-To: <CADU3R7gK_ysiOHBu4uEFMWyC1YExXX9DquqYE8x0Fsu3iecqyw@mail.gmail.com>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>
	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>
	<53B9C861.3090809@hawaii.edu>
	<CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>
	<-2968451659458027190@unknownmsgid> <lpgqht$tm3$1@ger.gmane.org>
	<CACoeR0wT=f3bPmc9j8UezPPa8UZz8ad6sj6q_6F3cyciGXHxjw@mail.gmail.com>
	<CAP7h-xYtS_Sc5PZt9rW1rg3Mk4LjUc6P=PuQ_g+J=Qe6VuhNtQ@mail.gmail.com>
	<CAP7h-xbvdv86DWZvYaLSJpba=vSTHdCQK1+YD_qfNRv4DeFBaw@mail.gmail.com>
	<CAPJVwBkPAWG6hRKJWaE3tW40s5fXheh=CeEw+giUneoLAMmLDA@mail.gmail.com>
	<CADU3R7hM_WNvoQkfMiCtfD1qH0RmQ9GWv1rwRee3kz9++zs4_A@mail.gmail.com>
	<CAPJVwBnMCt4vr6aQM5e74EBP347D01bWWKRYuZ5By0Fg4XY2dg@mail.gmail.com>
	<CAEv1Z-4UuuV4VMiM=OqEqFghJ81yg4QznWtHYKuJpMhwutkHSw@mail.gmail.com>
	<53C953F5.90100@gmail.com>
	<CADU3R7je2tEyvLArGpgLwn+yUcXHtNqQHqRuvvy3uvooFQ1wUA@mail.gmail.com>
	<53C9806E.3090008@m4x.org>
	<CADU3R7gK_ysiOHBu4uEFMWyC1YExXX9DquqYE8x0Fsu3iecqyw@mail.gmail.com>
Message-ID: <CAMMTP+CErQ-8xkfMNyaf_i2GMbYnJYQ4TPhHmrMAQuR5iOw5rw@mail.gmail.com>

On Fri, Jul 18, 2014 at 4:21 PM, Charles G. Waldman <charles at crunch.io>
wrote:

> Joseph Martinot-Lagarde writes:
>
> > Compare what's comparable:
>
> That's fair.
>
> > In addition, you have to use AltGr on some keyboards to get the brackets
>
> Wow, it must be rather painful to do any real programming on such a
> keyboard!
>
>  - C
>
>
> On Fri, Jul 18, 2014 at 1:15 PM, Joseph Martinot-Lagarde
> <joseph.martinot-lagarde at m4x.org> wrote:
> > Le 18/07/2014 20:42, Charles G. Waldman a ?crit :
> >> Well, if the goal is "shorthand", typing numpy.array(numpy.mat())
> >> won't please many users.
> >>
> >> But the more I think about it, the less I think Numpy should support
> >> this (non-Pythonic) input mode.  Too much molly-coddling of new users!
> >> When doing interactive work I usually just type:
> >>
> >>>>> np.array([[1,2,3],
> >> ...                   [4,5,6],
> >> ...                   [7,8,9]])
> >>
> >> which is (IMO) easier to read:  e.g. it's not totally obvious that
> >> "1,0,0;0,1,0;0,0,1" represents a 3x3 identity matrix, but
> >>
> >> [[1,0,0],
> >>    [0,1,0],
> >>    [0,0,1]]
> >>
> >> is pretty obvious.
> >>
> > Compare what's comparable:
> >
> > [[1,0,0],
> >   [0,1,0],
> >   [0,0,1]]
> >
> > vs
> >
> > "1 0 0;"
> > "0 1 0;"
> > "0 0 1"
> >
> > or
> >
> > """
> > 1 0 0;
> > 0 1 0;
> > 0 0 1
> > """
> >
> > [[1,0,0], [0,1,0], [0,0,1]]
> > vs
> > "1 0 0; 0 1 0; 0 0 1"
> >
> >> The difference in (non-whitespace) chars is 19 vs 25, so the
> >> "shorthand" doesn't seem to save that much.
> >
> > Well, it's easier to type "" (twice the same character) than [], and you
> > have no risk in swapping en opening and a closing bracket. In addition,
> > you have to use AltGr on some keyboards to get the brackets. It doesn't
> > boils down to a number of characters.
> >
> >>
> >> Just my ?0.02,
>


It's the year of the notebook.

notebooks are reusable.
notebooks correctly align the brackets in the second and third line
and it looks pretty, just like a matrix


(But, I don't have to teach newbies, and often I even correct whitespace on
the commandline, because it looks ugly and I will eventually copy it to a
script file.)

Josef
no broken windows!
well, except for the ones I don't feel like fixing right now.


> >>
> >>     - C
> >>
> >>
> >>
> >>
> >> On Fri, Jul 18, 2014 at 10:05 AM, Alan G Isaac <alan.isaac at gmail.com>
> wrote:
> >>> On 7/18/2014 12:45 PM, Mark Miller wrote:
> >>>> If the true goal is to just allow quick entry of a 2d array, why not
> just advocate using
> >>>> a = numpy.array(numpy.mat("1 2 3; 4 5 6; 7 8 9"))
> >>>
> >>>
> >>> It's even simpler:
> >>> a = np.mat(' 1 2 3;4 5 6;7 8 9').A
> >>>
> >>> I'm not putting a dog in this race.  Still I would say that
> >>> the reason why such proposals miss the point is that
> >>> there are introductory settings where one would like
> >>> to explain as few complications as possible.  In
> >>> particular, one might prefer *not* to discuss the
> >>> existence of a matrix type.  As an additional downside,
> >>> this is only good for 2d, and there have been proposals
> >>> for the new array builder to handle other dimensions.
> >>>
> >>> fwiw,
> >>> Alan Isaac
> >>>
> >>> _______________________________________________
> >>> NumPy-Discussion mailing list
> >>> NumPy-Discussion at scipy.org
> >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >> _______________________________________________
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion at scipy.org
> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >>
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/54a9a2bb/attachment.html>

From chris.barker at noaa.gov  Fri Jul 18 16:44:39 2014
From: chris.barker at noaa.gov (Chris Barker)
Date: Fri, 18 Jul 2014 13:44:39 -0700
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CALmrCV19YZG4g_XSdWutqxj+8ZWonZqDUxGUzKJ0a4tYLpw41w@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
	<CALmrCV3DQj7nu0KcCvoktmunK_t_qp+RR2rwB7L-qCBVuQ+=3A@mail.gmail.com>
	<-4597269384285942771@unknownmsgid>
	<CALmrCV2GK2cTec+hNnF3p7T3qVgvjmPNR=kg2NN0A9TCUfMiZQ@mail.gmail.com>
	<CALGmxEJtX7xJz1+GA-V7A7NzgtD1Juahd6iFUFx9qmJvbK1vSg@mail.gmail.com>
	<CALmrCV2Tr=viJPUSzBPVTM2R=JutC1DsVaBYHMH6HCs6J_6yag@mail.gmail.com>
	<CALGmxEJPoHFe1MWbU0HZy9sYL2oauxKzPx+x3c8UtnUypWdS=Q@mail.gmail.com>
	<CALmrCV19YZG4g_XSdWutqxj+8ZWonZqDUxGUzKJ0a4tYLpw41w@mail.gmail.com>
Message-ID: <CALGmxE+B_LaF2sFM=7zkfM5ffV_y80wzZSF3pQKc8k7EYE4DMg@mail.gmail.com>

On Fri, Jul 18, 2014 at 12:52 PM, Andrew Collette <andrew.collette at gmail.com
> wrote:

> > What it would do is push the problem from the HDF5<->numpy interface to
> the
> > python<->numpy interface.
> >
> > I'm not sure that's a good trade off.
>
> Maybe I'm being too paranoid about the truncation issue.


Actually, I agree about the truncation issue, but it's a question of where
to put it -- I'm suggesting that I don't want it at the python<->numpy
interface.


> Here's a strawman for how a Latin-1 "a" type might be handled in h5py:
>
> 1. Creation from existing "a" data: Use vlen strings.  Doesn't
> preserve the dtype, but maybe that's not so important.
>

do vlen strings support full unicode? -- then, yes, that's good.


> 2. Writing from "a" data to fixed-width ASCII: Copy, and replace
> bytes>127 with "?" (or don't)
>

I'd vote for don't, unless HDF starts enforcing pure ascii. But if it does,
then yes, replacement makes more sense than exceptions.

3. Writing from "a" data to fixed-width UTF-8: Transcode and truncate
> (being careful not to end in the middle of a multibyte character)
>

yup -- buyer beware.


> 4. Reading from fixed-width ASCII to "a": Straight copy, no inspection
>

yup.


> 5. Reading from fixed-width UTF-8 to "a": Copy, and replace
> non-Latin-1 chars with "?"
>

sure

what about reading from fixed-width UTF-8 to 'U' -- that seems like the
natural way to go for unicode. Tough a bit hard to know how long U needs to
be -- but <= the length of the utf-8 array (in characters).


> (The above example uses replacement rather than raising an exception,
> because an exception in the HDF5 conversion callback will leave the
> write/read half-completed).
>

and really -- what would you do with an exception on read? give up and
throw the file away?

note that I'm also proposing a "bytes" dtype, which might make sense for
grabbing utf-8 data from HDF-5. Then either h5py or the user could decode
to a unicode type.

In any case, I can say that the lack of an text 'S' type in NumPy has
> been a significant pain point for h5py users on Python 3 over the
> years.


isn't the current 'S'  a pretty good map to hdf ascii?

 Whatever specific encoding ends up being used, such a type can
> only improve the situation, and I'm firmly in favor of it.


agreed.

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/cd8f55dc/attachment.html>

From chris.barker at noaa.gov  Fri Jul 18 16:46:23 2014
From: chris.barker at noaa.gov (Chris Barker)
Date: Fri, 18 Jul 2014 13:46:23 -0700
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat`
	style
In-Reply-To: <53C9806E.3090008@m4x.org>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>
	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>
	<53B9C861.3090809@hawaii.edu>
	<CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>
	<-2968451659458027190@unknownmsgid> <lpgqht$tm3$1@ger.gmane.org>
	<CACoeR0wT=f3bPmc9j8UezPPa8UZz8ad6sj6q_6F3cyciGXHxjw@mail.gmail.com>
	<CAP7h-xYtS_Sc5PZt9rW1rg3Mk4LjUc6P=PuQ_g+J=Qe6VuhNtQ@mail.gmail.com>
	<CAP7h-xbvdv86DWZvYaLSJpba=vSTHdCQK1+YD_qfNRv4DeFBaw@mail.gmail.com>
	<CAPJVwBkPAWG6hRKJWaE3tW40s5fXheh=CeEw+giUneoLAMmLDA@mail.gmail.com>
	<CADU3R7hM_WNvoQkfMiCtfD1qH0RmQ9GWv1rwRee3kz9++zs4_A@mail.gmail.com>
	<CAPJVwBnMCt4vr6aQM5e74EBP347D01bWWKRYuZ5By0Fg4XY2dg@mail.gmail.com>
	<CAEv1Z-4UuuV4VMiM=OqEqFghJ81yg4QznWtHYKuJpMhwutkHSw@mail.gmail.com>
	<53C953F5.90100@gmail.com>
	<CADU3R7je2tEyvLArGpgLwn+yUcXHtNqQHqRuvvy3uvooFQ1wUA@mail.gmail.com>
	<53C9806E.3090008@m4x.org>
Message-ID: <CALGmxELV5Pt7ckJ=WMXxtZY5ehSHNX9964m=kbsHA13HSeGw_g@mail.gmail.com>

On Fri, Jul 18, 2014 at 1:15 PM, Joseph Martinot-Lagarde <
joseph.martinot-lagarde at m4x.org> wrote:

> In addition,
> you have to use AltGr on some keyboards to get the brackets.


If it's hard to type square brackets -- you're kind of dead in the water
with Python anyway -- this is not going to help.

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/8b39cfb1/attachment.html>

From jtaylor.debian at googlemail.com  Fri Jul 18 16:53:26 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Fri, 18 Jul 2014 22:53:26 +0200
Subject: [Numpy-discussion] proposal: new commit guidelines for
 backportable bugfixes
In-Reply-To: <53C95DA3.7010901@iki.fi>
References: <CAK5FAtGegwSH8Mx=QaZqvDeFp2UT7eHCKfFYQix6a32W1Mh=7w@mail.gmail.com>	<CAPJVwBkXYr89z4uwmXL5XAE5xXde_jVtHJQY5+QgEAn01W41=g@mail.gmail.com>	<CAK5FAtF+seKQHu9q5bNMNatX7JEPXbwURzR+4VjWdaWUhULXog@mail.gmail.com>
	<53C95DA3.7010901@iki.fi>
Message-ID: <53C98946.3020405@googlemail.com>

On 18.07.2014 19:47, Pauli Virtanen wrote:
> 18.07.2014 19:35, Julian Taylor kirjoitti:
>> On Fri, Jul 18, 2014 at 6:23 PM, Nathaniel Smith <njs at pobox.com>
>> wrote:
>>> On 18 Jul 2014 15:36, "Julian Taylor"
>>> <jtaylor.debian at googlemail.com> wrote:
>>>>
>>>> git rebase --onto $(git merge-base master maintenance/1.9.x)
>>>> HEAD^
>>>
>>> As a potential refinement, this might be simpler if we define a
>>> branch that points to this commit.
>>>
>>
>> we could do that, though the merge base changes to the last commit 
>> that was merged in that way. The old merge base is still valid but 
>> much older. I applied this method to some of my bugfixes so the 
>> current merge base of master and 1.9 is a commit from yesterday
>> not anymore the diverging point of master and 1.9. But I don't know
>> if the newer merge base makes any difference to git.
> 
> Will the merge base actually ever change if you don't merge the
> branches to each other

we want to merge them into each other so a change of merge base is
unavoidable.


> 
> The other well-known alternative to bugfixes is to first commit it in
> the earliest maintenance branch where you want to have it, and then
> merge that branch forward to the newer maintenance branches, and
> finally into master.
> 

wouldn't that still require basing bugfixes onto the point before the
master and maintenance branch diverged?
otherwise a merge from maintenance to master would include the commits
that are only part of the maintenance branch (release commits,
regression fixes etc.)

basing bugfixes on maintenance does allow cherry picking into master as
you don't care too much about backward mergeability here, but you still
lose a good git log and git branch --contains to check which bugfix is
in which branch.


From joseph.martinot-lagarde at m4x.org  Fri Jul 18 17:04:11 2014
From: joseph.martinot-lagarde at m4x.org (Joseph Martinot-Lagarde)
Date: Fri, 18 Jul 2014 23:04:11 +0200
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat`
	style
In-Reply-To: <CALGmxELV5Pt7ckJ=WMXxtZY5ehSHNX9964m=kbsHA13HSeGw_g@mail.gmail.com>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>	<53B9C861.3090809@hawaii.edu>	<CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>	<-2968451659458027190@unknownmsgid>
	<lpgqht$tm3$1@ger.gmane.org>	<CACoeR0wT=f3bPmc9j8UezPPa8UZz8ad6sj6q_6F3cyciGXHxjw@mail.gmail.com>	<CAP7h-xYtS_Sc5PZt9rW1rg3Mk4LjUc6P=PuQ_g+J=Qe6VuhNtQ@mail.gmail.com>	<CAP7h-xbvdv86DWZvYaLSJpba=vSTHdCQK1+YD_qfNRv4DeFBaw@mail.gmail.com>	<CAPJVwBkPAWG6hRKJWaE3tW40s5fXheh=CeEw+giUneoLAMmLDA@mail.gmail.com>	<CADU3R7hM_WNvoQkfMiCtfD1qH0RmQ9GWv1rwRee3kz9++zs4_A@mail.gmail.com>	<CAPJVwBnMCt4vr6aQM5e74EBP347D01bWWKRYuZ5By0Fg4XY2dg@mail.gmail.com>	<CAEv1Z-4UuuV4VMiM=OqEqFghJ81yg4QznWtHYKuJpMhwutkHSw@mail.gmail.com>	<53C953F5.90100@gmail.com>	<CADU3R7je2tEyvLArGpgLwn+yUcXHtNqQHqRuvvy3uvooFQ1wUA@mail.gmail.com>	<53C9806E.3090008@m4x.org>
	<CALGmxELV5Pt7ckJ=WMXxtZY5ehSHNX9964m=kbsHA13HSeGw_g@mail.gmail.com>
Message-ID: <lqc241$anh$1@ger.gmane.org>

Le 18/07/2014 22:46, Chris Barker a ?crit :
> On Fri, Jul 18, 2014 at 1:15 PM, Joseph Martinot-Lagarde
> <joseph.martinot-lagarde at m4x.org
> <mailto:joseph.martinot-lagarde at m4x.org>> wrote:
>
>     In addition,
>     you have to use AltGr on some keyboards to get the brackets.
>
>
> If it's hard to type square brackets -- you're kind of dead in the water
> with Python anyway -- this is not going to help.
>
> -Chris
>
Welcome to the azerty world ! ;)

It's not that hard to type, just a bit more involved. My biggest problem 
is that you have to type the opening and closing bracket for each line, 
with a comma in between. It will always be harder and more error prone 
than a single semicolon, whatever the keyboard.

My use case is not teaching but doing quick'n'dirty computations with a 
few values. Sometimes these values are copy-pasted from a space 
separated file, or from a printed array in another console. Having to 
add comas and bracket makes simple computations less easy. That's why I 
often use Octave for these.


From andrew.collette at gmail.com  Fri Jul 18 17:30:59 2014
From: andrew.collette at gmail.com (Andrew Collette)
Date: Fri, 18 Jul 2014 15:30:59 -0600
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CALGmxE+B_LaF2sFM=7zkfM5ffV_y80wzZSF3pQKc8k7EYE4DMg@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
	<CALmrCV3DQj7nu0KcCvoktmunK_t_qp+RR2rwB7L-qCBVuQ+=3A@mail.gmail.com>
	<-4597269384285942771@unknownmsgid>
	<CALmrCV2GK2cTec+hNnF3p7T3qVgvjmPNR=kg2NN0A9TCUfMiZQ@mail.gmail.com>
	<CALGmxEJtX7xJz1+GA-V7A7NzgtD1Juahd6iFUFx9qmJvbK1vSg@mail.gmail.com>
	<CALmrCV2Tr=viJPUSzBPVTM2R=JutC1DsVaBYHMH6HCs6J_6yag@mail.gmail.com>
	<CALGmxEJPoHFe1MWbU0HZy9sYL2oauxKzPx+x3c8UtnUypWdS=Q@mail.gmail.com>
	<CALmrCV19YZG4g_XSdWutqxj+8ZWonZqDUxGUzKJ0a4tYLpw41w@mail.gmail.com>
	<CALGmxE+B_LaF2sFM=7zkfM5ffV_y80wzZSF3pQKc8k7EYE4DMg@mail.gmail.com>
Message-ID: <CALmrCV2TJc3G4kWWOGn3SwpqWwFd7a+wM_9gA_aZLz596vBsRw@mail.gmail.com>

Hi Chris,

> Actually, I agree about the truncation issue, but it's a question of where
> to put it -- I'm suggesting that I don't want it at the python<->numpy
> interface.

Yes, that's a good point.  Of course, by using Latin-1 rather than
UTF-8 we can't support all Unicode code points (hence the "?"
replacement possible on read from HDF5).

> do vlen strings support full unicode? -- then, yes, that's good.

Yes, they do.  It's somewhat unfortunate to immediately cast to vlen
though, since people usually have fixed-width datasets to start with
for efficiency reasons...

> what about reading from fixed-width UTF-8 to 'U' -- that seems like the
> natural way to go for unicode. Tough a bit hard to know how long U needs to
> be -- but <= the length of the utf-8 array (in characters).

Space concerns ("U" has a 4x space penalty for ASCII-ish data).  Plus,
for similar reasons to this discussion, creating "U" datasets is
unsupported at the moment.

> note that I'm also proposing a "bytes" dtype, which might make sense for
> grabbing utf-8 data from HDF-5. Then either h5py or the user could decode to
> a unicode type.

Sound quite like the existing 'S' type.

>> In any case, I can say that the lack of an text 'S' type in NumPy has
>> been a significant pain point for h5py users on Python 3 over the
>> years.
>
> isn't the current 'S'  a pretty good map to hdf ascii?

Yes; in fact, right now all fixed-width strings in h5py (ASCII and
UTF-8) are read/written as 'S'.  The problem is that on Py3, 'S' is
treated as bytes, not text, so you can't freely mix it with str.

I am about to leave for the weekend... thanks for a great discussion!
To conclude, it strikes me that in choosing an encoding we get to pick
at most two of the following:

1. Support for all Unicode characters
2. Fixed number of characters
3. Fixed number of storage bytes

At this point, I would vote for UTF-8 in a fixed width buffer (1/3),
but I imagine as this progresses towards a NEP others will weigh in.

Andrew


From josef.pktd at gmail.com  Fri Jul 18 17:31:28 2014
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 18 Jul 2014 17:31:28 -0400
Subject: [Numpy-discussion] Short-hand array creation in `numpy.mat`
	style
In-Reply-To: <lqc241$anh$1@ger.gmane.org>
References: <CACoeR0ypAHCreJewHSQ=DByz2=6VyHOLGY4irhUcg69ecxE65A@mail.gmail.com>
	<CAPJVwBmHrj_u2pwu28CrCXRcnMdqXTECYOURQuYOd9hF5Q8u1g@mail.gmail.com>
	<53B9C861.3090809@hawaii.edu>
	<CAP7h-xZadzHpfPzbzgtgV__W-nLxb8d0iKE2Aj3-o4br+XHdwA@mail.gmail.com>
	<-2968451659458027190@unknownmsgid> <lpgqht$tm3$1@ger.gmane.org>
	<CACoeR0wT=f3bPmc9j8UezPPa8UZz8ad6sj6q_6F3cyciGXHxjw@mail.gmail.com>
	<CAP7h-xYtS_Sc5PZt9rW1rg3Mk4LjUc6P=PuQ_g+J=Qe6VuhNtQ@mail.gmail.com>
	<CAP7h-xbvdv86DWZvYaLSJpba=vSTHdCQK1+YD_qfNRv4DeFBaw@mail.gmail.com>
	<CAPJVwBkPAWG6hRKJWaE3tW40s5fXheh=CeEw+giUneoLAMmLDA@mail.gmail.com>
	<CADU3R7hM_WNvoQkfMiCtfD1qH0RmQ9GWv1rwRee3kz9++zs4_A@mail.gmail.com>
	<CAPJVwBnMCt4vr6aQM5e74EBP347D01bWWKRYuZ5By0Fg4XY2dg@mail.gmail.com>
	<CAEv1Z-4UuuV4VMiM=OqEqFghJ81yg4QznWtHYKuJpMhwutkHSw@mail.gmail.com>
	<53C953F5.90100@gmail.com>
	<CADU3R7je2tEyvLArGpgLwn+yUcXHtNqQHqRuvvy3uvooFQ1wUA@mail.gmail.com>
	<53C9806E.3090008@m4x.org>
	<CALGmxELV5Pt7ckJ=WMXxtZY5ehSHNX9964m=kbsHA13HSeGw_g@mail.gmail.com>
	<lqc241$anh$1@ger.gmane.org>
Message-ID: <CAMMTP+Ct127r=2KbZ1dS2giD3=9Ax1V+x8q_2UOLeMvADGpFfA@mail.gmail.com>

On Fri, Jul 18, 2014 at 5:04 PM, Joseph Martinot-Lagarde <
joseph.martinot-lagarde at m4x.org> wrote:

> Le 18/07/2014 22:46, Chris Barker a ?crit :
> > On Fri, Jul 18, 2014 at 1:15 PM, Joseph Martinot-Lagarde
> > <joseph.martinot-lagarde at m4x.org
> > <mailto:joseph.martinot-lagarde at m4x.org>> wrote:
> >
> >     In addition,
> >     you have to use AltGr on some keyboards to get the brackets.
> >
> >
> > If it's hard to type square brackets -- you're kind of dead in the water
> > with Python anyway -- this is not going to help.
> >
> > -Chris
> >
> Welcome to the azerty world ! ;)
>
> It's not that hard to type, just a bit more involved. My biggest problem
> is that you have to type the opening and closing bracket for each line,
> with a comma in between. It will always be harder and more error prone
> than a single semicolon, whatever the keyboard.
>
> My use case is not teaching but doing quick'n'dirty computations with a
> few values. Sometimes these values are copy-pasted from a space
> separated file, or from a printed array in another console. Having to
> add comas and bracket makes simple computations less easy. That's why I
> often use Octave for these.
>

my copy paste approaches for almost quick'n'dirty (no semicolons):

given:

a b c

1 2 3

4 5 6

7 8 9


(select & Ctrl-C)


>>> pandas.read_clipboard(sep=' ')

   a  b  c

0  1  2  3

1  4  5  6

2  7  8  9


>>> np.asarray(pandas.read_clipboard())

array([[1, 2, 3],

       [4, 5, 6],

       [7, 8, 9]], dtype=int64)


>>> pandas.read_clipboard().values

array([[1, 2, 3],

       [4, 5, 6],

       [7, 8, 9]], dtype=int64)


arr = np.array('''\

1 2 3

4 5 6

7 8 9'''.split(), float).reshape(-1, 3)


the last is not so quick and dirty but reusable and reused.


Josef


>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/8e32090c/attachment.html>

From charlesr.harris at gmail.com  Fri Jul 18 17:49:20 2014
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 18 Jul 2014 15:49:20 -0600
Subject: [Numpy-discussion] String type again.
In-Reply-To: <CALmrCV2TJc3G4kWWOGn3SwpqWwFd7a+wM_9gA_aZLz596vBsRw@mail.gmail.com>
References: <CAB6mnxJ0aRMcALmti7=HMaGMtum7=6vf1w32rXWP6b+A7f3hGA@mail.gmail.com>
	<CALmrCV3DQj7nu0KcCvoktmunK_t_qp+RR2rwB7L-qCBVuQ+=3A@mail.gmail.com>
	<-4597269384285942771@unknownmsgid>
	<CALmrCV2GK2cTec+hNnF3p7T3qVgvjmPNR=kg2NN0A9TCUfMiZQ@mail.gmail.com>
	<CALGmxEJtX7xJz1+GA-V7A7NzgtD1Juahd6iFUFx9qmJvbK1vSg@mail.gmail.com>
	<CALmrCV2Tr=viJPUSzBPVTM2R=JutC1DsVaBYHMH6HCs6J_6yag@mail.gmail.com>
	<CALGmxEJPoHFe1MWbU0HZy9sYL2oauxKzPx+x3c8UtnUypWdS=Q@mail.gmail.com>
	<CALmrCV19YZG4g_XSdWutqxj+8ZWonZqDUxGUzKJ0a4tYLpw41w@mail.gmail.com>
	<CALGmxE+B_LaF2sFM=7zkfM5ffV_y80wzZSF3pQKc8k7EYE4DMg@mail.gmail.com>
	<CALmrCV2TJc3G4kWWOGn3SwpqWwFd7a+wM_9gA_aZLz596vBsRw@mail.gmail.com>
Message-ID: <CAB6mnx+jX2HSwYe=TUzGumYWVE+zhozToqW0uYKC+L8AanAMCg@mail.gmail.com>

On Fri, Jul 18, 2014 at 3:30 PM, Andrew Collette <andrew.collette at gmail.com>
wrote:

> Hi Chris,
>
> > Actually, I agree about the truncation issue, but it's a question of
> where
> > to put it -- I'm suggesting that I don't want it at the python<->numpy
> > interface.
>
> Yes, that's a good point.  Of course, by using Latin-1 rather than
> UTF-8 we can't support all Unicode code points (hence the "?"
> replacement possible on read from HDF5).
>
> > do vlen strings support full unicode? -- then, yes, that's good.
>
> Yes, they do.  It's somewhat unfortunate to immediately cast to vlen
> though, since people usually have fixed-width datasets to start with
> for efficiency reasons...
>
> > what about reading from fixed-width UTF-8 to 'U' -- that seems like the
> > natural way to go for unicode. Tough a bit hard to know how long U needs
> to
> > be -- but <= the length of the utf-8 array (in characters).
>
> Space concerns ("U" has a 4x space penalty for ASCII-ish data).  Plus,
> for similar reasons to this discussion, creating "U" datasets is
> unsupported at the moment.
>
> > note that I'm also proposing a "bytes" dtype, which might make sense for
> > grabbing utf-8 data from HDF-5. Then either h5py or the user could
> decode to
> > a unicode type.
>
> Sound quite like the existing 'S' type.
>
> >> In any case, I can say that the lack of an text 'S' type in NumPy has
> >> been a significant pain point for h5py users on Python 3 over the
> >> years.
> >
> > isn't the current 'S'  a pretty good map to hdf ascii?
>
> Yes; in fact, right now all fixed-width strings in h5py (ASCII and
> UTF-8) are read/written as 'S'.  The problem is that on Py3, 'S' is
> treated as bytes, not text, so you can't freely mix it with str.
>
> I am about to leave for the weekend... thanks for a great discussion!
> To conclude, it strikes me that in choosing an encoding we get to pick
> at most two of the following:
>
> 1. Support for all Unicode characters
> 2. Fixed number of characters
> 3. Fixed number of storage bytes
>
> At this point, I would vote for UTF-8 in a fixed width buffer (1/3),
> but I imagine as this progresses towards a NEP others will weigh in.
>

At some point I'm pretty sure we will want to support utf-8 as it looks
well on its way to a universal standard.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/4c4a5a93/attachment.html>

From pav at iki.fi  Fri Jul 18 18:44:47 2014
From: pav at iki.fi (Pauli Virtanen)
Date: Sat, 19 Jul 2014 01:44:47 +0300
Subject: [Numpy-discussion] proposal: new commit guidelines for
	backportable bugfixes
In-Reply-To: <53C98946.3020405@googlemail.com>
References: <CAK5FAtGegwSH8Mx=QaZqvDeFp2UT7eHCKfFYQix6a32W1Mh=7w@mail.gmail.com>	<CAPJVwBkXYr89z4uwmXL5XAE5xXde_jVtHJQY5+QgEAn01W41=g@mail.gmail.com>	<CAK5FAtF+seKQHu9q5bNMNatX7JEPXbwURzR+4VjWdaWUhULXog@mail.gmail.com>	<53C95DA3.7010901@iki.fi>
	<53C98946.3020405@googlemail.com>
Message-ID: <lqc80v$oi0$1@ger.gmane.org>

18.07.2014 23:53, Julian Taylor kirjoitti:
> On 18.07.2014 19:47, Pauli Virtanen wrote:
[clip]
> > The other well-known alternative to bugfixes is to first commit it in
> > the earliest maintenance branch where you want to have it, and then
> > merge that branch forward to the newer maintenance branches, and
> > finally into master.
> 
> wouldn't that still require basing bugfixes onto the point before the
> master and maintenance branch diverged?
> otherwise a merge from maintenance to master would include the commits
> that are only part of the maintenance branch (release commits,
> regression fixes etc.)

If I understand correctly, the idea is to manually revert the changes
that don't belong in, which needs to be only done once for each, as the
merge logic should deal with it in all subsequent merges.

I think there are in practice not so many commits that you want to have
only in the release branch. Version number bumping is one (and easily
addressed by a follow-up commit in master that bumps it again) --- what
else?

The bugfix-in-release-and-forward-port-to-master seems to be the
recommended practice for Mercurial:

	http://mercurial.selenic.com/wiki/StandardBranching

	https://docs.python.org/devguide/committing.html

I think there are also git guides that recommend using it.

The option of basing commits on last merge base is probably not really
feasible with Mercurial (I haven't seen git guides that propose it either).

> basing bugfixes on maintenance does allow cherry picking into master as
> you don't care too much about backward mergeability here, but you still
> lose a good git log and git branch --contains to check which bugfix is
> in which branch.

I don't disagree with this. Cherry picking is OK, but only as long as
the number of commits is not too large and you use a tool (e.g. my
git-cherry-tree) that tries to check which patches are in and which not.

	Pauli


From njs at pobox.com  Fri Jul 18 18:49:08 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 18 Jul 2014 23:49:08 +0100
Subject: [Numpy-discussion] proposal: new commit guidelines for
 backportable bugfixes
In-Reply-To: <lqc80v$oi0$1@ger.gmane.org>
References: <CAK5FAtGegwSH8Mx=QaZqvDeFp2UT7eHCKfFYQix6a32W1Mh=7w@mail.gmail.com>
	<CAPJVwBkXYr89z4uwmXL5XAE5xXde_jVtHJQY5+QgEAn01W41=g@mail.gmail.com>
	<CAK5FAtF+seKQHu9q5bNMNatX7JEPXbwURzR+4VjWdaWUhULXog@mail.gmail.com>
	<53C95DA3.7010901@iki.fi> <53C98946.3020405@googlemail.com>
	<lqc80v$oi0$1@ger.gmane.org>
Message-ID: <CAPJVwBk+KJjfwj_kFYrAC8jq84gg-Z2XoobLQgGQXASOwRe+zA@mail.gmail.com>

On Fri, Jul 18, 2014 at 11:44 PM, Pauli Virtanen <pav at iki.fi> wrote:
> 18.07.2014 23:53, Julian Taylor kirjoitti:
>> On 18.07.2014 19:47, Pauli Virtanen wrote:
> [clip]
>> > The other well-known alternative to bugfixes is to first commit it in
>> > the earliest maintenance branch where you want to have it, and then
>> > merge that branch forward to the newer maintenance branches, and
>> > finally into master.
>>
>> wouldn't that still require basing bugfixes onto the point before the
>> master and maintenance branch diverged?
>> otherwise a merge from maintenance to master would include the commits
>> that are only part of the maintenance branch (release commits,
>> regression fixes etc.)
>
> If I understand correctly, the idea is to manually revert the changes
> that don't belong in, which needs to be only done once for each, as the
> merge logic should deal with it in all subsequent merges.
>
> I think there are in practice not so many commits that you want to have
> only in the release branch. Version number bumping is one (and easily
> addressed by a follow-up commit in master that bumps it again) --- what
> else?

Presumably all the commits that we miss on the first pass and end up
backporting the hard way later :-)

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org


From pav at iki.fi  Fri Jul 18 19:10:09 2014
From: pav at iki.fi (Pauli Virtanen)
Date: Sat, 19 Jul 2014 02:10:09 +0300
Subject: [Numpy-discussion] proposal: new commit guidelines for
	backportable bugfixes
In-Reply-To: <CAPJVwBk+KJjfwj_kFYrAC8jq84gg-Z2XoobLQgGQXASOwRe+zA@mail.gmail.com>
References: <CAK5FAtGegwSH8Mx=QaZqvDeFp2UT7eHCKfFYQix6a32W1Mh=7w@mail.gmail.com>	<CAPJVwBkXYr89z4uwmXL5XAE5xXde_jVtHJQY5+QgEAn01W41=g@mail.gmail.com>	<CAK5FAtF+seKQHu9q5bNMNatX7JEPXbwURzR+4VjWdaWUhULXog@mail.gmail.com>	<53C95DA3.7010901@iki.fi>
	<53C98946.3020405@googlemail.com>	<lqc80v$oi0$1@ger.gmane.org>
	<CAPJVwBk+KJjfwj_kFYrAC8jq84gg-Z2XoobLQgGQXASOwRe+zA@mail.gmail.com>
Message-ID: <lqc9gh$bm6$1@ger.gmane.org>

19.07.2014 01:49, Nathaniel Smith kirjoitti:
> On Fri, Jul 18, 2014 at 11:44 PM, Pauli Virtanen <pav at iki.fi> wrote:
>> 18.07.2014 23:53, Julian Taylor kirjoitti:
>>> On 18.07.2014 19:47, Pauli Virtanen wrote:
>> [clip]
>>>> The other well-known alternative to bugfixes is to first commit it in
>>>> the earliest maintenance branch where you want to have it, and then
>>>> merge that branch forward to the newer maintenance branches, and
>>>> finally into master.
>>>
>>> wouldn't that still require basing bugfixes onto the point before the
>>> master and maintenance branch diverged?
>>> otherwise a merge from maintenance to master would include the commits
>>> that are only part of the maintenance branch (release commits,
>>> regression fixes etc.)
>>
>> If I understand correctly, the idea is to manually revert the changes
>> that don't belong in, which needs to be only done once for each, as the
>> merge logic should deal with it in all subsequent merges.
>>
>> I think there are in practice not so many commits that you want to have
>> only in the release branch. Version number bumping is one (and easily
>> addressed by a follow-up commit in master that bumps it again) --- what
>> else?
> 
> Presumably all the commits that we miss on the first pass and end up
> backporting the hard way later :-)

If those are just cherry-picked, they will generate merge conflicts the
next time things are merged back (or, the merge will be smart enough to
note the patch was already applied some time ago). This is then probably
not really a big problem.

	Pauli


From pav at iki.fi  Fri Jul 18 19:13:34 2014
From: pav at iki.fi (Pauli Virtanen)
Date: Sat, 19 Jul 2014 02:13:34 +0300
Subject: [Numpy-discussion] proposal: new commit guidelines for
	backportable bugfixes
In-Reply-To: <lqc9gh$bm6$1@ger.gmane.org>
References: <CAK5FAtGegwSH8Mx=QaZqvDeFp2UT7eHCKfFYQix6a32W1Mh=7w@mail.gmail.com>	<CAPJVwBkXYr89z4uwmXL5XAE5xXde_jVtHJQY5+QgEAn01W41=g@mail.gmail.com>	<CAK5FAtF+seKQHu9q5bNMNatX7JEPXbwURzR+4VjWdaWUhULXog@mail.gmail.com>	<53C95DA3.7010901@iki.fi>	<53C98946.3020405@googlemail.com>	<lqc80v$oi0$1@ger.gmane.org>	<CAPJVwBk+KJjfwj_kFYrAC8jq84gg-Z2XoobLQgGQXASOwRe+zA@mail.gmail.com>
	<lqc9gh$bm6$1@ger.gmane.org>
Message-ID: <lqc9mu$bm6$2@ger.gmane.org>

19.07.2014 02:10, Pauli Virtanen kirjoitti:
> 19.07.2014 01:49, Nathaniel Smith kirjoitti:
>> On Fri, Jul 18, 2014 at 11:44 PM, Pauli Virtanen <pav at iki.fi> wrote:
[clip]
>> Presumably all the commits that we miss on the first pass and end up
>> backporting the hard way later :-)
> 
> If those are just cherry-picked, they will generate merge conflicts the
> next time things are merged back (or, the merge will be smart enough to
> note the patch was already applied some time ago). This is then probably
> not really a big problem.

NB. this is a bit playing devil's advocate --- I'm not advocating
porting bugfixes from merge branches, as using the merge base should
also work fine.


From bramwillemsen at gmail.com  Fri Jul 18 20:03:11 2014
From: bramwillemsen at gmail.com (Bram Willemsen)
Date: Fri, 18 Jul 2014 19:03:11 -0500
Subject: [Numpy-discussion] BLAS / LAPACK / MKL cannot be found?
Message-ID: <CAKFW64TNrKw+htpfDBqwtPy=n5T_MKLpa4uzyG8wRMGBKADJ_w@mail.gmail.com>

Hi everyone,

I am trying to install a package called PySparse. I have modified the paths
in the template site.cfg file. It appears that during the built process it
compiles numpy as well, because the error messages I get are all over the
numpy mailing list (I did not find one addressing my problem exactly).

My MKL libs are installed at
"/wgdisk/omega2dev2/env/EL5/intel/composer_xe_2013.0.079/mkl/lib/intel64" .
There are BLAS / LAPACK libraries here (I provided the library mkl_rt.so
to  SuiteSparse for instance, and it resolved all LAPACK and BLAS
dependencies that way). But for some reason
"/wgdisk/omega2dev2/env/EL5/intel/composer_xe_2013.0.079/mkl/lib/intel64"
does not work in the built process below, in what I think is a numpy
installer. The output shows that the directory is searched, but that the
result is "NOT AVAILABLE".

Could someone give me a pointer for why my the MKL/BLAS/LAPACK dependencies
cannot be resolved? It would be very appreciated

sincerely,
Bram

---------------------------------------------------------------------------
PART OF THE OUTPUT OF PySparse SETUP.PY
---------------------------------------------------------------------------

blas_opt_info:
blas_mkl_info:
  libraries mkl,vml,guide not found in ['/usr/local/lib',
'/wgdisk/hy3300/re15/lwillemsen/local_install/lib',
'/wgdisk/omega2dev2/env/EL5/intel/composer_xe_2013.0.079/mkl/lib/intel64']
  NOT AVAILABLE

openblas_info:
  libraries  not found in ['/usr/local/lib',
'/wgdisk/hy3300/re15/lwillemsen/local_install/lib',
'/wgdisk/omega2dev2/env/EL5/intel/composer_xe_2013.0.079/mkl/lib/intel64']
  NOT AVAILABLE

atlas_blas_threads_info:
Setting PTATLAS=ATLAS
  libraries ptf77blas,ptcblas,atlas not found in ['/usr/local/lib',
'/wgdisk/hy3300/re15/lwillemsen/local_install/lib',
'/wgdisk/omega2dev2/env/EL5/intel/composer_xe_2013.0.079/mkl/lib/intel64']
  NOT AVAILABLE

atlas_blas_info:
  libraries f77blas,cblas,atlas not found in ['/usr/local/lib',
'/wgdisk/hy3300/re15/lwillemsen/local_install/lib',
'/wgdisk/omega2dev2/env/EL5/intel/composer_xe_2013.0.079/mkl/lib/intel64']
  NOT AVAILABLE

blas_info:
  libraries blas not found in ['/usr/local/lib',
'/wgdisk/hy3300/re15/lwillemsen/local_install/lib',
'/wgdisk/omega2dev2/env/EL5/intel/composer_xe_2013.0.079/mkl/lib/intel64']
  NOT AVAILABLE

blas_src_info:
  NOT AVAILABLE

  NOT AVAILABLE

No blas info found
Sparse:: Using BLAS info:
{}
Using dflt_lib_dirs =
/usr/local/lib:/wgdisk/hy3300/re15/lwillemsen/local_install/lib:/wgdisk/omega2dev2/env/EL5/intel/composer_xe_2013.0.079/mkl/lib/intel64
Using dflt_libs =  []
No blas info found
Eigen:: Using BLAS info:
{}
lapack_opt_info:
lapack_mkl_info:
mkl_info:
  libraries mkl,vml,guide not found in ['/usr/local/lib',
'/wgdisk/hy3300/re15/lwillemsen/local_install/lib',
'/wgdisk/omega2dev2/env/EL5/intel/composer_xe_2013.0.079/mkl/lib/intel64']
  NOT AVAILABLE

  NOT AVAILABLE

atlas_threads_info:
Setting PTATLAS=ATLAS
  libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib
  libraries lapack_atlas not found in /usr/local/lib
  libraries ptf77blas,ptcblas,atlas not found in
/wgdisk/hy3300/re15/lwillemsen/local_install/lib
  libraries lapack_atlas not found in
/wgdisk/hy3300/re15/lwillemsen/local_install/lib
  libraries ptf77blas,ptcblas,atlas not found in
/wgdisk/omega2dev2/env/EL5/intel/composer_xe_2013.0.079/mkl/lib/intel64
  libraries lapack_atlas not found in
/wgdisk/omega2dev2/env/EL5/intel/composer_xe_2013.0.079/mkl/lib/intel64
numpy.distutils.system_info.atlas_threads_info
  NOT AVAILABLE

atlas_info:
  libraries f77blas,cblas,atlas not found in /usr/local/lib
  libraries lapack_atlas not found in /usr/local/lib
  libraries f77blas,cblas,atlas not found in
/wgdisk/hy3300/re15/lwillemsen/local_install/lib
  libraries lapack_atlas not found in
/wgdisk/hy3300/re15/lwillemsen/local_install/lib
  libraries f77blas,cblas,atlas not found in
/wgdisk/omega2dev2/env/EL5/intel/composer_xe_2013.0.079/mkl/lib/intel64
  libraries lapack_atlas not found in
/wgdisk/omega2dev2/env/EL5/intel/composer_xe_2013.0.079/mkl/lib/intel64
numpy.distutils.system_info.atlas_info
  NOT AVAILABLE

lapack_info:
  libraries lapack not found in ['/usr/local/lib',
'/wgdisk/hy3300/re15/lwillemsen/local_install/lib',
'/wgdisk/omega2dev2/env/EL5/intel/composer_xe_2013.0.079/mkl/lib/intel64']
  NOT AVAILABLE

lapack_src_info:
  NOT AVAILABLE

  NOT AVAILABLE

No lapack info found
Eigen:: Using LAPACK info:
{}
non-existing path in 'pysparse/eigen':
'/usr/local/lib:/wgdisk/hy3300/re15/lwillemsen/local_install/lib:/wgdisk/omega2dev2/env/EL5/intel/composer_xe_2013.0.079/mkl/lib/intel64'
No blas info found
Direct:: Using BLAS info:
{}
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/187e98b5/attachment.html>

From charlesr.harris at gmail.com  Fri Jul 18 21:47:25 2014
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 18 Jul 2014 19:47:25 -0600
Subject: [Numpy-discussion] `allclose` vs `assert_allclose`
In-Reply-To: <CALGmxELZkJ_ut5i65N6pj8zGRav4NvgNK0y+1aabC4GmYmTkug@mail.gmail.com>
References: <CAEym_HoQgVswDS7h7hx4JjqVb-jKmoSpZmt36VAQSbyTX_NXnw@mail.gmail.com>
	<CAPJVwBn7Br5Ng7U=y26o=nj_E3LMdpWbc5Qp53h1fSOFD603eg@mail.gmail.com>
	<CAMMTP+DA0ja+zbDR5Fh=yTxsW_TeSApHD9tAWY261RgZ+x4UoQ@mail.gmail.com>
	<CAMMTP+D+19j2eJTYSfg5kHVieu333xwb9ZV+Tc6vBS88eUMtzw@mail.gmail.com>
	<CAPJVwBmML+mBn2reEpFvArtAwhqxK=Y5aFyjKtfOdO=seVk8_g@mail.gmail.com>
	<CAMMTP+CKitjyM5tj=4qXgEAJKtvWQ3P1RPhh9FVbEL24eyQ1Pw@mail.gmail.com>
	<53C96BB8.4060104@iki.fi>
	<CALGmxEKbzq3tMMz7MqO0gRxsGRc-bCSj6oq2=+6CNFQBtcR4Lg@mail.gmail.com>
	<lqbtc9$aa0$1@ger.gmane.org>
	<CALGmxELZkJ_ut5i65N6pj8zGRav4NvgNK0y+1aabC4GmYmTkug@mail.gmail.com>
Message-ID: <CAB6mnxKJgwCbes3L19swajGg-nMSXSZ85iVadv8ZNpP3qbPJ_w@mail.gmail.com>

On Fri, Jul 18, 2014 at 2:32 PM, Chris Barker <chris.barker at noaa.gov> wrote:

> On Fri, Jul 18, 2014 at 12:43 PM, Pauli Virtanen <pav at iki.fi> wrote:
>
>> 18.07.2014 22:13, Chris Barker kirjoitti:
>> [clip]
>> > but an appropriate rtol would work there too. If only zero testing is
>> > needed, then atol=0 makes sense as a default. (or maybe atol=eps)
>>
>> There's plenty of room below eps, but finfo(float).tiny ~ 3e-308 (or
>> some big multiple) is also reasonable in the scale-freeness sense.
>
>
> right! brain blip -- eps is the difference between 1 and then next larger
> representable number, yes? So a long way away from smallest representable
> number. So yes, zero or [something]e-308 -- making zero seem like a good
> idea again....
>
> is it totally ridiculous to have the default be dependent on dtype?
> float32 vs float64?
>
>
Whatever the final decision is, if the defaults change we should start with
a FutureWarning. How we can make that work is uncertain, because I don't
know of any reliable way to detect if we are using the default value or if
a value was passed in. Maybe just warn if `atol == 0` ?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140718/25859161/attachment.html>

From argriffi at ncsu.edu  Fri Jul 18 21:56:41 2014
From: argriffi at ncsu.edu (alex)
Date: Fri, 18 Jul 2014 21:56:41 -0400
Subject: [Numpy-discussion] `allclose` vs `assert_allclose`
Message-ID: <CAE5GFcKwOPzXjvBKvCtZTeQkFt=W0vBP5bL9SeBQZrFUzKiGdw@mail.gmail.com>

On Fri, Jul 18, 2014 at 9:47 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
>
> On Fri, Jul 18, 2014 at 2:32 PM, Chris Barker <chris.barker at noaa.gov> wrote:
>>
>> On Fri, Jul 18, 2014 at 12:43 PM, Pauli Virtanen <pav at iki.fi> wrote:
>>>
>>> 18.07.2014 22:13, Chris Barker kirjoitti:
>>> [clip]
>>> > but an appropriate rtol would work there too. If only zero testing is
>>> > needed, then atol=0 makes sense as a default. (or maybe atol=eps)
>>>
>>> There's plenty of room below eps, but finfo(float).tiny ~ 3e-308 (or
>>> some big multiple) is also reasonable in the scale-freeness sense.
>>
>>
>> right! brain blip -- eps is the difference between 1 and then next larger
>> representable number, yes? So a long way away from smallest representable
>> number. So yes, zero or [something]e-308 -- making zero seem like a good
>> idea again....
>>
>> is it totally ridiculous to have the default be dependent on dtype?
>> float32 vs float64?
>>
>
> Whatever the final decision is, if the defaults change we should start with
> a FutureWarning. How we can make that work is uncertain, because I don't
> know of any reliable way to detect if we are using the default value or if a
> value was passed in.

There are tricks like http://stackoverflow.com/questions/12265695, not
that I'm suggesting to do that.


From ralf.gommers at gmail.com  Sat Jul 19 04:04:10 2014
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sat, 19 Jul 2014 10:04:10 +0200
Subject: [Numpy-discussion] proposal: new commit guidelines for
 backportable bugfixes
In-Reply-To: <lqc80v$oi0$1@ger.gmane.org>
References: <CAK5FAtGegwSH8Mx=QaZqvDeFp2UT7eHCKfFYQix6a32W1Mh=7w@mail.gmail.com>
	<CAPJVwBkXYr89z4uwmXL5XAE5xXde_jVtHJQY5+QgEAn01W41=g@mail.gmail.com>
	<CAK5FAtF+seKQHu9q5bNMNatX7JEPXbwURzR+4VjWdaWUhULXog@mail.gmail.com>
	<53C95DA3.7010901@iki.fi> <53C98946.3020405@googlemail.com>
	<lqc80v$oi0$1@ger.gmane.org>
Message-ID: <CABL7CQjHFZ7zSvOx7tUT1SHqnafSGQ11nQ=s7vrfzmC8edu7VA@mail.gmail.com>

On Sat, Jul 19, 2014 at 12:44 AM, Pauli Virtanen <pav at iki.fi> wrote:

> 18.07.2014 23:53, Julian Taylor kirjoitti:
> > On 18.07.2014 19:47, Pauli Virtanen wrote:
> [clip]
> > > The other well-known alternative to bugfixes is to first commit it in
> > > the earliest maintenance branch where you want to have it, and then
> > > merge that branch forward to the newer maintenance branches, and
> > > finally into master.
> >
> > wouldn't that still require basing bugfixes onto the point before the
> > master and maintenance branch diverged?
> > otherwise a merge from maintenance to master would include the commits
> > that are only part of the maintenance branch (release commits,
> > regression fixes etc.)
>
> If I understand correctly, the idea is to manually revert the changes
> that don't belong in, which needs to be only done once for each, as the
> merge logic should deal with it in all subsequent merges.
>
> I think there are in practice not so many commits that you want to have
> only in the release branch. Version number bumping is one (and easily
> addressed by a follow-up commit in master that bumps it again) --- what
> else?
>
> The bugfix-in-release-and-forward-port-to-master seems to be the
> recommended practice for Mercurial:
>
>         http://mercurial.selenic.com/wiki/StandardBranching
>
>         https://docs.python.org/devguide/committing.html
>
> I think there are also git guides that recommend using it.
>
> The option of basing commits on last merge base is probably not really
> feasible with Mercurial (I haven't seen git guides that propose it either).
>
> > basing bugfixes on maintenance does allow cherry picking into master as
> > you don't care too much about backward mergeability here, but you still
> > lose a good git log and git branch --contains to check which bugfix is
> > in which branch.
>
> I don't disagree with this. Cherry picking is OK, but only as long as
> the number of commits is not too large


This should be the case most of the time I think. It looks like we've
started backporting more and more though, even things like minor doc fixes.
The maintenance overhead would be much lower if we would stick to only
backporting important bug fixes.

Any strategy chosen is fine with me, but I would like to see considered how
this affects the number of PRs and the complexity for occasional
contributors. Those contributors can't really judge what's backportable and
don't want to deal with rebasing. So the new strategy would be something
like:
  1. bugfix PR sent to master by contributor
  2. maintainer decides it's backportable, so after review he doesn't merge
PR but rebases it and sends a second PR. First one, with review content, is
closed not merged.
  3. merge PR into maintenance branch.
  4. send third PR to merge back or forward port the fix to master, and
merge that.
(or some variation with merge bases which is even more involved)

Compare to what we did a while ago for numpy and still do for scipy:
  1. all PRs are sent to master
  2. hit green button after review
  3. bugfix is cherry-picked and pushed directly to the maintenance branch

The downside of the second strategy is indeed the occasional extra merge
conflict, but having 3x less PRs, 2x less merge commits and a less
confusing process for occasional contributors could well be worth dealing
with that merge conflict.

Cheers,
Ralf

and you use a tool (e.g. my
> git-cherry-tree) that tries to check which patches are in and which not.
>
>         Pauli
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140719/a4fb9f51/attachment.html>

From pav at iki.fi  Sat Jul 19 06:29:14 2014
From: pav at iki.fi (Pauli Virtanen)
Date: Sat, 19 Jul 2014 13:29:14 +0300
Subject: [Numpy-discussion] proposal: new commit guidelines for
	backportable bugfixes
In-Reply-To: <CABL7CQjHFZ7zSvOx7tUT1SHqnafSGQ11nQ=s7vrfzmC8edu7VA@mail.gmail.com>
References: <CAK5FAtGegwSH8Mx=QaZqvDeFp2UT7eHCKfFYQix6a32W1Mh=7w@mail.gmail.com>	<CAPJVwBkXYr89z4uwmXL5XAE5xXde_jVtHJQY5+QgEAn01W41=g@mail.gmail.com>	<CAK5FAtF+seKQHu9q5bNMNatX7JEPXbwURzR+4VjWdaWUhULXog@mail.gmail.com>	<53C95DA3.7010901@iki.fi>
	<53C98946.3020405@googlemail.com>	<lqc80v$oi0$1@ger.gmane.org>
	<CABL7CQjHFZ7zSvOx7tUT1SHqnafSGQ11nQ=s7vrfzmC8edu7VA@mail.gmail.com>
Message-ID: <lqdh9q$ltg$1@ger.gmane.org>

19.07.2014 11:04, Ralf Gommers kirjoitti:
[clip]
>   1. bugfix PR sent to master by contributor
>   2. maintainer decides it's backportable, so after review he doesn't merge
> PR but rebases it and sends a second PR. First one, with review content, is
> closed not merged.
>   3. merge PR into maintenance branch.
>   4. send third PR to merge back or forward port the fix to master, and
> merge that.
> (or some variation with merge bases which is even more involved)

The maintainer can just rebase on merge base, and then merge and push it
via git as usual, without having to deal with Github. If the pull
request happens to be already based on an OK merge base, it can be
merged via Github directly to master.

The only thing maintainer gains from sending additional pull request via
Github is that the code gets run by Travis-CI. However, the tests will
also run automatically after pushing the merge commits, so test failures
can be caught (although after the fact). This is also the case for
directly pushed cherry-picked commits.

-- 
Pauli Virtanen


From ralf.gommers at gmail.com  Sat Jul 19 07:04:17 2014
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sat, 19 Jul 2014 13:04:17 +0200
Subject: [Numpy-discussion] proposal: new commit guidelines for
 backportable bugfixes
In-Reply-To: <lqdh9q$ltg$1@ger.gmane.org>
References: <CAK5FAtGegwSH8Mx=QaZqvDeFp2UT7eHCKfFYQix6a32W1Mh=7w@mail.gmail.com>
	<CAPJVwBkXYr89z4uwmXL5XAE5xXde_jVtHJQY5+QgEAn01W41=g@mail.gmail.com>
	<CAK5FAtF+seKQHu9q5bNMNatX7JEPXbwURzR+4VjWdaWUhULXog@mail.gmail.com>
	<53C95DA3.7010901@iki.fi> <53C98946.3020405@googlemail.com>
	<lqc80v$oi0$1@ger.gmane.org>
	<CABL7CQjHFZ7zSvOx7tUT1SHqnafSGQ11nQ=s7vrfzmC8edu7VA@mail.gmail.com>
	<lqdh9q$ltg$1@ger.gmane.org>
Message-ID: <CABL7CQhpYTOjP5xTPVP7s_X+vK7g4Zedq-=OuS7g42L20cxFxw@mail.gmail.com>

On Sat, Jul 19, 2014 at 12:29 PM, Pauli Virtanen <pav at iki.fi> wrote:

> 19.07.2014 11:04, Ralf Gommers kirjoitti:
> [clip]
> >   1. bugfix PR sent to master by contributor
> >   2. maintainer decides it's backportable, so after review he doesn't
> merge
> > PR but rebases it and sends a second PR. First one, with review content,
> is
> > closed not merged.
> >   3. merge PR into maintenance branch.
> >   4. send third PR to merge back or forward port the fix to master, and
> > merge that.
> > (or some variation with merge bases which is even more involved)
>
> The maintainer can just rebase on merge base, and then merge and push it
> via git as usual, without having to deal with Github.


I agree, but note that that's not what's happening in the numpy repo at the
moment and that Julian (and maybe Chuck as well?) is explicitly against any
direct pushes. So the 3x more PRs between what the process used to be and
what Julian proposes is not unrealistic.

Maybe still worth it, but it's a trade-off (example: I used to use "gitk
--all", but it's a spaghetti now).

Ralf


> If the pull
> request happens to be already based on an OK merge base, it can be
> merged via Github directly to master.
>
> The only thing maintainer gains from sending additional pull request via
> Github is that the code gets run by Travis-CI. However, the tests will
> also run automatically after pushing the merge commits, so test failures
> can be caught (although after the fact). This is also the case for
> directly pushed cherry-picked commits.
>
> --
> Pauli Virtanen
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140719/ed526ecd/attachment.html>

From jtaylor.debian at googlemail.com  Sat Jul 19 07:26:10 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Sat, 19 Jul 2014 13:26:10 +0200
Subject: [Numpy-discussion] proposal: new commit guidelines for
 backportable bugfixes
In-Reply-To: <CABL7CQhpYTOjP5xTPVP7s_X+vK7g4Zedq-=OuS7g42L20cxFxw@mail.gmail.com>
References: <CAK5FAtGegwSH8Mx=QaZqvDeFp2UT7eHCKfFYQix6a32W1Mh=7w@mail.gmail.com>	<CAPJVwBkXYr89z4uwmXL5XAE5xXde_jVtHJQY5+QgEAn01W41=g@mail.gmail.com>	<CAK5FAtF+seKQHu9q5bNMNatX7JEPXbwURzR+4VjWdaWUhULXog@mail.gmail.com>	<53C95DA3.7010901@iki.fi>
	<53C98946.3020405@googlemail.com>	<lqc80v$oi0$1@ger.gmane.org>	<CABL7CQjHFZ7zSvOx7tUT1SHqnafSGQ11nQ=s7vrfzmC8edu7VA@mail.gmail.com>	<lqdh9q$ltg$1@ger.gmane.org>
	<CABL7CQhpYTOjP5xTPVP7s_X+vK7g4Zedq-=OuS7g42L20cxFxw@mail.gmail.com>
Message-ID: <53CA55D2.5020605@googlemail.com>

On 19.07.2014 13:04, Ralf Gommers wrote:
> 
> 
> 
> On Sat, Jul 19, 2014 at 12:29 PM, Pauli Virtanen <pav at iki.fi
> <mailto:pav at iki.fi>> wrote:
> 
>     19.07.2014 11:04, Ralf Gommers kirjoitti:
>     [clip]
>     >   1. bugfix PR sent to master by contributor
>     >   2. maintainer decides it's backportable, so after review he
>     doesn't merge
>     > PR but rebases it and sends a second PR. First one, with review
>     content, is
>     > closed not merged.
>     >   3. merge PR into maintenance branch.
>     >   4. send third PR to merge back or forward port the fix to
>     master, and
>     > merge that.
>     > (or some variation with merge bases which is even more involved)
> 
>     The maintainer can just rebase on merge base, and then merge and push it
>     via git as usual, without having to deal with Github. 
> 
> 
> I agree, but note that that's not what's happening in the numpy repo at
> the moment and that Julian (and maybe Chuck as well?) is explicitly
> against any direct pushes. So the 3x more PRs between what the process
> used to be and what Julian proposes is not unrealistic.
> 

It is what is happening at the numpy repo.
We are never directly pushing unreviewed changes, we always have at
least one PR. We only directly push changes that have been approved to
be applied two more than one branch.
With the method I propose there are not any more PRs. You have the main
PR targeted to master and the bugfix PR targeted to the maintenance
branch, it was the same before except the bugfix PR was a cherry pick
instead of a merge.
When directly pushing the second merge we even cut one PR from the process.
E.g. I pushed Pauls PR #4882 directly to 1.9 without asking him to
create a new PR but as far as git is concerned there is no difference,
it as still two merges.

We could always ask for a new PR for the branch merge to see travis
results before the merge. E.g. #4877 and #4891 same branch two PRs two
merges.
I don't think that should be currently required as master and 1.9 are
almost identical and there is little value in seeing travis results for
the second merge before doing the merge.
But when the branches diverge more the two PRs should probably be
preferred to avoid having broken commits on the branches that make
bisecting harder.


From ralf.gommers at gmail.com  Sat Jul 19 08:09:26 2014
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sat, 19 Jul 2014 14:09:26 +0200
Subject: [Numpy-discussion] proposal: new commit guidelines for
 backportable bugfixes
In-Reply-To: <53CA55D2.5020605@googlemail.com>
References: <CAK5FAtGegwSH8Mx=QaZqvDeFp2UT7eHCKfFYQix6a32W1Mh=7w@mail.gmail.com>
	<CAPJVwBkXYr89z4uwmXL5XAE5xXde_jVtHJQY5+QgEAn01W41=g@mail.gmail.com>
	<CAK5FAtF+seKQHu9q5bNMNatX7JEPXbwURzR+4VjWdaWUhULXog@mail.gmail.com>
	<53C95DA3.7010901@iki.fi> <53C98946.3020405@googlemail.com>
	<lqc80v$oi0$1@ger.gmane.org>
	<CABL7CQjHFZ7zSvOx7tUT1SHqnafSGQ11nQ=s7vrfzmC8edu7VA@mail.gmail.com>
	<lqdh9q$ltg$1@ger.gmane.org>
	<CABL7CQhpYTOjP5xTPVP7s_X+vK7g4Zedq-=OuS7g42L20cxFxw@mail.gmail.com>
	<53CA55D2.5020605@googlemail.com>
Message-ID: <CABL7CQhe4NLd5MP_wHy0se4hxH3TJe1iLrBnm1QoGsevC2Q00A@mail.gmail.com>

On Sat, Jul 19, 2014 at 1:26 PM, Julian Taylor <
jtaylor.debian at googlemail.com> wrote:

> On 19.07.2014 13:04, Ralf Gommers wrote:
> >
> >
> >
> > On Sat, Jul 19, 2014 at 12:29 PM, Pauli Virtanen <pav at iki.fi
> > <mailto:pav at iki.fi>> wrote:
> >
> >     19.07.2014 11:04, Ralf Gommers kirjoitti:
> >     [clip]
> >     >   1. bugfix PR sent to master by contributor
> >     >   2. maintainer decides it's backportable, so after review he
> >     doesn't merge
> >     > PR but rebases it and sends a second PR. First one, with review
> >     content, is
> >     > closed not merged.
> >     >   3. merge PR into maintenance branch.
> >     >   4. send third PR to merge back or forward port the fix to
> >     master, and
> >     > merge that.
> >     > (or some variation with merge bases which is even more involved)
> >
> >     The maintainer can just rebase on merge base, and then merge and
> push it
> >     via git as usual, without having to deal with Github.
> >
> >
> > I agree, but note that that's not what's happening in the numpy repo at
> > the moment and that Julian (and maybe Chuck as well?) is explicitly
> > against any direct pushes. So the 3x more PRs between what the process
> > used to be and what Julian proposes is not unrealistic.
> >
>
> It is what is happening at the numpy repo.
> We are never directly pushing unreviewed changes, we always have at
> least one PR. We only directly push changes that have been approved to
> be applied two more than one branch.
>

OK never mind then. I was pretty sure you said you were against this, and I
see a lot of PRs for simple backports in 1.8.x and 1.9.x. If you now say
it's fine (or even preferred) to push directly, my worry about multiple PRs
isn't relevant anymore.

Ralf


> With the method I propose there are not any more PRs. You have the main
> PR targeted to master
>
and the bugfix PR targeted to the maintenance
> branch, it was the same before except the bugfix PR was a cherry pick
> instead of a merge.
> When directly pushing the second merge we even cut one PR from the process.
> E.g. I pushed Pauls PR #4882 directly to 1.9 without asking him to
> create a new PR but as far as git is concerned there is no difference,
> it as still two merges.
>
> We could always ask for a new PR for the branch merge to see travis
> results before the merge. E.g. #4877 and #4891 same branch two PRs two
> merges.
> I don't think that should be currently required as master and 1.9 are
> almost identical and there is little value in seeing travis results for
> the second merge before doing the merge.
> But when the branches diverge more the two PRs should probably be
> preferred to avoid having broken commits on the branches that make
> bisecting harder.
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140719/97cf2ed5/attachment.html>

From jtaylor.debian at googlemail.com  Sat Jul 19 08:12:57 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Sat, 19 Jul 2014 14:12:57 +0200
Subject: [Numpy-discussion] proposal: new commit guidelines for
 backportable bugfixes
In-Reply-To: <CABL7CQhe4NLd5MP_wHy0se4hxH3TJe1iLrBnm1QoGsevC2Q00A@mail.gmail.com>
References: <CAK5FAtGegwSH8Mx=QaZqvDeFp2UT7eHCKfFYQix6a32W1Mh=7w@mail.gmail.com>	<CAPJVwBkXYr89z4uwmXL5XAE5xXde_jVtHJQY5+QgEAn01W41=g@mail.gmail.com>	<CAK5FAtF+seKQHu9q5bNMNatX7JEPXbwURzR+4VjWdaWUhULXog@mail.gmail.com>	<53C95DA3.7010901@iki.fi>
	<53C98946.3020405@googlemail.com>	<lqc80v$oi0$1@ger.gmane.org>	<CABL7CQjHFZ7zSvOx7tUT1SHqnafSGQ11nQ=s7vrfzmC8edu7VA@mail.gmail.com>	<lqdh9q$ltg$1@ger.gmane.org>	<CABL7CQhpYTOjP5xTPVP7s_X+vK7g4Zedq-=OuS7g42L20cxFxw@mail.gmail.com>	<53CA55D2.5020605@googlemail.com>
	<CABL7CQhe4NLd5MP_wHy0se4hxH3TJe1iLrBnm1QoGsevC2Q00A@mail.gmail.com>
Message-ID: <53CA60C9.9090902@googlemail.com>

On 19.07.2014 14:09, Ralf Gommers wrote:
> 
> 
> 
> On Sat, Jul 19, 2014 at 1:26 PM, Julian Taylor
> <jtaylor.debian at googlemail.com <mailto:jtaylor.debian at googlemail.com>>
> wrote:
> 
>     On 19.07.2014 13:04, Ralf Gommers wrote:
>     >
>     >
>     >
>     > On Sat, Jul 19, 2014 at 12:29 PM, Pauli Virtanen <pav at iki.fi
>     <mailto:pav at iki.fi>
>     > <mailto:pav at iki.fi <mailto:pav at iki.fi>>> wrote:
>     >
>     >     19.07.2014 11:04, Ralf Gommers kirjoitti:
>     >     [clip]
>     >     >   1. bugfix PR sent to master by contributor
>     >     >   2. maintainer decides it's backportable, so after review he
>     >     doesn't merge
>     >     > PR but rebases it and sends a second PR. First one, with review
>     >     content, is
>     >     > closed not merged.
>     >     >   3. merge PR into maintenance branch.
>     >     >   4. send third PR to merge back or forward port the fix to
>     >     master, and
>     >     > merge that.
>     >     > (or some variation with merge bases which is even more involved)
>     >
>     >     The maintainer can just rebase on merge base, and then merge
>     and push it
>     >     via git as usual, without having to deal with Github.
>     >
>     >
>     > I agree, but note that that's not what's happening in the numpy
>     repo at
>     > the moment and that Julian (and maybe Chuck as well?) is explicitly
>     > against any direct pushes. So the 3x more PRs between what the process
>     > used to be and what Julian proposes is not unrealistic.
>     >
> 
>     It is what is happening at the numpy repo.
>     We are never directly pushing unreviewed changes, we always have at
>     least one PR. We only directly push changes that have been approved to
>     be applied two more than one branch.
> 
> 
> OK never mind then. I was pretty sure you said you were against this,
> and I see a lot of PRs for simple backports in 1.8.x and 1.9.x. If you
> now say it's fine (or even preferred) to push directly, my worry about
> multiple PRs isn't relevant anymore.
> 

thats not what I'm saying.
I'm strongly against pushing unreviewed changes. There must *always* be
at least one PR.
Pushing this PR to multiple branches without another PR is fine with me
if it makes sense in the situation (== the merge is trivial enough to
not need *another* review)


From bramwillemsen at gmail.com  Sat Jul 19 13:31:53 2014
From: bramwillemsen at gmail.com (Bram Willemsen)
Date: Sat, 19 Jul 2014 17:31:53 +0000 (UTC)
Subject: [Numpy-discussion] BLAS / LAPACK / MKL cannot be found?
References: <CAKFW64TNrKw+htpfDBqwtPy=n5T_MKLpa4uzyG8wRMGBKADJ_w@mail.gmail.com>
Message-ID: <loom.20140719T192510-403@post.gmane.org>


Okay I figured out how to do it, in case someone finds this message later.

You need to enter this specific section for the MKL implementation of BLAS
and LAPACK, and it will find it!

#https://software.intel.com/en-us/articles/numpyscipy-with-intel-mkl
[mkl] 
library_dirs =
/wgdisk/omega2dev2/env/EL5/intel/composer_xe_2013.0.079/mkl/lib/intel64 
include_dirs =
/wgdisk/omega2dev2/env/EL5/intel/composer_xe_2013.0.079/mkl/include
mkl_libs = mkl_rt
lapack_libs =

Note that no libs are given for lapack_libs. This is not a failed copy-paste :)

Hopefully this will help someone!


From joseluismietta at yahoo.com.ar  Tue Jul 22 07:19:09 2014
From: joseluismietta at yahoo.com.ar (=?iso-8859-1?Q?Jos=E8_Luis_Mietta?=)
Date: Tue, 22 Jul 2014 04:19:09 -0700
Subject: [Numpy-discussion] length - sticks  algorithm
Message-ID: <1406027949.48361.YahooMailNeo@web142302.mail.bf1.yahoo.com>


 Hi experts!

Im working with conductivity of sticks film - systems. 


In my 
algorithm (N sticks) I have the intersection graph matrix M (M is a NxN 
matrix, M_ij=1 if sticks 'i' and 'j' do intersect, and M_ij=0 if sticks 
'i' and 'j' do not).
Also I have 2 lists with the end-points of each stick. In addition, I can calculate the intersection point (If exist) between sticks.

I want to calculate all the distances between the points of intersection (1,2,3,...N) in the next figure:
without lose the connectivity information (which intersection is connected to which). In the figure, (A) is the system with sticks.

I dont know how to do this. Im a python + numpy user.

Waiting for your answers!

Thans a lot 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140722/e5737fe0/attachment.html>

From robert.kern at gmail.com  Tue Jul 22 08:02:03 2014
From: robert.kern at gmail.com (Robert Kern)
Date: Tue, 22 Jul 2014 13:02:03 +0100
Subject: [Numpy-discussion] length - sticks algorithm
In-Reply-To: <1406027949.48361.YahooMailNeo@web142302.mail.bf1.yahoo.com>
References: <1406027949.48361.YahooMailNeo@web142302.mail.bf1.yahoo.com>
Message-ID: <CAF6FJitO7cQk7J9riU8hXAzS++GwKwaKpF-uwotKGbMf1DWqmg@mail.gmail.com>

What have you tried? What exactly are you having problems with? Loosely, I
would suggest the following approach:

For each stick, iterate over each stick that intersects with it (as
recorded in M). Find the coordinates of all of the intersection points.
Label the intersection points by the IDs of the two sticks that form the
intersection (normalize these IDs by keeping them in order so you don't
duplicate intersections already found; e.g. (2, 5), not (5, 2)).
Arbitrarily, but consistently, pick one end of the stick and find the
distances from that end to each of the intersection points. This induces an
order on the intersections with that stick by sorting the intersections by
their distance from the arbitrary end of the stick. You will need this to
determine which intersections on the same stick are neighbors and which
aren't. I.e., if you have 3 intersections with a given stick, (i,j), (i,k),
and (i,l), you want (i,j)-(i,k), and (i,k)-(i,l), but not (i,j)-(i,l). You
can find the distances between each of the intersections easily from that.
Use a networkx Graph to record the distances (you are making a so-called
"weighted graph").


On Tue, Jul 22, 2014 at 12:19 PM, Jos? Luis Mietta <
joseluismietta at yahoo.com.ar> wrote:

>
>    Hi experts!
>
> Im working with conductivity of sticks film - systems.
>
> In my algorithm (N sticks) I have the intersection graph matrix M (M is a
> NxN matrix, M_ij=1 if sticks 'i' and 'j' do intersect, and M_ij=0 if sticks
> 'i' and 'j' do not).
> Also I have 2 lists with the end-points of each stick. In addition, I can
> calculate the intersection point (If exist) between sticks.
>
> I want to calculate all the distances between the points of intersection
> (1,2,3,...N) in the next figure:
> [image: enter image description here]
> without lose the connectivity information (which intersection is connected
> to which). In the figure, (A) is the system with sticks.
>
> I dont know how to do this. Im a python + numpy user.
>
> Waiting for your answers!
>
> Thans a lot
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140722/6f660155/attachment.html>

From faltet at gmail.com  Tue Jul 22 10:53:54 2014
From: faltet at gmail.com (Francesc Alted)
Date: Tue, 22 Jul 2014 16:53:54 +0200
Subject: [Numpy-discussion] ANN: bcolz 0.7.0 released
Message-ID: <53CE7B02.5070000@gmail.com>

======================
Announcing bcolz 0.7.0
======================

What's new
==========

In this release, support for Python 3 has been added, Pandas and
HDF5/PyTables conversion, support for different compressors via latest
release of Blosc, and a new `iterblocks()` iterator.

Also, intensive benchmarking has lead to an important tuning of buffer
sizes parameters so that compression and evaluation goes faster than
ever.  Together, bcolz and the Blosc compressor, are finally fullfilling
the promise of accelerating memory I/O, at least for some real
scenarios:

http://nbviewer.ipython.org/github/Blosc/movielens-bench/blob/master/querying-ep14.ipynb#Plots

``bcolz`` is a renaming of the ``carray`` project.  The new goals for
the project are to create simple, yet flexible compressed containers,
that can live either on-disk or in-memory, and with some
high-performance iterators (like `iter()`, `where()`) for querying them.

For more detailed info, see the release notes in:
https://github.com/Blosc/bcolz/wiki/Release-Notes


What it is
==========

bcolz provides columnar and compressed data containers.  Column storage
allows for efficiently querying tables with a large number of columns.
It also allows for cheap addition and removal of column.  In addition,
bcolz objects are compressed by default for reducing memory/disk I/O
needs.  The compression process is carried out internally by Blosc, a
high-performance compressor that is optimized for binary data.

bcolz can use numexpr internally so as to accelerate many vector and
query operations (although it can use pure NumPy for doing so too).
numexpr optimizes the memory usage and use several cores for doing the
computations, so it is blazing fast.  Moreover, the carray/ctable
containers can be disk-based, and it is possible to use them for
seamlessly performing out-of-memory computations.

bcolz has minimal dependencies (NumPy), comes with an exhaustive test
suite and fully supports both 32-bit and 64-bit platforms.  Also, it is
typically tested on both UNIX and Windows operating systems.


Installing
==========

bcolz is in the PyPI repository, so installing it is easy:

$ pip install -U bcolz


Resources
=========

Visit the main bcolz site repository at:
http://github.com/Blosc/bcolz

Manual:
http://bcolz.blosc.org

Home of Blosc compressor:
http://blosc.org

User's mail list:
bcolz at googlegroups.com
http://groups.google.com/group/bcolz

License is the new BSD:
https://github.com/Blosc/bcolz/blob/master/LICENSES/BCOLZ.txt


----

   **Enjoy data!**

-- Francesc Alted


From totonixsame at gmail.com  Tue Jul 22 14:34:04 2014
From: totonixsame at gmail.com (Thiago Franco Moraes)
Date: Tue, 22 Jul 2014 15:34:04 -0300
Subject: [Numpy-discussion] =?utf-8?q?Research_position_in_the_Brazilian_R?=
	=?utf-8?q?esearch_Institute_for_Science_and_Neurotechnology_?=
	=?utf-8?q?=E2=80=93_BRAINN?=
Message-ID: <CAMmoLX-XKkeZaGEXbj7unQtY7YHbTrW2PpQBVEb4JRMw_M0Tjw@mail.gmail.com>

*Research position in the Brazilian Research Institute for Science and
Neurotechnology ? BRAINN Postdoc researcher to work with software
development for medical imaging*

The Brazilian Research Institute for Neuroscience and Neurotechnology
(BRAINN) (www.brainn.org.br) focuses on the investigation of basic
mechanisms leading to epilepsy and stroke, and the injury mechanisms that
follow disease onset and progression. This research has important
applications related to prevention, diagnosis, treatment and rehabilitation
and will serve as a model for better understanding normal and abnormal
brain function. The BRAINN Institute is composed of 10 institutions from
Brazil and abroad and hosted by State University of Campinas (UNICAMP).
Among the associated institutions is Renato Archer Information Technology
Center (CTI) that has a specialized team in open-source software
development for medical imaging (www.cti.gov.br/invesalius) and 3D printing
applications for healthcare. CTI is located close the UNICAMP in the city
of Campinas, State of S?o Paulo in a very technological region of Brazil
and is looking for a postdoc researcher to work with software development
for medical imaging related to the imaging analysis, diagnosis and
treatment of brain diseases. The postdoc position is for two years with the
possibility of being renovated  for more two years.

*Education*
- PhD in computer science, computer engineering, mathematics, physics or
related.

*Requirements*
- Digital image processing (Medical imaging)
- Computer graphics (basic)

* Benefits*
6.143,40 Reais per month free of taxes (about US$ 2.800,00);
15% technical reserve for conferences participation and specific materials
acquisition;

*Interested*
Send curriculum to: jorge.silva at cti.gov.br with subject ?Postdoc position?
Applications reviews will begin August 1, 2014 and continue until the
position is filled.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140722/2ea10138/attachment.html>

From scopatz at gmail.com  Tue Jul 22 19:04:00 2014
From: scopatz at gmail.com (Anthony Scopatz)
Date: Tue, 22 Jul 2014 18:04:00 -0500
Subject: [Numpy-discussion] ANN: bcolz 0.7.0 released
In-Reply-To: <53CE7B02.5070000@gmail.com>
References: <53CE7B02.5070000@gmail.com>
Message-ID: <CAPk-6T7v+NzDWnJREVyv=05WESxfYtrQgBq5Sgjecpz3qy6+tw@mail.gmail.com>

Congrats Francesc!


On Tue, Jul 22, 2014 at 9:53 AM, Francesc Alted <faltet at gmail.com> wrote:

> ======================
> Announcing bcolz 0.7.0
> ======================
>
> What's new
> ==========
>
> In this release, support for Python 3 has been added, Pandas and
> HDF5/PyTables conversion, support for different compressors via latest
> release of Blosc, and a new `iterblocks()` iterator.
>
> Also, intensive benchmarking has lead to an important tuning of buffer
> sizes parameters so that compression and evaluation goes faster than
> ever.  Together, bcolz and the Blosc compressor, are finally fullfilling
> the promise of accelerating memory I/O, at least for some real
> scenarios:
>
>
> http://nbviewer.ipython.org/github/Blosc/movielens-bench/blob/master/querying-ep14.ipynb#Plots
>
> ``bcolz`` is a renaming of the ``carray`` project.  The new goals for
> the project are to create simple, yet flexible compressed containers,
> that can live either on-disk or in-memory, and with some
> high-performance iterators (like `iter()`, `where()`) for querying them.
>
> For more detailed info, see the release notes in:
> https://github.com/Blosc/bcolz/wiki/Release-Notes
>
>
> What it is
> ==========
>
> bcolz provides columnar and compressed data containers.  Column storage
> allows for efficiently querying tables with a large number of columns.
> It also allows for cheap addition and removal of column.  In addition,
> bcolz objects are compressed by default for reducing memory/disk I/O
> needs.  The compression process is carried out internally by Blosc, a
> high-performance compressor that is optimized for binary data.
>
> bcolz can use numexpr internally so as to accelerate many vector and
> query operations (although it can use pure NumPy for doing so too).
> numexpr optimizes the memory usage and use several cores for doing the
> computations, so it is blazing fast.  Moreover, the carray/ctable
> containers can be disk-based, and it is possible to use them for
> seamlessly performing out-of-memory computations.
>
> bcolz has minimal dependencies (NumPy), comes with an exhaustive test
> suite and fully supports both 32-bit and 64-bit platforms.  Also, it is
> typically tested on both UNIX and Windows operating systems.
>
>
> Installing
> ==========
>
> bcolz is in the PyPI repository, so installing it is easy:
>
> $ pip install -U bcolz
>
>
> Resources
> =========
>
> Visit the main bcolz site repository at:
> http://github.com/Blosc/bcolz
>
> Manual:
> http://bcolz.blosc.org
>
> Home of Blosc compressor:
> http://blosc.org
>
> User's mail list:
> bcolz at googlegroups.com
> http://groups.google.com/group/bcolz
>
> License is the new BSD:
> https://github.com/Blosc/bcolz/blob/master/LICENSES/BCOLZ.txt
>
>
> ----
>
>    **Enjoy data!**
>
> -- Francesc Alted
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140722/5b5bd76a/attachment.html>

From jtaylor.debian at googlemail.com  Wed Jul 23 13:19:58 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Wed, 23 Jul 2014 19:19:58 +0200
Subject: [Numpy-discussion] change default integer from int32 to int64 on
	win64?
Message-ID: <53CFEEBE.5000207@googlemail.com>

hi,
it recently came to my attention that the default integer type in numpy
on windows 64 bit is a 32 bit integers [0].
This seems like a quite serious problem as it means you can't use any
integers created from python integers < 32 bit to index arrays larger
than 2GB.
For example np.product(array.shape) which will never overflow on linux
and mac, can overflow on win64.

I think this is a very dangerous platform difference and a quite large
inconvenience for win64 users so I think it would be good to fix this.
This would be a very large change of API and probably also ABI.
But as we also never officially released win64 binaries we could change
it for from source compilations and give win64 binary distributors the
option to keep the old ABI/API at their discretion.

Any thoughts on this from win64 users?

Cheers,
Julian Taylor

[0] https://github.com/astropy/astropy/pull/2697


From jtaylor.debian at googlemail.com  Wed Jul 23 13:37:39 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Wed, 23 Jul 2014 19:37:39 +0200
Subject: [Numpy-discussion] __numpy_ufunc__ and 1.9 release
In-Reply-To: <53C56DA2.40402@googlemail.com>
References: <53C56DA2.40402@googlemail.com>
Message-ID: <53CFF2E3.1020708@googlemail.com>

On 15.07.2014 20:06, Julian Taylor wrote:
> hi,
> as you may know we want to release numpy 1.9 soon. We should have solved
> most indexing regressions the first beta showed.
> 
> The remaining blockers are finishing the new __numpy_ufunc__ feature.
> This feature should allow for alternative method to overriding the
> behavior of ufuncs from subclasses.
> It is described here:
> https://github.com/numpy/numpy/blob/master/doc/neps/ufunc-overrides.rst
> 
> The current blocker issues are:
> https://github.com/numpy/numpy/issues/4753
> https://github.com/numpy/numpy/pull/4815
> 
> I'm not to familiar with all the complications of subclassing so I can't
> really say how hard this is to solve.
> My issue is that it there still seems to be debate on how to handle
> operator overriding correctly and I am opposed to releasing a numpy with
> yet another experimental feature that may or may not be finished
> sometime later. Having datetime in infinite experimental state is bad
> enough.
> I think nobody is served well if we release 1.9 with the feature
> prematurely based on a not representative set of users and the later
> after more users showed up see we have to change its behavior.
> 
> So I'm wondering if we should delay the introduction of this feature to
> 1.10 or is it important enough to wait until there is a consensus on the
> remaining issues?
> 

So its been a week and we got a few answers and new issues.
To summarize:
- to my knowledge no progress was made on the issues
- scipy already has a released version using the current implementation
- no very loud objections to delaying the feature to 1.10
- I am still unfamiliar with the problematics of subclassing, but don't
want to release something new which has unsolved issues.

That scipy already uses it in a released version (0.14) is very
problematic. Can maybe someone give some insight if the potential
changes to resolve the remaining issues would break scipy?
If so we have following choices:
- declare what we have as final and close the remaining issues as 'won't
fix'. Any changes would have to have a new name __numpy_ufunc2__ or a
somehow versioned the interface
- delay the introduction, potentially breaking scipy 0.14 when numpy
1.10 is released.

I would like to get the next (and last) numpy 1.9 beta out soon, so I
would propose to make a decision until this Saturday the 26.02.2014
however misinformed it may be.

Please note that the numpy 1.10 release cycle is likely going to be a
very long one as we are currently planning to change a bunch of default
behaviours that currently raise deprecation warnings and possibly will
try to fix string types, text IO and datetime. Please see the future
changes notes in the current 1.9.x release notes.
If we delay numpy_ufunc it is not unlikely that it will take a year
until we release 1.10. Though we could still put it into a earlier 1.9.1.

Cheers,
Julian


From robert.kern at gmail.com  Wed Jul 23 14:54:28 2014
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 23 Jul 2014 19:54:28 +0100
Subject: [Numpy-discussion] change default integer from int32 to int64
	on win64?
In-Reply-To: <53CFEEBE.5000207@googlemail.com>
References: <53CFEEBE.5000207@googlemail.com>
Message-ID: <CAF6FJivvgft1VrL=xUASLhPA3Q8iqmi-G5Hu+9W+HaBroN7CqQ@mail.gmail.com>

On Wed, Jul 23, 2014 at 6:19 PM, Julian Taylor
<jtaylor.debian at googlemail.com> wrote:
> hi,
> it recently came to my attention that the default integer type in numpy
> on windows 64 bit is a 32 bit integers [0].
> This seems like a quite serious problem as it means you can't use any
> integers created from python integers < 32 bit to index arrays larger
> than 2GB.
> For example np.product(array.shape) which will never overflow on linux
> and mac, can overflow on win64.

Currently, on win64, we use Python long integer objects for `.shape`
and related attributes. I wonder if we could return numpy int64
scalars instead. Then np.product() (or anything else that consumes
these via np.asarray()) would infer the correct dtype for the result.

> I think this is a very dangerous platform difference and a quite large
> inconvenience for win64 users so I think it would be good to fix this.
> This would be a very large change of API and probably also ABI.

Yes. Not only would it be a very large change from the status quo, I
think it introduces *much greater* platform difference than what we
have currently. The assumption that the default integer object
corresponds to the platform C long, whatever that is, is pretty
heavily ingrained.

> But as we also never officially released win64 binaries we could change
> it for from source compilations and give win64 binary distributors the
> option to keep the old ABI/API at their discretion.

That option would make the problem worse, not better.

-- 
Robert Kern


From jtaylor.debian at googlemail.com  Wed Jul 23 15:50:31 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Wed, 23 Jul 2014 21:50:31 +0200
Subject: [Numpy-discussion] change default integer from int32 to int64
 on win64?
In-Reply-To: <CAF6FJivvgft1VrL=xUASLhPA3Q8iqmi-G5Hu+9W+HaBroN7CqQ@mail.gmail.com>
References: <53CFEEBE.5000207@googlemail.com>
	<CAF6FJivvgft1VrL=xUASLhPA3Q8iqmi-G5Hu+9W+HaBroN7CqQ@mail.gmail.com>
Message-ID: <53D01207.2090807@googlemail.com>

On 23.07.2014 20:54, Robert Kern wrote:
> On Wed, Jul 23, 2014 at 6:19 PM, Julian Taylor
> <jtaylor.debian at googlemail.com> wrote:
>> hi,
>> it recently came to my attention that the default integer type in numpy
>> on windows 64 bit is a 32 bit integers [0].
>> This seems like a quite serious problem as it means you can't use any
>> integers created from python integers < 32 bit to index arrays larger
>> than 2GB.
>> For example np.product(array.shape) which will never overflow on linux
>> and mac, can overflow on win64.
> 
> Currently, on win64, we use Python long integer objects for `.shape`
> and related attributes. I wonder if we could return numpy int64
> scalars instead. Then np.product() (or anything else that consumes
> these via np.asarray()) would infer the correct dtype for the result.

this might be a less invasive alternative that might solve a lot of the
incompatibilities, but it would probably also change np.arange(5) and
similar functions to int64 which might change the dtype of a lot of
arrays. The difference to just changing it everywhere might not be so
large anymore.

> 
>> I think this is a very dangerous platform difference and a quite large
>> inconvenience for win64 users so I think it would be good to fix this.
>> This would be a very large change of API and probably also ABI.
> 
> Yes. Not only would it be a very large change from the status quo, I
> think it introduces *much greater* platform difference than what we
> have currently. The assumption that the default integer object
> corresponds to the platform C long, whatever that is, is pretty
> heavily ingrained.

This should be only a concern for the ABI which can be solved by simply
recompiling.
In comparison that the API is different on win64 compared to all other
platforms is something that needs source level changes.

> 
>> But as we also never officially released win64 binaries we could change
>> it for from source compilations and give win64 binary distributors the
>> option to keep the old ABI/API at their discretion.
> 
> That option would make the problem worse, not better.
> 

maybe, I'm not familiar with the numpy win64 distribution landscape.
Is it not like linux where you have one distributor per workstation
setup that can update all its packages to a new ABI on one go?


From robert.kern at gmail.com  Wed Jul 23 16:04:41 2014
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 23 Jul 2014 21:04:41 +0100
Subject: [Numpy-discussion] change default integer from int32 to int64
	on win64?
In-Reply-To: <53D01207.2090807@googlemail.com>
References: <53CFEEBE.5000207@googlemail.com>
	<CAF6FJivvgft1VrL=xUASLhPA3Q8iqmi-G5Hu+9W+HaBroN7CqQ@mail.gmail.com>
	<53D01207.2090807@googlemail.com>
Message-ID: <CAF6FJivFW_BkZOTT=Dkjr+eX3=x8e_M6ot_a9wadPcnMUNFO2Q@mail.gmail.com>

On Wed, Jul 23, 2014 at 8:50 PM, Julian Taylor
<jtaylor.debian at googlemail.com> wrote:
> On 23.07.2014 20:54, Robert Kern wrote:
>> On Wed, Jul 23, 2014 at 6:19 PM, Julian Taylor
>> <jtaylor.debian at googlemail.com> wrote:
>>> hi,
>>> it recently came to my attention that the default integer type in numpy
>>> on windows 64 bit is a 32 bit integers [0].
>>> This seems like a quite serious problem as it means you can't use any
>>> integers created from python integers < 32 bit to index arrays larger
>>> than 2GB.
>>> For example np.product(array.shape) which will never overflow on linux
>>> and mac, can overflow on win64.
>>
>> Currently, on win64, we use Python long integer objects for `.shape`
>> and related attributes. I wonder if we could return numpy int64
>> scalars instead. Then np.product() (or anything else that consumes
>> these via np.asarray()) would infer the correct dtype for the result.
>
> this might be a less invasive alternative that might solve a lot of the
> incompatibilities, but it would probably also change np.arange(5) and
> similar functions to int64 which might change the dtype of a lot of
> arrays. The difference to just changing it everywhere might not be so
> large anymore.

No, np.arange(5) would not change behavior given my suggestion, only
the type of the integer objects in ndarray.shape and related tuples.

>>> I think this is a very dangerous platform difference and a quite large
>>> inconvenience for win64 users so I think it would be good to fix this.
>>> This would be a very large change of API and probably also ABI.
>>
>> Yes. Not only would it be a very large change from the status quo, I
>> think it introduces *much greater* platform difference than what we
>> have currently. The assumption that the default integer object
>> corresponds to the platform C long, whatever that is, is pretty
>> heavily ingrained.
>
> This should be only a concern for the ABI which can be solved by simply
> recompiling.
> In comparison that the API is different on win64 compared to all other
> platforms is something that needs source level changes.

No, the API is no different on win64 than other platforms. Why do you
think it is? The win64 platform is a weird platform in this respect,
having made a choice that other 64-bit platforms didn't, but numpy's
API treats it consistently. When we say that something is a C long,
it's a C long on all platforms.

>>> But as we also never officially released win64 binaries we could change
>>> it for from source compilations and give win64 binary distributors the
>>> option to keep the old ABI/API at their discretion.
>>
>> That option would make the problem worse, not better.
>
> maybe, I'm not familiar with the numpy win64 distribution landscape.
> Is it not like linux where you have one distributor per workstation
> setup that can update all its packages to a new ABI on one go?

No. There tend to be multiple providers.

-- 
Robert Kern


From sebastian at sipsolutions.net  Wed Jul 23 16:06:11 2014
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Wed, 23 Jul 2014 22:06:11 +0200
Subject: [Numpy-discussion] change default integer from int32 to int64
 on win64?
In-Reply-To: <53D01207.2090807@googlemail.com>
References: <53CFEEBE.5000207@googlemail.com>
	<CAF6FJivvgft1VrL=xUASLhPA3Q8iqmi-G5Hu+9W+HaBroN7CqQ@mail.gmail.com>
	<53D01207.2090807@googlemail.com>
Message-ID: <1406145971.2895.5.camel@sebastian-laptop>

On Wed, 2014-07-23 at 21:50 +0200, Julian Taylor wrote:
> On 23.07.2014 20:54, Robert Kern wrote:
> > On Wed, Jul 23, 2014 at 6:19 PM, Julian Taylor
> > <jtaylor.debian at googlemail.com> wrote:
> >> hi,
> >> it recently came to my attention that the default integer type in numpy
> >> on windows 64 bit is a 32 bit integers [0].
> >> This seems like a quite serious problem as it means you can't use any
> >> integers created from python integers < 32 bit to index arrays larger
> >> than 2GB.
> >> For example np.product(array.shape) which will never overflow on linux
> >> and mac, can overflow on win64.
> > 
> > Currently, on win64, we use Python long integer objects for `.shape`
> > and related attributes. I wonder if we could return numpy int64
> > scalars instead. Then np.product() (or anything else that consumes
> > these via np.asarray()) would infer the correct dtype for the result.
> 
> this might be a less invasive alternative that might solve a lot of the
> incompatibilities, but it would probably also change np.arange(5) and
> similar functions to int64 which might change the dtype of a lot of
> arrays. The difference to just changing it everywhere might not be so
> large anymore.
> 

Aren't most such functions already using intp? Just guessing, but:

In [16]: np.arange(30, dtype=np.long).dtype.num
Out[16]: 9

In [17]: np.arange(30, dtype=np.intp).dtype.num
Out[17]: 7

In [18]: np.arange(30).dtype.num
Out[18]: 7

frankly, I am not sure what needs to change at all, except the normal
array creation and the sum promotion rule. I am probably naive here, but
what is the ABI change that is necessary for that?

I guess the problem you see is breaking code doing np.array([1,2,3]) and
then assuming in C that it is a long array?

- Sebastian

> > 
> >> I think this is a very dangerous platform difference and a quite large
> >> inconvenience for win64 users so I think it would be good to fix this.
> >> This would be a very large change of API and probably also ABI.
> > 
> > Yes. Not only would it be a very large change from the status quo, I
> > think it introduces *much greater* platform difference than what we
> > have currently. The assumption that the default integer object
> > corresponds to the platform C long, whatever that is, is pretty
> > heavily ingrained.
> 
> This should be only a concern for the ABI which can be solved by simply
> recompiling.
> In comparison that the API is different on win64 compared to all other
> platforms is something that needs source level changes.
> 
> > 
> >> But as we also never officially released win64 binaries we could change
> >> it for from source compilations and give win64 binary distributors the
> >> option to keep the old ABI/API at their discretion.
> > 
> > That option would make the problem worse, not better.
> > 
> 
> maybe, I'm not familiar with the numpy win64 distribution landscape.
> Is it not like linux where you have one distributor per workstation
> setup that can update all its packages to a new ABI on one go?
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 


From sebastian at sipsolutions.net  Wed Jul 23 16:17:01 2014
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Wed, 23 Jul 2014 22:17:01 +0200
Subject: [Numpy-discussion] change default integer from int32 to int64
 on win64?
In-Reply-To: <1406145971.2895.5.camel@sebastian-laptop>
References: <53CFEEBE.5000207@googlemail.com>
	<CAF6FJivvgft1VrL=xUASLhPA3Q8iqmi-G5Hu+9W+HaBroN7CqQ@mail.gmail.com>
	<53D01207.2090807@googlemail.com>
	<1406145971.2895.5.camel@sebastian-laptop>
Message-ID: <1406146621.2895.6.camel@sebastian-laptop>

On Wed, 2014-07-23 at 22:06 +0200, Sebastian Berg wrote:
> On Wed, 2014-07-23 at 21:50 +0200, Julian Taylor wrote:
> > On 23.07.2014 20:54, Robert Kern wrote:
> > > On Wed, Jul 23, 2014 at 6:19 PM, Julian Taylor
> > > <jtaylor.debian at googlemail.com> wrote:
> > >> hi,
> > >> it recently came to my attention that the default integer type in numpy
> > >> on windows 64 bit is a 32 bit integers [0].
> > >> This seems like a quite serious problem as it means you can't use any
> > >> integers created from python integers < 32 bit to index arrays larger
> > >> than 2GB.
> > >> For example np.product(array.shape) which will never overflow on linux
> > >> and mac, can overflow on win64.
> > > 
> > > Currently, on win64, we use Python long integer objects for `.shape`
> > > and related attributes. I wonder if we could return numpy int64
> > > scalars instead. Then np.product() (or anything else that consumes
> > > these via np.asarray()) would infer the correct dtype for the result.
> > 
> > this might be a less invasive alternative that might solve a lot of the
> > incompatibilities, but it would probably also change np.arange(5) and
> > similar functions to int64 which might change the dtype of a lot of
> > arrays. The difference to just changing it everywhere might not be so
> > large anymore.
> > 
> 
> Aren't most such functions already using intp? Just guessing, but:
> 
> In [16]: np.arange(30, dtype=np.long).dtype.num
> Out[16]: 9
> 
> In [17]: np.arange(30, dtype=np.intp).dtype.num
> Out[17]: 7
> 
> In [18]: np.arange(30).dtype.num
> Out[18]: 7
> 

Ops, never mind that stuff, probably not... np.int_ is 7 too, this is
just the way how intp is chosen.

> frankly, I am not sure what needs to change at all, except the normal
> array creation and the sum promotion rule. I am probably naive here, but
> what is the ABI change that is necessary for that?
> 
> I guess the problem you see is breaking code doing np.array([1,2,3]) and
> then assuming in C that it is a long array?
> 
> - Sebastian
> 
> > > 
> > >> I think this is a very dangerous platform difference and a quite large
> > >> inconvenience for win64 users so I think it would be good to fix this.
> > >> This would be a very large change of API and probably also ABI.
> > > 
> > > Yes. Not only would it be a very large change from the status quo, I
> > > think it introduces *much greater* platform difference than what we
> > > have currently. The assumption that the default integer object
> > > corresponds to the platform C long, whatever that is, is pretty
> > > heavily ingrained.
> > 
> > This should be only a concern for the ABI which can be solved by simply
> > recompiling.
> > In comparison that the API is different on win64 compared to all other
> > platforms is something that needs source level changes.
> > 
> > > 
> > >> But as we also never officially released win64 binaries we could change
> > >> it for from source compilations and give win64 binary distributors the
> > >> option to keep the old ABI/API at their discretion.
> > > 
> > > That option would make the problem worse, not better.
> > > 
> > 
> > maybe, I'm not familiar with the numpy win64 distribution landscape.
> > Is it not like linux where you have one distributor per workstation
> > setup that can update all its packages to a new ABI on one go?
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> > 
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 


From jtaylor.debian at googlemail.com  Wed Jul 23 16:34:40 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Wed, 23 Jul 2014 22:34:40 +0200
Subject: [Numpy-discussion] change default integer from int32 to int64
 on win64?
In-Reply-To: <CAF6FJivFW_BkZOTT=Dkjr+eX3=x8e_M6ot_a9wadPcnMUNFO2Q@mail.gmail.com>
References: <53CFEEBE.5000207@googlemail.com>	<CAF6FJivvgft1VrL=xUASLhPA3Q8iqmi-G5Hu+9W+HaBroN7CqQ@mail.gmail.com>	<53D01207.2090807@googlemail.com>
	<CAF6FJivFW_BkZOTT=Dkjr+eX3=x8e_M6ot_a9wadPcnMUNFO2Q@mail.gmail.com>
Message-ID: <53D01C60.1090307@googlemail.com>

On 23.07.2014 22:04, Robert Kern wrote:
> On Wed, Jul 23, 2014 at 8:50 PM, Julian Taylor
> <jtaylor.debian at googlemail.com> wrote:
>> On 23.07.2014 20:54, Robert Kern wrote:
>>> On Wed, Jul 23, 2014 at 6:19 PM, Julian Taylor
>>> <jtaylor.debian at googlemail.com> wrote:
>>>> hi,
>>>> it recently came to my attention that the default integer type in numpy
>>>> on windows 64 bit is a 32 bit integers [0].
>>>> This seems like a quite serious problem as it means you can't use any
>>>> integers created from python integers < 32 bit to index arrays larger
>>>> than 2GB.
>>>> For example np.product(array.shape) which will never overflow on linux
>>>> and mac, can overflow on win64.
>>>
>>> Currently, on win64, we use Python long integer objects for `.shape`
>>> and related attributes. I wonder if we could return numpy int64
>>> scalars instead. Then np.product() (or anything else that consumes
>>> these via np.asarray()) would infer the correct dtype for the result.
>>
>> this might be a less invasive alternative that might solve a lot of the
>> incompatibilities, but it would probably also change np.arange(5) and
>> similar functions to int64 which might change the dtype of a lot of
>> arrays. The difference to just changing it everywhere might not be so
>> large anymore.
> 
> No, np.arange(5) would not change behavior given my suggestion, only
> the type of the integer objects in ndarray.shape and related tuples.

ndarray.shape are not numpy scalars but python objects, so they would
always be converted back to 32 bit integers when given back to numpy.

> 
>>>> I think this is a very dangerous platform difference and a quite large
>>>> inconvenience for win64 users so I think it would be good to fix this.
>>>> This would be a very large change of API and probably also ABI.
>>>
>>> Yes. Not only would it be a very large change from the status quo, I
>>> think it introduces *much greater* platform difference than what we
>>> have currently. The assumption that the default integer object
>>> corresponds to the platform C long, whatever that is, is pretty
>>> heavily ingrained.
>>
>> This should be only a concern for the ABI which can be solved by simply
>> recompiling.
>> In comparison that the API is different on win64 compared to all other
>> platforms is something that needs source level changes.
> 
> No, the API is no different on win64 than other platforms. Why do you
> think it is? The win64 platform is a weird platform in this respect,
> having made a choice that other 64-bit platforms didn't, but numpy's
> API treats it consistently. When we say that something is a C long,
> it's a C long on all platforms.

The API is different if you consider it from a python perspective.
The default integer dtype should be sufficiently large to index into any
numpy array, thats what I call an API here. win64 behaves different, you
have to explicitly upcast your index to be able to index all memory.
But API or ABI is just semantics here, what I actually mean is the
difference of source changes vs recompiling to deal with the issue.
Of course there might be C code that needs more than recompiling, but it
should not be that much, it would have to be already somewhat
broken/restrictive code that uses numpy buffers without first checking
which type it has.

There can also be python code that might need source changes e.g.
np.int_ memory mapping a binary from win32 assuming np.int_ is also 32
bit on win64, but this would be broken on linux and mac already now.

>>>> But as we also never officially released win64 binaries we could change
>>>> it for from source compilations and give win64 binary distributors the
>>>> option to keep the old ABI/API at their discretion.
>>>
>>> That option would make the problem worse, not better.
>>
>> maybe, I'm not familiar with the numpy win64 distribution landscape.
>> Is it not like linux where you have one distributor per workstation
>> setup that can update all its packages to a new ABI on one go?
> 
> No. There tend to be multiple providers.
> 


From robert.kern at gmail.com  Wed Jul 23 16:57:50 2014
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 23 Jul 2014 21:57:50 +0100
Subject: [Numpy-discussion] change default integer from int32 to int64
	on win64?
In-Reply-To: <53D01C60.1090307@googlemail.com>
References: <53CFEEBE.5000207@googlemail.com>
	<CAF6FJivvgft1VrL=xUASLhPA3Q8iqmi-G5Hu+9W+HaBroN7CqQ@mail.gmail.com>
	<53D01207.2090807@googlemail.com>
	<CAF6FJivFW_BkZOTT=Dkjr+eX3=x8e_M6ot_a9wadPcnMUNFO2Q@mail.gmail.com>
	<53D01C60.1090307@googlemail.com>
Message-ID: <CAF6FJit9yBuB+LRCDH7=ZoGEc728xGMF4qHp23qT-f=MiY_LEQ@mail.gmail.com>

On Wed, Jul 23, 2014 at 9:34 PM, Julian Taylor
<jtaylor.debian at googlemail.com> wrote:
> On 23.07.2014 22:04, Robert Kern wrote:
>> On Wed, Jul 23, 2014 at 8:50 PM, Julian Taylor
>> <jtaylor.debian at googlemail.com> wrote:
>>> On 23.07.2014 20:54, Robert Kern wrote:
>>>> On Wed, Jul 23, 2014 at 6:19 PM, Julian Taylor
>>>> <jtaylor.debian at googlemail.com> wrote:
>>>>> hi,
>>>>> it recently came to my attention that the default integer type in numpy
>>>>> on windows 64 bit is a 32 bit integers [0].
>>>>> This seems like a quite serious problem as it means you can't use any
>>>>> integers created from python integers < 32 bit to index arrays larger
>>>>> than 2GB.
>>>>> For example np.product(array.shape) which will never overflow on linux
>>>>> and mac, can overflow on win64.
>>>>
>>>> Currently, on win64, we use Python long integer objects for `.shape`
>>>> and related attributes. I wonder if we could return numpy int64
>>>> scalars instead. Then np.product() (or anything else that consumes
>>>> these via np.asarray()) would infer the correct dtype for the result.
>>>
>>> this might be a less invasive alternative that might solve a lot of the
>>> incompatibilities, but it would probably also change np.arange(5) and
>>> similar functions to int64 which might change the dtype of a lot of
>>> arrays. The difference to just changing it everywhere might not be so
>>> large anymore.
>>
>> No, np.arange(5) would not change behavior given my suggestion, only
>> the type of the integer objects in ndarray.shape and related tuples.
>
> ndarray.shape are not numpy scalars but python objects, so they would
> always be converted back to 32 bit integers when given back to numpy.

That's what I'm suggesting that we change: make
`type(ndarray.shape[i])` be `np.intp` instead of `long`.

However, I'm not sure that this is an issue with numpy 1.8.0 at least.
I can't reproduce the reported problem on Win64:

In [12]: import numpy as np

In [13]: from numpy.lib import stride_tricks

In [14]: import sys

In [15]: b = stride_tricks.as_strided(np.zeros(1), shape=(100000,
200000, 400000), strides=(0, 0, 0))

In [16]: b.shape
Out[16]: (100000L, 200000L, 400000L)

In [17]: np.product(b.shape)
Out[17]: 8000000000000000

In [18]: np.product(b.shape).dtype
Out[18]: dtype('int64')

In [19]: sys.maxint
Out[19]: 2147483647

In [20]: np.__version__
Out[20]: '1.8.0'

In [21]: np.array(b.shape)
Out[21]: array([100000, 200000, 400000], dtype=int64)


This is on Python 2.7, so maybe something got weird in the Python 3
version that Chris Gohlke tested?

>>>>> I think this is a very dangerous platform difference and a quite large
>>>>> inconvenience for win64 users so I think it would be good to fix this.
>>>>> This would be a very large change of API and probably also ABI.
>>>>
>>>> Yes. Not only would it be a very large change from the status quo, I
>>>> think it introduces *much greater* platform difference than what we
>>>> have currently. The assumption that the default integer object
>>>> corresponds to the platform C long, whatever that is, is pretty
>>>> heavily ingrained.
>>>
>>> This should be only a concern for the ABI which can be solved by simply
>>> recompiling.
>>> In comparison that the API is different on win64 compared to all other
>>> platforms is something that needs source level changes.
>>
>> No, the API is no different on win64 than other platforms. Why do you
>> think it is? The win64 platform is a weird platform in this respect,
>> having made a choice that other 64-bit platforms didn't, but numpy's
>> API treats it consistently. When we say that something is a C long,
>> it's a C long on all platforms.
>
> The API is different if you consider it from a python perspective.
> The default integer dtype should be sufficiently large to index into any
> numpy array, thats what I call an API here.

That's perhaps what you want, but numpy has never claimed to do this.
The numpy project deliberately chose (and is so documented) to make
its default integer type a C long, not a C size_t, to match Python's
default.

> win64 behaves different, you
> have to explicitly upcast your index to be able to index all memory.
> But API or ABI is just semantics here, what I actually mean is the
> difference of source changes vs recompiling to deal with the issue.
> Of course there might be C code that needs more than recompiling, but it
> should not be that much, it would have to be already somewhat
> broken/restrictive code that uses numpy buffers without first checking
> which type it has.
>
> There can also be python code that might need source changes e.g.
> np.int_ memory mapping a binary from win32 assuming np.int_ is also 32
> bit on win64, but this would be broken on linux and mac already now.

Anything that assumes that np.int_ is any particular fixed size is
always broken, naturally.

-- 
Robert Kern


From robert.kern at gmail.com  Wed Jul 23 17:07:10 2014
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 23 Jul 2014 22:07:10 +0100
Subject: [Numpy-discussion] change default integer from int32 to int64
	on win64?
In-Reply-To: <CAF6FJit9yBuB+LRCDH7=ZoGEc728xGMF4qHp23qT-f=MiY_LEQ@mail.gmail.com>
References: <53CFEEBE.5000207@googlemail.com>
	<CAF6FJivvgft1VrL=xUASLhPA3Q8iqmi-G5Hu+9W+HaBroN7CqQ@mail.gmail.com>
	<53D01207.2090807@googlemail.com>
	<CAF6FJivFW_BkZOTT=Dkjr+eX3=x8e_M6ot_a9wadPcnMUNFO2Q@mail.gmail.com>
	<53D01C60.1090307@googlemail.com>
	<CAF6FJit9yBuB+LRCDH7=ZoGEc728xGMF4qHp23qT-f=MiY_LEQ@mail.gmail.com>
Message-ID: <CAF6FJivQEAFC1X2Nj1Www6iLn+UUBugJxV40p_R6taYKt-h7gA@mail.gmail.com>

On Wed, Jul 23, 2014 at 9:57 PM, Robert Kern <robert.kern at gmail.com> wrote:

> That's what I'm suggesting that we change: make
> `type(ndarray.shape[i])` be `np.intp` instead of `long`.
>
> However, I'm not sure that this is an issue with numpy 1.8.0 at least.
> I can't reproduce the reported problem on Win64:
>
> In [12]: import numpy as np
>
> In [13]: from numpy.lib import stride_tricks
>
> In [14]: import sys
>
> In [15]: b = stride_tricks.as_strided(np.zeros(1), shape=(100000,
> 200000, 400000), strides=(0, 0, 0))
>
> In [16]: b.shape
> Out[16]: (100000L, 200000L, 400000L)
>
> In [17]: np.product(b.shape)
> Out[17]: 8000000000000000
>
> In [18]: np.product(b.shape).dtype
> Out[18]: dtype('int64')
>
> In [19]: sys.maxint
> Out[19]: 2147483647
>
> In [20]: np.__version__
> Out[20]: '1.8.0'
>
> In [21]: np.array(b.shape)
> Out[21]: array([100000, 200000, 400000], dtype=int64)
>
>
> This is on Python 2.7, so maybe something got weird in the Python 3
> version that Chris Gohlke tested?

Ah yes, naturally. Because there is no separate `long` type in Python
3, np.asarray() can't use the type to distinguish what type to build
the array. Returning np.intp objects in the tuple would resolve the
problem in much the same way the problem is currently resolved in
Python 2. This would also have the effect of unifying API on all
platforms: currently, win64 is the only platform where the `.shape`
tuple and related attribute returns Python longs instead of Python
ints.

-- 
Robert Kern


From njs at pobox.com  Wed Jul 23 17:13:33 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 23 Jul 2014 22:13:33 +0100
Subject: [Numpy-discussion] change default integer from int32 to int64
	on win64?
In-Reply-To: <CAF6FJit9yBuB+LRCDH7=ZoGEc728xGMF4qHp23qT-f=MiY_LEQ@mail.gmail.com>
References: <53CFEEBE.5000207@googlemail.com>
	<CAF6FJivvgft1VrL=xUASLhPA3Q8iqmi-G5Hu+9W+HaBroN7CqQ@mail.gmail.com>
	<53D01207.2090807@googlemail.com>
	<CAF6FJivFW_BkZOTT=Dkjr+eX3=x8e_M6ot_a9wadPcnMUNFO2Q@mail.gmail.com>
	<53D01C60.1090307@googlemail.com>
	<CAF6FJit9yBuB+LRCDH7=ZoGEc728xGMF4qHp23qT-f=MiY_LEQ@mail.gmail.com>
Message-ID: <CAPJVwBm-eu_ma1PtXxxVRvy-prib_3PYEWhyPJQ=ft8u=eBZcw@mail.gmail.com>

On Wed, Jul 23, 2014 at 9:57 PM, Robert Kern <robert.kern at gmail.com> wrote:
> That's perhaps what you want, but numpy has never claimed to do this.
> The numpy project deliberately chose (and is so documented) to make
> its default integer type a C long, not a C size_t, to match Python's
> default.

This is true, but it's not very compelling on its own -- "big as a
pointer" is a much much more useful property than "big as a long". The
only real reason this made sense in the first place is the equivalence
between Python int and C long, but even that is gone now with Python
3. IMO at this point backcompat is really the only serious reason for
keeping int32 as the default integer type in win64. But of course this
is a pretty serious concern...

Julian: making the change experimentally and checking how badly scipy
and some similar libraries break might be a way to focus the
backcompat discussion more.

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org


From pav at iki.fi  Wed Jul 23 18:35:57 2014
From: pav at iki.fi (Pauli Virtanen)
Date: Thu, 24 Jul 2014 01:35:57 +0300
Subject: [Numpy-discussion] __numpy_ufunc__ and 1.9 release
In-Reply-To: <53CFF2E3.1020708@googlemail.com>
References: <53C56DA2.40402@googlemail.com> <53CFF2E3.1020708@googlemail.com>
Message-ID: <53D038CD.3000306@iki.fi>

23.07.2014, 20:37, Julian Taylor kirjoitti:
[clip: __numpy_ufunc__]
> So its been a week and we got a few answers and new issues. To
> summarize: - to my knowledge no progress was made on the issues -
> scipy already has a released version using the current
> implementation - no very loud objections to delaying the feature to
> 1.10 - I am still unfamiliar with the problematics of subclassing,
> but don't want to release something new which has unsolved issues.
> 
> That scipy already uses it in a released version (0.14) is very 
> problematic. Can maybe someone give some insight if the potential 
> changes to resolve the remaining issues would break scipy?
> 
> If so we have following choices:
> 
> - declare what we have as final and close the remaining issues as
> 'won't fix'. Any changes would have to have a new name
> __numpy_ufunc2__ or a somehow versioned the interface - delay the
> introduction, potentially breaking scipy 0.14 when numpy 1.10 is
> released.
> 
> I would like to get the next (and last) numpy 1.9 beta out soon, so
> I would propose to make a decision until this Saturday the
> 26.02.2014 however misinformed it may be.

It seems fairly unlikely to me that the `__numpy_ufunc__` interface
itself requires any changes. I believe the definition of the interface
is quite safe to consider as fixed --- it is a fairly straighforward
hook for Numpy ufuncs. (There are also no essential changes in it
since last year.)

For the binary operator overriding, Scipy sets the constraint that

    ndarray * spmatrix

MUST call spmatrix.__rmul__ even if spmatrix.__numpy_ufunc__ is
defined. spmatrixes are not ndarray subclasses, and various
subclassing problems do not enter here.

Note that this binop discussion is somewhat separate from the
__numpy_ufunc__ interface itself. The only information available about
it at the binop stage is `hasattr(other, '__numpy_ufunc__')`.

   ***

Regarding the blockers:

(1) https://github.com/numpy/numpy/issues/4753

This is a bug in the argument normalization --- output arguments are
not checked for the presence of "__numpy_ufunc__" if they are passed
as keyword arguments (as a positional argument it works). It's a bug
in the implementation, but I don't think it is really a blocker.

Scipy sparse matrices will in practice seldom be used as output args
for ufuncs.

   ***

(2) https://github.com/numpy/numpy/pull/4815

The is open question concerns semantics of `__numpy_ufunc__` versus
Python operator overrides. When should ndarray.__mul__(other) return
NotImplemented?

Scipy sparse matrices are not subclasses of ndarray, so the code in
question in Numpy gets to run only for

    ndarray * spmatrix

This provides a constraint to what solution we can choose in Numpy to
deal with the issue:

    ndarray.__mul__(spmatrix)  MUST  continue to return NotImplemented

This is the current behavior, and cannot be changed: it is not
possible to defer this to __numpy_ufunc__(ufunc=np.multiply), because
sparse matrices define `*` as the matrix multiply, and not the
elementwise multiply. (This settles one line of discussion in the
issues --- ndarray should defer.)

How Numpy currently determines whether to return NotImplemented in
this case or to call np.multiply(self, other) is by comparing
`__array_priority__` attributes of `self` and `other`. Scipy sparse
matrices define an `__array_priority__` larger than ndarrays, which
then makes a NotImplemented be returned.

The idea in the __numpy_ufunc__ NEP was to replace this with
`hasattr(other, '__numpy_ufunc__') and hasattr(other, '__rmul__')`.
However, when both self and other are ndarray subclasses in a certain
configuration, both end up returning NotImplemented, and Python raises
TypeError.

The `__array_priority__` mechanism is also broken in some of the
subclassing cases: https://github.com/numpy/numpy/issues/4766

As far as I see, the backward compatibility requirement from Scipy
only rules out the option that ndarray.__mul__(other) should
unconditionally call `np.add(self, other)`.

We have some freedom how to solve the binop vs. subclass issues. It's
possible to e.g. retain the __array_priority__ stuff as a backward
compatibility measure as we do currently.

-- 
Pauli Virtanen


From sturla.molden at gmail.com  Wed Jul 23 22:47:05 2014
From: sturla.molden at gmail.com (Sturla Molden)
Date: Thu, 24 Jul 2014 02:47:05 +0000 (UTC)
Subject: [Numpy-discussion] change default integer from int32 to int64
	on win64?
References: <53CFEEBE.5000207@googlemail.com>
	<CAF6FJivvgft1VrL=xUASLhPA3Q8iqmi-G5Hu+9W+HaBroN7CqQ@mail.gmail.com>
	<53D01207.2090807@googlemail.com>
	<CAF6FJivFW_BkZOTT=Dkjr+eX3=x8e_M6ot_a9wadPcnMUNFO2Q@mail.gmail.com>
	<53D01C60.1090307@googlemail.com>
Message-ID: <877454677427862634.570293sturla.molden-gmail.com@news.gmane.org>

Julian Taylor <jtaylor.debian at googlemail.com> wrote:

> The default integer dtype should be sufficiently large to index into any
> numpy array, thats what I call an API here. win64 behaves different, you
> have to explicitly upcast your index to be able to index all memory.

No, you don't have to manually upcast Python int to Python long.

Python 2 will automatically create a Python long if you overflow a Python
int.

On Python 3 the Python int does not have a size limit.


Sturla


From robert.kern at gmail.com  Thu Jul 24 04:36:18 2014
From: robert.kern at gmail.com (Robert Kern)
Date: Thu, 24 Jul 2014 09:36:18 +0100
Subject: [Numpy-discussion] change default integer from int32 to int64
	on win64?
In-Reply-To: <877454677427862634.570293sturla.molden-gmail.com@news.gmane.org>
References: <53CFEEBE.5000207@googlemail.com>
	<CAF6FJivvgft1VrL=xUASLhPA3Q8iqmi-G5Hu+9W+HaBroN7CqQ@mail.gmail.com>
	<53D01207.2090807@googlemail.com>
	<CAF6FJivFW_BkZOTT=Dkjr+eX3=x8e_M6ot_a9wadPcnMUNFO2Q@mail.gmail.com>
	<53D01C60.1090307@googlemail.com>
	<877454677427862634.570293sturla.molden-gmail.com@news.gmane.org>
Message-ID: <CAF6FJiu67Txfr3N36jQsk+Xu8S9xRnUCOqsJby7qz8uAwgL_FA@mail.gmail.com>

On Thu, Jul 24, 2014 at 3:47 AM, Sturla Molden <sturla.molden at gmail.com> wrote:
> Julian Taylor <jtaylor.debian at googlemail.com> wrote:
>
>> The default integer dtype should be sufficiently large to index into any
>> numpy array, thats what I call an API here. win64 behaves different, you
>> have to explicitly upcast your index to be able to index all memory.
>
> No, you don't have to manually upcast Python int to Python long.
>
> Python 2 will automatically create a Python long if you overflow a Python
> int.
>
> On Python 3 the Python int does not have a size limit.

Please reread the thread more carefully. That's not what this
discussion is about.

-- 
Robert Kern


From thomas_unterthiner at web.de  Thu Jul 24 05:32:24 2014
From: thomas_unterthiner at web.de (Thomas Unterthiner)
Date: Thu, 24 Jul 2014 11:32:24 +0200
Subject: [Numpy-discussion] numpy.mean still broken for large float32 arrays
Message-ID: <53D0D2A8.2060308@web.de>

Hi!

The following is a known "bug" since at least 2010 [1]:

     import numpy as np
     X = np.ones((50000, 1024), np.float32)
     print X.mean()
     >>> 0.32768


I ran into this for the first time today as part of a larger program. I 
was very surprised by this, and spent over an hour looking for bugs in 
my code before noticing that the culprit was `mean` being broken for 
large float32 arrays. I realize that this behavior is actually 
documented, but it is absolutely non-intuitive. I assume most users 
expect `mean` to just work.

This has been discussed once two years ago [2], but nothing came of 
that. This could be easily fixed by making `np.float64` the default 
dtype (as it already is for integer types), or by at least checking 
inside mean if the passed array was a large np.float32 array and switch 
the dtype to np.float64 in that case. Is there a reason why this has not 
been done?

Cheers

Thomas


[1] 
http://mail.scipy.org/pipermail/numpy-discussion/2010-November/053697.html
[2] 
http://numpy-discussion.10968.n7.nabble.com/Bug-in-numpy-mean-revisited-td1293.html


From larsmans at gmail.com  Thu Jul 24 05:39:30 2014
From: larsmans at gmail.com (Lars Buitinck)
Date: Thu, 24 Jul 2014 11:39:30 +0200
Subject: [Numpy-discussion] change default integer from int32 to int64
	on win64?
Message-ID: <CAKz-xUf9kwJgVXSeombg1xWGrtg87FKQw91cDkvmEC0pmK_eaQ@mail.gmail.com>

Wed, 23 Jul 2014 22:13:33 +0100  Nathaniel Smith <njs at pobox.com>:
> On Wed, Jul 23, 2014 at 9:57 PM, Robert Kern <robert.kern at gmail.com> wrote:
>> That's perhaps what you want, but numpy has never claimed to do this.

... except in np.where, which promises to return indices but actually
returns arrays of longs and thus doesn't work with large arrays on
Windows.

I know this is a bug that can be fixed without changing the size of
np.int, but it goes to show that even core functionality in NumPy gets
it wrong.

> This is true, but it's not very compelling on its own -- "big as a
> pointer" is a much much more useful property than "big as a long". The
> only real reason this made sense in the first place is the equivalence
> between Python int and C long, but even that is gone now with Python
> 3. IMO at this point backcompat is really the only serious reason for
> keeping int32 as the default integer type in win64. But of course this
> is a pretty serious concern...

Hear, hear.

The C type long is only useful as an "at least 32-bit" integer, but on
the platforms that NumPy targets, int is also at least that large. The
only real benefit of long is that it makes porting more interesting
</sarcasm>.

If you have intp and a bunch of explicitly-sized integer types, you
don't need an additional type that behaves like a long *except* for
backward compat.

The Go people got this right; they only have explicitly-sized integer
types and an int type the size of a pointer [1].

[1] http://golang.org/doc/go1.1#int


From robert.kern at gmail.com  Thu Jul 24 05:46:00 2014
From: robert.kern at gmail.com (Robert Kern)
Date: Thu, 24 Jul 2014 10:46:00 +0100
Subject: [Numpy-discussion] change default integer from int32 to int64
	on win64?
In-Reply-To: <CAKz-xUf9kwJgVXSeombg1xWGrtg87FKQw91cDkvmEC0pmK_eaQ@mail.gmail.com>
References: <CAKz-xUf9kwJgVXSeombg1xWGrtg87FKQw91cDkvmEC0pmK_eaQ@mail.gmail.com>
Message-ID: <CAF6FJitB+rBMzMTNuAERnD=GH6L1LufJksvDgwozec-u2uwnqw@mail.gmail.com>

On Thu, Jul 24, 2014 at 10:39 AM, Lars Buitinck <larsmans at gmail.com> wrote:
> Wed, 23 Jul 2014 22:13:33 +0100  Nathaniel Smith <njs at pobox.com>:
>> On Wed, Jul 23, 2014 at 9:57 PM, Robert Kern <robert.kern at gmail.com> wrote:
>>> That's perhaps what you want, but numpy has never claimed to do this.
>
> ... except in np.where, which promises to return indices but actually
> returns arrays of longs and thus doesn't work with large arrays on
> Windows.
>
> I know this is a bug that can be fixed without changing the size of
> np.int, but it goes to show that even core functionality in NumPy gets
> it wrong.

Does it? I don't have my Windows VM available at the moment, but it
looks like PyArray_Nonzero() is correctly returning an intp array:

https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/item_selection.c#L2478

If it is incorrect somewhere else, please submit a bug report.

-- 
Robert Kern


From hoogendoorn.eelco at gmail.com  Thu Jul 24 05:59:16 2014
From: hoogendoorn.eelco at gmail.com (Eelco Hoogendoorn)
Date: Thu, 24 Jul 2014 11:59:16 +0200
Subject: [Numpy-discussion] numpy.mean still broken for large float32
	arrays
In-Reply-To: <53D0D2A8.2060308@web.de>
References: <53D0D2A8.2060308@web.de>
Message-ID: <CAO0rnfG34C0UdoAJkZT=cE4mR=cLuADkrASLXxsTekcbjyPhWQ@mail.gmail.com>

Arguably, this isn't a problem of numpy, but of programmers being trained
to think of floating point numbers as 'real' numbers, rather than just a
finite number of states with a funny distribution over the number line.
np.mean isn't broken; your understanding of floating point number is.

What you appear to wish for is a silent upcasting of the accumulated
result. This is often performed in reducing operations, but I can imagine
it runs into trouble for nd-arrays. After all, if I have a huge array that
I want to reduce over a very short axis, upcasting might be very
undesirable; it wouldn't buy me any extra precision, but it would increase
memory use from 'huge' to 'even more huge'.

np.mean has a kwarg that allows you to explicitly choose the dtype of the
accumulant. X.mean(dtype=np.float64)==1.0. Personally, I have a distaste
for implicit behavior, unless the rule is simple and there really can be no
negative downsides; which doesn't apply here I would argue. Perhaps when
reducing an array completely to a single value, there is no harm in
upcasting to the maximum machine precision; but that becomes a rather
complex rule which would work out differently for different machines. Its
better to be confronted with the limitations of floating point numbers
earlier, rather than later when you want to distribute your work and run
into subtle bugs on other peoples computers.?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140724/2e3595d1/attachment.html>

From thomas_unterthiner at web.de  Thu Jul 24 06:55:07 2014
From: thomas_unterthiner at web.de (Thomas Unterthiner)
Date: Thu, 24 Jul 2014 12:55:07 +0200
Subject: [Numpy-discussion] numpy.mean still broken for large float32
 arrays
In-Reply-To: <CAO0rnfG34C0UdoAJkZT=cE4mR=cLuADkrASLXxsTekcbjyPhWQ@mail.gmail.com>
References: <53D0D2A8.2060308@web.de>
	<CAO0rnfG34C0UdoAJkZT=cE4mR=cLuADkrASLXxsTekcbjyPhWQ@mail.gmail.com>
Message-ID: <53D0E60B.9060500@web.de>

I don't agree. The problem is that I expect `mean` to do something 
reasonable. The documentation mentions that the results can be 
"inaccurate", which is a huge understatement: the results can be utterly 
wrong. That is not reasonable. At the very least, a warning should be 
issued in cases where the dtype might not be appropriate.

One cannot predict what input sizes a program will be run with once it's 
in use (especially if it's in use for several years). I'd argue this is 
true for pretty much every code except quick one-off scripts. Thus one 
would have to use  `dtype=np.float64` everywhere.  By which point it 
seems obvious that it should have been the default in the first place. 
The other alternative would be to extend np.mean with some logic that 
internally figures out the right thing to do (which I don't think is too 
hard, since ).

Your example with the short axis is something that can be checked for. I 
agree that the logic could become a bit hairy, but not too much: If we 
are going to sum up more than N values (where N could be determined at 
compile time, or simply be some constant), we upcast unless the user 
explicitly specified a dtype. Of course, this would incur an increase in 
memory. However I'd argue that it's not even a large increase: If you 
can fit the matrix in memory, then allocating a row/column of float64 
instead of float32 should be doable, as well. And I'd much rather get an 
OutOfMemory execption than silently continue my calculations with 
useless/wrong results.

Cheers

Thomas


On 2014-07-24 11:59, Eelco Hoogendoorn wrote:
> Arguably, this isn't a problem of numpy, but of programmers being 
> trained to think of floating point numbers as 'real' numbers, rather 
> than just a finite number of states with a funny distribution over the 
> number line. np.mean isn't broken; your understanding of floating 
> point number is.
>
> What you appear to wish for is a silent upcasting of the accumulated 
> result. This is often performed in reducing operations, but I can 
> imagine it runs into trouble for nd-arrays. After all, if I have a 
> huge array that I want to reduce over a very short axis, upcasting 
> might be very undesirable; it wouldn't buy me any extra precision, but 
> it would increase memory use from 'huge' to 'even more huge'.
>
> np.mean has a kwarg that allows you to explicitly choose the dtype of 
> the accumulant. X.mean(dtype=np.float64)==1.0. Personally, I have a 
> distaste for implicit behavior, unless the rule is simple and there 
> really can be no negative downsides; which doesn't apply here I would 
> argue. Perhaps when reducing an array completely to a single value, 
> there is no harm in upcasting to the maximum machine precision; but 
> that becomes a rather complex rule which would work out differently 
> for different machines. Its better to be confronted with the 
> limitations of floating point numbers earlier, rather than later when 
> you want to distribute your work and run into subtle bugs on other 
> peoples computers.?
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140724/66d1d7c7/attachment.html>

From fabien.maussion at gmail.com  Thu Jul 24 07:33:06 2014
From: fabien.maussion at gmail.com (Fabien)
Date: Thu, 24 Jul 2014 13:33:06 +0200
Subject: [Numpy-discussion] numpy.mean still broken for large float32
	arrays
In-Reply-To: <CAO0rnfG34C0UdoAJkZT=cE4mR=cLuADkrASLXxsTekcbjyPhWQ@mail.gmail.com>
References: <53D0D2A8.2060308@web.de>
	<CAO0rnfG34C0UdoAJkZT=cE4mR=cLuADkrASLXxsTekcbjyPhWQ@mail.gmail.com>
Message-ID: <lqqqtj$auj$1@ger.gmane.org>

Hi all,

On 24.07.2014 11:59, Eelco Hoogendoorn wrote:
> np.mean isn't broken; your understanding of floating point number is.

I am quite new to python, and this problem is discussed over and over 
for other languages too. However, numpy's summation problem appears with 
relatively small arrays already:

py>import numpy as np
py>np.ones((4000,4000), np.float32).mean()
1.0
py>np.ones((5000,5000), np.float32).mean()
0.67108864000000001

A 5000*5000 image is not unusual anymore today.

In IDL:
IDL> mean(fltarr(5000L, 5000L)+1)
        1.0000000
IDL> mean(fltarr(7000L, 7000L)+1)
        1.0000000
IDL> mean(fltarr(10000L, 10000L)+1)
       0.67108864

I can't really explain why there are differences between the two 
languages (IDL uses 32-bit, single-precision, floating-point numbers)

Fabien


From jtaylor.debian at googlemail.com  Thu Jul 24 07:56:17 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Thu, 24 Jul 2014 13:56:17 +0200
Subject: [Numpy-discussion] numpy.mean still broken for large float32
	arrays
In-Reply-To: <lqqqtj$auj$1@ger.gmane.org>
References: <53D0D2A8.2060308@web.de>
	<CAO0rnfG34C0UdoAJkZT=cE4mR=cLuADkrASLXxsTekcbjyPhWQ@mail.gmail.com>
	<lqqqtj$auj$1@ger.gmane.org>
Message-ID: <CAK5FAtGV+Oayr-6aY1Gb9R-r4ydbT0VP8xA7LB6Y2KuYSFyNEg@mail.gmail.com>

On Thu, Jul 24, 2014 at 1:33 PM, Fabien <fabien.maussion at gmail.com> wrote:
> Hi all,
>
> On 24.07.2014 11:59, Eelco Hoogendoorn wrote:
>> np.mean isn't broken; your understanding of floating point number is.
>
> I am quite new to python, and this problem is discussed over and over
> for other languages too. However, numpy's summation problem appears with
> relatively small arrays already:
>
> py>import numpy as np
> py>np.ones((4000,4000), np.float32).mean()
> 1.0
> py>np.ones((5000,5000), np.float32).mean()
> 0.67108864000000001
>
> A 5000*5000 image is not unusual anymore today.
>
> In IDL:
> IDL> mean(fltarr(5000L, 5000L)+1)
>         1.0000000
> IDL> mean(fltarr(7000L, 7000L)+1)
>         1.0000000
> IDL> mean(fltarr(10000L, 10000L)+1)
>        0.67108864
>
> I can't really explain why there are differences between the two
> languages (IDL uses 32-bit, single-precision, floating-point numbers)
>
> Fabien
>

something as simple as summation is already an interesting algorithmic
problem there are several ways do to with different speeds and
accuracies. E.g. pythons math.fsum is always exact to one ulp but is
very slow as it requires partial sorting. Then there is kahan
summation that has an accuracy of O(1) ulp but its about 4 times
slower than the naive sum.
In practice one of the better methods is pairwise summation that is
pretty much as fast as a naive summation but has an accuracy of
O(logN) ulp.
This is the method numpy 1.9 will use this method by default (+ its
even a bit faster than our old implementation of the naive sum):
https://github.com/numpy/numpy/pull/3685

but it has some limitations, it is limited to blocks fo the buffer
size (8192 elements by default) and does not work along the slow axes
due to limitations in the numpy iterator.


From jaime.frio at gmail.com  Thu Jul 24 10:27:28 2014
From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=)
Date: Thu, 24 Jul 2014 07:27:28 -0700
Subject: [Numpy-discussion] numpy.mean still broken for large float32
	arrays
In-Reply-To: <CAK5FAtGV+Oayr-6aY1Gb9R-r4ydbT0VP8xA7LB6Y2KuYSFyNEg@mail.gmail.com>
References: <53D0D2A8.2060308@web.de>
	<CAO0rnfG34C0UdoAJkZT=cE4mR=cLuADkrASLXxsTekcbjyPhWQ@mail.gmail.com>
	<lqqqtj$auj$1@ger.gmane.org>
	<CAK5FAtGV+Oayr-6aY1Gb9R-r4ydbT0VP8xA7LB6Y2KuYSFyNEg@mail.gmail.com>
Message-ID: <CAPOWHWm2ooyTEnE9KpeFno-g5XCNmC9Et3ujccLm2ZmTG9j+0w@mail.gmail.com>

On Thu, Jul 24, 2014 at 4:56 AM, Julian Taylor <
jtaylor.debian at googlemail.com> wrote:

> In practice one of the better methods is pairwise summation that is
> pretty much as fast as a naive summation but has an accuracy of
> O(logN) ulp.
> This is the method numpy 1.9 will use this method by default (+ its
> even a bit faster than our old implementation of the naive sum):
> https://github.com/numpy/numpy/pull/3685
>
> but it has some limitations, it is limited to blocks fo the buffer
> size (8192 elements by default) and does not work along the slow axes
> due to limitations in the numpy iterator.
>

For what it's worth, I see the issue on a 64-bit Windows numpy 1.8, but
cannot on a 32-bit Windows numpy master:

>>> np.__version__
'1.8.0'
>>> np.ones(100000000, dtype=np.float32).mean()
0.16777216

>>> np.__version__
'1.10.0.dev-Unknown'
>>> np.ones(100000000, dtype=np.float32).mean()
1.0

-- 
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes
de dominaci?n mundial.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140724/1235d830/attachment.html>

From alan.isaac at gmail.com  Thu Jul 24 11:09:12 2014
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Thu, 24 Jul 2014 11:09:12 -0400
Subject: [Numpy-discussion] numpy.mean still broken for large float32
 arrays
In-Reply-To: <CAO0rnfG34C0UdoAJkZT=cE4mR=cLuADkrASLXxsTekcbjyPhWQ@mail.gmail.com>
References: <53D0D2A8.2060308@web.de>
	<CAO0rnfG34C0UdoAJkZT=cE4mR=cLuADkrASLXxsTekcbjyPhWQ@mail.gmail.com>
Message-ID: <53D12198.8040308@gmail.com>

On 7/24/2014 5:59 AM, Eelco Hoogendoorn wrote to Thomas:
> np.mean isn't broken; your understanding of floating point number is.


This comment seems to conflate separate issues:
the desirable return type, and the computational algorithm.
It is certainly possible to compute a mean of float32
doing reduction in float64 and still return a float32.
There is nothing implicit in the name `mean` that says
we have to just add everything up and divide by the count.

My own view is that `mean` would behave enough better
if computed as a running mean to justify the speed loss.
Naturally similar issues arise for `var` and `std`, etc.
See http://www.johndcook.com/standard_deviation.html
for some discussion and references.

Alan Isaac


From hoogendoorn.eelco at gmail.com  Thu Jul 24 11:31:11 2014
From: hoogendoorn.eelco at gmail.com (Eelco Hoogendoorn)
Date: Thu, 24 Jul 2014 18:31:11 +0300
Subject: [Numpy-discussion] numpy.mean still broken for large
 float32arrays
Message-ID: <53d126f2.46b3c20a.4c2b.ffff8ede@mx.google.com>

Thanks Julian, those seem like Nice improvements. The fact that it either does or doesnt work depending on the axis makes me a Little queesy; but yeah, checking that fp's do what You think they should, is unfortunately best left as the responsibility of the programmer.

-----Original Message-----
From: "Julian Taylor" <jtaylor.debian at googlemail.com>
Sent: ?24-?7-?2014 14:56
To: "Discussion of Numerical Python" <numpy-discussion at scipy.org>
Subject: Re: [Numpy-discussion] numpy.mean still broken for large float32arrays

On Thu, Jul 24, 2014 at 1:33 PM, Fabien <fabien.maussion at gmail.com> wrote:
> Hi all,
>
> On 24.07.2014 11:59, Eelco Hoogendoorn wrote:
>> np.mean isn't broken; your understanding of floating point number is.
>
> I am quite new to python, and this problem is discussed over and over
> for other languages too. However, numpy's summation problem appears with
> relatively small arrays already:
>
> py>import numpy as np
> py>np.ones((4000,4000), np.float32).mean()
> 1.0
> py>np.ones((5000,5000), np.float32).mean()
> 0.67108864000000001
>
> A 5000*5000 image is not unusual anymore today.
>
> In IDL:
> IDL> mean(fltarr(5000L, 5000L)+1)
>         1.0000000
> IDL> mean(fltarr(7000L, 7000L)+1)
>         1.0000000
> IDL> mean(fltarr(10000L, 10000L)+1)
>        0.67108864
>
> I can't really explain why there are differences between the two
> languages (IDL uses 32-bit, single-precision, floating-point numbers)
>
> Fabien
>

something as simple as summation is already an interesting algorithmic
problem there are several ways do to with different speeds and
accuracies. E.g. pythons math.fsum is always exact to one ulp but is
very slow as it requires partial sorting. Then there is kahan
summation that has an accuracy of O(1) ulp but its about 4 times
slower than the naive sum.
In practice one of the better methods is pairwise summation that is
pretty much as fast as a naive summation but has an accuracy of
O(logN) ulp.
This is the method numpy 1.9 will use this method by default (+ its
even a bit faster than our old implementation of the naive sum):
https://github.com/numpy/numpy/pull/3685

but it has some limitations, it is limited to blocks fo the buffer
size (8192 elements by default) and does not work along the slow axes
due to limitations in the numpy iterator.
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140724/4fe9cb42/attachment.html>

From hoogendoorn.eelco at gmail.com  Thu Jul 24 11:34:28 2014
From: hoogendoorn.eelco at gmail.com (Eelco Hoogendoorn)
Date: Thu, 24 Jul 2014 18:34:28 +0300
Subject: [Numpy-discussion] numpy.mean still broken for large
 float32arrays
Message-ID: <53d127b7.a5cbc20a.62be.ffffa347@mx.google.com>

True, i suppose there is no harm in accumulating with max precision, and storing the result in the Original dtype, unless otherwise specified, although i wonder if the current nditer supports such behavior.

-----Original Message-----
From: "Alan G Isaac" <alan.isaac at gmail.com>
Sent: ?24-?7-?2014 18:09
To: "Discussion of Numerical Python" <numpy-discussion at scipy.org>
Subject: Re: [Numpy-discussion] numpy.mean still broken for large float32arrays

On 7/24/2014 5:59 AM, Eelco Hoogendoorn wrote to Thomas:
> np.mean isn't broken; your understanding of floating point number is.


This comment seems to conflate separate issues:
the desirable return type, and the computational algorithm.
It is certainly possible to compute a mean of float32
doing reduction in float64 and still return a float32.
There is nothing implicit in the name `mean` that says
we have to just add everything up and divide by the count.

My own view is that `mean` would behave enough better
if computed as a running mean to justify the speed loss.
Naturally similar issues arise for `var` and `std`, etc.
See http://www.johndcook.com/standard_deviation.html
for some discussion and references.

Alan Isaac
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140724/ff89afbc/attachment.html>

From charlesr.harris at gmail.com  Thu Jul 24 12:59:38 2014
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu, 24 Jul 2014 10:59:38 -0600
Subject: [Numpy-discussion] numpy.mean still broken for large float32
	arrays
In-Reply-To: <CAPOWHWm2ooyTEnE9KpeFno-g5XCNmC9Et3ujccLm2ZmTG9j+0w@mail.gmail.com>
References: <53D0D2A8.2060308@web.de>
	<CAO0rnfG34C0UdoAJkZT=cE4mR=cLuADkrASLXxsTekcbjyPhWQ@mail.gmail.com>
	<lqqqtj$auj$1@ger.gmane.org>
	<CAK5FAtGV+Oayr-6aY1Gb9R-r4ydbT0VP8xA7LB6Y2KuYSFyNEg@mail.gmail.com>
	<CAPOWHWm2ooyTEnE9KpeFno-g5XCNmC9Et3ujccLm2ZmTG9j+0w@mail.gmail.com>
Message-ID: <CAB6mnxK9RjMJEj0sZbvVFv52vez-6H3mefJMz72v_xNDdTBHRg@mail.gmail.com>

On Thu, Jul 24, 2014 at 8:27 AM, Jaime Fern?ndez del R?o <
jaime.frio at gmail.com> wrote:

> On Thu, Jul 24, 2014 at 4:56 AM, Julian Taylor <
> jtaylor.debian at googlemail.com> wrote:
>
>> In practice one of the better methods is pairwise summation that is
>>  pretty much as fast as a naive summation but has an accuracy of
>> O(logN) ulp.
>> This is the method numpy 1.9 will use this method by default (+ its
>> even a bit faster than our old implementation of the naive sum):
>> https://github.com/numpy/numpy/pull/3685
>>
>> but it has some limitations, it is limited to blocks fo the buffer
>> size (8192 elements by default) and does not work along the slow axes
>> due to limitations in the numpy iterator.
>>
>
> For what it's worth, I see the issue on a 64-bit Windows numpy 1.8, but
> cannot on a 32-bit Windows numpy master:
>
> >>> np.__version__
> '1.8.0'
> >>> np.ones(100000000, dtype=np.float32).mean()
> 0.16777216
>
> >>> np.__version__
> '1.10.0.dev-Unknown'
> >>> np.ones(100000000, dtype=np.float32).mean()
> 1.0
>
>
Interesting. Might be compiler related as there are many choices for
floating point instructions/registers in i386. The i386 version may
effectively  be working in double precision.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140724/5a746de8/attachment.html>

From joseph.martinot-lagarde at m4x.org  Thu Jul 24 13:03:50 2014
From: joseph.martinot-lagarde at m4x.org (Joseph Martinot-Lagarde)
Date: Thu, 24 Jul 2014 19:03:50 +0200
Subject: [Numpy-discussion] numpy.mean still broken for large float32
	arrays
In-Reply-To: <53D0E60B.9060500@web.de>
References: <53D0D2A8.2060308@web.de>	<CAO0rnfG34C0UdoAJkZT=cE4mR=cLuADkrASLXxsTekcbjyPhWQ@mail.gmail.com>
	<53D0E60B.9060500@web.de>
Message-ID: <lqre9m$544$1@ger.gmane.org>

Le 24/07/2014 12:55, Thomas Unterthiner a ?crit :
> I don't agree. The problem is that I expect `mean` to do something
> reasonable. The documentation mentions that the results can be
> "inaccurate", which is a huge understatement: the results can be utterly
> wrong. That is not reasonable. At the very least, a warning should be
> issued in cases where the dtype might not be appropriate.
>
Maybe the problem is the documentation, then. If this is a common error, 
it could be explicitly documented in the function documentation.


From nouiz at nouiz.org  Thu Jul 24 13:04:43 2014
From: nouiz at nouiz.org (=?UTF-8?B?RnLDqWTDqXJpYyBCYXN0aWVu?=)
Date: Thu, 24 Jul 2014 13:04:43 -0400
Subject: [Numpy-discussion] numpy.mean still broken for large float32
	arrays
In-Reply-To: <CAB6mnxK9RjMJEj0sZbvVFv52vez-6H3mefJMz72v_xNDdTBHRg@mail.gmail.com>
References: <53D0D2A8.2060308@web.de>
	<CAO0rnfG34C0UdoAJkZT=cE4mR=cLuADkrASLXxsTekcbjyPhWQ@mail.gmail.com>
	<lqqqtj$auj$1@ger.gmane.org>
	<CAK5FAtGV+Oayr-6aY1Gb9R-r4ydbT0VP8xA7LB6Y2KuYSFyNEg@mail.gmail.com>
	<CAPOWHWm2ooyTEnE9KpeFno-g5XCNmC9Et3ujccLm2ZmTG9j+0w@mail.gmail.com>
	<CAB6mnxK9RjMJEj0sZbvVFv52vez-6H3mefJMz72v_xNDdTBHRg@mail.gmail.com>
Message-ID: <CADKKbtjaoWuZ=ipND7JA1aRYnBUkNmxuBpKV3X0ShuXeLJHYvQ@mail.gmail.com>

On Thu, Jul 24, 2014 at 12:59 PM, Charles R Harris <
charlesr.harris at gmail.com> wrote:

>
>
>
> On Thu, Jul 24, 2014 at 8:27 AM, Jaime Fern?ndez del R?o <
> jaime.frio at gmail.com> wrote:
>
>> On Thu, Jul 24, 2014 at 4:56 AM, Julian Taylor <
>> jtaylor.debian at googlemail.com> wrote:
>>
>>> In practice one of the better methods is pairwise summation that is
>>>  pretty much as fast as a naive summation but has an accuracy of
>>> O(logN) ulp.
>>> This is the method numpy 1.9 will use this method by default (+ its
>>> even a bit faster than our old implementation of the naive sum):
>>> https://github.com/numpy/numpy/pull/3685
>>>
>>> but it has some limitations, it is limited to blocks fo the buffer
>>> size (8192 elements by default) and does not work along the slow axes
>>> due to limitations in the numpy iterator.
>>>
>>
>> For what it's worth, I see the issue on a 64-bit Windows numpy 1.8, but
>> cannot on a 32-bit Windows numpy master:
>>
>>  >>> np.__version__
>> '1.8.0'
>> >>> np.ones(100000000, dtype=np.float32).mean()
>> 0.16777216
>>
>> >>> np.__version__
>> '1.10.0.dev-Unknown'
>> >>> np.ones(100000000, dtype=np.float32).mean()
>> 1.0
>>
>>
> Interesting. Might be compiler related as there are many choices for
> floating point instructions/registers in i386. The i386 version may
> effectively  be working in double precision.
>

Also note the different numpy version. Julian told that numpy 1.9 will use
a more precise version in that case. That could explain that.

Fred
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140724/bafda7b0/attachment.html>

From rays at blue-cove.com  Thu Jul 24 13:36:12 2014
From: rays at blue-cove.com (RayS)
Date: Thu, 24 Jul 2014 10:36:12 -0700
Subject: [Numpy-discussion] numpy.mean still broken for large float32
 arrays
In-Reply-To: <CAB6mnxK9RjMJEj0sZbvVFv52vez-6H3mefJMz72v_xNDdTBHRg@mail.g
	mail.com>
References: <53D0D2A8.2060308@web.de>
	<CAO0rnfG34C0UdoAJkZT=cE4mR=cLuADkrASLXxsTekcbjyPhWQ@mail.gmail.com>
	<lqqqtj$auj$1@ger.gmane.org>
	<CAK5FAtGV+Oayr-6aY1Gb9R-r4ydbT0VP8xA7LB6Y2KuYSFyNEg@mail.gmail.com>
	<CAPOWHWm2ooyTEnE9KpeFno-g5XCNmC9Et3ujccLm2ZmTG9j+0w@mail.gmail.com>
	<CAB6mnxK9RjMJEj0sZbvVFv52vez-6H3mefJMz72v_xNDdTBHRg@mail.gmail.com>
Message-ID: <201407241736.s6OHaEcK032578@blue-cove.com>

import numpy
print numpy.__version__
for s in range(1864100, 1864200):
     if numpy.ones((s, 9), numpy.float32).sum()!= s*9:
         print '\nbroke', s
         break
     else:
         print '\r',s,

C:\temp>python np_sum.py
1.8.0b2
1864135
broke 1864136


import numpy
print numpy.__version__
for s in range(1864130*9, 1864135*9):
     if numpy.ones((s, 1), numpy.float32).sum()!= s:
         print '\nbroke', s
         break
     else:
         print '\r',s,
C:\temp>python np_sum.py
1.8.0b2
16777214 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140724/ce1efe1b/attachment.html>

From ben.root at ou.edu  Thu Jul 24 13:53:08 2014
From: ben.root at ou.edu (Benjamin Root)
Date: Thu, 24 Jul 2014 13:53:08 -0400
Subject: [Numpy-discussion] masked_where broadcasting?
Message-ID: <CANNq6FmZKTfphKz8jPxeBSbqmSVVzFXZJ8=KyBooyqNAhvO18g@mail.gmail.com>

I ran into this this morning while writing up a new test for matplotlib.
Shouldn't these two arrays be broadcasted automatically or maybe np.ma is
being overly cautious?

    u = np.ma.masked_where((-0.4 < x) & (x < 0.1), u, copy=False)
  File "/home/ben/.local/lib/python2.7/site-packages/numpy/ma/core.py",
line 1806, in masked_where
    " (got %s and %s)" % (cshape, ashape))
IndexError: Inconsistant shape between the condition and the input (got
(10, 1, 1) and (10, 10, 3))

x has shape (10, 1, 1) and u has shape (10, 10, 3).

This is on a recent-ish numpy master.

Cheers!
Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140724/8d9b4920/attachment.html>

From rays at blue-cove.com  Thu Jul 24 15:05:56 2014
From: rays at blue-cove.com (RayS)
Date: Thu, 24 Jul 2014 12:05:56 -0700
Subject: [Numpy-discussion] numpy.mean still broken for large float32
 arrays
Message-ID: <201407241906.s6OJ61gG006795@blue-cove.com>

Probably a number of scipy places as well


import numpy
import scipy.stats
print numpy.__version__
print scipy.__version__
for s in range(16777214, 16777944):
     if scipy.stats.nanmean(numpy.ones((s, 1), numpy.float32))[0]!=1:
         print '\nbroke', s, scipy.stats.nanmean(numpy.ones((s, 1), 
numpy.float32))
         break
     else:
         print '\r',s,

c:\temp>python np_sum.py
1.8.0b2
0.11.0
16777216
broke 16777217 [ 0.99999994]


From jeffreback at gmail.com  Thu Jul 24 15:25:06 2014
From: jeffreback at gmail.com (Jeff Reback)
Date: Thu, 24 Jul 2014 15:25:06 -0400
Subject: [Numpy-discussion] numpy.mean still broken for large float32
	arrays
In-Reply-To: <201407241906.s6OJ61gG006795@blue-cove.com>
References: <201407241906.s6OJ61gG006795@blue-cove.com>
Message-ID: <CAHMnJKhMrE2_SJF_4HYXikOtGrgzyKOEwoto0Y-CVfmCNSctiQ@mail.gmail.com>

related recent issue: https://github.com/numpy/numpy/issues/4638
and pandas is now explicitly specifying the accumulator to avoid this
problem: https://github.com/pydata/pandas/pull/6954/files

pandas also implemented the Welfords method for rolling_var in 0.14.0, see
here: https://github.com/pydata/pandas/pull/6817


On Thu, Jul 24, 2014 at 3:05 PM, RayS <rays at blue-cove.com> wrote:

> Probably a number of scipy places as well
>
>
>
> import numpy
> import scipy.stats
> print numpy.__version__
> print scipy.__version__
> for s in range(16777214, 16777944):
>      if scipy.stats.nanmean(numpy.ones((s, 1), numpy.float32))[0]!=1:
>          print '\nbroke', s, scipy.stats.nanmean(numpy.ones((s, 1),
> numpy.float32))
>          break
>      else:
>          print '\r',s,
>
> c:\temp>python np_sum.py
> 1.8.0b2
> 0.11.0
> 16777216
> broke 16777217 [ 0.99999994]
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140724/1d7f5c59/attachment.html>

From hoogendoorn.eelco at gmail.com  Thu Jul 24 16:42:53 2014
From: hoogendoorn.eelco at gmail.com (Eelco Hoogendoorn)
Date: Thu, 24 Jul 2014 23:42:53 +0300
Subject: [Numpy-discussion] numpy.mean still broken for large
 float32arrays
Message-ID: <53d17011.234dc20a.45b5.ffffeacf@mx.google.com>

Inaccurate and utterly wrong are subjective. If You want To Be sufficiently strict,  floating point calculations are almost always 'utterly wrong'.

Granted, It would Be Nice if the docs specified the algorithm used. But numpy does not produce anything different than what a standard c loop or c++ std lib func would. This isn't a bug report, but rather a feature request. That said, support for fancy reduction algorithms would certainly be nice, if implementing it in numpy in a coherent manner is feasible. 

-----Original Message-----
From: "Joseph Martinot-Lagarde" <joseph.martinot-lagarde at m4x.org>
Sent: ?24-?7-?2014 20:04
To: "numpy-discussion at scipy.org" <numpy-discussion at scipy.org>
Subject: Re: [Numpy-discussion] numpy.mean still broken for large float32arrays

Le 24/07/2014 12:55, Thomas Unterthiner a ?crit :
> I don't agree. The problem is that I expect `mean` to do something
> reasonable. The documentation mentions that the results can be
> "inaccurate", which is a huge understatement: the results can be utterly
> wrong. That is not reasonable. At the very least, a warning should be
> issued in cases where the dtype might not be appropriate.
>
Maybe the problem is the documentation, then. If this is a common error, 
it could be explicitly documented in the function documentation.

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140724/5baca1f8/attachment.html>

From alan.isaac at gmail.com  Thu Jul 24 17:10:15 2014
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Thu, 24 Jul 2014 17:10:15 -0400
Subject: [Numpy-discussion] numpy.mean still broken for large
	float32arrays
In-Reply-To: <53d17011.234dc20a.45b5.ffffeacf@mx.google.com>
References: <53d17011.234dc20a.45b5.ffffeacf@mx.google.com>
Message-ID: <53D17637.4000008@gmail.com>

On 7/24/2014 4:42 PM, Eelco Hoogendoorn wrote:
> This isn't a bug report, but rather a feature request.

I'm not sure statement this is correct.  The mean of a float32 array
can certainly be computed as a float32.  Currently this is
not necessarily what happens, not even approximately.
That feels a lot like a bug, even if we can readily understand
how the algorithm currently used produces it.  To say whether
it is a bug or not, don't we have to ask about the intent of `mean`?
If the intent is to sum and divide, then it is not a bug.
If the intent is to produce the mean, then it is a bug.

Alan Isaac


From hoogendoorn.eelco at gmail.com  Thu Jul 24 23:37:49 2014
From: hoogendoorn.eelco at gmail.com (Eelco Hoogendoorn)
Date: Fri, 25 Jul 2014 06:37:49 +0300
Subject: [Numpy-discussion] numpy.mean still broken for
 largefloat32arrays
Message-ID: <53d1d132.93d3b40a.2d6b.0714@mx.google.com>

Perhaps it is a slightly semantical discussion; but all fp calculations have errors, and there are always strategies for making them smaller. We just don't happen to like the error for this case; but rest assured it won't be hard to find new cases of 'blatantly wrong' results, no matter what accumulator is implemented. That's no reason to not try and be clever about it, but there isn't going to be an algorithm that is best for all possible inputs, and in the end the most important thing is that the algorithm used is specified in the docs.

-----Original Message-----
From: "Alan G Isaac" <alan.isaac at gmail.com>
Sent: ?25-?7-?2014 00:10
To: "Discussion of Numerical Python" <numpy-discussion at scipy.org>
Subject: Re: [Numpy-discussion] numpy.mean still broken for largefloat32arrays

On 7/24/2014 4:42 PM, Eelco Hoogendoorn wrote:
> This isn't a bug report, but rather a feature request.

I'm not sure statement this is correct.  The mean of a float32 array
can certainly be computed as a float32.  Currently this is
not necessarily what happens, not even approximately.
That feels a lot like a bug, even if we can readily understand
how the algorithm currently used produces it.  To say whether
it is a bug or not, don't we have to ask about the intent of `mean`?
If the intent is to sum and divide, then it is not a bug.
If the intent is to produce the mean, then it is a bug.

Alan Isaac
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140725/270edc28/attachment.html>

From hoogendoorn.eelco at gmail.com  Fri Jul 25 04:22:56 2014
From: hoogendoorn.eelco at gmail.com (Eelco Hoogendoorn)
Date: Fri, 25 Jul 2014 10:22:56 +0200
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
In-Reply-To: <53d1d132.93d3b40a.2d6b.0714@mx.google.com>
References: <53d1d132.93d3b40a.2d6b.0714@mx.google.com>
Message-ID: <CAO0rnfHPh1o+S3kVceFq=txF1_M82eKP9sh59Cki9DmJrmg0Vg@mail.gmail.com>

To elaborate on that point; knowing that numpy accumulates in a simple
first-to-last sweep, and does not implicitly upcast, the original problem
can be solved in several ways; specifying a higher precision to sum with,
or by a nested summation, like X.mean(0).mean(0)==1.0. I personally like
this explicitness, and am wary of numpy doing overly clever things behind
the scenes, as I can think of other code that might become broken if things
change too radically. For instance, I often sort large arrays with a large
spread in magnitudes before summation, relying on the fact that summing the
smallest values first gives best precision. Any changes made to reduction
behavior should try and be backwards compatible with such properties of
straightforward reductions, or else a lot of code is going to be broken
without warning.

I suppose using maximum precision internally, and nesting all reductions
over multiple axes of an ndarray, are both easy to implement improvements
that do not come with any drawbacks that I can think of. Actually the
maximum precision I am not so sure of, as I personally prefer to make an
informed decision about precision used, and get an error on a platform that
does not support the specified precision, rather than obtain subtly or
horribly broken results without warning when moving your code to a
different platform/compiler whatever.


On Fri, Jul 25, 2014 at 5:37 AM, Eelco Hoogendoorn <
hoogendoorn.eelco at gmail.com> wrote:

>  Perhaps it is a slightly semantical discussion; but all fp calculations
> have errors, and there are always strategies for making them smaller. We
> just don't happen to like the error for this case; but rest assured it
> won't be hard to find new cases of 'blatantly wrong' results, no matter
> what accumulator is implemented. That's no reason to not try and be clever
> about it, but there isn't going to be an algorithm that is best for all
> possible inputs, and in the end the most important thing is that the
> algorithm used is specified in the docs.
>  ------------------------------
> From: Alan G Isaac <alan.isaac at gmail.com>
> Sent: ?25-?7-?2014 00:10
>
> To: Discussion of Numerical Python <numpy-discussion at scipy.org>
> Subject: Re: [Numpy-discussion] numpy.mean still broken for
> largefloat32arrays
>
> On 7/24/2014 4:42 PM, Eelco Hoogendoorn wrote:
> > This isn't a bug report, but rather a feature request.
>
> I'm not sure statement this is correct.  The mean of a float32 array
> can certainly be computed as a float32.  Currently this is
> not necessarily what happens, not even approximately.
> That feels a lot like a bug, even if we can readily understand
> how the algorithm currently used produces it.  To say whether
> it is a bug or not, don't we have to ask about the intent of `mean`?
> If the intent is to sum and divide, then it is not a bug.
> If the intent is to produce the mean, then it is a bug.
>
> Alan Isaac
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140725/9bd54f3d/attachment.html>

From olivier.grisel at ensta.org  Fri Jul 25 09:06:40 2014
From: olivier.grisel at ensta.org (Olivier Grisel)
Date: Fri, 25 Jul 2014 15:06:40 +0200
Subject: [Numpy-discussion] change default integer from int32 to int64
	on win64?
In-Reply-To: <CAF6FJitB+rBMzMTNuAERnD=GH6L1LufJksvDgwozec-u2uwnqw@mail.gmail.com>
References: <CAKz-xUf9kwJgVXSeombg1xWGrtg87FKQw91cDkvmEC0pmK_eaQ@mail.gmail.com>
	<CAF6FJitB+rBMzMTNuAERnD=GH6L1LufJksvDgwozec-u2uwnqw@mail.gmail.com>
Message-ID: <CAFvE7K5jdh_6KQiMcqCr+nY+-49haYbieS1GQKbTt9pR8wq6iA@mail.gmail.com>

The dtype returned by np.where looks right (int64):

>>> import platform
>>> platform.architecture()
('64bit', 'WindowsPE')
>>> import numpy as np
>>> np.__version__
'1.9.0b1'
>>> a = np.zeros(10)
>>> np.where(a == 0)
(array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=int64),)

-- 
Olivier


From jeffreback at gmail.com  Fri Jul 25 09:52:37 2014
From: jeffreback at gmail.com (Jeff)
Date: Fri, 25 Jul 2014 06:52:37 -0700 (PDT)
Subject: [Numpy-discussion] ANN: Pandas 0.14.0 Release Candidate 1
In-Reply-To: <CAHMnJKh-L26x=-3c6pPoSnqmb7fAujSH=hGQ6f-V=On9FLXQTw@mail.gmail.com>
References: <CAHMnJKh-L26x=-3c6pPoSnqmb7fAujSH=hGQ6f-V=On9FLXQTw@mail.gmail.com>
Message-ID: <733e0eed-fbef-46d4-b797-c5368b280fee@googlegroups.com>

How does the build trigger? If its just a matter of clicking on something 
when released. I think we can handle that :)

On Saturday, May 17, 2014 7:22:00 AM UTC-4, Jeff wrote:
>
> Hi,
>
> I'm pleased to announce the availability of the first release candidate of 
> Pandas 0.14.0.
> Please try this RC and report any issues here: Pandas Issues 
> <https://github.com/pydata/pandas/issues>
> We will be releasing officially in about 2 weeks or so.
>
> This is a major release from 0.13.1 and includes a small number of API 
> changes, several new features, enhancements, and 
> performance improvements along with a large number of bug fixes. 
>
> Highlights include:
>
>    - Officially support Python 3.4
>    - SQL interfaces updated to use sqlalchemy,
>    - Display interface changes
>    - MultiIndexing Using Slicers
>    - Ability to join a singly-indexed DataFrame with a multi-indexed 
>    DataFrame
>    - More consistency in groupby results and more flexible groupby 
>    specifications
>    - Holiday calendars are now supported in CustomBusinessDay
>    - Several improvements in plotting functions, including: hexbin, area 
>    and pie plots.
>    - Performance doc section on I/O operations
>    
> Since there are some significant changes in the default way DataFrames are 
> displayed. I have put
> up a comment issue looking for some feedback here 
> <https://github.com/pydata/pandas/issues/7146>
>
> Here are the full whatsnew and documentation links:
>
> v0.14.0 Whatsnew 
> <http://pandas-docs.github.io/pandas-docs-travis/whatsnew.html>
>
> v0.14.0 Documentation Page 
> <http://pandas-docs.github.io/pandas-docs-travis/>
>
> Source tarballs, and windows builds are available here:
>
> Pandas v0.14rc1 Release <https://github.com/pydata/pandas/releases>
>
> A big thank you to everyone who contributed to this release!
>
> Jeff
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140725/8477f059/attachment.html>

From rays at blue-cove.com  Fri Jul 25 10:11:52 2014
From: rays at blue-cove.com (RayS)
Date: Fri, 25 Jul 2014 07:11:52 -0700
Subject: [Numpy-discussion] numpy.mean still broken for
 largefloat32arrays
In-Reply-To: <CAO0rnfHPh1o+S3kVceFq=txF1_M82eKP9sh59Cki9DmJrmg0Vg@mail.g
	mail.com>
References: <53d1d132.93d3b40a.2d6b.0714@mx.google.com>
	<CAO0rnfHPh1o+S3kVceFq=txF1_M82eKP9sh59Cki9DmJrmg0Vg@mail.gmail.com>
Message-ID: <201407251411.s6PEBrpw018675@blue-cove.com>

At 01:22 AM 7/25/2014, you wrote:
>  Actually the maximum precision I am not so 
> sure of, as I personally prefer to make an 
> informed decision about precision used, and get 
> an error on a platform that does not support 
> the specified precision, rather than obtain 
> subtly or horribly broken results without 
> warning? when moving your code to a different platform/compiler whatever.

We were talking on this in the office, as we 
realized it does affect a couple of lines dealing 
with large arrays, including complex64.
As I expect Python modules to work uniformly 
cross platform unless documented otherwise, to me 
that includes 32 vs 64 bit platforms, implying 
that the modules should automatically use large 
enough accumulators for the data type input.

http://docs.scipy.org/doc/numpy/reference/generated/numpy.mean.html
does mention inaccuracy.
http://docs.scipy.org/doc/scipy-0.13.0/reference/generated/scipy.stats.mstats.gmean.html
http://docs.scipy.org/doc/numpy/reference/generated/numpy.sum.html
etc do not, exactly

- Ray


From robert.kern at gmail.com  Fri Jul 25 10:22:36 2014
From: robert.kern at gmail.com (Robert Kern)
Date: Fri, 25 Jul 2014 15:22:36 +0100
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
In-Reply-To: <201407251411.s6PEBrpw018675@blue-cove.com>
References: <53d1d132.93d3b40a.2d6b.0714@mx.google.com>
	<CAO0rnfHPh1o+S3kVceFq=txF1_M82eKP9sh59Cki9DmJrmg0Vg@mail.gmail.com>
	<201407251411.s6PEBrpw018675@blue-cove.com>
Message-ID: <CAF6FJiufi+r_F5Rs256N7_3nS62OG05C1p13jLG0r3n_bm2eHg@mail.gmail.com>

On Fri, Jul 25, 2014 at 3:11 PM, RayS <rays at blue-cove.com> wrote:
> At 01:22 AM 7/25/2014, you wrote:
>>  Actually the maximum precision I am not so
>> sure of, as I personally prefer to make an
>> informed decision about precision used, and get
>> an error on a platform that does not support
>> the specified precision, rather than obtain
>> subtly or horribly broken results without
>> warning? when moving your code to a different platform/compiler whatever.
>
> We were talking on this in the office, as we
> realized it does affect a couple of lines dealing
> with large arrays, including complex64.
> As I expect Python modules to work uniformly
> cross platform unless documented otherwise, to me
> that includes 32 vs 64 bit platforms, implying
> that the modules should automatically use large
> enough accumulators for the data type input.

The 32/64-bitness of your platform has nothing to do with floating
point. Nothing discussed in this thread is platform-specific (modulo
some minor details about the hardware FPU, but that should be taken as
read).

-- 
Robert Kern


From rays at blue-cove.com  Fri Jul 25 12:56:39 2014
From: rays at blue-cove.com (RayS)
Date: Fri, 25 Jul 2014 09:56:39 -0700
Subject: [Numpy-discussion] numpy.mean still broken for
 largefloat32arrays
In-Reply-To: <CAF6FJiufi+r_F5Rs256N7_3nS62OG05C1p13jLG0r3n_bm2eHg@mail.g
	mail.com>
References: <53d1d132.93d3b40a.2d6b.0714@mx.google.com>
	<CAO0rnfHPh1o+S3kVceFq=txF1_M82eKP9sh59Cki9DmJrmg0Vg@mail.gmail.com>
	<201407251411.s6PEBrpw018675@blue-cove.com>
	<CAF6FJiufi+r_F5Rs256N7_3nS62OG05C1p13jLG0r3n_bm2eHg@mail.gmail.com>
Message-ID: <201407251656.s6PGui7J027752@blue-cove.com>

At 07:22 AM 7/25/2014, you wrote:
> > We were talking on this in the office, as we
> > realized it does affect a couple of lines dealing
> > with large arrays, including complex64.
> > As I expect Python modules to work uniformly
> > cross platform unless documented otherwise, to me
> > that includes 32 vs 64 bit platforms, implying
> > that the modules should automatically use large
> > enough accumulators for the data type input.
>
>The 32/64-bitness of your platform has nothing to do with floating
>point.

As a naive end user, I can, and do, download different binaries for 
different CPUs/Windows versions and will get different results
http://mail.scipy.org/pipermail/numpy-discussion/2014-July/070747.html

>  Nothing discussed in this thread is platform-specific (modulo
>some minor details about the hardware FPU, but that should be taken as
>read).

And compilers, apparently.

The important point was that it would be best if all of the methods 
affected by summing 32 bit floats with 32 bit accumulators had the 
same Notes as numpy.mean(). We went through a lot of code yesterday, 
assuming that any numpy or Scipy.stats functions that use 
accumulators suffer the same issue, whether noted or not, and found it true.

"Depending on the input data, this can cause the results to be 
inaccurate, especially for float32 (see example below). Specifying a 
higher-precision accumulator using the 
<http://docs.scipy.org/doc/numpy/reference/generated/numpy.dtype.html#numpy.dtype>dtype 
keyword can alleviate this issue." seems rather un-Pythonic.

- Ray

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140725/3ed6ff10/attachment.html>

From hoogendoorn.eelco at gmail.com  Fri Jul 25 13:40:17 2014
From: hoogendoorn.eelco at gmail.com (Eelco Hoogendoorn)
Date: Fri, 25 Jul 2014 20:40:17 +0300
Subject: [Numpy-discussion] numpy.mean still broken for
 largefloat32arrays
Message-ID: <53d296aa.d1c6b40a.6e8d.ffffe5bf@mx.google.com>

Arguably, the whole of floating point numbers and their related shenanigans is not very pythonic in the first place. The accuracy of the output WILL depend on the input, to some degree or another. At the risk of repeating myself: explicit is better than implicit

-----Original Message-----
From: "RayS" <rays at blue-cove.com>
Sent: ?25-?7-?2014 19:56
To: "Discussion of Numerical Python" <numpy-discussion at scipy.org>
Subject: Re: [Numpy-discussion] numpy.mean still broken for largefloat32arrays

At 07:22 AM 7/25/2014, you wrote:

> We were talking on this in the office, as we
> realized it does affect a couple of lines dealing
> with large arrays, including complex64.
> As I expect Python modules to work uniformly
> cross platform unless documented otherwise, to me
> that includes 32 vs 64 bit platforms, implying
> that the modules should automatically use large
> enough accumulators for the data type input.

The 32/64-bitness of your platform has nothing to do with floating
point.

As a naive end user, I can, and do, download different binaries for different CPUs/Windows versions and will get different results
http://mail.scipy.org/pipermail/numpy-discussion/2014-July/070747.html


 Nothing discussed in this thread is platform-specific (modulo
some minor details about the hardware FPU, but that should be taken as
read).

And compilers, apparently.

The important point was that it would be best if all of the methods affected by summing 32 bit floats with 32 bit accumulators had the same Notes as numpy.mean(). We went through a lot of code yesterday, assuming that any numpy or Scipy.stats functions that use accumulators suffer the same issue, whether noted or not, and found it true.

"Depending on the input data, this can cause the results to be inaccurate, especially for float32 (see example below). Specifying a higher-precision accumulator using the dtype keyword can alleviate this issue." seems rather un-Pythonic.

- Ray
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140725/af1324de/attachment.html>

From alan.isaac at gmail.com  Fri Jul 25 14:00:15 2014
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Fri, 25 Jul 2014 14:00:15 -0400
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
In-Reply-To: <53d296aa.d1c6b40a.6e8d.ffffe5bf@mx.google.com>
References: <53d296aa.d1c6b40a.6e8d.ffffe5bf@mx.google.com>
Message-ID: <53D29B2F.9060005@gmail.com>

On 7/25/2014 1:40 PM, Eelco Hoogendoorn wrote:
> At the risk of repeating myself: explicit is better than implicit


This sounds like an argument for renaming the `mean` function `naivemean`
rather than `mean`.  Whatever numpy names `mean`, shouldn't it
implement an algorithm that produces the mean?  And obviously, for any
float data type, the mean value of the values in the array is representable
as a value of the same type.

Alan Isaac


From matthew.brett at gmail.com  Fri Jul 25 14:06:30 2014
From: matthew.brett at gmail.com (Matthew Brett)
Date: Fri, 25 Jul 2014 14:06:30 -0400
Subject: [Numpy-discussion] [pydata] Re: ANN: Pandas 0.14.0 Release
	Candidate 1
In-Reply-To: <733e0eed-fbef-46d4-b797-c5368b280fee@googlegroups.com>
References: <CAHMnJKh-L26x=-3c6pPoSnqmb7fAujSH=hGQ6f-V=On9FLXQTw@mail.gmail.com>
	<733e0eed-fbef-46d4-b797-c5368b280fee@googlegroups.com>
Message-ID: <CAH6Pt5rEucbz4sH+rWcePQKkftnR3n42TKbEPmQn-nC7fAVV+Q@mail.gmail.com>

Hi,

On Fri, Jul 25, 2014 at 9:52 AM, Jeff <jeffreback at gmail.com> wrote:
> How does the build trigger? If its just a matter of clicking on something
> when released. I think we can handle that :)
>

The two options are:

* I add you and whoever else does releases to my repo, and you can
trigger builds by pressing a button on the travis page for my repo, or
pushing commits to the repo
* You take over the repo, I submit a pull request to make sure you
have auth to upload to rackspace, and proceed as above.

But yes - single click -> build....

Cheers,

Matthew


From njs at pobox.com  Fri Jul 25 14:29:16 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 25 Jul 2014 19:29:16 +0100
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
In-Reply-To: <201407251656.s6PGui7J027752@blue-cove.com>
References: <53d1d132.93d3b40a.2d6b.0714@mx.google.com>
	<CAO0rnfHPh1o+S3kVceFq=txF1_M82eKP9sh59Cki9DmJrmg0Vg@mail.gmail.com>
	<201407251411.s6PEBrpw018675@blue-cove.com>
	<CAF6FJiufi+r_F5Rs256N7_3nS62OG05C1p13jLG0r3n_bm2eHg@mail.gmail.com>
	<201407251656.s6PGui7J027752@blue-cove.com>
Message-ID: <CAPJVwBk_izFirOULdR0LOwfHKHrbKfCiA19NGp6=s+Xe5TQqeg@mail.gmail.com>

On Fri, Jul 25, 2014 at 5:56 PM, RayS <rays at blue-cove.com> wrote:
> The important point was that it would be best if all of the methods affected
> by summing 32 bit floats with 32 bit accumulators had the same Notes as
> numpy.mean(). We went through a lot of code yesterday, assuming that any
> numpy or Scipy.stats functions that use accumulators suffer the same issue,
> whether noted or not, and found it true.

Do you have a list of the functions that are affected?

> "Depending on the input data, this can cause the results to be inaccurate,
> especially for float32 (see example below). Specifying a higher-precision
> accumulator using the dtype keyword can alleviate this issue." seems rather
> un-Pythonic.

It's true that in its full generality, this problem just isn't
something numpy can solve. Using float32 is extremely dangerous and
should not be attempted unless you're prepared to seriously analyze
all your code for numeric stability; IME it often runs into problems
in practice, in any number of ways. Remember that it only has as much
precision as a 24 bit integer. There are good reasons why float64 is
the default!

That said, it does seem that np.mean could be implemented better than
it is, even given float32's inherent limitations. If anyone wants to
implement better algorithms for computing the mean, variance, sums,
etc., then we would love to add them to numpy. I'd suggest
implementing them as gufuncs -- there are examples of defining gufuncs
in numpy/linalg/umath_linalg.c.src.

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org


From hoogendoorn.eelco at gmail.com  Fri Jul 25 15:23:43 2014
From: hoogendoorn.eelco at gmail.com (Eelco Hoogendoorn)
Date: Fri, 25 Jul 2014 21:23:43 +0200
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
In-Reply-To: <53D29B2F.9060005@gmail.com>
References: <53d296aa.d1c6b40a.6e8d.ffffe5bf@mx.google.com>
	<53D29B2F.9060005@gmail.com>
Message-ID: <CAO0rnfGLVche=uhLs5+2gg4Ep2EDvPdU8PHoHS6f7u0UGq4RfQ@mail.gmail.com>

It need not be exactly representable as such; take the mean of [1, 1+eps]
for instance. Granted, there are at most two number in the range of the
original dtype which are closest to the true mean; but im not sure that
computing them exactly is a tractable problem for arbitrary input.

Im not sure what is considered best practice for these problems; or if
there is one, considering the hetrogenity of the problem. As noted earlier,
summing a list of floating point values is a remarkably multifaceted
problem, once you get down into the details.

I think it should be understood that all floating point algorithms are
subject to floating point errors. As long as the algorithm used is
specified, one can make an informed decision if the given algorithm will do
what you expect of it. That's the best we can hope for.

If we are going to advocate doing 'clever' things behind the scenes, we
have to take backwards compatibility (not creating a possibility of
producing worse results on the same input) and platform independence in
mind. Funny summation orders could violate the former depending on the
implementation details, and 'using the highest machine precision available'
violates the latter (and is horrible practice in general, imo. Either you
don't need the extra accuracy, or you do, and the absence on a given
platform should be an error)

Perhaps pairwise summation in the original order of the data is the best
option:

q = np.ones((2,)*26, np.float32)
print q.mean()
while q.ndim > 0:
    q = q.mean(axis=-1, dtype=np.float32)
print q

This only requires log(N) space on the stack if properly implemented, and
is not platform dependent, nor should have any backward compatibility
issues that I can think of. But im not sure how easy it would be to
implement, given the current framework. The ability to specify different
algorithms per kwarg wouldn't be a bad idea either, imo; or the ability to
explicitly specify a separate output and accumulator dtype.


On Fri, Jul 25, 2014 at 8:00 PM, Alan G Isaac <alan.isaac at gmail.com> wrote:

> On 7/25/2014 1:40 PM, Eelco Hoogendoorn wrote:
> > At the risk of repeating myself: explicit is better than implicit
>
>
> This sounds like an argument for renaming the `mean` function `naivemean`
> rather than `mean`.  Whatever numpy names `mean`, shouldn't it
> implement an algorithm that produces the mean?  And obviously, for any
> float data type, the mean value of the values in the array is representable
> as a value of the same type.
>
> Alan Isaac
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140725/60ac6cf2/attachment.html>

From rays at blue-cove.com  Fri Jul 25 16:25:57 2014
From: rays at blue-cove.com (RayS)
Date: Fri, 25 Jul 2014 13:25:57 -0700
Subject: [Numpy-discussion] numpy.mean still broken for
 largefloat32arrays
In-Reply-To: <CAPJVwBk_izFirOULdR0LOwfHKHrbKfCiA19NGp6=s+Xe5TQqeg@mail.g
	mail.com>
References: <53d1d132.93d3b40a.2d6b.0714@mx.google.com>
	<CAO0rnfHPh1o+S3kVceFq=txF1_M82eKP9sh59Cki9DmJrmg0Vg@mail.gmail.com>
	<201407251411.s6PEBrpw018675@blue-cove.com>
	<CAF6FJiufi+r_F5Rs256N7_3nS62OG05C1p13jLG0r3n_bm2eHg@mail.gmail.com>
	<201407251656.s6PGui7J027752@blue-cove.com>
	<CAPJVwBk_izFirOULdR0LOwfHKHrbKfCiA19NGp6=s+Xe5TQqeg@mail.gmail.com>
Message-ID: <201407252026.s6PKQ2kZ016912@blue-cove.com>

At 11:29 AM 7/25/2014, you wrote:
>On Fri, Jul 25, 2014 at 5:56 PM, RayS <rays at blue-cove.com> wrote:
> > The important point was that it would be best if all of the 
> methods affected
> > by summing 32 bit floats with 32 bit accumulators had the same Notes as
> > numpy.mean(). We went through a lot of code yesterday, assuming that any
> > numpy or Scipy.stats functions that use accumulators suffer the same issue,
> > whether noted or not, and found it true.
>
>Do you have a list of the functions that are affected?

We only tested a few we used, but
scipy.stats.nanmean, or any .*mean()
numpy.sum, mean, average, std, var,...

via something like:

import numpy
import scipy.stats
print numpy.__version__
print scipy.__version__
onez = numpy.ones((2**25, 1), numpy.float32)
step = 2**10
func = scipy.stats.nanmean
for s in range(2**24-step, 2**25, step):
     if func(onez[:s+step])!=1.:
         print '\nbroke', s, func(onez[:s+step])
         break
     else:
         print '\r',s,

>  That said, it does seem that np.mean could be implemented better than
>it is, even given float32's inherent limitations. If anyone wants to
>implement better algorithms for computing the mean, variance, sums,
>etc., then we would love to add them to numpy.

Others have pointed out the possible tradeoffs in summation algos, 
perhaps a method arg would be appropriate, "better" depending on your 
desire for speed vs. accuracy.

It just occurred to me that if the STSI folks (who count photons) 
took the mean() or other such func of an image array from Hubble 
sensors to find background value, they'd better always be using float64.

  - Ray


From josef.pktd at gmail.com  Fri Jul 25 17:36:27 2014
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 25 Jul 2014 17:36:27 -0400
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
In-Reply-To: <201407252026.s6PKQ2kZ016912@blue-cove.com>
References: <53d1d132.93d3b40a.2d6b.0714@mx.google.com>
	<CAO0rnfHPh1o+S3kVceFq=txF1_M82eKP9sh59Cki9DmJrmg0Vg@mail.gmail.com>
	<201407251411.s6PEBrpw018675@blue-cove.com>
	<CAF6FJiufi+r_F5Rs256N7_3nS62OG05C1p13jLG0r3n_bm2eHg@mail.gmail.com>
	<201407251656.s6PGui7J027752@blue-cove.com>
	<CAPJVwBk_izFirOULdR0LOwfHKHrbKfCiA19NGp6=s+Xe5TQqeg@mail.gmail.com>
	<201407252026.s6PKQ2kZ016912@blue-cove.com>
Message-ID: <CAMMTP+CTukcUyAxGedR2Q8UbVu+x6GQg5on3-QPSxqSV=TJi3w@mail.gmail.com>

On Fri, Jul 25, 2014 at 4:25 PM, RayS <rays at blue-cove.com> wrote:

> At 11:29 AM 7/25/2014, you wrote:
> >On Fri, Jul 25, 2014 at 5:56 PM, RayS <rays at blue-cove.com> wrote:
> > > The important point was that it would be best if all of the
> > methods affected
> > > by summing 32 bit floats with 32 bit accumulators had the same Notes as
> > > numpy.mean(). We went through a lot of code yesterday, assuming that
> any
> > > numpy or Scipy.stats functions that use accumulators suffer the same
> issue,
> > > whether noted or not, and found it true.
> >
> >Do you have a list of the functions that are affected?
>
> We only tested a few we used, but
> scipy.stats.nanmean, or any .*mean()
> numpy.sum, mean, average, std, var,...
>
> via something like:
>
> import numpy
> import scipy.stats
> print numpy.__version__
> print scipy.__version__
> onez = numpy.ones((2**25, 1), numpy.float32)
> step = 2**10
> func = scipy.stats.nanmean
> for s in range(2**24-step, 2**25, step):
>      if func(onez[:s+step])!=1.:
>          print '\nbroke', s, func(onez[:s+step])
>          break
>      else:
>          print '\r',s,
>
> >  That said, it does seem that np.mean could be implemented better than
> >it is, even given float32's inherent limitations. If anyone wants to
> >implement better algorithms for computing the mean, variance, sums,
> >etc., then we would love to add them to numpy.
>
> Others have pointed out the possible tradeoffs in summation algos,
> perhaps a method arg would be appropriate, "better" depending on your
> desire for speed vs. accuracy.
>

I think this would be a good improvement.
But it doesn't compensate for users to be aware of the problems. I think
the docstring and the description of the dtype argument is pretty clear.

I'm largely with Eelco, whatever precision or algorithm we use, with
floating point calculations we run into problems in some cases. And I don't
think this is a "broken" function but a design decision that takes the
different tradeoffs into account.
Whether it's the right decision is an open question, if there are better
algorithm with essentially not disadvantages.

Two examples:
I had problems to verify some results against Stata at more than a few
significant digits, until I realized that Stata had used float32 for the
calculations by default in this case, while I was working with float64.
Using single precision linear algebra causes the same numerical problems as
numpy.mean runs into.

A few years ago I tried to match some tougher NIST examples that were
intentionally very badly scaled. numpy.mean at float64 had quite large
errors, but a simple trick with two passes through the data managed to get
very close to the certified NIST examples.


my conclusion:
don't use float32 unless you know you don't need any higher precision.
even with float64 it is possible to run into extreme cases where you get
numerical garbage or large precision losses.
However, in the large majority of cases a boring fast "naive"
implementation is enough.

Also, whether we use mean, sum or dot in a calculation is an implementation
detail, which in the case of dot doesn't have a dtype argument and always
depends on the dtype of the arrays, AFAIK.
Separating the accumulation dtype from the array dtype would require a lot
of work except in the simplest cases, like those that only use sum and mean
with specified dtype argument.


trying out the original example:

>>> X = np.ones((50000, 1024), np.float32)
>>> X.mean()
1.0
>>> X.mean(dtype=np.float32)
1.0


>>> np.dot(X.ravel(), np.ones(X.ravel().shape) *1. / X.ravel().shape)
1.0000000002299174

>>> np.__version__
'1.5.1'

Win32

Josef


>
> It just occurred to me that if the STSI folks (who count photons)
> took the mean() or other such func of an image array from Hubble
> sensors to find background value, they'd better always be using float64.
>
>   - Ray
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140725/496b54f0/attachment.html>

From hoogendoorn.eelco at gmail.com  Fri Jul 25 17:51:40 2014
From: hoogendoorn.eelco at gmail.com (Eelco Hoogendoorn)
Date: Sat, 26 Jul 2014 00:51:40 +0300
Subject: [Numpy-discussion] numpy.mean still broken for
 largefloat32arrays
Message-ID: <53d2d191.48c9c20a.026d.4a6a@mx.google.com>

Ray: I'm not working with Hubble data, but yeah these are all issues I've run into with my terrabytes of microscopy data as well. Given that such raw data comes as uint16, its best to do your calculations as much as possible in good old ints. What you compute is what you get, no obscure shenanigans.

It just occurred to me that pairwise summation will lead to highly branchy code, and you can forget about any vector extensions. Tradeoffs indeed. Any such hierarchical summation is probably best done by aggregating naively summed blocks. 

-----Original Message-----
From: "RayS" <rays at blue-cove.com>
Sent: ?25-?7-?2014 23:26
To: "Discussion of Numerical Python" <numpy-discussion at scipy.org>
Subject: Re: [Numpy-discussion] numpy.mean still broken for largefloat32arrays

At 11:29 AM 7/25/2014, you wrote:
>On Fri, Jul 25, 2014 at 5:56 PM, RayS <rays at blue-cove.com> wrote:
> > The important point was that it would be best if all of the 
> methods affected
> > by summing 32 bit floats with 32 bit accumulators had the same Notes as
> > numpy.mean(). We went through a lot of code yesterday, assuming that any
> > numpy or Scipy.stats functions that use accumulators suffer the same issue,
> > whether noted or not, and found it true.
>
>Do you have a list of the functions that are affected?

We only tested a few we used, but
scipy.stats.nanmean, or any .*mean()
numpy.sum, mean, average, std, var,...

via something like:

import numpy
import scipy.stats
print numpy.__version__
print scipy.__version__
onez = numpy.ones((2**25, 1), numpy.float32)
step = 2**10
func = scipy.stats.nanmean
for s in range(2**24-step, 2**25, step):
     if func(onez[:s+step])!=1.:
         print '\nbroke', s, func(onez[:s+step])
         break
     else:
         print '\r',s,

>  That said, it does seem that np.mean could be implemented better than
>it is, even given float32's inherent limitations. If anyone wants to
>implement better algorithms for computing the mean, variance, sums,
>etc., then we would love to add them to numpy.

Others have pointed out the possible tradeoffs in summation algos, 
perhaps a method arg would be appropriate, "better" depending on your 
desire for speed vs. accuracy.

It just occurred to me that if the STSI folks (who count photons) 
took the mean() or other such func of an image array from Hubble 
sensors to find background value, they'd better always be using float64.

  - Ray


_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140726/e1d40693/attachment.html>

From jtaylor.debian at googlemail.com  Fri Jul 25 17:57:51 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Fri, 25 Jul 2014 23:57:51 +0200
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
In-Reply-To: <53d2d191.48c9c20a.026d.4a6a@mx.google.com>
References: <53d2d191.48c9c20a.026d.4a6a@mx.google.com>
Message-ID: <53D2D2DF.9000907@googlemail.com>

On 25.07.2014 23:51, Eelco Hoogendoorn wrote:
> Ray: I'm not working with Hubble data, but yeah these are all issues
> I've run into with my terrabytes of microscopy data as well. Given that
> such raw data comes as uint16, its best to do your calculations as much
> as possible in good old ints. What you compute is what you get, no
> obscure shenanigans.

integers are dangerous too, they overflow quickly and signed overflow is
even undefined in C the standard.

> 
> It just occurred to me that pairwise summation will lead to highly
> branchy code, and you can forget about any vector extensions. Tradeoffs
> indeed. Any such hierarchical summation is probably best done by
> aggregating naively summed blocks.

pairwise summation is usually implemented with a naive sum cutoff large
enough so the recursion does not matter much.
In numpy 1.9 this cutoff is 128 elements, but the inner loop is unrolled
8 times which makes it effectively 16 elements.
the unrolling factor of 8 was intentionally chosen to allow using AVX in
the inner loop without changing the summation ordering, but last I
tested actually using AVX here only gave mediocre speedups (10%-20% on
an i5).

> ------------------------------------------------------------------------
> From: RayS <mailto:rays at blue-cove.com>
> Sent: ?25-?7-?2014 23:26
> To: Discussion of Numerical Python <mailto:numpy-discussion at scipy.org>
> Subject: Re: [Numpy-discussion] numpy.mean still broken for
> largefloat32arrays
> 
> At 11:29 AM 7/25/2014, you wrote:
>>On Fri, Jul 25, 2014 at 5:56 PM, RayS <rays at blue-cove.com> wrote:
>> > The important point was that it would be best if all of the
>> methods affected
>> > by summing 32 bit floats with 32 bit accumulators had the same Notes as
>> > numpy.mean(). We went through a lot of code yesterday, assuming that any
>> > numpy or Scipy.stats functions that use accumulators suffer the same
> issue,
>> > whether noted or not, and found it true.
>>
>>Do you have a list of the functions that are affected?
> 
> We only tested a few we used, but
> scipy.stats.nanmean, or any .*mean()
> numpy.sum, mean, average, std, var,...
> 
> via something like:
> 
> import numpy
> import scipy.stats
> print numpy.__version__
> print scipy.__version__
> onez = numpy.ones((2**25, 1), numpy.float32)
> step = 2**10
> func = scipy.stats.nanmean
> for s in range(2**24-step, 2**25, step):
>      if func(onez[:s+step])!=1.:
>          print '\nbroke', s, func(onez[:s+step])
>          break
>      else:
>          print '\r',s,
> 
>>  That said, it does seem that np.mean could be implemented better than
>>it is, even given float32's inherent limitations. If anyone wants to
>>implement better algorithms for computing the mean, variance, sums,
>>etc., then we would love to add them to numpy.
> 
> Others have pointed out the possible tradeoffs in summation algos,
> perhaps a method arg would be appropriate, "better" depending on your
> desire for speed vs. accuracy.
> 
> It just occurred to me that if the STSI folks (who count photons)
> took the mean() or other such func of an image array from Hubble
> sensors to find background value, they'd better always be using float64.
> 
>   - Ray
> 
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 


From rays at blue-cove.com  Fri Jul 25 19:51:03 2014
From: rays at blue-cove.com (RayS)
Date: Fri, 25 Jul 2014 16:51:03 -0700
Subject: [Numpy-discussion] numpy.mean still broken for
 largefloat32arrays
In-Reply-To: <CAMMTP+CTukcUyAxGedR2Q8UbVu+x6GQg5on3-QPSxqSV=TJi3w@mail.g
	mail.com>
References: <53d1d132.93d3b40a.2d6b.0714@mx.google.com>
	<CAO0rnfHPh1o+S3kVceFq=txF1_M82eKP9sh59Cki9DmJrmg0Vg@mail.gmail.com>
	<201407251411.s6PEBrpw018675@blue-cove.com>
	<CAF6FJiufi+r_F5Rs256N7_3nS62OG05C1p13jLG0r3n_bm2eHg@mail.gmail.com>
	<201407251656.s6PGui7J027752@blue-cove.com>
	<CAPJVwBk_izFirOULdR0LOwfHKHrbKfCiA19NGp6=s+Xe5TQqeg@mail.gmail.com>
	<201407252026.s6PKQ2kZ016912@blue-cove.com>
	<CAMMTP+CTukcUyAxGedR2Q8UbVu+x6GQg5on3-QPSxqSV=TJi3w@mail.gmail.com>
Message-ID: <201407252351.s6PNpBGH022808@blue-cove.com>

At 02:36 PM 7/25/2014, you wrote:
>But it doesn't compensate for users to be aware of the problems. I 
>think the docstring and the description of the dtype argument is pretty clear.

Most of the docs for the affected functions do not have a Note with 
the same warning as mean()

- Ray


From larsmans at gmail.com  Sat Jul 26 04:19:01 2014
From: larsmans at gmail.com (Lars Buitinck)
Date: Sat, 26 Jul 2014 10:19:01 +0200
Subject: [Numpy-discussion] change default integer from int32 to int64
	on win64?
Message-ID: <CAKz-xUcx30aeuBS9ggrP9YiP-6i_tV6VOLODO4HEM2-UWQk6jg@mail.gmail.com>

> Date: Fri, 25 Jul 2014 15:06:40 +0200
> From: Olivier Grisel <olivier.grisel at ensta.org>
> Subject: Re: [Numpy-discussion] change default integer from int32 to
>         int64   on win64?
> To: Discussion of Numerical Python <numpy-discussion at scipy.org>
> Content-Type: text/plain; charset=UTF-8
>
> The dtype returned by np.where looks right (int64):
>
>>>> import platform
>>>> platform.architecture()
> ('64bit', 'WindowsPE')
>>>> import numpy as np
>>>> np.__version__
> '1.9.0b1'
>>>> a = np.zeros(10)
>>>> np.where(a == 0)
> (array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=int64),)

Strange. In [1] we had to cast the result of np.where because it was
an array of long. I ran through the NumPy code, and I couldn't find
the flaw, but neither could I find a point in the history where it was
fixed.

[1] https://github.com/scikit-learn/scikit-learn/commit/ebdeddbab1620c2473d04dc242d1e30684af9511


From robert.kern at gmail.com  Sat Jul 26 04:49:54 2014
From: robert.kern at gmail.com (Robert Kern)
Date: Sat, 26 Jul 2014 09:49:54 +0100
Subject: [Numpy-discussion] change default integer from int32 to int64
	on win64?
In-Reply-To: <CAKz-xUcx30aeuBS9ggrP9YiP-6i_tV6VOLODO4HEM2-UWQk6jg@mail.gmail.com>
References: <CAKz-xUcx30aeuBS9ggrP9YiP-6i_tV6VOLODO4HEM2-UWQk6jg@mail.gmail.com>
Message-ID: <CAF6FJivh8o2pM8qfjrh6orXCbMRKVBadQr=XF1zzdi6wOBsLMQ@mail.gmail.com>

On Sat, Jul 26, 2014 at 9:19 AM, Lars Buitinck <larsmans at gmail.com> wrote:
>> Date: Fri, 25 Jul 2014 15:06:40 +0200
>> From: Olivier Grisel <olivier.grisel at ensta.org>
>> Subject: Re: [Numpy-discussion] change default integer from int32 to
>>         int64   on win64?
>> To: Discussion of Numerical Python <numpy-discussion at scipy.org>
>> Content-Type: text/plain; charset=UTF-8
>>
>> The dtype returned by np.where looks right (int64):
>>
>>>>> import platform
>>>>> platform.architecture()
>> ('64bit', 'WindowsPE')
>>>>> import numpy as np
>>>>> np.__version__
>> '1.9.0b1'
>>>>> a = np.zeros(10)
>>>>> np.where(a == 0)
>> (array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=int64),)
>
> Strange. In [1] we had to cast the result of np.where because it was
> an array of long. I ran through the NumPy code, and I couldn't find
> the flaw, but neither could I find a point in the history where it was
> fixed.
>
> [1] https://github.com/scikit-learn/scikit-learn/commit/ebdeddbab1620c2473d04dc242d1e30684af9511

As far as I can tell, it's been that way essentially forever, before
numpy was numpy:

https://github.com/numpy/numpy/commit/8cb36a62#diff-88aedadb94e0ead6b434d55f81668471R645

-- 
Robert Kern


From hoogendoorn.eelco at gmail.com  Sat Jul 26 05:05:08 2014
From: hoogendoorn.eelco at gmail.com (Eelco Hoogendoorn)
Date: Sat, 26 Jul 2014 12:05:08 +0300
Subject: [Numpy-discussion] numpy.mean still broken for
 largefloat32arrays
Message-ID: <53d36f6e.a959b40a.7f23.738c@mx.google.com>

Cool, sounds  like great improvements. I can imagine that after some loop unrolling one becomes memory bound pretty soon. Is the summation guaranteed to traverse the data in its natural order? And do you happen to know what the rules for choosing accumulator dtypes are?

-----Original Message-----
From: "Julian Taylor" <jtaylor.debian at googlemail.com>
Sent: ?26-?7-?2014 00:58
To: "Discussion of Numerical Python" <numpy-discussion at scipy.org>
Subject: Re: [Numpy-discussion] numpy.mean still broken for largefloat32arrays

On 25.07.2014 23:51, Eelco Hoogendoorn wrote:
> Ray: I'm not working with Hubble data, but yeah these are all issues
> I've run into with my terrabytes of microscopy data as well. Given that
> such raw data comes as uint16, its best to do your calculations as much
> as possible in good old ints. What you compute is what you get, no
> obscure shenanigans.

integers are dangerous too, they overflow quickly and signed overflow is
even undefined in C the standard.

> 
> It just occurred to me that pairwise summation will lead to highly
> branchy code, and you can forget about any vector extensions. Tradeoffs
> indeed. Any such hierarchical summation is probably best done by
> aggregating naively summed blocks.

pairwise summation is usually implemented with a naive sum cutoff large
enough so the recursion does not matter much.
In numpy 1.9 this cutoff is 128 elements, but the inner loop is unrolled
8 times which makes it effectively 16 elements.
the unrolling factor of 8 was intentionally chosen to allow using AVX in
the inner loop without changing the summation ordering, but last I
tested actually using AVX here only gave mediocre speedups (10%-20% on
an i5).

> ------------------------------------------------------------------------
> From: RayS <mailto:rays at blue-cove.com>
> Sent: ?25-?7-?2014 23:26
> To: Discussion of Numerical Python <mailto:numpy-discussion at scipy.org>
> Subject: Re: [Numpy-discussion] numpy.mean still broken for
> largefloat32arrays
> 
> At 11:29 AM 7/25/2014, you wrote:
>>On Fri, Jul 25, 2014 at 5:56 PM, RayS <rays at blue-cove.com> wrote:
>> > The important point was that it would be best if all of the
>> methods affected
>> > by summing 32 bit floats with 32 bit accumulators had the same Notes as
>> > numpy.mean(). We went through a lot of code yesterday, assuming that any
>> > numpy or Scipy.stats functions that use accumulators suffer the same
> issue,
>> > whether noted or not, and found it true.
>>
>>Do you have a list of the functions that are affected?
> 
> We only tested a few we used, but
> scipy.stats.nanmean, or any .*mean()
> numpy.sum, mean, average, std, var,...
> 
> via something like:
> 
> import numpy
> import scipy.stats
> print numpy.__version__
> print scipy.__version__
> onez = numpy.ones((2**25, 1), numpy.float32)
> step = 2**10
> func = scipy.stats.nanmean
> for s in range(2**24-step, 2**25, step):
>      if func(onez[:s+step])!=1.:
>          print '\nbroke', s, func(onez[:s+step])
>          break
>      else:
>          print '\r',s,
> 
>>  That said, it does seem that np.mean could be implemented better than
>>it is, even given float32's inherent limitations. If anyone wants to
>>implement better algorithms for computing the mean, variance, sums,
>>etc., then we would love to add them to numpy.
> 
> Others have pointed out the possible tradeoffs in summation algos,
> perhaps a method arg would be appropriate, "better" depending on your
> desire for speed vs. accuracy.
> 
> It just occurred to me that if the STSI folks (who count photons)
> took the mean() or other such func of an image array from Hubble
> sensors to find background value, they'd better always be using float64.
> 
>   - Ray
> 
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140726/d564df7c/attachment.html>

From sebastian at sipsolutions.net  Sat Jul 26 05:15:00 2014
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Sat, 26 Jul 2014 11:15:00 +0200
Subject: [Numpy-discussion] numpy.mean still broken for
 largefloat32arrays
In-Reply-To: <CAO0rnfGLVche=uhLs5+2gg4Ep2EDvPdU8PHoHS6f7u0UGq4RfQ@mail.gmail.com>
References: <53d296aa.d1c6b40a.6e8d.ffffe5bf@mx.google.com>
	<53D29B2F.9060005@gmail.com>
	<CAO0rnfGLVche=uhLs5+2gg4Ep2EDvPdU8PHoHS6f7u0UGq4RfQ@mail.gmail.com>
Message-ID: <1406366100.30315.7.camel@sebastian-t440>

On Fr, 2014-07-25 at 21:23 +0200, Eelco Hoogendoorn wrote:
> It need not be exactly representable as such; take the mean of [1, 1
> +eps] for instance. Granted, there are at most two number in the range
> of the original dtype which are closest to the true mean; but im not
> sure that computing them exactly is a tractable problem for arbitrary
> input.
> 
<snip>
> 
> This only requires log(N) space on the stack if properly implemented,
> and is not platform dependent, nor should have any backward
> compatibility issues that I can think of. But im not sure how easy it
> would be to implement, given the current framework. The ability to
> specify different algorithms per kwarg wouldn't be a bad idea either,
> imo; or the ability to explicitly specify a separate output and
> accumulator dtype.
> 
> 

Well, you already can use dtype to cause an upcast of both arrays.
However this currently will cause a buffered upcast to float64 for the
float32 data. You could also add a d,f->d loop to avoid the cast, but
then you would have to use the out argument currently.

In any case, the real solution here is IMO what I think most of us
already thought before would be good, and that is a keyword argument or
maybe context (though I am unsure about details with threading, etc.) to
chose more stable algorithms for such statistical functions. The
pairwise summation that is in master now is very awesome, but it is not
secure enough in the sense that a new user will have difficulty
understanding when he can be sure it is used.

- Sebastian

> 
> On Fri, Jul 25, 2014 at 8:00 PM, Alan G Isaac <alan.isaac at gmail.com>
> wrote:
>         On 7/25/2014 1:40 PM, Eelco Hoogendoorn wrote:
>         > At the risk of repeating myself: explicit is better than
>         implicit
<snip>


From sturla.molden at gmail.com  Sat Jul 26 06:39:18 2014
From: sturla.molden at gmail.com (Sturla Molden)
Date: Sat, 26 Jul 2014 10:39:18 +0000 (UTC)
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
References: <53d296aa.d1c6b40a.6e8d.ffffe5bf@mx.google.com>
	<53D29B2F.9060005@gmail.com>
	<CAO0rnfGLVche=uhLs5+2gg4Ep2EDvPdU8PHoHS6f7u0UGq4RfQ@mail.gmail.com>
	<1406366100.30315.7.camel@sebastian-t440>
Message-ID: <956345389428063831.570461sturla.molden-gmail.com@news.gmane.org>

Sebastian Berg <sebastian at sipsolutions.net> wrote:

> chose more stable algorithms for such statistical functions. The
> pairwise summation that is in master now is very awesome, but it is not
> secure enough in the sense that a new user will have difficulty
> understanding when he can be sure it is used.

Why is it not always used?


From hoogendoorn.eelco at gmail.com  Sat Jul 26 09:38:46 2014
From: hoogendoorn.eelco at gmail.com (Eelco Hoogendoorn)
Date: Sat, 26 Jul 2014 15:38:46 +0200
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
In-Reply-To: <956345389428063831.570461sturla.molden-gmail.com@news.gmane.org>
References: <53d296aa.d1c6b40a.6e8d.ffffe5bf@mx.google.com>
	<53D29B2F.9060005@gmail.com>
	<CAO0rnfGLVche=uhLs5+2gg4Ep2EDvPdU8PHoHS6f7u0UGq4RfQ@mail.gmail.com>
	<1406366100.30315.7.camel@sebastian-t440>
	<956345389428063831.570461sturla.molden-gmail.com@news.gmane.org>
Message-ID: <CAO0rnfGcr8pXA2pNXi+G4NZp0VFbPzp6sgM4eU2kE8MAbFWGhw@mail.gmail.com>

I was wondering the same thing. Are there any known tradeoffs to this
method of reduction?


On Sat, Jul 26, 2014 at 12:39 PM, Sturla Molden <sturla.molden at gmail.com>
wrote:

> Sebastian Berg <sebastian at sipsolutions.net> wrote:
>
> > chose more stable algorithms for such statistical functions. The
> > pairwise summation that is in master now is very awesome, but it is not
> > secure enough in the sense that a new user will have difficulty
> > understanding when he can be sure it is used.
>
> Why is it not always used?
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140726/c7edb8ab/attachment.html>

From jtaylor.debian at googlemail.com  Sat Jul 26 09:53:06 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Sat, 26 Jul 2014 15:53:06 +0200
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
In-Reply-To: <CAO0rnfGcr8pXA2pNXi+G4NZp0VFbPzp6sgM4eU2kE8MAbFWGhw@mail.gmail.com>
References: <53d296aa.d1c6b40a.6e8d.ffffe5bf@mx.google.com>	<53D29B2F.9060005@gmail.com>	<CAO0rnfGLVche=uhLs5+2gg4Ep2EDvPdU8PHoHS6f7u0UGq4RfQ@mail.gmail.com>	<1406366100.30315.7.camel@sebastian-t440>	<956345389428063831.570461sturla.molden-gmail.com@news.gmane.org>
	<CAO0rnfGcr8pXA2pNXi+G4NZp0VFbPzp6sgM4eU2kE8MAbFWGhw@mail.gmail.com>
Message-ID: <53D3B2C2.7090309@googlemail.com>

On 26.07.2014 15:38, Eelco Hoogendoorn wrote:
> 
> Why is it not always used?

for 1d reduction the iterator blocks by 8192 elements even when no
buffering is required. There is a TODO in the source to fix that by
adding additional checks. Unfortunately nobody knows hat these
additional tests would need to be and Mark Wiebe who wrote it did not
reply to a ping yet.

Also along the non-fast axes the iterator optimizes the reduction to
remove the strided access, see:
https://github.com/numpy/numpy/pull/4697#issuecomment-42752599


Instead of having a keyword argument to mean I would prefer a context
manager that changes algorithms for different requirements.
This would easily allow changing the accuracy and performance of third
party functions using numpy without changing the third party library as
long as they are using numpy as the base.
E.g.
with np.precisionstate(sum="kahan"):
  scipy.stats.nanmean(d)

We also have case where numpy uses algorithms that are far more precise
than most people needs them. E.g. np.hypot and the related complex
absolute value and division.
These are very slow with glibc as it provides 1ulp accuracy, this is
hardly ever needed.
Another case that could use dynamic changing is flushing subnormals to zero.

But this api is like Nathaniels parameterizable dtypes just an idea
floating in my head which needs proper design and implementation written
down. The issue is as usual ENOTIME.


From ben.root at ou.edu  Sat Jul 26 09:57:05 2014
From: ben.root at ou.edu (Benjamin Root)
Date: Sat, 26 Jul 2014 09:57:05 -0400
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
In-Reply-To: <53D3B2C2.7090309@googlemail.com>
References: <53d296aa.d1c6b40a.6e8d.ffffe5bf@mx.google.com>
	<53D29B2F.9060005@gmail.com>
	<CAO0rnfGLVche=uhLs5+2gg4Ep2EDvPdU8PHoHS6f7u0UGq4RfQ@mail.gmail.com>
	<1406366100.30315.7.camel@sebastian-t440>
	<956345389428063831.570461sturla.molden-gmail.com@news.gmane.org>
	<CAO0rnfGcr8pXA2pNXi+G4NZp0VFbPzp6sgM4eU2kE8MAbFWGhw@mail.gmail.com>
	<53D3B2C2.7090309@googlemail.com>
Message-ID: <CANNq6Fko_2xhHa_JppRk6=UTqprFZo9ZWhdfFO52Uj7zpwGJWw@mail.gmail.com>

I could get behind the context manager approach. It would help keep
backwards compatibility, while providing a very easy (and clean) way of
consistently using the same reduction operation. Adding kwargs is just a
road to hell.

Cheers!
Ben Root


On Sat, Jul 26, 2014 at 9:53 AM, Julian Taylor <
jtaylor.debian at googlemail.com> wrote:

> On 26.07.2014 15:38, Eelco Hoogendoorn wrote:
> >
> > Why is it not always used?
>
> for 1d reduction the iterator blocks by 8192 elements even when no
> buffering is required. There is a TODO in the source to fix that by
> adding additional checks. Unfortunately nobody knows hat these
> additional tests would need to be and Mark Wiebe who wrote it did not
> reply to a ping yet.
>
> Also along the non-fast axes the iterator optimizes the reduction to
> remove the strided access, see:
> https://github.com/numpy/numpy/pull/4697#issuecomment-42752599
>
>
> Instead of having a keyword argument to mean I would prefer a context
> manager that changes algorithms for different requirements.
> This would easily allow changing the accuracy and performance of third
> party functions using numpy without changing the third party library as
> long as they are using numpy as the base.
> E.g.
> with np.precisionstate(sum="kahan"):
>   scipy.stats.nanmean(d)
>
> We also have case where numpy uses algorithms that are far more precise
> than most people needs them. E.g. np.hypot and the related complex
> absolute value and division.
> These are very slow with glibc as it provides 1ulp accuracy, this is
> hardly ever needed.
> Another case that could use dynamic changing is flushing subnormals to
> zero.
>
> But this api is like Nathaniels parameterizable dtypes just an idea
> floating in my head which needs proper design and implementation written
> down. The issue is as usual ENOTIME.
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140726/5bbfbc6a/attachment.html>

From sebastian at sipsolutions.net  Sat Jul 26 09:58:51 2014
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Sat, 26 Jul 2014 15:58:51 +0200
Subject: [Numpy-discussion] numpy.mean still broken for
 largefloat32arrays
In-Reply-To: <CAO0rnfGcr8pXA2pNXi+G4NZp0VFbPzp6sgM4eU2kE8MAbFWGhw@mail.gmail.com>
References: <53d296aa.d1c6b40a.6e8d.ffffe5bf@mx.google.com>
	<53D29B2F.9060005@gmail.com>
	<CAO0rnfGLVche=uhLs5+2gg4Ep2EDvPdU8PHoHS6f7u0UGq4RfQ@mail.gmail.com>
	<1406366100.30315.7.camel@sebastian-t440>
	<956345389428063831.570461sturla.molden-gmail.com@news.gmane.org>
	<CAO0rnfGcr8pXA2pNXi+G4NZp0VFbPzp6sgM4eU2kE8MAbFWGhw@mail.gmail.com>
Message-ID: <1406383131.30315.9.camel@sebastian-t440>

On Sa, 2014-07-26 at 15:38 +0200, Eelco Hoogendoorn wrote:
> I was wondering the same thing. Are there any known tradeoffs to this
> method of reduction?
> 

Yes, it is much more complicated and incompatible with naive ufuncs if
you want your memory access to be optimized. And optimizing that is very
much worth it speed wise...

- Sebastian

> 
> On Sat, Jul 26, 2014 at 12:39 PM, Sturla Molden
> <sturla.molden at gmail.com> wrote:
>         Sebastian Berg <sebastian at sipsolutions.net> wrote:
>         
>         > chose more stable algorithms for such statistical functions.
>         The
>         > pairwise summation that is in master now is very awesome,
>         but it is not
>         > secure enough in the sense that a new user will have
>         difficulty
>         > understanding when he can be sure it is used.
>         
>         
>         Why is it not always used?
>         
>         _______________________________________________
>         NumPy-Discussion mailing list
>         NumPy-Discussion at scipy.org
>         http://mail.scipy.org/mailman/listinfo/numpy-discussion
>         
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From hoogendoorn.eelco at gmail.com  Sat Jul 26 10:10:50 2014
From: hoogendoorn.eelco at gmail.com (Eelco Hoogendoorn)
Date: Sat, 26 Jul 2014 16:10:50 +0200
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
In-Reply-To: <53D3B2C2.7090309@googlemail.com>
References: <53d296aa.d1c6b40a.6e8d.ffffe5bf@mx.google.com>
	<53D29B2F.9060005@gmail.com>
	<CAO0rnfGLVche=uhLs5+2gg4Ep2EDvPdU8PHoHS6f7u0UGq4RfQ@mail.gmail.com>
	<1406366100.30315.7.camel@sebastian-t440>
	<956345389428063831.570461sturla.molden-gmail.com@news.gmane.org>
	<CAO0rnfGcr8pXA2pNXi+G4NZp0VFbPzp6sgM4eU2kE8MAbFWGhw@mail.gmail.com>
	<53D3B2C2.7090309@googlemail.com>
Message-ID: <CAO0rnfGpbTiyS8WCoscHdb0vPbirikxk4pv4QhRq1DuAxyJhbw@mail.gmail.com>

A context manager makes sense.

I very much appreciate the time constraints and the effort put in this far,
but if we can not make something work uniformly, I wonder if we should
include it in the master at all. I don't have a problem with customizing
algorithms where fp accuracy demands it; I have more of a problem with hard
to predict behavior. If np.ones(bigN).sum() gives different results than
np.ones((bigN,2)).sum(0), aside from the obvious differences, that would be
one hard to catch source of bugs.

Wouldn't per-axis reduction, as a limited form of nested reduction, provide
most of the benefits, without any of the drawbacks?


On Sat, Jul 26, 2014 at 3:53 PM, Julian Taylor <
jtaylor.debian at googlemail.com> wrote:

> On 26.07.2014 15:38, Eelco Hoogendoorn wrote:
> >
> > Why is it not always used?
>
> for 1d reduction the iterator blocks by 8192 elements even when no
> buffering is required. There is a TODO in the source to fix that by
> adding additional checks. Unfortunately nobody knows hat these
> additional tests would need to be and Mark Wiebe who wrote it did not
> reply to a ping yet.
>
> Also along the non-fast axes the iterator optimizes the reduction to
> remove the strided access, see:
> https://github.com/numpy/numpy/pull/4697#issuecomment-42752599
>
>
> Instead of having a keyword argument to mean I would prefer a context
> manager that changes algorithms for different requirements.
> This would easily allow changing the accuracy and performance of third
> party functions using numpy without changing the third party library as
> long as they are using numpy as the base.
> E.g.
> with np.precisionstate(sum="kahan"):
>   scipy.stats.nanmean(d)
>
> We also have case where numpy uses algorithms that are far more precise
> than most people needs them. E.g. np.hypot and the related complex
> absolute value and division.
> These are very slow with glibc as it provides 1ulp accuracy, this is
> hardly ever needed.
> Another case that could use dynamic changing is flushing subnormals to
> zero.
>
> But this api is like Nathaniels parameterizable dtypes just an idea
> floating in my head which needs proper design and implementation written
> down. The issue is as usual ENOTIME.
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140726/c0dd683b/attachment.html>

From sturla.molden at gmail.com  Sat Jul 26 11:11:09 2014
From: sturla.molden at gmail.com (Sturla Molden)
Date: Sat, 26 Jul 2014 15:11:09 +0000 (UTC)
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
References: <53d296aa.d1c6b40a.6e8d.ffffe5bf@mx.google.com>
	<53D29B2F.9060005@gmail.com>
	<CAO0rnfGLVche=uhLs5+2gg4Ep2EDvPdU8PHoHS6f7u0UGq4RfQ@mail.gmail.com>
	<1406366100.30315.7.camel@sebastian-t440>
	<956345389428063831.570461sturla.molden-gmail.com@news.gmane.org>
	<CAO0rnfGcr8pXA2pNXi+G4NZp0VFbPzp6sgM4eU2kE8MAbFWGhw@mail.gmail.com>
	<1406383131.30315.9.camel@sebastian-t440>
Message-ID: <555575025428079997.909854sturla.molden-gmail.com@news.gmane.org>

Sebastian Berg <sebastian at sipsolutions.net> wrote:

> Yes, it is much more complicated and incompatible with naive ufuncs if
> you want your memory access to be optimized. And optimizing that is very
> much worth it speed wise...

Why? Couldn't we just copy the data chunk-wise to a temporary buffer of say
2**13 numbers and then reduce that? I don't see why we need another
iterator for that.

Sturla


From sturla.molden at gmail.com  Sat Jul 26 12:34:10 2014
From: sturla.molden at gmail.com (Sturla Molden)
Date: Sat, 26 Jul 2014 16:34:10 +0000 (UTC)
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
References: <53d296aa.d1c6b40a.6e8d.ffffe5bf@mx.google.com>
	<53D29B2F.9060005@gmail.com>
	<CAO0rnfGLVche=uhLs5+2gg4Ep2EDvPdU8PHoHS6f7u0UGq4RfQ@mail.gmail.com>
	<1406366100.30315.7.camel@sebastian-t440>
	<956345389428063831.570461sturla.molden-gmail.com@news.gmane.org>
	<CAO0rnfGcr8pXA2pNXi+G4NZp0VFbPzp6sgM4eU2kE8MAbFWGhw@mail.gmail.com>
	<1406383131.30315.9.camel@sebastian-t440>
	<555575025428079997.909854sturla.molden-gmail.com@news.gmane.org>
Message-ID: <2097579651428085154.727016sturla.molden-gmail.com@news.gmane.org>

Sturla Molden <sturla.molden at gmail.com> wrote:
> Sebastian Berg <sebastian at sipsolutions.net> wrote:
> 
>> Yes, it is much more complicated and incompatible with naive ufuncs if
>> you want your memory access to be optimized. And optimizing that is very
>> much worth it speed wise...
> 
> Why? Couldn't we just copy the data chunk-wise to a temporary buffer of say
> 2**13 numbers and then reduce that? I don't see why we need another
> iterator for that.

I am sorry if this is a stupid suggestion. My knowledge of how NumPy ufuncs
works could have been better.

Sturla


From josef.pktd at gmail.com  Sat Jul 26 14:29:11 2014
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sat, 26 Jul 2014 14:29:11 -0400
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
In-Reply-To: <CANNq6Fko_2xhHa_JppRk6=UTqprFZo9ZWhdfFO52Uj7zpwGJWw@mail.gmail.com>
References: <53d296aa.d1c6b40a.6e8d.ffffe5bf@mx.google.com>
	<53D29B2F.9060005@gmail.com>
	<CAO0rnfGLVche=uhLs5+2gg4Ep2EDvPdU8PHoHS6f7u0UGq4RfQ@mail.gmail.com>
	<1406366100.30315.7.camel@sebastian-t440>
	<956345389428063831.570461sturla.molden-gmail.com@news.gmane.org>
	<CAO0rnfGcr8pXA2pNXi+G4NZp0VFbPzp6sgM4eU2kE8MAbFWGhw@mail.gmail.com>
	<53D3B2C2.7090309@googlemail.com>
	<CANNq6Fko_2xhHa_JppRk6=UTqprFZo9ZWhdfFO52Uj7zpwGJWw@mail.gmail.com>
Message-ID: <CAMMTP+AvbWHDKChcD6g2nbWzzAw2bq+q7Mk81kmwsqogPjqOQg@mail.gmail.com>

On Sat, Jul 26, 2014 at 9:57 AM, Benjamin Root <ben.root at ou.edu> wrote:

> I could get behind the context manager approach. It would help keep
> backwards compatibility, while providing a very easy (and clean) way of
> consistently using the same reduction operation. Adding kwargs is just a
> road to hell.
>

Wouldn't a context manager require a global state that changes how
everything is calculated ?

Josef


>
> Cheers!
> Ben Root
>
>
> On Sat, Jul 26, 2014 at 9:53 AM, Julian Taylor <
> jtaylor.debian at googlemail.com> wrote:
>
>> On 26.07.2014 15:38, Eelco Hoogendoorn wrote:
>> >
>> > Why is it not always used?
>>
>> for 1d reduction the iterator blocks by 8192 elements even when no
>> buffering is required. There is a TODO in the source to fix that by
>> adding additional checks. Unfortunately nobody knows hat these
>> additional tests would need to be and Mark Wiebe who wrote it did not
>> reply to a ping yet.
>>
>> Also along the non-fast axes the iterator optimizes the reduction to
>> remove the strided access, see:
>> https://github.com/numpy/numpy/pull/4697#issuecomment-42752599
>>
>>
>> Instead of having a keyword argument to mean I would prefer a context
>> manager that changes algorithms for different requirements.
>> This would easily allow changing the accuracy and performance of third
>> party functions using numpy without changing the third party library as
>> long as they are using numpy as the base.
>> E.g.
>> with np.precisionstate(sum="kahan"):
>>   scipy.stats.nanmean(d)
>>
>> We also have case where numpy uses algorithms that are far more precise
>> than most people needs them. E.g. np.hypot and the related complex
>> absolute value and division.
>> These are very slow with glibc as it provides 1ulp accuracy, this is
>> hardly ever needed.
>> Another case that could use dynamic changing is flushing subnormals to
>> zero.
>>
>> But this api is like Nathaniels parameterizable dtypes just an idea
>> floating in my head which needs proper design and implementation written
>> down. The issue is as usual ENOTIME.
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140726/33980aa1/attachment.html>

From ben.root at ou.edu  Sat Jul 26 14:44:08 2014
From: ben.root at ou.edu (Benjamin Root)
Date: Sat, 26 Jul 2014 14:44:08 -0400
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
In-Reply-To: <CAMMTP+AvbWHDKChcD6g2nbWzzAw2bq+q7Mk81kmwsqogPjqOQg@mail.gmail.com>
References: <53d296aa.d1c6b40a.6e8d.ffffe5bf@mx.google.com>
	<53D29B2F.9060005@gmail.com>
	<CAO0rnfGLVche=uhLs5+2gg4Ep2EDvPdU8PHoHS6f7u0UGq4RfQ@mail.gmail.com>
	<1406366100.30315.7.camel@sebastian-t440>
	<956345389428063831.570461sturla.molden-gmail.com@news.gmane.org>
	<CAO0rnfGcr8pXA2pNXi+G4NZp0VFbPzp6sgM4eU2kE8MAbFWGhw@mail.gmail.com>
	<53D3B2C2.7090309@googlemail.com>
	<CANNq6Fko_2xhHa_JppRk6=UTqprFZo9ZWhdfFO52Uj7zpwGJWw@mail.gmail.com>
	<CAMMTP+AvbWHDKChcD6g2nbWzzAw2bq+q7Mk81kmwsqogPjqOQg@mail.gmail.com>
Message-ID: <CANNq6F=uPmiQT6vTE_9kiaGUaivy0gy=+zVdqSKrk3Ly=zb7jw@mail.gmail.com>

That is one way of doing it, and probably the cleanest way. Or else you
have to pass in the context object everywhere anyway. But I am not so
concerned about that (we do that for other things as well). Bigger concerns
would be nested contexts. For example, what if one of the scikit functions
use such a context to explicitly state that they need a particular
reduction algorithm, but the call to that scikit function is buried under a
few layers of user functions, at the top of which has a context manager
that states a different reduction op.

Whose context wins? Naively, the scikit's context wins (because that's how
contexts work). But, does that break with the very broad design goal here?
To let the user specify the reduction kernel? Practically speaking, we
could see users naively puting in context managers all over the place in
their libraries, possibly choosing incorrect algorithms (I am serious here,
how often have we seen stackoverflow instructions just blindly parrot
certain arguments "just because")? This gives the user no real mechanism to
override the library, largely defeating the purpose.

My other concern would be with multi-threaded code (which is where a global
state would be bad).

Ben


On Sat, Jul 26, 2014 at 2:29 PM, <josef.pktd at gmail.com> wrote:

>
>
>
> On Sat, Jul 26, 2014 at 9:57 AM, Benjamin Root <ben.root at ou.edu> wrote:
>
>> I could get behind the context manager approach. It would help keep
>> backwards compatibility, while providing a very easy (and clean) way of
>> consistently using the same reduction operation. Adding kwargs is just a
>> road to hell.
>>
>
> Wouldn't a context manager require a global state that changes how
> everything is calculated ?
>
> Josef
>
>
>
>>
>> Cheers!
>> Ben Root
>>
>>
>> On Sat, Jul 26, 2014 at 9:53 AM, Julian Taylor <
>> jtaylor.debian at googlemail.com> wrote:
>>
>>> On 26.07.2014 15:38, Eelco Hoogendoorn wrote:
>>> >
>>> > Why is it not always used?
>>>
>>> for 1d reduction the iterator blocks by 8192 elements even when no
>>> buffering is required. There is a TODO in the source to fix that by
>>> adding additional checks. Unfortunately nobody knows hat these
>>> additional tests would need to be and Mark Wiebe who wrote it did not
>>> reply to a ping yet.
>>>
>>> Also along the non-fast axes the iterator optimizes the reduction to
>>> remove the strided access, see:
>>> https://github.com/numpy/numpy/pull/4697#issuecomment-42752599
>>>
>>>
>>> Instead of having a keyword argument to mean I would prefer a context
>>> manager that changes algorithms for different requirements.
>>> This would easily allow changing the accuracy and performance of third
>>> party functions using numpy without changing the third party library as
>>> long as they are using numpy as the base.
>>> E.g.
>>> with np.precisionstate(sum="kahan"):
>>>   scipy.stats.nanmean(d)
>>>
>>> We also have case where numpy uses algorithms that are far more precise
>>> than most people needs them. E.g. np.hypot and the related complex
>>> absolute value and division.
>>> These are very slow with glibc as it provides 1ulp accuracy, this is
>>> hardly ever needed.
>>> Another case that could use dynamic changing is flushing subnormals to
>>> zero.
>>>
>>> But this api is like Nathaniels parameterizable dtypes just an idea
>>> floating in my head which needs proper design and implementation written
>>> down. The issue is as usual ENOTIME.
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140726/727cbdcb/attachment.html>

From josef.pktd at gmail.com  Sat Jul 26 15:00:12 2014
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sat, 26 Jul 2014 15:00:12 -0400
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
In-Reply-To: <CANNq6F=uPmiQT6vTE_9kiaGUaivy0gy=+zVdqSKrk3Ly=zb7jw@mail.gmail.com>
References: <53d296aa.d1c6b40a.6e8d.ffffe5bf@mx.google.com>
	<53D29B2F.9060005@gmail.com>
	<CAO0rnfGLVche=uhLs5+2gg4Ep2EDvPdU8PHoHS6f7u0UGq4RfQ@mail.gmail.com>
	<1406366100.30315.7.camel@sebastian-t440>
	<956345389428063831.570461sturla.molden-gmail.com@news.gmane.org>
	<CAO0rnfGcr8pXA2pNXi+G4NZp0VFbPzp6sgM4eU2kE8MAbFWGhw@mail.gmail.com>
	<53D3B2C2.7090309@googlemail.com>
	<CANNq6Fko_2xhHa_JppRk6=UTqprFZo9ZWhdfFO52Uj7zpwGJWw@mail.gmail.com>
	<CAMMTP+AvbWHDKChcD6g2nbWzzAw2bq+q7Mk81kmwsqogPjqOQg@mail.gmail.com>
	<CANNq6F=uPmiQT6vTE_9kiaGUaivy0gy=+zVdqSKrk3Ly=zb7jw@mail.gmail.com>
Message-ID: <CAMMTP+B_v5uTk5hdSUPXJ0rp80XNDG2vCs8K0qm9Cc48R+vj-A@mail.gmail.com>

On Sat, Jul 26, 2014 at 2:44 PM, Benjamin Root <ben.root at ou.edu> wrote:

> That is one way of doing it, and probably the cleanest way. Or else you
> have to pass in the context object everywhere anyway. But I am not so
> concerned about that (we do that for other things as well). Bigger concerns
> would be nested contexts. For example, what if one of the scikit functions
> use such a context to explicitly state that they need a particular
> reduction algorithm, but the call to that scikit function is buried under a
> few layers of user functions, at the top of which has a context manager
> that states a different reduction op.
>
> Whose context wins? Naively, the scikit's context wins (because that's how
> contexts work). But, does that break with the very broad design goal here?
> To let the user specify the reduction kernel? Practically speaking, we
> could see users naively puting in context managers all over the place in
> their libraries, possibly choosing incorrect algorithms (I am serious here,
> how often have we seen stackoverflow instructions just blindly parrot
> certain arguments "just because")? This gives the user no real mechanism to
> override the library, largely defeating the purpose.
>
> My other concern would be with multi-threaded code (which is where a
> global state would be bad).
>


statsmodels still has avoided anything that smells like a global state that
changes calculation.
(We never even implemented different global warning levels.)

https://groups.google.com/d/msg/pystatsmodels/-J9WXKLjyH4/5xvKu9_mbbEJ


Josef
There be Dragons.


>
> Ben
>
>
>
> On Sat, Jul 26, 2014 at 2:29 PM, <josef.pktd at gmail.com> wrote:
>
>>
>>
>>
>> On Sat, Jul 26, 2014 at 9:57 AM, Benjamin Root <ben.root at ou.edu> wrote:
>>
>>> I could get behind the context manager approach. It would help keep
>>> backwards compatibility, while providing a very easy (and clean) way of
>>> consistently using the same reduction operation. Adding kwargs is just a
>>> road to hell.
>>>
>>
>> Wouldn't a context manager require a global state that changes how
>> everything is calculated ?
>>
>> Josef
>>
>>
>>
>>>
>>> Cheers!
>>> Ben Root
>>>
>>>
>>> On Sat, Jul 26, 2014 at 9:53 AM, Julian Taylor <
>>> jtaylor.debian at googlemail.com> wrote:
>>>
>>>> On 26.07.2014 15:38, Eelco Hoogendoorn wrote:
>>>> >
>>>> > Why is it not always used?
>>>>
>>>> for 1d reduction the iterator blocks by 8192 elements even when no
>>>> buffering is required. There is a TODO in the source to fix that by
>>>> adding additional checks. Unfortunately nobody knows hat these
>>>> additional tests would need to be and Mark Wiebe who wrote it did not
>>>> reply to a ping yet.
>>>>
>>>> Also along the non-fast axes the iterator optimizes the reduction to
>>>> remove the strided access, see:
>>>> https://github.com/numpy/numpy/pull/4697#issuecomment-42752599
>>>>
>>>>
>>>> Instead of having a keyword argument to mean I would prefer a context
>>>> manager that changes algorithms for different requirements.
>>>> This would easily allow changing the accuracy and performance of third
>>>> party functions using numpy without changing the third party library as
>>>> long as they are using numpy as the base.
>>>> E.g.
>>>> with np.precisionstate(sum="kahan"):
>>>>   scipy.stats.nanmean(d)
>>>>
>>>> We also have case where numpy uses algorithms that are far more precise
>>>> than most people needs them. E.g. np.hypot and the related complex
>>>> absolute value and division.
>>>> These are very slow with glibc as it provides 1ulp accuracy, this is
>>>> hardly ever needed.
>>>> Another case that could use dynamic changing is flushing subnormals to
>>>> zero.
>>>>
>>>> But this api is like Nathaniels parameterizable dtypes just an idea
>>>> floating in my head which needs proper design and implementation written
>>>> down. The issue is as usual ENOTIME.
>>>> _______________________________________________
>>>> NumPy-Discussion mailing list
>>>> NumPy-Discussion at scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>>
>>>
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140726/82703ecc/attachment.html>

From sturla.molden at gmail.com  Sat Jul 26 15:04:10 2014
From: sturla.molden at gmail.com (Sturla Molden)
Date: Sat, 26 Jul 2014 19:04:10 +0000 (UTC)
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
References: <53D29B2F.9060005@gmail.com>
	<CAO0rnfGLVche=uhLs5+2gg4Ep2EDvPdU8PHoHS6f7u0UGq4RfQ@mail.gmail.com>
	<1406366100.30315.7.camel@sebastian-t440>
	<956345389428063831.570461sturla.molden-gmail.com@news.gmane.org>
	<CAO0rnfGcr8pXA2pNXi+G4NZp0VFbPzp6sgM4eU2kE8MAbFWGhw@mail.gmail.com>
	<53D3B2C2.7090309@googlemail.com>
	<CANNq6Fko_2xhHa_JppRk6=UTqprFZo9ZWhdfFO52Uj7zpwGJWw@mail.gmail.com>
	<CAMMTP+AvbWHDKChcD6g2nbWzzAw2bq+q7Mk81kmwsqogPjqOQg@mail.gmail.com>
	<CANNq6F=uPmiQT6vTE_9kiaGUaivy0gy=+zVdqSKrk3Ly=zb7jw@mail.gmail.com>
Message-ID: <1676055168428094052.869608sturla.molden-gmail.com@news.gmane.org>

Benjamin Root <ben.root at ou.edu> wrote:
 
> My other concern would be with multi-threaded code (which is where a global
> state would be bad).

It would presumably require a global threading.RLock for protecting the
global state.

Sturla


From hoogendoorn.eelco at gmail.com  Sat Jul 26 15:19:59 2014
From: hoogendoorn.eelco at gmail.com (Eelco Hoogendoorn)
Date: Sat, 26 Jul 2014 21:19:59 +0200
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
In-Reply-To: <2097579651428085154.727016sturla.molden-gmail.com@news.gmane.org>
References: <53d296aa.d1c6b40a.6e8d.ffffe5bf@mx.google.com>
	<53D29B2F.9060005@gmail.com>
	<CAO0rnfGLVche=uhLs5+2gg4Ep2EDvPdU8PHoHS6f7u0UGq4RfQ@mail.gmail.com>
	<1406366100.30315.7.camel@sebastian-t440>
	<956345389428063831.570461sturla.molden-gmail.com@news.gmane.org>
	<CAO0rnfGcr8pXA2pNXi+G4NZp0VFbPzp6sgM4eU2kE8MAbFWGhw@mail.gmail.com>
	<1406383131.30315.9.camel@sebastian-t440>
	<555575025428079997.909854sturla.molden-gmail.com@news.gmane.org>
	<2097579651428085154.727016sturla.molden-gmail.com@news.gmane.org>
Message-ID: <CAO0rnfFhfFpKBObr0bytsQ8EHMSo=AwTU5xGHyxCeiaP9+vSqA@mail.gmail.com>

Perhaps I in turn am missing something; but I would suppose that any
algorithm that requires multiple passes over the data is off the table?
Perhaps I am being a little old fashioned and performance oriented here,
but to make the ultra-majority of use cases suffer a factor two performance
penalty for an odd use case which already has a plethora of fine and dandy
solutions? Id vote against, fwiw...


On Sat, Jul 26, 2014 at 6:34 PM, Sturla Molden <sturla.molden at gmail.com>
wrote:

> Sturla Molden <sturla.molden at gmail.com> wrote:
> > Sebastian Berg <sebastian at sipsolutions.net> wrote:
> >
> >> Yes, it is much more complicated and incompatible with naive ufuncs if
> >> you want your memory access to be optimized. And optimizing that is very
> >> much worth it speed wise...
> >
> > Why? Couldn't we just copy the data chunk-wise to a temporary buffer of
> say
> > 2**13 numbers and then reduce that? I don't see why we need another
> > iterator for that.
>
> I am sorry if this is a stupid suggestion. My knowledge of how NumPy ufuncs
> works could have been better.
>
> Sturla
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140726/e82bf432/attachment.html>

From sturla.molden at gmail.com  Sat Jul 26 15:19:52 2014
From: sturla.molden at gmail.com (Sturla Molden)
Date: Sat, 26 Jul 2014 19:19:52 +0000 (UTC)
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
References: <CAO0rnfGLVche=uhLs5+2gg4Ep2EDvPdU8PHoHS6f7u0UGq4RfQ@mail.gmail.com>
	<1406366100.30315.7.camel@sebastian-t440>
	<956345389428063831.570461sturla.molden-gmail.com@news.gmane.org>
	<CAO0rnfGcr8pXA2pNXi+G4NZp0VFbPzp6sgM4eU2kE8MAbFWGhw@mail.gmail.com>
	<53D3B2C2.7090309@googlemail.com>
	<CANNq6Fko_2xhHa_JppRk6=UTqprFZo9ZWhdfFO52Uj7zpwGJWw@mail.gmail.com>
	<CAMMTP+AvbWHDKChcD6g2nbWzzAw2bq+q7Mk81kmwsqogPjqOQg@mail.gmail.com>
	<CANNq6F=uPmiQT6vTE_9kiaGUaivy0gy=+zVdqSKrk3Ly=zb7jw@mail.gmail.com>
	<CAMMTP+B_v5uTk5hdSUPXJ0rp80XNDG2vCs8K0qm9Cc48R+vj-A@mail.gmail.com>
Message-ID: <1227118823428094389.828592sturla.molden-gmail.com@news.gmane.org>

<josef.pktd at gmail.com> wrote:

> statsmodels still has avoided anything that smells like a global state that
> changes calculation.

If global states are stored in a stack, as in OpenGL, it is not so bad. A
context manager could push a state in __enter__ and pop the state in
__exit__. This is actually how I write OpenGL code in Python and Cython:
pairs of glBegin/glEnd, glPushMatrix/glPopMatrix, and
glPushAttrib/glPopAttrib nicely fits with Python context managers. 

However, the bigger issue is multithreading scalability. You need to
protect the global state with a recursive lock, and it might not scale like
you want. A thread might call a lengthy computation that releases the GIL,
but still hold the rlock that protects the state. All your hopes for seing
more then one core saturated will go down the drain. It is even bad for i/o
bound code, e.g. on-line signal processing: Data might be ready for
processing in one thread, but the global state is locked by an idle thread
waiting for data. 

Sturla


From sylvain.corlay at gmail.com  Sat Jul 26 15:30:06 2014
From: sylvain.corlay at gmail.com (Sylvain Corlay)
Date: Sat, 26 Jul 2014 15:30:06 -0400
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
In-Reply-To: <CAO0rnfFhfFpKBObr0bytsQ8EHMSo=AwTU5xGHyxCeiaP9+vSqA@mail.gmail.com>
References: <53d296aa.d1c6b40a.6e8d.ffffe5bf@mx.google.com>
	<53D29B2F.9060005@gmail.com>
	<CAO0rnfGLVche=uhLs5+2gg4Ep2EDvPdU8PHoHS6f7u0UGq4RfQ@mail.gmail.com>
	<1406366100.30315.7.camel@sebastian-t440>
	<956345389428063831.570461sturla.molden-gmail.com@news.gmane.org>
	<CAO0rnfGcr8pXA2pNXi+G4NZp0VFbPzp6sgM4eU2kE8MAbFWGhw@mail.gmail.com>
	<1406383131.30315.9.camel@sebastian-t440>
	<555575025428079997.909854sturla.molden-gmail.com@news.gmane.org>
	<2097579651428085154.727016sturla.molden-gmail.com@news.gmane.org>
	<CAO0rnfFhfFpKBObr0bytsQ8EHMSo=AwTU5xGHyxCeiaP9+vSqA@mail.gmail.com>
Message-ID: <CAK=Phk7hOa1=9EwF0sJavxGwA=1ruLT0ZrOKw6qYf6=Jy=UP6g@mail.gmail.com>

I completely agree with Eelco. I expect numpy.mean to do something
simple and straightforward. If the naive method is not well suited for
my data, I can deal with it and have my own ad hoc method.

On Sat, Jul 26, 2014 at 3:19 PM, Eelco Hoogendoorn
<hoogendoorn.eelco at gmail.com> wrote:
> Perhaps I in turn am missing something; but I would suppose that any
> algorithm that requires multiple passes over the data is off the table?
> Perhaps I am being a little old fashioned and performance oriented here, but
> to make the ultra-majority of use cases suffer a factor two performance
> penalty for an odd use case which already has a plethora of fine and dandy
> solutions? Id vote against, fwiw...
>
>
> On Sat, Jul 26, 2014 at 6:34 PM, Sturla Molden <sturla.molden at gmail.com>
> wrote:
>>
>> Sturla Molden <sturla.molden at gmail.com> wrote:
>> > Sebastian Berg <sebastian at sipsolutions.net> wrote:
>> >
>> >> Yes, it is much more complicated and incompatible with naive ufuncs if
>> >> you want your memory access to be optimized. And optimizing that is
>> >> very
>> >> much worth it speed wise...
>> >
>> > Why? Couldn't we just copy the data chunk-wise to a temporary buffer of
>> > say
>> > 2**13 numbers and then reduce that? I don't see why we need another
>> > iterator for that.
>>
>> I am sorry if this is a stupid suggestion. My knowledge of how NumPy
>> ufuncs
>> works could have been better.
>>
>> Sturla
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From robert.kern at gmail.com  Sat Jul 26 16:06:21 2014
From: robert.kern at gmail.com (Robert Kern)
Date: Sat, 26 Jul 2014 21:06:21 +0100
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
In-Reply-To: <1676055168428094052.869608sturla.molden-gmail.com@news.gmane.org>
References: <53D29B2F.9060005@gmail.com>
	<CAO0rnfGLVche=uhLs5+2gg4Ep2EDvPdU8PHoHS6f7u0UGq4RfQ@mail.gmail.com>
	<1406366100.30315.7.camel@sebastian-t440>
	<956345389428063831.570461sturla.molden-gmail.com@news.gmane.org>
	<CAO0rnfGcr8pXA2pNXi+G4NZp0VFbPzp6sgM4eU2kE8MAbFWGhw@mail.gmail.com>
	<53D3B2C2.7090309@googlemail.com>
	<CANNq6Fko_2xhHa_JppRk6=UTqprFZo9ZWhdfFO52Uj7zpwGJWw@mail.gmail.com>
	<CAMMTP+AvbWHDKChcD6g2nbWzzAw2bq+q7Mk81kmwsqogPjqOQg@mail.gmail.com>
	<CANNq6F=uPmiQT6vTE_9kiaGUaivy0gy=+zVdqSKrk3Ly=zb7jw@mail.gmail.com>
	<1676055168428094052.869608sturla.molden-gmail.com@news.gmane.org>
Message-ID: <CAF6FJitmyUSiLednfGhJCk9mQxiW77sc1dumPAsFox63z4w72Q@mail.gmail.com>

On Sat, Jul 26, 2014 at 8:04 PM, Sturla Molden <sturla.molden at gmail.com> wrote:
> Benjamin Root <ben.root at ou.edu> wrote:
>
>> My other concern would be with multi-threaded code (which is where a global
>> state would be bad).
>
> It would presumably require a global threading.RLock for protecting the
> global state.

We would use thread-local storage like we currently do with the
np.errstate() context manager. Each thread will have its own "global"
state.

-- 
Robert Kern


From sturla.molden at gmail.com  Sat Jul 26 17:19:06 2014
From: sturla.molden at gmail.com (Sturla Molden)
Date: Sat, 26 Jul 2014 21:19:06 +0000 (UTC)
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
References: <1406366100.30315.7.camel@sebastian-t440>
	<956345389428063831.570461sturla.molden-gmail.com@news.gmane.org>
	<CAO0rnfGcr8pXA2pNXi+G4NZp0VFbPzp6sgM4eU2kE8MAbFWGhw@mail.gmail.com>
	<53D3B2C2.7090309@googlemail.com>
	<CANNq6Fko_2xhHa_JppRk6=UTqprFZo9ZWhdfFO52Uj7zpwGJWw@mail.gmail.com>
	<CAMMTP+AvbWHDKChcD6g2nbWzzAw2bq+q7Mk81kmwsqogPjqOQg@mail.gmail.com>
	<CANNq6F=uPmiQT6vTE_9kiaGUaivy0gy=+zVdqSKrk3Ly=zb7jw@mail.gmail.com>
	<1676055168428094052.869608sturla.molden-gmail.com@news.gmane.org>
	<CAF6FJitmyUSiLednfGhJCk9mQxiW77sc1dumPAsFox63z4w72Q@mail.gmail.com>
Message-ID: <686489827428101123.449928sturla.molden-gmail.com@news.gmane.org>

Robert Kern <robert.kern at gmail.com> wrote:

>> It would presumably require a global threading.RLock for protecting the
>> global state.
> 
> We would use thread-local storage like we currently do with the
> np.errstate() context manager. Each thread will have its own "global"
> state.

That sounds like a better plan, yes :)


Sturla


From gabriel.altay at gmail.com  Sat Jul 26 17:32:11 2014
From: gabriel.altay at gmail.com (Gabriel Altay)
Date: Sat, 26 Jul 2014 17:32:11 -0400
Subject: [Numpy-discussion] ImportError while building Numpy on Ubuntu 14.04
Message-ID: <CANF6+_WogfvRSHipKeb60bfod0yWzYzOLZd0EY7jirwxMSjZoQ@mail.gmail.com>

I'm attempting to build Numpy from source in order to do some development.
I've cloned the github repo and installed the pre-reqs for Ubuntu

http://www.scipy.org/scipylib/building/linux.html#debian-ubuntu

However, when I do
>>> python setup.py build

I get

Running from numpy source directory.
Traceback (most recent call last):
  File "setup.py", line 251, in <module>
    setup_package()
  File "setup.py", line 235, in setup_package
    from numpy.distutils.core import setup
  File "/home/galtay/github/numpy-env/numpy/numpy/distutils/__init__.py",
line 37, in <module>
    from numpy.testing import Tester
  File "/home/galtay/github/numpy-env/numpy/numpy/testing/__init__.py",
line 13, in <module>
    from .utils import *
  File "/home/galtay/github/numpy-env/numpy/numpy/testing/utils.py", line
17, in <module>
    from numpy.core import float32, empty, arange, array_repr, ndarray
  File "/home/galtay/github/numpy-env/numpy/numpy/core/__init__.py", line
6, in <module>
    from . import multiarray
ImportError: cannot import name multiarray


Any hints?  I'm running Continuum Analytics Anaconda Python distribution

thanks,
-Gabriel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140726/22a39cf0/attachment.html>

From josef.pktd at gmail.com  Sun Jul 27 02:04:50 2014
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sun, 27 Jul 2014 02:04:50 -0400
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
In-Reply-To: <686489827428101123.449928sturla.molden-gmail.com@news.gmane.org>
References: <1406366100.30315.7.camel@sebastian-t440>
	<956345389428063831.570461sturla.molden-gmail.com@news.gmane.org>
	<CAO0rnfGcr8pXA2pNXi+G4NZp0VFbPzp6sgM4eU2kE8MAbFWGhw@mail.gmail.com>
	<53D3B2C2.7090309@googlemail.com>
	<CANNq6Fko_2xhHa_JppRk6=UTqprFZo9ZWhdfFO52Uj7zpwGJWw@mail.gmail.com>
	<CAMMTP+AvbWHDKChcD6g2nbWzzAw2bq+q7Mk81kmwsqogPjqOQg@mail.gmail.com>
	<CANNq6F=uPmiQT6vTE_9kiaGUaivy0gy=+zVdqSKrk3Ly=zb7jw@mail.gmail.com>
	<1676055168428094052.869608sturla.molden-gmail.com@news.gmane.org>
	<CAF6FJitmyUSiLednfGhJCk9mQxiW77sc1dumPAsFox63z4w72Q@mail.gmail.com>
	<686489827428101123.449928sturla.molden-gmail.com@news.gmane.org>
Message-ID: <CAMMTP+B4RU5DOFJZ-wcZ-yfjmb7LoJaWZVja_sZOMms6sz-JYg@mail.gmail.com>

On Sat, Jul 26, 2014 at 5:19 PM, Sturla Molden <sturla.molden at gmail.com>
wrote:

> Robert Kern <robert.kern at gmail.com> wrote:
>
> >> It would presumably require a global threading.RLock for protecting the
> >> global state.
> >
> > We would use thread-local storage like we currently do with the
> > np.errstate() context manager. Each thread will have its own "global"
> > state.
>
> That sounds like a better plan, yes :)
>

Any "global" state that changes how things are calculated will have
unpredictable results.

And I don't trust python users to be disciplined enough.

issue: Why do I get different results after `import this_funy_package`?

Josef


>
>
> Sturla
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140727/747aef1d/attachment.html>

From robert.kern at gmail.com  Sun Jul 27 04:24:58 2014
From: robert.kern at gmail.com (Robert Kern)
Date: Sun, 27 Jul 2014 09:24:58 +0100
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
In-Reply-To: <CAMMTP+B4RU5DOFJZ-wcZ-yfjmb7LoJaWZVja_sZOMms6sz-JYg@mail.gmail.com>
References: <1406366100.30315.7.camel@sebastian-t440>
	<956345389428063831.570461sturla.molden-gmail.com@news.gmane.org>
	<CAO0rnfGcr8pXA2pNXi+G4NZp0VFbPzp6sgM4eU2kE8MAbFWGhw@mail.gmail.com>
	<53D3B2C2.7090309@googlemail.com>
	<CANNq6Fko_2xhHa_JppRk6=UTqprFZo9ZWhdfFO52Uj7zpwGJWw@mail.gmail.com>
	<CAMMTP+AvbWHDKChcD6g2nbWzzAw2bq+q7Mk81kmwsqogPjqOQg@mail.gmail.com>
	<CANNq6F=uPmiQT6vTE_9kiaGUaivy0gy=+zVdqSKrk3Ly=zb7jw@mail.gmail.com>
	<1676055168428094052.869608sturla.molden-gmail.com@news.gmane.org>
	<CAF6FJitmyUSiLednfGhJCk9mQxiW77sc1dumPAsFox63z4w72Q@mail.gmail.com>
	<686489827428101123.449928sturla.molden-gmail.com@news.gmane.org>
	<CAMMTP+B4RU5DOFJZ-wcZ-yfjmb7LoJaWZVja_sZOMms6sz-JYg@mail.gmail.com>
Message-ID: <CAF6FJitMuziOO3HVAEY+XV0Faj-h-KFZMYomU1bEcAsL8zf=Qw@mail.gmail.com>

On Sun, Jul 27, 2014 at 7:04 AM,  <josef.pktd at gmail.com> wrote:
>
> On Sat, Jul 26, 2014 at 5:19 PM, Sturla Molden <sturla.molden at gmail.com>
> wrote:
>>
>> Robert Kern <robert.kern at gmail.com> wrote:
>>
>> >> It would presumably require a global threading.RLock for protecting the
>> >> global state.
>> >
>> > We would use thread-local storage like we currently do with the
>> > np.errstate() context manager. Each thread will have its own "global"
>> > state.
>>
>> That sounds like a better plan, yes :)
>
> Any "global" state that changes how things are calculated will have
> unpredictable results.
>
> And I don't trust python users to be disciplined enough.
>
> issue: Why do I get different results after `import this_funy_package`?

That's why the suggestion is that it be controlled by a context
manager. The state change will only be limited to the `with:`
statement. You would not be able to "fire-and-forget" the state
change.

-- 
Robert Kern


From josef.pktd at gmail.com  Sun Jul 27 04:56:32 2014
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sun, 27 Jul 2014 04:56:32 -0400
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
In-Reply-To: <CAF6FJitMuziOO3HVAEY+XV0Faj-h-KFZMYomU1bEcAsL8zf=Qw@mail.gmail.com>
References: <1406366100.30315.7.camel@sebastian-t440>
	<956345389428063831.570461sturla.molden-gmail.com@news.gmane.org>
	<CAO0rnfGcr8pXA2pNXi+G4NZp0VFbPzp6sgM4eU2kE8MAbFWGhw@mail.gmail.com>
	<53D3B2C2.7090309@googlemail.com>
	<CANNq6Fko_2xhHa_JppRk6=UTqprFZo9ZWhdfFO52Uj7zpwGJWw@mail.gmail.com>
	<CAMMTP+AvbWHDKChcD6g2nbWzzAw2bq+q7Mk81kmwsqogPjqOQg@mail.gmail.com>
	<CANNq6F=uPmiQT6vTE_9kiaGUaivy0gy=+zVdqSKrk3Ly=zb7jw@mail.gmail.com>
	<1676055168428094052.869608sturla.molden-gmail.com@news.gmane.org>
	<CAF6FJitmyUSiLednfGhJCk9mQxiW77sc1dumPAsFox63z4w72Q@mail.gmail.com>
	<686489827428101123.449928sturla.molden-gmail.com@news.gmane.org>
	<CAMMTP+B4RU5DOFJZ-wcZ-yfjmb7LoJaWZVja_sZOMms6sz-JYg@mail.gmail.com>
	<CAF6FJitMuziOO3HVAEY+XV0Faj-h-KFZMYomU1bEcAsL8zf=Qw@mail.gmail.com>
Message-ID: <CAMMTP+AANP1jJUbvSaiZotUagryj+x7cuw=kh1MfWChPEv=Xkw@mail.gmail.com>

On Sun, Jul 27, 2014 at 4:24 AM, Robert Kern <robert.kern at gmail.com> wrote:

> On Sun, Jul 27, 2014 at 7:04 AM,  <josef.pktd at gmail.com> wrote:
> >
> > On Sat, Jul 26, 2014 at 5:19 PM, Sturla Molden <sturla.molden at gmail.com>
> > wrote:
> >>
> >> Robert Kern <robert.kern at gmail.com> wrote:
> >>
> >> >> It would presumably require a global threading.RLock for protecting
> the
> >> >> global state.
> >> >
> >> > We would use thread-local storage like we currently do with the
> >> > np.errstate() context manager. Each thread will have its own "global"
> >> > state.
> >>
> >> That sounds like a better plan, yes :)
> >
> > Any "global" state that changes how things are calculated will have
> > unpredictable results.
> >
> > And I don't trust python users to be disciplined enough.
> >
> > issue: Why do I get different results after `import this_funy_package`?
>
> That's why the suggestion is that it be controlled by a context
> manager. The state change will only be limited to the `with:`
> statement. You would not be able to "fire-and-forget" the state
> change.
>

Can you implement a context manager without introducing a global variable
that everyone could set, and forget?

Josef


>
> --
> Robert Kern
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140727/93b9a240/attachment.html>

From robert.kern at gmail.com  Sun Jul 27 05:04:41 2014
From: robert.kern at gmail.com (Robert Kern)
Date: Sun, 27 Jul 2014 10:04:41 +0100
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
In-Reply-To: <CAMMTP+AANP1jJUbvSaiZotUagryj+x7cuw=kh1MfWChPEv=Xkw@mail.gmail.com>
References: <1406366100.30315.7.camel@sebastian-t440>
	<956345389428063831.570461sturla.molden-gmail.com@news.gmane.org>
	<CAO0rnfGcr8pXA2pNXi+G4NZp0VFbPzp6sgM4eU2kE8MAbFWGhw@mail.gmail.com>
	<53D3B2C2.7090309@googlemail.com>
	<CANNq6Fko_2xhHa_JppRk6=UTqprFZo9ZWhdfFO52Uj7zpwGJWw@mail.gmail.com>
	<CAMMTP+AvbWHDKChcD6g2nbWzzAw2bq+q7Mk81kmwsqogPjqOQg@mail.gmail.com>
	<CANNq6F=uPmiQT6vTE_9kiaGUaivy0gy=+zVdqSKrk3Ly=zb7jw@mail.gmail.com>
	<1676055168428094052.869608sturla.molden-gmail.com@news.gmane.org>
	<CAF6FJitmyUSiLednfGhJCk9mQxiW77sc1dumPAsFox63z4w72Q@mail.gmail.com>
	<686489827428101123.449928sturla.molden-gmail.com@news.gmane.org>
	<CAMMTP+B4RU5DOFJZ-wcZ-yfjmb7LoJaWZVja_sZOMms6sz-JYg@mail.gmail.com>
	<CAF6FJitMuziOO3HVAEY+XV0Faj-h-KFZMYomU1bEcAsL8zf=Qw@mail.gmail.com>
	<CAMMTP+AANP1jJUbvSaiZotUagryj+x7cuw=kh1MfWChPEv=Xkw@mail.gmail.com>
Message-ID: <CAF6FJitUTv+HnFDY-UVYnpTTnxSaAHsZ8uMZ3WhCt7+ZkFs-_Q@mail.gmail.com>

On Sun, Jul 27, 2014 at 9:56 AM,  <josef.pktd at gmail.com> wrote:
>
> On Sun, Jul 27, 2014 at 4:24 AM, Robert Kern <robert.kern at gmail.com> wrote:
>>
>> On Sun, Jul 27, 2014 at 7:04 AM,  <josef.pktd at gmail.com> wrote:
>> >
>> > On Sat, Jul 26, 2014 at 5:19 PM, Sturla Molden <sturla.molden at gmail.com>
>> > wrote:
>> >>
>> >> Robert Kern <robert.kern at gmail.com> wrote:
>> >>
>> >> >> It would presumably require a global threading.RLock for protecting
>> >> >> the
>> >> >> global state.
>> >> >
>> >> > We would use thread-local storage like we currently do with the
>> >> > np.errstate() context manager. Each thread will have its own "global"
>> >> > state.
>> >>
>> >> That sounds like a better plan, yes :)
>> >
>> > Any "global" state that changes how things are calculated will have
>> > unpredictable results.
>> >
>> > And I don't trust python users to be disciplined enough.
>> >
>> > issue: Why do I get different results after `import this_funy_package`?
>>
>> That's why the suggestion is that it be controlled by a context
>> manager. The state change will only be limited to the `with:`
>> statement. You would not be able to "fire-and-forget" the state
>> change.
>
> Can you implement a context manager without introducing a global variable
> that everyone could set, and forget?

Oh sure, with enough effort and digging, someone could search through
the C source, find the hidden, private API that does this, and
deliberately mess with it. But they can already do that with all of
the other necessarily-global state; every module object is a glorified
global variable that can be mutated.

You won't be able to do it by accident or omission or a lack of
discipline. It's not a tempting public target like, say, np.seterr().

-- 
Robert Kern


From rays at blue-cove.com  Sun Jul 27 10:16:47 2014
From: rays at blue-cove.com (RayS)
Date: Sun, 27 Jul 2014 07:16:47 -0700
Subject: [Numpy-discussion] numpy.mean still broken for
 largefloat32arrays
In-Reply-To: <CAF6FJitUTv+HnFDY-UVYnpTTnxSaAHsZ8uMZ3WhCt7+ZkFs-_Q@mail.g
	mail.com>
References: <1406366100.30315.7.camel@sebastian-t440>
	<956345389428063831.570461sturla.molden-gmail.com@news.gmane.org>
	<CAO0rnfGcr8pXA2pNXi+G4NZp0VFbPzp6sgM4eU2kE8MAbFWGhw@mail.gmail.com>
	<53D3B2C2.7090309@googlemail.com>
	<CANNq6Fko_2xhHa_JppRk6=UTqprFZo9ZWhdfFO52Uj7zpwGJWw@mail.gmail.com>
	<CAMMTP+AvbWHDKChcD6g2nbWzzAw2bq+q7Mk81kmwsqogPjqOQg@mail.gmail.com>
	<CANNq6F=uPmiQT6vTE_9kiaGUaivy0gy=+zVdqSKrk3Ly=zb7jw@mail.gmail.com>
	<1676055168428094052.869608sturla.molden-gmail.com@news.gmane.org>
	<CAF6FJitmyUSiLednfGhJCk9mQxiW77sc1dumPAsFox63z4w72Q@mail.gmail.com>
	<686489827428101123.449928sturla.molden-gmail.com@news.gmane.org>
	<CAMMTP+B4RU5DOFJZ-wcZ-yfjmb7LoJaWZVja_sZOMms6sz-JYg@mail.gmail.com>
	<CAF6FJitMuziOO3HVAEY+XV0Faj-h-KFZMYomU1bEcAsL8zf=Qw@mail.gmail.com>
	<CAMMTP+AANP1jJUbvSaiZotUagryj+x7cuw=kh1MfWChPEv=Xkw@mail.gmail.com>
	<CAF6FJitUTv+HnFDY-UVYnpTTnxSaAHsZ8uMZ3WhCt7+ZkFs-_Q@mail.gmail.com>
Message-ID: <201407271416.s6REGn0J031512@blue-cove.com>

At 02:04 AM 7/27/2014, you wrote:

>You won't be able to do it by accident or omission or a lack of
>discipline. It's not a tempting public target like, say, np.seterr().

BTW, why not throw an overflow error in the large float32 sum() case?
Is it too expensive to check while accumulating?

- Ray


From njs at pobox.com  Sun Jul 27 10:44:46 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Sun, 27 Jul 2014 15:44:46 +0100
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
In-Reply-To: <201407271416.s6REGn0J031512@blue-cove.com>
References: <1406366100.30315.7.camel@sebastian-t440>
	<956345389428063831.570461sturla.molden-gmail.com@news.gmane.org>
	<CAO0rnfGcr8pXA2pNXi+G4NZp0VFbPzp6sgM4eU2kE8MAbFWGhw@mail.gmail.com>
	<53D3B2C2.7090309@googlemail.com>
	<CANNq6Fko_2xhHa_JppRk6=UTqprFZo9ZWhdfFO52Uj7zpwGJWw@mail.gmail.com>
	<CAMMTP+AvbWHDKChcD6g2nbWzzAw2bq+q7Mk81kmwsqogPjqOQg@mail.gmail.com>
	<CANNq6F=uPmiQT6vTE_9kiaGUaivy0gy=+zVdqSKrk3Ly=zb7jw@mail.gmail.com>
	<1676055168428094052.869608sturla.molden-gmail.com@news.gmane.org>
	<CAF6FJitmyUSiLednfGhJCk9mQxiW77sc1dumPAsFox63z4w72Q@mail.gmail.com>
	<686489827428101123.449928sturla.molden-gmail.com@news.gmane.org>
	<CAMMTP+B4RU5DOFJZ-wcZ-yfjmb7LoJaWZVja_sZOMms6sz-JYg@mail.gmail.com>
	<CAF6FJitMuziOO3HVAEY+XV0Faj-h-KFZMYomU1bEcAsL8zf=Qw@mail.gmail.com>
	<CAMMTP+AANP1jJUbvSaiZotUagryj+x7cuw=kh1MfWChPEv=Xkw@mail.gmail.com>
	<CAF6FJitUTv+HnFDY-UVYnpTTnxSaAHsZ8uMZ3WhCt7+ZkFs-_Q@mail.gmail.com>
	<201407271416.s6REGn0J031512@blue-cove.com>
Message-ID: <CAPJVwBm0bzi=o42wgm5UQPNw16ixxJOZXZm5a_Kf5ijkQ+OA+w@mail.gmail.com>

On Sun, Jul 27, 2014 at 3:16 PM, RayS <rays at blue-cove.com> wrote:
> At 02:04 AM 7/27/2014, you wrote:
>
>>You won't be able to do it by accident or omission or a lack of
>>discipline. It's not a tempting public target like, say, np.seterr().
>
> BTW, why not throw an overflow error in the large float32 sum() case?
> Is it too expensive to check while accumulating?

In the example that started this thread, there's no overflow (in the
technical sense) occurring. Overflow for ints means wrapping around,
and for floats it means exceeding the maximum possible value and
overflowing to infinity.

The problem here is that when summing up the values, the sum gets
large enough that after rounding, x + 1 = x and the sum stops
increasing. (For float32's all this requires is x > 16777216.) So
while the final error is massive, the mechanism is just ordinary
floating-point round-off error.

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org


From rays at blue-cove.com  Sun Jul 27 13:02:16 2014
From: rays at blue-cove.com (RayS)
Date: Sun, 27 Jul 2014 10:02:16 -0700
Subject: [Numpy-discussion] numpy.mean still broken for
 largefloat32arrays
In-Reply-To: <CAPJVwBm0bzi=o42wgm5UQPNw16ixxJOZXZm5a_Kf5ijkQ+OA+w@mail.g
	mail.com>
References: <1406366100.30315.7.camel@sebastian-t440>
	<956345389428063831.570461sturla.molden-gmail.com@news.gmane.org>
	<CAO0rnfGcr8pXA2pNXi+G4NZp0VFbPzp6sgM4eU2kE8MAbFWGhw@mail.gmail.com>
	<53D3B2C2.7090309@googlemail.com>
	<CANNq6Fko_2xhHa_JppRk6=UTqprFZo9ZWhdfFO52Uj7zpwGJWw@mail.gmail.com>
	<CAMMTP+AvbWHDKChcD6g2nbWzzAw2bq+q7Mk81kmwsqogPjqOQg@mail.gmail.com>
	<CANNq6F=uPmiQT6vTE_9kiaGUaivy0gy=+zVdqSKrk3Ly=zb7jw@mail.gmail.com>
	<1676055168428094052.869608sturla.molden-gmail.com@news.gmane.org>
	<CAF6FJitmyUSiLednfGhJCk9mQxiW77sc1dumPAsFox63z4w72Q@mail.gmail.com>
	<686489827428101123.449928sturla.molden-gmail.com@news.gmane.org>
	<CAMMTP+B4RU5DOFJZ-wcZ-yfjmb7LoJaWZVja_sZOMms6sz-JYg@mail.gmail.com>
	<CAF6FJitMuziOO3HVAEY+XV0Faj-h-KFZMYomU1bEcAsL8zf=Qw@mail.gmail.com>
	<CAMMTP+AANP1jJUbvSaiZotUagryj+x7cuw=kh1MfWChPEv=Xkw@mail.gmail.com>
	<CAF6FJitUTv+HnFDY-UVYnpTTnxSaAHsZ8uMZ3WhCt7+ZkFs-_Q@mail.gmail.com>
	<201407271416.s6REGn0J031512@blue-cove.com>
	<CAPJVwBm0bzi=o42wgm5UQPNw16ixxJOZXZm5a_Kf5ijkQ+OA+w@mail.gmail.com>
Message-ID: <201407271702.s6RH2I7K000353@blue-cove.com>

Thanks for the clarification, but how is the numpy rounding directed? 
Round to nearest, ties to even?
http://en.wikipedia.org/wiki/IEEE_floating_point#Rounding_rules
Just curious, as I couldn't find a reference.

- Ray


At 07:44 AM 7/27/2014, you wrote:
>On Sun, Jul 27, 2014 at 3:16 PM, RayS <rays at blue-cove.com> wrote:
> > At 02:04 AM 7/27/2014, you wrote:
> >
> >>You won't be able to do it by accident or omission or a lack of
> >>discipline. It's not a tempting public target like, say, np.seterr().
> >
> > BTW, why not throw an overflow error in the large float32 sum() case?
> > Is it too expensive to check while accumulating?
>
>In the example that started this thread, there's no overflow (in the
>technical sense) occurring. Overflow for ints means wrapping around,
>and for floats it means exceeding the maximum possible value and
>overflowing to infinity.
>
>The problem here is that when summing up the values, the sum gets
>large enough that after rounding, x + 1 = x and the sum stops
>increasing. (For float32's all this requires is x > 16777216.) So
>while the final error is massive, the mechanism is just ordinary
>floating-point round-off error.
>
>-n
>
>--
>Nathaniel J. Smith
>Postdoctoral researcher - Informatics - University of Edinburgh
>http://vorpus.org
>_______________________________________________
>NumPy-Discussion mailing list
>NumPy-Discussion at scipy.org
>http://mail.scipy.org/mailman/listinfo/numpy-discussion


From sturla.molden at gmail.com  Sun Jul 27 14:26:37 2014
From: sturla.molden at gmail.com (Sturla Molden)
Date: Sun, 27 Jul 2014 18:26:37 +0000 (UTC)
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
References: <201407271416.s6REGn0J031512@blue-cove.com>
	<CAPJVwBm0bzi=o42wgm5UQPNw16ixxJOZXZm5a_Kf5ijkQ+OA+w@mail.gmail.com>
Message-ID: <1486437082428178269.968493sturla.molden-gmail.com@news.gmane.org>

Nathaniel Smith <njs at pobox.com> wrote:

> The problem here is that when summing up the values, the sum gets
> large enough that after rounding, x + 1 = x and the sum stops
> increasing.

Interesting. That explains why the divide-and-conquer reduction is much
more robust.

Thanks :)


Sturla


From rmcgibbo at gmail.com  Sun Jul 27 22:32:54 2014
From: rmcgibbo at gmail.com (Robert McGibbon)
Date: Sun, 27 Jul 2014 19:32:54 -0700
Subject: [Numpy-discussion] 64-bit windows numpy / scipy wheels for
	testing
In-Reply-To: <CAN4+E8G=i5g5gNTFjzxWUwGOC4HTh_HvME-dXRSnvEvQqg4W7Q@mail.gmail.com>
References: <CAH6Pt5r47rOKXYfdiqaGQihV+yzrq2dc9rmYbboNb=HUbrtEpA@mail.gmail.com>
	<536CB2C6.1030305@googlemail.com>
	<CAGY4rcV1tecTqbVB-nY2_zuAQ1U0uVRH9iPa8Wd2inL8mh+zpA@mail.gmail.com>
	<CAH6Pt5oOx+POzTJdMHU49JxZs=Pkc6OFD-B6=9topx68QjtiiA@mail.gmail.com>
	<CABL7CQhfo6++1B6hBh3pDzmLZOeNbqo1SzQOfBMbLEQj4G+U7A@mail.gmail.com>
	<CAFvE7K4+TMXG8-=bFsWk7hx5Wh4VkhhyYhw4z7EBb6kKnqPJrw@mail.gmail.com>
	<CAGGsPMxmOV8T+Wm6LfTER1xebCQk4MEebXfdRULJ6b8zWLL_=Q@mail.gmail.com>
	<CAH6Pt5pFftrXN8dkS4Dcj+Dt8Bx6EMyqp_cK0RX6=KUKhVx2+w@mail.gmail.com>
	<CAH6Pt5r2fNJGGj1K=foSt9tFU89+wf82waYAzxYHrKGtQbTXwQ@mail.gmail.com>
	<CAGGsPMyxvNkz__+Hc1YSp_fLRPbdg4gez9EAwQyS9jzv_R+ZCA@mail.gmail.com>
	<CAH6Pt5rj8QOrtCyZMKeaM34ofdXkKRCiABGNLD1vVUC_B7bqTA@mail.gmail.com>
	<CAGGsPMy7Y1nqz+5TCuGFP2gvK6G2ReYMb9X1e_Jq3rwO6FoLJA@mail.gmail.com>
	<CAH6Pt5rPhXYyWHPVy8fYrym=2oNQxj5kchUL38KdLrwXjmhL8w@mail.gmail.com>
	<CALGmxEKyXDsuF81=coYYvtJSOLnpnO6AW5CxNYu+JSBOCiQu=A@mail.gmail.com>
	<lp2k91$7jg$1@ger.gmane.org>
	<CAH6Pt5o6g5A_gjCsH75xkqLzPLJUz9OdsB4a5Ej5S-kectw+og@mail.gmail.com>
	<CAH6Pt5qJS357CtoXGyz2cNmVFr8ufjjdwe_fHqU86YcAcJ5vcw@mail.gmail.com>
	<CAFvE7K5i-d6j=BQu5+GxQUqSdDjJ91N6yUUhqqBc-0V5rPh8AQ@mail.gmail.com>
	<CAFvE7K6+3vB9emv2MCt6KMPz8PYZhQUQazgnHQVg3FEOU0PioA@mail.gmail.com>
	<CAN4+E8G=i5g5gNTFjzxWUwGOC4HTh_HvME-dXRSnvEvQqg4W7Q@mail.gmail.com>
Message-ID: <CAN4+E8GCtLnSiLTyN1RicB9t=Q4Mz6_ZS4hkCRh5BKtemG6tnA@mail.gmail.com>

I forked Olivier's example project to use the same infrastructure for
building conda binaries and deploying them to binstar, which might also be
useful for some projects.

https://github.com/rmcgibbo/python-appveyor-conda-example

-Robert


On Wed, Jul 9, 2014 at 3:53 PM, Robert McGibbon <rmcgibbo at gmail.com> wrote:

> This is an awesome resource for tons of projects.
>
> Thanks Olivier!
>
> -Robert
>
>
> On Wed, Jul 9, 2014 at 7:00 AM, Olivier Grisel <olivier.grisel at ensta.org>
> wrote:
>
>> Feodor updated the AppVeyor nodes to have the Windows SDK matching
>> MSVC 2008 Express for Python 2. I have updated my sample scripts and
>> we now have a working example of a free CI system for:
>>
>> Python 2 and 3 both for 32 and 64 bit architectures.
>>
>> https://github.com/ogrisel/python-appveyor-demo
>>
>> Best,
>>
>> --
>> Olivier
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140727/44be8789/attachment.html>

From hoogendoorn.eelco at gmail.com  Mon Jul 28 08:37:13 2014
From: hoogendoorn.eelco at gmail.com (Eelco Hoogendoorn)
Date: Mon, 28 Jul 2014 14:37:13 +0200
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
In-Reply-To: <1486437082428178269.968493sturla.molden-gmail.com@news.gmane.org>
References: <201407271416.s6REGn0J031512@blue-cove.com>
	<CAPJVwBm0bzi=o42wgm5UQPNw16ixxJOZXZm5a_Kf5ijkQ+OA+w@mail.gmail.com>
	<1486437082428178269.968493sturla.molden-gmail.com@news.gmane.org>
Message-ID: <CAO0rnfGEKm6iWAwNpGnS1_r3nzL-a3nC8asvDT3weLxwGLMhLg@mail.gmail.com>

To rephrase my most pressing question: may np.ones((N,2)).mean(0) and
np.ones((2,N)).mean(1) produce different results with the implementation in
the current master? If so, I think that would be very much regrettable; and
if this is a minority opinion, I do hope that at least this gets documented
in a most explicit manner.


On Sun, Jul 27, 2014 at 8:26 PM, Sturla Molden <sturla.molden at gmail.com>
wrote:

> Nathaniel Smith <njs at pobox.com> wrote:
>
> > The problem here is that when summing up the values, the sum gets
> > large enough that after rounding, x + 1 = x and the sum stops
> > increasing.
>
> Interesting. That explains why the divide-and-conquer reduction is much
> more robust.
>
> Thanks :)
>
>
> Sturla
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140728/402eb066/attachment.html>

From sebastian at sipsolutions.net  Mon Jul 28 08:46:35 2014
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Mon, 28 Jul 2014 14:46:35 +0200
Subject: [Numpy-discussion] numpy.mean still broken for
 largefloat32arrays
In-Reply-To: <CAO0rnfGEKm6iWAwNpGnS1_r3nzL-a3nC8asvDT3weLxwGLMhLg@mail.gmail.com>
References: <201407271416.s6REGn0J031512@blue-cove.com>
	<CAPJVwBm0bzi=o42wgm5UQPNw16ixxJOZXZm5a_Kf5ijkQ+OA+w@mail.gmail.com>
	<1486437082428178269.968493sturla.molden-gmail.com@news.gmane.org>
	<CAO0rnfGEKm6iWAwNpGnS1_r3nzL-a3nC8asvDT3weLxwGLMhLg@mail.gmail.com>
Message-ID: <1406551595.11957.4.camel@sebastian-t440>

On Mo, 2014-07-28 at 14:37 +0200, Eelco Hoogendoorn wrote:
> To rephrase my most pressing question: may np.ones((N,2)).mean(0) and
> np.ones((2,N)).mean(1) produce different results with the
> implementation in the current master? If so, I think that would be
> very much regrettable; and if this is a minority opinion, I do hope
> that at least this gets documented in a most explicit manner.
> 

This will always give you different results. Though in master. the
difference is more likely to be large, since (often the second one)
maybe be less likely to run into bigger numerical issues.

> 
> On Sun, Jul 27, 2014 at 8:26 PM, Sturla Molden
> <sturla.molden at gmail.com> wrote:
>         Nathaniel Smith <njs at pobox.com> wrote:
>         
>         > The problem here is that when summing up the values, the sum
>         gets
>         > large enough that after rounding, x + 1 = x and the sum
>         stops
>         > increasing.
>         
>         
>         Interesting. That explains why the divide-and-conquer
>         reduction is much
>         more robust.
>         
>         Thanks :)
>         
>         
>         Sturla
>         
>         _______________________________________________
>         NumPy-Discussion mailing list
>         NumPy-Discussion at scipy.org
>         http://mail.scipy.org/mailman/listinfo/numpy-discussion
>         
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From argriffi at ncsu.edu  Mon Jul 28 09:21:15 2014
From: argriffi at ncsu.edu (alex)
Date: Mon, 28 Jul 2014 09:21:15 -0400
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
In-Reply-To: <1406551595.11957.4.camel@sebastian-t440>
References: <201407271416.s6REGn0J031512@blue-cove.com>
	<CAPJVwBm0bzi=o42wgm5UQPNw16ixxJOZXZm5a_Kf5ijkQ+OA+w@mail.gmail.com>
	<1486437082428178269.968493sturla.molden-gmail.com@news.gmane.org>
	<CAO0rnfGEKm6iWAwNpGnS1_r3nzL-a3nC8asvDT3weLxwGLMhLg@mail.gmail.com>
	<1406551595.11957.4.camel@sebastian-t440>
Message-ID: <CAE5GFcJ4of+S-Mwut8EeqqVDGvCN8Qq1g12NsK3E6=N4PuzG6g@mail.gmail.com>

On Mon, Jul 28, 2014 at 8:46 AM, Sebastian Berg
<sebastian at sipsolutions.net> wrote:
> On Mo, 2014-07-28 at 14:37 +0200, Eelco Hoogendoorn wrote:
>> To rephrase my most pressing question: may np.ones((N,2)).mean(0) and
>> np.ones((2,N)).mean(1) produce different results with the
>> implementation in the current master? If so, I think that would be
>> very much regrettable; and if this is a minority opinion, I do hope
>> that at least this gets documented in a most explicit manner.
>>
>
> This will always give you different results. Though in master. the
> difference is more likely to be large, since (often the second one)
> maybe be less likely to run into bigger numerical issues.

Are you sure they always give different results?  Notice that
np.ones((N,2)).mean(0)
np.ones((2,N)).mean(1)
compute means of different axes on transposed arrays so these
differences 'cancel out'.

My understanding of the question is to clarify how numpy reduction
algorithms are special-cased for the fast axis vs. other axes.


From cmkleffner at gmail.com  Mon Jul 28 09:25:33 2014
From: cmkleffner at gmail.com (Carl Kleffner)
Date: Mon, 28 Jul 2014 15:25:33 +0200
Subject: [Numpy-discussion] 64-bit windows numpy / scipy wheels for
	testing
In-Reply-To: <CAN4+E8GCtLnSiLTyN1RicB9t=Q4Mz6_ZS4hkCRh5BKtemG6tnA@mail.gmail.com>
References: <CAH6Pt5r47rOKXYfdiqaGQihV+yzrq2dc9rmYbboNb=HUbrtEpA@mail.gmail.com>
	<536CB2C6.1030305@googlemail.com>
	<CAGY4rcV1tecTqbVB-nY2_zuAQ1U0uVRH9iPa8Wd2inL8mh+zpA@mail.gmail.com>
	<CAH6Pt5oOx+POzTJdMHU49JxZs=Pkc6OFD-B6=9topx68QjtiiA@mail.gmail.com>
	<CABL7CQhfo6++1B6hBh3pDzmLZOeNbqo1SzQOfBMbLEQj4G+U7A@mail.gmail.com>
	<CAFvE7K4+TMXG8-=bFsWk7hx5Wh4VkhhyYhw4z7EBb6kKnqPJrw@mail.gmail.com>
	<CAGGsPMxmOV8T+Wm6LfTER1xebCQk4MEebXfdRULJ6b8zWLL_=Q@mail.gmail.com>
	<CAH6Pt5pFftrXN8dkS4Dcj+Dt8Bx6EMyqp_cK0RX6=KUKhVx2+w@mail.gmail.com>
	<CAH6Pt5r2fNJGGj1K=foSt9tFU89+wf82waYAzxYHrKGtQbTXwQ@mail.gmail.com>
	<CAGGsPMyxvNkz__+Hc1YSp_fLRPbdg4gez9EAwQyS9jzv_R+ZCA@mail.gmail.com>
	<CAH6Pt5rj8QOrtCyZMKeaM34ofdXkKRCiABGNLD1vVUC_B7bqTA@mail.gmail.com>
	<CAGGsPMy7Y1nqz+5TCuGFP2gvK6G2ReYMb9X1e_Jq3rwO6FoLJA@mail.gmail.com>
	<CAH6Pt5rPhXYyWHPVy8fYrym=2oNQxj5kchUL38KdLrwXjmhL8w@mail.gmail.com>
	<CALGmxEKyXDsuF81=coYYvtJSOLnpnO6AW5CxNYu+JSBOCiQu=A@mail.gmail.com>
	<lp2k91$7jg$1@ger.gmane.org>
	<CAH6Pt5o6g5A_gjCsH75xkqLzPLJUz9OdsB4a5Ej5S-kectw+og@mail.gmail.com>
	<CAH6Pt5qJS357CtoXGyz2cNmVFr8ufjjdwe_fHqU86YcAcJ5vcw@mail.gmail.com>
	<CAFvE7K5i-d6j=BQu5+GxQUqSdDjJ91N6yUUhqqBc-0V5rPh8AQ@mail.gmail.com>
	<CAFvE7K6+3vB9emv2MCt6KMPz8PYZhQUQazgnHQVg3FEOU0PioA@mail.gmail.com>
	<CAN4+E8G=i5g5gNTFjzxWUwGOC4HTh_HvME-dXRSnvEvQqg4W7Q@mail.gmail.com>
	<CAN4+E8GCtLnSiLTyN1RicB9t=Q4Mz6_ZS4hkCRh5BKtemG6tnA@mail.gmail.com>
Message-ID: <CAGGsPMxSfmZyktAEjE43V_PpXw5jYSF4qnrKVR7OA-1t9VDYcQ@mail.gmail.com>

Hi,

on https://bitbucket.org/carlkl/mingw-w64-for-python/downloads I uploaded
7z-archives for mingw-w64 and for OpenBLAS-0.2.10 for 32 bit and for 64
bit.
To use mingw-w64 for Python >= 3.3 you have to manually tweak the so called
specs file - see readme.txt in the archive.

Regards

Carl


2014-07-28 4:32 GMT+02:00 Robert McGibbon <rmcgibbo at gmail.com>:

> I forked Olivier's example project to use the same infrastructure for
> building conda binaries and deploying them to binstar, which might also be
> useful for some projects.
>
> https://github.com/rmcgibbo/python-appveyor-conda-example
>
> -Robert
>
>
> On Wed, Jul 9, 2014 at 3:53 PM, Robert McGibbon <rmcgibbo at gmail.com>
> wrote:
>
>> This is an awesome resource for tons of projects.
>>
>> Thanks Olivier!
>>
>> -Robert
>>
>>
>> On Wed, Jul 9, 2014 at 7:00 AM, Olivier Grisel <olivier.grisel at ensta.org>
>> wrote:
>>
>>> Feodor updated the AppVeyor nodes to have the Windows SDK matching
>>> MSVC 2008 Express for Python 2. I have updated my sample scripts and
>>> we now have a working example of a free CI system for:
>>>
>>> Python 2 and 3 both for 32 and 64 bit architectures.
>>>
>>> https://github.com/ogrisel/python-appveyor-demo
>>>
>>> Best,
>>>
>>> --
>>> Olivier
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140728/b57ae165/attachment.html>

From davidmenhur at gmail.com  Mon Jul 28 09:30:24 2014
From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=)
Date: Mon, 28 Jul 2014 15:30:24 +0200
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
In-Reply-To: <1406551595.11957.4.camel@sebastian-t440>
References: <201407271416.s6REGn0J031512@blue-cove.com>
	<CAPJVwBm0bzi=o42wgm5UQPNw16ixxJOZXZm5a_Kf5ijkQ+OA+w@mail.gmail.com>
	<1486437082428178269.968493sturla.molden-gmail.com@news.gmane.org>
	<CAO0rnfGEKm6iWAwNpGnS1_r3nzL-a3nC8asvDT3weLxwGLMhLg@mail.gmail.com>
	<1406551595.11957.4.camel@sebastian-t440>
Message-ID: <CAJhcF=1d005BOMFByvnx2jovdFqkae+Ydt7T8XdXtU+saxvE8A@mail.gmail.com>

On 28 July 2014 14:46, Sebastian Berg <sebastian at sipsolutions.net> wrote:

> > To rephrase my most pressing question: may np.ones((N,2)).mean(0) and
> > np.ones((2,N)).mean(1) produce different results with the
> > implementation in the current master? If so, I think that would be
> > very much regrettable; and if this is a minority opinion, I do hope
> > that at least this gets documented in a most explicit manner.
> >
>
> This will always give you different results. Though in master. the
> difference is more likely to be large, since (often the second one)
> maybe be less likely to run into bigger numerical issues.
>


An example using float16 on Numpy 1.8.1 (I haven't seen diferences with
float32):

for N in np.logspace(2, 6):
        print N, (np.ones((N,2), dtype=np.float16).mean(0), np.ones((2,N),
dtype=np.float16).mean(1))

The first one gives correct results up to 2049, from where the values start
to fall. The second one, on the other hand, gives correct results up to
65519, where it blows to infinity.

Interestingly, in the second case there are fluctuations. For example, for
N = 65424, the mean is 0.99951172, but 1 for the next and previous numbers.
But I think they are just an effect of the rounding, as:

In [33]: np.ones(N+1, dtype=np.float16).sum() - N
Out[33]: 16.0

In [35]: np.ones(N+1, dtype=np.float16).sum() - (N +1)
Out[35]: 15.0

In [36]: np.ones(N-1, dtype=np.float16).sum() - (N -1)
Out[36]: -15.0

In [37]: N = 65519 - 20

In [38]: np.ones(N, dtype=np.float16).sum() - N
Out[38]: 5.0

In [39]: np.ones(N+1, dtype=np.float16).sum() - (N +1)
Out[39]: 4.0

In [40]: np.ones(N-1, dtype=np.float16).sum() - (N -1)
Out[40]: 6.0
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140728/3651dbd9/attachment.html>

From sturla.molden at gmail.com  Mon Jul 28 09:35:23 2014
From: sturla.molden at gmail.com (Sturla Molden)
Date: Mon, 28 Jul 2014 15:35:23 +0200
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
In-Reply-To: <CAE5GFcJ4of+S-Mwut8EeqqVDGvCN8Qq1g12NsK3E6=N4PuzG6g@mail.gmail.com>
References: <201407271416.s6REGn0J031512@blue-cove.com>	<CAPJVwBm0bzi=o42wgm5UQPNw16ixxJOZXZm5a_Kf5ijkQ+OA+w@mail.gmail.com>	<1486437082428178269.968493sturla.molden-gmail.com@news.gmane.org>	<CAO0rnfGEKm6iWAwNpGnS1_r3nzL-a3nC8asvDT3weLxwGLMhLg@mail.gmail.com>	<1406551595.11957.4.camel@sebastian-t440>
	<CAE5GFcJ4of+S-Mwut8EeqqVDGvCN8Qq1g12NsK3E6=N4PuzG6g@mail.gmail.com>
Message-ID: <lr5jis$lfo$1@ger.gmane.org>

On 28/07/14 15:21, alex wrote:

> Are you sure they always give different results?  Notice that
> np.ones((N,2)).mean(0)
> np.ones((2,N)).mean(1)
> compute means of different axes on transposed arrays so these
> differences 'cancel out'.

They will be if different algorithms are used. np.ones((N,2)).mean(0) 
will have larger accumulated rounding error than np.ones((2,N)).mean(1), 
if only the latter uses the divide-and-conquer summation.

I would suggest that in the first case we try to copy the array to a 
temporary contiguous buffer and use the same divide-and-conquer 
algorithm, unless some heuristics on memory usage fails.

Sturla


From fabien.maussion at gmail.com  Mon Jul 28 09:50:50 2014
From: fabien.maussion at gmail.com (Fabien)
Date: Mon, 28 Jul 2014 15:50:50 +0200
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
In-Reply-To: <CAJhcF=1d005BOMFByvnx2jovdFqkae+Ydt7T8XdXtU+saxvE8A@mail.gmail.com>
References: <201407271416.s6REGn0J031512@blue-cove.com>	<CAPJVwBm0bzi=o42wgm5UQPNw16ixxJOZXZm5a_Kf5ijkQ+OA+w@mail.gmail.com>	<1486437082428178269.968493sturla.molden-gmail.com@news.gmane.org>	<CAO0rnfGEKm6iWAwNpGnS1_r3nzL-a3nC8asvDT3weLxwGLMhLg@mail.gmail.com>	<1406551595.11957.4.camel@sebastian-t440>
	<CAJhcF=1d005BOMFByvnx2jovdFqkae+Ydt7T8XdXtU+saxvE8A@mail.gmail.com>
Message-ID: <lr5kfr$362$1@ger.gmane.org>

On 28.07.2014 15:30, Da?id wrote:
> An example using float16 on Numpy 1.8.1 (I haven't seen diferences with
> float32):

Why aren't there differences between float16 and float32 ?

Could this be related to my earlier post in this thread where I 
mentioned summation problems occurring much earlier in numpy than in IDL?

Fabien


From sebastian at sipsolutions.net  Mon Jul 28 10:06:12 2014
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Mon, 28 Jul 2014 16:06:12 +0200
Subject: [Numpy-discussion] numpy.mean still broken for
 largefloat32arrays
In-Reply-To: <lr5jis$lfo$1@ger.gmane.org>
References: <201407271416.s6REGn0J031512@blue-cove.com>
	<CAPJVwBm0bzi=o42wgm5UQPNw16ixxJOZXZm5a_Kf5ijkQ+OA+w@mail.gmail.com>
	<1486437082428178269.968493sturla.molden-gmail.com@news.gmane.org>
	<CAO0rnfGEKm6iWAwNpGnS1_r3nzL-a3nC8asvDT3weLxwGLMhLg@mail.gmail.com>
	<1406551595.11957.4.camel@sebastian-t440>
	<CAE5GFcJ4of+S-Mwut8EeqqVDGvCN8Qq1g12NsK3E6=N4PuzG6g@mail.gmail.com>
	<lr5jis$lfo$1@ger.gmane.org>
Message-ID: <1406556372.11957.20.camel@sebastian-t440>

On Mo, 2014-07-28 at 15:35 +0200, Sturla Molden wrote:
> On 28/07/14 15:21, alex wrote:
> 
> > Are you sure they always give different results?  Notice that
> > np.ones((N,2)).mean(0)
> > np.ones((2,N)).mean(1)
> > compute means of different axes on transposed arrays so these
> > differences 'cancel out'.
> 
> They will be if different algorithms are used. np.ones((N,2)).mean(0) 
> will have larger accumulated rounding error than np.ones((2,N)).mean(1), 
> if only the latter uses the divide-and-conquer summation.
> 

What I wanted to point out is that to some extend the algorithm does not
matter. You will not necessarily get identical results already if you
use a different iteration order, and we have been doing that for years
for speed reasons. All libs like BLAS do the same.
Yes, the new changes make this much more dramatic, but they only make
some paths much better, never worse. It might be dangerous, but only in
the sense that you test it with the good path and it works good enough,
but later (also) use the other one in some lib. I am not even sure if I

> I would suggest that in the first case we try to copy the array to a 
> temporary contiguous buffer and use the same divide-and-conquer 
> algorithm, unless some heuristics on memory usage fails.
> 

Sure, but you have to make major changes to the buffered iterator to do
that without larger speed implications. It might be a good idea, but it
requires someone who knows this stuff to spend a lot of time and care in
the depths of numpy.

> Sturla
> 
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 


From sebastian at sipsolutions.net  Mon Jul 28 10:08:39 2014
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Mon, 28 Jul 2014 16:08:39 +0200
Subject: [Numpy-discussion] numpy.mean still broken for
 largefloat32arrays
In-Reply-To: <lr5kfr$362$1@ger.gmane.org>
References: <201407271416.s6REGn0J031512@blue-cove.com>
	<CAPJVwBm0bzi=o42wgm5UQPNw16ixxJOZXZm5a_Kf5ijkQ+OA+w@mail.gmail.com>
	<1486437082428178269.968493sturla.molden-gmail.com@news.gmane.org>
	<CAO0rnfGEKm6iWAwNpGnS1_r3nzL-a3nC8asvDT3weLxwGLMhLg@mail.gmail.com>
	<1406551595.11957.4.camel@sebastian-t440>
	<CAJhcF=1d005BOMFByvnx2jovdFqkae+Ydt7T8XdXtU+saxvE8A@mail.gmail.com>
	<lr5kfr$362$1@ger.gmane.org>
Message-ID: <1406556519.11957.22.camel@sebastian-t440>

On Mo, 2014-07-28 at 15:50 +0200, Fabien wrote:
> On 28.07.2014 15:30, Da?id wrote:
> > An example using float16 on Numpy 1.8.1 (I haven't seen diferences with
> > float32):
> 
> Why aren't there differences between float16 and float32 ?
> 

float16 calculations are actually float32 calculations. If done along
the fast axis they will not get rounded in between (within those 8192
elements chunks). Basically something like the difference we are talking
about for float32 and float64 has for years existed in float16.

> Could this be related to my earlier post in this thread where I 
> mentioned summation problems occurring much earlier in numpy than in IDL?
> 
> Fabien
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From hoogendoorn.eelco at gmail.com  Mon Jul 28 10:31:41 2014
From: hoogendoorn.eelco at gmail.com (Eelco Hoogendoorn)
Date: Mon, 28 Jul 2014 16:31:41 +0200
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
In-Reply-To: <1406556372.11957.20.camel@sebastian-t440>
References: <201407271416.s6REGn0J031512@blue-cove.com>
	<CAPJVwBm0bzi=o42wgm5UQPNw16ixxJOZXZm5a_Kf5ijkQ+OA+w@mail.gmail.com>
	<1486437082428178269.968493sturla.molden-gmail.com@news.gmane.org>
	<CAO0rnfGEKm6iWAwNpGnS1_r3nzL-a3nC8asvDT3weLxwGLMhLg@mail.gmail.com>
	<1406551595.11957.4.camel@sebastian-t440>
	<CAE5GFcJ4of+S-Mwut8EeqqVDGvCN8Qq1g12NsK3E6=N4PuzG6g@mail.gmail.com>
	<lr5jis$lfo$1@ger.gmane.org>
	<1406556372.11957.20.camel@sebastian-t440>
Message-ID: <CAO0rnfGVcgWEJ=O5mENcdgd1B2YpBqbGyOe6ATTbufDp2p19Aw@mail.gmail.com>

Sebastian:

Those are good points. Indeed iteration order may already produce different
results, even though the semantics of numpy suggest identical operations.
Still, I feel this different behavior without any semantical clues is
something to be minimized.

Indeed copying might have large speed implications. But on second thought,
does it? Either the data is already aligned and no copy is required, or it
isn't aligned, and we need one pass of cache inefficient access to the data
anyway. Infact, if we had one low level function which does
cache-intelligent transposition of numpy arrays (using some block
strategy), this might be faster even than the current simple reduction
operations when forced to work on awkwardly aligned data. Ideally, this
intelligent access and intelligent reduction would be part of a single pass
of course; but that wouldn't really fit within the numpy design, and merely
such an intelligent transpose would provide most of the benefit I think. Or
is the mechanism behind ascontiguousarray already intelligent in this sense?


On Mon, Jul 28, 2014 at 4:06 PM, Sebastian Berg <sebastian at sipsolutions.net>
wrote:

> On Mo, 2014-07-28 at 15:35 +0200, Sturla Molden wrote:
> > On 28/07/14 15:21, alex wrote:
> >
> > > Are you sure they always give different results?  Notice that
> > > np.ones((N,2)).mean(0)
> > > np.ones((2,N)).mean(1)
> > > compute means of different axes on transposed arrays so these
> > > differences 'cancel out'.
> >
> > They will be if different algorithms are used. np.ones((N,2)).mean(0)
> > will have larger accumulated rounding error than np.ones((2,N)).mean(1),
> > if only the latter uses the divide-and-conquer summation.
> >
>
> What I wanted to point out is that to some extend the algorithm does not
> matter. You will not necessarily get identical results already if you
> use a different iteration order, and we have been doing that for years
> for speed reasons. All libs like BLAS do the same.
> Yes, the new changes make this much more dramatic, but they only make
> some paths much better, never worse. It might be dangerous, but only in
> the sense that you test it with the good path and it works good enough,
> but later (also) use the other one in some lib. I am not even sure if I
>
> > I would suggest that in the first case we try to copy the array to a
> > temporary contiguous buffer and use the same divide-and-conquer
> > algorithm, unless some heuristics on memory usage fails.
> >
>
> Sure, but you have to make major changes to the buffered iterator to do
> that without larger speed implications. It might be a good idea, but it
> requires someone who knows this stuff to spend a lot of time and care in
> the depths of numpy.
>
> > Sturla
> >
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140728/41c62811/attachment.html>

From olivier.grisel at ensta.org  Mon Jul 28 10:46:26 2014
From: olivier.grisel at ensta.org (Olivier Grisel)
Date: Mon, 28 Jul 2014 16:46:26 +0200
Subject: [Numpy-discussion] 64-bit windows numpy / scipy wheels for
	testing
In-Reply-To: <CAGGsPMxSfmZyktAEjE43V_PpXw5jYSF4qnrKVR7OA-1t9VDYcQ@mail.gmail.com>
References: <CAH6Pt5r47rOKXYfdiqaGQihV+yzrq2dc9rmYbboNb=HUbrtEpA@mail.gmail.com>
	<536CB2C6.1030305@googlemail.com>
	<CAGY4rcV1tecTqbVB-nY2_zuAQ1U0uVRH9iPa8Wd2inL8mh+zpA@mail.gmail.com>
	<CAH6Pt5oOx+POzTJdMHU49JxZs=Pkc6OFD-B6=9topx68QjtiiA@mail.gmail.com>
	<CABL7CQhfo6++1B6hBh3pDzmLZOeNbqo1SzQOfBMbLEQj4G+U7A@mail.gmail.com>
	<CAFvE7K4+TMXG8-=bFsWk7hx5Wh4VkhhyYhw4z7EBb6kKnqPJrw@mail.gmail.com>
	<CAGGsPMxmOV8T+Wm6LfTER1xebCQk4MEebXfdRULJ6b8zWLL_=Q@mail.gmail.com>
	<CAH6Pt5pFftrXN8dkS4Dcj+Dt8Bx6EMyqp_cK0RX6=KUKhVx2+w@mail.gmail.com>
	<CAH6Pt5r2fNJGGj1K=foSt9tFU89+wf82waYAzxYHrKGtQbTXwQ@mail.gmail.com>
	<CAGGsPMyxvNkz__+Hc1YSp_fLRPbdg4gez9EAwQyS9jzv_R+ZCA@mail.gmail.com>
	<CAH6Pt5rj8QOrtCyZMKeaM34ofdXkKRCiABGNLD1vVUC_B7bqTA@mail.gmail.com>
	<CAGGsPMy7Y1nqz+5TCuGFP2gvK6G2ReYMb9X1e_Jq3rwO6FoLJA@mail.gmail.com>
	<CAH6Pt5rPhXYyWHPVy8fYrym=2oNQxj5kchUL38KdLrwXjmhL8w@mail.gmail.com>
	<CALGmxEKyXDsuF81=coYYvtJSOLnpnO6AW5CxNYu+JSBOCiQu=A@mail.gmail.com>
	<lp2k91$7jg$1@ger.gmane.org>
	<CAH6Pt5o6g5A_gjCsH75xkqLzPLJUz9OdsB4a5Ej5S-kectw+og@mail.gmail.com>
	<CAH6Pt5qJS357CtoXGyz2cNmVFr8ufjjdwe_fHqU86YcAcJ5vcw@mail.gmail.com>
	<CAFvE7K5i-d6j=BQu5+GxQUqSdDjJ91N6yUUhqqBc-0V5rPh8AQ@mail.gmail.com>
	<CAFvE7K6+3vB9emv2MCt6KMPz8PYZhQUQazgnHQVg3FEOU0PioA@mail.gmail.com>
	<CAN4+E8G=i5g5gNTFjzxWUwGOC4HTh_HvME-dXRSnvEvQqg4W7Q@mail.gmail.com>
	<CAN4+E8GCtLnSiLTyN1RicB9t=Q4Mz6_ZS4hkCRh5BKtemG6tnA@mail.gmail.com>
	<CAGGsPMxSfmZyktAEjE43V_PpXw5jYSF4qnrKVR7OA-1t9VDYcQ@mail.gmail.com>
Message-ID: <CAFvE7K4McHFbFW+ao76pV4a2yORSTb7aSRKUsGNXz0Uac+yW7Q@mail.gmail.com>

2014-07-28 15:25 GMT+02:00 Carl Kleffner <cmkleffner at gmail.com>:
> Hi,
>
> on https://bitbucket.org/carlkl/mingw-w64-for-python/downloads I uploaded
> 7z-archives for mingw-w64 and for OpenBLAS-0.2.10 for 32 bit and for 64 bit.
> To use mingw-w64 for Python >= 3.3 you have to manually tweak the so called
> specs file - see readme.txt in the archive.

Have the patches to build numpy and scipy with mingw-w64 been merged
in the master branches of those projects?


-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel


From cmkleffner at gmail.com  Mon Jul 28 11:16:47 2014
From: cmkleffner at gmail.com (Carl Kleffner)
Date: Mon, 28 Jul 2014 17:16:47 +0200
Subject: [Numpy-discussion] 64-bit windows numpy / scipy wheels for
	testing
In-Reply-To: <CAFvE7K4McHFbFW+ao76pV4a2yORSTb7aSRKUsGNXz0Uac+yW7Q@mail.gmail.com>
References: <CAH6Pt5r47rOKXYfdiqaGQihV+yzrq2dc9rmYbboNb=HUbrtEpA@mail.gmail.com>
	<536CB2C6.1030305@googlemail.com>
	<CAGY4rcV1tecTqbVB-nY2_zuAQ1U0uVRH9iPa8Wd2inL8mh+zpA@mail.gmail.com>
	<CAH6Pt5oOx+POzTJdMHU49JxZs=Pkc6OFD-B6=9topx68QjtiiA@mail.gmail.com>
	<CABL7CQhfo6++1B6hBh3pDzmLZOeNbqo1SzQOfBMbLEQj4G+U7A@mail.gmail.com>
	<CAFvE7K4+TMXG8-=bFsWk7hx5Wh4VkhhyYhw4z7EBb6kKnqPJrw@mail.gmail.com>
	<CAGGsPMxmOV8T+Wm6LfTER1xebCQk4MEebXfdRULJ6b8zWLL_=Q@mail.gmail.com>
	<CAH6Pt5pFftrXN8dkS4Dcj+Dt8Bx6EMyqp_cK0RX6=KUKhVx2+w@mail.gmail.com>
	<CAH6Pt5r2fNJGGj1K=foSt9tFU89+wf82waYAzxYHrKGtQbTXwQ@mail.gmail.com>
	<CAGGsPMyxvNkz__+Hc1YSp_fLRPbdg4gez9EAwQyS9jzv_R+ZCA@mail.gmail.com>
	<CAH6Pt5rj8QOrtCyZMKeaM34ofdXkKRCiABGNLD1vVUC_B7bqTA@mail.gmail.com>
	<CAGGsPMy7Y1nqz+5TCuGFP2gvK6G2ReYMb9X1e_Jq3rwO6FoLJA@mail.gmail.com>
	<CAH6Pt5rPhXYyWHPVy8fYrym=2oNQxj5kchUL38KdLrwXjmhL8w@mail.gmail.com>
	<CALGmxEKyXDsuF81=coYYvtJSOLnpnO6AW5CxNYu+JSBOCiQu=A@mail.gmail.com>
	<lp2k91$7jg$1@ger.gmane.org>
	<CAH6Pt5o6g5A_gjCsH75xkqLzPLJUz9OdsB4a5Ej5S-kectw+og@mail.gmail.com>
	<CAH6Pt5qJS357CtoXGyz2cNmVFr8ufjjdwe_fHqU86YcAcJ5vcw@mail.gmail.com>
	<CAFvE7K5i-d6j=BQu5+GxQUqSdDjJ91N6yUUhqqBc-0V5rPh8AQ@mail.gmail.com>
	<CAFvE7K6+3vB9emv2MCt6KMPz8PYZhQUQazgnHQVg3FEOU0PioA@mail.gmail.com>
	<CAN4+E8G=i5g5gNTFjzxWUwGOC4HTh_HvME-dXRSnvEvQqg4W7Q@mail.gmail.com>
	<CAN4+E8GCtLnSiLTyN1RicB9t=Q4Mz6_ZS4hkCRh5BKtemG6tnA@mail.gmail.com>
	<CAGGsPMxSfmZyktAEjE43V_PpXw5jYSF4qnrKVR7OA-1t9VDYcQ@mail.gmail.com>
	<CAFvE7K4McHFbFW+ao76pV4a2yORSTb7aSRKUsGNXz0Uac+yW7Q@mail.gmail.com>
Message-ID: <CAGGsPMx8OWexK8hiYXT2koOmEX8vwC=Fitc6329i1TCa_SRfdQ@mail.gmail.com>

I had to move my development enviroment on different windows box recently
(stilll in progress). On this box I don't have full access unfortunately.
The patch for scipy build was merged into scipy master some time ago, see
https://github.com/scipy/scipy/pull/3484 . I have some additional patches
for scipy.test.
The pull request for numpy build has not yet been made for the reasons I
mentioned.

Cheers,

Carl


2014-07-28 16:46 GMT+02:00 Olivier Grisel <olivier.grisel at ensta.org>:

> 2014-07-28 15:25 GMT+02:00 Carl Kleffner <cmkleffner at gmail.com>:
> > Hi,
> >
> > on https://bitbucket.org/carlkl/mingw-w64-for-python/downloads I
> uploaded
> > 7z-archives for mingw-w64 and for OpenBLAS-0.2.10 for 32 bit and for 64
> bit.
> > To use mingw-w64 for Python >= 3.3 you have to manually tweak the so
> called
> > specs file - see readme.txt in the archive.
>
> Have the patches to build numpy and scipy with mingw-w64 been merged
> in the master branches of those projects?
>
>
> --
> Olivier
> http://twitter.com/ogrisel - http://github.com/ogrisel
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140728/56783a9d/attachment.html>

From sebastian at sipsolutions.net  Mon Jul 28 11:22:33 2014
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Mon, 28 Jul 2014 17:22:33 +0200
Subject: [Numpy-discussion] numpy.mean still broken for
 largefloat32arrays
In-Reply-To: <CAO0rnfGVcgWEJ=O5mENcdgd1B2YpBqbGyOe6ATTbufDp2p19Aw@mail.gmail.com>
References: <201407271416.s6REGn0J031512@blue-cove.com>
	<CAPJVwBm0bzi=o42wgm5UQPNw16ixxJOZXZm5a_Kf5ijkQ+OA+w@mail.gmail.com>
	<1486437082428178269.968493sturla.molden-gmail.com@news.gmane.org>
	<CAO0rnfGEKm6iWAwNpGnS1_r3nzL-a3nC8asvDT3weLxwGLMhLg@mail.gmail.com>
	<1406551595.11957.4.camel@sebastian-t440>
	<CAE5GFcJ4of+S-Mwut8EeqqVDGvCN8Qq1g12NsK3E6=N4PuzG6g@mail.gmail.com>
	<lr5jis$lfo$1@ger.gmane.org> <1406556372.11957.20.camel@sebastian-t440>
	<CAO0rnfGVcgWEJ=O5mENcdgd1B2YpBqbGyOe6ATTbufDp2p19Aw@mail.gmail.com>
Message-ID: <1406560953.11957.28.camel@sebastian-t440>

On Mo, 2014-07-28 at 16:31 +0200, Eelco Hoogendoorn wrote:
> Sebastian:
> 
> 
> Those are good points. Indeed iteration order may already produce
> different results, even though the semantics of numpy suggest
> identical operations. Still, I feel this different behavior without
> any semantical clues is something to be minimized.
> 
> Indeed copying might have large speed implications. But on second
> thought, does it? Either the data is already aligned and no copy is
> required, or it isn't aligned, and we need one pass of cache
> inefficient access to the data anyway. Infact, if we had one low level
> function which does cache-intelligent transposition of numpy arrays
> (using some block strategy), this might be faster even than the
> current simple reduction operations when forced to work on awkwardly
> aligned data. Ideally, this intelligent access and intelligent
> reduction would be part of a single pass of course; but that wouldn't
> really fit within the numpy design, and merely such an intelligent
> transpose would provide most of the benefit I think. Or is the
> mechanism behind ascontiguousarray already intelligent in this sense?
> 

The iterator is currently smart in the sense that it will (obviously low
level), do something like [1]. Most things in numpy use that iterator,
ascontiguousarray does so as well. Such a blocked cache aware iterator
is what I mean by, someone who knows it would have to spend a lot of
time on it :).

[1] Appendix:

arr = np.ones((100, 100))
arr.sum(1)
# being equivalent (iteration order wise) to:
res = np.zeros(100)
for i in range(100):
	res += arr[i, :]
# while arr.sum(0) would be:
for i in range(100):
	res[i] = arr[i, :].sum()

> 
> On Mon, Jul 28, 2014 at 4:06 PM, Sebastian Berg
> <sebastian at sipsolutions.net> wrote:
>         On Mo, 2014-07-28 at 15:35 +0200, Sturla Molden wrote:
>         > On 28/07/14 15:21, alex wrote:
>         >
>         > > Are you sure they always give different results?  Notice
>         that
>         > > np.ones((N,2)).mean(0)
>         > > np.ones((2,N)).mean(1)
>         > > compute means of different axes on transposed arrays so
>         these
>         > > differences 'cancel out'.
>         >
>         > They will be if different algorithms are used.
>         np.ones((N,2)).mean(0)
>         > will have larger accumulated rounding error than
>         np.ones((2,N)).mean(1),
>         > if only the latter uses the divide-and-conquer summation.
>         >
>         
>         
>         What I wanted to point out is that to some extend the
>         algorithm does not
>         matter. You will not necessarily get identical results already
>         if you
>         use a different iteration order, and we have been doing that
>         for years
>         for speed reasons. All libs like BLAS do the same.
>         Yes, the new changes make this much more dramatic, but they
>         only make
>         some paths much better, never worse. It might be dangerous,
>         but only in
>         the sense that you test it with the good path and it works
>         good enough,
>         but later (also) use the other one in some lib. I am not even
>         sure if I
>         
>         > I would suggest that in the first case we try to copy the
>         array to a
>         > temporary contiguous buffer and use the same
>         divide-and-conquer
>         > algorithm, unless some heuristics on memory usage fails.
>         >
>         
>         
>         Sure, but you have to make major changes to the buffered
>         iterator to do
>         that without larger speed implications. It might be a good
>         idea, but it
>         requires someone who knows this stuff to spend a lot of time
>         and care in
>         the depths of numpy.
>         
>         > Sturla
>         >
>         >
>         >
>         > _______________________________________________
>         > NumPy-Discussion mailing list
>         > NumPy-Discussion at scipy.org
>         > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>         >
>         
>         
>         _______________________________________________
>         NumPy-Discussion mailing list
>         NumPy-Discussion at scipy.org
>         http://mail.scipy.org/mailman/listinfo/numpy-discussion
>         
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From hoogendoorn.eelco at gmail.com  Mon Jul 28 17:32:15 2014
From: hoogendoorn.eelco at gmail.com (Eelco Hoogendoorn)
Date: Mon, 28 Jul 2014 23:32:15 +0200
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
In-Reply-To: <1406560953.11957.28.camel@sebastian-t440>
References: <201407271416.s6REGn0J031512@blue-cove.com>
	<CAPJVwBm0bzi=o42wgm5UQPNw16ixxJOZXZm5a_Kf5ijkQ+OA+w@mail.gmail.com>
	<1486437082428178269.968493sturla.molden-gmail.com@news.gmane.org>
	<CAO0rnfGEKm6iWAwNpGnS1_r3nzL-a3nC8asvDT3weLxwGLMhLg@mail.gmail.com>
	<1406551595.11957.4.camel@sebastian-t440>
	<CAE5GFcJ4of+S-Mwut8EeqqVDGvCN8Qq1g12NsK3E6=N4PuzG6g@mail.gmail.com>
	<lr5jis$lfo$1@ger.gmane.org>
	<1406556372.11957.20.camel@sebastian-t440>
	<CAO0rnfGVcgWEJ=O5mENcdgd1B2YpBqbGyOe6ATTbufDp2p19Aw@mail.gmail.com>
	<1406560953.11957.28.camel@sebastian-t440>
Message-ID: <CAO0rnfHAhGqVrW=fBKBffqDc0tFwusw_oQCBFCKvBLfw3Sz7Ew@mail.gmail.com>

I see, thanks for the clarification. Just for the sake of argument, since
unfortunately I don't have the time to go dig in the guts of numpy myself:
a design which always produces results of the same (high) accuracy, but
only optimizes the common access patterns in a hacky way, and may be
inefficient in case it needs to fall back on dumb iteration or array
copying, is the best compromise between features and the ever limiting
amount of time available, I would argue, no? Its preferable if your code
works, but may be hacked to work more efficiently, than that it works
efficiently, but may need hacking to work correctly under all circumstances.

But fun as it is to think about what ought to be, i suppose the people who
do actually pour in the effort have thought about these things already. A
numpy 2.0 could probably borrow/integrate a lot from numexpr, I suppose.

By the way, the hierarchical summation would make it fairly easy to erase
(and in any case would minimize) summation differences due to differences
between logical and actual ordering in memory of the data, no?


On Mon, Jul 28, 2014 at 5:22 PM, Sebastian Berg <sebastian at sipsolutions.net>
wrote:

> On Mo, 2014-07-28 at 16:31 +0200, Eelco Hoogendoorn wrote:
> > Sebastian:
> >
> >
> > Those are good points. Indeed iteration order may already produce
> > different results, even though the semantics of numpy suggest
> > identical operations. Still, I feel this different behavior without
> > any semantical clues is something to be minimized.
> >
> > Indeed copying might have large speed implications. But on second
> > thought, does it? Either the data is already aligned and no copy is
> > required, or it isn't aligned, and we need one pass of cache
> > inefficient access to the data anyway. Infact, if we had one low level
> > function which does cache-intelligent transposition of numpy arrays
> > (using some block strategy), this might be faster even than the
> > current simple reduction operations when forced to work on awkwardly
> > aligned data. Ideally, this intelligent access and intelligent
> > reduction would be part of a single pass of course; but that wouldn't
> > really fit within the numpy design, and merely such an intelligent
> > transpose would provide most of the benefit I think. Or is the
> > mechanism behind ascontiguousarray already intelligent in this sense?
> >
>
> The iterator is currently smart in the sense that it will (obviously low
> level), do something like [1]. Most things in numpy use that iterator,
> ascontiguousarray does so as well. Such a blocked cache aware iterator
> is what I mean by, someone who knows it would have to spend a lot of
> time on it :).
>
> [1] Appendix:
>
> arr = np.ones((100, 100))
> arr.sum(1)
> # being equivalent (iteration order wise) to:
> res = np.zeros(100)
> for i in range(100):
>         res += arr[i, :]
> # while arr.sum(0) would be:
> for i in range(100):
>         res[i] = arr[i, :].sum()
>
> >
> > On Mon, Jul 28, 2014 at 4:06 PM, Sebastian Berg
> > <sebastian at sipsolutions.net> wrote:
> >         On Mo, 2014-07-28 at 15:35 +0200, Sturla Molden wrote:
> >         > On 28/07/14 15:21, alex wrote:
> >         >
> >         > > Are you sure they always give different results?  Notice
> >         that
> >         > > np.ones((N,2)).mean(0)
> >         > > np.ones((2,N)).mean(1)
> >         > > compute means of different axes on transposed arrays so
> >         these
> >         > > differences 'cancel out'.
> >         >
> >         > They will be if different algorithms are used.
> >         np.ones((N,2)).mean(0)
> >         > will have larger accumulated rounding error than
> >         np.ones((2,N)).mean(1),
> >         > if only the latter uses the divide-and-conquer summation.
> >         >
> >
> >
> >         What I wanted to point out is that to some extend the
> >         algorithm does not
> >         matter. You will not necessarily get identical results already
> >         if you
> >         use a different iteration order, and we have been doing that
> >         for years
> >         for speed reasons. All libs like BLAS do the same.
> >         Yes, the new changes make this much more dramatic, but they
> >         only make
> >         some paths much better, never worse. It might be dangerous,
> >         but only in
> >         the sense that you test it with the good path and it works
> >         good enough,
> >         but later (also) use the other one in some lib. I am not even
> >         sure if I
> >
> >         > I would suggest that in the first case we try to copy the
> >         array to a
> >         > temporary contiguous buffer and use the same
> >         divide-and-conquer
> >         > algorithm, unless some heuristics on memory usage fails.
> >         >
> >
> >
> >         Sure, but you have to make major changes to the buffered
> >         iterator to do
> >         that without larger speed implications. It might be a good
> >         idea, but it
> >         requires someone who knows this stuff to spend a lot of time
> >         and care in
> >         the depths of numpy.
> >
> >         > Sturla
> >         >
> >         >
> >         >
> >         > _______________________________________________
> >         > NumPy-Discussion mailing list
> >         > NumPy-Discussion at scipy.org
> >         > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >         >
> >
> >
> >         _______________________________________________
> >         NumPy-Discussion mailing list
> >         NumPy-Discussion at scipy.org
> >         http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140728/c2d3264e/attachment.html>

From jtaylor.debian at googlemail.com  Mon Jul 28 18:03:35 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Tue, 29 Jul 2014 00:03:35 +0200
Subject: [Numpy-discussion] numpy.mean still broken for
	largefloat32arrays
In-Reply-To: <CAO0rnfHAhGqVrW=fBKBffqDc0tFwusw_oQCBFCKvBLfw3Sz7Ew@mail.gmail.com>
References: <201407271416.s6REGn0J031512@blue-cove.com>	<CAPJVwBm0bzi=o42wgm5UQPNw16ixxJOZXZm5a_Kf5ijkQ+OA+w@mail.gmail.com>	<1486437082428178269.968493sturla.molden-gmail.com@news.gmane.org>	<CAO0rnfGEKm6iWAwNpGnS1_r3nzL-a3nC8asvDT3weLxwGLMhLg@mail.gmail.com>	<1406551595.11957.4.camel@sebastian-t440>	<CAE5GFcJ4of+S-Mwut8EeqqVDGvCN8Qq1g12NsK3E6=N4PuzG6g@mail.gmail.com>	<lr5jis$lfo$1@ger.gmane.org>	<1406556372.11957.20.camel@sebastian-t440>	<CAO0rnfGVcgWEJ=O5mENcdgd1B2YpBqbGyOe6ATTbufDp2p19Aw@mail.gmail.com>	<1406560953.11957.28.camel@sebastian-t440>
	<CAO0rnfHAhGqVrW=fBKBffqDc0tFwusw_oQCBFCKvBLfw3Sz7Ew@mail.gmail.com>
Message-ID: <53D6C8B7.7040606@googlemail.com>

On 28.07.2014 23:32, Eelco Hoogendoorn wrote:
> I see, thanks for the clarification. Just for the sake of argument,
> since unfortunately I don't have the time to go dig in the guts of numpy
> myself: a design which always produces results of the same (high)
> accuracy, but only optimizes the common access patterns in a hacky way,
> and may be inefficient in case it needs to fall back on dumb iteration
> or array copying, is the best compromise between features and the ever
> limiting amount of time available, I would argue, no? Its preferable if
> your code works, but may be hacked to work more efficiently, than that
> it works efficiently, but may need hacking to work correctly under all
> circumstances.

I don't see the inconsistency as such a big problem. If applications are
so sensitive to accurate summations over large uniform datasets they
will most likely implement their own algorithm instead of relying on the
black box in numpy (which never documented any accuracy bounds or used
algorithms on summation so far I know).
If they do they should add testsuites that will detect accidental use
the less accurate path in numpy and fix it before they even release.

General purpose libraries that may not be able to test every input third
party users may give them usually don't have the luxury of only
supporting the latest version of numpy to have pairwise summation
guaranteed in the first place, so they would just have to implement
their own algorithms anyway.


> 
> But fun as it is to think about what ought to be, i suppose the people
> who do actually pour in the effort have thought about these things
> already. A numpy 2.0 could probably borrow/integrate a lot from numexpr,
> I suppose.
> 
> By the way, the hierarchical summation would make it fairly easy to
> erase (and in any case would minimize) summation differences due to
> differences between logical and actual ordering in memory of the data, no?
> 
> 
> On Mon, Jul 28, 2014 at 5:22 PM, Sebastian Berg
> <sebastian at sipsolutions.net <mailto:sebastian at sipsolutions.net>> wrote:
> 
>     On Mo, 2014-07-28 at 16:31 +0200, Eelco Hoogendoorn wrote:
>     > Sebastian:
>     >
>     >
>     > Those are good points. Indeed iteration order may already produce
>     > different results, even though the semantics of numpy suggest
>     > identical operations. Still, I feel this different behavior without
>     > any semantical clues is something to be minimized.
>     >
>     > Indeed copying might have large speed implications. But on second
>     > thought, does it? Either the data is already aligned and no copy is
>     > required, or it isn't aligned, and we need one pass of cache
>     > inefficient access to the data anyway. Infact, if we had one low level
>     > function which does cache-intelligent transposition of numpy arrays
>     > (using some block strategy), this might be faster even than the
>     > current simple reduction operations when forced to work on awkwardly
>     > aligned data. Ideally, this intelligent access and intelligent
>     > reduction would be part of a single pass of course; but that wouldn't
>     > really fit within the numpy design, and merely such an intelligent
>     > transpose would provide most of the benefit I think. Or is the
>     > mechanism behind ascontiguousarray already intelligent in this sense?
>     >
> 
>     The iterator is currently smart in the sense that it will (obviously low
>     level), do something like [1]. Most things in numpy use that iterator,
>     ascontiguousarray does so as well. Such a blocked cache aware iterator
>     is what I mean by, someone who knows it would have to spend a lot of
>     time on it :).
> 
>     [1] Appendix:
> 
>     arr = np.ones((100, 100))
>     arr.sum(1)
>     # being equivalent (iteration order wise) to:
>     res = np.zeros(100)
>     for i in range(100):
>             res += arr[i, :]
>     # while arr.sum(0) would be:
>     for i in range(100):
>             res[i] = arr[i, :].sum()
> 
>     >
>     > On Mon, Jul 28, 2014 at 4:06 PM, Sebastian Berg
>     > <sebastian at sipsolutions.net <mailto:sebastian at sipsolutions.net>>
>     wrote:
>     >         On Mo, 2014-07-28 at 15:35 +0200, Sturla Molden wrote:
>     >         > On 28/07/14 15:21, alex wrote:
>     >         >
>     >         > > Are you sure they always give different results?  Notice
>     >         that
>     >         > > np.ones((N,2)).mean(0)
>     >         > > np.ones((2,N)).mean(1)
>     >         > > compute means of different axes on transposed arrays so
>     >         these
>     >         > > differences 'cancel out'.
>     >         >
>     >         > They will be if different algorithms are used.
>     >         np.ones((N,2)).mean(0)
>     >         > will have larger accumulated rounding error than
>     >         np.ones((2,N)).mean(1),
>     >         > if only the latter uses the divide-and-conquer summation.
>     >         >
>     >
>     >
>     >         What I wanted to point out is that to some extend the
>     >         algorithm does not
>     >         matter. You will not necessarily get identical results already
>     >         if you
>     >         use a different iteration order, and we have been doing that
>     >         for years
>     >         for speed reasons. All libs like BLAS do the same.
>     >         Yes, the new changes make this much more dramatic, but they
>     >         only make
>     >         some paths much better, never worse. It might be dangerous,
>     >         but only in
>     >         the sense that you test it with the good path and it works
>     >         good enough,
>     >         but later (also) use the other one in some lib. I am not even
>     >         sure if I
>     >
>     >         > I would suggest that in the first case we try to copy the
>     >         array to a
>     >         > temporary contiguous buffer and use the same
>     >         divide-and-conquer
>     >         > algorithm, unless some heuristics on memory usage fails.
>     >         >
>     >
>     >
>     >         Sure, but you have to make major changes to the buffered
>     >         iterator to do
>     >         that without larger speed implications. It might be a good
>     >         idea, but it
>     >         requires someone who knows this stuff to spend a lot of time
>     >         and care in
>     >         the depths of numpy.
>     >
>     >         > Sturla
>     >         >
>     >         >
>     >         >
>     >         > _______________________________________________
>     >         > NumPy-Discussion mailing list
>     >         > NumPy-Discussion at scipy.org
>     <mailto:NumPy-Discussion at scipy.org>
>     >         > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>     >         >
>     >
>     >
>     >         _______________________________________________
>     >         NumPy-Discussion mailing list
>     >         NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>     >         http://mail.scipy.org/mailman/listinfo/numpy-discussion
>     >
>     >
>     >
>     > _______________________________________________
>     > NumPy-Discussion mailing list
>     > NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>     > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
> 
>     _______________________________________________
>     NumPy-Discussion mailing list
>     NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>     http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
> 
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 


From joseluismietta at yahoo.com.ar  Tue Jul 29 07:47:20 2014
From: joseluismietta at yahoo.com.ar (=?iso-8859-1?Q?Jos=E8_Luis_Mietta?=)
Date: Tue, 29 Jul 2014 04:47:20 -0700
Subject: [Numpy-discussion] length - sticks algorithm
In-Reply-To: <CAF6FJitO7cQk7J9riU8hXAzS++GwKwaKpF-uwotKGbMf1DWqmg@mail.gmail.com>
References: <1406027949.48361.YahooMailNeo@web142302.mail.bf1.yahoo.com>
	<CAF6FJitO7cQk7J9riU8hXAzS++GwKwaKpF-uwotKGbMf1DWqmg@mail.gmail.com>
Message-ID: <1406634440.12428.YahooMailNeo@web142306.mail.bf1.yahoo.com>

Robert, thanks for your help!

Now I have: 

* Q nodes (Q stick-stick intersections)
* a list 'NODES'=[(x,y,i,j)_1,........, (x,y,i,j)_Q], where each element (x,y,i,j) represent the intersection point (x,y) of the sticks i and j.
* a matrix 'H' with Q elements {H_k,l}. 
H_k,l=0 if nodes 'k' and 'l' aren't joined by a edge, and H_k,l = R_k,l = the electrical resistance associated with the union of the nodes 'k' and 'l' (directly proportional to the length of the edge that connects these nodes).
* a list 'nodes_resistances'=[R_1, ....., R_Q].

All nodes with 'j' (or 'i') = N+1 have a electric potential 'V' respect all nodes with 'j' or 'i' = N.

Now i must apply NODAL ANALYSIS for determinate the electrical current through each of the edges, and the net current (see attached files). I have no ideas about how to do that. Can you help me?

Thanks a lot!

Best regards, 
Jos? Luis

El d?a martes, 22 de julio de 2014 9:02, Robert Kern <robert.kern at gmail.com> escribi?:
 

What have you tried? What exactly are you having problems with? Loosely, I would suggest the following approach:

For each stick, iterate over each stick that intersects with it (as recorded in M). Find the coordinates of all of the intersection points. Label the intersection points by the IDs of the two sticks that form the intersection (normalize these IDs by keeping them in order so you don't duplicate intersections already found; e.g. (2, 5), not (5, 2)). Arbitrarily, but consistently, pick one end of the stick and find the distances from that end to each of the intersection points. This induces an order on the intersections with that stick by sorting the intersections by their distance from the arbitrary end of the stick. You will need this to determine which intersections on the same stick are neighbors and which aren't. I.e., if you have 3 intersections with a given stick, (i,j), (i,k), and (i,l), you want (i,j)-(i,k), and (i,k)-(i,l), but not (i,j)-(i,l). You can find the distances between each of the intersections easily from that. Use a networkx Graph to
 record the distances (you are making a so-called "weighted graph").


On Tue, Jul 22, 2014 at 12:19 PM, Jos? Luis Mietta <joseluismietta at yahoo.com.ar> wrote:


> Hi experts!
>
>
>Im working with conductivity of sticks film - systems. 
>
>
>
>In my 
algorithm (N sticks) I have the intersection graph matrix M (M is a NxN 
matrix, M_ij=1 if sticks 'i' and 'j' do intersect, and M_ij=0 if sticks 
'i' and 'j' do not).
>Also I have 2 lists with the end-points of each stick. In addition, I can calculate the intersection point (If exist) between sticks.
>
>
>I want to calculate all the distances between the points of intersection (1,2,3,...N) in the next figure:
>without lose the connectivity information (which intersection is connected to which). In the figure, (A) is the system with sticks.
>
>
>I dont know how to do this. Im a python + numpy user.
>
>
>Waiting for your answers!
>
>
>Thans a lot 
>_______________________________________________
>NumPy-Discussion mailing list
>NumPy-Discussion at scipy.org
>http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 
Robert Kern 

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140729/5e989373/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: EE201_matrix_analysis.pdf
Type: image/ipeg
Size: 184815 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140729/5e989373/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 10.1103 at PhysRevB.86.134202.pdf
Type: image/ipeg
Size: 1146335 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140729/5e989373/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: rahman2012.pdf
Type: image/ipeg
Size: 482265 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140729/5e989373/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Dibujo.png
Type: image/png
Size: 446152 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140729/5e989373/attachment.png>

From cjw at ncf.ca  Tue Jul 29 08:24:41 2014
From: cjw at ncf.ca (Colin J. Williams)
Date: Tue, 29 Jul 2014 08:24:41 -0400 (EDT)
Subject: [Numpy-discussion] Compiling Numpy-1.8.1
In-Reply-To: <53D6C8B7.7040606@googlemail.com>
Message-ID: <1583752360.20803.1406636681520.JavaMail.root@ncf.ca>


This version of Numpy does not appear to be available as an installable binary.  In any event, the LAPACK and other packages do not seem to be available with the installable versions.

I understand that Windows Studio 2008 is normally used for Windows compiling. Unfortunately, this is no longer available from Microsoft.  The link is replaced by a Power Point presentation.

Can anyone suggest an alternative compiler/linker?

Colin W. 


From robert.kern at gmail.com  Tue Jul 29 08:43:03 2014
From: robert.kern at gmail.com (Robert Kern)
Date: Tue, 29 Jul 2014 13:43:03 +0100
Subject: [Numpy-discussion] length - sticks algorithm
In-Reply-To: <1406634440.12428.YahooMailNeo@web142306.mail.bf1.yahoo.com>
References: <1406027949.48361.YahooMailNeo@web142302.mail.bf1.yahoo.com>
	<CAF6FJitO7cQk7J9riU8hXAzS++GwKwaKpF-uwotKGbMf1DWqmg@mail.gmail.com>
	<1406634440.12428.YahooMailNeo@web142306.mail.bf1.yahoo.com>
Message-ID: <CAF6FJisD_MQzkkfwp=O6SW78O4hcnS+1Tun++0i3FRWLvJK8Ew@mail.gmail.com>

On Tue, Jul 29, 2014 at 12:47 PM, Jos? Luis Mietta <
joseluismietta at yahoo.com.ar> wrote:

>  Robert, thanks for your help!
>
> Now I have:
>
> * Q nodes (Q stick-stick intersections)
> * a list 'NODES'=[(x,y,i,j)_1,........, (x,y,i,j)_Q], where each element
> (x,y,i,j) represent the intersection point (x,y) of the sticks i and j.
> * a matrix 'H' with Q elements {H_k,l}.
> H_k,l=0 if nodes 'k' and 'l' aren't joined by a edge, and H_k,l = R_k,l =
> the electrical resistance associated with the union of the nodes 'k' and
> 'l' (directly proportional to the length of the edge that connects these
> nodes).
> * a list 'nodes_resistances'=[R_1, ....., R_Q].
>
> All nodes with 'j' (or 'i') = N+1 have a electric potential 'V' respect
> all nodes with 'j' or 'i' = N.
>
> Now i must apply NODAL ANALYSIS for determinate the electrical current
> through each of the edges, and the net current (see attached files). I
> have no ideas about how to do that. Can you help me?
>

Please do not send largish binary attachments to this list. I do not know
off-hand how to do this, but it looks like the EE201 document you attached
tells you how. It is somewhat beyond the scope of this mailing list to help
you understand that document, sorry.

-- 
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140729/c25a78ea/attachment.html>

From olivier.grisel at ensta.org  Tue Jul 29 08:50:12 2014
From: olivier.grisel at ensta.org (Olivier Grisel)
Date: Tue, 29 Jul 2014 14:50:12 +0200
Subject: [Numpy-discussion] Compiling Numpy-1.8.1
In-Reply-To: <1583752360.20803.1406636681520.JavaMail.root@ncf.ca>
References: <53D6C8B7.7040606@googlemail.com>
	<1583752360.20803.1406636681520.JavaMail.root@ncf.ca>
Message-ID: <CAFvE7K6ciey8FEq5OxKhkVdEGvdF5VpGc7vEhAbgNC88FPg_Jw@mail.gmail.com>

2014-07-29 14:24 GMT+02:00 Colin J. Williams <cjw at ncf.ca>:
>
> This version of Numpy does not appear to be available as an installable binary.  In any event, the LAPACK and other packages do not seem to be available with the installable versions.
>
> I understand that Windows Studio 2008 is normally used for Windows compiling. Unfortunately, this is no longer available from Microsoft.  The link is replaced by a Power Point presentation.
>
> Can anyone suggest an alternative compiler/linker?

The web installers for MSVC Express 2008 is still online at:
http://go.microsoft.com/?linkid=7729279

FYI I recently update the scikit-learn documentation for building
under windows, both for Python 2 and Python 3 as well as 32 bit and 64
bit architectures:

http://scikit-learn.org/stable/install.html#building-on-windows

The same build environment should work for numpy (I think).

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel


From cjwilliams43 at gmail.com  Tue Jul 29 09:48:44 2014
From: cjwilliams43 at gmail.com (Colin J. Williams)
Date: Tue, 29 Jul 2014 09:48:44 -0400
Subject: [Numpy-discussion] Compiling Numpy-1.8.1
In-Reply-To: <CAFvE7K6ciey8FEq5OxKhkVdEGvdF5VpGc7vEhAbgNC88FPg_Jw@mail.gmail.com>
References: <53D6C8B7.7040606@googlemail.com>
	<1583752360.20803.1406636681520.JavaMail.root@ncf.ca>
	<CAFvE7K6ciey8FEq5OxKhkVdEGvdF5VpGc7vEhAbgNC88FPg_Jw@mail.gmail.com>
Message-ID: <CADWPdxZkq8AeJY0aDEU2v_d6sbeKnY6hx65n+59Hq4xUpCNxbg@mail.gmail.com>

Oliver,

Thanks.  I've installed Windows Studio 2008 Express.

I'll read your building on Winods Document.

Colin W.


On 29 July 2014 08:50, Olivier Grisel <olivier.grisel at ensta.org> wrote:

> 2014-07-29 14:24 GMT+02:00 Colin J. Williams <cjw at ncf.ca>:
> >
> > This version of Numpy does not appear to be available as an installable
> binary.  In any event, the LAPACK and other packages do not seem to be
> available with the installable versions.
> >
> > I understand that Windows Studio 2008 is normally used for Windows
> compiling. Unfortunately, this is no longer available from Microsoft.  The
> link is replaced by a Power Point presentation.
> >
> > Can anyone suggest an alternative compiler/linker?
>
> The web installers for MSVC Express 2008 is still online at:
> http://go.microsoft.com/?linkid=7729279
>
> FYI I recently update the scikit-learn documentation for building
> under windows, both for Python 2 and Python 3 as well as 32 bit and 64
> bit architectures:
>
> http://scikit-learn.org/stable/install.html#building-on-windows
>
> The same build environment should work for numpy (I think).
>
> --
> Olivier
> http://twitter.com/ogrisel - http://github.com/ogrisel
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140729/e159e8f9/attachment.html>

From derek at astro.physik.uni-goettingen.de  Tue Jul 29 13:52:51 2014
From: derek at astro.physik.uni-goettingen.de (Derek Homeier)
Date: Tue, 29 Jul 2014 19:52:51 +0200
Subject: [Numpy-discussion] length - sticks algorithm
In-Reply-To: <CAF6FJisD_MQzkkfwp=O6SW78O4hcnS+1Tun++0i3FRWLvJK8Ew@mail.gmail.com>
References: <1406027949.48361.YahooMailNeo@web142302.mail.bf1.yahoo.com>
	<CAF6FJitO7cQk7J9riU8hXAzS++GwKwaKpF-uwotKGbMf1DWqmg@mail.gmail.com>
	<1406634440.12428.YahooMailNeo@web142306.mail.bf1.yahoo.com>
	<CAF6FJisD_MQzkkfwp=O6SW78O4hcnS+1Tun++0i3FRWLvJK8Ew@mail.gmail.com>
Message-ID: <5BAE10D9-E948-49FE-BD25-1D97A118D2A7@astro.physik.uni-goettingen.de>

On 29 Jul 2014, at 02:43 pm, Robert Kern <robert.kern at gmail.com> wrote:

> On Tue, Jul 29, 2014 at 12:47 PM, Jos? Luis Mietta <joseluismietta at yahoo.com.ar> wrote:
> Robert, thanks for your help!
> 
> Now I have: 
> 
> * Q nodes (Q stick-stick intersections)
> * a list 'NODES'=[(x,y,i,j)_1,........, (x,y,i,j)_Q], where each element (x,y,i,j) represent the intersection point (x,y) of the sticks i and j.
> * a matrix 'H' with Q elements {H_k,l}. 
> H_k,l=0 if nodes 'k' and 'l' aren't joined by a edge, and H_k,l = R_k,l = the electrical resistance associated withthe union of the nodes 'k' and 'l' (directly proportional to the length of the edge that connects these nodes).
> * a list 'nodes_resistances'=[R_1, ....., R_Q].
> 
> All nodes with 'j' (or 'i') = N+1 have a electric potential 'V' respect all nodes with 'j' or 'i' = N.
> 
> Now i must apply NODAL ANALYSIS for determinate the electrical current through each of the edges, and the net current (see attached files). I have no ideas about how to do that. Can you help me? 
> 
> Please do not send largish binary attachments to this list. I do not know off-hand how to do this, but it looks like the EE201 document you attached tells you how. It is somewhat beyond the scope of this mailing list to help you understand that document, sorry.
> 
And it is not a good idea to post copyrighted journal articles to a list where they will end
up in a public list archive (even if not immediately recognisable so).

						Derek


From faltet at gmail.com  Wed Jul 30 06:34:02 2014
From: faltet at gmail.com (Francesc Alted)
Date: Wed, 30 Jul 2014 12:34:02 +0200
Subject: [Numpy-discussion] ANN: bcolz 0.7.1 released
Message-ID: <53D8CA1A.6090400@gmail.com>

======================
Announcing bcolz 0.7.1
======================

What's new
==========

This is maintenance release, where bcolz got rid of the nose dependency
for Python 2.6 (only unittest2 should be required).  Also, some small
fixes for the test suite, specially in 32-bit has been done.  Thanks to
Ilan Schnell for pointing out the problems and for suggesting fixes.

``bcolz`` is a renaming of the ``carray`` project.  The new goals for
the project are to create simple, yet flexible compressed containers,
that can live either on-disk or in-memory, and with some
high-performance iterators (like `iter()`, `where()`) for querying them.

Together, bcolz and the Blosc compressor, are finally fullfilling the
promise of accelerating memory I/O, at least for some real scenarios:

http://nbviewer.ipython.org/github/Blosc/movielens-bench/blob/master/querying-ep14.ipynb#Plots

For more detailed info, see the release notes in:
https://github.com/Blosc/bcolz/wiki/Release-Notes


What it is
==========

bcolz provides columnar and compressed data containers.  Column storage
allows for efficiently querying tables with a large number of columns.
It also allows for cheap addition and removal of column.  In addition,
bcolz objects are compressed by default for reducing memory/disk I/O
needs.  The compression process is carried out internally by Blosc, a
high-performance compressor that is optimized for binary data.

bcolz can use numexpr internally so as to accelerate many vector and
query operations (although it can use pure NumPy for doing so too).
numexpr optimizes the memory usage and use several cores for doing the
computations, so it is blazing fast.  Moreover, the carray/ctable
containers can be disk-based, and it is possible to use them for
seamlessly performing out-of-memory computations.

bcolz has minimal dependencies (NumPy), comes with an exhaustive test
suite and fully supports both 32-bit and 64-bit platforms.  Also, it is
typically tested on both UNIX and Windows operating systems.


Installing
==========

bcolz is in the PyPI repository, so installing it is easy:

$ pip install -U bcolz


Resources
=========

Visit the main bcolz site repository at:
http://github.com/Blosc/bcolz

Manual:
http://bcolz.blosc.org

Home of Blosc compressor:
http://blosc.org

User's mail list:
bcolz at googlegroups.com
http://groups.google.com/group/bcolz

License is the new BSD:
https://github.com/Blosc/bcolz/blob/master/LICENSES/BCOLZ.txt


----

   **Enjoy data!**

-- 
Francesc Alted


From jtaylor.debian at googlemail.com  Wed Jul 30 16:20:05 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Wed, 30 Jul 2014 22:20:05 +0200
Subject: [Numpy-discussion] ANN:  NumPy 1.9.0 beta 2 release
Message-ID: <53D95375.5080707@googlemail.com>

Hello,

The source packages and binaries got numpy 1.9.0 beta 2 have just been
uploaded to sourceforge.
https://sourceforge.net/projects/numpy/files/NumPy/1.9.0b2

1.9.0 will be a new feature release supporting Python 2.6 - 2.7 and 3.2
- 3.4.

Unfortunately we have disabled the new __numpy_ufunc__ feature for
overriding ufuncs in subclasses for now. There are still some unresolved
issues with its behavior regarding python operator precedence and
subclasses.
If you have a stake in the issue please read Pauli's summary of the
remaining issues:
http://mail.scipy.org/pipermail/numpy-discussion/2014-July/070737.html

When the issues are resolved to everyones satisfaction we hope to enable
the feature for 1.10 in its final form.

We have restored the indexing edge case that broke matplotlib with numpy
1.9.0 beta 1 but some of the other test failures in other packages are
deemed bugs in their code and not reasonable to support in numpy
anymore. Most projects have fixed the issues in their latest stable or
development versions. Depending on how bad the broken functionality is
you may need to update your third party packages when updating numpy to
1.9.0b2.

An attempt was made to update the windows binary toolchain to the latest
mingw/mingw64 version and an up to date ATLAS version but this turned up
a few ugly test failures.
Help in resolving these issues is appreciated, no core developer has
Windows debugging experience.
Please see this issue for details:
https://github.com/numpy/numpy/issues/4909


The changelog is mostly the same as in beta1. Please read it carefully
there have been many small changes that could affect your code.
https://github.com/numpy/numpy/blob/maintenance/1.9.x/doc/release/1.9.0-notes.rst
Please also take special note of the future changes section which will
apply to the following release 1.10.0 and make sure to check if your
applications would be affected by them.

Source tarballs, windows installers and release notes can be found at
https://sourceforge.net/projects/numpy/files/NumPy/1.9.0b2

Cheers,
Julian Taylor

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140730/e0836c14/attachment.sig>

From jks257 at cornell.edu  Wed Jul 30 16:36:24 2014
From: jks257 at cornell.edu (Jeffrey Ken Smith)
Date: Wed, 30 Jul 2014 20:36:24 +0000
Subject: [Numpy-discussion] Can't build numpy on my Windows 7 desktop
	computer
Message-ID: <ab29bfc52f5a41da99c9848e797f0b96@BLUPR04MB006.namprd04.prod.outlook.com>

I have been unable to install on my Windows 7 desktop computer, which is a Dell - I had no problems installing it on my new laptop, which is also a Dell. When I try to run the superpack .exe file, I get a message claiming that Python2.7 is not in the registry even though it is and even though I was able to install pyodbc. If I download the zip file and try to use setup.py, I get messages like

"No module named msvccompiler in numpy.distutils: trying from distutils
error: unable to find vcvarsall.bat"

I have no idea what this means or what to do about it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140730/d4a59a47/attachment.html>

From chris.barker at noaa.gov  Wed Jul 30 18:01:03 2014
From: chris.barker at noaa.gov (Chris Barker)
Date: Wed, 30 Jul 2014 15:01:03 -0700
Subject: [Numpy-discussion] Can't build numpy on my Windows 7 desktop
	computer
In-Reply-To: <ab29bfc52f5a41da99c9848e797f0b96@BLUPR04MB006.namprd04.prod.outlook.com>
References: <ab29bfc52f5a41da99c9848e797f0b96@BLUPR04MB006.namprd04.prod.outlook.com>
Message-ID: <CALGmxELHP+xctZPzVE_MyESqO6ZmyhiK=Jg4gV+=Br53257rOw@mail.gmail.com>

On Wed, Jul 30, 2014 at 1:36 PM, Jeffrey Ken Smith <jks257 at cornell.edu>
wrote:

>  I have been unable to install on my Windows 7 desktop computer, which is
> a Dell ? I had no problems installing it on my new laptop, which is also a
> Dell. When I try to run the superpack .exe file, I get a message claiming
> that Python2.7 is not in the registry even though it is and even though I
> was able to install pyodbc.
>

Really bad error message - you are probably trying to install a 64 bit
numpy into a 32 bit python, or vice versa -- make sure you are doing both
the same.

And I recommend the binaries from here:

http://www.lfd.uci.edu/~gohlke/pythonlibs/

(or Anaconda or Canopy)

-Chris


>  If I download the zip file and try to use setup.py, I get messages like
>
>
>
> ?No module named msvccompiler in numpy.distutils: trying from distutils
>
> error: unable to find vcvarsall.bat?
>
>
>
> I have no idea what this means or what to do about it.
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140730/0dac4165/attachment.html>

From chris.barker at noaa.gov  Wed Jul 30 18:02:14 2014
From: chris.barker at noaa.gov (Chris Barker)
Date: Wed, 30 Jul 2014 15:02:14 -0700
Subject: [Numpy-discussion] Can't build numpy on my Windows 7 desktop
	computer
In-Reply-To: <CALGmxELHP+xctZPzVE_MyESqO6ZmyhiK=Jg4gV+=Br53257rOw@mail.gmail.com>
References: <ab29bfc52f5a41da99c9848e797f0b96@BLUPR04MB006.namprd04.prod.outlook.com>
	<CALGmxELHP+xctZPzVE_MyESqO6ZmyhiK=Jg4gV+=Br53257rOw@mail.gmail.com>
Message-ID: <CALGmxELwy-MZnQLiVM72Os2n1e871BvisXGc2yjzkFGzvU+syA@mail.gmail.com>

one more note:

>
>> If I download the zip file and try to use setup.py, I get messages like
>>
>>
>>
>> ?No module named msvccompiler in numpy.distutils: trying from distutils
>>
>> error: unable to find vcvarsall.bat?
>>
>>
>>
>> I have no idea what this means or what to do about it.
>>
>
It means it is trying to compile numpy, and you don't have the compiler set
up to do that. But I suspect you don't want to anyway.

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140730/00ab6f11/attachment.html>

From charlesr.harris at gmail.com  Wed Jul 30 18:34:32 2014
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 30 Jul 2014 16:34:32 -0600
Subject: [Numpy-discussion] Remove user_array.py
Message-ID: <CAB6mnx+YXgKJeyvvmiOa==ceXXbUdSQ2whBoOd_9jpfXtWjksQ@mail.gmail.com>

Hi All,

numpy/lib/user_array.py is an old module (2006) that documents itself as
unfinished. The only recent changes are my work for supporting both python2
and python3 from the same code base. It was apparently intended as an
alternative to inheriting from ndarray. It has no tests to speak of except
a few odds and ends included in the module. I suspect this is one of those
features that few have heard of.

Thoughts?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140730/dc221e58/attachment.html>

From matthew.brett at gmail.com  Wed Jul 30 18:52:50 2014
From: matthew.brett at gmail.com (Matthew Brett)
Date: Wed, 30 Jul 2014 15:52:50 -0700
Subject: [Numpy-discussion] OSX wheels for older numpy versions on pypi
Message-ID: <CAH6Pt5rJOXCxP5Ow1cqjuuhKD0Q1XCfEUEmOOZ6PYH2Hos2uZQ@mail.gmail.com>

Hi,

I took the liberty of uploading OSX wheels for some older numpy
versions to pypi.   These can be useful for testing, or when building
your own wheels to be compatible with earlier numpy versions - see:

http://stackoverflow.com/questions/17709641/valueerror-numpy-dtype-has-the-wrong-size-try-recompiling/18369312#18369312

There are currently wheels for

numpy 1.5.1 py27
numpy 1.6.0 py27
numpy 1.6.1 py27
numpy 1.7.1 py27, 32, 33, 34

These are all compiled against ATLAS:

https://github.com/matthew-brett/numpy-atlas-binaries

install with e.g.

pip install numpy==1.6.1

If anyone needs other wheels compiled, let me know, I'll try and upload them,

Cheers,

Matthew


From cmkleffner at gmail.com  Wed Jul 30 20:12:49 2014
From: cmkleffner at gmail.com (Carl Kleffner)
Date: Thu, 31 Jul 2014 02:12:49 +0200
Subject: [Numpy-discussion] ANN: NumPy 1.9.0 beta 2 release
In-Reply-To: <53D95375.5080707@googlemail.com>
References: <53D95375.5080707@googlemail.com>
Message-ID: <CAGGsPMyS+ZLvi20efLSUaGCv90grMSNq0DkF_R=t+eGt5uiqxw@mail.gmail.com>

Hi,

I created mingw-w64 builds for testing based on OpenBLAS, see:
https://bitbucket.org/carlkl/mingw-w64-for-python/downloads .

gists for numpy.test run:

win32: https://gist.github.com/carlkl/43182c7c5e0049db7b4e
amd64: https://gist.github.com/carlkl/c528505af31ac32720b0

Regards,

Carl


2014-07-30 22:20 GMT+02:00 Julian Taylor <jtaylor.debian at googlemail.com>:

> Hello,
>
> The source packages and binaries got numpy 1.9.0 beta 2 have just been
> uploaded to sourceforge.
> https://sourceforge.net/projects/numpy/files/NumPy/1.9.0b2
>
> 1.9.0 will be a new feature release supporting Python 2.6 - 2.7 and 3.2
> - 3.4.
>
> Unfortunately we have disabled the new __numpy_ufunc__ feature for
> overriding ufuncs in subclasses for now. There are still some unresolved
> issues with its behavior regarding python operator precedence and
> subclasses.
> If you have a stake in the issue please read Pauli's summary of the
> remaining issues:
> http://mail.scipy.org/pipermail/numpy-discussion/2014-July/070737.html
>
> When the issues are resolved to everyones satisfaction we hope to enable
> the feature for 1.10 in its final form.
>
> We have restored the indexing edge case that broke matplotlib with numpy
> 1.9.0 beta 1 but some of the other test failures in other packages are
> deemed bugs in their code and not reasonable to support in numpy
> anymore. Most projects have fixed the issues in their latest stable or
> development versions. Depending on how bad the broken functionality is
> you may need to update your third party packages when updating numpy to
> 1.9.0b2.
>
> An attempt was made to update the windows binary toolchain to the latest
> mingw/mingw64 version and an up to date ATLAS version but this turned up
> a few ugly test failures.
> Help in resolving these issues is appreciated, no core developer has
> Windows debugging experience.
> Please see this issue for details:
> https://github.com/numpy/numpy/issues/4909
>
>
> The changelog is mostly the same as in beta1. Please read it carefully
> there have been many small changes that could affect your code.
>
> https://github.com/numpy/numpy/blob/maintenance/1.9.x/doc/release/1.9.0-notes.rst
> Please also take special note of the future changes section which will
> apply to the following release 1.10.0 and make sure to check if your
> applications would be affected by them.
>
> Source tarballs, windows installers and release notes can be found at
> https://sourceforge.net/projects/numpy/files/NumPy/1.9.0b2
>
> Cheers,
> Julian Taylor
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140731/3a78cddc/attachment.html>

From matthew.brett at gmail.com  Wed Jul 30 22:06:43 2014
From: matthew.brett at gmail.com (Matthew Brett)
Date: Wed, 30 Jul 2014 19:06:43 -0700
Subject: [Numpy-discussion] ANN: NumPy 1.9.0 beta 2 release
In-Reply-To: <CAGGsPMyS+ZLvi20efLSUaGCv90grMSNq0DkF_R=t+eGt5uiqxw@mail.gmail.com>
References: <53D95375.5080707@googlemail.com>
	<CAGGsPMyS+ZLvi20efLSUaGCv90grMSNq0DkF_R=t+eGt5uiqxw@mail.gmail.com>
Message-ID: <CAH6Pt5qdKno4=BQhUySRo+AvTs12wqGgqe3O2Dfx2G30Aj0cyA@mail.gmail.com>

Hi,

On Wed, Jul 30, 2014 at 5:12 PM, Carl Kleffner <cmkleffner at gmail.com> wrote:
> Hi,
>
> I created mingw-w64 builds for testing based on OpenBLAS, see:
> https://bitbucket.org/carlkl/mingw-w64-for-python/downloads .
>
> gists for numpy.test run:
>
> win32: https://gist.github.com/carlkl/43182c7c5e0049db7b4e
> amd64: https://gist.github.com/carlkl/c528505af31ac32720b0

Thanks all for all the hard work.

Here's OSX wheels for testing:

http://wheels.scikit-image.org

Try with:

pip install --pre -f http://wheels.scikit-image.org numpy

This should work with Python.org Python on OSX 10.6+, homebrew /
macports / system Python for 10.9 [1]

Please do send feedback.

Cheers,

Matthew

[1] System Python (/usr/bin/python) will only see your new copy of
numpy if you adjust the default path, or test in a virtualenv, because
of the system Python sys.path setup


From matthew.brett at gmail.com  Wed Jul 30 22:42:50 2014
From: matthew.brett at gmail.com (Matthew Brett)
Date: Wed, 30 Jul 2014 19:42:50 -0700
Subject: [Numpy-discussion] ANN: NumPy 1.9.0 beta 2 release
In-Reply-To: <CAGGsPMyS+ZLvi20efLSUaGCv90grMSNq0DkF_R=t+eGt5uiqxw@mail.gmail.com>
References: <53D95375.5080707@googlemail.com>
	<CAGGsPMyS+ZLvi20efLSUaGCv90grMSNq0DkF_R=t+eGt5uiqxw@mail.gmail.com>
Message-ID: <CAH6Pt5r_vEL3B43tUeH4+PiDEfPvf7sBzdXxaT2LEE8amRSjEg@mail.gmail.com>

Hi,

On Wed, Jul 30, 2014 at 5:12 PM, Carl Kleffner <cmkleffner at gmail.com> wrote:
> Hi,
>
> I created mingw-w64 builds for testing based on OpenBLAS, see:
> https://bitbucket.org/carlkl/mingw-w64-for-python/downloads .
>
> gists for numpy.test run:
>
> win32: https://gist.github.com/carlkl/43182c7c5e0049db7b4e
> amd64: https://gist.github.com/carlkl/c528505af31ac32720b0

I believe the amd64 failure is because Windows doesn't like you trying
to open a file that is already open - maybe this will fix it:

https://github.com/numpy/numpy/pull/4927

Cheers,

Matthew


From charlesr.harris at gmail.com  Wed Jul 30 23:20:15 2014
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 30 Jul 2014 21:20:15 -0600
Subject: [Numpy-discussion] ANN: NumPy 1.9.0 beta 2 release
In-Reply-To: <CAH6Pt5r_vEL3B43tUeH4+PiDEfPvf7sBzdXxaT2LEE8amRSjEg@mail.gmail.com>
References: <53D95375.5080707@googlemail.com>
	<CAGGsPMyS+ZLvi20efLSUaGCv90grMSNq0DkF_R=t+eGt5uiqxw@mail.gmail.com>
	<CAH6Pt5r_vEL3B43tUeH4+PiDEfPvf7sBzdXxaT2LEE8amRSjEg@mail.gmail.com>
Message-ID: <CAB6mnx+nGk0bYMmNSKTG=gzm1H9E5GPmK81-fyBY8AQNbuGZpA@mail.gmail.com>

On Wed, Jul 30, 2014 at 8:42 PM, Matthew Brett <matthew.brett at gmail.com>
wrote:

> Hi,
>
> On Wed, Jul 30, 2014 at 5:12 PM, Carl Kleffner <cmkleffner at gmail.com>
> wrote:
> > Hi,
> >
> > I created mingw-w64 builds for testing based on OpenBLAS, see:
> > https://bitbucket.org/carlkl/mingw-w64-for-python/downloads .
> >
> > gists for numpy.test run:
> >
> > win32: https://gist.github.com/carlkl/43182c7c5e0049db7b4e
> > amd64: https://gist.github.com/carlkl/c528505af31ac32720b0
>
> I believe the amd64 failure is because Windows doesn't like you trying
> to open a file that is already open - maybe this will fix it:
>
> https://github.com/numpy/numpy/pull/4927
>
> Cheers,
>

Thanks for getting this out.


I just noticed that we are getting a couple of warnings on some platforms.

*Python 3.2 debug*;

/usr/lib/python3.2/platform.py:381: ResourceWarning: unclosed file
<_io.TextIOWrapper name='/etc/lsb-release' mode='rU' encoding='UTF-8'>

full_distribution_name=0)


*USE_CHROOT=1 ARCH=i386 DIST=trusty PYTHON=3.4*
/usr/local/lib/python3.4/dist-packages/numpy/distutils/cpuinfo.py:120:
UserWarning: [Errno 2] No such file or directory: '/proc/cpuinfo'

warnings.warn(str(e), UserWarning)

Not sure about the second one.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140730/bf9f15ee/attachment.html>

From robert.kern at gmail.com  Thu Jul 31 02:55:11 2014
From: robert.kern at gmail.com (Robert Kern)
Date: Thu, 31 Jul 2014 07:55:11 +0100
Subject: [Numpy-discussion] Remove user_array.py
In-Reply-To: <CAB6mnx+YXgKJeyvvmiOa==ceXXbUdSQ2whBoOd_9jpfXtWjksQ@mail.gmail.com>
References: <CAB6mnx+YXgKJeyvvmiOa==ceXXbUdSQ2whBoOd_9jpfXtWjksQ@mail.gmail.com>
Message-ID: <CAF6FJitynpSY75V_Kby2s+LHWfSUrgCE=DGVR_zaKFHStLat-Q@mail.gmail.com>

On Wed, Jul 30, 2014 at 11:34 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
> Hi All,
>
> numpy/lib/user_array.py is an old module (2006) that documents itself as
> unfinished. The only recent changes are my work for supporting both python2
> and python3 from the same code base. It was apparently intended as an
> alternative to inheriting from ndarray. It has no tests to speak of except a
> few odds and ends included in the module. I suspect this is one of those
> features that few have heard of.

Agreed.

-- 
Robert Kern


From olivier.grisel at ensta.org  Thu Jul 31 07:56:25 2014
From: olivier.grisel at ensta.org (Olivier Grisel)
Date: Thu, 31 Jul 2014 13:56:25 +0200
Subject: [Numpy-discussion] OSX wheels for older numpy versions on pypi
In-Reply-To: <CAH6Pt5rJOXCxP5Ow1cqjuuhKD0Q1XCfEUEmOOZ6PYH2Hos2uZQ@mail.gmail.com>
References: <CAH6Pt5rJOXCxP5Ow1cqjuuhKD0Q1XCfEUEmOOZ6PYH2Hos2uZQ@mail.gmail.com>
Message-ID: <CAFvE7K5XBbwVTv7QzoduR27rEau7+98OeyGAY0VVUky=QrPecQ@mail.gmail.com>

2014-07-31 0:52 GMT+02:00 Matthew Brett <matthew.brett at gmail.com>:
> Hi,
>
> I took the liberty of uploading OSX wheels for some older numpy
> versions to pypi.   These can be useful for testing, or when building
> your own wheels to be compatible with earlier numpy versions - see:
>
> http://stackoverflow.com/questions/17709641/valueerror-numpy-dtype-has-the-wrong-size-try-recompiling/18369312#18369312
>
> There are currently wheels for
>
> numpy 1.5.1 py27
> numpy 1.6.0 py27
> numpy 1.6.1 py27
> numpy 1.7.1 py27, 32, 33, 34
>
> These are all compiled against ATLAS:
>
> https://github.com/matthew-brett/numpy-atlas-binaries
>
> install with e.g.
>
> pip install numpy==1.6.1

Thanks, this is very helpful for project maintainers who have to
switch between versions to reproduce bugs reported by users.

Do you plan do do the same for scipy? As scipy is even slower to build
that would be even more helpful.


-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel


From jtaylor.debian at googlemail.com  Thu Jul 31 13:45:57 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Thu, 31 Jul 2014 19:45:57 +0200
Subject: [Numpy-discussion] ANN: NumPy 1.9.0 beta 2 release
In-Reply-To: <CAB6mnx+nGk0bYMmNSKTG=gzm1H9E5GPmK81-fyBY8AQNbuGZpA@mail.gmail.com>
References: <53D95375.5080707@googlemail.com>	<CAGGsPMyS+ZLvi20efLSUaGCv90grMSNq0DkF_R=t+eGt5uiqxw@mail.gmail.com>	<CAH6Pt5r_vEL3B43tUeH4+PiDEfPvf7sBzdXxaT2LEE8amRSjEg@mail.gmail.com>
	<CAB6mnx+nGk0bYMmNSKTG=gzm1H9E5GPmK81-fyBY8AQNbuGZpA@mail.gmail.com>
Message-ID: <53DA80D5.20909@googlemail.com>

On 31.07.2014 05:20, Charles R Harris wrote:

> 
> 
> I just noticed that we are getting a couple of warnings on some platforms.
> 
...
> 
> *USE_CHROOT=1 ARCH=i386 DIST=trusty PYTHON=3.4*
> 
> /usr/local/lib/python3.4/dist-packages/numpy/distutils/cpuinfo.py:120:
> UserWarning: [Errno 2] No such file or directory: '/proc/cpuinfo'
> 
> warnings.warn(str(e), UserWarning)
> 
> Not sure about the second one.
> 


this should harmless, the chroot we use to test 32 bit here does not
have the proc filesystem mounted, we could mount it but this distutils
feature should not be relevant for travis.


From matthew.brett at gmail.com  Thu Jul 31 16:40:21 2014
From: matthew.brett at gmail.com (Matthew Brett)
Date: Thu, 31 Jul 2014 13:40:21 -0700
Subject: [Numpy-discussion] OSX wheels for older numpy versions on pypi
In-Reply-To: <CAFvE7K5XBbwVTv7QzoduR27rEau7+98OeyGAY0VVUky=QrPecQ@mail.gmail.com>
References: <CAH6Pt5rJOXCxP5Ow1cqjuuhKD0Q1XCfEUEmOOZ6PYH2Hos2uZQ@mail.gmail.com>
	<CAFvE7K5XBbwVTv7QzoduR27rEau7+98OeyGAY0VVUky=QrPecQ@mail.gmail.com>
Message-ID: <CAH6Pt5qqA411BU3v3ozy2-nPN3dHF_nUaTAFy7JcS4hcbNZpBg@mail.gmail.com>

On Thu, Jul 31, 2014 at 4:56 AM, Olivier Grisel
<olivier.grisel at ensta.org> wrote:
> 2014-07-31 0:52 GMT+02:00 Matthew Brett <matthew.brett at gmail.com>:
>> Hi,
>>
>> I took the liberty of uploading OSX wheels for some older numpy
>> versions to pypi.   These can be useful for testing, or when building
>> your own wheels to be compatible with earlier numpy versions - see:
>>
>> http://stackoverflow.com/questions/17709641/valueerror-numpy-dtype-has-the-wrong-size-try-recompiling/18369312#18369312
>>
>> There are currently wheels for
>>
>> numpy 1.5.1 py27
>> numpy 1.6.0 py27
>> numpy 1.6.1 py27
>> numpy 1.7.1 py27, 32, 33, 34
>>
>> These are all compiled against ATLAS:
>>
>> https://github.com/matthew-brett/numpy-atlas-binaries
>>
>> install with e.g.
>>
>> pip install numpy==1.6.1
>
> Thanks, this is very helpful for project maintainers who have to
> switch between versions to reproduce bugs reported by users.
>
> Do you plan do do the same for scipy? As scipy is even slower to build
> that would be even more helpful.

Sure, I built and uploaded:

scipy-0.12.0 py27
scipy-0.13.0 py27, 33, 34

Are there any others you need?

Cheers,

Matthew


From Catherine.M.Moroney at jpl.nasa.gov  Thu Jul 31 18:31:02 2014
From: Catherine.M.Moroney at jpl.nasa.gov (Moroney, Catherine M (398D))
Date: Thu, 31 Jul 2014 22:31:02 +0000
Subject: [Numpy-discussion] working with numpy object arrays
Message-ID: <5AAFD452-2882-4D7F-883E-C7C3148D882A@jpl.nasa.gov>

In the example code below, is it possible to return an array of all the ".a"
values of the MyClass objects as stored in the object array "a"?

I am successfully able to retrieve the "a" attributes if I loop through the
array elements one by one, but I cannot do a whole-array operation to retrieve
the "a" attributes.

Is there any way to retrieve all the "a" attributes of the MyClass objects
all at once, or do I have to loop through all elements of "array" one-by-one?

Thanks for any help,

Catherine

import numpy

class MyClass(object):
   def __init__(self, a):
	self.a = a

   def add(self, b, c):
	self.a += b+c

   def return_a(self):
       return self.a

array = numpy.empty((2,2), dtype=object)
for i in xrange(0, 2):
    for j in xrange(0, 2):
	array[i,j] = MyClass(i+j)

for i in xrange(0, 2):
    for j in xrange(0, 2):
        array[i,j].add(i, j)
        print "(%i,%i) = %i" % (i, j, array[i,j].a)

try:
    array_a = array[:,:].a
    print "a values =",array_a
except AttributeError:
    print "Unable to access a attributes of array as a whole."

array_a = numpy.empty((2,2))
for i in xrange(0, 2):
    for j in xrange(0, 2):
        array_a[i,j] = array[i,j].a

print "a values =",array_a


From matthew.brett at gmail.com  Thu Jul 31 18:55:43 2014
From: matthew.brett at gmail.com (Matthew Brett)
Date: Thu, 31 Jul 2014 15:55:43 -0700
Subject: [Numpy-discussion] OSX wheels for older numpy versions on pypi
In-Reply-To: <CAH6Pt5qqA411BU3v3ozy2-nPN3dHF_nUaTAFy7JcS4hcbNZpBg@mail.gmail.com>
References: <CAH6Pt5rJOXCxP5Ow1cqjuuhKD0Q1XCfEUEmOOZ6PYH2Hos2uZQ@mail.gmail.com>
	<CAFvE7K5XBbwVTv7QzoduR27rEau7+98OeyGAY0VVUky=QrPecQ@mail.gmail.com>
	<CAH6Pt5qqA411BU3v3ozy2-nPN3dHF_nUaTAFy7JcS4hcbNZpBg@mail.gmail.com>
Message-ID: <CAH6Pt5pHca4x500A2_7rO5Tk1m_fAGnTt=W9N3n6DBx-GHcdHQ@mail.gmail.com>

On Thu, Jul 31, 2014 at 1:40 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
> On Thu, Jul 31, 2014 at 4:56 AM, Olivier Grisel
> <olivier.grisel at ensta.org> wrote:
>> 2014-07-31 0:52 GMT+02:00 Matthew Brett <matthew.brett at gmail.com>:
>>> Hi,
>>>
>>> I took the liberty of uploading OSX wheels for some older numpy
>>> versions to pypi.   These can be useful for testing, or when building
>>> your own wheels to be compatible with earlier numpy versions - see:
>>>
>>> http://stackoverflow.com/questions/17709641/valueerror-numpy-dtype-has-the-wrong-size-try-recompiling/18369312#18369312
>>>
>>> There are currently wheels for
>>>
>>> numpy 1.5.1 py27
>>> numpy 1.6.0 py27
>>> numpy 1.6.1 py27
>>> numpy 1.7.1 py27, 32, 33, 34
>>>
>>> These are all compiled against ATLAS:
>>>
>>> https://github.com/matthew-brett/numpy-atlas-binaries
>>>
>>> install with e.g.
>>>
>>> pip install numpy==1.6.1
>>
>> Thanks, this is very helpful for project maintainers who have to
>> switch between versions to reproduce bugs reported by users.
>>
>> Do you plan do do the same for scipy? As scipy is even slower to build
>> that would be even more helpful.
>
> Sure, I built and uploaded:
>
> scipy-0.12.0 py27
> scipy-0.13.0 py27, 33, 34

I uploaded 0.11.0 and 0.10.0 for py27 in the meantime,

Cheers,

Matthew


From charlesr.harris at gmail.com  Thu Jul 31 21:27:55 2014
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu, 31 Jul 2014 19:27:55 -0600
Subject: [Numpy-discussion] Remove numpy/compat/_inspect.py ?
Message-ID: <CAB6mnxKvH1kPfDkE3JEAGgUQ=jnLabN3o2CKaAnw9_9OGhqGVQ@mail.gmail.com>

Hi All,

The _inspect.py function looks like a numpy version of the python inspect
function. ISTR that is was a work around for problems with the early python
versions, but that would have been back in 2009.

Thoughts?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140731/36600ef3/attachment.html>