From russell at keith-magee.com  Sun Mar  4 03:05:57 2018
From: russell at keith-magee.com (Russell Keith-Magee)
Date: Sun, 4 Mar 2018 16:05:57 +0800
Subject: [Numpy-discussion] Request for review: PR #10689
Message-ID: <385EFD5F-EFD8-44BC-B11F-C60AE00EFE4B@keith-magee.com>

Hi all,

I?ve just submitted PR #10689, making some small changes to allow NumPy to be compiled on iOS.

https://github.com/numpy/numpy/pull/10689 <https://github.com/numpy/numpy/pull/10689>

The changes are described in detail on the PR description. For the most part, they?re changes to differentiate between building *on* macOS, and building *for* macOS.

Yours,
Russ Magee %-)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180304/79040032/attachment.html>

From marko.asplund at gmail.com  Tue Mar  6 04:39:31 2018
From: marko.asplund at gmail.com (Marko Asplund)
Date: Tue, 6 Mar 2018 11:39:31 +0200
Subject: [Numpy-discussion] numpy.random.randn
Message-ID: <CANoUZR8H1A+0Ac524cXa3g4BabMvXUKeuBVAHgGCk7Su8TTqpg@mail.gmail.com>

I've some neural network code in NumPy that I'd like to compare with a
Scala based implementation.
My problem is currently random initialization of the neural net parameters.
I'd like to be able to get the same results from both implementations when
using the same random seed.

One approach I've though of would be to use the NumPy random generator also
with the Scala implementation, but unfortunately the linear algebra library
I'm using doesn't provide an equivalent for this.

Could someone give pointers to implementing numpy.random.randn?
Or alternatively, is there an equivalent random generator for Scala or Java?


marko
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180306/9a3027c9/attachment.html>

From cimrman3 at ntc.zcu.cz  Tue Mar  6 06:06:20 2018
From: cimrman3 at ntc.zcu.cz (Robert Cimrman)
Date: Tue, 6 Mar 2018 12:06:20 +0100
Subject: [Numpy-discussion] ANN: SfePy 2018.1
Message-ID: <07f7c4f6-abae-a7f4-5e87-e29dbab2e296@ntc.zcu.cz>

I am pleased to announce release 2018.1 of SfePy.

Description
-----------

SfePy (simple finite elements in Python) is a software for solving systems of
coupled partial differential equations by the finite element method or by the
isogeometric analysis (limited support). It is distributed under the new BSD
license.

Home page: http://sfepy.org
Mailing list: https://mail.python.org/mm3/mailman3/lists/sfepy.python.org/
Git (source) repository, issue tracker: https://github.com/sfepy/sfepy

Highlights of this release
--------------------------

- major update of time-stepping solvers and solver handling
- Newmark and Bathe elastodynamics solvers
- interface to MUMPS linear solver
- new examples:
   - iron plate impact problem (elastodynamics)
   - incompressible Mooney-Rivlin material model (hyperelasticity) as a script

For full release notes see http://docs.sfepy.org/doc/release_notes.html#id1
(rather long and technical).

Cheers,
Robert Cimrman

---

Contributors to this release in alphabetical order:

Robert Cimrman
Jan Heczko
Jan Kopacka
Vladimir Lukes


From robert.kern at gmail.com  Tue Mar  6 15:52:14 2018
From: robert.kern at gmail.com (Robert Kern)
Date: Tue, 6 Mar 2018 12:52:14 -0800
Subject: [Numpy-discussion] numpy.random.randn
In-Reply-To: <CANoUZR8H1A+0Ac524cXa3g4BabMvXUKeuBVAHgGCk7Su8TTqpg@mail.gmail.com>
References: <CANoUZR8H1A+0Ac524cXa3g4BabMvXUKeuBVAHgGCk7Su8TTqpg@mail.gmail.com>
Message-ID: <CAF6FJiuHBCnCM+Y0ugwB3FL0ZJM29zh+x=uc1r=OZoMhTLTXxg@mail.gmail.com>

On Tue, Mar 6, 2018 at 1:39 AM, Marko Asplund <marko.asplund at gmail.com>
wrote:
>
> I've some neural network code in NumPy that I'd like to compare with a
Scala based implementation.
> My problem is currently random initialization of the neural net
parameters.
> I'd like to be able to get the same results from both implementations
when using the same random seed.
>
> One approach I've though of would be to use the NumPy random generator
also with the Scala implementation, but unfortunately the linear algebra
library I'm using doesn't provide an equivalent for this.
>
> Could someone give pointers to implementing numpy.random.randn?
> Or alternatively, is there an equivalent random generator for Scala or
Java?

I would just recommend using one of the codebases to initialize the
network, save the network out to disk, and load up the initialized network
in each of the different codebases for training. That way you are sure that
they are both starting from the same exact network parameters.

Even if you do rewrite a precisely equivalent np.random.randn() for
Scala/Java, you ought to write the code to serialize the initialized
network anyways so that you can test that the two initialization routines
are equivalent. But if you're going to do that, you might as well take my
recommended approach.

--
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180306/b4aa86a3/attachment.html>

From marko.asplund at gmail.com  Wed Mar  7 16:10:38 2018
From: marko.asplund at gmail.com (Marko Asplund)
Date: Wed, 7 Mar 2018 23:10:38 +0200
Subject: [Numpy-discussion] numpy.random.randn
In-Reply-To: <CANoUZR8H1A+0Ac524cXa3g4BabMvXUKeuBVAHgGCk7Su8TTqpg@mail.gmail.com>
References: <CANoUZR8H1A+0Ac524cXa3g4BabMvXUKeuBVAHgGCk7Su8TTqpg@mail.gmail.com>
Message-ID: <CANoUZR-n3h+U3WKcAdHg1d-fOGLmufUedZkoXBpxM7iHMC_udw@mail.gmail.com>

On Tue, 6 Mar 2018 12:52:14, Robert Kern wrote:
> I would just recommend using one of the codebases to initialize the
> network, save the network out to disk, and load up the initialized network
> in each of the different codebases for training. That way you are sure
that
> they are both starting from the same exact network parameters.
>
> Even if you do rewrite a precisely equivalent np.random.randn() for
> Scala/Java, you ought to write the code to serialize the initialized
> network anyways so that you can test that the two initialization routines
> are equivalent. But if you're going to do that, you might as well take my
> recommended approach.

Thanks for the suggestion! I decided to use the approach you proposed.

Still, I'm puzzled by an issue that seems to be related to random
initilization.
I've three different NN implementations, 2 in Scala and one in NumPy.
When using the exact same initialization parameters I get the same
cost after each training iteration from each implementation. So, based on
this
I'd infer that the implementations work equivalently.
However, the results look very different when using random initialization.
With respect to exact cost this is course expected, but what I find
troublesome
is that  after N training iterations the cost starts approaching zero with
the NumPy
code (most of of the time), whereas with the Scala based implementations
cost fails
to converge (most of the time).

With NumPy I'm simply using the following random initilization code:

np.random.randn(n_h, n_x) * 0.01

I'm trying to emulate the same behaviour in my Scala code by  sampling from
a
Gaussian distribution with mean = 0 and std dev = 1.

Any ideas?

Marko
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180307/0aaf9d81/attachment.html>

From robert.kern at gmail.com  Wed Mar  7 16:14:36 2018
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 7 Mar 2018 13:14:36 -0800
Subject: [Numpy-discussion] numpy.random.randn
In-Reply-To: <CANoUZR-n3h+U3WKcAdHg1d-fOGLmufUedZkoXBpxM7iHMC_udw@mail.gmail.com>
References: <CANoUZR8H1A+0Ac524cXa3g4BabMvXUKeuBVAHgGCk7Su8TTqpg@mail.gmail.com>
 <CANoUZR-n3h+U3WKcAdHg1d-fOGLmufUedZkoXBpxM7iHMC_udw@mail.gmail.com>
Message-ID: <CAF6FJiv_M37gncEoADEir=8nwk5c+K6Y+qojY-Sb2fCNEb-PnQ@mail.gmail.com>

On Wed, Mar 7, 2018 at 1:10 PM, Marko Asplund <marko.asplund at gmail.com>
wrote:
>
> However, the results look very different when using random initialization.
> With respect to exact cost this is course expected, but what I find
troublesome
> is that  after N training iterations the cost starts approaching zero
with the NumPy
> code (most of of the time), whereas with the Scala based implementations
cost fails
> to converge (most of the time).
>
> With NumPy I'm simply using the following random initilization code:
>
> np.random.randn(n_h, n_x) * 0.01
>
> I'm trying to emulate the same behaviour in my Scala code by  sampling
from a
> Gaussian distribution with mean = 0 and std dev = 1.

`np.random.randn(n_h, n_x) * 0.01`  gives a Gaussian distribution of mean=0
and stdev=0.01

--
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180307/4774ec0e/attachment.html>

From njs at pobox.com  Thu Mar  8 03:25:00 2018
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 8 Mar 2018 00:25:00 -0800
Subject: [Numpy-discussion] New NEP: merging multiarray and umath
Message-ID: <CAPJVwBkD7xLauHVNSyD9xHQYDwQiSAOjjakxNtD8dJH6BQs2-w@mail.gmail.com>

Hi all,

Well, this is something that we've discussed for a while and I think
generally has consensus already, but I figured I'd write it down
anyway to make sure.

There's a rendered version here:
https://github.com/njsmith/numpy/blob/nep-0015-merge-multiarray-umath/doc/neps/nep-0015-merge-multiarray-umath.rst

-----

============================
Merging multiarray and umath
============================

:Author: Nathaniel J. Smith <njs at pobox.com>
:Status: Draft
:Type: Standards Track
:Created: 2018-02-22


Abstract
--------

Let's merge ``numpy.core.multiarray`` and ``numpy.core.umath`` into a
single extension module, and deprecate ``np.set_numeric_ops``.


Background
----------

Currently, numpy's core C code is split between two separate extension
modules.

``numpy.core.multiarray`` is built from
``numpy/core/src/multiarray/*.c``, and contains the core array
functionality (in particular, the ``ndarray`` object).

``numpy.core.umath`` is built from ``numpy/core/src/umath/*.c``, and
contains the ufunc machinery.

These two modules each expose their own separate C API, accessed via
``import_multiarray()`` and ``import_umath()`` respectively. The idea
is that they're supposed to be independent modules, with
``multiarray`` as a lower-level layer with ``umath`` built on top. In
practice this has turned out to be problematic.

First, the layering isn't perfect: when you write ``ndarray +
ndarray``, this invokes ``ndarray.__add__``, which then calls the
ufunc ``np.add``. This means that ``ndarray`` needs to know about
ufuncs ? so instead of a clean layering, we have a circular
dependency. To solve this, ``multiarray`` exports a somewhat
terrifying function called ``set_numeric_ops``. The bootstrap
procedure each time you ``import numpy`` is:

1. ``multiarray`` and its ``ndarray`` object are loaded, but
   arithmetic operations on ndarrays are broken.

2. ``umath`` is loaded.

3. ``set_numeric_ops`` is used to monkeypatch all the methods like
   ``ndarray.__add__`` with objects from ``umath``.

In addition, ``set_numeric_ops`` is exposed as a public API,
``np.set_numeric_ops``.

Furthermore, even when this layering does work, it ends up distorting
the shape of our public ABI. In recent years, the most common reason
for adding new functions to ``multiarray``\'s "public" ABI is not that
they really need to be public or that we expect other projects to use
them, but rather just that we need to call them from ``umath``. This
is extremely unfortunate, because it makes our public ABI
unnecessarily large, and since we can never remove things from it then
this creates an ongoing maintenance burden. The way C works, you can
have internal API that's visible to everything inside the same
extension module, or you can have a public API that everyone can use;
you can't have an API that's visible to multiple extension modules
inside numpy, but not to external users.

We've also increasingly been putting utility code into
``numpy/core/src/private/``, which now contains a bunch of files which
are ``#include``\d twice, once into ``multiarray`` and once into
``umath``. This is pretty gross, and is purely a workaround for these
being separate C extensions.


Proposed changes
----------------

This NEP proposes three changes:

1. We should start building ``numpy/core/src/multiarray/*.c`` and
   ``numpy/core/src/umath/*.c`` together into a single extension
   module.

2. Instead of ``set_numeric_ops``, we should use some new, private API
   to set up ``ndarray.__add__`` and friends.

3. We should deprecate, and eventually remove, ``np.set_numeric_ops``.


Non-proposed changes
--------------------

We don't necessarily propose to throw away the distinction between
multiarray/ and umath/ in terms of our source code organization:
internal organization is useful! We just want to build them together
into a single extension module. Of course, this does open the door for
potential future refactorings, which we can then evaluate based on
their merits as they come up.

It also doesn't propose that we break the public C ABI. We should
continue to provide ``import_multiarray()`` and ``import_umath()``
functions ? it's just that now both ABIs will ultimately be loaded
from the same C library. Due to how ``import_multiarray()`` and
``import_umath()`` are written, we'll also still need to have modules
called ``numpy.core.multiarray`` and ``numpy.core.umath``, and they'll
need to continue to export ``_ARRAY_API`` and ``_UFUNC_API`` objects ?
but we can make one or both of these modules be tiny shims that simply
re-export the magic API object from where-ever it's actually defined.
(See ``numpy/core/code_generators/generate_{numpy,ufunc}_api.py`` for
details of how these imports work.)


Backward compatibility
----------------------

The only compatibility break is the deprecation of ``np.set_numeric_ops``.


Alternatives
------------

n/a


Discussion
----------

TBD


Copyright
---------

This document has been placed in the public domain.


-- 
Nathaniel J. Smith -- https://vorpus.org

From wieser.eric+numpy at gmail.com  Thu Mar  8 03:47:46 2018
From: wieser.eric+numpy at gmail.com (Eric Wieser)
Date: Thu, 08 Mar 2018 08:47:46 +0000
Subject: [Numpy-discussion] New NEP: merging multiarray and umath
In-Reply-To: <CAPJVwBkD7xLauHVNSyD9xHQYDwQiSAOjjakxNtD8dJH6BQs2-w@mail.gmail.com>
References: <CAPJVwBkD7xLauHVNSyD9xHQYDwQiSAOjjakxNtD8dJH6BQs2-w@mail.gmail.com>
Message-ID: <CAL1kJvBp-HM_8WebTMgCRsByQ7dvW8ewYdE12MHDxcu1ijx4=Q@mail.gmail.com>

This means that ndarray needs to know about ufuncs ? so instead of a clean
layering, we have a circular dependency.

Perhaps we should split ndarray into a base_ndarray class with no
arithmetic support (*add*, sum, etc), and then provide an ndarray subclass
from umath instead (either the separate extension, or just a different set
of files)
?

On Thu, 8 Mar 2018 at 08:25 Nathaniel Smith <njs at pobox.com> wrote:

> Hi all,
>
> Well, this is something that we've discussed for a while and I think
> generally has consensus already, but I figured I'd write it down
> anyway to make sure.
>
> There's a rendered version here:
>
> https://github.com/njsmith/numpy/blob/nep-0015-merge-multiarray-umath/doc/neps/nep-0015-merge-multiarray-umath.rst
>
> -----
>
> ============================
> Merging multiarray and umath
> ============================
>
> :Author: Nathaniel J. Smith <njs at pobox.com>
> :Status: Draft
> :Type: Standards Track
> :Created: 2018-02-22
>
>
> Abstract
> --------
>
> Let's merge ``numpy.core.multiarray`` and ``numpy.core.umath`` into a
> single extension module, and deprecate ``np.set_numeric_ops``.
>
>
> Background
> ----------
>
> Currently, numpy's core C code is split between two separate extension
> modules.
>
> ``numpy.core.multiarray`` is built from
> ``numpy/core/src/multiarray/*.c``, and contains the core array
> functionality (in particular, the ``ndarray`` object).
>
> ``numpy.core.umath`` is built from ``numpy/core/src/umath/*.c``, and
> contains the ufunc machinery.
>
> These two modules each expose their own separate C API, accessed via
> ``import_multiarray()`` and ``import_umath()`` respectively. The idea
> is that they're supposed to be independent modules, with
> ``multiarray`` as a lower-level layer with ``umath`` built on top. In
> practice this has turned out to be problematic.
>
> First, the layering isn't perfect: when you write ``ndarray +
> ndarray``, this invokes ``ndarray.__add__``, which then calls the
> ufunc ``np.add``. This means that ``ndarray`` needs to know about
> ufuncs ? so instead of a clean layering, we have a circular
> dependency. To solve this, ``multiarray`` exports a somewhat
> terrifying function called ``set_numeric_ops``. The bootstrap
> procedure each time you ``import numpy`` is:
>
> 1. ``multiarray`` and its ``ndarray`` object are loaded, but
>    arithmetic operations on ndarrays are broken.
>
> 2. ``umath`` is loaded.
>
> 3. ``set_numeric_ops`` is used to monkeypatch all the methods like
>    ``ndarray.__add__`` with objects from ``umath``.
>
> In addition, ``set_numeric_ops`` is exposed as a public API,
> ``np.set_numeric_ops``.
>
> Furthermore, even when this layering does work, it ends up distorting
> the shape of our public ABI. In recent years, the most common reason
> for adding new functions to ``multiarray``\'s "public" ABI is not that
> they really need to be public or that we expect other projects to use
> them, but rather just that we need to call them from ``umath``. This
> is extremely unfortunate, because it makes our public ABI
> unnecessarily large, and since we can never remove things from it then
> this creates an ongoing maintenance burden. The way C works, you can
> have internal API that's visible to everything inside the same
> extension module, or you can have a public API that everyone can use;
> you can't have an API that's visible to multiple extension modules
> inside numpy, but not to external users.
>
> We've also increasingly been putting utility code into
> ``numpy/core/src/private/``, which now contains a bunch of files which
> are ``#include``\d twice, once into ``multiarray`` and once into
> ``umath``. This is pretty gross, and is purely a workaround for these
> being separate C extensions.
>
>
> Proposed changes
> ----------------
>
> This NEP proposes three changes:
>
> 1. We should start building ``numpy/core/src/multiarray/*.c`` and
>    ``numpy/core/src/umath/*.c`` together into a single extension
>    module.
>
> 2. Instead of ``set_numeric_ops``, we should use some new, private API
>    to set up ``ndarray.__add__`` and friends.
>
> 3. We should deprecate, and eventually remove, ``np.set_numeric_ops``.
>
>
> Non-proposed changes
> --------------------
>
> We don't necessarily propose to throw away the distinction between
> multiarray/ and umath/ in terms of our source code organization:
> internal organization is useful! We just want to build them together
> into a single extension module. Of course, this does open the door for
> potential future refactorings, which we can then evaluate based on
> their merits as they come up.
>
> It also doesn't propose that we break the public C ABI. We should
> continue to provide ``import_multiarray()`` and ``import_umath()``
> functions ? it's just that now both ABIs will ultimately be loaded
> from the same C library. Due to how ``import_multiarray()`` and
> ``import_umath()`` are written, we'll also still need to have modules
> called ``numpy.core.multiarray`` and ``numpy.core.umath``, and they'll
> need to continue to export ``_ARRAY_API`` and ``_UFUNC_API`` objects ?
> but we can make one or both of these modules be tiny shims that simply
> re-export the magic API object from where-ever it's actually defined.
> (See ``numpy/core/code_generators/generate_{numpy,ufunc}_api.py`` for
> details of how these imports work.)
>
>
> Backward compatibility
> ----------------------
>
> The only compatibility break is the deprecation of ``np.set_numeric_ops``.
>
>
> Alternatives
> ------------
>
> n/a
>
>
> Discussion
> ----------
>
> TBD
>
>
> Copyright
> ---------
>
> This document has been placed in the public domain.
>
>
> --
> Nathaniel J. Smith -- https://vorpus.org
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180308/b52dc4e4/attachment-0001.html>

From njs at pobox.com  Thu Mar  8 03:57:38 2018
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 8 Mar 2018 00:57:38 -0800
Subject: [Numpy-discussion] New NEP: merging multiarray and umath
In-Reply-To: <CAL1kJvBp-HM_8WebTMgCRsByQ7dvW8ewYdE12MHDxcu1ijx4=Q@mail.gmail.com>
References: <CAPJVwBkD7xLauHVNSyD9xHQYDwQiSAOjjakxNtD8dJH6BQs2-w@mail.gmail.com>
 <CAL1kJvBp-HM_8WebTMgCRsByQ7dvW8ewYdE12MHDxcu1ijx4=Q@mail.gmail.com>
Message-ID: <CAPJVwB=eoL088=GAhbDU6qwf7kSi1EEBccax_3OKWRE55qCk3Q@mail.gmail.com>

On Thu, Mar 8, 2018 at 12:47 AM, Eric Wieser
<wieser.eric+numpy at gmail.com> wrote:
> This means that ndarray needs to know about ufuncs ? so instead of a clean
> layering, we have a circular dependency.
>
> Perhaps we should split ndarray into a base_ndarray class with no arithmetic
> support (add, sum, etc), and then provide an ndarray subclass from umath
> instead (either the separate extension, or just a different set of files)

This just seems like adding more complexity because we can, though?

-n

-- 
Nathaniel J. Smith -- https://vorpus.org

From njs at pobox.com  Thu Mar  8 04:33:56 2018
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 8 Mar 2018 01:33:56 -0800
Subject: [Numpy-discussion] new NEP: np.AbstractArray and np.asabstractarray
Message-ID: <CAPJVwBnBGh_kfT+tXa8gJ8HA3xjGzoSMJMsCzj+2=diSUvxEAQ@mail.gmail.com>

Hi all,

Here's a more substantive NEP: trying to define how to define a
standard way for functions to say that they can accept any "duck
array".

Biggest open question for me: the name "asabstractarray" kinda sucks
(for reasons described in the NEP), and I'd love to have something
better. Any ideas?

Rendered version:
https://github.com/njsmith/numpy/blob/nep-16-abstract-array/doc/neps/nep-0016-abstract-array.rst

-n

----

====================================================
An abstract base class for identifying "duck arrays"
====================================================

:Author: Nathaniel J. Smith <njs at pobox.com>
:Status: Draft
:Type: Standards Track
:Created: 2018-03-06


Abstract
--------

We propose to add an abstract base class ``AbstractArray`` so that
third-party classes can declare their ability to "quack like" an
``ndarray``, and an ``asabstractarray`` function that performs
similarly to ``asarray`` except that it passes through
``AbstractArray`` instances unchanged.


Detailed description
--------------------

Many functions, in NumPy and in third-party packages, start with some
code like::

   def myfunc(a, b):
       a = np.asarray(a)
       b = np.asarray(b)
       ...

This ensures that ``a`` and ``b`` are ``np.ndarray`` objects, so
``myfunc`` can carry on assuming that they'll act like ndarrays both
semantically (at the Python level), and also in terms of how they're
stored in memory (at the C level). But many of these functions only
work with arrays at the Python level, which means that they don't
actually need ``ndarray`` objects *per se*: they could work just as
well with any Python object that "quacks like" an ndarray, such as
sparse arrays, dask's lazy arrays, or xarray's labeled arrays.

However, currently, there's no way for these libraries to express that
their objects can quack like an ndarray, and there's no way for
functions like ``myfunc`` to express that they'd be happy with
anything that quacks like an ndarray. The purpose of this NEP is to
provide those two features.

Sometimes people suggest using ``np.asanyarray`` for this purpose, but
unfortunately its semantics are exactly backwards: it guarantees that
the object it returns uses the same memory layout as an ``ndarray``,
but tells you nothing at all about its semantics, which makes it
essentially impossible to use safely in practice. Indeed, the two
``ndarray`` subclasses distributed with NumPy ? ``np.matrix`` and
``np.ma.masked_array`` ? do have incompatible semantics, and if they
were passed to a function like ``myfunc`` that doesn't check for them
as a special-case, then it may silently return incorrect results.


Declaring that an object can quack like an array
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

There are two basic approaches we could use for checking whether an
object quacks like an array. We could check for a special attribute on
the class::

  def quacks_like_array(obj):
      return bool(getattr(type(obj), "__quacks_like_array__", False))

Or, we could define an `abstract base class (ABC)
<https://docs.python.org/3/library/collections.abc.html>`__::

  def quacks_like_array(obj):
      return isinstance(obj, AbstractArray)

If you look at how ABCs work, this is essentially equivalent to
keeping a global set of types that have been declared to implement the
``AbstractArray`` interface, and then checking it for membership.

Between these, the ABC approach seems to have a number of advantages:

* It's Python's standard, "one obvious way" of doing this.

* ABCs can be introspected (e.g. ``help(np.AbstractArray)`` does
  something useful).

* ABCs can provide useful mixin methods.

* ABCs integrate with other features like mypy type-checking,
  ``functools.singledispatch``, etc.

One obvious thing to check is whether this choice affects speed. Using
the attached benchmark script on a CPython 3.7 prerelease (revision
c4d77a661138d, self-compiled, no PGO), on a Thinkpad T450s running
Linux, we find::

    np.asarray(ndarray_obj)      330 ns
    np.asarray([])              1400 ns

    Attribute check, success      80 ns
    Attribute check, failure      80 ns

    ABC, success via subclass    340 ns
    ABC, success via register()  700 ns
    ABC, failure                 370 ns

Notes:

* The first two lines are included to put the other lines in context.

* This used 3.7 because both ``getattr`` and ABCs are receiving
  substantial optimizations in this release, and it's more
  representative of the long-term future of Python. (Failed
  ``getattr`` doesn't necessarily construct an exception object
  anymore, and ABCs were reimplemented in C.)

* The "success" lines refer to cases where ``quacks_like_array`` would
  return True. The "failure" lines are cases where it would return
  False.

* The first measurement for ABCs is subclasses defined like::

      class MyArray(AbstractArray):
          ...

  The second is for subclasses defined like::

      class MyArray:
          ...

      AbstractArray.register(MyArray)

  I don't know why there's such a large difference between these.

In practice, either way we'd only do the full test after first
checking for well-known types like ``ndarray``, ``list``, etc. `This
is how NumPy currently checks for other double-underscore attributes
<https://github.com/numpy/numpy/blob/master/numpy/core/src/private/get_attr_string.h>`__
and the same idea applies here to either approach. So these numbers
won't affect the common case, just the case where we actually have an
``AbstractArray``, or else another third-party object that will end up
going through ``__array__`` or ``__array_interface__`` or end up as an
object array.

So in summary, using an ABC will be slightly slower than using an
attribute, but this doesn't affect the most common paths, and the
magnitude of slowdown is fairly small (~250 ns on an operation that
already takes longer than that). Furthermore, we can potentially
optimize this further (e.g. by keeping a tiny LRU cache of types that
are known to be AbstractArray subclasses, on the assumption that most
code will only use one or two of these types at a time), and it's very
unclear that this even matters ? if the speed of ``asarray`` no-op
pass-throughs were a bottleneck that showed up in profiles, then
probably we would have made them faster already! (It would be trivial
to fast-path this, but we don't.)

Given the semantic and usability advantages of ABCs, this seems like
an acceptable trade-off.

..
   CPython 3.6 (from Debian)::

       Attribute check, success     110 ns
       Attribute check, failure     370 ns

       ABC, success via subclass    690 ns
       ABC, success via register()  690 ns
       ABC, failure                1220 ns


Specification of ``asabstractarray``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Given ``AbstractArray``, the definition of ``asabstractarray`` is simple::

  def asabstractarray(a, dtype=None):
      if isinstance(a, AbstractArray):
          if dtype is not None and dtype != a.dtype:
              return a.astype(dtype)
          return a
      return asarray(a, dtype=dtype)

Things to note:

* ``asarray`` also accepts an ``order=`` argument, but we don't
  include that here because it's about details of memory
  representation, and the whole point of this function is that you use
  it to declare that you don't care about details of memory
  representation.

* Using the ``astype`` method allows the ``a`` object to decide how to
  implement casting for its particular type.

* For strict compatibility with ``asarray``, we skip calling
  ``astype`` when the dtype is already correct. Compare::

      >>> a = np.arange(10)

      # astype() always returns a view:
      >>> a.astype(a.dtype) is a
      False

      # asarray() returns the original object if possible:
      >>> np.asarray(a, dtype=a.dtype) is a
      True


What exactly are you promising if you inherit from ``AbstractArray``?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This will presumably be refined over time. The ideal of course is that
your class should be indistinguishable from a real ``ndarray``, but
nothing enforces that except the expectations of users. In practice,
declaring that your class implements the ``AbstractArray`` interface
simply means that it will start passing through ``asabstractarray``,
and so by subclassing it you're saying that if some code works for
``ndarray``\s but breaks for your class, then you're willing to accept
bug reports on that.

To start with, we should declare ``__array_ufunc__`` to be an abstract
method, and add the ``NDArrayOperatorsMixin`` methods as mixin
methods.

Declaring ``astype`` as an ``@abstractmethod`` probably makes sense as
well, since it's used by ``asabstractarray``. We might also want to go
ahead and add some basic attributes like ``ndim``, ``shape``,
``dtype``.

Adding new abstract methods will be a bit trick, because ABCs enforce
these at subclass time; therefore, simply adding a new
`@abstractmethod` will be a backwards compatibility break. If this
becomes a problem then we can use some hacks to implement an
`@upcoming_abstractmethod` decorator that only issues a warning if the
method is missing, and treat it like a regular deprecation cycle. (In
this case, the thing we'd be deprecating is "support for abstract
arrays that are missing feature X".)


Naming
~~~~~~

The name of the ABC doesn't matter too much, because it will only be
referenced rarely and in relatively specialized situations. The name
of the function matters a lot, because most existing instances of
``asarray`` should be replaced by this, and in the future it's what
everyone should be reaching for by default unless they have a specific
reason to use ``asarray`` instead. This suggests that its name really
should be *shorter* and *more memorable* than ``asarray``... which
is difficult. I've used ``asabstractarray`` in this draft, but I'm not
really happy with it, because it's too long and people are unlikely to
start using it by habit without endless exhortations.

One option would be to actually change ``asarray``\'s semantics so
that *it* passes through ``AbstractArray`` objects unchanged. But I'm
worried that there may be a lot of code out there that calls
``asarray`` and then passes the result into some C function that
doesn't do any further type checking (because it knows that its caller
has already used ``asarray``). If we allow ``asarray`` to return
``AbstractArray`` objects, and then someone calls one of these C
wrappers and passes it an ``AbstractArray`` object like a sparse
array, then they'll get a segfault. Right now, in the same situation,
``asarray`` will instead invoke the object's ``__array__`` method, or
use the buffer interface to make a view, or pass through an array with
object dtype, or raise an error, or similar. Probably none of these
outcomes are actually desireable in most cases, so maybe making it a
segfault instead would be OK? But it's dangerous given that we don't
know how common such code is. OTOH, if we were starting from scratch
then this would probably be the ideal solution.

We can't use ``asanyarray`` or ``array``, since those are already
taken.

Any other ideas? ``np.cast``, ``np.coerce``?


Implementation
--------------

1. Rename ``NDArrayOperatorsMixin`` to ``AbstractArray`` (leaving
   behind an alias for backwards compatibility) and make it an ABC.

2. Add ``asabstractarray`` (or whatever we end up calling it), and
   probably a C API equivalent.

3. Begin migrating NumPy internal functions to using
   ``asabstractarray`` where appropriate.


Backward compatibility
----------------------

This is purely a new feature, so there are no compatibility issues.
(Unless we decide to change the semantics of ``asarray`` itself.)


Rejected alternatives
---------------------

One suggestion that has come up is to define multiple abstract classes
for different subsets of the array interface. Nothing in this proposal
stops either NumPy or third-parties from doing this in the future, but
it's very difficult to guess ahead of time which subsets would be
useful. Also, "the full ndarray interface" is something that existing
libraries are written to expect (because they work with actual
ndarrays) and test (because they test with actual ndarrays), so it's
by far the easiest place to start.


Links to discussion
-------------------

TBD


Appendix: Benchmark script
--------------------------

.. literal-include:: nep-0016-benchmark.py


Copyright
---------

This document has been placed in the public domain.


-n

-- 
Nathaniel J. Smith -- https://vorpus.org

From gregor.thalhammer at gmail.com  Thu Mar  8 04:52:15 2018
From: gregor.thalhammer at gmail.com (Gregor Thalhammer)
Date: Thu, 8 Mar 2018 10:52:15 +0100
Subject: [Numpy-discussion] New NEP: merging multiarray and umath
In-Reply-To: <CAPJVwBkD7xLauHVNSyD9xHQYDwQiSAOjjakxNtD8dJH6BQs2-w@mail.gmail.com>
References: <CAPJVwBkD7xLauHVNSyD9xHQYDwQiSAOjjakxNtD8dJH6BQs2-w@mail.gmail.com>
Message-ID: <23471BD4-A81B-4B9C-AECC-D161C3643B81@gmail.com>


Hi,

long time ago I wrote a wrapper to to use optimised and parallelized math functions from Intels vector math library 
geggo/uvml: Provide vectorized math function (MKL) for numpy <https://github.com/geggo/uvml>

I found it useful to inject (some of) the fast methods into numpy via np.set_num_ops(), to gain more performance without changing my programs.

While this original project is outdated, I can imagine that a centralised way to swap the implementation of math functions is useful. Therefor I suggest to keep np.set_num_ops(), but admittedly I do not understand all the technical implications of the proposed change.

best
Gregor
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180308/69968df4/attachment-0001.html>

From m.h.vankerkwijk at gmail.com  Thu Mar  8 10:06:23 2018
From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk)
Date: Thu, 8 Mar 2018 10:06:23 -0500
Subject: [Numpy-discussion] new NEP: np.AbstractArray and
 np.asabstractarray
In-Reply-To: <CAPJVwBnBGh_kfT+tXa8gJ8HA3xjGzoSMJMsCzj+2=diSUvxEAQ@mail.gmail.com>
References: <CAPJVwBnBGh_kfT+tXa8gJ8HA3xjGzoSMJMsCzj+2=diSUvxEAQ@mail.gmail.com>
Message-ID: <CAJNV+9uM6CL3SeYoLFbQAdpVVaFFvSb7Z5D4KVCQ-vLTm7g4SQ@mail.gmail.com>

Hi Nathaniel,

Overall, hugely in favour!  For detailed comments, it would be good to
have a link to a PR; could you put that up?

A larger comment: you state that you think `np.asanyarray` is a
mistake since `np.matrix` and `np.ma.MaskedArray` would pass through
and that those do not strictly mimic `NDArray`. Here, I agree with
`matrix` (but since we're deprecating it, let's remove that from the
discussion), but I do not see how your proposed interface would not
let `MaskedArray` pass through, nor really that one would necessarily
want that.

I think it may be good to distinguish two separate cases:
1. Everything has exactly the same meaning as for `ndarray` but the
data is stored differently (i.e., only `view` does not work). One can
thus expect that for `output = function(inputs)`, at the end all
`duck_output == ndarray_output`.
2. Everything is implemented but operations may give different output
(depending on masks for masked arrays, units for quantities, etc.), so
generally `duck_output != ndarray_output`.

Which one of these are you aiming at? By including
`NDArrayOperatorsMixin`, it would seem option (2), but perhaps not? Is
there a case for both separately?

Smaller general comment: at least in the NEP I would not worry about
deprecating `NDArrayOperatorsMixin` - this may well be handy in itself
(for things that implement `__array_ufunc__` but do not have shape,
etc. (I have been doing some work on creating ufunc chains that would
use this -- but they definitely are not array-like). Similarly, I
think there is room for an `NDArrayShapeMixin` which might help with
`concatenate` and friends.

Finally, on the name: `asarray` and `asanyarray` are just shims over
`array`, so one option would be to add an argument in `array` (or
broaden the scope of `subok`).

As an explicit suggestion, one could introduce a `duck` or `abstract`
argument to `array` which is used in `asarray` and `asanyarray` as
well (corresponding to options 1 and 2), and eventually default to
something sensible (I would think `False` for `asarray` and `True` for
`asanyarray`).

All the best,

Marten

From charlesr.harris at gmail.com  Thu Mar  8 11:20:08 2018
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu, 8 Mar 2018 09:20:08 -0700
Subject: [Numpy-discussion] New NEP: merging multiarray and umath
In-Reply-To: <23471BD4-A81B-4B9C-AECC-D161C3643B81@gmail.com>
References: <CAPJVwBkD7xLauHVNSyD9xHQYDwQiSAOjjakxNtD8dJH6BQs2-w@mail.gmail.com>
 <23471BD4-A81B-4B9C-AECC-D161C3643B81@gmail.com>
Message-ID: <CAB6mnxKqsxxdLQ2JgkwiD4w3aXW7kFJLZAL14J5K-dtiPmuAGQ@mail.gmail.com>

On Thu, Mar 8, 2018 at 2:52 AM, Gregor Thalhammer <
gregor.thalhammer at gmail.com> wrote:

>
> Hi,
>
> long time ago I wrote a wrapper to to use optimised and parallelized math
> functions from Intels vector math library
> geggo/uvml: Provide vectorized math function (MKL) for numpy
> <https://github.com/geggo/uvml>
>
> I found it useful to inject (some of) the fast methods into numpy via
> np.set_num_ops(), to gain more performance without changing my programs.
>

I think that was much of the original motivation for `set_num_ops` back in
the Numeric days, where there was little commonality among platforms and
getting hold of optimized libraries was very much an individual thing. The
former cblas module, now merged with multiarray, was present for the same
reasons.


>
> While this original project is outdated, I can imagine that a centralised
> way to swap the implementation of math functions is useful. Therefor I
> suggest to keep np.set_num_ops(), but admittedly I do not understand all
> the technical implications of the proposed change.
>

I suppose we could set it up to detect and use an external library during
compilation. The CBLAS implementations currently do that and should pick up
the MKL version when available. Where are the MKL functions you used
presented? That is an admittedly lower level interface, however.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180308/7078e27f/attachment.html>

From m.h.vankerkwijk at gmail.com  Thu Mar  8 11:27:40 2018
From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk)
Date: Thu, 8 Mar 2018 11:27:40 -0500
Subject: [Numpy-discussion] New NEP: merging multiarray and umath
In-Reply-To: <23471BD4-A81B-4B9C-AECC-D161C3643B81@gmail.com>
References: <CAPJVwBkD7xLauHVNSyD9xHQYDwQiSAOjjakxNtD8dJH6BQs2-w@mail.gmail.com>
 <23471BD4-A81B-4B9C-AECC-D161C3643B81@gmail.com>
Message-ID: <CAJNV+9t0dHRVUK+bXfOq1XazrnFzs9UTKemhOk0HAtNwDsksGQ@mail.gmail.com>

On Thu, Mar 8, 2018 at 4:52 AM, Gregor Thalhammer
<gregor.thalhammer at gmail.com> wrote:
>
> Hi,
>
> long time ago I wrote a wrapper to to use optimised and parallelized math
> functions from Intels vector math library
> geggo/uvml: Provide vectorized math function (MKL) for numpy
>
> I found it useful to inject (some of) the fast methods into numpy via
> np.set_num_ops(), to gain more performance without changing my programs.
>
> While this original project is outdated, I can imagine that a centralised
> way to swap the implementation of math functions is useful. Therefor I
> suggest to keep np.set_num_ops(), but admittedly I do not understand all the
> technical implications of the proposed change.

There may still be a case for being able to swap out the functions
that do the actual work, i.e., the parts of the ufuncs that are called
once any conversion to ndarray has been done.

-- Marten

From charlesr.harris at gmail.com  Thu Mar  8 11:30:22 2018
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu, 8 Mar 2018 09:30:22 -0700
Subject: [Numpy-discussion] New NEP: merging multiarray and umath
In-Reply-To: <CAB6mnxKqsxxdLQ2JgkwiD4w3aXW7kFJLZAL14J5K-dtiPmuAGQ@mail.gmail.com>
References: <CAPJVwBkD7xLauHVNSyD9xHQYDwQiSAOjjakxNtD8dJH6BQs2-w@mail.gmail.com>
 <23471BD4-A81B-4B9C-AECC-D161C3643B81@gmail.com>
 <CAB6mnxKqsxxdLQ2JgkwiD4w3aXW7kFJLZAL14J5K-dtiPmuAGQ@mail.gmail.com>
Message-ID: <CAB6mnxKFzQ10nypS+vHzDFgfEGHO=EbQ_osMiKO4WUiU2w315w@mail.gmail.com>

On Thu, Mar 8, 2018 at 9:20 AM, Charles R Harris <charlesr.harris at gmail.com>
wrote:

>
>
> On Thu, Mar 8, 2018 at 2:52 AM, Gregor Thalhammer <
> gregor.thalhammer at gmail.com> wrote:
>
>>
>> Hi,
>>
>> long time ago I wrote a wrapper to to use optimised and parallelized math
>> functions from Intels vector math library
>> geggo/uvml: Provide vectorized math function (MKL) for numpy
>> <https://github.com/geggo/uvml>
>>
>> I found it useful to inject (some of) the fast methods into numpy via
>> np.set_num_ops(), to gain more performance without changing my programs.
>>
>
> I think that was much of the original motivation for `set_num_ops` back in
> the Numeric days, where there was little commonality among platforms and
> getting hold of optimized libraries was very much an individual thing. The
> former cblas module, now merged with multiarray, was present for the same
> reasons.
>
>
>>
>> While this original project is outdated, I can imagine that a centralised
>> way to swap the implementation of math functions is useful. Therefor I
>> suggest to keep np.set_num_ops(), but admittedly I do not understand all
>> the technical implications of the proposed change.
>>
>
> I suppose we could set it up to detect and use an external library during
> compilation. The CBLAS implementations currently do that and should pick up
> the MKL version when available. Where are the MKL functions you used
> presented? That is an admittedly lower level interface, however.
>
>
Note that Intel is also working to support NumPy and intends to use the
Intel optimizations as part of that.


Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180308/1b0b9729/attachment-0001.html>

From m.h.vankerkwijk at gmail.com  Thu Mar  8 11:34:34 2018
From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk)
Date: Thu, 8 Mar 2018 11:34:34 -0500
Subject: [Numpy-discussion] New NEP: merging multiarray and umath
In-Reply-To: <CAPJVwBkD7xLauHVNSyD9xHQYDwQiSAOjjakxNtD8dJH6BQs2-w@mail.gmail.com>
References: <CAPJVwBkD7xLauHVNSyD9xHQYDwQiSAOjjakxNtD8dJH6BQs2-w@mail.gmail.com>
Message-ID: <CAJNV+9styfwZPM82rt2qY0qGuFZqRig4U44ioEBz1q=q=Pof8Q@mail.gmail.com>

I think part of the problem is that ufuncs actually have two parts: a
generic interface, which turns all its arguments into ndarray (or
calls `__array_ufunc__`) and an ndarray-specific implementation of the
given function (partially, just the iterator, partially the inner
loop). The latter could logically be moved to
`ndarray.__array_ufunc__` (and thus to `multiarray`). In that case,
`umath` would hardly depend on `multiarray` any more.

But perhaps this is a bit besides the point: building the two at the
same time would go a long way to making it easier to do a move like
the above.

-- Marten

From shoyer at gmail.com  Thu Mar  8 13:56:19 2018
From: shoyer at gmail.com (Stephan Hoyer)
Date: Thu, 08 Mar 2018 18:56:19 +0000
Subject: [Numpy-discussion] new NEP: np.AbstractArray and
 np.asabstractarray
In-Reply-To: <CAJNV+9uM6CL3SeYoLFbQAdpVVaFFvSb7Z5D4KVCQ-vLTm7g4SQ@mail.gmail.com>
References: <CAPJVwBnBGh_kfT+tXa8gJ8HA3xjGzoSMJMsCzj+2=diSUvxEAQ@mail.gmail.com>
 <CAJNV+9uM6CL3SeYoLFbQAdpVVaFFvSb7Z5D4KVCQ-vLTm7g4SQ@mail.gmail.com>
Message-ID: <CAEQ_Tvc8QA+Finhq6+2Ey8J5dBdTR+VuraA1_kuOFjb3=LO2gg@mail.gmail.com>

Hi Nathaniel,

Thanks for starting the discussion!

Like Marten says, I think it would be useful to more clearly define what it
means to be an abstract array. ndarray has lots of methods/properties that
expose internal implementation (e.g., view, strides) that presumably we
don't want to require as part of this interfaces. On the other hand, dtype
and shape are almost assuredly part of this interface.

To help guide the discussion, it would be good to identify concrete
examples of types that should and should not satisfy this interface, e.g.,
Marten's case 1: works exactly like ndarray, but stores data differently:
parallel arrays (e.g., dask.array), sparse arrays (e.g.,
https://github.com/pydata/sparse), hypothetical non-strided arrays (e.g.,
always C ordered).
Marten's case 2: same methods as ndarray, but gives different results:
np.ma.MaskedArray, arrays with units (quantities), maybe labeled arrays
like xarray.DataArray

I don't think we have a hope of making a single base class for case 2 work
with everything in NumPy, but we can define interfaces with different
levels of functionality.

Because there is such a gradation of "duck array" types, I agree with
Marten that we should not deprecate NDArrayOperatorsMixin. It's useful for
types like xarray.Dataset that define __array_ufunc__ but cannot satisfy
the full abstract array interface.

Finally for the name, what about `asduckarray`? Thought perhaps that could
be a source of confusion, and given the gradation of duck array like types.

Cheers,
Stephan

On Thu, Mar 8, 2018 at 7:07 AM Marten van Kerkwijk <
m.h.vankerkwijk at gmail.com> wrote:

> Hi Nathaniel,
>
> Overall, hugely in favour!  For detailed comments, it would be good to
> have a link to a PR; could you put that up?
>
> A larger comment: you state that you think `np.asanyarray` is a
> mistake since `np.matrix` and `np.ma.MaskedArray` would pass through
> and that those do not strictly mimic `NDArray`. Here, I agree with
> `matrix` (but since we're deprecating it, let's remove that from the
> discussion), but I do not see how your proposed interface would not
> let `MaskedArray` pass through, nor really that one would necessarily
> want that.
>
> I think it may be good to distinguish two separate cases:
> 1. Everything has exactly the same meaning as for `ndarray` but the
> data is stored differently (i.e., only `view` does not work). One can
> thus expect that for `output = function(inputs)`, at the end all
> `duck_output == ndarray_output`.
> 2. Everything is implemented but operations may give different output
> (depending on masks for masked arrays, units for quantities, etc.), so
> generally `duck_output != ndarray_output`.
>
> Which one of these are you aiming at? By including
> `NDArrayOperatorsMixin`, it would seem option (2), but perhaps not? Is
> there a case for both separately?
>
> Smaller general comment: at least in the NEP I would not worry about
> deprecating `NDArrayOperatorsMixin` - this may well be handy in itself
> (for things that implement `__array_ufunc__` but do not have shape,
> etc. (I have been doing some work on creating ufunc chains that would
> use this -- but they definitely are not array-like). Similarly, I
> think there is room for an `NDArrayShapeMixin` which might help with
> `concatenate` and friends.
>
> Finally, on the name: `asarray` and `asanyarray` are just shims over
> `array`, so one option would be to add an argument in `array` (or
> broaden the scope of `subok`).
>
> As an explicit suggestion, one could introduce a `duck` or `abstract`
> argument to `array` which is used in `asarray` and `asanyarray` as
> well (corresponding to options 1 and 2), and eventually default to
> something sensible (I would think `False` for `asarray` and `True` for
> `asanyarray`).
>
> All the best,
>
> Marten
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180308/51b6f6ed/attachment.html>

From marko.asplund at gmail.com  Thu Mar  8 15:44:39 2018
From: marko.asplund at gmail.com (Marko Asplund)
Date: Thu, 8 Mar 2018 22:44:39 +0200
Subject: [Numpy-discussion] numpy.random.randn
Message-ID: <CANoUZR_rAoSoUmdCthu2w9VSOg1050BFxugE0y2YZL2-bnnsug@mail.gmail.com>

On Wed, 7 Mar 2018 13:14:36, Robert Kern wrote:

> > With NumPy I'm simply using the following random initilization code:
> >
> > np.random.randn(n_h, n_x) * 0.01
> >
> > I'm trying to emulate the same behaviour in my Scala code by  sampling
> from a
> > Gaussian distribution with mean = 0 and std dev = 1.

> `np.random.randn(n_h, n_x) * 0.01`  gives a Gaussian distribution of
mean=0
> and stdev=0.01

Sorry for being a bit inaccurate.
My Scala code actually mirrors the NumPy based random initialization, so I
sample with Gaussian of mean = 0 and std dev = 1, then multiply with 0.01.
Despite the extra step the result should be the same as with the NumPy code
above.

Is there anything else that could be different with the random
initilization methods?


marko
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180308/456fe322/attachment.html>

From charlesr.harris at gmail.com  Thu Mar  8 18:55:10 2018
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu, 8 Mar 2018 16:55:10 -0700
Subject: [Numpy-discussion] NumPy 1.14.2 release
Message-ID: <CAB6mnxJswNVVjuU1wg1u+7t5hMaW6jXZjjQHcUdD7swwNvmLSA@mail.gmail.com>

Hi All,

I'm looking to make a NumPy soonish, possibly at the beginning of next
week. The only change planned is a fix for the printing problem that the
astropy folks reported. The fix for that problem is also in master, so if
you test against master you should be able check if the fix works for you.
If you have experienced any other problems with 1.14.1, please report them,
and also mention them here so that they don't fall through the cracks.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180308/a3c4829d/attachment.html>

From njs at pobox.com  Thu Mar  8 20:06:52 2018
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 8 Mar 2018 17:06:52 -0800
Subject: [Numpy-discussion] New NEP: merging multiarray and umath
In-Reply-To: <23471BD4-A81B-4B9C-AECC-D161C3643B81@gmail.com>
References: <CAPJVwBkD7xLauHVNSyD9xHQYDwQiSAOjjakxNtD8dJH6BQs2-w@mail.gmail.com>
 <23471BD4-A81B-4B9C-AECC-D161C3643B81@gmail.com>
Message-ID: <CAPJVwBm6BC0pe7o2zueQuCfy=Aar6hNTpDkPkOu7EsqQq0e-2w@mail.gmail.com>

On Thu, Mar 8, 2018 at 1:52 AM, Gregor Thalhammer
<gregor.thalhammer at gmail.com> wrote:
>
> Hi,
>
> long time ago I wrote a wrapper to to use optimised and parallelized math
> functions from Intels vector math library
> geggo/uvml: Provide vectorized math function (MKL) for numpy
>
> I found it useful to inject (some of) the fast methods into numpy via
> np.set_num_ops(), to gain more performance without changing my programs.
>
> While this original project is outdated, I can imagine that a centralised
> way to swap the implementation of math functions is useful. Therefor I
> suggest to keep np.set_num_ops(), but admittedly I do not understand all the
> technical implications of the proposed change.

The main part of the proposal is to merge the two libraries; the
question of whether to deprecate set_numeric_ops is a bit separate.
There's no technical obstacle to keeping it, except the usual issue of
having more cruft to maintain :-).

It's usually true that any monkeypatching interface will be useful to
someone under some circumstances, but we usually don't consider this a
good enough reason on its own to add and maintain these kinds of
interfaces. And an unfortunate side-effect of these kinds of hacky
interfaces is that they can end up removing the pressure to solve
problems properly. In this case, better solutions would include:

- Adding support for accelerated vector math libraries to NumPy
directly (e.g. MKL, yeppp)

- Overriding the inner loops inside ufuncs like numpy.add that
np.ndarray.__add__ ultimately calls. This would speed up all addition
(whether or not it uses Python + syntax), would be a more general
solution (e.g. you could monkeypatch np.exp to use MKL's fast
vectorized exp), would let you skip reimplementing all the tricky
shared bits of the ufunc logic, etc. Conceptually it's not even very
hacky, because we allow you add new loops to existing ufuncs; making
it possible to replace existing loops wouldn't be a big stretch. (In
fact it's possible that we already allow this; I haven't checked.)

So I still lean towards deprecating set_numeric_ops. It's not the most
crucial part of the proposal though; if it turns out to be too
controversial then I'll take it out.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org

From jni.soma at gmail.com  Thu Mar  8 20:51:56 2018
From: jni.soma at gmail.com (Juan Nunez-Iglesias)
Date: Fri, 09 Mar 2018 12:51:56 +1100
Subject: [Numpy-discussion] new NEP: np.AbstractArray and
 np.asabstractarray
In-Reply-To: <CAEQ_Tvc8QA+Finhq6+2Ey8J5dBdTR+VuraA1_kuOFjb3=LO2gg@mail.gmail.com>
References: <CAPJVwBnBGh_kfT+tXa8gJ8HA3xjGzoSMJMsCzj+2=diSUvxEAQ@mail.gmail.com>
 <CAJNV+9uM6CL3SeYoLFbQAdpVVaFFvSb7Z5D4KVCQ-vLTm7g4SQ@mail.gmail.com>
 <CAEQ_Tvc8QA+Finhq6+2Ey8J5dBdTR+VuraA1_kuOFjb3=LO2gg@mail.gmail.com>
Message-ID: <1520560316.2962680.1296803088.6C85AC87@webmail.messagingengine.com>

On Fri, Mar 9, 2018, at 5:56 AM, Stephan Hoyer wrote:
> Marten's case 1: works exactly like ndarray, but stores data
> differently: parallel arrays (e.g., dask.array), sparse arrays (e.g.,
> https://github.com/pydata/sparse), hypothetical non-strided arrays
> (e.g., always C ordered).
Two other "hypotheticals" that would fit nicely in this space:
- the Open Connectome folks (https://neurodata.io) proposed linearising
  indices using space-filling curves, which minimizes cache misses (or
  IO reads) for giant volumes. I believe they implemented this but can't
  find it currently.- the N5 format for chunked arrays on disk:
  https://github.com/saalfeldlab/n5
> Finally for the name, what about `asduckarray`? Thought perhaps that
> could be a source of confusion, and given the gradation of duck array
> like types.
I suggest that the name should *not* use programmer lingo, so neither
"abstract" nor "duck" should be in there. My humble proposal is
"arraylike". (I know that this term has included things like "list-of-
list" before but only in text, not code, as far as I know.)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180309/302832da/attachment.html>

From njs at pobox.com  Thu Mar  8 23:22:29 2018
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 8 Mar 2018 20:22:29 -0800
Subject: [Numpy-discussion] Where to discuss NEPs (was: Re: new NEP:
 np.AbstractArray and np.asabstractarray)
Message-ID: <CAPJVwBmAVV_EFhrnBic+6spRc6fWgsmhsgAaZ0fL9AvYtz29WA@mail.gmail.com>

On Thu, Mar 8, 2018 at 7:06 AM, Marten van Kerkwijk
<m.h.vankerkwijk at gmail.com> wrote:
> Hi Nathaniel,
>
> Overall, hugely in favour!  For detailed comments, it would be good to
> have a link to a PR; could you put that up?

Well, there's a PR here: https://github.com/numpy/numpy/pull/10706

But, this raises a question :-). (One which also came up here:
https://github.com/numpy/numpy/pull/10704#issuecomment-371684170)

There are sensible two workflows we could use (or at least, two that I
can think of):

1. We merge updates to the NEPs as we go, so that whatever's in the
repo is the current draft. Anyone can go to the NEP webpage at
http://numpy.org/neps (WIP, see #10702) to see the latest version of
all NEPs, whether accepted, rejected, or in progress. Discussion
happens on the mailing list, and line-by-line feedback can be done by
quote-replying and commenting on individual lines. From time to time,
the NEP author takes all the accumulated feedback, updates the
document, and makes a new post to the list to let people know about
the updated version.

This is how python-dev handles PEPs.

2. We use Github itself to manage the review. The repo only contains
"accepted" NEPs; draft NEPs are represented by open PRs, and rejected
NEPs are represented by PRs that were closed-without-merging.
Discussion uses Github's commenting/review tools, and happens in the
PR itself.

This is roughly how Rust handles their RFC process, for example:
https://github.com/rust-lang/rfcs

Trying to do some hybrid version of these seems like it would be
pretty painful, so we should pick one.

Given that historically we've tried to use the mailing list for
substantive features/planning discussions, and that our NEP process
has been much closer to workflow 1 than workflow 2 (e.g., there are
already a bunch of old NEPs already in the repo that are effectively
rejected/withdrawn), I think we should maybe continue that way, and
keep discussions here?

So my suggestion is discussion should happen on the list, and NEP
updates should be merged promptly, or just self-merged. Sound good?

-n

-- 
Nathaniel J. Smith -- https://vorpus.org

From shoyer at gmail.com  Fri Mar  9 00:45:35 2018
From: shoyer at gmail.com (Stephan Hoyer)
Date: Fri, 09 Mar 2018 05:45:35 +0000
Subject: [Numpy-discussion] new NEP: np.AbstractArray and
 np.asabstractarray
In-Reply-To: <1520560316.2962680.1296803088.6C85AC87@webmail.messagingengine.com>
References: <CAPJVwBnBGh_kfT+tXa8gJ8HA3xjGzoSMJMsCzj+2=diSUvxEAQ@mail.gmail.com>
 <CAJNV+9uM6CL3SeYoLFbQAdpVVaFFvSb7Z5D4KVCQ-vLTm7g4SQ@mail.gmail.com>
 <CAEQ_Tvc8QA+Finhq6+2Ey8J5dBdTR+VuraA1_kuOFjb3=LO2gg@mail.gmail.com>
 <1520560316.2962680.1296803088.6C85AC87@webmail.messagingengine.com>
Message-ID: <CAEQ_TvcdSyApKdwYOoNg=noc3YiU9xR8AsxsKU_5aAks4=OKbw@mail.gmail.com>

On Thu, Mar 8, 2018 at 5:54 PM Juan Nunez-Iglesias <jni.soma at gmail.com>
wrote:

> On Fri, Mar 9, 2018, at 5:56 AM, Stephan Hoyer wrote:
>
> Marten's case 1: works exactly like ndarray, but stores data differently:
> parallel arrays (e.g., dask.array), sparse arrays (e.g.,
> https://github.com/pydata/sparse), hypothetical non-strided arrays (e.g.,
> always C ordered).
>
>
> Two other "hypotheticals" that would fit nicely in this space:
> - the Open Connectome folks (https://neurodata.io) proposed linearising
> indices using space-filling curves, which minimizes cache misses (or IO
> reads) for giant volumes. I believe they implemented this but can't find it
> currently.
> - the N5 format for chunked arrays on disk:
> https://github.com/saalfeldlab/n5
>

I think these fall into another important category of duck arrays.
"Indexable" arrays the serve as storage, but that don't support
computation. These sorts of arrays typically support operations like
indexing and define handful of array-like properties (e.g., dtype and
shape), but not arithmetic, reductions or reshaping.

This means you can't quite use them as a drop-in replacement for NumPy
arrays in all cases, but that's OK. In contrast, both dask.array and sparse
do aspire to do fill out nearly the full numpy.ndarray API.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180309/9c15eb69/attachment.html>

From ralf.gommers at gmail.com  Fri Mar  9 01:26:46 2018
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Thu, 8 Mar 2018 22:26:46 -0800
Subject: [Numpy-discussion] Where to discuss NEPs (was: Re: new NEP:
 np.AbstractArray and np.asabstractarray)
In-Reply-To: <CAPJVwBmAVV_EFhrnBic+6spRc6fWgsmhsgAaZ0fL9AvYtz29WA@mail.gmail.com>
References: <CAPJVwBmAVV_EFhrnBic+6spRc6fWgsmhsgAaZ0fL9AvYtz29WA@mail.gmail.com>
Message-ID: <CABL7CQhbOED8nogyQf-Y8B2ifq98kBNcFjNaj4d5nUf2B5etRw@mail.gmail.com>

On Thu, Mar 8, 2018 at 8:22 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Thu, Mar 8, 2018 at 7:06 AM, Marten van Kerkwijk
> <m.h.vankerkwijk at gmail.com> wrote:
> > Hi Nathaniel,
> >
> > Overall, hugely in favour!  For detailed comments, it would be good to
> > have a link to a PR; could you put that up?
>
> Well, there's a PR here: https://github.com/numpy/numpy/pull/10706
>
> But, this raises a question :-). (One which also came up here:
> https://github.com/numpy/numpy/pull/10704#issuecomment-371684170)
>
> There are sensible two workflows we could use (or at least, two that I
> can think of):
>
> 1. We merge updates to the NEPs as we go, so that whatever's in the
> repo is the current draft. Anyone can go to the NEP webpage at
> http://numpy.org/neps (WIP, see #10702) to see the latest version of
> all NEPs, whether accepted, rejected, or in progress. Discussion
> happens on the mailing list, and line-by-line feedback can be done by
> quote-replying and commenting on individual lines. From time to time,
> the NEP author takes all the accumulated feedback, updates the
> document, and makes a new post to the list to let people know about
> the updated version.
>
> This is how python-dev handles PEPs.
>
> 2. We use Github itself to manage the review. The repo only contains
> "accepted" NEPs; draft NEPs are represented by open PRs, and rejected
> NEPs are represented by PRs that were closed-without-merging.
> Discussion uses Github's commenting/review tools, and happens in the
> PR itself.
>
> This is roughly how Rust handles their RFC process, for example:
> https://github.com/rust-lang/rfcs
>
> Trying to do some hybrid version of these seems like it would be
> pretty painful, so we should pick one.
>
> Given that historically we've tried to use the mailing list for
> substantive features/planning discussions, and that our NEP process
> has been much closer to workflow 1 than workflow 2 (e.g., there are
> already a bunch of old NEPs already in the repo that are effectively
> rejected/withdrawn), I think we should maybe continue that way, and
> keep discussions here?
>
> So my suggestion is discussion should happen on the list, and NEP
> updates should be merged promptly, or just self-merged. Sound good?


Agreed that overall (1) is better than (2), rejected NEPs should be
visible. However there's no need for super-quick self-merge, and I think it
would be counter-productive.

Instead, just send a PR, leave it open for some discussion, and update for
detailed comments (as well as long in-depth discussions that only a couple
of people care about) in the Github UI and major ones on the list. Once
it's stabilized a bit, then merge with status "Draft" and update once in a
while. I think this is also much more in like with what python-dev does, I
have seen substantial discussion on Github and have not seen quick
self-merges.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180308/60e7ded4/attachment-0001.html>

From einstein.edison at gmail.com  Fri Mar  9 02:21:01 2018
From: einstein.edison at gmail.com (Hameer Abbasi)
Date: Thu, 8 Mar 2018 23:21:01 -0800
Subject: [Numpy-discussion] new NEP: np.AbstractArray and
 np.asabstractarray
In-Reply-To: <1520560316.2962680.1296803088.6C85AC87@webmail.messagingengine.com>
References: <CAPJVwBnBGh_kfT+tXa8gJ8HA3xjGzoSMJMsCzj+2=diSUvxEAQ@mail.gmail.com>
 <CAJNV+9uM6CL3SeYoLFbQAdpVVaFFvSb7Z5D4KVCQ-vLTm7g4SQ@mail.gmail.com>
 <CAEQ_Tvc8QA+Finhq6+2Ey8J5dBdTR+VuraA1_kuOFjb3=LO2gg@mail.gmail.com>
 <1520560316.2962680.1296803088.6C85AC87@webmail.messagingengine.com>
Message-ID: <CADViA5Ay73XvqV8dFK8TpZZV75C7ju9GvYcNMt7eLdc+WBub_w@mail.gmail.com>

Not that I?m against different ?levels? of ndarray granularity, but I just
don?t want it to introduce complexity for the end-user. For example, it
would be unreasonable to expect the end-user to check for all parts of the
interface that they need support for separately.

Keeping this in view; different levels only make sense if and only if they
are strict sub/supersets of each other, so the user can just check for the
highest level of compatibility they require, but even then they would need
to learn about the different ?levels".

PS, thanks for putting this together! I was thinking of doing it this
weekend but you beat me to it and covered aspects I wouldn?t have thought
of.

The name ?asarraylike? appeals to me, as does a ?custom=? kwarg for
asanyarray.


Sent from Astro <https://www.helloastro.com> for Mac

On Mar 9, 2018 at 02:51, Juan Nunez-Iglesias <jni.soma at gmail.com> wrote:


On Fri, Mar 9, 2018, at 5:56 AM, Stephan Hoyer wrote:

Marten's case 1: works exactly like ndarray, but stores data differently:
parallel arrays (e.g., dask.array), sparse arrays (e.g.,
https://github.com/pydata/sparse), hypothetical non-strided arrays (e.g.,
always C ordered).


Two other "hypotheticals" that would fit nicely in this space:
- the Open Connectome folks (https://neurodata.io) proposed linearising
indices using space-filling curves, which minimizes cache misses (or IO
reads) for giant volumes. I believe they implemented this but can't find it
currently.
- the N5 format for chunked arrays on disk:
https://github.com/saalfeldlab/n5

Finally for the name, what about `asduckarray`? Thought perhaps that could
be a source of confusion, and given the gradation of duck array like types.


I suggest that the name should *not* use programmer lingo, so neither
"abstract" nor "duck" should be in there. My humble proposal is
"arraylike". (I know that this term has included things like "list-of-list"
before but only in text, not code, as far as I know.)

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at python.org
https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180308/e26fb2c9/attachment.html>

From stefanv at berkeley.edu  Fri Mar  9 02:23:37 2018
From: stefanv at berkeley.edu (Stefan van der Walt)
Date: Thu, 8 Mar 2018 23:23:37 -0800
Subject: [Numpy-discussion] Where to discuss NEPs (was: Re: new NEP:
 np.AbstractArray and np.asabstractarray)
In-Reply-To: <CAPJVwBmAVV_EFhrnBic+6spRc6fWgsmhsgAaZ0fL9AvYtz29WA@mail.gmail.com>
References: <CAPJVwBmAVV_EFhrnBic+6spRc6fWgsmhsgAaZ0fL9AvYtz29WA@mail.gmail.com>
Message-ID: <20180309072337.slrjkru657dpbmuo@carbo>

On Thu, 08 Mar 2018 20:22:29 -0800, Nathaniel Smith wrote:
> 1. We merge updates to the NEPs as we go, so that whatever's in the
> repo is the current draft. Anyone can go to the NEP webpage at
> http://numpy.org/neps (WIP, see #10702) to see the latest version of
> all NEPs, whether accepted, rejected, or in progress.

If we go this route, it may also be useful to give some more guidance on
how complete we expect a first draft of a NEP to be before it is
submitted as a PR.  We currently only have:

"""
The NEP champion (a.k.a. Author) should first attempt to ascertain
whether the idea is suitable for a NEP. Posting to the numpy-discussion
mailing list is the best way to go about doing this.

Following a discussion on the mailing list, the proposal should be
submitted as a draft NEP via a GitHub pull request to the doc/neps
directory [...]
"""

Best regards
St?fan

From matti.picus at gmail.com  Fri Mar  9 02:49:27 2018
From: matti.picus at gmail.com (Matti Picus)
Date: Fri, 9 Mar 2018 09:49:27 +0200
Subject: [Numpy-discussion] Where to discuss NEPs (was: Re: new NEP:
 np.AbstractArray and np.asabstractarray)
In-Reply-To: <CABL7CQhbOED8nogyQf-Y8B2ifq98kBNcFjNaj4d5nUf2B5etRw@mail.gmail.com>
References: <CAPJVwBmAVV_EFhrnBic+6spRc6fWgsmhsgAaZ0fL9AvYtz29WA@mail.gmail.com>
 <CABL7CQhbOED8nogyQf-Y8B2ifq98kBNcFjNaj4d5nUf2B5etRw@mail.gmail.com>
Message-ID: <f26d4d7b-3847-3b00-19a6-67f06d795cca@gmail.com>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180309/1b5c14f9/attachment.html>

From njs at pobox.com  Fri Mar  9 03:00:47 2018
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 9 Mar 2018 00:00:47 -0800
Subject: [Numpy-discussion] Where to discuss NEPs (was: Re: new NEP:
 np.AbstractArray and np.asabstractarray)
In-Reply-To: <CABL7CQhbOED8nogyQf-Y8B2ifq98kBNcFjNaj4d5nUf2B5etRw@mail.gmail.com>
References: <CAPJVwBmAVV_EFhrnBic+6spRc6fWgsmhsgAaZ0fL9AvYtz29WA@mail.gmail.com>
 <CABL7CQhbOED8nogyQf-Y8B2ifq98kBNcFjNaj4d5nUf2B5etRw@mail.gmail.com>
Message-ID: <CAPJVwB=ik5-Qi=+w7y2hPxWvudR-_zJq0YB_XFjvP8+jY8eYJQ@mail.gmail.com>

On Thu, Mar 8, 2018 at 10:26 PM, Ralf Gommers <ralf.gommers at gmail.com> wrote:
>
>
> On Thu, Mar 8, 2018 at 8:22 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>> On Thu, Mar 8, 2018 at 7:06 AM, Marten van Kerkwijk
>> <m.h.vankerkwijk at gmail.com> wrote:
>> > Hi Nathaniel,
>> >
>> > Overall, hugely in favour!  For detailed comments, it would be good to
>> > have a link to a PR; could you put that up?
>>
>> Well, there's a PR here: https://github.com/numpy/numpy/pull/10706
>>
>> But, this raises a question :-). (One which also came up here:
>> https://github.com/numpy/numpy/pull/10704#issuecomment-371684170)
>>
>> There are sensible two workflows we could use (or at least, two that I
>> can think of):
>>
>> 1. We merge updates to the NEPs as we go, so that whatever's in the
>> repo is the current draft. Anyone can go to the NEP webpage at
>> http://numpy.org/neps (WIP, see #10702) to see the latest version of
>> all NEPs, whether accepted, rejected, or in progress. Discussion
>> happens on the mailing list, and line-by-line feedback can be done by
>> quote-replying and commenting on individual lines. From time to time,
>> the NEP author takes all the accumulated feedback, updates the
>> document, and makes a new post to the list to let people know about
>> the updated version.
>>
>> This is how python-dev handles PEPs.
>>
>> 2. We use Github itself to manage the review. The repo only contains
>> "accepted" NEPs; draft NEPs are represented by open PRs, and rejected
>> NEPs are represented by PRs that were closed-without-merging.
>> Discussion uses Github's commenting/review tools, and happens in the
>> PR itself.
>>
>> This is roughly how Rust handles their RFC process, for example:
>> https://github.com/rust-lang/rfcs
>>
>> Trying to do some hybrid version of these seems like it would be
>> pretty painful, so we should pick one.
>>
>> Given that historically we've tried to use the mailing list for
>> substantive features/planning discussions, and that our NEP process
>> has been much closer to workflow 1 than workflow 2 (e.g., there are
>> already a bunch of old NEPs already in the repo that are effectively
>> rejected/withdrawn), I think we should maybe continue that way, and
>> keep discussions here?
>>
>> So my suggestion is discussion should happen on the list, and NEP
>> updates should be merged promptly, or just self-merged. Sound good?
>
>
> Agreed that overall (1) is better than (2), rejected NEPs should be visible.
> However there's no need for super-quick self-merge, and I think it would be
> counter-productive.
>
> Instead, just send a PR, leave it open for some discussion, and update for
> detailed comments (as well as long in-depth discussions that only a couple
> of people care about) in the Github UI and major ones on the list. Once it's
> stabilized a bit, then merge with status "Draft" and update once in a while.
> I think this is also much more in like with what python-dev does, I have
> seen substantial discussion on Github and have not seen quick self-merges.

Not sure what you mean about python-dev. Are you looking at the peps
repository? https://github.com/python/peps

>From a quick skim, it looks like of the last 37 commits, only 8 came
in through PRs and the other 29 were pushed directly by committers
without any review. 3 of the 8 PRs were self-merged immediately after
submission, and of the remaining 5 PRs, 4 of them were from external
contributors who didn't have commit rights, and the 1 other was a fix
to the repo README, rather than an actual PEP change. I don't think
I've ever seen any kind of substantive discussion in that repo -- any
discussion is mostly restricted to helping new contributors with
procedural stuff, maybe formatting issues or fixes to the PEP tooling.

Anyway, just because python-dev does it that way doesn't mean that we
have to too.

But if we split discussions between GH and the mailing list, then
we're definitely going to end up discussing substantive issues there
(how do we know which discussions only a couple of people care
about?), and trying to juggle that seems confusing to me, plus makes
it harder to track down what happened later, after we've had multiple
PRs each with their own comments...

-n

-- 
Nathaniel J. Smith -- https://vorpus.org

From njs at pobox.com  Fri Mar  9 04:29:17 2018
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 9 Mar 2018 01:29:17 -0800
Subject: [Numpy-discussion] new NEP: np.AbstractArray and
 np.asabstractarray
In-Reply-To: <CAJNV+9uM6CL3SeYoLFbQAdpVVaFFvSb7Z5D4KVCQ-vLTm7g4SQ@mail.gmail.com>
References: <CAPJVwBnBGh_kfT+tXa8gJ8HA3xjGzoSMJMsCzj+2=diSUvxEAQ@mail.gmail.com>
 <CAJNV+9uM6CL3SeYoLFbQAdpVVaFFvSb7Z5D4KVCQ-vLTm7g4SQ@mail.gmail.com>
Message-ID: <CAPJVwBnMJtJjXgAJEF8K7N=B2pKdRWRaFc16-c0Jgu3-HZrv0Q@mail.gmail.com>

On Thu, Mar 8, 2018 at 7:06 AM, Marten van Kerkwijk
<m.h.vankerkwijk at gmail.com> wrote:
> A larger comment: you state that you think `np.asanyarray` is a
> mistake since `np.matrix` and `np.ma.MaskedArray` would pass through
> and that those do not strictly mimic `NDArray`. Here, I agree with
> `matrix` (but since we're deprecating it, let's remove that from the
> discussion), but I do not see how your proposed interface would not
> let `MaskedArray` pass through, nor really that one would necessarily
> want that.

We can discuss whether MaskedArray should be an AbstractArray.
Conceptually it probably should be; I think that was a goal of the
MaskedArray authors (even if they wouldn't have put it that way). In
practice there are a lot of funny quirks in MaskedArray, so I'd want
to look more carefully in case there are weird incompatibilities that
would cause problems. Note that we can figure this out after the NEP
is finished, too.

I wonder if the matplotlib folks have any thoughts on this? I know
they're one of the more prominent libraries that tries to handle both
regular and masked arrays, so maybe they could comment on how often
they run

> I think it may be good to distinguish two separate cases:
> 1. Everything has exactly the same meaning as for `ndarray` but the
> data is stored differently (i.e., only `view` does not work). One can
> thus expect that for `output = function(inputs)`, at the end all
> `duck_output == ndarray_output`.
> 2. Everything is implemented but operations may give different output
> (depending on masks for masked arrays, units for quantities, etc.), so
> generally `duck_output != ndarray_output`.
>
> Which one of these are you aiming at? By including
> `NDArrayOperatorsMixin`, it would seem option (2), but perhaps not? Is
> there a case for both separately?

Well, (1) is much easier to design around, because it's well-defined
:-). And I'm not sure that there's a principled difference between
regular arrays and masked arrays/quantity arrays; these *could* be
ndarray objects with special dtypes and extra methods, neither of
which would disqualify you from being a "case 1" array.

(I guess one issue is that because MaskedArray ignores the mask by
default, you could get weird results from things like mean
calculations: np.sum(masked_arr) / np.prod(masked_arr.shape) does not
give the right result. This isn't an issue for quantities, though, or
for an R-style NA that propagated by default.)

> Smaller general comment: at least in the NEP I would not worry about
> deprecating `NDArrayOperatorsMixin` - this may well be handy in itself
> (for things that implement `__array_ufunc__` but do not have shape,
> etc. (I have been doing some work on creating ufunc chains that would
> use this -- but they definitely are not array-like). Similarly, I
> think there is room for an `NDArrayShapeMixin` which might help with
> `concatenate` and friends.

Fair enough.

> Finally, on the name: `asarray` and `asanyarray` are just shims over
> `array`, so one option would be to add an argument in `array` (or
> broaden the scope of `subok`).

We definitely don't want to broaden the scope of 'subok', because one
of the goals here is to have something that projects like sklearn can
use, and they won't use subok :-). (In particular, np.matrix is
definitely not a duck array of any kind.)

And supporting array() is tricky, because then you have to figure out
what to do with the copy=, order=, subok=, ndmin= arguments. copy= in
particular is tricky given that we don't know the object's type! I
guess we could call obj.copy() or something... but for this first
iteration it seemed simplest to make a new function that just has the
most important stuff for writing generic functions that accept duck
arrays.

What we could do is, in addition to adding some kind of
asabstractarray() function, *also* make it so asanyarray() starts
accepting abstract/duck arrays, on the theory that anyone who's
willing to put up with asanyarrays()'s weak guarantees won't notice if
we weaken them a bit more. Honestly though I'd rather just not touch
asanyarray at all, and maybe even deprecate it someday.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org

From cmkleffner at gmail.com  Fri Mar  9 04:46:11 2018
From: cmkleffner at gmail.com (Carl Kleffner)
Date: Fri, 9 Mar 2018 10:46:11 +0100
Subject: [Numpy-discussion] New NEP: merging multiarray and umath
In-Reply-To: <CAPJVwBm6BC0pe7o2zueQuCfy=Aar6hNTpDkPkOu7EsqQq0e-2w@mail.gmail.com>
References: <CAPJVwBkD7xLauHVNSyD9xHQYDwQiSAOjjakxNtD8dJH6BQs2-w@mail.gmail.com>
 <23471BD4-A81B-4B9C-AECC-D161C3643B81@gmail.com>
 <CAPJVwBm6BC0pe7o2zueQuCfy=Aar6hNTpDkPkOu7EsqQq0e-2w@mail.gmail.com>
Message-ID: <CAGGsPMyRv+CgSd0Vx7sHRy6U624cwJ5uye11YLtRBkyp2S8JfA@mail.gmail.com>

2018-03-09 2:06 GMT+01:00 Nathaniel Smith <njs at pobox.com>:

> On Thu, Mar 8, 2018 at 1:52 AM, Gregor Thalhammer
> <gregor.thalhammer at gmail.com> wrote:
> >
> > Hi,
> >
> > long time ago I wrote a wrapper to to use optimised and parallelized math
> > functions from Intels vector math library
> > geggo/uvml: Provide vectorized math function (MKL) for numpy
> >
> > I found it useful to inject (some of) the fast methods into numpy via
> > np.set_num_ops(), to gain more performance without changing my programs.
> >
> > While this original project is outdated, I can imagine that a centralised
> > way to swap the implementation of math functions is useful. Therefor I
> > suggest to keep np.set_num_ops(), but admittedly I do not understand all
> the
> > technical implications of the proposed change.
>
> The main part of the proposal is to merge the two libraries; the
> question of whether to deprecate set_numeric_ops is a bit separate.
> There's no technical obstacle to keeping it, except the usual issue of
> having more cruft to maintain :-).
>
> It's usually true that any monkeypatching interface will be useful to
> someone under some circumstances, but we usually don't consider this a
> good enough reason on its own to add and maintain these kinds of
> interfaces. And an unfortunate side-effect of these kinds of hacky
> interfaces is that they can end up removing the pressure to solve
> problems properly. In this case, better solutions would include:
>
> - Adding support for accelerated vector math libraries to NumPy
> directly (e.g. MKL, yeppp)
>
> I just want to bring the Sleef <https://github.com/shibatch/sleef>
library for vectorized math (C99) into the discussion.
Recently a new version with a stabilized API has been provided by its
authors.
The library is now well documented http://sleef.org and available under the
permissive boost license. A runtime CPU dispatcher is used for the
different SIMD variants
(SSE2, AVX, AVX2, FMA ...)

However, I never understand how a vectorized math library can be easily
used with numpy arrays
in all manners (strided arrays i.e.).


> - Overriding the inner loops inside ufuncs like numpy.add that
> np.ndarray.__add__ ultimately calls. This would speed up all addition
> (whether or not it uses Python + syntax), would be a more general
> solution (e.g. you could monkeypatch np.exp to use MKL's fast
> vectorized exp), would let you skip reimplementing all the tricky
> shared bits of the ufunc logic, etc. Conceptually it's not even very
> hacky, because we allow you add new loops to existing ufuncs; making
> it possible to replace existing loops wouldn't be a big stretch. (In
> fact it's possible that we already allow this; I haven't checked.)
>
> So I still lean towards deprecating set_numeric_ops. It's not the most
> crucial part of the proposal though; if it turns out to be too
> controversial then I'll take it out.
>
> -n
>
> --
> Nathaniel J. Smith -- https://vorpus.org
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180309/ba2359a0/attachment.html>

From matthew.brett at gmail.com  Fri Mar  9 06:04:53 2018
From: matthew.brett at gmail.com (Matthew Brett)
Date: Fri, 9 Mar 2018 11:04:53 +0000
Subject: [Numpy-discussion] Endian dtype specifier without using character
 codes?
Message-ID: <CAH6Pt5qAQko9+KCEwN=XtpMs5WGOKmEeE-dZ-Hk95uPYt28sKA@mail.gmail.com>

Hi,

We (over at https://github.com/nipy/nibabel) often want to do stuff like this:

```
dtype_type = 'i'
size = 8
endianness = '<'
dtype = np.dtype('{}{}{}'.format(endianness, dtype_type, size))
```

I see that

"""
Use of the character codes, however, is discouraged.
"""

https://docs.scipy.org/doc/numpy-1.14.0/reference/arrays.scalars.html

What is the recommended way of specifying endianness in my dtype, if I
am not using the character codes?  Do I have to use something like:

```
np.dtype('int64').newbyteorder(endianness)
```

?

Cheers,

Matthew

From jtaylor.debian at googlemail.com  Fri Mar  9 06:33:21 2018
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Fri, 9 Mar 2018 12:33:21 +0100
Subject: [Numpy-discussion] New NEP: merging multiarray and umath
In-Reply-To: <CAB6mnxKqsxxdLQ2JgkwiD4w3aXW7kFJLZAL14J5K-dtiPmuAGQ@mail.gmail.com>
References: <CAPJVwBkD7xLauHVNSyD9xHQYDwQiSAOjjakxNtD8dJH6BQs2-w@mail.gmail.com>
 <23471BD4-A81B-4B9C-AECC-D161C3643B81@gmail.com>
 <CAB6mnxKqsxxdLQ2JgkwiD4w3aXW7kFJLZAL14J5K-dtiPmuAGQ@mail.gmail.com>
Message-ID: <81930c51-ac3c-77e9-74c0-ccf12691096a@googlemail.com>

On 08.03.2018 17:20, Charles R Harris wrote:
> 
> 
> On Thu, Mar 8, 2018 at 2:52 AM, Gregor Thalhammer
> <gregor.thalhammer at gmail.com <mailto:gregor.thalhammer at gmail.com>> wrote:
> 
> 
>     Hi,
> 
>     long time ago I wrote a wrapper to to use optimised and parallelized
>     math functions from Intels vector math library?
>     geggo/uvml: Provide vectorized math function (MKL) for numpy
>     <https://github.com/geggo/uvml>
> 
>     I found it useful to inject (some of) the fast methods into numpy
>     via np.set_num_ops(), to gain more performance without changing my
>     programs.
> 
> 
> I think that was much of the original motivation for `set_num_ops` back
> in the Numeric days, where there was little commonality among platforms
> and getting hold of optimized libraries was very much an individual
> thing. The former cblas module, now merged with multiarray, was present
> for the same reasons.
> ??
> 
> 
>     While this original project is outdated, I can imagine that a
>     centralised way to swap the implementation of math functions is
>     useful. Therefor I suggest to keep np.set_num_ops(), but admittedly
>     I do not understand all the technical implications of the proposed
>     change.
> 
> 
> I suppose we could set it up to detect and use an external library
> during compilation. The CBLAS implementations currently do that and
> should pick up the MKL version when available. Where are the MKL
> functions you used presented? That is an admittedly lower level
> interface, however.
> 
> Chuck


As the functions of the different libraries have vastly different
accuracies you want to be able to exchange numeric ops at runtime or at
least during load time (like our cblas) and not limit yourself one
compile time defined set of functions.
Keeping set_numeric_ops would be preferable to me.

Though I am not clear on why the two things are connected?
Why can't we keep set_numeric_ops and merge multiarray and umath into
one shared object?

From sebastian at sipsolutions.net  Fri Mar  9 05:51:21 2018
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Fri, 09 Mar 2018 11:51:21 +0100
Subject: [Numpy-discussion] new NEP: np.AbstractArray and
 np.asabstractarray
In-Reply-To: <CAEQ_Tvc8QA+Finhq6+2Ey8J5dBdTR+VuraA1_kuOFjb3=LO2gg@mail.gmail.com>
References: <CAPJVwBnBGh_kfT+tXa8gJ8HA3xjGzoSMJMsCzj+2=diSUvxEAQ@mail.gmail.com>
 <CAJNV+9uM6CL3SeYoLFbQAdpVVaFFvSb7Z5D4KVCQ-vLTm7g4SQ@mail.gmail.com>
 <CAEQ_Tvc8QA+Finhq6+2Ey8J5dBdTR+VuraA1_kuOFjb3=LO2gg@mail.gmail.com>
Message-ID: <1520592681.19004.11.camel@sipsolutions.net>

On Thu, 2018-03-08 at 18:56 +0000, Stephan Hoyer wrote:
> Hi Nathaniel,
> 
> Thanks for starting the discussion!
> 
> Like Marten says, I think it would be useful to more clearly define
> what it means to be an abstract array. ndarray has lots of
> methods/properties that expose internal implementation (e.g., view,
> strides) that presumably we don't want to require as part of this
> interfaces. On the other hand, dtype and shape are almost assuredly
> part of this interface.
> 
> To help guide the discussion, it would be good to identify concrete
> examples of types that should and should not satisfy this interface,
> e.g.,
> Marten's case 1: works exactly like ndarray, but stores data
> differently: parallel arrays (e.g., dask.array), sparse arrays (e.g.,
> https://github.com/pydata/sparse), hypothetical non-strided arrays
> (e.g., always C ordered).
> Marten's case 2: same methods as ndarray, but gives different
> results: np.ma.MaskedArray, arrays with units (quantities), maybe
> labeled arrays like xarray.DataArray
> 
> I don't think we have a hope of making a single base class for case 2
> work with everything in NumPy, but we can define interfaces with
> different levels of functionality.


True, but I guess the aim is not to care at all about how things are
implemented (so only 2)? I agree that we can aim to be as close as
possible, but should not expect to reach it.
My personal opinion:

1. To do this, we should start it "experimentally"

2. We need something like a reference implementation. First, because it
allows testing whether a function e.g. in numpy is actually abstract-
safe and second because it will be the only way to find out what our
minimal abstract interface actually is (assuming we have started 3).

3. Go ahead with putting it into numpy functions and see how much you
need to make them work. In the end, my guess is, everything that works
for MaskedArrays and xarray is a pretty safe bet.

I disagree with the statement that we do not need to define the minimal
reference. In practice we do as soon as we use it for numpy functions.

- Sebastian


> 
> Because there is such a gradation of "duck array" types, I agree with
> Marten that we should not deprecate NDArrayOperatorsMixin. It's
> useful for types like xarray.Dataset that define __array_ufunc__ but
> cannot satisfy the full abstract array interface.
> 
> Finally for the name, what about `asduckarray`? Thought perhaps that
> could be a source of confusion, and given the gradation of duck array
> like types.
> 
> Cheers,
> Stephan
> 
> On Thu, Mar 8, 2018 at 7:07 AM Marten van Kerkwijk <m.h.vankerkwijk at g
> mail.com> wrote:
> > Hi Nathaniel,
> > 
> > Overall, hugely in favour!  For detailed comments, it would be good
> > to
> > have a link to a PR; could you put that up?
> > 
> > A larger comment: you state that you think `np.asanyarray` is a
> > mistake since `np.matrix` and `np.ma.MaskedArray` would pass
> > through
> > and that those do not strictly mimic `NDArray`. Here, I agree with
> > `matrix` (but since we're deprecating it, let's remove that from
> > the
> > discussion), but I do not see how your proposed interface would not
> > let `MaskedArray` pass through, nor really that one would
> > necessarily
> > want that.
> > 
> > I think it may be good to distinguish two separate cases:
> > 1. Everything has exactly the same meaning as for `ndarray` but the
> > data is stored differently (i.e., only `view` does not work). One
> > can
> > thus expect that for `output = function(inputs)`, at the end all
> > `duck_output == ndarray_output`.
> > 2. Everything is implemented but operations may give different
> > output
> > (depending on masks for masked arrays, units for quantities, etc.),
> > so
> > generally `duck_output != ndarray_output`.
> > 
> > Which one of these are you aiming at? By including
> > `NDArrayOperatorsMixin`, it would seem option (2), but perhaps not?
> > Is
> > there a case for both separately?
> > 
> > Smaller general comment: at least in the NEP I would not worry
> > about
> > deprecating `NDArrayOperatorsMixin` - this may well be handy in
> > itself
> > (for things that implement `__array_ufunc__` but do not have shape,
> > etc. (I have been doing some work on creating ufunc chains that
> > would
> > use this -- but they definitely are not array-like). Similarly, I
> > think there is room for an `NDArrayShapeMixin` which might help
> > with
> > `concatenate` and friends.
> > 
> > Finally, on the name: `asarray` and `asanyarray` are just shims
> > over
> > `array`, so one option would be to add an argument in `array` (or
> > broaden the scope of `subok`).
> > 
> > As an explicit suggestion, one could introduce a `duck` or
> > `abstract`
> > argument to `array` which is used in `asarray` and `asanyarray` as
> > well (corresponding to options 1 and 2), and eventually default to
> > something sensible (I would think `False` for `asarray` and `True`
> > for
> > `asanyarray`).
> > 
> > All the best,
> > 
> > Marten
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180309/6fcf57bf/attachment.sig>

From charlesr.harris at gmail.com  Fri Mar  9 10:23:19 2018
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 9 Mar 2018 08:23:19 -0700
Subject: [Numpy-discussion] Where to discuss NEPs (was: Re: new NEP:
 np.AbstractArray and np.asabstractarray)
In-Reply-To: <CABL7CQhbOED8nogyQf-Y8B2ifq98kBNcFjNaj4d5nUf2B5etRw@mail.gmail.com>
References: <CAPJVwBmAVV_EFhrnBic+6spRc6fWgsmhsgAaZ0fL9AvYtz29WA@mail.gmail.com>
 <CABL7CQhbOED8nogyQf-Y8B2ifq98kBNcFjNaj4d5nUf2B5etRw@mail.gmail.com>
Message-ID: <CAB6mnxLn+DjgjR7AhvNy2H=Oyh_=EtyG9kZHKK7MQ3GeOoqADg@mail.gmail.com>

On Thu, Mar 8, 2018 at 11:26 PM, Ralf Gommers <ralf.gommers at gmail.com>
wrote:

>
>
> On Thu, Mar 8, 2018 at 8:22 PM, Nathaniel Smith <njs at pobox.com> wrote:
>
>> On Thu, Mar 8, 2018 at 7:06 AM, Marten van Kerkwijk
>> <m.h.vankerkwijk at gmail.com> wrote:
>> > Hi Nathaniel,
>> >
>> > Overall, hugely in favour!  For detailed comments, it would be good to
>> > have a link to a PR; could you put that up?
>>
>> Well, there's a PR here: https://github.com/numpy/numpy/pull/10706
>>
>> But, this raises a question :-). (One which also came up here:
>> https://github.com/numpy/numpy/pull/10704#issuecomment-371684170)
>>
>> There are sensible two workflows we could use (or at least, two that I
>> can think of):
>>
>> 1. We merge updates to the NEPs as we go, so that whatever's in the
>> repo is the current draft. Anyone can go to the NEP webpage at
>> http://numpy.org/neps (WIP, see #10702) to see the latest version of
>> all NEPs, whether accepted, rejected, or in progress. Discussion
>> happens on the mailing list, and line-by-line feedback can be done by
>> quote-replying and commenting on individual lines. From time to time,
>> the NEP author takes all the accumulated feedback, updates the
>> document, and makes a new post to the list to let people know about
>> the updated version.
>>
>> This is how python-dev handles PEPs.
>>
>> 2. We use Github itself to manage the review. The repo only contains
>> "accepted" NEPs; draft NEPs are represented by open PRs, and rejected
>> NEPs are represented by PRs that were closed-without-merging.
>> Discussion uses Github's commenting/review tools, and happens in the
>> PR itself.
>>
>> This is roughly how Rust handles their RFC process, for example:
>> https://github.com/rust-lang/rfcs
>>
>> Trying to do some hybrid version of these seems like it would be
>> pretty painful, so we should pick one.
>>
>> Given that historically we've tried to use the mailing list for
>> substantive features/planning discussions, and that our NEP process
>> has been much closer to workflow 1 than workflow 2 (e.g., there are
>> already a bunch of old NEPs already in the repo that are effectively
>> rejected/withdrawn), I think we should maybe continue that way, and
>> keep discussions here?
>>
>> So my suggestion is discussion should happen on the list, and NEP
>> updates should be merged promptly, or just self-merged. Sound good?
>
>
> Agreed that overall (1) is better than (2), rejected NEPs should be
> visible. However there's no need for super-quick self-merge, and I think it
> would be counter-productive.
>
> Instead, just send a PR, leave it open for some discussion, and update for
> detailed comments (as well as long in-depth discussions that only a couple
> of people care about) in the Github UI and major ones on the list. Once
> it's stabilized a bit, then merge with status "Draft" and update once in a
> while. I think this is also much more in like with what python-dev does, I
> have seen substantial discussion on Github and have not seen quick
> self-merges.
>
>
I have a slight preference for managing the discussion on Github. Note that
I added a `component: NEP` label and that discussion can take place on
merged/closed PRs, the index could also contain links to proposed NEP PRs.
If we just left PR open until acceptance/rejection the label would allow
the proposed NEPs to be easily found, especially if we include the NEP
number in the title, something like `NEP-10111: ` .

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180309/a092f8b8/attachment.html>

From shoyer at gmail.com  Fri Mar  9 11:58:55 2018
From: shoyer at gmail.com (Stephan Hoyer)
Date: Fri, 09 Mar 2018 16:58:55 +0000
Subject: [Numpy-discussion] Where to discuss NEPs (was: Re: new NEP:
 np.AbstractArray and np.asabstractarray)
In-Reply-To: <CAB6mnxLn+DjgjR7AhvNy2H=Oyh_=EtyG9kZHKK7MQ3GeOoqADg@mail.gmail.com>
References: <CAPJVwBmAVV_EFhrnBic+6spRc6fWgsmhsgAaZ0fL9AvYtz29WA@mail.gmail.com>
 <CABL7CQhbOED8nogyQf-Y8B2ifq98kBNcFjNaj4d5nUf2B5etRw@mail.gmail.com>
 <CAB6mnxLn+DjgjR7AhvNy2H=Oyh_=EtyG9kZHKK7MQ3GeOoqADg@mail.gmail.com>
Message-ID: <CAEQ_Tve_GoMnShXOkKO73++edsPUbDZOab-tJFA6ZnajyzypiQ@mail.gmail.com>

I also have a slight preference for managing the discussion on GitHub,
which is a bit more fully featured than email for long discussion (e.g., it
supports code formatting and editing comments). But I'm really OK either
way as long as discussion is kept in one place.

We could still stipulate that NEPs are advertised on the mailing list:
first, to announce them, and second, before merging them marked as
accepted. We could even still merge rejected/abandoned NEPs as long as they
are clearly marked.

On Fri, Mar 9, 2018 at 7:24 AM Charles R Harris <charlesr.harris at gmail.com>
wrote:

> On Thu, Mar 8, 2018 at 11:26 PM, Ralf Gommers <ralf.gommers at gmail.com>
> wrote:
>
>>
>>
>> On Thu, Mar 8, 2018 at 8:22 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>>> On Thu, Mar 8, 2018 at 7:06 AM, Marten van Kerkwijk
>>> <m.h.vankerkwijk at gmail.com> wrote:
>>> > Hi Nathaniel,
>>> >
>>> > Overall, hugely in favour!  For detailed comments, it would be good to
>>> > have a link to a PR; could you put that up?
>>>
>>> Well, there's a PR here: https://github.com/numpy/numpy/pull/10706
>>>
>>> But, this raises a question :-). (One which also came up here:
>>> https://github.com/numpy/numpy/pull/10704#issuecomment-371684170)
>>>
>>> There are sensible two workflows we could use (or at least, two that I
>>> can think of):
>>>
>>> 1. We merge updates to the NEPs as we go, so that whatever's in the
>>> repo is the current draft. Anyone can go to the NEP webpage at
>>> http://numpy.org/neps (WIP, see #10702) to see the latest version of
>>> all NEPs, whether accepted, rejected, or in progress. Discussion
>>> happens on the mailing list, and line-by-line feedback can be done by
>>> quote-replying and commenting on individual lines. From time to time,
>>> the NEP author takes all the accumulated feedback, updates the
>>> document, and makes a new post to the list to let people know about
>>> the updated version.
>>>
>>> This is how python-dev handles PEPs.
>>>
>>> 2. We use Github itself to manage the review. The repo only contains
>>> "accepted" NEPs; draft NEPs are represented by open PRs, and rejected
>>> NEPs are represented by PRs that were closed-without-merging.
>>> Discussion uses Github's commenting/review tools, and happens in the
>>> PR itself.
>>>
>>> This is roughly how Rust handles their RFC process, for example:
>>> https://github.com/rust-lang/rfcs
>>>
>>> Trying to do some hybrid version of these seems like it would be
>>> pretty painful, so we should pick one.
>>>
>>> Given that historically we've tried to use the mailing list for
>>> substantive features/planning discussions, and that our NEP process
>>> has been much closer to workflow 1 than workflow 2 (e.g., there are
>>> already a bunch of old NEPs already in the repo that are effectively
>>> rejected/withdrawn), I think we should maybe continue that way, and
>>> keep discussions here?
>>>
>>> So my suggestion is discussion should happen on the list, and NEP
>>> updates should be merged promptly, or just self-merged. Sound good?
>>
>>
>> Agreed that overall (1) is better than (2), rejected NEPs should be
>> visible. However there's no need for super-quick self-merge, and I think it
>> would be counter-productive.
>>
>> Instead, just send a PR, leave it open for some discussion, and update
>> for detailed comments (as well as long in-depth discussions that only a
>> couple of people care about) in the Github UI and major ones on the list.
>> Once it's stabilized a bit, then merge with status "Draft" and update once
>> in a while. I think this is also much more in like with what python-dev
>> does, I have seen substantial discussion on Github and have not seen quick
>> self-merges.
>>
>>
> I have a slight preference for managing the discussion on Github. Note
> that I added a `component: NEP` label and that discussion can take place on
> merged/closed PRs, the index could also contain links to proposed NEP PRs.
> If we just left PR open until acceptance/rejection the label would allow
> the proposed NEPs to be easily found, especially if we include the NEP
> number in the title, something like `NEP-10111: ` .
>
> Chuck
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180309/18415b22/attachment-0001.html>

From shoyer at gmail.com  Fri Mar  9 12:00:43 2018
From: shoyer at gmail.com (Stephan Hoyer)
Date: Fri, 09 Mar 2018 17:00:43 +0000
Subject: [Numpy-discussion] Where to discuss NEPs (was: Re: new NEP:
 np.AbstractArray and np.asabstractarray)
References: <CAPJVwBmAVV_EFhrnBic+6spRc6fWgsmhsgAaZ0fL9AvYtz29WA@mail.gmail.com>
 <CABL7CQhbOED8nogyQf-Y8B2ifq98kBNcFjNaj4d5nUf2B5etRw@mail.gmail.com>
 <CAB6mnxLn+DjgjR7AhvNy2H=Oyh_=EtyG9kZHKK7MQ3GeOoqADg@mail.gmail.com>
 <CAEQ_Tve_GoMnShXOkKO73++edsPUbDZOab-tJFA6ZnajyzypiQ@mail.gmail.com>
Message-ID: <CAEQ_Tvc+UAh9jhV9F3hDWe+cL73MZ5yMCHMnROFxeOXwx_kWsQ@mail.gmail.com>

I'll note that we basically used GitHub for revising __array_ufunc__ NEP,
and I think that worked out better for everyone involved. The discussion
was a little too specialized and high volume to be well handled on the
mailing list.

On Fri, Mar 9, 2018 at 8:58 AM Stephan Hoyer <shoyer at gmail.com> wrote:

> I also have a slight preference for managing the discussion on GitHub,
> which is a bit more fully featured than email for long discussion (e.g., it
> supports code formatting and editing comments). But I'm really OK either
> way as long as discussion is kept in one place.
>
> We could still stipulate that NEPs are advertised on the mailing list:
> first, to announce them, and second, before merging them marked as
> accepted. We could even still merge rejected/abandoned NEPs as long as they
> are clearly marked.
>
> On Fri, Mar 9, 2018 at 7:24 AM Charles R Harris <charlesr.harris at gmail.com>
> wrote:
>
>> On Thu, Mar 8, 2018 at 11:26 PM, Ralf Gommers <ralf.gommers at gmail.com>
>> wrote:
>>
>>>
>>>
>>> On Thu, Mar 8, 2018 at 8:22 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>>
>>>> On Thu, Mar 8, 2018 at 7:06 AM, Marten van Kerkwijk
>>>> <m.h.vankerkwijk at gmail.com> wrote:
>>>> > Hi Nathaniel,
>>>> >
>>>> > Overall, hugely in favour!  For detailed comments, it would be good to
>>>> > have a link to a PR; could you put that up?
>>>>
>>>> Well, there's a PR here: https://github.com/numpy/numpy/pull/10706
>>>>
>>>> But, this raises a question :-). (One which also came up here:
>>>> https://github.com/numpy/numpy/pull/10704#issuecomment-371684170)
>>>>
>>>> There are sensible two workflows we could use (or at least, two that I
>>>> can think of):
>>>>
>>>> 1. We merge updates to the NEPs as we go, so that whatever's in the
>>>> repo is the current draft. Anyone can go to the NEP webpage at
>>>> http://numpy.org/neps (WIP, see #10702) to see the latest version of
>>>> all NEPs, whether accepted, rejected, or in progress. Discussion
>>>> happens on the mailing list, and line-by-line feedback can be done by
>>>> quote-replying and commenting on individual lines. From time to time,
>>>> the NEP author takes all the accumulated feedback, updates the
>>>> document, and makes a new post to the list to let people know about
>>>> the updated version.
>>>>
>>>> This is how python-dev handles PEPs.
>>>>
>>>> 2. We use Github itself to manage the review. The repo only contains
>>>> "accepted" NEPs; draft NEPs are represented by open PRs, and rejected
>>>> NEPs are represented by PRs that were closed-without-merging.
>>>> Discussion uses Github's commenting/review tools, and happens in the
>>>> PR itself.
>>>>
>>>> This is roughly how Rust handles their RFC process, for example:
>>>> https://github.com/rust-lang/rfcs
>>>>
>>>> Trying to do some hybrid version of these seems like it would be
>>>> pretty painful, so we should pick one.
>>>>
>>>> Given that historically we've tried to use the mailing list for
>>>> substantive features/planning discussions, and that our NEP process
>>>> has been much closer to workflow 1 than workflow 2 (e.g., there are
>>>> already a bunch of old NEPs already in the repo that are effectively
>>>> rejected/withdrawn), I think we should maybe continue that way, and
>>>> keep discussions here?
>>>>
>>>> So my suggestion is discussion should happen on the list, and NEP
>>>> updates should be merged promptly, or just self-merged. Sound good?
>>>
>>>
>>> Agreed that overall (1) is better than (2), rejected NEPs should be
>>> visible. However there's no need for super-quick self-merge, and I think it
>>> would be counter-productive.
>>>
>>> Instead, just send a PR, leave it open for some discussion, and update
>>> for detailed comments (as well as long in-depth discussions that only a
>>> couple of people care about) in the Github UI and major ones on the list.
>>> Once it's stabilized a bit, then merge with status "Draft" and update once
>>> in a while. I think this is also much more in like with what python-dev
>>> does, I have seen substantial discussion on Github and have not seen quick
>>> self-merges.
>>>
>>>
>> I have a slight preference for managing the discussion on Github. Note
>> that I added a `component: NEP` label and that discussion can take place on
>> merged/closed PRs, the index could also contain links to proposed NEP PRs.
>> If we just left PR open until acceptance/rejection the label would allow
>> the proposed NEPs to be easily found, especially if we include the NEP
>> number in the title, something like `NEP-10111: ` .
>>
>> Chuck
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180309/fc58c60f/attachment.html>

From kirit.thadaka at gmail.com  Fri Mar  9 13:41:56 2018
From: kirit.thadaka at gmail.com (Kirit Thadaka)
Date: Sat, 10 Mar 2018 00:11:56 +0530
Subject: [Numpy-discussion] PR to add a function to calculate histogram
 edges without calculating the histogram
Message-ID: <CALSHQvPKoyQi2D0YHtgbPuA9qQ4bh1SDRx8CRvDeaDpjR=QP1Q@mail.gmail.com>

Hi!

I've created a PR to add a function called "histogram_bin_edges" which will
allow a user to calculate the bins used by the histogram for some data
without requiring the entire histogram to be calculated.

https://github.com/numpy/numpy/pull/10591#issuecomment-371863472

This function allows one set of bins to be computed, and reused across
multiple histograms which gives more easily comparable results than using
separate bins for each histogram.

Please let me know if you have any suggestions on how to improve this PR.

Thanks!

-
Kirit
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180310/e0420847/attachment.html>

From rmay31 at gmail.com  Fri Mar  9 14:36:47 2018
From: rmay31 at gmail.com (Ryan May)
Date: Fri, 9 Mar 2018 12:36:47 -0700
Subject: [Numpy-discussion] new NEP: np.AbstractArray and
 np.asabstractarray
In-Reply-To: <CAPJVwBnMJtJjXgAJEF8K7N=B2pKdRWRaFc16-c0Jgu3-HZrv0Q@mail.gmail.com>
References: <CAPJVwBnBGh_kfT+tXa8gJ8HA3xjGzoSMJMsCzj+2=diSUvxEAQ@mail.gmail.com>
 <CAJNV+9uM6CL3SeYoLFbQAdpVVaFFvSb7Z5D4KVCQ-vLTm7g4SQ@mail.gmail.com>
 <CAPJVwBnMJtJjXgAJEF8K7N=B2pKdRWRaFc16-c0Jgu3-HZrv0Q@mail.gmail.com>
Message-ID: <CAKH0P+VeDymMZkBEjtUaUCFVVb6b+2vDAu9yfGKooHV9Xz0u1A@mail.gmail.com>

On Fri, Mar 9, 2018 at 2:29 AM, Nathaniel Smith <njs at pobox.com> wrote:

> On Thu, Mar 8, 2018 at 7:06 AM, Marten van Kerkwijk
> <m.h.vankerkwijk at gmail.com> wrote:
> > A larger comment: you state that you think `np.asanyarray` is a
> > mistake since `np.matrix` and `np.ma.MaskedArray` would pass through
> > and that those do not strictly mimic `NDArray`. Here, I agree with
> > `matrix` (but since we're deprecating it, let's remove that from the
> > discussion), but I do not see how your proposed interface would not
> > let `MaskedArray` pass through, nor really that one would necessarily
> > want that.
>
> We can discuss whether MaskedArray should be an AbstractArray.
> Conceptually it probably should be; I think that was a goal of the
> MaskedArray authors (even if they wouldn't have put it that way). In
> practice there are a lot of funny quirks in MaskedArray, so I'd want
> to look more carefully in case there are weird incompatibilities that
> would cause problems. Note that we can figure this out after the NEP
> is finished, too.
>
> I wonder if the matplotlib folks have any thoughts on this? I know
> they're one of the more prominent libraries that tries to handle both
> regular and masked arrays, so maybe they could comment on how often
> they run


There's a lot of places in matplotlib where this could simplify our checks,
though probably more from a standpoint of "does this thing we've been given
need converting?"

There are also a lot of places where matplotlib needs to know if we have
actually been given a MaskedArray so that we can handle it specially.

Ryan

-- 
Ryan May
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180309/84c69d6e/attachment-0001.html>

From robert.kern at gmail.com  Fri Mar  9 14:38:55 2018
From: robert.kern at gmail.com (Robert Kern)
Date: Fri, 9 Mar 2018 11:38:55 -0800
Subject: [Numpy-discussion] numpy.random.randn
In-Reply-To: <CANoUZR_rAoSoUmdCthu2w9VSOg1050BFxugE0y2YZL2-bnnsug@mail.gmail.com>
References: <CANoUZR_rAoSoUmdCthu2w9VSOg1050BFxugE0y2YZL2-bnnsug@mail.gmail.com>
Message-ID: <CAF6FJiuZLv7eK0JMmjCKR_cUtTddiRfiKwGNE6Kya-TLQZ8Dtg@mail.gmail.com>

On Thu, Mar 8, 2018 at 12:44 PM, Marko Asplund <marko.asplund at gmail.com>
wrote:
>
> On Wed, 7 Mar 2018 13:14:36, Robert Kern wrote:
>
> > > With NumPy I'm simply using the following random initilization code:
> > >
> > > np.random.randn(n_h, n_x) * 0.01
> > >
> > > I'm trying to emulate the same behaviour in my Scala code by  sampling
> > from a
> > > Gaussian distribution with mean = 0 and std dev = 1.
>
> > `np.random.randn(n_h, n_x) * 0.01`  gives a Gaussian distribution of
mean=0
> > and stdev=0.01
>
> Sorry for being a bit inaccurate.
> My Scala code actually mirrors the NumPy based random initialization, so
I sample with Gaussian of mean = 0 and std dev = 1, then multiply with 0.01.

Have you verified this? I.e. save out the Scala-initialized network and
load it up with numpy to check the mean and std dev? How about if you run
the numpy NN training with the Scala-initialized network? Does that also
diverge?

--
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180309/027105d1/attachment.html>

From stefanv at berkeley.edu  Fri Mar  9 14:51:40 2018
From: stefanv at berkeley.edu (Stefan van der Walt)
Date: Fri, 9 Mar 2018 11:51:40 -0800
Subject: [Numpy-discussion] Where to discuss NEPs (was: Re: new NEP:
 np.AbstractArray and np.asabstractarray)
In-Reply-To: <CAEQ_Tvc+UAh9jhV9F3hDWe+cL73MZ5yMCHMnROFxeOXwx_kWsQ@mail.gmail.com>
References: <CAPJVwBmAVV_EFhrnBic+6spRc6fWgsmhsgAaZ0fL9AvYtz29WA@mail.gmail.com>
 <CABL7CQhbOED8nogyQf-Y8B2ifq98kBNcFjNaj4d5nUf2B5etRw@mail.gmail.com>
 <CAB6mnxLn+DjgjR7AhvNy2H=Oyh_=EtyG9kZHKK7MQ3GeOoqADg@mail.gmail.com>
 <CAEQ_Tve_GoMnShXOkKO73++edsPUbDZOab-tJFA6ZnajyzypiQ@mail.gmail.com>
 <CAEQ_Tvc+UAh9jhV9F3hDWe+cL73MZ5yMCHMnROFxeOXwx_kWsQ@mail.gmail.com>
Message-ID: <20180309195140.ga465g7bbv6byuqh@carbo>

On Fri, 09 Mar 2018 17:00:43 +0000, Stephan Hoyer wrote:
> I'll note that we basically used GitHub for revising __array_ufunc__ NEP,
> and I think that worked out better for everyone involved. The discussion
> was a little too specialized and high volume to be well handled on the
> mailing list.

A disadvantage of GitHub PR comments is that they do not track
sub-threads of conversation, so you cannot "reply to" a previous concern
directly.

PRs also mix inline comments (that become much less visible after
rebases and updates) and "story line" comments.  These two "modes" of
commenting, substantive discussion around ideas, v.s. concerns about
specific phrasing, usage of words, typos, content of code snippets,
etc., may require different approaches.  It would be quite easy to
redirect the prior to the mailing list and the latter to the GitHub PR.

I'm also not too keen on repeated PR creation and merging (it splits up
the PR discussion even further).  Why not simply hold off until the PEP
is ready, and view the documents on GitHub?  The rendering there is just
as good.

+1 also on merging rejected PEPs, once they are fully developed.

St?fan

From rmay31 at gmail.com  Fri Mar  9 14:42:05 2018
From: rmay31 at gmail.com (Ryan May)
Date: Fri, 9 Mar 2018 12:42:05 -0700
Subject: [Numpy-discussion] new NEP: np.AbstractArray and
 np.asabstractarray
In-Reply-To: <CADViA5Ay73XvqV8dFK8TpZZV75C7ju9GvYcNMt7eLdc+WBub_w@mail.gmail.com>
References: <CAPJVwBnBGh_kfT+tXa8gJ8HA3xjGzoSMJMsCzj+2=diSUvxEAQ@mail.gmail.com>
 <CAJNV+9uM6CL3SeYoLFbQAdpVVaFFvSb7Z5D4KVCQ-vLTm7g4SQ@mail.gmail.com>
 <CAEQ_Tvc8QA+Finhq6+2Ey8J5dBdTR+VuraA1_kuOFjb3=LO2gg@mail.gmail.com>
 <1520560316.2962680.1296803088.6C85AC87@webmail.messagingengine.com>
 <CADViA5Ay73XvqV8dFK8TpZZV75C7ju9GvYcNMt7eLdc+WBub_w@mail.gmail.com>
Message-ID: <CAKH0P+WkSpng1_3mpfnAvMdvE6zUOuT_J_6FAEGZ-RSVDyu6cw@mail.gmail.com>

On Fri, Mar 9, 2018 at 12:21 AM, Hameer Abbasi <einstein.edison at gmail.com>
wrote:

> Not that I?m against different ?levels? of ndarray granularity, but I just
> don?t want it to introduce complexity for the end-user. For example, it
> would be unreasonable to expect the end-user to check for all parts of the
> interface that they need support for separately.
>

I wouldn't necessarily want all of the granularity exposed in something
like "asarraylike"--that should be kept really simple. But I think there's
value in numpy providing multiple ABCs for portions of the interface (and
one big one that combines them all). That way, people who want the
finer-grained checking (say for a more limited array-like) can use a
common, shared, existing ABC, rather than having everyone re-invent it.

Ryan

-- 
Ryan May
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180309/30c3e4cc/attachment.html>

From m.h.vankerkwijk at gmail.com  Fri Mar  9 16:55:31 2018
From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk)
Date: Fri, 9 Mar 2018 16:55:31 -0500
Subject: [Numpy-discussion] NumPy 1.14.2 release
In-Reply-To: <CAB6mnxJswNVVjuU1wg1u+7t5hMaW6jXZjjQHcUdD7swwNvmLSA@mail.gmail.com>
References: <CAB6mnxJswNVVjuU1wg1u+7t5hMaW6jXZjjQHcUdD7swwNvmLSA@mail.gmail.com>
Message-ID: <CAJNV+9ty+gQxDwPWOqjDAUHFsW-SOLsYybD2PcgMnY3DSE=MQQ@mail.gmail.com>

Hi Chuck,

Astropy tests indeed all pass again against master, without the
work-arounds for 1.14.1.
Thanks, of course also to Allan for the fix,

Marten

From m.h.vankerkwijk at gmail.com  Fri Mar  9 17:10:33 2018
From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk)
Date: Fri, 9 Mar 2018 17:10:33 -0500
Subject: [Numpy-discussion] Where to discuss NEPs (was: Re: new NEP:
 np.AbstractArray and np.asabstractarray)
In-Reply-To: <20180309195140.ga465g7bbv6byuqh@carbo>
References: <CAPJVwBmAVV_EFhrnBic+6spRc6fWgsmhsgAaZ0fL9AvYtz29WA@mail.gmail.com>
 <CABL7CQhbOED8nogyQf-Y8B2ifq98kBNcFjNaj4d5nUf2B5etRw@mail.gmail.com>
 <CAB6mnxLn+DjgjR7AhvNy2H=Oyh_=EtyG9kZHKK7MQ3GeOoqADg@mail.gmail.com>
 <CAEQ_Tve_GoMnShXOkKO73++edsPUbDZOab-tJFA6ZnajyzypiQ@mail.gmail.com>
 <CAEQ_Tvc+UAh9jhV9F3hDWe+cL73MZ5yMCHMnROFxeOXwx_kWsQ@mail.gmail.com>
 <20180309195140.ga465g7bbv6byuqh@carbo>
Message-ID: <CAJNV+9vJYkNMGvmoTgqQttj09=782zEfvxVmiui+rz2AKRtSZA@mail.gmail.com>

Hi Nathaniel,

astropy is an example of a project that does essentially all
discussion of its "Astropy Proposals for Enhancement" on github. I
actually like the numpy approach of sending anything to the mailing
list that deserves community input (which includes NEP by their very
nature). I don't think it has to be either/or, though; maybe the
preferred approach is in fact a combination, where the draft is send
to the mailing list, initial general comments are incorporated, and
then discussion moves to github when one is past the "general
interest" stage. When exactly this happens will be somewhat
subjective, but probably is not important to nail down anyway.

All the best,

Marten

p.s. I think the __array_ufunc__ discussion indeed showed that github
can work, but only once the general ideas are agreed upon - the
initial discussion become hopeless to follow (though I'm not sure a
mailing list discussion would have been any better).

From m.h.vankerkwijk at gmail.com  Fri Mar  9 17:49:21 2018
From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk)
Date: Fri, 9 Mar 2018 17:49:21 -0500
Subject: [Numpy-discussion] new NEP: np.AbstractArray and
 np.asabstractarray
In-Reply-To: <CAKH0P+WkSpng1_3mpfnAvMdvE6zUOuT_J_6FAEGZ-RSVDyu6cw@mail.gmail.com>
References: <CAPJVwBnBGh_kfT+tXa8gJ8HA3xjGzoSMJMsCzj+2=diSUvxEAQ@mail.gmail.com>
 <CAJNV+9uM6CL3SeYoLFbQAdpVVaFFvSb7Z5D4KVCQ-vLTm7g4SQ@mail.gmail.com>
 <CAEQ_Tvc8QA+Finhq6+2Ey8J5dBdTR+VuraA1_kuOFjb3=LO2gg@mail.gmail.com>
 <1520560316.2962680.1296803088.6C85AC87@webmail.messagingengine.com>
 <CADViA5Ay73XvqV8dFK8TpZZV75C7ju9GvYcNMt7eLdc+WBub_w@mail.gmail.com>
 <CAKH0P+WkSpng1_3mpfnAvMdvE6zUOuT_J_6FAEGZ-RSVDyu6cw@mail.gmail.com>
Message-ID: <CAJNV+9vyt26Jw3oJLz1_B_Qkf+pomK_bWHpnYuGDF0QwruNCYg@mail.gmail.com>

We may be getting a bit distracted by the naming -- though I'll throw
out `asarraymimic` as another non-programmer-lingo option that doesn't
reuse `arraylike` and might describe what the duck array is attempting
to do more closely.

But more to the point: I think in essence, we're trying to create a
function that does the equivalent of:
```
def ...(arraylike, ...)
    if isinstance(arraylike, NDAbstractArray):
        return arraylike
    else:
        return np.array(arraylike, ...)
```

Given that one possibly might want to check for partial compatibility,
maybe the new or old function should just expose what compatibility is
desired, via something like:
```
input = np.as...(input, ..., mimicok='shape|operator|...')
```
Where one could have `mimicok=True` to indicate the highest level
(maybe not including being viewable?), `False` to not allow any
mimics.

This might even work for np.array itself:
- dtype - any mimic must provide `astype` (which can error if not
possible; this could be the ABC default)
- copy - can't one just use `copy.copy`? I think this defaults to `__copy__`.
- order - can be passed to `astype` as well; up to code to error if
not possible.
- subok - meaningless
- ndmin - requirement of mimicok='shape' would be to provide a shape
attribute and reshape method.

-- Marten

From stefanv at berkeley.edu  Fri Mar  9 18:26:38 2018
From: stefanv at berkeley.edu (Stefan van der Walt)
Date: Fri, 9 Mar 2018 15:26:38 -0800
Subject: [Numpy-discussion] NEP sprint: 21 and 22 March
Message-ID: <20180309232638.vumxg3z4dzfaz3yo@carbo>

Hi everyone,

As you may have noticed, there's been quite a bit of movement recently
around NumPy Enhancement Proposals---on setting specifications,
building infrastructure, as well as writing new proposals.

To further support this work, we will be hosting an informal NEP
sprint at Berkeley on 21 and 22 March.  Our aim is to bring core
contributors and interested community members together to discuss
proposal ideas, write up new NEPs, and polish existing ones.

Some potential topics of discussion are:

 - Duck arrays
 - Array concatenation
 - Random number generator seed versioning
 - User defined dtypes
 - Deprecation pathways for `np.matrix`
 - What to do about nditer?

All community members are welcome to attend.  If you are a core
contributor, we may be able to fund some travel costs as well; please
let me know.

Best regards
St?fan

From njs at pobox.com  Fri Mar  9 18:32:18 2018
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 9 Mar 2018 15:32:18 -0800
Subject: [Numpy-discussion] new NEP: np.AbstractArray and
 np.asabstractarray
In-Reply-To: <CAEQ_TvcdSyApKdwYOoNg=noc3YiU9xR8AsxsKU_5aAks4=OKbw@mail.gmail.com>
References: <CAPJVwBnBGh_kfT+tXa8gJ8HA3xjGzoSMJMsCzj+2=diSUvxEAQ@mail.gmail.com>
 <CAJNV+9uM6CL3SeYoLFbQAdpVVaFFvSb7Z5D4KVCQ-vLTm7g4SQ@mail.gmail.com>
 <CAEQ_Tvc8QA+Finhq6+2Ey8J5dBdTR+VuraA1_kuOFjb3=LO2gg@mail.gmail.com>
 <1520560316.2962680.1296803088.6C85AC87@webmail.messagingengine.com>
 <CAEQ_TvcdSyApKdwYOoNg=noc3YiU9xR8AsxsKU_5aAks4=OKbw@mail.gmail.com>
Message-ID: <CAPJVwBnWrYb85vJgUoNTczCFWZfoH3-2M+K=crB6wdCdouk23Q@mail.gmail.com>

On Thu, Mar 8, 2018 at 9:45 PM, Stephan Hoyer <shoyer at gmail.com> wrote:
> On Thu, Mar 8, 2018 at 5:54 PM Juan Nunez-Iglesias <jni.soma at gmail.com>
> wrote:
>>
>> On Fri, Mar 9, 2018, at 5:56 AM, Stephan Hoyer wrote:
>>
>> Marten's case 1: works exactly like ndarray, but stores data differently:
>> parallel arrays (e.g., dask.array), sparse arrays (e.g.,
>> https://github.com/pydata/sparse), hypothetical non-strided arrays (e.g.,
>> always C ordered).
>>
>>
>> Two other "hypotheticals" that would fit nicely in this space:
>> - the Open Connectome folks (https://neurodata.io) proposed linearising
>> indices using space-filling curves, which minimizes cache misses (or IO
>> reads) for giant volumes. I believe they implemented this but can't find it
>> currently.
>> - the N5 format for chunked arrays on disk:
>> https://github.com/saalfeldlab/n5
>
>
> I think these fall into another important category of duck arrays.
> "Indexable" arrays the serve as storage, but that don't support computation.
> These sorts of arrays typically support operations like indexing and define
> handful of array-like properties (e.g., dtype and shape), but not
> arithmetic, reductions or reshaping.
>
> This means you can't quite use them as a drop-in replacement for NumPy
> arrays in all cases, but that's OK. In contrast, both dask.array and sparse
> do aspire to do fill out nearly the full numpy.ndarray API.

I'm not sure if these particular formats fall into that category or
not (isn't the point of the space-filling curves to support
cache-efficient computation?). But I suppose you're also thinking of
things like h5py.Dataset? My impression is that these are mostly
handled pretty well already by defining __array__ and/or providing
array operations that implicitly convert to ndarray -- do you agree?

This does raise an interesting point: maybe we'll eventually want an
__abstract_array__ method that asabstractarray tries calling if
defined, so e.g. if your object isn't itself an array but can be
efficiently converted into a *sparse* array, you have a way to declare
that? I think this is something to file under "worry about later,
after we have the basic infrastructure", but it's not something I'd
thought of before so mentioning here.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org

From njs at pobox.com  Fri Mar  9 19:40:11 2018
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 9 Mar 2018 16:40:11 -0800
Subject: [Numpy-discussion] Where to discuss NEPs (was: Re: new NEP:
 np.AbstractArray and np.asabstractarray)
In-Reply-To: <20180309195140.ga465g7bbv6byuqh@carbo>
References: <CAPJVwBmAVV_EFhrnBic+6spRc6fWgsmhsgAaZ0fL9AvYtz29WA@mail.gmail.com>
 <CABL7CQhbOED8nogyQf-Y8B2ifq98kBNcFjNaj4d5nUf2B5etRw@mail.gmail.com>
 <CAB6mnxLn+DjgjR7AhvNy2H=Oyh_=EtyG9kZHKK7MQ3GeOoqADg@mail.gmail.com>
 <CAEQ_Tve_GoMnShXOkKO73++edsPUbDZOab-tJFA6ZnajyzypiQ@mail.gmail.com>
 <CAEQ_Tvc+UAh9jhV9F3hDWe+cL73MZ5yMCHMnROFxeOXwx_kWsQ@mail.gmail.com>
 <20180309195140.ga465g7bbv6byuqh@carbo>
Message-ID: <CAPJVwBm+-TCZJQie-OO2ko-+Vuy4WmTvJFUj9WHNqmchM2Ks9g@mail.gmail.com>

On Fri, Mar 9, 2018 at 11:51 AM, Stefan van der Walt
<stefanv at berkeley.edu> wrote:
> On Fri, 09 Mar 2018 17:00:43 +0000, Stephan Hoyer wrote:
>> I'll note that we basically used GitHub for revising __array_ufunc__ NEP,
>> and I think that worked out better for everyone involved. The discussion
>> was a little too specialized and high volume to be well handled on the
>> mailing list.
>
> A disadvantage of GitHub PR comments is that they do not track
> sub-threads of conversation, so you cannot "reply to" a previous concern
> directly.

Yeah, I actually find email much easier for this kind of complex
high-volume discussion. Even if lots of people don't use traditional
threaded mail clients anymore [1], archives are still threaded, and
the tools that make line-by-line responses easy and the ability to
split off conversations are both really helpful. (E.g., the way I
split this thread off from the original one :-).) The __array_ufunc__
discussion was almost impenetrable on GH, I think.

I admit though that some of this is probably just that I'm more used
to the email-based discussion workflow. Honestly none of these tools
are particularly amazing, and the __array_ufunc__ conversation would
have been difficult and inaccessible to outsiders no matter what
medium we used. It's much more important that we just pick something
and use it consistently than that pick the Most Optimal Solution.

[1] Meaning this, not gmail's threads:
https://en.wikipedia.org/wiki/Conversation_threading#/media/File:Nntp.jpg

> PRs also mix inline comments (that become much less visible after
> rebases and updates) and "story line" comments.  These two "modes" of
> commenting, substantive discussion around ideas, v.s. concerns about
> specific phrasing, usage of words, typos, content of code snippets,
> etc., may require different approaches.  It would be quite easy to
> redirect the prior to the mailing list and the latter to the GitHub PR.

I don't think we should worry about this. Fiddly detail comments are,
by definition, not super important, and generally make up a tiny
volume of the discussion around a proposal. Also in practice reviewers
are no good at splitting up substantive comments from fiddly details:
the review workflow is that you read through and as thoughts occur you
write them down, so even if you start out thinking "okay, I'm only
going to comment on typos", then half-way through some paragraph
sparks a thought and suddenly you're writing something substantive
(and I'm as guilty of this as anyone, maybe more so...). Asking people
to classify their comments and then chiding them for putting them in
the wrong place etc. isn't a good use of time. Let's just pick one
place for everything and stick with it.

> I'm also not too keen on repeated PR creation and merging (it splits up
> the PR discussion even further).  Why not simply hold off until the PEP
> is ready, and view the documents on GitHub?  The rendering there is just
> as good.

Well, if we aren't using PRs for discussion then multiple PRs are fine
:-). And merging changes quickly is helpful because it makes the
rendered NEPs page a single one-stop-shop to see all the latest NEPs,
no matter what their current status.

If we do use PRs for discussion, then I agree that we should try to
keep the PR open until the NEP is "done", to minimize the splitting of
discussion. This does create a bit of extra friction because it turns
out that "is this done?" is not something you can really ever answer
for certain :-). Even after PEPs are accepted they usually end up
getting some further tweaks once people start implementing them.
Sometimes PEPs get abandoned in "Draft" state without ever being
accepted/rejected, and sometimes a PEP that had been abandoned for
years gets picked up and finished. You can see this in the Rust RFC
guidelines too [2]; they specifically address the issue of post-merge
changes, and it sounds like their solution is that if a substantive
issue is discovered in an accepted RFC, then you have to create a new
"fixup" RFC, which then gets its own PR for discussion. I guess if
this were our process then __array_ufunc__ would have ended up with ~3
NEPs :-).

This is all doable -- every approach has trade-offs. But we should
pick one, so we can adapt to those trade-offs.

[2] https://github.com/rust-lang/rfcs#the-rfc-life-cycle

-n

-- 
Nathaniel J. Smith -- https://vorpus.org

From njs at pobox.com  Fri Mar  9 20:10:17 2018
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 9 Mar 2018 17:10:17 -0800
Subject: [Numpy-discussion] New NEP: merging multiarray and umath
In-Reply-To: <81930c51-ac3c-77e9-74c0-ccf12691096a@googlemail.com>
References: <CAPJVwBkD7xLauHVNSyD9xHQYDwQiSAOjjakxNtD8dJH6BQs2-w@mail.gmail.com>
 <23471BD4-A81B-4B9C-AECC-D161C3643B81@gmail.com>
 <CAB6mnxKqsxxdLQ2JgkwiD4w3aXW7kFJLZAL14J5K-dtiPmuAGQ@mail.gmail.com>
 <81930c51-ac3c-77e9-74c0-ccf12691096a@googlemail.com>
Message-ID: <CAPJVwBkYcBHA3hx5WXu-SBV-ZX7m6vA8_LwAJOiyi=NtCs96Xw@mail.gmail.com>

On Fri, Mar 9, 2018 at 3:33 AM, Julian Taylor
<jtaylor.debian at googlemail.com> wrote:
> As the functions of the different libraries have vastly different
> accuracies you want to be able to exchange numeric ops at runtime or at
> least during load time (like our cblas) and not limit yourself one
> compile time defined set of functions.
> Keeping set_numeric_ops would be preferable to me.
>
> Though I am not clear on why the two things are connected?
> Why can't we keep set_numeric_ops and merge multiarray and umath into
> one shared object?

I think I addressed both of these topics here?
https://mail.python.org/pipermail/numpy-discussion/2018-March/077777.html

Looking again now, I see that we actually *do* have an explicit API
for monkeypatching ufuncs:

https://docs.scipy.org/doc/numpy/reference/c-api.ufunc.html#c.PyUFunc_ReplaceLoopBySignature

So this seems to be a strictly more general/powerful/useful version of
set_numeric_ops...

I added some discussion to the NEP:
https://github.com/numpy/numpy/pull/10704/commits/4c4716ee0b3bc51d5be9baa891d60473f480d1f2

-n

-- 
Nathaniel J. Smith -- https://vorpus.org

From njs at pobox.com  Fri Mar  9 20:45:20 2018
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 9 Mar 2018 17:45:20 -0800
Subject: [Numpy-discussion] new NEP: np.AbstractArray and
 np.asabstractarray
In-Reply-To: <1520560316.2962680.1296803088.6C85AC87@webmail.messagingengine.com>
References: <CAPJVwBnBGh_kfT+tXa8gJ8HA3xjGzoSMJMsCzj+2=diSUvxEAQ@mail.gmail.com>
 <CAJNV+9uM6CL3SeYoLFbQAdpVVaFFvSb7Z5D4KVCQ-vLTm7g4SQ@mail.gmail.com>
 <CAEQ_Tvc8QA+Finhq6+2Ey8J5dBdTR+VuraA1_kuOFjb3=LO2gg@mail.gmail.com>
 <1520560316.2962680.1296803088.6C85AC87@webmail.messagingengine.com>
Message-ID: <CAPJVwBmpt7Y0Q2Nt7jW52FnMMPxQ-2A2u3bb9U0YDMc2QSsJSw@mail.gmail.com>

On Thu, Mar 8, 2018 at 5:51 PM, Juan Nunez-Iglesias <jni.soma at gmail.com> wrote:
>> Finally for the name, what about `asduckarray`? Thought perhaps that could
>> be a source of confusion, and given the gradation of duck array like types.
>
> I suggest that the name should *not* use programmer lingo, so neither
> "abstract" nor "duck" should be in there. My humble proposal is "arraylike".
> (I know that this term has included things like "list-of-list" before but
> only in text, not code, as far as I know.)

I agree with your point about avoiding programmer lingo. My first
draft actually used 'asduckarray', but that's like an in-joke; it
works fine for us, but it's not really something I want teachers to
have to explain on day 1...

Array-like is problematic too though, because we still need a way to
say "thing that can be coerced to an array", which is what array-like
has been used to mean historically. And with the new type hints stuff,
it is actually becoming code. E.g. what should the type hints here be:

    asabstractarray(a: X) -> Y

Right now "X" is "ArrayLike", but if we make "Y" be "ArrayLike" then
we'll need to come up with some other name for "X" :-).

Maybe we can call duck arrays "py arrays", since the idea is that they
implement the standard Python array API (but not necessarily the
C-level array API)? np.PyArray, np.aspyarray()?

-n

-- 
Nathaniel J. Smith -- https://vorpus.org

From ralf.gommers at gmail.com  Sat Mar 10 00:24:52 2018
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Fri, 9 Mar 2018 21:24:52 -0800
Subject: [Numpy-discussion] Where to discuss NEPs (was: Re: new NEP:
 np.AbstractArray and np.asabstractarray)
In-Reply-To: <CAPJVwB=ik5-Qi=+w7y2hPxWvudR-_zJq0YB_XFjvP8+jY8eYJQ@mail.gmail.com>
References: <CAPJVwBmAVV_EFhrnBic+6spRc6fWgsmhsgAaZ0fL9AvYtz29WA@mail.gmail.com>
 <CABL7CQhbOED8nogyQf-Y8B2ifq98kBNcFjNaj4d5nUf2B5etRw@mail.gmail.com>
 <CAPJVwB=ik5-Qi=+w7y2hPxWvudR-_zJq0YB_XFjvP8+jY8eYJQ@mail.gmail.com>
Message-ID: <CABL7CQhqOZo0SdaP01jSfVG4Nt5EujZw9y-TzR2g5caNFTFBUQ@mail.gmail.com>

On Fri, Mar 9, 2018 at 12:00 AM, Nathaniel Smith <njs at pobox.com> wrote:

> On Thu, Mar 8, 2018 at 10:26 PM, Ralf Gommers <ralf.gommers at gmail.com>
> wrote:
> >
> >
> > On Thu, Mar 8, 2018 at 8:22 PM, Nathaniel Smith <njs at pobox.com> wrote:
> >>
> >> On Thu, Mar 8, 2018 at 7:06 AM, Marten van Kerkwijk
> >> <m.h.vankerkwijk at gmail.com> wrote:
> >> > Hi Nathaniel,
> >> >
> >> > Overall, hugely in favour!  For detailed comments, it would be good to
> >> > have a link to a PR; could you put that up?
> >>
> >> Well, there's a PR here: https://github.com/numpy/numpy/pull/10706
> >>
> >> But, this raises a question :-). (One which also came up here:
> >> https://github.com/numpy/numpy/pull/10704#issuecomment-371684170)
> >>
> >> There are sensible two workflows we could use (or at least, two that I
> >> can think of):
> >>
> >> 1. We merge updates to the NEPs as we go, so that whatever's in the
> >> repo is the current draft. Anyone can go to the NEP webpage at
> >> http://numpy.org/neps (WIP, see #10702) to see the latest version of
> >> all NEPs, whether accepted, rejected, or in progress. Discussion
> >> happens on the mailing list, and line-by-line feedback can be done by
> >> quote-replying and commenting on individual lines. From time to time,
> >> the NEP author takes all the accumulated feedback, updates the
> >> document, and makes a new post to the list to let people know about
> >> the updated version.
> >>
> >> This is how python-dev handles PEPs.
> >>
> >> 2. We use Github itself to manage the review. The repo only contains
> >> "accepted" NEPs; draft NEPs are represented by open PRs, and rejected
> >> NEPs are represented by PRs that were closed-without-merging.
> >> Discussion uses Github's commenting/review tools, and happens in the
> >> PR itself.
> >>
> >> This is roughly how Rust handles their RFC process, for example:
> >> https://github.com/rust-lang/rfcs
> >>
> >> Trying to do some hybrid version of these seems like it would be
> >> pretty painful, so we should pick one.
> >>
> >> Given that historically we've tried to use the mailing list for
> >> substantive features/planning discussions, and that our NEP process
> >> has been much closer to workflow 1 than workflow 2 (e.g., there are
> >> already a bunch of old NEPs already in the repo that are effectively
> >> rejected/withdrawn), I think we should maybe continue that way, and
> >> keep discussions here?
> >>
> >> So my suggestion is discussion should happen on the list, and NEP
> >> updates should be merged promptly, or just self-merged. Sound good?
> >
> >
> > Agreed that overall (1) is better than (2), rejected NEPs should be
> visible.
> > However there's no need for super-quick self-merge, and I think it would
> be
> > counter-productive.
> >
> > Instead, just send a PR, leave it open for some discussion, and update
> for
> > detailed comments (as well as long in-depth discussions that only a
> couple
> > of people care about) in the Github UI and major ones on the list. Once
> it's
> > stabilized a bit, then merge with status "Draft" and update once in a
> while.
> > I think this is also much more in like with what python-dev does, I have
> > seen substantial discussion on Github and have not seen quick
> self-merges.
>
> Not sure what you mean about python-dev. Are you looking at the peps
> repository? https://github.com/python/peps


I was mostly thinking about packaging PEPs that are now also there, but
were separate. Stuff like
https://github.com/pypa/interoperability-peps/pull/54. There seems to be
significantly more comments on packaging things than on other PEPs.


>
>
> From a quick skim, it looks like of the last 37 commits, only 8 came
> in through PRs and the other 29 were pushed directly by committers
> without any review. 3 of the 8 PRs were self-merged immediately after
> submission, and of the remaining 5 PRs, 4 of them were from external
> contributors who didn't have commit rights, and the 1 other was a fix
> to the repo README, rather than an actual PEP change. I don't think
> I've ever seen any kind of substantive discussion in that repo -- any
> discussion is mostly restricted to helping new contributors with
> procedural stuff, maybe formatting issues or fixes to the PEP tooling.
>
> Anyway, just because python-dev does it that way doesn't mean that we
> have to too.
>
> But if we split discussions between GH and the mailing list, then
> we're definitely going to end up discussing substantive issues there
> (how do we know which discussions only a couple of people care
> about?), and trying to juggle that seems confusing to me, plus makes
> it harder to track down what happened later, after we've had multiple
> PRs each with their own comments...
>

It's not imho, because it's what we already do on this list. Github is a
superior review interface over mailing list, so my vote goes to using that
interface, while keeping this list in the loop on critical stuff and
decisions about to be made.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180309/b0f9f983/attachment-0001.html>

From mrocklin at gmail.com  Sat Mar 10 07:27:04 2018
From: mrocklin at gmail.com (Matthew Rocklin)
Date: Sat, 10 Mar 2018 07:27:04 -0500
Subject: [Numpy-discussion] new NEP: np.AbstractArray and
 np.asabstractarray
In-Reply-To: <CAPJVwBmpt7Y0Q2Nt7jW52FnMMPxQ-2A2u3bb9U0YDMc2QSsJSw@mail.gmail.com>
References: <CAPJVwBnBGh_kfT+tXa8gJ8HA3xjGzoSMJMsCzj+2=diSUvxEAQ@mail.gmail.com>
 <CAJNV+9uM6CL3SeYoLFbQAdpVVaFFvSb7Z5D4KVCQ-vLTm7g4SQ@mail.gmail.com>
 <CAEQ_Tvc8QA+Finhq6+2Ey8J5dBdTR+VuraA1_kuOFjb3=LO2gg@mail.gmail.com>
 <1520560316.2962680.1296803088.6C85AC87@webmail.messagingengine.com>
 <CAPJVwBmpt7Y0Q2Nt7jW52FnMMPxQ-2A2u3bb9U0YDMc2QSsJSw@mail.gmail.com>
Message-ID: <CAJ8oX-H4ZD5BPgfJbxLBEyfC9TDB3u3jMa+r-zhqYEPj7LEeog@mail.gmail.com>

I'm very glad to see this discussion.

I think that coming up with a single definition of array-like may be
difficult, and that we might end up wanting to embrace duck typing instead.

It seems to me that different array-like classes will implement different
mixtures of features.  It may be difficult to pin down a single definition
that includes anything except for the most basic attributes (shape and
dtype?).  Consider two extreme cases of restrictive functionality:

   1. LinearOperators (support dot in a numpy-like way)
   2. Storage objects like h5py (support getitem in a numpy-like way)

I can imagine authors of both groups saying that they should qualify as
array-like because downstream projects that consume them should not convert
them to numpy arrays in important contexts.

The name "duck arrays" that we sometimes use doesn't necessarily mean
"quack like an ndarray" but might actually mean a number of different
things in different contexts.  Making a single class or predicate for duck
arrays may not be as effective as we want.  Instead, it might be that we
need a number of different protocols like `__array_mat_vec__` or
`__array_slice__`
that downstream projects can check instead.  I can imagine cases where I
want to check only "can I use this thing to multiply against arrays" or
"can I get numpy arrays out of this thing with numpy slicing" rather than
"is this thing array-like" because I may genuinely not care about most of
the functionality in a blessed definition of "array-like".

On Fri, Mar 9, 2018 at 8:45 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Thu, Mar 8, 2018 at 5:51 PM, Juan Nunez-Iglesias <jni.soma at gmail.com>
> wrote:
> >> Finally for the name, what about `asduckarray`? Thought perhaps that
> could
> >> be a source of confusion, and given the gradation of duck array like
> types.
> >
> > I suggest that the name should *not* use programmer lingo, so neither
> > "abstract" nor "duck" should be in there. My humble proposal is
> "arraylike".
> > (I know that this term has included things like "list-of-list" before but
> > only in text, not code, as far as I know.)
>
> I agree with your point about avoiding programmer lingo. My first
> draft actually used 'asduckarray', but that's like an in-joke; it
> works fine for us, but it's not really something I want teachers to
> have to explain on day 1...
>
> Array-like is problematic too though, because we still need a way to
> say "thing that can be coerced to an array", which is what array-like
> has been used to mean historically. And with the new type hints stuff,
> it is actually becoming code. E.g. what should the type hints here be:
>
>     asabstractarray(a: X) -> Y
>
> Right now "X" is "ArrayLike", but if we make "Y" be "ArrayLike" then
> we'll need to come up with some other name for "X" :-).
>
> Maybe we can call duck arrays "py arrays", since the idea is that they
> implement the standard Python array API (but not necessarily the
> C-level array API)? np.PyArray, np.aspyarray()?
>
> -n
>
> --
> Nathaniel J. Smith -- https://vorpus.org
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180310/afe17bb2/attachment.html>

From chris.barker at noaa.gov  Sat Mar 10 17:39:40 2018
From: chris.barker at noaa.gov (Chris Barker)
Date: Sat, 10 Mar 2018 23:39:40 +0100
Subject: [Numpy-discussion] new NEP: np.AbstractArray and
 np.asabstractarray
In-Reply-To: <CAJ8oX-H4ZD5BPgfJbxLBEyfC9TDB3u3jMa+r-zhqYEPj7LEeog@mail.gmail.com>
References: <CAPJVwBnBGh_kfT+tXa8gJ8HA3xjGzoSMJMsCzj+2=diSUvxEAQ@mail.gmail.com>
 <CAJNV+9uM6CL3SeYoLFbQAdpVVaFFvSb7Z5D4KVCQ-vLTm7g4SQ@mail.gmail.com>
 <CAEQ_Tvc8QA+Finhq6+2Ey8J5dBdTR+VuraA1_kuOFjb3=LO2gg@mail.gmail.com>
 <1520560316.2962680.1296803088.6C85AC87@webmail.messagingengine.com>
 <CAPJVwBmpt7Y0Q2Nt7jW52FnMMPxQ-2A2u3bb9U0YDMc2QSsJSw@mail.gmail.com>
 <CAJ8oX-H4ZD5BPgfJbxLBEyfC9TDB3u3jMa+r-zhqYEPj7LEeog@mail.gmail.com>
Message-ID: <CALGmxEJ4WYpuEmKaoq_YPpQH3CcaX1i0uqT-idEeurarZopSCQ@mail.gmail.com>

On Sat, Mar 10, 2018 at 1:27 PM, Matthew Rocklin <mrocklin at gmail.com> wrote:

> I'm very glad to see this discussion.
>

me too, but....


> I think that coming up with a single definition of array-like may be
> difficult, and that we might end up wanting to embrace duck typing instead.
>

exactly -- I think there is a clear line between "uses the numpy memory
layout" and the Python API. But the python API is pretty darn big, and many
"array_ish" classes implement only partvof it, and may even implement some
parts a bit differently. So really hard to have "one" definition, except
"Python API exactly like a ndarray" -- and I'm wondering how useful that is.

It seems to me that different array-like classes will implement different
> mixtures of features.  It may be difficult to pin down a single definition
> that includes anything except for the most basic attributes (shape and
> dtype?).
>

or a minimum set -- but again, how useful??


> Storage objects like h5py (support getitem in a numpy-like way)
>

Exactly -- though I don't know about h5py, but netCDF4 variables supoprt a
useful subst of ndarray, but do "fancy indexing" differently -- so are they
ndarray_ish? -- sorry to coin yet another term :-)


> I can imagine authors of both groups saying that they should qualify as
> array-like because downstream projects that consume them should not convert
> them to numpy arrays in important contexts.
>

indeed. My solution so far is to define my own duck types "asarraylike"
that checks for the actual methods I need:

https://github.com/NOAA-ORR-ERD/gridded/blob/master/gridded/utilities.py

which has:

must_have = ['dtype', 'shape', 'ndim', '__len__', '__getitem__', '
__getattribute__']

def isarraylike(obj):
"""
tests if obj acts enough like an array to be used in gridded.
This should catch netCDF4 variables and numpy arrays, at least, etc.
Note: these won't check if the attributes required actually work right.
"""
for attr in must_have:
if not hasattr(obj, attr):
return False
return True
def asarraylike(obj):
"""
If it satisfies the requirements of pyugrid the object is returned as is.
If not, then numpy's array() will be called on it.
:param obj: The object to check if it's like an array
"""
return obj if isarraylike(obj) else np.array(obj)

It's possible that we could come up with semi-standard "groupings" of
attributes to produce "levels" of compatibility, or maybe not levels, but
independentgroupings, so you could specify which groupings you need in this
instance.


> The name "duck arrays" that we sometimes use doesn't necessarily mean
> "quack like an ndarray" but might actually mean a number of different
> things in different contexts.  Making a single class or predicate for duck
> arrays may not be as effective as we want.  Instead, it might be that we
> need a number of different protocols like `__array_mat_vec__` or `__array_slice__`
> that downstream projects can check instead.  I can imagine cases where I
> want to check only "can I use this thing to multiply against arrays" or
> "can I get numpy arrays out of this thing with numpy slicing" rather than
> "is this thing array-like" because I may genuinely not care about most of
> the functionality in a blessed definition of "array-like".
>

exactly.

but maybe we won't know until we try.

-CHB


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180310/4a8c8312/attachment.html>

From m.h.vankerkwijk at gmail.com  Sat Mar 10 19:13:50 2018
From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk)
Date: Sat, 10 Mar 2018 19:13:50 -0500
Subject: [Numpy-discussion] new NEP: np.AbstractArray and
 np.asabstractarray
In-Reply-To: <CALGmxEJ4WYpuEmKaoq_YPpQH3CcaX1i0uqT-idEeurarZopSCQ@mail.gmail.com>
References: <CAPJVwBnBGh_kfT+tXa8gJ8HA3xjGzoSMJMsCzj+2=diSUvxEAQ@mail.gmail.com>
 <CAJNV+9uM6CL3SeYoLFbQAdpVVaFFvSb7Z5D4KVCQ-vLTm7g4SQ@mail.gmail.com>
 <CAEQ_Tvc8QA+Finhq6+2Ey8J5dBdTR+VuraA1_kuOFjb3=LO2gg@mail.gmail.com>
 <1520560316.2962680.1296803088.6C85AC87@webmail.messagingengine.com>
 <CAPJVwBmpt7Y0Q2Nt7jW52FnMMPxQ-2A2u3bb9U0YDMc2QSsJSw@mail.gmail.com>
 <CAJ8oX-H4ZD5BPgfJbxLBEyfC9TDB3u3jMa+r-zhqYEPj7LEeog@mail.gmail.com>
 <CALGmxEJ4WYpuEmKaoq_YPpQH3CcaX1i0uqT-idEeurarZopSCQ@mail.gmail.com>
Message-ID: <CAJNV+9snsNJ0wWvFQoL9KVJM2s6qeK9=J3gvg-MsuqtZnsu2VQ@mail.gmail.com>

?I think we don't have to make it sounds like there are *that* many types
of compatibility: really there is just array organisation
(indexing/reshaping) and array arithmetic. These correspond roughly to
ShapedLikeNDArray in astropy and NDArrayOperatorMixin in numpy (missing so
far is concatenation). The advantage of the ABC classes is that they can
supply missing methods (say, size, isscalar, __len__, and ndim given shape;
__iter__ given __getitem__, ravel, squeeze, flatten given reshape; etc.).

-- Marten
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180310/10a77fcd/attachment.html>

From gregor.thalhammer at gmail.com  Sun Mar 11 15:52:46 2018
From: gregor.thalhammer at gmail.com (Gregor Thalhammer)
Date: Sun, 11 Mar 2018 20:52:46 +0100
Subject: [Numpy-discussion] New NEP: merging multiarray and umath
In-Reply-To: <CAPJVwBm6BC0pe7o2zueQuCfy=Aar6hNTpDkPkOu7EsqQq0e-2w@mail.gmail.com>
References: <CAPJVwBkD7xLauHVNSyD9xHQYDwQiSAOjjakxNtD8dJH6BQs2-w@mail.gmail.com>
 <23471BD4-A81B-4B9C-AECC-D161C3643B81@gmail.com>
 <CAPJVwBm6BC0pe7o2zueQuCfy=Aar6hNTpDkPkOu7EsqQq0e-2w@mail.gmail.com>
Message-ID: <EAEFB017-7CCF-4954-870C-76C5B761C3A5@gmail.com>


> Am 09.03.2018 um 02:06 schrieb Nathaniel Smith <njs at pobox.com>:
> 
> On Thu, Mar 8, 2018 at 1:52 AM, Gregor Thalhammer
> <gregor.thalhammer at gmail.com <mailto:gregor.thalhammer at gmail.com>> wrote:
>> 
>> Hi,
>> 
>> long time ago I wrote a wrapper to to use optimised and parallelized math
>> functions from Intels vector math library
>> geggo/uvml: Provide vectorized math function (MKL) for numpy
>> 
>> I found it useful to inject (some of) the fast methods into numpy via
>> np.set_num_ops(), to gain more performance without changing my programs.
>> 
>> While this original project is outdated, I can imagine that a centralised
>> way to swap the implementation of math functions is useful. Therefor I
>> suggest to keep np.set_num_ops(), but admittedly I do not understand all the
>> technical implications of the proposed change.
> 
> The main part of the proposal is to merge the two libraries; the
> question of whether to deprecate set_numeric_ops is a bit separate.
> There's no technical obstacle to keeping it, except the usual issue of
> having more cruft to maintain :-).


> 
> It's usually true that any monkeypatching interface will be useful to
> someone under some circumstances, but we usually don't consider this a
> good enough reason on its own to add and maintain these kinds of
> interfaces. And an unfortunate side-effect of these kinds of hacky
> interfaces is that they can end up removing the pressure to solve
> problems properly. In this case, better solutions would include:
> 
> - Adding support for accelerated vector math libraries to NumPy
> directly (e.g. MKL, yeppp)
> 
> - Overriding the inner loops inside ufuncs like numpy.add that
> np.ndarray.__add__ ultimately calls. This would speed up all addition
> (whether or not it uses Python + syntax), would be a more general
> solution (e.g. you could monkeypatch np.exp to use MKL's fast
> vectorized exp), would let you skip reimplementing all the tricky
> shared bits of the ufunc logic, etc. Conceptually it's not even very
> hacky, because we allow you add new loops to existing ufuncs; making
> it possible to replace existing loops wouldn't be a big stretch. (In
> fact it's possible that we already allow this; I haven't checked.)
> 
> So I still lean towards deprecating set_numeric_ops. It's not the most
> crucial part of the proposal though; if it turns out to be too
> controversial then I'll take it out.

Dear Nathaniel,

since you referred to your reply in your latest post in this thread I comment here.

First, I agree that set_numeric_ops() is not very important for replacing numpy math functions with faster implementations, mostly because this covers only the basic operations (+, *, boolean operations), which are fast anyhow, only pow can be accelerated by a substantial factor.

I also agree that adding support for optimised math function libraries directly to numpy might be a better solution than patching numpy. But in the past there have been a couple of proposals to add fast vectorised math functions directly to numpy, e.g. for a GSoC project. There have always been long discussions about maintainability, testing, vendor lock-in, free versus non-free software ? all attempts failed. Only the Intel accelerated Python distribution claims that it boosted performance for transcendental functions, but I do not know how they achieved this and if this could be integrated in the official numpy. 

Therefor I think there is some need for an ?official? way to swap numpy math functions at the user (Python) level at runtime. As Julian commented, you want this flexibility because of speed and accuracy trade-offs.

Just replacing the inner loop might be an alternative way, but I am not sure. Many optimised vector math libraries require contiguous arrays, so they don?t fulfil the expectations numpy has for an inner loop. So you would need to allocate memory, copy, and free memory for each call to the inner loop. I image this gives quite some overhead you could avoid by a completely custom ufunc. 
On the other hand, setting up a ufunc from inner loop functions is easy, you can reuse all the numpy machinery. I disagree with you that you have to reimplement the whole ufunc machinery if you swap math functions at the ufunc level.

Stupid question: how to get the first argument of 
 int PyUFunc_ReplaceLoopBySignature(PyUFuncObject <https://docs.scipy.org/doc/numpy/reference/c-api.types-and-structures.html#c.PyUFuncObject>* ufunc,
e.g. for np.add ?

So, please consider this when refactoring/redesigning the ufunc module.

Gregor


> 
> -n
> 
> -- 
> Nathaniel J. Smith -- https://vorpus.org <https://vorpus.org/>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org <mailto:NumPy-Discussion at python.org>
> https://mail.python.org/mailman/listinfo/numpy-discussion <https://mail.python.org/mailman/listinfo/numpy-discussion>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180311/2438f4aa/attachment-0001.html>

From m.h.vankerkwijk at gmail.com  Mon Mar 12 12:05:15 2018
From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk)
Date: Mon, 12 Mar 2018 12:05:15 -0400
Subject: [Numpy-discussion] New NEP: merging multiarray and umath
In-Reply-To: <CAPJVwBkD7xLauHVNSyD9xHQYDwQiSAOjjakxNtD8dJH6BQs2-w@mail.gmail.com>
References: <CAPJVwBkD7xLauHVNSyD9xHQYDwQiSAOjjakxNtD8dJH6BQs2-w@mail.gmail.com>
Message-ID: <CAJNV+9uDj9rqmM3Fh+o2yoBSHCcpJUsuEOMs_sGfbge8_u7Nww@mail.gmail.com>

Hi Nathanial,

I looked through the revised text at https://github.com/numpy/numpy/pull/10704
and think it covers things well; any improvements on the organisation
I can think of would seem to start with doing the merge anyway (e.g.,
I quite like Eric Wieser's suggested base ndarray class; the
additional bits that implement operators might quite easily become
useful for duck arrays).

One request: can it be part of the NEP to aim to document the
organisation of the whole more clearly? For me at least, one of the
big hurdles to trying to contribute to the C code has been the absence
of a mental picture of how it all hangs together.

All the best,

Marten

From charlesr.harris at gmail.com  Mon Mar 12 14:25:42 2018
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 12 Mar 2018 12:25:42 -0600
Subject: [Numpy-discussion] NumPy 1.14.2 released
Message-ID: <CAB6mnxJxPKPSQK5rJDtJr9eoDj15wquAjG558BCp+JjSBXAz9g@mail.gmail.com>

Hi All,

I am pleased to announce the release of NumPy 1.14.2. This is a bugfix
release for some bugs reported following the 1.14.1 release. The major
problems dealt with are as follows.

   - Residual bugs in the new array printing functionality.
   - Regression resulting in a relocation problem with shared library.
   - Improved PyPy compatibility.

This release supports Python 2.7 and 3.4 - 3.6. Wheels for the release are
available on PyPI. Source tarballs, zipfiles, release notes, and the
changelog are available on github
<https://github.com/numpy/numpy/releases/tag/v1.14.2>. The Python 3.6
wheels available from PIP are built with Python 3.6.2 and should be
compatible with all previous versions of Python 3.6. The source releases
were cythonized with Cython 0.26.1, which is known to *not* support the
upcoming Python 3.7 release. People who wish to run Python 3.7 should check
out the NumPy repo and try building with the, as yet, unreleased master
branch of Cython.

Contributors
============

A total of 4 people contributed to this release.  People with a "+" by their
names contributed a patch for the first time.

* Allan Haldane
* Charles Harris
* Eric Wieser
* Pauli Virtanen

Pull requests merged
====================

A total of 5 pull requests were merged for this release.

* `#10674 <https://github.com/numpy/numpy/pull/10674>`__: BUG: Further
back-compat fix for subclassed array repr
* `#10725 <https://github.com/numpy/numpy/pull/10725>`__: BUG: dragon4
fractional output mode adds too many trailing zeros
* `#10726 <https://github.com/numpy/numpy/pull/10726>`__: BUG: Fix f2py
generated code to work on PyPy
* `#10727 <https://github.com/numpy/numpy/pull/10727>`__: BUG: Fix missing
NPY_VISIBILITY_HIDDEN on npy_longdouble_to_PyLong
* `#10729 <https://github.com/numpy/numpy/pull/10729>`__: DOC: Create
1.14.2 notes and changelog.

Cheers,

Charles Harris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180312/edb13859/attachment.html>

From charlesr.harris at gmail.com  Mon Mar 12 14:44:31 2018
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 12 Mar 2018 12:44:31 -0600
Subject: [Numpy-discussion] NumPy 1.15 release schedule
Message-ID: <CAB6mnx+6mnZPiqDs1avh6APX=wnHT2OSbWwdE29Sd=eWS2fZ1w@mail.gmail.com>

Hi All,

I'm thinking of branching NumPy in the middle/end of April. That is quicker
than usual, but there don't seem to be any major changes proposed for the
near future, we have merged a reasonable number of PRs, and a Python 3.7
compatible release of Cython looks to be forthcoming. An early release will
also give us time for possibly two following releases before we drop Python
2.7 support. With that schedule, I also propose to drop Python 3.4 support
in NumPy 1.16.

Thoughts?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180312/3910942d/attachment.html>

From charlesr.harris at gmail.com  Mon Mar 12 15:01:40 2018
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 12 Mar 2018 13:01:40 -0600
Subject: [Numpy-discussion] New NEP: merging multiarray and umath
In-Reply-To: <CAPJVwBkD7xLauHVNSyD9xHQYDwQiSAOjjakxNtD8dJH6BQs2-w@mail.gmail.com>
References: <CAPJVwBkD7xLauHVNSyD9xHQYDwQiSAOjjakxNtD8dJH6BQs2-w@mail.gmail.com>
Message-ID: <CAB6mnxK=r=Cp4VMyojdK8PELMttWiALCmCYthho3=fMdnSeJ3g@mail.gmail.com>

On Thu, Mar 8, 2018 at 1:25 AM, Nathaniel Smith <njs at pobox.com> wrote:

> Hi all,
>
> Well, this is something that we've discussed for a while and I think
> generally has consensus already, but I figured I'd write it down
> anyway to make sure.
>
> There's a rendered version here:
> https://github.com/njsmith/numpy/blob/nep-0015-merge-
> multiarray-umath/doc/neps/nep-0015-merge-multiarray-umath.rst
>
> -----
>
> ============================
> Merging multiarray and umath
> ============================
>
> :Author: Nathaniel J. Smith <njs at pobox.com>
> :Status: Draft
> :Type: Standards Track
> :Created: 2018-02-22
>
>
> Abstract
> --------
>
> Let's merge ``numpy.core.multiarray`` and ``numpy.core.umath`` into a
> single extension module, and deprecate ``np.set_numeric_ops``.
>
>
> Background
> ----------
>
> Currently, numpy's core C code is split between two separate extension
> modules.
>
> ``numpy.core.multiarray`` is built from
> ``numpy/core/src/multiarray/*.c``, and contains the core array
> functionality (in particular, the ``ndarray`` object).
>
> ``numpy.core.umath`` is built from ``numpy/core/src/umath/*.c``, and
> contains the ufunc machinery.
>
> These two modules each expose their own separate C API, accessed via
> ``import_multiarray()`` and ``import_umath()`` respectively. The idea
> is that they're supposed to be independent modules, with
> ``multiarray`` as a lower-level layer with ``umath`` built on top. In
> practice this has turned out to be problematic.
>
> First, the layering isn't perfect: when you write ``ndarray +
> ndarray``, this invokes ``ndarray.__add__``, which then calls the
> ufunc ``np.add``. This means that ``ndarray`` needs to know about
> ufuncs ? so instead of a clean layering, we have a circular
> dependency. To solve this, ``multiarray`` exports a somewhat
> terrifying function called ``set_numeric_ops``. The bootstrap
> procedure each time you ``import numpy`` is:
>
> 1. ``multiarray`` and its ``ndarray`` object are loaded, but
>    arithmetic operations on ndarrays are broken.
>
> 2. ``umath`` is loaded.
>
> 3. ``set_numeric_ops`` is used to monkeypatch all the methods like
>    ``ndarray.__add__`` with objects from ``umath``.
>
> In addition, ``set_numeric_ops`` is exposed as a public API,
> ``np.set_numeric_ops``.
>
> Furthermore, even when this layering does work, it ends up distorting
> the shape of our public ABI. In recent years, the most common reason
> for adding new functions to ``multiarray``\'s "public" ABI is not that
> they really need to be public or that we expect other projects to use
> them, but rather just that we need to call them from ``umath``. This
> is extremely unfortunate, because it makes our public ABI
> unnecessarily large, and since we can never remove things from it then
> this creates an ongoing maintenance burden. The way C works, you can
> have internal API that's visible to everything inside the same
> extension module, or you can have a public API that everyone can use;
> you can't have an API that's visible to multiple extension modules
> inside numpy, but not to external users.
>
> We've also increasingly been putting utility code into
> ``numpy/core/src/private/``, which now contains a bunch of files which
> are ``#include``\d twice, once into ``multiarray`` and once into
> ``umath``. This is pretty gross, and is purely a workaround for these
> being separate C extensions.
>
>
> Proposed changes
> ----------------
>
> This NEP proposes three changes:
>
> 1. We should start building ``numpy/core/src/multiarray/*.c`` and
>    ``numpy/core/src/umath/*.c`` together into a single extension
>    module.
>
> 2. Instead of ``set_numeric_ops``, we should use some new, private API
>    to set up ``ndarray.__add__`` and friends.
>
> 3. We should deprecate, and eventually remove, ``np.set_numeric_ops``.
>
>
> Non-proposed changes
> --------------------
>
> We don't necessarily propose to throw away the distinction between
> multiarray/ and umath/ in terms of our source code organization:
> internal organization is useful! We just want to build them together
> into a single extension module. Of course, this does open the door for
> potential future refactorings, which we can then evaluate based on
> their merits as they come up.
>
> It also doesn't propose that we break the public C ABI. We should
> continue to provide ``import_multiarray()`` and ``import_umath()``
> functions ? it's just that now both ABIs will ultimately be loaded
> from the same C library. Due to how ``import_multiarray()`` and
> ``import_umath()`` are written, we'll also still need to have modules
> called ``numpy.core.multiarray`` and ``numpy.core.umath``, and they'll
> need to continue to export ``_ARRAY_API`` and ``_UFUNC_API`` objects ?
> but we can make one or both of these modules be tiny shims that simply
> re-export the magic API object from where-ever it's actually defined.
> (See ``numpy/core/code_generators/generate_{numpy,ufunc}_api.py`` for
> details of how these imports work.)
>
>
> Backward compatibility
> ----------------------
>
> The only compatibility break is the deprecation of ``np.set_numeric_ops``.
>
>
> Alternatives
> ------------
>
> n/a
>
>
> Discussion
> ----------
>
> TBD
>
>
> Copyright
> ---------
>
> This document has been placed in the public domain.
>

If we accept this NEP, I'd like to get it done soon, preferably and the
next few months, so that it is finished before we drop Python 2.7 support.
That will make maintenance of the NumPy long term support release through
2019 easier.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180312/e99f88cf/attachment-0001.html>

From njs at pobox.com  Mon Mar 12 15:25:20 2018
From: njs at pobox.com (Nathaniel Smith)
Date: Mon, 12 Mar 2018 12:25:20 -0700
Subject: [Numpy-discussion] New NEP: merging multiarray and umath
In-Reply-To: <CAB6mnxK=r=Cp4VMyojdK8PELMttWiALCmCYthho3=fMdnSeJ3g@mail.gmail.com>
References: <CAPJVwBkD7xLauHVNSyD9xHQYDwQiSAOjjakxNtD8dJH6BQs2-w@mail.gmail.com>
 <CAB6mnxK=r=Cp4VMyojdK8PELMttWiALCmCYthho3=fMdnSeJ3g@mail.gmail.com>
Message-ID: <CAPJVwB=sETc0dVQq5Tst1c0j=sQgNQcnkpOR3qUn=x3WF6QpzQ@mail.gmail.com>

On Mar 12, 2018 12:02, "Charles R Harris" <charlesr.harris at gmail.com> wrote:


If we accept this NEP, I'd like to get it done soon, preferably and the
next few months, so that it is finished before we drop Python 2.7 support.
That will make maintenance of the NumPy long term support release through
2019 easier.


The reason you're seeing this spurt of activity on NEPs and NEP
infrastructure from people at Berkeley is that we're preparing for the
upcoming arrival of full time devs on the numpy grant. (More announcements
there soon.) So if it's accepted then I don't think there will be any
problem getting it implemented by then.

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180312/a8540f9a/attachment.html>

From charlesr.harris at gmail.com  Mon Mar 12 15:40:42 2018
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 12 Mar 2018 13:40:42 -0600
Subject: [Numpy-discussion] New NEP: merging multiarray and umath
In-Reply-To: <CAPJVwB=sETc0dVQq5Tst1c0j=sQgNQcnkpOR3qUn=x3WF6QpzQ@mail.gmail.com>
References: <CAPJVwBkD7xLauHVNSyD9xHQYDwQiSAOjjakxNtD8dJH6BQs2-w@mail.gmail.com>
 <CAB6mnxK=r=Cp4VMyojdK8PELMttWiALCmCYthho3=fMdnSeJ3g@mail.gmail.com>
 <CAPJVwB=sETc0dVQq5Tst1c0j=sQgNQcnkpOR3qUn=x3WF6QpzQ@mail.gmail.com>
Message-ID: <CAB6mnxKrkx2hZpfDGZErhO5wZUASjC7jr1cjjfcv1vFDtW3u-A@mail.gmail.com>

On Mon, Mar 12, 2018 at 1:25 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Mar 12, 2018 12:02, "Charles R Harris" <charlesr.harris at gmail.com>
> wrote:
>
>
> If we accept this NEP, I'd like to get it done soon, preferably and the
> next few months, so that it is finished before we drop Python 2.7 support.
> That will make maintenance of the NumPy long term support release through
> 2019 easier.
>
>
> The reason you're seeing this spurt of activity on NEPs and NEP
> infrastructure from people at Berkeley is that we're preparing for the
> upcoming arrival of full time devs on the numpy grant. (More announcements
> there soon.) So if it's accepted then I don't think there will be any
> problem getting it implemented by then.
>

Depends on background. Even the best developers need some time to come up
to speed on a new project ...

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180312/f008385a/attachment.html>

From tcaswell at gmail.com  Mon Mar 12 17:45:11 2018
From: tcaswell at gmail.com (Thomas Caswell)
Date: Mon, 12 Mar 2018 21:45:11 +0000
Subject: [Numpy-discussion] PR to add a function to calculate histogram
 edges without calculating the histogram
In-Reply-To: <CALSHQvPKoyQi2D0YHtgbPuA9qQ4bh1SDRx8CRvDeaDpjR=QP1Q@mail.gmail.com>
References: <CALSHQvPKoyQi2D0YHtgbPuA9qQ4bh1SDRx8CRvDeaDpjR=QP1Q@mail.gmail.com>
Message-ID: <CAA48SF-TgWkicrMYLdaYk9nEoiLN5fUcBsQkU0yrAOZ_1yQr4Q@mail.gmail.com>

As commented in the OP, this would be very useful for Matplotlib.

Tom

On Fri, Mar 9, 2018 at 1:42 PM Kirit Thadaka <kirit.thadaka at gmail.com>
wrote:

> Hi!
>
> I've created a PR to add a function called "histogram_bin_edges" which
> will allow a user to calculate the bins used by the histogram for some data
> without requiring the entire histogram to be calculated.
>
> https://github.com/numpy/numpy/pull/10591#issuecomment-371863472
>
> This function allows one set of bins to be computed, and reused across
> multiple histograms which gives more easily comparable results than using
> separate bins for each histogram.
>
> Please let me know if you have any suggestions on how to improve this PR.
>
> Thanks!
>
> -
> Kirit
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180312/524aba43/attachment.html>

From wieser.eric+numpy at gmail.com  Mon Mar 12 19:08:45 2018
From: wieser.eric+numpy at gmail.com (Eric Wieser)
Date: Mon, 12 Mar 2018 23:08:45 +0000
Subject: [Numpy-discussion] PR to add a function to calculate histogram
 edges without calculating the histogram
In-Reply-To: <CAA48SF-TgWkicrMYLdaYk9nEoiLN5fUcBsQkU0yrAOZ_1yQr4Q@mail.gmail.com>
References: <CALSHQvPKoyQi2D0YHtgbPuA9qQ4bh1SDRx8CRvDeaDpjR=QP1Q@mail.gmail.com>
 <CAA48SF-TgWkicrMYLdaYk9nEoiLN5fUcBsQkU0yrAOZ_1yQr4Q@mail.gmail.com>
Message-ID: <CAL1kJvBq5ZciA+f8t7u3AW-m3_Kmtq_Dk8Qs39ExkR3RFizbsA@mail.gmail.com>

As likely one of the primary users, Tom - does the function name seem
reasonable?

Eric

On Mon, Mar 12, 2018, 21:45 Thomas Caswell <tcaswell at gmail.com> wrote:

> As commented in the OP, this would be very useful for Matplotlib.
>
> Tom
>
> On Fri, Mar 9, 2018 at 1:42 PM Kirit Thadaka <kirit.thadaka at gmail.com>
> wrote:
>
>> Hi!
>>
>> I've created a PR to add a function called "histogram_bin_edges" which
>> will allow a user to calculate the bins used by the histogram for some data
>> without requiring the entire histogram to be calculated.
>>
>> https://github.com/numpy/numpy/pull/10591#issuecomment-371863472
>>
>> This function allows one set of bins to be computed, and reused across
>> multiple histograms which gives more easily comparable results than using
>> separate bins for each histogram.
>>
>> Please let me know if you have any suggestions on how to improve this PR.
>>
>> Thanks!
>>
>> -
>> Kirit
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180312/9ff3171b/attachment-0001.html>

From josef.pktd at gmail.com  Mon Mar 12 22:58:09 2018
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Mon, 12 Mar 2018 22:58:09 -0400
Subject: [Numpy-discussion] PR to add a function to calculate histogram
 edges without calculating the histogram
In-Reply-To: <CAL1kJvBq5ZciA+f8t7u3AW-m3_Kmtq_Dk8Qs39ExkR3RFizbsA@mail.gmail.com>
References: <CALSHQvPKoyQi2D0YHtgbPuA9qQ4bh1SDRx8CRvDeaDpjR=QP1Q@mail.gmail.com>
 <CAA48SF-TgWkicrMYLdaYk9nEoiLN5fUcBsQkU0yrAOZ_1yQr4Q@mail.gmail.com>
 <CAL1kJvBq5ZciA+f8t7u3AW-m3_Kmtq_Dk8Qs39ExkR3RFizbsA@mail.gmail.com>
Message-ID: <CAMMTP+A2WdFinbvpYm2r3ieTig8FCcHSxNYUJGP+G8eYr=fSoQ@mail.gmail.com>

On Mon, Mar 12, 2018 at 7:08 PM, Eric Wieser
<wieser.eric+numpy at gmail.com> wrote:
> As likely one of the primary users, Tom - does the function name seem
> reasonable?
>
> Eric
>
>
> On Mon, Mar 12, 2018, 21:45 Thomas Caswell <tcaswell at gmail.com> wrote:
>>
>> As commented in the OP, this would be very useful for Matplotlib.
>>
>> Tom
>>
>> On Fri, Mar 9, 2018 at 1:42 PM Kirit Thadaka <kirit.thadaka at gmail.com>
>> wrote:
>>>
>>> Hi!
>>>
>>> I've created a PR to add a function called "histogram_bin_edges" which
>>> will allow a user to calculate the bins used by the histogram for some data
>>> without requiring the entire histogram to be calculated.
>>>
>>> https://github.com/numpy/numpy/pull/10591#issuecomment-371863472
>>>
>>> This function allows one set of bins to be computed, and reused across
>>> multiple histograms which gives more easily comparable results than using
>>> separate bins for each histogram.

Given that the bin selection are data driven, transferring them across
datasets might not be so useful.

(Aside I usually pick the bin_edges returned by the first histogram to
use for any follow-up histograms, or pick something on a common
range.)


>>>
>>> Please let me know if you have any suggestions on how to improve this PR.
>>>
>>> Thanks!

as a bystander: LGTM and I think it's a good idea

Josef


>>>
>>> -
>>> Kirit
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>

From wieser.eric+numpy at gmail.com  Mon Mar 12 23:20:17 2018
From: wieser.eric+numpy at gmail.com (Eric Wieser)
Date: Mon, 12 Mar 2018 20:20:17 -0700
Subject: [Numpy-discussion] PR to add a function to calculate histogram
 edges without calculating the histogram
In-Reply-To: <CAMMTP+A2WdFinbvpYm2r3ieTig8FCcHSxNYUJGP+G8eYr=fSoQ@mail.gmail.com>
References: <CALSHQvPKoyQi2D0YHtgbPuA9qQ4bh1SDRx8CRvDeaDpjR=QP1Q@mail.gmail.com>
 <CAA48SF-TgWkicrMYLdaYk9nEoiLN5fUcBsQkU0yrAOZ_1yQr4Q@mail.gmail.com>
 <CAL1kJvBq5ZciA+f8t7u3AW-m3_Kmtq_Dk8Qs39ExkR3RFizbsA@mail.gmail.com>
 <CAMMTP+A2WdFinbvpYm2r3ieTig8FCcHSxNYUJGP+G8eYr=fSoQ@mail.gmail.com>
Message-ID: <CAL1kJvDpc8p7RtwAmr0zfmuL_TVrV0u649kYk5WX--LJKRdQQQ@mail.gmail.com>

> Given that the bin selection are data driven, transferring them across datasets might not be so useful.

The main application would be to compute bins across the union of all
datasets. This is already possibly by using `np.histogram` and
discarding the first result, but that's super wasteful.

From josef.pktd at gmail.com  Mon Mar 12 23:34:41 2018
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Mon, 12 Mar 2018 23:34:41 -0400
Subject: [Numpy-discussion] PR to add a function to calculate histogram
 edges without calculating the histogram
In-Reply-To: <CAL1kJvDpc8p7RtwAmr0zfmuL_TVrV0u649kYk5WX--LJKRdQQQ@mail.gmail.com>
References: <CALSHQvPKoyQi2D0YHtgbPuA9qQ4bh1SDRx8CRvDeaDpjR=QP1Q@mail.gmail.com>
 <CAA48SF-TgWkicrMYLdaYk9nEoiLN5fUcBsQkU0yrAOZ_1yQr4Q@mail.gmail.com>
 <CAL1kJvBq5ZciA+f8t7u3AW-m3_Kmtq_Dk8Qs39ExkR3RFizbsA@mail.gmail.com>
 <CAMMTP+A2WdFinbvpYm2r3ieTig8FCcHSxNYUJGP+G8eYr=fSoQ@mail.gmail.com>
 <CAL1kJvDpc8p7RtwAmr0zfmuL_TVrV0u649kYk5WX--LJKRdQQQ@mail.gmail.com>
Message-ID: <CAMMTP+A+cf==V9s-rM1yM46DwrK0rVaLkg7t=p_1TsVPwVuRCQ@mail.gmail.com>

On Mon, Mar 12, 2018 at 11:20 PM, Eric Wieser
<wieser.eric+numpy at gmail.com> wrote:
>> Given that the bin selection are data driven, transferring them across datasets might not be so useful.
>
> The main application would be to compute bins across the union of all
> datasets. This is already possibly by using `np.histogram` and
> discarding the first result, but that's super wasteful.

assuming "union" means a combined dataset.

If you stack  datasets, then the number of observations will not be
correct for individual datasets.

In that case an additional keyword like nobs, or whatever name would
be appropriate for numpy, would be useful, e.g. use the average number
of observations across datasets.
Auxiliary statistic like std could then be computed on the total
dataset (if that makes sense, which would not be the case if the
variance across datasets is larger than the variance within datasets.

Josef

> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

From marko.asplund at gmail.com  Wed Mar 14 02:32:20 2018
From: marko.asplund at gmail.com (Marko Asplund)
Date: Wed, 14 Mar 2018 08:32:20 +0200
Subject: [Numpy-discussion] numpy.random.randn
Message-ID: <CANoUZR9fV0JPq=LMxL2KVLhtqfCodjXt5wJUJp77_Pjmv0h5mA@mail.gmail.com>

On Fri, 9 Mar 2018 11:38:55, Robert Kern wrote:

> > Sorry for being a bit inaccurate.
> > My Scala code actually mirrors the NumPy based random initialization, so
> > I sample with Gaussian of mean = 0 and std dev = 1, then multiply with
0.01.
>
> Have you verified this? I.e. save out the Scala-initialized network and
> load it up with numpy to check the mean and std dev? How about if you run
> the numpy NN training with the Scala-initialized network? Does that also
> diverge?

I did what you suggested and it turned out my NumPy NN code
was behaving exactly as the Scala code when using Scala-initialized network.
After digging deeper into this I managed to find and fix a bug in how I was
doing
the random initilization and it's working correctly now.

Thanks a lot for your help!


Marko
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180314/e5c1ac23/attachment.html>

From jkkulick at amazon.de  Wed Mar 14 04:05:42 2018
From: jkkulick at amazon.de (Kulick, Johannes)
Date: Wed, 14 Mar 2018 08:05:42 +0000
Subject: [Numpy-discussion] ENH: softmax
Message-ID: <968D67CA-5A81-48EB-87CF-B03091A933C2@amazon.com>

Hi,

I regularly need the softmax function (https://en.wikipedia.org/wiki/Softmax_function) for my code. I have a quite efficient pure python implementation (credits to Nolan Conaway). I think it would be a valuable enhancement of the ndarray class. But since it is kind of a specialty function I wanted to ask you if you would consider it to be part of the numpy core (alongside ndarray.max and ndarray.argmax) or rather in scipy (e.g. scipy.stats seems also an appropriate place).

Best
Johannes

Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180314/a39df88a/attachment.html>

From warren.weckesser at gmail.com  Wed Mar 14 04:22:14 2018
From: warren.weckesser at gmail.com (Warren Weckesser)
Date: Wed, 14 Mar 2018 04:22:14 -0400
Subject: [Numpy-discussion] ENH: softmax
In-Reply-To: <968D67CA-5A81-48EB-87CF-B03091A933C2@amazon.com>
References: <968D67CA-5A81-48EB-87CF-B03091A933C2@amazon.com>
Message-ID: <CAGzF1uchnxH5tEJA=iUKdLGZoJJ01P8qO+QqrYUcdSTcbANvxA@mail.gmail.com>

On Wed, Mar 14, 2018 at 4:05 AM, Kulick, Johannes <jkkulick at amazon.de>
wrote:

> Hi,
>
>
>
> I regularly need the softmax function (https://en.wikipedia.org/
> wiki/Softmax_function) for my code. I have a quite efficient pure python
> implementation (credits to Nolan Conaway). I think it would be a valuable
> enhancement of the ndarray class. But since it is kind of a specialty
> function I wanted to ask you if you would consider it to be part of the
> numpy core (alongside ndarray.max and ndarray.argmax) or rather in scipy
> (e.g. scipy.stats seems also an appropriate place).
>
>

Johannes,

If the numpy devs aren't interested in adding it to numpy, I'm pretty sure
we can get it in scipy.  I've had adding it (or at least proposing that it
be added) to scipy on my to-do list for quite a while now.

Warren


>
>
> Best
>
> Johannes
>
>
>
> Amazon Development Center Germany GmbH
> Berlin - Dresden - Aachen
> main office: Krausenstr. 38, 10117 Berlin
> Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
> Ust-ID: DE289237879
> Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180314/cd63d3ad/attachment.html>

From robert.kern at gmail.com  Wed Mar 14 04:41:25 2018
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 14 Mar 2018 17:41:25 +0900
Subject: [Numpy-discussion] ENH: softmax
In-Reply-To: <CAGzF1uchnxH5tEJA=iUKdLGZoJJ01P8qO+QqrYUcdSTcbANvxA@mail.gmail.com>
References: <968D67CA-5A81-48EB-87CF-B03091A933C2@amazon.com>
 <CAGzF1uchnxH5tEJA=iUKdLGZoJJ01P8qO+QqrYUcdSTcbANvxA@mail.gmail.com>
Message-ID: <CAF6FJitAg2V3k9U2Y6xAcahosiFkUc9v6Tchnt+7-C4SK30GzA@mail.gmail.com>

On Wed, Mar 14, 2018 at 5:22 PM, Warren Weckesser <
warren.weckesser at gmail.com> wrote:
>
> On Wed, Mar 14, 2018 at 4:05 AM, Kulick, Johannes <jkkulick at amazon.de>
wrote:
>>
>> Hi,
>>
>> I regularly need the softmax function (
https://en.wikipedia.org/wiki/Softmax_function) for my code. I have a quite
efficient pure python implementation (credits to Nolan Conaway). I think it
would be a valuable enhancement of the ndarray class. But since it is kind
of a specialty function I wanted to ask you if you would consider it to be
part of the numpy core (alongside ndarray.max and ndarray.argmax) or rather
in scipy (e.g. scipy.stats seems also an appropriate place).
>
> Johannes,
>
> If the numpy devs aren't interested in adding it to numpy, I'm pretty
sure we can get it in scipy.  I've had adding it (or at least proposing
that it be added) to scipy on my to-do list for quite a while now.

+1 for scipy.special.

--
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180314/fca78963/attachment-0001.html>

From m.h.vankerkwijk at gmail.com  Wed Mar 14 09:27:21 2018
From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk)
Date: Wed, 14 Mar 2018 09:27:21 -0400
Subject: [Numpy-discussion] ENH: softmax
In-Reply-To: <CAF6FJitAg2V3k9U2Y6xAcahosiFkUc9v6Tchnt+7-C4SK30GzA@mail.gmail.com>
References: <968D67CA-5A81-48EB-87CF-B03091A933C2@amazon.com>
 <CAGzF1uchnxH5tEJA=iUKdLGZoJJ01P8qO+QqrYUcdSTcbANvxA@mail.gmail.com>
 <CAF6FJitAg2V3k9U2Y6xAcahosiFkUc9v6Tchnt+7-C4SK30GzA@mail.gmail.com>
Message-ID: <CAJNV+9u81sG8a4z-eMthuuLQXaL5RhpcsOYsMSDnJWdOypJtfA@mail.gmail.com>

I think this indeed makes most sense for scipy. I possible, write it
as a `gufunc`, so duck arrays can override with `__array_ufunc__` if
necessary.  -- Marten

From ralf.gommers at gmail.com  Wed Mar 14 09:37:46 2018
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Wed, 14 Mar 2018 06:37:46 -0700
Subject: [Numpy-discussion] ENH: softmax
In-Reply-To: <CAF6FJitAg2V3k9U2Y6xAcahosiFkUc9v6Tchnt+7-C4SK30GzA@mail.gmail.com>
References: <968D67CA-5A81-48EB-87CF-B03091A933C2@amazon.com>
 <CAGzF1uchnxH5tEJA=iUKdLGZoJJ01P8qO+QqrYUcdSTcbANvxA@mail.gmail.com>
 <CAF6FJitAg2V3k9U2Y6xAcahosiFkUc9v6Tchnt+7-C4SK30GzA@mail.gmail.com>
Message-ID: <CABL7CQjiXQFG=xsFUOYP8ospX8GE8jBMiiCCfSqj-Q-yUPAOzQ@mail.gmail.com>

On Wed, Mar 14, 2018 at 1:41 AM, Robert Kern <robert.kern at gmail.com> wrote:

> On Wed, Mar 14, 2018 at 5:22 PM, Warren Weckesser <
> warren.weckesser at gmail.com> wrote:
> >
> > On Wed, Mar 14, 2018 at 4:05 AM, Kulick, Johannes <jkkulick at amazon.de>
> wrote:
> >>
> >> Hi,
> >>
> >> I regularly need the softmax function (https://en.wikipedia.org/
> wiki/Softmax_function) for my code. I have a quite efficient pure python
> implementation (credits to Nolan Conaway). I think it would be a valuable
> enhancement of the ndarray class. But since it is kind of a specialty
> function I wanted to ask you if you would consider it to be part of the
> numpy core (alongside ndarray.max and ndarray.argmax) or rather in scipy
> (e.g. scipy.stats seems also an appropriate place).
> >
> > Johannes,
> >
> > If the numpy devs aren't interested in adding it to numpy, I'm pretty
> sure we can get it in scipy.  I've had adding it (or at least proposing
> that it be added) to scipy on my to-do list for quite a while now.
>
> +1 for scipy.special.
>

scipy.special sounds right to me too

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180314/92b462a7/attachment.html>

From einstein.edison at gmail.com  Wed Mar 14 09:44:49 2018
From: einstein.edison at gmail.com (Hameer Abbasi)
Date: Wed, 14 Mar 2018 06:44:49 -0700
Subject: [Numpy-discussion] ENH: softmax
In-Reply-To: <CAJNV+9u81sG8a4z-eMthuuLQXaL5RhpcsOYsMSDnJWdOypJtfA@mail.gmail.com>
References: <968D67CA-5A81-48EB-87CF-B03091A933C2@amazon.com>
 <CAGzF1uchnxH5tEJA=iUKdLGZoJJ01P8qO+QqrYUcdSTcbANvxA@mail.gmail.com>
 <CAF6FJitAg2V3k9U2Y6xAcahosiFkUc9v6Tchnt+7-C4SK30GzA@mail.gmail.com>
 <CAJNV+9u81sG8a4z-eMthuuLQXaL5RhpcsOYsMSDnJWdOypJtfA@mail.gmail.com>
Message-ID: <CADViA5CMMgHa9Nb=qzDnBd4AFVr2gaxFntjKUtKNRnTMrcBtmQ@mail.gmail.com>

I possible, write it as a `gufunc`, so duck arrays can override with
`__array_ufunc__` if

necessary. -- Marten

Softmax is a very simple combination of elementary `ufunc`s with two
inputs, the weight vector `w` and the data `x`. Writing it as a `gufunc`
would be going overboard, IMO. Writing it as a combination of `ufunc`s and
avoiding Numpy-specific stuff should be good enough.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180314/50d4fd85/attachment.html>

From m.h.vankerkwijk at gmail.com  Wed Mar 14 14:01:18 2018
From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk)
Date: Wed, 14 Mar 2018 14:01:18 -0400
Subject: [Numpy-discussion] ENH: softmax
In-Reply-To: <CADViA5CMMgHa9Nb=qzDnBd4AFVr2gaxFntjKUtKNRnTMrcBtmQ@mail.gmail.com>
References: <968D67CA-5A81-48EB-87CF-B03091A933C2@amazon.com>
 <CAGzF1uchnxH5tEJA=iUKdLGZoJJ01P8qO+QqrYUcdSTcbANvxA@mail.gmail.com>
 <CAF6FJitAg2V3k9U2Y6xAcahosiFkUc9v6Tchnt+7-C4SK30GzA@mail.gmail.com>
 <CAJNV+9u81sG8a4z-eMthuuLQXaL5RhpcsOYsMSDnJWdOypJtfA@mail.gmail.com>
 <CADViA5CMMgHa9Nb=qzDnBd4AFVr2gaxFntjKUtKNRnTMrcBtmQ@mail.gmail.com>
Message-ID: <CAJNV+9sJOdGBsbwTQcGrJYeJ6X38ArX=4u09axLJ7+CD37CyPg@mail.gmail.com>

On Wed, Mar 14, 2018 at 9:44 AM, Hameer Abbasi
<einstein.edison at gmail.com> wrote:
> I possible, write it as a `gufunc`, so duck arrays can override with
> `__array_ufunc__` if
>
> necessary. -- Marten
>
> Softmax is a very simple combination of elementary `ufunc`s with two inputs,
> the weight vector `w` and the data `x`. Writing it as a `gufunc` would be
> going overboard, IMO. Writing it as a combination of `ufunc`s and avoiding
> Numpy-specific stuff should be good enough.

My mistake - I thought the result was reduced, but you only need a
reduction along the way. Writing this in terms of standard functions
is certainly fine!
-- Marten

From jkkulick at amazon.de  Wed Mar 14 18:04:46 2018
From: jkkulick at amazon.de (Kulick, Johannes)
Date: Wed, 14 Mar 2018 22:04:46 +0000
Subject: [Numpy-discussion] ENH: softmax
In-Reply-To: <CAJNV+9sJOdGBsbwTQcGrJYeJ6X38ArX=4u09axLJ7+CD37CyPg@mail.gmail.com>
References: <968D67CA-5A81-48EB-87CF-B03091A933C2@amazon.com>
 <CAGzF1uchnxH5tEJA=iUKdLGZoJJ01P8qO+QqrYUcdSTcbANvxA@mail.gmail.com>
 <CAF6FJitAg2V3k9U2Y6xAcahosiFkUc9v6Tchnt+7-C4SK30GzA@mail.gmail.com>
 <CAJNV+9u81sG8a4z-eMthuuLQXaL5RhpcsOYsMSDnJWdOypJtfA@mail.gmail.com>
 <CADViA5CMMgHa9Nb=qzDnBd4AFVr2gaxFntjKUtKNRnTMrcBtmQ@mail.gmail.com>
 <CAJNV+9sJOdGBsbwTQcGrJYeJ6X38ArX=4u09axLJ7+CD37CyPg@mail.gmail.com>
Message-ID: <E676E41D-C1A1-477F-884E-F47265FFED88@amazon.com>

Alright. Going for scipy.special then. Thanks for the quick answer.

Cheers
Johannes

?On 14.03.18, 19:03, "NumPy-Discussion on behalf of Marten van Kerkwijk" <numpy-discussion-bounces+jkkulick=amazon.de at python.org on behalf of m.h.vankerkwijk at gmail.com> wrote:

    On Wed, Mar 14, 2018 at 9:44 AM, Hameer Abbasi
    <einstein.edison at gmail.com> wrote:
    > I possible, write it as a `gufunc`, so duck arrays can override with
    > `__array_ufunc__` if
    >
    > necessary. -- Marten
    >
    > Softmax is a very simple combination of elementary `ufunc`s with two inputs,
    > the weight vector `w` and the data `x`. Writing it as a `gufunc` would be
    > going overboard, IMO. Writing it as a combination of `ufunc`s and avoiding
    > Numpy-specific stuff should be good enough.
    
    My mistake - I thought the result was reduced, but you only need a
    reduction along the way. Writing this in terms of standard functions
    is certainly fine!
    -- Marten
    _______________________________________________
    NumPy-Discussion mailing list
    NumPy-Discussion at python.org
    https://mail.python.org/mailman/listinfo/numpy-discussion
    

Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

From m.h.vankerkwijk at gmail.com  Wed Mar 14 21:27:46 2018
From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk)
Date: Wed, 14 Mar 2018 21:27:46 -0400
Subject: [Numpy-discussion] Where to discuss NEPs (was: Re: new NEP:
 np.AbstractArray and np.asabstractarray)
In-Reply-To: <CABL7CQhqOZo0SdaP01jSfVG4Nt5EujZw9y-TzR2g5caNFTFBUQ@mail.gmail.com>
References: <CAPJVwBmAVV_EFhrnBic+6spRc6fWgsmhsgAaZ0fL9AvYtz29WA@mail.gmail.com>
 <CABL7CQhbOED8nogyQf-Y8B2ifq98kBNcFjNaj4d5nUf2B5etRw@mail.gmail.com>
 <CAPJVwB=ik5-Qi=+w7y2hPxWvudR-_zJq0YB_XFjvP8+jY8eYJQ@mail.gmail.com>
 <CABL7CQhqOZo0SdaP01jSfVG4Nt5EujZw9y-TzR2g5caNFTFBUQ@mail.gmail.com>
Message-ID: <CAJNV+9uwTvj1TrcA7aKTdLf23gOt1-b+Ws=K2H4oO+FYpx_53Q@mail.gmail.com>

Apparently, where and how to discuss enhancement proposals was
recently a topic on the python mailing list as well -- see the
write-up at LWN:
https://lwn.net/SubscriberLink/749200/4343911ee71e35cf/
The conclusion seems to be that one should switch to mailman3...
-- Marten

From stefanv at berkeley.edu  Thu Mar 15 18:29:06 2018
From: stefanv at berkeley.edu (Stefan van der Walt)
Date: Thu, 15 Mar 2018 15:29:06 -0700
Subject: [Numpy-discussion] NEP sprint: 21 and 22 March
In-Reply-To: <20180309232638.vumxg3z4dzfaz3yo@carbo>
References: <20180309232638.vumxg3z4dzfaz3yo@carbo>
Message-ID: <20180315222906.xc33qjkgas2k55xs@carbo>

Hi everyone,

A quick reminder of the NEP sprint that will happen at Berkeley next
Wednesday and Thursday.  Please let me know if you are interested in
joining.

Best regards
St?fan

On Fri, 09 Mar 2018 15:26:38 -0800, Stefan van der Walt wrote:
> Hi everyone,
> 
> As you may have noticed, there's been quite a bit of movement recently
> around NumPy Enhancement Proposals---on setting specifications,
> building infrastructure, as well as writing new proposals.
> 
> To further support this work, we will be hosting an informal NEP
> sprint at Berkeley on 21 and 22 March.  Our aim is to bring core
> contributors and interested community members together to discuss
> proposal ideas, write up new NEPs, and polish existing ones.
> 
> Some potential topics of discussion are:
> 
>  - Duck arrays
>  - Array concatenation
>  - Random number generator seed versioning
>  - User defined dtypes
>  - Deprecation pathways for `np.matrix`
>  - What to do about nditer?
> 
> All community members are welcome to attend.  If you are a core
> contributor, we may be able to fund some travel costs as well; please
> let me know.
> 
> Best regards
> St?fan
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

From tcaswell at gmail.com  Thu Mar 15 22:56:41 2018
From: tcaswell at gmail.com (Thomas Caswell)
Date: Fri, 16 Mar 2018 02:56:41 +0000
Subject: [Numpy-discussion] PR to add a function to calculate histogram
 edges without calculating the histogram
In-Reply-To: <CAMMTP+A+cf==V9s-rM1yM46DwrK0rVaLkg7t=p_1TsVPwVuRCQ@mail.gmail.com>
References: <CALSHQvPKoyQi2D0YHtgbPuA9qQ4bh1SDRx8CRvDeaDpjR=QP1Q@mail.gmail.com>
 <CAA48SF-TgWkicrMYLdaYk9nEoiLN5fUcBsQkU0yrAOZ_1yQr4Q@mail.gmail.com>
 <CAL1kJvBq5ZciA+f8t7u3AW-m3_Kmtq_Dk8Qs39ExkR3RFizbsA@mail.gmail.com>
 <CAMMTP+A2WdFinbvpYm2r3ieTig8FCcHSxNYUJGP+G8eYr=fSoQ@mail.gmail.com>
 <CAL1kJvDpc8p7RtwAmr0zfmuL_TVrV0u649kYk5WX--LJKRdQQQ@mail.gmail.com>
 <CAMMTP+A+cf==V9s-rM1yM46DwrK0rVaLkg7t=p_1TsVPwVuRCQ@mail.gmail.com>
Message-ID: <CAA48SF9AA6B7-quk+mLrog_ZUqQEh6xCey86doJ=6J1-WywOsQ@mail.gmail.com>

Yes I like the name.

The primary use-case for Matplotlib is that our `hist` method can take in a
list of arrays and produces N histograms in one shot. Currently with 'auto'
we only use the first data set to sort out what the bins should be and then
re-use those for the rest of the data sets.  This will let us get the bins
on the merged input, but I take Josef's point that this is not actually
what we want....

Tom

On Mon, Mar 12, 2018 at 11:35 PM <josef.pktd at gmail.com> wrote:

> On Mon, Mar 12, 2018 at 11:20 PM, Eric Wieser
> <wieser.eric+numpy at gmail.com> wrote:
> >> Given that the bin selection are data driven, transferring them across
> datasets might not be so useful.
> >
> > The main application would be to compute bins across the union of all
> > datasets. This is already possibly by using `np.histogram` and
> > discarding the first result, but that's super wasteful.
>
> assuming "union" means a combined dataset.
>
> If you stack  datasets, then the number of observations will not be
> correct for individual datasets.
>
> In that case an additional keyword like nobs, or whatever name would
> be appropriate for numpy, would be useful, e.g. use the average number
> of observations across datasets.
> Auxiliary statistic like std could then be computed on the total
> dataset (if that makes sense, which would not be the case if the
> variance across datasets is larger than the variance within datasets.
>
> Josef
>
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180316/dbee8c7b/attachment.html>

From njs at pobox.com  Thu Mar 15 23:13:47 2018
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 15 Mar 2018 20:13:47 -0700
Subject: [Numpy-discussion] PR to add a function to calculate histogram
 edges without calculating the histogram
In-Reply-To: <CAA48SF9AA6B7-quk+mLrog_ZUqQEh6xCey86doJ=6J1-WywOsQ@mail.gmail.com>
References: <CALSHQvPKoyQi2D0YHtgbPuA9qQ4bh1SDRx8CRvDeaDpjR=QP1Q@mail.gmail.com>
 <CAA48SF-TgWkicrMYLdaYk9nEoiLN5fUcBsQkU0yrAOZ_1yQr4Q@mail.gmail.com>
 <CAL1kJvBq5ZciA+f8t7u3AW-m3_Kmtq_Dk8Qs39ExkR3RFizbsA@mail.gmail.com>
 <CAMMTP+A2WdFinbvpYm2r3ieTig8FCcHSxNYUJGP+G8eYr=fSoQ@mail.gmail.com>
 <CAL1kJvDpc8p7RtwAmr0zfmuL_TVrV0u649kYk5WX--LJKRdQQQ@mail.gmail.com>
 <CAMMTP+A+cf==V9s-rM1yM46DwrK0rVaLkg7t=p_1TsVPwVuRCQ@mail.gmail.com>
 <CAA48SF9AA6B7-quk+mLrog_ZUqQEh6xCey86doJ=6J1-WywOsQ@mail.gmail.com>
Message-ID: <CAPJVwBm0Nf-Hs9VPSjHGA1f9b3NzJNE_zEY5T9opyCOa59DM+Q@mail.gmail.com>

Instead of an nobs argument, maybe we should have a version that accepts
multiple data sets, so that we have the full information and can improve
the algorithm over time.

On Mar 15, 2018 7:57 PM, "Thomas Caswell" <tcaswell at gmail.com> wrote:

> Yes I like the name.
>
> The primary use-case for Matplotlib is that our `hist` method can take in
> a list of arrays and produces N histograms in one shot. Currently with
> 'auto' we only use the first data set to sort out what the bins should be
> and then re-use those for the rest of the data sets.  This will let us get
> the bins on the merged input, but I take Josef's point that this is not
> actually what we want....
>
> Tom
>
> On Mon, Mar 12, 2018 at 11:35 PM <josef.pktd at gmail.com> wrote:
>
>> On Mon, Mar 12, 2018 at 11:20 PM, Eric Wieser
>> <wieser.eric+numpy at gmail.com> wrote:
>> >> Given that the bin selection are data driven, transferring them across
>> datasets might not be so useful.
>> >
>> > The main application would be to compute bins across the union of all
>> > datasets. This is already possibly by using `np.histogram` and
>> > discarding the first result, but that's super wasteful.
>>
>> assuming "union" means a combined dataset.
>>
>> If you stack  datasets, then the number of observations will not be
>> correct for individual datasets.
>>
>> In that case an additional keyword like nobs, or whatever name would
>> be appropriate for numpy, would be useful, e.g. use the average number
>> of observations across datasets.
>> Auxiliary statistic like std could then be computed on the total
>> dataset (if that makes sense, which would not be the case if the
>> variance across datasets is larger than the variance within datasets.
>>
>> Josef
>>
>> > _______________________________________________
>> > NumPy-Discussion mailing list
>> > NumPy-Discussion at python.org
>> > https://mail.python.org/mailman/listinfo/numpy-discussion
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180315/63cde9de/attachment.html>

From ralf.gommers at gmail.com  Fri Mar 16 00:35:40 2018
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Thu, 15 Mar 2018 21:35:40 -0700
Subject: [Numpy-discussion] NumPy 1.15 release schedule
In-Reply-To: <CAB6mnx+6mnZPiqDs1avh6APX=wnHT2OSbWwdE29Sd=eWS2fZ1w@mail.gmail.com>
References: <CAB6mnx+6mnZPiqDs1avh6APX=wnHT2OSbWwdE29Sd=eWS2fZ1w@mail.gmail.com>
Message-ID: <CABL7CQg4R0605u+sfEYBzHYUHuDQazLbrTa610hvXD-sD3=tBw@mail.gmail.com>

On Mon, Mar 12, 2018 at 11:44 AM, Charles R Harris <
charlesr.harris at gmail.com> wrote:

> Hi All,
>
> I'm thinking of branching NumPy in the middle/end of April. That is
> quicker than usual, but there don't seem to be any major changes proposed
> for the near future, we have merged a reasonable number of PRs, and a
> Python 3.7 compatible release of Cython looks to be forthcoming. An early
> release will also give us time for possibly two following releases before
> we drop Python 2.7 support. With that schedule, I also propose to drop
> Python 3.4 support in NumPy 1.16.
>
> Thoughts?
>

Sounds fine to me. Thanks Chuck!

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180315/4c4c76b0/attachment-0001.html>

From matti.picus at gmail.com  Fri Mar 16 00:45:19 2018
From: matti.picus at gmail.com (matti picus)
Date: Fri, 16 Mar 2018 04:45:19 +0000
Subject: [Numpy-discussion] NEP sprint: 21 and 22 March
In-Reply-To: <20180315222906.xc33qjkgas2k55xs@carbo>
References: <20180309232638.vumxg3z4dzfaz3yo@carbo>
 <20180315222906.xc33qjkgas2k55xs@carbo>
Message-ID: <CAO8bCgj3riNdZD2xXGCc0NnRpoQzXT0FtxGhWXqgcsAiiwJEKw@mail.gmail.com>

I would love to join but I will be at the PyPy yearly sprint in Switzerland
from Saturday to Wednesday, and traveling back to Israel on Thursday. I can
join virtually Wednesday, my evening will be your morning. I begin
traveling Thurs morning which is sometime Wed afternoon for you and will be
offline until I arrive home around 20:00 Israel time, which is Thurs
morning.

Matti

On Fri, 16 Mar 2018 at 00:29, Stefan van der Walt <stefanv at berkeley.edu>
wrote:

> Hi everyone,
>
> A quick reminder of the NEP sprint that will happen at Berkeley next
> Wednesday and Thursday.  Please let me know if you are interested in
> joining.
>
> Best regards
> St?fan
>
> On Fri, 09 Mar 2018 15:26:38 -0800, Stefan van der Walt wrote:
> > Hi everyone,
> >
> > As you may have noticed, there's been quite a bit of movement recently
> > around NumPy Enhancement Proposals---on setting specifications,
> > building infrastructure, as well as writing new proposals.
> >
> > To further support this work, we will be hosting an informal NEP
> > sprint at Berkeley on 21 and 22 March.  Our aim is to bring core
> > contributors and interested community members together to discuss
> > proposal ideas, write up new NEPs, and polish existing ones.
> >
> > Some potential topics of discussion are:
> >
> >  - Duck arrays
> >  - Array concatenation
> >  - Random number generator seed versioning
> >  - User defined dtypes
> >  - Deprecation pathways for `np.matrix`
> >  - What to do about nditer?
> >
> > All community members are welcome to attend.  If you are a core
> > contributor, we may be able to fund some travel costs as well; please
> > let me know.
> >
> > Best regards
> > St?fan
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180316/5d61931d/attachment.html>

From wieser.eric+numpy at gmail.com  Fri Mar 16 01:09:52 2018
From: wieser.eric+numpy at gmail.com (Eric Wieser)
Date: Fri, 16 Mar 2018 05:09:52 +0000
Subject: [Numpy-discussion] PR to add a function to calculate histogram
 edges without calculating the histogram
In-Reply-To: <CAPJVwBm0Nf-Hs9VPSjHGA1f9b3NzJNE_zEY5T9opyCOa59DM+Q@mail.gmail.com>
References: <CALSHQvPKoyQi2D0YHtgbPuA9qQ4bh1SDRx8CRvDeaDpjR=QP1Q@mail.gmail.com>
 <CAA48SF-TgWkicrMYLdaYk9nEoiLN5fUcBsQkU0yrAOZ_1yQr4Q@mail.gmail.com>
 <CAL1kJvBq5ZciA+f8t7u3AW-m3_Kmtq_Dk8Qs39ExkR3RFizbsA@mail.gmail.com>
 <CAMMTP+A2WdFinbvpYm2r3ieTig8FCcHSxNYUJGP+G8eYr=fSoQ@mail.gmail.com>
 <CAL1kJvDpc8p7RtwAmr0zfmuL_TVrV0u649kYk5WX--LJKRdQQQ@mail.gmail.com>
 <CAMMTP+A+cf==V9s-rM1yM46DwrK0rVaLkg7t=p_1TsVPwVuRCQ@mail.gmail.com>
 <CAA48SF9AA6B7-quk+mLrog_ZUqQEh6xCey86doJ=6J1-WywOsQ@mail.gmail.com>
 <CAPJVwBm0Nf-Hs9VPSjHGA1f9b3NzJNE_zEY5T9opyCOa59DM+Q@mail.gmail.com>
Message-ID: <CAL1kJvAhdi+sNQqc1umFVhZtsXCNy_W-RR+PPftSx1Hx1Hu4Vg@mail.gmail.com>

That sounds like a reasonable extension - but I think there still exist
cases where you want to treat the data as one uniform set when computing
bins (toggling between orthogonal subsets of data) so isn't really a useful
replacement.

I suppose this becomes relevant when `density` is passed to the individual
histogram invocations. Does matplotlib handle that correctly for stacked
histograms?

On Thu, Mar 15, 2018, 20:14 Nathaniel Smith <njs at pobox.com> wrote:

> Instead of an nobs argument, maybe we should have a version that accepts
> multiple data sets, so that we have the full information and can improve
> the algorithm over time.
>
> On Mar 15, 2018 7:57 PM, "Thomas Caswell" <tcaswell at gmail.com> wrote:
>
>> Yes I like the name.
>>
>> The primary use-case for Matplotlib is that our `hist` method can take in
>> a list of arrays and produces N histograms in one shot. Currently with
>> 'auto' we only use the first data set to sort out what the bins should be
>> and then re-use those for the rest of the data sets.  This will let us get
>> the bins on the merged input, but I take Josef's point that this is not
>> actually what we want....
>>
>> Tom
>>
>> On Mon, Mar 12, 2018 at 11:35 PM <josef.pktd at gmail.com> wrote:
>>
>>> On Mon, Mar 12, 2018 at 11:20 PM, Eric Wieser
>>> <wieser.eric+numpy at gmail.com> wrote:
>>> >> Given that the bin selection are data driven, transferring them
>>> across datasets might not be so useful.
>>> >
>>> > The main application would be to compute bins across the union of all
>>> > datasets. This is already possibly by using `np.histogram` and
>>> > discarding the first result, but that's super wasteful.
>>>
>>> assuming "union" means a combined dataset.
>>>
>>> If you stack  datasets, then the number of observations will not be
>>> correct for individual datasets.
>>>
>>> In that case an additional keyword like nobs, or whatever name would
>>> be appropriate for numpy, would be useful, e.g. use the average number
>>> of observations across datasets.
>>> Auxiliary statistic like std could then be computed on the total
>>> dataset (if that makes sense, which would not be the case if the
>>> variance across datasets is larger than the variance within datasets.
>>>
>>> Josef
>>>
>>> > _______________________________________________
>>> > NumPy-Discussion mailing list
>>> > NumPy-Discussion at python.org
>>> > https://mail.python.org/mailman/listinfo/numpy-discussion
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180316/0326ccb4/attachment.html>

From njs at pobox.com  Fri Mar 16 03:06:58 2018
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 16 Mar 2018 00:06:58 -0700
Subject: [Numpy-discussion] PR to add a function to calculate histogram
 edges without calculating the histogram
In-Reply-To: <CAL1kJvAhdi+sNQqc1umFVhZtsXCNy_W-RR+PPftSx1Hx1Hu4Vg@mail.gmail.com>
References: <CALSHQvPKoyQi2D0YHtgbPuA9qQ4bh1SDRx8CRvDeaDpjR=QP1Q@mail.gmail.com>
 <CAA48SF-TgWkicrMYLdaYk9nEoiLN5fUcBsQkU0yrAOZ_1yQr4Q@mail.gmail.com>
 <CAL1kJvBq5ZciA+f8t7u3AW-m3_Kmtq_Dk8Qs39ExkR3RFizbsA@mail.gmail.com>
 <CAMMTP+A2WdFinbvpYm2r3ieTig8FCcHSxNYUJGP+G8eYr=fSoQ@mail.gmail.com>
 <CAL1kJvDpc8p7RtwAmr0zfmuL_TVrV0u649kYk5WX--LJKRdQQQ@mail.gmail.com>
 <CAMMTP+A+cf==V9s-rM1yM46DwrK0rVaLkg7t=p_1TsVPwVuRCQ@mail.gmail.com>
 <CAA48SF9AA6B7-quk+mLrog_ZUqQEh6xCey86doJ=6J1-WywOsQ@mail.gmail.com>
 <CAPJVwBm0Nf-Hs9VPSjHGA1f9b3NzJNE_zEY5T9opyCOa59DM+Q@mail.gmail.com>
 <CAL1kJvAhdi+sNQqc1umFVhZtsXCNy_W-RR+PPftSx1Hx1Hu4Vg@mail.gmail.com>
Message-ID: <CAPJVwBn=sBabDQ8HXP-FTa2XKpP-mcyLE5k3o7s237GgpbuCwA@mail.gmail.com>

Oh sure, I'm not suggesting it be impossible to calculate for a single data
set. If nothing else, if we had a version that accepted a list of data
sets, then you could always pass in a single-element list :-).

On Mar 15, 2018 22:10, "Eric Wieser" <wieser.eric+numpy at gmail.com> wrote:

> That sounds like a reasonable extension - but I think there still exist
> cases where you want to treat the data as one uniform set when computing
> bins (toggling between orthogonal subsets of data) so isn't really a useful
> replacement.
>
> I suppose this becomes relevant when `density` is passed to the individual
> histogram invocations. Does matplotlib handle that correctly for stacked
> histograms?
>
> On Thu, Mar 15, 2018, 20:14 Nathaniel Smith <njs at pobox.com> wrote:
>
>> Instead of an nobs argument, maybe we should have a version that accepts
>> multiple data sets, so that we have the full information and can improve
>> the algorithm over time.
>>
>> On Mar 15, 2018 7:57 PM, "Thomas Caswell" <tcaswell at gmail.com> wrote:
>>
>>> Yes I like the name.
>>>
>>> The primary use-case for Matplotlib is that our `hist` method can take
>>> in a list of arrays and produces N histograms in one shot. Currently with
>>> 'auto' we only use the first data set to sort out what the bins should be
>>> and then re-use those for the rest of the data sets.  This will let us get
>>> the bins on the merged input, but I take Josef's point that this is not
>>> actually what we want....
>>>
>>> Tom
>>>
>>> On Mon, Mar 12, 2018 at 11:35 PM <josef.pktd at gmail.com> wrote:
>>>
>>>> On Mon, Mar 12, 2018 at 11:20 PM, Eric Wieser
>>>> <wieser.eric+numpy at gmail.com> wrote:
>>>> >> Given that the bin selection are data driven, transferring them
>>>> across datasets might not be so useful.
>>>> >
>>>> > The main application would be to compute bins across the union of all
>>>> > datasets. This is already possibly by using `np.histogram` and
>>>> > discarding the first result, but that's super wasteful.
>>>>
>>>> assuming "union" means a combined dataset.
>>>>
>>>> If you stack  datasets, then the number of observations will not be
>>>> correct for individual datasets.
>>>>
>>>> In that case an additional keyword like nobs, or whatever name would
>>>> be appropriate for numpy, would be useful, e.g. use the average number
>>>> of observations across datasets.
>>>> Auxiliary statistic like std could then be computed on the total
>>>> dataset (if that makes sense, which would not be the case if the
>>>> variance across datasets is larger than the variance within datasets.
>>>>
>>>> Josef
>>>>
>>>> > _______________________________________________
>>>> > NumPy-Discussion mailing list
>>>> > NumPy-Discussion at python.org
>>>> > https://mail.python.org/mailman/listinfo/numpy-discussion
>>>> _______________________________________________
>>>> NumPy-Discussion mailing list
>>>> NumPy-Discussion at python.org
>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>>
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180316/039356f8/attachment-0001.html>

From jaime.frio at gmail.com  Fri Mar 16 03:14:43 2018
From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=)
Date: Fri, 16 Mar 2018 07:14:43 +0000
Subject: [Numpy-discussion] NEP sprint: 21 and 22 March
In-Reply-To: <20180315222906.xc33qjkgas2k55xs@carbo>
References: <20180309232638.vumxg3z4dzfaz3yo@carbo>
 <20180315222906.xc33qjkgas2k55xs@carbo>
Message-ID: <CAPOWHW=NxOZCU2mkCAmNjMhXJi8rg+bDQiQaQfTXL3XGg0H56Q@mail.gmail.com>

I will not be joining you for this sprint, but will be in the Bay Area from
May 12th to May 25th, and wouldn't mind spending a day visiting you.

If it works for you and anyone else want to join we could try to give it a
little more structure than "just came over to say hi!"

Jaime


On Thu, Mar 15, 2018 at 11:29 PM Stefan van der Walt <stefanv at berkeley.edu>
wrote:

> Hi everyone,
>
> A quick reminder of the NEP sprint that will happen at Berkeley next
> Wednesday and Thursday.  Please let me know if you are interested in
> joining.
>
> Best regards
> St?fan
>
> On Fri, 09 Mar 2018 15:26:38 -0800, Stefan van der Walt wrote:
> > Hi everyone,
> >
> > As you may have noticed, there's been quite a bit of movement recently
> > around NumPy Enhancement Proposals---on setting specifications,
> > building infrastructure, as well as writing new proposals.
> >
> > To further support this work, we will be hosting an informal NEP
> > sprint at Berkeley on 21 and 22 March.  Our aim is to bring core
> > contributors and interested community members together to discuss
> > proposal ideas, write up new NEPs, and polish existing ones.
> >
> > Some potential topics of discussion are:
> >
> >  - Duck arrays
> >  - Array concatenation
> >  - Random number generator seed versioning
> >  - User defined dtypes
> >  - Deprecation pathways for `np.matrix`
> >  - What to do about nditer?
> >
> > All community members are welcome to attend.  If you are a core
> > contributor, we may be able to fund some travel costs as well; please
> > let me know.
> >
> > Best regards
> > St?fan
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>


-- 
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes
de dominaci?n mundial.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180316/5a2fc7fb/attachment.html>

From josef.pktd at gmail.com  Fri Mar 16 09:43:41 2018
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 16 Mar 2018 09:43:41 -0400
Subject: [Numpy-discussion] PR to add a function to calculate histogram
 edges without calculating the histogram
In-Reply-To: <CAPJVwBn=sBabDQ8HXP-FTa2XKpP-mcyLE5k3o7s237GgpbuCwA@mail.gmail.com>
References: <CALSHQvPKoyQi2D0YHtgbPuA9qQ4bh1SDRx8CRvDeaDpjR=QP1Q@mail.gmail.com>
 <CAA48SF-TgWkicrMYLdaYk9nEoiLN5fUcBsQkU0yrAOZ_1yQr4Q@mail.gmail.com>
 <CAL1kJvBq5ZciA+f8t7u3AW-m3_Kmtq_Dk8Qs39ExkR3RFizbsA@mail.gmail.com>
 <CAMMTP+A2WdFinbvpYm2r3ieTig8FCcHSxNYUJGP+G8eYr=fSoQ@mail.gmail.com>
 <CAL1kJvDpc8p7RtwAmr0zfmuL_TVrV0u649kYk5WX--LJKRdQQQ@mail.gmail.com>
 <CAMMTP+A+cf==V9s-rM1yM46DwrK0rVaLkg7t=p_1TsVPwVuRCQ@mail.gmail.com>
 <CAA48SF9AA6B7-quk+mLrog_ZUqQEh6xCey86doJ=6J1-WywOsQ@mail.gmail.com>
 <CAPJVwBm0Nf-Hs9VPSjHGA1f9b3NzJNE_zEY5T9opyCOa59DM+Q@mail.gmail.com>
 <CAL1kJvAhdi+sNQqc1umFVhZtsXCNy_W-RR+PPftSx1Hx1Hu4Vg@mail.gmail.com>
 <CAPJVwBn=sBabDQ8HXP-FTa2XKpP-mcyLE5k3o7s237GgpbuCwA@mail.gmail.com>
Message-ID: <CAMMTP+BLHu8g4A0izNTKzFz1J6i3twx07mA1bzZ-CqDuJO17og@mail.gmail.com>

passing a list of arrays would be useful (aside of discriminating
between list and array_like)

In that case I would add a keyword like "within=True" to compute the additional
statistics like std or iqr on the group demeaned data.
This would remove the effect of (mean-)shifted datasets on those
auxiliary statistics.

aside: An alternative to using a list of arrays would be to include a
"groups" indicator
as keyword, and if it is not None, then compute based on averages
across groups or
pooled within statistics.


Josef


On Fri, Mar 16, 2018 at 3:06 AM, Nathaniel Smith <njs at pobox.com> wrote:
> Oh sure, I'm not suggesting it be impossible to calculate for a single data
> set. If nothing else, if we had a version that accepted a list of data sets,
> then you could always pass in a single-element list :-).
>
> On Mar 15, 2018 22:10, "Eric Wieser" <wieser.eric+numpy at gmail.com> wrote:
>>
>> That sounds like a reasonable extension - but I think there still exist
>> cases where you want to treat the data as one uniform set when computing
>> bins (toggling between orthogonal subsets of data) so isn't really a useful
>> replacement.
>>
>> I suppose this becomes relevant when `density` is passed to the individual
>> histogram invocations. Does matplotlib handle that correctly for stacked
>> histograms?
>>
>> On Thu, Mar 15, 2018, 20:14 Nathaniel Smith <njs at pobox.com> wrote:
>>>
>>> Instead of an nobs argument, maybe we should have a version that accepts
>>> multiple data sets, so that we have the full information and can improve the
>>> algorithm over time.
>>>
>>> On Mar 15, 2018 7:57 PM, "Thomas Caswell" <tcaswell at gmail.com> wrote:
>>>>
>>>> Yes I like the name.
>>>>
>>>> The primary use-case for Matplotlib is that our `hist` method can take
>>>> in a list of arrays and produces N histograms in one shot. Currently with
>>>> 'auto' we only use the first data set to sort out what the bins should be
>>>> and then re-use those for the rest of the data sets.  This will let us get
>>>> the bins on the merged input, but I take Josef's point that this is not
>>>> actually what we want....
>>>>
>>>> Tom
>>>>
>>>> On Mon, Mar 12, 2018 at 11:35 PM <josef.pktd at gmail.com> wrote:
>>>>>
>>>>> On Mon, Mar 12, 2018 at 11:20 PM, Eric Wieser
>>>>> <wieser.eric+numpy at gmail.com> wrote:
>>>>> >> Given that the bin selection are data driven, transferring them
>>>>> >> across datasets might not be so useful.
>>>>> >
>>>>> > The main application would be to compute bins across the union of all
>>>>> > datasets. This is already possibly by using `np.histogram` and
>>>>> > discarding the first result, but that's super wasteful.
>>>>>
>>>>> assuming "union" means a combined dataset.
>>>>>
>>>>> If you stack  datasets, then the number of observations will not be
>>>>> correct for individual datasets.
>>>>>
>>>>> In that case an additional keyword like nobs, or whatever name would
>>>>> be appropriate for numpy, would be useful, e.g. use the average number
>>>>> of observations across datasets.
>>>>> Auxiliary statistic like std could then be computed on the total
>>>>> dataset (if that makes sense, which would not be the case if the
>>>>> variance across datasets is larger than the variance within datasets.
>>>>>
>>>>> Josef
>>>>>
>>>>> > _______________________________________________
>>>>> > NumPy-Discussion mailing list
>>>>> > NumPy-Discussion at python.org
>>>>> > https://mail.python.org/mailman/listinfo/numpy-discussion
>>>>> _______________________________________________
>>>>> NumPy-Discussion mailing list
>>>>> NumPy-Discussion at python.org
>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>>
>>>>
>>>> _______________________________________________
>>>> NumPy-Discussion mailing list
>>>> NumPy-Discussion at python.org
>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>

From einstein.edison at gmail.com  Fri Mar 16 13:10:06 2018
From: einstein.edison at gmail.com (Hameer Abbasi)
Date: Fri, 16 Mar 2018 10:10:06 -0700
Subject: [Numpy-discussion] PR to add an initializer kwarg to ufunc.reduce
 (and similar functions)
Message-ID: <CADViA5D6qc_3PcTXFz3zcVwNEdP2hfKUjgUg+RFb8eDFvj1yyg@mail.gmail.com>

 Hello, everyone. I?ve submitted a PR to add a initializer kwarg to
ufunc.reduce. This is useful in a few cases, e.g., it allows one to supply
a ?default? value for identity-less ufunc reductions, and specify an
initial value for reductions such as sum (other than zero.)

Please feel free to review or leave feedback, (although I think Eric and
Marten have picked it apart pretty well).

https://github.com/numpy/numpy/pull/10635

Thanks,

Hameer
Sent from Astro <https://www.helloastro.com> for Mac
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180316/f4b653db/attachment.html>

From tcaswell at gmail.com  Sat Mar 17 17:42:01 2018
From: tcaswell at gmail.com (Thomas Caswell)
Date: Sat, 17 Mar 2018 21:42:01 +0000
Subject: [Numpy-discussion] new NEP: np.AbstractArray and
 np.asabstractarray
In-Reply-To: <CAJNV+9snsNJ0wWvFQoL9KVJM2s6qeK9=J3gvg-MsuqtZnsu2VQ@mail.gmail.com>
References: <CAPJVwBnBGh_kfT+tXa8gJ8HA3xjGzoSMJMsCzj+2=diSUvxEAQ@mail.gmail.com>
 <CAJNV+9uM6CL3SeYoLFbQAdpVVaFFvSb7Z5D4KVCQ-vLTm7g4SQ@mail.gmail.com>
 <CAEQ_Tvc8QA+Finhq6+2Ey8J5dBdTR+VuraA1_kuOFjb3=LO2gg@mail.gmail.com>
 <1520560316.2962680.1296803088.6C85AC87@webmail.messagingengine.com>
 <CAPJVwBmpt7Y0Q2Nt7jW52FnMMPxQ-2A2u3bb9U0YDMc2QSsJSw@mail.gmail.com>
 <CAJ8oX-H4ZD5BPgfJbxLBEyfC9TDB3u3jMa+r-zhqYEPj7LEeog@mail.gmail.com>
 <CALGmxEJ4WYpuEmKaoq_YPpQH3CcaX1i0uqT-idEeurarZopSCQ@mail.gmail.com>
 <CAJNV+9snsNJ0wWvFQoL9KVJM2s6qeK9=J3gvg-MsuqtZnsu2VQ@mail.gmail.com>
Message-ID: <CAA48SF8+sbDHcP1OVFkrzPmca0xf9a02Cu060L2AfVR8gbTcMA@mail.gmail.com>

It would be nice if there was an IntEnum [1] that was taken is an input to
`np.asarrayish` and `np.isarrayish` to require a combination of the groups
of attributes/methods/semantics.

Tom

[1] https://docs.python.org/3/library/enum.html#intenum

On Sat, Mar 10, 2018 at 7:14 PM Marten van Kerkwijk <
m.h.vankerkwijk at gmail.com> wrote:

>
> ?I think we don't have to make it sounds like there are *that* many types
> of compatibility: really there is just array organisation
> (indexing/reshaping) and array arithmetic. These correspond roughly to
> ShapedLikeNDArray in astropy and NDArrayOperatorMixin in numpy (missing so
> far is concatenation). The advantage of the ABC classes is that they can
> supply missing methods (say, size, isscalar, __len__, and ndim given shape;
> __iter__ given __getitem__, ravel, squeeze, flatten given reshape; etc.).
>
> -- Marten
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180317/89742d1c/attachment.html>

From einstein.edison at gmail.com  Sat Mar 17 18:01:57 2018
From: einstein.edison at gmail.com (Hameer Abbasi)
Date: Sat, 17 Mar 2018 15:01:57 -0700
Subject: [Numpy-discussion] new NEP: np.AbstractArray and
 np.asabstractarray
In-Reply-To: <CAA48SF8+sbDHcP1OVFkrzPmca0xf9a02Cu060L2AfVR8gbTcMA@mail.gmail.com>
References: <CAPJVwBnBGh_kfT+tXa8gJ8HA3xjGzoSMJMsCzj+2=diSUvxEAQ@mail.gmail.com>
 <CAJ8oX-H4ZD5BPgfJbxLBEyfC9TDB3u3jMa+r-zhqYEPj7LEeog@mail.gmail.com>
 <CALGmxEJ4WYpuEmKaoq_YPpQH3CcaX1i0uqT-idEeurarZopSCQ@mail.gmail.com>
 <CAJNV+9snsNJ0wWvFQoL9KVJM2s6qeK9=J3gvg-MsuqtZnsu2VQ@mail.gmail.com>
 <CAA48SF8+sbDHcP1OVFkrzPmca0xf9a02Cu060L2AfVR8gbTcMA@mail.gmail.com>
Message-ID: <CADViA5C7BpKbSNOnh8RCnX2p5KqrY7POF5fzubiiEp5p0zw58w@mail.gmail.com>

It would be nice if there was an IntEnum [1] that was taken is an input to
`np.asarrayish` and `np.isarrayish` to require a combination of the groups
of attributes/methods/semantics.


Don?t you mean IntFlag <https://docs.python.org/3/library/enum.html#intflag>?
I like Marten?s idea of ?grouping together? related functionality via ABCs
and implementing different parts via ABCs (for example, in pydata/sparse we
use NDArrayOperatorsMixin for exactly this), but I believe that separate
ABCs should be provided for different parts of the interface.

Then we can either:

   1. Check with isinstance for the ABCs, or
   2. Check with hasattr.

I like the IntFlag idea most (it seems to be designed for use-cases like
these), but a string-based (np.aspyarray(x,
functionality=?arithmetic|reductions')) or list-based (np.aspyarray(x,
functionality=[?arithmetic?, ?reductions?]) is also fine.

It might help to have some sort of a ?dry-run? interface that (given a run
of code) figures out which parts you need.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180317/48b15a31/attachment.html>

From tcaswell at gmail.com  Sat Mar 17 18:09:51 2018
From: tcaswell at gmail.com (Thomas Caswell)
Date: Sat, 17 Mar 2018 22:09:51 +0000
Subject: [Numpy-discussion] new NEP: np.AbstractArray and
 np.asabstractarray
In-Reply-To: <CADViA5C7BpKbSNOnh8RCnX2p5KqrY7POF5fzubiiEp5p0zw58w@mail.gmail.com>
References: <CAPJVwBnBGh_kfT+tXa8gJ8HA3xjGzoSMJMsCzj+2=diSUvxEAQ@mail.gmail.com>
 <CAJ8oX-H4ZD5BPgfJbxLBEyfC9TDB3u3jMa+r-zhqYEPj7LEeog@mail.gmail.com>
 <CALGmxEJ4WYpuEmKaoq_YPpQH3CcaX1i0uqT-idEeurarZopSCQ@mail.gmail.com>
 <CAJNV+9snsNJ0wWvFQoL9KVJM2s6qeK9=J3gvg-MsuqtZnsu2VQ@mail.gmail.com>
 <CAA48SF8+sbDHcP1OVFkrzPmca0xf9a02Cu060L2AfVR8gbTcMA@mail.gmail.com>
 <CADViA5C7BpKbSNOnh8RCnX2p5KqrY7POF5fzubiiEp5p0zw58w@mail.gmail.com>
Message-ID: <CAA48SF_We8KOgw0tEzyVXTkiKDE83WS3kap=eP9QX_vQAQkVCQ@mail.gmail.com>

Yes, meant IntFlag :sheep:

On Sat, Mar 17, 2018 at 6:02 PM Hameer Abbasi <einstein.edison at gmail.com>
wrote:

>
> It would be nice if there was an IntEnum [1] that was taken is an input to
> `np.asarrayish` and `np.isarrayish` to require a combination of the groups
> of attributes/methods/semantics.
>
>
> Don?t you mean IntFlag
> <https://docs.python.org/3/library/enum.html#intflag>? I like Marten?s
> idea of ?grouping together? related functionality via ABCs and implementing
> different parts via ABCs (for example, in pydata/sparse we use
> NDArrayOperatorsMixin for exactly this), but I believe that separate ABCs
> should be provided for different parts of the interface.
>
> Then we can either:
>
>    1. Check with isinstance for the ABCs, or
>    2. Check with hasattr.
>
> I like the IntFlag idea most (it seems to be designed for use-cases like
> these), but a string-based (np.aspyarray(x,
> functionality=?arithmetic|reductions')) or list-based (np.aspyarray(x,
> functionality=[?arithmetic?, ?reductions?]) is also fine.
>
> It might help to have some sort of a ?dry-run? interface that (given a run
> of code) figures out which parts you need.
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180317/30255afc/attachment.html>

From wieser.eric+numpy at gmail.com  Sat Mar 17 20:25:59 2018
From: wieser.eric+numpy at gmail.com (Eric Wieser)
Date: Sun, 18 Mar 2018 00:25:59 +0000
Subject: [Numpy-discussion] new NEP: np.AbstractArray and
 np.asabstractarray
In-Reply-To: <CAA48SF_We8KOgw0tEzyVXTkiKDE83WS3kap=eP9QX_vQAQkVCQ@mail.gmail.com>
References: <CAPJVwBnBGh_kfT+tXa8gJ8HA3xjGzoSMJMsCzj+2=diSUvxEAQ@mail.gmail.com>
 <CAJ8oX-H4ZD5BPgfJbxLBEyfC9TDB3u3jMa+r-zhqYEPj7LEeog@mail.gmail.com>
 <CALGmxEJ4WYpuEmKaoq_YPpQH3CcaX1i0uqT-idEeurarZopSCQ@mail.gmail.com>
 <CAJNV+9snsNJ0wWvFQoL9KVJM2s6qeK9=J3gvg-MsuqtZnsu2VQ@mail.gmail.com>
 <CAA48SF8+sbDHcP1OVFkrzPmca0xf9a02Cu060L2AfVR8gbTcMA@mail.gmail.com>
 <CADViA5C7BpKbSNOnh8RCnX2p5KqrY7POF5fzubiiEp5p0zw58w@mail.gmail.com>
 <CAA48SF_We8KOgw0tEzyVXTkiKDE83WS3kap=eP9QX_vQAQkVCQ@mail.gmail.com>
Message-ID: <CAL1kJvCkT9iSyNb7phvUR1cfAt9Gq4X1nANJWRv43wxP2tY+LA@mail.gmail.com>

I would have thought that a simple tuple of types would be more appropriate
than using integer flags, since that means that isinstance can be used on
the individual elements. Ideally there?d be a typing.Intersection[TraitA,
TraitB] for this kind of thing.
?

On Sat, 17 Mar 2018 at 15:10 Thomas Caswell <tcaswell at gmail.com> wrote:

> Yes, meant IntFlag :sheep:
>
> On Sat, Mar 17, 2018 at 6:02 PM Hameer Abbasi <einstein.edison at gmail.com>
> wrote:
>
>>
>> It would be nice if there was an IntEnum [1] that was taken is an input
>> to `np.asarrayish` and `np.isarrayish` to require a combination of the
>> groups of attributes/methods/semantics.
>>
>>
>> Don?t you mean IntFlag
>> <https://docs.python.org/3/library/enum.html#intflag>? I like Marten?s
>> idea of ?grouping together? related functionality via ABCs and implementing
>> different parts via ABCs (for example, in pydata/sparse we use
>> NDArrayOperatorsMixin for exactly this), but I believe that separate ABCs
>> should be provided for different parts of the interface.
>>
>> Then we can either:
>>
>>    1. Check with isinstance for the ABCs, or
>>    2. Check with hasattr.
>>
>> I like the IntFlag idea most (it seems to be designed for use-cases like
>> these), but a string-based (np.aspyarray(x,
>> functionality=?arithmetic|reductions')) or list-based (np.aspyarray(x,
>> functionality=[?arithmetic?, ?reductions?]) is also fine.
>>
>> It might help to have some sort of a ?dry-run? interface that (given a
>> run of code) figures out which parts you need.
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180318/4c89b125/attachment-0001.html>

From m.h.vankerkwijk at gmail.com  Sun Mar 18 11:57:32 2018
From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk)
Date: Sun, 18 Mar 2018 11:57:32 -0400
Subject: [Numpy-discussion] new NEP: np.AbstractArray and
 np.asabstractarray
In-Reply-To: <CAL1kJvCkT9iSyNb7phvUR1cfAt9Gq4X1nANJWRv43wxP2tY+LA@mail.gmail.com>
References: <CAPJVwBnBGh_kfT+tXa8gJ8HA3xjGzoSMJMsCzj+2=diSUvxEAQ@mail.gmail.com>
 <CAJ8oX-H4ZD5BPgfJbxLBEyfC9TDB3u3jMa+r-zhqYEPj7LEeog@mail.gmail.com>
 <CALGmxEJ4WYpuEmKaoq_YPpQH3CcaX1i0uqT-idEeurarZopSCQ@mail.gmail.com>
 <CAJNV+9snsNJ0wWvFQoL9KVJM2s6qeK9=J3gvg-MsuqtZnsu2VQ@mail.gmail.com>
 <CAA48SF8+sbDHcP1OVFkrzPmca0xf9a02Cu060L2AfVR8gbTcMA@mail.gmail.com>
 <CADViA5C7BpKbSNOnh8RCnX2p5KqrY7POF5fzubiiEp5p0zw58w@mail.gmail.com>
 <CAA48SF_We8KOgw0tEzyVXTkiKDE83WS3kap=eP9QX_vQAQkVCQ@mail.gmail.com>
 <CAL1kJvCkT9iSyNb7phvUR1cfAt9Gq4X1nANJWRv43wxP2tY+LA@mail.gmail.com>
Message-ID: <CAJNV+9tmU0PpPggzRCDKdfGhS19kF2G0i3P84b8ca3jVHB6cXQ@mail.gmail.com>

Yes, a tuple of types would make more sense, given `isinstance` --
string abbreviations for those could be there for convenience.
-- Marten


On Sat, Mar 17, 2018 at 8:25 PM, Eric Wieser
<wieser.eric+numpy at gmail.com> wrote:
> I would have thought that a simple tuple of types would be more appropriate
> than using integer flags, since that means that isinstance can be used on
> the individual elements. Ideally there?d be a typing.Intersection[TraitA,
> TraitB] for this kind of thing.
>
>
> On Sat, 17 Mar 2018 at 15:10 Thomas Caswell <tcaswell at gmail.com> wrote:
>>
>> Yes, meant IntFlag :sheep:
>>
>> On Sat, Mar 17, 2018 at 6:02 PM Hameer Abbasi <einstein.edison at gmail.com>
>> wrote:
>>>
>>>
>>> It would be nice if there was an IntEnum [1] that was taken is an input
>>> to `np.asarrayish` and `np.isarrayish` to require a combination of the
>>> groups of attributes/methods/semantics.
>>>
>>>
>>> Don?t you mean IntFlag? I like Marten?s idea of ?grouping together?
>>> related functionality via ABCs and implementing different parts via ABCs
>>> (for example, in pydata/sparse we use NDArrayOperatorsMixin for exactly
>>> this), but I believe that separate ABCs should be provided for different
>>> parts of the interface.
>>>
>>> Then we can either:
>>>
>>> Check with isinstance for the ABCs, or
>>> Check with hasattr.
>>>
>>> I like the IntFlag idea most (it seems to be designed for use-cases like
>>> these), but a string-based (np.aspyarray(x,
>>> functionality=?arithmetic|reductions')) or list-based (np.aspyarray(x,
>>> functionality=[?arithmetic?, ?reductions?]) is also fine.
>>>
>>> It might help to have some sort of a ?dry-run? interface that (given a
>>> run of code) figures out which parts you need.
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>

From charlesr.harris at gmail.com  Mon Mar 19 21:06:10 2018
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 19 Mar 2018 19:06:10 -0600
Subject: [Numpy-discussion] NEP sprint: 21 and 22 March
In-Reply-To: <CAPOWHW=NxOZCU2mkCAmNjMhXJi8rg+bDQiQaQfTXL3XGg0H56Q@mail.gmail.com>
References: <20180309232638.vumxg3z4dzfaz3yo@carbo>
 <20180315222906.xc33qjkgas2k55xs@carbo>
 <CAPOWHW=NxOZCU2mkCAmNjMhXJi8rg+bDQiQaQfTXL3XGg0H56Q@mail.gmail.com>
Message-ID: <CAB6mnxJweV8owf7GcsYNPjQThBCFTPs8=U4HndgQMR6jH7EeVg@mail.gmail.com>

On Fri, Mar 16, 2018 at 1:14 AM, Jaime Fern?ndez del R?o <
jaime.frio at gmail.com> wrote:

> I will not be joining you for this sprint, but will be in the Bay Area
> from May 12th to May 25th, and wouldn't mind spending a day visiting you.
>
> If it works for you and anyone else want to join we could try to give it a
> little more structure than "just came over to say hi!"
>
> Jaime
>

That would be a good time frame for me also.

<snip>

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180319/d71ada9c/attachment.html>

From njs at pobox.com  Thu Mar 22 04:14:23 2018
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 22 Mar 2018 01:14:23 -0700
Subject: [Numpy-discussion] new NEP: np.AbstractArray and
 np.asabstractarray
In-Reply-To: <CAJ8oX-H4ZD5BPgfJbxLBEyfC9TDB3u3jMa+r-zhqYEPj7LEeog@mail.gmail.com>
References: <CAPJVwBnBGh_kfT+tXa8gJ8HA3xjGzoSMJMsCzj+2=diSUvxEAQ@mail.gmail.com>
 <CAJNV+9uM6CL3SeYoLFbQAdpVVaFFvSb7Z5D4KVCQ-vLTm7g4SQ@mail.gmail.com>
 <CAEQ_Tvc8QA+Finhq6+2Ey8J5dBdTR+VuraA1_kuOFjb3=LO2gg@mail.gmail.com>
 <1520560316.2962680.1296803088.6C85AC87@webmail.messagingengine.com>
 <CAPJVwBmpt7Y0Q2Nt7jW52FnMMPxQ-2A2u3bb9U0YDMc2QSsJSw@mail.gmail.com>
 <CAJ8oX-H4ZD5BPgfJbxLBEyfC9TDB3u3jMa+r-zhqYEPj7LEeog@mail.gmail.com>
Message-ID: <CAPJVwBmrCJxO4dffSGsB78vZu9KA2oYJigVgOF17Pi04ag6C_A@mail.gmail.com>

On Sat, Mar 10, 2018 at 4:27 AM, Matthew Rocklin <mrocklin at gmail.com> wrote:
> I'm very glad to see this discussion.
>
> I think that coming up with a single definition of array-like may be
> difficult, and that we might end up wanting to embrace duck typing instead.
>
> It seems to me that different array-like classes will implement different
> mixtures of features.  It may be difficult to pin down a single definition
> that includes anything except for the most basic attributes (shape and
> dtype?).  Consider two extreme cases of restrictive functionality:
>
> LinearOperators (support dot in a numpy-like way)
> Storage objects like h5py (support getitem in a numpy-like way)
>
> I can imagine authors of both groups saying that they should qualify as
> array-like because downstream projects that consume them should not convert
> them to numpy arrays in important contexts.

I think this is an important point -- there are a lot of subtleties in
the interfaces that different objects might want to provide. Some
interesting ones that haven't been mentioned:

- a "duck array" that has everything except fancy indexing
- xarray's arrays are just like numpy arrays in most ways, but they
have incompatible broadcasting semantics
- immutable vs. mutable arrays

When faced with this kind of situation, always it's tempting to try to
write down some classification system to capture every possible
configuration of interesting behavior. In fact, this is one of the
most classic nerd snipes; it's been catching people for literally
thousands of years [1]. Most of these attempts fail though :-).

So let's back up -- I probably erred in not making this more clear in
the NEP, but I actually have a fairly concrete use case in mind here.
What happened is, I started working on a NEP for
__array_concatenate__, and my thought pattern went as follows:

1) Cool, this should work for np.concatenate.
2) But what about all the other variants, like np.row_stack. We don't
want __array_row_stack__; we want to express row_stack in terms of
concatenate.
3) Ok, what's row_stack? It's:
  np.concatenate([np.atleast_2d(arr) for arr in arrs], axis=0)
4) So I need to make atleast_2d work on duck arrays. What's
atleast_2d? It's: asarray + some shape checks and indexing with
newaxis
5) Okay, so I need something atleast_2d can call instead of asarray [2].

And this kind of pattern shows up everywhere inside numpy, e.g. it's
the first thing inside lots of functions in np.linalg b/c they do some
futzing with dtypes and shape before delegating to ufuncs, it's the
first thing the mean() function does b/c it needs to check arr.dtype
before proceeding, etc. etc.

So, we need something we can use in these functions as a first step
towards unlocking the use of duck arrays in general. But we can't
realistically go through each of these functions, make an exact list
of all the operations/attributes it cares about, and then come up with
exactly the right type constraint for it to impose at the top. And
these functions aren't generally going to work on LinearOperators or
h5py datasets anyway.

We also don't want to go through every function in numpy and add new
arguments to control this coercion behavior.

What we can do, at least to start, is to have a mechanism that passes
through objects that aspire to be "complete" duck arrays, like dask
arrays or sparse arrays or astropy's unit arrays, and then if it turns
out that in practice people find uses for finer-grained distinctions,
we can iteratively add those as a second pass. Notice that if a
function starts out requiring a "complete" duck array, and then later
relaxes that to accept "partial" duck arrays, that's actually
increasing the domain of objects that it can act on, so it's a
backwards-compatible change that we can do later.

So I think we should start out with a concept of "duck array" that's
fairly strong but a bit vague on the exact details (e.g.,
dask.array.Array is currently missing some weird things like arr.ptp()
and arr.tolist(), I guess because no-one has ever noticed or cared?).

------------

Thinking things through like this, I also realized that this proposal
jumps through hoops to avoid changing np.asarray itself, because I was
nervous about changing the rule that its output is always an
ndarray... but actually, this is currently the rule for most functions
in numpy, and the whole point of this proposal is to relax that rule
for most functions, in cases where the user is explicitly passing in a
duck-array object. So maybe I'm being overparanoid? I'm genuinely
unsure here.

Instead of messing about with ABCs, an alternative mechanism would be
to add a new method __arrayish__ (hat tip to Tom Caswell for the name
:-)), that essentially acts as an override for Python-level calls to
np.array / np.asarray, in much the same way that __array_ufunc__
overrides ufuncs, etc. (C level calls to PyArray_FromAny and similar
would of course continue to return ndarray objects, and I assume we'd
add some argument like require_ndarray= that you could pass to
explicitly indicate whether you needed C-level compatibility.)

This would also allow objects like h5py datasets to *produce* an
arrayish object on demand, even if they aren't one themselves. (E.g.,
imagine some hdf5-like storage that holds sparse arrays instead of
regular arrays.)

I'm thinking I may write this option up as a second NEP, to compete
with my first one.

-n

[1] See: https://www.wiley.com/en-us/The+Search+for+the+Perfect+Language-p-9780631205104
[2] Actually atleast_2d calls asanyarray, not asarray, but that's just
a detail; the way to solve this problem for asanyarray is to first
solve it for asarray.

-- 
Nathaniel J. Smith -- https://vorpus.org

From mhimes at knights.ucf.edu  Wed Mar 21 16:40:55 2018
From: mhimes at knights.ucf.edu (Michael Himes)
Date: Wed, 21 Mar 2018 20:40:55 +0000
Subject: [Numpy-discussion] 3D array slicing bug?
Message-ID: <SN2PR07MB2512E1609A5B1E5EEFE62AB38DAA0@SN2PR07MB2512.namprd07.prod.outlook.com>

Hi,


I have discovered what I believe is a bug with array slicing involving 3D (and higher) dimension arrays. When slicing a 3D array by a single value for axis 0, all values for axis 1, and a list to slice axis 2, the dimensionality of the resulting 2D array is flipped. However, slicing more than a single index for axis 0 or performing the slicing in two steps results in the correct dimensionality. Below is a quick example to demonstrate this behavior.


import numpy as np


arr = np.arange(54).reshape(2, 3, 9)

list = [0, 2, 4, 5, 8]

print(arr.shape)  # (2, 3, 9)

print(arr[0, :, list].shape) # (5, 3) -- but it should be (3, 5)?

print(arr[0][:, list].shape) # (3, 5), as expected

print(arr[0:1, :, list].shape) # (1, 3, 5), as expected


This behavior carries over to 4D arrays as well, where the axis sliced with a list becomes the 0th axis regardless of order. Below demonstrates that.


arr2 = np.arange(324).reshape(2, 3, 6, 9)

print(arr2[0, :, :, list].shape) # (5, 3, 6), but I expect (3, 6, 5)


arr3 = np.arange(324).reshape(2, 3, 9, 6)

print(arr3[0, :, list].shape) # (5, 3, 6), expected (3, 5, 6)

print(arr3[0, :, list, :].shape) # same as above


Can anyone explain this behavior, or is this a bug?


Best,

Michael

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180321/143bfed0/attachment.html>

From pav at iki.fi  Thu Mar 22 05:41:18 2018
From: pav at iki.fi (Pauli Virtanen)
Date: Thu, 22 Mar 2018 10:41:18 +0100
Subject: [Numpy-discussion] 3D array slicing bug?
In-Reply-To: <SN2PR07MB2512E1609A5B1E5EEFE62AB38DAA0@SN2PR07MB2512.namprd07.prod.outlook.com>
References: <SN2PR07MB2512E1609A5B1E5EEFE62AB38DAA0@SN2PR07MB2512.namprd07.prod.outlook.com>
Message-ID: <1521711678.6503.44.camel@iki.fi>

ke, 2018-03-21 kello 20:40 +0000, Michael Himes kirjoitti:
> I have discovered what I believe is a bug with array slicing
> involving 3D (and higher) dimension arrays. When slicing a 3D array
> by a single value for axis 0, all values for axis 1, and a list to
> slice axis 2, the dimensionality of the resulting 2D array is
> flipped. However, slicing more than a single index for axis 0 or
> performing the slicing in two steps results in the correct
> dimensionality. Below is a quick example to demonstrate this
> behavior.
> 

https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#combining-advanced-and-basic-indexing

The key part seems to be: "There are two parts to the indexing
operation, the subspace defined by the basic indexing 
(**excluding integers**) and the subspace from the advanced indexing
part."

-- 
Pauli Virtanen

From einstein.edison at gmail.com  Thu Mar 22 06:35:46 2018
From: einstein.edison at gmail.com (Hameer Abbasi)
Date: Thu, 22 Mar 2018 11:35:46 +0100
Subject: [Numpy-discussion] new NEP: np.AbstractArray and
 np.asabstractarray
In-Reply-To: <CAPJVwBmrCJxO4dffSGsB78vZu9KA2oYJigVgOF17Pi04ag6C_A@mail.gmail.com>
References: <CAPJVwBnBGh_kfT+tXa8gJ8HA3xjGzoSMJMsCzj+2=diSUvxEAQ@mail.gmail.com>
 <CAJNV+9uM6CL3SeYoLFbQAdpVVaFFvSb7Z5D4KVCQ-vLTm7g4SQ@mail.gmail.com>
 <CAEQ_Tvc8QA+Finhq6+2Ey8J5dBdTR+VuraA1_kuOFjb3=LO2gg@mail.gmail.com>
 <1520560316.2962680.1296803088.6C85AC87@webmail.messagingengine.com>
 <CAPJVwBmpt7Y0Q2Nt7jW52FnMMPxQ-2A2u3bb9U0YDMc2QSsJSw@mail.gmail.com>
 <CAJ8oX-H4ZD5BPgfJbxLBEyfC9TDB3u3jMa+r-zhqYEPj7LEeog@mail.gmail.com>
 <CAPJVwBmrCJxO4dffSGsB78vZu9KA2oYJigVgOF17Pi04ag6C_A@mail.gmail.com>
Message-ID: <CADViA5CGax=Yf_ttkN9V=iyAQD=Sjq28K3i1Cu2yxR-yPkhvFw@mail.gmail.com>

I think that with your comments in mind, it may just be best to embrace
duck typing, like Matthew suggested. I propose the following workflow:

   - __array_concatenate__ and similar "protocol" functions return
   NotImplemented if they won't work.
   - "Base functions" that can be called directly like __getitem__ raise
   NotImplementedError if they won't work.
   - __arrayish__ = True

Then, something like np.concatenate would do the following:

   - Call __array_concatenate__ following the same order as ufunc arguments.
   - If everything fails, raise NotImplementedError (or convert everything
   to ndarray).

Overloaded functions would do something like this (perhaps a simple
decorator will do for the repetitive work?):

   - Try with np.arrayish
   - Catch NotImplementedError
      - Try with np.array

Then, we use abstract classes just to overload functionality or implement
things in terms of others. If something fails, we have a decent fallback.
We don't need to do anything special in order to "check" functionality.

Feel free to propose changes, but this is the best I could come up with
that would require the smallest incremental changes to Numpy while also
supporting everything right from the start.

On Thu, Mar 22, 2018 at 9:14 AM, Nathaniel Smith <njs at pobox.com> wrote:

> On Sat, Mar 10, 2018 at 4:27 AM, Matthew Rocklin <mrocklin at gmail.com>
> wrote:
> > I'm very glad to see this discussion.
> >
> > I think that coming up with a single definition of array-like may be
> > difficult, and that we might end up wanting to embrace duck typing
> instead.
> >
> > It seems to me that different array-like classes will implement different
> > mixtures of features.  It may be difficult to pin down a single
> definition
> > that includes anything except for the most basic attributes (shape and
> > dtype?).  Consider two extreme cases of restrictive functionality:
> >
> > LinearOperators (support dot in a numpy-like way)
> > Storage objects like h5py (support getitem in a numpy-like way)
> >
> > I can imagine authors of both groups saying that they should qualify as
> > array-like because downstream projects that consume them should not
> convert
> > them to numpy arrays in important contexts.
>
> I think this is an important point -- there are a lot of subtleties in
> the interfaces that different objects might want to provide. Some
> interesting ones that haven't been mentioned:
>
> - a "duck array" that has everything except fancy indexing
> - xarray's arrays are just like numpy arrays in most ways, but they
> have incompatible broadcasting semantics
> - immutable vs. mutable arrays
>
> When faced with this kind of situation, always it's tempting to try to
> write down some classification system to capture every possible
> configuration of interesting behavior. In fact, this is one of the
> most classic nerd snipes; it's been catching people for literally
> thousands of years [1]. Most of these attempts fail though :-).
>
> So let's back up -- I probably erred in not making this more clear in
> the NEP, but I actually have a fairly concrete use case in mind here.
> What happened is, I started working on a NEP for
> __array_concatenate__, and my thought pattern went as follows:
>
> 1) Cool, this should work for np.concatenate.
> 2) But what about all the other variants, like np.row_stack. We don't
> want __array_row_stack__; we want to express row_stack in terms of
> concatenate.
> 3) Ok, what's row_stack? It's:
>   np.concatenate([np.atleast_2d(arr) for arr in arrs], axis=0)
> 4) So I need to make atleast_2d work on duck arrays. What's
> atleast_2d? It's: asarray + some shape checks and indexing with
> newaxis
> 5) Okay, so I need something atleast_2d can call instead of asarray [2].
>
> And this kind of pattern shows up everywhere inside numpy, e.g. it's
> the first thing inside lots of functions in np.linalg b/c they do some
> futzing with dtypes and shape before delegating to ufuncs, it's the
> first thing the mean() function does b/c it needs to check arr.dtype
> before proceeding, etc. etc.
>
> So, we need something we can use in these functions as a first step
> towards unlocking the use of duck arrays in general. But we can't
> realistically go through each of these functions, make an exact list
> of all the operations/attributes it cares about, and then come up with
> exactly the right type constraint for it to impose at the top. And
> these functions aren't generally going to work on LinearOperators or
> h5py datasets anyway.
>
> We also don't want to go through every function in numpy and add new
> arguments to control this coercion behavior.
>
> What we can do, at least to start, is to have a mechanism that passes
> through objects that aspire to be "complete" duck arrays, like dask
> arrays or sparse arrays or astropy's unit arrays, and then if it turns
> out that in practice people find uses for finer-grained distinctions,
> we can iteratively add those as a second pass. Notice that if a
> function starts out requiring a "complete" duck array, and then later
> relaxes that to accept "partial" duck arrays, that's actually
> increasing the domain of objects that it can act on, so it's a
> backwards-compatible change that we can do later.
>
> So I think we should start out with a concept of "duck array" that's
> fairly strong but a bit vague on the exact details (e.g.,
> dask.array.Array is currently missing some weird things like arr.ptp()
> and arr.tolist(), I guess because no-one has ever noticed or cared?).
>
> ------------
>
> Thinking things through like this, I also realized that this proposal
> jumps through hoops to avoid changing np.asarray itself, because I was
> nervous about changing the rule that its output is always an
> ndarray... but actually, this is currently the rule for most functions
> in numpy, and the whole point of this proposal is to relax that rule
> for most functions, in cases where the user is explicitly passing in a
> duck-array object. So maybe I'm being overparanoid? I'm genuinely
> unsure here.
>
> Instead of messing about with ABCs, an alternative mechanism would be
> to add a new method __arrayish__ (hat tip to Tom Caswell for the name
> :-)), that essentially acts as an override for Python-level calls to
> np.array / np.asarray, in much the same way that __array_ufunc__
> overrides ufuncs, etc. (C level calls to PyArray_FromAny and similar
> would of course continue to return ndarray objects, and I assume we'd
> add some argument like require_ndarray= that you could pass to
> explicitly indicate whether you needed C-level compatibility.)
>
> This would also allow objects like h5py datasets to *produce* an
> arrayish object on demand, even if they aren't one themselves. (E.g.,
> imagine some hdf5-like storage that holds sparse arrays instead of
> regular arrays.)
>
> I'm thinking I may write this option up as a second NEP, to compete
> with my first one.
>
> -n
>
> [1] See: https://www.wiley.com/en-us/The+Search+for+the+Perfect+
> Language-p-9780631205104
> [2] Actually atleast_2d calls asanyarray, not asarray, but that's just
> a detail; the way to solve this problem for asanyarray is to first
> solve it for asarray.
>
> --
> Nathaniel J. Smith -- https://vorpus.org
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180322/ee0edf71/attachment-0001.html>

From pav at iki.fi  Thu Mar 22 05:23:38 2018
From: pav at iki.fi (Pauli Virtanen)
Date: Thu, 22 Mar 2018 10:23:38 +0100
Subject: [Numpy-discussion] 3D array slicing bug?
In-Reply-To: <SN2PR07MB2512E1609A5B1E5EEFE62AB38DAA0@SN2PR07MB2512.namprd07.prod.outlook.com>
References: <SN2PR07MB2512E1609A5B1E5EEFE62AB38DAA0@SN2PR07MB2512.namprd07.prod.outlook.com>
Message-ID: <1521710618.6503.43.camel@iki.fi>

ke, 2018-03-21 kello 20:40 +0000, Michael Himes kirjoitti:
> I have discovered what I believe is a bug with array slicing
> involving 3D (and higher) dimension arrays. When slicing a 3D array
> by a single value for axis 0, all values for axis 1, and a list to
> slice axis 2, the dimensionality of the resulting 2D array is
> flipped.

https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#combining-advanced-and-basic-indexing

The key part seems to be: "There are two parts to the indexing
operation, the subspace defined by the basic indexing 
(**excluding integers**) and the subspace from the advanced indexing
part."

From sebastian at sipsolutions.net  Thu Mar 22 08:44:38 2018
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Thu, 22 Mar 2018 13:44:38 +0100
Subject: [Numpy-discussion] 3D array slicing bug?
In-Reply-To: <1521711678.6503.44.camel@iki.fi>
References: <SN2PR07MB2512E1609A5B1E5EEFE62AB38DAA0@SN2PR07MB2512.namprd07.prod.outlook.com>
 <1521711678.6503.44.camel@iki.fi>
Message-ID: <1521722678.19593.2.camel@sipsolutions.net>

This NEP draft has some more hints/explanations if you are interested:

https://github.com/seberg/numpy/blob/5becd12914d0402967205579d6f59a9815
1e0d98/doc/neps/indexing.rst#examples

Plus, it tries to avoid the word "subspace" hehehe.

- Sebastian


On Thu, 2018-03-22 at 10:41 +0100, Pauli Virtanen wrote:
> ke, 2018-03-21 kello 20:40 +0000, Michael Himes kirjoitti:
> > I have discovered what I believe is a bug with array slicing
> > involving 3D (and higher) dimension arrays. When slicing a 3D array
> > by a single value for axis 0, all values for axis 1, and a list to
> > slice axis 2, the dimensionality of the resulting 2D array is
> > flipped. However, slicing more than a single index for axis 0 or
> > performing the slicing in two steps results in the correct
> > dimensionality. Below is a quick example to demonstrate this
> > behavior.
> > 
> 
> https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#combi
> ning-advanced-and-basic-indexing
> 
> The key part seems to be: "There are two parts to the indexing
> operation, the subspace defined by the basic indexing 
> (**excluding integers**) and the subspace from the advanced indexing
> part."
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180322/fa20e22c/attachment.sig>

From matti.picus at gmail.com  Thu Mar 22 13:37:03 2018
From: matti.picus at gmail.com (Matti Picus)
Date: Thu, 22 Mar 2018 19:37:03 +0200
Subject: [Numpy-discussion] nditer as a context manager
Message-ID: <f17605a3-9cfa-5fcb-5d1c-ef0ac039e608@gmail.com>

|Hello all, PR #9998 (https://github.com/numpy/numpy/pull/9998/) proposes 
an update to the nditer API, both C and python. The issue (link) is that |||sometimes nditer uses temp arrays via the "writeback" mechanism, the 
data is copied back to the original arrays "when finished". However 
"when finished" was implemented using nditer deallocation. |This 
mechanism is implicit and unclear, and relies on refcount semantics 
which do not work on non-refcount python implementations like PyPY. It 
also leads to lines of code like "iter=None" to trigger the writeback 
resolution. On the c-api level the agreed upon solution is to add a new |||`NpyIter_Close` function in C, this is to be called before 
`NpyIter_Dealloc`. The reviewers and I would like to ask the wider NumPy 
community for opinions about the proposed python-level solution: 
|turning the python nditer object into a context manager. This way 
"writeback" occurs at context manager exit via a call to 
`NpyIter_Close`, instead of like before when it occurred at nditer 
deallocation (which might not happen until much later in Pypy, and could 
be delayed by GC even in Cpython). Another solution that was rejected 
(https://github.com/numpy/numpy/pull/10184) was to add an nditer.close() 
python-level function that would not require a context manager It was 
felt that this is more error-prone since it requires users to add the 
line for each iterator created. The back-compat issues are that: 1. We 
are adding a new function to the numpy API, `NpyIter_Close` (pretty 
harmless) 2. We want people to update their C code using nditer, to call 
`NpyIter_Close` before ?they call `NpyIter_Dealloc` and will start 
raising a deprecation warning if misuse is detected 3. We want people to 
update their Python code to use the nditer object as a context manager, 
and will warn if they do not. We tried to minimize back-compat issues, 
in the sense that old code (which didn't work in PyPy anyway) will still 
work, although it will now emit deprecation warnings. In the future we 
also plan to raise an error if an nditer is used in Python without a 
context manager (when it should have been). For C code, we plan to leave 
the deprecation warning in place probably forever, as we can only detect 
the deprecated behavior in the deallocator, where exceptions cannot be 
raised. Anybody who uses nditers should take a look and please reply if 
it seems the change will be too painful. For more details, please see 
the updated docs in that PR Matti (and reviewers) |


From matti.picus at gmail.com  Thu Mar 22 13:43:23 2018
From: matti.picus at gmail.com (Matti Picus)
Date: Thu, 22 Mar 2018 19:43:23 +0200
Subject: [Numpy-discussion] nditer as a context manager (reformatted?)
In-Reply-To: <f17605a3-9cfa-5fcb-5d1c-ef0ac039e608@gmail.com>
References: <f17605a3-9cfa-5fcb-5d1c-ef0ac039e608@gmail.com>
Message-ID: <1522036d-a561-ba14-8dc3-48e329266827@gmail.com>

Hello all, PR #9998 (https://github.com/numpy/numpy/pull/9998/) proposes 
an update to the nditer API, both C and python. The issue 
(https://github.com/numpy/numpy/issues/9714) is that sometimes nditer 
uses temp arrays via the "writeback" mechanism, the data is copied back 
to the original arrays "when finished". However "when finished" was 
implemented using nditer deallocation.

This mechanism is implicit and unclear, and relies on refcount semantics 
which do not work on non-refcount python implementations like PyPY. It 
also leads to lines of code like "iter=None" to trigger the writeback 
resolution.

On the c-api level the agreed upon solution is to add a new 
`NpyIter_Close` function in C, this is to be called before 
`NpyIter_Dealloc`.

The reviewers and I would like to ask the wider NumPy community for 
opinions about the proposed python-level solution: turning the python 
nditer object into a context manager. This way "writeback" occurs at 
context manager exit via a call to `NpyIter_Close`, instead of like 
before when it occurred at `nditer` deallocation (which might not happen 
until much later in Pypy, and could be delayed by GC even in Cpython).

Another solution that was rejected 
(https://github.com/numpy/numpy/pull/10184) was to add an nditer.close() 
python-level function that would not require a context manager It was 
felt that this is more error-prone since it requires users to add the 
line for each iterator created.

The back-compat issues are that:

1. We are adding a new function to the numpy API, `NpyIter_Close` 
(pretty harmless)

2. We want people to update their C code using nditer, to call 
`NpyIter_Close` before ?they call `NpyIter_Dealloc` and will start 
raising a deprecation warning if misuse is detected

3. We want people to update their Python code to use the nditer object 
as a context manager, and will warn if they do not.

We tried to minimize back-compat issues, in the sense that old code 
(which didn't work in PyPy anyway) will still work, although it will now 
emit deprecation warnings. In the future we also plan to raise an error 
if an nditer is used in Python without a context manager (when it should 
have been). For C code, we plan to leave the deprecation warning in 
place probably forever, as we can only detect the deprecated behavior in 
the deallocator, where exceptions cannot be raised.

Anybody who uses nditers should take a look and please reply if it seems 
the change will be too painful.

For more details, please see the updated docs in that PR

Matti (and reviewers)

From oc-spam66 at laposte.net  Thu Mar 22 15:05:57 2018
From: oc-spam66 at laposte.net (Olivier)
Date: Thu, 22 Mar 2018 20:05:57 +0100
Subject: [Numpy-discussion] round(numpy.float64(0.0)) is a numpy.float64
In-Reply-To: <422941419.2737564.1521718689632.JavaMail.zimbra@laposte.net>
References: <422941419.2737564.1521718689632.JavaMail.zimbra@laposte.net>
Message-ID: <ebd4e639-86a6-9d19-dc80-fff336445be1@laposte.net>

Hello,


Is it normal, expected and desired that :


  ????round(numpy.float64(0.0)) is a numpy.float64


while

  ????round(numpy.float(0.0)) is an integer?


I find it disturbing and misleading. What do you think? Has it already been 
discussed somewhere else?


Best regards,


Olivier


From nathan12343 at gmail.com  Thu Mar 22 15:32:57 2018
From: nathan12343 at gmail.com (Nathan Goldbaum)
Date: Thu, 22 Mar 2018 19:32:57 +0000
Subject: [Numpy-discussion] round(numpy.float64(0.0)) is a numpy.float64
In-Reply-To: <ebd4e639-86a6-9d19-dc80-fff336445be1@laposte.net>
References: <422941419.2737564.1521718689632.JavaMail.zimbra@laposte.net>
 <ebd4e639-86a6-9d19-dc80-fff336445be1@laposte.net>
Message-ID: <CAJXewOnEWRPkHrkB5ssbiDU1fpExih9ON1tUAk816=i1NfH7vA@mail.gmail.com>

numpy.float is an alias to the python float builtin.

https://github.com/numpy/numpy/issues/3998


On Thu, Mar 22, 2018 at 2:26 PM Olivier <oc-spam66 at laposte.net> wrote:

> Hello,
>
>
> Is it normal, expected and desired that :
>
>
>       round(numpy.float64(0.0)) is a numpy.float64
>
>
> while
>
>       round(numpy.float(0.0)) is an integer?
>
>
> I find it disturbing and misleading. What do you think? Has it already been
> discussed somewhere else?
>
>
> Best regards,
>
>
> Olivier
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180322/f5db5c1e/attachment.html>

From opossumnano at gmail.com  Fri Mar 23 04:24:10 2018
From: opossumnano at gmail.com (Python School Organizers)
Date: Fri, 23 Mar 2018 01:24:10 -0700 (PDT)
Subject: [Numpy-discussion] =?utf-8?b?W0FOTl0gMTHhtZfKsCBBZHZhbmNlZCBT?=
 =?utf-8?q?cientific_Programming_in_Python_in_Camerino=2C_Italy=2C_3?=
 =?utf-8?q?=E2=80=948_September=2C_2018?=
Message-ID: <5ab4b9aa.08c41c0a.55881.885a@mx.google.com>

11?? Advanced Scientific Programming in Python
==============================================
a Summer School by the G-Node and the University of Camerino

https://python.g-node.org

Scientists spend more and more time writing, maintaining, and debugging software. While techniques for doing this efficiently have evolved, only few scientists have been trained to use them. As a result, instead of doing their research, they spend far too much time writing deficient code and reinventing the wheel. In this course we will present a selection of advanced programming techniques and best practices which are standard in the industry, but especially tailored to the needs of a programming scientist. Lectures are devised to be interactive and to give the students enough time to acquire direct hands-on experience with the materials. Students will work in pairs throughout the school and will team up to practice the newly learned skills in a real programming project ? an entertaining computer game.

We use the Python programming language for the entire course. Python works as a simple programming language for beginners, but more importantly, it also works great in scientific simulations and data analysis. We show how clean language design, ease of extensibility, and the great wealth of open source libraries for scientific computing and data visualization are driving Python to become a standard tool for the programming scientist.

This school is targeted at Master or PhD students and Post-docs from all areas of science. Competence in Python or in another language such as Java, C/C++, MATLAB, or Mathematica is absolutely required. Basic knowledge of Python and of a version control system such as git, subversion, mercurial, or bazaar is assumed. Participants without any prior experience with Python and/or git should work through the proposed introductory material before the course.

We are striving hard to get a pool of students which is international and gender-balanced: see how far we got in previous years <https://python.g-node.org/wiki/archives#stats>!

Date & Location
===============
3?8 September, 2018. Camerino, Italy.

Application
===========
You can apply online: https://python.g-node.org/wiki/applications
Application deadline: 23:59 UTC, 31 May, 2018. There will be no deadline extension, so be sure to apply on time.
Be sure to read the FAQ before applying: https://python.g-node.org/wiki/faq

Participation is for free, i.e. no fee is charged! Participants however should take care of travel, living, and accommodation expenses by themselves.

Program
=======
? Version control with git and how to contribute to open source projects with GitHub
? Best practices in data visualization
? Organizing, documenting, and distributing scientific code
? Testing scientific code
? Profiling scientific code
? Advanced NumPy
? Advanced scientific Python: decorators, context managers, generators, and elements of object oriented programming
? Writing parallel applications in Python
? Speeding up scientific code with Cython and numba
? Memory-bound computations and the memory hierarchy
? Programming in teams

Also see the detailed day-by-day schedule: https://python.g-node.org/wiki/schedule

Faculty
=======
? Ashwin Trikuta Srinath, Cyberinfrastructure Technology Integration, Clemson University, SC USA
? Jenni Rinker, Department of Wind Energy, Technical University of Denmark, Roskilde Denmark
? Juan Nunez-Iglesias, Melbourne Bioinformatics, University of Melbourne Australia
? Nicolas P. Rougier, Inria Bordeaux Sud-Ouest, Institute of Neurodegenerative Disease, University of Bordeaux France
? Pietro Berkes, NAGRA Kudelski, Lausanne Switzerland
? Rike-Benjamin Schuppner, Institute for Theoretical Biology, Humboldt-Universit?t zu Berlin Germany
? Tiziano Zito, freelance consultant, Berlin Germany
? Zbigniew J?drzejewski-Szmek, Red Hat Inc., Warsaw Poland

Organizers
==========
For the German Neuroinformatics Node of the INCF (G-Node) Germany:

? Tiziano Zito, freelance consultant, Berlin Germany
? Caterina Buizza, Personal Robotics Lab, Imperial College London UK
? Zbigniew J?drzejewski-Szmek, Red Hat Inc., Warsaw Poland
? Jakob Jordan, Department of Physiology, University of Bern, Switzerland Switzerland

 For the University of Camerino Italy:

? Flavio Corradini, Computer Science Division, School of Science and Technology, University of Camerino Italy
? Barbara Re, Computer Science Division, School of Science and Technology, University of Camerino Italy


Website: https://python.g-node.org
Contact: python-info at g-node.org

From ralf.gommers at gmail.com  Sat Mar 24 23:33:21 2018
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sat, 24 Mar 2018 20:33:21 -0700
Subject: [Numpy-discussion] ANN: SciPy 1.0.1 released
Message-ID: <CABL7CQgcWQKq_jUc=mRLVp3e7spLDb-XX1o3W7CWuGcCLntTNg@mail.gmail.com>

On behalf of the SciPy development team I am pleased to announce the
availability of Scipy 1.0.1. This is a maintenance release, no new features
with respect to 1.0.0. See the release notes below for details.

Wheels and sources can be found on PyPI (https://pypi.python.org/pypi/scipy)
and on Github (https://github.com/scipy/scipy/releases/tag/v1.0.1). The
conda-forge channel will be up to date within a couple of hours.

Thanks to everyone who contributed to this release!

Cheers,
Ralf


SciPy 1.0.1 Release Notes
====================

SciPy 1.0.1 is a bug-fix release with no new features compared to 1.0.0.
Probably the most important change is a fix for an incompatibility between
SciPy 1.0.0 and ``numpy.f2py`` in the NumPy master branch.

Authors
=======

* Saurabh Agarwal +
* Alessandro Pietro Bardelli
* Philip DeBoer
* Ralf Gommers
* Matt Haberland
* Eric Larson
* Denis Laxalde
* Mihai Capot? +
* Andrew Nelson
* Oleksandr Pavlyk
* Ilhan Polat
* Anant Prakash +
* Pauli Virtanen
* Warren Weckesser
* @xoviat
* Ted Ying +

A total of 16 people contributed to this release.
People with a "+" by their names contributed a patch for the first time.
This list of names is automatically generated, and may not be fully
complete.


Issues closed for 1.0.1
-----------------------

- `#7493 <https://github.com/scipy/scipy/issues/7493>`__:
`ndimage.morphology` functions are broken with numpy 1.13.0
- `#8118 <https://github.com/scipy/scipy/issues/8118>`__: minimize_cobyla
broken if `disp=True` passed
- `#8142 <https://github.com/scipy/scipy/issues/8142>`__: scipy-v1.0.0
pdist with metric=`minkowski` raises `ValueError:...
- `#8173 <https://github.com/scipy/scipy/issues/8173>`__:
`scipy.stats.ortho_group` produces all negative determinants...
- `#8207 <https://github.com/scipy/scipy/issues/8207>`__: gaussian_filter
seg faults on float16 numpy arrays
- `#8234 <https://github.com/scipy/scipy/issues/8234>`__:
`scipy.optimize.linprog` `interior-point` presolve bug with trivial...
- `#8243 <https://github.com/scipy/scipy/issues/8243>`__: Make csgraph
importable again via `from scipy.sparse import*`
- `#8320 <https://github.com/scipy/scipy/issues/8320>`__: scipy.root
segfaults with optimizer 'lm'


Pull requests for 1.0.1
-----------------------

- `#8068 <https://github.com/scipy/scipy/pull/8068>`__: BUG: fix numpy
deprecation test failures
- `#8082 <https://github.com/scipy/scipy/pull/8082>`__: BUG: fix
solve_lyapunov import
- `#8144 <https://github.com/scipy/scipy/pull/8144>`__: MRG: Fix for cobyla
- `#8150 <https://github.com/scipy/scipy/pull/8150>`__: MAINT: resolve
UPDATEIFCOPY deprecation errors
- `#8156 <https://github.com/scipy/scipy/pull/8156>`__: BUG: missing check
on minkowski w kwarg
- `#8187 <https://github.com/scipy/scipy/pull/8187>`__: BUG: Sign of
elements in random orthogonal 2D matrices in "ortho_group_gen"...
- `#8197 <https://github.com/scipy/scipy/pull/8197>`__: CI: uninstall oclint
- `#8215 <https://github.com/scipy/scipy/pull/8215>`__: Fixes Numpy
datatype compatibility issues
- `#8237 <https://github.com/scipy/scipy/pull/8237>`__: BUG: optimize: fix
bug when variables fixed by bounds are inconsistent...
- `#8248 <https://github.com/scipy/scipy/pull/8248>`__: BUG: declare "gfk"
variable before call of terminate() in newton-cg
- `#8280 <https://github.com/scipy/scipy/pull/8280>`__: REV: reintroduce
csgraph import in scipy.sparse
- `#8322 <https://github.com/scipy/scipy/pull/8322>`__: MAINT: prevent
scipy.optimize.root segfault closes #8320
- `#8334 <https://github.com/scipy/scipy/pull/8334>`__: TST: stats: don't
use exact equality check for hdmedian test
- `#8477 <https://github.com/scipy/scipy/pull/8477>`__: BUG:
signal/signaltools: fix wrong refcounting in PyArray_OrderFilterND
- `#8530 <https://github.com/scipy/scipy/pull/8530>`__: BUG: linalg: Fixed
typo in flapack.pyf.src.
- `#8566 <https://github.com/scipy/scipy/pull/8566>`__: CI: Temporarily pin
Cython version to 0.27.3
- `#8573 <https://github.com/scipy/scipy/pull/8573>`__: Backports for 1.0.1
- `#8581 <https://github.com/scipy/scipy/pull/8581>`__: Fix Cython 0.28
build break of qhull.pyx
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180324/86064003/attachment.html>

From wieser.eric+numpy at gmail.com  Sun Mar 25 16:14:23 2018
From: wieser.eric+numpy at gmail.com (Eric Wieser)
Date: Sun, 25 Mar 2018 20:14:23 +0000
Subject: [Numpy-discussion] PR to add an initializer kwarg to
 ufunc.reduce (and similar functions)
In-Reply-To: <CADViA5D6qc_3PcTXFz3zcVwNEdP2hfKUjgUg+RFb8eDFvj1yyg@mail.gmail.com>
References: <CADViA5D6qc_3PcTXFz3zcVwNEdP2hfKUjgUg+RFb8eDFvj1yyg@mail.gmail.com>
Message-ID: <CAL1kJvBTEMBsD6_8JpTc3rdiBL7YbWqu62SSZLz5U1M4J0SW0w@mail.gmail.com>

To reiterate my comments in the issue - I'm in favor of this.

It seems seem especially valuable for identity-less functions (`min`,
`max`, `lcm`), and the argument name is consistent with `functools.reduce`.
too.

The only argument I can see against merging this would be `kwarg`-creep of
`reduce`, and I think this has enough use cases to justify that.

I'd like to merge in a few days, if no one else has any opinions.

Eric

On Fri, 16 Mar 2018 at 10:13 Hameer Abbasi <einstein.edison at gmail.com>
wrote:

> Hello, everyone. I?ve submitted a PR to add a initializer kwarg to
> ufunc.reduce. This is useful in a few cases, e.g., it allows one to supply
> a ?default? value for identity-less ufunc reductions, and specify an
> initial value for reductions such as sum (other than zero.)
>
> Please feel free to review or leave feedback, (although I think Eric and
> Marten have picked it apart pretty well).
>
> https://github.com/numpy/numpy/pull/10635
>
> Thanks,
>
> Hameer
> Sent from Astro <https://www.helloastro.com> for Mac
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180325/c9bf2312/attachment.html>

From shoyer at gmail.com  Mon Mar 26 03:16:28 2018
From: shoyer at gmail.com (Stephan Hoyer)
Date: Mon, 26 Mar 2018 07:16:28 +0000
Subject: [Numpy-discussion] PR to add an initializer kwarg to
 ufunc.reduce (and similar functions)
In-Reply-To: <CAL1kJvBTEMBsD6_8JpTc3rdiBL7YbWqu62SSZLz5U1M4J0SW0w@mail.gmail.com>
References: <CADViA5D6qc_3PcTXFz3zcVwNEdP2hfKUjgUg+RFb8eDFvj1yyg@mail.gmail.com>
 <CAL1kJvBTEMBsD6_8JpTc3rdiBL7YbWqu62SSZLz5U1M4J0SW0w@mail.gmail.com>
Message-ID: <CAEQ_TvfJ5fq5mQ98kUhGvr1cpWphmRiHbTF3M9DrAEiETDe5dA@mail.gmail.com>

This looks like a very logical addition to the reduce interface. It has my
support!

I would have preferred the more descriptive name "initial_value", but
consistency with functools.reduce makes a compelling case for "initializer".

On Sun, Mar 25, 2018 at 1:15 PM Eric Wieser <wieser.eric+numpy at gmail.com>
wrote:

> To reiterate my comments in the issue - I'm in favor of this.
>
> It seems seem especially valuable for identity-less functions (`min`,
> `max`, `lcm`), and the argument name is consistent with `functools.reduce`.
> too.
>
> The only argument I can see against merging this would be `kwarg`-creep of
> `reduce`, and I think this has enough use cases to justify that.
>
> I'd like to merge in a few days, if no one else has any opinions.
>
> Eric
>
> On Fri, 16 Mar 2018 at 10:13 Hameer Abbasi <einstein.edison at gmail.com>
> wrote:
>
>> Hello, everyone. I?ve submitted a PR to add a initializer kwarg to
>> ufunc.reduce. This is useful in a few cases, e.g., it allows one to supply
>> a ?default? value for identity-less ufunc reductions, and specify an
>> initial value for reductions such as sum (other than zero.)
>>
>> Please feel free to review or leave feedback, (although I think Eric and
>> Marten have picked it apart pretty well).
>>
>> https://github.com/numpy/numpy/pull/10635
>>
>> Thanks,
>>
>> Hameer
>> Sent from Astro <https://www.helloastro.com> for Mac
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180326/9aa180ac/attachment.html>

From wieser.eric+numpy at gmail.com  Mon Mar 26 03:54:10 2018
From: wieser.eric+numpy at gmail.com (Eric Wieser)
Date: Mon, 26 Mar 2018 07:54:10 +0000
Subject: [Numpy-discussion] PR to add an initializer kwarg to
 ufunc.reduce (and similar functions)
In-Reply-To: <CAEQ_TvfJ5fq5mQ98kUhGvr1cpWphmRiHbTF3M9DrAEiETDe5dA@mail.gmail.com>
References: <CADViA5D6qc_3PcTXFz3zcVwNEdP2hfKUjgUg+RFb8eDFvj1yyg@mail.gmail.com>
 <CAL1kJvBTEMBsD6_8JpTc3rdiBL7YbWqu62SSZLz5U1M4J0SW0w@mail.gmail.com>
 <CAEQ_TvfJ5fq5mQ98kUhGvr1cpWphmRiHbTF3M9DrAEiETDe5dA@mail.gmail.com>
Message-ID: <CAL1kJvDNpRxcopA1PMfRABc5rhVifP0-Q5KtUKOR7xqKT1sfAQ@mail.gmail.com>

It turns out I mispoke - functools.reduce calls the argument `initial`

On Mon, 26 Mar 2018 at 00:17 Stephan Hoyer <shoyer at gmail.com> wrote:

> This looks like a very logical addition to the reduce interface. It has my
> support!
>
> I would have preferred the more descriptive name "initial_value", but
> consistency with functools.reduce makes a compelling case for "initializer".
>
> On Sun, Mar 25, 2018 at 1:15 PM Eric Wieser <wieser.eric+numpy at gmail.com>
> wrote:
>
>> To reiterate my comments in the issue - I'm in favor of this.
>>
>> It seems seem especially valuable for identity-less functions (`min`,
>> `max`, `lcm`), and the argument name is consistent with `functools.reduce`.
>> too.
>>
>> The only argument I can see against merging this would be `kwarg`-creep
>> of `reduce`, and I think this has enough use cases to justify that.
>>
>> I'd like to merge in a few days, if no one else has any opinions.
>>
>> Eric
>>
>> On Fri, 16 Mar 2018 at 10:13 Hameer Abbasi <einstein.edison at gmail.com>
>> wrote:
>>
>>> Hello, everyone. I?ve submitted a PR to add a initializer kwarg to
>>> ufunc.reduce. This is useful in a few cases, e.g., it allows one to supply
>>> a ?default? value for identity-less ufunc reductions, and specify an
>>> initial value for reductions such as sum (other than zero.)
>>>
>>> Please feel free to review or leave feedback, (although I think Eric and
>>> Marten have picked it apart pretty well).
>>>
>>> https://github.com/numpy/numpy/pull/10635
>>>
>>> Thanks,
>>>
>>> Hameer
>>> Sent from Astro <https://www.helloastro.com> for Mac
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180326/85d95014/attachment-0001.html>

From einstein.edison at gmail.com  Mon Mar 26 05:57:14 2018
From: einstein.edison at gmail.com (Hameer Abbasi)
Date: Mon, 26 Mar 2018 05:57:14 -0400
Subject: [Numpy-discussion] PR to add an initializer kwarg to
 ufunc.reduce (and similar functions)
In-Reply-To: <CAL1kJvDNpRxcopA1PMfRABc5rhVifP0-Q5KtUKOR7xqKT1sfAQ@mail.gmail.com>
References: <CADViA5D6qc_3PcTXFz3zcVwNEdP2hfKUjgUg+RFb8eDFvj1yyg@mail.gmail.com>
 <CAL1kJvBTEMBsD6_8JpTc3rdiBL7YbWqu62SSZLz5U1M4J0SW0w@mail.gmail.com>
 <CAEQ_TvfJ5fq5mQ98kUhGvr1cpWphmRiHbTF3M9DrAEiETDe5dA@mail.gmail.com>
 <CAL1kJvDNpRxcopA1PMfRABc5rhVifP0-Q5KtUKOR7xqKT1sfAQ@mail.gmail.com>
Message-ID: <CADViA5A-fzad=4z7Ra92NtszeVCytywHDno7MW_ZZpyajEDdQQ@mail.gmail.com>

 It calls it `initializer` - See
https://docs.python.org/3.5/library/functools.html#functools.reduce

Sent from Astro <https://www.helloastro.com> for Mac

On Mar 26, 2018 at 09:54, Eric Wieser <wieser.eric+numpy at gmail.com> wrote:


It turns out I mispoke - functools.reduce calls the argument `initial`

On Mon, 26 Mar 2018 at 00:17 Stephan Hoyer <shoyer at gmail.com> wrote:

> This looks like a very logical addition to the reduce interface. It has my
> support!
>
> I would have preferred the more descriptive name "initial_value", but
> consistency with functools.reduce makes a compelling case for "initializer".
>
> On Sun, Mar 25, 2018 at 1:15 PM Eric Wieser <wieser.eric+numpy at gmail.com>
> wrote:
>
>> To reiterate my comments in the issue - I'm in favor of this.
>>
>> It seems seem especially valuable for identity-less functions (`min`,
>> `max`, `lcm`), and the argument name is consistent with `functools.reduce`.
>> too.
>>
>> The only argument I can see against merging this would be `kwarg`-creep
>> of `reduce`, and I think this has enough use cases to justify that.
>>
>> I'd like to merge in a few days, if no one else has any opinions.
>>
>> Eric
>>
>> On Fri, 16 Mar 2018 at 10:13 Hameer Abbasi <einstein.edison at gmail.com>
>> wrote:
>>
>>> Hello, everyone. I?ve submitted a PR to add a initializer kwarg to
>>> ufunc.reduce. This is useful in a few cases, e.g., it allows one to supply
>>> a ?default? value for identity-less ufunc reductions, and specify an
>>> initial value for reductions such as sum (other than zero.)
>>>
>>> Please feel free to review or leave feedback, (although I think Eric and
>>> Marten have picked it apart pretty well).
>>>
>>> https://github.com/numpy/numpy/pull/10635
>>>
>>> Thanks,
>>>
>>> Hameer
>>> Sent from Astro <https://www.helloastro.com> for Mac
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at python.org
https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180326/3263ae2c/attachment.html>

From sebastian at sipsolutions.net  Mon Mar 26 06:06:26 2018
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Mon, 26 Mar 2018 12:06:26 +0200
Subject: [Numpy-discussion] PR to add an initializer kwarg to
 ufunc.reduce (and similar functions)
In-Reply-To: <CADViA5A-fzad=4z7Ra92NtszeVCytywHDno7MW_ZZpyajEDdQQ@mail.gmail.com>
References: <CADViA5D6qc_3PcTXFz3zcVwNEdP2hfKUjgUg+RFb8eDFvj1yyg@mail.gmail.com>
 <CAL1kJvBTEMBsD6_8JpTc3rdiBL7YbWqu62SSZLz5U1M4J0SW0w@mail.gmail.com>
 <CAEQ_TvfJ5fq5mQ98kUhGvr1cpWphmRiHbTF3M9DrAEiETDe5dA@mail.gmail.com>
 <CAL1kJvDNpRxcopA1PMfRABc5rhVifP0-Q5KtUKOR7xqKT1sfAQ@mail.gmail.com>
 <CADViA5A-fzad=4z7Ra92NtszeVCytywHDno7MW_ZZpyajEDdQQ@mail.gmail.com>
Message-ID: <1522058786.15711.5.camel@sipsolutions.net>

Initializer or this sounds fine to me. As an other data point which I
think has been mentioned before, `sum` uses start and min/max use
default. `start` does not work, unless we also change the code to
always use the identity if given (currently that is not the case), in
which case it might be nice. However, "start" seems a bit like solving
a different issue in any case.

Anyway, mostly noise. I really like adding this, the only thing worth
discussing a bit is the name :).

- Sebastian


On Mon, 2018-03-26 at 05:57 -0400, Hameer Abbasi wrote:
> It calls it `initializer` - See https://docs.python.org/3.5/library/f
> unctools.html#functools.reduce
> 
> Sent from Astro for Mac
> 
> > On Mar 26, 2018 at 09:54, Eric Wieser <wieser.eric+numpy at gmail.com>
> > wrote:
> > 
> > It turns out I mispoke - functools.reduce calls the argument
> > `initial`
> > 
> > On Mon, 26 Mar 2018 at 00:17 Stephan Hoyer <shoyer at gmail.com>
> > wrote:
> > > This looks like a very logical addition to the reduce interface.
> > > It has my support!
> > > 
> > > I would have preferred the more descriptive name "initial_value",
> > > but consistency with functools.reduce makes a compelling case for
> > > "initializer".
> > > 
> > > On Sun, Mar 25, 2018 at 1:15 PM Eric Wieser <wieser.eric+numpy at gm
> > > ail.com> wrote:
> > > > To reiterate my comments in the issue - I'm in favor of this.
> > > > 
> > > > It seems seem especially valuable for identity-less functions
> > > > (`min`, `max`, `lcm`), and the argument name is consistent with
> > > > `functools.reduce`. too.
> > > > 
> > > > The only argument I can see against merging this would be
> > > > `kwarg`-creep of `reduce`, and I think this has enough use
> > > > cases to justify that.
> > > > 
> > > > I'd like to merge in a few days, if no one else has any
> > > > opinions.
> > > > 
> > > > Eric
> > > > 
> > > > On Fri, 16 Mar 2018 at 10:13 Hameer Abbasi <einstein.edison at gma
> > > > il.com> wrote:
> > > > > Hello, everyone. I?ve submitted a PR to add a initializer
> > > > > kwarg to ufunc.reduce. This is useful in a few cases, e.g.,
> > > > > it allows one to supply a ?default? value for identity-less
> > > > > ufunc reductions, and specify an initial value for reductions
> > > > > such as sum (other than zero.)
> > > > > 
> > > > > Please feel free to review or leave feedback, (although I
> > > > > think Eric and Marten have picked it apart pretty well).
> > > > > 
> > > > > https://github.com/numpy/numpy/pull/10635
> > > > > 
> > > > > Thanks,
> > > > > 
> > > > > Hameer
> > > > > Sent from Astro for Mac
> > > > > 
> > > > > _______________________________________________
> > > > > NumPy-Discussion mailing list
> > > > > NumPy-Discussion at python.org
> > > > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > > 
> > > > _______________________________________________
> > > > NumPy-Discussion mailing list
> > > > NumPy-Discussion at python.org
> > > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > 
> > > _______________________________________________
> > > NumPy-Discussion mailing list
> > > NumPy-Discussion at python.org
> > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > 
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180326/8f1be5b4/attachment.sig>

From einstein.edison at gmail.com  Mon Mar 26 08:20:52 2018
From: einstein.edison at gmail.com (Hameer Abbasi)
Date: Mon, 26 Mar 2018 08:20:52 -0400
Subject: [Numpy-discussion] PR to add an initializer kwarg to
 ufunc.reduce (and similar functions)
In-Reply-To: <1522058786.15711.5.camel@sipsolutions.net>
References: <CADViA5D6qc_3PcTXFz3zcVwNEdP2hfKUjgUg+RFb8eDFvj1yyg@mail.gmail.com>
 <CAEQ_TvfJ5fq5mQ98kUhGvr1cpWphmRiHbTF3M9DrAEiETDe5dA@mail.gmail.com>
 <CAL1kJvDNpRxcopA1PMfRABc5rhVifP0-Q5KtUKOR7xqKT1sfAQ@mail.gmail.com>
 <CADViA5A-fzad=4z7Ra92NtszeVCytywHDno7MW_ZZpyajEDdQQ@mail.gmail.com>
 <1522058786.15711.5.camel@sipsolutions.net>
Message-ID: <CADViA5CXy5Kuv1GpPdJnQPTSVOaNSuYMSz_qJX_3XmDK1WhCNQ@mail.gmail.com>

 Actually, the behavior right now isn?t that of `default` but that of
`initializer` or `start`.

This was discussed further down in the PR but to reiterate: `np.sum([10],
initializer=5)` becomes `15`.

Also, `np.min([5], initializer=0)` becomes `0`, so it isn?t really the
default value, it?s the initial value among which the reduction is
performed.

This was the reason to call it initializer in the first place. I like
`initial` and `initial_value` as well, and `start` also makes sense but
isn?t descriptive enough.

Hameer
Sent from Astro <https://www.helloastro.com> for Mac

On Mar 26, 2018 at 12:06, Sebastian Berg <sebastian at sipsolutions.net> wrote:


Initializer or this sounds fine to me. As an other data point which I
think has been mentioned before, `sum` uses start and min/max use
default. `start` does not work, unless we also change the code to
always use the identity if given (currently that is not the case), in
which case it might be nice. However, "start" seems a bit like solving
a different issue in any case.

Anyway, mostly noise. I really like adding this, the only thing worth
discussing a bit is the name :).

- Sebastian


On Mon, 2018-03-26 at 05:57 -0400, Hameer Abbasi wrote:

It calls it `initializer` - See https://docs.python.org/3.5/library/f
unctools.html#functools.reduce

Sent from Astro for Mac

On Mar 26, 2018 at 09:54, Eric Wieser <wieser.eric+numpy at gmail.com>
wrote:

It turns out I mispoke - functools.reduce calls the argument
`initial`

On Mon, 26 Mar 2018 at 00:17 Stephan Hoyer <shoyer at gmail.com>
wrote:

This looks like a very logical addition to the reduce interface.
It has my support!

I would have preferred the more descriptive name "initial_value",
but consistency with functools.reduce makes a compelling case for
"initializer".

On Sun, Mar 25, 2018 at 1:15 PM Eric Wieser <wieser.eric+numpy at gm
ail.com> wrote:

To reiterate my comments in the issue - I'm in favor of this.

It seems seem especially valuable for identity-less functions
(`min`, `max`, `lcm`), and the argument name is consistent with
`functools.reduce`. too.

The only argument I can see against merging this would be
`kwarg`-creep of `reduce`, and I think this has enough use
cases to justify that.

I'd like to merge in a few days, if no one else has any
opinions.

Eric

On Fri, 16 Mar 2018 at 10:13 Hameer Abbasi <einstein.edison at gma
il.com> wrote:

Hello, everyone. I?ve submitted a PR to add a initializer
kwarg to ufunc.reduce. This is useful in a few cases, e.g.,
it allows one to supply a ?default? value for identity-less
ufunc reductions, and specify an initial value for reductions
such as sum (other than zero.)

Please feel free to review or leave feedback, (although I
think Eric and Marten have picked it apart pretty well).

https://github.com/numpy/numpy/pull/10635

Thanks,

Hameer
Sent from Astro for Mac

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at python.org
https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180326/7445ecdd/attachment-0001.html>

From wieser.eric+numpy at gmail.com  Mon Mar 26 11:06:01 2018
From: wieser.eric+numpy at gmail.com (Eric Wieser)
Date: Mon, 26 Mar 2018 15:06:01 +0000
Subject: [Numpy-discussion] PR to add an initializer kwarg to
 ufunc.reduce (and similar functions)
In-Reply-To: <CADViA5A-fzad=4z7Ra92NtszeVCytywHDno7MW_ZZpyajEDdQQ@mail.gmail.com>
References: <CADViA5D6qc_3PcTXFz3zcVwNEdP2hfKUjgUg+RFb8eDFvj1yyg@mail.gmail.com>
 <CAL1kJvBTEMBsD6_8JpTc3rdiBL7YbWqu62SSZLz5U1M4J0SW0w@mail.gmail.com>
 <CAEQ_TvfJ5fq5mQ98kUhGvr1cpWphmRiHbTF3M9DrAEiETDe5dA@mail.gmail.com>
 <CAL1kJvDNpRxcopA1PMfRABc5rhVifP0-Q5KtUKOR7xqKT1sfAQ@mail.gmail.com>
 <CADViA5A-fzad=4z7Ra92NtszeVCytywHDno7MW_ZZpyajEDdQQ@mail.gmail.com>
Message-ID: <CAL1kJvCWuDYK9puyB8NnZZoaAUpebmdN5oyt0SiH1=1DxoYcwQ@mail.gmail.com>

Huh, looks like it has different names in different places.
`help(functools.reduce)` shows "initial".

On Mon, Mar 26, 2018, 02:57 Hameer Abbasi <einstein.edison at gmail.com> wrote:

> It calls it `initializer` - See
> https://docs.python.org/3.5/library/functools.html#functools.reduce
>
>
> Sent from Astro <https://www.helloastro.com> for Mac
>
> On Mar 26, 2018 at 09:54, Eric Wieser <wieser.eric+numpy at gmail.com> wrote:
>
>
> It turns out I mispoke - functools.reduce calls the argument `initial`
>
> On Mon, 26 Mar 2018 at 00:17 Stephan Hoyer <shoyer at gmail.com> wrote:
>
>> This looks like a very logical addition to the reduce interface. It has
>> my support!
>>
>> I would have preferred the more descriptive name "initial_value", but
>> consistency with functools.reduce makes a compelling case for "initializer".
>>
>> On Sun, Mar 25, 2018 at 1:15 PM Eric Wieser <wieser.eric+numpy at gmail.com>
>> wrote:
>>
>>> To reiterate my comments in the issue - I'm in favor of this.
>>>
>>> It seems seem especially valuable for identity-less functions (`min`,
>>> `max`, `lcm`), and the argument name is consistent with `functools.reduce`.
>>> too.
>>>
>>> The only argument I can see against merging this would be `kwarg`-creep
>>> of `reduce`, and I think this has enough use cases to justify that.
>>>
>>> I'd like to merge in a few days, if no one else has any opinions.
>>>
>>> Eric
>>>
>>> On Fri, 16 Mar 2018 at 10:13 Hameer Abbasi <einstein.edison at gmail.com>
>>> wrote:
>>>
>>>> Hello, everyone. I?ve submitted a PR to add a initializer kwarg to
>>>> ufunc.reduce. This is useful in a few cases, e.g., it allows one to supply
>>>> a ?default? value for identity-less ufunc reductions, and specify an
>>>> initial value for reductions such as sum (other than zero.)
>>>>
>>>> Please feel free to review or leave feedback, (although I think Eric
>>>> and Marten have picked it apart pretty well).
>>>>
>>>> https://github.com/numpy/numpy/pull/10635
>>>>
>>>> Thanks,
>>>>
>>>> Hameer
>>>> Sent from Astro <https://www.helloastro.com> for Mac
>>>>
>>>> _______________________________________________
>>>> NumPy-Discussion mailing list
>>>> NumPy-Discussion at python.org
>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180326/2f388fea/attachment.html>

From sebastian at sipsolutions.net  Mon Mar 26 11:16:34 2018
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Mon, 26 Mar 2018 17:16:34 +0200
Subject: [Numpy-discussion] PR to add an initializer kwarg to
 ufunc.reduce (and similar functions)
In-Reply-To: <CADViA5CXy5Kuv1GpPdJnQPTSVOaNSuYMSz_qJX_3XmDK1WhCNQ@mail.gmail.com>
References: <CADViA5D6qc_3PcTXFz3zcVwNEdP2hfKUjgUg+RFb8eDFvj1yyg@mail.gmail.com>
 <CAEQ_TvfJ5fq5mQ98kUhGvr1cpWphmRiHbTF3M9DrAEiETDe5dA@mail.gmail.com>
 <CAL1kJvDNpRxcopA1PMfRABc5rhVifP0-Q5KtUKOR7xqKT1sfAQ@mail.gmail.com>
 <CADViA5A-fzad=4z7Ra92NtszeVCytywHDno7MW_ZZpyajEDdQQ@mail.gmail.com>
 <1522058786.15711.5.camel@sipsolutions.net>
 <CADViA5CXy5Kuv1GpPdJnQPTSVOaNSuYMSz_qJX_3XmDK1WhCNQ@mail.gmail.com>
Message-ID: <1522077394.4888.10.camel@sipsolutions.net>

OK, the new documentation is actually clear:

    initializer : scalar, optional
        The value with which to start the reduction.
        Defaults to the `~numpy.ufunc.identity` of the ufunc.
        If ``None`` is given, the first element of the reduction is used,
        and an error is thrown if the reduction is empty. If ``a.dtype`` is
        ``object``, then the initializer is _only_ used if reduction is empty.

I would actually like to say that I do not like the object special case
much (and it is probably the reason why I was confused), nor am I quite
sure this is what helps a lot? Logically, I would argue there are two
things:

 1. initializer/start (always used)
 2. default (oly used for empty reductions)

For example, I might like to give `np.nan` as the default for some
empty reductions, this will not work. I understand that this is a
minimal invasive PR and I am not sure I find the solution bad enough to
really dislike it, but what do other think? My first expectation was
the default behaviour (in all cases, not just object case) for some
reason.

To be honest, for now I just wonder a bit: How hard would it be to do
both, or is that too annoying? It would at least get rid of that
annoying thing with object ufuncs (which currently have a default, but
not really an identity/initializer).

Best,

Sebastian


On Mon, 2018-03-26 at 08:20 -0400, Hameer Abbasi wrote:
> Actually, the behavior right now isn?t that of `default` but that of
> `initializer` or `start`.
> 
> This was discussed further down in the PR but to reiterate:
> `np.sum([10], initializer=5)` becomes `15`.
> 
> Also, `np.min([5], initializer=0)` becomes `0`, so it isn?t really
> the default value, it?s the initial value among which the reduction
> is performed.
> 
> This was the reason to call it initializer in the first place. I like
> `initial` and `initial_value` as well, and `start` also makes sense
> but isn?t descriptive enough.
> 
> Hameer
> Sent from Astro for Mac
> 
> > On Mar 26, 2018 at 12:06, Sebastian Berg <sebastian at sipsolutions.ne
> > t> wrote:
> > 
> > Initializer or this sounds fine to me. As an other data point which
> > I
> > think has been mentioned before, `sum` uses start and min/max use
> > default. `start` does not work, unless we also change the code to
> > always use the identity if given (currently that is not the case),
> > in
> > which case it might be nice. However, "start" seems a bit like
> > solving
> > a different issue in any case.
> > 
> > Anyway, mostly noise. I really like adding this, the only thing
> > worth
> > discussing a bit is the name :).
> > 
> > - Sebastian
> > 
> > 
> > On Mon, 2018-03-26 at 05:57 -0400, Hameer Abbasi wrote:
> > > It calls it `initializer` - See https://docs.python.org/3.5/libra
> > > ry/f
> > > unctools.html#functools.reduce
> > > 
> > > Sent from Astro for Mac
> > > 
> > > > On Mar 26, 2018 at 09:54, Eric Wieser <wieser.eric+numpy at gmail.
> > > > com>
> > > > wrote:
> > > > 
> > > > It turns out I mispoke - functools.reduce calls the argument
> > > > `initial`
> > > > 
> > > > On Mon, 26 Mar 2018 at 00:17 Stephan Hoyer <shoyer at gmail.com>
> > > > wrote:
> > > > > This looks like a very logical addition to the reduce
> > > > > interface.
> > > > > It has my support!
> > > > > 
> > > > > I would have preferred the more descriptive name
> > > > > "initial_value",
> > > > > but consistency with functools.reduce makes a compelling case
> > > > > for
> > > > > "initializer".
> > > > > 
> > > > > On Sun, Mar 25, 2018 at 1:15 PM Eric Wieser <wieser.eric+nump
> > > > > y at gm
> > > > > ail.com> wrote:
> > > > > > To reiterate my comments in the issue - I'm in favor of
> > > > > > this.
> > > > > > 
> > > > > > It seems seem especially valuable for identity-less
> > > > > > functions
> > > > > > (`min`, `max`, `lcm`), and the argument name is consistent
> > > > > > with
> > > > > > `functools.reduce`. too.
> > > > > > 
> > > > > > The only argument I can see against merging this would be
> > > > > > `kwarg`-creep of `reduce`, and I think this has enough use
> > > > > > cases to justify that.
> > > > > > 
> > > > > > I'd like to merge in a few days, if no one else has any
> > > > > > opinions.
> > > > > > 
> > > > > > Eric
> > > > > > 
> > > > > > On Fri, 16 Mar 2018 at 10:13 Hameer Abbasi <einstein.edison
> > > > > > @gma
> > > > > > il.com> wrote:
> > > > > > > Hello, everyone. I?ve submitted a PR to add a initializer
> > > > > > > kwarg to ufunc.reduce. This is useful in a few cases,
> > > > > > > e.g.,
> > > > > > > it allows one to supply a ?default? value for identity-
> > > > > > > less
> > > > > > > ufunc reductions, and specify an initial value for
> > > > > > > reductions
> > > > > > > such as sum (other than zero.)
> > > > > > > 
> > > > > > > Please feel free to review or leave feedback, (although I
> > > > > > > think Eric and Marten have picked it apart pretty well).
> > > > > > > 
> > > > > > > https://github.com/numpy/numpy/pull/10635
> > > > > > > 
> > > > > > > Thanks,
> > > > > > > 
> > > > > > > Hameer
> > > > > > > Sent from Astro for Mac
> > > > > > > 
> > > > > > > _______________________________________________
> > > > > > > NumPy-Discussion mailing list
> > > > > > > NumPy-Discussion at python.org
> > > > > > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > > > > 
> > > > > > _______________________________________________
> > > > > > NumPy-Discussion mailing list
> > > > > > NumPy-Discussion at python.org
> > > > > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > > > 
> > > > > _______________________________________________
> > > > > NumPy-Discussion mailing list
> > > > > NumPy-Discussion at python.org
> > > > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > > 
> > > > _______________________________________________
> > > > NumPy-Discussion mailing list
> > > > NumPy-Discussion at python.org
> > > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > 
> > > _______________________________________________
> > > NumPy-Discussion mailing list
> > > NumPy-Discussion at python.org
> > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > 
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180326/ea6eeeed/attachment.sig>

From ben.v.root at gmail.com  Mon Mar 26 11:35:32 2018
From: ben.v.root at gmail.com (Benjamin Root)
Date: Mon, 26 Mar 2018 11:35:32 -0400
Subject: [Numpy-discussion] PR to add an initializer kwarg to
 ufunc.reduce (and similar functions)
In-Reply-To: <1522077394.4888.10.camel@sipsolutions.net>
References: <CADViA5D6qc_3PcTXFz3zcVwNEdP2hfKUjgUg+RFb8eDFvj1yyg@mail.gmail.com>
 <CAEQ_TvfJ5fq5mQ98kUhGvr1cpWphmRiHbTF3M9DrAEiETDe5dA@mail.gmail.com>
 <CAL1kJvDNpRxcopA1PMfRABc5rhVifP0-Q5KtUKOR7xqKT1sfAQ@mail.gmail.com>
 <CADViA5A-fzad=4z7Ra92NtszeVCytywHDno7MW_ZZpyajEDdQQ@mail.gmail.com>
 <1522058786.15711.5.camel@sipsolutions.net>
 <CADViA5CXy5Kuv1GpPdJnQPTSVOaNSuYMSz_qJX_3XmDK1WhCNQ@mail.gmail.com>
 <1522077394.4888.10.camel@sipsolutions.net>
Message-ID: <CANNq6Fm56H4AZNrDH-7e1hH86mq1YpoJTuH4Y8_LDH1bkA-Y=w@mail.gmail.com>

Hmm, this is neat. I imagine it would finally give some people a choice on
what np.nansum([np.nan]) should return? It caused a huge hullabeloo a few
years ago when we changed it from returning NaN to returning zero.

Ben Root

On Mon, Mar 26, 2018 at 11:16 AM, Sebastian Berg <sebastian at sipsolutions.net
> wrote:

> OK, the new documentation is actually clear:
>
>     initializer : scalar, optional
>         The value with which to start the reduction.
>         Defaults to the `~numpy.ufunc.identity` of the ufunc.
>         If ``None`` is given, the first element of the reduction is used,
>         and an error is thrown if the reduction is empty. If ``a.dtype`` is
>         ``object``, then the initializer is _only_ used if reduction is
> empty.
>
> I would actually like to say that I do not like the object special case
> much (and it is probably the reason why I was confused), nor am I quite
> sure this is what helps a lot? Logically, I would argue there are two
> things:
>
>  1. initializer/start (always used)
>  2. default (oly used for empty reductions)
>
> For example, I might like to give `np.nan` as the default for some
> empty reductions, this will not work. I understand that this is a
> minimal invasive PR and I am not sure I find the solution bad enough to
> really dislike it, but what do other think? My first expectation was
> the default behaviour (in all cases, not just object case) for some
> reason.
>
> To be honest, for now I just wonder a bit: How hard would it be to do
> both, or is that too annoying? It would at least get rid of that
> annoying thing with object ufuncs (which currently have a default, but
> not really an identity/initializer).
>
> Best,
>
> Sebastian
>
>
> On Mon, 2018-03-26 at 08:20 -0400, Hameer Abbasi wrote:
> > Actually, the behavior right now isn?t that of `default` but that of
> > `initializer` or `start`.
> >
> > This was discussed further down in the PR but to reiterate:
> > `np.sum([10], initializer=5)` becomes `15`.
> >
> > Also, `np.min([5], initializer=0)` becomes `0`, so it isn?t really
> > the default value, it?s the initial value among which the reduction
> > is performed.
> >
> > This was the reason to call it initializer in the first place. I like
> > `initial` and `initial_value` as well, and `start` also makes sense
> > but isn?t descriptive enough.
> >
> > Hameer
> > Sent from Astro for Mac
> >
> > > On Mar 26, 2018 at 12:06, Sebastian Berg <sebastian at sipsolutions.ne
> > > t> wrote:
> > >
> > > Initializer or this sounds fine to me. As an other data point which
> > > I
> > > think has been mentioned before, `sum` uses start and min/max use
> > > default. `start` does not work, unless we also change the code to
> > > always use the identity if given (currently that is not the case),
> > > in
> > > which case it might be nice. However, "start" seems a bit like
> > > solving
> > > a different issue in any case.
> > >
> > > Anyway, mostly noise. I really like adding this, the only thing
> > > worth
> > > discussing a bit is the name :).
> > >
> > > - Sebastian
> > >
> > >
> > > On Mon, 2018-03-26 at 05:57 -0400, Hameer Abbasi wrote:
> > > > It calls it `initializer` - See https://docs.python.org/3.5/libra
> > > > ry/f
> > > > unctools.html#functools.reduce
> > > >
> > > > Sent from Astro for Mac
> > > >
> > > > > On Mar 26, 2018 at 09:54, Eric Wieser <wieser.eric+numpy at gmail.
> > > > > com>
> > > > > wrote:
> > > > >
> > > > > It turns out I mispoke - functools.reduce calls the argument
> > > > > `initial`
> > > > >
> > > > > On Mon, 26 Mar 2018 at 00:17 Stephan Hoyer <shoyer at gmail.com>
> > > > > wrote:
> > > > > > This looks like a very logical addition to the reduce
> > > > > > interface.
> > > > > > It has my support!
> > > > > >
> > > > > > I would have preferred the more descriptive name
> > > > > > "initial_value",
> > > > > > but consistency with functools.reduce makes a compelling case
> > > > > > for
> > > > > > "initializer".
> > > > > >
> > > > > > On Sun, Mar 25, 2018 at 1:15 PM Eric Wieser <wieser.eric+nump
> > > > > > y at gm
> > > > > > ail.com> wrote:
> > > > > > > To reiterate my comments in the issue - I'm in favor of
> > > > > > > this.
> > > > > > >
> > > > > > > It seems seem especially valuable for identity-less
> > > > > > > functions
> > > > > > > (`min`, `max`, `lcm`), and the argument name is consistent
> > > > > > > with
> > > > > > > `functools.reduce`. too.
> > > > > > >
> > > > > > > The only argument I can see against merging this would be
> > > > > > > `kwarg`-creep of `reduce`, and I think this has enough use
> > > > > > > cases to justify that.
> > > > > > >
> > > > > > > I'd like to merge in a few days, if no one else has any
> > > > > > > opinions.
> > > > > > >
> > > > > > > Eric
> > > > > > >
> > > > > > > On Fri, 16 Mar 2018 at 10:13 Hameer Abbasi <einstein.edison
> > > > > > > @gma
> > > > > > > il.com> wrote:
> > > > > > > > Hello, everyone. I?ve submitted a PR to add a initializer
> > > > > > > > kwarg to ufunc.reduce. This is useful in a few cases,
> > > > > > > > e.g.,
> > > > > > > > it allows one to supply a ?default? value for identity-
> > > > > > > > less
> > > > > > > > ufunc reductions, and specify an initial value for
> > > > > > > > reductions
> > > > > > > > such as sum (other than zero.)
> > > > > > > >
> > > > > > > > Please feel free to review or leave feedback, (although I
> > > > > > > > think Eric and Marten have picked it apart pretty well).
> > > > > > > >
> > > > > > > > https://github.com/numpy/numpy/pull/10635
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Hameer
> > > > > > > > Sent from Astro for Mac
> > > > > > > >
> > > > > > > > _______________________________________________
> > > > > > > > NumPy-Discussion mailing list
> > > > > > > > NumPy-Discussion at python.org
> > > > > > > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > > > > >
> > > > > > > _______________________________________________
> > > > > > > NumPy-Discussion mailing list
> > > > > > > NumPy-Discussion at python.org
> > > > > > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > > > >
> > > > > > _______________________________________________
> > > > > > NumPy-Discussion mailing list
> > > > > > NumPy-Discussion at python.org
> > > > > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > > >
> > > > > _______________________________________________
> > > > > NumPy-Discussion mailing list
> > > > > NumPy-Discussion at python.org
> > > > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > >
> > > > _______________________________________________
> > > > NumPy-Discussion mailing list
> > > > NumPy-Discussion at python.org
> > > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > >
> > > _______________________________________________
> > > NumPy-Discussion mailing list
> > > NumPy-Discussion at python.org
> > > https://mail.python.org/mailman/listinfo/numpy-discussion
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180326/0345917f/attachment-0001.html>

From einstein.edison at gmail.com  Mon Mar 26 11:39:13 2018
From: einstein.edison at gmail.com (Hameer Abbasi)
Date: Mon, 26 Mar 2018 11:39:13 -0400
Subject: [Numpy-discussion] PR to add an initializer kwarg to
 ufunc.reduce (and similar functions)
In-Reply-To: <CANNq6Fm56H4AZNrDH-7e1hH86mq1YpoJTuH4Y8_LDH1bkA-Y=w@mail.gmail.com>
References: <CADViA5D6qc_3PcTXFz3zcVwNEdP2hfKUjgUg+RFb8eDFvj1yyg@mail.gmail.com>
 <1522058786.15711.5.camel@sipsolutions.net>
 <CADViA5CXy5Kuv1GpPdJnQPTSVOaNSuYMSz_qJX_3XmDK1WhCNQ@mail.gmail.com>
 <1522077394.4888.10.camel@sipsolutions.net>
 <CANNq6Fm56H4AZNrDH-7e1hH86mq1YpoJTuH4Y8_LDH1bkA-Y=w@mail.gmail.com>
Message-ID: <CADViA5BOExwdG4LBYbrkHZy-6pudpTnN9+5=nzBKWrr_uK2NQw@mail.gmail.com>

That is the idea, but NaN functions are in a separate branch for
another PR to be discussed later. You can see it on my fork, if you're
interested. On 26/03/2018 at 17:35, Benjamin wrote: Hmm, this is neat.
I imagine it would finally give some people a choice on what
np.nansum([np.nan]) should return? It caused a huge hullabeloo a few
years ago when we changed it from returning NaN to returning zero. Ben
Root On Mon, Mar 26, 2018 at 11:16 AM, Sebastian Berg
<sebastian at sipsolutions.net> wrote: OK, the new documentation is
actually clear: initializer : scalar, optional The value with which to
start the reduction. Defaults to the `~numpy.ufunc.identity` of the
ufunc. If ``None`` is given, the first element of the reduction is
used, and an error is thrown if the reduction is empty. If ``a.dtype``
is ``object``, then the initializer is _only_ used if reduction is
empty. I would actually like to say that I do not like the object
special case much (and it is probably the reason why I was confused),
nor am I quite sure this is what helps a lot? Logically, I would argue
there are two things: 1. initializer/start (always used) 2. default
(oly used for empty reductions) For example, I might like to give
`np.nan` as the default for some empty reductions, this will not work.
I understand that this is a minimal invasive PR and I am not sure I
find the solution bad enough to really dislike it, but what do other
think? My first expectation was the default behaviour (in all cases,
not just object case) for some reason. To be honest, for now I just
wonder a bit: How hard would it be to do both, or is that too
annoying? It would at least get rid of that annoying thing with object
ufuncs (which currently have a default, but not really an
identity/initializer). Best, Sebastian On Mon, 2018-03-26 at 08:20
-0400, Hameer Abbasi wrote: > Actually, the behavior right now isn?t
that of `default` but that of > `initializer` or `start`. > > This was
discussed further down in the PR but to reiterate: > `np.sum([10],
initializer=5)` becomes `15`. > > Also, `np.min([5], initializer=0)`
becomes `0`, so it isn?t really > the default value, it?s the initial
value among which the reduction > is performed. > > This was the
reason to call it initializer in the first place. I like > `initial`
and `initial_value` as well, and `start` also makes sense > but isn?t
descriptive enough. > > Hameer > Sent from Astro for Mac > > > On Mar
26, 2018 at 12:06, Sebastian Berg <sebastian at sipsolutions.ne > > t>
wrote: > > > > Initializer or this sounds fine to me. As an other data
point which > > I > > think has been mentioned before, `sum` uses
start and min/max use > > default. `start` does not work, unless we
also change the code to > > always use the identity if given
(currently that is not the case), > > in > > which case it might be
nice. However, "start" seems a bit like > > solving > > a different
issue in any case. > > > > Anyway, mostly noise. I really like adding
this, the only thing > > worth > > discussing a bit is the name :). >
> > > - Sebastian > > > > > > On Mon, 2018-03-26 at 05:57 -0400,
Hameer Abbasi wrote: > > > It calls it `initializer` - See
https://docs.python.org/3.5/libra > > > ry/f > > >
unctools.html#functools.reduce > > > > > > Sent from Astro for Mac > >
> > > > > On Mar 26, 2018 at 09:54, Eric Wieser
<wieser.eric+numpy at gmail. > > > > com> > > > > wrote: > > > > > > > >
It turns out I mispoke - functools.reduce calls the argument > > > >
`initial` > > > > > > > > On Mon, 26 Mar 2018 at 00:17 Stephan Hoyer
<shoyer at gmail.com> > > > > wrote: > > > > > This looks like a very
logical addition to the reduce > > > > > interface. > > > > > It has
my support! > > > > > > > > > > I would have preferred the more
descriptive name > > > > > "initial_value", > > > > > but consistency
with functools.reduce makes a compelling case > > > > > for > > > > >
"initializer". > > > > > > > > > > On Sun, Mar 25, 2018 at 1:15 PM
Eric Wieser <wieser.eric+nump > > > > > y at gm > > > > > ail.com> wrote:
> > > > > > To reiterate my comments in the issue - I'm in favor of >
> > > > > this. > > > > > > > > > > > > It seems seem especially
valuable for identity-less > > > > > > functions > > > > > > (`min`,
`max`, `lcm`), and the argument name is consistent > > > > > > with >
> > > > > `functools.reduce`. too. > > > > > > > > > > > > The only
argument I can see against merging this would be > > > > > >
`kwarg`-creep of `reduce`, and I think this has enough use > > > > > >
cases to justify that. > > > > > > > > > > > > I'd like to merge in a
few days, if no one else has any > > > > > > opinions. > > > > > > > >
> > > > Eric > > > > > > > > > > > > On Fri, 16 Mar 2018 at 10:13
Hameer Abbasi <einstein.edison > > > > > > @gma > > > > > > il.com>
wrote: > > > > > > > Hello, everyone. I?ve submitted a PR to add a
initializer > > > > > > > kwarg to ufunc.reduce. This is useful in a
few cases, > > > > > > > e.g., > > > > > > > it allows one to supply a
?default? value for identity- > > > > > > > less > > > > > > > ufunc
reductions, and specify an initial value for > > > > > > > reductions
> > > > > > > such as sum (other than zero.) > > > > > > > > > > > > >
> Please feel free to review or leave feedback, (although I > > > > >
> > think Eric and Marten have picked it apart pretty well). > > > > >
> > > > > > > > > https://github.com/numpy/numpy/pull/10635 > > > > >
> > > > > > > > > Thanks, > > > > > > > > > > > > > > Hameer > > > > >
> > Sent from Astro for Mac > > > > > > > > > > > > > >
_______________________________________________ > > > > > > >
NumPy-Discussion mailing list > > > > > > >
NumPy-Discussion at python.org > > > > > > >
https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > >
> > > > > > _______________________________________________ > > > > >
> NumPy-Discussion mailing list > > > > > >
NumPy-Discussion at python.org > > > > > >
https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > >
> > > > _______________________________________________ > > > > >
NumPy-Discussion mailing list > > > > > NumPy-Discussion at python.org >
> > > > https://mail.python.org/mailman/listinfo/numpy-discussion > >
> > > > > > _______________________________________________ > > > >
NumPy-Discussion mailing list > > > > NumPy-Discussion at python.org > >
> > https://mail.python.org/mailman/listinfo/numpy-discussion > > > >
> > _______________________________________________ > > >
NumPy-Discussion mailing list > > > NumPy-Discussion at python.org > > >
https://mail.python.org/mailman/listinfo/numpy-discussion > > > >
_______________________________________________ > > NumPy-Discussion
mailing list > > NumPy-Discussion at python.org > >
https://mail.python.org/mailman/listinfo/numpy-discussion > >
_______________________________________________ > NumPy-Discussion
mailing list > NumPy-Discussion at python.org >
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion
mailing list NumPy-Discussion at python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

From sebastian at sipsolutions.net  Mon Mar 26 11:45:56 2018
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Mon, 26 Mar 2018 17:45:56 +0200
Subject: [Numpy-discussion] PR to add an initializer kwarg to
 ufunc.reduce (and similar functions)
In-Reply-To: <CADViA5BOExwdG4LBYbrkHZy-6pudpTnN9+5=nzBKWrr_uK2NQw@mail.gmail.com>
References: <CADViA5D6qc_3PcTXFz3zcVwNEdP2hfKUjgUg+RFb8eDFvj1yyg@mail.gmail.com>
 <1522058786.15711.5.camel@sipsolutions.net>
 <CADViA5CXy5Kuv1GpPdJnQPTSVOaNSuYMSz_qJX_3XmDK1WhCNQ@mail.gmail.com>
 <1522077394.4888.10.camel@sipsolutions.net>
 <CANNq6Fm56H4AZNrDH-7e1hH86mq1YpoJTuH4Y8_LDH1bkA-Y=w@mail.gmail.com>
 <CADViA5BOExwdG4LBYbrkHZy-6pudpTnN9+5=nzBKWrr_uK2NQw@mail.gmail.com>
Message-ID: <1522079156.4888.12.camel@sipsolutions.net>

On Mon, 2018-03-26 at 11:39 -0400, Hameer Abbasi wrote:
> That is the idea, but NaN functions are in a separate branch for
> another PR to be discussed later. You can see it on my fork, if
> you're
> interested.

Except that as far as I understand I am not sure it will help much with
it, since it is not a default, but an initializer. Initializing to NaN
would just make all results NaN.

- Sebastian


> On 26/03/2018 at 17:35, Benjamin wrote: Hmm, this is neat.
> I imagine it would finally give some people a choice on what
> np.nansum([np.nan]) should return? It caused a huge hullabeloo a few
> years ago when we changed it from returning NaN to returning zero.
> Ben
> Root On Mon, Mar 26, 2018 at 11:16 AM, Sebastian Berg
> <sebastian at sipsolutions.net> wrote: OK, the new documentation is
> actually clear: initializer : scalar, optional The value with which
> to
> start the reduction. Defaults to the `~numpy.ufunc.identity` of the
> ufunc. If ``None`` is given, the first element of the reduction is
> used, and an error is thrown if the reduction is empty. If
> ``a.dtype``
> is ``object``, then the initializer is _only_ used if reduction is
> empty. I would actually like to say that I do not like the object
> special case much (and it is probably the reason why I was confused),
> nor am I quite sure this is what helps a lot? Logically, I would
> argue
> there are two things: 1. initializer/start (always used) 2. default
> (oly used for empty reductions) For example, I might like to give
> `np.nan` as the default for some empty reductions, this will not
> work.
> I understand that this is a minimal invasive PR and I am not sure I
> find the solution bad enough to really dislike it, but what do other
> think? My first expectation was the default behaviour (in all cases,
> not just object case) for some reason. To be honest, for now I just
> wonder a bit: How hard would it be to do both, or is that too
> annoying? It would at least get rid of that annoying thing with
> object
> ufuncs (which currently have a default, but not really an
> identity/initializer). Best, Sebastian On Mon, 2018-03-26 at 08:20
> -0400, Hameer Abbasi wrote: > Actually, the behavior right now isn?t
> that of `default` but that of > `initializer` or `start`. > > This
> was
> discussed further down in the PR but to reiterate: > `np.sum([10],
> initializer=5)` becomes `15`. > > Also, `np.min([5], initializer=0)`
> becomes `0`, so it isn?t really > the default value, it?s the initial
> value among which the reduction > is performed. > > This was the
> reason to call it initializer in the first place. I like > `initial`
> and `initial_value` as well, and `start` also makes sense > but isn?t
> descriptive enough. > > Hameer > Sent from Astro for Mac > > > On Mar
> 26, 2018 at 12:06, Sebastian Berg <sebastian at sipsolutions.ne > > t>
> wrote: > > > > Initializer or this sounds fine to me. As an other
> data
> point which > > I > > think has been mentioned before, `sum` uses
> start and min/max use > > default. `start` does not work, unless we
> also change the code to > > always use the identity if given
> (currently that is not the case), > > in > > which case it might be
> nice. However, "start" seems a bit like > > solving > > a different
> issue in any case. > > > > Anyway, mostly noise. I really like adding
> this, the only thing > > worth > > discussing a bit is the name :). >
> > > > - Sebastian > > > > > > On Mon, 2018-03-26 at 05:57 -0400,
> 
> Hameer Abbasi wrote: > > > It calls it `initializer` - See
> https://docs.python.org/3.5/libra > > > ry/f > > >
> unctools.html#functools.reduce > > > > > > Sent from Astro for Mac >
> >
> > > > > > On Mar 26, 2018 at 09:54, Eric Wieser
> 
> <wieser.eric+numpy at gmail. > > > > com> > > > > wrote: > > > > > > > >
> It turns out I mispoke - functools.reduce calls the argument > > > >
> `initial` > > > > > > > > On Mon, 26 Mar 2018 at 00:17 Stephan Hoyer
> <shoyer at gmail.com> > > > > wrote: > > > > > This looks like a very
> logical addition to the reduce > > > > > interface. > > > > > It has
> my support! > > > > > > > > > > I would have preferred the more
> descriptive name > > > > > "initial_value", > > > > > but consistency
> with functools.reduce makes a compelling case > > > > > for > > > > >
> "initializer". > > > > > > > > > > On Sun, Mar 25, 2018 at 1:15 PM
> Eric Wieser <wieser.eric+nump > > > > > y at gm > > > > > ail.com>
> wrote:
> > > > > > > To reiterate my comments in the issue - I'm in favor of >
> > > > > > 
> > > > > > this. > > > > > > > > > > > > It seems seem especially
> 
> valuable for identity-less > > > > > > functions > > > > > > (`min`,
> `max`, `lcm`), and the argument name is consistent > > > > > > with >
> > > > > > `functools.reduce`. too. > > > > > > > > > > > > The only
> 
> argument I can see against merging this would be > > > > > >
> `kwarg`-creep of `reduce`, and I think this has enough use > > > > >
> >
> cases to justify that. > > > > > > > > > > > > I'd like to merge in a
> few days, if no one else has any > > > > > > opinions. > > > > > > >
> >
> > > > > Eric > > > > > > > > > > > > On Fri, 16 Mar 2018 at 10:13
> 
> Hameer Abbasi <einstein.edison > > > > > > @gma > > > > > > il.com>
> wrote: > > > > > > > Hello, everyone. I?ve submitted a PR to add a
> initializer > > > > > > > kwarg to ufunc.reduce. This is useful in a
> few cases, > > > > > > > e.g., > > > > > > > it allows one to supply
> a
> ?default? value for identity- > > > > > > > less > > > > > > > ufunc
> reductions, and specify an initial value for > > > > > > > reductions
> > > > > > > > such as sum (other than zero.) > > > > > > > > > > > >
> > > > > > > > >
> > 
> > Please feel free to review or leave feedback, (although I > > > > >
> > > think Eric and Marten have picked it apart pretty well). > > > >
> > > >
> > > > > > > > > > https://github.com/numpy/numpy/pull/10635 > > > > >
> > > > > > > > > > Thanks, > > > > > > > > > > > > > > Hameer > > > >
> > > > > > > > > > >
> > > 
> > > Sent from Astro for Mac > > > > > > > > > > > > > >
> 
> _______________________________________________ > > > > > > >
> NumPy-Discussion mailing list > > > > > > >
> NumPy-Discussion at python.org > > > > > > >
> https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > >
> > > > > > > _______________________________________________ > > > > >
> > 
> > NumPy-Discussion mailing list > > > > > >
> 
> NumPy-Discussion at python.org > > > > > >
> https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > >
> > > > > _______________________________________________ > > > > >
> 
> NumPy-Discussion mailing list > > > > > NumPy-Discussion at python.org >
> > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > >
> > > > > > > _______________________________________________ > > > >
> 
> NumPy-Discussion mailing list > > > > NumPy-Discussion at python.org > >
> > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > >
> > > _______________________________________________ > > >
> 
> NumPy-Discussion mailing list > > > NumPy-Discussion at python.org > > >
> https://mail.python.org/mailman/listinfo/numpy-discussion > > > >
> _______________________________________________ > > NumPy-Discussion
> mailing list > > NumPy-Discussion at python.org > >
> https://mail.python.org/mailman/listinfo/numpy-discussion > >
> _______________________________________________ > NumPy-Discussion
> mailing list > NumPy-Discussion at python.org >
> https://mail.python.org/mailman/listinfo/numpy-discussion
> _______________________________________________ NumPy-Discussion
> mailing list NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180326/db478adf/attachment.sig>

From einstein.edison at gmail.com  Mon Mar 26 11:53:01 2018
From: einstein.edison at gmail.com (Hameer Abbasi)
Date: Mon, 26 Mar 2018 11:53:01 -0400
Subject: [Numpy-discussion] PR to add an initializer kwarg to
 ufunc.reduce (and similar functions)
In-Reply-To: <1522079156.4888.12.camel@sipsolutions.net>
References: <CADViA5D6qc_3PcTXFz3zcVwNEdP2hfKUjgUg+RFb8eDFvj1yyg@mail.gmail.com>
 <1522077394.4888.10.camel@sipsolutions.net>
 <CANNq6Fm56H4AZNrDH-7e1hH86mq1YpoJTuH4Y8_LDH1bkA-Y=w@mail.gmail.com>
 <CADViA5BOExwdG4LBYbrkHZy-6pudpTnN9+5=nzBKWrr_uK2NQw@mail.gmail.com>
 <1522079156.4888.12.camel@sipsolutions.net>
Message-ID: <CADViA5BNtLbutmjJNr6AH0ck4h8TuC9HAtNHKAvdPuKbi7-1YQ@mail.gmail.com>

It'll need to be thought out for object arrays and subclasses. But for
Regular numeric stuff, Numpy uses fmin and this would have the desired
effect. On 26/03/2018 at 17:45, Sebastian wrote: On Mon, 2018-03-26 at
11:39 -0400, Hameer Abbasi wrote: That is the idea, but NaN functions
are in a separate branch for another PR to be discussed later. You can
see it on my fork, if you're interested. Except that as far as I
understand I am not sure it will help much with it, since it is not a
default, but an initializer. Initializing to NaN would just make all
results NaN. - Sebastian On 26/03/2018 at 17:35, Benjamin wrote: Hmm,
this is neat. I imagine it would finally give some people a choice on
what np.nansum([np.nan]) should return? It caused a huge hullabeloo a
few years ago when we changed it from returning NaN to returning zero.
Ben Root On Mon, Mar 26, 2018 at 11:16 AM, Sebastian Berg
<sebastian at sipsolutions.net> wrote: OK, the new documentation is
actually clear: initializer : scalar, optional The value with which to
start the reduction. Defaults to the `~numpy.ufunc.identity` of the
ufunc. If ``None`` is given, the first element of the reduction is
used, and an error is thrown if the reduction is empty. If ``a.dtype``
is ``object``, then the initializer is _only_ used if reduction is
empty. I would actually like to say that I do not like the object
special case much (and it is probably the reason why I was confused),
nor am I quite sure this is what helps a lot? Logically, I would argue
there are two things: 1. initializer/start (always used) 2. default
(oly used for empty reductions) For example, I might like to give
`np.nan` as the default for some empty reductions, this will not work.
I understand that this is a minimal invasive PR and I am not sure I
find the solution bad enough to really dislike it, but what do other
think? My first expectation was the default behaviour (in all cases,
not just object case) for some reason. To be honest, for now I just
wonder a bit: How hard would it be to do both, or is that too
annoying? It would at least get rid of that annoying thing with object
ufuncs (which currently have a default, but not really an
identity/initializer). Best, Sebastian On Mon, 2018-03-26 at 08:20
-0400, Hameer Abbasi wrote: > Actually, the behavior right now isn?t
that of `default` but that of > `initializer` or `start`. > > This was
discussed further down in the PR but to reiterate: > `np.sum([10],
initializer=5)` becomes `15`. > > Also, `np.min([5], initializer=0)`
becomes `0`, so it isn?t really > the default value, it?s the initial
value among which the reduction > is performed. > > This was the
reason to call it initializer in the first place. I like > `initial`
and `initial_value` as well, and `start` also makes sense > but isn?t
descriptive enough. > > Hameer > Sent from Astro for Mac > > > On Mar
26, 2018 at 12:06, Sebastian Berg <sebastian at sipsolutions.ne > > t>
wrote: > > > > Initializer or this sounds fine to me. As an other data
point which > > I > > think has been mentioned before, `sum` uses
start and min/max use > > default. `start` does not work, unless we
also change the code to > > always use the identity if given
(currently that is not the case), > > in > > which case it might be
nice. However, "start" seems a bit like > > solving > > a different
issue in any case. > > > > Anyway, mostly noise. I really like adding
this, the only thing > > worth > > discussing a bit is the name :). >
- Sebastian > > > > > > On Mon, 2018-03-26 at 05:57 -0400, Hameer
Abbasi wrote: > > > It calls it `initializer` - See
https://docs.python.org/3.5/libra > > > ry/f > > >
unctools.html#functools.reduce > > > > > > Sent from Astro for Mac >
On Mar 26, 2018 at 09:54, Eric Wieser <wieser.eric+numpy at gmail. > > >
> com> > > > > wrote: > > > > > > > > It turns out I mispoke -
functools.reduce calls the argument > > > > `initial` > > > > > > > >
On Mon, 26 Mar 2018 at 00:17 Stephan Hoyer <shoyer at gmail.com> > > > >
wrote: > > > > > This looks like a very logical addition to the reduce
> > > > > interface. > > > > > It has my support! > > > > > > > > > >
I would have preferred the more descriptive name > > > > >
"initial_value", > > > > > but consistency with functools.reduce makes
a compelling case > > > > > for > > > > > "initializer". > > > > > > >
> > > On Sun, Mar 25, 2018 at 1:15 PM Eric Wieser <wieser.eric+nump >
> > > > y at gm > > > > > ail.com> wrote: To reiterate my comments in the
issue - I'm in favor of > this. > > > > > > > > > > > > It seems seem
especially valuable for identity-less > > > > > > functions > > > > >
> (`min`, `max`, `lcm`), and the argument name is consistent > > > > >
> with > `functools.reduce`. too. > > > > > > > > > > > > The only
argument I can see against merging this would be > > > > > >
`kwarg`-creep of `reduce`, and I think this has enough use > > > > >
cases to justify that. > > > > > > > > > > > > I'd like to merge in a
few days, if no one else has any > > > > > > opinions. > > > > > > >
Eric > > > > > > > > > > > > On Fri, 16 Mar 2018 at 10:13 Hameer
Abbasi <einstein.edison > > > > > > @gma > > > > > > il.com> wrote: >
> > > > > > Hello, everyone. I?ve submitted a PR to add a initializer
> > > > > > > kwarg to ufunc.reduce. This is useful in a few cases, >
> > > > > > e.g., > > > > > > > it allows one to supply a ?default?
value for identity- > > > > > > > less > > > > > > > ufunc reductions,
and specify an initial value for > > > > > > > reductions such as sum
(other than zero.) > > > > > > > > > > > > Please feel free to review
or leave feedback, (although I > > > > > think Eric and Marten have
picked it apart pretty well). > > > >
https://github.com/numpy/numpy/pull/10635 > > > > > Thanks, > > > > >
> > > > > > > > > Hameer > > > > Sent from Astro for Mac > > > > > > >
> > > > > > > _______________________________________________ > > > >
> > > NumPy-Discussion mailing list > > > > > > >
NumPy-Discussion at python.org > > > > > > >
https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > >
_______________________________________________ > > > > >
NumPy-Discussion mailing list > > > > > > NumPy-Discussion at python.org
> > > > > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > > > > _______________________________________________ > > > > >
NumPy-Discussion mailing list > > > > > NumPy-Discussion at python.org >
https://mail.python.org/mailman/listinfo/numpy-discussion > >
_______________________________________________ > > > >
NumPy-Discussion mailing list > > > > NumPy-Discussion at python.org > >
https://mail.python.org/mailman/listinfo/numpy-discussion > > > >
_______________________________________________ > > > NumPy-Discussion
mailing list > > > NumPy-Discussion at python.org > > >
https://mail.python.org/mailman/listinfo/numpy-discussion > > > >
_______________________________________________ > > NumPy-Discussion
mailing list > > NumPy-Discussion at python.org > >
https://mail.python.org/mailman/listinfo/numpy-discussion > >
_______________________________________________ > NumPy-Discussion
mailing list > NumPy-Discussion at python.org >
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion
mailing list NumPy-Discussion at python.org
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion
mailing list NumPy-Discussion at python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

From allanhaldane at gmail.com  Mon Mar 26 12:07:59 2018
From: allanhaldane at gmail.com (Allan Haldane)
Date: Mon, 26 Mar 2018 12:07:59 -0400
Subject: [Numpy-discussion] nditer as a context manager (reformatted?)
In-Reply-To: <1522036d-a561-ba14-8dc3-48e329266827@gmail.com>
References: <f17605a3-9cfa-5fcb-5d1c-ef0ac039e608@gmail.com>
 <1522036d-a561-ba14-8dc3-48e329266827@gmail.com>
Message-ID: <99f982c1-146f-44f0-8695-07bc197746b5@gmail.com>

Given the lack of objections, we are probably going forward with this
change to nditer.

Anyone who uses nditers may have to update their code slightly if they
want to avoid deprecation warnings, but otherwise old nditer code should
work for a long time from now.

Allan

On 03/22/2018 01:43 PM, Matti Picus wrote:
> Hello all, PR #9998 (https://github.com/numpy/numpy/pull/9998/) proposes
> an update to the nditer API, both C and python. The issue
> (https://github.com/numpy/numpy/issues/9714) is that sometimes nditer
> uses temp arrays via the "writeback" mechanism, the data is copied back
> to the original arrays "when finished". However "when finished" was
> implemented using nditer deallocation.
> 
> This mechanism is implicit and unclear, and relies on refcount semantics
> which do not work on non-refcount python implementations like PyPY. It
> also leads to lines of code like "iter=None" to trigger the writeback
> resolution.
> 
> On the c-api level the agreed upon solution is to add a new
> `NpyIter_Close` function in C, this is to be called before
> `NpyIter_Dealloc`.
> 
> The reviewers and I would like to ask the wider NumPy community for
> opinions about the proposed python-level solution: turning the python
> nditer object into a context manager. This way "writeback" occurs at
> context manager exit via a call to `NpyIter_Close`, instead of like
> before when it occurred at `nditer` deallocation (which might not happen
> until much later in Pypy, and could be delayed by GC even in Cpython).
> 
> Another solution that was rejected
> (https://github.com/numpy/numpy/pull/10184) was to add an nditer.close()
> python-level function that would not require a context manager It was
> felt that this is more error-prone since it requires users to add the
> line for each iterator created.
> 
> The back-compat issues are that:
> 
> 1. We are adding a new function to the numpy API, `NpyIter_Close`
> (pretty harmless)
> 
> 2. We want people to update their C code using nditer, to call
> `NpyIter_Close` before ?they call `NpyIter_Dealloc` and will start
> raising a deprecation warning if misuse is detected
> 
> 3. We want people to update their Python code to use the nditer object
> as a context manager, and will warn if they do not.
> 
> We tried to minimize back-compat issues, in the sense that old code
> (which didn't work in PyPy anyway) will still work, although it will now
> emit deprecation warnings. In the future we also plan to raise an error
> if an nditer is used in Python without a context manager (when it should
> have been). For C code, we plan to leave the deprecation warning in
> place probably forever, as we can only detect the deprecated behavior in
> the deallocator, where exceptions cannot be raised.
> 
> Anybody who uses nditers should take a look and please reply if it seems
> the change will be too painful.
> 
> For more details, please see the updated docs in that PR
> 
> Matti (and reviewers)
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion


From sebastian at sipsolutions.net  Mon Mar 26 12:48:47 2018
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Mon, 26 Mar 2018 18:48:47 +0200
Subject: [Numpy-discussion] PR to add an initializer kwarg to
 ufunc.reduce (and similar functions)
In-Reply-To: <CADViA5BNtLbutmjJNr6AH0ck4h8TuC9HAtNHKAvdPuKbi7-1YQ@mail.gmail.com>
References: <CADViA5D6qc_3PcTXFz3zcVwNEdP2hfKUjgUg+RFb8eDFvj1yyg@mail.gmail.com>
 <1522077394.4888.10.camel@sipsolutions.net>
 <CANNq6Fm56H4AZNrDH-7e1hH86mq1YpoJTuH4Y8_LDH1bkA-Y=w@mail.gmail.com>
 <CADViA5BOExwdG4LBYbrkHZy-6pudpTnN9+5=nzBKWrr_uK2NQw@mail.gmail.com>
 <1522079156.4888.12.camel@sipsolutions.net>
 <CADViA5BNtLbutmjJNr6AH0ck4h8TuC9HAtNHKAvdPuKbi7-1YQ@mail.gmail.com>
Message-ID: <1522082927.4888.24.camel@sipsolutions.net>

On Mon, 2018-03-26 at 11:53 -0400, Hameer Abbasi wrote:
> It'll need to be thought out for object arrays and subclasses. But
> for
> Regular numeric stuff, Numpy uses fmin and this would have the
> desired
> effect.

I do not want to block this, but I would like a clearer opinion about
this issue, `np.nansum` as Benjamin noted would require something like:

np.nansum([np.nan], default=np.nan)

because

np.sum([1], initializer=np.nan)
np.nansum([1], initializer=np.nan)

would both give NaN if the logic is the same as the current `np.sum`.
And yes, I guess for fmin/fmax NaN happens to work. And then there are
many nonsense reduces which could make sense with `initializer`.

Now nansum is not implemented in a way that could make use of the new
kwarg anyway, so maybe it does not matter in some sense. We can in
principle use `default` in nansum and at some point possibly add
`default` to the normal ufuncs. If we argue like that, the only
annoying thing is the `object` dtype which confuses the two use cases
currently.

This confusion IMO is not harmless, because I might want to use it
(e.g. sum with initializer=5), and I would expect things like dropping
in `decimal.Decimal` to work most of the time, while here it would give
silently bad results.

- Sebastian


> On 26/03/2018 at 17:45, Sebastian wrote: On Mon, 2018-03-26 at
> 11:39 -0400, Hameer Abbasi wrote: That is the idea, but NaN functions
> are in a separate branch for another PR to be discussed later. You
> can
> see it on my fork, if you're interested. Except that as far as I
> understand I am not sure it will help much with it, since it is not a
> default, but an initializer. Initializing to NaN would just make all
> results NaN. - Sebastian On 26/03/2018 at 17:35, Benjamin wrote: Hmm,
> this is neat. I imagine it would finally give some people a choice on
> what np.nansum([np.nan]) should return? It caused a huge hullabeloo a
> few years ago when we changed it from returning NaN to returning
> zero.
> Ben Root On Mon, Mar 26, 2018 at 11:16 AM, Sebastian Berg
> <sebastian at sipsolutions.net> wrote: OK, the new documentation is
> actually clear: initializer : scalar, optional The value with which
> to
> start the reduction. Defaults to the `~numpy.ufunc.identity` of the
> ufunc. If ``None`` is given, the first element of the reduction is
> used, and an error is thrown if the reduction is empty. If
> ``a.dtype``
> is ``object``, then the initializer is _only_ used if reduction is
> empty. I would actually like to say that I do not like the object
> special case much (and it is probably the reason why I was confused),
> nor am I quite sure this is what helps a lot? Logically, I would
> argue
> there are two things: 1. initializer/start (always used) 2. default
> (oly used for empty reductions) For example, I might like to give
> `np.nan` as the default for some empty reductions, this will not
> work.
> I understand that this is a minimal invasive PR and I am not sure I
> find the solution bad enough to really dislike it, but what do other
> think? My first expectation was the default behaviour (in all cases,
> not just object case) for some reason. To be honest, for now I just
> wonder a bit: How hard would it be to do both, or is that too
> annoying? It would at least get rid of that annoying thing with
> object
> ufuncs (which currently have a default, but not really an
> identity/initializer). Best, Sebastian On Mon, 2018-03-26 at 08:20
> -0400, Hameer Abbasi wrote: > Actually, the behavior right now isn?t
> that of `default` but that of > `initializer` or `start`. > > This
> was
> discussed further down in the PR but to reiterate: > `np.sum([10],
> initializer=5)` becomes `15`. > > Also, `np.min([5], initializer=0)`
> becomes `0`, so it isn?t really > the default value, it?s the initial
> value among which the reduction > is performed. > > This was the
> reason to call it initializer in the first place. I like > `initial`
> and `initial_value` as well, and `start` also makes sense > but isn?t
> descriptive enough. > > Hameer > Sent from Astro for Mac > > > On Mar
> 26, 2018 at 12:06, Sebastian Berg <sebastian at sipsolutions.ne > > t>
> wrote: > > > > Initializer or this sounds fine to me. As an other
> data
> point which > > I > > think has been mentioned before, `sum` uses
> start and min/max use > > default. `start` does not work, unless we
> also change the code to > > always use the identity if given
> (currently that is not the case), > > in > > which case it might be
> nice. However, "start" seems a bit like > > solving > > a different
> issue in any case. > > > > Anyway, mostly noise. I really like adding
> this, the only thing > > worth > > discussing a bit is the name :). >
> - Sebastian > > > > > > On Mon, 2018-03-26 at 05:57 -0400, Hameer
> Abbasi wrote: > > > It calls it `initializer` - See
> https://docs.python.org/3.5/libra > > > ry/f > > >
> unctools.html#functools.reduce > > > > > > Sent from Astro for Mac >
> On Mar 26, 2018 at 09:54, Eric Wieser <wieser.eric+numpy at gmail. > > >
> > com> > > > > wrote: > > > > > > > > It turns out I mispoke -
> 
> functools.reduce calls the argument > > > > `initial` > > > > > > > >
> On Mon, 26 Mar 2018 at 00:17 Stephan Hoyer <shoyer at gmail.com> > > > >
> wrote: > > > > > This looks like a very logical addition to the
> reduce
> > > > > > interface. > > > > > It has my support! > > > > > > > > > >
> 
> I would have preferred the more descriptive name > > > > >
> "initial_value", > > > > > but consistency with functools.reduce
> makes
> a compelling case > > > > > for > > > > > "initializer". > > > > > >
> >
> > > > On Sun, Mar 25, 2018 at 1:15 PM Eric Wieser <wieser.eric+nump >
> > > > > y at gm > > > > > ail.com> wrote: To reiterate my comments in
> > > > > the
> 
> issue - I'm in favor of > this. > > > > > > > > > > > > It seems seem
> especially valuable for identity-less > > > > > > functions > > > > >
> > (`min`, `max`, `lcm`), and the argument name is consistent > > > >
> > >
> > with > `functools.reduce`. too. > > > > > > > > > > > > The only
> 
> argument I can see against merging this would be > > > > > >
> `kwarg`-creep of `reduce`, and I think this has enough use > > > > >
> cases to justify that. > > > > > > > > > > > > I'd like to merge in a
> few days, if no one else has any > > > > > > opinions. > > > > > > >
> Eric > > > > > > > > > > > > On Fri, 16 Mar 2018 at 10:13 Hameer
> Abbasi <einstein.edison > > > > > > @gma > > > > > > il.com> wrote: >
> > > > > > > Hello, everyone. I?ve submitted a PR to add a initializer
> > > > > > > > kwarg to ufunc.reduce. This is useful in a few cases, >
> > > > > > > 
> > > > > > > e.g., > > > > > > > it allows one to supply a ?default?
> 
> value for identity- > > > > > > > less > > > > > > > ufunc
> reductions,
> and specify an initial value for > > > > > > > reductions such as sum
> (other than zero.) > > > > > > > > > > > > Please feel free to review
> or leave feedback, (although I > > > > > think Eric and Marten have
> picked it apart pretty well). > > > >
> https://github.com/numpy/numpy/pull/10635 > > > > > Thanks, > > > > >
> > > > > > > > > > Hameer > > > > Sent from Astro for Mac > > > > > >
> > > > > > > > > > >
> > > > > > > > 
> > > > > > > > _______________________________________________ > > > >
> > > > 
> > > > NumPy-Discussion mailing list > > > > > > >
> 
> NumPy-Discussion at python.org > > > > > > >
> https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > >
> _______________________________________________ > > > > >
> NumPy-Discussion mailing list > > > > > > NumPy-Discussion at python.org
> > > > > > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > > > > > _______________________________________________ > > > > >
> 
> NumPy-Discussion mailing list > > > > > NumPy-Discussion at python.org >
> https://mail.python.org/mailman/listinfo/numpy-discussion > >
> _______________________________________________ > > > >
> NumPy-Discussion mailing list > > > > NumPy-Discussion at python.org > >
> https://mail.python.org/mailman/listinfo/numpy-discussion > > > >
> _______________________________________________ > > > NumPy-
> Discussion
> mailing list > > > NumPy-Discussion at python.org > > >
> https://mail.python.org/mailman/listinfo/numpy-discussion > > > >
> _______________________________________________ > > NumPy-Discussion
> mailing list > > NumPy-Discussion at python.org > >
> https://mail.python.org/mailman/listinfo/numpy-discussion > >
> _______________________________________________ > NumPy-Discussion
> mailing list > NumPy-Discussion at python.org >
> https://mail.python.org/mailman/listinfo/numpy-discussion
> _______________________________________________ NumPy-Discussion
> mailing list NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> _______________________________________________ NumPy-Discussion
> mailing list NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180326/2e79573b/attachment.sig>

From sebastian at sipsolutions.net  Mon Mar 26 12:54:51 2018
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Mon, 26 Mar 2018 18:54:51 +0200
Subject: [Numpy-discussion] PR to add an initializer kwarg to
 ufunc.reduce (and similar functions)
In-Reply-To: <1522082927.4888.24.camel@sipsolutions.net>
References: <CADViA5D6qc_3PcTXFz3zcVwNEdP2hfKUjgUg+RFb8eDFvj1yyg@mail.gmail.com>
 <1522077394.4888.10.camel@sipsolutions.net>
 <CANNq6Fm56H4AZNrDH-7e1hH86mq1YpoJTuH4Y8_LDH1bkA-Y=w@mail.gmail.com>
 <CADViA5BOExwdG4LBYbrkHZy-6pudpTnN9+5=nzBKWrr_uK2NQw@mail.gmail.com>
 <1522079156.4888.12.camel@sipsolutions.net>
 <CADViA5BNtLbutmjJNr6AH0ck4h8TuC9HAtNHKAvdPuKbi7-1YQ@mail.gmail.com>
 <1522082927.4888.24.camel@sipsolutions.net>
Message-ID: <1522083291.8319.3.camel@sipsolutions.net>

On Mon, 2018-03-26 at 18:48 +0200, Sebastian Berg wrote:
> On Mon, 2018-03-26 at 11:53 -0400, Hameer Abbasi wrote:
> > It'll need to be thought out for object arrays and subclasses. But
> > for
> > Regular numeric stuff, Numpy uses fmin and this would have the
> > desired
> > effect.
> 
> I do not want to block this, but I would like a clearer opinion about
> this issue, `np.nansum` as Benjamin noted would require something
> like:
> 
> np.nansum([np.nan], default=np.nan)
> 
> because
> 
> np.sum([1], initializer=np.nan)
> np.nansum([1], initializer=np.nan)
> 
> would both give NaN if the logic is the same as the current `np.sum`.
> And yes, I guess for fmin/fmax NaN happens to work. And then there
> are
> many nonsense reduces which could make sense with `initializer`.
> 
> Now nansum is not implemented in a way that could make use of the new
> kwarg anyway, so maybe it does not matter in some sense. We can in
> principle use `default` in nansum and at some point possibly add
> `default` to the normal ufuncs. If we argue like that, the only
> annoying thing is the `object` dtype which confuses the two use cases
> currently.
> 
> This confusion IMO is not harmless, because I might want to use it
> (e.g. sum with initializer=5), and I would expect things like
> dropping
> in `decimal.Decimal` to work most of the time, while here it would
> give
> silently bad results.
> 

In other words: I am very very much in favor if you get rid that object
dtype special case. I frankly not see why not (except that it needs a
bit more code change).
If given explicitly, we might as well force the use and not do the
funny stuff which is designed to be more type agnostic! If it happens
to fail due to not being type agnostic, it will at least fail loudly.

If you leave that object special case I am *very* hesitant about it.

That I think I would like a `default` argument as well, is another
issue and it can wait to another day.

- Sebastian


> - Sebastian
> 
> 
> 
> 
> 
> > On 26/03/2018 at 17:45, Sebastian wrote: On Mon, 2018-03-26 at
> > 11:39 -0400, Hameer Abbasi wrote: That is the idea, but NaN
> > functions
> > are in a separate branch for another PR to be discussed later. You
> > can
> > see it on my fork, if you're interested. Except that as far as I
> > understand I am not sure it will help much with it, since it is not
> > a
> > default, but an initializer. Initializing to NaN would just make
> > all
> > results NaN. - Sebastian On 26/03/2018 at 17:35, Benjamin wrote:
> > Hmm,
> > this is neat. I imagine it would finally give some people a choice
> > on
> > what np.nansum([np.nan]) should return? It caused a huge hullabeloo
> > a
> > few years ago when we changed it from returning NaN to returning
> > zero.
> > Ben Root On Mon, Mar 26, 2018 at 11:16 AM, Sebastian Berg
> > <sebastian at sipsolutions.net> wrote: OK, the new documentation is
> > actually clear: initializer : scalar, optional The value with which
> > to
> > start the reduction. Defaults to the `~numpy.ufunc.identity` of the
> > ufunc. If ``None`` is given, the first element of the reduction is
> > used, and an error is thrown if the reduction is empty. If
> > ``a.dtype``
> > is ``object``, then the initializer is _only_ used if reduction is
> > empty. I would actually like to say that I do not like the object
> > special case much (and it is probably the reason why I was
> > confused),
> > nor am I quite sure this is what helps a lot? Logically, I would
> > argue
> > there are two things: 1. initializer/start (always used) 2. default
> > (oly used for empty reductions) For example, I might like to give
> > `np.nan` as the default for some empty reductions, this will not
> > work.
> > I understand that this is a minimal invasive PR and I am not sure I
> > find the solution bad enough to really dislike it, but what do
> > other
> > think? My first expectation was the default behaviour (in all
> > cases,
> > not just object case) for some reason. To be honest, for now I just
> > wonder a bit: How hard would it be to do both, or is that too
> > annoying? It would at least get rid of that annoying thing with
> > object
> > ufuncs (which currently have a default, but not really an
> > identity/initializer). Best, Sebastian On Mon, 2018-03-26 at 08:20
> > -0400, Hameer Abbasi wrote: > Actually, the behavior right now
> > isn?t
> > that of `default` but that of > `initializer` or `start`. > > This
> > was
> > discussed further down in the PR but to reiterate: > `np.sum([10],
> > initializer=5)` becomes `15`. > > Also, `np.min([5],
> > initializer=0)`
> > becomes `0`, so it isn?t really > the default value, it?s the
> > initial
> > value among which the reduction > is performed. > > This was the
> > reason to call it initializer in the first place. I like >
> > `initial`
> > and `initial_value` as well, and `start` also makes sense > but
> > isn?t
> > descriptive enough. > > Hameer > Sent from Astro for Mac > > > On
> > Mar
> > 26, 2018 at 12:06, Sebastian Berg <sebastian at sipsolutions.ne > > t>
> > wrote: > > > > Initializer or this sounds fine to me. As an other
> > data
> > point which > > I > > think has been mentioned before, `sum` uses
> > start and min/max use > > default. `start` does not work, unless we
> > also change the code to > > always use the identity if given
> > (currently that is not the case), > > in > > which case it might be
> > nice. However, "start" seems a bit like > > solving > > a different
> > issue in any case. > > > > Anyway, mostly noise. I really like
> > adding
> > this, the only thing > > worth > > discussing a bit is the name :).
> > >
> > - Sebastian > > > > > > On Mon, 2018-03-26 at 05:57 -0400, Hameer
> > Abbasi wrote: > > > It calls it `initializer` - See
> > https://docs.python.org/3.5/libra > > > ry/f > > >
> > unctools.html#functools.reduce > > > > > > Sent from Astro for Mac
> > >
> > On Mar 26, 2018 at 09:54, Eric Wieser <wieser.eric+numpy at gmail. > >
> > >
> > > com> > > > > wrote: > > > > > > > > It turns out I mispoke -
> > 
> > functools.reduce calls the argument > > > > `initial` > > > > > > >
> > >
> > On Mon, 26 Mar 2018 at 00:17 Stephan Hoyer <shoyer at gmail.com> > > >
> > >
> > wrote: > > > > > This looks like a very logical addition to the
> > reduce
> > > > > > > interface. > > > > > It has my support! > > > > > > > > >
> > > > > > > >
> > 
> > I would have preferred the more descriptive name > > > > >
> > "initial_value", > > > > > but consistency with functools.reduce
> > makes
> > a compelling case > > > > > for > > > > > "initializer". > > > > >
> > >
> > > 
> > > > > On Sun, Mar 25, 2018 at 1:15 PM Eric Wieser <wieser.eric+nump
> > > > > >
> > > > > > y at gm > > > > > ail.com> wrote: To reiterate my comments in
> > > > > > the
> > 
> > issue - I'm in favor of > this. > > > > > > > > > > > > It seems
> > seem
> > especially valuable for identity-less > > > > > > functions > > > >
> > >
> > > (`min`, `max`, `lcm`), and the argument name is consistent > > >
> > > >
> > > > 
> > > 
> > > with > `functools.reduce`. too. > > > > > > > > > > > > The only
> > 
> > argument I can see against merging this would be > > > > > >
> > `kwarg`-creep of `reduce`, and I think this has enough use > > > >
> > >
> > cases to justify that. > > > > > > > > > > > > I'd like to merge in
> > a
> > few days, if no one else has any > > > > > > opinions. > > > > > >
> > >
> > Eric > > > > > > > > > > > > On Fri, 16 Mar 2018 at 10:13 Hameer
> > Abbasi <einstein.edison > > > > > > @gma > > > > > > il.com> wrote:
> > >
> > > > > > > > Hello, everyone. I?ve submitted a PR to add a
> > > > > > > > initializer
> > > > > > > > > kwarg to ufunc.reduce. This is useful in a few cases,
> > > > > > > > > >
> > > > > > > > 
> > > > > > > > e.g., > > > > > > > it allows one to supply a ?default?
> > 
> > value for identity- > > > > > > > less > > > > > > > ufunc
> > reductions,
> > and specify an initial value for > > > > > > > reductions such as
> > sum
> > (other than zero.) > > > > > > > > > > > > Please feel free to
> > review
> > or leave feedback, (although I > > > > > think Eric and Marten have
> > picked it apart pretty well). > > > >
> > https://github.com/numpy/numpy/pull/10635 > > > > > Thanks, > > > >
> > >
> > > > > > > > > > > Hameer > > > > Sent from Astro for Mac > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > _______________________________________________ > > >
> > > > > > > > > >
> > > > > 
> > > > > NumPy-Discussion mailing list > > > > > > >
> > 
> > NumPy-Discussion at python.org > > > > > > >
> > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > >
> > >
> > _______________________________________________ > > > > >
> > NumPy-Discussion mailing list > > > > > > NumPy-Discussion at python.o
> > rg
> > > > > > > > https://mail.python.org/mailman/listinfo/numpy-discussi
> > > > > > > > on
> > > > > > > > _______________________________________________ > > > >
> > > > > > > > >
> > 
> > NumPy-Discussion mailing list > > > > > NumPy-Discussion at python.org
> >  >
> > https://mail.python.org/mailman/listinfo/numpy-discussion > >
> > _______________________________________________ > > > >
> > NumPy-Discussion mailing list > > > > NumPy-Discussion at python.org >
> > >
> > https://mail.python.org/mailman/listinfo/numpy-discussion > > > >
> > _______________________________________________ > > > NumPy-
> > Discussion
> > mailing list > > > NumPy-Discussion at python.org > > >
> > https://mail.python.org/mailman/listinfo/numpy-discussion > > > >
> > _______________________________________________ > > NumPy-
> > Discussion
> > mailing list > > NumPy-Discussion at python.org > >
> > https://mail.python.org/mailman/listinfo/numpy-discussion > >
> > _______________________________________________ > NumPy-Discussion
> > mailing list > NumPy-Discussion at python.org >
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> > _______________________________________________ NumPy-Discussion
> > mailing list NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> > _______________________________________________ NumPy-Discussion
> > mailing list NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180326/16a6cef0/attachment-0001.sig>

From einstein.edison at gmail.com  Mon Mar 26 12:59:38 2018
From: einstein.edison at gmail.com (Hameer Abbasi)
Date: Mon, 26 Mar 2018 12:59:38 -0400
Subject: [Numpy-discussion] PR to add an initializer kwarg to
 ufunc.reduce (and similar functions)
In-Reply-To: <1522083291.8319.3.camel@sipsolutions.net>
References: <CADViA5D6qc_3PcTXFz3zcVwNEdP2hfKUjgUg+RFb8eDFvj1yyg@mail.gmail.com>
 <1522079156.4888.12.camel@sipsolutions.net>
 <CADViA5BNtLbutmjJNr6AH0ck4h8TuC9HAtNHKAvdPuKbi7-1YQ@mail.gmail.com>
 <1522082927.4888.24.camel@sipsolutions.net>
 <1522083291.8319.3.camel@sipsolutions.net>
Message-ID: <CADViA5BQasWFN5O3-G9sM3dqopRyMhTyT6j5xh4zO3RoM9tu3g@mail.gmail.com>

That may be complicated. Currently, the identity isn't used in object
dtype reductions. We may need to change that, which could cause a
whole lot of other backwards incompatible changes. For example, sum
actually including zero in object reductions. Or we could pass in a
flag saying an initializer was passed in to change that behaviour. If
this is agreed upon and someone is kind enough to point me to the
code, I'd be willing to make this change. On 26/03/2018 at 18:54,
Sebastian wrote: On Mon, 2018-03-26 at 18:48 +0200, Sebastian Berg
wrote: On Mon, 2018-03-26 at 11:53 -0400, Hameer Abbasi wrote: It'll
need to be thought out for object arrays and subclasses. But for
Regular numeric stuff, Numpy uses fmin and this would have the desired
effect. I do not want to block this, but I would like a clearer
opinion about this issue, `np.nansum` as Benjamin noted would require
something like: np.nansum([np.nan], default=np.nan) because
np.sum([1], initializer=np.nan) np.nansum([1], initializer=np.nan)
would both give NaN if the logic is the same as the current `np.sum`.
And yes, I guess for fmin/fmax NaN happens to work. And then there are
many nonsense reduces which could make sense with `initializer`. Now
nansum is not implemented in a way that could make use of the new
kwarg anyway, so maybe it does not matter in some sense. We can in
principle use `default` in nansum and at some point possibly add
`default` to the normal ufuncs. If we argue like that, the only
annoying thing is the `object` dtype which confuses the two use cases
currently. This confusion IMO is not harmless, because I might want to
use it (e.g. sum with initializer=5), and I would expect things like
dropping in `decimal.Decimal` to work most of the time, while here it
would give silently bad results. In other words: I am very very much
in favor if you get rid that object dtype special case. I frankly not
see why not (except that it needs a bit more code change). If given
explicitly, we might as well force the use and not do the funny stuff
which is designed to be more type agnostic! If it happens to fail due
to not being type agnostic, it will at least fail loudly. If you leave
that object special case I am *very* hesitant about it. That I think I
would like a `default` argument as well, is another issue and it can
wait to another day. - Sebastian - Sebastian On 26/03/2018 at 17:45,
Sebastian wrote: On Mon, 2018-03-26 at 11:39 -0400, Hameer Abbasi
wrote: That is the idea, but NaN functions are in a separate branch
for another PR to be discussed later. You can see it on my fork, if
you're interested. Except that as far as I understand I am not sure it
will help much with it, since it is not a default, but an initializer.
Initializing to NaN would just make all results NaN. - Sebastian On
26/03/2018 at 17:35, Benjamin wrote: Hmm, this is neat. I imagine it
would finally give some people a choice on what np.nansum([np.nan])
should return? It caused a huge hullabeloo a few years ago when we
changed it from returning NaN to returning zero. Ben Root On Mon, Mar
26, 2018 at 11:16 AM, Sebastian Berg <sebastian at sipsolutions.net>
wrote: OK, the new documentation is actually clear: initializer :
scalar, optional The value with which to start the reduction. Defaults
to the `~numpy.ufunc.identity` of the ufunc. If ``None`` is given, the
first element of the reduction is used, and an error is thrown if the
reduction is empty. If ``a.dtype`` is ``object``, then the initializer
is _only_ used if reduction is empty. I would actually like to say
that I do not like the object special case much (and it is probably
the reason why I was confused), nor am I quite sure this is what helps
a lot? Logically, I would argue there are two things: 1.
initializer/start (always used) 2. default (oly used for empty
reductions) For example, I might like to give `np.nan` as the default
for some empty reductions, this will not work. I understand that this
is a minimal invasive PR and I am not sure I find the solution bad
enough to really dislike it, but what do other think? My first
expectation was the default behaviour (in all cases, not just object
case) for some reason. To be honest, for now I just wonder a bit: How
hard would it be to do both, or is that too annoying? It would at
least get rid of that annoying thing with object ufuncs (which
currently have a default, but not really an identity/initializer).
Best, Sebastian On Mon, 2018-03-26 at 08:20 -0400, Hameer Abbasi
wrote: > Actually, the behavior right now isn?t that of `default` but
that of > `initializer` or `start`. > > This was discussed further
down in the PR but to reiterate: > `np.sum([10], initializer=5)`
becomes `15`. > > Also, `np.min([5], initializer=0)` becomes `0`, so
it isn?t really > the default value, it?s the initial value among
which the reduction > is performed. > > This was the reason to call it
initializer in the first place. I like > `initial` and `initial_value`
as well, and `start` also makes sense > but isn?t descriptive enough.
> > Hameer > Sent from Astro for Mac > > > On Mar 26, 2018 at 12:06,
Sebastian Berg <sebastian at sipsolutions.ne > > t> wrote: > > > >
Initializer or this sounds fine to me. As an other data point which >
> I > > think has been mentioned before, `sum` uses start and min/max
use > > default. `start` does not work, unless we also change the code
to > > always use the identity if given (currently that is not the
case), > > in > > which case it might be nice. However, "start" seems
a bit like > > solving > > a different issue in any case. > > > >
Anyway, mostly noise. I really like adding this, the only thing > >
worth > > discussing a bit is the name :). - Sebastian > > > > > > On
Mon, 2018-03-26 at 05:57 -0400, Hameer Abbasi wrote: > > > It calls it
`initializer` - See https://docs.python.org/3.5/libra > > > ry/f > > >
unctools.html#functools.reduce > > > > > > Sent from Astro for Mac On
Mar 26, 2018 at 09:54, Eric Wieser <wieser.eric+numpy at gmail. > > com>
> > > > wrote: > > > > > > > > It turns out I mispoke -
functools.reduce calls the argument > > > > `initial` > > > > > > > On
Mon, 26 Mar 2018 at 00:17 Stephan Hoyer <shoyer at gmail.com> > > >
wrote: > > > > > This looks like a very logical addition to the reduce
interface. > > > > > It has my support! > > > > > > > > > I would have
preferred the more descriptive name > > > > > "initial_value", > > > >
> but consistency with functools.reduce makes a compelling case > > >
> > for > > > > > "initializer". > > > > > On Sun, Mar 25, 2018 at
1:15 PM Eric Wieser <wieser.eric+nump y at gm > > > > > ail.com> wrote:
To reiterate my comments in the issue - I'm in favor of > this. > > >
> > > > > > > > > It seems seem especially valuable for identity-less
> > > > > > functions > > > > (`min`, `max`, `lcm`), and the argument
name is consistent > > > with > `functools.reduce`. too. > > > > > > >
> > > > > The only argument I can see against merging this would be >
> > > > > `kwarg`-creep of `reduce`, and I think this has enough use >
> > > cases to justify that. > > > > > > > > > > > > I'd like to merge
in a few days, if no one else has any > > > > > > opinions. > > > > >
> Eric > > > > > > > > > > > > On Fri, 16 Mar 2018 at 10:13 Hameer
Abbasi <einstein.edison > > > > > > @gma > > > > > > il.com> wrote:
Hello, everyone. I?ve submitted a PR to add a initializer kwarg to
ufunc.reduce. This is useful in a few cases, e.g., > > > > > > > it
allows one to supply a ?default? value for identity- > > > > > > >
less > > > > > > > ufunc reductions, and specify an initial value for
> > > > > > > reductions such as sum (other than zero.) > > > > > > >
> > > > > Please feel free to review or leave feedback, (although I >
> > > > think Eric and Marten have picked it apart pretty well). > > >
> https://github.com/numpy/numpy/pull/10635 > > > > > Thanks, > > > >
Hameer > > > > Sent from Astro for Mac > > > > >
_______________________________________________ > > > NumPy-Discussion
mailing list > > > > > > > NumPy-Discussion at python.org > > > > > > >
https://mail.python.org/mailman/listinfo/numpy-discussion > > > > >
_______________________________________________ > > > > >
NumPy-Discussion mailing list > > > > > > NumPy-Discussion at python.o rg
https://mail.python.org/mailman/listinfo/numpy-discussi on
_______________________________________________ > > > >
NumPy-Discussion mailing list > > > > > NumPy-Discussion at python.org
https://mail.python.org/mailman/listinfo/numpy-discussion > >
_______________________________________________ > > > >
NumPy-Discussion mailing list > > > > NumPy-Discussion at python.org >
https://mail.python.org/mailman/listinfo/numpy-discussion > > > >
_______________________________________________ > > > NumPy-
Discussion mailing list > > > NumPy-Discussion at python.org > > >
https://mail.python.org/mailman/listinfo/numpy-discussion > > > >
_______________________________________________ > > NumPy- Discussion
mailing list > > NumPy-Discussion at python.org > >
https://mail.python.org/mailman/listinfo/numpy-discussion > >
_______________________________________________ > NumPy-Discussion
mailing list > NumPy-Discussion at python.org >
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion
mailing list NumPy-Discussion at python.org
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion
mailing list NumPy-Discussion at python.org
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion
mailing list NumPy-Discussion at python.org
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion
mailing list NumPy-Discussion at python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

From sebastian at sipsolutions.net  Mon Mar 26 13:09:52 2018
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Mon, 26 Mar 2018 19:09:52 +0200
Subject: [Numpy-discussion] PR to add an initializer kwarg to
 ufunc.reduce (and similar functions)
In-Reply-To: <CADViA5BQasWFN5O3-G9sM3dqopRyMhTyT6j5xh4zO3RoM9tu3g@mail.gmail.com>
References: <CADViA5D6qc_3PcTXFz3zcVwNEdP2hfKUjgUg+RFb8eDFvj1yyg@mail.gmail.com>
 <1522079156.4888.12.camel@sipsolutions.net>
 <CADViA5BNtLbutmjJNr6AH0ck4h8TuC9HAtNHKAvdPuKbi7-1YQ@mail.gmail.com>
 <1522082927.4888.24.camel@sipsolutions.net>
 <1522083291.8319.3.camel@sipsolutions.net>
 <CADViA5BQasWFN5O3-G9sM3dqopRyMhTyT6j5xh4zO3RoM9tu3g@mail.gmail.com>
Message-ID: <1522084192.8883.6.camel@sipsolutions.net>

On Mon, 2018-03-26 at 12:59 -0400, Hameer Abbasi wrote:
> That may be complicated. Currently, the identity isn't used in object
> dtype reductions. We may need to change that, which could cause a
> whole lot of other backwards incompatible changes. For example, sum
> actually including zero in object reductions. Or we could pass in a
> flag saying an initializer was passed in to change that behaviour. If
> this is agreed upon and someone is kind enough to point me to the
> code, I'd be willing to make this change.

I realize the implication, I am not suggesting to change the default
behaviour (when no initial=... is passed), I would think about
deprecating it, but probably only if we also have the `default`
argument, since otherwise you cannot replicate the old behaviour.

What I think I would like to see is to change how it works if (and only
if) the initializer is passed in. Yes, this will require holding on to
some extra information since you will have to know/remember whether the
"identity" was passed in or defined otherwise.

I did not check the code, but I would hope that it is not awfully
tricky to do that.

- Sebastian


PS: A side note, but I see your emails as a single block of text with
no/broken new-lines.


>  On 26/03/2018 at 18:54,
> Sebastian wrote: On Mon, 2018-03-26 at 18:48 +0200, Sebastian Berg
> wrote: On Mon, 2018-03-26 at 11:53 -0400, Hameer Abbasi wrote: It'll
> need to be thought out for object arrays and subclasses. But for
> Regular numeric stuff, Numpy uses fmin and this would have the
> desired
> effect. I do not want to block this, but I would like a clearer
> opinion about this issue, `np.nansum` as Benjamin noted would require
> something like: np.nansum([np.nan], default=np.nan) because
> np.sum([1], initializer=np.nan) np.nansum([1], initializer=np.nan)
> would both give NaN if the logic is the same as the current `np.sum`.
> And yes, I guess for fmin/fmax NaN happens to work. And then there
> are
> many nonsense reduces which could make sense with `initializer`. Now
> nansum is not implemented in a way that could make use of the new
> kwarg anyway, so maybe it does not matter in some sense. We can in
> principle use `default` in nansum and at some point possibly add
> `default` to the normal ufuncs. If we argue like that, the only
> annoying thing is the `object` dtype which confuses the two use cases
> currently. This confusion IMO is not harmless, because I might want
> to
> use it (e.g. sum with initializer=5), and I would expect things like
> dropping in `decimal.Decimal` to work most of the time, while here it
> would give silently bad results. In other words: I am very very much
> in favor if you get rid that object dtype special case. I frankly not
> see why not (except that it needs a bit more code change). If given
> explicitly, we might as well force the use and not do the funny stuff
> which is designed to be more type agnostic! If it happens to fail due
> to not being type agnostic, it will at least fail loudly. If you
> leave
> that object special case I am *very* hesitant about it. That I think
> I
> would like a `default` argument as well, is another issue and it can
> wait to another day. - Sebastian - Sebastian On 26/03/2018 at 17:45,
> Sebastian wrote: On Mon, 2018-03-26 at 11:39 -0400, Hameer Abbasi
> wrote: That is the idea, but NaN functions are in a separate branch
> for another PR to be discussed later. You can see it on my fork, if
> you're interested. Except that as far as I understand I am not sure
> it
> will help much with it, since it is not a default, but an
> initializer.
> Initializing to NaN would just make all results NaN. - Sebastian On
> 26/03/2018 at 17:35, Benjamin wrote: Hmm, this is neat. I imagine it
> would finally give some people a choice on what np.nansum([np.nan])
> should return? It caused a huge hullabeloo a few years ago when we
> changed it from returning NaN to returning zero. Ben Root On Mon, Mar
> 26, 2018 at 11:16 AM, Sebastian Berg <sebastian at sipsolutions.net>
> wrote: OK, the new documentation is actually clear: initializer :
> scalar, optional The value with which to start the reduction.
> Defaults
> to the `~numpy.ufunc.identity` of the ufunc. If ``None`` is given,
> the
> first element of the reduction is used, and an error is thrown if the
> reduction is empty. If ``a.dtype`` is ``object``, then the
> initializer
> is _only_ used if reduction is empty. I would actually like to say
> that I do not like the object special case much (and it is probably
> the reason why I was confused), nor am I quite sure this is what
> helps
> a lot? Logically, I would argue there are two things: 1.
> initializer/start (always used) 2. default (oly used for empty
> reductions) For example, I might like to give `np.nan` as the default
> for some empty reductions, this will not work. I understand that this
> is a minimal invasive PR and I am not sure I find the solution bad
> enough to really dislike it, but what do other think? My first
> expectation was the default behaviour (in all cases, not just object
> case) for some reason. To be honest, for now I just wonder a bit: How
> hard would it be to do both, or is that too annoying? It would at
> least get rid of that annoying thing with object ufuncs (which
> currently have a default, but not really an identity/initializer).
> Best, Sebastian On Mon, 2018-03-26 at 08:20 -0400, Hameer Abbasi
> wrote: > Actually, the behavior right now isn?t that of `default` but
> that of > `initializer` or `start`. > > This was discussed further
> down in the PR but to reiterate: > `np.sum([10], initializer=5)`
> becomes `15`. > > Also, `np.min([5], initializer=0)` becomes `0`, so
> it isn?t really > the default value, it?s the initial value among
> which the reduction > is performed. > > This was the reason to call
> it
> initializer in the first place. I like > `initial` and
> `initial_value`
> as well, and `start` also makes sense > but isn?t descriptive enough.
> > > Hameer > Sent from Astro for Mac > > > On Mar 26, 2018 at 12:06,
> 
> Sebastian Berg <sebastian at sipsolutions.ne > > t> wrote: > > > >
> Initializer or this sounds fine to me. As an other data point which >
> > I > > think has been mentioned before, `sum` uses start and min/max
> 
> use > > default. `start` does not work, unless we also change the
> code
> to > > always use the identity if given (currently that is not the
> case), > > in > > which case it might be nice. However, "start" seems
> a bit like > > solving > > a different issue in any case. > > > >
> Anyway, mostly noise. I really like adding this, the only thing > >
> worth > > discussing a bit is the name :). - Sebastian > > > > > > On
> Mon, 2018-03-26 at 05:57 -0400, Hameer Abbasi wrote: > > > It calls
> it
> `initializer` - See https://docs.python.org/3.5/libra > > > ry/f > >
> >
> unctools.html#functools.reduce > > > > > > Sent from Astro for Mac On
> Mar 26, 2018 at 09:54, Eric Wieser <wieser.eric+numpy at gmail. > > com>
> > > > > wrote: > > > > > > > > It turns out I mispoke -
> 
> functools.reduce calls the argument > > > > `initial` > > > > > > >
> On
> Mon, 26 Mar 2018 at 00:17 Stephan Hoyer <shoyer at gmail.com> > > >
> wrote: > > > > > This looks like a very logical addition to the
> reduce
> interface. > > > > > It has my support! > > > > > > > > > I would
> have
> preferred the more descriptive name > > > > > "initial_value", > > >
> >
> > but consistency with functools.reduce makes a compelling case > > >
> > > for > > > > > "initializer". > > > > > On Sun, Mar 25, 2018 at
> 
> 1:15 PM Eric Wieser <wieser.eric+nump y at gm > > > > > ail.com> wrote:
> To reiterate my comments in the issue - I'm in favor of > this. > > >
> > > > > > > > > > It seems seem especially valuable for identity-less
> > > > > > > 
> > > > > > > functions > > > > (`min`, `max`, `lcm`), and the argument
> 
> name is consistent > > > with > `functools.reduce`. too. > > > > > >
> >
> > > > > > The only argument I can see against merging this would be >
> > > > > > `kwarg`-creep of `reduce`, and I think this has enough use
> > > > > > >
> > > > 
> > > > cases to justify that. > > > > > > > > > > > > I'd like to
> > > > merge
> 
> in a few days, if no one else has any > > > > > > opinions. > > > > >
> > Eric > > > > > > > > > > > > On Fri, 16 Mar 2018 at 10:13 Hameer
> 
> Abbasi <einstein.edison > > > > > > @gma > > > > > > il.com> wrote:
> Hello, everyone. I?ve submitted a PR to add a initializer kwarg to
> ufunc.reduce. This is useful in a few cases, e.g., > > > > > > > it
> allows one to supply a ?default? value for identity- > > > > > > >
> less > > > > > > > ufunc reductions, and specify an initial value for
> > > > > > > > reductions such as sum (other than zero.) > > > > > > >
> > > > > > 
> > > > > > Please feel free to review or leave feedback, (although I >
> > > > > 
> > > > > think Eric and Marten have picked it apart pretty well). > >
> > > > > >
> > 
> > https://github.com/numpy/numpy/pull/10635 > > > > > Thanks, > > > >
> 
> Hameer > > > > Sent from Astro for Mac > > > > >
> _______________________________________________ > > > NumPy-
> Discussion
> mailing list > > > > > > > NumPy-Discussion at python.org > > > > > > >
> https://mail.python.org/mailman/listinfo/numpy-discussion > > > > >
> _______________________________________________ > > > > >
> NumPy-Discussion mailing list > > > > > > NumPy-Discussion at python.o
> rg
> https://mail.python.org/mailman/listinfo/numpy-discussi on
> _______________________________________________ > > > >
> NumPy-Discussion mailing list > > > > > NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion > >
> _______________________________________________ > > > >
> NumPy-Discussion mailing list > > > > NumPy-Discussion at python.org >
> https://mail.python.org/mailman/listinfo/numpy-discussion > > > >
> _______________________________________________ > > > NumPy-
> Discussion mailing list > > > NumPy-Discussion at python.org > > >
> https://mail.python.org/mailman/listinfo/numpy-discussion > > > >
> _______________________________________________ > > NumPy- Discussion
> mailing list > > NumPy-Discussion at python.org > >
> https://mail.python.org/mailman/listinfo/numpy-discussion > >
> _______________________________________________ > NumPy-Discussion
> mailing list > NumPy-Discussion at python.org >
> https://mail.python.org/mailman/listinfo/numpy-discussion
> _______________________________________________ NumPy-Discussion
> mailing list NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> _______________________________________________ NumPy-Discussion
> mailing list NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> _______________________________________________ NumPy-Discussion
> mailing list NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> _______________________________________________ NumPy-Discussion
> mailing list NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180326/6aa9401a/attachment-0001.sig>

From wieser.eric+numpy at gmail.com  Mon Mar 26 13:40:34 2018
From: wieser.eric+numpy at gmail.com (Eric Wieser)
Date: Mon, 26 Mar 2018 17:40:34 +0000
Subject: [Numpy-discussion] PR to add an initializer kwarg to
 ufunc.reduce (and similar functions)
In-Reply-To: <1522084192.8883.6.camel@sipsolutions.net>
References: <CADViA5D6qc_3PcTXFz3zcVwNEdP2hfKUjgUg+RFb8eDFvj1yyg@mail.gmail.com>
 <1522079156.4888.12.camel@sipsolutions.net>
 <CADViA5BNtLbutmjJNr6AH0ck4h8TuC9HAtNHKAvdPuKbi7-1YQ@mail.gmail.com>
 <1522082927.4888.24.camel@sipsolutions.net>
 <1522083291.8319.3.camel@sipsolutions.net>
 <CADViA5BQasWFN5O3-G9sM3dqopRyMhTyT6j5xh4zO3RoM9tu3g@mail.gmail.com>
 <1522084192.8883.6.camel@sipsolutions.net>
Message-ID: <CAL1kJvB50jPnsEaSJsLv4UZ18WOv5ge-z_o3f1dvTQC2SPu4mQ@mail.gmail.com>

The difficulty in supporting object arrays is that func.reduce(arr,
initial=func.identity) and func.reduce(arr) have different meanings -
whereas with the current patch, they are equivalent.
?

On Mon, 26 Mar 2018 at 10:10 Sebastian Berg <sebastian at sipsolutions.net>
wrote:

> On Mon, 2018-03-26 at 12:59 -0400, Hameer Abbasi wrote:
> > That may be complicated. Currently, the identity isn't used in object
> > dtype reductions. We may need to change that, which could cause a
> > whole lot of other backwards incompatible changes. For example, sum
> > actually including zero in object reductions. Or we could pass in a
> > flag saying an initializer was passed in to change that behaviour. If
> > this is agreed upon and someone is kind enough to point me to the
> > code, I'd be willing to make this change.
>
> I realize the implication, I am not suggesting to change the default
> behaviour (when no initial=... is passed), I would think about
> deprecating it, but probably only if we also have the `default`
> argument, since otherwise you cannot replicate the old behaviour.
>
> What I think I would like to see is to change how it works if (and only
> if) the initializer is passed in. Yes, this will require holding on to
> some extra information since you will have to know/remember whether the
> "identity" was passed in or defined otherwise.
>
> I did not check the code, but I would hope that it is not awfully
> tricky to do that.
>
> - Sebastian
>
>
> PS: A side note, but I see your emails as a single block of text with
> no/broken new-lines.
>
>
> >  On 26/03/2018 at 18:54,
> > Sebastian wrote: On Mon, 2018-03-26 at 18:48 +0200, Sebastian Berg
> > wrote: On Mon, 2018-03-26 at 11:53 -0400, Hameer Abbasi wrote: It'll
> > need to be thought out for object arrays and subclasses. But for
> > Regular numeric stuff, Numpy uses fmin and this would have the
> > desired
> > effect. I do not want to block this, but I would like a clearer
> > opinion about this issue, `np.nansum` as Benjamin noted would require
> > something like: np.nansum([np.nan], default=np.nan) because
> > np.sum([1], initializer=np.nan) np.nansum([1], initializer=np.nan)
> > would both give NaN if the logic is the same as the current `np.sum`.
> > And yes, I guess for fmin/fmax NaN happens to work. And then there
> > are
> > many nonsense reduces which could make sense with `initializer`. Now
> > nansum is not implemented in a way that could make use of the new
> > kwarg anyway, so maybe it does not matter in some sense. We can in
> > principle use `default` in nansum and at some point possibly add
> > `default` to the normal ufuncs. If we argue like that, the only
> > annoying thing is the `object` dtype which confuses the two use cases
> > currently. This confusion IMO is not harmless, because I might want
> > to
> > use it (e.g. sum with initializer=5), and I would expect things like
> > dropping in `decimal.Decimal` to work most of the time, while here it
> > would give silently bad results. In other words: I am very very much
> > in favor if you get rid that object dtype special case. I frankly not
> > see why not (except that it needs a bit more code change). If given
> > explicitly, we might as well force the use and not do the funny stuff
> > which is designed to be more type agnostic! If it happens to fail due
> > to not being type agnostic, it will at least fail loudly. If you
> > leave
> > that object special case I am *very* hesitant about it. That I think
> > I
> > would like a `default` argument as well, is another issue and it can
> > wait to another day. - Sebastian - Sebastian On 26/03/2018 at 17:45,
> > Sebastian wrote: On Mon, 2018-03-26 at 11:39 -0400, Hameer Abbasi
> > wrote: That is the idea, but NaN functions are in a separate branch
> > for another PR to be discussed later. You can see it on my fork, if
> > you're interested. Except that as far as I understand I am not sure
> > it
> > will help much with it, since it is not a default, but an
> > initializer.
> > Initializing to NaN would just make all results NaN. - Sebastian On
> > 26/03/2018 at 17:35, Benjamin wrote: Hmm, this is neat. I imagine it
> > would finally give some people a choice on what np.nansum([np.nan])
> > should return? It caused a huge hullabeloo a few years ago when we
> > changed it from returning NaN to returning zero. Ben Root On Mon, Mar
> > 26, 2018 at 11:16 AM, Sebastian Berg <sebastian at sipsolutions.net>
> > wrote: OK, the new documentation is actually clear: initializer :
> > scalar, optional The value with which to start the reduction.
> > Defaults
> > to the `~numpy.ufunc.identity` of the ufunc. If ``None`` is given,
> > the
> > first element of the reduction is used, and an error is thrown if the
> > reduction is empty. If ``a.dtype`` is ``object``, then the
> > initializer
> > is _only_ used if reduction is empty. I would actually like to say
> > that I do not like the object special case much (and it is probably
> > the reason why I was confused), nor am I quite sure this is what
> > helps
> > a lot? Logically, I would argue there are two things: 1.
> > initializer/start (always used) 2. default (oly used for empty
> > reductions) For example, I might like to give `np.nan` as the default
> > for some empty reductions, this will not work. I understand that this
> > is a minimal invasive PR and I am not sure I find the solution bad
> > enough to really dislike it, but what do other think? My first
> > expectation was the default behaviour (in all cases, not just object
> > case) for some reason. To be honest, for now I just wonder a bit: How
> > hard would it be to do both, or is that too annoying? It would at
> > least get rid of that annoying thing with object ufuncs (which
> > currently have a default, but not really an identity/initializer).
> > Best, Sebastian On Mon, 2018-03-26 at 08:20 -0400, Hameer Abbasi
> > wrote: > Actually, the behavior right now isn?t that of `default` but
> > that of > `initializer` or `start`. > > This was discussed further
> > down in the PR but to reiterate: > `np.sum([10], initializer=5)`
> > becomes `15`. > > Also, `np.min([5], initializer=0)` becomes `0`, so
> > it isn?t really > the default value, it?s the initial value among
> > which the reduction > is performed. > > This was the reason to call
> > it
> > initializer in the first place. I like > `initial` and
> > `initial_value`
> > as well, and `start` also makes sense > but isn?t descriptive enough.
> > > > Hameer > Sent from Astro for Mac > > > On Mar 26, 2018 at 12:06,
> >
> > Sebastian Berg <sebastian at sipsolutions.ne > > t> wrote: > > > >
> > Initializer or this sounds fine to me. As an other data point which >
> > > I > > think has been mentioned before, `sum` uses start and min/max
> >
> > use > > default. `start` does not work, unless we also change the
> > code
> > to > > always use the identity if given (currently that is not the
> > case), > > in > > which case it might be nice. However, "start" seems
> > a bit like > > solving > > a different issue in any case. > > > >
> > Anyway, mostly noise. I really like adding this, the only thing > >
> > worth > > discussing a bit is the name :). - Sebastian > > > > > > On
> > Mon, 2018-03-26 at 05:57 -0400, Hameer Abbasi wrote: > > > It calls
> > it
> > `initializer` - See https://docs.python.org/3.5/libra > > > ry/f > >
> > >
> > unctools.html#functools.reduce > > > > > > Sent from Astro for Mac On
> > Mar 26, 2018 at 09:54, Eric Wieser <wieser.eric+numpy at gmail. > > com>
> > > > > > wrote: > > > > > > > > It turns out I mispoke -
> >
> > functools.reduce calls the argument > > > > `initial` > > > > > > >
> > On
> > Mon, 26 Mar 2018 at 00:17 Stephan Hoyer <shoyer at gmail.com> > > >
> > wrote: > > > > > This looks like a very logical addition to the
> > reduce
> > interface. > > > > > It has my support! > > > > > > > > > I would
> > have
> > preferred the more descriptive name > > > > > "initial_value", > > >
> > >
> > > but consistency with functools.reduce makes a compelling case > > >
> > > > for > > > > > "initializer". > > > > > On Sun, Mar 25, 2018 at
> >
> > 1:15 PM Eric Wieser <wieser.eric+nump y at gm > > > > > ail.com> wrote:
> > To reiterate my comments in the issue - I'm in favor of > this. > > >
> > > > > > > > > > > It seems seem especially valuable for identity-less
> > > > > > > >
> > > > > > > > functions > > > > (`min`, `max`, `lcm`), and the argument
> >
> > name is consistent > > > with > `functools.reduce`. too. > > > > > >
> > >
> > > > > > > The only argument I can see against merging this would be >
> > > > > > > `kwarg`-creep of `reduce`, and I think this has enough use
> > > > > > > >
> > > > >
> > > > > cases to justify that. > > > > > > > > > > > > I'd like to
> > > > > merge
> >
> > in a few days, if no one else has any > > > > > > opinions. > > > > >
> > > Eric > > > > > > > > > > > > On Fri, 16 Mar 2018 at 10:13 Hameer
> >
> > Abbasi <einstein.edison > > > > > > @gma > > > > > > il.com> wrote:
> > Hello, everyone. I?ve submitted a PR to add a initializer kwarg to
> > ufunc.reduce. This is useful in a few cases, e.g., > > > > > > > it
> > allows one to supply a ?default? value for identity- > > > > > > >
> > less > > > > > > > ufunc reductions, and specify an initial value for
> > > > > > > > > reductions such as sum (other than zero.) > > > > > > >
> > > > > > >
> > > > > > > Please feel free to review or leave feedback, (although I >
> > > > > >
> > > > > > think Eric and Marten have picked it apart pretty well). > >
> > > > > > >
> > >
> > > https://github.com/numpy/numpy/pull/10635 > > > > > Thanks, > > > >
> >
> > Hameer > > > > Sent from Astro for Mac > > > > >
> > _______________________________________________ > > > NumPy-
> > Discussion
> > mailing list > > > > > > > NumPy-Discussion at python.org > > > > > > >
> > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > >
> > _______________________________________________ > > > > >
> > NumPy-Discussion mailing list > > > > > > NumPy-Discussion at python.o
> > rg
> > https://mail.python.org/mailman/listinfo/numpy-discussi on
> > _______________________________________________ > > > >
> > NumPy-Discussion mailing list > > > > > NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion > >
> > _______________________________________________ > > > >
> > NumPy-Discussion mailing list > > > > NumPy-Discussion at python.org >
> > https://mail.python.org/mailman/listinfo/numpy-discussion > > > >
> > _______________________________________________ > > > NumPy-
> > Discussion mailing list > > > NumPy-Discussion at python.org > > >
> > https://mail.python.org/mailman/listinfo/numpy-discussion > > > >
> > _______________________________________________ > > NumPy- Discussion
> > mailing list > > NumPy-Discussion at python.org > >
> > https://mail.python.org/mailman/listinfo/numpy-discussion > >
> > _______________________________________________ > NumPy-Discussion
> > mailing list > NumPy-Discussion at python.org >
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> > _______________________________________________ NumPy-Discussion
> > mailing list NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> > _______________________________________________ NumPy-Discussion
> > mailing list NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> > _______________________________________________ NumPy-Discussion
> > mailing list NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> > _______________________________________________ NumPy-Discussion
> > mailing list NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180326/843704cd/attachment-0001.html>

From sebastian at sipsolutions.net  Mon Mar 26 14:09:00 2018
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Mon, 26 Mar 2018 20:09:00 +0200
Subject: [Numpy-discussion] PR to add an initializer kwarg to
 ufunc.reduce (and similar functions)
In-Reply-To: <CAL1kJvB50jPnsEaSJsLv4UZ18WOv5ge-z_o3f1dvTQC2SPu4mQ@mail.gmail.com>
References: <CADViA5D6qc_3PcTXFz3zcVwNEdP2hfKUjgUg+RFb8eDFvj1yyg@mail.gmail.com>
 <1522079156.4888.12.camel@sipsolutions.net>
 <CADViA5BNtLbutmjJNr6AH0ck4h8TuC9HAtNHKAvdPuKbi7-1YQ@mail.gmail.com>
 <1522082927.4888.24.camel@sipsolutions.net>
 <1522083291.8319.3.camel@sipsolutions.net>
 <CADViA5BQasWFN5O3-G9sM3dqopRyMhTyT6j5xh4zO3RoM9tu3g@mail.gmail.com>
 <1522084192.8883.6.camel@sipsolutions.net>
 <CAL1kJvB50jPnsEaSJsLv4UZ18WOv5ge-z_o3f1dvTQC2SPu4mQ@mail.gmail.com>
Message-ID: <1522087740.11797.7.camel@sipsolutions.net>

On Mon, 2018-03-26 at 17:40 +0000, Eric Wieser wrote:
> The difficulty in supporting object arrays is that func.reduce(arr,
> initial=func.identity) and func.reduce(arr) have different meanings -
> whereas with the current patch, they are equivalent.
> 

True, but the current meaning is:

func.reduce(arr, intial=<NoValue>, default=func.identity)

in the case for object dtype. Luckily for normal dtypes, func.identity
is both the correct default "default" and a no-op for initial. Thus the
name "identity" kinda works there. I am also not really sure that both
kwargs would make real sense (plus initial probably disallows
default...), but I got some feeling that the "default" meaning may be
even more useful to simplify special casing the empty case.

Anyway, still just pointing out that I it gives me some headaches to
see such a special case for objects :(.

- Sebastian
 

> 
> On Mon, 26 Mar 2018 at 10:10 Sebastian Berg <sebastian at sipsolutions.n
> et> wrote:
> > On Mon, 2018-03-26 at 12:59 -0400, Hameer Abbasi wrote:
> > > That may be complicated. Currently, the identity isn't used in
> > object
> > > dtype reductions. We may need to change that, which could cause a
> > > whole lot of other backwards incompatible changes. For example,
> > sum
> > > actually including zero in object reductions. Or we could pass in
> > a
> > > flag saying an initializer was passed in to change that
> > behaviour. If
> > > this is agreed upon and someone is kind enough to point me to the
> > > code, I'd be willing to make this change.
> > 
> > I realize the implication, I am not suggesting to change the
> > default
> > behaviour (when no initial=... is passed), I would think about
> > deprecating it, but probably only if we also have the `default`
> > argument, since otherwise you cannot replicate the old behaviour.
> > 
> > What I think I would like to see is to change how it works if (and
> > only
> > if) the initializer is passed in. Yes, this will require holding on
> > to
> > some extra information since you will have to know/remember whether
> > the
> > "identity" was passed in or defined otherwise.
> > 
> > I did not check the code, but I would hope that it is not awfully
> > tricky to do that.
> > 
> > - Sebastian
> > 
> > 
> > PS: A side note, but I see your emails as a single block of text
> > with
> > no/broken new-lines.
> > 
> > 
> > >  On 26/03/2018 at 18:54,
> > > Sebastian wrote: On Mon, 2018-03-26 at 18:48 +0200, Sebastian
> > Berg
> > > wrote: On Mon, 2018-03-26 at 11:53 -0400, Hameer Abbasi wrote:
> > It'll
> > > need to be thought out for object arrays and subclasses. But for
> > > Regular numeric stuff, Numpy uses fmin and this would have the
> > > desired
> > > effect. I do not want to block this, but I would like a clearer
> > > opinion about this issue, `np.nansum` as Benjamin noted would
> > require
> > > something like: np.nansum([np.nan], default=np.nan) because
> > > np.sum([1], initializer=np.nan) np.nansum([1],
> > initializer=np.nan)
> > > would both give NaN if the logic is the same as the current
> > `np.sum`.
> > > And yes, I guess for fmin/fmax NaN happens to work. And then
> > there
> > > are
> > > many nonsense reduces which could make sense with `initializer`.
> > Now
> > > nansum is not implemented in a way that could make use of the new
> > > kwarg anyway, so maybe it does not matter in some sense. We can
> > in
> > > principle use `default` in nansum and at some point possibly add
> > > `default` to the normal ufuncs. If we argue like that, the only
> > > annoying thing is the `object` dtype which confuses the two use
> > cases
> > > currently. This confusion IMO is not harmless, because I might
> > want
> > > to
> > > use it (e.g. sum with initializer=5), and I would expect things
> > like
> > > dropping in `decimal.Decimal` to work most of the time, while
> > here it
> > > would give silently bad results. In other words: I am very very
> > much
> > > in favor if you get rid that object dtype special case. I frankly
> > not
> > > see why not (except that it needs a bit more code change). If
> > given
> > > explicitly, we might as well force the use and not do the funny
> > stuff
> > > which is designed to be more type agnostic! If it happens to fail
> > due
> > > to not being type agnostic, it will at least fail loudly. If you
> > > leave
> > > that object special case I am *very* hesitant about it. That I
> > think
> > > I
> > > would like a `default` argument as well, is another issue and it
> > can
> > > wait to another day. - Sebastian - Sebastian On 26/03/2018 at
> > 17:45,
> > > Sebastian wrote: On Mon, 2018-03-26 at 11:39 -0400, Hameer Abbasi
> > > wrote: That is the idea, but NaN functions are in a separate
> > branch
> > > for another PR to be discussed later. You can see it on my fork,
> > if
> > > you're interested. Except that as far as I understand I am not
> > sure
> > > it
> > > will help much with it, since it is not a default, but an
> > > initializer.
> > > Initializing to NaN would just make all results NaN. - Sebastian
> > On
> > > 26/03/2018 at 17:35, Benjamin wrote: Hmm, this is neat. I imagine
> > it
> > > would finally give some people a choice on what
> > np.nansum([np.nan])
> > > should return? It caused a huge hullabeloo a few years ago when
> > we
> > > changed it from returning NaN to returning zero. Ben Root On Mon,
> > Mar
> > > 26, 2018 at 11:16 AM, Sebastian Berg <sebastian at sipsolutions.net>
> > > wrote: OK, the new documentation is actually clear: initializer :
> > > scalar, optional The value with which to start the reduction.
> > > Defaults
> > > to the `~numpy.ufunc.identity` of the ufunc. If ``None`` is
> > given,
> > > the
> > > first element of the reduction is used, and an error is thrown if
> > the
> > > reduction is empty. If ``a.dtype`` is ``object``, then the
> > > initializer
> > > is _only_ used if reduction is empty. I would actually like to
> > say
> > > that I do not like the object special case much (and it is
> > probably
> > > the reason why I was confused), nor am I quite sure this is what
> > > helps
> > > a lot? Logically, I would argue there are two things: 1.
> > > initializer/start (always used) 2. default (oly used for empty
> > > reductions) For example, I might like to give `np.nan` as the
> > default
> > > for some empty reductions, this will not work. I understand that
> > this
> > > is a minimal invasive PR and I am not sure I find the solution
> > bad
> > > enough to really dislike it, but what do other think? My first
> > > expectation was the default behaviour (in all cases, not just
> > object
> > > case) for some reason. To be honest, for now I just wonder a bit:
> > How
> > > hard would it be to do both, or is that too annoying? It would at
> > > least get rid of that annoying thing with object ufuncs (which
> > > currently have a default, but not really an
> > identity/initializer).
> > > Best, Sebastian On Mon, 2018-03-26 at 08:20 -0400, Hameer Abbasi
> > > wrote: > Actually, the behavior right now isn?t that of `default`
> > but
> > > that of > `initializer` or `start`. > > This was discussed
> > further
> > > down in the PR but to reiterate: > `np.sum([10], initializer=5)`
> > > becomes `15`. > > Also, `np.min([5], initializer=0)` becomes `0`,
> > so
> > > it isn?t really > the default value, it?s the initial value among
> > > which the reduction > is performed. > > This was the reason to
> > call
> > > it
> > > initializer in the first place. I like > `initial` and
> > > `initial_value`
> > > as well, and `start` also makes sense > but isn?t descriptive
> > enough.
> > > > > Hameer > Sent from Astro for Mac > > > On Mar 26, 2018 at
> > 12:06,
> > >
> > > Sebastian Berg <sebastian at sipsolutions.ne > > t> wrote: > > > >
> > > Initializer or this sounds fine to me. As an other data point
> > which >
> > > > I > > think has been mentioned before, `sum` uses start and
> > min/max
> > >
> > > use > > default. `start` does not work, unless we also change the
> > > code
> > > to > > always use the identity if given (currently that is not
> > the
> > > case), > > in > > which case it might be nice. However, "start"
> > seems
> > > a bit like > > solving > > a different issue in any case. > > > >
> > > Anyway, mostly noise. I really like adding this, the only thing >
> > >
> > > worth > > discussing a bit is the name :). - Sebastian > > > > >
> > > On
> > > Mon, 2018-03-26 at 05:57 -0400, Hameer Abbasi wrote: > > > It
> > calls
> > > it
> > > `initializer` - See https://docs.python.org/3.5/libra > > > ry/f
> > > >
> > > >
> > > unctools.html#functools.reduce > > > > > > Sent from Astro for
> > Mac On
> > > Mar 26, 2018 at 09:54, Eric Wieser <wieser.eric+numpy at gmail. > >
> > com>
> > > > > > > wrote: > > > > > > > > It turns out I mispoke -
> > >
> > > functools.reduce calls the argument > > > > `initial` > > > > > >
> > >
> > > On
> > > Mon, 26 Mar 2018 at 00:17 Stephan Hoyer <shoyer at gmail.com> > > >
> > > wrote: > > > > > This looks like a very logical addition to the
> > > reduce
> > > interface. > > > > > It has my support! > > > > > > > > > I would
> > > have
> > > preferred the more descriptive name > > > > > "initial_value", >
> > > >
> > > >
> > > > but consistency with functools.reduce makes a compelling case >
> > > >
> > > > > for > > > > > "initializer". > > > > > On Sun, Mar 25, 2018
> > at
> > >
> > > 1:15 PM Eric Wieser <wieser.eric+nump y at gm > > > > > ail.com>
> > wrote:
> > > To reiterate my comments in the issue - I'm in favor of > this. >
> > > >
> > > > > > > > > > > > It seems seem especially valuable for identity-
> > less
> > > > > > > > >
> > > > > > > > > functions > > > > (`min`, `max`, `lcm`), and the
> > argument
> > >
> > > name is consistent > > > with > `functools.reduce`. too. > > > >
> > > >
> > > >
> > > > > > > > The only argument I can see against merging this would
> > be >
> > > > > > > > `kwarg`-creep of `reduce`, and I think this has enough
> > use
> > > > > > > > >
> > > > > >
> > > > > > cases to justify that. > > > > > > > > > > > > I'd like to
> > > > > > merge
> > >
> > > in a few days, if no one else has any > > > > > > opinions. > > >
> > > >
> > > > Eric > > > > > > > > > > > > On Fri, 16 Mar 2018 at 10:13
> > Hameer
> > >
> > > Abbasi <einstein.edison > > > > > > @gma > > > > > > il.com>
> > wrote:
> > > Hello, everyone. I?ve submitted a PR to add a initializer kwarg
> > to
> > > ufunc.reduce. This is useful in a few cases, e.g., > > > > > > >
> > it
> > > allows one to supply a ?default? value for identity- > > > > > >
> > >
> > > less > > > > > > > ufunc reductions, and specify an initial value
> > for
> > > > > > > > > > reductions such as sum (other than zero.) > > > > >
> > > >
> > > > > > > >
> > > > > > > > Please feel free to review or leave feedback, (although
> > I >
> > > > > > >
> > > > > > > think Eric and Marten have picked it apart pretty well).
> > > >
> > > > > > > >
> > > >
> > > > https://github.com/numpy/numpy/pull/10635 > > > > > Thanks, > >
> > > >
> > >
> > > Hameer > > > > Sent from Astro for Mac > > > > >
> > > _______________________________________________ > > > NumPy-
> > > Discussion
> > > mailing list > > > > > > > NumPy-Discussion at python.org > > > > >
> > > >
> > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > >
> > >
> > > _______________________________________________ > > > > >
> > > NumPy-Discussion mailing list > > > > > > NumPy-Discussion at python
> > .o
> > > rg
> > > https://mail.python.org/mailman/listinfo/numpy-discussi on
> > > _______________________________________________ > > > >
> > > NumPy-Discussion mailing list > > > > > NumPy-Discussion at python.o
> > rg
> > > https://mail.python.org/mailman/listinfo/numpy-discussion > >
> > > _______________________________________________ > > > >
> > > NumPy-Discussion mailing list > > > > NumPy-Discussion at python.org
> >  >
> > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > >
> > > _______________________________________________ > > > NumPy-
> > > Discussion mailing list > > > NumPy-Discussion at python.org > > >
> > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > >
> > > _______________________________________________ > > NumPy-
> > Discussion
> > > mailing list > > NumPy-Discussion at python.org > >
> > > https://mail.python.org/mailman/listinfo/numpy-discussion > >
> > > _______________________________________________ > NumPy-
> > Discussion
> > > mailing list > NumPy-Discussion at python.org >
> > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > _______________________________________________ NumPy-Discussion
> > > mailing list NumPy-Discussion at python.org
> > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > _______________________________________________ NumPy-Discussion
> > > mailing list NumPy-Discussion at python.org
> > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > _______________________________________________ NumPy-Discussion
> > > mailing list NumPy-Discussion at python.org
> > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > _______________________________________________ NumPy-Discussion
> > > mailing list NumPy-Discussion at python.org
> > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > _______________________________________________
> > > NumPy-Discussion mailing list
> > > NumPy-Discussion at python.org
> > > https://mail.python.org/mailman/listinfo/numpy-discussion________
> > _______________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180326/e11bc0e0/attachment.sig>

From ben.v.root at gmail.com  Mon Mar 26 14:24:27 2018
From: ben.v.root at gmail.com (Benjamin Root)
Date: Mon, 26 Mar 2018 14:24:27 -0400
Subject: [Numpy-discussion] Right way to do fancy indexing from argsort()
 result?
Message-ID: <CANNq6FnqMHR=93fRgbTAD7Dnb=vf6teUgkvKLU92etWbX-ijSw@mail.gmail.com>

I seem to be losing my mind... I can't seem to get this to work right.

I have a (N, k) array `distances` (along with a bunch of other arrays of
the same shape). I need to resort the rows, so I do:

indexs = np.argsort(distances, axis=1)

How do I use this index array correctly to get back distances sorted along
rows? Note, telling me to use `np.sort()` isn't going to work because I
need to apply the same indexing to a couple of other arrays.

new_dists = distances[indexs]

gives me a (N, k, k) array, while

new_dists = np.take(indexs, axis=1)

gives me a (N, N, k) array.

What am I missing?

Thanks!
Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180326/e94be523/attachment.html>

From robert.kern at gmail.com  Mon Mar 26 14:28:53 2018
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 26 Mar 2018 11:28:53 -0700
Subject: [Numpy-discussion] Right way to do fancy indexing from
 argsort() result?
In-Reply-To: <CANNq6FnqMHR=93fRgbTAD7Dnb=vf6teUgkvKLU92etWbX-ijSw@mail.gmail.com>
References: <CANNq6FnqMHR=93fRgbTAD7Dnb=vf6teUgkvKLU92etWbX-ijSw@mail.gmail.com>
Message-ID: <CAF6FJisnud-K9fB65jsEtzOXD5MAGe7KCMFLfFy2VAqHjsG4fQ@mail.gmail.com>

On Mon, Mar 26, 2018 at 11:24 AM, Benjamin Root <ben.v.root at gmail.com>
wrote:
>
> I seem to be losing my mind... I can't seem to get this to work right.
>
> I have a (N, k) array `distances` (along with a bunch of other arrays of
the same shape). I need to resort the rows, so I do:
>
> indexs = np.argsort(distances, axis=1)
>
> How do I use this index array correctly to get back distances sorted
along rows? Note, telling me to use `np.sort()` isn't going to work because
I need to apply the same indexing to a couple of other arrays.
>
> new_dists = distances[indexs]
>
> gives me a (N, k, k) array, while
>
> new_dists = np.take(indexs, axis=1)
>
> gives me a (N, N, k) array.
>
> What am I missing?

Broadcasting!

  new_dists = distances[np.arange(N)[:, np.newaxis], indexs]

--
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180326/a1b5619c/attachment-0001.html>

From ben.v.root at gmail.com  Mon Mar 26 14:34:21 2018
From: ben.v.root at gmail.com (Benjamin Root)
Date: Mon, 26 Mar 2018 14:34:21 -0400
Subject: [Numpy-discussion] Right way to do fancy indexing from
 argsort() result?
In-Reply-To: <CAF6FJisnud-K9fB65jsEtzOXD5MAGe7KCMFLfFy2VAqHjsG4fQ@mail.gmail.com>
References: <CANNq6FnqMHR=93fRgbTAD7Dnb=vf6teUgkvKLU92etWbX-ijSw@mail.gmail.com>
 <CAF6FJisnud-K9fB65jsEtzOXD5MAGe7KCMFLfFy2VAqHjsG4fQ@mail.gmail.com>
Message-ID: <CANNq6FnoHzGiFrh02ps-836A7pXJoUH6r0eoPZP--H58t6=UUQ@mail.gmail.com>

Ah, yes, I should have thought about that. Kind of seems like something
that we could make `np.take()` do, somehow, for something that is easier to
read.

Thank you!
Ben Root


On Mon, Mar 26, 2018 at 2:28 PM, Robert Kern <robert.kern at gmail.com> wrote:

> On Mon, Mar 26, 2018 at 11:24 AM, Benjamin Root <ben.v.root at gmail.com>
> wrote:
> >
> > I seem to be losing my mind... I can't seem to get this to work right.
> >
> > I have a (N, k) array `distances` (along with a bunch of other arrays of
> the same shape). I need to resort the rows, so I do:
> >
> > indexs = np.argsort(distances, axis=1)
> >
> > How do I use this index array correctly to get back distances sorted
> along rows? Note, telling me to use `np.sort()` isn't going to work because
> I need to apply the same indexing to a couple of other arrays.
> >
> > new_dists = distances[indexs]
> >
> > gives me a (N, k, k) array, while
> >
> > new_dists = np.take(indexs, axis=1)
> >
> > gives me a (N, N, k) array.
> >
> > What am I missing?
>
> Broadcasting!
>
>   new_dists = distances[np.arange(N)[:, np.newaxis], indexs]
>
> --
> Robert Kern
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180326/b07376e0/attachment.html>

From wieser.eric+numpy at gmail.com  Mon Mar 26 14:36:36 2018
From: wieser.eric+numpy at gmail.com (Eric Wieser)
Date: Mon, 26 Mar 2018 18:36:36 +0000
Subject: [Numpy-discussion] Right way to do fancy indexing from
 argsort() result?
In-Reply-To: <CANNq6FnoHzGiFrh02ps-836A7pXJoUH6r0eoPZP--H58t6=UUQ@mail.gmail.com>
References: <CANNq6FnqMHR=93fRgbTAD7Dnb=vf6teUgkvKLU92etWbX-ijSw@mail.gmail.com>
 <CAF6FJisnud-K9fB65jsEtzOXD5MAGe7KCMFLfFy2VAqHjsG4fQ@mail.gmail.com>
 <CANNq6FnoHzGiFrh02ps-836A7pXJoUH6r0eoPZP--H58t6=UUQ@mail.gmail.com>
Message-ID: <CAL1kJvCga99bsVbvnebtUx36BK+rUJDLgStwvDYkM717-8pBPQ@mail.gmail.com>

https://github.com/numpy/numpy/issues/8708 is a proposal to add such a
function, with an implementation in https://github.com/numpy/numpy/pull/8714


Eric

On Mon, 26 Mar 2018 at 11:35 Benjamin Root <ben.v.root at gmail.com> wrote:

> Ah, yes, I should have thought about that. Kind of seems like something
> that we could make `np.take()` do, somehow, for something that is easier to
> read.
>
> Thank you!
> Ben Root
>
>
> On Mon, Mar 26, 2018 at 2:28 PM, Robert Kern <robert.kern at gmail.com>
> wrote:
>
>> On Mon, Mar 26, 2018 at 11:24 AM, Benjamin Root <ben.v.root at gmail.com>
>> wrote:
>> >
>> > I seem to be losing my mind... I can't seem to get this to work right.
>> >
>> > I have a (N, k) array `distances` (along with a bunch of other arrays
>> of the same shape). I need to resort the rows, so I do:
>> >
>> > indexs = np.argsort(distances, axis=1)
>> >
>> > How do I use this index array correctly to get back distances sorted
>> along rows? Note, telling me to use `np.sort()` isn't going to work because
>> I need to apply the same indexing to a couple of other arrays.
>> >
>> > new_dists = distances[indexs]
>> >
>> > gives me a (N, k, k) array, while
>> >
>> > new_dists = np.take(indexs, axis=1)
>> >
>> > gives me a (N, N, k) array.
>> >
>> > What am I missing?
>>
>> Broadcasting!
>>
>>   new_dists = distances[np.arange(N)[:, np.newaxis], indexs]
>>
>> --
>> Robert Kern
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180326/10a3502b/attachment.html>

From njs at pobox.com  Mon Mar 26 21:24:49 2018
From: njs at pobox.com (Nathaniel Smith)
Date: Mon, 26 Mar 2018 18:24:49 -0700
Subject: [Numpy-discussion] round(numpy.float64(0.0)) is a numpy.float64
In-Reply-To: <CAJXewOnEWRPkHrkB5ssbiDU1fpExih9ON1tUAk816=i1NfH7vA@mail.gmail.com>
References: <422941419.2737564.1521718689632.JavaMail.zimbra@laposte.net>
 <ebd4e639-86a6-9d19-dc80-fff336445be1@laposte.net>
 <CAJXewOnEWRPkHrkB5ssbiDU1fpExih9ON1tUAk816=i1NfH7vA@mail.gmail.com>
Message-ID: <CAPJVwBmgcSLMsQEHGNainEw5rSnPOXZ8Uszp43cD+hjBXk2nCQ@mail.gmail.com>

Even knowing that, it's still confusing that round(np.float64(0.0))
isn't the same as round(0.0). The reason is a Python 2 / Python 3
thing: in Python 2, round returns a float, while on Python 3, it
returns an integer ? but numpy still uses the python 2 behavior
everywhere.

I'm not sure if it's possible or worthwhile to change this. If we'd
changed it when we first added python 3 support then it would have
been easy (and obviously a good idea), but at this point it might be
tricky?

-n

On Thu, Mar 22, 2018 at 12:32 PM, Nathan Goldbaum <nathan12343 at gmail.com> wrote:
> numpy.float is an alias to the python float builtin.
>
> https://github.com/numpy/numpy/issues/3998
>
>
> On Thu, Mar 22, 2018 at 2:26 PM Olivier <oc-spam66 at laposte.net> wrote:
>>
>> Hello,
>>
>>
>> Is it normal, expected and desired that :
>>
>>
>>       round(numpy.float64(0.0)) is a numpy.float64
>>
>>
>> while
>>
>>       round(numpy.float(0.0)) is an integer?
>>
>>
>> I find it disturbing and misleading. What do you think? Has it already
>> been
>> discussed somewhere else?
>>
>>
>> Best regards,
>>
>>
>> Olivier
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>


-- 
Nathaniel J. Smith -- https://vorpus.org

From njs at pobox.com  Mon Mar 26 21:28:39 2018
From: njs at pobox.com (Nathaniel Smith)
Date: Mon, 26 Mar 2018 18:28:39 -0700
Subject: [Numpy-discussion] round(numpy.float64(0.0)) is a numpy.float64
In-Reply-To: <CAPJVwBmgcSLMsQEHGNainEw5rSnPOXZ8Uszp43cD+hjBXk2nCQ@mail.gmail.com>
References: <422941419.2737564.1521718689632.JavaMail.zimbra@laposte.net>
 <ebd4e639-86a6-9d19-dc80-fff336445be1@laposte.net>
 <CAJXewOnEWRPkHrkB5ssbiDU1fpExih9ON1tUAk816=i1NfH7vA@mail.gmail.com>
 <CAPJVwBmgcSLMsQEHGNainEw5rSnPOXZ8Uszp43cD+hjBXk2nCQ@mail.gmail.com>
Message-ID: <CAPJVwBm5G4qjpZJyuf5riBTOHYf2D7fOJpr7iXw6ef8gWHi_Gg@mail.gmail.com>

On Mon, Mar 26, 2018 at 6:24 PM, Nathaniel Smith <njs at pobox.com> wrote:
> Even knowing that, it's still confusing that round(np.float64(0.0))
> isn't the same as round(0.0). The reason is a Python 2 / Python 3
> thing: in Python 2, round returns a float, while on Python 3, it
> returns an integer ? but numpy still uses the python 2 behavior
> everywhere.
>
> I'm not sure if it's possible or worthwhile to change this. If we'd
> changed it when we first added python 3 support then it would have
> been easy (and obviously a good idea), but at this point it might be
> tricky?

Oh right, and I forgot: part of the reason it's tricky is that it
really would have to return a Python 'int', *not* any of numpy's
integer types, because floats have a much larger range than numpy
integers, e.g.:

In [4]: round(1e50)
Out[4]: 100000000000000007629769841091887003294964970946560

In [5]: round(np.float64(1e50))
Out[5]: 1e+50

In [6]: np.uint64(round(np.float64(1e50)))
Out[6]: 0

(Actually that last case illustrates another weird inconsistency:
np.uint64(1e50) -> OverflowError, but np.uint64(np.float64(1e50)) ->
0. I have no idea what's going on there.)

-n

-- 
Nathaniel J. Smith -- https://vorpus.org

From robert.kern at gmail.com  Mon Mar 26 22:29:10 2018
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 26 Mar 2018 19:29:10 -0700
Subject: [Numpy-discussion] round(numpy.float64(0.0)) is a numpy.float64
In-Reply-To: <CAPJVwBm5G4qjpZJyuf5riBTOHYf2D7fOJpr7iXw6ef8gWHi_Gg@mail.gmail.com>
References: <422941419.2737564.1521718689632.JavaMail.zimbra@laposte.net>
 <ebd4e639-86a6-9d19-dc80-fff336445be1@laposte.net>
 <CAJXewOnEWRPkHrkB5ssbiDU1fpExih9ON1tUAk816=i1NfH7vA@mail.gmail.com>
 <CAPJVwBmgcSLMsQEHGNainEw5rSnPOXZ8Uszp43cD+hjBXk2nCQ@mail.gmail.com>
 <CAPJVwBm5G4qjpZJyuf5riBTOHYf2D7fOJpr7iXw6ef8gWHi_Gg@mail.gmail.com>
Message-ID: <CAF6FJivWnwJOTEJ1_NcZ1daNA_3kwFgk+kKhMmanu4R5GXmF9w@mail.gmail.com>

On Mon, Mar 26, 2018 at 6:28 PM, Nathaniel Smith <njs at pobox.com> wrote:
>
> On Mon, Mar 26, 2018 at 6:24 PM, Nathaniel Smith <njs at pobox.com> wrote:
> > Even knowing that, it's still confusing that round(np.float64(0.0))
> > isn't the same as round(0.0). The reason is a Python 2 / Python 3
> > thing: in Python 2, round returns a float, while on Python 3, it
> > returns an integer ? but numpy still uses the python 2 behavior
> > everywhere.
> >
> > I'm not sure if it's possible or worthwhile to change this. If we'd
> > changed it when we first added python 3 support then it would have
> > been easy (and obviously a good idea), but at this point it might be
> > tricky?
>
> Oh right, and I forgot: part of the reason it's tricky is that it
> really would have to return a Python 'int', *not* any of numpy's
> integer types, because floats have a much larger range than numpy
> integers, e.g.:

I don't think that's the tricky part. We don't have to change anything but
our implementation of Python 3's __round__() special method for np.generic
scalar types, which would be straightforward. The only issue, besides
backwards compatibility, is that it would introduce a new inconsistency
between scalars and arrays (which can't use the Python ints). However,
that's "paid for" by the increased compatibility with the rest of Python.
For a special method that is used for to interoperate with a Python builtin
function, that's probably the more important consistency to worry about.

As for the backwards compatibility concern, I don't think it would matter
much. Everyone who has written code that expects round(np.float64(...)) to
return a np.float64 is probably already wrapping that with int() anyways.
Anyone who really wants to keep the scalar type of the output same as the
input can use np.around().

--
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180326/8b4a66a7/attachment.html>

From josef.pktd at gmail.com  Tue Mar 27 01:03:43 2018
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Tue, 27 Mar 2018 01:03:43 -0400
Subject: [Numpy-discussion] round(numpy.float64(0.0)) is a numpy.float64
In-Reply-To: <CAF6FJivWnwJOTEJ1_NcZ1daNA_3kwFgk+kKhMmanu4R5GXmF9w@mail.gmail.com>
References: <422941419.2737564.1521718689632.JavaMail.zimbra@laposte.net>
 <ebd4e639-86a6-9d19-dc80-fff336445be1@laposte.net>
 <CAJXewOnEWRPkHrkB5ssbiDU1fpExih9ON1tUAk816=i1NfH7vA@mail.gmail.com>
 <CAPJVwBmgcSLMsQEHGNainEw5rSnPOXZ8Uszp43cD+hjBXk2nCQ@mail.gmail.com>
 <CAPJVwBm5G4qjpZJyuf5riBTOHYf2D7fOJpr7iXw6ef8gWHi_Gg@mail.gmail.com>
 <CAF6FJivWnwJOTEJ1_NcZ1daNA_3kwFgk+kKhMmanu4R5GXmF9w@mail.gmail.com>
Message-ID: <CAMMTP+C2XnucNXieF1hY1_DR9t4Z7Zf0vFGXPCqcXjqFnRzFHA@mail.gmail.com>

On Mon, Mar 26, 2018 at 10:29 PM, Robert Kern <robert.kern at gmail.com> wrote:
> On Mon, Mar 26, 2018 at 6:28 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>> On Mon, Mar 26, 2018 at 6:24 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> > Even knowing that, it's still confusing that round(np.float64(0.0))
>> > isn't the same as round(0.0). The reason is a Python 2 / Python 3
>> > thing: in Python 2, round returns a float, while on Python 3, it
>> > returns an integer ? but numpy still uses the python 2 behavior
>> > everywhere.
>> >
>> > I'm not sure if it's possible or worthwhile to change this. If we'd
>> > changed it when we first added python 3 support then it would have
>> > been easy (and obviously a good idea), but at this point it might be
>> > tricky?
>>
>> Oh right, and I forgot: part of the reason it's tricky is that it
>> really would have to return a Python 'int', *not* any of numpy's
>> integer types, because floats have a much larger range than numpy
>> integers, e.g.:
>
> I don't think that's the tricky part. We don't have to change anything but
> our implementation of Python 3's __round__() special method for np.generic
> scalar types, which would be straightforward. The only issue, besides
> backwards compatibility, is that it would introduce a new inconsistency
> between scalars and arrays (which can't use the Python ints). However,
> that's "paid for" by the increased compatibility with the rest of Python.
> For a special method that is used for to interoperate with a Python builtin
> function, that's probably the more important consistency to worry about.
>
> As for the backwards compatibility concern, I don't think it would matter
> much. Everyone who has written code that expects round(np.float64(...)) to
> return a np.float64 is probably already wrapping that with int() anyways.
> Anyone who really wants to keep the scalar type of the output same as the
> input can use np.around().

same would need to apply for ceil, floor, trunc, I guess.

However, np.round has a decimal argument that I use pretty often and
that needs to return a float

>>> np.round(5.33333, 2)
5.3300000000000001

Python makes the return type conditional on whether ndigits is used or not
AFAICS.
>>> round(5.33333, 0)
5.0
>>> round(5.33333)
5

(I'm currently using Python 3.4.4)

Josef

>
> --
> Robert Kern
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>

From robert.kern at gmail.com  Tue Mar 27 01:52:35 2018
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 26 Mar 2018 22:52:35 -0700
Subject: [Numpy-discussion] round(numpy.float64(0.0)) is a numpy.float64
In-Reply-To: <CAMMTP+C2XnucNXieF1hY1_DR9t4Z7Zf0vFGXPCqcXjqFnRzFHA@mail.gmail.com>
References: <422941419.2737564.1521718689632.JavaMail.zimbra@laposte.net>
 <ebd4e639-86a6-9d19-dc80-fff336445be1@laposte.net>
 <CAJXewOnEWRPkHrkB5ssbiDU1fpExih9ON1tUAk816=i1NfH7vA@mail.gmail.com>
 <CAPJVwBmgcSLMsQEHGNainEw5rSnPOXZ8Uszp43cD+hjBXk2nCQ@mail.gmail.com>
 <CAPJVwBm5G4qjpZJyuf5riBTOHYf2D7fOJpr7iXw6ef8gWHi_Gg@mail.gmail.com>
 <CAF6FJivWnwJOTEJ1_NcZ1daNA_3kwFgk+kKhMmanu4R5GXmF9w@mail.gmail.com>
 <CAMMTP+C2XnucNXieF1hY1_DR9t4Z7Zf0vFGXPCqcXjqFnRzFHA@mail.gmail.com>
Message-ID: <CAF6FJit7aFYqtOxJOguZCxYji11u9PH49+M5CQyOAmEC-CbqFQ@mail.gmail.com>

On Mon, Mar 26, 2018 at 10:03 PM, <josef.pktd at gmail.com> wrote:
>
> On Mon, Mar 26, 2018 at 10:29 PM, Robert Kern <robert.kern at gmail.com>
wrote:
> > On Mon, Mar 26, 2018 at 6:28 PM, Nathaniel Smith <njs at pobox.com> wrote:
> >>
> >> On Mon, Mar 26, 2018 at 6:24 PM, Nathaniel Smith <njs at pobox.com> wrote:
> >> > Even knowing that, it's still confusing that round(np.float64(0.0))
> >> > isn't the same as round(0.0). The reason is a Python 2 / Python 3
> >> > thing: in Python 2, round returns a float, while on Python 3, it
> >> > returns an integer ? but numpy still uses the python 2 behavior
> >> > everywhere.
> >> >
> >> > I'm not sure if it's possible or worthwhile to change this. If we'd
> >> > changed it when we first added python 3 support then it would have
> >> > been easy (and obviously a good idea), but at this point it might be
> >> > tricky?
> >>
> >> Oh right, and I forgot: part of the reason it's tricky is that it
> >> really would have to return a Python 'int', *not* any of numpy's
> >> integer types, because floats have a much larger range than numpy
> >> integers, e.g.:
> >
> > I don't think that's the tricky part. We don't have to change anything
but
> > our implementation of Python 3's __round__() special method for
np.generic
> > scalar types, which would be straightforward. The only issue, besides
> > backwards compatibility, is that it would introduce a new inconsistency
> > between scalars and arrays (which can't use the Python ints). However,
> > that's "paid for" by the increased compatibility with the rest of
Python.
> > For a special method that is used for to interoperate with a Python
builtin
> > function, that's probably the more important consistency to worry about.
> >
> > As for the backwards compatibility concern, I don't think it would
matter
> > much. Everyone who has written code that expects round(np.float64(...))
to
> > return a np.float64 is probably already wrapping that with int()
anyways.
> > Anyone who really wants to keep the scalar type of the output same as
the
> > input can use np.around().
>
> same would need to apply for ceil, floor, trunc, I guess.

ceil and floor don't have __special__ methods for them; math.ceil() and
math.floor() do not defer their implementation to the type. math.trunc()
might (there is a __trunc__), but it looks like math.trunc(np.float64(...))
already returns an int.

I'm not suggesting changing np.ceil(), np.floor(), etc. Nor am I suggesting
that we change np.around(), np.round(), or the .round() method on scalar
types. Only .__round__().

> However, np.round has a decimal argument that I use pretty often and
> that needs to return a float
>
> >>> np.round(5.33333, 2)
> 5.3300000000000001
>
> Python makes the return type conditional on whether ndigits is used or not
> AFAICS.
> >>> round(5.33333, 0)
> 5.0
> >>> round(5.33333)
> 5

Sorry, I took that as a given. If someone followed my suggestion to
implement np.generic.__round__, yes, they I intended that they handle both
cases correctly.

But also, to reiterate, I'm not suggesting that we change np.round(). Only
the behavior of numpy scalar types under the builtin round() function.

--
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180326/09808375/attachment-0001.html>

From Catherine.M.Moroney at jpl.nasa.gov  Wed Mar 28 20:56:12 2018
From: Catherine.M.Moroney at jpl.nasa.gov (Moroney, Catherine M (398E))
Date: Thu, 29 Mar 2018 00:56:12 +0000
Subject: [Numpy-discussion] best way of speeding up a filtering-like
 algorithm
Message-ID: <43BD456C-B5E0-4203-B069-13A49A53E5F6@jpl.nasa.gov>

Hello,

I have the following sample code (pretty simple algorithm that uses a rolling filter window) and am wondering what the best way is of speeding it up.  I tried rewriting it in Cython by pre-declaring the variables but that didn?t buy me a lot of time.  Then I rewrote it in Fortran (and compiled it with f2py) and now it?s lightning fast.  But I would still like to know if I could rewrite it in pure python/numpy/scipy or in Cython and get a similar speedup.

Here is the raw Python code:

def mixed_coastline_slow(nsidc, radius, count, mask=None):

    nsidc_copy = numpy.copy(nsidc)

    if (mask is None):
        idx_coastline = numpy.where(nsidc_copy == NSIDC_COASTLINE_MIXED)
    else:
        idx_coastline = numpy.where(mask & (nsidc_copy == NSIDC_COASTLINE_MIXED))

    for (irow0, icol0) in zip(idx_coastline[0], idx_coastline[1]):

        rows = ( max(irow0-radius, 0), min(irow0+radius+1, nsidc_copy.shape[0]) )
        cols = ( max(icol0-radius, 0), min(icol0+radius+1, nsidc_copy.shape[1]) )
        window = nsidc[rows[0]:rows[1], cols[0]:cols[1]]

        npoints = numpy.where(window != NSIDC_COASTLINE_MIXED, True, False).sum()
        nsnowice = numpy.where( (window >= NSIDC_SEAICE_LOW) & (window <= NSIDC_FRESHSNOW), \
                                True, False).sum()

        if (100.0*nsnowice/npoints >= count):
             nsidc_copy[irow0, icol0] = MISR_SEAICE_THRESHOLD

    return nsidc_copy

and here is my attempt at Cython-izing it:

import numpy
cimport numpy as cnumpy
cimport cython

cdef int NSIDC_SIZE  = 721
cdef int NSIDC_NO_SNOW = 0
cdef int NSIDC_ALL_SNOW = 100
cdef int NSIDC_FRESHSNOW = 103
cdef int NSIDC_PERMSNOW  = 101
cdef int NSIDC_SEAICE_LOW  = 1
cdef int NSIDC_SEAICE_HIGH = 100
cdef int NSIDC_COASTLINE_MIXED = 252
cdef int NSIDC_SUSPECT_ICE = 253

cdef int MISR_SEAICE_THRESHOLD = 6

def mixed_coastline(cnumpy.ndarray[cnumpy.uint8_t, ndim=2] nsidc, int radius, int count):

     cdef int irow, icol, irow1, irow2, icol1, icol2, npoints, nsnowice
     cdef cnumpy.ndarray[cnumpy.uint8_t, ndim=2] nsidc2 \
        = numpy.empty(shape=(NSIDC_SIZE, NSIDC_SIZE), dtype=numpy.uint8)
     cdef cnumpy.ndarray[cnumpy.uint8_t, ndim=2] window \
        = numpy.empty(shape=(2*radius+1, 2*radius+1), dtype=numpy.uint8)

     nsidc2 = numpy.copy(nsidc)

     idx_coastline = numpy.where(nsidc2 == NSIDC_COASTLINE_MIXED)

     for (irow, icol) in zip(idx_coastline[0], idx_coastline[1]):

          irow1 = max(irow-radius, 0)
          irow2 = min(irow+radius+1, NSIDC_SIZE)
          icol1 = max(icol-radius, 0)
          icol2 = min(icol+radius+1, NSIDC_SIZE)
          window = nsidc[irow1:irow2, icol1:icol2]

          npoints = numpy.where(window != NSIDC_COASTLINE_MIXED, True, False).sum()
          nsnowice = numpy.where( (window >= NSIDC_SEAICE_LOW) & (window <= NSIDC_FRESHSNOW), \
                                  True, False).sum()

          if (100.0*nsnowice/npoints >= count):
               nsidc2[irow, icol] = MISR_SEAICE_THRESHOLD

     return nsidc2

Thanks in advance for any advice!

Catherine

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180329/3874cc1b/attachment.html>

From wieser.eric+numpy at gmail.com  Wed Mar 28 21:43:33 2018
From: wieser.eric+numpy at gmail.com (Eric Wieser)
Date: Thu, 29 Mar 2018 01:43:33 +0000
Subject: [Numpy-discussion] best way of speeding up a filtering-like
 algorithm
In-Reply-To: <43BD456C-B5E0-4203-B069-13A49A53E5F6@jpl.nasa.gov>
References: <43BD456C-B5E0-4203-B069-13A49A53E5F6@jpl.nasa.gov>
Message-ID: <CAL1kJvD7=ZUhZL+Am3h5QxT2RQFVuEw3f3Yc-=bmZr8L6eNyDg@mail.gmail.com>

Well, one tip to start with:

numpy.where(some_comparison, True, False)

is the same as but slower than

some_comparison

Eric

On Wed, 28 Mar 2018 at 18:36 Moroney, Catherine M (398E) <
Catherine.M.Moroney at jpl.nasa.gov> wrote:

> Hello,
>
>
>
> I have the following sample code (pretty simple algorithm that uses a
> rolling filter window) and am wondering what the best way is of speeding it
> up.  I tried rewriting it in Cython by pre-declaring the variables but that
> didn?t buy me a lot of time.  Then I rewrote it in Fortran (and compiled it
> with f2py) and now it?s lightning fast.  But I would still like to know if
> I could rewrite it in pure python/numpy/scipy or in Cython and get a
> similar speedup.
>
>
>
> Here is the raw Python code:
>
>
>
> def mixed_coastline_slow(nsidc, radius, count, mask=None):
>
>
>
>     nsidc_copy = numpy.copy(nsidc)
>
>
>
>     if (mask is None):
>
>         idx_coastline = numpy.where(nsidc_copy == NSIDC_COASTLINE_MIXED)
>
>     else:
>
>         idx_coastline = numpy.where(mask & (nsidc_copy ==
> NSIDC_COASTLINE_MIXED))
>
>
>
>     for (irow0, icol0) in zip(idx_coastline[0], idx_coastline[1]):
>
>
>
>         rows = ( max(irow0-radius, 0), min(irow0+radius+1,
> nsidc_copy.shape[0]) )
>
>         cols = ( max(icol0-radius, 0), min(icol0+radius+1,
> nsidc_copy.shape[1]) )
>
>         window = nsidc[rows[0]:rows[1], cols[0]:cols[1]]
>
>
>
>         npoints = numpy.where(window != NSIDC_COASTLINE_MIXED, True,
> False).sum()
>
>         nsnowice = numpy.where( (window >= NSIDC_SEAICE_LOW) & (window <=
> NSIDC_FRESHSNOW), \
>
>                                 True, False).sum()
>
>
>
>         if (100.0*nsnowice/npoints >= count):
>
>              nsidc_copy[irow0, icol0] = MISR_SEAICE_THRESHOLD
>
>
>
>     return nsidc_copy
>
>
>
> and here is my attempt at Cython-izing it:
>
>
>
> import numpy
>
> cimport numpy as cnumpy
>
> cimport cython
>
>
>
> cdef int NSIDC_SIZE  = 721
>
> cdef int NSIDC_NO_SNOW = 0
>
> cdef int NSIDC_ALL_SNOW = 100
>
> cdef int NSIDC_FRESHSNOW = 103
>
> cdef int NSIDC_PERMSNOW  = 101
>
> cdef int NSIDC_SEAICE_LOW  = 1
>
> cdef int NSIDC_SEAICE_HIGH = 100
>
> cdef int NSIDC_COASTLINE_MIXED = 252
>
> cdef int NSIDC_SUSPECT_ICE = 253
>
>
>
> cdef int MISR_SEAICE_THRESHOLD = 6
>
>
>
> def mixed_coastline(cnumpy.ndarray[cnumpy.uint8_t, ndim=2] nsidc, int
> radius, int count):
>
>
>
>      cdef int irow, icol, irow1, irow2, icol1, icol2, npoints, nsnowice
>
>      cdef cnumpy.ndarray[cnumpy.uint8_t, ndim=2] nsidc2 \
>
>         = numpy.empty(shape=(NSIDC_SIZE, NSIDC_SIZE), dtype=numpy.uint8)
>
>      cdef cnumpy.ndarray[cnumpy.uint8_t, ndim=2] window \
>
>         = numpy.empty(shape=(2*radius+1, 2*radius+1), dtype=numpy.uint8)
>
>
>
>      nsidc2 = numpy.copy(nsidc)
>
>
>
>      idx_coastline = numpy.where(nsidc2 == NSIDC_COASTLINE_MIXED)
>
>
>
>      for (irow, icol) in zip(idx_coastline[0], idx_coastline[1]):
>
>
>
>           irow1 = max(irow-radius, 0)
>
>           irow2 = min(irow+radius+1, NSIDC_SIZE)
>
>           icol1 = max(icol-radius, 0)
>
>           icol2 = min(icol+radius+1, NSIDC_SIZE)
>
>           window = nsidc[irow1:irow2, icol1:icol2]
>
>
>
>           npoints = numpy.where(window != NSIDC_COASTLINE_MIXED, True,
> False).sum()
>
>           nsnowice = numpy.where( (window >= NSIDC_SEAICE_LOW) & (window
> <= NSIDC_FRESHSNOW), \
>
>                                   True, False).sum()
>
>
>
>           if (100.0*nsnowice/npoints >= count):
>
>                nsidc2[irow, icol] = MISR_SEAICE_THRESHOLD
>
>
>
>      return nsidc2
>
>
>
> Thanks in advance for any advice!
>
>
>
> Catherine
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180329/a35fde34/attachment-0001.html>

From jfoxrabinovitz at gmail.com  Thu Mar 29 00:10:08 2018
From: jfoxrabinovitz at gmail.com (Joseph Fox-Rabinovitz)
Date: Thu, 29 Mar 2018 00:10:08 -0400
Subject: [Numpy-discussion] best way of speeding up a filtering-like
 algorithm
In-Reply-To: <CAL1kJvD7=ZUhZL+Am3h5QxT2RQFVuEw3f3Yc-=bmZr8L6eNyDg@mail.gmail.com>
References: <43BD456C-B5E0-4203-B069-13A49A53E5F6@jpl.nasa.gov>
 <CAL1kJvD7=ZUhZL+Am3h5QxT2RQFVuEw3f3Yc-=bmZr8L6eNyDg@mail.gmail.com>
Message-ID: <CAAa1KPbuajWkbj4tYPRWqdqv+6JjuM8YUoqZAJVLd561VGXpcw@mail.gmail.com>

It looks like you are creating a coastline mask (or a coastline mask +
some other mask), and computing the ratio of two quantities in a
particular window around each point. If your coastline covers a
sufficiently large portion of the image, you may get quite a bit of
mileage using an efficient convolution instead of summing the windows
directly. For example, you could use scipy.signal.convolve2d with
inputs being (nsidc_copy != NSIDC_COASTLINE_MIXED), (nsidc_copy ==
NSIDC_SEAICE_LOW & nsdic_copy == NSIDC_FRESHSNOW) for the frst array,
and a (2*radius x 2*radius) array of ones for the second. You may have
to center the block of ones in an array of zeros the same size as
nsdic_copy, but I am not sure about that.

Another option you may want to try is implementing your window
movement more efficiently. If you step your window center along using
an algorithm like flood-fill, you can insure that there will be very
large overlap between successive steps (even if there is a break in
the coastline). That means that you can reuse most of the data you've
extracted. You will only need to subtract off the non-overlapping
portion of the previous window and add in the non-overlapping portion
of the updated window. If radius is 16, giving you a 32x32 window, you
go from summing ~1000 pixels per quantity of interest, to summing only
~120 if the window moves along a diagonal, and only 64 if it moves
vertically or horizontally. While an algorithm like this will probably
give you the greatest boost, it is a pain to implement.

If I had to guess, this looks like L2 processing for a multi-spectral
instrument. If you don't mind me asking, what mission is this for? I'm
working on space-looking detectors at the moment, but have spent many
years on the L0, L1b and L1 portions of the GOES-R ground system.

- Joe

On Wed, Mar 28, 2018 at 9:43 PM, Eric Wieser
<wieser.eric+numpy at gmail.com> wrote:
> Well, one tip to start with:
>
> numpy.where(some_comparison, True, False)
>
> is the same as but slower than
>
> some_comparison
>
> Eric
>
> On Wed, 28 Mar 2018 at 18:36 Moroney, Catherine M (398E)
> <Catherine.M.Moroney at jpl.nasa.gov> wrote:
>>
>> Hello,
>>
>>
>>
>> I have the following sample code (pretty simple algorithm that uses a
>> rolling filter window) and am wondering what the best way is of speeding it
>> up.  I tried rewriting it in Cython by pre-declaring the variables but that
>> didn?t buy me a lot of time.  Then I rewrote it in Fortran (and compiled it
>> with f2py) and now it?s lightning fast.  But I would still like to know if I
>> could rewrite it in pure python/numpy/scipy or in Cython and get a similar
>> speedup.
>>
>>
>>
>> Here is the raw Python code:
>>
>>
>>
>> def mixed_coastline_slow(nsidc, radius, count, mask=None):
>>
>>
>>
>>     nsidc_copy = numpy.copy(nsidc)
>>
>>
>>
>>     if (mask is None):
>>
>>         idx_coastline = numpy.where(nsidc_copy == NSIDC_COASTLINE_MIXED)
>>
>>     else:
>>
>>         idx_coastline = numpy.where(mask & (nsidc_copy ==
>> NSIDC_COASTLINE_MIXED))
>>
>>
>>
>>     for (irow0, icol0) in zip(idx_coastline[0], idx_coastline[1]):
>>
>>
>>
>>         rows = ( max(irow0-radius, 0), min(irow0+radius+1,
>> nsidc_copy.shape[0]) )
>>
>>         cols = ( max(icol0-radius, 0), min(icol0+radius+1,
>> nsidc_copy.shape[1]) )
>>
>>         window = nsidc[rows[0]:rows[1], cols[0]:cols[1]]
>>
>>
>>
>>         npoints = numpy.where(window != NSIDC_COASTLINE_MIXED, True,
>> False).sum()
>>
>>         nsnowice = numpy.where( (window >= NSIDC_SEAICE_LOW) & (window <=
>> NSIDC_FRESHSNOW), \
>>
>>                                 True, False).sum()
>>
>>
>>
>>         if (100.0*nsnowice/npoints >= count):
>>
>>              nsidc_copy[irow0, icol0] = MISR_SEAICE_THRESHOLD
>>
>>
>>
>>     return nsidc_copy
>>
>>
>>
>> and here is my attempt at Cython-izing it:
>>
>>
>>
>> import numpy
>>
>> cimport numpy as cnumpy
>>
>> cimport cython
>>
>>
>>
>> cdef int NSIDC_SIZE  = 721
>>
>> cdef int NSIDC_NO_SNOW = 0
>>
>> cdef int NSIDC_ALL_SNOW = 100
>>
>> cdef int NSIDC_FRESHSNOW = 103
>>
>> cdef int NSIDC_PERMSNOW  = 101
>>
>> cdef int NSIDC_SEAICE_LOW  = 1
>>
>> cdef int NSIDC_SEAICE_HIGH = 100
>>
>> cdef int NSIDC_COASTLINE_MIXED = 252
>>
>> cdef int NSIDC_SUSPECT_ICE = 253
>>
>>
>>
>> cdef int MISR_SEAICE_THRESHOLD = 6
>>
>>
>>
>> def mixed_coastline(cnumpy.ndarray[cnumpy.uint8_t, ndim=2] nsidc, int
>> radius, int count):
>>
>>
>>
>>      cdef int irow, icol, irow1, irow2, icol1, icol2, npoints, nsnowice
>>
>>      cdef cnumpy.ndarray[cnumpy.uint8_t, ndim=2] nsidc2 \
>>
>>         = numpy.empty(shape=(NSIDC_SIZE, NSIDC_SIZE), dtype=numpy.uint8)
>>
>>      cdef cnumpy.ndarray[cnumpy.uint8_t, ndim=2] window \
>>
>>         = numpy.empty(shape=(2*radius+1, 2*radius+1), dtype=numpy.uint8)
>>
>>
>>
>>      nsidc2 = numpy.copy(nsidc)
>>
>>
>>
>>      idx_coastline = numpy.where(nsidc2 == NSIDC_COASTLINE_MIXED)
>>
>>
>>
>>      for (irow, icol) in zip(idx_coastline[0], idx_coastline[1]):
>>
>>
>>
>>           irow1 = max(irow-radius, 0)
>>
>>           irow2 = min(irow+radius+1, NSIDC_SIZE)
>>
>>           icol1 = max(icol-radius, 0)
>>
>>           icol2 = min(icol+radius+1, NSIDC_SIZE)
>>
>>           window = nsidc[irow1:irow2, icol1:icol2]
>>
>>
>>
>>           npoints = numpy.where(window != NSIDC_COASTLINE_MIXED, True,
>> False).sum()
>>
>>           nsnowice = numpy.where( (window >= NSIDC_SEAICE_LOW) & (window
>> <= NSIDC_FRESHSNOW), \
>>
>>                                   True, False).sum()
>>
>>
>>
>>           if (100.0*nsnowice/npoints >= count):
>>
>>                nsidc2[irow, icol] = MISR_SEAICE_THRESHOLD
>>
>>
>>
>>      return nsidc2
>>
>>
>>
>> Thanks in advance for any advice!
>>
>>
>>
>> Catherine
>>
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>

From jfoxrabinovitz at gmail.com  Thu Mar 29 02:31:22 2018
From: jfoxrabinovitz at gmail.com (Joseph Fox-Rabinovitz)
Date: Thu, 29 Mar 2018 02:31:22 -0400
Subject: [Numpy-discussion] PR adding support for object arrays to np.isinf,
 np.isnan, np.isfinite
Message-ID: <CAAa1KPY0Fx+zRrm+gUHFs+RxaW4uQzNK+OZ8yY4XtLB+JQr7PA@mail.gmail.com>

I have opened PR #10820 to add support for `dtype=object` to
`np.isinf`, `np.isnan`, `np.isfinite`. The PR is a fairly minor
change, but I would like to make sure that I understand at least the
basics of ufuncs before I start adding support for datetimes and
timedeltas to `np.isfinite` and eventually to `np.histogram`. I have
left a few comments in areas I am not sure about, and would greatly
appreciate feedback, even if the PR is not found suitable for merging.

With this PR, object arrays containing any numerical or simulated
numerical types (implementing `__float__` or `__complex__` methods)
are processed as would be expected. While working on PR, I came up
with two questions for the gurus:

1. Am I correct in understanding that `isinf`, `isnan` and `isfinite`
currently cast integer inputs to float to process them? Why are
integer inputs not optimized to return arrays of all False, False,
True, respectively for those functions?

2. Why are `isneginf` and `isposinf` not ufuncs? Is there any reason
not to make them ufuncs (besides the renaming of the `y` parameter to
`out`, which technically breaks some backward compatibility)?

Regards,

- Joe

From stuart at stuartreynolds.net  Thu Mar 29 11:14:16 2018
From: stuart at stuartreynolds.net (Stuart Reynolds)
Date: Thu, 29 Mar 2018 15:14:16 +0000
Subject: [Numpy-discussion] best way of speeding up a filtering-like
 algorithm
In-Reply-To: <CAAa1KPbuajWkbj4tYPRWqdqv+6JjuM8YUoqZAJVLd561VGXpcw@mail.gmail.com>
References: <43BD456C-B5E0-4203-B069-13A49A53E5F6@jpl.nasa.gov>
 <CAL1kJvD7=ZUhZL+Am3h5QxT2RQFVuEw3f3Yc-=bmZr8L6eNyDg@mail.gmail.com>
 <CAAa1KPbuajWkbj4tYPRWqdqv+6JjuM8YUoqZAJVLd561VGXpcw@mail.gmail.com>
Message-ID: <CAAy-kd=LH79SqBWvFFQDDHOUp1adpH4yCKM_L-5aXtH3mRw-tA@mail.gmail.com>

Install snakeviz to visualize what?s taking all the time.

You might want to check out numba.jit(nopython) for optimizing specific
sections.


On Wed, Mar 28, 2018 at 9:10 PM Joseph Fox-Rabinovitz <
jfoxrabinovitz at gmail.com> wrote:

> It looks like you are creating a coastline mask (or a coastline mask +
> some other mask), and computing the ratio of two quantities in a
> particular window around each point. If your coastline covers a
> sufficiently large portion of the image, you may get quite a bit of
> mileage using an efficient convolution instead of summing the windows
> directly. For example, you could use scipy.signal.convolve2d with
> inputs being (nsidc_copy != NSIDC_COASTLINE_MIXED), (nsidc_copy ==
> NSIDC_SEAICE_LOW & nsdic_copy == NSIDC_FRESHSNOW) for the frst array,
> and a (2*radius x 2*radius) array of ones for the second. You may have
> to center the block of ones in an array of zeros the same size as
> nsdic_copy, but I am not sure about that.
>
> Another option you may want to try is implementing your window
> movement more efficiently. If you step your window center along using
> an algorithm like flood-fill, you can insure that there will be very
> large overlap between successive steps (even if there is a break in
> the coastline). That means that you can reuse most of the data you've
> extracted. You will only need to subtract off the non-overlapping
> portion of the previous window and add in the non-overlapping portion
> of the updated window. If radius is 16, giving you a 32x32 window, you
> go from summing ~1000 pixels per quantity of interest, to summing only
> ~120 if the window moves along a diagonal, and only 64 if it moves
> vertically or horizontally. While an algorithm like this will probably
> give you the greatest boost, it is a pain to implement.
>
> If I had to guess, this looks like L2 processing for a multi-spectral
> instrument. If you don't mind me asking, what mission is this for? I'm
> working on space-looking detectors at the moment, but have spent many
> years on the L0, L1b and L1 portions of the GOES-R ground system.
>
> - Joe
>
> On Wed, Mar 28, 2018 at 9:43 PM, Eric Wieser
> <wieser.eric+numpy at gmail.com> wrote:
> > Well, one tip to start with:
> >
> > numpy.where(some_comparison, True, False)
> >
> > is the same as but slower than
> >
> > some_comparison
> >
> > Eric
> >
> > On Wed, 28 Mar 2018 at 18:36 Moroney, Catherine M (398E)
> > <Catherine.M.Moroney at jpl.nasa.gov> wrote:
> >>
> >> Hello,
> >>
> >>
> >>
> >> I have the following sample code (pretty simple algorithm that uses a
> >> rolling filter window) and am wondering what the best way is of
> speeding it
> >> up.  I tried rewriting it in Cython by pre-declaring the variables but
> that
> >> didn?t buy me a lot of time.  Then I rewrote it in Fortran (and
> compiled it
> >> with f2py) and now it?s lightning fast.  But I would still like to know
> if I
> >> could rewrite it in pure python/numpy/scipy or in Cython and get a
> similar
> >> speedup.
> >>
> >>
> >>
> >> Here is the raw Python code:
> >>
> >>
> >>
> >> def mixed_coastline_slow(nsidc, radius, count, mask=None):
> >>
> >>
> >>
> >>     nsidc_copy = numpy.copy(nsidc)
> >>
> >>
> >>
> >>     if (mask is None):
> >>
> >>         idx_coastline = numpy.where(nsidc_copy == NSIDC_COASTLINE_MIXED)
> >>
> >>     else:
> >>
> >>         idx_coastline = numpy.where(mask & (nsidc_copy ==
> >> NSIDC_COASTLINE_MIXED))
> >>
> >>
> >>
> >>     for (irow0, icol0) in zip(idx_coastline[0], idx_coastline[1]):
> >>
> >>
> >>
> >>         rows = ( max(irow0-radius, 0), min(irow0+radius+1,
> >> nsidc_copy.shape[0]) )
> >>
> >>         cols = ( max(icol0-radius, 0), min(icol0+radius+1,
> >> nsidc_copy.shape[1]) )
> >>
> >>         window = nsidc[rows[0]:rows[1], cols[0]:cols[1]]
> >>
> >>
> >>
> >>         npoints = numpy.where(window != NSIDC_COASTLINE_MIXED, True,
> >> False).sum()
> >>
> >>         nsnowice = numpy.where( (window >= NSIDC_SEAICE_LOW) & (window
> <=
> >> NSIDC_FRESHSNOW), \
> >>
> >>                                 True, False).sum()
> >>
> >>
> >>
> >>         if (100.0*nsnowice/npoints >= count):
> >>
> >>              nsidc_copy[irow0, icol0] = MISR_SEAICE_THRESHOLD
> >>
> >>
> >>
> >>     return nsidc_copy
> >>
> >>
> >>
> >> and here is my attempt at Cython-izing it:
> >>
> >>
> >>
> >> import numpy
> >>
> >> cimport numpy as cnumpy
> >>
> >> cimport cython
> >>
> >>
> >>
> >> cdef int NSIDC_SIZE  = 721
> >>
> >> cdef int NSIDC_NO_SNOW = 0
> >>
> >> cdef int NSIDC_ALL_SNOW = 100
> >>
> >> cdef int NSIDC_FRESHSNOW = 103
> >>
> >> cdef int NSIDC_PERMSNOW  = 101
> >>
> >> cdef int NSIDC_SEAICE_LOW  = 1
> >>
> >> cdef int NSIDC_SEAICE_HIGH = 100
> >>
> >> cdef int NSIDC_COASTLINE_MIXED = 252
> >>
> >> cdef int NSIDC_SUSPECT_ICE = 253
> >>
> >>
> >>
> >> cdef int MISR_SEAICE_THRESHOLD = 6
> >>
> >>
> >>
> >> def mixed_coastline(cnumpy.ndarray[cnumpy.uint8_t, ndim=2] nsidc, int
> >> radius, int count):
> >>
> >>
> >>
> >>      cdef int irow, icol, irow1, irow2, icol1, icol2, npoints, nsnowice
> >>
> >>      cdef cnumpy.ndarray[cnumpy.uint8_t, ndim=2] nsidc2 \
> >>
> >>         = numpy.empty(shape=(NSIDC_SIZE, NSIDC_SIZE), dtype=numpy.uint8)
> >>
> >>      cdef cnumpy.ndarray[cnumpy.uint8_t, ndim=2] window \
> >>
> >>         = numpy.empty(shape=(2*radius+1, 2*radius+1), dtype=numpy.uint8)
> >>
> >>
> >>
> >>      nsidc2 = numpy.copy(nsidc)
> >>
> >>
> >>
> >>      idx_coastline = numpy.where(nsidc2 == NSIDC_COASTLINE_MIXED)
> >>
> >>
> >>
> >>      for (irow, icol) in zip(idx_coastline[0], idx_coastline[1]):
> >>
> >>
> >>
> >>           irow1 = max(irow-radius, 0)
> >>
> >>           irow2 = min(irow+radius+1, NSIDC_SIZE)
> >>
> >>           icol1 = max(icol-radius, 0)
> >>
> >>           icol2 = min(icol+radius+1, NSIDC_SIZE)
> >>
> >>           window = nsidc[irow1:irow2, icol1:icol2]
> >>
> >>
> >>
> >>           npoints = numpy.where(window != NSIDC_COASTLINE_MIXED, True,
> >> False).sum()
> >>
> >>           nsnowice = numpy.where( (window >= NSIDC_SEAICE_LOW) & (window
> >> <= NSIDC_FRESHSNOW), \
> >>
> >>                                   True, False).sum()
> >>
> >>
> >>
> >>           if (100.0*nsnowice/npoints >= count):
> >>
> >>                nsidc2[irow, icol] = MISR_SEAICE_THRESHOLD
> >>
> >>
> >>
> >>      return nsidc2
> >>
> >>
> >>
> >> Thanks in advance for any advice!
> >>
> >>
> >>
> >> Catherine
> >>
> >>
> >>
> >> _______________________________________________
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion at python.org
> >> https://mail.python.org/mailman/listinfo/numpy-discussion
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> >
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180329/9e88c0b9/attachment-0001.html>

From chris.barker at noaa.gov  Thu Mar 29 13:23:36 2018
From: chris.barker at noaa.gov (Chris Barker)
Date: Thu, 29 Mar 2018 10:23:36 -0700
Subject: [Numpy-discussion] best way of speeding up a filtering-like
 algorithm
In-Reply-To: <43BD456C-B5E0-4203-B069-13A49A53E5F6@jpl.nasa.gov>
References: <43BD456C-B5E0-4203-B069-13A49A53E5F6@jpl.nasa.gov>
Message-ID: <CALGmxE+hE+E2a6N0hLSd37PrwY5BFBt3A3fnDSaQoKDS4E-CDw@mail.gmail.com>

sorry, not enough time to look closely, but a couple general comments:

On Wed, Mar 28, 2018 at 5:56 PM, Moroney, Catherine M (398E) <
Catherine.M.Moroney at jpl.nasa.gov> wrote:

> I have the following sample code (pretty simple algorithm that uses a
> rolling filter window) and am wondering what the best way is of speeding it
> up.  I tried rewriting it in Cython by pre-declaring the variables but that
> didn?t buy me a lot of time.  Then I rewrote it in Fortran (and compiled it
> with f2py) and now it?s lightning fast.
>

if done right, Cython should be almost as fast as Fortran, and just as fast
if you use the "restrict" correctly (which I hope can be done in Cython):

https://en.wikipedia.org/wiki/Pointer_aliasing


> But I would still like to know if I could rewrite it in pure
> python/numpy/scipy
>

you can use stride_tricks to make arrays "appear" to be N+1 D, to implement
windows without actually duplicating the data, and then use array
operations on them. This can buy a lot of speed, but will not be as fast
(by a factor of 10 or so) as Cython or Fortran

see:

https://github.com/PythonCHB/IRIS_Python_Class/blob/master/Numpy/code/filter_example.py
for and example in 1D


> or in Cython and get a similar speedup.
>
>
see above -- a direct port of your Fortran code to Cython should get you
within a factor of two or so of the Fortran, and then using "restrict" to
let the compiler know your pointers aren't aliased should get you the reset
of the way.

Here is an example of a Automatic Gain Control filter in 1D, iplimented in
numpy with stride_triks, and C and Cython and Fortran.

https://github.com/PythonCHB/IRIS_Python_Class/tree/master/Interfacing_C/agc_example

Note that in that example, I never got C or Cython as fast as Fortran --
but I think using "restrict" in the C would do it.

HTH,

-CHB


>
> Here is the raw Python code:
>
>
>
> def mixed_coastline_slow(nsidc, radius, count, mask=None):
>
>
>
>     nsidc_copy = numpy.copy(nsidc)
>
>
>
>     if (mask is None):
>
>         idx_coastline = numpy.where(nsidc_copy == NSIDC_COASTLINE_MIXED)
>
>     else:
>
>         idx_coastline = numpy.where(mask & (nsidc_copy ==
> NSIDC_COASTLINE_MIXED))
>
>
>
>     for (irow0, icol0) in zip(idx_coastline[0], idx_coastline[1]):
>
>
>
>         rows = ( max(irow0-radius, 0), min(irow0+radius+1,
> nsidc_copy.shape[0]) )
>
>         cols = ( max(icol0-radius, 0), min(icol0+radius+1,
> nsidc_copy.shape[1]) )
>
>         window = nsidc[rows[0]:rows[1], cols[0]:cols[1]]
>
>
>
>         npoints = numpy.where(window != NSIDC_COASTLINE_MIXED, True,
> False).sum()
>
>         nsnowice = numpy.where( (window >= NSIDC_SEAICE_LOW) & (window <=
> NSIDC_FRESHSNOW), \
>
>                                 True, False).sum()
>
>
>
>         if (100.0*nsnowice/npoints >= count):
>
>              nsidc_copy[irow0, icol0] = MISR_SEAICE_THRESHOLD
>
>
>
>     return nsidc_copy
>
>
>
> and here is my attempt at Cython-izing it:
>
>
>
> import numpy
>
> cimport numpy as cnumpy
>
> cimport cython
>
>
>
> cdef int NSIDC_SIZE  = 721
>
> cdef int NSIDC_NO_SNOW = 0
>
> cdef int NSIDC_ALL_SNOW = 100
>
> cdef int NSIDC_FRESHSNOW = 103
>
> cdef int NSIDC_PERMSNOW  = 101
>
> cdef int NSIDC_SEAICE_LOW  = 1
>
> cdef int NSIDC_SEAICE_HIGH = 100
>
> cdef int NSIDC_COASTLINE_MIXED = 252
>
> cdef int NSIDC_SUSPECT_ICE = 253
>
>
>
> cdef int MISR_SEAICE_THRESHOLD = 6
>
>
>
> def mixed_coastline(cnumpy.ndarray[cnumpy.uint8_t, ndim=2] nsidc, int
> radius, int count):
>
>
>
>      cdef int irow, icol, irow1, irow2, icol1, icol2, npoints, nsnowice
>
>      cdef cnumpy.ndarray[cnumpy.uint8_t, ndim=2] nsidc2 \
>
>         = numpy.empty(shape=(NSIDC_SIZE, NSIDC_SIZE), dtype=numpy.uint8)
>
>      cdef cnumpy.ndarray[cnumpy.uint8_t, ndim=2] window \
>
>         = numpy.empty(shape=(2*radius+1, 2*radius+1), dtype=numpy.uint8)
>
>
>
>      nsidc2 = numpy.copy(nsidc)
>
>
>
>      idx_coastline = numpy.where(nsidc2 == NSIDC_COASTLINE_MIXED)
>
>
>
>      for (irow, icol) in zip(idx_coastline[0], idx_coastline[1]):
>
>
>
>           irow1 = max(irow-radius, 0)
>
>           irow2 = min(irow+radius+1, NSIDC_SIZE)
>
>           icol1 = max(icol-radius, 0)
>
>           icol2 = min(icol+radius+1, NSIDC_SIZE)
>
>           window = nsidc[irow1:irow2, icol1:icol2]
>
>
>
>           npoints = numpy.where(window != NSIDC_COASTLINE_MIXED, True,
> False).sum()
>
>           nsnowice = numpy.where( (window >= NSIDC_SEAICE_LOW) & (window
> <= NSIDC_FRESHSNOW), \
>
>                                   True, False).sum()
>
>
>
>           if (100.0*nsnowice/npoints >= count):
>
>                nsidc2[irow, icol] = MISR_SEAICE_THRESHOLD
>
>
>
>      return nsidc2
>
>
>
> Thanks in advance for any advice!
>
>
>
> Catherine
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180329/4a4e6cd0/attachment.html>

From chris.barker at noaa.gov  Thu Mar 29 13:26:02 2018
From: chris.barker at noaa.gov (Chris Barker)
Date: Thu, 29 Mar 2018 10:26:02 -0700
Subject: [Numpy-discussion] best way of speeding up a filtering-like
 algorithm
In-Reply-To: <43BD456C-B5E0-4203-B069-13A49A53E5F6@jpl.nasa.gov>
References: <43BD456C-B5E0-4203-B069-13A49A53E5F6@jpl.nasa.gov>
Message-ID: <CALGmxEKFVSYL92oT-uFn3Wc7+YE9SzPNukKSAitJdQM9sEM4UQ@mail.gmail.com>

one other note:

As a rule, using numpy array operations from Cython doesn't buy you much,
as you discovered. YOu need to use numpy arrays as n-d containers, and
write the loops yourself.

You may want to check out numba as another alternative -- it DOES optimize
numpy operations.

-CHB


On Wed, Mar 28, 2018 at 5:56 PM, Moroney, Catherine M (398E) <
Catherine.M.Moroney at jpl.nasa.gov> wrote:

> Hello,
>
>
>
> I have the following sample code (pretty simple algorithm that uses a
> rolling filter window) and am wondering what the best way is of speeding it
> up.  I tried rewriting it in Cython by pre-declaring the variables but that
> didn?t buy me a lot of time.  Then I rewrote it in Fortran (and compiled it
> with f2py) and now it?s lightning fast.  But I would still like to know if
> I could rewrite it in pure python/numpy/scipy or in Cython and get a
> similar speedup.
>
>
>
> Here is the raw Python code:
>
>
>
> def mixed_coastline_slow(nsidc, radius, count, mask=None):
>
>
>
>     nsidc_copy = numpy.copy(nsidc)
>
>
>
>     if (mask is None):
>
>         idx_coastline = numpy.where(nsidc_copy == NSIDC_COASTLINE_MIXED)
>
>     else:
>
>         idx_coastline = numpy.where(mask & (nsidc_copy ==
> NSIDC_COASTLINE_MIXED))
>
>
>
>     for (irow0, icol0) in zip(idx_coastline[0], idx_coastline[1]):
>
>
>
>         rows = ( max(irow0-radius, 0), min(irow0+radius+1,
> nsidc_copy.shape[0]) )
>
>         cols = ( max(icol0-radius, 0), min(icol0+radius+1,
> nsidc_copy.shape[1]) )
>
>         window = nsidc[rows[0]:rows[1], cols[0]:cols[1]]
>
>
>
>         npoints = numpy.where(window != NSIDC_COASTLINE_MIXED, True,
> False).sum()
>
>         nsnowice = numpy.where( (window >= NSIDC_SEAICE_LOW) & (window <=
> NSIDC_FRESHSNOW), \
>
>                                 True, False).sum()
>
>
>
>         if (100.0*nsnowice/npoints >= count):
>
>              nsidc_copy[irow0, icol0] = MISR_SEAICE_THRESHOLD
>
>
>
>     return nsidc_copy
>
>
>
> and here is my attempt at Cython-izing it:
>
>
>
> import numpy
>
> cimport numpy as cnumpy
>
> cimport cython
>
>
>
> cdef int NSIDC_SIZE  = 721
>
> cdef int NSIDC_NO_SNOW = 0
>
> cdef int NSIDC_ALL_SNOW = 100
>
> cdef int NSIDC_FRESHSNOW = 103
>
> cdef int NSIDC_PERMSNOW  = 101
>
> cdef int NSIDC_SEAICE_LOW  = 1
>
> cdef int NSIDC_SEAICE_HIGH = 100
>
> cdef int NSIDC_COASTLINE_MIXED = 252
>
> cdef int NSIDC_SUSPECT_ICE = 253
>
>
>
> cdef int MISR_SEAICE_THRESHOLD = 6
>
>
>
> def mixed_coastline(cnumpy.ndarray[cnumpy.uint8_t, ndim=2] nsidc, int
> radius, int count):
>
>
>
>      cdef int irow, icol, irow1, irow2, icol1, icol2, npoints, nsnowice
>
>      cdef cnumpy.ndarray[cnumpy.uint8_t, ndim=2] nsidc2 \
>
>         = numpy.empty(shape=(NSIDC_SIZE, NSIDC_SIZE), dtype=numpy.uint8)
>
>      cdef cnumpy.ndarray[cnumpy.uint8_t, ndim=2] window \
>
>         = numpy.empty(shape=(2*radius+1, 2*radius+1), dtype=numpy.uint8)
>
>
>
>      nsidc2 = numpy.copy(nsidc)
>
>
>
>      idx_coastline = numpy.where(nsidc2 == NSIDC_COASTLINE_MIXED)
>
>
>
>      for (irow, icol) in zip(idx_coastline[0], idx_coastline[1]):
>
>
>
>           irow1 = max(irow-radius, 0)
>
>           irow2 = min(irow+radius+1, NSIDC_SIZE)
>
>           icol1 = max(icol-radius, 0)
>
>           icol2 = min(icol+radius+1, NSIDC_SIZE)
>
>           window = nsidc[irow1:irow2, icol1:icol2]
>
>
>
>           npoints = numpy.where(window != NSIDC_COASTLINE_MIXED, True,
> False).sum()
>
>           nsnowice = numpy.where( (window >= NSIDC_SEAICE_LOW) & (window
> <= NSIDC_FRESHSNOW), \
>
>                                   True, False).sum()
>
>
>
>           if (100.0*nsnowice/npoints >= count):
>
>                nsidc2[irow, icol] = MISR_SEAICE_THRESHOLD
>
>
>
>      return nsidc2
>
>
>
> Thanks in advance for any advice!
>
>
>
> Catherine
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180329/9f6a1f2a/attachment-0001.html>

From jaime.frio at gmail.com  Thu Mar 29 17:34:36 2018
From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=)
Date: Thu, 29 Mar 2018 21:34:36 +0000
Subject: [Numpy-discussion] best way of speeding up a filtering-like
 algorithm
In-Reply-To: <43BD456C-B5E0-4203-B069-13A49A53E5F6@jpl.nasa.gov>
References: <43BD456C-B5E0-4203-B069-13A49A53E5F6@jpl.nasa.gov>
Message-ID: <CAPOWHWkDM0-TBvqgF0ee8sFqUmchPB9GCYKV2ouSb7VR+Dp+Zw@mail.gmail.com>

Hi Catherine,

One problem with sliding window algorithms is that the straightforward
approach can be very inefficient. Ideally you would want to not recompute
your windowed quantity from all points in the window, but to reuse the
result from an overlapping window and only take into account the points
that have changed in the sliding of the window. In your case this can be
efficiently done using a summed area table
<https://en.wikipedia.org/wiki/Summed-area_table>. Consider these two
auxiliary functions:

def summed_area_table(array):
    rows, cols = array.shape
    out = np.zeros((rows + 1, cols + 1), np.intp)
    np.cumsum(array, axis=0, out=out[1:, 1:])
    np.cumsum(out[1:, 1:], axis=1, out=out[1:, 1:])
    return out

def windowed_sum_from_summed_area_table(array, size):
    sat = summed_area_table(array)
    return (sat[:-size, :-size] + sat[size:, size:] - sat[:-size, size:] -
            sat[size:, -size:])

Using these, you can compute npoints and nsnowice for all points in your
input nsidc array as:

mask_coastline = nsidc == NSIDC_COASTLINE_MIXED
mask_not_coastline = ~mask_coastline
mask_snowice = (nsidc >= NSIDC_SEAICE_LOW) & (nsidc <= NSIDC_FRESHSNOW)
nsnowice = windowed_sum_from_summed_area_table(mask_snowice, 2*radius + 1)
npoints = windowed_sum_from_summed_area_table(mask_not_coastline, 2*radius +
1)

>From here it should be more or less straightforward to reproduce the rest
of your calculations. As written this code only handles points a distance
of at least radius from an array edge. If the edges are important to you,
they can also be extracted from the summed area table, but the expressions
get ugly: it may be cleaner, even if slower, to pad the masks with zeros
before summing them up. Also, if the fraction of points that are in
mask_coastline is very small, you may be doing way too many
unnecessary calculations.

Good luck!

Jaime


On Thu, Mar 29, 2018 at 3:36 AM Moroney, Catherine M (398E) <
Catherine.M.Moroney at jpl.nasa.gov> wrote:

> Hello,
>
>
>
> I have the following sample code (pretty simple algorithm that uses a
> rolling filter window) and am wondering what the best way is of speeding it
> up.  I tried rewriting it in Cython by pre-declaring the variables but that
> didn?t buy me a lot of time.  Then I rewrote it in Fortran (and compiled it
> with f2py) and now it?s lightning fast.  But I would still like to know if
> I could rewrite it in pure python/numpy/scipy or in Cython and get a
> similar speedup.
>
>
>
> Here is the raw Python code:
>
>
>
> def mixed_coastline_slow(nsidc, radius, count, mask=None):
>
>
>
>     nsidc_copy = numpy.copy(nsidc)
>
>
>
>     if (mask is None):
>
>         idx_coastline = numpy.where(nsidc_copy == NSIDC_COASTLINE_MIXED)
>
>     else:
>
>         idx_coastline = numpy.where(mask & (nsidc_copy ==
> NSIDC_COASTLINE_MIXED))
>
>
>
>     for (irow0, icol0) in zip(idx_coastline[0], idx_coastline[1]):
>
>
>
>         rows = ( max(irow0-radius, 0), min(irow0+radius+1,
> nsidc_copy.shape[0]) )
>
>         cols = ( max(icol0-radius, 0), min(icol0+radius+1,
> nsidc_copy.shape[1]) )
>
>         window = nsidc[rows[0]:rows[1], cols[0]:cols[1]]
>
>
>
>         npoints = numpy.where(window != NSIDC_COASTLINE_MIXED, True,
> False).sum()
>
>         nsnowice = numpy.where( (window >= NSIDC_SEAICE_LOW) & (window <=
> NSIDC_FRESHSNOW), \
>
>                                 True, False).sum()
>
>
>
>         if (100.0*nsnowice/npoints >= count):
>
>              nsidc_copy[irow0, icol0] = MISR_SEAICE_THRESHOLD
>
>
>
>     return nsidc_copy
>
>
>
> and here is my attempt at Cython-izing it:
>
>
>
> import numpy
>
> cimport numpy as cnumpy
>
> cimport cython
>
>
>
> cdef int NSIDC_SIZE  = 721
>
> cdef int NSIDC_NO_SNOW = 0
>
> cdef int NSIDC_ALL_SNOW = 100
>
> cdef int NSIDC_FRESHSNOW = 103
>
> cdef int NSIDC_PERMSNOW  = 101
>
> cdef int NSIDC_SEAICE_LOW  = 1
>
> cdef int NSIDC_SEAICE_HIGH = 100
>
> cdef int NSIDC_COASTLINE_MIXED = 252
>
> cdef int NSIDC_SUSPECT_ICE = 253
>
>
>
> cdef int MISR_SEAICE_THRESHOLD = 6
>
>
>
> def mixed_coastline(cnumpy.ndarray[cnumpy.uint8_t, ndim=2] nsidc, int
> radius, int count):
>
>
>
>      cdef int irow, icol, irow1, irow2, icol1, icol2, npoints, nsnowice
>
>      cdef cnumpy.ndarray[cnumpy.uint8_t, ndim=2] nsidc2 \
>
>         = numpy.empty(shape=(NSIDC_SIZE, NSIDC_SIZE), dtype=numpy.uint8)
>
>      cdef cnumpy.ndarray[cnumpy.uint8_t, ndim=2] window \
>
>         = numpy.empty(shape=(2*radius+1, 2*radius+1), dtype=numpy.uint8)
>
>
>
>      nsidc2 = numpy.copy(nsidc)
>
>
>
>      idx_coastline = numpy.where(nsidc2 == NSIDC_COASTLINE_MIXED)
>
>
>
>      for (irow, icol) in zip(idx_coastline[0], idx_coastline[1]):
>
>
>
>           irow1 = max(irow-radius, 0)
>
>           irow2 = min(irow+radius+1, NSIDC_SIZE)
>
>           icol1 = max(icol-radius, 0)
>
>           icol2 = min(icol+radius+1, NSIDC_SIZE)
>
>           window = nsidc[irow1:irow2, icol1:icol2]
>
>
>
>           npoints = numpy.where(window != NSIDC_COASTLINE_MIXED, True,
> False).sum()
>
>           nsnowice = numpy.where( (window >= NSIDC_SEAICE_LOW) & (window
> <= NSIDC_FRESHSNOW), \
>
>                                   True, False).sum()
>
>
>
>           if (100.0*nsnowice/npoints >= count):
>
>                nsidc2[irow, icol] = MISR_SEAICE_THRESHOLD
>
>
>
>      return nsidc2
>
>
>
> Thanks in advance for any advice!
>
>
>
> Catherine
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>


-- 
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes
de dominaci?n mundial.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180329/8d869de4/attachment-0001.html>

From fernanvieira at gmail.com  Fri Mar 30 09:45:34 2018
From: fernanvieira at gmail.com (Fernando Fernandes Vieira)
Date: Fri, 30 Mar 2018 10:45:34 -0300
Subject: [Numpy-discussion] NEUROLAB - SPYDER
Message-ID: <CAOEwMKJHYd0V-UEWnDrs1ptoK9h9v6n-yux96Xhch5rAQt+kgQ@mail.gmail.com>

Hello everyone
How to install neurolab in spyder?
Can someone help me
Att..
_______________________________________________________
  * FERNANDO FERNANDES VIEIRA*
   Departamento de Engenharia Sanit?ria e Ambiental - DESA
   Centro de Ci?ncias e Tecnologia - CCT
   Universidade Estadual da Para?ba - UEPB
   Tel: (83) 3315-3333 (DESA) - (83) 98852-1461 (Pessoal)
   e-mail: fernando at uepb.edu.br (fernanvieira at gmail.com)
   Campina Grande - PB - Brasil
_______________________________________________________
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180330/1e7d2195/attachment.html>

From solarjoe at posteo.org  Fri Mar 30 11:56:00 2018
From: solarjoe at posteo.org (Joe)
Date: Fri, 30 Mar 2018 17:56:00 +0200
Subject: [Numpy-discussion] NEUROLAB - SPYDER
In-Reply-To: <CAOEwMKJHYd0V-UEWnDrs1ptoK9h9v6n-yux96Xhch5rAQt+kgQ@mail.gmail.com>
References: <CAOEwMKJHYd0V-UEWnDrs1ptoK9h9v6n-yux96Xhch5rAQt+kgQ@mail.gmail.com>
Message-ID: <068ca20e-baf4-4a2f-e013-fade965f0415@posteo.org>

Hi,

Download here:
https://pypi.python.org/pypi/neurolab

Though, I can't recommend to use it. I did a while ago and it is
a pretty basic project that seems to be no longer maintained.

I use Keras / Theano now instead, which is a mature and widely used
package.

Kind regards,
Joe


Am 30.03.2018 um 15:45 schrieb Fernando Fernandes Vieira:
> Hello everyone
> How to install neurolab in spyder?
> Can someone help me
> Att..
> _______________________________________________________
> *FERNANDO FERNANDES VIEIRA*
>  ???Departamento de Engenharia Sanit?ria e Ambiental - DESA
>  ?? Centro de Ci?ncias e Tecnologia - CCT
>  ?? Universidade Estadual da Para?ba - UEPB
>  ?? Tel: (83) 3315-3333 (DESA) - (83) 98852-1461 (Pessoal)
>  ? ?e-mail: fernando at uepb.edu.br <mailto:fernando at uepb.edu.br> 
> (fernanvieira at gmail.com <mailto:fernanvieira at gmail.com>)
>  ? ?Campina Grande - PB - Brasil
> _______________________________________________________
> 
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> 

From fernanvieira at gmail.com  Fri Mar 30 15:42:14 2018
From: fernanvieira at gmail.com (Fernando Fernandes Vieira)
Date: Fri, 30 Mar 2018 16:42:14 -0300
Subject: [Numpy-discussion] NEUROLAB - SPYDER
In-Reply-To: <068ca20e-baf4-4a2f-e013-fade965f0415@posteo.org>
References: <CAOEwMKJHYd0V-UEWnDrs1ptoK9h9v6n-yux96Xhch5rAQt+kgQ@mail.gmail.com>
 <068ca20e-baf4-4a2f-e013-fade965f0415@posteo.org>
Message-ID: <CAOEwMKJKJrUiN-AxeHn+CukqC-NKd_Yg9eky5_Zw-K4e=vwDqA@mail.gmail.com>

Hi joe,

Thanks for your help.

Att..
_______________________________________________________
  * FERNANDO FERNANDES VIEIRA*
   Departamento de Engenharia Sanit?ria e Ambiental - DESA
   Centro de Ci?ncias e Tecnologia - CCT
   Universidade Estadual da Para?ba - UEPB
   Tel: (83) 3315-3333 (DESA) - (83) 98852-1461 (Pessoal)
   e-mail: fernando at uepb.edu.br (fernanvieira at gmail.com)
   Campina Grande - PB - Brasil
_______________________________________________________


2018-03-30 12:56 GMT-03:00 Joe <solarjoe at posteo.org>:

> Hi,
>
> Download here:
> https://pypi.python.org/pypi/neurolab
>
> Though, I can't recommend to use it. I did a while ago and it is
> a pretty basic project that seems to be no longer maintained.
>
> I use Keras / Theano now instead, which is a mature and widely used
> package.
>
> Kind regards,
> Joe
>
>
> Am 30.03.2018 um 15:45 schrieb Fernando Fernandes Vieira:
>
>> Hello everyone
>> How to install neurolab in spyder?
>> Can someone help me
>> Att..
>> _______________________________________________________
>> *FERNANDO FERNANDES VIEIRA*
>>     Departamento de Engenharia Sanit?ria e Ambiental - DESA
>>     Centro de Ci?ncias e Tecnologia - CCT
>>     Universidade Estadual da Para?ba - UEPB
>>     Tel: (83) 3315-3333 (DESA) - (83) 98852-1461 (Pessoal)
>>     e-mail: fernando at uepb.edu.br <mailto:fernando at uepb.edu.br> (
>> fernanvieira at gmail.com <mailto:fernanvieira at gmail.com>)
>>     Campina Grande - PB - Brasil
>> _______________________________________________________
>>
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180330/fa271cc2/attachment.html>

From ralf.gommers at gmail.com  Fri Mar 30 20:03:08 2018
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Fri, 30 Mar 2018 17:03:08 -0700
Subject: [Numpy-discussion] ANN: numpydoc 0.8.0 release
Message-ID: <CABL7CQjLmO56m3BRc0LsfjzZcEJyRsYc-fPOf6+URU+1RxvhYw@mail.gmail.com>

Hi all,

I'm pleased to announce that a new release of numpydoc is available:
- package: https://pypi.python.org/pypi/numpydoc
- documentation (new in this release):
https://numpydoc.readthedocs.io/en/latest/

This is a maintenance release with many small improvements. Likely your
documentation will render with fewer warnings; e.g. for NumPy it removed
~300 irrelevant ones, while improving the rendered results.

Thanks to everyone who contributed to this release!

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180330/528d007d/attachment.html>