[Numpy-discussion] Add sliding_window_view method to numpy
Zimmermann Klaus
klaus.zimmermann at smhi.se
Fri Nov 27 10:10:24 EST 2020
Hi Ralf,
On 26/11/2020 15:17, Ralf Gommers wrote:
> On Fri, Nov 6, 2020 at 4:03 PM Zimmermann Klaus
> <klaus.zimmermann at smhi.se <mailto:klaus.zimmermann at smhi.se>> wrote:
> I was just wondering if, of the top your head, an existing, better fit
> comes to mind?
>
>
> Not really. Outside of stride_tricks there's nothing that quite fits.
> This function is more in scope for something like scipy.signal.
Alright, let's keep it as is then.
Thanks and cheers
Klaus
>
> Cheers,
> Ralf
>
>
>
> > The reason
> > from my point of view is that stride tricks is really a
> technical (and
> > slightly ominous) name that might throw of more application
> oriented
> > programmers from finding and using this function. Thinking of my
> > scientist colleagues, I think those are exactly the kind of
> users that
> > could benefit from such a prototyping tool.
> >
> >
> > That phrasing is one of a number of concerns. NumPy is normally not in
> > the business of providing things that are okay as a prototyping tool,
> > but are potentially extremely slow (as pointed out in the Notes
> section
> > of the docstring). A function like that would basically not be the
> right
> > tool for almost anything in, e.g., SciPy - it requires an iterative
> > algorithm. In NumPy we don't prefer performance at all costs, but in
> > general it's pretty decent rather than "Numba or Cython may gain you
> > 100x here".
>
> I still think that the performance concern is a bit overblown. Yes,
> application with large windows can need more FLOPs by an equally large
> factor. But most such applications will use small to moderate windows.
> Furthermore, this view focuses only on FLOPs. In my current field of
> climate science (and many others), that is almost never the limiting
> factor. Memory demands are far more problematic and incidentally, those
> are more likely to increase in other methods that require the storage of
> ancillary, temporary data.
>
> > Other issues include:
> > 2) It is very specific to NumPy's memory model (as pointed out by you
> > and Sebastian) - just like the rest of stride_tricks
> Not wrong, but on the other hand, that memory model is not exotic. C,
> Fortran, and any number of other languages play very nicely with this,
> just as important downstream libraries like dask.
>
> > 3) It has "view" in the name, which doesn't quite make sense for the
> > main namespace (also connected to point 2 above).
> Ok.
>
> > 4) The cost of putting something in the main namespace for other
> > array/tensor libraries is large. Maybe other libraries, e.g. CuPy,
> Dask,
> > TensorFlow, PyTorch, JAX, MXNet, aim to reimplement part or all of the
> > main NumPy namespace as well as possible. This would trigger
> discussions
> > and likely many person-weeks of work for others.
> Agreed. Though I have to say that my whole motivation comes from
> corresponding issues in dask that where specifically waiting for (the
> older version of) this PR (see [1, 2,...]). But I understand that dask
> is effectively much closer to the numpy memory model than, say, CuPy, so
> don't take this to mean it should be in the main namespace.
>
> > 5) It's a useful function, but it's very much on the margins of
> NumPy's
> > scope. It could easily have gone into, for example, scipy.signal. At
> > this point the bar for functions going into the main namespace
> should be> (and is) high.
> I agree that the bar for the main namespace should be high!
>
> > All this taken together means it's not even a toss-up for me. If
> it were
> > just one or two of these points, maybe. But given all the above, I'm
> > pretty confident saying "it does not belong in the main namespace".
> Again, I am happy with that.
>
>
> Thanks for your thoughts and work! I really appreciate it!
>
> Cheers
> Klaus
>
> [1] https://github.com/dask/dask/issues/4659
> <https://github.com/dask/dask/issues/4659>
> [2] https://github.com/pydata/xarray/issues/3608
> <https://github.com/pydata/xarray/issues/3608>
> [3] https://github.com/pandas-dev/pandas/issues/26959
> <https://github.com/pandas-dev/pandas/issues/26959>
>
>
> >
> >
> > Cheers
> > Klaus
> >
> >
> >
> > [1]
> https://github.com/numpy/numpy/pull/17394#issuecomment-700998618
> <https://github.com/numpy/numpy/pull/17394#issuecomment-700998618>
> >
> <https://github.com/numpy/numpy/pull/17394#issuecomment-700998618
> <https://github.com/numpy/numpy/pull/17394#issuecomment-700998618>>
> > [2]
> https://github.com/numpy/numpy/pull/17394#discussion_r498215468
> <https://github.com/numpy/numpy/pull/17394#discussion_r498215468>
> >
> <https://github.com/numpy/numpy/pull/17394#discussion_r498215468
> <https://github.com/numpy/numpy/pull/17394#discussion_r498215468>>
> > [3]
> https://github.com/numpy/numpy/pull/17394#discussion_r498724340
> <https://github.com/numpy/numpy/pull/17394#discussion_r498724340>
> >
> <https://github.com/numpy/numpy/pull/17394#discussion_r498724340
> <https://github.com/numpy/numpy/pull/17394#discussion_r498724340>>
> >
> > On 06/11/2020 01:39, Sebastian Berg wrote:
> > > On Thu, 2020-11-05 at 17:35 -0600, Sebastian Berg wrote:
> > >> On Thu, 2020-11-05 at 12:51 -0800, Stephan Hoyer wrote:
> > >>> On Thu, Nov 5, 2020 at 11:16 AM Ralf Gommers <
> > >>> ralf.gommers at gmail.com <mailto:ralf.gommers at gmail.com>
> <mailto:ralf.gommers at gmail.com <mailto:ralf.gommers at gmail.com>>>
> > >>> wrote:
> > >>>
> > >>>> On Thu, Nov 5, 2020 at 4:56 PM Sebastian Berg <
> > >>>> sebastian at sipsolutions.net
> <mailto:sebastian at sipsolutions.net>
> <mailto:sebastian at sipsolutions.net <mailto:sebastian at sipsolutions.net>>>
> > >>>> wrote:
> > >>>>
> > >>>>> Hi all,
> > >>>>>
> > >>>>> just a brief note that I merged this proposal:
> > >>>>>
> > >>>>> https://github.com/numpy/numpy/pull/17394
> <https://github.com/numpy/numpy/pull/17394>
> > <https://github.com/numpy/numpy/pull/17394
> <https://github.com/numpy/numpy/pull/17394>>
> > >>>>>
> > >>>>> adding `np.sliding_window_view` into the 1.20 release of
> NumPy.
> > >>>>>
> > >>>>> There was only one public API change, and that is that the
> > >>>>> `shape`
> > >>>>> argument is now called `window_shape`.
> > >>>>>
> > >>>>> This is still a good time for feedback in case you have a
> > >>>>> better
> > >>>>> idea
> > >>>>> e.g. for the function or parameter names.
> > >>>>>
> > >>>>
> > >>>> The old PR had this in the lib.stride_tricks namespace.
> Seeing it
> > >>>> in the
> > >>>> main namespace is unexpected and likely will lead to
> > >>>> issues/questions,
> > >>>> given that such an overlapping view is going to do behave
> in ways
> > >>>> the
> > >>>> average user will be surprised by. It may also lead to
> requests
> > >>>> for
> > >>>> other
> > >>>> array/tensor libraries to implement this. I don't see any
> > >>>> discussion on
> > >>>> this in PR 17394, it looks like a decision by the PR
> author that
> > >>>> no
> > >>>> one
> > >>>> commented on - reconsider that?
> > >>>>
> > >>>> Cheers,
> > >>>> Ralf
> > >>>>
> > >>>
> > >>> +1 let's keep this in the lib.stride_tricks namespace.
> > >>>
> > >>
> > >> I have no reservations against having it in the main
> namespace and am
> > >> happy either way (it can still be exposed later in any
> case). It is
> > >> the
> > >> conservative choice and maybe it is an uncommon enough
> function that
> > >> it
> > >> deserves being a bit hidden...
> > >
> > >
> > > In any case, its the safe bet for NumPy 1.20 at least so I
> opened
> > a PR:
> > >
> > > https://github.com/numpy/numpy/pull/17720
> <https://github.com/numpy/numpy/pull/17720>
> > <https://github.com/numpy/numpy/pull/17720
> <https://github.com/numpy/numpy/pull/17720>>
> > >
> > > Name changes, etc. are also possible of course.
> > >
> > > I still think it might be nice to find a better place for
> this type of
> > > function that `np.lib.stride_tricks` though, but dunno...
> > >
> > > - Sebastian
> > >
> > >
> > >
> > >>
> > >> But I am curious, it sounds like you have both very strong
> > >> reservations, and I would like to understand them better.
> > >>
> > >> The behaviour can be surprising, but that is why the
> default is a
> > >> read-
> > >> only view. I do not think it is worse than
> `np.broadcast_to` in this
> > >> regard. (It is nowhere near as dangerous as `as_strided`.)
> > >>
> > >> It is true that it is specific to NumPy (memory model). So
> that is
> > >> maybe a good enough reason right now. But I am not sure that
> > >> stuffing
> > >> things into a pretty hidden `np.lib.*` namespaces is a
> great long
> > >> term
> > >> solution either. There is very little useful functionality
> hidden
> > >> away
> > >> in `np.lib.*` currently.
> > >>
> > >> Cheers,
> > >>
> > >> Sebastian
> > >>
> > >>>>
> > >>>>
> > >>>>> Cheers,
> > >>>>>
> > >>>>> Sebastian
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> On Mon, 2020-10-12 at 08:39 +0000, Zimmermann Klaus wrote:
> > >>>>>> Hello,
> > >>>>>>
> > >>>>>> I would like to draw the attention of this list to PR
> #17394
> > >>>>>> [1] that
> > >>>>>> adds the implementation of a sliding window view to numpy.
> > >>>>>>
> > >>>>>> Having a sliding window view in numpy is a longstanding
> open
> > >>>>>> issue
> > >>>>>> (cf
> > >>>>>> #7753 [2] from 2016). A brief summary of the discussions
> > >>>>>> surrounding
> > >>>>>> it
> > >>>>>> can be found in the description of the PR.
> > >>>>>>
> > >>>>>> This PR implements a sliding window view based on stride
> > >>>>>> tricks.
> > >>>>>> Following the discussion in issue #7753, a first
> > >>>>>> implementation
> > >>>>>> was
> > >>>>>> provided by Fanjin Zeng in PR #10771. After some
> discussion,
> > >>>>>> that PR
> > >>>>>> stalled and I picked up the issue in the present PR #17394.
> > >>>>>> It
> > >>>>>> is
> > >>>>>> based
> > >>>>>> on the first implementation, but follows the changed API as
> > >>>>>> suggested
> > >>>>>> by
> > >>>>>> Eric Wieser.
> > >>>>>>
> > >>>>>> Code reviews have been provided by Bas van Beek, Stephen
> > >>>>>> Hoyer,
> > >>>>>> and
> > >>>>>> Eric
> > >>>>>> Wieser. Sebastian Berg added the "62 - Python API" label.
> > >>>>>>
> > >>>>>>
> > >>>>>> Do you think this is suitable for inclusion in numpy?
> > >>>>>>
> > >>>>>> Do you consider the PR ready?
> > >>>>>>
> > >>>>>> Do you have suggestions or requests?
> > >>>>>>
> > >>>>>>
> > >>>>>> Thanks for your time and consideration!
> > >>>>>> Klaus
> > >>>>>>
> > >>>>>>
> > >>>>>> [1] https://github.com/numpy/numpy/pull/17394
> <https://github.com/numpy/numpy/pull/17394>
> > <https://github.com/numpy/numpy/pull/17394
> <https://github.com/numpy/numpy/pull/17394>>
> > >>>>>> [2] https://github.com/numpy/numpy/issues/7753
> <https://github.com/numpy/numpy/issues/7753>
> > <https://github.com/numpy/numpy/issues/7753
> <https://github.com/numpy/numpy/issues/7753>>
> > >>>>>> _______________________________________________
> > >>>>>> NumPy-Discussion mailing list
> > >>>>>> NumPy-Discussion at python.org
> <mailto:NumPy-Discussion at python.org>
> <mailto:NumPy-Discussion at python.org
> <mailto:NumPy-Discussion at python.org>>
> > >>>>>>
> https://mail.python.org/mailman/listinfo/numpy-discussion
> <https://mail.python.org/mailman/listinfo/numpy-discussion>
> > <https://mail.python.org/mailman/listinfo/numpy-discussion
> <https://mail.python.org/mailman/listinfo/numpy-discussion>>
> > >>>>>>
> > >>>>>
> > >>>>> _______________________________________________
> > >>>>> NumPy-Discussion mailing list
> > >>>>> NumPy-Discussion at python.org
> <mailto:NumPy-Discussion at python.org>
> <mailto:NumPy-Discussion at python.org
> <mailto:NumPy-Discussion at python.org>>
> > >>>>>
> https://mail.python.org/mailman/listinfo/numpy-discussion
> <https://mail.python.org/mailman/listinfo/numpy-discussion>
> > <https://mail.python.org/mailman/listinfo/numpy-discussion
> <https://mail.python.org/mailman/listinfo/numpy-discussion>>
> > >>>>>
> > >>>> _______________________________________________
> > >>>> NumPy-Discussion mailing list
> > >>>> NumPy-Discussion at python.org
> <mailto:NumPy-Discussion at python.org>
> <mailto:NumPy-Discussion at python.org
> <mailto:NumPy-Discussion at python.org>>
> > >>>> https://mail.python.org/mailman/listinfo/numpy-discussion
> <https://mail.python.org/mailman/listinfo/numpy-discussion>
> > <https://mail.python.org/mailman/listinfo/numpy-discussion
> <https://mail.python.org/mailman/listinfo/numpy-discussion>>
> > >>>>
> > >>>
> > >>> _______________________________________________
> > >>> NumPy-Discussion mailing list
> > >>> NumPy-Discussion at python.org
> <mailto:NumPy-Discussion at python.org>
> <mailto:NumPy-Discussion at python.org
> <mailto:NumPy-Discussion at python.org>>
> > >>> https://mail.python.org/mailman/listinfo/numpy-discussion
> <https://mail.python.org/mailman/listinfo/numpy-discussion>
> > <https://mail.python.org/mailman/listinfo/numpy-discussion
> <https://mail.python.org/mailman/listinfo/numpy-discussion>>
> > >>
> > >> _______________________________________________
> > >> NumPy-Discussion mailing list
> > >> NumPy-Discussion at python.org
> <mailto:NumPy-Discussion at python.org>
> <mailto:NumPy-Discussion at python.org
> <mailto:NumPy-Discussion at python.org>>
> > >> https://mail.python.org/mailman/listinfo/numpy-discussion
> <https://mail.python.org/mailman/listinfo/numpy-discussion>
> > <https://mail.python.org/mailman/listinfo/numpy-discussion
> <https://mail.python.org/mailman/listinfo/numpy-discussion>>
> > >
> > >
> > > _______________________________________________
> > > NumPy-Discussion mailing list
> > > NumPy-Discussion at python.org
> <mailto:NumPy-Discussion at python.org>
> <mailto:NumPy-Discussion at python.org
> <mailto:NumPy-Discussion at python.org>>
> > > https://mail.python.org/mailman/listinfo/numpy-discussion
> <https://mail.python.org/mailman/listinfo/numpy-discussion>
> > <https://mail.python.org/mailman/listinfo/numpy-discussion
> <https://mail.python.org/mailman/listinfo/numpy-discussion>>
> > >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org
> <mailto:NumPy-Discussion at python.org>
> <mailto:NumPy-Discussion at python.org
> <mailto:NumPy-Discussion at python.org>>
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> <https://mail.python.org/mailman/listinfo/numpy-discussion>
> > <https://mail.python.org/mailman/listinfo/numpy-discussion
> <https://mail.python.org/mailman/listinfo/numpy-discussion>>
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org <mailto:NumPy-Discussion at python.org>
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> <https://mail.python.org/mailman/listinfo/numpy-discussion>
> >
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org <mailto:NumPy-Discussion at python.org>
> https://mail.python.org/mailman/listinfo/numpy-discussion
> <https://mail.python.org/mailman/listinfo/numpy-discussion>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
More information about the NumPy-Discussion
mailing list