[Numpy-discussion] Views of a different dtype

Jaime Fernández del Río jaime.frio at gmail.com
Tue Feb 3 10:18:19 EST 2015


On Tue, Feb 3, 2015 at 1:28 AM, Sebastian Berg <sebastian at sipsolutions.net>
wrote:

> On Mo, 2015-02-02 at 06:25 -0800, Jaime Fernández del Río wrote:
> > On Sat, Jan 31, 2015 at 1:17 AM, Sebastian Berg
> > <sebastian at sipsolutions.net> wrote:
> >         On Fr, 2015-01-30 at 19:52 -0800, Jaime Fernández del Río
> >         wrote:
> >         > On Thu, Jan 29, 2015 at 8:57 AM, Nathaniel Smith
> >         <njs at pobox.com>
> >         > wrote:
> >         >         On Thu, Jan 29, 2015 at 12:56 AM, Jaime Fernández
> >         del Río
> >         >         <jaime.frio at gmail.com> wrote:
> >         >         [...]
> >
> >         <snip>
> >
> >         >
> >         >         Could we make it more like: check to see if the last
> >         dimension
> >         >         works.
> >         >         If not, raise an error (and let the user transpose
> >         some other
> >         >         dimension there if that's what they wanted)? Or
> >         require the
> >         >         user to
> >         >         specify which dimension will absorb the shape
> >         change? (If we
> >         >         were
> >         >         doing this from scratch, then it would be tempting
> >         to just say
> >         >         that we
> >         >         always add a new dimension at the end with
> >         newtype.itemsize /
> >         >         oldtype.itemsize entries, or absorb such a dimension
> >         if
> >         >         shrinking. As
> >         >         a bonus, this would always work, regardless of
> >         contiguity!
> >         >         Except that
> >         >         when shrinking the last dimension would have to be
> >         contiguous,
> >         >         of
> >         >         course.)
> >         >
> >         >
> >         > When we roll @ in and people start working with stacks of
> >         matrices, we
> >         > will probably find ourselves having to create an alias,
> >         similar to .T,
> >         > for .swapaxes(-1, -2). Searching for the smallest stride
> >         allows to
> >         > take views of such arrays, which does not work right now
> >         because the
> >         > array is no longer contiguous globally.
> >         >
> >
> >         That is true, but I agree with Nathaniel at least as far as
> >         that I would
> >         prefer a user to be able to safely use `view` even he has not
> >         even an
> >         inkling about what his memory layout is. One option would be
> >         an
> >         `axis=-1` default (maybe FutureWarn this from `axis=None`
> >         which would
> >         look at order, see below -- or maybe have axis='A', 'C' and
> >         'F' and
> >         default to 'A' for starters).
> >
> >         This even now could start creating bugs when enabling relaxed
> >         strides :(, because your good old fortran order complex array
> >         being
> >         viewed as a float one could expand along the wrong axis, and
> >         even
> >         without such arrays swap order pretty fast when operating on
> >         them, which
> >         can create impossibly to find bugs, because even a poweruser
> >         is likely
> >         to forget about such things.
> >
> >         Of course you could argue that view is a poweruser feature and
> >         a user
> >         using it should keep these things in mind.... Though if you
> >         argue that,
> >         you can almost just use `np.ndarray` directly ;) -- ok, not
> >         really
> >         considering how cumbersome it is, but still.
> >
> >
> > I have been giving this some thought, and am willing to concede that
> > my first proposal may have been too ambitious. So even though the knob
> > goes to 11, we can always do things incrementally. I am also wary of
> > adding new keywords when it seems obvious that we do not have the
> > functionality completely figured out, so here's my new proposal:
> >
> >
> >       * The objective is that a view of an array that is the result of
> >         slicing a contiguous array should be possible, if it remains
> >         "contiguous" (meaning stride == itemsize) along its original
> >         contiguous (first or last) dimension. This eliminates axis
> >         transposition from the previous proposal, although reversing
> >         the axes completely would also work.
> >       * To verify this, unless the C contiguous or Fortran contiguous
> >         flags are set, we would still need to look at the strides. An
> >         array would be C contiguous if, starting from the last stride
> >         it is equal to the itemsize, and working backwards every next
> >         stride is larger or equal than the product of the previous
> >         stride by the previous dimension. dimensions of size 1 would
> >         be ignored for these, except for the last one, which would be
> >         taken to have stride = itemsize. The Fortran case is of course
> >         the same in reverse.
> >       * I think the above combined with the current preference of C
> >         contiguousness over Fortran, would actually allow the views to
> >         always be reversible, which is also a nice thing to have.
> > This eliminates most of the weirdness, but extends current
> > functionality to cover cases like Jens reported a few days back.
> >
> >
> > Does this sound better?
> >
>
> It seems fine as such, but I still worry about relaxed strides, though
> this is not really directly related to your efforts here. The problem I
> see is something like this (any numpy version):
>
> arr = np.array([[1, 2]], dtype=np.float64, order='C').T
> # note that arr is fortran contiguous
> view = arr.view(np.complex128)
> not_arr = view.view(np.float64)
> np.array_equal(arr, not_arr)  # False!
>

Yes, dimensions of size one can be a pain...


>
> And with relaxed strides, the situation should become worse, because
> "Fortran order unless C order" logic is harder to predict, and here does
> an actual difference even for non (1, 1) arrays. Which creates the
> possibility of breaking currently working code.
>

Do you have a concrete example of what a non (1, 1) array that fails with
relaxed strides would look like?

If we used, as right now, the array flags as a first choice point, and only
if none is set try to determine it from the strides/dimensions information,
I fail to imagine any situation where the end result would be worse than
now. I don't think that a little bit of predictable surprising in an
advanced functionality is too bad. We could start raising "on the face of
ambiguity, we refuse to guess" errors, even for the current behavior you
show above, but that is more likely to trip people by not giving them any
simple workaround, that it seems to me would be "add a .T if all dimensions
are 1" in some particular situations. Or are you thinking of something more
serious than a shape mismatch when you write about "breaking current code"?

If there are any real loopholes in expanding this functionality, then lets
not do it, but we know we have at least one user unsatisfied with the
current performance, so I really think it is worth trying. Plus, I'll admit
to that, messing around with some of these stuff deep inside the guts of
the beast is lots of fun! ;)

Jaime


>
> - Sebastian
>
>
> >
> > Jaime
> >
> >
> >
> >         - Sebastian
> >
> >         >
> >         >         I guess the main consideration for this is that we
> >         may be
> >         >         stuck with
> >         >         stuff b/c of backwards compatibility. Can you maybe
> >         say a
> >         >         little bit
> >         >         about what is allowed now, and what constraints that
> >         puts on
> >         >         things?
> >         >         E.g. are we already grovelling around in strides and
> >         picking
> >         >         random
> >         >         dimensions in some cases?
> >         >
> >         >
> >         > Just to restate it: right now we only allow new views if the
> >         array is
> >         > globally contiguous, so either along the first or last
> >         dimension.
> >         >
> >         >
> >         > Jaime
> >         >
> >         >
> >         >         -n
> >         >
> >         >         --
> >         >         Nathaniel J. Smith
> >         >         Postdoctoral researcher - Informatics - University
> >         of
> >         >         Edinburgh
> >         >         http://vorpus.org
> >         >         _______________________________________________
> >         >         NumPy-Discussion mailing list
> >         >         NumPy-Discussion at scipy.org
> >         >
> >          http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >         >
> >         >
> >         >
> >         >
> >         >
> >         > --
> >         > (\__/)
> >         > ( O.o)
> >         > ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale
> >         en sus
> >         > planes de dominación mundial.
> >         > _______________________________________________
> >         > NumPy-Discussion mailing list
> >         > NumPy-Discussion at scipy.org
> >         > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> >
> >
> >         _______________________________________________
> >         NumPy-Discussion mailing list
> >         NumPy-Discussion at scipy.org
> >         http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> >
> >
> >
> >
> > --
> > (\__/)
> > ( O.o)
> > ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus
> > planes de dominación mundial.
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
de dominación mundial.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150203/6a15ccd9/attachment.html>


More information about the NumPy-Discussion mailing list