[Numpy-discussion] Added atleast_nd, request for clarification/cleanup of atleast_3d
Benjamin Root
ben.v.root at gmail.com
Wed Jul 6 09:51:16 EDT 2016
While atleast_1d/2d/3d predates my involvement in numpy, I am probably
partly to blame for popularizing them as I helped to fix them up a fair
amount. I wouldn't call its use "guessing". Rather, I would treat them as
useful input sanitizers. If your function is going to be doing 2d indexing
on an input, then it is very convenient to have atleast_2d() at the top of
your function, not only to sanitize the input, but to make it clear that
your code expects at least two dimensions.
One place where it is used is in np.loadtxt(..., ndmin=N) to protect
against the situation of a single row of data becoming a 1-D array rather
than a 2-D array (or an empty text file returning something completely
useless).
I have previously pointed out the oddity with atleast_3d(). I can't
remember the explanation I got though. Maybe someone can find the old
thread that has the explanation, if any?
I think the keyword argument approach for controlling the behavior might be
a good approach, provided that a suitable design could be devised. 1 & 2
dimensions is fairly trivial to control, but 3+ dimensions has too many
degrees of freedom for me to consider.
Cheers!
Ben Root
On Wed, Jul 6, 2016 at 9:12 AM, Joseph Fox-Rabinovitz <
jfoxrabinovitz at gmail.com> wrote:
> I can add a keyword-only argument that lets you put the new dims
> before or after the existing ones. I am not sure how to specify
> arbitrary patterns for the new dimensions, but that should take care
> of most use cases.
>
> The use case that motivated this function in the first place is that I
> am doing some processing on 4D arrays and I need to reduce them but
> return a result with the original dimensionality (but not shape).
> atleast_nd seemed like a better solution than atleast_4d.
>
> -Joe
>
>
> On Wed, Jul 6, 2016 at 3:41 AM, <josef.pktd at gmail.com> wrote:
> >
> >
> > On Wed, Jul 6, 2016 at 3:29 AM, <josef.pktd at gmail.com> wrote:
> >>
> >>
> >>
> >> On Wed, Jul 6, 2016 at 2:21 AM, Ralf Gommers <ralf.gommers at gmail.com>
> >> wrote:
> >>>
> >>>
> >>>
> >>> On Wed, Jul 6, 2016 at 7:06 AM, Nathaniel Smith <njs at pobox.com> wrote:
> >>>
> >>>> On Jul 5, 2016 9:09 PM, "Joseph Fox-Rabinovitz"
> >>>> <jfoxrabinovitz at gmail.com> wrote:
> >>>> >
> >>>> > Hi,
> >>>> >
> >>>> > I have generalized np.atleast_1d, np.atleast_2d, np.atleast_3d with
> a
> >>>> > function np.atleast_nd in PR#7804
> >>>> > (https://github.com/numpy/numpy/pull/7804).
> >>>> >
> >>>> > As a result of this PR, I have a couple of questions about
> >>>> > `np.atleast_3d`. `np.atleast_3d` appears to do something weird with
> >>>> > the dimensions: If the input is 1D, it prepends and appends a size-1
> >>>> > dimension. If the input is 2D, it appends a size-1 dimension. This
> is
> >>>> > inconsistent with `np.atleast_2d`, which always prepends (as does
> >>>> > `np.atleast_nd`).
> >>>> >
> >>>> > - Is there any reason for this behavior?
> >>>> > - Can it be cleaned up (e.g., by reimplementing `np.atleast_3d` in
> >>>> > terms of `np.atleast_nd`, which is actually much simpler)? This
> would
> >>>> > be a slight API change since the output would not be exactly the
> same.
> >>>>
> >>>> Changing atleast_3d seems likely to break a bunch of stuff...
> >>>>
> >>>> Beyond that, I find it hard to have an opinion about the best design
> for
> >>>> these functions, because I don't think I've ever encountered a
> situation
> >>>> where they were actually what I wanted. I'm not a big fan of coercing
> >>>> dimensions in the first place, for the usual "refuse to guess"
> reasons. And
> >>>> then generally if I do want to coerce an array to another dimension,
> then I
> >>>> have some opinion about where the new dimensions should go, and/or I
> have
> >>>> some opinion about the minimum acceptable starting dimension, and/or
> I have
> >>>> a maximum dimension in mind. (E.g. "coerce 1d inputs into a column
> matrix;
> >>>> 0d or 3d inputs are an error" -- atleast_2d is zero-for-three on that
> >>>> requirements list.)
> >>>>
> >>>> I don't know how typical I am in this. But it does make me wonder if
> the
> >>>> atleast_* functions act as an attractive nuisance, where new users
> take
> >>>> their presence as an implicit recommendation that they are actually a
> useful
> >>>> thing to reach for, even though they... aren't that. And maybe we
> should be
> >>>> recommending folk move away from them rather than trying to extend
> them
> >>>> further?
> >>>>
> >>>> Or maybe they're totally useful and I'm just missing it. What's your
> use
> >>>> case that motivates atleast_nd?
> >>>
> >>> I think you're just missing it:) atleast_1d/2d are used quite a bit in
> >>> Scipy and Statsmodels (those are the only ones I checked), and in the
> large
> >>> majority of cases it's the best thing to use there. There's a bunch of
> >>> atleast_2d calls with a transpose appended because the input needs to
> be
> >>> treated as columns instead of rows, but that's still efficient and
> readable
> >>> enough.
> >>
> >>
> >>
> >> As Ralph pointed out its usage in statsmodels. I do find them useful as
> >> replacement for several lines of ifs and reshapes
> >>
> >> We stilll need in many cases the atleast_2d_cols, that appends the
> newaxis
> >> if necessary.
> >>
> >> roughly the equivalent of
> >>
> >> if x.ndim == 1:
> >> x = x[:, None]
> >> else:
> >> x = np.atleast_2d(x)
> >>
> >> Josef
> >>
> >>>
> >>>
> >>> For 3D/nD I can see that you'd need more control over where the
> >>> dimensions go, but 1D/2D are fine.
> >
> >
> >
> > statsmodels has currently very little code with ndim >2, so I have no
> > overview of possible use cases, but it would be necessary to have full
> > control over the added axis since axis have a strict meaning and stats
> still
> > prefer Fortran order to default numpy/C ordering.
> >
> > Josef
> >
> >
> >>>
> >>>
> >>>
> >>> Ralf
> >>>
> >>>
> >>> _______________________________________________
> >>> NumPy-Discussion mailing list
> >>> NumPy-Discussion at scipy.org
> >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
> >>>
> >>
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > https://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20160706/4241522b/attachment.html>
More information about the NumPy-Discussion
mailing list