[Numpy-discussion] Augment unique method

Stephan Hoyer shoyer at gmail.com
Thu Jul 16 16:14:05 EDT 2020


On Thu, Jul 16, 2020 at 1:04 PM <aminthefresh at gmail.com> wrote:

> I see your point. How about passing number of significant figures instead
> of atol.
>
>
>
> In fact, that’s what I originally intended but I thought that it could be
> expressed via atol and rtol, whereas number of significant figures doesn’t
> seem to suffer from the ambiguity you pointed out.
>

This can already be expressed clearly* with a separate function call, e.g.,
np.unique(np.round(x, 3))

In general, it's a better software design practice to have separate
composable functions rather than adding more features into a single
function. So I don't think this would be an improvement for np.unique().

* Note: this is rounding to fixed precision rather than a fixed number of
significant figures. I can see a case why adding a helper function for
rounding to a number of significant digits would be useful, but this should
be a separate change from np.unique(). You can certainly do this currently
in NumPy but it's a bit of work:
https://stackoverflow.com/questions/18915378/rounding-to-significant-figures-in-numpy


>
>
> *From:* NumPy-Discussion <numpy-discussion-bounces+aminthefresh=
> gmail.com at python.org> *On Behalf Of *Stephan Hoyer
> *Sent:* Thursday, July 16, 2020 3:06 PM
> *To:* Discussion of Numerical Python <numpy-discussion at python.org>
> *Subject:* Re: [Numpy-discussion] Augment unique method
>
>
>
> On Thu, Jul 16, 2020 at 11:41 AM Roman Yurchak <rth.yurchak at gmail.com>
> wrote:
>
> One issue with adding a tolerance to np.unique for floats is say you have
>   [0, 0.1, 0.2, 0.3, 0.4, 0.5] with atol=0.15
>
> Should this return a single element or multiple ones? One once side each
> consecutive float is closer than the tolerance to the next one but the
> first one and the last one are clearly not within atol.
>
> Generally this is similar to what DBSCAN clustering algorithm does (e.g.
> in scikit-learn) and that would probably be out of scope for np.unique.
>
>
>
> I agree, I don't think there's an easy answer for selecting "approximately
> unique" floats in the case of overlap.
>
>
>
> np.unique() does actually have well defined behavior for float, comparing
> floats for exact equality. This isn't always directly useful, but it
> definitely is well defined.
>
>
>
> My suggestion for this use-case would be round floats to the desired
> precision before passing them into np.unique().
>
>
>
>
>
> Roman
>
> On 16/07/2020 20:27, Amin Sadeghi wrote:
> > It would be handy to add "atol" and "rtol" optional arguments to the
> > "unique" method. I'm proposing this since uniqueness is a bit vague for
> > floats. This change would be clearly backwards-compatible.
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> >
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20200716/632efc64/attachment-0001.html>


More information about the NumPy-Discussion mailing list