[Numpy-discussion] efficient way to manage a set of floats?

Benjamin Root ben.root at ou.edu
Wed May 12 22:06:50 EDT 2010


On Wed, May 12, 2010 at 8:37 PM, <josef.pktd at gmail.com> wrote:

> On Wed, May 12, 2010 at 9:27 PM, Robert Kern <robert.kern at gmail.com>
> wrote:
> > On Wed, May 12, 2010 at 20:09, Dr. Phillip M. Feldman
> > <pfeldman at verizon.net> wrote:
> >>
> >> Warren Weckesser-3 wrote:
> >>>
> >>> A couple questions:
> >>>
> >>> How many floats will you be storing?
> >>>
> >>> When you test for membership, will you want to allow for a numerical
> >>> tolerance, so that if the value 1 - 0.7 is added to the set, a test for
> >>> the value 0.3 returns True?  (0.3 is actually 0.29999999999999999,
> while
> >>> 1-0.7 is 0.30000000000000004)
> >>>
> >>> Warren
> >>>
> >>
> >> Anne- Thanks for that absolutely beautiful explanation!!
> >>
> >> Warren- I had not initially thought about numerical tolerance, but this
> >> could potentially be an issue, in which case the management of the data
> >> would have to be completely different.  Thanks for pointing this out!  I
> >> might have as many as 50,000 values.
> >
> > You may want to explain your higher-level problem. Maintaining sets of
> > floating point numbers is almost never the right approach. With sets,
> > comparison must necessarily be by exact equality because fuzzy
> > equality is not transitive.
>
> with consistent scaling, shouldn't something like rounding to a fixed
> precision be enough?
>
> >>> round(1 - 0.7,14) == round(0.3, 14)
> True
> >>> 1 - 0.7 == 0.3
> False
>
> or approx_equal instead of almost_equal
>
> Josef
>
> I have to agree with Robert.  Whenever a fellow student comes to me
describing an issue where they needed to find a floating point number in an
array, the problem can usually be restated in a way that makes much more
sense.

There are so many issues with doing a naive comparison using round()
(largely because it is intransitive as someone else already stated).  As a
quick and dirty solution to very specific issues, they work -- but they are
almost never left as a final solution.

Ben
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20100512/d15113d3/attachment.html>


More information about the NumPy-Discussion mailing list