[Python-Dev] nice()

Greg Ewing greg.ewing at canterbury.ac.nz
Tue Feb 14 11:52:59 CET 2006


Smith wrote:

> computing the bin boundaries for a histogram
 > where bins are a width of 0.1:
> 
>>>>for i in range(20):
> ...  if (i*.1==i/10.)<>(nice(i*.1)==nice(i/10.)):
> ...   print i,repr(i*.1),repr(i/10.),i*.1,i/10.

I don't see how that has any relevance to the way bin boundaries
would be used in practice, which is to say something like

   i = int(value / 0.1)
   bin[i] += 1 # modulo appropriate range checks

which doesn't require comparing floats for equality at all.

> For, say, garden variety numbers that aren't full of garbage digits
 > resulting from fp computation, the boundaries computed as 0.1*i are\
 > not going to agree with such simple numbers as 1.4 and 0.7.

Because the arithmetic is binary rather than decimal. But even using
decimal, you get the same sort of problems using a bin width of
1.0/3.0. The solution is to use an algorithm that isn't sensitive
to those problems, then it doesn't matter what base your arithmetic
is done in.

> I understand that the above really is just a patch over the problem,
 > but I'm wondering if it moves the problem far enough away that most
 > users wouldn't have to worry about it.

No, it doesn't. The problems are not conveniently grouped together
in some place you can get away from; they're scattered all over the
place where you can stumble upon one at any time.

> So perhaps this brings us back to the original comment that "fp issues
 > are a learning opportunity." They are. The question I have is "how
> soon  do they need to run into them?" Is decreasing the likelihood that
 > they will see the problem (but not eliminate it) a good thing for the
 > python community or not?

I don't think you're doing anyone any favours by trying to protect
them from having to know about these things, because they *need* to
know about them if they're not to write algorithms that seem to
work fine on tests but mysteriously start producing garbage when
run on real data, possibly without it even being obvious that it is
garbage.

Greg


More information about the Python-Dev mailing list