Comparing float and decimal

Terry Reedy tjreedy at udel.edu
Tue Sep 30 10:21:13 CEST 2008


Gabriel Genellina wrote:
> En Thu, 25 Sep 2008 08:02:49 -0300, Mark Dickinson <dickinsm at gmail.com> 
> escribió:
>> On Sep 23, 1:58 pm, Robert Lehmann <stargam... at gmail.com> wrote:
>>> I don't see why transitivity should apply to Python objects in general.
>>
>> Hmmm.  Lack of transitivity does produce some, um, interesting
>> results when playing with sets and dicts.  Here are sets s and
>> t such that the unions s | t and t | s have different sizes:
>>
>>>>> from decimal import Decimal
>>>>> s = set([Decimal(2), 2.0])
>>>>> t = set([2])
>>>>> len(s | t)
>> 2
>>>>> len(t | s)
>> 1
> 
> Ouch!
> 
>> This opens up some wonderful possibilities for hard-to-find bugs...
> 
> And I was thinking all this thread was just a theoretical question 
> without practical consequences...

To  explain this anomaly more clearly, here is a recursive definition of 
set union.

if b: a|b = a.add(x)|(b-x) where x is arbitrary member of b
else: a|b = a

Since Python only defines set-set and not set-ob, we would have to 
subtract {x} to directly implement the above.  But b.pop() subtracts an 
arbitrary members and returns it so we can add it.  So here is a Python 
implementation of the definition.

def union(a,b):
   a = set(a) # copy to preserve original
   b = set(b) # ditto
   while b:
     a.add(b.pop())
   return a

from decimal import Decimal
d1 = Decimal(1)
fd = set((1.0, d1))
i  = set((1,))
print(union(fd,i))
print(union(i,fd))

# prints

{1.0, Decimal('1')}
{1}

This is a bug in relation to the manual:
"union(other, ...)
set | other | ...
Return a new set with elements from both sets."


Transitivity is basic to logical deduction:
   equations: a == b == c ... == z implies a == z
   implications: (a implies b) and (b implies c)implies (a implies c)
The latter covers syllogism and other deduction rules.

The induction part of an induction proof of set union commutivity is a 
typical equality chain:

if b:
a | b
= a.add(x)| b-x for x in b  # definition for non-empty b
= b-x | a.add(x)  # induction hypothesis
= (b-x).add(x) | a.add(x)-x  # definition for non-empty a
= b | a.add(x)-x  # definitions of - and .add
   if x not in a:
   = b | a  # .add and -
   if x in a:
   = b | a-x    # .add and -
   = b.add(x) | a-x  # definition of .add for x in b
   = b | a  # definition for non-empty a
= b | a  # in either case, by case analysis

By transitivity of =, a | b = b | a !

So where does this go wrong for our example?  This shows the problems.
 >>> fd - i
set()

This pretty much says that 2-1=0, or that 2=1.  Not good.

The fundamental idea of a set is that it only contains something once. 
This definition assumes that equality is defined sanely, with the usual 
properties.  So, while fd having two members implies d1 != 1.0, the fact 
that f1 == 1 and 1.0 == 1 implies that they are really the same thing, 
so that d1 == 1.0, a contradiction.

To put this another way: The rule of substitution is that if E, F, and G 
are expressions and E == F and E is a subexpression of G and we 
substitute F for E in G to get H, then G == H.  Again, this rule, which 
is a premise of all formal expressional systems I know of, assumes the 
normal definition of =.  When we apply this,

fd == {f1, 1.0} == {1,1.0} == {1} == i

But Python says
 >>> fd == i
False

Conclusion: fd is not a mathematical set.

Yet another anomaly:

 >>> f = set((1.0,))
 >>> i == f
True
 >>> i.add(d1)
 >>> f.add(d1)
 >>> i == f
False

So much for "adding the same thing to equals yields equals", which is a 
special case of "doing the same thing to equals, where the thing done 
only depends on the properties that make the things equal, yields equals."


And another

 >>> d1 in i
True
 >>> 1.0 in i
True
 >>> fd <= i
False

Manual: "set <= other
Test whether every element in the set is in other"

I bet Python first tests the sizes because the implementer *assumed* 
that every member of a larger set could not be in a smaller set.  I 
presume the same assumption is used for equality testing.

Or

Manual: "symmetric_difference(other)
set ^ other
Return a new set with elements in either the set or other but not both."

 >>> d1 in fd
True
 >>> d1 in i
True
 >>> d1
Decimal('1')
 >>> fd ^ i
{Decimal('1')}

If no one beats me to it, I will probably file a bug report or two, but 
I am still thinking about what to say and to suggest.

Terry Jan Reedy







More information about the Python-list mailing list