Comparisons and sorting of a numeric class....

Andrew Robinson andrew3 at r3dsolutions.com
Wed Jan 14 01:12:54 CET 2015


On 01/12/2015 09:32 PM, Steven D'Aprano wrote:
> On Mon, 12 Jan 2015 17:59:42 -0800, Andrew Robinson wrote:
>
> [...]
>> What I am wanting to know is WHY did Guido think it so important to do
>> that ?   Why was he so focused on a strict inability to have any
>> instances of a bool subclass at all -- that he made a very arbitrary
>> exception to the general rule that base types in Python can be
>> subclassed ?
> It's not arbitrary. All the singleton (doubleton in the case of bool)
> classes cannot be subclassed. E.g. NoneType:
>
> py> class X(type(None)):
> ...     pass
> ...
> Traceback (most recent call last):
>    File "<stdin>", line 1, in <module>
> TypeError: Error when calling the metaclass bases
>      type 'NoneType' is not an acceptable base type
>
>
> Likewise for the NotImplemented and Ellipsis types.
>
> The reason is the same: if a type promises that there is one and only one
> instance (two in the case of bool), then allowing subtypes will break
> that promise in the 99.99% of cases where the subtype is instantiated.
Ok. That's something I did not know.  So much for the just four classes 
can't be subtyped remark someone else made...

At least Guido is consistent.
But that doesn't give me any idea of why he thought it important.

> I suppose in principle Python could allow you to subclass singleton
> classes to your hearts content, and only raise an error if you try to
> instantiate them, but that would probably be harder and more error-prone
> to implement, and would *definitely* be harder to explain.
>
>
> There may be others too:
>
> py> from types import FunctionType
> py> class F(FunctionType):
> ...     pass
> ...
> Traceback (most recent call last):
>    File "<stdin>", line 1, in <module>
> TypeError: Error when calling the metaclass bases
>      type 'function' is not an acceptable base type
>
>
> My guess here is that functions are so tightly coupled to the Python
> interpreter that allowing you to subclass them, and hence break required
> invariants, could crash the interpreter. Crashing the interpreter from
> pure Python code is *absolutely not allowed*, so anything which would
> allow that is forbidden.
>
>
>> There's no reason in object oriented programming principles in general
>> that requires a new subclass instance to be a COMPLETELY DISTINCT
>> instance from an already existing superclass instance....
> True. But what is the point of such a subclass? I don't think you have
> really thought this through in detail.

I have. Such a subclass allows refining of the meaning/precision of a 
previous type while improving compatibility with existing applications 
-- and without being so easy to do that everyone will abuse it.  That's 
the standard kind of things which go into deciding that a singleton is 
appropriate...

Subclasses of singletons is a tried and true way of improving 
productivity regardless of how it is implemented.

> Suppose we allowed bool subclasses, and we implement one which *only*
> returns True and False, without adding a third instance:
>
> class MyBool(bool):
>      def __new__(cls, arg):
>          if cls.condition(arg):
>              return True
>          else:
>              return False
>      @classmethod
>      def condition(cls, obj):
>          # decide whether obj is true-ish or false-ish.
>          pass
>      def spam(self):
>          return self.eggs()
>      def eggs(self):
>          return 23
>
>
> And then you do this:
>
> flag = MyBool(something)
> flag.spam()
>
>
> What do you expect to happen?
Flag is not an instance of MyBool, so it's going to generate an exception.
>
> Since flag can *only* be a regular bool, True or False, it won't have
> spam or eggs methods.
Correct.  It would generate an exception.
> You might think of writing the code using unbound methods:
>
> MyBool.spam(flag)
>
> (assuming that methods don't enforce the type restriction that "self"
> must be an instance of their class), but that fails when the spam method
> calls "self.eggs". So you have to write your methods like this:
>
>      def spam(self):
>          return MyBool.eggs(self)
>
> hard-coding the class name! You can't use type(self), because that's
> regular bool, not MyBool.
>
> This is a horrible, error-prone, confusing mess of a system. If you're
> going to write code like this, you are better off making MyBool a module
> with functions instead of a class with methods.

Every design method has its trade offs... but how you organize you code 
will affect whether it is messy or clean.

In terms of information encoding -- both an instance of a type,  or a 
class definition held in a type variable -- eg: a class name -- are 
pretty much interchangeable when it comes to being able to tell two 
items are not the same one, or are the same one.

So -- even a cursory thought shows that the information could be encoded 
in a very few lines even without an instance of a subclass:

class CAllFalse():
     @classmethod
     def __nonzero__(Kls): return False

class CPartFalse():
     @classmethod
     def __nonzero__(Kls): return False

...

class cmp():
     lex=( CAllFalse,  CPartFalse, True )
     @staticmethod
     def __cmp__( left, right ):
         if type(left) == type(right): # advanced statistical compare
             return cmp.lex.index( right ).__cmp__( cmp.lex.index( left ) )
         else: # legacy bool compare...
            return right.__nonzero__().__cmp__( left.__nonzero__() )

 >>> CAllFalse.__nonzero__()
False
 >>> cmp.__cmp__( CPartFalse, CAllFalse )
-1
 >>> cmp.__cmp__( CPartFalse, False )
0

This is not optimal code, but I could definitely refine it until it was 
compact and clean.

I do admit that, being more used to instantated variables, I can make 
proxy wrappers far more easily and robustly than subclassed ones with no 
instances at all.  Both methods are traditionally used for making the 
equivalent of a subclass of a singleton...

And it's pretty clear that the return values of comparison operators are 
what have to be preserved in legacy operations vs. advanced ones 
regardless of how the singletons are implemented.

Essentially, any type of if statement that mixes bool with an advanced 
class must be wanting a legacy result -- so basically, I just need a 
simple way to encode and carry out a complicated compare, and always 
return a True or False final value at the appropriate time...

The most basic wrapper class is merely one that extends the idea of 
comparision to rich comparision; eg: like this one...

def _op(op,other): # Call an operation, to match data sizes...
         try: other[0]
         except: return op((other,))
         return op(other)

class ETuple(tuple):
     """
     Enhanced Tuple.
     A tuple which does rich comparisons even when the item being
     compared against is not an iterable by converting non iterables to
     tuples of length, exactly one.
     """
     def __new__(cls, data):return super(ETuple,cls).__new__(cls,data)
     def __eq__(self,other):return _op(super(ETuple,self).__eq__,other)
     def __ne__(self,other):return _op(super(ETuple,self).__ne__,other)
     def __lt__(self,other):return _op(super(ETuple,self).__lt__,other)
     def __le__(self,other):return _op(super(ETuple,self).__le__,other)
     def __gt__(self,other):return _op(super(ETuple,self).__gt__,other)
     def __ge__(self,other):return _op(super(ETuple,self).__ge__,other)
     def __cmp__(self,other):return _op(super(ETuple,self).__cmp__,other)
     def __nonzero__(self,other):return 
_op(super(ETuple,self).__cmp__,other)

 >>> PartFalse = ETuple( (False,) )
 >>> True > PartFalse
True
 >>>  PartFalse < True
True
 >>> PartFalse > False
False
 >>> False < PartFalse
False
 >>> False == PartFalse
True
 >>> False is PartFalse
False
 >>> False == True
False
 >>> print PartFalse
(False,)

It's pretty well behaved, and does everything I want.

A full implementation, that distinguishes between legacy and 
probabilistic interpretations for the math operators could be done in 
several ways;  Here's one that I think is reasonably clean.

class RBool(tuple):
     """
     Rich compare bool.
     A tuple which contains a bool, and a relative certainty value.
     Whenever a RBool is compared against a bool or single valued object 
tuple,
     it will default to legacy mode and compare only against the bool
     However, whenever compared against another RBool, or multi-element 
tuple,
     it will do a full rich comparison.
     """
     def __new__(cls, data):return super(RBool,cls).__new__(cls,data)
     def _legacy(self,other):
         'Resize local data for proper legacy compares as needed.'
         try:
             if len(other)==1: return (self[0],)  # legacy compare in 
tuples.
         except TypeError: return self[0] # non iterables are a legacy 
compare.
         return tuple(self) # A full rich non-recursive compare is 
warranted
     def __eq__(self,other):return self._legacy(other) == other
     def __ne__(self,other):return self._legacy(other) != other
     def __lt__(self,other):return self._legacy(other) <  other
     def __le__(self,other):return self._legacy(other) <= other
     def __gt__(self,other):return self._legacy(other) >  other
     def __ge__(self,other):return self._legacy(other) >= other
     # def __cmp__( #TODO:
     # def __nonzero__( #TODO:

AllTrue   = True
PartTrue  = RBool((False,3))
Uncertain = RBool((False,2))
PartFalse = RBool((False,1))
AllFalse  = RBool((False,0))

>> nor, have I
>> ever seen Guido say that Python is designed intentionally to force this
>> to always be the case... so I'm not sure that's its anything more than a
>> non guaranteed implementation detail that Python acts the way you say it
>> does....
> It is a documented restriction on bool. Whether you agree with the
> decision or not, it is not an implementation detail, it is a language
> promise.
>
> https://docs.python.org/2/library/functions.html#bool

It was documented in the link I gave too. So -- It's not like I didn't 
know that already.  It just doesn't look like he had thought it out 
carefully and might change his mind in the future.  He is great on the 
saying what will happen -- but not so good at clearly stating why, and 
he does change his mind.  I do recall reading about a time when Python 
had a bool factory function ... with a plethora of values...

And even if he doesn't change his mind --  it's still a language 
implementation detail; because Proxy objects are still possible even if 
subclassing is not.  And, it is therefore possible to get an object 
which is not bool -- to return isinstance( myvar, bool ) True -- when it 
proxies a bool (I have actually done this...)

Besides which, If you really want to get down to it -- I can redefine 
class bool from main and make it subclassable for selected libraries I 
want to import; it's hairy, but I certainly can make a replacement class 
that will allow subclassing and which is compatible with the present 
definition of bool.

So -- it's not like Guido's choices (including saying that duck-typing 
ruins the whole point of having bools) -- are carved in stone for all 
eternity.

There are always options....





More information about the Python-list mailing list