[Python-Dev] total ordering.

Vladimir Yu. Stepanov root at renet.ru
Sat May 6 22:59:24 CEST 2006


On Sat, May 06, 2006 at 03:12:11AM -0700, Josiah Carlson wrote:
> 
> "Vladimir 'Yu' Stepanov" <vys at renet.ru> wrote:
> > Josiah Carlson wrote:
> > > This problem has nothing to do with dictionaries and hashing, it has to
> > > do with the fact that there may not be a total ordering on the elements
> > > of a sequence.
> > 
> > It is sad. I did not know it. Therefore and have not understood.
> > 
> > I have looked in Google on "python dev total ordering". On intentions to
> > correct this problem I have not found anything. It already earlier was
> > discussed ?
> 
> Not really.  It has to do with how non-comparable instances compare to
> each other.  Generally speaking, when instances aren't comparable,
> Python compares their type, which is actually comparing the names of
> types. This wouldn't be a big deal, except that:
> 
> >>> str < tuple < unicode
> True
> 
> And you can actually compare str and unicode, so, if you have a str that
> is greater than the unicode, you run into this issue.  With unicode
> becoming str in Py3k, we may not run into this issue much then, unless
> bytes are comparable to str, in which case we end up witht the same
> problems.
> 
> Actually, according to my google of "python dev total ordering", it
> gives me...
> 
> http://mail.python.org/pipermail/python-dev/2003-March/034169.html
> http://mail.python.org/pipermail/python-list/2003-March/154142.html
> 
> Which were earlier discussions on this topic, which are quite topical. 
> The ultimate solution is to choose a total ordering on types and
> consider the problem solved.  Then list.sort() and binary trees will
> work properly.
> 

Thanks for so detailed answer.

Under these references discussion is conducted is presented in the form
of "so is", instead of "as it to correct". Therefore I would like to
mention a question "as it to correct".

In case of incomparability of types can be it is meaningful
compare the identifiers compared to each concrete type. More truly
on them to calculate weight which will play a direct role in case
of impossibility of comparison of types.

Below one of variants of the decision of this problem is resulted.

To each type of data in Python to add three fields:
...
        int     tp_id;
        int     tp_qualifier;
        int     tp_weight;
...

Rigidly to appoint some built in and external to types
identifiers (tp_id). The others receive free identifiers on their
initialization. Except for that types should be classified on
their basic properties - tp_qualifier. The type can be
classified as:
0 = None
1 = integer (bool, int, long, etc..)
2 = float (decimal, float, etc..)
3 = math (complex, matrix may be ?? )
4 = string (str, unicode, splice, etc..)
5 = sequence (iterators, deque, xrange, enumerate, etc..)
6 = array (tuple, list, bytes, array, etc..)
7 = dictionary (dict, defdict, etc..)
8 = set (set, etc..)
9 = file (file, socket.socket, cStringIO.StringIO)

...
127 = non qualifier (sre.*, datetime.*, ClassType, IntsenceType, etc..)

It is the approximate list on which it will be convenient
to classify types (in my opinion).

The weight can be precisely set in structure or if be equal -1
should is calculated under the formula:

    ob_type->tp_weight = (ob_type->tp_qualifier<<24) + (ob_type->tp_id<<8)

Thus objects with similar properties will follow one after
another. A problem can make external types. But if to develop
further classification, problems to arise should not.


-- 
SY. Vladimir Stepanov


More information about the Python-Dev mailing list