[Tutor] True and 1 [was Re: use of the newer dict types]

Fri Aug 2 01:58:52 CEST 2013

On Thu, Aug 1, 2013 at 3:28 PM, Albert-Jan Roskam <fomcl at yahoo.com> wrote:
>
> Six comparisons for each operator (6 x 3) and the 4 calls to __coerce__
> seems much. So you arrvie at 8 calls by 3 operator methods called
> "bidirectionally" (A-B and B-A) + two calls to __coerce__?

If the types are the same, cmp uses __cmp__ if it's defined and
implemented. Otherwise CPython uses the rich comparisons EQ, LT, and
GT, with the operands both unswapped and swapped (EQ swaps with
itself, and LT swaps with GT). It returns the first implemented result
(but not the default comparison at this step). Also, if the 2nd
operand's type is a subclass of the first, the swapped operation is
done first, so the subclass can override the parent.

My example returns NotImplemented for all of the rich comparisons, so
the interpreter should try a classic comparison using __coerce__
(unswapped and swapped) to get comparable objects. PyPy skips coercing
the objects. It also allows the default result from the LT rich
comparison (comparing type names in this case) to trump attempting a
GT comparison. CPython only uses the default comparison after
exhausting all other options. That seems right to me, but this is
language lawyer territory...

As to the excessive calls in CPython, (1) it doesn't remember that
it's already tried the swapped operation if the 2nd operand's type is
a subclass, so it does it again. (2) The API calls such as
PyObject_RichCompare try each comparison swapped and unswapped, but
the slot function (e.g. slot_tp_richcompare) *also* tries it swapped
and unswapped. The same problem exists with PyNumber_CoerceEx vs.
slot_nb_coerce. CPython 3.x doesn't have this problem with its rich
comparison implementation.

> I did "help(coerce)" and it seems straightforward, however, I can't connect
> it to your remark about coercion. I thought pypy would behave differently
> with e.g cmp(1, "x"), compared to cpython

In CPython 2.x numbers are less than other types, except None is less
than anything. I think PyPy implements the same default behavior.

int 1 and str "x" won't be coerced to a comparable type. str doesn't
implement __coerce__. The number types have rich comparisons defined
with each other and won't coerce with non-number types. For example,
long can coerce an int; float can coerce an int or long (maybe with
loss of precision); and complex can coerce an int, long, or float.

Here's a simple example that returns the default result in PyPy 2.02:

    class A(object):
        def __coerce__(self, other):
            return 10, other

    >>>> cmp(A(), 9), cmp(A(), 10), cmp(A(), 11)
    (1, 1, 1)
    >>>> cmp(9, A()), cmp(10, A()), cmp(11, A())
    (-1, -1, -1)

Here's the result in CPython 2.7.5:

    >>> cmp(A(), 9), cmp(A(), 10), cmp(A(), 11)
    (1, 0, -1)
    >>> cmp(9, A()), cmp(10, A()), cmp(11, A())
    (-1, 0, 1)

> # I was hoping this, I mean dis, would shed some light on this
>>>> import dis
>>>> dis.dis("cmp(0, 0)")
>           0 DUP_TOPX        28781
>           3 STORE_SLICE+0
>           4 <48>
>           5 <44>
>           6 SLICE+2
>           7 <48>
>           8 STORE_SLICE+1

First, disassembling a function *call* won't tell you anything about
its implementation. You'd have to disassemble the function, which you
can't do for a built-in function or method (it isn't bytecode). You
can't even disassemble cmp in PyPy.

Second, you just disassembled a string as bytecode. Let's do that
manually. First get the op names, but remove the 'mp' from 'cmp',
which is treated as an arg to DUP_TOPX:

    >>> ops = [dis.opname[ord(c)] for c in 'c(0, 0)']

Now unpack 'mp' as a little-endian, unsigned short and insert it at index 1:

    >>> ops.insert(1, struct.unpack('<H', 'mp')[0])

    >>> ops
    ['DUP_TOPX', 28781, 'STORE_SLICE+0', '<48>', '<44>',
     'SLICE+2', '<48>', 'STORE_SLICE+1']