A macro for easier rich comparisons

Hello, Rich comparison functions of many builtin types include a block of boilerplate code with a switch on the "op" argument, like this example from tupleobject: int cmp; PyObject *res; switch (op) { case Py_LT: cmp = vlen < wlen; break; case Py_LE: cmp = vlen <= wlen; break; case Py_EQ: cmp = vlen == wlen; break; case Py_NE: cmp = vlen != wlen; break; case Py_GT: cmp = vlen > wlen; break; case Py_GE: cmp = vlen >= wlen; break; default: return NULL; /* cannot happen */ } if (cmp) res = Py_True; else res = Py_False; Py_INCREF(res); return res; Each time it's implemented slightly differently: sometimes there are two values to compare, sometimes there's a cmp-style 0/1/-1 result to process; invalid "op" values are handled in various ways. Such code is also required in third-party extensions. I propose adding a public macro to ease this. My version takes two C-orderable values and the operation, similar to richcmpfunc: #define Py_RICHCOMPARE(val1, val2, op) ( \ ((op) == Py_EQ) ? PyBool_FromLong((val1) == (val2)) : \ ((op) == Py_NE) ? PyBool_FromLong((val1) != (val2)) : \ ((op) == Py_LT) ? PyBool_FromLong((val1) < (val2)) : \ ((op) == Py_GT) ? PyBool_FromLong((val1) > (val2)) : \ ((op) == Py_LE) ? PyBool_FromLong((val1) <= (val2)) : \ ((op) == Py_GE) ? PyBool_FromLong((val1) >= (val2)) : \ (Py_INCREF(Py_NotImplemented), Py_NotImplemented)) (As for the behavior for unknown op: for most cases the best thing to do is setting an error and returning NULL, but that doesn't fit into a macro. A surprising number of richcmpfunc's in CPython either return NULL without error (e.g. tupleobject), or fall through and return True/False arbitrarily (as in bytearrayobject). Datetime does an assert. I think Py_NotImplemented is a good value to return) I've made a patch with the macro and its use in CPython itself. Read it as examples of usage: I'm not saying that all these modules should start using it immediately (or ever, e.g. for speed reasons), and if they do, many could use some refactoring in other code paths. The patch is here: https://github.com/encukou/cpython/compare/master...richcmp Why I'm interested in this: When porting old code to Python 3, rich comparisons a major cause of added boilerplate – especially when wrapping a C library that exposes either trivially comparable types or "cmp" functions. With this macro, a typical trivial rich comparison function reduces to: static PyObject* mytype_richcmp(PyObject *obj1, PyObject *obj2, int op) { if (mytype_Check(obj1) && mytype_Check(obj2)) { return PY_RICHCOMPARE(mytype_get_data(obj1), mytype_get_data(obj2), op); } Py_RETURN_NOTIMPLEMENTED; } ... or with `PY_RICHCOMPARE(mytype_cmp(obj1, obj2), 0, op)`. (Note that even adding this to 3.5/3.6, would help in porting efforts: the macro itself is be easy to backport, but there are big benefits in having a standard name, known semantics, and official documentation.) Is this a PEP-worthy idea?

On 18.03.2015 11:53, Petr Viktorin wrote:
... I propose adding a public macro to ease this. My version takes two C-orderable values and the operation, similar to richcmpfunc:
#define Py_RICHCOMPARE(val1, val2, op) ( \ ((op) == Py_EQ) ? PyBool_FromLong((val1) == (val2)) : \ ((op) == Py_NE) ? PyBool_FromLong((val1) != (val2)) : \ ((op) == Py_LT) ? PyBool_FromLong((val1) < (val2)) : \ ((op) == Py_GT) ? PyBool_FromLong((val1) > (val2)) : \ ((op) == Py_LE) ? PyBool_FromLong((val1) <= (val2)) : \ ((op) == Py_GE) ? PyBool_FromLong((val1) >= (val2)) : \ (Py_INCREF(Py_NotImplemented), Py_NotImplemented))
(As for the behavior for unknown op: for most cases the best thing to do is setting an error and returning NULL, but that doesn't fit into a macro. A surprising number of richcmpfunc's in CPython either return NULL without error (e.g. tupleobject), or fall through and return True/False arbitrarily (as in bytearrayobject). Datetime does an assert. I think Py_NotImplemented is a good value to return) ... Is this a PEP-worthy idea?
This is a great idea, no need for a PEP :-) Please submit the patch on the bug tracker and also include a patch for the C API documentation. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 18 2015)
Python Projects, Coaching and Consulting ... http://www.egenix.com/ mxODBC Plone/Zope Database Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2015-03-12: Released mxODBC 3.3.2 ... http://egenix.com/go71 ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

On Wed, Mar 18, 2015 at 12:01 PM, M.-A. Lemburg <mal@egenix.com> wrote:
On 18.03.2015 11:53, Petr Viktorin wrote:
... I propose adding a public macro to ease this. My version takes two C-orderable values and the operation, similar to richcmpfunc:
#define Py_RICHCOMPARE(val1, val2, op) ( \ ((op) == Py_EQ) ? PyBool_FromLong((val1) == (val2)) : \ ((op) == Py_NE) ? PyBool_FromLong((val1) != (val2)) : \ ((op) == Py_LT) ? PyBool_FromLong((val1) < (val2)) : \ ((op) == Py_GT) ? PyBool_FromLong((val1) > (val2)) : \ ((op) == Py_LE) ? PyBool_FromLong((val1) <= (val2)) : \ ((op) == Py_GE) ? PyBool_FromLong((val1) >= (val2)) : \ (Py_INCREF(Py_NotImplemented), Py_NotImplemented))
(As for the behavior for unknown op: for most cases the best thing to do is setting an error and returning NULL, but that doesn't fit into a macro. A surprising number of richcmpfunc's in CPython either return NULL without error (e.g. tupleobject), or fall through and return True/False arbitrarily (as in bytearrayobject). Datetime does an assert. I think Py_NotImplemented is a good value to return) ... Is this a PEP-worthy idea?
This is a great idea, no need for a PEP :-) Please submit the patch on the bug tracker and also include a patch for the C API documentation.
Issue with patches is here: http://bugs.python.org/issue23699

On Wed, Mar 18, 2015 at 2:48 PM, Petr Viktorin <encukou@gmail.com> wrote:
On Wed, Mar 18, 2015 at 12:01 PM, M.-A. Lemburg <mal@egenix.com> wrote:
On 18.03.2015 11:53, Petr Viktorin wrote:
... I propose adding a public macro to ease this. My version takes two C-orderable values and the operation, similar to richcmpfunc:
#define Py_RICHCOMPARE(val1, val2, op) ( \ ((op) == Py_EQ) ? PyBool_FromLong((val1) == (val2)) : \ ((op) == Py_NE) ? PyBool_FromLong((val1) != (val2)) : \ ((op) == Py_LT) ? PyBool_FromLong((val1) < (val2)) : \ ((op) == Py_GT) ? PyBool_FromLong((val1) > (val2)) : \ ((op) == Py_LE) ? PyBool_FromLong((val1) <= (val2)) : \ ((op) == Py_GE) ? PyBool_FromLong((val1) >= (val2)) : \ (Py_INCREF(Py_NotImplemented), Py_NotImplemented))
(As for the behavior for unknown op: for most cases the best thing to do is setting an error and returning NULL, but that doesn't fit into a macro. A surprising number of richcmpfunc's in CPython either return NULL without error (e.g. tupleobject), or fall through and return True/False arbitrarily (as in bytearrayobject). Datetime does an assert. I think Py_NotImplemented is a good value to return) ... Is this a PEP-worthy idea?
This is a great idea, no need for a PEP :-) Please submit the patch on the bug tracker and also include a patch for the C API documentation.
Issue with patches is here: http://bugs.python.org/issue23699
Notes from the patch review: it is not so clear-cut after all. A sequence of comparisons is less efficient than a switch statement. Same with PyBool_FromLong vs. Py_RETURN_* Also, it might be too complex for a macro. These points could be solved by making it a function instead. Another issue raised is that the use cases are too limited limited for this to be a part of the Python API. I don't agree with that; I think boilerplate-reducing features are even if their usage is limited (see Py_RETURN_NOTIMPLEMENTED or PyModule_AddIntConstant).

On Mar 20, 2015, at 7:24 AM, Petr Viktorin <encukou@gmail.com> wrote:
A sequence of comparisons is less efficient than a switch statement. Same with PyBool_FromLong vs. Py_RETURN_* Also, it might be too complex for a macro. These points could be solved by making it a function instead.
Or by just making the macro a statement instead of a function; then the code can be exactly the same code you're abstracting out, and instead of using it like this: return PY_RICHCOMPARE(mytype_get_data(obj1), mytype_get_data(obj2), op); ... you use it like this: PY_RETURNRICHCOMPARE(mytype_get_data(obj1), mytype_get_data(obj2), op);

On Fri, Mar 20, 2015 at 11:19 PM, Andrew Barnert <abarnert@yahoo.com> wrote:
On Mar 20, 2015, at 7:24 AM, Petr Viktorin <encukou@gmail.com> wrote:
A sequence of comparisons is less efficient than a switch statement. Same with PyBool_FromLong vs. Py_RETURN_* Also, it might be too complex for a macro. These points could be solved by making it a function instead.
Or by just making the macro a statement instead of a function; then the code can be exactly the same code you're abstracting out, and instead of using it like this:
return PY_RICHCOMPARE(mytype_get_data(obj1), mytype_get_data(obj2), op);
... you use it like this:
PY_RETURNRICHCOMPARE(mytype_get_data(obj1), mytype_get_data(obj2), op);
Thanks Andrew. This makes it harder to do cleanup after the comparison (e.g. jump to a common block of Py_DECREFs, or do PyFPE_END_PROTECT in float code). But those should be rare enough, or can be easily worked around, so a "return" macro is still useful.
participants (3)
-
Andrew Barnert
-
M.-A. Lemburg
-
Petr Viktorin