[Python-ideas] INSANE FLOAT PERFORMANCE!!!

Elliot Gorokhovsky elliot.gorokhovsky at gmail.com
Wed Oct 12 17:19:13 EDT 2016


On Wed, Oct 12, 2016 at 9:20 AM Tim Peters <tim.peters at gmail.com> wrote:

> > What I'm *not* quite clear on is why Python 3's change to reject
> > comparisons between unrelated types makes this optimisation possible.
>
> It doesn't.  It would also apply in Python 2.  I simply expect the
> optimization will pay off more frequently in Python 3 code.  For
> example, in Python 2 I used to create lists with objects of wildly
> mixed types, and sort them merely to bring objects of the same type
> next to each other.  Things "like that" don't work at all in Python 3.
>
>
> > Surely you have to check either way? It's not that it's a particularly
> > important question - if the optimisation works, it's not that big a
> > deal what triggered the insight. It's just that I'm not sure if
> > there's some other point that I've not properly understood.
>

Yup. Actually, the initial version of this work was with Python 2. What
happened was this: I had posted earlier something along the lines of "hey
everybody let's radix sort strings instead of merge sort because that will
be more fun ok". And everyone wrote me back "no please don't are you
kidding". Tim Peters wrote back "try it but just fyi it's not gonna work".
So I set off to try it. I had never used the C API before, but luckily I
found some Python 2 documentation that gives an example of subclassing
list, so I was able to mostly just copy-paste to get a working list
extension module. I then copied over the implementation of listsort.

My first question was how expensive python compares are vs C compares. And
since python 2 has PyString_AS_STRING, which just gives you a char* pointer
to a C string, I went in and replaced PyObject_RichCompareBool with strcmp
and did a simple benchmark. And I was just totally blown away; it turns out
you get something like a 40-50% improvement (at least on my simple
benchmark).

So that was the motivation for all this. Actually, if I wrote this for
python 2, I might be able to get even better numbers (at least for
strings), since we can't use strcmp in python 3. (Actually, I've heard
UTF-8 strings are strcmp-able, so maybe if we go through and verify all the
strings are UTF-8 we can strcmp them? I don't know enough about how
PyUnicode stuff works to do this safely). My string special case currently
just bypasses the typechecks and goes to unicode_compare(), which is still
wayyy overkill for the common case of ASCII or Latin-1 strings, since it
uses a for loop to go through and check characters, and strcmp uses
compiler magic to do it in like, negative time or something. I even
PyUnicode_READY the strings before comparing; I'm not sure if that's really
necessary, but that's how PyUnicode_Compare does it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20161012/cef5f8b0/attachment.html>


More information about the Python-ideas mailing list