[Python-Dev] Should standard library modules optimize for CPython?
Steven D'Aprano
steve at pearwood.info
Sun Jun 1 10:11:39 CEST 2014
I think I know the answer to this, but I'm going to ask it anyway...
I know that there is a general policy of trying to write code in the
standard library that does not disadvantage other implementations. How
far does that go the other way? Should the standard library accept
slower code because it will be much faster in other implementations?
Briefly, I have a choice of algorithm for the median function in the
statistics module. If I target CPython, I will use a naive but simple
O(N log N) implementation based on sorting the list and returning the
middle item. (That's what the module currently does.) But if I target
PyPy, I will use an O(N) algorithm which knocks the socks off the naive
version even for smaller lists. In CPython that's typically 2-5 times
slower; in PyPy it's typically 3-8 times faster, and the bigger the data
set the more the advantage.
For the specific details, see http://bugs.python.org/issue21592
My feeling is that the CPython standard library should be written for
CPython, that is, it should stick to the current naive implementation of
median, and if PyPy wants to speed the function up, they can provide
their own version of the module. I should *not* complicate the
implementation by trying to detect which Python the code is running
under and changing algorithms accordingly. However, I should put a
comment in the module pointing at the tracker issue. Does this sound
right to others?
Thanks,
--
Steve
More information about the Python-Dev
mailing list