[Python-3000] _heapq.c, etc. (was Re: Heaptypes)
Guido van Rossum
guido at python.org
Fri Jul 20 16:44:09 CEST 2007
On 7/20/07, Josiah Carlson <jcarlson at uci.edu> wrote:
>
> "Guido van Rossum" <guido at python.org> wrote:
> > On 7/19/07, Guido van Rossum <guido at python.org> wrote:
> > > How about instead you help with fixing pickling of datetime objects?
> > > This broke when I fixed test_pickle. Rolling back your changes to
> > > datetime pickling didn't seem to help.
> >
> > Never mind; this was shallow -- cPickle doesn't pickle bytes
> > correctly. I've decided to get rid of cPickle -- someone is writing a
> > replacement for the summer of code anyway. The new approach will be
> > that you always write "import pickle" and this transparently attempts
> > to use the C accelerator if it can be imported, like heapq.py and
> > _heapq.c.
>
> On a related note, since I had been supporting only Python 2.3 for quite
> a while, I didn't notice the fact that Python's _heapq.c (in 2.4 at
> least, I haven't tested on 2.5) only supported lists as containers, and
> not a list-like object with all methods that heapq calls (which was an
> issue for a pure-Python pair heap implementation I posted last December
> or so).
>
> What made it really annoying is that there was no way to tell the heapq
> module not to load the C version so that I could use a generic container.
> I ended up just commenting out the C module heapq import and moving on.
>
> I don't know if we want to make it possible to disable the loading of
> certain C modules that *don't* offer all of the same features, or if we
> want to limit the Python versions to what the C versions support, or
> even if we want to expand the C versions to handle all cases that the
> Python versions support. While the pickle/cPickle, StringIO/cStringIO,
> etc., naming can be a bit annoying, it does give me the choice whether I
> want it to be fast or flexible.
This was an example of a performance improvement that changed the
specs of an API in an incompatible way. Breaking your code was an
unintended side effect of the speedup.
We're going to do a few more of these in Py3k, and this time breaking
the specs is the name of the game. I think going forward (post 3.0) we
should be more careful to write specs that can easily be optimized
without breaking existing usage, or writing speedups that can handle
all the argument types that the original code supported.
I definitely *don't* want to continue the old habit of having a slow
and a fast module with different names; the experience with especially
cPickle and cStringIO is that everyone believes their code is
performance critical and hence uses the C version if it exists,
thereby repeating the same idiom over and over.
--
--Guido van Rossum (home page: http://www.python.org/~guido/)
More information about the Python-3000
mailing list