[Python-Dev] SF patch 864863: Bisect C implementation

Tim Peters tim.one at comcast.net
Thu Jan 1 19:01:53 EST 2004


[Raymond]
> I would be happy to bring back the python version and make it coexist
> with the C version.

string.py was the first poster child for this, but it's changed a lot over
the years.  Originally, string.py was pure Python.  As the years went by, it
*remained* pure Python, but more and more of the functions got overriden, at
the bottom of the file, by importing C functions with the same names.
That's kinda lost now, because the module functions turned into string
methods, and most of the Python code that remains in string.py is trivial
delegation to the string methods now.  Several years ago, it all looked a
lot like maketrans() looks today (a complete implementation in Python, and
then at the bottom that's thrown away).

I don't really care whether the Python implementations get tested.  And I
definitely don't want more messes like pickle+cPickle and
StringIO+cStringIO -- they're *always* out of synch in some way, and I doubt
that will ever be repaired.

Putting them under Demo doesn't do any good, because nobody maintains that
directory; we don't even ship that directory in the Windows installer, so
putting stuff there renders it invisible to most of Python's users.

> IMO, it was a programming pearl unto itself and efforts were already
> made to preserve the extensive module docstring and make the
> code match the original line for line with the variable names and
> logic intact.

Most Python users don't know C, and will never see the C implementation.

> ...
> Also, with the PyPy project, there is a chance that cleanly coded pure
> python modules will survive the test of time longer than their C
> counterparts.  In the itertools module, I documented pure python
> equivalents to make the documentation more specific, to enchance
> understanding, and to make it easier to backport.  For the most
> part, that effort was successful.

I love the Python-workalike docs for itertools!  That's a very helpful form
of documentation, building on Python knowledge to supply a more precise
explanation than is possible with informal English.  OTOH, the itertools
functions have (unlike, e.g., pickle) pretty simple, well-defined jobs.
Needing to present half a page (or more) of Python code instead wouldn't
make for pleasant docs.  For that reason, I was happy to leave datetime.py
to die in the sandbox.

> The only downside is the double maintenance problem of keeping the two
> in sync.  Given the stability of bisect, that is not much of an issue.

The double-maintenance problem is severe for non-trivial code.  The current
string module, and bisect module, functions are trivial.  heapq is somewhere
in the middle, more like the string module used to be.  I'd be perfectly
happy if heapq.py were restored just as it was, and then a new section added
at the bottom to replace it all with C functions.  I don't think putting the
Python implementations of heapq functions in the docs would be satisfying,
because the details of their implementations don't shed light on the
*purpose* of the functions.  In contrast, the Python implementations of most
itertools functions define the purpose of those functions more succinctly
than English.  The implementations of heapq functions are much more about a
specific way of implementing a full binary tree than they are about heap
operations.




More information about the Python-Dev mailing list