[Python-Dev] DRAFT: python-dev summary for 2006-05-16 to 2006-05-31

Steven Bethard steven.bethard at gmail.com
Thu Jun 15 19:28:28 CEST 2006

Ok, for the first time in a few months, you're getting this summary
before the next one is due.  Woo-hoo!  (Yes, I know I'm not even a day
ahead.  Let me enjoy my temporary victory.) =)

Here's the draft summary for the second half of May.  Let me know what
comments/corrections you have.  Thanks!


QOTF: Quote of the Fortnight

Martin v. Löwis on what kind of questions are appropriate for python-dev:

    ... [python-dev] is the list where you say "I want to help", not
so much "I need your help".

Contributing thread:

- `Segmentation fault of Python if build on Solaris 9 or 10 with Sun
Studio 11 <http://mail.python.org/pipermail/python-dev/2006-May/065493.html>`__

Python 2.5 schedule

Python 2.5 is moving steadily towards its next release.  See `PEP
356`_ for more details and the full schedule.  You may start to see a
few warnings at import time if you've named non-package directories
with the same names as your modules/packages.  Python-dev suggests
renaming these directories -- though the warnings won't give you any
real trouble in Python 2.5, there's a chance that a future version of
Python will drop the need for __init__.py.

.. _PEP 356: http://www.python.org/dev/peps/pep-0356/

Contributing thread:

- `2.5 schedule
- `warnings about missing __init__.py in toplevel directories

Restructured library reference

Thanks to work by A.M. Kuchling and Michael Spencer, the organization
of the `development Library Reference documentation`_ structure is
much improved over the `old one`_.  Thanks for your hard work guys!

.. _development Library Reference documentation:
.. _old one: http://docs.python.org/lib/lib.html

Contributing thread:

- `[Python-3000] stdlib reorganization

Need for Speed Sprint results

The results of the `Need for Speed Sprint`_ are all posted on the
wiki.  In particular, you should check a number of `successes`_ they
had in speeding up various parts of Python including function calls,
string and Unicode operations, and string<->integer conversions.

.. _Need for Speed Sprint: http://wiki.python.org/moin/NeedForSpeed/
.. _successes: http://wiki.python.org/moin/NeedForSpeed/Successes

Contributing threads:

- `[Python-checkins] r46043 - peps/trunk/pep-0356.txt
- `Need for Speed Sprint status

Python old-timer memories

Guido's been collecting `memories of old-timers`_ who have been using
Python for 10 years or more.  Be sure to check 'em out and add your

.. _memories of old-timers:

Contributing thread:

- `Looking for Memories of Python Old-Timers


Struct module inconsistencies

Changes to the struct module to do proper range checking resulted in a
few bugs showing up where the stdlib depended on the old, undocumented
behavior.  As a compromise, Bob Ippolito added code to do the proper
range checking and issue DeprecationWarnings, and then made sure that
the all struct results were calculated with appropriate bit masking.
The warnings are expected to become errors in Python 2.6 or 2.7.

Bob also updated the struct module to return ints instead of longs
whenever possible, even for the format codes that had previously
guaranteed longs (I, L, q and Q).

Contributing threads:

- `Returning int instead of long from struct when possible for
performance <http://mail.python.org/pipermail/python-dev/2006-May/065199.html>`__
- `test_gzip/test_tarfile failure om AMD64
- `Converting crc32 functions to use unsigned
- `test_struct failure on 64 bit platforms

Using epoll for the select module

Ross Cohen implemented a `drop-in replacement for select.poll`_ using
Linux's epoll (a more efficient io notifcation system than poll).  The
select interface is already much closer to the the epoll API than the
poll API, and people liked the idea of using epoll silently when
available. Ross promised to look into merging his code with the
current select module (though it wasn't clear whether or not he would
do this using ctypes isntead of an extension module as some people had

.. _drop-in replacement for select.poll: http://sourceforge.net/projects/pyepoll

Contributing thread:

- `epoll implementation

Negatives and sequences

Fredrik Lundh pointed out that using a negative sign and multiplying
by -1 do not always produce the same behavior, e.g.::

    >>> -1 * (1, 2, 3)
    >>> -(1, 2, 3)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: bad operand type for unary -

Though no one seemed particularly concerned about the discrepancy, the
thread did spend some time discussing the behavior of sequences
multiplied by negatives.  A number of folks were pushing for this to
become an error until Uncle Timmy showed some use-cases like::

    # right-justify to 80 columns, padding with spaces
    s = " " * (80 - len(s)) + s

The rest of the thread turned into a (mostly humorous) competition for
the best way to incomprehensibly alter sequence multiplication

Contributing thread:

- `A Horrible Inconsistency


Georg Brandl asked about removing METH_OLDARGS which has been
deprecated since 2.2.  Unfortunately, there are still a bunch of uses
of it in Modules, and it's still the default if no flag is specified.
Georg promised to work on removing the ones in Python core, and there
was some discussion of trying to mark the API as deprecated.  Issuing
a DeprecationWarning seemeed too heavy-handed, so Georg looked into
generating C compile time warnings by marking PyArg_Parse as

Contributing thread:


Propogating exceptions in dict lookup

Armin Rigo offered up `a patch to stop dict lookup from hiding
exceptions`_ in user-defined __eq__ methods.  The PyDict_GetItem() API
gives no way of propogating such an exception, so previously the
exceptions were just swallowed.  Armin moved the exception-swallowing
part out of lookdict() and into PyDict_GetItem() so that even though
PyDict_GetItem() will still swallow the exceptions, all other ways of
invoking dict lookup (e.g. ``value = d[key]`` in Python code) will now
propogate the exception properly.  Scott Dial brought up an odd corner
case where the old behavior would cause insertion of a value into the
dict because the exception was assumed to indicate a new key, but
people didn't seem to worried about breaking this behavior.

.. _a patch to stop dict lookup from hiding exceptions:

Contributing thread:

- `Let's stop eating exceptions in dict lookup

String/unicode inconsistencies

After the Need for Speed Sprint unified some of the string and unicode
code, some tests started failing where string and unicode objects had
different behavior, e.g. ``'abc'.find('', 100)`` used to return -1 but
started returning 100.  There was some discussion about what was the
right behavior here and Fredrik Lundh promised to implement whatever
was decided.

Contributing thread:

- `replace on empty strings
- `Let's stop eating exceptions in dict lookup

Allowing inline "if" with for-loops

Heiko Wundram presented a brief PEP suggesting that if-statements in
the first line of a for-loop could be optionally inlined, so for
example instead of::

    for node in tree:
        if node.haschildren():
            <do something with node>

you could write::

    for node in tree if node.haschildren():
        <do something with node>

Most people seemed to feel that saving a colon character and a few
indents was not a huge gain.  Some also worried that this change would
encourage code that was harder to read, particularly if the for-clause
or if-clause got long.  Guido rejected it, and Heiko promised to
submit it as a full PEP so that the rejection would be properly

Contributing thread:

- `PEP-xxx: Unification of for statement and list-comp syntax

Splitting strings with embedded quoted strings

Dave Cinege proposed augmenting str.split to allow a non-split
delimiter to be specified so that splits would not happen within
particular substrings, e.g.::

    >>> 'Here is "a phrase that should not get split"'.split(None,-1,'"')
    ['Here', 'is', 'a phrase that should not get split']

Most people were opposed to complicating the API of str.split, but
even as a separate method, people didn't seem to think that the need
was that great, particularly since the most common needs for such
functionality were already covered by ``shlex.split()`` and the csv

Contributing thread:

- `New string method - splitquoted

Deadlocks with fork() and multithreading

Rotem Yaari ran into some deadlocks using the subprocess module in a
multithreaded environment.  If a thread other than the thread calling
fork is holding the import lock, then since posix only replicates the
calling thread, the new child process ends up with an import lock that
is locked by a no longer existing thread.  Ronald Oussoren offered up
a repeatable test case, and a number of strategies for solving the
problem were discussed, including releasing the import lock during a
fork and throwing away the old import lock after a fork.

Contributing threads:

- `pthreads, fork, import, and execvp
- `pthreads, fork, import, and execvp


Fredrik Lundh asked about the status of string.partition, and there
was a brief discussion about whether or not to return real string
objects or lazy objects that would only make a copy if the original
string disappeared.  Guido opted for the simpler approach using real
string objects, and Fredrik implemented it.

Contributing threads:

- `whatever happened to string.partition ?
- `[Python-checkins] whatever happened to string.partition ?
- `partition() variants

Speeding up parsing of longs

Runar Petursson asked about speeding up parsing of longs from a slice
of a string, e.g. ``long(mystring[x:y])``.  He initially proposed
adding start= and end= keyword arguments to the long constructor, but
that seemed like a slippery slope where every function that took a
string would eventually need the same arguments.  Tim Peters pointed
out that a buffer object would solve the problem if
``PyLong_FromString()`` supported buffer's "offset & length" view or
the world instead of only seeing the start index.  While adding a
``PyLong_FromStringAndSize()`` would solve this particular problem,
all the internal parsing routines have a similar problem -- none of
them support a slice-based API.

As an alternate approach, Martin Blais was working on a "hot" buffer
class, based on the design of the Java NIO ByteBuffer class, which
would work without an intermediate string creation or memory copy.

Contributing thread:

- `Cost-Free Slice into FromString constructors--Long

Speeding up try/except

After Steve Holden noticed a ~60% slowdown between Python 2.4.3 and
the Python trunk on the pybench try/except test, Sean Reifschneider
and Richard Jones looked into the problem and found that the slowdown
was due to creation of Exception objects.  Exceptions had been
converted to new-style objects by using PyType_New() as the
constructor and then adding magic methods with PyMethodDef().  By
changing BaseException to use a PyType_Type definition and the proper
C struct to associate methods with the class, Sean and Richard Jones
were able to speed up try/except to 30% faster than it was in Python

Contributing thread:

- `2.5a2 try/except slow-down: Convert to type?

Supporting zlib's inflateCopy

Guido noticed that the zlib module was failing with libz 1.1.4.  Even
though Python has its own copy of libz 1.2.3, it tries to use the
system libraries on Unix, so when the zlib module's compress and
decompress objects were updated with a copy() method (using libz's
inflateCopy() function), this broke compatibility for any system that
used a zlib older than 1.2.0.  Chris AtLee provided a `patch
conditionalizing the addition of the copy() method`_ on the version of
libz available.

.. _patch conditionalizing the addition of the copy() method:

Contributing thread:

- `zlib module doesn't build - inflateCopy() not found

Potential ssize_t values

Neal Norwitz looked through the Python codebase for longs that should
potentially be declared as ssize_t instead.  There was a brief
discussion about changing int's ob_ival to ssize_t, but this would
have been an enormous change this late in the release cycle and would
have slowed down operations on short int operations.  Hash values were
also discussed, but since there's no natural correlation between a
hash value and the size of a collection, most people thought it was
unnecessary for the moment.  Martin v. Löwis suggested upping the
recursion limit to ssize_t, and formalizing a 16-bit and 31-bit limit
on line and column numbers, respectively.

Contributing threads:

- `ssize_t question: longs in header files
- `ssize_t: ints in header files


Torsten Marek proposed adding a windowing function to itertools like::

    >>> list(iwindow(range(0,5), 3))
    [[0, 1, 2], [1, 2, 3], [2, 3, 4]]

Raymond Hettinger pointed him to a `previous discussion`_ on
comp.lang.python where he had explained that ``collections.deque()``
was usually a better solution.  Nick Coghlan suggested putting the
deque example in the collections module docs, but the thread trailed
off after that.

.. _previous discussion:

Contributing thread:

- `Proposal for a new itertools function: iwindow

Problems with buildbots and files left around

Neal Norwitz discovered some problems with the buildbots after finding
a few tests that didn't properly clean up, leaving a few files around
afterwards.  Martin v. Löwis explained that forcing a build on a
non-existing branch will remove the build tree (which should clean up
a lot of the files) and also that "make distclean" could be added to
the clean step of Makefile.pre.in and master.cfg.

Contributing thread:

- `fixing buildbots

PEP 3101: Advanced String Formatting

The discussion of `PEP 3101`_'s string formatting continued again this
fortnight.  Guido generally liked the proposal, though he suggested
following .NET's quoting syntax of doubling the braces, and maybe
allowing all formatting errors to pass silently so that rarely raised
exceptions don't hide themselves if their format string has an error.
The discussion was then moved to the `python-3000 list`_.

.. _PEP 3101: http://www.python.org/dev/peps/pep-3101/
.. _python-3000 list: http://mail.python.org/mailman/listinfo/python-3000

Contributing thread:

- `PEP 3101 Update

DONT_HAVE_* vs. HAVE_* macros

Neal Norwitz asked whether some recently checked-in DONT_HAVE_* macros
should be replaced with HAVE_* macros instead.  Martin v. Löwis
indicated that these were probably written this way because Luke
Dunstan (the contributor) didn't want to modify configure.in and run
autoconf.  Luke noted that the configure.in and autoconf effort is
greater for Windows developers, but also agreed to convert things to
autoconf anyway.

Contributing thread:

- `[Python-checkins] r46064 - in python/trunk: Include/Python.h
Include/pyport.h Misc/ACKS Misc/NEWS Modules/_localemodule.c
Modules/main.c Modules/posixmodule.c Modules/sha512module.c
PC/pyconfig.h Python/thread_nt.h

Changing python int to long long

Sean Reifschneider looked into converting the Python int type to long
long.  Though for simple math he saw speedups of around 25%, for ints
that fit entirely within 32-bits, the slowdown was around 11%.  Sean
was considering changing the int->long automatic conversion so that
ints would first be up-converted to long longs and then to Python
longs.  Guido said that it would be okay to standardize all ints as
64-bits everywhere, but only for Python 2.6.

Contributing thread:

- `Changing python int to "long long".

C-level exception invariants

Tim Peters was looking at what kind of invariants could be promised
for C-level exceptions.  In particular, he was hoping to promise that
for PyFrameObject's f_exc_type, f_exc_value, and f_exc_traceback,
either all are NULL or none are NULL.  In his investigation, he found
a number of errors, including that _PyExc_Init() tries to raise an
AttributeError before the exception pointers have been initialized.

Contributing thread:

- `Low-level exception invariants?

C-code style

Martin Blais asked about the policy for C code in Python core.  `PEP
7`_ explains that for old code, the most important thing is to be
consistent with the surrounding style.  For new C files (and for
Python 3000 code) indentation should be 4 spaces per indent, all
spaces (no tabs in any file).  There was a short discussion about
reformatting the current C code, but that would unnecessarily break
svn blame and make merging more difficult.

.. _PEP 7: http://www.python.org/dev/peps/pep-0007/

Contributing thread:

- `A can of worms... (does Python C code have a new C style?)

More information about the Python-Dev mailing list