[Python-Dev] DRAFT: python-dev summary for 2005-11-16 to 2005-11-31

Steven Bethard steven.bethard at gmail.com
Sun Dec 18 00:13:05 CET 2005

Here's the summary for the first half of November -- sorry for the bit
of a delay.  As always, let me or Tony know if you have any

Summary Announcements

Reminder: Python is now on Subversion!

Don't forget that the Python source code is now hosted on
svn.python.org as a Subversion (rather than CVS) repository.

Note that because of the way the subversion conversion was done,
by-date revision specifications for dates prior to the switchover
won't work.  To work around this, you can use svn diff (find the
changes since some date), svn up (check out revision a some date), and
svn annotate (aka svn blame).

Removing the CVS repository from sourceforge isn't possible without
hacks (as a result of their "open source never goes away" policy). 
However, it's no longer available from the project page, and the
repository is now filled with files pointing people to the new

Contributing threads:

- `Is some magic required to check out new files from svn?
- `svn diff -r {2001-01-01}
- `CVS repository mostly closed now



Memory management in the AST

Thomas Lee's attempt to implement `PEP 341`_ brought up some issues
about working with the new AST code. Because the AST code used its own
custom objects instead of PyObjects, it also introduced its own set of
allocation/deallocation functions instead of the existing Py_INCREF
and Py_DECREF. There was some discussion about how best to simplify
the scheme, with the two main suggestions being:

(1) Convert all AST objects into PyObjects so Py_INCREF and Py_DECREF work
(2) Create an arena API, where objects are added to the arena and then
can be freed in one shot when the arena is freed

Neal Norwitz presented an example from the current AST code using the
various asdl_*_free functions, and he, Greg Ewing and Martin v. Löwis
compared how the code would look with the various API suggestions. 
While using ref-counting had the benefit of being consistent with the
rest of Python, there were still some who felt that the arena API
would simplify things enough to make the extra learning curve
worthwhile.  It seemed likely that branches or patches for the various
APIs would appear shortly.

While the C API is still undergoing these changes, and thus the Python
API is still a ways off, a few implementations for the Python API were
suggested.  If the AST code ends up using PyObjects, these could be
passed directly to Python code, though they would probably have to be
immutable.  Brett Cannon suggested that another route would be a
simple PyString marshalling like the current parser module, so that
Python code and C code would never share the same objects.

.. _PEP 341: http://www.python.org/peps/pep-0341.html

Contributing threads:

- `Memory management in the AST parser &amp; compiler
- `ast status, memory leaks, etc
- `a Python interface for the AST (WAS: DRAFT: python-dev...)


Profilers in the stdlib

Armin Rigo summarised the current Python profiler situation, which
includes profile.Profile (ages-old, slow, pure Python profiler with
limited support for profiling C calls), hotshot (Python 2.2+, faster
than profile.py, but very slow to convert the log file to the
pstats.Stats format, possibly inaccurate, doesn't know about C calls),
and `lsprof`_ (Brett Rosen, Ted Czotter, Michael Hudson, Armin Rigo;
doesn't support C calls, incompatible interface with
profile.py/hotshot, can record detailed stats about children).  He
suggested that lsprof be added to the standard library, keeping
profile.py as a pure Python implementation and replacing hotshot with

There was concern about maintenence of the library; however, since
Armin and Michael are core developers, this seems covered.  Martin
suggested that lsprof be distributed separately for some time, and
then included when it is more mature.  Many people were concerned
about having so many profilers included (with the preference for a
single profiler that would suit beginners, since advanced users can
easily install third-party modules, which could be referenced in the

Tim Peters explained that the aim of hotshot wasn't to reduce total
time overhead, but to be less disruptive (than profile.py) to the code
being profiled, while that code is running, via tiny little C
functions that avoid memory allocation/deallocation.  Hotshot can do
much more than the minimalistic documentation says (e.g. it could be
used as the basis of a tracing tool to debug software, to measure test
coverage); you won't find them discussed in the documentation, which
makes user experience mostly negative, but you do find them in Tim's

Discussion centered around whether lsprof should be added to the
standard distribution, and whether hotshot and/or profile.py should be
removed.  Armin indicated that he favours removing hotshot, adding
lsprof, which would be added as "cProfile" (c.f cPickle/Pickle,
cStringIO/StringIO), and possibly rewriting profile.py as a pure
Python version of lsprof.

Floris Bruynooghe (for Google's Summer of Code) wrote a `replacement
for profile.py`_ that uses hotshot directory.  This replacement didn't
fix the problems with hotshot, but did result in pstats loading
hotshot data 30% faster, and would mean that profile.py could be

There was a little debate about whether any profiler should even be
included in the standard library, but there were several people who
opined that it was an important 'battery'.  A few people also liked
the idea of adding a statistical profiler to the standard library at
some point (e.g. http://wingolog.org/archives/2005/10/28/profiling).

Aahz suggested that Armin write a PEP for this, which seems the likely
way that this will progress.

Contributing thread:

- `s/hotshot/lsprof

 .. _lsprof: http://codespeak.net/svn/user/arigo/hack/misc/lsprof
 .. _replacement for profile.py:  http://savannah.nongnu.org/projects/pyprof/


The tp_free slot and multiple inheritance in C

Travis Oliphant started a thread discussing a memory problem in some
new scipy core code where a huge number of objects were not being
freed.  Making the allocation code use malloc and free instead of
PyObject_New and PyObject_Del made these problems go away.  After an
intense discussion, Armin Rigo figured out that the problem arose in a
type that inherited both from int and from another scipy type.  The
tp_free slot of this type was being inherited from its second parent
(int) instead of its first parent (the scipy type), and thus
"deallocated" objects were put on the CPython free list of integers
instead of being freed.  It was unclear as to whether the code in
typeobject.c which made this decision could be "fixed", so Armin
suggested forcing the appropriate tp_alloc/tp_free functions in the
static types instead.

Contributing threads:

- `Problems with the Python Memory Manager
- `Problems with mro for dual inheritance in C [Was: Problems with the
Python Memory Manager]


Patches for porting Python to a new OS

Ben Decker asked for some feedback on patches porting Python to
DOS/DJGPP.  This lead to a discussion of what the requirements for
accepting a porting patch were.  Guido made it clear that he wanted
porting patches included in Python whenever reasonable so that the
various obscure ports would be able to upgrade to new versions of
Python when they were released.  The basic conditions were that the
submission came from a reputable platform maintainer, and that if the
patches caused problems in future Python versions, the maintainer
would either need to update the patch appropriately, or have it
removed from Python.

Contributing thread:

- `Patch Req. # 1351020 &amp; 1351036: PythonD modifications


Making StringIO behave more like a file

Walter Dörwald identified a number of situations where StringIO (but
not cStringIO) does not behave like a normal file:

- next() after close() raises StopIteration instead of ValueError
- isatty() after close() returns False instead of raising ValueError
- truncate() with a negative argument doesn't raise an IOError

These were determined to be bugs in StringIO and will likely be fixed
in an upcoming Python release.

Contributing threads:

- `Iterating a closed StringIO
- `isatty() on closed StringIO (was: Iterating a closed StringIO)
- `Another StringIO/cStringIO discrepancy
- `isatty() on closed StringIO


User-defined data for logging calls

Vinay Sajip explained that on numerous occasions, requests have been
made for the ability to easily add user-defined data to logging
events. For example, a multi-threaded server application may want to
output specific information to a particular server thread (e.g. the
identity of the client, specific protocol options for the client

While this is currently possible, you have to subclass the Logger
class and override its makeRecord method to put custom attributes in
the LogRecord; the approach is usable but requires more work than

Vinay proposed a simpler way of achieving the same result, which
requires use of an additional optional keyword argument ("extra") in
logging calls.  The "extra" argument will be passed to
Logger.makeRecord, which extend the logRecord's __dict__ with this
argument; however, if any of the keys are already present (values
calculated by the logging package), then a KeyError will be raised.

Contributing thread:

- `Proposed additional keyword argument in logging calls


Updating urlparse to support RFC 3986

Paul Jimenez complained that urlparse uses a table of url schemes to
determine whether a protocol (e.g. http or ftp) supports specifying a
username and password in the url (e.g. https://user:pass@host:port). 
He suggested that all protocols should be capable of using this

Guido pointed out that the main purpose of urlparse is to be
RFC-compliant.  Paul explained that the current code is valid
according to `RFC 1808`_ (1995-1998), but that this was superceded by
`RFC 2396`_ (1998-2004) and `RFC 3986`_ (2005-).  Guido was convinced,
and asked for a new API (for backwards compatibility) and a patch to
be submitted via sourceforge.

Contributing thread:

- `urlparse brokenness

 .. _RFC 1808: http://www.ietf.org/rfc/rfc1808.txt
 .. _RFC 2396: http://www.ietf.org/rfc/rfc2396.txt
 .. _RFC 3986: http://www.ietf.org/rfc/rfc3986.txt


Magic methods on the instance and on the type

Nick Coghlan pointed out that the current semantics of `PEP 343`_ look
up methods on the instance instead of on the type, and noted that
slots are generally invoked as ``type(obj).__slot__(obj)`` instead. 
Guido explained that in general, using ``__xxxx__`` methods in an
undocumented way (e.g. relying on them being looked up in the
instance) was not supported, and code relying on that could be
expected to break if the ``__xxxx__`` method was ever upgraded to a
slot.  So, it was okay that the `PEP 343`_ support looked up methods
on the instance, but anyone depending on this behavior was asking for

.. _PEP 343: http://www.python.org/peps/pep-0343.html

Contributing thread:

- `Metaclass problem in the &quot;with&quot; statement semantics in
PEP 343 <http://mail.python.org/pipermail/python-dev/2005-November/058360.html>`__


Releasing the GIL in the re module

Duncan Grisby has a multi-threaded program that does a lot of complex
regular expression searching, and has trouble with threads blocking
because the GIL is not released while the re engine is running.  He
wanted to know whether there was any fundamental reason why the re
engine could not release the interpreter lock.

Fredrik Lundh pointed out that SRE can operate on anything that
implements the buffer interface.  This means that the objects that the
engine is accessing might be mutable, which could cause problems.

Several people suggested that a better solution would be using more
efficient regular expressions; Duncan explained that the expressions
are user-entered, which makes this difficult.

Eric Noyau put together a `patch to release the GIL` when the engine
performs a low level search, if (and only if) the object searched is a
[unicode] string.

 .. _patch to release the GIL: http://python.org/sf/1366311

Contributing threads:

- `(no subject)
- `Re: Regular expressions
- `SRE should release the GIL (was: no subject)
- `Regular expressions


Skipped Threads

- `str.dedent <http://mail.python.org/pipermail/python-dev/2005-November/058148.html>`__
- `Behavoir question.
- `Conclusion: Event loops, PyOS_InputHook, and Tkinter
- `DRAFT: python-dev Summary for 2005-09-16 to 2005-09-30
- `DRAFT: python-dev Summary for 2005-10-01 to 2005-10-15
- `DRAFT: python-dev Summary for 2005-10-16 to 2005-10-31
- `Coroutines (PEP 342)
- `Enjoy a week without me
- `Weekly Python Patch/Bug Summary
- `How to stay almost backwards compatible with all these new cool
features <http://mail.python.org/pipermail/python-dev/2005-November/058211.html>`__
- `test_cmd_line on Windows
- `Fwd: [Python-checkins] commit of r41497 - python/trunk/Lib/test
- `[Python-checkins] commit of r41497 -python/trunk/Lib/test
- `DRAFT: python-dev Summary for 2005-11-01 through 2005-11-15
- `something is wrong with test___all__
- `PEP 302, PEP 338 and imp.getloader (was Re: a Python interface for
the AST (WAS: DRAFT: python-dev...)
- `registering unicode codecs
- `reference leaks
- `Bug bz2.BZ2File(...).seek(0,2) + patch
- `Python 3 <http://mail.python.org/pipermail/python-dev/2005-November/058346.html>`__
- `For Python 3k, drop default/implicit hash, and comparison
- `Bug day this Sunday?
- `Short-circuiting iterators
- `Standalone email package in the sandbox

More information about the Python-Dev mailing list