At the LSB meeting, Jeff Licquia asked whether Python could provide
binary compatibility with previous versions by use of ELF symbol
versioning. In ELF symbol versioning, you can have multiple
definitions for the same symbol; clients (e.g. extension modules)
would refer to a specific version. During static linking, the most
recent (?) version of the symbol is coded into the client object.
With symbol versioning, you can change the implementation of a
function and even its interface, and it will be compatible as
long as you keep the original version around.
My first reaction is that this is difficult due to the usage of
function-like macros. However, if we replaced those with C functions
(perhaps has a compile-time choice), and if we would also hide
the layout of structures, I think providing binary compatibility
(with a certain baseline version, or multiple of these) would
Of course, several things need to be considered, e.g.
- making Py_INCREF/Py_DECREF functions is likely a bad idea
for performance reasons. OTOH, it seems safe that
Py_INCREF/Py_DECREF can remain as-is for the rest of 2.x.
- hiding PyTypeObject is a bad idea for source compatibility.
OTOH, we already try to make only compatible changes to
it (except when we don't :-), so exposing this as stable
ABI might actually work.
- certain kinds of modules likely need to be ruled out, e.g.
modules that extend the layout of existing types (it would
fail to compile because the base struct becomes an
All in all, I think providing binary compatibility would
be feasible, and should be attempted. What do you think?
over at my work copy of the python language reference, Adrian Holovaty
asked about the exact semantics of the __str__ hook:
"The return value must be a string object." Does this mean it can be a
*Unicode* string object? This distinction is ambiguous to me because
unicode objects and string objects are both subclasses of basestring.
May a __str__() return a Unicode object?
I seem to remember earlier discussions on this topic, but don't recall when
and what. From what I can tell, __str__ may return a Unicode object, but
only if can be converted to an 8-bit string using the default encoding. Is this
on purpose or by accident? Do we have a plan for improving the situation
in future 2.X releases ?
At the LSB meeting, there was a brief discussion of what Python
version should be incorporated into LSB. This is more an issue
if ABI compatibility for the C ABI is desired, but even if only
script portability is a goal, application developers will need
to know what APIs they have available in LSB 3.2 and which they
LSB codifies existing practice, rather than defining "anticipating"
standards, so things get only into LSB if they are already
commonplace in distributions. Currently, Python 2.4 is widely
available in Linux distributions; Python 2.5 is not and (apparently)
will not be included in the next RHEL release (which apparently
is coming soon).
So it looks like LSB will standardize on Python 2.4 (which will also
be the default version in the next Debian release).
I've been looking once again over the docs for distutils and setuptools,
and thinking to myself "this seems a lot more complicated than it ought
Before I get into detail, however, I want to explain carefully the scope
of my critique - in particular, why I am talking about setuptools on the
python-dev list. You see, in my mind, the process of assembling,
distributing, and downloading a package is, or at least ought to be, a
unified process. It ought to be a fundamental part of the system, and
not split into separate tools with separate docs that have to be
mentally assembled in order to understand it.
Moreover, setuptools is the defacto standard these days - a novice
programmer who googles for 'python install tools' will encounter
setuptools long before they learn about distutils; and if you read the
various mailing lists and blogs, you'll sense a subtle aura of
deprecation and decay that surrounds distutils.
I would claim, then, that regardless of whether setuptools is officially
blessed or not, it is an intrinstic part of the "Python experience".
(I'd also like to put forward the disclaimer that there are probably
factual errors in this post, or errors of misunderstanding; All I can
claim as an excuse is that it's not for lack of trying, and corrections
are welcome as always.)
Think about the idea of module distribution from a pedagogical
standpoint - when does a newbie Python programmer start learning about
module distribution and what do they learn first? A novice Python user
will begin by writing scripts for themselves, and not thinking about
distribution at all. However, once they reach the point where they begin
to think about packaging up their module, the Python documentation ought
to be able to lead them, step by step, towards a goal of making a
-- It should teach them how to organize their code into packages and
-- It should show them how to write the proper setup scripts
-- If there is C code involved, it should explain how that fits into
-- It should explain how to write unit tests and where they should go.
So how does the current system fail in this regard? The docs for each
component - distutils, setuptools, unit test frameworks, and so on, only
talk about that specific module - not how it all fits together.
For example, the docs for distutils start by telling you how to build a
setup script. It never explains why you need a setup script, or why
Python programs need to be "installed" in the first place. 
The distutils docs never describe how your directory structure ought to
look. In fact, they never tell you how to *write* a distributable
package; rather, it seems to be more oriented towards taking an
already-working package and modifying it to be distributable.
The setuptools docs are even worse in this regard. If you look carefully
at the docs for setuptools, you'll notice that each subsection is
effectively a 'diff', describing how setuputils is different from
distutils. One section talks about the "new and changed keywords",
without explaining what the old keywords were or how to find them.
Thus, for the novice programmer, learning how to write a setup script
ends up being a process of flipping back and forth between the distutils
and setuptools docs, trying to hold in their minds enough of each to be
able to achieve some sort of understanding.
What we have now does a good job of explaining how the individual tools
work, but it doesn't do a good job of answering the question "Starting
from an empty directory, how do I create a distributable Python
package?" A novice programmer wants to know what to create first, what
to create next, and so on.
This is especially true if the novice programmer is creating an
extension module. Suppose I have a C library that I need to wrap. In
order to even compile and test it, I'm going to need a setup script.
That means I need to understand distutils before I even think about
distribution, before I even begin writing the code!
(Sure, I could write a Makefile, but I'd only end up throwing it away
later -- so why not cut to the chase and *start* with a setup script?
Ans: Because it's too hard!)
But it isn't just the docs that are at fault here - otherwise, I'd be
posting this on a different mailing list. It seems like the whole
architecture is 'diff'-based, a series of patches on top of patches,
which are in need of some serious refactoring.
Except that nobody can do this refactoring, because there's no formal
list of requirements. I look at distutils, and while some parts are
obvious, there are other parts where I go "what problem were they trying
to solve here?" In my experience, you *don't* go mucking with someone's
code and trying to fix it unless you understand what problem they were
trying to solve - otherwise you'll botch it and make a mess. Since few
people ever bother to write down what problem they were trying to solve
(although they tend to be better at describing their clever solution),
usually this ends up being done through a process of reverse engineering
the requirements from the code, unless you are lucky enough to have
someone around who knows the history of the thing.
Admittedly, I'm somewhat in ignorance here. My perspective is that of an
'end-user developer', someone who uses these tools but does not write
them. I don't know the internals of these tools, nor do I particularly
want to - I've got bigger fish to fry.
I'm posting this here because what I'd like folks to think about is the
whole process of Python development, not just the documentation. What is
the smoothest path from empty directory to a finished package on PyPI?
What can be changed about the current standard libraries that will ease
 The answer, AFAICT, is that 'setup' is really a Makefile - in other
words, its a platform-independent way of describing how to construct a
compiled module from sources, and making it available to all programs on
that system. Although this gets confusing when we start talking about
"pure python" modules that have no C component - because we have all
this language that talks about compiling and installing and such, when
all that is really going on underneath is a plain old file copy.
Here's the summary for the second half of November. As always,
corrections and comments are greatly appreciated. If you were
involved in the November portions of the LSB discussions, I'd
particularly appreciate your reviews of that section.
Python 2.5 malloc families
Remember that if you find your extension module is crashing with
Python 2.5 in malloc/free, there is a high chance that you have a
mismatch in malloc "families". Fredrik Lundh's FAQ has more:
- `2.5 portability problems
Roundup tracker schema discussion
If you'd like to be involved in the discussion of the setup for the
`new tracker`_, you can now file issues on the `meta tracker`_ or post
to the `tracker-discuss mailing list`_. Be sure to sign up for an
account so your comments don't show up as anonymous!
.. _new tracker: http://psf.upfronthosting.co.za/roundup/tracker/
.. _meta tracker: http://psf.upfronthosting.co.za/roundup/meta/
.. _tracker-discuss mailing list:
- `discussion of schema for new issue tracker starting
Python and the Linux Standard Base (LSB)
Ian Murdock, the chair of the Linux Standard Base (LSB), explained
that they wanted to add Python to `LSB 3.2`_. Martin v. Löwis promised
to go to their meeting in Berlin and report back to python-dev.
The discussion then turned to the various ways in which the different
Linux variants package Python. A number of people had been troubled by
Debian's handling of distutils. At one point, Debian had excluded
distutils completely, requiring users to install the "python-dev"
package to get distutils functionality. While current versions of
Debian had put distutils back in the stdlib, they had excluded the
``config`` directory, meaning that distutils worked only for pure
Python modules, not extension modules. And because Debian had no way
of knowing that a computer with both gcc and Python installed would
likely benefit from having the ``config`` directory installed, the
user still had to install "python-dev" separately.
There was also some discussion about how to handle third party modules
so that updating a module didn't break some application which was
expecting a different version. These kinds of problems were
particularly dangerous on distributions like Gentoo and Ubuntu which
relied heavily on their own system Python for the OS to work properly.
Guido suggested introducing a vendor-packages directory for the third
party modules required by the OS and Martin v. Löwis reopened an
`earlier patch`_ suggesting this. A number of folks also thought that
adding a ~/.local/lib/pythonX.X/site-packages directory for user
specific (not site wide) packages could be useful. Phillip J. Eby
pointed out that distutils and setuptools already allow you to install
packages this way by putting::
prefix = ~/.local
into ./setup.cfg, ~/.pydistutils.cfg, or
/usr/lib/python2.x/distutils/distutils.cfg. He also explained that
setuptools could address some of the application-level problems:
setuptools-generated scripts adjust their sys.path to include the
specific eggs they need, and can specify these eggs with an exact
version if necessary. Thus OS-level scripts would likely specify exact
versions and then users could feel free to install newer eggs without
worrying that the OS would try to use them instead.
.. _LSB 3.2: http://www.freestandards.org/en/LSB_Roadmap
.. _earlier patch: http://bugs.python.org/1298835
- `Python and the Linux Standard Base (LSB)
Fredrik Lundh has been working on a `new Python FAQ`_ and asked about
what kinds of operations could be considered "atomic" for the purposes
of thread-safety. While almost any statement in Python can invoke an
arbitrary special method (e.g. ``a = b`` can invoke ``a.__del__()``),
Fredrik was interested in situations where the objects involved were
either builtins or objects that didn't override special methods. In
situations like these, you can be guaranteed things like::
* If two threads execute ``L.append(x)``, two items will be added to
the list (though the order is unspecified)
* If two threads execute ``x.y = z``, the field ``y`` on the ``x``
object will exist and contain one of the values assigned by one of the
You get these guarantees mainly because the core operation in these
examples involves only a single Python bytecode.
However, Martin v. Löwis pointed out that even the above examples are
not truly atomic in the strictest sense because they invoke bytecodes
to load the values of the variables in addition to the bytecode to
perform the operation. For example, if one thread does ``x = y`` while
another thread does ``y = x``, at the end of the code in an "atomic"
system, both ``x`` and ``y`` would have the same value. However, in
Python, the values could get swapped if a context switch occurred
between the loading of the values and the assignment operations.
Much of this discussion was also posted to `the FAQ item`_.
.. _new Python FAQ: http://effbot.org/pyfaq/
.. _the FAQ item:
- `PyFAQ: thread-safe interpreter operations
From an empty directory to a package on PyPI
Talin suggested that distutils/setuptools and their documentation
should be updated so that new users could more easily answer the
question: "What is the smoothest path from empty directory to a
finished package on PyPI?" In particular, Talin thought that having to
cross-reference between distutils/setuptools/unittest/etc. was
confusing, and that a more stand-alone version of the documentation
was necessary. A number of people agreed that the documentation could
use some reorganization and the addition of some more tutorial-like
sections. Mike Orr promised to put together an initial "Table of
Contents" that would have links to the most important information for
package distribution, and Talin promised to make his notes available
on the "baby steps" necessary to prepare a module for setuptools (e.g.
create the directory structure, write a setup.py file, create source
files in the appropriate directories, etc.)
- `Distribution tools: What I would like to see
Monitoring progress with urllib's reporthook
Martin v. Löwis looked at a `patch to urllib's reporthook`_ aimed at
more accurate progress reporting. The original code in urllib was
passing the ``read()`` block size as the second argument to the
reporthook. The patch would have instead passed as the second argument
the actual count of bytes read. Guido pointed out that the block size
and the actual count would always be identical except for the last
block because of how Python's ``file.read(n)`` works. Thus urllib was
already giving the reporthook as accurate a progress report as
possible given the implementation, and so the patch was rejected.
.. _patch to urllib's reporthook: http://bugs.python.org/849407
- `Passing actual read size to urllib reporthook
Infinity and NaN singletons
Tomer Filiba asked about making the positive-infinity,
negative-infinity and not-a-number (NaN) singletons available as
attributes of the ``float`` type, e.g. ``float.posinf``,
``float.neginf`` and ``float.nan``. Bob Ippolito pointed him to `PEP
754`_ and the fpconst_ module which addressed some of these issues
though in a separate module instead of the builtin ``float`` type.
When Tomer asked why `PEP 754`_ had not been accepted, Martin v. Löwis
explained that while people were interested in the feature, it was
difficult to implement in general, e.g. on platforms where the double
type was not IEEE-754.
.. _PEP 754: http://www.python.org/dev/peps/pep-0754/
.. _fpconst: http://www.python.org/pypi/fpconst/
- `infinities <http://mail.python.org/pipermail/python-dev/2006-November/070020.html>`__
Line continuations and the tokenize module
Guido asked about modifying the tokenize module to allow a better
round-tripping of code with line continuations. While the tokenize
module was generating pseudo-tokens for things like comments and
"ignored" newlines, it was not generating anything for line
continuation backslashes. Adding the appropriate yield would have been
trivial, but would have been a (minor) backwards incompatible change.
Phillip J. Eby pointed Guido to `scale.dsl`_ which dealt with similar
issues, and suggested that even though the change was small, it might
cause problems for some existing tools. Guido proposed a somewhat more
backwards compatible version, where a NL pseudo-token was generated
with '\\\n' as its text value, and asked folks to try it out and see
if it gave them any trouble.
.. _scale.dsl: http://peak.telecommunity.com/DevCenter/scale.dsl#converting-tokens-back-...
- `Small tweak to tokenize.py?
Summer of Code projects
Georg Brandl asked about the status of the Google Summer of Code
projects and got a number of responses:
* Nilton Volpato reported the completion of the new ziparchive_
module, which includes file-like access to zip members, support for
BZIP2 compression, support for member file removal and support for
encryption. He explained that he was still doing a little work to
clean up the API, and that he would appreciate any feedback people had
on the module.
* Facundo Batista reported that the decimal Python-to-C
transliteration was completed successfully, but that they learned in
the process that a simple transliteration was not going to suffice and
the decimal module was going to have to undergo a structural redesign
to perform well in C.
* Jim Jewett reported that the work to make more stdlib modules use
the logging module was incomplete, and not ready for stdlib inclusion
.. _ziparchive: http://ziparchive.sourceforge.net/
- `Summer of Code: zipfile?
- `Results of the SOC projects
- `Summer of Code: zipfile?
- `Results of the SOC projects
- `Weekly Python Patch/Bug Summary
- `Python in first-year MIT core curriculum
- `POSIX Capabilities
- ` Re: readline problem with python-2.5
- `DRAFT: python-dev summary for 2006-10-01 to 2006-10-15
- `Suggestion/ feature request
- `DRAFT: python-dev summary for 2006-10-16 to 2006-10-31
- `DRAFT: python-dev summary for 2006-11-01 to 2006-11-15
- `ctypes and powerpc
- `(no subject)
- `Cloning threading.py using processes
- `Objecttype of 'locals' argument in PyEval_EvalCode
many times writing somewhat complex loops over lists i've found the need
to sometimes delete an item from the list. currently there's no easy
way to do so; basically, you have to write something like
i = 0
while i < len(list):
el = list[i]
if el should be deleted:
i += 1
note that you can't write
for x in list:
for i in xrange(len(list)):
note also that you need to do some trickiness to adjust the index
appropriately when deleting.
i'd much rather see something like:
for x:iter in list:
if x should be deleted:
the idea is that you have a way of retrieving both the element itself
and the iterator for the element, so that you can then call methods on
the iterator. it shouldn't be too hard to implement iter.delete(), as
well as iter.insert() and similar functions. (the recent changes to the
generator protocol in 2.5 might help.)
the only question then is how to access the iterator. the syntax i've
proposed, with `x:iter', seems fairly logical (note that parallels with
slice notation, which also uses a colon) and doesn't introduce any new
operators. (comma is impossible since `for x,iter in list:' already has
btw someone is probably going to come out and say "why don't you just
use a list comprehension with an `if' clause? the problems are  i'd
like this to be destructive;  i'd like this to work over non-lists
as well, e.g. hash-tables;  list comprehensions work well in simple
cases, but in more complicated cases when you may be doing various
things on each step, and might not know whether you need to delete or
insert an element until after you've done various other things, a list
comprehension would not fit naturally;  this mechanism is extendible
to insertions, replacements and other such changes as well as just
PyPy Leysin Winter Sports Sprint (8-14th January 2007)
.. image:: http://www.ermina.ch/002.JPG
The next PyPy sprint will be in Leysin, Switzerland, for the fourth
time. This sprint will be the final public sprint of our EU-funded
period, and a kick-off for the final work on the upcoming PyPy 1.0
release (scheduled for mid-February).
The sprint is the last chance for students looking for a "summer" job
with PyPy this winter! If you have a proposal and would like to
work with us in the mountains please send it in before 15th December
to "pypy-tb at codespeak.net" and cross-read this page:
Goals and topics of the sprint
* Like previous winter, the main side goal is to have fun in winter
sports :-) We can take a couple of days off for ski; at this time of
year, ski days end before 4pm, which still leaves plenty of time
to recover (er, I mean hack).
* Progress on the JIT compiler, which we are just starting to scale to
the whole of PyPy. ``_
* Polish the code and documentation of the py lib, and eventually
* Work on prototypes that use the new features that PyPy enables:
distribution ``_ (based on transparent proxying ``_),
security ``_, persistence ``_, etc. ``_.
.. _: http://codespeak.net/pypy/dist/pypy/doc/jit.html
.. _: http://codespeak.net/svn/pypy/dist/pypy/lib/distributed/
.. _: http://codespeak.net/pypy/dist/pypy/doc/proxy.html
.. _: http://codespeak.net/pypy/dist/pypy/doc/project-ideas.html#security
.. _: http://codespeak.net/pypy/dist/pypy/doc/project-ideas.html#persistence
.. _: http://codespeak.net/pypy/dist/pypy/doc/project-ideas.html
Location & Accomodation
Leysin, Switzerland, "same place as before". Let me refresh your
memory: both the sprint venue and the lodging will be in a very spacious
pair of chalets built specifically for bed & breakfast:
http://www.ermina.ch/. The place has a baseline ADSL Internet connexion
(600Kb) with wireless installed. You can of course arrange your own
lodging anywhere (so long as you are in Leysin, you cannot be more than a
15 minute walk away from the sprint venue), but I definitely recommend
lodging there too -- you won't find a better view anywhere else (though you
probably won't get much worse ones easily, either :-)
I made pre-reservations in the Chalet, so please *confirm* quickly that
you are coming so that we can adjust the reservations as appropriate.
The rate so far has been 60 CHF a night all included in 2-person rooms,
with breakfast. There are larger rooms too (less expensive, possibly
more available too) and the possibility to get a single room if you
really want to.
Please register by svn:
or on the pypy-sprint mailing list if you do not yet have check-in rights:
You need a Swiss-to-(insert country here) power adapter. There will be
some Swiss-to-EU adapters around - bring a EU-format power strip if you
Officially, 8th-14th January 2007. Both dates are flexible, you can
arrive or leave earlier or later, though it is recommended to arrive on
the 7th (if many people arrive on the 6th we need to check for
accomodation availability first). We will give introductions and
tutorials on the 8th, in the morning if everybody is there, or in the
> [private, but feel free to respond on-list]
>> - Allow Python scripts to run unmodified across Linux distributions
>> - Optional: Allow extension modules to be used in binary form across
>> - Optional: Allow extension modules to be used in binary form across
> Was this duplication of last two points cut'n'paste error or what?
Oops, yes, it should have read
- Optional: Allow installation of binary foreign Python add-on
At 11:38 AM 12/4/2006 -0800, Mike Orr wrote:
>The other approaches work fine for giving each user a private install
>dir, but don't address the case of the same user wanting different
>install dirs for different projects. For instance, when exploring
>Pylons or TurboGears which install a lot of packages I may not
>otherwise want, I create a Virtual Python for each of them. If I'm
>developing an application under Virtual Python, I can see at a glance
>which packages my project needs installed. I can't think of any other
>way except Virtual Python to do this.
Simply install the packages to an arbitrary directory using -m
(--multi-version), and allow the scripts to be installed to the same
directory. When the scripts are run, they'll find their eggs in the script
directory. Neither PYTHONPATH manipulation nor virtual Pythons are needed
to achieve this - it's a self-contained single-application directory. (It
will still have to import pkg_resources from a copy of setuptools installed
somewhere else, e.g. on PYTHONPATH or in site-packages, or someday perhaps
>Another point. Setuptools seems to have Two Ways To Do Things
>regarding package activiation. easy_install puts the latest-installed
>egg version in its .pth file so it's automatically available via a
>naive "import". This happens to clutters sys.path with a more
>entries than some people desire.
Using -m (--multi-version) suppresses this behavior. The "just works"
behavior of automatically adding to sys.path is just the default.
> Meanwhile, programs can use
>pkg_resources to activate a package or version that may not already be
>in sys.path. Is this the Way Of The Future? Should people start
>using pkg_resources for all packages they import?
No. Setuptools automatically wraps generated scripts with
pkg_resources-based activation, so there's almost never a need to request
packages. So, if you need dynamic access for a project, just create
yourself a setup.py for it and put the requirements in there. Then run
"setup.py develop" to generate wrappers for your scripts. The wrappers
will do all the activation for you when they're run.
>Finally, one can use ~/.pydistutils.cfg to specify an install
>location, but that again allows only one location per user.
One *default* location per user... per current directory. Since it's based
on distutils, easy_install always reads setup.cfg from the current
directory, which can set the defaults. And of course, using -d
(--install-dir) on the command line lets you specify a one-off target.
>Do the PYTHONPATH improvements make it possible to just put a
>directory on your PYTHONPATH and have Python process .pth files in it
>without using the site.addsitedir() hack? That would probably be my
>biggest wishlist item.
Yes, it does -- but it only works for packages installed by easy_install; a
special 'site.py' is added to the directory that adds the necessary hooks
around the "real" 'site' module's processing.
Some people (Robert Schweikert) requested byte-code stability at the
LSB meeting: LSB should standardize a certain version of the byte code,
and then future versions of Python should continue to support this
byte code version.
I explained that this is currently not supported, but would be
technically possible; somebody would have to implement it.
What do you think?