Python-dev Summary for 2002-10-01 through 2002-10-13

Sun Oct 13 21:47:50 EDT 2002

This is a summary of traffic on the `python-dev mailing list`_ between
October 1, 2002 and October 13, 2002 (inclusive).  It is intended to
inform the wider Python community of on-going developments on the list
that might interest the wider Python community.  To comment on
anything mentioned here, just post to python-list at python.org or
comp.lang.python in the usual way; give your posting a meaningful
subject line, and if it's about a PEP, include the PEP number (e.g.
Subject: PEP 201 - Lockstep iteration). All python-dev members are
interested in seeing ideas discussed by the community, so don't
hesitate to take a stance on a PEP (or anything else for that matter)
if you have an opinion.  And if all of this really interests you then
get involved and join Python-dev!

This is the fourth summary written by Brett Cannon (with a partially
fried brain thanks to the GRE).

All summaries are now archived at http://www.python.org/dev/summary/
thanks to A.M. Kuchling.

Please note that this summary is written using reStructuredText_ which
can be found at http://docutils.sourceforge.net/rst.html .  Any
unfamiliar punctuation is probably markup for reST_; you can safely
ignore it (although I suggest learning reST; its nice and is accepted
for PEP markup).  Also, because of the wonders of reformatting thanks
to whatever program you are using to read this, I cannot guarantee you
will be able to run this text through Docutils_ as-is.  If you want to
do that, get the original text version.

.. _python-dev mailing list:
http://mail.python.org/mailman/listinfo/python-dev
.. _Docutils:
.. _reST:
.. _reStructuredText: http://docutils.sf.net/

======================
Summary Announcements
======================

This is a new section to the summary that I have decided to introduce.
 It is mainly going to serve to make any general announcements or
comments on this summary and this summary alone.  All universal
comments will stay at the top of the summaries.

Just to let everyone know, I am taking off for two weeks on vacation
starting 2002-10-14 and I will not return until 2002-10-30.  Now,
before you all start sobbing over the loss of one of my great
summaries, you should know that Raymond Hettinger has graciously taken
up the job of temp summarizer for me and will do the summary while I
am gone.

Michael Hudson has made the suggestion that I inject more of my
personality into the summary so as to liven it up a little.  I am
personally quite happy to do this.  But the real question is do you,
fine reader, mind the idea?  If I don't hear from throngs of people
going "your sarcastic tone takes away from the wonderfully drawl
summaries and that is a bad thing", then I will just go ahead and
write with personality.  Just don't complain later.  =)

2.2.2b1 has been occupying Python-dev during this summary period, and
so this summary is shorter than usual.  I left out a bunch of threads
that were discussing bugfixes that I either didn't find interesting or
didn't think the rest of the world would care about.

A.M. Kuchling has put all Python-dev summaries up at
http://www.python.org/dev/summary/ .  So now the archive is
centralized instead of spread out among three web sites.  I am sure
the original archive sites will stay (I will keep mine up), but all
future references will go to that page.

Now it's time for my personal favor of the month.  I am going to start
applying to grad school for computer (science | programming) when I
get back from vacation.  If anyone knows of Python-friendly schools
out there, let me know.  Heck, I am even willing to leave America to
go to school as long as the classes are in English.  So if you know of
any, please let me know!

And now on to the summary.

=========================================
`Python 2.2.2 beta release on Monday!`__
=========================================
__ http://mail.python.org/pipermail/python-dev/2002-October/029204.html

splinter thread:
	- `RELEASED: Python 2.2.2b1`__

__ http://mail.python.org/pipermail/python-dev/2002-October/029337.html

Python 2.2.2.b1 was released on Monday, October 7.  This is the reason
(or perhaps excuse is a better description) for the lighter summary
this week.  A good amount of the traffic on Python-dev was about
bugfixing 2.2.2.b1.  Most of this probably would not interest the
average Python user, and thus I didn't summarize a bunch of threads. 
Taking the GRE also didn't help with my free time and thus has caused
me to cut down on the summary since I am having to go through a huge
backlog to get this out the door.

=========================================
`Dropping support for Tcl 8.0 and 8.1`__
=========================================
__ http://mail.python.org/pipermail/python-dev/2002-October/029024.html

Martin v. Loewis asked if it would be okay to drop support for Tcl_
8.0 and 8.1 since `_tkinter.c`_ has special code in there just for
those outdated versions.  Guido ok'ed it, so if you are using those
still using a version of Tk from way back when, it's time to upgrade.

.. _Tcl: http://www.tcl.tk/
.. __tkinter.c: http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Modules/_tkinter.c

===========================================
`*very* revised proposal for interfaces`__
===========================================
__ http://mail.python.org/pipermail/python-dev/2002-October/029042.html

Previously John Williams came up with a proprosal for implementing the
stuff from `PEP 245`_ (Python Interface Syntax) and `PEP 246`_ (Object
Adaptation).  From my understanding of what John has done, it appears
he has written an interface system in pure Python.  If you want
backstory to this long and involved discussion to get an interface
system into Python read the Python-dev Summaries for 2002-08-16 to
2002-09-01 and 2002-09-01 to 2002-09-15 .

Gerald Williams, Michael Chermside, and Esteban Castro all commented
on the implementation and made various suggestions.

John said that he was not done yet implementing this.  But if you were
interested in the whole previous discussion on interfaces you could
consider looking at what John has done.  As for it going into the
language, I suspect that will have to wait until John is done and has
convinced the PEP writers and Python-dev that his implementation fits
the bill.  Stay tuned.

If you want some more background info on interfaces, read the
previously mentioned summaries.  As for object adaptation, read on
c.l.py and on Python-dev anything by Alex Martelli on the subject.  He
has become the main proponent of object adaptation and has written
several very extensive essays on the subject.

.. _PEP 245: http://www.python.org/peps/pep-0245.html
.. _PEP 246: http://www.python.org/peps/pep-0246.html

=====================
`perplexed by mro`__
=====================
__ http://mail.python.org/pipermail/python-dev/2002-October/029035.html

Splinter threads:
	- `Re: my proposals about mros (was: perplexed by mro)`__
	- `C3 implementation`__

__ http://mail.python.org/pipermail/python-dev/2002-October/029167.html
__ http://mail.python.org/pipermail/python-dev/2002-October/029230.html

Samuele Pedroni said he was "trying to wrap [his] head around the mro
computation in 2.2".  Apparently there is the algorithm mentioned at
http://www.python.org/2.2.1/descrintro.html (dubbed the naive
algorithm) and then the one implemented in `typeobject.c`_ (called the
2.2 algorithm).  Samuele discovered some inconsistencies with the
implemented algorithm that he desired some explanation about.

Guido responded, thankful that someone was giving this a look because
his "intuition about the equivalence between algorithms turned out to
be wrong".  Guido stated that he thought that he wrote the algorithm
from the book "Putting Metaclasses To Work" correctly sans raising an
error when major conflicts occur in the ordering.  In a later email
Guido explained that the naive algorithm came about by his attempt to
simplify the explanation of the 2.2 algorithm.  Guido pretty much
wrote the algorithm from the aforementioned book.  Now the algorithm
is not simple, so Guido did his best to simplify the explanation. 
Unknowningly, though, he came up with a variant on the algorithm in
his explanation.

Greg Ewing pointed out that he thought the naive algorithm was nicer
since it seemed to work more intuitively and was easier to explain
(and remember kids, these are basic tenants in Python programming). 
Guido ended up stating that "If Samuele agrees that the naive
algorithm works better, [Guido will] try to make it so in 2.3".  Well,
Samuele said that the "2.2 mro is the worst of our options".

There was a problem, though, with the naive algorithm; it is not
monotonic as pointed out by Samuele.  This led him to put out two
options:

1. Use the naive algorithm, which had the drawback of not being
monotonic.  Samuele also believed that it didn't produce "the most
natural results".

2. Adopt C3_ as described at
http://www.webcom.com/haahr/dylan/linearization-oopsla96.html and
apparently used by Goo_ .  This algorithm is monotonic and Samuele
says is more intuitive in its results.

Guido got around to reading the C3_ paper and agreed that "we should
adopt C3".  He thought that the 2.2 algorithm was like the L*[LOOPS]
algorithm mentioned in the paper, but he is not positive.  Samuele
then wrote a C implementation of the algorithm.  Guido said he would
get to the patch after 2.2.2b1 got out the door.

.. _C3: http://www.webcom.com/haahr/dylan/linearization-oopsla96.html
.. _Goo: http://www.ai.mit.edu/~jrb/goo/manual.43/goomanual_55.html

.. _typeobject.c: http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Objects/typeobject.c

===============================
`Psyco requests and patches`__
===============================
__ http://mail.python.org/pipermail/python-dev/2002-October/029038.html

Armin Rigo of psyco_ an email to Python-dev with some thoughts on
psyco and some requests.  After mentioning how he wished more people
realized psyco is meant to be used by anyone and not speed-hungry
coders (it is a cool app and if you have any interest in compilers you
should take a look) and that psyco could get some more advertising he
mentioned three patches that he wrote (patches 617309_ , 617311_ , and
617312_ ) against 2.2.2 that he would like to see be accepted so as to
ease maintenance of psyco.

Armin also mentioned how he would like to move psyco forward.  He
pointed out he would like to eventually write it all in Python.  This
would require tracking changes in the interpreter that psyco dealt
with.  He would like to keep this in mind when Python 3 discussions
kick up (and don't ask when that will happen; not for a VERY long
time).

In regards to Armin's patches, Martin v. Loewis thought they broke
binary compatibility (big no-no between micro releases), but Armin
claimed it didn't.

After glancing over the patches it seems they have all been applied
against 2.2.2 and are being actively worked on by Armin for
application to 2.3.

.. _617309: http://www.python.org/sf/617309
.. _617311: http://www.python.org/sf/617311
.. _617312: http://www.python.org/sf/617312

.. _psyco: http://psyco.sf.net/

======================================================================
`PEP239 (Rational Numbers) Reference Implementation and new issues`__
======================================================================
__ http://mail.python.org/pipermail/python-dev/2002-October/029065.html

splinter threads:
	- `Re: PEP239 (Rational Numbers) Reference Implementation and new
issues`__

__ http://mail.python.org/pipermail/python-dev/2002-October/029068.html

Christopher Craig uploaded `patch 617779`_ that implemented `PEP 239`_
(Rational Numbers).  He had some questions for Python-dev, though,
regarding a couple points.  These few points have become the bane of
my summarizing existence; this thread is huge.

One was whether division should return rationals instead of floats. 
Since rationals keep precision in division they are the most accurate
way to perform division.  THey also make the most output sense (e.g.
1/3).  Problem with this is that rational math is slow and this would
cause issue with any code that expected a float.

The next issue was about comparison.  Should a rational compare only
when it is exactly equal to a float or when the float is really close?

Lastly, Christopher wondered if rationals should hash the same as
floats.  The answer to the second issue would influence the answer to
this issue.

Issue 1. Eric Raymond was for returning a rational.  Francois was kind
of on the fence.  Christian was +1 for returning rationals.  Guido
said ABC did this and that numeric processes thus ended up being slow.

Issue 2. Christian said "Let it grow!  Let the user feel what
precision he's carrying around, and how much they throw away when they
reduce down to a float."

Issue 3.  Eric Raymond suggested a global "fuzz" variable that defines
a "close-enough-for-equality range"; this idea was used by APL. 
Andrew Koenig was against this because you don't always want a fuzzy
comparison and it destroys substitutability: "If a==b, it is not
always true that f(a)==f(b)".  Andrew said he preferred Scheme's
numeric model.  To this, Guido said that "'It works in Scheme' doesn't
give me a warm fuzzy feeling that it's been tried in real life"; Tim
later laid the smack down on Scheme's numeric model and ended it with
"There's a reason the NumPy folks never bug you for Scheme features
<wink>".  Christian pointed out that keeping it as a rat would prevent
overflows from ever occuring from long division.  Tim was staunchly
against a fuzz variable.  Raymond suggested a fuzz comparison function
that took in a fuzz value.  Christopher said that the way it stands
now in the implementation is that rationals are coerced to floats and
then compared.  Oren Tirosh suggested a thired boolean,
'Undetermined', that would be raised when the "difference between A
and B is below the error margin".  David Abrahams said that Boost
discussed this and said that the cost of adding ternary logic was not
worth it.

Andrew asked how rats could be optimized.  He suggested ditching
trailing zeros.  Tim wondered how much of a save you would get from
this.  Raymond Hettinger suggested having a builtin variable that
would specify the "maximum denominator magnitude".  Christian liked
this idea.  Greg didn't think this would be a good solution because
people using rats are going to want them specifically because they are
exact.  So Raymond suggest the default be unrestricted denominator.

Guido brought up the question of how rats should be represented when
printed.

The syntax for rats came up.  Greg Ewing got the ball rolling by
suggesting the syntax for rat division as ``\\\``.  M.A. Lemburg
suggested just having a constructor like ``rat(2,3)``.  The discussion
then had a gamut of suggested syntax: ``2:3``, ``2r3``, ``{2/3}``
(Guido shot this down because he wants to leave the option open for
possible set notation), ``<2/3>``,``2r/3``, something by Barry using
an extended character that Pine wouldn't display (it was a joke), and
finally ``2/3r`` by Guido.  People agreed that this last one suggested
by Guido was the best one.  Tim also pointed out that Scheme has
notation to specify whether a number is exact or not and using the 'r'
notation would basically provide the same functionality.

But regardless of what syntax people preferred, it was overwhelmingly
agreed that choosing the syntax should wait until rationals have been
in the language for a while and it is known how they are used.

If you only read one thing, read Tim's emails since he explains all of
this really well and is the resident math whiz on Python-dev.

.. _patch 617779: http://www.python.org/sf/617779
.. _PEP 239: http://www.python.org/peps/pep-0239.html

==================================================
`Non-ASCII characters in test_pep277.py in 2.3`__
==================================================
__ http://mail.python.org/pipermail/python-dev/2002-October/029247.html

Guido pointed out that test_pep277.py_ uses an encoding cookie which
was not being recognized by his toolchain.  At some point he stated
that he is "still not 100% comfortable with using arbitrary coding
cookies in the Python distribution".

The reason I mention this thread (beyond for the quote above) is that
info on how to get XEmacs to recognize the cookie came out.  Sjoerd
Mullender sent out the link
http://www.xemacs.org/Documentation/packages/html/mule-ucs_2.html
which helped some people.  As for Linux distro-specific problems, M.A.
Lemburg noticed that SuSE puts Mule support in a package named
'mule-ucs-xemacs' and once he got the package loaded XEmacs worked.

.. _test_pep277.py: http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Lib/test/test_pep277.py

======================================================
`Unclear on the way forward with unsigned integers`__
======================================================
__ http://mail.python.org/pipermail/python-dev/2002-October/029249.html

Splinter threads:
	- `PEP 263 in the works`__

__ http://mail.python.org/pipermail/python-dev/2002-October/029289.html

Mark Hammond was "a little confused by the new world order for working
with integers in extension modules".  Mark wanted know how to create
objects that were more like a collection of bits than an integer.  Tim
suggested creating a Python long; that would act more like an unsigned
int in terms of its bits.

The FutureWarning for hexadecimal constants was brought up and it was
pointed out that to deal with those just stick an 'L' at the end. 
Remember folks, that in Python 2.3 ``0x80000000L == 2147483648``.

The usefulness of __future__ statements also came up.  Tim wondered
how useful they were.  Thomas Wouters, though, came to __future__'s
defense and explained how it helped him migrate people to newer
versions of Python without being yelled at for breaking their code.

=================================================
`segmentation fault with python2.3a0 from cvs`__
=================================================
__ http://mail.python.org/pipermail/python-dev/2002-October/029286.html

Gersen Kurz was having problems importing a huge file.  The bug was
attributed to cygwin's malloc implementation, so people might want to
watch out for that.

It was also pointed out that a loop with a bunch of items in a dict
created a huge number of references.  It turns out that dicts use
dummy references in its implementation for when something is deleted. 
So don't be alarmed by huge references even after you deleting an
immense dict.

==========================================
`Snapshot win32all builds of interest?`__
==========================================
__ http://mail.python.org/pipermail/python-dev/2002-October/029352.html

As mentioned in the last summary, Mark Hammond wondered if anyone
would care to have access to compiled snapshots of CVS for Windows. 
He got enough of a response to give access at
http://starship.python.net/crew/mhammond .  This is not like the
standard Windows installer, though; "This version installs no
shortcuts, does not compile .pyc files etc - you are pretty much on
your own.  Pythonwin\start_pythonwin.pyw is installed to start
Pythonwin, but you must do so manually".  Mark would like to know if
you end up using this.

===========================================
`Set-next-statement in Python debuggers`__
===========================================
__ http://mail.python.org/pipermail/python-dev/2002-October/029377.html

Richie Hindle wants to write a pure Python debugger; the problem is
that it would be difficult without certain C-level stuff exposed to
Python.  Specifically, he wants frame.f_lasti (to be found in
frameobject.c_ ) to be writable by Python so he "could implement
Set-Next-Statement".

Michael Hudson is the first email I have of someone chiming against
this.  He thought it would be better to make it a descriptor so as
that "you can do some sanity checking on the values".  Guido then
chimed in saying that if it was writable that would open up a hole for
crashing a program.  Guido eventually said making it read-only would
be fine.

This led to Armin Rigo pointing out that you can crash the interpreter
already with the new module "or by writing crappy .pyc files".  Guido
acknowledged this, but said he didn't want to add anymore if it could
be helped.  He pointed out that he wants to stick with the idea that a
segfault is Python's fault unless proven otherwise.

.. _frameobject.c: http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Objects/frameobject.c

=====================
`Multibyte repr()`__
=====================
__ http://mail.python.org/pipermail/python-dev/2002-October/029442.html

A patch was applied that allowed repr() to return characters with the
high bit set; repr() used the "multibyte C library for printing string
if available".  This had caused a bug and made Guido wonder if this
was a good thing to do.  For an example::

  >>> u = u'\u1f40'  # Python 2.2
  >>> s = u.encode('utf8')
  >>> s
  '\xe1\xbd\x80'
  >>>

  >>> u = u'\u1f40'
  >>> s = u.encode('utf8')
  >>> s
  '1⁄2\x80'  # Notice the extended character
  >>>

"The latter output is not helpful, because the encoding of s is not
the locale's encoding".

Martin v. Loewis said that he thought author of the patch's intention
was "to get 'proper' output in interactive mode for strings".  Part of
the issue with all of this is the GNU readline calls setlocale()
automatically.  A patch came about to reset it in the extension module
back to its original state.

The issue of whether pickling would break because of this.  Guido
tried it and had no issue.  Atsuo Ishimoto (who brought up the
possible problem) said it broke when using the ShiftJIS locale.

But the worry of having repr() be locale-specific still lingered. 
Martin said he was "convinced that having repr locale-specific is
unacceptable".  He said, though, that having the tp_print slot use a
locale-aware print function was fine and to have it differ from
tp_repr was fine.  But this was shot down by Guido; "tp_print only
gets invoked when sys.stdout is a real file; otherwise str() or repr()
get invoked".  Apparently tp_print is a performance optimization and
thus should be fully transparent and not be different in any way to
the user.

Guido said that the multibyte-string patch should be backed out.  With
the pickle issue and different semantics for sys.stdout because of
tp_print, Guido said the patch had to be backed out.

As Tamito Kajiyama said, "one of the virtues of Python is that Python
has no language feature that is (automagically) affected by locale
settings".

=======================================
`tp_dictoffset calculation in 2.2.2`__
=======================================
__ http://mail.python.org/pipermail/python-dev/2002-October/029502.html

Guido asked David Abrahams and Kevin Jacobs if a change in how
tp_dictoffset (found in `typeobject.c`_ ) was calculated would affect
them.  To give some background, tp_dictoffset "tells us where the
instance dict pointer is in the instance object layout".  This was a
three step process (all Guido's words):

1. (Line 1113) For dynamically created types (i.e. created by class
statements), if the new type has no __slots__, the dominant base class
has a zero tp_dictoffset, and the dominant base doesn't have a custom
tp_getattro, a __dict__ will be added to the instance layout, and its
tp_dictoffset is calculated by taking the end of the base class
instance struct (or in a more complicated way if tp_itemsize is
nonzero).

2. (Line 1941) If the dominant base has a nonzero tp_dictoffset, copy
the tp_dictoffset from the dominant base.

3. (Line 2090) The tp_dictoffset of the first non-dominant base class
that has a nonzero tp_dictoffset is copied.

That last rule had caused Guido and Jeremey Hylton some problems with
some code they were bugfixing.  Guido wanted to just get rid of that
rule since he though "it is *always* wrong".  Both David and Kevin
said that nothing broke for them, so that is now all straightened out.

==========================
`Memory size overflows`__
==========================
__ http://mail.python.org/pipermail/python-dev/2002-October/029535.html

Armin Rigo pointed out some potential overflow errors "with objects of
very large sizes".  The issue was when the amount of memory needed to
allocate was calculated there was the chance that it would overflow. 
Armin suggested adding macros to deal with various issues.

Arrive Tim Peters, creator of pymalloc.  He admitted that he "always
ignore[s] these [errors] until one pops up in real life" since
"Checking slows the code, and that causes [Tim] pain <0.5 wink>".  He
pointed out another possible overflow calculation problem.  But it was
a basically a hopeless battle since malloc() has its own
cross-platform issues.

But Tim did say that if it was decided to go down the macro route,
then he wanted something like what Zope does:
``DO_SOMETHING_OR(RESULT_LVALUE, INPUT1, ..., ON_ERROR_BLOCK);``.  The
result goes into RESULT_LVALUE unless there is a problem, in which
case ON_ERROR_BLOCK is run.

Christian Tismer chimed in and said that he thought we should just
move completely over to 64 bit math.  Ruby had done it successfully so
it wasn't like we were taking a blind leap.  It would also save us the
hassle from doing it down the road when 64 bit processors become the
norm instead of the exception.