[Python-Dev] Python-dev Summary for 2002-10-01 through 2002-10-13

Brett Cannon drifty@bigfoot.com
Sun, 13 Oct 2002 01:03:08 -0700 (PDT)


I realize this is a little early, but I am going on vacation on Monday for
two weeks (Raymond Hettinger has agreed to do the summary while I am
gone), so I wanted to get this out the door before I leave.

Usual 24 hour correction period for the list to make my writing skills and
comprehension skills seem inferior is in effect.  =)

--------------------------------------------------------------------

This is a summary of traffic on the `python-dev mailing list`_ between
October 1, 2002 and October 13, 2002 (inclusive).  It is intended to
inform the wider Python community of on-going developments on the list
that might interest the wider Python community.  To comment on anything
mentioned here, just post to python-list@python.org or comp.lang.python in
the usual way; give your posting a meaningful subject line, and if it's
about a PEP, include the PEP number (e.g. Subject: PEP 201 - Lockstep
iteration). All python-dev members are interested in seeing ideas
discussed by the community, so don't hesitate to take a stance on a PEP
(or anything else for that matter) if you have an opinion.  And if all of
this really interests you then get involved and join Python-dev!

This is the fourth summary written by Brett Cannon (with a partially fried
brain thanks to the GRE).

Summaries by me (2002-09-15 to ... when I burn out) are archived at:
    http://www.ocf.berkeley.edu/~bac/python-dev/summaries/index.php
You can find summaries by Michael Hudson (2002-02-01 to 2001-07-05) at:
    http://starship.python.net/crew/mwh/summaries/index.html
Summaries by A.M. Kuchling (2000-12-01 to 2001-01-31) are at:
    http://www.amk.ca/python/dev/

Please note that this summary is written using reStructuredText_ which can
be found at http://docutils.sourceforge.net/rst.html .  Any unfamiliar
punctuation is probably markup for reST; you can safely ignore it
(although I suggest learning reST; its nice and is accepted for PEP
markup).  Also, because of the wonders of reformatting thanks to whatever
you are using to read this, I cannot guarantee you will be able to run
this text through DocUtils as-is.  If you want to do that, get the
original text from the archive.

.. _python-dev mailing list:
http://mail.python.org/mailman/listinfo/python-dev
.. _reST:
.. _reStructuredText: http://docutils.sf.net/

======================
Summary Announcements
======================

This is a new section to the summary that I have decided to introduce.  It
is mainly going to serve to make any general announcements or comments on
this summary and this summary alone.  All universal comments will stay at
the top of the summaries.

Just to let everyone know, I am taking off for two weeks on vacation
starting 2002-10-14 and I will not return until 2002-10-30.  Now, before
you all start sobbing over the loss of one of my great summaries, you
should know that Raymond Hettinger has graciously taken up the job of temp
summarizer for me and will do the summary while I am gone.

Michael Hudson has made the suggestion that I inject more of my
personality into the summary so as to liven it up a little.  I am
personally quite happy to do this.  But the real question is do you, fine
reader, mind the idea?  If I don't hear from throngs of people going "your
sarcastic tone takes away from the wonderfully drawl summaries and that is
a bad thing", then I will just go ahead and write with personality.  Just
don't complain later.  =)

2.2.2b1 has been occupying Python-dev during this summary period, and so
this summary is shorter than usual.  I left out a bunch of threads that
were discussing bugfixes that I either didn't find interesting or didn't
think the rest of the world would care about.

Now it's time for my personal favor of the month.  I am going to start
applying to grad school for computer (science | programming) when I get
back from vacation.  If anyone knows of Python-friendly schools out there,
let me know.  Heck, I am even willing to leave America to go to school as
long as the classes are in English.  So if you know of any, please let me
know!

And now on to the summary.

=====================================
`Python 2.2.2 beta release on Monday!`__
=====================================
__ http://mail.python.org/pipermail/python-dev/2002-October/029204.html

splintered thread:

	RELEASED: Python 2.2.2b1 :
http://mail.python.org/pipermail/python-dev/2002-October/029337.html

Python 2.2.2.b1 was released on Monday, October 7.  This is the reason (or
perhaps excuse is a better description) for the lighter summary this week.
A good amount of the traffic on Python-dev was about bugfixing 2.2.2.b1.
Most of this probably would not interest the average Python user, and thus
I didn't summarize a bunch of threads.  Taking the GRE also didn't help
with my free time and thus has caused me to cut down on the summary since
I am having to go through a huge backlog to get this out the door.

=====================================
`Dropping support for Tcl 8.0 and 8.1`__
=====================================
__ http://mail.python.org/pipermail/python-dev/2002-October/029024.html

Martin v. Loewis asked if it would be okay to drop support for Tcl_ 8.0
and 8.1 since `_tkinter.c`_ has special code in there just for those
outdated versions.  Guido ok'ed it, so if you are using those still using
a version of Tk from way back when, it's time to upgrade.

.. __tkinter.c:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Modules/_tkinter.c

=======================================
`*very* revised proposal for interfaces`__
=======================================
__ http://mail.python.org/pipermail/python-dev/2002-October/029042.html

Previously John Williams came up with a proprosal for implementing the
stuff from `PEP 245`_ (Python Interface Syntax) and `PEP 246`_ (Object
Adaptation).  From my understanding of what John has done, it appears he
has written an interface system in pure Python.  If you want backstory to
this long and involved discussion to get an interface system into Python
read the Python-dev Summaries for 2002-08-16 to 2002-09-01 and 2002-09-01
to 2002-09-15 .

Gerald Williams, Michael Chermside, and Esteban Castro all commented on
the implementation and made various suggestions.

John said that he was not done yet implementing this.  But if you were
interested in the whole previous discussion on interfaces you could
consider looking at what John has done.  As for it going into the
language, I suspect that will have to wait until John is done and has
convinced the PEP writers and Python-dev that his implementation fits the
bill.  Stay tuned.

If you want some more background info on interfaces, read the previously
mentioned summaries.  As for object adaptation, read on c.l.py and on
Python-dev anything by Alex Martelli on the subject.  He has become the
main proponent of object adaptation and has written several very extensive
essays on the subject.

.. _PEP 245: http://www.python.org/peps/pep-0245.html
.. _PEP 246: http://www.python.org/peps/pep-0246.html

=================
`perplexed by mro`__
=================
__ http://mail.python.org/pipermail/python-dev/2002-October/029035.html

Splinter threads:

	Re: my proposals about mros (was: perplexed by mro) :
http://mail.python.org/pipermail/python-dev/2002-October/029167.html
	C3 implementation :
http://mail.python.org/pipermail/python-dev/2002-October/029230.html

Samuele Pedroni said he was "trying to wrap [his] head around the mro
computation in 2.2".  Apparently there is the algorithm mentioned at
http://www.python.org/2.2.1/descrintro.html (dubbed the naive algorithm)
and then the one implemented in typeobject.c_ (called the 2.2 algorithm).
Samuele discovered some inconsistencies with the implemented algorithm
that he desired some explanation about.

Guido responded, thankful that someone was giving this a look because his
"intuition about the equivalence between algorithms turned out to be
wrong".  Guido stated that he thought that he wrote the algorithm from the
book "Putting Metaclasses To Work" correctly sans raising an error when
major conflicts occur in the ordering.  In a later email Guido explained
that the naive algorithm came about by his attempt to simplify the
explanation of the 2.2 algorithm.  Guido pretty much wrote the algorithm
from the aforementioned book.  Now the algorithm is not simple, so Guido
did his best to simplify the explanation.  Unknowningly, though, he came
up with a variant on the algorithm in his explanation.

Greg Ewing pointed out that he thought the naive algorithm was nicer since
it seemed to work more intuitively and was easier to explain (and remember
kids, these are basic tenants in Python programming).  Guido ended up
stating that "If Samuele agrees that the naive algorithm works better,
[Guido will] try to make it so in 2.3".  Well, Samuele said that the "2.2
mro is the worst of our options".

There was a problem, though, with the naive algorithm; it is not monotonic
as pointed out by Samuele.  This led him to put out two options:

1. Use the naive algorithm, which had the drawback of being monotonic.
Samuele also believed that it didn't produce "the most natural results".

2. Adopt C3_ as described at
http://www.webcom.com/haahr/dylan/linearization-oopsla96.html and
apparently used by Dylan_ and Goo_ .  This algorithm is monotonic and
Samuele says is more intuitive.

Guido got around to reading the C3_ paper and agreed that "we should adopt
C3".  He thought that the 2.2 algorithm was like the C*[LOOPS] algorithm
mentioned in the paper.  Samuele then wrote a C implementation of the
algorithm.  Guido said he would get to the patch after 2.2.2b1 got out the
door.

.. _C3: http://www.webcom.com/haahr/dylan/linearization-oopsla96.html
.. _Dylan: http://monday.sourceforge.net/wiki/
.. _Goo: http://www.ai.mit.edu/~jrb/goo/manual.43/goomanual_55.html

.._typeobject.c:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Objects/typeobject.c

===========================
`Psyco requests and patches`__
===========================
__ http://mail.python.org/pipermail/python-dev/2002-October/029038.html

Armin Rigo of psyco_ an email to Python-dev with some thoughts on psyco
and some requests.  After mentioning how he wished more people realized
psyco is meant to be used by anyone and not speed-hungry coders (it is a
cool app and if you have any interest in compilers you should take a look)
and that psyco could get some more advertising he mentioned three patches
that he wrote (patches 617309_ , 617311_ , and 617312_ ) against 2.2.2
that he would like to see be accepted so as to ease maintenance of psyco.

Armin also mentioned how he would like to move psyco forward.  He pointed
out he would like to eventually write it all in Python.  This would
require tracking changes in the interpreter that psyco dealt with.  He
would like to keep this in mind when Python 3 discussions kick up (and
don't ask when that will happen; not for a VERY long time).

In regards to Armin's patches, Martin v. Loewis thought they broke binary
compatibility (big no-no between micro releases), but Armin claimed it
didn't.

After glancing over the patches it seems they have all been applied
against 2.2.2 and are being actively worked on by Armin for application to
2.3.

.. _617309: http://www.python.org/sf/617309
.. _617311: http://www.python.org/sf/617311
.. _617312: http://www.python.org/sf/617312

.. _psyco: http://psyco.sf.net/

===================================================================
`PEP239 (Rational Numbers) Reference Implementation and new issues`__
===================================================================
__http://mail.python.org/pipermail/python-dev/2002-October/029065.html

splintered threads:
	Re: PEP239 (Rational Numbers) Reference Implementation and new
issues :
http://mail.python.org/pipermail/python-dev/2002-October/029068.html

Christopher Craig uploaded `patch 617779`_ that implemented `PEP 239`_
(Rational Numbers).  He had some questions for Python-dev, though,
regarding a couple points.  These few points have become the bane of my
summarizing existence; this thread is huge.

One was whether division should return rationals instead of floats.  Since
rationals keep precision in division they are the most accurate way to
perform division.  THey also make the most output sense (e.g. 1/3).
Problem with this is that rational math is slow and this would cause issue
with any code that expected a float.

The next issue was about comparison.  Should a rational compare only when
it is exactly equal to a float or when the float is really close?

Lastly, Christopher wondered if rationals should hash the same as floats.
The answer to the second issue would influence the answer to this issue.

Issue 1. Eric Raymond was for returning a rational.  Francois was kind of
on the fence.  Christian was +1 for returning rationals.  Guido said ABC
did this and that numeric processes thus ended up being slow.

Issue 2. Christian said "Let it grow!  Let the user feel what precision
he's carrying around, and how much they throw away when they reduce down
to a float."

Issue 3.  Eric Raymond suggested a global "fuzz" variable that defines a
"close-enough-for-equality range"; this idea was used by APL.  Andrew
Koenig was against this because you don't always want a fuzzy comparison
and it destroys substitutability: "If a==b, it is not always true that
f(a)==f(b)".  Andrew said he preferred Scheme's numeric model.  To this,
Guido said that "'It works in Scheme' doesn't give me a warm fuzzy feeling
that it's been tried in real life"; Tim later laid the smack down on
Scheme's numeric model and ended it with "There's a reason the NumPy folks
never bug you for Scheme features <wink>".  Christian pointed out that
keeping it as a rat would prevent overflows from ever occuring from long
division.  Tim was staunchly against a fuzz variable.  Raymond suggested a
fuzz comparison function that took in a fuzz value.  Christopher said that
the way it stands now in the implementation is that rationals are coerced
to floats and then compared.  Oren Tirosh suggested a thired boolean,
'Undetermined', that would be raised when the "difference between A and B
is below the error margin".  David Abrahams said that Boost discussed this
and said that the cost of adding ternary logic was not worth it.

Andrew asked how rats could be optimized.  He suggested ditching trailing
zeros.  Tim wondered how much of a save you would get from this.  Raymond
Hettinger suggested having a builtin variable that would specify the
"maximum denominator magnitude".  Christian liked this idea.  Greg didn't
think this would be a good solution because people using rats are going to
want them specifically because they are exact.  So Raymond suggest the
default be unrestricted denominator.

Guido brought up the question of how rats should be represented when
printed.

The syntax for rats came up.  Greg Ewing got the ball rolling by
suggesting the syntax for rat division as ``\\\``.  M.A. Lemburg suggested
just having a constructor like ``rat(2,3)``.  The discussion then had a
gamut of suggested syntax: ``2:3``, ``2r3``, ``{2/3}`` (Guido shot this
down because he wants to leave the option open for possible set notation),
``<2/3>``,``2r/3``, something by Barry using an extended character that
Pine wouldn't display (it was a joke), and finally ``2/3r`` by Guido.
People agreed that this last one suggested by Guido was the best one.  Tim
also pointed out that Scheme has notation to specify whether a number is
exact or not and using the 'r' notation would basically provide the same
functionality.

But regardless of what syntax people preferred, it was overwhelmingly
agreed that choosing the syntax should wait until rationals have been in
the language for a while and it is known how they are used.

If you only read one thing, read Tim's emails since he explains all of
this really well and is the resident math whiz on Python-dev.

.. _patch 617779: http://www.python.org/sf/617779
.. _PEP 239: http://www.python.org/peps/pep-0239.html

==============================================
`Non-ASCII characters in test_pep277.py in 2.3`__
==============================================
__ http://mail.python.org/pipermail/python-dev/2002-October/029247.html

Guido pointed out that test_pep277.py_ uses an encoding cookie which was
not being recognized by his toolchain.  At some point he stated that he is
"still not 100% comfortable with using arbitrary coding cookies in the
Python distribution".

The reason I mention this thread (beyond for the quote above) is that info
on how to get XEmacs to recognize the cookie came out.  Sjoerd Mullender
sent out the link
http://www.xemacs.org/Documentation/packages/html/mule-ucs_2.html which
helped some people.  As for Linux distro-specific problems, M.A. Lemburg
noticed that SuSE puts Mule support in a package named 'mule-ucs-xemacs'
and once he got the package loaded XEmacs worked.

.. _test_pep277.py:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Lib/test/test_pep277.py

==================================================
`Unclear on the way forward with unsigned integers`__
==================================================
__http://mail.python.org/pipermail/python-dev/2002-October/029249.html

Splinter threads:
	PEP 263 in the works :
http://mail.python.org/pipermail/python-dev/2002-October/029289.html

Mark Hammond was "a little confused by the new world order for working
with integers in extension modules".  Mark wanted know how to create
objects that were more like a collection of bits than an integer.  Tim
suggested creating a Python long; that would act more like an unsigned int
in terms of its bits.

The FutureWarning for hexadecimal constants was brought up and it was
pointed out that to deal with those just stick an 'L' at the end.
Remember folks, that in Python 2.3 ``0x80000000L == 2147483648``.

The usefulness of __future__ statements also came up.  Tim wondered how
useful they were.  Thomas Wouters, though, came to __future__'s defense
and explained how it helped him migrate people to newer versions of Python
without being yelled at for breaking their code.

=============================================
`segmentation fault with python2.3a0 from cvs`__
=============================================
__ http://mail.python.org/pipermail/python-dev/2002-October/029286.html

Gersen Kurz was having problems importing a huge file.  The bug was
attributed to cygwin's malloc implementation, so people might want to
watch out for that.

It was also pointed out that a loop with a bunch of items in a dict
created a huge number of references.  It turns out that dicts use dummy
references in its implementation for when something is deleted.  So don't
be alarmed by huge references even after you deleting an immense dict.

======================================
`Snapshot win32all builds of interest?`__
======================================
__ http://mail.python.org/pipermail/python-dev/2002-October/029352.html

As mentioned in the last summary, Mark Hammond wondered if anyone would
care to have access to compiled snapshots of CVS for Windows.  He got
enough of a response to give access at
http://starship.python.net/crew/mhammond .  This is not like the standard
Windows installer, though; "This version installs no shortcuts, does not
compile .pyc files etc - you are pretty much on your own.
Pythonwin\start_pythonwin.pyw is installed to start Pythonwin, but you
must do so manually".  Mark would like to know if you end up using this.

=======================================
`Set-next-statement in Python debuggers`__
=======================================
__ http://mail.python.org/pipermail/python-dev/2002-October/029377.html

Richie Hindle wants to write a pure Python debugger; the problem is that
it would be difficult without certain C-level stuff exposed to Python.
Specifically, he wants frame.f_lasti (to be found in frameobject.c_ ) to
be writable by Python so he "could implement Set-Next-Statement".

Michael Hudson is the first email I have of someone chiming against this.
He thought it would be better to make it a descriptor so as that "you can
do some sanity checking on the values".  Guido then chimed in saying that
if it was writable that would open up a hole for crashing a program.
Guido eventually said making it read-only would be fine.

This led to Armin Rigo pointing out that you can crash the interpreter
already with the new module "or by writing crappy .pyc files".  Guido
acknowledged this, but said he didn't want to add anymore if it could be
helped.  He pointed out that he wants to stick with the idea that a
segfault is Python's fault unless proven otherwise.

.. _frameobject.c:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Objects/frameobject.c

=================
`Multibyte repr()`__
=================
__ http://mail.python.org/pipermail/python-dev/2002-October/029442.html

A patch was applied that allowed repr() to return characters with the high
bit set; repr() used the "multibyte C library for printing string if
available".  This had caused a bug and made Guido wonder if this was a
good thing to do.  For an example::

  >>> u = u'\u1f40'  # Python 2.2
  >>> s = u.encode('utf8')
  >>> s
  '\xe1\xbd\x80'
  >>>



>
>
>

u

=

u
'
\
u
1
f
4
0
'



>
>
>

s

=

u
.
e
n
c
o
d
e
(
'
u
t
f
8
'
)



>
>
>

s



'12
\
x
8
0
'
  # Notice the extended character
  >
>
>

"The latter output is not helpful, because the encoding of s is not the
locale's encoding".

Martin v. Loewis said that he thought author of the patch's intention was
"to get 'proper' output in interactive mode for strings".  Part of the
issue with all of this is the GNU readline calls setlocale()
automatically.  A patch came about to reset it in the extension module
back to its original state.

The issue of whether pickling would break because of this.  Guido tried it
and had no issue.  Atsuo Ishimoto (who brought up the possible problem)
said it broke when using the ShiftJFS locale.

But the worry of having repr() be locale-specific still lingered.  Martin
said he was "convinced that having repr locale-specific is unacceptable".
He said, though, that having the tp_print slot use a locale-aware print
function was fine and to have it differ from tp_repr was fine.  But this
was shot down by Guido; "tp_print only gets invoked when sys.stdout is a
real file; otherwise str() or repr() get invoked".  Apparently tp_print is
a performance optimization and thus should be fully transparent and not be
different in any way to the user.

Guido said that the multibyte-string patch should be backed out.  With the
pickle issue and different semantics for sys.stdout because of tp_print,
Guido said the patch had to be backed out.

As Tamito Kajiyama said, "one of the virtues of Python is that Python has
no language feature that is (automagically) affected by locale settings".

===================================
`tp_dictoffset calculation in 2.2.2`__
===================================
__ http://mail.python.org/pipermail/python-dev/2002-October/029502.html

Guido asked David Abrahams and Kevin Jacobs if a change in how
tp_dictoffset (found in typeobject.c_ ) was calculated would affect them.
To give some background, tp_dictoffset "tells us where the instance dict
pointer is in the instance object layout".  This was a three step process
(all Guido's words):

1. (Line 1113) For dynamically created types (i.e. created by class
statements), if the new type has no __slots__, the dominant base class has
a zero tp_dictoffset, and the dominant base doesn't have a custom
tp_getattro, a __dict__ will be added to the instance layout, and its
tp_dictoffset is calculated by taking the end of the base class instance
struct (or in a more complicated way if tp_itemsize is nonzero).

2. (Line 1941) If the dominant base has a nonzero tp_dictoffset, copy the
tp_dictoffset from the dominant base.

3. (Line 2090) The tp_dictoffset of the first non-dominant base class that
has a nonzero tp_dictoffset is copied.

That last rule had caused Guido and Jeremey Hylton some problems with some
code they were bugfixing.  Guido wanted to just get rid of that rule since
he though "it is *always* wrong".  Both David and Kevin said that nothing
broke for them, so that is now all straightened out.

======================
`Memory size overflows`__
======================
__ http://mail.python.org/pipermail/python-dev/2002-October/029535.html

Armin Rigo pointed out some potential overflow errors "with objects of
very large sizes".  The issue was when the amount of memory needed to
allocate was calculated there was the chance that it would overflow.
Armin suggested adding macros to deal with various issues.

Arrive Tim Peters, creator of pymalloc.  He admitted that he "always
ignore[s] these [errors] until one pops up in real life" since "Checking
slows the code, and that causes [Tim] pain <0.5 wink>".  He pointed out
another possible overflow calculation problem.  But it was a basically a
hopeless battle since malloc() has its own cross-platform issues.

But Tim did say that if it was decided to go down the macro route, then he
wanted something like what Zope does: ``DO_SOMETHING_OR(RESULT_LVALUE,
INPUT1, ..., ON_ERROR_BLOCK);``.  The result goes into RESULT_LVALUE
unless there is a problem, in which case ON_ERROR_BLOCK is run.

Christian Tismer chimed in and said that he thought we should just move
completely over to 64 bit math.  Ruby had done it successfully so it
wasn't like we were taking a blind leap.  It would also save us the hassle
from doing it down the road when 64 bit processors become the norm instead
of the exception.