From tim.one@comcast.net  Sun Sep  1 08:04:44 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 01 Sep 2002 03:04:44 -0400
Subject: [Python-Dev] Re: [Python-checkins]
 python/nondist/sandbox/spambayes  GBayes.py,1.7,1.8
In-Reply-To: <20020824183542.GA22248@glacier.arctrix.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEFFBBAB.tim.one@comcast.net>

[Neil Schemenauer]
> ...
> For whatever reason, setting HAMBIAS to 1.0 seems to produce worse
results.

It's remarkable.  Graham's scheme is pasted together out of all sorts of
things that shouldn't work <wink>, but this one seems the most mysterious.

It has a huge effect in my 5x5 c.l.py test grid.  Combining all unique msgs
identified as false negative or false positive across all 20 test runs,

At HAMBIAS = 1.0
    total false negatives goes down by a factor of 2 (337 -> 166)
    total false positives goes up by a factor of 7.6 (23 -> 174)

and some of the false positives are just amazing -- David Ascher announcing
a Python conference, Laura Creighton pontificating about the GPL, ... it's
hard to fathom!  One innocuous example:

"""
Hello,
        I love all these speed debates but if speed were our only concern we
would all be writing in assembly for all non internet based programs...!

        Thank you,
        Vincent A. Primavera

prob = 0.99918657946
prob('only') = 0.645419
prob('would') = 0.349237
prob('hello,') = 0.342435
prob('assembly') = 0.34891
prob('thank') = 0.819611
prob('these') = 0.677099
prob('all') = 0.709966
prob('you,') = 0.803672
prob('concern') = 0.225352
prob('our') = 0.951928
prob('internet') = 0.942274
prob('speed') = 0.305927
prob('but') = 0.229635
prob('love') = 0.736116
prob('non') = 0.885065
prob('writing') = 0.150994
"""

There's not a lot going on in that msg!  *Perhaps* the primary effect of
boosting HAMBIAS is to take common glue words (like 'these' and 'all') out
of this uniquely "only look at smoking guns" scoring scheme altogether?  I
don't know what "sense" there is in letting 'these' vote in favor of spam,
for example.

At HAMBIAS = 3.0
    total false negatives goes up by a factor of 2.08 (337 -> 702)
    total false positives goes down by a factor of 4.6 (23 -> 5)

Somebody else think about this <wink>.  It's certainly the easiest knob to
twiddle to make a false-positive versus false-negative rate tradeoff.



From sholden@holdenweb.com  Sun Sep  1 10:14:25 2002
From: sholden@holdenweb.com (Steve Holden)
Date: Sun, 1 Sep 2002 05:14:25 -0400
Subject: [Python-Dev] tiny optimization in ceval mainloop
References: <15726.52313.734491.272985@gargle.gargle.HOWL> <0ED9227E-BBF1-11D6-B9DE-0030655234CE@cwi.nl> <15727.31272.80804.453415@gargle.gargle.HOWL> <200208301413.g7UEDqZ07890@pcp02138704pcs.reston01.va.comcast.net> <15727.33074.324120.988215@gargle.gargle.HOWL> <200208301429.g7UETqQ08033@pcp02138704pcs.reston01.va.comcast.net> <15727.33451.698048.657655@slothrop.zope.com> <2m3csw5qu9.fsf@starship.python.net> <055301c2503a$e1cfea60$6300000a@holdenweb.com> <2mfzwwiaud.fsf@starship.python.net>
Message-ID: <003301c25197$f8522600$6300000a@holdenweb.com>

[Michael Hudson]
> "Steve Holden" <sholden@holdenweb.com> writes:
>
> > > A bunch of 0.5% improvements add up.  If there's not much cost in
> > > complexity, why not go for it?
> > >
> >
> > Yeah, right, we just need 200 of them and we're laughing. Computation in
> > infinitesimal time.
>
> Multiply up doesn't have the same ring to it, does it?
>
Indeed not. I try to keep my pedantry in control, but it escapes from time
to time.

regards
-----------------------------------------------------------------------
Steve Holden                                  http://www.holdenweb.com/
Python Web Programming                        pydish.holdenweb.com/pwp/
Previous .sig file retired to                    www.homeforoldsigs.com
-----------------------------------------------------------------------




From skip@manatee.mojam.com  Sun Sep  1 13:00:23 2002
From: skip@manatee.mojam.com (Skip Montanaro)
Date: Sun, 1 Sep 2002 07:00:23 -0500
Subject: [Python-Dev] Weekly Python Bug/Patch Summary
Message-ID: <200209011200.g81C0NSH019331@manatee.mojam.com>

Bug/Patch Summary
-----------------

282 open / 2810 total bugs (+7)
119 open / 1676 total patches (+10)

New Bugs
--------

textwrap has problems wrapping hyphens (2002-08-17)
	http://python.org/sf/596434
Another dealloc stack killer (2002-08-25)
	http://python.org/sf/600007
Installing w/o admin generates key error (2002-08-27)
	http://python.org/sf/600952
bug in new execvpe (2002-08-27)
	http://python.org/sf/601077
weird header wrapping in email.Generator (2002-08-28)
	http://python.org/sf/601392
xmlrpclib ignores CDATA (2002-08-28)
	http://python.org/sf/601534
some int results that should be bool (2002-08-29)
	http://python.org/sf/601775
smtplib mishandles empty sender (2002-08-29)
	http://python.org/sf/602029
configure finds c++ w/o --with-cxx (2002-08-29)
	http://python.org/sf/602102
os.popen() negative error code IOError (2002-08-29)
	http://python.org/sf/602245
3rd parameter for Tkinter.scan_dragto (2002-08-30)
	http://python.org/sf/602259
Bgen should learn about booleans (2002-08-30)
	http://python.org/sf/602291
option for not writing .py[co] files (2002-08-30)
	http://python.org/sf/602345
Jaguar "install" does not overwrite (2002-08-30)
	http://python.org/sf/602398
non greedy match bug (2002-08-30)
	http://python.org/sf/602444
pydoc -g dumps core on Solaris 2.8 (2002-08-30)
	http://python.org/sf/602627
cgitb tracebacks not accessible (2002-08-31)
	http://python.org/sf/602893

New Patches
-----------

test_commands test fails under Cygwin (2002-04-16)
	http://python.org/sf/544740
email: RFC 2231 parameters encoding (2002-08-26)
	http://python.org/sf/600096
IDLE [Open module]: import submodules (2002-08-26)
	http://python.org/sf/600152
Robustness tweak to httplib.py (2002-08-26)
	http://python.org/sf/600488
Refactoring of difflib.Differ  (2002-08-27)
	http://python.org/sf/600984
build_ext forgets libraries par w MSVC (2002-08-28)
	http://python.org/sf/601314
obmalloc,structmodule: 64bit, big endian (2002-08-28)
	http://python.org/sf/601369
expose PYTHON_API_VERSION via sys (2002-08-28)
	http://python.org/sf/601456
replace_header method for Message class (2002-08-29)
	http://python.org/sf/601959
sys.path in user.py (2002-08-29)
	http://python.org/sf/602005
improper use of strncpy in getpath (2002-08-29)
	http://python.org/sf/602108
single shared ticker (2002-08-29)
	http://python.org/sf/602191

Closed Bugs
-----------

test_commands test fails under Cygwin (2002-04-16)
	http://python.org/sf/544740
Various Playstation 2 Linux Test Errors (2002-06-12)
	http://python.org/sf/567892
Core dump when using mmap. (2002-08-20)
	http://python.org/sf/597938
execfile() not show filename when IOErro (2002-08-23)
	http://python.org/sf/599163
SocketServer wrong about allow_reuse_add (2002-08-24)
	http://python.org/sf/599681
sub[n] not working as expected. (2002-08-24)
	http://python.org/sf/599757
httplib.connect broken in 2.1 branch (2002-08-25)
	http://python.org/sf/599838
NameError value is not the name error (2002-08-25)
	http://python.org/sf/599869

Closed Patches
--------------

"simplification" to ceval.c (2002-08-19)
	http://python.org/sf/597221
Failure building the documentation (2002-08-22)
	http://python.org/sf/598996


From martin@v.loewis.de  Sun Sep  1 22:25:39 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 01 Sep 2002 23:25:39 +0200
Subject: [Python-Dev] mimetypes patch #554192
In-Reply-To: <3D5F9C2D.8010209@livinglogic.de>
References: <3D5BEBB8.7080904@livinglogic.de>
 <15707.61612.844119.819432@anthem.wooz.org>
 <3D5CE38D.9080905@livinglogic.de>
 <m3d6sizl50.fsf@mira.informatik.hu-berlin.de>
 <3D5F9C2D.8010209@livinglogic.de>
Message-ID: <m3sn0tcsh8.fsf@mira.informatik.hu-berlin.de>

Walter D=F6rwald <walter@livinglogic.de> writes:

> >>Even better would be, if we could assign priorities to the mappings,
> >>so that for e.g. image/jpeg the preferred extension is .jpeg.
> >>Then guess_type() and guess_extension() would return the preferred
> >>mimetype/extension.
> > Do you have a specific application for that in mind? It sounds like
> > overkill.
>=20
> I'm using a web mirror script which uses the extensions from
> guess_extension to save all downloaded resources, and I hate it
> when the HTML files are named .htm and JPEG images are named .jpe.

Then this is your preference - others might prefer jpg, just because
their file system can deal better with that. If you can agree that
this is your preference, you should put the preference mechanism into
the application.

Maybe your preference can be expressed algorithmically? It might be
that you always want the longest known extension (it is unlikely that
you prefer "jpeg" over "jpg" just because that contains a vowel :-).

Regards,
Martin


From martin@v.loewis.de  Sun Sep  1 22:31:26 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 01 Sep 2002 23:31:26 +0200
Subject: [Python-Dev] PyString_DecodeEscape and PEP293
In-Reply-To: <3D60EA3B.7030008@livinglogic.de>
References: <3D60EA3B.7030008@livinglogic.de>
Message-ID: <m3ofbhcs7l.fsf@mira.informatik.hu-berlin.de>

Walter D=F6rwald <walter@livinglogic.de> writes:

> A recent checkin added a function PyString_DecodeEscape()
> to stringobject.c. To make this function PEP293 compatible
> it would need access to unicode_decode_call_errorhandler
> which is defined static in unicodeobject.c. Does
> PyString_DecodeEscape() really need an errors argument?

What do you mean, "really need"? The callers of this function pass the
argument, in particular escape_decode. Is that "real"?

> If yes, we could either move it to unicodeobject.c=20

No. It has to do little with Unicode.

> or make unicode_decode_call_errorhandler externally visible.

I don't know this function. What does this have to do with Unicode?

> Another problem that I noticed is that string-escape can't
> be used for encoding Unicode objects:

That is a feature. string-escape has nothing to do with Unicode.

Regards,
Martin


From martin@v.loewis.de  Sun Sep  1 22:22:29 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 01 Sep 2002 23:22:29 +0200
Subject: [Python-Dev] PEP 277 (unicode filenames): please review
In-Reply-To: <p05111701b9869e037b33@[192.109.102.36]>
References: <p05111701b9869e037b33@[192.109.102.36]>
Message-ID: <m3wuq5csmi.fsf@mira.informatik.hu-berlin.de>

Matthias Urlichs <smurf@noris.de> writes:

> Linux and MacOSX use UTF-8 and should probably be treated as such,=20
> i.e. I want to open("=E4=F6=FC"), not open("=E4=F6=FC".encode("utf-8")).

What would be "=E4=F6=FC" in this context? Your message was encoded as
Latin-1 - was that deliberate?

You could expect that open(u"=E4=F6=FC") works well; for the way you write
it, somebody needs to know what encoding the string has.

Linux does *not* "use" UTF-8. On the file system API, it treats
arbitrary byte sequences as-is, i.e. when you pass "=E4=F6=FC" as Latin-1,
it will put those bytes on disk - if you later use "=E4=F6=FC" in UTF-8,
Linux won't find the file.

Instead, the convention seems to be that file names are in the
locale's encoding - which might be UTF-8, if you use a UTF-8 locale.

> Byte strings are perfectly OK if they have a common encoding (meaning=20
> UTF-8, in some accepted normal form).=20

Unfortunately, that precondition is false. There is no common encoding
on Linux.

Regards,
Martin


From martin@v.loewis.de  Sun Sep  1 22:57:32 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 01 Sep 2002 23:57:32 +0200
Subject: [Python-Dev] To commit or not to commit
In-Reply-To: <200208261847.g7QIlI806850@pcp02138704pcs.reston01.va.comcast.net>
References: <3D6A7742.1030005@livinglogic.de>
 <200208261847.g7QIlI806850@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <m3k7m5cr03.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> > Any objections against committing the patch?
> 
> What do MvL and MAL say?

I'm still concerned about the massive amounts of C code, most of which
could be expressed way more compact in Python code. Walter convinced
me that this (the aspect that I picked in a discussion) does have a
real performance impact for real data, so I guess I have to live with
that.

Because of the size, I'm sure there are still bugs in it. I couldn't
spot any by inspection, so I think the patch is ready to be installed.

Regards,
Martin


From tdelaney@avaya.com  Sun Sep  1 23:53:39 2002
From: tdelaney@avaya.com (Delaney, Timothy)
Date: Mon, 2 Sep 2002 08:53:39 +1000
Subject: [Python-Dev] The first trustworthy <wink> GBayes results
Message-ID: <B43D149A9AB2D411971300B0D03D7E8BF0A553@natasha.auslabs.avaya.com>

> From: Tim Peters [mailto:tim.one@comcast.net]
> 
> Training GBayes is cheap, and the more you feed it the less need to do
> information-destroying transformations (like folding case or ignoring
> punctuation).

Speaking of which, I had a thought this morning (in the shower of course ;)
about a slightly more intelligent tokeniser.

Split on whitespace, then runs of punctuation at the end of "words" are
split off as a separate word.

So:

    a.b.c -> 'a.b.c' (main use: keeps file extensions with filenames)
    
    A phrase. -> 'A', 'phrase', '.'
    
    WTF??? -> 'WTF', '???'

    >>> import module -> '>>>', 'import', 'module'

Might this be useful? No code of course ;)

Tim Delaney


From drifty@bigfoot.com  Sun Sep  1 23:57:53 2002
From: drifty@bigfoot.com (Brett Cannon)
Date: Sun, 1 Sep 2002 15:57:53 -0700 (PDT)
Subject: [Python-Dev] Python-dev summary for 2002-08-15 - 2002-09-01
Message-ID: <Pine.SOL.4.44.0209011545490.23213-100000@death.OCF.Berkeley.EDU>

Yes, with Michael's permission, I am attempting to start up the Python-dev
summaries again.  Below is my attempt at summarizing the last half of
August.  It's longer then normal summaries, but that is because I bothered
to include discussions on threads that were not directly relating to the
Python core but are interesting nonetheless (e.g., the whole spambayes
thread).

I am posting to Python-dev first before posting to c.l.py, c.l.py.a (also
lwn.net and probably Slashdot) because I want to get the general okay from
the list that I have done a good enough of a job to send this out; I don't
want to have a summary that represents the going-ons here without the
general populace (or just the BDFL since he can overrule =) being okay
with it.  I am also curious as to whether I should go into more or less
detail, leave out the summaries that do not directly pertain to the Python
core, etc.

So please read the summary and let me know if you are okay with it.  If so
I will try to do semi-monthly summaries from now on.  Oh, and I am on
vacation right now and will be doing a lot of travelling in the next two
months, so I can't guarantee summaries will be this quick to come out for
a while.  I will do them, though, even if they are a week late.  =)

Oh, and if I do get the okay to do this, expect a lot of dumb questions
from me in the future in terms of clarifying things.  Just remember, it is
for the good of the Python community.  =)


=======================================


This is a summary of traffic on the python-dev mailing list between August
16, 2002 and September 1, 2002 (exclusive).  It is intended to inform the
wider Python community of ongoing developments.  To comment, just post to
python-list@python.org or comp.lang.python in the usual way. Give your
posting a meaningful subject line, and if it's about a PEP, include the
PEP number (e.g. Subject: PEP 201 - Lockstep iteration) All python-dev
members are interested in seeing ideas discussed by the community, so
don't hesitate to take a stance on a PEP if you have an opinion.

This is the first summary written by Brett Cannon.
Summaries are archived no where at the moment.  =)   They will be, though,
so stay tuned for the URL in future summaries.



   Posting distribution (with apologies to mbm, but thanks to mwh for the
code)

   Number of articles in summary: 585

    80 |                     [|]
       |                     [|]
       |                     [|]
       |                     [|]
       | [|]                 [|]
    60 | [|]             [|] [|]
       | [|]             [|] [|]
       | [|]             [|] [|]
       | [|]             [|] [|]
       | [|]             [|] [|]                 [|]
    40 | [|]         [|] [|] [|]                 [|]
       | [|]         [|] [|] [|]         [|]     [|]         [|]
       | [|]         [|] [|] [|]         [|]     [|]         [|] [|]
       | [|]         [|] [|] [|] [|]     [|]     [|]     [|] [|] [|]
       | [|]         [|] [|] [|] [|]     [|]     [|] [|] [|] [|] [|]
    20 | [|] [|]     [|] [|] [|] [|] [|] [|]     [|] [|] [|] [|] [|]
       | [|] [|]     [|] [|] [|] [|] [|] [|]     [|] [|] [|] [|] [|]
       | [|] [|]     [|] [|] [|] [|] [|] [|]     [|] [|] [|] [|] [|]
       | [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|]
       | [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|]
     0 +-071-025-012-042-063-084-030-021-039-009-047-027-033-041-036-005
        Fri 16| Sun 18| Tue 20| Thu 22| Sat 24| Mon 26| Wed 28| Fri 30|
            Sat 17  Mon 19  Wed 21  Fri 23  Sun 25  Tue 27  Thu 29  Sat 31



================
Type Categories
================
This VERY long thread was sparked by Andrew Koenig asking if a discussion
of making type categories more explicit had ever occured (Andrew meant for
category to mean "the set of all types that implement a particular marker
interface").  As Andrew later pointed out, he was asking about  "a way of
making notions such as 'file-like object' more formal and/or automatic".
The discussion quickly started using the term interface to mean defining a
way to specify that an object implemented certain methods (think of it in
terms of Java's 'implements' mechanism).  Once that was out of the way,
the discussion took off.  Zope's implementation was pointed out
(http://cvs.zope.org/Zope3/lib/python/Interface/) very quickly.  PEP 245
(Python Interface Syntax) was also brought to the attention of the list.
The idea of using inheritance to handle interfaces was brought up.  Guido
said that he hasn't "given up the hope that inheritance and interfaces
could use the same mechanisms.  But Jim Fulton, based on years of
experience in Zope, claims they really should be different" in terms of
how interfaces should be handled in objects.  Jeremy Hylton tried to
channel Jim's opinion by pointing out that "We'd like to use interfaces to
make fairly strong claims.  If a class A implements an interface I, then
we should be able to use an instance of A anywhere that an I is needed."
But "the inheritance mechanism is too general" because if a class A
implements interface I and then a class B, which does not implement I,
subclasses class A we end up with a class B that claims it has a certain
interface which it doesn't actually have.  Guido understood the point, but
still thought inheritence could be used "if there was a way to "shut off"
inheritance as far as isinstance() (or issubclass())" is concerned.  Guido
asked the simple question, "Why do keep arguing for inheritance?  (a) the
need to deny inheritance from an interface, while essential, is relatively
rare IMO, and in *most* cases the inheritance rules work just fine; (b)
having two separate but similar mechanisms makes the language larger."
Samuele Pedroni asked that any implementation "allow also for refering to
anonymous super-interfaces of an interface in terms of the interface plus
a subset of its signatures, also e.g. FileLike and just 'write'.  [that
means an interface can be thought to correspond to a set of
(tag,signature) tuples, where tag identifies the interface, and one can
also just consider subsets of it]".  The thread has finally seemed to have
stopped (for now) with Guido saying he is mulling the whole thing in the
back of his head.  This is a very sticky topic because of the number of
design decisions required and how it might change the way people program
in Python.

There was also a partial sub-thread in this whole discussion about
multimethods; basically a way to do overloading of methods based on
parameter signature.  Most of the discussion was over syntax and such and
how to handle resolution order.  It then seemed to go to the wayside when
the main part of the thread took over again.

==============================
type categories -- an example
==============================
This thread was starteed when Andrew Koenig said that the reason he
brought up his type category question was because he wanted a way so as to
be able to identify members of a type easily.  He now had an example in a
program he was writing where what the type of the argument was varied and
thus what needed to be done to the data changed accordingly.  Jermey
Hylton suggested the isinstance(obj, type(re.compile(''))) idiom.  Andrew
asked if this was guaranteed to work, which Jeremy said no.  I asked why
this was not guaranteed, and Frederick Lundh said because re.compile() is
a factory fxn and it is possible that a future version could return a
different object based on the pattern.

===============================================
Python build trouble with the new gcc/binutils
===============================================
Andrew Koenig said that he couldn't compile Python using the newest gcc
(this was the day after the latest release hit servers).  With help from
Zack Weinberg of Code Sourcery (who also recently rewrote the tempfile
module), the problem was tracked down to binutils 2.13. being the culprit
and was not Python's fault.

===================================
Last call: mortal interned strings
===================================
The patch python.org/sf/576101 removes the default immortality of interned
strings.  I believe it was in early August (possibly spilled over from
late July) when Oren Tirosh proposed the idea and wrote the above
mentioned patch.  There had been some discussion over whether any 3rd
party code was reliant upon interned strings being immortal; none was
found (MacPython was reliant upon it, but since it is under Python core
control it was considered a moot point since it could be changed).  It has
been checked in.  With the patch the way to make a string immortal is to
call PyString_InternImmortal(); no code in the core uses this function.

=====================================
PEP 218 (sets); moving set.py to Lib
=====================================
Thanks to Greg Wilson (for writing the PEP), Alex Martelli (for writing
the module initially), and Guido (for refactoring Alex's code) the stdlib
has now gained a sets module.  It has both the notion of mutable and
immutable sets (the latter used when you have a set of sets).  There was
discussion about how sets should print (sorted or not; unsorted is default
but option is there to print sorted) and what operators should be
overloaded for working on sets (| and & were chosen).  The module is a
beautiful chunk of code and I highly recommend reading its source.

===========================================
A few lessons from the tempfile.py rewrite
===========================================
Zack Weinberg, after rewriting the tempfile module, brought up three
points:
1) Lack of dummy threads, 2) lack of a pthreads_once equivalent, and 3)
lack of a way to skip tests from unittest.py via some built-in method.
Guido responded accordingly: 1) since some code uses the idiom of trying
to import thread and catching the exception if it fails, Guido said he
would be willing to accept a dummy_thread.py that would allow:

try:
    import thread as _thread
except ImportError:
    import dummy_thread as _thread

to work.  No word on whether this is being written at the moment.  2)
Guido said the method was, in his opinion, overkill.  He said to "be
Pythonic, live dangerously, accept the risk that a ^C can screw you.  It
can anyway. :-)".  And as for 3) Guido deferred Zack to the PyUnit list
and Steve Purcell since Python just tracks Steve's code (pyunit.sf.net).
Guido's suggestion was to stick code that was reliant on some other code
in a separate testing suite that is only run when the reliant code is
available.

===========================
Standard datetime objects?
===========================
Kevin Jacobs asked what stage the new datetime object was at.  Guido said
it is in python/nondist/sandbox/datetime/ in CVS which also has comments
pointing to a wiki containing the current work on it.  Fred L. Drake, Jr.
is working on the C re-implementation and Guido expects a checkin at any
moment (hasn't happened as of this writing).

===================
PEP 269 versus 283
===================
Jonathan Riehl noticed that PEP 283 said PEP 269 was dead; not good
considering he was close to having a patch for PEP 269 (pgen module to
interface with the C version).  Guido said he will revive the PEP.  The
patch has since been put on SF at python.org/sf/599331 .

==============================
What is a backport candidate?
==============================
Since Python 2.2 is going to be around for a long time, the question was
brought up of what constitutes code that should be backported.  Guido made
the following three points:

1) code trivial to backport should always be backported

2) code patcheing 2.3 code should obviously not be backported

3) 2.2 code requires changes to use patch, but applies; gradients of this
exist.

So please, when submitting patches, mention whether you think the patch
should be backported to the 2.2 tree and any possible dependencies it
might have in a backport.

=================================
python/nondist/sandbox/spambayes
=================================
In response to Paul Graham's spam filter written using Baye's Rule
(Slashdot post on it is at
http://developers.slashdot.org/article.pl?sid=02/08/16/1428238&tid=156), a
thread spawned around this checkin of code that followed that paper's
suggestions.  This thread quickly jumped into discussions on data
structures, Baye's Rule, and a whole lot of talk about spam.  Very
interesting if spam filtering interests you.  Tim Peters has been leading
the drive on this chunk of code (and thanks to his illness that befelled
him in late August which he has subsequently gotten over he had a few days
of major hacking on it; Tim showed he is a performance stats whore
<wink>).

A very cool quote came out of this thread from Eric S. Raymond when
discussing the spam filter he has been working on: "This is actually the
first new program I've coded in C (rather than
Python) in a good four years or so".

====================
Parsing vs. lexing.
====================
In response to a question by Aahz about what the differences were between
a lexer, parser, and tokenizer, Eric Raymond posted a good overview of the
differences.  Guido later commented in an email mentioning SPARK and about
how Python's lexer (pgen) works and why he wrote it.  He also made some
other comments on lexers.  Jeremy Hylton pointed out a "neat new paper
about an old algorithm for recursive descent parsers with backtracking and
unlimited lookahead" by Bryan Ford at http://www.brynosaurus.com/pub.html
.  Alex Martelli pointed out that this discussion reminded him of "a
long-ago interview with Borland's techies"  in which they said they were
able to make Borland PASCAL fit on a floppy while MS PASCAL took multiple
floppies.  Their trick was "we just did everything by the Dragon Book --
except that the parser is a hand-written recursive descent parser [Aho &c
being adamant defenders of Yacc & the like], which buys us a lot".
Someone named Noah also emailed a discussion on lexers and parsers pulling
in Finite State Machines, Push Down Autonoma, and Turing Machines in his
discussion.

Martin Sj?n says that Haskell's pattern matching and lazy evaluation makes
lexers easy (even a Recursive-Descent parser), but unfortunately Haskell
does not play with other languages nicely.  Haskell is where Python got
it's list comprehension idea.

=========================================
[Python-Dev] Fw: Security hole in rexec?
=========================================
It was brought to the attention of the list that deleting __builtins__
allowed a compromise in rexec.  Guido pointed out that
python.org/sf/577530 reports this.  He also said don't trust rexec.

A patch is going to be submitted to document the view that rexec is really
not that safe.

=================
A `cogen' module
=================
Francois Pinard asked about Cartesian products using the new sets module.
Guido didn't think people would in general need it.  Francois quickly
started this thread of discussing a cogen module to generate Cartesian
products and other ways of operating on sets.

=================
Mersenne Twister
=================
Raymond Hettinger volunteered to implement the Merseene Twister algorithm
(one in Python exists at www.math.keio.ac.jp/~matumoto/emt.html).  While
discussing to implement in C or Python, Guido noticed that random.Random
re-implements whrandom.  Guido then came up with the idea of writing a
base random class that is subclassed where .random() can be implemented;
Tim Peters agreed and suggested more methods to subclass.

=================================
New PEP Format: reStructuredText
=================================
David Goodger and Barry Warsaw have now gotten reST as a usable syntax for
PEPs.  Read the PEPs on the subject to learn more:

- PEP 12 -- Sample reStructuredText PEP Template
  (http://www.python.org/peps/pep-0012.html)

- PEP 258 -- Docutils Design Specification
  (http://www.python.org/peps/pep-0258.html)

- PEP 287 -- reStructuredText Docstring Format
  (http://www.python.org/peps/pep-0287.html)

====================================
tiny optimization in ceval mainloop
====================================
Jeremy Hylton noticed that in ceval that their is a test of whether the
ticker was 0 or if things_to_do was set to true (explanation of the
ticker, checkinterval, and the GIL follow this paragraph).  Jeremy
wondered if we could just drop the ticker to 0 when things_to_do is true.
Jack Janssen, though, pointed out that clearing it is not guaranteed since
there may be an interrupt routine when "we fiddle things_to_do".  Skip
Montanaro then pointed out that since neither ticker nor things_to_do is
fiddled with unless the GIL is held that instead of causing each thread to
execute this test that they could be made globals instead; he did a patch
that implements this (python.org/sf/602191).  Guido then said that if
there wasn't a decent speed improvement, then no patch would be checked
in.  He then changed his mind when it was pointed out that it actually
simplified the code.  Skip tested anyway, though, and there is a speed
improvement.  This also brought up whether the default value of 10 for
checkinterval was reasonable.  It was then agreed to be bumped up to 100.
Jack ran some code and said he noticed a definite improvement.

Python's version of threading is not like in C.  There is something called
the GIL (Global Interpreter Lock) which any thread wishing to execute
Python code or play with Python objects must hold.  This means that when
you have Python threads running (using the thread or threading module)
they are usually all waiting in line to get the GIL.  Now for Python to
decide when to release the GIL for another thread to grab it, it uses the
ticker.  This variable counts down to zero by being decremented every time
a Python opcode is executed (originally defaulted to 10, now defaulted to
100).  The ticker's starting value after each release of the GIL is what
sys.checkinterval() sets.

To get a better understanding of therading under Python I recommend
reading Aahz's tutorials on threading.




From tim.one@comcast.net  Mon Sep  2 00:40:38 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 01 Sep 2002 19:40:38 -0400
Subject: [Python-Dev] The first trustworthy <wink> GBayes results
In-Reply-To: <B43D149A9AB2D411971300B0D03D7E8BF0A553@natasha.auslabs.avaya.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEGMBBAB.tim.one@comcast.net>

[Delaney, Timothy]
> Speaking of which, I had a thought this morning (in the shower of
> course ;) about a slightly more intelligent tokeniser.

"Intelligence" isn't necessarily helpful with a statistical scheme, and
always makes it harder to adapt to other languages.

> Split on whitespace, then runs of punctuation at the end of "words" are
> split off as a separate word.

For example <wink>, "free!!" never appears in a ham msg in my corpora, but
appears often in the spam samples.  OTOH, plain "free" is a weak spam
indicator on c.l.py, given the frequent supposedly on-topic arguments about
free beer versus free speech, etc.

>     a.b.c -> 'a.b.c' (main use: keeps file extensions with filenames)
>
>     A phrase. -> 'A', 'phrase', '.'
>
>     WTF??? -> 'WTF', '???'
>
>     >>> import module -> '>>>', 'import', 'module'

The first and last are the same as just splitting on whitespace.  The
2nd-last may lose the distinction between WTF??? and a solicitation to join
the World Trade Federation <wink>; WTF isn't likely to make it into a list
of smoking guns regardless.  Hard to guess about the 2nd.  The database
isn't large enough to worry about reducing its size, btw -- the only
gimmicks I care about are those that increase accuracy.

> Might this be useful? No code of course ;)

It takes about an hour to run and evaluate tests for one change.  If you
want to motivate me to try, supply a patch against timtest.py (in the
sandbox), else I've already got far more ideas than time to test them
properly.  Anyone else want to test this one?



From tdelaney@avaya.com  Mon Sep  2 01:04:39 2002
From: tdelaney@avaya.com (Delaney, Timothy)
Date: Mon, 2 Sep 2002 10:04:39 +1000
Subject: [Python-Dev] The first trustworthy <wink> GBayes results
Message-ID: <B43D149A9AB2D411971300B0D03D7E8BF0A555@natasha.auslabs.avaya.com>

> From: Tim Peters [mailto:tim.one@comcast.net]
> 
> For example <wink>, "free!!" never appears in a ham msg in my 
> corpora, but
> appears often in the spam samples.  OTOH, plain "free" is a weak spam
> indicator on c.l.py, given the frequent supposedly on-topic 
> arguments about
> free beer versus free speech, etc.

I'd actually thought of this limitation, and how it could be avoided. This
so-called "more intelligent" tokeniser would probably work best in a system
which scored word pairs as well as single words. For example:

    "I want free beer!!!"

would be split as

    'I' 'want' 'free' 'beer' '!!!'

This might then be scored as

    'I'          0.5
    'want'       0.5
    'free'       0.5
    'beer'       0.1 (beer is unlikely to be a spam indicator ;)
    '!!!'        0.9
    'I want'     0.3
    'want free'  0.99 (do you want free hot ...?)
    'free beer'  0.01 (free beer is never a spam indicator ;)
    'beer !!!'   0.5

Whether any weighting should be applied to single words or word pairs I
don't know - my gut feeling is that they should be weighted the same, but
guts are no replacement for empirical evidence.

I just brought CVS python down at home and tried compiling with MinGW (no
success so far ...) but I'll have a look at the GBayes stuff sometime soon
and see if the above helps at all. Unfortunately, I just started my work day
...

Tim Delaney


From tdelaney@avaya.com  Mon Sep  2 01:38:10 2002
From: tdelaney@avaya.com (Delaney, Timothy)
Date: Mon, 2 Sep 2002 10:38:10 +1000
Subject: [Python-Dev] The first trustworthy <wink> GBayes results
Message-ID: <B43D149A9AB2D411971300B0D03D7E8BF0A556@natasha.auslabs.avaya.com>

> From: Delaney, Timothy [mailto:tdelaney@avaya.com]
>
> Whether any weighting should be applied to single words or 
> word pairs I
> don't know - my gut feeling is that they should be weighted 
> the same, but
> guts are no replacement for empirical evidence.

On second thought - if a word-pair appears, then the separate parts should
not be checked as separate words.

So, If I had scores:

    'free'              0.1
    'beer'              0.1
    ('want', 'free',)   0.9
    ('free', 'beer',)   0.01
    ('free', '!!!',)    0.99

then the following phrases would match (case-folding) as:

    'I want free beer!!!':

    ('want', 'free',)   0.9
    ('free', 'beer',)   0.01

    'Get *** for free!!!'

    ('free', '!!!',)    0.99

    'I want free beer. Free the beer!!!'

    ('want', 'free',)   0.9
    ('free', 'beer',)   0.01
    'free'              0.1
    'beer'              0.1

Damn I wish I was at home to try this out ... :(

Tim Delaney


From skip@pobox.com  Mon Sep  2 03:29:09 2002
From: skip@pobox.com (Skip Montanaro)
Date: Sun, 1 Sep 2002 21:29:09 -0500
Subject: [Python-Dev] Python-dev summary for 2002-08-15 - 2002-09-01
In-Reply-To: <Pine.SOL.4.44.0209011545490.23213-100000@death.OCF.Berkeley.EDU>
References: <Pine.SOL.4.44.0209011545490.23213-100000@death.OCF.Berkeley.EDU>
Message-ID: <15730.52469.604124.730029@localhost.localdomain>

    Brett> I am posting to Python-dev first before posting to c.l.py,
    Brett> c.l.py.a ... because I want to get the general okay from the
    Brett> list...

Looks good to me.  The only trivial nit I would like to raise is that any
URLs you embed in the text be true URLs.  I'd also prefer they be encased in
<...>, but that's slightly less important and generally only matters when
URLs are immediately followed by punctuation.  So, instead of

    Brett> Guido said he will revive the PEP.  The patch has since been put
    Brett> on SF at python.org/sf/599331 .

you'd have

    Brett> Guido said he will revive the PEP.  The patch has since been put
    Brett> on SF at <http://python.org/sf/599331>.

The two changes make it much more likely that email readers will be able to
successfully highlight such URLs correctly.

Skip



From skip@pobox.com  Mon Sep  2 03:34:24 2002
From: skip@pobox.com (Skip Montanaro)
Date: Sun, 1 Sep 2002 21:34:24 -0500
Subject: [Python-Dev] The first trustworthy <wink> GBayes results
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEGMBBAB.tim.one@comcast.net>
References: <B43D149A9AB2D411971300B0D03D7E8BF0A553@natasha.auslabs.avaya.com>
 <LNBBLJKPBEHFEDALKOLCAEGMBBAB.tim.one@comcast.net>
Message-ID: <15730.52784.407584.441515@localhost.localdomain>

    Tim> It takes about an hour to run and evaluate tests for one change.
    Tim> If you want to motivate me to try, supply a patch against
    Tim> timtest.py (in the sandbox), else I've already got far more ideas
    Tim> than time to test them properly.  Anyone else want to test this
    Tim> one?

Care to identify some of those ideas?

Skip



From tim.one@comcast.net  Mon Sep  2 03:43:01 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 01 Sep 2002 22:43:01 -0400
Subject: [Python-Dev] spambayes status
Message-ID: <LNBBLJKPBEHFEDALKOLCCEHCBBAB.tim.one@comcast.net>

This is a multi-part message in MIME format.

--Boundary_(ID_Uqs78No0Dj49zOTKCoTyzA)
Content-type: text/plain; charset=iso-8859-1
Content-transfer-encoding: 7BIT

I spent an enormous amount of time this weekend running tests against
various changes -- a "1% inspiration, 99% perspiration" kind of thing.
There are lots of words about the changes (both good and bad) in the comment
blocks and checkin msgs.  The biggest "conceptual" change is that I'm now
using (but only using) the Subject and From lines from the headers (my
earlier belief that the ham corpora Subject lines were too corrupted by
Mailman decorations turned out to be wrong).  Adding Subject lines gave a
remarkably small improvement, btw.  Most changes I tried either didn't
matter, or hurt.  Approximately 70 more blatant spams in the ham corpora
were identified and replaced with (randomly selected) legitimate msgs.

The f-p rate is too low now to measure changes with confidence.  Best guess
I can make from the evidence is that it's below 0.05% now.  The false
negative rate has improved more, and there's still plenty of those (so it's
still easy to be confident about whether changes do or don't help that).

Across all 20 runs (each training on 4000 ham + about 2750 spam, then
predicting against a different set with the same number of each), these are
the false positive and negative rates now (percentages; note that 0.025% is
a single message in the f-p column; a single msg in the f-n column is about
0.036%):

      f-p     f-n
    0.000   1.236
    0.000   1.164
    0.050   1.454
    0.000   1.599
    0.025   1.527
    0.025   1.236
    0.050   1.163
    0.025   1.309
    0.025   1.891
    0.000   1.418
    0.075   1.745
    0.050   1.708
    0.025   1.491
    0.000   0.836
    0.050   1.091
    0.025   1.309
    0.025   1.491
    0.000   1.127
    0.025   1.309
    0.050   1.636

The aggregate number of unique f-p across all runs is down to 8.
The aggregate number of unique f-n across all runs is 336.

The 8 ham messages for which at least one run claimed it was spam are
attached.  Note that I finally removed the "If AOL were a car" spam from the
good corpus; while it may or may not be amusing, it *was* automated bulk
email, even to the extent of including large blocks of random characters at
the end.  The message consisting almost entirely of quoting a Nigerian scam
message looks like it would be a "false postitive" under any scheme worth
using, but I left it in the good corpus (so it's still an f-p here), because
it wasn't bulk email (the original msg was, but the reply was not).

--Boundary_(ID_Uqs78No0Dj49zOTKCoTyzA)
Content-type: application/x-zip-compressed; name=fp.zip
Content-transfer-encoding: base64
Content-disposition: attachment; filename=fp.zip

UEsDBBQAAAAIAOewIS0GzF+x8RsAABBiAAAGAAAAZnAudHh07FvrcttGlv6vKr1D2+vZSI4IoXEH
x1YkW7KliWW7LDlOTSq11QSaJCxcGFxE0W+7z7AvsN9pXAhe5ImqkuxOlZ2KTaK7T5/7+fo0+PTp
H/lnd+dUlOLwXCSHV7K0DrnuOp6rlXfl7s4sz0bsOdM13/dd09dN3fTN+vHed0GWzES6OPhuX03x
fF/3usHbqIjKZsTnvtMNyEEiorgdMbnn+e1QOZW5bMnprudx123Hsipv1xiWYxnt80KmpdaOWJy7
TjsiUhEvio4HB/911HIZyOhWht1C03OcJRsivSk6PkxdB5Pt4FTGsx/aId4+jdJbWZQJeGkp+sut
JlUsyuVe3UitiaJ97vku73goqtFnGZTD94tymqUb+83UY23jeZ6J8KD3dHfnQyPpkI3zLGG/cN/W
uMs1W3P5r2wP4mTP0yyUls61MPmiFaVIQ5GHcTSWGiy8v7vDmj+jBSOGtWbzLJ+weVROmSyScsb2
zu6ihJmawdl/8N6qKGTczk+LaqDrumff4J/l4DjLWU1uEEdFebwk/Xf2SYYHjOvsXVAyAzZgujk0
raFpsIFuEZE10Yq4ugmD1NKitJQ5jL8hC9vr5qwPLVn6hesapNBc49dV2b+mJbb3Ps8GnuZr5mH3
ab9Wz9Xl9XtoYUnr4uTENh1HJ+GfbZf+aJv4Homv6+x7nW8VX3d909DZ3i/cNTSTa56nGSb/dR/M
Lzf/PVp6k5VVUfN9eX3Cbi3N0RzG9jxf1zhzB9wZcCSE/Z5++pIyTzds5+SMa7pu+NYLR79HGj7U
jVaaS1kUYiIHF6dD9kzXXanzgNvcDp+MRoZt+5b+RLq2x8ciOG5FvV+Go92dV1DKkD3+KQpuIsku
ozguHrNnt+rrfyX09XjLqutseK9Ndneumqhk17mI0iidsJfIS4UsWJSyOlTp08cfVU6Vw+1iG0Oz
M+LlxeXZ4CeZF1GWDhk0trvzMoNUaTm4XsxAoZR35eEsxnZ/X2o7mArsWj5/HBXwNc/2B/xxb2Eu
0mIs88FZGmQhuBwydxQhm/88eJ9HWR6ViyEz6evl1SUCuvf0bZYnIlZDGJD5EIoL8qzIxiV7V5Vx
lt2ws7tZDmMxS1NuZtvENBZEiXz35mzI4P9hFciQvVj0VjfD7KeVZVcyDWmXnsYHIkyitKf33Z2z
PM/yYkC2+frEnwcvpEyvqYwM78krrWyJSJd6NzQdDr7HdW64Orz6PdWHUKYByIyq+GZ35w1teI7c
D/egJFhmwz4rufytkiv7/NBk8OdUMI4aAu+zotxKYNXN1Fz4WhHk0QgsPJuW5Wx4eLiWfdV3yHFI
FKJ0nB32KB4dLN3lgRwX7c4tKxfIMa9lKnMRszAqgqogtTGiozI46nbr/ShDk1wkCcVGLNJJhZju
MdLjQNsU+WNa/J8JXaUbYp/kwRQJ9n5WZtFM5vSwz8PhEZXd84jVaIa+XDCRMIod0gopbJFVDC6W
JUAMoSihzIJBeWWbVYImq4hbEBejWFJWIS1TZsnaVKMx9lKkiphCJbTVdY1d6GM/84GHq2XSOqmR
kZov4yFpS+cmZ4ZlM44QQPIUd2uPbRvheqYQixr5aiIlyq8gZ5LlxPqYsgpJSUJeNXPZG0w+YAon
MsjL5nKEj/CWRtvz+Xwjrx9iZ4T3KhF2UhQVcl4gkY5rSNpNIFNixnlWFdgMeRe1bRpBax8IKbEz
pMdRlU+m7Owcgp6eH7CogGkmMCSsF5Ler4KshCuD1N7bjF3909pn+MY6XEfYQEUAskkGzbKLDgyy
kwqGotQKWylV1wo17N0dFDUgGRYIKBC4asFGUvlEHoKk0l0aldBfSEun4CrI0nGEpFRG2KPGjsQs
lLu7Q/uLMKS8LCW0HNOGF2PW4lySQ1IWPWBhBiRTYqSENxwG2WxxSDEdZwVMBZ+gSp5VtUmwJeFr
Je8slqIgFssqR5UrWZmxqsA2nySMmAMwFAH0CnEa5uB0I9ARcIF6w2IGchF5M8UAzMRCkSA/QAew
jtKjaEjReAwL1F5DAyRjBIPPRF7i6xiDxAQUoE4A5HH/8qTzoD/rxyLDMmxj5VSkqnVzKoDL6kM4
aIO9HdszDa8/esSH0zJpzz0u97htrK6W3UnAxFmgg/T2ZKiiYeP4UK9KZXvkcE3Pc9zeMhVBG8sw
oCJp24CKu60b8WGESn13z9gs2jzdNKu6AcAUy+ntpaJ8GxOZttTjyoDKCdsGVK7YKg8J1BvoUKGY
BACD+PsYAYZgHWmyQN5+K+fFJM+qWTFkdLrVqIQ1Gb+H/mQai0AWLci7nlYAeS57JUcE8nR8HprO
kHstyHuXT0QafVHOPGQXYSgzRP4HWUrKfv+TUpJKJfa013Cw5029yeQJv01jzzq+en1hOjaXYwBf
otHw/Pb6vQIWyOyDcwUwHEPzDGB/HGOMh6E+UkAuRfh15AfopmuGzi0c9vhDsN/awveinA5Zij0f
jcd3WlVp8OZHVUV/i6RoH4SZFsr2C80eSxniIFZUsVYOkOygPMxYDukbQ/3Po2hCM7VEhpGYi0Wx
Qpe+lFWaaVH5qJD5LWD0SAuK+qgxqgoyVEGDtGDAtbyxIsqxLNTDzjiPkPYGSHUqG0JNuRzX0m74
1tCzXAqONYS7Alv/n0BlrhlsT94B+ERU4UR8P1z+C2DjClhIs0lV3AiyINBhWt0d5rOE/h+cv/t0
/U6j/LuyIJCVNo60KgxgrcPX7z++oXU3xeHKrFAmoohEqPLSoUqD2oxo/RCFzznn+n8i/T2vUc5/
l1EgiK0/ux7xQ9eydee+eiQVOiBE0ODfjU7RCDhxI2XO86yUG1PHsC1suDE7zOIY59D9zdSfUZxM
Np4D1qZlvtisSWk0gT+JdGNgVgGDA3ZsDMBT4Az5YgtTCfJ5FNy3x8bzDCFeANpsDLQtuBdN0N9H
cVMcwo9Jn+d+qtNGYpGQIgC/+vmumLYp7uzjakYCdp7Os/ymIAd8FCyA1ZDJivlIxrGaWS5m0wwx
UeBLvXQl8TQ178NCpCh6Of45pr/G8SJNm7bH1wvfBzlGmkB4F83hFRkBBd3mlmE6jmZ4Fve1e0+z
Xdn8IKlARLcomqxVKh7G9QFoGs3Y3sd8gqSy3xVE1/9z6peta5ZtGBo39Ac1L9ZXrlTrF/nPpaNZ
pq8/eX2qGZZvIEKP141ztNIbMji7FHnTG9KHtkctsYFuE/FtlR2VEyjUcTVOveKfB3TgoeZQWed8
gQolj9utaAIOfZSW17lgZD7bsrDtCk22t8kWH4IzSgdtPj67ut5fY+8+kZq1tGQdB129PbvGWYk6
eIrVrjyuR8hmufThdu6/RbmkzpL5rbP0rbP0h3WWTubss5RfDnCAzqjFhNM+TvxjUhWO52+bSsrm
Ec7dOJEDRtftA5qGUspG8GMcyhFf4e4O6djQ2UKijqsz9GCAdI/iwF5RdSB/FQFhmpjaMyyUtzLO
ZgT8DnFsApDe3SGItFJOavDz+GQkkGp/FCMc6OmsdaM+HY+T+XSASKa6wxTeoN5EUufR3R2K/+HD
SoymgfMj9vHD67O31+zk7Sl7+e7tq4tTfLs4eYMhGv1Q20c5VV1ktpegZvqrKMfkC5ZU+KfI4iiI
Smq05QygIwrKrhkTSCZUw0jkYQEbQNHtIYHNp1Ewbbp/YkQ9lTIj4vDkGXVbykz17kZS9T8aGzLA
kcVqrycVZQVjKnIwA5tmM0krofusikMiKWaobEGEJKyxer8U9MZgG+sbwq9qGIdiVkMnev6JYIVq
75zXmG13Z+/V5afzfRiF6DbupKmmW0vgNUGmVPW3QKKZolpNaRkvWIGMXc2YICFKcp/dnRxHJDln
M5FKYoco19IoXoHwIjh7SgqpL1KjCWEEaiW1NMDkXFBr7IDJOxlU1HdTPbmZiEJl1YryPQmq7AcZ
2r7cjHaHdCyJ4qgUEHwKhFAQ70WpNHZa5ZRVaHLdeaUxfNjdCasyksWB0rq4VaRrs4wjMKC6bqpS
0YKPV08M88Dm7gH1APau55i3gOJyWTdhKbG9QtSzc7CaS/CuBDhiV4gqlEGE5TUhZ7LGxzQiCa+I
v4KdNkCbjeNMULlt28EiUJC6aEwMqEdyi7JVw0tJ2ovZC4D9nq2ajJAgJSyY8iElXZ0ayub2iLqf
MIhQSUIi1Nvd1HXikdq/02hexah4ypHwPKLO4TTLy6JN4ngya9uj5OtT0Uwk3yCwrFqEVLJDDShC
uXRcZOAojqAdzIVQ0ZjskMsZKKvt5R0cWcWgkr8TCWw3hqfeIPTcdGGVKTGi+K/lnou0plWvXLKb
zdNVlhvZFhB6Mi1ZIm4k3Hx3pyhBGkaqY5uKGDSHbFEVrZUQ7wE1DdiL1UhX7JM12zML6tq8UPoe
RWGoLgKiW5zjqA8BPgvaQN31IgFQI/+APuS1RxB7CQHBpse/ZrTah5veb+3KNQsgUEYqGSmjNsZv
MpmIRaPdMXy2UP6wTpnCTAVNjMhlTfe4pk4AhRZ3a1T2E3mJgk/aKBBLtUUa9VMDmZrXdS9cNZy7
9vqKLyPnwjGXCYbILlRGWmGxeUOl7jd3uaT2JqqFIqbDwqJ2/Dav1FWxl0iolV4HCNVbScTrFSFY
nKgrAKIF9ZfUlSbr5lQ36H2JBXEZQ5IJyk9reJUHlbozdS3U9MNZexRtdQQHwoz5lIpFq21Ysfbc
nqnAaW0fcgI1XXmKCCg8yTe7RVVddojtOPqiBOoIaexcxRImU88eoaNcUBUa5Q6xJPna+FQKERNk
N5UniiktWpqynFbFgBn23zoc10qAyALXe6C7z1z9b7W6ES17jcrbulXsH7BmeYKowmEXasvoyuET
YQYEfljfQdCWRdnk7IYRKjIgThPaHgI8SiitS5X2VRkm569v7xf1fYdqUCg9qvyD5aNqQWJGQRWj
DMMBEwHwBhFgWhpWQYSMpNSg+F8S0ho48b6+O0kJ7SjnU/pUiid4RW8NFIzr4LcQ4zrVNBUHbjWK
KzlDgVJBhbVz1JI6/TQK6zyB3B1GkgggZWFggTSIq1C20vVABWZJHAnTerexuAP2qpIRZfFWG8i0
sUqVpAja+UIFKAxCpVhhF3XBF9JVUlx0SbhzqEb6FwS8Goh00Dw7SyfwpD5GVDl/EySqq6DmsUwn
UDwIK5R5RDdqh+qm0jAtxplr+y43TRW2vw/rLCdeyYBuxFAcS3WJGMu7A3Zxky2iA/ZGTLKiFubP
buuZh9xyDd9Zf/vOcy3XcXzH6e6MqA4MH9/I9HHTZ7K7nlldbmT4w/rIfCrbFptBNwn2CrG6Z5iK
RA6f3XQTuzltO+ykKrOk9z5bN2Eklz0/wze8jlWUYcRO2z+zuG171kqjcrmO3oHzl2/wrW3l2aZp
mluYVo3OaVbSl2aua/nm8jZsfS6dPpqJpsEtfaPpd9Yj1W9sttdMpmP4Vve8fmOwGeKu7fjGQ5t+
MSJJC4pA9fjStJxpYrS8mdAScTendl+xyDUZVo/G1QCxCs3RjUaVRr1vM5MPMN1wMTnUomml4kg7
+bi9LfijpK4gDH7cKPB3tQS7pl7nDuysJtvciVXygFnsH1VKjSCDGkEGH1o2+x5pTr/3wkuEEx5m
T7ieJrn9hB9fnA4s37U9TSkhHKdBhH/k9muvewQXFdszdBN2wf8213x7v9cY6ysSKdgwOdmPM9My
TQ4fYqtL2R532C+Kp1/3/4zepKOuyBz8pesP601urHxAQ407jm38e1xAUUeN828ttW8ttT/uZa1s
TkgPp6Ou4rC2MDEqmqwpYUzVuPZAhwq7fOXqr7j0sw+5aXuusY4ObMPyfFu3dd7d2c0prtpi7Jie
3pVpeQc0nE7aQdf0XW+jyBNWWlZl17K411VldZJoSdMb991ANO6oehayZzvQewOsreS6Z+rd+zFA
klq3EHL0AEsUTLuyatnL99wJmra0XNOw3XXx2rfgl8U7jm5k98a8Y7mu20GJROQ33Xs0nm8ZS4AC
PbbPLcfU230eR9+Fjxtahu75Tsda0yNFuuoYN33eQy10pG6HbNOje/+HQQVCB4iOWCzqIkfPmmqn
pV9q8BAUWjUCmBDdfWEy16JZcRv0MEX/SXtBlKTLh9r7d1fXZ6fbccPnTMbeKCoBHj7Po+AG6P7p
0zQrZiJ5+vQYB888qmoA8TBA8cu761/ZaadE9ia6hSceNh5Jh5NXSoUdhviTrgipmPrc1xz3oVeE
qytX8M3Lufmbp1HhenKpWaZj2vrxNs13l4SXWXoA8MROZnkNpAx9aNpDy2GvL6+3oiDDonf/DRu7
8HtvCEVZjqJ7rgj7fAARcdfxTRMb9gmzvXXGlrmcWw2HL++7JtwQqr/kgcDFtnX6RdK/CXDxvuGW
b7jlD8MtlwtGNZeaR+r1NOxav+orWFFW1OlhbTlsddbeIqluYJLdUpM2V42rOqtSr/O7kFGtVG10
Ed/Uv+0RbCznbFncMEXl5fl6XlbE6C3ked3okuwzcEr9xrSIm95bXXBVn1GVZcZO0oW6olwTBJWh
//54V92pLVTiQExbJ7SXzBXD44j64OuvnVPbqulmYU5C7cGx0iMTCBWqZCo+1otWh+6o16X9FQjP
OnRN7lvrAE93TIebjm1a3hJHrHY0KGSof7yYQstlC5o6JEWNufVnG10fiiVxsz6t1/SgXyh2QIfU
37V7AEFXmy039HJgKrfuqfZZH1Hh2jx0cNg2rSXv/e6KB6zbaaF+Kb0DaNxytkmnFPRFHHXcmrZ7
ryKDrJvm+cuX4Lby3MLlXCKUujfkNoYp0Nsx3bftTrJePujB1W0/2nQ009I8R7Pb32yG0zJHXdPW
7E4Y8Iv4HT/a/Je/2bSCH397o36zyf/xu3+zqSo7N1beaTKcoe627zStSUa/VASu4Jal+cTqV6Vi
e+ptoSuK5jNVX5dM/UQ/G/hEN8Zvr/e38eEODQIZ7HvdWOVD7fp5OjJqvbQvJDGiz65kfhshp+3Z
mo1zlA2vWNPTs9eW7rw5+eePR/fsav9vO1fb0zYShL8j8R9WUSOCDhy/xXZQkiNA6XEtHC2ge+lV
lYmdZFsnjrxOKf31NzPrt8QmTa+0AokIlMSx17Oz65lnZ+eZPdVI77qABftW++Cwfdg3D4z2Eayj
Do51vEht9c3WAfypjmOYdtvc98ap2avWSy/F5K9wjrKX8sljHfr6cv+uaxDp1LaqR3Krtg4LEuc1
S+b+AmCt0EGm+QrGY842XEv7/5sg+YCAoP4EBJ+A4L0BwWR/ET0yopuxjGddnWU3kSkRuIeLuihG
rchI/KTglam2Na1cWMJp220HHE3mFmu1dFMr978eH2TwoJCavuS8cTf1k8SYpXMp2exd6TAmdGWR
rvxwpxlHvYJTXvToElqWL+LT2bxcAqLDJ+VkeCR7BlV3HvuYYNLV7HI7Hv9UPhh7ZTlB+qqjcZmm
Bat+vxu5Hg8LP5X8tI844U4oMXMngwRLmIqqLwEJWzs6PiQgce6vDSSOI77DdKsQIgA3Yu+p1h3F
H0Y3SmHsZRmDt7raVjQbIKOB2djv8jsTgkIEpCl9uuoivWp7nSIXKzt5dvLDOokiGMsSs0b1YQCN
uqJZjmIoWnuplEVl11nDUeDxtJryLale8Twt6jAyXr96/acOLbbUgqct98FExEdVOuzFPjSoEwjw
4sl+EA7cYAweqixZRReXJMvEOdNblmF8izgSpXz1NAmoLlFWDPJ9ZJ1E7iXREhx1MZ/ymP0eijFn
HfEB3/c5pjjIUw4HK6JTRXoE7kR4ITth0sCQzV6CjSiuauqWBv+GrvRhZrfMCrEWSBvO0BkOfctW
kmsNTVds23ZsH/DlTAbnlFEYjoL0coxnrluYIoNbc7HrigHn+blHXEDzXGb8c+LgbW5cCT/a7WMS
LuDFeRw3NUWHa06mu29wkU7Q65tFLkwC+bAsjgJSz2nUacitHTnsoGJV39Pgzzo/Tcf+r91jHhD9
e/fgdg953cHpZ8q4EUg5HrpCMqLB5yLN+sIvYI4Cb2xpSLAujDcfxKIpG2z+RkFovN0FGNBdPHEu
MFK9w8Y8Ft3dlqIyxBAcJOnilxism+ienL1/8/z8/eUfmCqMw9OF51x7WPuz6hO8fYK39wZv3xbN
q4RxmLsFMyigjLuTNFUusZoS0qUp5luT5DjlyVLmF5xKeJLS2uWPlOcmr+OUaAsgFN8AM4Ej9OJx
t6b9W6/1WOULz2QS/bEcUTFYF8/9bq3eGM6DAGNs2wKBcb3BhcejbVGjwJs8+Qw+9WQ7J6cvWIYC
2Rjw1cDv6kxEg27N/0K85eYNn3rhTXPIKY0vDvkA9PhhNiKhpcBaO2kPwWC13A/nJSUFQUnpAHTZ
IHCF6NYm/nQ+kN5E1Hodl40jf7io01qv3pCfOk2312nC1bLnj6Lrac+zaabiPFuthnoDvC92F7u6
VvuPaQ5kmmitownBv/irVfF4NaDrd9uc6pdsYYW6JqFH7JXtwnNCuiE946o3s36bG5rC/g7nCUtC
5lrXKanaFwN3hiYXjDbyfhiSwXZY5Ma0nzR2p6xZVzY3dNnAhGO6DMItjFTEtNOTLoDZlthCdOay
LXiow3Abvl7D8uAjhSmiWwLBe8QpQ6GK86O+QjngvO7FIH+3OS4Y44c+CVFCmgn3YoAf+jMne5ub
3ToZm+81uo9llHMj+/V+f93EPr5+6/oq61H9whbWMa0iewQSy5rY1cSqHo59WC3toHUjfhrSblny
jaINyefT5WX9z4jTIgVBs1SnREEwTMw+1Kx8+1KmAGS7r46VF98FRQyJMpNu25qGlifBYZGzLK3O
Ug0j4wrM3DytT1fzElnePM4yAVXbtE2tQC8gZkn6Y9vQHS3b3U1IR1mbdsvMxB/5UcTLQVx/HhVj
oUmYNGF+lc4mtlN6umU6uciNaVUxlWLdlZad5xnGblZcC3ScacrzxYznzVjttmlnd0iJNsXSJPfF
OgCHmCYKjmj2wWosAp82DWTyYAxLVzfg9L0qPfAFKfe0xjqjCfgJP9q/CaPAS8owTYPe2gmB5xf9
5vnF8Q6b3f7D0wyRKSV2wBI64f2JBeKBoRYS5jQsm0vFuPRyMa5L2Q12ACvyYP45SyrE5NVFZoI7
iT6Gz1zn0zNtX+YPagUl9Cpz8bDYsm5qioaBaN0qJNuVW0DygQoX6I4DH1u6vXw1axS6lS/NtTb2
z3QwIXC7IuEPR2viYjnD/fxmlWl5JdVlTf+iEqX56vLwh3Ae7q8ey7dxHlTHfDxVRJ5iak8xtXuM
qcm0asyxez5HImxaTDNGdqk7cb/Ar8Ety/w47SCnRNOELI9hdndzg7Zy2DXxshOOL7Kwp+zqop8d
aZxl5NfEp9GogJmSYXGqDjuUvHGUiGFRhOyOguGjT6G7awIPIhc2E3FbYSdxktp4jbVEOGUjyuCg
CCc+RexhMUsVAsBxgGs5Zg3wL7/Kkq7UcupmqHdSN7Rhfgzz6AhRSGK7RcrDxrkF7VwjZRpLqwja
DuHxbVoygIADHo+RZev50F2s6UvkkbRyiHSXEgrKmiqw1ohcBUsYuDDMymwMi+UZlQBrgvML3Oum
uBX40+bGf1BLAQIUABQAAAAIAOewIS0GzF+x8RsAABBiAAAGAAAAAAAAAAEAIAC2gQAAAABmcC50
eHRQSwUGAAAAAAEAAQA0AAAAFRwAAAAA

--Boundary_(ID_Uqs78No0Dj49zOTKCoTyzA)--


From tim.one@comcast.net  Mon Sep  2 03:50:36 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 01 Sep 2002 22:50:36 -0400
Subject: [Python-Dev] The first trustworthy <wink> GBayes results
In-Reply-To: <15730.52784.407584.441515@localhost.localdomain>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEHDBBAB.tim.one@comcast.net>

>     Tim> It takes about an hour to run and evaluate tests for one change.
>     Tim> If you want to motivate me to try, supply a patch against
>     Tim> timtest.py (in the sandbox), else I've already got far more ideas
>     Tim> than time to test them properly.  Anyone else want to test this
>     Tim> one?

[Skip Montanaro]
> Care to identify some of those ideas?

Nope, I'm puking sick of this topic now.  Look for XXX comments in
timtest.py for some of them.  You can infer others from places where XXX
comments aren't <wink>.  The f-p rate can't be improved anymore (meaning
that it's too low for me to measure an improvement if one were made).  The
f-n rate is still high, but adding more headers is likely the most effective
way to cut f-n, and my testing corpora won't allow me to test that (the
header lines are too damned different since my ham and spam came from
entirely different sources).

It's somebody else's turn now ... and thank Barry for the email pkg!  It's
been a joy to use.



From oren-py-d@hishome.net  Mon Sep  2 05:22:05 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Mon, 2 Sep 2002 00:22:05 -0400
Subject: [Python-Dev] Python-dev summary for 2002-08-15 - 2002-09-01
In-Reply-To: <Pine.SOL.4.44.0209011545490.23213-100000@death.OCF.Berkeley.EDU>
References: <Pine.SOL.4.44.0209011545490.23213-100000@death.OCF.Berkeley.EDU>
Message-ID: <20020902042205.GA29553@hishome.net>

Nice work!

Some other threads you may want to include in your summary :

The 'str' in 'string' feature:
    http://mail.python.org/pipermail/python-dev/2002-August/027354.html

PEP 237 deprecation warnings and hex constants:
    http://mail.python.org/pipermail/python-dev/2002-August/027783.html

PEP 277 - unicode filenames
    http://mail.python.org/pipermail/python-dev/2002-August/027651.html

	Oren


From tim.one@comcast.net  Mon Sep  2 07:54:35 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 02 Sep 2002 02:54:35 -0400
Subject: [Python-Dev] The first trustworthy <wink> GBayes results
In-Reply-To: <B43D149A9AB2D411971300B0D03D7E8BF0A556@natasha.auslabs.avaya.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEHIBBAB.tim.one@comcast.net>

[Delaney, Timothy]
> On second thought - if a word-pair appears, then the separate parts should
> not be checked as separate words.
>
> So, If I had scores:
>
>     'free'              0.1
>     'beer'              0.1
>     ('want', 'free',)   0.9
>     ('free', 'beer',)   0.01
>     ('free', '!!!',)    0.99
>
> then the following phrases would match (case-folding) as:
>
>     'I want free beer!!!':
>
>     ('want', 'free',)   0.9
>     ('free', 'beer',)   0.01
>
>     'Get *** for free!!!'
>
>     ('free', '!!!',)    0.99
>
>     'I want free beer. Free the beer!!!'
>
>     ('want', 'free',)   0.9
>     ('free', 'beer',)   0.01
>     'free'              0.1
>     'beer'              0.1
>
> Damn I wish I was at home to try this out ... :(

I'm going to say a lot of stuff here, and then shut up <wink>.  I want to
move on to other things, but there's an opportunity to pass on some darned
good advice for those who can hear.

Combining pairs of words is called "word bigrams".  My intuition at the
start was that it would do better.  OTOH, my intuition also was that
character n-grams for a relatively large n would do better still.  The
latter may be so for "foreign" languages, but for this particular task using
Graham's scheme on the c.l.py tests, turns out they sucked.  A comment block
in timtest.py explains why.

I didn't try word bigrams because the f-p rate is already supernaturally
low, so there doesn't seem anything left to be gained there.  This echoes
what Graham sez on his web page:

    One idea that I haven't tried yet is to filter based on word pairs, or
    even triples, rather than individual words.  This should yield a much
    sharper estimate of the probability.

My comment with benefit of hindsight:  it doesn't.  Because the scoring
scheme throws away everything except about a dozen extremes, the
"probabilities" that come out are almost always very near 0 or very near 1;
only very short or (or especially "and") very bland msgs come out in
between.  This outcome is largely independent of the tokenization scheme --
the scoring scheme forces it, provided only that the tokenization scheme
produces stuff *some* of which *does* vary in frequency between spam and
ham.

    For example, in my current database, the word "offers" has a probability
    of .96. If you based the probabilities on word pairs, you'd end up with
    "special offers" and "valuable offers" having probabilities of .99 and,
    say, "approach offers" (as in "this approach offers") having a
probability
    of .1 or less.

The theory is indeed appealing <wink>.

    The reason I haven't done this is that filtering based on individual
    words already works so well.

Which is also the reason I didn't pursue it.

    But it does mean that there is room to tighten the filters if spam gets
    harder to detect.

I expect it would also need a different scoring scheme then.

OK, I ran a full test using word bigrams.  It gets one strike against it at
the start because the database size grows by a factor between 2 and 3.
That's only justified if the results are better.  Before-and-after f-p
(false positive) percentages:

   before   bigrams
    0.000   0.025
    0.000   0.025
    0.050   0.050
    0.000   0.025
    0.025   0.050
    0.025   0.100
    0.050   0.075
    0.025   0.025
    0.025   0.050
    0.000   0.025
    0.075   0.050
    0.050   0.000
    0.025   0.050
    0.000   0.025
    0.050   0.075
    0.025   0.025
    0.025   0.025
    0.000   0.000
    0.025   0.050
    0.050   0.025

Lost on 12 runs
Tied on  5 runs
Won  on  3 runs

total # of unique fps across all runs rose from 8 to 17

The f-n percentages on the same runs:

   before   bigrams
    1.236   1.091
    1.164   1.091
    1.454   1.708
    1.599   1.563
    1.527   1.491
    1.236   1.127
    1.163   1.345
    1.309   1.309
    1.891   1.927
    1.418   1.382
    1.745   1.927
    1.708   1.963
    1.491   1.782
    0.836   0.800
    1.091   1.127
    1.309   1.309
    1.491   1.709
    1.127   1.018
    1.309   1.018
    1.636   1.672

Lost on  9 runs
Tied on  2 runs
Won  on  9 runs

total # of unique fns across all runs rose from 336 to 350

This doesn't need deep analysis:  it costs more, and on the face of it
either doesn't help, or helps so little it's not worth the cost.

Now I'll tell in you confidence <wink> that the way to make a scheme like
this excellent is to keep your ego out of it and let the data *tell* you
what works:  getting the best test setup you can is the most important thing
you can possibly do, which must include multiple training and test corpora
(e.g., if I had used only one pair, I would have had a 3/20 chance of
erroneously concluding that bigrams might help the f-p rate, when running
across 20 pairs shows that they almost certainly do it harm; while I would
have had an even chance of drawing a wrong conclusion-- in either
direction --about the effect on the f-n rate).

The second most important thing is to run a fat test all the way to the end
before concluding anything.  A subtler point is that you should never keep a
change that doesn't *prove* itself a winner:  neutral changes bloat your
code with proven irrelevancies that will come back to make your life harder
later, in part because they'll randomly interfere with future changes in
ways that make it harder to recognize a significant change when you stumble
into one.

Most things you try won't help -- indeed, many of them will deliver worse
results.  I dare say my intution for this kind of classification task is
better than most programmers' (in part because I had years of professional
experience in a related field), and most of the things I tried I had to
throw away.  BFD -- then you try something else.  When I find something that
works I can rationalize it, but when I try something that doesn't, no amount
of argument can change that the data said it sucked <wink>.

Two things about *this* task have fooled me repeatedly:

1. The "only look at smoking guns" nature of the scoring step makes many
kinds
   of "on average" intuitions worthless:  "on average" almost everything is
   thrown away!  For example, you're not going to find bad results reported
   for n-grams (neither character- nor word-based) in the literature, and
   because most scoring schemes throw much less away.  Graham's scheme
strikes
   as brilliant in this specific respect:  it's worth enduring the ego
   humiliation <wink> to get such a spectacularly low f-p rate from such
   simple and fast code.

2. Most mailing-list messages are much shorter than this one.  This
   systematically frustrates "well, averaged over enough words" intuitions
   too.

Cute:  In particular, word bigrams systematically hate conference
announcements.  The current word one-gram scheme hated them too, until I
started folding case.  Then their SCREAMING stopped acting against them.
But they're still using the language of advertisement, and word bigrams
can't help but notice that more strongly than individual words do.

Here from the TOOLS Europe '99 announcement:

prob('more information') = 0.916003
prob('web site') = 0.895518
prob('please write') = 0.99
prob('you wish') = 0.984494
prob('our web') = 0.985578
prob('visit our') = 0.99

Here from the XP2001 - FINAL CALL FOR PAPERS:

prob('web site:') = 0.926174
prob('receive this') = 0.945813
prob('you receive') = 0.987542
prob('most exciting') = 0.99
prob('alberta, canada') = 0.99
prob('e-mail to:') = 0.99

Here from the XP2002 - CALL FOR PRACTITIONER'S REPORTS ('BOM' is an
artificial token I made up for "beginning of message", to give something for
the first word in the message to pair up with):

prob('web site:') = 0.926174
prob('this announcement') = 0.94359
prob('receive this') = 0.945813
prob('forward this') = 0.99
prob('e-mail to:') = 0.99
prob('BOM *****') = 0.99
prob('you receive') = 0.987542

Here from the TOOLS Europe 2000 announcement:

prob('visit the') = 0.96
prob('you receive') = 0.967805
prob('accept our') = 0.99
prob('our apologies') = 0.99
prob('quality and') = 0.99
prob('receive more') = 0.99
prob('asia and') = 0.99

A vanilla f-p showing where bigrams can hurt was a short msg about setting
up a Python user's group.  Bigrams gave it large penalties for phrases like
"fully functional" (most often seen in spams for bootleg software, but here
applied to the proposed user group's web site -- and "web site" is also a
strong spam indicator!).  OTOH, the poster also said "Aahz rocks".  As a
bigram, that neither helped nor hurt (that 2-word phrase is unique in the
corpus); but as an individual word, "Aahz" is a strong non-spam indicator on
c.l.py (and will probably remain so until he starts spamming <wink>).

It did find one spam hiding in a ham corpus:

"""
NNTP-Posting-Host: 212.64.45.236
Newsgroups: comp.lang.python,comp.lang.rexx
Date: Thu, 21 Oct 1999 10:18:52 -0700
Message-ID: <67821AB23987D311ADB100A0241979E5396955@news.ykm.com>
From: znblrn@hetronet.com
Subject: Rudolph The Rednose Hooters Here
Lines: 4
Path:
news!uunet!ffx.uu.net!newsfeed.fast.net!howland.erols.net!newsfeed.cwix.com!
news.cfw.com!paxfeed.eni.net!DAIPUB.DataAssociatesInc..com
Xref: news comp.lang.python:74468 comp.lang.rexx:31946
To: python-list@python.org

THis IS it: The site where they talk about when you are 50 years old.

http://huizen.dds.nl/~jansen20
"""

there's-no-substitute-for-experiment-except-drugs-ly y'rs  - tim



From tdelaney@avaya.com  Mon Sep  2 08:43:06 2002
From: tdelaney@avaya.com (Delaney, Timothy)
Date: Mon, 2 Sep 2002 17:43:06 +1000
Subject: [Python-Dev] The first trustworthy <wink> GBayes results
Message-ID: <B43D149A9AB2D411971300B0D03D7E8BF0A55B@natasha.auslabs.avaya.com>

> From: Tim Peters [mailto:tim.one@comcast.net]
> 
> I'm going to say a lot of stuff here, and then shut up 
> <wink>.  I want to
> move on to other things, but there's an opportunity to pass 
> on some darned
> good advice for those who can hear.

Pretty darned good advice too ... but you won't object if I waste some time
playing with this stuff anyway I hope. Only one way to accumulate experience
after all ;)

Personally, I considered that you were already well past the point of
diminishing returns, and anything further was of academic interest to those
who felt a desire to tinker ... (i.e. the hard work has been done, and
everything else is just fun and games :) If enough people (or just one
dedicated person) waste enough time, who knows what may come out. Hey - it
worked for timsort didn't it ...? ;)

Tim Delaney


From mal@lemburg.com  Mon Sep  2 09:02:27 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 02 Sep 2002 10:02:27 +0200
Subject: [Python-Dev] To commit or not to commit
References: <3D6A7742.1030005@livinglogic.de>	<200208261847.g7QIlI806850@pcp02138704pcs.reston01.va.comcast.net> <m3k7m5cr03.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3D731B13.9090909@lemburg.com>

Martin v. Loewis wrote:
> Guido van Rossum <guido@python.org> writes:
> 
> 
>>>Any objections against committing the patch?
>>
>>What do MvL and MAL say?
> 
> Because of the size, I'm sure there are still bugs in it. I couldn't
> spot any by inspection, so I think the patch is ready to be installed.

+1.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/



From tim.one@comcast.net  Mon Sep  2 09:09:54 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 02 Sep 2002 04:09:54 -0400
Subject: [Python-Dev] The first trustworthy <wink> GBayes results
In-Reply-To: <B43D149A9AB2D411971300B0D03D7E8BF0A55B@natasha.auslabs.avaya.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEHLBBAB.tim.one@comcast.net>

[Delaney, Timothy]
> Pretty darned good advice too ... but you won't object if I waste
> some time playing with this stuff anyway I hope. Only one way to
accumulate
> experience after all ;)

Not at all!  Knock yourself out -- it's really a lot of fun, except when it
gets so tedious you start punching the wall just to watch your knuckles
bleed <wink>.

> Personally, I considered that you were already well past the point of
> diminishing returns,

Not yet -- false positives are a horrible thing, and the false negative rate
still lets a lot of spam through.  Cutting the f-n rate, e.g., in half,
would mean half as much spam to deal with; generalization left to the
reader.

> and anything further was of academic interest to those who felt a desire
to
> tinker ...

The best hope for reducing f-n lies in exploiting more header lines than I
can test with my mixed corpora, and there's *tons* of room for improvement
there (note that the f-n rate is more than 20x greater than the f-p rate
now).  Anyone who wants to tackle that with tedious experiment should first
pick Neil Schemenauer's brain:  he had a good start on that early last week.

> (i.e. the hard work has been done, and everything else is just fun and
> games :) If enough people (or just one dedicated person) waste enough
time,
> who knows what may come out. Hey - it worked for timsort didn't it ...? ;)

Indeed so, and it works for this too -- never underestimate the power of
working yourself sick.  If you also *write* about it, you can make everyone
else ill too by proxy <wink>.

sharing-the-pain-ly y'rs  - tim



From walter@livinglogic.de  Mon Sep  2 12:21:22 2002
From: walter@livinglogic.de (=?ISO-8859-15?Q?Walter_D=F6rwald?=)
Date: Mon, 02 Sep 2002 13:21:22 +0200
Subject: [Python-Dev] PyString_DecodeEscape and PEP293
References: <3D60EA3B.7030008@livinglogic.de> <m3ofbhcs7l.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3D7349B2.8010706@livinglogic.de>

Martin v. Loewis wrote:

> Walter Dörwald <walter@livinglogic.de> writes:
> 
> 
>>A recent checkin added a function PyString_DecodeEscape()
>>to stringobject.c. To make this function PEP293 compatible
>>it would need access to unicode_decode_call_errorhandler
>>which is defined static in unicodeobject.c. Does
>>PyString_DecodeEscape() really need an errors argument?
> 
> 
> What do you mean, "really need"? The callers of this function pass the
> argument, in particular escape_decode. Is that "real"?

So does escape_decode need an errors argument. AFAICT
escape_decode is used only in the context of reading pickles.
Will there ever be a need to call escape_decode with anything
other than errors="strict"?

>>If yes, we could either move it to unicodeobject.c 
> 
> 
> No. It has to do little with Unicode.
> 
> 
>>or make unicode_decode_call_errorhandler externally visible.
> 
> 
> I don't know this function.

It's a static function in unicodeobject.c in the PEP293 patch
that does the complete error handling for decoding.

> What does this have to do with Unicode?

I expected that all codecs to unicode<->8bit coding/decoding
"string-escape" seems to be an exception.

>>Another problem that I noticed is that string-escape can't
>>be used for encoding Unicode objects:
> 
> 
> That is a feature. string-escape has nothing to do with Unicode.

So it doesn't need the new PEP293 error handling?

Bye,
    Walter Dörwald



From walter@livinglogic.de  Mon Sep  2 12:22:25 2002
From: walter@livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Mon, 02 Sep 2002 13:22:25 +0200
Subject: [Python-Dev] To commit or not to commit
References: <3D6A7742.1030005@livinglogic.de>	<200208261847.g7QIlI806850@pcp02138704pcs.reston01.va.comcast.net> <m3k7m5cr03.fsf@mira.informatik.hu-berlin.de> <3D731B13.9090909@lemburg.com>
Message-ID: <3D7349F1.4090100@livinglogic.de>

M.-A. Lemburg wrote:
> Martin v. Loewis wrote:
> 
>> Guido van Rossum <guido@python.org> writes:
>>
>>>> Any objections against committing the patch?
>>>
>>> What do MvL and MAL say?
>>
>> Because of the size, I'm sure there are still bugs in it. I couldn't
>> spot any by inspection, so I think the patch is ready to be installed.
> 
> +1.

OK, I'll check it in then.

Bye,
    Walter Dörwald



From pinard@iro.umontreal.ca  Mon Sep  2 13:02:55 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: Mon, 02 Sep 2002 08:02:55 -0400
Subject: [Python-Dev] Re: The first trustworthy <wink> GBayes results
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEHIBBAB.tim.one@comcast.net> (Tim
 Peters's message of "Mon, 02 Sep 2002 02:54:35 -0400")
References: <LNBBLJKPBEHFEDALKOLCAEHIBBAB.tim.one@comcast.net>
Message-ID: <oqsn0sha4w.fsf@titan.progiciels-bpi.ca>

[Tim Peters]

[... extremely good work and stuff and comments, for a good while now ...]

Hi, Tim.  I read your messages, witnessing your work and progress in that
area, with great interest, and also saved them for later contemplation! :-)

Spam always annoyed me, as most of us, and despite many efforts I did, it is
increasingly successful at traversing my filters -- so this idea of Graham or
Bayesian filters is timely and welcome.  Most previous filters I observed are
based on various (random) tests or events (you surely know all this), and
`procmail'-based filters, or even the popular SpamAssassin, are either very
slow or at least slow.  The tool I use since 1998 is much faster, especially
after I rewrote it in Python!, it is also based on various tests or events.

Your works concentrated on tuning the statistical formulas and lexical
analysis, and building operational data from preset corpora.  I'm sure all the
knowledge gleaned there will make its way everywhere, and reach me.  For a
tiny share, I decided to experiment with day-to-day user aspects of using such
a filter, and built a Gnus interface over Eric Raymond's Bogofilter.  There
are two functions to this program, one is about learning from messages known
to be ham or spam, the other is about classification of incoming messages.  By
the way, if there are Gnus users among you, just ask me for the recipe...

It goes pretty well for me, so far.  The principle, put forward by Paul
Graham, is to let the user have two delete commands: delete-as-ham or
delete-as-spam.  Eric pushed this idea a bit further by postponing learning
until the user quits the mail reader, `mutt' in his case.  As Gnus allows me
to have many mailgroups and folders and shuffle between them, I postpone
learning until the user switches mailgroups or quit, and only for the _final_
disposition of a message: that is, when a message is merely saved into another
folder, the decision will be taken when leaving that other folder, and not the
current one.  Messages marked as "saved" are _not_ sent, so to avoid double
learning.

The fact is that ham messages are more likely to be postponed than spam,
because ham is more often filed here and there.  Even if many or most ham
messages are deleted, this introduce a short term bias in the learning
statistics by which the percentage of spam seems to be higher (in my case,
1157 messages have been learned in about three days, 20% of which were spam),
but this percentage will later be lowered as filed messages get reprocessed.
Another effect is that the delay itself in ham learning may have a slight
effect on classification, but since both ham and spam are well represented,
the effect is likely negligible.

Tim corpora are surely very clean, at least by now, while day-to-day learning
may yield slightly tainted learning.  In my case, when a thread does not
interest me, I often kill all articles it contains in one command, without
opening each of them to see if it would not be spam: the threading itself
makes it unlikely.  But nevertheless possible, you surely noticed that bad
guys now fetch and re-use already published subjects as a way to get through.
That means that if big corpora are thinkable in case of mailing lists having
existed for a while, those are probably not very usable for individual users.
GBayes, Bogofilter and others should ideally resist some amount of
ham-tainted-as-spam or spam-tainted-as-ham at learning time.

After adding Graham filtering as a supplementary method to my spam detection
tool, I gladly observe that it successfully detects many spam messages which
would otherwise fall in the cracks, so it really brings something to me.  But
I also see many spam cases (are they?) it does not detect and that it would
hardly: one simple example is that _for me_, invalidly structured MIME is
indicative of an un-interesting message, as interesting people know better!

One particular problem I observed are Tim messages themselves, which are
undoubtedly very miummy ham messages, but discussing and quoting many spam
inside them.  Should these be registered as ham or spam? :-) Would not these
defeat the learning to some extent?  Where should Tim add his own messages in
the corpora he uses, and what changes would result in `GBayes' effectiveness?

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard


From guido@python.org  Mon Sep  2 15:01:45 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 02 Sep 2002 10:01:45 -0400
Subject: [Python-Dev] Python-dev summary for 2002-08-15 - 2002-09-01
In-Reply-To: Your message of "Sun, 01 Sep 2002 21:29:09 CDT."
 <15730.52469.604124.730029@localhost.localdomain>
References: <Pine.SOL.4.44.0209011545490.23213-100000@death.OCF.Berkeley.EDU>
 <15730.52469.604124.730029@localhost.localdomain>
Message-ID: <200209021401.g82E1k030628@pcp02138704pcs.reston01.va.comcast.net>

> Looks good to me.  The only trivial nit I would like to raise is that any
> URLs you embed in the text be true URLs.  I'd also prefer they be encased in
> <...>, but that's slightly less important and generally only matters when
> URLs are immediately followed by punctuation.  So, instead of
> 
>     Brett> Guido said he will revive the PEP.  The patch has since been put
>     Brett> on SF at python.org/sf/599331 .
> 
> you'd have
> 
>     Brett> Guido said he will revive the PEP.  The patch has since been put
>     Brett> on SF at <http://python.org/sf/599331>.
> 
> The two changes make it much more likely that email readers will be able to
> successfully highlight such URLs correctly.

I think adding http:// alone should be sufficient.  Despite all the
official recommendations, I've always hated the <...> form.  However,
do keep a space after the URL if punctuation were to follow (which you
already did).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Mon Sep  2 15:06:05 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 02 Sep 2002 10:06:05 -0400
Subject: [Python-Dev] To commit or not to commit
In-Reply-To: Your message of "Mon, 02 Sep 2002 10:02:27 +0200."
 <3D731B13.9090909@lemburg.com>
References: <3D6A7742.1030005@livinglogic.de> <200208261847.g7QIlI806850@pcp02138704pcs.reston01.va.comcast.net> <m3k7m5cr03.fsf@mira.informatik.hu-berlin.de>
 <3D731B13.9090909@lemburg.com>
Message-ID: <200209021406.g82E65b30667@pcp02138704pcs.reston01.va.comcast.net>

> >>>Any objections against committing the patch?
> >>
> >>What do MvL and MAL say?
> > 
> > Because of the size, I'm sure there are still bugs in it. I couldn't
> > spot any by inspection, so I think the patch is ready to be installed.
> 
> +1.

OK, anchors away then! :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From pinard@iro.umontreal.ca  Mon Sep  2 16:15:54 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: Mon, 02 Sep 2002 11:15:54 -0400
Subject: [Python-Dev] Re: Python-dev summary for 2002-08-15 - 2002-09-01
In-Reply-To: <200209021401.g82E1k030628@pcp02138704pcs.reston01.va.comcast.net> (Guido
 van Rossum's message of "Mon, 02 Sep 2002 10:01:45 -0400")
References: <Pine.SOL.4.44.0209011545490.23213-100000@death.OCF.Berkeley.EDU>
 <15730.52469.604124.730029@localhost.localdomain>
 <200209021401.g82E1k030628@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <oqd6rwh179.fsf@titan.progiciels-bpi.ca>

>> you'd have
>> 
>>     Brett> Guido said he will revive the PEP.  The patch has since been put
>>     Brett> on SF at <http://python.org/sf/599331>.
>> 
>> The two changes make it much more likely that email readers will be able to
>> successfully highlight such URLs correctly.
>
> I think adding http:// alone should be sufficient.  Despite all the
> official recommendations, I've always hated the <...> form.

Gnus highlights correctly with the `http://', and adds clickability.  The `<'
and '>' are not needed.  I do not know what other mail readers do.

To get the same effects with email addresses, I often prefer using `mailto:'
as a prefix over writing `<' and `>' around a quoted address in a message
body, even if not fully systematic about this.  In the message header itself,
`<' and '>' are the proper way to go, of course.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard


From tim.one@comcast.net  Mon Sep  2 16:41:00 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 02 Sep 2002 11:41:00 -0400
Subject: [Python-Dev] Re: Python-dev summary for 2002-08-15 - 2002-09-01
In-Reply-To: <oqd6rwh179.fsf@titan.progiciels-bpi.ca>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEIJBBAB.tim.one@comcast.net>

I usually add <> to http thingies when I remember to.  A couple people
yelled at me, claiming their readers couldn't recognize http thingies
otherwise.  This seems particularly odd, since I almost always put them on
their own line:

    http://www.python.org

OTOH, *my* reader doesn't recognize them in the

    <URL:http://www.python.org>

style, neither with nor without <>.



From barry@python.org  Mon Sep  2 16:48:47 2002
From: barry@python.org (Barry A. Warsaw)
Date: Mon, 2 Sep 2002 11:48:47 -0400
Subject: [Python-Dev] To commit or not to commit
References: <3D6A7742.1030005@livinglogic.de>
 <200208261847.g7QIlI806850@pcp02138704pcs.reston01.va.comcast.net>
 <m3k7m5cr03.fsf@mira.informatik.hu-berlin.de>
 <3D731B13.9090909@lemburg.com>
 <3D7349F1.4090100@livinglogic.de>
Message-ID: <15731.34911.231999.691324@anthem.wooz.org>

>>>>> "WD" =3D=3D Walter D=F6rwald <walter@livinglogic.de> writes:

    WD> OK, I'll check it in then.

Does that mean it's time to mark PEP 293 as Final and move it to the
Finished PEPs category in PEP 0?

-Barry


From walter@livinglogic.de  Mon Sep  2 17:29:14 2002
From: walter@livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Mon, 02 Sep 2002 18:29:14 +0200
Subject: [Python-Dev] To commit or not to commit
References: <3D6A7742.1030005@livinglogic.de>	<200208261847.g7QIlI806850@pcp02138704pcs.reston01.va.comcast.net>	<m3k7m5cr03.fsf@mira.informatik.hu-berlin.de>	<3D731B13.9090909@lemburg.com>	<3D7349F1.4090100@livinglogic.de> <15731.34911.231999.691324@anthem.wooz.org>
Message-ID: <3D7391DA.6010306@livinglogic.de>

Barry A. Warsaw wrote:
>>>>>>"WD" == Walter Dörwald <walter@livinglogic.de> writes:
>>>>>
> 
>     WD> OK, I'll check it in then.
> 
> Does that mean it's time to mark PEP 293 as Final and move it to the
> Finished PEPs category in PEP 0?

Guido already changed PEP 283, so: yes. Only a few cleanup tasks remain
(Neals comments, LaTeX documentation for the rest of the C functions).

Bye,
    Walter Dörwald



From barry@python.org  Mon Sep  2 17:48:47 2002
From: barry@python.org (Barry A. Warsaw)
Date: Mon, 2 Sep 2002 12:48:47 -0400
Subject: [Python-Dev] Re: Python-dev summary for 2002-08-15 - 2002-09-01
References: <oqd6rwh179.fsf@titan.progiciels-bpi.ca>
 <LNBBLJKPBEHFEDALKOLCOEIJBBAB.tim.one@comcast.net>
Message-ID: <15731.38511.160332.641594@anthem.wooz.org>

>>>>> "TP" == Tim Peters <tim.one@comcast.net> writes:

    TP> I usually add <> to http thingies when I remember to.  A
    TP> couple people yelled at me, claiming their readers couldn't
    TP> recognize http thingies otherwise.  This seems particularly
    TP> odd, since I almost always put them on their own line:

    TP>     http://www.python.org

As do I.

    TP> OTOH, *my* reader doesn't recognize them in the

    TP>     <URL:http://www.python.org>

    TP> style, neither with nor without <>.

Mine does too, but it's not the <> that is the distinguishing feature,
AFAIK.  The <> seem to be most useful for inline urls where trailing
punctuation gets incorrectly attached to the url.

-Barry


From drifty@bigfoot.com  Mon Sep  2 18:11:40 2002
From: drifty@bigfoot.com (Brett Cannon)
Date: Mon, 2 Sep 2002 10:11:40 -0700 (PDT)
Subject: [Python-Dev] Python-dev summary for 2002-08-15 - 2002-09-01
In-Reply-To: <200209021401.g82E1k030628@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <Pine.SOL.4.44.0209021010170.7176-100000@death.OCF.Berkeley.EDU>

[Guido van Rossum]

> > you'd have
> >
> >     Brett> Guido said he will revive the PEP.  The patch has since been put
> >     Brett> on SF at <http://python.org/sf/599331>.
> >
> > The two changes make it much more likely that email readers will be able to
> > successfully highlight such URLs correctly.
>
> I think adding http:// alone should be sufficient.  Despite all the
> official recommendations, I've always hated the <...> form.  However,
> do keep a space after the URL if punctuation were to follow (which you
> already did).
>

I think I will go with adding http:// to all addresses and putting them on
their own line.

-Brett



From martin@v.loewis.de  Mon Sep  2 21:31:56 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 02 Sep 2002 22:31:56 +0200
Subject: [Python-Dev] PyString_DecodeEscape and PEP293
In-Reply-To: <3D7349B2.8010706@livinglogic.de>
References: <3D60EA3B.7030008@livinglogic.de>
 <m3ofbhcs7l.fsf@mira.informatik.hu-berlin.de>
 <3D7349B2.8010706@livinglogic.de>
Message-ID: <m3fzws9lqb.fsf@mira.informatik.hu-berlin.de>

Walter D=F6rwald <walter@livinglogic.de> writes:

> So does escape_decode need an errors argument. AFAICT
> escape_decode is used only in the context of reading pickles.
> Will there ever be a need to call escape_decode with anything
> other than errors=3D"strict"?

It's a codec, so anybody is entitled to write

  "foo".decode("string-escape", "replace")

if they chose to. If you are suggesting that this is not supported is
only acceptable if you also suggest how it should fail. Silently
ignoring the "replace" argument is not acceptable.

> > What does this have to do with Unicode?
>=20
> I expected that all codecs to unicode<->8bit coding/decoding
> "string-escape" seems to be an exception.

That was my original expectation as well. By now, I have accepted
things like

>>> "foo".encode("base64")
'Zm9v\n'

So codecs can do way more things than converting between
unicode<->byte strings. Whether it is a good thing that they are that
flexible is still open to debate, however, it was convenient for
string-escape.

> So it doesn't need the new PEP293 error handling?

Probably not - just supporting "strict", "replace", "ignore", and
failing for any other error handling would be sufficient. If you
manage to make it fail for anything but "strict", that would be
acceptable as well (IMO).

Regards,
Martin


From tdelaney@avaya.com  Tue Sep  3 00:25:19 2002
From: tdelaney@avaya.com (Delaney, Timothy)
Date: Tue, 3 Sep 2002 09:25:19 +1000
Subject: [Python-Dev] Python-dev summary for 2002-08-15 - 2002-09-01
Message-ID: <B43D149A9AB2D411971300B0D03D7E8BF0A55C@natasha.auslabs.avaya.com>

> From: Brett Cannon [mailto:bac@OCF.Berkeley.EDU]
> 
> I think I will go with adding http:// to all addresses and 
> putting them on their own line.

May I suggest that this may be a good test document for reStructuredText?
Especially if it is going to such places as slashdot ...

Tim Delaney


From skip@pobox.com  Tue Sep  3 02:37:48 2002
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 2 Sep 2002 20:37:48 -0500
Subject: [Python-Dev] Re: Python-dev summary for 2002-08-15 - 2002-09-01
In-Reply-To: <oqd6rwh179.fsf@titan.progiciels-bpi.ca>
References: <Pine.SOL.4.44.0209011545490.23213-100000@death.OCF.Berkeley.EDU>
 <15730.52469.604124.730029@localhost.localdomain>
 <200209021401.g82E1k030628@pcp02138704pcs.reston01.va.comcast.net>
 <oqd6rwh179.fsf@titan.progiciels-bpi.ca>
Message-ID: <15732.4716.629172.615326@12-248-11-90.client.attbi.com>

    >> I think adding http:// alone should be sufficient.  Despite all =
the
    >> official recommendations, I've always hated the <...> form.

    Fran=E7ois> Gnus highlights correctly with the `http://', and adds
    Fran=E7ois> clickability.  The `<' and '>' are not needed. =20

I use VM.  It highlights correctly as long as the leading "http://" is
there, provided the URL isn't followed immediately by punctuation which=
 can
occur in a URL.  In Brett's summary he avoids that problem by adding a =
space
between the URL and the ambiguous punctuation.  I think that looks odde=
r
than the <...> notation.

Skip


From barry@python.org  Tue Sep  3 05:16:32 2002
From: barry@python.org (Barry A. Warsaw)
Date: Tue, 3 Sep 2002 00:16:32 -0400
Subject: [Python-Dev] Re: Python-dev summary for 2002-08-15 - 2002-09-01
References: <Pine.SOL.4.44.0209011545490.23213-100000@death.OCF.Berkeley.EDU>
 <15730.52469.604124.730029@localhost.localdomain>
 <200209021401.g82E1k030628@pcp02138704pcs.reston01.va.comcast.net>
 <oqd6rwh179.fsf@titan.progiciels-bpi.ca>
 <15732.4716.629172.615326@12-248-11-90.client.attbi.com>
Message-ID: <15732.14240.941982.728027@anthem.wooz.org>

>>>>> "SM" == Skip Montanaro <skip@pobox.com> writes:

    >> I think adding http:// alone should be sufficient.  Despite all
    >> the official recommendations, I've always hated the <...> form.

    SM> In Brett's summary he avoids that problem by adding a space
    SM> between the URL and the ambiguous punctuation.  I think that
    SM> looks odder than the <...> notation.

I agree (and also use VM :), but putting the url on a separate line
looks fine, unless that greatly increases the vertical whitespace.

-Barry


From akim@epita.fr  Tue Sep  3 07:26:52 2002
From: akim@epita.fr (Akim Demaille)
Date: 03 Sep 2002 08:26:52 +0200
Subject: [Python-Dev] Re: HAVE_CONFIG_H
In-Reply-To: <oqvg6w5au9.fsf@titan.progiciels-bpi.ca>
References: <200207291930.g6TJUYi05460@pcp02138704pcs.reston01.va.comcast.net>
 <m3fzy2kuye.fsf@mira.informatik.hu-berlin.de>
 <oqptx6qfhl.fsf@titan.progiciels-bpi.ca>
 <200207301539.g6UFdUS09930@odiug.zope.com>
 <200207301622.g6UGMBl17143@odiug.zope.com>
 <oqbs8pi2ct.fsf@titan.progiciels-bpi.ca>
 <mv44regqn2o.fsf@nostromo.lrde.epita.fr>
 <oqvg6w5au9.fsf@titan.progiciels-bpi.ca>
Message-ID: <mv4bs7feggj.fsf@nostromo.lrde.epita.fr>

>>>>> "Fran=E7ois" =3D=3D Fran=E7ois Pinard <pinard@iro.umontreal.ca> wri=
tes:

Fran=E7ois> [Akim Demaille]
>> I'm not sure I completely understand the question here: if
>> HAVE_CONFIG_H is specified, it means config.h is created.  So if
>> you use a config.h, why does it matter not to define HAVE_CONFIG_H?

Fran=E7ois> Hi, Akim.  I hope life is still good to you! :-)

Hi Fran=E7ois!

The new (scholar) year is starting now, so life is still good, but I'm
a bit afraid of what it might be done in the near future :)

Fran=E7ois> In the beginnings of Autoconf, the `config.h' file did not
Fran=E7ois> exist.  David MacKenzie added it as a way to reduce the
Fran=E7ois> `make' output clutter.  Nowadays, I suspect almost all
Fran=E7ois> packages of at least moderate size uses it.

Agreed.

Fran=E7ois> Our traditional `lib/' modules have to work in many
Fran=E7ois> packages, whether `config.h' has been created or not, this
Fran=E7ois> being decided on a per package basis, and that is why there
Fran=E7ois> is a conditional inclusion of `config.h' in each of these
Fran=E7ois> `lib/' modules.  He took a good while before we got
Fran=E7ois> stabilised on the exact stanza of this inclusion (I
Fran=E7ois> especially remember the massive unilateral changes by Roland
Fran=E7ois> McGrath introducing the BROKEN_BROKET define, or something
Fran=E7ois> like that, and all the doing it later took to clean this
Fran=E7ois> out.)

I understand.

Fran=E7ois> Python (the distribution, which is what is in question here)
Fran=E7ois> does not use any of our `lib/' things, it is not going to
Fran=E7ois> use them, and it is not going to provide new such modules,
Fran=E7ois> so the distribution includes `config.h' everywhere, by
Fran=E7ois> permanent choice, without any need to use `HAVE_CONFIG_H' to
Fran=E7ois> decide if that inclusion is needed or not.  So, even
Fran=E7ois> `-DHAVE_CONFIG_H' is useless `make' clutter in this case,
Fran=E7ois> and that's why the Python packagers wanted to get rid of it.

Fran=E7ois> In fact, in practice `-DHAVE_CONFIG_H' is only needed for
Fran=E7ois> packages using those common `lib/' modules, but many
Fran=E7ois> packages do not.  Now that Autoconf is used with projects
Fran=E7ois> who have a life outside GNU, this is less necessary.  Guido
Fran=E7ois> found, and got me to remember, that `@DEFS@' is the culprit:
Fran=E7ois> people just do not have to use it in their hand-crafted
Fran=E7ois> Makefiles, which is the case for Python.  For away-from-GNU
Fran=E7ois> packages using Automake, some Automake option might exist so
Fran=E7ois> `@DEFS@' does not get generated?  The only goal here is to
Fran=E7ois> get a cleaner `make' output.

I understand the goal, but much of the effort is devoted to having the
thing work cleanly, not being beautiful.  Another goal is to have it
being easy to maintain, i.e., not having too much to document, too
much to support, too much to test etc.  So, although I don't know what
the Automake team might think of this idea, I suspect they'll want to
focus on other features :(


From sholden@holdenweb.com  Tue Sep  3 11:52:36 2002
From: sholden@holdenweb.com (Steve Holden)
Date: Tue, 3 Sep 2002 06:52:36 -0400
Subject: [Python-Dev] Re: Python-dev summary for 2002-08-15 - 2002-09-01
References: <LNBBLJKPBEHFEDALKOLCOEIJBBAB.tim.one@comcast.net>
Message-ID: <00cb01c25338$05d87120$6300000a@holdenweb.com>

----- > I usually add <> to http thingies when I remember to.  A couple
people
> yelled at me, claiming their readers couldn't recognize http thingies
> otherwise.  This seems particularly odd, since I almost always put them on
> their own line:
>
>     http://www.python.org
>
> OTOH, *my* reader doesn't recognize them in the
>
>     <URL:http://www.python.org>
>
> style, neither with nor without <>.
>

But it nevertheless sends out something that *it* will recognise as a URL.
Both your references were correctly represented as hyperlinks in OE when I
read your message!

regards
-----------------------------------------------------------------------
Steve Holden                                  http://www.holdenweb.com/
Python Web Programming                        pydish.holdenweb.com/pwp/
Previous .sig file retired to                    www.homeforoldsigs.com
-----------------------------------------------------------------------




From greg@python.org  Tue Sep  3 14:41:12 2002
From: greg@python.org (Greg Ward)
Date: Tue, 3 Sep 2002 09:41:12 -0400
Subject: [Python-Dev] The first trustworthy <wink> GBayes results
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEAMBBAB.tim.one@comcast.net>
References: <20020828194248.GA16407@cthulhu.gerg.ca> <LNBBLJKPBEHFEDALKOLCOEAMBBAB.tim.one@comcast.net>
Message-ID: <20020903134112.GC1227@cthulhu.gerg.ca>

[Tim, last week]
> What's an acceptable false positive rate?

[my response]
> Speaking as one of the people who reviews suspected spam for python.org
> and rescues false positives, I would say that the more relevant figure
> is: how much suspected spam do I have to review every morning?  < 10
> messages would be peachy; right now it's around 5-20 messages per day.

[Tim again]
> I must be missing something.  I would *hope* that you review *all* messages
> claimed to be spam, in which case the number of msgs to be reviewed would,
> in a perfectly accurate system, be equal to the number of spams received.

Good lord, certainly not!  Remember that Exim rejects a couple hundred
messages a day that never get near SpamAssassin -- that's mostly
Chinese/Korean junk that's rejected on the basis of 8-bit chars or
banned charsets in the headers.  Then, probably 50-75% of what SA gets
its hands on scores >= 10.0, so it too is rejected at SMTP time.  Only
messages that score < 10 are accepted, and those that score >= 5.0 are
set aside in /var/mail/spam for review.  That's 10-30 messages/day.

(I do occasionally scan Exim's reject log on mail.python.org to see
what's getting rejected today -- Exim kindly logs the full headers of
every message that is rejected after the DATA command.  I usually make
it to about 11am of a given day's logfile before my eyes glaze over from
the endless stream of spam and viruses.)

Note that we *used* to accept messages before passing them to
SpamAssassin, so never rejected anything on the basis of its SA score.
Back then, we saved and reviewed probably 50-70 messages/day.  Very,
very, very few (if any) false positives scored >= 10.0, which is why
that's the threshold for SMTP-time rejection.

> OTOH, the false positive rate doesn't have anything to do with the number of
> spams received, it has to do with the number of non-spams received.

Err, yeah, good point.  I make a point of talking about "suspected
spam", which is any message that scores between 5.0 and 10.0.  IMHO, the
true nature of those messages can only be determined by manual
inspection.

> Maybe you don't want this kind of approach at all.  The classifier doesn't
> have "gray areas" in practice:  it tends to give probabilites near 1, or
> near 0, and there's very little in between -- a msg either has a
> preponderance of spam indicators, or a preponderance of non-spam indicators.

That's a great improvement over SpamAssassin then: with SA, the grey
area (IMHO) is scores from 3 to 10... which is why several python.org
lists now have a little bit of Mailman configuration magic that makes MM
set aside messages with an SA score >= 3 for list admin review.  (It's
probably worth getting the list admin to do a bit more work in order to
avoid sending low-scoring spam to the list.)

However, as long as "very little" != "nothing", we still need to worry a
bit about that grey area.  What do you think we should do with a message
whose spam probability is between (say) 0.1 and 0.9?  Send it on, reject
it, or set it aside?  Just how many messages fall in that grey area
anyways?

        Greg
-- 
Greg Ward <gward@python.net>                         http://www.gerg.ca/
MTV -- get off the air!
    -- Dead Kennedys


From mcherm@destiny.com  Tue Sep  3 15:11:45 2002
From: mcherm@destiny.com (Michael Chermside)
Date: Tue, 03 Sep 2002 10:11:45 -0400
Subject: [Python-Dev] Re: PEP 218 (sets); moving set.py to Lib
Message-ID: <3D74C321.7070103@destiny.com>

>> Hmm, I intended to have s1.refresh() return a new object for use in
>> s2 while leaving s1 alone (being immutable and all).  Now, I wonder
>> if that was the right thing to do.  The answer lies in use cases for
>> algorithms that need sets of sets.  If anyone knows off the top of
>> their head that would be great; otherwise, I seem to remember that
>> some of that business was found in compiler algorithms and graph
>> packages.
> 
> Let's call YAGNI on this one.
> 

Furthermore, what if I create a BIG set like this:

   s = ImmutableSet( range(2**x) )

Now, not only do I use lots of memory for s, I ALSO keep around lots of 
memory to preserve a temporary list which I never wanted to keep anyhow!

-- Michael Chermside




From walter@livinglogic.de  Tue Sep  3 17:05:21 2002
From: walter@livinglogic.de (=?ISO-8859-15?Q?Walter_D=F6rwald?=)
Date: Tue, 03 Sep 2002 18:05:21 +0200
Subject: [Python-Dev] PyString_DecodeEscape and PEP293
References: <3D60EA3B.7030008@livinglogic.de>	<m3ofbhcs7l.fsf@mira.informatik.hu-berlin.de>	<3D7349B2.8010706@livinglogic.de> <m3fzws9lqb.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3D74DDC1.7040609@livinglogic.de>

Martin v. Loewis wrote:

> Walter Dörwald <walter@livinglogic.de> writes:
> 
> 
>>So does escape_decode need an errors argument. AFAICT
>>escape_decode is used only in the context of reading pickles.
>>Will there ever be a need to call escape_decode with anything
>>other than errors="strict"?
> 
> 
> It's a codec, so anybody is entitled to write
> 
>   "foo".decode("string-escape", "replace")
> 
> if they chose to. If you are suggesting that this is not supported is
> only acceptable if you also suggest how it should fail. Silently
> ignoring the "replace" argument is not acceptable.

I won't suggest that. Let's keep PyString_DecodeEscape as it is now.
It should not be a problem for encoding, because encoding can't fail,
so there is no need for using "xmlcharrefreplace" etc. as the error
handling. Decoding can fail, but lets add custom error handling
only when the need for it arises (which hopefully won't).

> [...]
>>So it doesn't need the new PEP293 error handling?
> 
> Probably not - just supporting "strict", "replace", "ignore", and
> failing for any other error handling would be sufficient. If you
> manage to make it fail for anything but "strict", that would be
> acceptable as well (IMO).

OK, lets keep PyString_DecodeEscape as it is now (i.e. "strict", 
"ignore", "replace" implemented inline with no
custom error handling).

Bye,
    Walter Dörwald



From tim.one@comcast.net  Tue Sep  3 17:27:57 2002
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 03 Sep 2002 12:27:57 -0400
Subject: [Python-Dev] Re: Python-dev summary for 2002-08-15 - 2002-09-01
In-Reply-To: <00cb01c25338$05d87120$6300000a@holdenweb.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHGEBPDKAA.tim.one@comcast.net>

[Tim]
> OTOH, *my* reader doesn't recognize them in the
>
>     <URL:http://www.python.org>
>
> style, neither with nor without <>.

[Steve Holden]
> But it nevertheless sends out something that *it* will recognise as a URL.

I think you're assuming I use Outlook Express.  I don't; "my reader" is
usually Outlook 2000.

> Both your references were correctly represented as hyperlinks in OE when I
> read your message!

Yes, OE and Outlook differ in this repsect.



From guido@python.org  Tue Sep  3 17:53:45 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 03 Sep 2002 12:53:45 -0400
Subject: [Python-Dev] Python-dev summary for 2002-08-15 - 2002-09-01
In-Reply-To: Your message of "Sun, 01 Sep 2002 15:57:53 PDT."
 <Pine.SOL.4.44.0209011545490.23213-100000@death.OCF.Berkeley.EDU>
References: <Pine.SOL.4.44.0209011545490.23213-100000@death.OCF.Berkeley.EDU>
Message-ID: <200209031653.g83GrjQ01929@odiug.zope.com>

> Yes, with Michael's permission, I am attempting to start up the Python-dev
> summaries again.  Below is my attempt at summarizing the last half of
> August.  It's longer then normal summaries, but that is because I bothered
> to include discussions on threads that were not directly relating to the
> Python core but are interesting nonetheless (e.g., the whole spambayes
> thread).
> 
> I am posting to Python-dev first before posting to c.l.py, c.l.py.a (also
> lwn.net and probably Slashdot) because I want to get the general okay from
> the list that I have done a good enough of a job to send this out; I don't
> want to have a summary that represents the going-ons here without the
> general populace (or just the BDFL since he can overrule =) being okay
> with it.  I am also curious as to whether I should go into more or less
> detail, leave out the summaries that do not directly pertain to the Python
> core, etc.
> 
> So please read the summary and let me know if you are okay with it.  If so
> I will try to do semi-monthly summaries from now on.  Oh, and I am on
> vacation right now and will be doing a lot of travelling in the next two
> months, so I can't guarantee summaries will be this quick to come out for
> a while.  I will do them, though, even if they are a week late.  =)
> 
> Oh, and if I do get the okay to do this, expect a lot of dumb questions
> from me in the future in terms of clarifying things.  Just remember, it is
> for the good of the Python community.  =)

Thanks, Brett.  Minor comments ahead; but basically, go ahead --
don't let striving for perfection keep you from posting something good!

> 
> =======================================
> 
> 
> This is a summary of traffic on the python-dev mailing list between August
> 16, 2002 and September 1, 2002 (exclusive).  It is intended to inform the
> wider Python community of ongoing developments.  To comment, just post to
> python-list@python.org or comp.lang.python in the usual way. Give your
> posting a meaningful subject line, and if it's about a PEP, include the
> PEP number (e.g. Subject: PEP 201 - Lockstep iteration) All python-dev
> members are interested in seeing ideas discussed by the community, so
> don't hesitate to take a stance on a PEP if you have an opinion.
> 
> This is the first summary written by Brett Cannon.
> Summaries are archived no where at the moment.  =)   They will be, though,
> so stay tuned for the URL in future summaries.
> 
> 
> 
>    Posting distribution (with apologies to mbm, but thanks to mwh for the
> code)
> 
>    Number of articles in summary: 585
> 
>     80 |                     [|]
>        |                     [|]
>        |                     [|]
>        |                     [|]
>        | [|]                 [|]
>     60 | [|]             [|] [|]
>        | [|]             [|] [|]
>        | [|]             [|] [|]
>        | [|]             [|] [|]
>        | [|]             [|] [|]                 [|]
>     40 | [|]         [|] [|] [|]                 [|]
>        | [|]         [|] [|] [|]         [|]     [|]         [|]
>        | [|]         [|] [|] [|]         [|]     [|]         [|] [|]
>        | [|]         [|] [|] [|] [|]     [|]     [|]     [|] [|] [|]
>        | [|]         [|] [|] [|] [|]     [|]     [|] [|] [|] [|] [|]
>     20 | [|] [|]     [|] [|] [|] [|] [|] [|]     [|] [|] [|] [|] [|]
>        | [|] [|]     [|] [|] [|] [|] [|] [|]     [|] [|] [|] [|] [|]
>        | [|] [|]     [|] [|] [|] [|] [|] [|]     [|] [|] [|] [|] [|]
>        | [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|]
>        | [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|]
>      0 +-071-025-012-042-063-084-030-021-039-009-047-027-033-041-036-005
>         Fri 16| Sun 18| Tue 20| Thu 22| Sat 24| Mon 26| Wed 28| Fri 30|
>             Sat 17  Mon 19  Wed 21  Fri 23  Sun 25  Tue 27  Thu 29  Sat 31

I'm not sure I care about this diagram.  It's also kind of hard to
read.  I would mind less if it was at the end of the summary.

> 
> 
> ================
> Type Categories
> ================
> This VERY long thread was sparked by Andrew Koenig asking if a discussion
> of making type categories more explicit had ever occured (Andrew meant for
> category to mean "the set of all types that implement a particular marker
> interface").  As Andrew later pointed out, he was asking about  "a way of
> making notions such as 'file-like object' more formal and/or automatic".
> The discussion quickly started using the term interface to mean defining a
> way to specify that an object implemented certain methods (think of it in
> terms of Java's 'implements' mechanism).  Once that was out of the way,
> the discussion took off.  Zope's implementation was pointed out
> (http://cvs.zope.org/Zope3/lib/python/Interface/) very quickly.  PEP 245
> (Python Interface Syntax) was also brought to the attention of the list.
> The idea of using inheritance to handle interfaces was brought up.  Guido
> said that he hasn't "given up the hope that inheritance and interfaces
> could use the same mechanisms.  But Jim Fulton, based on years of
> experience in Zope, claims they really should be different" in terms of
> how interfaces should be handled in objects.  Jeremy Hylton tried to
> channel Jim's opinion by pointing out that "We'd like to use interfaces to
> make fairly strong claims.  If a class A implements an interface I, then
> we should be able to use an instance of A anywhere that an I is needed."
> But "the inheritance mechanism is too general" because if a class A
> implements interface I and then a class B, which does not implement I,
> subclasses class A we end up with a class B that claims it has a certain
> interface which it doesn't actually have.  Guido understood the point, but
> still thought inheritence could be used "if there was a way to "shut off"
> inheritance as far as isinstance() (or issubclass())" is concerned.  Guido
> asked the simple question, "Why do keep arguing for inheritance?  (a) the
> need to deny inheritance from an interface, while essential, is relatively
> rare IMO, and in *most* cases the inheritance rules work just fine; (b)
> having two separate but similar mechanisms makes the language larger."
> Samuele Pedroni asked that any implementation "allow also for refering to
> anonymous super-interfaces of an interface in terms of the interface plus
> a subset of its signatures, also e.g. FileLike and just 'write'.  [that
> means an interface can be thought to correspond to a set of
> (tag,signature) tuples, where tag identifies the interface, and one can
> also just consider subsets of it]".  The thread has finally seemed to have
> stopped (for now) with Guido saying he is mulling the whole thing in the
> back of his head.  This is a very sticky topic because of the number of
> design decisions required and how it might change the way people program
> in Python.

Please break up that paragraph into pieces shorter than 12 lines
each. :-)

> There was also a partial sub-thread in this whole discussion about
> multimethods; basically a way to do overloading of methods based on
> parameter signature.  Most of the discussion was over syntax and such and
> how to handle resolution order.  It then seemed to go to the wayside when
> the main part of the thread took over again.
> 
> ==============================
> type categories -- an example
> ==============================
> This thread was starteed when Andrew Koenig said that the reason he
> brought up his type category question was because he wanted a way so as to
> be able to identify members of a type easily.  He now had an example in a
> program he was writing where what the type of the argument was varied and
> thus what needed to be done to the data changed accordingly.  Jermey
> Hylton suggested the isinstance(obj, type(re.compile(''))) idiom.  Andrew
> asked if this was guaranteed to work, which Jeremy said no.  I asked why
> this was not guaranteed, and Frederick Lundh said because re.compile() is
> a factory fxn and it is possible that a future version could return a
> different object based on the pattern.
> 
> ===============================================
> Python build trouble with the new gcc/binutils
> ===============================================
> Andrew Koenig said that he couldn't compile Python using the newest gcc
> (this was the day after the latest release hit servers).  With help from
> Zack Weinberg of Code Sourcery (who also recently rewrote the tempfile
> module), the problem was tracked down to binutils 2.13. being the culprit
> and was not Python's fault.
> 
> ===================================
> Last call: mortal interned strings
> ===================================
> The patch python.org/sf/576101 removes the default immortality of interned
> strings.  I believe it was in early August (possibly spilled over from
> late July) when Oren Tirosh proposed the idea and wrote the above
> mentioned patch.  There had been some discussion over whether any 3rd
> party code was reliant upon interned strings being immortal; none was
> found (MacPython was reliant upon it, but since it is under Python core
> control it was considered a moot point since it could be changed).  It has
> been checked in.  With the patch the way to make a string immortal is to
> call PyString_InternImmortal(); no code in the core uses this function.
> 
> =====================================
> PEP 218 (sets); moving set.py to Lib
> =====================================
> Thanks to Greg Wilson (for writing the PEP), Alex Martelli (for writing
> the module initially), and Guido (for refactoring Alex's code) the stdlib

You might add Raymond Hettinger who wrote the docs and did significant
work on the code after me.  Also Tim Peters who added some good speedups.

> has now gained a sets module.  It has both the notion of mutable and
> immutable sets (the latter used when you have a set of sets).  There was
> discussion about how sets should print (sorted or not; unsorted is default
> but option is there to print sorted)

This option is no longer documented though.  It may yet disappear.

>                                      and what operators should be
> overloaded for working on sets (| and & were chosen).  The module is a
> beautiful chunk of code and I highly recommend reading its source.

Thanks.

> ===========================================
> A few lessons from the tempfile.py rewrite
> ===========================================
> Zack Weinberg, after rewriting the tempfile module, brought up three
> points:
> 1) Lack of dummy threads, 2) lack of a pthreads_once equivalent, and 3)
> lack of a way to skip tests from unittest.py via some built-in method.
> Guido responded accordingly: 1) since some code uses the idiom of trying
> to import thread and catching the exception if it fails, Guido said he
> would be willing to accept a dummy_thread.py that would allow:
> 
> try:
>     import thread as _thread
> except ImportError:
>     import dummy_thread as _thread
> 
> to work.  No word on whether this is being written at the moment.  2)
> Guido said the method was, in his opinion, overkill.  He said to "be
> Pythonic, live dangerously, accept the risk that a ^C can screw you.  It
> can anyway. :-)".  And as for 3) Guido deferred Zack to the PyUnit list
> and Steve Purcell since Python just tracks Steve's code (pyunit.sf.net).
> Guido's suggestion was to stick code that was reliant on some other code
> in a separate testing suite that is only run when the reliant code is
> available.
> 
> ===========================
> Standard datetime objects?
> ===========================
> Kevin Jacobs asked what stage the new datetime object was at.  Guido said
> it is in python/nondist/sandbox/datetime/ in CVS which also has comments
> pointing to a wiki containing the current work on it.  Fred L. Drake, Jr.
> is working on the C re-implementation and Guido expects a checkin at any
> moment (hasn't happened as of this writing).

Has now, in the sandbox (more to come).

> ===================
> PEP 269 versus 283
> ===================
> Jonathan Riehl noticed that PEP 283 said PEP 269 was dead; not good
> considering he was close to having a patch for PEP 269 (pgen module to
> interface with the C version).  Guido said he will revive the PEP.  The
> patch has since been put on SF at python.org/sf/599331 .
> 
> ==============================
> What is a backport candidate?
> ==============================
> Since Python 2.2 is going to be around for a long time, the question was
> brought up of what constitutes code that should be backported.  Guido made
> the following three points:
> 
> 1) code trivial to backport should always be backported
> 
> 2) code patcheing 2.3 code should obviously not be backported
               x

> 
> 3) 2.2 code requires changes to use patch, but applies; gradients of this
> exist.
> 
> So please, when submitting patches, mention whether you think the patch
> should be backported to the 2.2 tree and any possible dependencies it
> might have in a backport.
> 
> =================================
> python/nondist/sandbox/spambayes
> =================================
> In response to Paul Graham's spam filter written using Baye's Rule
> (Slashdot post on it is at
> http://developers.slashdot.org/article.pl?sid=02/08/16/1428238&tid=156), a
> thread spawned around this checkin of code that followed that paper's
> suggestions.  This thread quickly jumped into discussions on data
> structures, Baye's Rule, and a whole lot of talk about spam.  Very
> interesting if spam filtering interests you.  Tim Peters has been leading
> the drive on this chunk of code (and thanks to his illness that befelled
> him in late August which he has subsequently gotten over he had a few days
> of major hacking on it; Tim showed he is a performance stats whore
> <wink>).
> 
> A very cool quote came out of this thread from Eric S. Raymond when
> discussing the spam filter he has been working on: "This is actually the
> first new program I've coded in C (rather than
> Python) in a good four years or so".

(Several of us think even this didn't have to be coded in C after
all. :-)

> ====================
> Parsing vs. lexing.
> ====================
> In response to a question by Aahz about what the differences were between
> a lexer, parser, and tokenizer, Eric Raymond posted a good overview of the
> differences.  Guido later commented in an email mentioning SPARK and about
> how Python's lexer (pgen) works and why he wrote it.  He also made some
> other comments on lexers.  Jeremy Hylton pointed out a "neat new paper
> about an old algorithm for recursive descent parsers with backtracking and
> unlimited lookahead" by Bryan Ford at http://www.brynosaurus.com/pub.html
> .  Alex Martelli pointed out that this discussion reminded him of "a
> long-ago interview with Borland's techies"  in which they said they were
> able to make Borland PASCAL fit on a floppy while MS PASCAL took multiple
> floppies.  Their trick was "we just did everything by the Dragon Book --
> except that the parser is a hand-written recursive descent parser [Aho &c
> being adamant defenders of Yacc & the like], which buys us a lot".
> Someone named Noah also emailed a discussion on lexers and parsers pulling
> in Finite State Machines, Push Down Autonoma, and Turing Machines in his
> discussion.
> 
> Martin Sj?n says that Haskell's pattern matching and lazy evaluation makes

Come on, you know his real name is Sjögren. :-)

> lexers easy (even a Recursive-Descent parser), but unfortunately Haskell
> does not play with other languages nicely.  Haskell is where Python got
> it's list comprehension idea.
> 
> =========================================
> [Python-Dev] Fw: Security hole in rexec?
> =========================================
> It was brought to the attention of the list that deleting __builtins__
> allowed a compromise in rexec.  Guido pointed out that
> python.org/sf/577530 reports this.  He also said don't trust rexec.
> 
> A patch is going to be submitted to document the view that rexec is really
> not that safe.

It was checked in.

> =================
> A `cogen' module
> =================
> Francois Pinard asked about Cartesian products using the new sets module.
> Guido didn't think people would in general need it.  Francois quickly
> started this thread of discussing a cogen module to generate Cartesian
> products and other ways of operating on sets.

Tim Peters quickly posted *his* elaborate state-of-the-art code, which
ended the discussion (as usual, posting code is a good way to stop
discussion :-).

> =================
> Mersenne Twister
> =================
> Raymond Hettinger volunteered to implement the Merseene Twister algorithm
> (one in Python exists at www.math.keio.ac.jp/~matumoto/emt.html).  While
> discussing to implement in C or Python, Guido noticed that random.Random
> re-implements whrandom.  Guido then came up with the idea of writing a
> base random class that is subclassed where .random() can be implemented;
> Tim Peters agreed and suggested more methods to subclass.
> 
> =================================
> New PEP Format: reStructuredText
> =================================
> David Goodger and Barry Warsaw have now gotten reST as a usable syntax for
> PEPs.  Read the PEPs on the subject to learn more:
> 
> - PEP 12 -- Sample reStructuredText PEP Template
>   (http://www.python.org/peps/pep-0012.html)
> 
> - PEP 258 -- Docutils Design Specification
>   (http://www.python.org/peps/pep-0258.html)
> 
> - PEP 287 -- reStructuredText Docstring Format
>   (http://www.python.org/peps/pep-0287.html)
> 
> ====================================
> tiny optimization in ceval mainloop
> ====================================
> Jeremy Hylton noticed that in ceval that their is a test of whether the
> ticker was 0 or if things_to_do was set to true (explanation of the
> ticker, checkinterval, and the GIL follow this paragraph).  Jeremy
> wondered if we could just drop the ticker to 0 when things_to_do is true.
> Jack Janssen, though, pointed out that clearing it is not guaranteed since
> there may be an interrupt routine when "we fiddle things_to_do".  Skip
> Montanaro then pointed out that since neither ticker nor things_to_do is
> fiddled with unless the GIL is held that instead of causing each thread to
> execute this test that they could be made globals instead; he did a patch
> that implements this (python.org/sf/602191).  Guido then said that if
> there wasn't a decent speed improvement, then no patch would be checked
> in.  He then changed his mind when it was pointed out that it actually
> simplified the code.  Skip tested anyway, though, and there is a speed
> improvement.  This also brought up whether the default value of 10 for
> checkinterval was reasonable.  It was then agreed to be bumped up to 100.
> Jack ran some code and said he noticed a definite improvement.
> 
> Python's version of threading is not like in C.  There is something called
> the GIL (Global Interpreter Lock) which any thread wishing to execute
> Python code or play with Python objects must hold.  This means that when
> you have Python threads running (using the thread or threading module)
> they are usually all waiting in line to get the GIL.  Now for Python to
> decide when to release the GIL for another thread to grab it, it uses the
> ticker.  This variable counts down to zero by being decremented every time
> a Python opcode is executed (originally defaulted to 10, now defaulted to
> 100).  The ticker's starting value after each release of the GIL is what
> sys.checkinterval() sets.
> 
> To get a better understanding of therading under Python I recommend
> reading Aahz's tutorials on threading.
> 
> 
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev

All in all, please keep this up!!!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@comcast.net  Tue Sep  3 18:53:36 2002
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 03 Sep 2002 13:53:36 -0400
Subject: [Python-Dev] The first trustworthy <wink> GBayes results
In-Reply-To: <20020903134112.GC1227@cthulhu.gerg.ca>
Message-ID: <BIEJKCLHCIOIHAGOKOLHKECEDKAA.tim.one@comcast.net>

[Tim again]
>> I must be missing something.  I would *hope* that you review
>> *all* messages claimed to be spam, in which case the number of msgs
>> to be reviewed would, in a perfectly accurate system, be equal to the
>> number of spams received.

[Greg Ward]
> Good lord, certainly not!  Remember that Exim rejects a couple hundred
> messages a day that never get near SpamAssassin -- that's mostly
> Chinese/Korean junk that's rejected on the basis of 8-bit chars or
> banned charsets in the headers.  Then, probably 50-75% of what SA gets
> its hands on scores >= 10.0, so it too is rejected at SMTP time.  Only
> messages that score < 10 are accepted, and those that score >= 5.0 are
> set aside in /var/mail/spam for review.  That's 10-30 messages/day.
>
> (I do occasionally scan Exim's reject log on mail.python.org to see
> what's getting rejected today -- Exim kindly logs the full headers of
> every message that is rejected after the DATA command.  I usually make
> it to about 11am of a given day's logfile before my eyes glaze over from
> the endless stream of spam and viruses.)

I get about 200 spams per day on my own email accounts, and look at all of
them.  I don't look at the headers at all, I just look at the msgs in a
capable HTML-aware mail reader, as a matter of course while dealing with all
the day's email.  It's rare that it takes more than a second to recognize a
spam by eyeball and hit the delete key.  At about 200 per day, it's just now
reaching my "hmm, this is becoming a nuisance sometimes" threshold.  Our
tolerance levels for manual review seem to differ by a factor of 100 or more
<wink>.

> Note that we *used* to accept messages before passing them to
> SpamAssassin, so never rejected anything on the basis of its SA score.
> Back then, we saved and reviewed probably 50-70 messages/day.  Very,
> very, very few (if any) false positives scored >= 10.0, which is why
> that's the threshold for SMTP-time rejection.

I can tell you the mean false negative and false positive rates on what I've
been working on, and even measure their variance across both training and
prediction sets.  (The fn rate is well under 2% now (adding in more headers
should improve that a lot), and the fp rate under 0.05% (but I doubt that
adding in more headers will improve this)).  So long as we don't know the
rates for the scheme you're using now, there's no objective basis for
comparison.

...

>> Maybe you don't want this kind of approach at all.  The classifier
doesn't
>> have "gray areas" in practice:  it tends to give probabilites near 1, or
>> near 0, and there's very little in between -- a msg either has a
>> preponderance of spam indicators, or a preponderance of non-spam
>> indicators.

> That's a great improvement over SpamAssassin then: with SA, the grey
> area (IMHO) is scores from 3 to 10... which is why several python.org
> lists now have a little bit of Mailman configuration magic that makes MM
> set aside messages with an SA score >= 3 for list admin review.  (It's
> probably worth getting the list admin to do a bit more work in order to
> avoid sending low-scoring spam to the list.)
>
> However, as long as "very little" != "nothing", we still need to worry a
> bit about that grey area.  What do you think we should do with a message
> whose spam probability is between (say) 0.1 and 0.9?  Send it on, reject
> it, or set it aside?

Under Graham's scheme, send it on.  It doesn't have grey areas in a useful
sense, becuase the scoring step only looks at a handful of extremes:
extremes in, extremes out, and when it's wrong it's *spectacularly* wrong
(e.g., the very rare (< 0.05%) false positives generally have "probabilties"
exceeding 0.99, and a false negative often has a "probability" less then
0.01).

> Just how many messages fall in that grey area anyways?

I can't get at my testing setup now and don't know the answer offhand.  I'll
try to make time tonight to determine the answer.  I guess the interesting
stats are what percent of hams have probs in (0.1, 0.9), and what percent of
spams.  In general, it's only very brief messages that don't score near 0.0
or 1.0, so this *may* turn out to be the same thing as asking what
percentages of hams and spams are very brief.

Note too that adding the headers in *should* catch a lot more spam under
this scheme.  But, even as is, and even if I strip all the HTML tags out of
spam, fewer than 1 spam in 50 scores less than 0.9.  The ones that are
passed on now include all spams with empty bodies (a message with an empty
body scores 0.5).



From tismer@tismer.com  Tue Sep  3 19:26:01 2002
From: tismer@tismer.com (Christian Tismer)
Date: Tue, 03 Sep 2002 20:26:01 +0200
Subject: [Python-Dev] Get rid of etype struct
Message-ID: <3D74FEB9.5060406@tismer.com>

This is a multi-part message in MIME format.
--------------080306050703000101060801
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit

Hi Guido,
I think I have a solution for this one, see the attached diff.
I did what you suggested: Make the adressing of the
members dependant from the metatype.
The etype struct has lost its members[1] field, to make it
easier to extend the structure. Instead, the allocator always
adds one to the size, to have the sentinel in place.

I did not yet publish the etype stucture, since I didn't find
a good name and place for it.
Testing was also not very thorow. I just checked that types
work from Python and that I can add __slots__ to them.
Will re-port this stuff to my Py2.2 Stackless base and try it
out as base type for my own C types.

It took me the whole day to understand how it must work, and
then just an hour to get it to work. This is quite some stuff :-)
Can somebody please have a look, if there are subtle errors?

ciao - chris

----------------------------------------------------------------------

You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=591586&group_id=5470


--------------080306050703000101060801
Content-Type: text/plain; charset=us-ascii;
 name="typeobject.diff"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="typeobject.diff"

cvs -z9 diff -u dist/src/Objects/typeobject.c 
Index: dist/src/Objects/typeobject.c
===================================================================
RCS file: /cvsroot/python/python/dist/src/Objects/typeobject.c,v
retrieving revision 2.179
diff -u -r2.179 typeobject.c

--- dist/src/Objects/typeobject.c	16 Aug 2002 17:01:08 -0000	2.179
+++ dist/src/Objects/typeobject.c	3 Sep 2002 18:04:39 -0000
@@ -20,9 +20,12 @@
 					  see add_operators() below. */
 	PyBufferProcs as_buffer;
 	PyObject *name, *slots;
-	PyMemberDef members[1];
+	/* here are optional user slots, followed by the members. */
 } etype;
 
+#define GET_MEMBERS(etype) \
+	((PyMemberDef *)(((char *)etype) + (etype)->type.ob_type->tp_basicsize)-1)
+
 static PyMemberDef type_members[] = {
 	{"__basicsize__", T_INT, offsetof(PyTypeObject,tp_basicsize),READONLY},
 	{"__itemsize__", T_INT, offsetof(PyTypeObject, tp_itemsize), READONLY},
@@ -213,7 +216,8 @@
 PyType_GenericAlloc(PyTypeObject *type, int nitems)
 {
 	PyObject *obj;
-	const size_t size = _PyObject_VAR_SIZE(type, nitems);
+	const size_t size = _PyObject_VAR_SIZE(type, nitems+1);
+	/* note that we need to add one, for the sentinel */
 
 	if (PyType_IS_GC(type))
 		obj = _PyObject_GC_Malloc(size);
@@ -253,7 +257,7 @@
 	PyMemberDef *mp;
 
 	n = type->ob_size;
-	mp = ((etype *)type)->members;
+	mp = GET_MEMBERS((etype *)type);
 	for (i = 0; i < n; i++, mp++) {
 		if (mp->type == T_OBJECT_EX) {
 			char *addr = (char *)self + mp->offset;
@@ -318,7 +322,7 @@
 	PyMemberDef *mp;
 
 	n = type->ob_size;
-	mp = ((etype *)type)->members;
+	mp = GET_MEMBERS((etype *)type);
 	for (i = 0; i < n; i++, mp++) {
 		if (mp->type == T_OBJECT_EX && !(mp->flags & READONLY)) {
 			char *addr = (char *)self + mp->offset;
@@ -1125,7 +1129,8 @@
 
 		/* Are slots allowed? */
 		nslots = PyTuple_GET_SIZE(slots);
-		if (nslots > 0 && base->tp_itemsize != 0) {
+		if (nslots > 0 && base->tp_itemsize != 0 && !PyType_Check(base)) {
+			/* for the special case of meta types, allow slots */
 			PyErr_Format(PyExc_TypeError,
 				     "nonempty __slots__ "
 				     "not supported for subtype of '%s'",
@@ -1334,7 +1339,7 @@
 	}
 
 	/* Add descriptors for custom slots from __slots__, or for __dict__ */
-	mp = et->members;
+	mp = GET_MEMBERS(et);
 	slotoffset = base->tp_basicsize;
 	if (slots != NULL) {
 		for (i = 0; i < nslots; i++, mp++) {
@@ -1366,7 +1371,7 @@
 	}
 	type->tp_basicsize = slotoffset;
 	type->tp_itemsize = base->tp_itemsize;
-	type->tp_members = et->members;
+	type->tp_members = GET_MEMBERS(et);
 	type->tp_getset = subtype_getsets;
 
 	/* Special case some slots */


*****CVS exited normally with code 1*****



--------------080306050703000101060801--



From nas@python.ca  Tue Sep  3 19:34:47 2002
From: nas@python.ca (Neil Schemenauer)
Date: Tue, 3 Sep 2002 11:34:47 -0700
Subject: [Python-Dev] The first trustworthy <wink> GBayes results
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHKECEDKAA.tim.one@comcast.net>
References: <20020903134112.GC1227@cthulhu.gerg.ca> <BIEJKCLHCIOIHAGOKOLHKECEDKAA.tim.one@comcast.net>
Message-ID: <20020903183447.GA13310@glacier.arctrix.com>

Tim Peters wrote:
> Under Graham's scheme, send it on.  It doesn't have grey areas in a useful
> sense, becuase the scoring step only looks at a handful of extremes:
> extremes in, extremes out, and when it's wrong it's *spectacularly* wrong
> (e.g., the very rare (< 0.05%) false positives generally have "probabilties"
> exceeding 0.99, and a false negative often has a "probability" less then
> 0.01).

I noticed that as well.  When the classifier goes wrong it goes badly
wrong and using different thresholds would not help.  It seems that
increasing the number of discriminators doesn't really help either.  Too
bad because otherwise you could flag those messages for human
classification.

On the bright side, based on the number of mis-classified messages in my
corpus, it looks like a human would have a very hard time doing a better
job.  Perhaps all that is needed is a bypass mechanism for that small
fraction of non-spammers.  That way if their initial message is rejected
they would still have some way of getting through.

Erik Naggum made an interesting comment.  He said that spam should be
handled at the transport level.  Greg's work on doing filtering at SMTP
time accomplishes this and makes a lot of sense.  When a message is
rejected, the sending mail server is the one that has to deal with it.
In the case of spam, the sending server is often an open rely.  Letting
it handle the bounces is sweet justice. :-)

I bring this up because "STMP time filtering" makes a bypass mechanism
work much better.  With a system like TMDA, confirmation notices usually
generate double-bounces.  Instead, we could reject the message with a
5xx error that includes instructions on how to bypass the filter (e.g.
include a cookie in the body of the message).

  Neil


From python@discworld.dyndns.org  Tue Sep  3 19:39:14 2002
From: python@discworld.dyndns.org (Charles Cazabon)
Date: Tue, 3 Sep 2002 12:39:14 -0600
Subject: [Python-Dev] The first trustworthy <wink> GBayes results
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHKECEDKAA.tim.one@comcast.net>; from tim.one@comcast.net on Tue, Sep 03, 2002 at 01:53:36PM -0400
References: <20020903134112.GC1227@cthulhu.gerg.ca> <BIEJKCLHCIOIHAGOKOLHKECEDKAA.tim.one@comcast.net>
Message-ID: <20020903123914.B30532@twoflower.internal.do>

Tim Peters <tim.one@comcast.net> wrote:
> 
> Under Graham's scheme, send it on.  It doesn't have grey areas in a useful
> sense, becuase the scoring step only looks at a handful of extremes:
> extremes in, extremes out, and when it's wrong it's *spectacularly* wrong
> (e.g., the very rare (< 0.05%) false positives generally have "probabilties"
> exceeding 0.99, and a false negative often has a "probability" less then
> 0.01).

I would love to see how the results would be affected by applying the scoring
scheme to the entire content of the message, instead of just the 15 (or 16 in
your case) most extreme samples.  By the way, you never said why you increased
that number by one; did it make that much difference?

Charles
-- 
-----------------------------------------------------------------------
Charles Cazabon                           <python@discworld.dyndns.org>
GPL'ed software available at:     http://www.qcc.ca/~charlesc/software/
-----------------------------------------------------------------------


From guido@python.org  Tue Sep  3 18:50:31 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 03 Sep 2002 13:50:31 -0400
Subject: [Python-Dev] Proposed Mixins for Wide Interfaces
In-Reply-To: Your message of "Sat, 31 Aug 2002 12:44:06 EDT."
 <001101c2510d$9fce0920$5f66accf@othello>
References: <001101c2510d$9fce0920$5f66accf@othello>
Message-ID: <200209031750.g83HoVq05812@odiug.zope.com>

> How about adding some mixins to simplify the
> implementation of some of the fatter interfaces?

Can you suggest implementations for these, to be absolutely clear what
you mean?

> class CompareMixin:
>     """
>     Given an __eq__ method in a subclass, adds a __ne__ method
>     Given __eq__ and __lt__, adds !=, <=, >, >=.
>     """

What if the "natural" thing to implement is __le__ instead of __lt__?
That's the case for sets.  Or __gt__ (less likely)?

> class MappingMixin:
>     """
>     Given __setitem__, __getitem__,  and keys,
>     implements values, items, update, get, setdefault, len,
>     iterkeys, iteritems, itervalues, has_key, and __contains__.
> 
>     If __delitem__ is also supplied, implements clear, pop,
>     and popitem.
> 
>     Takes advantage of __iter__ if supplied (recommended).

Does that mean that if you have __iter__, you don't use keys()?  In
that case it should implement keys() out of __iter__.  Maybe this
should be required.

>     Takes advantage of __contains__ or has_key if supplied
>     (recommended).
>     """

Let's standardize on __contains__, not has_key().  I guess you could
provide __contains__ as follows:

  def __contains__(self, key):
      try:
          self[key]
      except KeyError:
          return 0
      else:
          return 1

I don't mind if there are some recursions amongst the various
implementations; if you don't supply the minimum, the implementation
will raise "RuntimeError: maximum recursion depth exceeded".

> The idea is to make it easier to implement these interfaces.
> Also, if the interfaces get expanded, the clients automatically
> updated.  

A similar thing for sequences would be useful too, right?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@comcast.net  Tue Sep  3 20:08:57 2002
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 03 Sep 2002 15:08:57 -0400
Subject: [Python-Dev] The first trustworthy <wink> GBayes results
In-Reply-To: <20020903123914.B30532@twoflower.internal.do>
Message-ID: <BIEJKCLHCIOIHAGOKOLHAECMDKAA.tim.one@comcast.net>

[Charles Cazabon]
> I would love to see how the results would be affected by applying
> the scoring scheme to the entire content of the message, instead of
> just the 15 (or 16 in your case) most extreme samples.

Then it would be close to a classic Bayesian classifier, and like any such
would need entirely different scoring code to avoid catastrophic
floating-point errors (right now an intermediate result can't become smaller
than 0.01**16 = 1e-32, so fp troubles are impossible; raise the exponent to
a measly 200 and you're already out of the range of IEEE double precision;
classic classifiers word in logarithm space instead for this reason).  You
can read lots of papers on how those do; all evidence suggests they do worse
than this scheme on the spam versus non-spam task.

> By the way, you never said why you increased that number by one;

It's explained in the comment block preceding the MAX_DISCRIMINATORS
definition.

BTW, in an unreported experiment I boosted MAX_DISCRIMINATORS to 36.  I
don't recall what happened now, but it was a disaster for at least one of
the error rates.

> did it make that much difference?

Not on average.  It helped eliminate a narrow class of false positives,
where previously the first 15 extremes the classifier saw had 8 probs of .99
and 7 of .01.  That works out to "spam".  Making the # of classifiers even
instead allowed for graceful ties, which favor ham in this scheme.  All
previous decisions "should be" revisited after each new change, though, and
in this particular case it could well be that stipping HTML tags out of
plain-text messages also addressed the same narrow issue but in a more
effective way (without some special gimmick, virtually every message
including so much as an example of HTML got scored as spam).



From guido@python.org  Tue Sep  3 20:41:10 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 03 Sep 2002 15:41:10 -0400
Subject: [Python-Dev] Should KeyError use repr() on its argument?
Message-ID: <200209031941.g83JfAK07542@odiug.zope.com>

(SF bug 598451.)

The KeyError exception doesn't apply repr() to its argument.  That's
annoying in cases like this:

  >>> a = {}
  >>> a['']
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
  KeyError
  >>> 

Should this be fixed?  How?  (I guess we could add a KeyError__str__
method to exceptions.c that applies repr().)

I've got a feeling this is a feature, but not a very useful one.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Tue Sep  3 20:54:48 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 03 Sep 2002 15:54:48 -0400
Subject: [Python-Dev] The first trustworthy <wink> GBayes results
In-Reply-To: Your message of "Tue, 03 Sep 2002 11:34:47 PDT."
 <20020903183447.GA13310@glacier.arctrix.com>
References: <20020903134112.GC1227@cthulhu.gerg.ca> <BIEJKCLHCIOIHAGOKOLHKECEDKAA.tim.one@comcast.net>
 <20020903183447.GA13310@glacier.arctrix.com>
Message-ID: <200209031954.g83Jsmw07797@odiug.zope.com>

> Erik Naggum made an interesting comment.  He said that spam should be
> handled at the transport level.  Greg's work on doing filtering at SMTP
> time accomplishes this and makes a lot of sense.  When a message is
> rejected, the sending mail server is the one that has to deal with it.
> In the case of spam, the sending server is often an open rely.  Letting
> it handle the bounces is sweet justice. :-)

In the case of a false positive, it has the added advantage that at
least the poor sender, falsely accused of sending spam, gets a bounce
and may try to try again.

> I bring this up because "STMP time filtering" makes a bypass mechanism
> work much better.  With a system like TMDA, confirmation notices usually
> generate double-bounces.  Instead, we could reject the message with a
> 5xx error that includes instructions on how to bypass the filter (e.g.
> include a cookie in the body of the message).

Do you still believe that TMDA is the only answer to spam?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Tue Sep  3 20:57:00 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 03 Sep 2002 15:57:00 -0400
Subject: [Python-Dev] Should KeyError use repr() on its argument?
In-Reply-To: Your message of "Tue, 03 Sep 2002 15:41:10 EDT."
 <200209031941.g83JfAK07542@odiug.zope.com>
References: <200209031941.g83JfAK07542@odiug.zope.com>
Message-ID: <200209031957.g83Jv0k07810@odiug.zope.com>

> The KeyError exception doesn't apply repr() to its argument.  That's
> annoying in cases like this:
> 
>   >>> a = {}
>   >>> a['']
>   Traceback (most recent call last):
>     File "<stdin>", line 1, in ?
>   KeyError
>   >>> 
> 
> Should this be fixed?  How?  (I guess we could add a KeyError__str__
> method to exceptions.c that applies repr().)
> 
> I've got a feeling this is a feature, but not a very useful one.

I take it back.  args[0] being the actual key that failed is a
feature.  str() not using repr() on args[0] is a bug.  I'll fix it.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From pinard@iro.umontreal.ca  Tue Sep  3 20:54:27 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: Tue, 03 Sep 2002 15:54:27 -0400
Subject: [Python-Dev] Re: Python-dev summary for 2002-08-15 - 2002-09-01
In-Reply-To: <200209031653.g83GrjQ01929@odiug.zope.com> (Guido van Rossum's
 message of "Tue, 03 Sep 2002 12:53:45 -0400")
References: <Pine.SOL.4.44.0209011545490.23213-100000@death.OCF.Berkeley.EDU>
 <200209031653.g83GrjQ01929@odiug.zope.com>
Message-ID: <oqlm6i6e8c.fsf@titan.progiciels-bpi.ca>

[Guido van Rossum]

>>     80 |                     [|]
>>        |                     [|]
>>        |                     [|]
>>        |                     [|]
>>        | [|]                 [|]
>>     60 | [|]             [|] [|]
>>        | [|]             [|] [|]
>>        | [|]             [|] [|]
>>        | [|]             [|] [|]
>>        | [|]             [|] [|]                 [|]
>>     40 | [|]         [|] [|] [|]                 [|]
>>        | [|]         [|] [|] [|]         [|]     [|]         [|]
>>        | [|]         [|] [|] [|]         [|]     [|]         [|] [|]
>>        | [|]         [|] [|] [|] [|]     [|]     [|]     [|] [|] [|]
>>        | [|]         [|] [|] [|] [|]     [|]     [|] [|] [|] [|] [|]
>>     20 | [|] [|]     [|] [|] [|] [|] [|] [|]     [|] [|] [|] [|] [|]
>>        | [|] [|]     [|] [|] [|] [|] [|] [|]     [|] [|] [|] [|] [|]
>>        | [|] [|]     [|] [|] [|] [|] [|] [|]     [|] [|] [|] [|] [|]
>>        | [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|]
>>        | [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|]
>>      0 +-071-025-012-042-063-084-030-021-039-009-047-027-033-041-036-005
>>         Fri 16| Sun 18| Tue 20| Thu 22| Sat 24| Mon 26| Wed 28| Fri 30|
>>             Sat 17  Mon 19  Wed 21  Fri 23  Sun 25  Tue 27  Thu 29  Sat 31
>
> [...] It's also kind of hard to read. [...]

True.  But not so difficult to improve.  Adding a bit of simplicity yields:

       |                     84
    80 |                     [] 
       |                     [] 
       |                     [] 
       | 71                  [] 
       | []              63  [] 
    60 | []              []  [] 
       | []              []  [] 
       | []              []  [] 
       | []              []  []                  47
       | []          42  []  []                  [] 
    40 | []          []  []  []          39      []          41
       | []          []  []  []          []      []          []  36
       | []          []  []  []  30      []      []      33  []  [] 
       | []          []  []  []  []      []      []  27  []  []  [] 
       | []  25      []  []  []  []  21  []      []  []  []  []  [] 
    20 | []  []      []  []  []  []  []  []      []  []  []  []  [] 
       | []  []      []  []  []  []  []  []      []  []  []  []  [] 
       | []  []  12  []  []  []  []  []  []   9  []  []  []  []  [] 
       | []  []  []  []  []  []  []  []  []  []  []  []  []  []  []   5
       | []  []  []  []  []  []  []  []  []  []  []  []  []  []  []  []
     0 +----------------------------------------------------------------
        Fri Sat Sun Mon Tue Wed Thu Fri Sat Sun Mon Tue Wed Thu Fri Sat
         16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard


From pinard@iro.umontreal.ca  Tue Sep  3 20:57:50 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: Tue, 03 Sep 2002 15:57:50 -0400
Subject: [Python-Dev] Re: Python-dev summary for 2002-08-15 - 2002-09-01
In-Reply-To: <200209031653.g83GrjQ01929@odiug.zope.com> (Guido van Rossum's
 message of "Tue, 03 Sep 2002 12:53:45 -0400")
References: <Pine.SOL.4.44.0209011545490.23213-100000@death.OCF.Berkeley.EDU>
 <200209031653.g83GrjQ01929@odiug.zope.com>
Message-ID: <oqheh66e2p.fsf@titan.progiciels-bpi.ca>

[Guido van Rossum]

>> =================
>> A `cogen' module
>> =================
>> Francois Pinard asked about Cartesian products using the new sets module.
>> Guido didn't think people would in general need it.  Francois quickly
>> started this thread of discussing a cogen module to generate Cartesian
>> products and other ways of operating on sets.
>
> Tim Peters quickly posted *his* elaborate state-of-the-art code, which
> ended the discussion (as usual, posting code is a good way to stop
> discussion :-).

I'll be back!  (Not that I especially look like Arnold Schwartzeneger!)

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard


From guido@python.org  Tue Sep  3 21:18:03 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 03 Sep 2002 16:18:03 -0400
Subject: [Python-Dev] Re: Python-dev summary for 2002-08-15 - 2002-09-01
In-Reply-To: Your message of "Tue, 03 Sep 2002 15:54:27 EDT."
 <oqlm6i6e8c.fsf@titan.progiciels-bpi.ca>
References: <Pine.SOL.4.44.0209011545490.23213-100000@death.OCF.Berkeley.EDU> <200209031653.g83GrjQ01929@odiug.zope.com>
 <oqlm6i6e8c.fsf@titan.progiciels-bpi.ca>
Message-ID: <200209032018.g83KI3q08343@odiug.zope.com>

> > [...] It's also kind of hard to read. [...]
> 
> True.  But not so difficult to improve.  Adding a bit of simplicity yields:
> 
>        |                     84
>     80 |                     [] 
>        |                     [] 
>        |                     [] 
>        | 71                  [] 
>        | []              63  [] 
>     60 | []              []  [] 
>        | []              []  [] 
>        | []              []  [] 
>        | []              []  []                  47
>        | []          42  []  []                  [] 
>     40 | []          []  []  []          39      []          41
>        | []          []  []  []          []      []          []  36
>        | []          []  []  []  30      []      []      33  []  [] 
>        | []          []  []  []  []      []      []  27  []  []  [] 
>        | []  25      []  []  []  []  21  []      []  []  []  []  [] 
>     20 | []  []      []  []  []  []  []  []      []  []  []  []  [] 
>        | []  []      []  []  []  []  []  []      []  []  []  []  [] 
>        | []  []  12  []  []  []  []  []  []   9  []  []  []  []  [] 
>        | []  []  []  []  []  []  []  []  []  []  []  []  []  []  []   5
>        | []  []  []  []  []  []  []  []  []  []  []  []  []  []  []  []
>      0 +----------------------------------------------------------------
>         Fri Sat Sun Mon Tue Wed Thu Fri Sat Sun Mon Tue Wed Thu Fri Sat
>          16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31

Ooh, much better.  Still, put this at the end instead of at the top of
the message.  It's not *that* interesting.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@comcast.net  Tue Sep  3 21:32:55 2002
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 03 Sep 2002 16:32:55 -0400
Subject: [Python-Dev] The first trustworthy <wink> GBayes results
In-Reply-To: <20020903183447.GA13310@glacier.arctrix.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHCEDFDKAA.tim.one@comcast.net>

[Neil Schemenauer]
> I noticed that as well.  When the classifier goes wrong it goes badly
> wrong and using different thresholds would not help.  It seems that
> increasing the number of discriminators doesn't really help either.  Too
> bad because otherwise you could flag those messages for human
> classification.

I think it's worse than just that:  suppose any scheme says "OK, this is
spam, with probability 0.9995".  If it's reporting accurate probabilities,
then another way to read that claim is "On average, one time in 2000 this
message actually isn't spam".  In real life we have to accept that there's
no scheme with a 0% false positive rate-- not even human review --short of
the scheme that never calls anything spam.  Since deciding on the largest
acceptable false positive rate is far more a social than a technical issue,
a group of nerds will do anything rather than face it <wink>.



From David Abrahams" <dave@boost-consulting.com  Tue Sep  3 22:06:53 2002
From: David Abrahams" <dave@boost-consulting.com (David Abrahams)
Date: Tue, 3 Sep 2002 17:06:53 -0400
Subject: [Python-Dev] Re: Python-dev summary for 2002-08-15 - 2002-09-01
References: <Pine.SOL.4.44.0209011545490.23213-100000@death.OCF.Berkeley.EDU> <200209031653.g83GrjQ01929@odiug.zope.com>              <oqlm6i6e8c.fsf@titan.progiciels-bpi.ca>  <200209032018.g83KI3q08343@odiug.zope.com>
Message-ID: <17d001c2538d$f82650f0$1c86db41@boostconsulting.com>

Turn it sideways and it'll get smaller...

From: "Guido van Rossum" <guido@python.org>


> > > [...] It's also kind of hard to read. [...]
> >
> > True.  But not so difficult to improve.  Adding a bit of simplicity
yields:
> >
> >        |                     84
> >     80 |                     []
> >        |                     []
> >        |                     []
> >        | 71                  []
> >        | []              63  []
> >     60 | []              []  []
> >        | []              []  []
> >        | []              []  []
> >        | []              []  []                  47
> >        | []          42  []  []                  []
> >     40 | []          []  []  []          39      []          41
> >        | []          []  []  []          []      []          []  36
> >        | []          []  []  []  30      []      []      33  []  []
> >        | []          []  []  []  []      []      []  27  []  []  []
> >        | []  25      []  []  []  []  21  []      []  []  []  []  []
> >     20 | []  []      []  []  []  []  []  []      []  []  []  []  []
> >        | []  []      []  []  []  []  []  []      []  []  []  []  []
> >        | []  []  12  []  []  []  []  []  []   9  []  []  []  []  []
> >        | []  []  []  []  []  []  []  []  []  []  []  []  []  []  []   5
> >        | []  []  []  []  []  []  []  []  []  []  []  []  []  []  []  []
> >      0
+----------------------------------------------------------------
> >         Fri Sat Sun Mon Tue Wed Thu Fri Sat Sun Mon Tue Wed Thu Fri Sat
> >          16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31
>
> Ooh, much better.  Still, put this at the end instead of at the top of
> the message.  It's not *that* interesting.
>
> --Guido van Rossum (home page: http://www.python.org/~guido/)
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev



From skip@pobox.com  Tue Sep  3 22:39:01 2002
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 3 Sep 2002 16:39:01 -0500
Subject: [Python-Dev] Two random and nearly unrelated ideas
Message-ID: <15733.11253.743055.864572@12-248-11-90.client.attbi.com>

While adding a blurb to Misc/NEWS about the change to the thread ticker and
check interval, it occurred to me that perhaps Misc/NEWS would benefit from
conversion to ReST format.  You could pump an HTML version out to the
website periodically.

Second (also considered during the above edit), it would be nice to get rid
of the ticker altogether in systems with proper signal support.  On those
platforms couldn't an alarm replace polling for the ticker?  I know signals
are tricky devils, but it still seems it would be a win if you could use it.
You'd have to install a SIGALRM handler which would trip periodically.  It
would also have to keep track of any alarm handler the programmer installed.

Just for the heck of it I recompiled ceval.c with the (--_Py_Ticker < 0)
block ifdef'd out.  Got a 1.7% increase in pystones over the now default
checkinterval == 100 situation.

Skip



From nas@python.ca  Tue Sep  3 22:52:51 2002
From: nas@python.ca (Neil Schemenauer)
Date: Tue, 3 Sep 2002 14:52:51 -0700
Subject: [Python-Dev] The first trustworthy <wink> GBayes results
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHCEDFDKAA.tim.one@comcast.net>
References: <20020903183447.GA13310@glacier.arctrix.com> <BIEJKCLHCIOIHAGOKOLHCEDFDKAA.tim.one@comcast.net>
Message-ID: <20020903215251.GA14101@glacier.arctrix.com>

Tim Peters wrote:
> Since deciding on the largest acceptable false positive rate is far
> more a social than a technical issue, a group of nerds will do
> anything rather than face it <wink>.

I think we pretty much ran out of things to do. :-)  Still, I think the
acceptable rate depends heavily on what happens to the rejects.  If they
go to /dev/null then it would have to be very low.  If there are bounces
and a way for the innocent victims to bypass the filter then I consider
0.5% good enough for most situations.  The major remaining problem would
be handing legitimate automated email.  For mailing lists that probably
isn't an issue.

I'm probably not the guy to listen to about acceptable rates, though.  I
currently use TMDA and therefore am a heartless bastard. :-)

  Neil


From jeremy@alum.mit.edu  Tue Sep  3 22:53:46 2002
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Tue, 3 Sep 2002 17:53:46 -0400
Subject: [Python-Dev] mysterious hangs in socket code
Message-ID: <15733.12138.568668.562013@slothrop.zope.com>

I've been running a small, multi-threaded program to retrieve web
pages today.  The entire program appears to hang when I perform a slow
DNS operation, even there is no application-level coordinate between
the threads.

The motivation comes from http://www.python.org/sf/591349, but I ended
up writing a similar small test script, which I've attached.

When I run this program with Python 2.1, it produces a steady stream
of output -- urls and the time it took to load them.  Most of the
pages take less than a second, but some take a very long time.

If I run this program with Python 2.2 or 2.3, it produces little
bursts of output, then pauses for a long time, then repeats.

I believe that the problem relates to DNS lookups, but not in a way I
fully understand.  If I connect gdb to any of the threads while the
program is hung, it is always inside getaddrinfo().  My first
realization was that the socketmodule stopped wrapping DNS lookups in
By_BEGIN/END_ALLOW_THREADS calls when the IPv6 changes were
integrated.  But if I restore these calls --
    see http://www.python.org/sf/604210 --
I don't see any change in behavior.  The program still hangs
periodically.

One possibility is that the Linux getaddrinfo() is thread-safe, but
only by way of a lock that only allows one request to be outstanding
at a time.

Not sure what the other possibilities are, but the current behavior is
awful.

Jeremy

---------------------------------------------------------------------
import httplib
import Queue
import random
import sys
import threading
import time
import traceback
import urlparse

headers = {"Accept":
           "text/plain, text/html, image/jpeg, image/jpg, "
           "image/gif, image/png, */*"}

class URLThread(threading.Thread):

    def __init__(self, queue):
        threading.Thread.__init__(self)
        self._queue = queue
        self._stopevent = threading.Event()

    def stop(self):
        self._stopevent.set()
    
    def run(self):
        while not self._stopevent.isSet():
            self.fetch()

    def fetch(self):
        url = self._queue.get()
        t0 = time.time()
        try:
            self._fetch(url)
        except:
            etype, value, tb = sys.exc_info()
            L = ["Error occurred fetching %s\n" % url,
                 "%s: %s\n" % (etype, value),
                 ]
            L += traceback.format_tb(tb)
            sys.stderr.write("".join(L))
        t1 = time.time()
        print url, round(t1 - t0, 2)

    def _fetch(self, url):
        parts = urlparse.urlparse(url)
        host = parts[1]
        path = parts[2]
        h = httplib.HTTPConnection(host)
        h.connect()
        h.request("GET", path, headers=headers)
        r = h.getresponse()
        r.read()
        h.close()

urls = """\
http://www.andersen.com/
http://www.google.com/
http://www.google.com/images/logo.gif
http://www.microsoft.com/
http://www.microsoft.com/homepage/gif/bnr-microsoft.gif
http://www.microsoft.com/homepage/gif/1ptrans.gif
http://www.microsoft.com/library/toolbar/images/curve.gif
http://www.yahoo.com/
http://www.sourceforge.net/
http://www.slashdot.org/
http://www.kuro5hin.org/
http://www.intel.com/
http://www.aol.com/
http://www.amazon.com/
http://www.cnn.com/
http://money.cnn.com/
http://www.expedia.com/
http://www.tripod.com/
http://www.hotmail.com/
http://www.angelfire.com/
http://www.excite.com/
http://www.verisign.com/
http://www.riaa.com/
http://www.enron.com/
http://www.securityspace.com/
http://www.directv.com/
http://www.att.com/
http://www.qwest.com/
http://www.covad.com/
http://www.sprint.com/
http://www.mci.com/
http://www.worldcom.com/
"""
urls = [u for u in urls.split("\n") if u]

REPEAT = 10
THREADS = 8

class RandomQueue:

    def __init__(self, L):
        self.list = L

    def get(self):
        return random.choice(self.list)
        
if __name__ == "__main__":
    urlq = RandomQueue(urls)

    sys.setcheckinterval(10)

    threads = []
    for i in range(THREADS):
        t = URLThread(urlq)
        t.start()
        threads.append(t)

    while 1:
        try:
            time.sleep(30)
        except:
            break

    print "Shutting down threads..."
    for t in threads:
        t.stop()
    for t in threads:
        t.join()




From drifty@bigfoot.com  Wed Sep  4 00:00:52 2002
From: drifty@bigfoot.com (Brett Cannon)
Date: Tue, 3 Sep 2002 16:00:52 -0700 (PDT)
Subject: [Python-Dev] Re: Python-dev summary for 2002-08-15 - 2002-09-01
In-Reply-To: <200209032018.g83KI3q08343@odiug.zope.com>
Message-ID: <Pine.SOL.4.44.0209031558070.19982-100000@death.OCF.Berkeley.EDU>

[Guido van Rossum]

> > > [...] It's also kind of hard to read. [...]
> >
> > True.  But not so difficult to improve.  Adding a bit of simplicity yields:
> >
> >        |                     84
> >     80 |                     []
> >        |                     []
> >        |                     []
> >        | 71                  []
> >        | []              63  []
> >     60 | []              []  []
> >        | []              []  []
> >        | []              []  []
> >        | []              []  []                  47
> >        | []          42  []  []                  []
> >     40 | []          []  []  []          39      []          41
> >        | []          []  []  []          []      []          []  36
> >        | []          []  []  []  30      []      []      33  []  []
> >        | []          []  []  []  []      []      []  27  []  []  []
> >        | []  25      []  []  []  []  21  []      []  []  []  []  []
> >     20 | []  []      []  []  []  []  []  []      []  []  []  []  []
> >        | []  []      []  []  []  []  []  []      []  []  []  []  []
> >        | []  []  12  []  []  []  []  []  []   9  []  []  []  []  []
> >        | []  []  []  []  []  []  []  []  []  []  []  []  []  []  []   5
> >        | []  []  []  []  []  []  []  []  []  []  []  []  []  []  []  []
> >      0 +----------------------------------------------------------------
> >         Fri Sat Sun Mon Tue Wed Thu Fri Sat Sun Mon Tue Wed Thu Fri Sat
> >          16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31
>
> Ooh, much better.  Still, put this at the end instead of at the top of
> the message.  It's not *that* interesting.
>

How about I just get rid of it?  It is only in there because Michael had
it in his summaries.

Actually, the entire header (from the first line to first summary) is
there just because Michael had it there.  I personally am happy keeping
the header as it is sans this count; I know I had to read a lot of emails
but I don't think anyone else cares.  =)

-Brett



From jason-exp-1031786493.04d3ca@mastaler.com  Wed Sep  4 00:28:24 2002
From: jason-exp-1031786493.04d3ca@mastaler.com (jason-exp-1031786493.04d3ca@mastaler.com)
Date: Tue, 03 Sep 2002 17:28:24 -0600
Subject: [Python-Dev] Re: The first trustworthy <wink> GBayes results
References: <20020903134112.GC1227@cthulhu.gerg.ca> <BIEJKCLHCIOIHAGOKOLHKECEDKAA.tim.one@comcast.net>
 <20020903183447.GA13310@glacier.arctrix.com>
Message-ID: <hhbs7ed55z.fsf@hrothgar.la.mastaler.com>

Neil Schemenauer <nas@python.ca> writes:

> I bring this up because "STMP time filtering" makes a bypass
> mechanism work much better.  With a system like TMDA, confirmation
> notices usually generate double-bounces.  Instead, we could reject
> the message with a 5xx error that includes instructions on how to
> bypass the filter (e.g.  include a cookie in the body of the
> message).

TMDA doesn't do this because it would make more work for the sender to
get his message delivered.  Because TMDA stores the incoming messages
in a local queue, the sender just has to reply to a confirmation
request, and his original message gets delivered.  As opposed to
having to cut and paste his message from the body of a bounce and then
resend it.  So, not operating at the transport level saves your
correspondents some work at the expense of some bandwidth.

-- 
(http://tmda.net/)




From aahz@pythoncraft.com  Wed Sep  4 00:49:01 2002
From: aahz@pythoncraft.com (Aahz)
Date: Tue, 3 Sep 2002 19:49:01 -0400
Subject: [Python-Dev] mysterious hangs in socket code
In-Reply-To: <15733.12138.568668.562013@slothrop.zope.com>
References: <15733.12138.568668.562013@slothrop.zope.com>
Message-ID: <20020903234901.GA29756@panix.com>

On Tue, Sep 03, 2002, Jeremy Hylton wrote:
>
> I've been running a small, multi-threaded program to retrieve web
> pages today.  The entire program appears to hang when I perform a slow
> DNS operation, even there is no application-level coordinate between
> the threads.

gethostbyname() IIRC has frequently been non-reentrant.  it might be
related.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/


From pinard@iro.umontreal.ca  Wed Sep  4 01:31:50 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: Tue, 03 Sep 2002 20:31:50 -0400
Subject: [Python-Dev] Nit about `setdefault' documentation
Message-ID: <oqd6ruoart.fsf@titan.progiciels-bpi.ca>

Quite a small nit.  Reading:

---------------------------------------------------------------------->
>>> help({}.setdefault)
Help on built-in function setdefault:

setdefault(...)
    D.setdefault(k[,d]) -> D.get(k,d), also set D[k]=d if not D.has_key(k)
----------------------------------------------------------------------<

I wonder if writing the last line as:

---------------------------------------------------------------------->
    D.setdefault(k[,d]) -> D.get(k,d), also set D[k]=d if k not in D
----------------------------------------------------------------------<

would not better represent Python current fashion. :-)

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard


From sholden@holdenweb.com  Wed Sep  4 01:31:52 2002
From: sholden@holdenweb.com (Steve Holden)
Date: Tue, 3 Sep 2002 20:31:52 -0400
Subject: [Python-Dev] Re: Python-dev summary for 2002-08-15 - 2002-09-01
References: <Pine.SOL.4.44.0209011545490.23213-100000@death.OCF.Berkeley.EDU> <200209031653.g83GrjQ01929@odiug.zope.com>              <oqlm6i6e8c.fsf@titan.progiciels-bpi.ca>  <200209032018.g83KI3q08343@odiug.zope.com> <17d001c2538d$f82650f0$1c86db41@boostconsulting.com>
Message-ID: <008201c253aa$780144d0$6300000a@holdenweb.com>

[Guido]
> >
> > Ooh, much better.  Still, put this at the end instead of at the top of
> > the message.  It's not *that* interesting.
> >
[David]
> Turn it sideways and it'll get smaller...
>

... but no more interesting. Couldn't we just have a web page where this
statistic was available slided and diced according to requirements? It looks
especially bad in my standard mailreader variable-pitch font.

The summary itself, however, looks excellent.

regards
-----------------------------------------------------------------------
Steve Holden                                  http://www.holdenweb.com/
Python Web Programming                        pydish.holdenweb.com/pwp/
Previous .sig file retired to                    www.homeforoldsigs.com
-----------------------------------------------------------------------





From tim.one@comcast.net  Wed Sep  4 02:06:43 2002
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 03 Sep 2002 21:06:43 -0400
Subject: [Python-Dev] The first trustworthy <wink> GBayes results
In-Reply-To: <20020903134112.GC1227@cthulhu.gerg.ca>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEOFBBAB.tim.one@comcast.net>

[Greg Ward]
> ...
> Just how many messages fall in that grey area anyways?

Heh.  Here's the probability distribution for the 4000 ham messages in my
first test pair:

Ham distribution for this pair:
* = 67 items
  0.00 4000 ************************************************************
  2.50    0
  5.00    0
  7.50    0
 10.00    0
 12.50    0
 15.00    0
 17.50    0
 20.00    0
 22.50    0
 25.00    0
 27.50    0
 30.00    0
 32.50    0
 35.00    0
 37.50    0
 40.00    0
 42.50    0
 45.00    0
 47.50    0
 50.00    0
 52.50    0
 55.00    0
 57.50    0
 60.00    0
 62.50    0
 65.00    0
 67.50    0
 70.00    0
 72.50    0
 75.00    0
 77.50    0
 80.00    0
 82.50    0
 85.00    0
 87.50    0
 90.00    0
 92.50    0
 95.00    0
 97.50    0

That is, they *all* got a "probability score" less than 2.5% (0.025).
Here's the spam probability distribution across the same run:

Spam distribution for this pair:
* = 46 items
  0.00    5 *
  2.50    2 *
  5.00    1 *
  7.50    0
 10.00    0
 12.50    0
 15.00    1 *
 17.50    0
 20.00    1 *
 22.50    0
 25.00    2 *
 27.50    1 *
 30.00    0
 32.50    1 *
 35.00    0
 37.50    0
 40.00    0
 42.50    0
 45.00    1 *
 47.50    1 *
 50.00    1 *
 52.50    0
 55.00    0
 57.50    1 *
 60.00    3 *
 62.50    0
 65.00    2 *
 67.50    0
 70.00    0
 72.50    0
 75.00    1 *
 77.50    1 *
 80.00    0
 82.50    0
 85.00    0
 87.50    0
 90.00    3 *
 92.50    1 *
 95.00    6 *
 97.50 2715 ************************************************************

IOW, a spam usually scored at least 0.975 on this run, but some spams scored
under 0.025.  There's very little "in the middle".

I've got 19 more sets like this if you care a lot <wink>.  Here's the
aggregate across all 20 runs (each msg is counted 4 times here, once for
each of the runs in which it served in the prediction set against training
on one of the 4 spam+ham collection pairs it doesn't belong to):

Ham distribution for all runs:
* = 1333 items
  0.00 79938 ************************************************************
  2.50     8 *
  5.00     3 *
  7.50     0
 10.00     3 *
 12.50     1 *
 15.00     3 *
 17.50     1 *
 20.00     1 *
 22.50     0
 25.00     0
 27.50     0
 30.00     1 *
 32.50     4 *
 35.00     2 *
 37.50     0
 40.00     2 *
 42.50     0
 45.00     1 *
 47.50     1 *
 50.00     1 *
 52.50     0
 55.00     0
 57.50     0
 60.00     0
 62.50     1 *
 65.00     0
 67.50     0
 70.00     2 *
 72.50     0
 75.00     1 *
 77.50     1 *
 80.00     0
 82.50     0
 85.00     1 *
 87.50     1 *
 90.00     0
 92.50     1 *
 95.00     1 *
 97.50    21 *

Spam distribution for all runs:
* = 905 items
  0.00   215 *
  2.50    18 *
  5.00     8 *
  7.50    12 *
 10.00     6 *
 12.50     6 *
 15.00    14 *
 17.50     6 *
 20.00    10 *
 22.50     8 *
 25.00     9 *
 27.50     9 *
 30.00     3 *
 32.50     3 *
 35.00     5 *
 37.50     3 *
 40.00     7 *
 42.50    24 *
 45.00     3 *
 47.50    29 *
 50.00    34 *
 52.50     8 *
 55.00     6 *
 57.50    18 *
 60.00    64 *
 62.50    12 *
 65.00     7 *
 67.50     5 *
 70.00     3 *
 72.50     7 *
 75.00     4 *
 77.50    18 *
 80.00    10 *
 82.50    23 *
 85.00    13 *
 87.50    20 *
 90.00    27 *
 92.50    18 *
 95.00    57 *
 97.50 54256 ************************************************************

In percentage terms, very little lives outside the tips of the tail ends.

Note that calling the spam cutoff 0.975 instead of 0.90 would save 2 false
positives, at the expense of letting an additional 27+18+57 = 102 spams go
thru.


Here's the first example of a low-prob spam:

"""
Low prob spam! 0.0133104753792
Data/Spam/Set2/8007.txt
prob('from:email name:<janet691') = 0.5
prob('the') = 0.5
prob('subject:Fred') = 0.5
prob('you') = 0.5
prob('was') = 0.305052
prob('bool:noorg') = 0.614515
prob('proposal') = 0.100629
prob('will') = 0.557569
prob('talk') = 0.507463
prob('send') = 0.858078
prob('nice') = 0.227838
prob('from:email addr:ac') = 0.0754717
prob('from:email addr:uk>') = 0.0488301
prob('thanks,') = 0.0300188
prob('subject:Hey') = 0.99
prob('today') = 0.852792

Return-Path: <janet691@cranfield.ac.uk>
Delivered-To: bruce-spam@localhost
Received: (qmail 14409 invoked by alias); 6 Mar 2002 20:07:42 -0000
Delivered-To: spam@bruce-guenter.dyndns.org
Received: (qmail 14405 invoked from network); 6 Mar 2002 20:07:42 -0000
Received: from agamemnon.bfsmedia.com (204.83.201.2)
  by lorien.untroubled.org (192.168.1.3) with SMTP; 06 Mar 2002
20:07:42 -0000
Received: (qmail 13063 invoked by uid 500); 6 Mar 2002 20:02:05 -0000
Delivered-To: em-ca-spam@em.ca
Received: (qmail 13057 invoked by uid 502); 6 Mar 2002 20:02:05 -0000
Delivered-To: bfsmedia-goose.kennels@bfsmedia.com
Received: (qmail 13051 invoked from network); 6 Mar 2002 20:02:05 -0000
Received: from unknown (HELO smtp2.forserve.com) (63.170.11.221)
  by agamemnon.bfsmedia.com with SMTP; 6 Mar 2002 20:02:05 -0000
Date: Wed, 6 Mar 2002 15:12:41 -0500
Message-Id: <200203062012.g26KCfn08192@smtp2.forserve.com>
X-Mailer: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:0.9.1)
Gecko/20010607
Reply-To: <janet691@cranfield.ac.uk>
From: <janet691@cranfield.ac.uk>
To: <goose01977@bellsouth.net>
Subject: Hey Fred
Content-Length: 95
Lines: 9

Fred,


  It was nice to talk to you today I will send the proposal tonight.



Thanks,
 Heidi
"""

You figure it out <wink>.  I suspect bfsmedia would have added a high spam
score if I looked at Received lines, but even several additional strong spam
indicators wouldn't be enough to nail this one.  BTW, this msg shows up many
times in the spam corpora, varying the "Fred" and "Heidi" with other male
and female names; I assume this is a harvester that's trying to provoke the
recipient into replying.

Several others are damaged in ways such that the email pkg can't create a
msg out of them.  I could easily enough add code to force such a msg to be
considered spam.

Some are wildly embarrassing failures:

"""
Low prob spam! 0.000102019995919
Data/Spam/Set3/681.txt
prob('common,') = 0.01
prob('definately') = 0.01
prob('logic') = 0.01
prob('hell,') = 0.01
prob('it".') = 0.01
prob('obvious.') = 0.01
prob('theory') = 0.01
prob('whilst') = 0.01
prob('earning') = 0.99
prob('same,') = 0.01
prob('$500,000') = 0.99
prob('"bull",') = 0.99
prob('year!!!') = 0.99
prob('internet!') = 0.99
prob('tv:') = 0.99
prob('*this') = 0.99

Return-Path: <ihrockrat3213@hotmail.com>
Delivered-To: em-ca-bruceg@em.ca
Received: (qmail 25721 invoked from network); 17 Aug 2002 01:05:07 -0000
Received: from unknown (HELO 65.102.48.161) (65.102.48.161)
  by churchill.factcomp.com with SMTP; 17 Aug 2002 01:05:07 -0000
Received: from unknown (149.89.93.47) by rly-xr02.mx.aol.com with NNFMP;
Aug, 17 2002 1:50:22 AM -0800
Received: from anther.webhostingtalk.com ([88.58.121.118]) by
da001d2020.lax-ca.osd.concentric.net with QMQP; Aug, 17 2002 12:40:13
AM -0700
Received: from 34.57.158.148 ([34.57.158.148]) by rly-xr02.mx.aol.com with
local; Aug, 17 2002 12:02:05 AM +0300
From: rnpyjohn <ihrockrat3213@hotmail.com>
To: Undisclosed Recipients
Cc:
Subject: Please read this letter carefully, it works 100%
Sender: rnpyjohn <ihrockrat3213@hotmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Date: Sat, 17 Aug 2002 02:03:28 +0100
X-Mailer: The Bat! (v1.52f) Business
X-Priority: 1
Content-Length: 15985

*This is a one time mailing and this list will never be used again.*

Hi,

SEEN THIS MAIL BEFORE?,  SICK OF FINDING IT IN YOUR INBOX?   ME TOO, HONEST
I was exactly the same, till one day whilst i was complaining about how
tired
i was of seeing ...
"""

The first 16 most extreme indicators are split 9 highly in favor of ham
(.01) and 7 highly in favor of spam (.99).  If I hadn't folded case away to
let stinking conference announcements through <wink>, I expect it would have
latched on to the SCREAMING at the start instead of looking deeper.  Looking
at the To: line probably would nail this one too, as "Undisclosed
Recipients" has two 0.99 spam indicators right there.

Whatever, you *don't* want to look at msgs with a mix of just 0.99 and 0.01
thingies:  it's not all that unusual to get such an extreme mix, in spam or
ham.

this-isn't-your-father's-idea-of-probability<wink>-ly y'rs  - tim



From barry@python.org  Wed Sep  4 02:35:27 2002
From: barry@python.org (Barry A. Warsaw)
Date: Tue, 3 Sep 2002 21:35:27 -0400
Subject: [Python-Dev] mysterious hangs in socket code
References: <15733.12138.568668.562013@slothrop.zope.com>
Message-ID: <15733.25439.461968.51583@anthem.wooz.org>

>>>>> "JH" == Jeremy Hylton <jeremy@alum.mit.edu> writes:

    JH> I've been running a small, multi-threaded program to retrieve
    JH> web pages today.  The entire program appears to hang when I
    JH> perform a slow DNS operation, even there is no
    JH> application-level coordinate between the threads.

Does strace'ing the program provide any clues?  Also, if it's a DNS
thing, you should definitely try to run it on different networks (or
at least pointing to different DNS servers).

<type> <type>

Ok, running it now as "strace python foo.py" (Py2.2.1) and I see
similar behavior.  It seems to mostly be sitting in select() calls and
rt_sigsuspend() which I guess is a wrapper around sigsuspend().  When
I use Python 2.1.3 I never see it sit in sigsuspend().

-Barry


From pinard@iro.umontreal.ca  Wed Sep  4 02:39:44 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: Tue, 03 Sep 2002 21:39:44 -0400
Subject: [Python-Dev] Re: Python-dev summary for 2002-08-15 - 2002-09-01
In-Reply-To: <008201c253aa$780144d0$6300000a@holdenweb.com> ("Steve
 Holden"'s message of "Tue, 3 Sep 2002 20:31:52 -0400")
References: <Pine.SOL.4.44.0209011545490.23213-100000@death.OCF.Berkeley.EDU>
 <200209031653.g83GrjQ01929@odiug.zope.com>
 <oqlm6i6e8c.fsf@titan.progiciels-bpi.ca>
 <200209032018.g83KI3q08343@odiug.zope.com>
 <17d001c2538d$f82650f0$1c86db41@boostconsulting.com>
 <008201c253aa$780144d0$6300000a@holdenweb.com>
Message-ID: <oqsn0qr0rj.fsf@titan.progiciels-bpi.ca>

[Steve Holden]

> It looks especially bad in my standard mailreader variable-pitch font.

Oh!  You are touching a sensible nerve! :-)

There are many cases where people do ASCII art in messages, and I'm not
speaking of signatures here.  People often insert ASCII tables or simple
explicative drawings, these capabilities are useful enough for not being
dismissed.  You should use fixed width fonts when receiving, and even when
sending email.  (And people should limit their messages to 79 columns.)

If something looks bad because of your variable-pitch fonts, the problem is
emphatically _not_ in the sent message, and does not justify any alteration to
the format of those messages.

Another example is the fact that many fonts nowadays decided to improve over
ASCII, and have an apostrophe which is not symmetrical to a grave accent.  By
design and since ASCII 1, long ago, they should be symmetrical.  A few people
push for everybody to stop `quoting' like this.  I strongly believe that for
displaying ASCII text, people should use ASCII fonts.  If fonts are wrong, and
despite many fonts are wrong, this should not be seen as the sender problem.

The push is sometimes accompanied with the suggestion of switching to Unicode
all over, as a way to avoid the problem.  It is surely a good idea, but we are
not there yet.  In the meantime, ASCII stays ASCII.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard


From fredrik@pythonware.com  Wed Sep  4 06:41:42 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Wed, 4 Sep 2002 07:41:42 +0200
Subject: [Python-Dev] Re: Python-dev summary for 2002-08-15 - 2002-09-01
References: <Pine.SOL.4.44.0209011545490.23213-100000@death.OCF.Berkeley.EDU><200209031653.g83GrjQ01929@odiug.zope.com><oqlm6i6e8c.fsf@titan.progiciels-bpi.ca><200209032018.g83KI3q08343@odiug.zope.com><17d001c2538d$f82650f0$1c86db41@boostconsulting.com><008201c253aa$780144d0$6300000a@holdenweb.com> <oqsn0qr0rj.fsf@titan.progiciels-bpi.ca>
Message-ID: <005601c253d5$d0a63c50$ced241d5@hagrid>

Fran=E7ois Pinard wrote:

> [Steve Holden]
>
> > It looks especially bad in my standard mailreader variable-pitch font.
>
> Oh!  You are touching a sensible nerve! :-)
>
> There are many cases where people do ASCII art in messages, and I'm not
> speaking of signatures here.  People often insert ASCII tables or simpl=
e
> explicative drawings, these capabilities are useful enough for not bein=
g
> dismissed.  You should use fixed width fonts when receiving, and even w=
hen
> sending email.

loser.

if python really was all about "everything computers did when
I learned to use them will always be the best way to do it", it
would probably never have been invented.

and this mailing list is about python.

</F>



From oren-py-d@hishome.net  Wed Sep  4 10:49:47 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Wed, 4 Sep 2002 05:49:47 -0400
Subject: [Python-Dev] Two random and nearly unrelated ideas
In-Reply-To: <15733.11253.743055.864572@12-248-11-90.client.attbi.com>
References: <15733.11253.743055.864572@12-248-11-90.client.attbi.com>
Message-ID: <20020904094947.GA56953@hishome.net>

On Tue, Sep 03, 2002 at 04:39:01PM -0500, Skip Montanaro wrote:
> Second (also considered during the above edit), it would be nice to get rid
> of the ticker altogether in systems with proper signal support.  On those
> platforms couldn't an alarm replace polling for the ticker?  

Not before all all Python I/O calls are converted to be EINTR-safe.

After running into some problems with I/O interrupted by signals I tried to
fix it myself but it requires a lot of work in some of the hairiest places 
in the Python codebase.

	Oren


From fredrik@pythonware.com  Wed Sep  4 12:22:26 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Wed, 4 Sep 2002 13:22:26 +0200
Subject: [Python-Dev] Two random and nearly unrelated ideas
References: <15733.11253.743055.864572@12-248-11-90.client.attbi.com> <20020904094947.GA56953@hishome.net>
Message-ID: <001b01c25405$5a6da520$0900a8c0@spiff>

oren wrote:

> Not before all all Python I/O calls are converted to be EINTR-safe.
>=20
> After running into some problems with I/O interrupted by signals I =
tried to
> fix it myself but it requires a lot of work in some of the hairiest =
places=20
> in the Python codebase.

sounds like a good topic for a "here's what I learned when
trying to fix this problem" PEP.

</F>



From guido@python.org  Wed Sep  4 12:24:16 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 04 Sep 2002 07:24:16 -0400
Subject: [Python-Dev] Should KeyError use repr() on its argument?
In-Reply-To: Your message of "Tue, 03 Sep 2002 16:29:32 PDT."
 <Pine.SOL.4.44.0209031628020.19982-100000@death.OCF.Berkeley.EDU>
References: <Pine.SOL.4.44.0209031628020.19982-100000@death.OCF.Berkeley.EDU>
Message-ID: <200209041124.g84BOHY03377@pcp02138704pcs.reston01.va.comcast.net>

> > > The KeyError exception doesn't apply repr() to its argument.  That's
> > > annoying in cases like this:
> > >
> > >   >>> a = {}
> > >   >>> a['']
> > >   Traceback (most recent call last):
> > >     File "<stdin>", line 1, in ?
> > >   KeyError
> > >   >>>
> > >
> > > Should this be fixed?  How?  (I guess we could add a KeyError__str__
> > > method to exceptions.c that applies repr().)
> > >
> > > I've got a feeling this is a feature, but not a very useful one.
> >
> > I take it back.  args[0] being the actual key that failed is a
> > feature.  str() not using repr() on args[0] is a bug.  I'll fix it.
> >
> 
> What is args[0]?

args is the name of the instance variable that most exceptions use to
store the arguments that were passed to them in the raise statement
(or equivalent C API).  It is a tuple.  Examples:

  >>> a = KeyError()
  >>> a.args
  ()
  >>> a = KeyError(1)
  >>> a.args
  (1,)
  >>> a = KeyError(1,2,3)
  >>> a.args
  (1, 2, 3)
  >>> try:
          {}['']
      except KeyError, k:
          print k.args

  ('',)
  >>> 

> Are you saying that dicts use repr() instead of str() to
> get the key value when accessing?

No, I'm saying that str(KeyError('foo')) should return repr('foo')
rather than 'foo' as it does now.  See current CVS. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Wed Sep  4 12:44:32 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 04 Sep 2002 07:44:32 -0400
Subject: [Python-Dev] Two random and nearly unrelated ideas
In-Reply-To: Your message of "Wed, 04 Sep 2002 05:49:47 EDT."
 <20020904094947.GA56953@hishome.net>
References: <15733.11253.743055.864572@12-248-11-90.client.attbi.com>
 <20020904094947.GA56953@hishome.net>
Message-ID: <200209041144.g84BiXZ05244@pcp02138704pcs.reston01.va.comcast.net>

> > Second (also considered during the above edit), it would be nice to get rid
> > of the ticker altogether in systems with proper signal support.  On those
> > platforms couldn't an alarm replace polling for the ticker?  
> 
> Not before all all Python I/O calls are converted to be EINTR-safe.
> 
> After running into some problems with I/O interrupted by signals I tried to
> fix it myself but it requires a lot of work in some of the hairiest places 
> in the Python codebase.

Signals: just say no.  It is impossible to write correct code in the
presence of signals.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Wed Sep  4 12:49:15 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 04 Sep 2002 07:49:15 -0400
Subject: [Python-Dev] mysterious hangs in socket code
In-Reply-To: Your message of "Tue, 03 Sep 2002 17:53:46 EDT."
 <15733.12138.568668.562013@slothrop.zope.com>
References: <15733.12138.568668.562013@slothrop.zope.com>
Message-ID: <200209041149.g84BnFV05659@pcp02138704pcs.reston01.va.comcast.net>

> One possibility is that the Linux getaddrinfo() is thread-safe, but
> only by way of a lock that only allows one request to be outstanding
> at a time.

The next step should be to get the getaddrinfo() source code from
glibc and see what it does.  It's open source, hey. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Wed Sep  4 12:51:10 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 04 Sep 2002 07:51:10 -0400
Subject: [Python-Dev] Two random and nearly unrelated ideas
In-Reply-To: Your message of "Tue, 03 Sep 2002 16:39:01 CDT."
 <15733.11253.743055.864572@12-248-11-90.client.attbi.com>
References: <15733.11253.743055.864572@12-248-11-90.client.attbi.com>
Message-ID: <200209041151.g84BpAg05683@pcp02138704pcs.reston01.va.comcast.net>

> While adding a blurb to Misc/NEWS about the change to the thread
> ticker and check interval, it occurred to me that perhaps Misc/NEWS
> would benefit from conversion to ReST format.  You could pump an
> HTML version out to the website periodically.

Nice idea.  How much additional mark-up would this add to quote the
occasional reST meta-character?  Can you convert a section for test
and show me?

> Second (also considered during the above edit), it would be nice to
> get rid of the ticker altogether in systems with proper signal
> support.  On those platforms couldn't an alarm replace polling for
> the ticker?  I know signals are tricky devils, but it still seems it
> would be a win if you could use it.  You'd have to install a SIGALRM
> handler which would trip periodically.  It would also have to keep
> track of any alarm handler the programmer installed.

-1,000,000.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From praveen.patil@silver-software.com  Wed Sep  4 13:31:00 2002
From: praveen.patil@silver-software.com (Praveen Patil)
Date: Wed, 4 Sep 2002 13:31:00 +0100
Subject: [Python-Dev] Please help in  calling python fucntion from 'c'
Message-ID: <NFBBLJFBNMKMLGNLJMFCGELNCCAA.praveen.patil@silver-software.com>

This is a multi-part message in MIME format.

------=_NextPart_000_0011_01C25417.4EC8F910
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit

Hi,

I have written 'C' dll(MY_DLL.DLL) . I am importing 'C' dll in python
file(example.py).
I want to call python function from 'c' function.
For your reference I have attached 'c' and python files to this mail.
In my pc:
python code is under the directory D:\test\example.py
dll is under the directory C:\Program Files\Python\DLLs\MY_DLL.pyd

Here are the steps I am following.

step(1): I am calling 'C' function(RECEIVE_FROM_IL_S) from python.
         This 'C' function is existing imported dll(MY_DLL).
step(2): I want to call python function(TestFunction) from 'C'
function(RECEIVE_FROM_IL_S).


Python code is(example.py)  :-
----------------------------
import MY_DLL

G_Logfile          = None

def TestFunction():
    G_Logfile = open('Pytestfile.txt', 'w')
    G_Logfile.write("%s \n"%'I am writing python created text file')
    G_Logfile.close
    G_Logfile = None
#end def TestFunction

if __name__ == "__main__":

   MY_DLL.RECEIVE_FROM_IL_S(10,50)


'C' code is (MY_DLL.c) :-
---------------------
#include <windows.h>
#include <stdio.h>
#include <Python.h>

PyObject* _wrap_RECEIVE_FROM_IL_S(PyObject *self, PyObject *args)
{
    FILE* fp;
    PyObject* _resultobj;
    int i,j;

    if( !(PyArg_ParseTuple(args, "ii",&i,&j)))
    {
       return NULL;
    }
    fp= fopen("RECEIVE_IL_S.txt", "w");
    fprintf(fp, "i=%d   j=%d" , i,j);
    fclose(fp);

    /* Here I want to call python function(TestFunction). Please suggest me
some solution*/

    _resultobj = Py_None;
    return _resultobj;
}


static PyMethodDef MY_DLL_methods[] = {
      { "RECEIVE_FROM_IL_S", _wrap_RECEIVE_FROM_IL_S, METH_VARARGS },
      { NULL , NULL}
      };

__declspec(dllexport) void __cdecl initMY_DLL(void)
  {
    Py_InitModule("MY_DLL",MY_DLL_methods);
  }


Please anybody help me solving the problem.


Cheers,

Praveen.

------=_NextPart_000_0011_01C25417.4EC8F910
Content-Type: text/plain;
	name="exampl.py"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
	filename="exampl.py"

import MY_DLL

G_Logfile          = None

def TestFunction():
    G_Logfile = open('Pytestfile.txt', 'w')
    G_Logfile.write("%s \n"%'I am writing python created text file')
    G_Logfile.close
    G_Logfile = None
#end def TestFunction  

if __name__ == "__main__":

   MY_DLL.RECEIVE_FROM_IL_S(10,50)

------=_NextPart_000_0011_01C25417.4EC8F910
Content-Type: application/octet-stream;
	name="MY_DLL.c"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
	filename="MY_DLL.c"

#include <windows.h>
#include <stdio.h>
#include <Python.h>

PyObject* _wrap_RECEIVE_FROM_IL_S(PyObject *self, PyObject *args)
{
    FILE* fp; =20
    PyObject* _resultobj;
    int i,j;
   =20
    if( !(PyArg_ParseTuple(args, "ii",&i,&j)))
    {
       return NULL;
    }
    fp=3D fopen("RECEIVE_IL_S.txt", "w");
    fprintf(fp, "i=3D%d   j=3D%d" , i,j);
    fclose(fp);

    /* Here I want to call python function(TestFunction). Please suggest =
me some solution*/

    _resultobj =3D Py_None;
    return _resultobj;
}


static PyMethodDef MY_DLL_methods[] =3D {
      { "RECEIVE_FROM_IL_S", _wrap_RECEIVE_FROM_IL_S, METH_VARARGS },
      { NULL , NULL}
      };

__declspec(dllexport) void __cdecl initMY_DLL(void)
  {
    Py_InitModule("MY_DLL",MY_DLL_methods);
  }

------=_NextPart_000_0011_01C25417.4EC8F910
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

[ The information contained in this e-mail is confidential and is intended for the named recipient only. If you are not the named recipient, please notify us by telephone on +44 (0)1249 442 430 immediately, destroy the message and delete it from your computer. Silver Software has taken every reasonable precaution to ensure that any attachment to this e-mail has been checked for viruses. However, we cannot accept liability for any damage sustained as a result of any such software viruses and advise you to carry out your own virus check before opening any attachment. Furthermore, we do not accept responsibility for any change made to this message after it was sent by the sender.]

------=_NextPart_000_0011_01C25417.4EC8F910--


From oren-py-d@hishome.net  Wed Sep  4 13:46:46 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Wed, 4 Sep 2002 08:46:46 -0400
Subject: [Python-Dev] Signal-resistant code (was: Two random and nearly unrelated ideas)
In-Reply-To: <200209041144.g84BiXZ05244@pcp02138704pcs.reston01.va.comcast.net>
References: <15733.11253.743055.864572@12-248-11-90.client.attbi.com> <20020904094947.GA56953@hishome.net> <200209041144.g84BiXZ05244@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020904124646.GA79746@hishome.net>

On Wed, Sep 04, 2002 at 07:44:32AM -0400, Guido van Rossum wrote:
> > > Second (also considered during the above edit), it would be nice to get rid
> > > of the ticker altogether in systems with proper signal support.  On those
> > > platforms couldn't an alarm replace polling for the ticker?  
> > 
> > Not before all all Python I/O calls are converted to be EINTR-safe.
> > 
> > After running into some problems with I/O interrupted by signals I tried to
> > fix it myself but it requires a lot of work in some of the hairiest places 
> > in the Python codebase.
> 
> Signals: just say no.  It is impossible to write correct code in the
> presence of signals.

Wrapping all I/O calls with PyOS_ wrappers would be a good start. After
that the wrappers can be modified to retry the call on EINTR. This should 
solve all the problems I have encountered with interference to Python code 
by signals. Any other problems I should be aware of?

	Oren



From guido@python.org  Wed Sep  4 14:25:01 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 04 Sep 2002 09:25:01 -0400
Subject: [Python-Dev] Re: Signal-resistant code (was: Two random and nearly unrelated ideas)
In-Reply-To: Your message of "Wed, 04 Sep 2002 08:46:46 EDT."
 <20020904124646.GA79746@hishome.net>
References: <15733.11253.743055.864572@12-248-11-90.client.attbi.com> <20020904094947.GA56953@hishome.net> <200209041144.g84BiXZ05244@pcp02138704pcs.reston01.va.comcast.net>
 <20020904124646.GA79746@hishome.net>
Message-ID: <200209041325.g84DP1o06695@pcp02138704pcs.reston01.va.comcast.net>

> > Signals: just say no.  It is impossible to write correct code in the
> > presence of signals.
> 
> Wrapping all I/O calls with PyOS_ wrappers would be a good start.

And what should those wrappers do?

> After that the wrappers can be modified to retry the call on EINTR.

But that's not always what you want to happen!  E.g. if an app is
blocked on a read and uses an alarm to bail out of the read.

> This should solve all the problems I have encountered with
> interference to Python code by signals. Any other problems I should
> be aware of?

There's no way to sufficiently test a program that uses signals.  The
signal handler cannot touch *any* data, which makes it pretty useless.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@pobox.com  Wed Sep  4 15:45:51 2002
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 4 Sep 2002 09:45:51 -0500
Subject: [Python-Dev] Two random and nearly unrelated ideas
In-Reply-To: <20020904094947.GA56953@hishome.net>
References: <15733.11253.743055.864572@12-248-11-90.client.attbi.com>
 <20020904094947.GA56953@hishome.net>
Message-ID: <15734.7327.163001.51042@12-248-11-90.client.attbi.com>

    >> On those platforms couldn't an alarm replace polling for the ticker?

    Oren> Not before all all Python I/O calls are converted to be
    Oren> EINTR-safe.

Ah, yes.  Thanks for pointing out that little stumbling block...

Skip


From oren-py-d@hishome.net  Wed Sep  4 17:01:43 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Wed, 4 Sep 2002 12:01:43 -0400
Subject: [Python-Dev] Re: Signal-resistant code (was: Two random and nearly unrelated ideas)
In-Reply-To: <200209041325.g84DP1o06695@pcp02138704pcs.reston01.va.comcast.net>
References: <15733.11253.743055.864572@12-248-11-90.client.attbi.com> <20020904094947.GA56953@hishome.net> <200209041144.g84BiXZ05244@pcp02138704pcs.reston01.va.comcast.net> <20020904124646.GA79746@hishome.net> <200209041325.g84DP1o06695@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020904160143.GA1483@hishome.net>

On Wed, Sep 04, 2002 at 09:25:01AM -0400, Guido van Rossum wrote:
> > After that the wrappers can be modified to retry the call on EINTR.
> 
> But that's not always what you want to happen!  E.g. if an app is
> blocked on a read and uses an alarm to bail out of the read.

If I use a module that spawns an external process and uses SIGCHLD to be 
informed of its termination why should my innocent code that just reads 
lines from a file suddenly break?  In C I can at least restart the 
operation after an EINTR but file.readline cannot even be properly 
restarted because the buffering and file position is all messed up.

The example you gave of bailing out of a read with a signal can be done
using other techniques such as non-blocking I/O (which is, IMHO, a much
cleaner way to do it). Getting an notification of a child process 
terminating or other asynchronous events can only be done using signals 
and is currently dangerous because it will break code using I/O.

> > interference to Python code by signals. Any other problems I should
> > be aware of?
> 
> There's no way to sufficiently test a program that uses signals.  The
> signal handler cannot touch *any* data, which makes it pretty useless.

In order to be useful a signal handler needs to be able to set one bit.
The next time the ticker expires this bit will be checked. If an I/O 
operation was interrupted the Python signal handler can be executed 
immediately from the wrapper. When it returns the wrapper will resume the 
interrupted operation.

	Oren


I/O, I/O, it's off to work we go...

	The seven dwarfs



From oren-py-d@hishome.net  Wed Sep  4 19:51:31 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Wed, 4 Sep 2002 21:51:31 +0300
Subject: [Python-Dev] Signal-resistant code (was: Two random and nearly unrelated ideas)
In-Reply-To: <Pine.SOL.4.44.0209041104020.25909-100000@death.OCF.Berkeley.EDU>; from bac@OCF.Berkeley.EDU on Wed, Sep 04, 2002 at 11:04:27AM -0700
References: <20020904094947.GA56953@hishome.net> <Pine.SOL.4.44.0209041104020.25909-100000@death.OCF.Berkeley.EDU>
Message-ID: <20020904215131.A12898@hishome.net>

On Wed, Sep 04, 2002 at 11:04:27AM -0700, Brett Cannon wrote:
> [Oren Tirosh]
> 
> <snip>
> >
> > Not before all all Python I/O calls are converted to be EINTR-safe.
> 
> what is EINTER-safe?

When an I/O operation is interrupted by an unmasked signal it returns 
with errno==EINTR.  The state of the file is not affected and repeating
the operation should recover and continue with no loss of data.

Here is an EINTR-safe version of read:

ssize_t safe_read(int fd, void *buf, size_t count) {
	ssize_t result;
	do {
		result = read(fd, buf, count);
	} while (result == -1 && errno == EINTR);
	return result;
}

When exposing the C I/O calls to Python you can either:

1. Use EINTR-safe I/O and hide this from the user.
2. Pass on EINTR to the user.

Python currently does #2 with a big caveat - the internal buffering 
of functions like file.read or file.readline is messed up and cannot be 
cleanly restarted. This makes signals unusable for delivery of asynchronous 
events in the background without affecting the state of the main program.

	Oren



From guido@python.org  Wed Sep  4 20:10:15 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 04 Sep 2002 15:10:15 -0400
Subject: [Python-Dev] Signal-resistant code (was: Two random and nearly unrelated ideas)
In-Reply-To: Your message of "Wed, 04 Sep 2002 21:51:31 +0300."
 <20020904215131.A12898@hishome.net>
References: <20020904094947.GA56953@hishome.net> <Pine.SOL.4.44.0209041104020.25909-100000@death.OCF.Berkeley.EDU>
 <20020904215131.A12898@hishome.net>
Message-ID: <200209041910.g84JAGR08004@pcp02138704pcs.reston01.va.comcast.net>


From guido@python.org  Wed Sep  4 20:16:25 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 04 Sep 2002 15:16:25 -0400
Subject: [Python-Dev] Signal-resistant code (was: Two random and nearly unrelated ideas)
In-Reply-To: Your message of "Wed, 04 Sep 2002 21:51:31 +0300."
 <20020904215131.A12898@hishome.net>
References: <20020904094947.GA56953@hishome.net> <Pine.SOL.4.44.0209041104020.25909-100000@death.OCF.Berkeley.EDU>
 <20020904215131.A12898@hishome.net>
Message-ID: <200209041916.g84JGPd08031@pcp02138704pcs.reston01.va.comcast.net>

> > what is EINTER-safe?
> 
> When an I/O operation is interrupted by an unmasked signal it returns 
> with errno==EINTR.  The state of the file is not affected and repeating
> the operation should recover and continue with no loss of data.

What if the operation is a select() call?  Is restarting the right
thing?  How to take into account the consumed portion of the timeout,
if given?

> Here is an EINTR-safe version of read:
> 
> ssize_t safe_read(int fd, void *buf, size_t count) {
> 	ssize_t result;
> 	do {
> 		result = read(fd, buf, count);
> 	} while (result == -1 && errno == EINTR);
> 	return result;
> }
> 
> When exposing the C I/O calls to Python you can either:
> 
> 1. Use EINTR-safe I/O and hide this from the user.
> 2. Pass on EINTR to the user.
> 
> Python currently does #2 with a big caveat - the internal buffering 
> of functions like file.read or file.readline is messed up and cannot be 
> cleanly restarted. This makes signals unusable for delivery of asynchronous 
> events in the background without affecting the state of the main program.

Can you point to a place in the code where this is happening?

Or is this a stdio problem?  I believe that calls like fgets() and
getchar() don't lose data, but maybe I misunderstand your observation.

As I said before, I'm very skeptical that making the I/O ops
EINTR-safe would be enough to allow the use of signals as siggested by
Skip, but that might still be useful for other purposes, *if* we can
decide when to honor EINTR and when not.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From nas@python.ca  Wed Sep  4 20:22:47 2002
From: nas@python.ca (Neil Schemenauer)
Date: Wed, 4 Sep 2002 12:22:47 -0700
Subject: [Python-Dev] Signal-resistant code (was: Two random and nearly unrelated ideas)
In-Reply-To: <200209041916.g84JGPd08031@pcp02138704pcs.reston01.va.comcast.net>
References: <20020904094947.GA56953@hishome.net> <Pine.SOL.4.44.0209041104020.25909-100000@death.OCF.Berkeley.EDU> <20020904215131.A12898@hishome.net> <200209041916.g84JGPd08031@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020904192247.GA16797@glacier.arctrix.com>

Guido van Rossum wrote:
> What if the operation is a select() call?  Is restarting the right
> thing?  How to take into account the consumed portion of the timeout,
> if given?

I think you would not restart select().  It's only a hint anyhow.

  Neil


From oren-py-d@hishome.net  Wed Sep  4 21:07:09 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Wed, 4 Sep 2002 23:07:09 +0300
Subject: [Python-Dev] Signal-resistant code (was: Two random and nearly unrelated ideas)
In-Reply-To: <200209041916.g84JGPd08031@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Wed, Sep 04, 2002 at 03:16:25PM -0400
References: <20020904094947.GA56953@hishome.net> <Pine.SOL.4.44.0209041104020.25909-100000@death.OCF.Berkeley.EDU> <20020904215131.A12898@hishome.net> <200209041916.g84JGPd08031@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020904230709.A24623@hishome.net>

On Wed, Sep 04, 2002 at 03:16:25PM -0400, Guido van Rossum wrote:
> > When an I/O operation is interrupted by an unmasked signal it returns 
> > with errno==EINTR.  The state of the file is not affected and repeating
> > the operation should recover and continue with no loss of data.
> 
> What if the operation is a select() call?  Is restarting the right
> thing?  How to take into account the consumed portion of the timeout,
> if given?

Some versions of select update the timeout structure to the remainder if 
they are interrupted by a signal. It's probably not a good idea to rely 
on this so gettimeofday could be used to calculate the remainder.

> Or is this a stdio problem?  I believe that calls like fgets() and
> getchar() don't lose data, but maybe I misunderstand your observation.

This is not the point - even if Python I/O calls were fully restartable
would you actually expect people to check for EINTR and restart for
*every* I/O operation in the program just in case some module happens to
use signals?

Instead of

    for line in file:
        do_something_with(line)

we would need to write

    while 1:
        try:
            line = file.next()
        except IOError, exc:
            if exc.errno == errno.EINTR:
                continue
            else:
                raise
        except StopIteration:
            break
        do_something_with(line)

> As I said before, I'm very skeptical that making the I/O ops
> EINTR-safe would be enough to allow the use of signals as suggested by
> Skip

If it's good enough for other purposes it should be good enough for Skip's
proposal, too.

> Skip, but that might still be useful for other purposes, *if* we can
> decide when to honor EINTR and when not.

Only low-level functions like os.read and os.write that map directly to
stdio functions should ever return EINTR.  To make Python signal-safe all
other calls that can return EINTR should have a retry loop. On EINTR they
should check if there are things to do and if so grab the GIL, make
pending calls, release the GIL and retry the operation (unless an
exception has been raised by the signal handler, of course).

This way I could finally write a Python daemon that reloads its 
configuration files on getting the customary SIGHUP :-)

	Oren



From guido@python.org  Wed Sep  4 21:05:22 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 04 Sep 2002 16:05:22 -0400
Subject: [Python-Dev] Re: Signal-resistant code (was: Two random and nearly unrelated ideas)
In-Reply-To: Your message of "Wed, 04 Sep 2002 12:01:43 EDT."
 <20020904160143.GA1483@hishome.net>
References: <15733.11253.743055.864572@12-248-11-90.client.attbi.com> <20020904094947.GA56953@hishome.net> <200209041144.g84BiXZ05244@pcp02138704pcs.reston01.va.comcast.net> <20020904124646.GA79746@hishome.net> <200209041325.g84DP1o06695@pcp02138704pcs.reston01.va.comcast.net>
 <20020904160143.GA1483@hishome.net>
Message-ID: <200209042005.g84K5Ms08177@pcp02138704pcs.reston01.va.comcast.net>

> If I use a module that spawns an external process and uses SIGCHLD to be 
> informed of its termination why should my innocent code that just reads 
> lines from a file suddenly break?  In C I can at least restart the 
> operation after an EINTR but file.readline cannot even be properly 
> restarted because the buffering and file position is all messed up.

I have never understood why a child dying should send a signal.  You
can poll for the child with waitpid() instead.

But if you have a suggestion for how to fix this particular issue, I'd
be happy to look it over, since this *is* something some people do.

> The example you gave of bailing out of a read with a signal can be done
> using other techniques such as non-blocking I/O (which is, IMHO, a much
> cleaner way to do it).

Yes.

> Getting an notification of a child process terminating or other
> asynchronous events can only be done using signals and is currently
> dangerous because it will break code using I/O.

See above.  I see half your point; people wanting this tend to use
signals and it causes breakage.

> > > interference to Python code by signals. Any other problems I should
> > > be aware of?
> > 
> > There's no way to sufficiently test a program that uses signals.  The
> > signal handler cannot touch *any* data, which makes it pretty useless.
> 
> In order to be useful a signal handler needs to be able to set one bit.
> The next time the ticker expires this bit will be checked.

OK.

> If an I/O operation was interrupted the Python signal handler can be
> executed immediately from the wrapper. When it returns the wrapper
> will resume the interrupted operation.

Is calling the Python signal handler from the wrapper always safe?
What if the Python signal handler e.g. closes the file or reads from
it?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Wed Sep  4 21:24:04 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 04 Sep 2002 16:24:04 -0400
Subject: [Python-Dev] Signal-resistant code (was: Two random and nearly unrelated ideas)
In-Reply-To: Your message of "Wed, 04 Sep 2002 23:07:09 +0300."
 <20020904230709.A24623@hishome.net>
References: <20020904094947.GA56953@hishome.net> <Pine.SOL.4.44.0209041104020.25909-100000@death.OCF.Berkeley.EDU> <20020904215131.A12898@hishome.net> <200209041916.g84JGPd08031@pcp02138704pcs.reston01.va.comcast.net>
 <20020904230709.A24623@hishome.net>
Message-ID: <200209042024.g84KO4G08242@pcp02138704pcs.reston01.va.comcast.net>

> > What if the operation is a select() call?  Is restarting the right
> > thing?  How to take into account the consumed portion of the timeout,
> > if given?
> 
> Some versions of select update the timeout structure to the remainder if 
> they are interrupted by a signal. It's probably not a good idea to rely 
> on this so gettimeofday could be used to calculate the remainder.

I like Neil's suggestion: simply return.  The timeout is a hint.

> > Or is this a stdio problem?  I believe that calls like fgets() and
> > getchar() don't lose data, but maybe I misunderstand your observation.
> 
> This is not the point - even if Python I/O calls were fully restartable
> would you actually expect people to check for EINTR and restart for
> *every* I/O operation in the program just in case some module happens to
> use signals?
> 
> Instead of
> 
>     for line in file:
>         do_something_with(line)
> 
> we would need to write
> 
>     while 1:
>         try:
>             line = file.next()
>         except IOError, exc:
>             if exc.errno == errno.EINTR:
>                 continue
>             else:
>                 raise
>         except StopIteration:
>             break
>         do_something_with(line)

OK, but you're changing your tune here.  I agree that this is bad, but
I still don't believe (or understand) your previous remark about
readline losing track of buffering.  But let's forget about this, I
trust that you really meant what you showed here.

> > As I said before, I'm very skeptical that making the I/O ops
> > EINTR-safe would be enough to allow the use of signals as
> > suggested by Skip
> 
> If it's good enough for other purposes it should be good enough for
> Skip's proposal, too.

Well, it has to be *perfect* for Skip's proposal, since it means we'd
be generating signals probably at a rate of 100 per second.

> > Skip, but that might still be useful for other purposes, *if* we can
> > decide when to honor EINTR and when not.
> 
> Only low-level functions like os.read and os.write that map directly
> to stdio functions should ever return EINTR.

Um, os.read/write are the ones that *don't* map to stdio.  Maybe you
meant "that map directly to file descriptors"?  But I doubt this would
be acceptable -- if we were generating 100 signals per second,
os.read/write become much harder to use if they could raise EINTR
(currently they only raise EINTR if the app uses signal handlers,
which isn't that common).

> To make Python signal-safe all other calls that can return EINTR
> should have a retry loop. On EINTR they should check if there are
> things to do and if so grab the GIL, make pending calls, release the
> GIL and retry the operation (unless an exception has been raised by
> the signal handler, of course).
> 
> This way I could finally write a Python daemon that reloads its 
> configuration files on getting the customary SIGHUP :-)

If you really want that, maybe you could see if you can produce a
working design and patch?  Even if it's not perfect enough to use
signals to replace the ticker, people who like to use signals would
probably be happy.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From Jack.Jansen@oratrix.com  Wed Sep  4 21:45:30 2002
From: Jack.Jansen@oratrix.com (Jack Jansen)
Date: Wed, 4 Sep 2002 13:45:30 -0700
Subject: [Python-Dev] Signal-resistant code (was: Two random and nearly unrelated ideas)
In-Reply-To: <20020904215131.A12898@hishome.net>
Message-ID: <3FE2540C-C047-11D6-89C6-000A27B19B96@oratrix.com>

On woensdag, sep 4, 2002, at 11:51 US/Pacific, Oren Tirosh wrote:
> When an I/O operation is interrupted by an unmasked signal it returns
> with errno==EINTR.  The state of the file is not affected and repeating
> the operation should recover and continue with no loss of data.
>
I'm not sure about modern unixen (it's been a long time since I was 
interested in such lowlevel details) but historically this has been one 
complete mess.

Aside from some unix variations that basically didn't do restart at all 
there have always been problems with signal restart semantics. For 
sockets and various devices (raw ttys, I think) you could definitely 
lose data.

Hmm, and when I think of it I don't think it's even possible to restart 
safely. What if I do a read() on a socket, and I request more bytes 
than the available physical memory (but less than VM, of course)? The 
kernel simply doesn't have anywhere to store the bytes other than my 
buffer, and if it has to return EINTR then >POOF< these bytes are gone 
forever.



From guido@python.org  Wed Sep  4 21:48:11 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 04 Sep 2002 16:48:11 -0400
Subject: [Python-Dev] Signal-resistant code (was: Two random and nearly unrelated ideas)
In-Reply-To: Your message of "Wed, 04 Sep 2002 13:45:30 PDT."
 <3FE2540C-C047-11D6-89C6-000A27B19B96@oratrix.com>
References: <3FE2540C-C047-11D6-89C6-000A27B19B96@oratrix.com>
Message-ID: <200209042048.g84KmCK08365@pcp02138704pcs.reston01.va.comcast.net>

[Jack]
> Hmm, and when I think of it I don't think it's even possible to restart 
> safely. What if I do a read() on a socket, and I request more bytes 
> than the available physical memory (but less than VM, of course)? The 
> kernel simply doesn't have anywhere to store the bytes other than my 
> buffer, and if it has to return EINTR then >POOF< these bytes are gone 
> forever.

I think that if any bytes have already been copied into your buffer,
you don't get an EINTR, you get a short read.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From walter@livinglogic.de  Wed Sep  4 22:21:40 2002
From: walter@livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Wed, 04 Sep 2002 23:21:40 +0200
Subject: [Python-Dev] mimetypes patch #554192
References: <3D5BEBB8.7080904@livinglogic.de>	<15707.61612.844119.819432@anthem.wooz.org>	<3D5CE38D.9080905@livinglogic.de>	<m3d6sizl50.fsf@mira.informatik.hu-berlin.de>	<3D5F9C2D.8010209@livinglogic.de> <m3sn0tcsh8.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3D767964.4090405@livinglogic.de>

Martin v. Loewis wrote:

> Walter Dörwald <walter@livinglogic.de> writes:
> 
> 
>>>>Even better would be, if we could assign priorities to the mappings,
>>>>so that for e.g. image/jpeg the preferred extension is .jpeg.
>>>>Then guess_type() and guess_extension() would return the preferred
>>>>mimetype/extension.
>>>
>>>Do you have a specific application for that in mind? It sounds like
>>>overkill.
>>
>>I'm using a web mirror script which uses the extensions from
>>guess_extension to save all downloaded resources, and I hate it
>>when the HTML files are named .htm and JPEG images are named .jpe.
> 
> Then this is your preference - others might prefer jpg, just because
> their file system can deal better with that. If you can agree that
> this is your preference, you should put the preference mechanism into
> the application.

Agreed, other applications might have other priorities.

> Maybe your preference can be expressed algorithmically? It might be
> that you always want the longest known extension (it is unlikely that
> you prefer "jpeg" over "jpg" just because that contains a vowel :-).

I guess it's "longest one" or "the one most unencumbered by filesystem
limitations".

OK, so lets drop the priority idea. What do we do with the patch
as it is now?

Bye,
    Walter Dörwald



From pinard@iro.umontreal.ca  Wed Sep  4 22:21:44 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: Wed, 04 Sep 2002 17:21:44 -0400
Subject: [Python-Dev] Re: Signal-resistant code (was: Two random and nearly unrelated
 ideas)
In-Reply-To: <200209042048.g84KmCK08365@pcp02138704pcs.reston01.va.comcast.net> (Guido
 van Rossum's message of "Wed, 04 Sep 2002 16:48:11 -0400")
References: <3FE2540C-C047-11D6-89C6-000A27B19B96@oratrix.com>
 <200209042048.g84KmCK08365@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <oq1y89jvrr.fsf@carouge.sram.qc.ca>

[Guido van Rossum]

> [Jack]
>> Hmm, and when I think of it I don't think it's even possible to restart 
>> safely. What if I do a read() on a socket, and I request more bytes 
>> than the available physical memory (but less than VM, of course)? The 
>> kernel simply doesn't have anywhere to store the bytes other than my 
>> buffer, and if it has to return EINTR then >POOF< these bytes are gone 
>> forever.
>
> I think that if any bytes have already been copied into your buffer,
> you don't get an EINTR, you get a short read.

I'm not fully familiar with all the details of this problem, it surely has
been in the air for quite a long time now (I might have first heard of it
while Taylor UUCP was being developed).  It might be dependent on the
underlying system.  If I'm not mistaken, this is Ian Taylor who introduced the
following Autoconf macro:


 - Macro: AC_SYS_RESTARTABLE_SYSCALLS
     If the system automatically restarts a system call that is
     interrupted by a signal, define `HAVE_RESTARTABLE_SYSCALLS'.


In GNU file utilities (now merged within the new GNU coreutils), Jim Meyering
uses restart wrappers for many I/O functions, so the idea of wrappers has been
maturing for a while, and is used in basic, heavily used programs.  However, I
did not look at such wrappers recently.  Python might probably wrap calls when
these are restartable, or transmit the error upwards for systems where calls
are not restartable.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From python@rcn.com  Wed Sep  4 22:40:34 2002
From: python@rcn.com (Raymond Hettinger)
Date: Wed, 4 Sep 2002 17:40:34 -0400
Subject: [Python-Dev] Proposed Mixins for Wide Interfaces
References: <001101c2510d$9fce0920$5f66accf@othello>  <200209031750.g83HoVq05812@odiug.zope.com>
Message-ID: <001801c2545b$b43aba60$e8ea7ad1@othello>

[RH]
> > How about adding some mixins to simplify the
> > implementation of some of the fatter interfaces?

[GvR]
> Can you suggest implementations for these, to be absolutely clear what
> you mean?
 -- snip --
 > What if the "natural" thing to implement is __le__ instead of __lt__?
> That's the case for sets.  Or __gt__ (less likely)?

Yes.  Here is some code
---------------------------
class CompareMixin:
    """
    Given an __eq__ method in a subclass, adds a __ne__ method
    Given __eq__ and __lt__, adds !=, <=, >, >=.
    If supplied, takes advantage of __lte__ for speed.
    """

    def __eq__(self, other):
        raise NotImplementedError

    def __ne__(self, other):
        return not (self == other)

    def __lt__(self, other):
        raise NotImplementedError

    def __lte__(self, other):
        return self < other or self == other

    def __gt__(self, other):
        return not (self <= other)

    def __gte__(self, other):
        return not (self < other)
        
## Example from sets
import mixins

class BaseSet(object, mixins.CompareMixin):
    """Common base class for mutable and immutable sets."""

    __slots__ = ['_data']

    # . . .

    def issubset(self, other):
        """Report whether another set contains this set."""
        self._binary_sanity_check(other)
        if len(self) > len(other):  # Fast check for obvious cases
            return False
        otherdata = other._data
        for elt in self:
            if elt not in otherdata:
                return False
        return True

    def __eq__(self, other):
        self._binary_sanity_check(other)
        return self._data == other._data

    def __lt__(self, other):
        self._binary_sanity_check(other)
        return len(self) < len(other) and self.issubset(other)

    __le__ = issubset   # optional, but recommended for speed.


# Example where gt is the most natural implementation
class Anyhoo(CompareMixin):
    __eq__ = someBigEqualityTest
    __gt__ = someBigComplexOrderingFunction
    def __lt__(self, other):
        return not(self>other or self==other)



[RH]
> > class MappingMixin:
> >     """
> >     Given __setitem__, __getitem__,  and keys,
> >     implements values, items, update, get, setdefault, len,
> >     iterkeys, iteritems, itervalues, has_key, and __contains__.
> > 
> >     If __delitem__ is also supplied, implements clear, pop,
> >     and popitem.
> > 
> >     Takes advantage of __iter__ if supplied (recommended).

[GvR]
> Does that mean that if you have __iter__, you don't use keys()?  In
> that case it should implement keys() out of __iter__.  Maybe this
> should be required.

Not really.  keys() is always required.  If __iter__ is supplied,
then things like iterkeys(), iteritems(), and itervalues() get computed
from __iter__ rather than keys().  

My thought on using keys() as part of the minimum specification is
that database style interfaces always supply some type of list method.
For instance, shelve can be instantly widened with the mixin, no 
other coding is required.

OTOH, I'm not glued to the idea of using keys() as part of the minimum spec.

[RH]
> >     Takes advantage of __contains__ or has_key if supplied
> >     (recommended).
> >     """

[GvR]
> Let's standardize on __contains__, not has_key().  I guess you could
> provide __contains__ as follows:

Makes sense.

[RH]
> > The idea is to make it easier to implement these interfaces.
> > Also, if the interfaces get expanded, the clients automatically
> > updated.  

[GvR]
> A similar thing for sequences would be useful too, right?

Hmm, listing and concatenation beget repetition;
len() and __getitem__() beget slicing.
iteration and __cmp__ beget min(), max()

For mutable sequences, supplying __setitem__ begets
appending, extending, and slice assignment.

Supplying __delitem__ begets pop(), remove() and slice deletion.

For overachivers, the above are all that are needed
for sort(), reverse(), index(), insert(), and count()


Would you like me to create a mixin module
and put it in the sandbox?


Raymond Hettinger





From pinard@iro.umontreal.ca  Wed Sep  4 23:25:24 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: Wed, 04 Sep 2002 18:25:24 -0400
Subject: [Python-Dev] Re: Python-dev summary for 2002-08-15 - 2002-09-01
In-Reply-To: <005601c253d5$d0a63c50$ced241d5@hagrid> ("Fredrik Lundh"'s
 message of "Wed, 4 Sep 2002 07:41:42 +0200")
References: <Pine.SOL.4.44.0209011545490.23213-100000@death.OCF.Berkeley.EDU>
 <200209031653.g83GrjQ01929@odiug.zope.com>
 <oqlm6i6e8c.fsf@titan.progiciels-bpi.ca>
 <200209032018.g83KI3q08343@odiug.zope.com>
 <17d001c2538d$f82650f0$1c86db41@boostconsulting.com>
 <008201c253aa$780144d0$6300000a@holdenweb.com>
 <oqsn0qr0rj.fsf@titan.progiciels-bpi.ca>
 <005601c253d5$d0a63c50$ced241d5@hagrid>
Message-ID: <oqr8g9ie97.fsf@carouge.sram.qc.ca>

[Fredrik Lundh]

> [...] and this mailing list is about python.

Why did you reply to the mailing list, then? :-)
 
> François Pinard wrote:
>> [Steve Holden]

>> > It looks especially bad in my standard mailreader variable-pitch font.

>> [...] People often insert ASCII tables or simple explicative drawings,
>> these capabilities are useful enough for not being dismissed.  You should
>> use fixed width fonts [...]
>
> loser.
>
> if python really was all about "everything computers did when I learned to
> use them will always be the best way to do it", it would probably never have
> been invented.

Python did not build its success by trying to convince people that every else
is wrong.  It rather offered an environment in which participants happily
considered they were gaining a lot.

If someone breaks its screen appearance through selection of inappropriate
fonts, he might gain some pleasure indeed while loosing the ability to read
many existing messages.  That's really his choice and preferences, he has to
live with the drawbacks, without trying to convince senders that they are all
wrong.  Considering others as losers does not efficiently trigger progress.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From hu.peress@mail.mcgill.ca  Sat Sep  7 23:30:59 2002
From: hu.peress@mail.mcgill.ca (Hunter Peress)
Date: 07 Sep 2002 17:30:59 -0500
Subject: [Python-Dev] Call for clarity
Message-ID: <1031437860.636.29.camel@HillCountryPeress>

I've been using python for a good few months.

And im really bothered by some aspects of the documentation.

I think that there should be a clear effort to provide API style
information, rather than the mixed state that things currently are.

There are tools for C++/Java...that are part of the official
distributions that provide API style docs.

Here's what gets me: when u look up something in pydoc, you have no idea
what it returns/expects in terms of types. Now, since python is not an
explicitely typed language, I ask rhetorically, how can u have good docs
that tell u the return/input types without making the language
explicitely typed?

Make the documenation system explictely typed. 

The clarification needs to happen somewhere along the lines, and I
really think that the world would rather not have it happening at
runtime.

This could clear up a lot of confusion and further python's
effectiveness.

-Hunter.




From bkc@murkworks.com  Wed Sep  4 23:39:01 2002
From: bkc@murkworks.com (Brad Clements)
Date: Wed, 04 Sep 2002 18:39:01 -0400
Subject: [Python-Dev] Getting started with GBayes testing
Message-ID: <3D7653AD.14352.14F391B6@localhost>

Hi,

I'm interested in contributing to GBayes ..

I'm thinking of trying word stemming and adding other types of token indicators. How 
can I contribute?

Btw, I have been saving up my spam for a year or so.. I have about 31,238 spam 
messages saved up now. These are categorized as spam based on my reading of the 
subject, or examining the body when in doubt. There are probably 10% dups in the 
corpus. Some of them have viruses, likely klez.

I'd like to replicate Tim's test rig so I can compare my results with existing ones. My 
spam isn't in mbox format, but I can convert it.. 

I'm particularly intersted in how to allow html only messages (reduce false positives). 
I'm getting a lot of personal mail in that format, unfortunately.



Brad Clements,                bkc@murkworks.com   (315)268-1000
http://www.murkworks.com                          (315)268-9812 Fax
AOL-IM: BKClements



From martin@v.loewis.de  Wed Sep  4 23:56:30 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 05 Sep 2002 00:56:30 +0200
Subject: [Python-Dev] Call for clarity
In-Reply-To: <1031437860.636.29.camel@HillCountryPeress>
References: <1031437860.636.29.camel@HillCountryPeress>
Message-ID: <m3it1l8iu9.fsf@mira.informatik.hu-berlin.de>

Hunter Peress <hu.peress@mail.mcgill.ca> writes:

> This could clear up a lot of confusion and further python's
> effectiveness.

It's not clear, to me, from reading your message, what kind of change
you are requesting (that you are requesting a change, rather than
offering help, or asking for advice, appears to be clear).

Could you kindly provide a small patch that gives an idea of what you
would like to see changed, and how?

TIA,
Martin



From drifty@bigfoot.com  Thu Sep  5 00:25:29 2002
From: drifty@bigfoot.com (Brett Cannon)
Date: Wed, 4 Sep 2002 16:25:29 -0700 (PDT)
Subject: [Python-Dev] Proposed Mixins for Wide Interfaces
In-Reply-To: <001801c2545b$b43aba60$e8ea7ad1@othello>
Message-ID: <Pine.SOL.4.44.0209041616430.18662-100000@death.OCF.Berkeley.EDU>

[Raymond Hettinger]

> [RH]
> > > How about adding some mixins to simplify the
> > > implementation of some of the fatter interfaces?
>

This is a spur-of-the-moment thought, so it might not be a reasonable
comment, but do we care that all of these methods will show up in when
using dir()  or any other introspective check?  While I think the idea is
great, it might give this sense that they are really, truly implemented
for the class instead of reliant on the other implementations; the side
effects of changing one of the required methods might have unexpected
consequences for the user.

But since I think this is a great idea, I don't want to see it disappear
because of this; I guess I better solve my issue.  =)  Perhaps we can just
make sure that this gets documented in both the API and in the doc strings
saying that it is from the mixin and what methods it is dependent upon.
That should be enough to squash my worry.

And yes, if this gets into the core and Raymond does not want to do it, I
will help with the doc patches.

-Brett C.



From hu.peress@mail.mcgill.ca  Sun Sep  8 00:47:43 2002
From: hu.peress@mail.mcgill.ca (Hunter Peress)
Date: 07 Sep 2002 18:47:43 -0500
Subject: [Python-Dev] Call for clarity ( clarification ;-) )
In-Reply-To: <m3it1l8iu9.fsf@mira.informatik.hu-berlin.de>
References: <1031437860.636.29.camel@HillCountryPeress>
 <m3it1l8iu9.fsf@mira.informatik.hu-berlin.de>
Message-ID: <1031442464.644.68.camel@HillCountryPeress>

Ok heres some more detail.

I have no idea how pydoc works right now. I assume you call some program
on a python file, and it simply looks for all """   """.

It seems to do SOME lexical/scoping analysis of where to look for """
""", and consequently, how to display that information in the final,doc
form; but I'm asking for more.

As I said, python methods/functions are not explcitely typed. So what I
propose is this:

When the pydoc generator comes accross a function/method, there should
remain a normal """ """ area for any comments. I'm asking now, that when
the generator sees its in a method/function, it does a NEW check for a
set of docs that document the type of each input argument, and the
output.

EG (theoretical, and off the top of my head):

in a file you have a function:

def something(a,b,c="lalal"):
   """This will find its way into the pydocs because its a comment"""
   ##Here is the new stuff Im proposing
   ##note, a clearer sytnax can surely be devised.
   """file"""    #documents the type of the first arg
   """string"""  #              ""          second 
   """list"""    #              ""          third
   """string"""  #documents the return type.

Then the pydoc generator will do a check on the #  arguments to the
func/meth, verify that the correct amount of these new comments (which
only supply the type) are provided. I do think that it would help to
actually enforce this. I think its fine that doc's NOT be generated if
they don't supply this information. This provides for better docs and
shouldnt get that many complaints. 

Then: If the docs are generated into webpages, links to the known types
that are checked are provided. And if the docs are going into shell
format then i dont know if links are necessary.

There are lots of cases and issues that I havent discussed for this
proposed implemenation. So I would like to continue this thread for the
purposes of detailing this idea further.

> > This could clear up a lot of confusion and further python's
> > effectiveness.
As we know, python is not an explicitely typed language, but enforcing
some level of typing at the documentation level will see a lot of people
falling into line (depending on how rigidly its enforced, and i do
suggest a pretty rigid level).


I have no patch ATM because I tend to design software before writing it,
and im looking for support from the developers first.

PS whats TIA mean?
 
On Wed, 2002-09-04 at 17:56, Martin v. Loewis wrote:
> Hunter Peress <hu.peress@mail.mcgill.ca> writes:
> 
> > This could clear up a lot of confusion and further python's
> > effectiveness.
> 
> It's not clear, to me, from reading your message, what kind of change
> you are requesting (that you are requesting a change, rather than
> offering help, or asking for advice, appears to be clear).
> 
> Could you kindly provide a small patch that gives an idea of what you
> would like to see changed, and how?
> 
> TIA,
> Martin
> 
> 




From python@rcn.com  Thu Sep  5 01:19:04 2002
From: python@rcn.com (Raymond Hettinger)
Date: Wed, 4 Sep 2002 20:19:04 -0400
Subject: [Python-Dev] Call for clarity ( clarification ;-) )
References: <1031437860.636.29.camel@HillCountryPeress> <m3it1l8iu9.fsf@mira.informatik.hu-berlin.de> <1031442464.644.68.camel@HillCountryPeress>
Message-ID: <003d01c25471$d83fe960$2fd8accf@othello>

From: "Hunter Peress" <hu.peress@mail.mcgill.ca>

> def something(a,b,c="lalal"):
>    """This will find its way into the pydocs because its a comment"""
>    ##Here is the new stuff Im proposing
>    ##note, a clearer sytnax can surely be devised.
>    """file"""    #documents the type of the first arg
>    """string"""  #              ""          second 
>    """list"""    #              ""          third
>    """string"""  #documents the return type.
> 
> Then the pydoc generator will do a check on the #  arguments to the
> func/meth, verify that the correct amount of these new comments (which
> only supply the type) are provided. I do think that it would help to
> actually enforce this. I think its fine that doc's NOT be generated if
> they don't supply this information. This provides for better docs and
> shouldnt get that many complaints. 

Thanks for the clarification. I see what you're trying to do;
however, I think that any gains are more than offset by the new
level of complexity and lengthier code.

The current docs make a pretty good effort at describing what is
needed for each argument.  At the same time, they allow flexibility
for dynamic arguments that share a similar interface (such as
substituting a StringIO object for a File object.

In your example, the docs strings could be made clear
using existing tools:

def something(file, promptstring, optionlist):
     """Returns a string extracted from the file
          for any line matching the promptstring.
          The optionlist can include any of the
          following:  IGNORECASE, VERBOSE.
          MULTILINE, or ADDLINENUMBER."""

I can't see that a tool like you described would add any
more clarity than the above docstring.

> PS whats TIA mean?

"Thanks In Advance"

Do you have any examples of current python docstrings that are
not clear enough?


Raymond Hettinger



From greg@cosc.canterbury.ac.nz  Thu Sep  5 01:30:40 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 05 Sep 2002 12:30:40 +1200 (NZST)
Subject: [Python-Dev] Re: Signal-resistant code (was: Two random and nearly unrelated ideas)
In-Reply-To: <oq1y89jvrr.fsf@carouge.sram.qc.ca>
Message-ID: <200209050030.g850UeI2026648@kuku.cosc.canterbury.ac.nz>

pinard@iro.umontreal.ca:

> - Macro: AC_SYS_RESTARTABLE_SYSCALLS
>     If the system automatically restarts a system call that is
>     interrupted by a signal, define `HAVE_RESTARTABLE_SYSCALLS'.
> 
> Python might probably wrap calls when
> these are restartable, or transmit the error upwards for systems where calls
> are not restartable.

I think that macro means that you *don't* have to use a wrapper
to restart syscalls, because it happens automatically.
So if it's not defined it means you have to restart them
manually, not that they can't be restarted at all.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From guido@python.org  Thu Sep  5 01:24:29 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 04 Sep 2002 20:24:29 -0400
Subject: [Python-Dev] Getting started with GBayes testing
In-Reply-To: Your message of "Wed, 04 Sep 2002 18:39:01 EDT."
 <3D7653AD.14352.14F391B6@localhost>
References: <3D7653AD.14352.14F391B6@localhost>
Message-ID: <200209050024.g850OTd08824@pcp02138704pcs.reston01.va.comcast.net>

> I'm interested in contributing to GBayes ..
> 
> I'm thinking of trying word stemming and adding other types of token
> indicators. How can I contribute?

Pretty soon, a SF propject will be created (Barry has already gotten
the request in).  We'll gladly add you to the list of developers.

> Btw, I have been saving up my spam for a year or so.. I have about
> 31,238 spam messages saved up now. These are categorized as spam
> based on my reading of the subject, or examining the body when in
> doubt. There are probably 10% dups in the corpus. Some of them have
> viruses, likely klez.

Cool.

> I'd like to replicate Tim's test rig so I can compare my results
> with existing ones. My spam isn't in mbox format, but I can convert
> it..

If you can't wait for the SF project, you can find all the code in the
Python CVS tree:

  http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/nondist/sandbox/spambayes/

> I'm particularly intersted in how to allow html only messages
> (reduce false positives).  I'm getting a lot of personal mail in
> that format, unfortunately.

You train it with an equal number of spam and non-spam ("ham") that
you received.  Just make sure the ham training messages contain enough
representatives of the html-only mail.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From greg@cosc.canterbury.ac.nz  Thu Sep  5 01:36:18 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 05 Sep 2002 12:36:18 +1200 (NZST)
Subject: [Python-Dev] Signal-resistant code (was: Two random and nearly unrelated ideas)
In-Reply-To: <3FE2540C-C047-11D6-89C6-000A27B19B96@oratrix.com>
Message-ID: <200209050036.g850aIkU026656@kuku.cosc.canterbury.ac.nz>

Jack Jansen <Jack.Jansen@oratrix.com>:

> Aside from some unix variations that basically didn't do restart at all 
> there have always been problems with signal restart semantics. For 
> sockets and various devices (raw ttys, I think) you could definitely 
> lose data.

Sockets? Are you sure? I find it unlikely that such a severe
problem could persist in many Unix variants for so long. I've
never heard of any mention of such a thing.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From greg@cosc.canterbury.ac.nz  Thu Sep  5 01:38:09 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 05 Sep 2002 12:38:09 +1200 (NZST)
Subject: [Python-Dev] Re: Signal-resistant code (was: Two random and nearly unrelated ideas)
In-Reply-To: <200209042005.g84K5Ms08177@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200209050038.g850c9mY026662@kuku.cosc.canterbury.ac.nz>

Guido van Rossum <guido@python.org>:

> I have never understood why a child dying should send a signal.  You
> can poll for the child with waitpid() instead.

Because child termination might not be the only thing
you want to wait for.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From guido@python.org  Thu Sep  5 01:32:21 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 04 Sep 2002 20:32:21 -0400
Subject: [Python-Dev] Proposed Mixins for Wide Interfaces
In-Reply-To: Your message of "Wed, 04 Sep 2002 16:25:29 PDT."
 <Pine.SOL.4.44.0209041616430.18662-100000@death.OCF.Berkeley.EDU>
References: <Pine.SOL.4.44.0209041616430.18662-100000@death.OCF.Berkeley.EDU>
Message-ID: <200209050032.g850WLG08875@pcp02138704pcs.reston01.va.comcast.net>

> This is a spur-of-the-moment thought, so it might not be a
> reasonable comment, but do we care that all of these methods will
> show up in when using dir() or any other introspective check?  While
> I think the idea is great, it might give this sense that they are
> really, truly implemented for the class instead of reliant on the
> other implementations; the side effects of changing one of the
> required methods might have unexpected consequences for the user.

dir() *intends* to show methods regardless of whether they are
implemented in the class or in a base class.  So this doesn't sound
like a valid objection.  Pydoc shows inherited methods separately.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From whisper@oz.net  Thu Sep  5 01:48:07 2002
From: whisper@oz.net (David LeBlanc)
Date: Wed, 4 Sep 2002 17:48:07 -0700
Subject: [Python-Dev] Getting started with GBayes testing
In-Reply-To: <200209050024.g850OTd08824@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <GCEDKONBLEFPPADDJCOEEENPEMAA.whisper@oz.net>

I would like to be in on that project too please.

David LeBlanc
Seattle, WA USA 

> -----Original Message-----
> From: python-dev-admin@python.org [mailto:python-dev-admin@python.org]On
> Behalf Of Guido van Rossum
> Sent: Wednesday, September 04, 2002 17:24
> To: bkc@murkworks.com
> Cc: python-dev@python.org
> Subject: Re: [Python-Dev] Getting started with GBayes testing
> 
> 
> > I'm interested in contributing to GBayes ..
> > 
> > I'm thinking of trying word stemming and adding other types of token
> > indicators. How can I contribute?
> 
> Pretty soon, a SF propject will be created (Barry has already gotten
> the request in).  We'll gladly add you to the list of developers.
> 
> > Btw, I have been saving up my spam for a year or so.. I have about
> > 31,238 spam messages saved up now. These are categorized as spam
> > based on my reading of the subject, or examining the body when in
> > doubt. There are probably 10% dups in the corpus. Some of them have
> > viruses, likely klez.
> 
> Cool.
> 
> > I'd like to replicate Tim's test rig so I can compare my results
> > with existing ones. My spam isn't in mbox format, but I can convert
> > it..
> 
> If you can't wait for the SF project, you can find all the code in the
> Python CVS tree:
> 
>   
> http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/nondi
> st/sandbox/spambayes/
> 
> > I'm particularly intersted in how to allow html only messages
> > (reduce false positives).  I'm getting a lot of personal mail in
> > that format, unfortunately.
> 
> You train it with an equal number of spam and non-spam ("ham") that
> you received.  Just make sure the ham training messages contain enough
> representatives of the html-only mail.
> 
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev


From barry@python.org  Thu Sep  5 01:48:48 2002
From: barry@python.org (Barry A. Warsaw)
Date: Wed, 4 Sep 2002 20:48:48 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/nondist/sandbox/spambayes classifier.py,1.8,1.9
References: <E17mk6R-0000P9-00@usw-pr-cvs1.sourceforge.net>
 <004301c25472$78f62f40$2fd8accf@othello>
Message-ID: <15734.43504.641800.957590@anthem.wooz.org>

>>>>> "RH" == Raymond Hettinger <python@rcn.com> writes:

    >> A now-rare pure win, changing spamprob() to work harder to find
    >> more evidence when competing 0.01 and 0.99 clues appear

    RH> I hope these victories make it back to the world outside of
    RH> Python (assuming there is one).  The world needs good spam
    RH> filters.

Indeed, I too hope they will.  I just got approved for a SF project
called "spambayes" and plan to move the code there.  I'll try to
coordinate that with Tim, and then make a more detailed announcement
tomorrow.

-Barry


From tim.one@comcast.net  Thu Sep  5 01:52:14 2002
From: tim.one@comcast.net (Tim Peters)
Date: Wed, 04 Sep 2002 20:52:14 -0400
Subject: [Python-Dev] The first trustworthy <wink> GBayes results
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEOFBBAB.tim.one@comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCAECIBCAB.tim.one@comcast.net>

[Tim]
> ...
> The first 16 most extreme indicators are split 9 highly in favor of ham
> (.01) and 7 highly in favor of spam (.99).  If I hadn't folded
> case away to let stinking conference announcements through <wink>, I
> expect it would have latched on to the SCREAMING at the start instead of
> looking deeper.  Looking at the To: line probably would nail this one too,
> as "Undisclosed Recipients" has two 0.99 spam indicators right there.
>
> Whatever, you *don't* want to look at msgs with a mix of just
> 0.99 and 0.01 thingies:  it's not all that unusual to get such an
> extreme mix, in spam or ham.

I should have added that it usually gets the right result when this happens.
It's the exceptions to that rule that are mondo embarrassing, because it's
making a mistake then while sitting on a mountain of strong evidence (albeit
pointing as extremely as possible in both directions at once <wink>).

"A problem" is that when a MIN_SPAMPROB and MAX_SPAMPROB clue both appear,
the math is such that they cancel out exactly.  It's *almost* as if neither
existed, but not quite:  they also keep two lower-probability words *out* of
the computation (only a grand total of the MAX_DISCRIMINATORS most extreme
clues are retained).

So I changed spamprob() to keep accepting more clues when MIN/MAX
cancellations are inevitable, and to use the best of those in lieu of the
cancelling extremes.  This turned out to be a pure win:

false positive percentages
    0.000  0.000  tied
    0.000  0.000  tied
    0.050  0.050  tied
    0.000  0.000  tied
    0.025  0.025  tied
    0.025  0.025  tied
    0.050  0.050  tied
    0.025  0.025  tied
    0.025  0.025  tied
    0.025  0.025  tied
    0.075  0.075  tied
    0.025  0.025  tied
    0.025  0.025  tied
    0.025  0.025  tied
    0.075  0.025  won
    0.025  0.025  tied
    0.025  0.025  tied
    0.000  0.000  tied
    0.025  0.025  tied
    0.050  0.050  tied

won   1 times
tied 19 times
lost  0 times

total unique fp went from 9 to 7

false negative percentages
    0.909  0.764  won
    0.800  0.691  won
    1.091  0.981  won
    1.381  1.309  won
    1.491  1.418  won
    1.055  0.873  won
    0.945  0.800  won
    1.236  1.163  won
    1.564  1.491  won
    1.200  1.200  tied
    1.454  1.381  won
    1.599  1.454  won
    1.236  1.164  won
    0.800  0.655  won
    0.836  0.655  won
    1.236  1.163  won
    1.236  1.200  won
    1.055  0.982  won
    1.127  0.982  won
    1.381  1.236  won

won  19 times
tied  1 times
lost  0 times

total unique fn went from 284 to 260



From sholden@holdenweb.com  Thu Sep  5 01:55:59 2002
From: sholden@holdenweb.com (Steve Holden)
Date: Wed, 4 Sep 2002 20:55:59 -0400
Subject: [Python-Dev] Re: Python-dev summary for 2002-08-15 - 2002-09-01
References: <Pine.SOL.4.44.0209011545490.23213-100000@death.OCF.Berkeley.EDU><200209031653.g83GrjQ01929@odiug.zope.com><oqlm6i6e8c.fsf@titan.progiciels-bpi.ca><200209032018.g83KI3q08343@odiug.zope.com><17d001c2538d$f82650f0$1c86db41@boostconsulting.com><008201c253aa$780144d0$6300000a@holdenweb.com><oqsn0qr0rj.fsf@titan.progiciels-bpi.ca><005601c253d5$d0a63c50$ced241d5@hagrid> <oqr8g9ie97.fsf@carouge.sram.qc.ca>
Message-ID: <006901c25477$01b9cb30$6300000a@holdenweb.com>

[Fran=E7ois Pinard]
> [Fredrik Lundh]
>
> > [...] and this mailing list is about python.
>
> Why did you reply to the mailing list, then? :-)
>
The effbot is a law unto itself :-)

> > Fran=E7ois Pinard wrote:
> >> [Steve Holden]
>
> >> > It looks especially bad in my standard mailreader variable-pitch
font.
>
[...]
>
> Python did not build its success by trying to convince people that ever=
y
else
> is wrong.  It rather offered an environment in which participants happi=
ly
> considered they were gaining a lot.
>
erm, ...

> If someone breaks its screen appearance through selection of inappropri=
ate
> fonts, he might gain some pleasure indeed while loosing the ability to
read
> many existing messages.  That's really his choice and preferences, he h=
as
to
> live with the drawbacks, without trying to convince senders that they a=
re
all
> wrong.  Considering others as losers does not efficiently trigger
progress.
>
I don't really consider """It looks especially bad in my standard mailrea=
der
variable-pitch font""" to be sufficiently evangelical to deserve this
rebuke, but then I didn't really consider your rebuke deserved either, so=
 I
guess we should just terminate this thread now.

regards
-----------------------------------------------------------------------
Steve Holden                                  http://www.holdenweb.com/
Python Web Programming                        pydish.holdenweb.com/pwp/
Previous .sig file retired to                    www.homeforoldsigs.com
-----------------------------------------------------------------------




From oren-py-d@hishome.net  Thu Sep  5 06:27:37 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Thu, 5 Sep 2002 08:27:37 +0300
Subject: [Python-Dev] Signal-resistant code (was: Two random and nearly unrelated ideas)
In-Reply-To: <200209042048.g84KmCK08365@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Wed, Sep 04, 2002 at 04:48:11PM -0400
References: <3FE2540C-C047-11D6-89C6-000A27B19B96@oratrix.com> <200209042048.g84KmCK08365@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020905082737.A31267@hishome.net>

On Wed, Sep 04, 2002 at 04:48:11PM -0400, Guido van Rossum wrote:
> [Jack]
> > Hmm, and when I think of it I don't think it's even possible to restart 
> > safely. What if I do a read() on a socket, and I request more bytes 
> > than the available physical memory (but less than VM, of course)? The 
> > kernel simply doesn't have anywhere to store the bytes other than my 
> > buffer, and if it has to return EINTR then >POOF< these bytes are gone 
> > forever.
> 
> I think that if any bytes have already been copied into your buffer,
> you don't get an EINTR, you get a short read.

>From read(2) man page:

   EINTR  The call was interrupted by a signal before any data was read.

Same applies to write, recv, fcntl with locks, semop, etc. They're all 
designed to be restartable. The keyword in all cases is "before".

	Oren



From goodger@users.sourceforge.net  Thu Sep  5 03:44:23 2002
From: goodger@users.sourceforge.net (David Goodger)
Date: Wed, 04 Sep 2002 22:44:23 -0400
Subject: [Python-Dev] Misc/NEWS (was: Two random and nearly unrelated ideas)
In-Reply-To: <200209041151.g84BpAg05683@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <B99C3D46.28677%goodger@users.sourceforge.net>

[Skip]
>> While adding a blurb to Misc/NEWS about the change to the thread
>> ticker and check interval, it occurred to me that perhaps Misc/NEWS
>> would benefit from conversion to ReST format.  You could pump an
>> HTML version out to the website periodically.

I have the Docutils site auto-regenerated via a small cron script.
Any time any of the source text files change, within an hour the site
reflects the change.  It makes site maintenance easy.

(BTW, Skip, thanks for the bug report.  I'll be looking into it ASAP.)

[Guido]
> Nice idea.  How much additional mark-up would this add to quote the
> occasional reST meta-character?

Very little, depending on the desired effect.  The extreme case would
be if you want to mark up everything possible.  The result may look
too busy in the source text form though, especially because there are
so many Python identifiers, expressions, code snippets, and file names
that *could* be marked up.  It's a trade-off.

The nice thing is that Misc/NEWS is already almost valid
reStrucuturedText (which shouldn't be surprising, since
reStrucuturedText is based on common usage).  In fact, most (if not
all) of the standalone text files are almost there: README, PLAN.txt,
etc.  It wouldn't be much work to bring them up to spec.

Here are the areas of Misc/NEWS that would require editing:

* Sections: The two-line titles aren't supported.  Either they should
  be combined into one line, or the "Release date" line should become
  part of the section body.  Either::

      What's New in Python 2.2 final?  Release date: 21-Dec-2001
      ==========================================================

  or::

      What's New in Python 2.2 final?
      ===============================

      Release date: 21-Dec-2001

* Subsections (like "Core and builtins", "Library", "Extension
  modules", etc.): These could be made into true subsections by
  underlining them with dashes (and changing to title case)::

      Core and Builtins
      -----------------

  I notice that there are many headers for empty subsections (such as
  "Tools/Demos" and "Build" in "What's New in Python 2.2 final?").
  Should they be removed?

* Inline literals (filenames, identifiers, expressions and code
  snippets): Surround with double-backquotes to get monospaced,
  uninterpreted text (like HTML TT tags).  There are so many of these
  that it may be best to be selective.

* Literal blocks: Example code should be indented and prefaced with
  double-colons ("::" at the end of the preceding paragraph).  Doctest
  blocks (interactive sessions, begin with ">>> " and end with a blank
  line) don't need this, although it wouldn't hurt.

> Can you convert a section for test and show me?

I'll be happy to help.  Hmm.  Looking at the 2.2.1 Misc/NEWS file, I
see sections for 2.2.1 final, 2.2.1c2, etc., but they're missing from
the CVS Misc/NEWS file.  Is this normal because of separate development
branches or is something amiss?

Following is a converted section from the current Misc/NEWS.

Minimally marked up:

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

What's New in Python 2.3 alpha 1?
=================================

XXX Release date: DD-MMM-2002 XXX

Type/class unification and new-style classes
--------------------------------------------

- Assignment to __class__ is disallowed if either the old and the new
  class is a statically allocated type object (such as defined by an
  extension module).  This prevents anomalies like ``2 .__class__ =
  bool``.

- New-style object creation and deallocation have been sped up
  significantly; they are now faster than classic instance creation
  and deallocation.

- The __slots__ variable can now mention "private" names, and the
  right thing will happen (e.g. ``__slots__ = ["__foo"]``).

- The built-ins slice() and buffer() are now callable types.  The
  types classobj (formerly class), code, function, instance, and
  instancemethod (formerly instance-method), which have no built-in
  names but are accessible through the types module, are now also
  callable.  The type dict-proxy is renamed to dictproxy.

- Cycles going through the __class__ link of a new-style instance are
  now detected by the garbage collector.

- Classes using __slots__ are now properly garbage collected.
  [SF bug 519621]

- Tightened the __slots__ rules: a slot name must be a valid Python
  identifier.

- The constructor for the module type now requires a name argument and
  takes an optional docstring argument.  Previously, this constructor
  ignored its arguments.  As a consequence, deriving a class from a
  module (not from the module type) is now illegal; previously this
  created an unnamed module, just like invoking the module type did.
  [SF bug 563060]

- A new type object, 'basestring', is added.  This is a common base
  type for 'str' and 'unicode', and can be used instead of
  ``types.StringTypes``, e.g. to test whether something is "a string":
  ``isinstance(x, basestring)`` is True for Unicode and 8-bit strings.
  This is an abstract base class and cannot be instantiated directly.

- Changed new-style class instantiation so that when C's __new__
  method returns something that's not a C instance, its __init__ is
  not called.  [SF bug #537450]

- Fixed super() to work correctly with class methods.  [SF bug #535444]

- If you try to pickle an instance of a class that has __slots__ but
  doesn't define or override __getstate__, a TypeError is now raised.
  This is done by adding a bozo __getstate__ to the class that always
  raises TypeError.  (Before, this would appear to be pickled, but the
  state of the slots would be lost.)

<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

Maximally marked up:

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

What's New in Python 2.3 alpha 1?
=================================

XXX Release date: DD-MMM-2002 XXX

Type/class unification and new-style classes
--------------------------------------------

- Assignment to ``__class__`` is disallowed if either the old and the
  new class is a statically allocated type object (such as defined by
  an extension module).  This prevents anomalies like ``2 .__class__ =
  bool``.

- New-style object creation and deallocation have been sped up
  significantly; they are now faster than classic instance creation
  and deallocation.

- The ``__slots__`` variable can now mention "private" names, and the
  right thing will happen (e.g. ``__slots__ = ["__foo"]``).

- The built-ins ``slice()`` and ``buffer()`` are now callable types.
  The types classobj (formerly class), code, function, instance, and
  instancemethod (formerly instance-method), which have no built-in
  names but are accessible through the ``types`` module, are now also
  callable.  The type dict-proxy is renamed to dictproxy.

- Cycles going through the ``__class__`` link of a new-style instance
  are now detected by the garbage collector.

- Classes using ``__slots__`` are now properly garbage collected.
  [SF bug 519621]

- Tightened the ``__slots__`` rules: a slot name must be a valid
  Python identifier.

- The constructor for the module type now requires a name argument and
  takes an optional docstring argument.  Previously, this constructor
  ignored its arguments.  As a consequence, deriving a class from a
  module (not from the module type) is now illegal; previously this
  created an unnamed module, just like invoking the module type did.
  [SF bug 563060]

- A new type object, ``basestring``, is added.  This is a common base
  type for ``str`` and ``unicode``, and can be used instead of
  ``types.StringTypes``, e.g. to test whether something is "a string":
  ``isinstance(x, basestring)`` is ``True`` for Unicode and 8-bit
  strings.  This is an abstract base class and cannot be instantiated
  directly.

- Changed new-style class instantiation so that when C's ``__new__``
  method returns something that's not a C instance, its ``__init__``
  is not called.  [SF bug #537450]

- Fixed ``super()`` to work correctly with class methods.  [SF bug #535444]

- If you try to pickle an instance of a class that has ``__slots__``
  but doesn't define or override ``__getstate__``, a ``TypeError`` is
  now raised.  This is done by adding a bozo ``__getstate__`` to the
  class that always raises ``TypeError``.  (Before, this would appear
  to be pickled, but the state of the slots would be lost.)

<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

-- 
David Goodger  <goodger@users.sourceforge.net>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/



From mwh@python.net  Thu Sep  5 10:34:34 2002
From: mwh@python.net (Michael Hudson)
Date: 05 Sep 2002 10:34:34 +0100
Subject: [Python-Dev] Please help in  calling python fucntion from 'c'
In-Reply-To: "Praveen Patil"'s message of "Wed, 4 Sep 2002 13:31:00 +0100"
References: <NFBBLJFBNMKMLGNLJMFCGELNCCAA.praveen.patil@silver-software.com>
Message-ID: <2m8z2gok45.fsf@starship.python.net>

"Praveen Patil" <praveen.patil@silver-software.com> writes:

> This is a multi-part message in MIME format.


> ------=_NextPart_000_0011_01C25417.4EC8F910
> Content-Type: text/plain;
> 	charset="iso-8859-1"
> Content-Transfer-Encoding: 7bit
> 
> Hi,
> 
> I have written 'C' dll(MY_DLL.DLL) . I am importing 'C' dll in python
> file(example.py).
> I want to call python function from 'c' function.
> For your reference I have attached 'c' and python files to this mail.
> In my pc:
> python code is under the directory D:\test\example.py
> dll is under the directory C:\Program Files\Python\DLLs\MY_DLL.pyd
> 
> Here are the steps I am following.
> 
> step(1): I am calling 'C' function(RECEIVE_FROM_IL_S) from python.
>          This 'C' function is existing imported dll(MY_DLL).
> step(2): I want to call python function(TestFunction) from 'C'
> function(RECEIVE_FROM_IL_S).
> 
> 
> Python code is(example.py)  :-
> ----------------------------
> import MY_DLL
> 
> G_Logfile          = None
> 
> def TestFunction():
>     G_Logfile = open('Pytestfile.txt', 'w')
>     G_Logfile.write("%s \n"%'I am writing python created text file')
>     G_Logfile.close
>     G_Logfile = None
> #end def TestFunction
> 
> if __name__ == "__main__":
> 
>    MY_DLL.RECEIVE_FROM_IL_S(10,50)
> 
> 
> 'C' code is (MY_DLL.c) :-
> ---------------------
> #include <windows.h>
> #include <stdio.h>
> #include <Python.h>
> 
> PyObject* _wrap_RECEIVE_FROM_IL_S(PyObject *self, PyObject *args)
> {
>     FILE* fp;
>     PyObject* _resultobj;
>     int i,j;
> 
>     if( !(PyArg_ParseTuple(args, "ii",&i,&j)))
>     {
>        return NULL;
>     }
>     fp= fopen("RECEIVE_IL_S.txt", "w");
>     fprintf(fp, "i=%d   j=%d" , i,j);
>     fclose(fp);
> 
>     /* Here I want to call python function(TestFunction). Please suggest me
> some solution*/
> 
>     _resultobj = Py_None;
>     return _resultobj;
> }
> 
> 
> static PyMethodDef MY_DLL_methods[] = {
>       { "RECEIVE_FROM_IL_S", _wrap_RECEIVE_FROM_IL_S, METH_VARARGS },
>       { NULL , NULL}
>       };
> 
> __declspec(dllexport) void __cdecl initMY_DLL(void)
>   {
>     Py_InitModule("MY_DLL",MY_DLL_methods);
>   }
> 
> 
> Please anybody help me solving the problem.
> 
> 
> Cheers,
> 
> Praveen.
> 
> ------=_NextPart_000_0011_01C25417.4EC8F910
> Content-Type: text/plain;
> 	name="exampl.py"
> Content-Transfer-Encoding: 7bit
> Content-Disposition: attachment;
> 	filename="exampl.py"
> 
> import MY_DLL
> 
> G_Logfile          = None
> 
> def TestFunction():
>     G_Logfile = open('Pytestfile.txt', 'w')
>     G_Logfile.write("%s \n"%'I am writing python created text file')
>     G_Logfile.close
>     G_Logfile = None
> #end def TestFunction  
> 
> if __name__ == "__main__":
> 
>    MY_DLL.RECEIVE_FROM_IL_S(10,50)
> 
> ------=_NextPart_000_0011_01C25417.4EC8F910
> Content-Type: application/octet-stream;
> 	name="MY_DLL.c"
> Content-Transfer-Encoding: quoted-printable
> Content-Disposition: attachment;
> 	filename="MY_DLL.c"
> 
> #include <windows.h>
> #include <stdio.h>
> #include <Python.h>
> 
> PyObject* _wrap_RECEIVE_FROM_IL_S(PyObject *self, PyObject *args)
> {
>     FILE* fp; =20
>     PyObject* _resultobj;
>     int i,j;
>    =20
>     if( !(PyArg_ParseTuple(args, "ii",&i,&j)))
>     {
>        return NULL;
>     }
>     fp=3D fopen("RECEIVE_IL_S.txt", "w");
>     fprintf(fp, "i=3D%d   j=3D%d" , i,j);
>     fclose(fp);
> 
>     /* Here I want to call python function(TestFunction). Please suggest =
> me some solution*/
> 
>     _resultobj =3D Py_None;
>     return _resultobj;
> }
> 
> 
> static PyMethodDef MY_DLL_methods[] =3D {
>       { "RECEIVE_FROM_IL_S", _wrap_RECEIVE_FROM_IL_S, METH_VARARGS },
>       { NULL , NULL}
>       };
> 
> __declspec(dllexport) void __cdecl initMY_DLL(void)
>   {
>     Py_InitModule("MY_DLL",MY_DLL_methods);
>   }
> 
> ------=_NextPart_000_0011_01C25417.4EC8F910
> Content-Type: text/plain; charset="us-ascii"
> Content-Transfer-Encoding: 7bit
> Content-Disposition: inline
> 
> [ The information contained in this e-mail is confidential and is intended for the named recipient only. If you are not the named recipient, please notify us by telephone on +44 (0)1249 442 430 immediately, destroy the message and delete it from your computer. Silver Software has taken every reasonable precaution to ensure that any attachment to this e-mail has been checked for viruses. However, we cannot accept liability for any damage sustained as a result of any such software viruses and advise you to carry out your own virus check before opening any attachment. Furthermore, we do not accept responsibility for any change made to this message after it was sent by the sender.]
> 
> ------=_NextPart_000_0011_01C25417.4EC8F910--
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev

-- 


From mwh@python.net  Thu Sep  5 10:33:04 2002
From: mwh@python.net (Michael Hudson)
Date: 05 Sep 2002 10:33:04 +0100
Subject: [Python-Dev] Signal-resistant code (was: Two random and nearly unrelated ideas)
In-Reply-To: Oren Tirosh's message of "Wed, 4 Sep 2002 08:46:46 -0400"
References: <15733.11253.743055.864572@12-248-11-90.client.attbi.com> <20020904094947.GA56953@hishome.net> <200209041144.g84BiXZ05244@pcp02138704pcs.reston01.va.comcast.net> <20020904124646.GA79746@hishome.net>
Message-ID: <2mbs7cok6n.fsf@starship.python.net>

Oren Tirosh <oren-py-d@hishome.net> writes:

> Any other problems I should be aware of?

Wildly unpredicatble x-platform behaviour in the presence of threads.

M.

-- 
  Indeed, when I design my killer language, the identifiers "foo" and
  "bar" will be reserved words, never used, and not even mentioned in
  the reference manual. Any program using one will simply dump core
  without comment. Multitudes will rejoice. -- Tim Peters, 29 Apr 1998


From mgilfix@eecs.tufts.edu  Thu Sep  5 05:45:03 2002
From: mgilfix@eecs.tufts.edu (Michael Gilfix)
Date: Thu, 5 Sep 2002 00:45:03 -0400
Subject: [apug] Re: [Python-Dev] Call for clarity ( clarification ;-) )
In-Reply-To: <1031451760.644.97.camel@HillCountryPeress>; from hu.peress@mail.mcgill.ca on Sat, Sep 07, 2002 at 09:22:40PM -0500
References: <1031437860.636.29.camel@HillCountryPeress> <m3it1l8iu9.fsf@mira.informatik.hu-berlin.de> <1031442464.644.68.camel@HillCountryPeress> <003d01c25471$d83fe960$2fd8accf@othello> <1031451760.644.97.camel@HillCountryPeress>
Message-ID: <20020905004503.A9680@eecs.tufts.edu>

  While I understand what you're trying to do here (and think it would
be quite nice), I'm not sure how you're going to accomplish it.  How
will parsing python using a syntax-tree help? It's not going to tell
you what the function does in all cases or the various types it could
handle. Perhaps you could make educated guesses by looking at the
types of operations on the objects (a 'has_key' is a sure indicator of
a hash), but that would be sketchy at best.

  For a ready example, imagine having a module that contains useful
helper functions. How are you going to identify the type requirements
of those functions if you don't have context? How can you be sure that
you've convered all contexts (including conversions).

  Such is the nature of dynamic languages. It's very hard to do
what you'd like to do here.

                       -- Mike

On Sat, Sep 07 @ 21:22, Hunter Peress wrote:
> I think its easier to enforce this from the level i describe, than have
> guido saying "ok guys please be more explicit in your documentation". I
> mean, both of those documents above are somewhat explicit, but they are
> not COMPLETE.
> 
> Could you provide me with some linkage on parsing python (from a
> compilation/ syntax-tree analysis POV). SO that i can get to work on
> writing a patch for the pydoc generation program. 

-- 
Michael Gilfix
mgilfix@eecs.tufts.edu

For my gpg public key:
http://www.eecs.tufts.edu/~mgilfix/contact.html"


From hu.peress@mail.mcgill.ca  Sun Sep  8 08:11:32 2002
From: hu.peress@mail.mcgill.ca (Hunter Peress)
Date: 08 Sep 2002 02:11:32 -0500
Subject: [apug] Re: [Python-Dev] Call for clarity ( clarification ;-) )
In-Reply-To: <20020905004503.A9680@eecs.tufts.edu>
References: <1031437860.636.29.camel@HillCountryPeress>
 <m3it1l8iu9.fsf@mira.informatik.hu-berlin.de>
 <1031442464.644.68.camel@HillCountryPeress>
 <003d01c25471$d83fe960$2fd8accf@othello>
 <1031451760.644.97.camel@HillCountryPeress>
 <20020905004503.A9680@eecs.tufts.edu>
Message-ID: <1031469093.644.196.camel@HillCountryPeress>

Actually all of the thinking i did WAS taking into account the "dynamic"
nature of python. 

But its not like the actual code is being rewritten fast enough to make
this unfeasible or unneccesary. 

Im glad to get all of this feedback as its helping me formulate, and
further specify my plans (or eventually healthily debunk them (as the
past 3 responders have helped do)). 

Instead of just thinking: 

"arguments are not explicitely anything, therefore it makes no sense to
even attempt to document them explicitely". 

I think this: simply add the capability for multiple definitions per
each argument. eg going back to my original sample here is an updated
version: 

def something(a,b,c="lalal"): 
   """This will find its way into the pydocs because its a comment""" 
   ##Here is the new stuff Im proposing 
   ##note, a clearer sytnax can surely be devised. 
   """file,socket"""  #documents the type(s) of the first arg 
   """string,list"""  #              ""             second 
   """list,hash"""    #              ""             third 
   """string,hash"""  #documents the return type(s). 

Thats quite a simple solution, and still provides worlds better
exactness and clarity than the current system allows. 

Onto more of your concerns: 
On Wed, 2002-09-04 at 23:45, Michael Gilfix wrote: 
>   While I understand what you're trying to do here (and think it would
> be quite nice), I'm not sure how you're going to accomplish it.  How
> will parsing python using a syntax-tree help? It's not going to tell
> you what the function does in all cases or the various types it could
> handle.Perhaps you could make educated guesses by looking at the
> types of operations on the objects (a 'has_key' is a sure indicator of
> a hash), but that would be sketchy at best.
Actually I wasnt suggesting this AT ALL wrt intelligent guesses, and for
now this proposal leans away from it. 

Rather there are only 2 simple things that I wanted to obtain from the
parse-tree: the number of arguments, and if possible to see if there 

Assume for now that my whole proposal will simply be another option
(instead of the default) to the pydoc-generator program. If invoked, it
will fail (if the super strict option is specified) if you don't supply
definitions for number of args for a given method. 

This brings up your "dynamic" language issue again. 

When u have lots of args being used as different things, my program then
introduces another level of complexity to deciphering the docs in a
meaningful way. 

Eg: a sample output of this program based on my example: 

------------output-----------------
  method: something(a (file,socket),b (string,list),c="lalal"
(list,hash)) 
  return type :string,hash 
  
  This will find its way into the pydocs because its a comment 
-----------------------------------
  
  Now in html format it would be even nicer as there will be links to
the types listed. 

And now looking at it, I think its much clearer than nothing at al. 

Of course there is going to be that type of code where u have no need of
documenting every method because their names are self explantory, and
such explicit documentation isnt necessary, thats not what this is
really intended for.  

If the specific argument arises that "since python is a dynamic language
your approach doesnt make sense" say, then I have to respond: 

an attempt at specifiying things is FAR better than nothing, and
moreover, this is only my first attempt. Allowing it to become a part of
the generator as an option will open it up to user input, and hence
improvement, AND! 

***
it might just turn out that a "dynamic" approach will be necessary to
document a "dynamic" language. 
***

So im still looking for more design tips, and a place where I could find
out how to get into the meat of the python parser, but i think the
"http://python.org/doc/2.2/lib/module-parser.html" is probably what I'll
be using.
> 
>   For a ready example, imagine having a module that contains useful
> helper functions. How are you going to identify the type requirements
> of those functions if you don't have context? How can you be sure that
> you've convered all contexts (including conversions).
> 
>   Such is the nature of dynamic languages. It's very hard to do
> what you'd like to do here.
> 
>                        -- Mike
> 
> On Sat, Sep 07 @ 21:22, Hunter Peress wrote:
> > I think its easier to enforce this from the level i describe, than have
> > guido saying "ok guys please be more explicit in your documentation". I
> > mean, both of those documents above are somewhat explicit, but they are
> > not COMPLETE.
> > 
> > Could you provide me with some linkage on parsing python (from a
> > compilation/ syntax-tree analysis POV). SO that i can get to work on
> > writing a patch for the pydoc generation program. 
> 
> -- 
> Michael Gilfix
> mgilfix@eecs.tufts.edu
> 
> For my gpg public key:
> http://www.eecs.tufts.edu/~mgilfix/contact.html"
> 



From mal@egenix.com  Thu Sep  5 10:14:06 2002
From: mal@egenix.com (M.-A. Lemburg)
Date: Thu, 05 Sep 2002 11:14:06 +0200
Subject: [Python-Dev] utf8 issue
References: <200208232105.g7NL5RE16863@pcp02138704pcs.reston01.va.comcast.net>              <2mznv9c1k4.fsf@starship.python.net> <200208261405.g7QE5Of05199@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D77205E.8080103@lemburg.com>

Guido van Rossum wrote:
>>Guido van Rossum <guido@python.org> writes:
>>
>>
>>>This might beling on SF, except it's already been solved in Python
>>>2.3, and I need guidance about what to do for Python 2.2.2.
>>>
>>>In 2.2.1, a lone surrogate encoded into utf8 gives an utf8 string that
>>>cannot be decode back.  In 2.3, this is fixed.  Should this be fixed
>>>in 2.2.2 as well?
>>
>>I think this was discussed really quite a long time ago, like six
>>months or so.
>>
>>
>>>I'm asking because it caused problems with reading .pyc files: if
>>>there's a Unicode literal containing a lone surrogate, reading the
>>>.pyc file causes an exception:
>>>
>>>UnicodeError: UTF-8 decoding error: unexpected code byte
>>>
>>>It looks like revision 2.128 fixed this for 2.3, but that patch
>>>doesn't cleanly apply to the 2.2 maintenance branch.  Can someone
>>>help?
>>
>>I think the reason this didn't get fixed in 2.2.1 is that it
>>necessitates bumping MAGIC.
>>
>>I can probably dig up more references if you want.
> 
> 
> Please do.  Bumping MAGIC is a no-no between dot releases.  But I
> don't understand why that is necessary?

It would be necessary since marshal uses UTF-8 for storing
Unicode literals. Even though it's highly unlikely that the
problem cases are used in Python Unicode literals, there's
a tiny chance. Without the MAGIC change this could result
in PYC files failing to load.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/



From guido@python.org  Thu Sep  5 14:51:49 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 05 Sep 2002 09:51:49 -0400
Subject: [Python-Dev] utf8 issue
In-Reply-To: Your message of "Thu, 05 Sep 2002 11:14:06 +0200."
 <3D77205E.8080103@lemburg.com>
References: <200208232105.g7NL5RE16863@pcp02138704pcs.reston01.va.comcast.net> <2mznv9c1k4.fsf@starship.python.net> <200208261405.g7QE5Of05199@pcp02138704pcs.reston01.va.comcast.net>
 <3D77205E.8080103@lemburg.com>
Message-ID: <200209051351.g85Dpnk12649@odiug.zope.com>

> > Please do.  Bumping MAGIC is a no-no between dot releases.  But I
> > don't understand why that is necessary?
> 
> It would be necessary since marshal uses UTF-8 for storing
> Unicode literals.

Do you mean that in 2.2 it doesn't?

> Even though it's highly unlikely that the problem cases are used in
> Python Unicode literals, there's a tiny chance. Without the MAGIC
> change this could result in PYC files failing to load.

Ha.  You may have missed the start of this thread, but the whole
problem was that a PYC file *did* fail to load!  (The .py file had a
lone surrogate in it.)  So I'm not sure this argument holds much
water.

Can someone please explain what change would be necessary to what part
of the code to prevent a lone surrogate in a string literal from
creating a PYC file from blowing up?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From oren-py-d@hishome.net  Thu Sep  5 05:54:14 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Thu, 5 Sep 2002 00:54:14 -0400
Subject: [Python-Dev] Re: Signal-resistant code (was: Two random and nearly unrelated ideas)
In-Reply-To: <200209042005.g84K5Ms08177@pcp02138704pcs.reston01.va.comcast.net>
References: <15733.11253.743055.864572@12-248-11-90.client.attbi.com> <20020904094947.GA56953@hishome.net> <200209041144.g84BiXZ05244@pcp02138704pcs.reston01.va.comcast.net> <20020904124646.GA79746@hishome.net> <200209041325.g84DP1o06695@pcp02138704pcs.reston01.va.comcast.net> <20020904160143.GA1483@hishome.net> <200209042005.g84K5Ms08177@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020905045414.GA26104@hishome.net>

On Wed, Sep 04, 2002 at 04:05:22PM -0400, Guido van Rossum wrote:
> > If I use a module that spawns an external process and uses SIGCHLD to be 
> > informed of its termination why should my innocent code that just reads 
> > lines from a file suddenly break?  In C I can at least restart the 
> > operation after an EINTR but file.readline cannot even be properly 
> > restarted because the buffering and file position is all messed up.
> 
> I have never understood why a child dying should send a signal.  You
> can poll for the child with waitpid() instead.

You're assuming too much about the structure of the program using child
processes. The code that starts the child process may not be in control of
the Python program counter by the time it ends. It's useful to be able to 
leave a signal handler to clean up the zombie process by waitpid(). 

> But if you have a suggestion for how to fix this particular issue, I'd
> be happy to look it over, since this *is* something some people do.

Of course people do it - it's documented and it works.  Signal handling 
may have had some historical problems on some Unixes but I've never had any 
problem with it under Linux.  My previous messages more or less outline
my suggestion. I'll write a better summary.

> > Getting an notification of a child process terminating or other
> > asynchronous events can only be done using signals and is currently
> > dangerous because it will break code using I/O.
> 
> See above.  I see half your point; people wanting this tend to use
> signals and it causes breakage.

Polling is not what I'd call "getting notification of asynchronous events".
If it causes breakage it could be because people either use it incorrectly
or the signal support on the underlying system is broken. In Linux it isn't
broken. If it's broken on other Python platforms I don't see why it
shouldn't be well-supported on the platforms that aren't.

Has anyone here actually tried to use signal.signal ?

> > > > interference to Python code by signals. Any other problems I should
> > > > be aware of?
> > > 
> > > There's no way to sufficiently test a program that uses signals.  The
> > > signal handler cannot touch *any* data, which makes it pretty useless.
> > 
> > In order to be useful a signal handler needs to be able to set one bit.
> > The next time the ticker expires this bit will be checked.
> 
> OK.
> 
> > If an I/O operation was interrupted the Python signal handler can be
> > executed immediately from the wrapper. When it returns the wrapper
> > will resume the interrupted operation.
> 
> Is calling the Python signal handler from the wrapper always safe?
> What if the Python signal handler e.g. closes the file or reads from
> it?

Code in signal handlers is executed at some arbitrary point in the program 
and the programmer should be aware of this and only do so simple things
like setting a flag or appending to a list.

	Oren


From bkc@murkworks.com  Thu Sep  5 15:13:50 2002
From: bkc@murkworks.com (Brad Clements)
Date: Thu, 05 Sep 2002 10:13:50 -0400
Subject: [Python-Dev] Getting started with GBayes testing
In-Reply-To: <200209050024.g850OTd08824@pcp02138704pcs.reston01.va.comcast.net>
References: Your message of "Wed, 04 Sep 2002 18:39:01 EDT." <3D7653AD.14352.14F391B6@localhost>
Message-ID: <3D772EC2.30217.184B6C78@localhost>

On 4 Sep 2002 at 20:24, Guido van Rossum wrote:

> Pretty soon, a SF propject will be created (Barry has already gotten
> the request in).  We'll gladly add you to the list of developers.

I look forward to it.

> > I'm particularly intersted in how to allow html only messages
> > (reduce false positives).  I'm getting a lot of personal mail in
> > that format, unfortunately.
> 
> You train it with an equal number of spam and non-spam ("ham") that
> you received.  Just make sure the ham training messages contain enough
> representatives of the html-only mail.

This is one way to do it, but I was planning on experimenting with tokenizer methods 
that strip out HTML tags, leaving only the text. 

My feeling is that the presentation of "the message" is independent of the message 
itself, so if I get a message in Text, HTML, RTF only the actual content is important, not 
the markup method. Though I suppose using lots of red and large fonts might be an 
indicator of spam, the text of the message should still suffice.

Tim's comments in timtest.py hint that stripping tags isn't a catastrophe for f-n's, but 
he's not planning on doing that for use on technical lists.

I would like to pursue general client-side filtering of spam, so I do need to contend with 
that.

btw, Tim's comment:


> # So if a message is multipart/alternative with both text/plain and text/html
> # branches, we ignore the latter, else newbies would never get a message
> # through.  If a message is just HTML, it has virtually no chance of getting
> # through

Tells me (spammer hat on) that I can send message with a non-spammish text only 
part, and a spam html part since most "non-techie" email client users automatically 
display the html version when available, however Tim's implementation will ignore it.

Most "average users" never even see the text-only part of multipart messages. In Tim's 
application, that's okay since he's going to use the text-only part anyway. But for my 
purposes, I need to consider both portions. So it's simpler for me to strip html and 
combine that text with the text-only part and then "test" the combined parts.

Well these are just musings, I'll be looking for the SF project. 

-Brad


Brad Clements,                bkc@murkworks.com   (315)268-1000
http://www.murkworks.com                          (315)268-9812 Fax
AOL-IM: BKClements



From Anthony Baxter <anthony@interlink.com.au>  Thu Sep  5 15:28:25 2002
From: Anthony Baxter <anthony@interlink.com.au> (Anthony Baxter)
Date: Fri, 06 Sep 2002 00:28:25 +1000
Subject: [Python-Dev] Getting started with GBayes testing
In-Reply-To: <3D772EC2.30217.184B6C78@localhost>
Message-ID: <200209051428.g85ESPR24749@localhost.localdomain>

>>> "Brad Clements" wrote
> This is one way to do it, but I was planning on experimenting with tokenizer 
methods 
> that strip out HTML tags, leaving only the text. 

The set I'm working with, I found I needed to strip out everything 
but for src="" and href="" attributes of tags. Too much goodness in
them for the system to get it's teeth into.


> Tells me (spammer hat on) that I can send message with a non-spammish text 
> only part, and a spam html part since most "non-techie" email client users 
> automatically display the html version when available, however Tim's 
> implementation will ignore it.

I've actually got a bunch of spam like that. The text/plain is something
like 

**This is a HTML message** 

and nothing else.


Anthony
-- 
Anthony Baxter     <anthony@interlink.com.au>   
It's never too late to have a happy childhood.



From guido@python.org  Thu Sep  5 15:33:34 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 05 Sep 2002 10:33:34 -0400
Subject: [Python-Dev] Proposed Mixins for Wide Interfaces
In-Reply-To: Your message of "Wed, 04 Sep 2002 17:40:34 EDT."
 <001801c2545b$b43aba60$e8ea7ad1@othello>
References: <001101c2510d$9fce0920$5f66accf@othello> <200209031750.g83HoVq05812@odiug.zope.com>
 <001801c2545b$b43aba60$e8ea7ad1@othello>
Message-ID: <200209051433.g85EXY612883@odiug.zope.com>

> [RH]
> > > How about adding some mixins to simplify the
> > > implementation of some of the fatter interfaces?

On second thought, I don't think there's enough here to warrant
putting this in the standard library.  E.g. the example from BaseSet
actually strikes me as indirect: because <= is the natural operation
to provide for sets, hanging everything off __lt__ looks forced.

Maybe this could go into the Demo directory or in some example or
HOWTO.

We'll revise this issue when we are going to introduce a standard type
or interface hierarchy (not for Python 2.3).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From python@rcn.com  Thu Sep  5 15:43:50 2002
From: python@rcn.com (Raymond Hettinger)
Date: Thu, 5 Sep 2002 10:43:50 -0400
Subject: [Python-Dev] Proposed Mixins for Wide Interfaces
References: <001101c2510d$9fce0920$5f66accf@othello> <200209031750.g83HoVq05812@odiug.zope.com>              <001801c2545b$b43aba60$e8ea7ad1@othello>  <200209051433.g85EXY612883@odiug.zope.com>
Message-ID: <003901c254ea$a752d820$f6eb7ad1@othello>

> > [RH]
> > > > How about adding some mixins to simplify the
> > > > implementation of some of the fatter interfaces?

[GvR]
> On second thought, I don't think there's enough here to warrant
> putting this in the standard library.  E.g. the example from BaseSet
> actually strikes me as indirect: because <= is the natural operation
> to provide for sets, hanging everything off __lt__ looks forced.

Agreed.

How about the MappingMixin and SequenceMixin?  These both
provide much more meat and have more natural attach points
(getitem, setitem, delitem).



From guido@python.org  Thu Sep  5 15:53:08 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 05 Sep 2002 10:53:08 -0400
Subject: [Python-Dev] Proposed Mixins for Wide Interfaces
In-Reply-To: Your message of "Thu, 05 Sep 2002 10:43:50 EDT."
 <003901c254ea$a752d820$f6eb7ad1@othello>
References: <001101c2510d$9fce0920$5f66accf@othello> <200209031750.g83HoVq05812@odiug.zope.com> <001801c2545b$b43aba60$e8ea7ad1@othello> <200209051433.g85EXY612883@odiug.zope.com>
 <003901c254ea$a752d820$f6eb7ad1@othello>
Message-ID: <200209051453.g85Er8j12983@odiug.zope.com>

> How about the MappingMixin and SequenceMixin?  These both
> provide much more meat and have more natural attach points
> (getitem, setitem, delitem).

I'd much rather have a howto that explains all the issues.  This stuff
is vastly underdocumented.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Thu Sep  5 16:01:14 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 05 Sep 2002 11:01:14 -0400
Subject: [Python-Dev] Re: Signal-resistant code (was: Two random and nearly unrelated ideas)
In-Reply-To: Your message of "Thu, 05 Sep 2002 00:54:14 EDT."
 <20020905045414.GA26104@hishome.net>
References: <15733.11253.743055.864572@12-248-11-90.client.attbi.com> <20020904094947.GA56953@hishome.net> <200209041144.g84BiXZ05244@pcp02138704pcs.reston01.va.comcast.net> <20020904124646.GA79746@hishome.net> <200209041325.g84DP1o06695@pcp02138704pcs.reston01.va.comcast.net> <20020904160143.GA1483@hishome.net> <200209042005.g84K5Ms08177@pcp02138704pcs.reston01.va.comcast.net>
 <20020905045414.GA26104@hishome.net>
Message-ID: <200209051501.g85F1EY13017@odiug.zope.com>

> > I have never understood why a child dying should send a signal.
> > You can poll for the child with waitpid() instead.
> 
> You're assuming too much about the structure of the program using
> child processes. The code that starts the child process may not be
> in control of the Python program counter by the time it ends. It's
> useful to be able to leave a signal handler to clean up the zombie
> process by waitpid().

I admit that I hate signals so badly that whenever I needed to wait
for a child to finish I would always structure the program around this
need (even when coding in C).

> > But if you have a suggestion for how to fix this particular issue, I'd
> > be happy to look it over, since this *is* something some people do.
> 
> Of course people do it - it's documented and it works.

Barely.  This thread started when you pointed out the problems with
using signals.  I've always been reluctant about the fact that we had
a signal module at all -- it's not portable (no non-Unix system
supports it well), doesn't interact well with threads, etc., etc.;
however, C programmers have demanded some sort of signal support and I
caved in long ago when someone contributed a reasonable approach.  I
don't regret it like lambda, but I think it should only be used by
people who really know about the caveats.

> > See above.  I see half your point; people wanting this tend to use
> > signals and it causes breakage.
> 
> Polling is not what I'd call "getting notification of asynchronous events".
> If it causes breakage it could be because people either use it incorrectly
> or the signal support on the underlying system is broken. In Linux it isn't
> broken. If it's broken on other Python platforms I don't see why it
> shouldn't be well-supported on the platforms that aren't.

I meant in Python.  The I/O problems make signals hard to use.

> Has anyone here actually tried to use signal.signal ?

Yes.

> Code in signal handlers is executed at some arbitrary point in the
> program and the programmer should be aware of this and only do so
> simple things like setting a flag or appending to a list.

Unfortunately the mechanism doesn't enforce this.  I wish we could
invent a Python signal API that only lets you do one of these simple
things.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@comcast.net  Thu Sep  5 02:34:21 2002
From: tim.one@comcast.net (Tim Peters)
Date: Wed, 04 Sep 2002 21:34:21 -0400
Subject: [Python-Dev] Getting started with GBayes testing
In-Reply-To: <3D7653AD.14352.14F391B6@localhost>
Message-ID: <LNBBLJKPBEHFEDALKOLCKECKBCAB.tim.one@comcast.net>

Guido addressed most points, so I'll just cover a few:

[Brad Clements]
> ...
> I'd like to replicate Tim's test rig so I can compare my results
> with existing ones. My spam isn't in mbox format, but I can convert it.

Mine isn't either <wink>.  Barry gave me mboxes, but the spam corpus I got
off the web had one spam per file, and it only took two days of extreme pain
to realize that one msg per file is enormously easier to work with when
testing:  you want to split these at random into random collections, you may
need to replace some at random when testing reveals spam mistakenly called
ham (and vice versa), etc -- even pasting examples into email is much easier
when it's one msg per file (and the test driver makes it easy to print a
msg's file path).

My test driver and tokenizer are checked in (timtest.py), and also a little
utility or two.  The directory structure under my spambayes directory looks
like so:

Data/
    Spam/
        Set1/ (contains 2750 spam .txt files)
        Set2/            ""
        Set3/            ""
        Set4/            ""
        Set5/            ""
    Ham/
        Set1/ (contains 4000 ham .txt files)
        Set2/            ""
        Set3/            ""
        Set4/            ""
        Set5/            ""
        reservoir/ (contains "backup ham")

If you use the same names and structure, huge mounds of the tedious testing
code will work as-is.  The more Set directories the merrier, although you'll
hit a point of diminishing returns if you exceed 10.  The "reservoir"
directory contains a few thousand other random hams.  When a ham is found
that's really spam, I delete it, and then the rebal.py utility moves in a
message at random from the reservoir to replace it.  If I had it to do over
again, I think I'd move such spam into a Spam set (chosen at random),
instead of deleting it.

> I'm particularly intersted in how to allow html only messages
> (reduce false positives).  I'm getting a lot of personal mail in that
> format, unfortunately.

It will learn about that -- not a problem.  It's a problem in *my* tests
because HTML mail is so strongly hated on tech lists, but newbies use it
there anyway, and it would be horrid to block newbies just because they're
normal people who enjoy creating visually attractive messages <0.9 wink>.
Read the "What about HTML?" section in timtest.py.

You may also with to remove the guard from

        if part.get_content_type() == "text/plain":
            text = html_re.sub(' ', text)

in tokenize().  Once you have a good test setup, you can try it both ways,
and the data will tell you which way works best for your normal mix.
Details of runs both ways on my c.l.py corpora are given in the "What about
HTML?" section mentioned before, and even there stripping HTML decorations
out of HTML-only messages had an insignificant effect on the f-p rate.  It
increased the f-n rate, though, and precisely because HTML messages are so
very rare on c.l.py that they're *almost* certainly spam.



From python@rcn.com  Thu Sep  5 16:43:20 2002
From: python@rcn.com (Raymond Hettinger)
Date: Thu, 5 Sep 2002 11:43:20 -0400
Subject: [Python-Dev] GBayes design
Message-ID: <002b01c254f2$f6c7c020$71b53bd0@othello>

Is it too late to challenge a core design decision?

Instead of multiplying probablities, use fuzzy logic methods.
Classify the indicators into damning, strong, weak, neautral, ...

After counting the number of indicators in each class, make
a spam/ham decision that can be easily tweaked.  This would
make it easy to implement variations of Tim's recent clear
win, where additional indicators are gathered until the
balance shifts sharply to one side.

Some other advantages are:
-- easily interpreted score vectors (6 damning, 7 strong, 4 weak, ... )
-- avoids mathematical issues with indicators not being independent
-- allows the addition of non-token based indicators.  for instance,
    a preponderance of caps would be a weak indicator.  the presence
    of caps separated by spaces would be a strong indicator.
-- the decision logic would be more intuitive
-- avoids the issue of having equal amounts of spam and ham in
    the sample

The core concept would stay the same -- it's really just a shift from
continuous to discrete.


of-course-this-is-entirely-outside-my-fields-of-knowledge-ly yours,


Raymond Hettinger



From mgilfix@eecs.tufts.edu  Thu Sep  5 18:23:05 2002
From: mgilfix@eecs.tufts.edu (Michael Gilfix)
Date: Thu, 5 Sep 2002 13:23:05 -0400
Subject: [apug] Re: [Python-Dev] Call for clarity ( clarification ;-) )
In-Reply-To: <1031469093.644.196.camel@HillCountryPeress>; from hu.peress@mail.mcgill.ca on Sun, Sep 08, 2002 at 02:11:32AM -0500
References: <1031437860.636.29.camel@HillCountryPeress> <m3it1l8iu9.fsf@mira.informatik.hu-berlin.de> <1031442464.644.68.camel@HillCountryPeress> <003d01c25471$d83fe960$2fd8accf@othello> <1031451760.644.97.camel@HillCountryPeress> <20020905004503.A9680@eecs.tufts.edu> <1031469093.644.196.camel@HillCountryPeress>
Message-ID: <20020905132305.A19681@eecs.tufts.edu>

  Ok. I think I understand better what you're trying to accomplish.  I
got the impression earlier (and I think others did as well) that you
were hoping to have pydoc automatically label types on the function
call. A new convention might very well be welcomed. You might want
to post a couple of examples and the corresponding documentation for
feedback here before you start the hard work on the patch :)

  More below...


On Sun, Sep 08 @ 02:11, Hunter Peress wrote:
> Actually all of the thinking i did WAS taking into account the "dynamic"
> nature of python. 
> 
> But its not like the actual code is being rewritten fast enough to make
> this unfeasible or unneccesary. 
> 
> Im glad to get all of this feedback as its helping me formulate, and
> further specify my plans (or eventually healthily debunk them (as the
> past 3 responders have helped do)). 
> 
> Instead of just thinking: 
> 
> "arguments are not explicitely anything, therefore it makes no sense to
> even attempt to document them explicitely". 
> 
> I think this: simply add the capability for multiple definitions per
> each argument. eg going back to my original sample here is an updated
> version: 
> 
> def something(a,b,c="lalal"): 
>    """This will find its way into the pydocs because its a comment""" 
>    ##Here is the new stuff Im proposing 
>    ##note, a clearer sytnax can surely be devised. 
>    """file,socket"""  #documents the type(s) of the first arg 
>    """string,list"""  #              ""             second 
>    """list,hash"""    #              ""             third 
>    """string,hash"""  #documents the return type(s). 
> 
> Thats quite a simple solution, and still provides worlds better
> exactness and clarity than the current system allows. 
> 
> Onto more of your concerns: 
> On Wed, 2002-09-04 at 23:45, Michael Gilfix wrote: 
> >   While I understand what you're trying to do here (and think it would
> > be quite nice), I'm not sure how you're going to accomplish it.  How
> > will parsing python using a syntax-tree help? It's not going to tell
> > you what the function does in all cases or the various types it could
> > handle.Perhaps you could make educated guesses by looking at the
> > types of operations on the objects (a 'has_key' is a sure indicator of
> > a hash), but that would be sketchy at best.
> Actually I wasnt suggesting this AT ALL wrt intelligent guesses, and for
> now this proposal leans away from it. 
> 
> Rather there are only 2 simple things that I wanted to obtain from the
> parse-tree: the number of arguments, and if possible to see if there 

  Agreed now that things are clearer.

> Assume for now that my whole proposal will simply be another option
> (instead of the default) to the pydoc-generator program. If invoked, it
> will fail (if the super strict option is specified) if you don't supply
> definitions for number of args for a given method. 
> 
> This brings up your "dynamic" language issue again. 
> 
> When u have lots of args being used as different things, my program then
> introduces another level of complexity to deciphering the docs in a
> meaningful way. 
> 
> Eg: a sample output of this program based on my example: 
> 
> ------------output-----------------
>   method: something(a (file,socket),b (string,list),c="lalal"
> (list,hash)) 
>   return type :string,hash 
>   
>   This will find its way into the pydocs because its a comment 
> -----------------------------------
>   
>   Now in html format it would be even nicer as there will be links to
> the types listed. 

  I agree. I know that I'd welcome an extra added option to enable
some extra pydoc functionality. Developing a schema is tricky though
and you should probably engage in some more debate first :)

> And now looking at it, I think its much clearer than nothing at al. 
> 
> Of course there is going to be that type of code where u have no need of
> documenting every method because their names are self explantory, and
> such explicit documentation isnt necessary, thats not what this is
> really intended for.  
> 
> If the specific argument arises that "since python is a dynamic language
> your approach doesnt make sense" say, then I have to respond: 

  Of course it applies. Because of my misunderstanding, I was under
the impression that you wanted to generate the equivalent of function
calls, not develop a scheme like javadoc. The dynamic nature of Python
means that such specifications become even more important as project
sizes increase.

> an attempt at specifiying things is FAR better than nothing, and
> moreover, this is only my first attempt. Allowing it to become a part of
> the generator as an option will open it up to user input, and hence
> improvement, AND! 
> 
> ***
> it might just turn out that a "dynamic" approach will be necessary to
> document a "dynamic" language. 
> ***
> 
> So im still looking for more design tips, and a place where I could find
> out how to get into the meat of the python parser, but i think the
> "http://python.org/doc/2.2/lib/module-parser.html" is probably what I'll
> be using.

  Shouldn't there be code in the existing pydoc to do much of what
you want for you? It seems like it might be nice to re-engineer pydoc
to take some handlers that allow you to do further customization
after it's done it's thing. That way, we can add extensions into the
existing code and all that integration stuff might be a little easier.

  Good luck n' keep us posted :)

                  -- Mike

-- 
Michael Gilfix
mgilfix@eecs.tufts.edu

For my gpg public key:
http://www.eecs.tufts.edu/~mgilfix/contact.html"


From spambayes@python.org  Thu Sep  5 18:57:17 2002
From: spambayes@python.org (Tim Peters)
Date: Thu, 05 Sep 2002 13:57:17 -0400
Subject: [Python-Dev] Getting started with GBayes testing
In-Reply-To: <3D772EC2.30217.184B6C78@localhost>
Message-ID: <BIEJKCLHCIOIHAGOKOLHAEENDKAA.tim.one@comcast.net>

[Followups directed to spambayes@python.org
 http://mail.python.org/mailman-21/listinfo/spambayes
]

[Brad Clements]
> ...
> My feeling is that the presentation of "the message" is independent of the
> message itself, so if I get a message in Text, HTML, RTF only the actual
> content is important, not the markup method.

Everything's A Clue.  Everything that gets ignored partly blinds the
classifier, so the question isn't whether there's a difference, it's how
much of a difference it makes.

> Though I suppose using lots of red and large fonts might be an
> indicator of spam, the text of the message should still suffice.

Indeed, Graham reported that the hex color code for bright red was one of
the strongest spam indicators in his database.

> Tim's comments in timtest.py hint that stripping tags isn't a
> catastrophe for f-n's, but he's not planning on doing that for use on
> technical lists.

When HTML-only email is a 99.99% spam indicator on a tech list, it would be
crazy to ignore that clue.  But note that the comments *also* say I'd be
delighted to remove HTML tags even there if some other way of slashing the
f-n rate is proven to work (and most people who have tried it say that
mining more header lines does do it -- but then I haven't seen anything from
them about how they do when they ignore the header lines.  I was happy to
ignore header lines in order to get *some* kind of handle on how well could
be done on "pure content", and turned out that works remarkably well).

>> # So if a message is multipart/alternative with both text/plain
>> # and text/html branches, we ignore the latter, else newbies would never
>> # get a message through.  If a message is just HTML, it has virtually no
>> # chance of getting through

> Tells me (spammer hat on) that I can send message with a
> non-spammish text only part, and a spam html part since most
> "non-techie" email client users automatically display the html
> version when available, however Tim's implementation will ignore it.

Sure.  It *certainly* isn't a problem on my test data (as witnessed by the
measured error rates).  If the nature of the world changes, the code has to
adapt along with it.  But 90% of the spam I receive (and I get a lot) is
still trivial to recognize from a mere glance at the subject line, and I
don't buy that spammers are a class of ubergeek with formidable skill.
Response rates are a percentage game, and more so than anti-spammers I
expect spammers are keen to go for high-percentage wins at the expense of
esoterica.

> Most "average users" never even see the text-only part of
> multipart messages. In Tim's application, that's okay since he's going
> to use the text-only part anyway. But for my  purposes, I need to consider
> both portions. So it's simpler for me to strip html and combine that text
> with the text-only part and then "test" the combined parts.

Not unreasonable <wink>, but testing remains the only way to decide.  It's
rare you can out-think a fraction of a percent!



From oren-py-d@hishome.net  Thu Sep  5 10:30:02 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Thu, 5 Sep 2002 05:30:02 -0400
Subject: [Python-Dev] Re: Signal-resistant code (was: Two random and nearly unrelated ideas)
In-Reply-To: <oq1y89jvrr.fsf@carouge.sram.qc.ca>
References: <3FE2540C-C047-11D6-89C6-000A27B19B96@oratrix.com> <200209042048.g84KmCK08365@pcp02138704pcs.reston01.va.comcast.net> <oq1y89jvrr.fsf@carouge.sram.qc.ca>
Message-ID: <20020905093002.GA61136@hishome.net>

On Wed, Sep 04, 2002 at 05:21:44PM -0400, François Pinard wrote:
> I'm not fully familiar with all the details of this problem, it surely has
> been in the air for quite a long time now (I might have first heard of it
> while Taylor UUCP was being developed).  It might be dependent on the
> underlying system.  If I'm not mistaken, this is Ian Taylor who introduced the
> following Autoconf macro:
> 
> 
>  - Macro: AC_SYS_RESTARTABLE_SYSCALLS
>      If the system automatically restarts a system call that is
>      interrupted by a signal, define `HAVE_RESTARTABLE_SYSCALLS'.

The name of this macro is misleading. It doesn't check whether system calls
are restartABLE but whether they are restartED automatically by libc. It 
forks a subprocess that sends a signal to the parent. The parent waits for
the child and checks if the wait() was interrupted.

If this macro is defined you will never get EINTR so there's no need to
worry about this. If it isn't defined you need to restart system calls 
yourself.

If a platform really has interruptible I/O calls that cannot be continued or
restarted without data loss there is no way to use signal handlers on that
system. I doubt that such totally broken platforms are common these days.

> In GNU file utilities (now merged within the new GNU coreutils), Jim Meyering
> uses restart wrappers for many I/O functions, so the idea of wrappers has been
> maturing for a while, and is used in basic, heavily used programs.  

I'll check the sources.

	Oren



From spambayes@python.org  Thu Sep  5 18:39:36 2002
From: spambayes@python.org (Tim Peters)
Date: Thu, 05 Sep 2002 13:39:36 -0400
Subject: [Python-Dev] Getting started with GBayes testing
In-Reply-To: <200209051428.g85ESPR24749@localhost.localdomain>
Message-ID: <BIEJKCLHCIOIHAGOKOLHGEEMDKAA.tim.one@comcast.net>

[Followups directed to spambayes@python.org
 http://mail.python.org/mailman-21/listinfo/spambayes
]

[Anthony Baxter]
> ...
> I've actually got a bunch of spam like that. The text/plain is something
> like
>
> **This is a HTML message**
>
> and nothing else.

Are you sure that's in a text/plain MIME section?  I've seen that many times
myself, but it's always been in the prologue (*between* MIME sections -- so
it's something a non-MIME aware reader will show you).



From spambayes@python.org  Thu Sep  5 19:30:03 2002
From: spambayes@python.org (Tim Peters)
Date: Thu, 05 Sep 2002 14:30:03 -0400
Subject: [Python-Dev] GBayes design
In-Reply-To: <002b01c254f2$f6c7c020$71b53bd0@othello>
Message-ID: <BIEJKCLHCIOIHAGOKOLHGEEPDKAA.tim.one@comcast.net>

[Followups directed to spambayes@python.org
 http://mail.python.org/mailman-21/listinfo/spambayes
]

[Raymond Hettinger]
> Is it too late to challenge a core design decision?

Never too late, but somebody has to do real work to prove that a change is
justified.  Plausible ideas are cheaper than dirt, alas.

> Instead of multiplying probablities, use fuzzy logic methods.
> Classify the indicators into damning, strong, weak, neautral, ...

Think about how that differs from 0.99, 0.80, 0.20 and 0.50.  Does it?

> After counting the number of indicators in each class, make
> a spam/ham decision that can be easily tweaked.  This would
> make it easy to implement variations of Tim's recent clear
> win, where additional indicators are gathered until the
> balance shifts sharply to one side.
>
> Some other advantages are:
> -- easily interpreted score vectors (6 damning, 7 strong, 4 weak, ... )

I've seen people see the current prob("TV") = 0.99 style cold and pick it up
at once.  With character n-grams I think it's frustrating, but word-like
tokenization gives easily recognized clues.

> -- avoids mathematical issues with indicators not being independent

How do you know this?

> -- allows the addition of non-token based indicators.  for instance,
>     a preponderance of caps would be a weak indicator.  the presence
>     of caps separated by spaces would be a strong indicator.

As far as the current classifier is concerned, "a token" is any Python
object usable as a dict key.  There are already several ways in which the
current tokenization scheme in timtest.py uses strings to *represent*
non-textual indicators.  For example, if the headers lack an Organization
line, a 'bool:noorg' "token" is generated.  For large blobs of text that get
skipped, a token is generated that records both the first character in that
blob and the number of bytes skipped (chopped to the nearest multiple of
10).  And so on -- you can inject anything you like into the scheme,
including stuff like

    "number of caps separated by spaces: more than 10"

(BTW, I happen to know that this particular "clue" acts to block relevant
conference announcements, not just spam)

I got some interesting results by injecting a crude characters/word
statistic:

    yield "cpw:%.1g" % (float(len(text)) / len(text.split()))

There are certain values of that statistic that turned out to be
killer-strong spam indicators, but there's a potential problem I've
mentioned before:  if you have an unbounded number of free parameters you
can fiddle, you can train a system to fit any given dataset exactly.  That's
in part why replication of results by others is necessary to make schemes
like this superb (I can only make one merely excellent on my own <wink>).

> -- the decision logic would be more intuitive
> -- avoids the issue of having equal amounts of spam and ham in
>     the sample

It's not clear that this matters; some results of preliminary experiments
are written up in the code comments.  The way Graham computes P(Spam | Word)
is via ratios, *as if* there were an equal number of each; and that's
consistent with the other bogus <wink> equality assumption in the scorer.  I
haven't yet changed all these guys at the same time to take P(Spam) and
P(Ham) into account.

BTW, note that all the results I've reported had a ham/spam training ratio
of 4000/2750.  I left that non-unity on purpose.

> The core concept would stay the same -- it's really just a shift from
> continuous to discrete.

Let us know how it turns out <wink>.



From barry@zope.com  Thu Sep  5 19:06:59 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Thu, 5 Sep 2002 14:06:59 -0400
Subject: [Python-Dev] New `spambayes' project on SourceForge
Message-ID: <15735.40259.117828.402419@anthem.wooz.org>

There's been a ton of press about applying Bayesian classifiers to
spam detection lately, spurred on by Paul Graham's recent paper "A
Plan for Spam"

    http://www.paulgraham.com/spam.html

Tim Peters has done an incredible amount of work on our Python
implementation of this idea.  Some of the reasons why I think Tim's
work is so cool is that he's brought along his deep knowledge of
speech recognition's related issues, and his obsessive devotion to
reducing the amount of spam I ultimately have to delete <wink>.

In order to encourage more participation from the wider open source
community, we've moved the code from a backwater of the Python cvs
tree to its own project on SourceForge.  The hope is that more people
will be able to contribute to ideas, testing, and integration of the
basic algorithms with other systems such as mail daemons, mailing list
managers, and mail clients.

The project is called "spambayes" (for lack of creativity on our part
:) and is hosted here:

    http://sf.net/projects/spambayes

If you're interested in becoming a developer on the project, let me
know.  Otherwise you can of course get anonymous checkouts of the code.

There are also two mailing lists related to the spambayes project.
The first is a general discussion list:

    http://mail.python.org/mailman-21/listinfo/spambayes

and the other is a list for cvs checkin message notices:

    http://mail.python.org/mailman-21/listinfo/spambayes-checkins

Feel free to join those lists (and help be a guinea pig for Mailman
2.1 :).

Enjoy,
-Barry

PS to Python-devers: the code has been removed from
nondist/sandbox/spambayes, so you won't be able to hack on it there.
Also, please move discussion about this from python-dev@python.org to
spambayes@python.org


From nas@python.ca  Thu Sep  5 19:52:28 2002
From: nas@python.ca (Neil Schemenauer)
Date: Thu, 5 Sep 2002 11:52:28 -0700
Subject: [Python-Dev] Re: Signal-resistant code (was: Two random and nearly unrelated ideas)
In-Reply-To: <20020905093002.GA61136@hishome.net>
References: <3FE2540C-C047-11D6-89C6-000A27B19B96@oratrix.com> <200209042048.g84KmCK08365@pcp02138704pcs.reston01.va.comcast.net> <oq1y89jvrr.fsf@carouge.sram.qc.ca> <20020905093002.GA61136@hishome.net>
Message-ID: <20020905185228.GA19726@glacier.arctrix.com>

Oren Tirosh wrote:
> If this macro is defined you will never get EINTR so there's no need to
> worry about this. If it isn't defined you need to restart system calls 
> yourself.

I don't think that is correct.  Only certain systems calls will be
restarted (for BSD 4.2 it's ioctl, read, readv, write, writev, wait, and
waitpid).  I think the system calls restarted varies depending on the
OS.

Signals are a gigantic mess.  I'm starting to doubt that you realize the
extent of the brain damage.  While I would be pleased if there was some
way Python could hide the mess, I'm not convinced it is possible.

  Neil


From guido@python.org  Thu Sep  5 19:19:02 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 05 Sep 2002 14:19:02 -0400
Subject: [Python-Dev] Re: Signal-resistant code (was: Two random and nearly unrelated ideas)
In-Reply-To: Your message of "Thu, 05 Sep 2002 05:30:02 EDT."
 <20020905093002.GA61136@hishome.net>
References: <3FE2540C-C047-11D6-89C6-000A27B19B96@oratrix.com> <200209042048.g84KmCK08365@pcp02138704pcs.reston01.va.comcast.net> <oq1y89jvrr.fsf@carouge.sram.qc.ca>
 <20020905093002.GA61136@hishome.net>
Message-ID: <200209051819.g85IJ2113867@odiug.zope.com>

> >  - Macro: AC_SYS_RESTARTABLE_SYSCALLS
> >      If the system automatically restarts a system call that is
> >      interrupted by a signal, define `HAVE_RESTARTABLE_SYSCALLS'.
> 
> The name of this macro is misleading. It doesn't check whether system calls
> are restartABLE but whether they are restartED automatically by libc. It 
> forks a subprocess that sends a signal to the parent. The parent waits for
> the child and checks if the wait() was interrupted.
> 
> If this macro is defined you will never get EINTR so there's no need to
> worry about this. If it isn't defined you need to restart system calls 
> yourself.

This was a feature introduced by BSD Unix in a distant past, as a
change from v7 Unix (which had only the EINTR behavior).  For b/w
compatibility, BSD had a system call to disable the restart feature.
I'm guessing that over the years the feature has been found less than
helpful, so POSIX defaults to off.  POSIX sigaction() has a flag
SA_RESTART to enable restarting.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Thu Sep  5 20:15:54 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 05 Sep 2002 15:15:54 -0400
Subject: [Python-Dev] Re: Signal-resistant code (was: Two random and nearly unrelated ideas)
In-Reply-To: Your message of "Thu, 05 Sep 2002 11:52:28 PDT."
 <20020905185228.GA19726@glacier.arctrix.com>
References: <3FE2540C-C047-11D6-89C6-000A27B19B96@oratrix.com> <200209042048.g84KmCK08365@pcp02138704pcs.reston01.va.comcast.net> <oq1y89jvrr.fsf@carouge.sram.qc.ca> <20020905093002.GA61136@hishome.net>
 <20020905185228.GA19726@glacier.arctrix.com>
Message-ID: <200209051915.g85JFsR14171@odiug.zope.com>

> Signals are a gigantic mess.  I'm starting to doubt that you realize the
> extent of the brain damage.  While I would be pleased if there was some
> way Python could hide the mess, I'm not convinced it is possible.

Thanks for the support Neil.  That's exactly how I think about it.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@pobox.com  Thu Sep  5 15:57:45 2002
From: skip@pobox.com (Skip Montanaro)
Date: Thu, 5 Sep 2002 09:57:45 -0500
Subject: [Python-Dev] Getting started with GBayes testing
In-Reply-To: <3D772EC2.30217.184B6C78@localhost>
References: <3D7653AD.14352.14F391B6@localhost>
 <3D772EC2.30217.184B6C78@localhost>
Message-ID: <15735.28905.730200.821228@12-248-11-90.client.attbi.com>

    Brad> My feeling is that the presentation of "the message" is
    Brad> independent of the message itself, so if I get a message in Text,
    Brad> HTML, RTF only the actual content is important, not the markup
    Brad> method. Though I suppose using lots of red and large fonts might
    Brad> be an indicator of spam, the text of the message should still
    Brad> suffice.

You might be surprised.  In Paul Graham's "A New Plan for Spam" he writes:

    I don't know why I avoided trying the statistical approach for so
    long.  I think it was because I got addicted to trying to identify
    spam features myself, as if I were playing some kind of
    competitive game with the spammers.  (Nonhackers don't often
    realize this, but most hackers are very competitive.)  When I did
    try statistical analysis, I found immediately that it was much
    cleverer than I had been.  It discovered, of course, that terms
    like "virtumundo" and "teens" were good indicators of spam.  But
    it also discovered that "per" and "FL" and "ff0000" are good
    indicators of spam.  In fact, "ff0000" (html for bright red) turns
    out to be as good an indicator of spam as any pornographic term.

As Tim has pointed out several times, intuition and hunches about this
stuff often turns out to be incorrect.

Skip


From jason-exp-1031947065.5eb24b@mastaler.com  Thu Sep  5 21:01:37 2002
From: jason-exp-1031947065.5eb24b@mastaler.com (jason-exp-1031947065.5eb24b@mastaler.com)
Date: Thu, 05 Sep 2002 14:01:37 -0600
Subject: [Python-Dev] Re: New `spambayes' project on SourceForge
References: <15735.40259.117828.402419@anthem.wooz.org>
Message-ID: <hhofbc19zy.fsf@hrothgar.la.mastaler.com>

barry@zope.com (Barry A. Warsaw) writes:

> There are also two mailing lists related to the spambayes project.
> The first is a general discussion list:
>
>     http://mail.python.org/mailman-21/listinfo/spambayes
>
> and the other is a list for cvs checkin message notices:
>
>     http://mail.python.org/mailman-21/listinfo/spambayes-checkins

These lists have now been added to Gmane (http://gmane.org) as well:

spambayes@python.org <==> news://news.gmane.org/gmane.mail.spam.spambayes.general

spambayes-checkins@python.org <==> news://news.gmane.org/gmane.mail.spam.spambayes.cvs

-- 
(http://tmda.net/)




From oren-py-d@hishome.net  Thu Sep  5 21:27:16 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Thu, 5 Sep 2002 23:27:16 +0300
Subject: [Python-Dev] Re: Signal-resistant code
In-Reply-To: <200209051501.g85F1EY13017@odiug.zope.com>; from guido@python.org on Thu, Sep 05, 2002 at 11:01:14AM -0400
References: <15733.11253.743055.864572@12-248-11-90.client.attbi.com> <20020904094947.GA56953@hishome.net> <200209041144.g84BiXZ05244@pcp02138704pcs.reston01.va.comcast.net> <20020904124646.GA79746@hishome.net> <200209041325.g84DP1o06695@pcp02138704pcs.reston01.va.comcast.net> <20020904160143.GA1483@hishome.net> <200209042005.g84K5Ms08177@pcp02138704pcs.reston01.va.comcast.net> <20020905045414.GA26104@hishome.net> <200209051501.g85F1EY13017@odiug.zope.com>
Message-ID: <20020905232716.A8225@hishome.net>

On Thu, Sep 05, 2002 at 11:01:14AM -0400, Guido van Rossum wrote:
> > > I have never understood why a child dying should send a signal.
> > > You can poll for the child with waitpid() instead.
> > 
> > You're assuming too much about the structure of the program using
> > child processes. The code that starts the child process may not be
> > in control of the Python program counter by the time it ends. It's
> > useful to be able to leave a signal handler to clean up the zombie
> > process by waitpid().
> 
> I admit that I hate signals so badly that whenever I needed to wait
> for a child to finish I would always structure the program around this
> need (even when coding in C).

Ummm... if you really hate signals that much perhaps you to step aside 
from this particular discussion? Naturally, you will get to pronounce on 
the results that come out of it (if any ;-)


    Westley:  No, no. We have already succeeded.  I mean, what are 
        the three terrors of the fire swamp?  One, the flame spurt - no 
        problem - there's a popping sound preceding each.  We can avoid 
        that.  Two, the lightning sand which you were clever enough to 
        discover what that looks like, so in the future we can avoid that 
        too.

                     (from "The Princess Bride" by William Goldman)

So what are the three problems of signals?  

One - what calls are allowed by the platform inside a signal handler.
No problem.  Nobody suggested actually executing Python code inside a
signal handler so we don't need to be worried about user code. The C
handler doesn't call anything unusual, just sets flags.  This should work 
on all platforms.

Two - Interruptible system calls.  If all Python I/O calls are wrapped
inside restarting wrappers this should be solved. If the system's libc
wraps them it can be disabled by SA_RESTART (posix) or siginterrupt (BSD).
On some systems read and recv return a short buffer instead of EINTR. This 
can be safely ignored because it only happens for pipes and sockets where 
this is a valid result. AFAIR it's guaranteed not to happen on regular 
files so we won't be tricked into believing they reached EOF.  Are there 
any systems where system calls are interruptible but not restartable 
in any way without data loss? 

Three - Threads playing "who gets the signal".  The Python signal module
has a hack that appears to work on all relevant platform - ignore the
signal if getpid() isn't the main thread.

	Oren


Buttercup:  Westley, what about the R.O.U.S.'s?
Westley:  Rodents Of Unusual Size? I don't think they exist.

	...



From stephen@ixokai.net  Thu Sep  5 21:27:47 2002
From: stephen@ixokai.net (Stephen Hansen)
Date: 05 Sep 2002 13:27:47 -0700
Subject: [Python-Dev] SF patch#555779, "import user" and Apache... *humble*
Message-ID: <1031257667.16739.5.camel@jeremy>

*cough*

So. Hi. Python-Gods. Um. So. Anyways. *embarassed*

I submitted a really tiny patch to SF awhile back, #555779, which would
make "import user" actually useful in a certain specific CGI situation.
The BDFL seemed to have no problems and said anyone could commit it.. no
one has. :) Now, i'm not impatient at all, its already patched into all
the machines i'm working on... however, i'm just sending this little
reminder in the hopes that it won't be forgotten until after 2.3 comes
out. :) I don't want to re-patch everything again later, i've got quite
a few machines currently using it. :)

*cough* So. Yes. Well. Thank you for your time. :)

*runs away*

--Stephen



From mal@egenix.com  Thu Sep  5 18:19:57 2002
From: mal@egenix.com (M.-A. Lemburg)
Date: Thu, 05 Sep 2002 19:19:57 +0200
Subject: [Python-Dev] GBayes design
References: <002b01c254f2$f6c7c020$71b53bd0@othello>
Message-ID: <3D77923D.8060108@lemburg.com>

Raymond Hettinger wrote:
> Is it too late to challenge a core design decision?
> 
> Instead of multiplying probablities, use fuzzy logic methods.
> Classify the indicators into damning, strong, weak, neautral, ...
> 
> After counting the number of indicators in each class, make
> a spam/ham decision that can be easily tweaked.  This would
> make it easy to implement variations of Tim's recent clear
> win, where additional indicators are gathered until the
> balance shifts sharply to one side.
> 
> Some other advantages are:
> -- easily interpreted score vectors (6 damning, 7 strong, 4 weak, ... )
> -- avoids mathematical issues with indicators not being independent
> -- allows the addition of non-token based indicators.  for instance,
>     a preponderance of caps would be a weak indicator.  the presence
>     of caps separated by spaces would be a strong indicator.
> -- the decision logic would be more intuitive
> -- avoids the issue of having equal amounts of spam and ham in
>     the sample
> 
> The core concept would stay the same -- it's really just a shift from
> continuous to discrete.

Hmm, there's nothing discrete about fuzzy logic (ok, this
claim is 0.65% true ;-)

The problem is more about multi-dimensional optimization where
you are interested in distilling several different inputs
into one value.

A weighted average is the simplest form to use here and there
are various multi-dimensional optimization algorithms around
to aid in finding the "optimal" weights.

Another approach would be using a shallow neural network.

The only "problem" with these is that Tim generates a
variable number of inputs, AFAICT, so that you'd have
to use some preprocessing to make the number of inputs
constant.

Would make a nice internship project, I guess :-)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/



From hu.peress@mail.mcgill.ca  Sun Sep  8 03:22:40 2002
From: hu.peress@mail.mcgill.ca (Hunter Peress)
Date: 07 Sep 2002 21:22:40 -0500
Subject: [Python-Dev] Call for clarity ( clarification ;-) )
In-Reply-To: <003d01c25471$d83fe960$2fd8accf@othello>
References: <1031437860.636.29.camel@HillCountryPeress>
 <m3it1l8iu9.fsf@mira.informatik.hu-berlin.de>
 <1031442464.644.68.camel@HillCountryPeress>
 <003d01c25471$d83fe960$2fd8accf@othello>
Message-ID: <1031451760.644.97.camel@HillCountryPeress>

On Wed, 2002-09-04 at 19:19, Raymond Hettinger wrote:
> From: "Hunter Peress" <hu.peress@mail.mcgill.ca>
> 
> > def something(a,b,c="lalal"):
> >    """This will find its way into the pydocs because its a comment"""
> >    ##Here is the new stuff Im proposing
> >    ##note, a clearer sytnax can surely be devised.
> >    """file"""    #documents the type of the first arg
> >    """string"""  #              ""          second 
> >    """list"""    #              ""          third
> >    """string"""  #documents the return type.
> > 
> > Then the pydoc generator will do a check on the #  arguments to the
> > func/meth, verify that the correct amount of these new comments (which
> > only supply the type) are provided. I do think that it would help to
> > actually enforce this. I think its fine that doc's NOT be generated if
> > they don't supply this information. This provides for better docs and
> > shouldnt get that many complaints. 
> 
> Thanks for the clarification. I see what you're trying to do;
> however, I think that any gains are more than offset by the new
> level of complexity and lengthier code.
> 
> The current docs make a pretty good effort at describing what is
> needed for each argument.  At the same time, they allow flexibility
> for dynamic arguments that share a similar interface (such as
> substituting a StringIO object for a File object.
> 
> In your example, the docs strings could be made clear
> using existing tools:
> 
> def something(file, promptstring, optionlist):
>      """Returns a string extracted from the file
>           for any line matching the promptstring.
>           The optionlist can include any of the
>           following:  IGNORECASE, VERBOSE.
>           MULTILINE, or ADDLINENUMBER."""
> 
> I can't see that a tool like you described would add any
> more clarity than the above docstring.
> 
> > PS whats TIA mean?
> 
> "Thanks In Advance"
> 
> Do you have any examples of current python docstrings that are
> not clear enough?
this was the impetus behind my whole thinking here.

I need not search far.
example 1) pydoc os.fork
Python Library Documentation: built-in function fork in os
fork(...)
    fork() -> pid
    Fork a child process.
    
    Return 0 to child process and PID of child to parent process.

example2) pydoc string.index
Python Library Documentation: function index in string
index(s, *args)
    index(s, sub [,start [,end]]) -> int
    
    Like find but raises ValueError when the substring is not found.

>From these two, I have no idea what BOTH the input and return types are.

I found those examples in 10 seconds (literally). The state of the
python documentation is caca. And your complacency is a cause for
concern. 

I think its easier to enforce this from the level i describe, than have
guido saying "ok guys please be more explicit in your documentation". I
mean, both of those documents above are somewhat explicit, but they are
not COMPLETE.

Could you provide me with some linkage on parsing python (from a
compilation/ syntax-tree analysis POV). SO that i can get to work on
writing a patch for the pydoc generation program. 


> 
> 
> Raymond Hettinger
> 
> 




From guido@python.org  Thu Sep  5 21:46:27 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 05 Sep 2002 16:46:27 -0400
Subject: [Python-Dev] Re: Signal-resistant code
In-Reply-To: Your message of "Thu, 05 Sep 2002 23:27:16 +0300."
 <20020905232716.A8225@hishome.net>
References: <15733.11253.743055.864572@12-248-11-90.client.attbi.com> <20020904094947.GA56953@hishome.net> <200209041144.g84BiXZ05244@pcp02138704pcs.reston01.va.comcast.net> <20020904124646.GA79746@hishome.net> <200209041325.g84DP1o06695@pcp02138704pcs.reston01.va.comcast.net> <20020904160143.GA1483@hishome.net> <200209042005.g84K5Ms08177@pcp02138704pcs.reston01.va.comcast.net> <20020905045414.GA26104@hishome.net> <200209051501.g85F1EY13017@odiug.zope.com>
 <20020905232716.A8225@hishome.net>
Message-ID: <200209052046.g85KkR714802@odiug.zope.com>

> > I admit that I hate signals so badly that whenever I needed to wait
> > for a child to finish I would always structure the program around this
> > need (even when coding in C).
> 
> Ummm... if you really hate signals that much perhaps you to step aside 
> from this particular discussion? Naturally, you will get to pronounce on 
> the results that come out of it (if any ;-)

Why?  I don't think hating signals disqualifies me from understanding
their problems.

> So what are the three problems of signals?  
> 
> One - what calls are allowed by the platform inside a signal handler.
> No problem.  Nobody suggested actually executing Python code inside a
> signal handler so we don't need to be worried about user code. The C
> handler doesn't call anything unusual, just sets flags.  This should work 
> on all platforms.
> 
> Two - Interruptible system calls.  If all Python I/O calls are wrapped
> inside restarting wrappers this should be solved.

I asked what the Python code called by the wrapper when a signal
arrives is allowed to do (e.g. close the file?).  If you replied to
that, I missed it.

> If the system's libc wraps them it can be disabled by SA_RESTART
> (posix) or siginterrupt (BSD).  On some systems read and recv return
> a short buffer instead of EINTR.

This latter sentence shows that you don't understand signals, or
you're being very sloppy.  You get *either* a short buffer *or* EINTR
depending on whether some data was already transferred to user space.

> This can be safely ignored because it only happens for pipes and
> sockets where this is a valid result. AFAIR it's guaranteed not to
> happen on regular files so we won't be tricked into believing they
> reached EOF.

I don't believe that a short read on a regular file can be used
reliably to infer EOF anyway.  The file could be growing while we
read.

> Are there any systems where system calls are interruptible but not
> restartable in any way without data loss?

Not AFAIK.

> Three - Threads playing "who gets the signal".  The Python signal module
> has a hack that appears to work on all relevant platform - ignore the
> signal if getpid() isn't the main thread.

Doesn't that make signals unreliable?  What if thread 4 has forked a
child, and the child exist?  Won't the SIGCHLD be sent to thread 4?
AFAIK there's no standard for this, or if there is, not all systems
comply.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From oren-py-d@hishome.net  Thu Sep  5 21:45:52 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Thu, 5 Sep 2002 16:45:52 -0400
Subject: [Python-Dev] Re: Signal-resistant code (was: Two random and nearly unrelated ideas)
In-Reply-To: <20020905185228.GA19726@glacier.arctrix.com>
References: <3FE2540C-C047-11D6-89C6-000A27B19B96@oratrix.com> <200209042048.g84KmCK08365@pcp02138704pcs.reston01.va.comcast.net> <oq1y89jvrr.fsf@carouge.sram.qc.ca> <20020905093002.GA61136@hishome.net> <20020905185228.GA19726@glacier.arctrix.com>
Message-ID: <20020905204552.GA51795@hishome.net>

On Thu, Sep 05, 2002 at 11:52:28AM -0700, Neil Schemenauer wrote:
> 
> Signals are a gigantic mess.  I'm starting to doubt that you realize the
> extent of the brain damage.  While I would be pleased if there was some
> way Python could hide the mess, I'm not convinced it is possible.
> 
>   Neil

Ah... I can almost hear the pain, frustration and despair in your voice.
Obviously Guido and you got burned by this. I know other old-time Unix 
hackers with the same attitude. From my experience signals on Linux work 
just fine - I don't carry any signal scars. I can show off my Oracle 
scars, though. They're really gnarly. I can't hear that name mentioned 
without turning completely irrational about it. Certain embedded software 
and hardware makers also make me want to scream.

	Oren



From guido@python.org  Thu Sep  5 21:51:57 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 05 Sep 2002 16:51:57 -0400
Subject: [Python-Dev] Re: Signal-resistant code (was: Two random and nearly unrelated ideas)
In-Reply-To: Your message of "Thu, 05 Sep 2002 16:45:52 EDT."
 <20020905204552.GA51795@hishome.net>
References: <3FE2540C-C047-11D6-89C6-000A27B19B96@oratrix.com> <200209042048.g84KmCK08365@pcp02138704pcs.reston01.va.comcast.net> <oq1y89jvrr.fsf@carouge.sram.qc.ca> <20020905093002.GA61136@hishome.net> <20020905185228.GA19726@glacier.arctrix.com>
 <20020905204552.GA51795@hishome.net>
Message-ID: <200209052059.g85Kxop14949@odiug.zope.com>

> From my experience signals on Linux work just fine - I don't carry
> any signal scars.

That just shows you haven't written enough signal code. :-)

Seriously, let's please not confuse Linux with portable.  The issues
here are about the cross-platform viability of your suggested
approach.  If you've only used signals on Linux, maybe you should
withdraw yourself on account of lack of experience with the real
issues.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From paul-python@svensson.org  Thu Sep  5 22:08:54 2002
From: paul-python@svensson.org (Paul Svensson)
Date: Thu, 5 Sep 2002 17:08:54 -0400 (EDT)
Subject: [Python-Dev] Re: Signal-resistant code
In-Reply-To: <20020905232716.A8225@hishome.net>
Message-ID: <Pine.LNX.4.44.0209051657330.6840-100000@familjen.svensson.org>

On Thu, 5 Sep 2002, Oren Tirosh wrote:

>So what are the three problems of signals?

>Two - Interruptible system calls.  If all Python I/O calls are wrapped
>inside restarting wrappers this should be solved. If the system's libc
>wraps them it can be disabled by SA_RESTART (posix) or siginterrupt (BSD).
>On some systems read and recv return a short buffer instead of EINTR. This
>can be safely ignored because it only happens for pipes and sockets where
>this is a valid result. AFAIR it's guaranteed not to happen on regular
>files so we won't be tricked into believing they reached EOF.  Are there
>any systems where system calls are interruptible but not restartable
>in any way without data loss?

I don't see any guarantee against short reads in my documentation (Linux,
HP-UX); indeed both state explicitly that only a 0 return from read()
indicates EOF.

	/Paul




From neal@metaslash.com  Thu Sep  5 22:10:10 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Thu, 05 Sep 2002 17:10:10 -0400
Subject: [Python-Dev] SF patch#555779, "import user" and Apache... *humble*
References: <1031257667.16739.5.camel@jeremy>
Message-ID: <3D77C832.FD342FCD@metaslash.com>

Stephen Hansen wrote:
> 
> I submitted a really tiny patch to SF awhile back, #555779, which would
> make "import user" actually useful in a certain specific CGI situation.
> The BDFL seemed to have no problems and said anyone could commit it.. 

Done.

Neal


From python@rcn.com  Thu Sep  5 21:49:21 2002
From: python@rcn.com (Raymond Hettinger)
Date: Thu, 5 Sep 2002 16:49:21 -0400
Subject: [Python-Dev] SF patch#555779, "import user" and Apache... *humble*
References: <1031257667.16739.5.camel@jeremy>
Message-ID: <006801c2551d$b6e0e600$3961accf@othello>

I'll check it in for you when I get back from class this evening.

Raymond Hettinger 


BTW, no need for humility around here.


----- Original Message ----- 
From: "Stephen Hansen" <stephen@ixokai.net>
To: <python-dev@python.org>
Sent: Thursday, September 05, 2002 4:27 PM
Subject: [Python-Dev] SF patch#555779, "import user" and Apache... *humble*


> *cough*
> 
> So. Hi. Python-Gods. Um. So. Anyways. *embarassed*
> 
> I submitted a really tiny patch to SF awhile back, #555779, which would
> make "import user" actually useful in a certain specific CGI situation.
> The BDFL seemed to have no problems and said anyone could commit it.. no
> one has. :) Now, i'm not impatient at all, its already patched into all
> the machines i'm working on... however, i'm just sending this little
> reminder in the hopes that it won't be forgotten until after 2.3 comes
> out. :) I don't want to re-patch everything again later, i've got quite
> a few machines currently using it. :)
> 
> *cough* So. Yes. Well. Thank you for your time. :)
> 
> *runs away*
> 
> --Stephen
> 
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> 



From fredrik@pythonware.com  Thu Sep  5 23:06:22 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 6 Sep 2002 00:06:22 +0200
Subject: [Python-Dev] Call for clarity ( clarification ;-) )
References: <1031437860.636.29.camel@HillCountryPeress><m3it1l8iu9.fsf@mira.informatik.hu-berlin.de><1031442464.644.68.camel@HillCountryPeress> <003d01c25471$d83fe960$2fd8accf@othello> <1031451760.644.97.camel@HillCountryPeress>
Message-ID: <004701c25528$7c8b4530$ced241d5@hagrid>

hunter wrote:

> I need not search far.
> example 1) pydoc os.fork
> Python Library Documentation: built-in function fork in os
> fork(...)
>     fork() -> pid
>     Fork a child process.
>     
>     Return 0 to child process and PID of child to parent process.

why do you care about the type of a PID object?  in most
cases, all you need to know is that a PID isn't 0, which is
exactly what the documentation says.

and if you know what a PID is, you already know what type
it is...

> example2) pydoc string.index
> Python Library Documentation: function index in string
> index(s, *args)
>     index(s, sub [,start [,end]]) -> int
>     
>     Like find but raises ValueError when the substring is not found.
> 
> From these two, I have no idea what BOTH the input and return
> types are.

the index documentation refers to the documentation
for "find", which tells you that:

>>> help(string.find)
Help on function find in module string:

find(s, *args)
    find(s, sub [,start [,end]]) -> in

    Return the lowest index in s where substring sub is found,
    such that sub is contained within s[start,end].  Optional
    arguments start and end are interpreted as in slice notation.

    Return -1 on failure.

which, given that you know how indexes and slices work in
python, is all you need to know.

> I found those examples in 10 seconds (literally). The state of the
> python documentation is caca.

how long have you been using Python?

</F>



From oren-py-d@hishome.net  Thu Sep  5 23:23:30 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Fri, 6 Sep 2002 01:23:30 +0300
Subject: [Python-Dev] Re: Signal-resistant code
In-Reply-To: <200209052046.g85KkR714802@odiug.zope.com>; from guido@python.org on Thu, Sep 05, 2002 at 04:46:27PM -0400
References: <20020904094947.GA56953@hishome.net> <200209041144.g84BiXZ05244@pcp02138704pcs.reston01.va.comcast.net> <20020904124646.GA79746@hishome.net> <200209041325.g84DP1o06695@pcp02138704pcs.reston01.va.comcast.net> <20020904160143.GA1483@hishome.net> <200209042005.g84K5Ms08177@pcp02138704pcs.reston01.va.comcast.net> <20020905045414.GA26104@hishome.net> <200209051501.g85F1EY13017@odiug.zope.com> <20020905232716.A8225@hishome.net> <200209052046.g85KkR714802@odiug.zope.com>
Message-ID: <20020906012330.A10575@hishome.net>

On Thu, Sep 05, 2002 at 04:46:27PM -0400, Guido van Rossum wrote:
> > > I admit that I hate signals so badly that whenever I needed to wait
> > > for a child to finish I would always structure the program around this
> > > need (even when coding in C).
> > 
> > Ummm... if you really hate signals that much perhaps you to step aside 
> > from this particular discussion? Naturally, you will get to pronounce on 
> > the results that come out of it (if any ;-)
> 
> Why?  I don't think hating signals disqualifies me from understanding
> their problems.

In the past I have disqualified myself from making technical decisions on 
issues where I have been burned and knew that my opinion would be a calm
rational decision.

> I asked what the Python code called by the wrapper when a signal
> arrives is allowed to do (e.g. close the file?).  If you replied to
> that, I missed it.

Anything that a Python thread is allowed to do without grabbing a 
lock, i.e. anything that involves only exclusive resources or atomic 
Python operations on shared resources like setting a variable (but 
not read-modify-write). A signal handler is also allowed to raise an 
exception that will get delivered to the main thread.

> > If the system's libc wraps them it can be disabled by SA_RESTART
> > (posix) or siginterrupt (BSD).  On some systems read and recv return
> > a short buffer instead of EINTR.
> 
> This latter sentence shows that you don't understand signals, or
> you're being very sloppy.  You get *either* a short buffer *or* EINTR
> depending on whether some data was already transferred to user space.

Did I say that you get *both* a short buffer *and* EINTR?  What I meant 
is that it's really quite simple - if errno==EINTR I retry and if I get 
a short buffer I continue from whatever I got and ask for the remainder
and this should work regardless of the differences in behavior between 
different systems, sockets and files, etc.

> > This can be safely ignored because it only happens for pipes and
> > sockets where this is a valid result. AFAIR it's guaranteed not to
> > happen on regular files so we won't be tricked into believing they
> > reached EOF.
> 
> I don't believe that a short read on a regular file can be used
> reliably to infer EOF anyway.  The file could be growing while we read.

You're right, only a zero result on read should be interpreted as EOF,
not a short result. I got confused by fread where a short read does mark
an end of file condition.  I don't see how the growing file case is
relevant, though.

> > Three - Threads playing "who gets the signal".  The Python signal module
> > has a hack that appears to work on all relevant platform - ignore the
> > signal if getpid() isn't the main thread.
> 
> Doesn't that make signals unreliable?  What if thread 4 has forked a
> child, and the child exist?  Won't the SIGCHLD be sent to thread 4?
> AFAIK there's no standard for this, or if there is, not all systems
> comply.

I've never actually tried this one. I just went by the comments in 
signalmodule.c which claim that this works for all cases of how different 
implementations deliver signals to threads. I guess that was a bit hasty.

	Oren



From list-python@ccraig.org  Fri Sep  6 07:20:51 2002
From: list-python@ccraig.org (Christopher A. Craig)
Date: 06 Sep 2002 02:20:51 -0400
Subject: [Python-Dev] Documentation inconsistency in re
Message-ID: <t1w65xjy6yk.fsf@kermit.wreck.org>

>From the Library Reference (2.2.1):

\b    Matches the empty string, but only at the beginning or end of a
      word. A word is defined as a sequence of alphanumeric characters, so
      the end of a word is indicated by whitespace or a non-alphanumeric
      character. Inside a character range, \b represents the backspace
      character, for compatibility with Python's string literals.

Now reality:

Python 2.2.1 (#2, Apr 22 2002, 17:53:10) 
[GCC 2.95.4 20011002 (Debian prerelease)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> t = re.compile(r'\bbag\b')
>>> t.search('test bag')
<_sre.SRE_Match object at 0x812aad0>
>>> t.search('test+bag')
<_sre.SRE_Match object at 0x815d528>
>>> t.search('test_bag')
>>> [ chr(i) for i in xrange(256) if not t.search('test' + chr(i) +
'bag') ]
['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D',
'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R',
'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', '_', 'a', 'b', 'c', 'd', 'e',
'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's',
't', 'u', 'v', 'w', 'x', 'y', 'z']
>>> 


So the implementation appears to define a word as a sequence of
alphanumeric characters or underscores, which means either the
documentation, or the library is wrong.  Now it happens that this was
found while a friend of mine and I were looking to get the exact
behavior that is implemented, so I'd prefer it if the documentation
were updated to meet the implementation <.8 wink>.

-- 
Christopher A. Craig <list-python@ccraig.org>
I develop for Linux for a living, I used to develop for DOS.  Going from
DOS to Linux is like trading a glider for an F117. - Lawrence Foard


From fredrik@pythonware.com  Fri Sep  6 07:47:12 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 6 Sep 2002 08:47:12 +0200
Subject: [Python-Dev] Documentation inconsistency in re
References: <t1w65xjy6yk.fsf@kermit.wreck.org>
Message-ID: <00e401c25571$3e6bc1f0$ced241d5@hagrid>

Christopher A. Craig wrote:

> >From the Library Reference (2.2.1):
> 
> \b    Matches the empty string, but only at the beginning or end of a
>       word. A word is defined as a sequence of alphanumeric characters, so
>       the end of a word is indicated by whitespace or a non-alphanumeric
>       character. Inside a character range, \b represents the backspace
>       character, for compatibility with Python's string literals.

as you suspected, the documentation is flawed: \b is defined
in terms of \w and \W.

</F>



From mal@lemburg.com  Fri Sep  6 08:55:13 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 06 Sep 2002 09:55:13 +0200
Subject: [Python-Dev] utf8 issue
References: <200208232105.g7NL5RE16863@pcp02138704pcs.reston01.va.comcast.net> <2mznv9c1k4.fsf@starship.python.net> <200208261405.g7QE5Of05199@pcp02138704pcs.reston01.va.comcast.net>              <3D77205E.8080103@lemburg.com> <200209051351.g85Dpnk12649@odiug.zope.com>
Message-ID: <3D785F61.1090301@lemburg.com>

Guido van Rossum wrote:
>>>Please do.  Bumping MAGIC is a no-no between dot releases.  But I
>>>don't understand why that is necessary?
>>
>>It would be necessary since marshal uses UTF-8 for storing
>>Unicode literals.
> 
> 
> Do you mean that in 2.2 it doesn't?

Marshal uses it since 1.6. The point is that the fix to the
lone surrogate problem resulted in a change of the UTF codec
output. PYCs from unpatched and patched versions wouldn't
interop if they use lone surrogates in Unicode literals. We
usually bump the PYC magic in such a case, to avoid these
issues. Since it's not possible for a patch level release,
we have two choices:

1. leave things as they are

2. apply the fix and live with the consequences of having
    to regenerate PYCs by hand

Just to give an example of the problem:

Python 2.2:
-------------
u'\ud800'.encode('utf-8') == '\xa0\x80'

 >>> unicode('\xa0\x80', 'utf-8')
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
UnicodeError: UTF-8 decoding error: unexpected code byte

 >>> unicode('\xed\xa0\x80', 'utf-8')
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
UnicodeError: UTF-8 decoding error: illegal encoding

Current CVS Python:
---------------------
u'\ud800'.encode('utf-8') == '\xed\xa0\x80'

 >>> unicode('\xed\xa0\x80', 'utf-8')
u'\ud800'

>>Even though it's highly unlikely that the problem cases are used in
>>Python Unicode literals, there's a tiny chance. Without the MAGIC
>>change this could result in PYC files failing to load.
> 
> 
> Ha.  You may have missed the start of this thread, but the whole
> problem was that a PYC file *did* fail to load!  (The .py file had a
> lone surrogate in it.)  So I'm not sure this argument holds much
> water.

Interesting. I wouldn't have expected that.

> Can someone please explain what change would be necessary to what part
> of the code to prevent a lone surrogate in a string literal from
> creating a PYC file from blowing up?

One possibility would be to:

1. change the UTF-8 encoder in Python 2.2 to produce correct
    output

2. let the UTF-8 decoder in Python 2.2 accept the correct
    output *and* the maformed output

I am not sure whether 2. would introduce a security problem.
Perhaps there is a way to restrict the work-around so that
we don't run into UTF-8 encoding attack problems.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/



From guido@python.org  Fri Sep  6 15:06:21 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 06 Sep 2002 10:06:21 -0400
Subject: [Python-Dev] utf8 issue
In-Reply-To: Your message of "Fri, 06 Sep 2002 09:55:13 +0200."
 <3D785F61.1090301@lemburg.com>
References: <200208232105.g7NL5RE16863@pcp02138704pcs.reston01.va.comcast.net> <2mznv9c1k4.fsf@starship.python.net> <200208261405.g7QE5Of05199@pcp02138704pcs.reston01.va.comcast.net> <3D77205E.8080103@lemburg.com> <200209051351.g85Dpnk12649@odiug.zope.com>
 <3D785F61.1090301@lemburg.com>
Message-ID: <200209061406.g86E6Lu14230@pcp02138704pcs.reston01.va.comcast.net>

[MAL, on UTF-8 for unicode]
> Marshal uses it since 1.6. The point is that the fix to the
> lone surrogate problem resulted in a change of the UTF codec
> output. PYCs from unpatched and patched versions wouldn't
> interop if they use lone surrogates in Unicode literals. We
> usually bump the PYC magic in such a case, to avoid these
> issues. Since it's not possible for a patch level release,
> we have two choices:
> 
> 1. leave things as they are
> 
> 2. apply the fix and live with the consequences of having
>     to regenerate PYCs by hand

[but then later]

> One possibility would be to:
> 
> 1. change the UTF-8 encoder in Python 2.2 to produce correct
>     output
> 
> 2. let the UTF-8 decoder in Python 2.2 accept the correct
>     output *and* the maformed output

This sounds like the right solution.  I hope you can produce a patch
against the release22-maint branch.

> I am not sure whether 2. would introduce a security problem.
> Perhaps there is a way to restrict the work-around so that
> we don't run into UTF-8 encoding attack problems.

I don't see what this vulnerability (if it is one) adds to the already
laughable security of marshal and .pyc files.  If someone you don't
trust can write your .pyc files, they can cause your interpreter to
crash by inserting bogus bytecode.  So I'd say this is a non-issue.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From loewis@informatik.hu-berlin.de  Fri Sep  6 15:12:25 2002
From: loewis@informatik.hu-berlin.de (Martin v. =?iso-8859-1?q?L=F6wis?=)
Date: 06 Sep 2002 16:12:25 +0200
Subject: [Python-Dev] Subsecond time stamps
Message-ID: <j47khzcily.fsf@informatik.hu-berlin.de>

A number of systems provide subsecond time stamp resolution for
files. In particular:

- NFS v3 has nanosecond time stamps.

- Solaris 9 has nanosecond time stamps in stat(2), and microsecond
  time stamps in utimes(2). In addition, they have microsecond time
  stamps on ufs. It appears that other Unices have also extended
  stat(2), as does OS X.

- NTFS has 100ns resolution for time stamps.

I'd like to expose atleast the stat extensions to Python. Adding new
fields to stat_result is easy enough, but there are a number of
alternatives:

A. Add an additional field to hold the nanoseconds, i.e. st_mtimensec,
   st_atimensec, st_ctimensec. This is the BSD Posix extension.

B. Follow the Unix API (Solaris and others). They define a
     struct timespec_t {
       time_t tv_sec;
       unsigned long tv_nsec;
     };

  and fields st_mtim, st_ctim, st_atim of timespec_t. For
  compatibility, they

  #define st_mtime st_mtim.tv_sec

  So to get at the seconds, you can write either st_mtim.tv_sec, or
  st_mtime. For the nanoseconds, you write st_mtim.tv_nsec.

  This requires to add a new type.

C. Make st_mtime a floating point number. This won't offer nanosecond
   resolution, as C doubles are not dense enough.

What do you think?

Regards,
Martin


From paul-python@svensson.org  Fri Sep  6 15:31:25 2002
From: paul-python@svensson.org (Paul Svensson)
Date: Fri, 6 Sep 2002 10:31:25 -0400 (EDT)
Subject: [Python-Dev] Subsecond time stamps
In-Reply-To: <j47khzcily.fsf@informatik.hu-berlin.de>
Message-ID: <Pine.LNX.4.44.0209061026500.6840-100000@familjen.svensson.org>

On 6 Sep 2002, Martin v. Löwis wrote:

>A number of systems provide subsecond time stamp resolution for
>files. In particular:
>
>- NFS v3 has nanosecond time stamps.
>
>- Solaris 9 has nanosecond time stamps in stat(2), and microsecond
>  time stamps in utimes(2). In addition, they have microsecond time
>  stamps on ufs. It appears that other Unices have also extended
>  stat(2), as does OS X.
>
>- NTFS has 100ns resolution for time stamps.

(---)

>C. Make st_mtime a floating point number. This won't offer nanosecond
>   resolution, as C doubles are not dense enough.

This seems to me the most Pythonic way.
Are C doubles dense enough to offer 100 ns resolution ?

	/Paul



From skip@pobox.com  Fri Sep  6 15:39:02 2002
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 6 Sep 2002 09:39:02 -0500
Subject: [Python-Dev] Documentation inconsistency in re
In-Reply-To: <t1w65xjy6yk.fsf@kermit.wreck.org>
References: <t1w65xjy6yk.fsf@kermit.wreck.org>
Message-ID: <15736.48646.910216.93578@12-248-11-90.client.attbi.com>

    Christopher> So the implementation appears to define a word as a
    Christopher> sequence of alphanumeric characters or underscores, which
    Christopher> means either the documentation, or the library is wrong.

Documentation has been fixed.

Skip


From erik@pythonware.com  Fri Sep  6 15:44:16 2002
From: erik@pythonware.com (erik heneryd)
Date: Fri, 06 Sep 2002 16:44:16 +0200
Subject: [Python-Dev] Call for clarity ( clarification ;-) )
References: <1031437860.636.29.camel@HillCountryPeress>	<m3it1l8iu9.fsf@mira.informatik.hu-berlin.de>	<1031442464.644.68.camel@HillCountryPeress> 	<003d01c25471$d83fe960$2fd8accf@othello> <1031451760.644.97.camel@HillCountryPeress>
Message-ID: <3D78BF40.1030609@pythonware.com>

Hunter Peress wrote:

>example 1) pydoc os.fork
>Python Library Documentation: built-in function fork in os
>fork(...)
>    fork() -> pid
>    Fork a child process.
>    
>    Return 0 to child process and PID of child to parent process.
>  
>

my only objection is that the case where fork fails isn't documented.
with a c background one expects a negative number, when in fact an 
exception is raised...

erik




From guido@python.org  Fri Sep  6 15:41:54 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 06 Sep 2002 10:41:54 -0400
Subject: [Python-Dev] Call for clarity ( clarification ;-) )
In-Reply-To: Your message of "Fri, 06 Sep 2002 16:44:16 +0200."
 <3D78BF40.1030609@pythonware.com>
References: <1031437860.636.29.camel@HillCountryPeress> <m3it1l8iu9.fsf@mira.informatik.hu-berlin.de> <1031442464.644.68.camel@HillCountryPeress> <003d01c25471$d83fe960$2fd8accf@othello> <1031451760.644.97.camel@HillCountryPeress>
 <3D78BF40.1030609@pythonware.com>
Message-ID: <200209061441.g86EfsV14529@pcp02138704pcs.reston01.va.comcast.net>

> my only objection is that the case where fork fails isn't documented.
> with a C background one expects a negative number, when in fact an 
> exception is raised...

Ah jeez.  Even with only half a day of Python you should've figured
out that Python nearly always raises an exception where the
corresponding C code returns an error value.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fredrik@pythonware.com  Fri Sep  6 16:03:06 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 6 Sep 2002 17:03:06 +0200
Subject: [Python-Dev] Call for clarity ( clarification ;-) )
References: <1031437860.636.29.camel@HillCountryPeress> <m3it1l8iu9.fsf@mira.informatik.hu-berlin.de> <1031442464.644.68.camel@HillCountryPeress> <003d01c25471$d83fe960$2fd8accf@othello> <1031451760.644.97.camel@HillCountryPeress>              <3D78BF40.1030609@pythonware.com>  <200209061441.g86EfsV14529@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <00f201c255b6$82458c40$0900a8c0@spiff>

guido wrote:

> > my only objection is that the case where fork fails isn't =
documented.
> > with a C background one expects a negative number, when in fact an=20
> > exception is raised...
>=20
> Ah jeez.  Even with only half a day of Python you should've figured
> out that Python nearly always raises an exception where the
> corresponding C code returns an error value.

otoh, it doesn't hurt to spell it out for functions like fork which
almost always succeeds...

(can you write a portable test that is guaranteed to raise an
exception, and does that without locking up the system?)

</F>



From Jack.Jansen@oratrix.com  Fri Sep  6 16:09:16 2002
From: Jack.Jansen@oratrix.com (Jack Jansen)
Date: Fri, 6 Sep 2002 17:09:16 +0200
Subject: [Python-Dev] Re: Signal-resistant code (was: Two random and nearly unrelated ideas)
In-Reply-To: <200209051501.g85F1EY13017@odiug.zope.com>
Message-ID: <9CA26C2C-C1AA-11D6-8D51-003065517236@oratrix.com>

On donderdag, september 5, 2002, at 05:01 , Guido van Rossum wrote:
>> Code in signal handlers is executed at some arbitrary point in the
>> program and the programmer should be aware of this and only do so
>> simple things like setting a flag or appending to a list.
>
> Unfortunately the mechanism doesn't enforce this.  I wish we could
> invent a Python signal API that only lets you do one of these simple
> things.

Could we connect signals to semaphores or locks or something 
like that? That would allow you to do the two things that i 
think are worth doing in a signal handler: setting a flag and/or 
making some other part of the code wake up.

Only problem is that for completeness you would really want to 
wire up select-like functionality too, so that you could really 
have a single waiting mechanism.
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- 
Emma Goldman -



From md9ms@mdstud.chalmers.se  Fri Sep  6 16:10:21 2002
From: md9ms@mdstud.chalmers.se (Martin =?ISO-8859-1?Q?Sj=F6gren?=)
Date: 06 Sep 2002 17:10:21 +0200
Subject: [Python-Dev] Call for clarity ( clarification ;-) )
In-Reply-To: <200209061441.g86EfsV14529@pcp02138704pcs.reston01.va.comcast.net>
References: <1031437860.636.29.camel@HillCountryPeress>
 <m3it1l8iu9.fsf@mira.informatik.hu-berlin.de>
 <1031442464.644.68.camel@HillCountryPeress>
 <003d01c25471$d83fe960$2fd8accf@othello>
 <1031451760.644.97.camel@HillCountryPeress>
 <3D78BF40.1030609@pythonware.com>
 <200209061441.g86EfsV14529@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <1031325022.587.1.camel@winterfell>

--=-Ph2A8jkujuq9XkvZKUg+
Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable

fre 2002-09-06 klockan 16.41 skrev Guido van Rossum:
> > my only objection is that the case where fork fails isn't documented.
> > with a C background one expects a negative number, when in fact an=20
> > exception is raised...
>=20
> Ah jeez.  Even with only half a day of Python you should've figured
> out that Python nearly always raises an exception where the
> corresponding C code returns an error value.

It would, however, be extremely useful if the documentation spelled out
*which* exceptions can be raised! Kind of hard to write a decent
try/except clause if you don't know what to expect.


Regards,
Martin

--=-Ph2A8jkujuq9XkvZKUg+
Content-Type: application/pgp-signature; name=signature.asc
Content-Description: Detta =?ISO-8859-1?Q?=E4r?= en digitalt signerad
	meddelandedel

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQA9eMVdGpBPiZwE9FYRAhetAJ4wknrWuT3HVjosDJBu7doPUPNQWACgrm34
cKfO5uHaFBC4JImx5b97vig=
=kukK
-----END PGP SIGNATURE-----

--=-Ph2A8jkujuq9XkvZKUg+--



From guido@python.org  Fri Sep  6 16:12:14 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 06 Sep 2002 11:12:14 -0400
Subject: [Python-Dev] Re: Signal-resistant code (was: Two random and nearly unrelated ideas)
In-Reply-To: Your message of "Fri, 06 Sep 2002 17:09:16 +0200."
 <9CA26C2C-C1AA-11D6-8D51-003065517236@oratrix.com>
References: <9CA26C2C-C1AA-11D6-8D51-003065517236@oratrix.com>
Message-ID: <200209061512.g86FCF314849@pcp02138704pcs.reston01.va.comcast.net>

> Could we connect signals to semaphores or locks or something 
> like that? That would allow you to do the two things that i 
> think are worth doing in a signal handler: setting a flag and/or 
> making some other part of the code wake up.

But that mixes signals with threads, which is even more poorly
standardized than signals in general.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Fri Sep  6 16:13:22 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 06 Sep 2002 11:13:22 -0400
Subject: [Python-Dev] Call for clarity ( clarification ;-) )
In-Reply-To: Your message of "Fri, 06 Sep 2002 17:10:21 +0200."
 <1031325022.587.1.camel@winterfell>
References: <1031437860.636.29.camel@HillCountryPeress> <m3it1l8iu9.fsf@mira.informatik.hu-berlin.de> <1031442464.644.68.camel@HillCountryPeress> <003d01c25471$d83fe960$2fd8accf@othello> <1031451760.644.97.camel@HillCountryPeress> <3D78BF40.1030609@pythonware.com> <200209061441.g86EfsV14529@pcp02138704pcs.reston01.va.comcast.net>
 <1031325022.587.1.camel@winterfell>
Message-ID: <200209061513.g86FDXi14877@pcp02138704pcs.reston01.va.comcast.net>

> It would, however, be extremely useful if the documentation spelled out
> *which* exceptions can be raised! Kind of hard to write a decent
> try/except clause if you don't know what to expect.

Yes, *this* is a deficiency in the Python docs that ought to be
fixed.  It's a lot of work though, and it's not always clear what to
document (e.g. *everything* can raise MemoryError -- so it's not
useful to mention that everywhere).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Fri Sep  6 16:30:40 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 06 Sep 2002 11:30:40 -0400
Subject: [Python-Dev] Subsecond time stamps
In-Reply-To: Your message of "Fri, 06 Sep 2002 16:12:25 +0200."
 <j47khzcily.fsf@informatik.hu-berlin.de>
References: <j47khzcily.fsf@informatik.hu-berlin.de>
Message-ID: <200209061530.g86FUeq15029@pcp02138704pcs.reston01.va.comcast.net>

> C. Make st_mtime a floating point number. This won't offer nanosecond
>    resolution, as C doubles are not dense enough.

This is the most Pythonic approach.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From neal@metaslash.com  Fri Sep  6 16:36:40 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Fri, 06 Sep 2002 11:36:40 -0400
Subject: [Python-Dev] Re: Signal-resistant code (was: Two random and nearly
 unrelated ideas)
References: <9CA26C2C-C1AA-11D6-8D51-003065517236@oratrix.com> <200209061512.g86FCF314849@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D78CB88.9E642F78@metaslash.com>

Guido van Rossum wrote:
> 
> > Could we connect signals to semaphores or locks or something
> > like that? That would allow you to do the two things that i
> > think are worth doing in a signal handler: setting a flag and/or
> > making some other part of the code wake up.
> 
> But that mixes signals with threads, which is even more poorly
> standardized than signals in general.

Python can open a pipe to itself.  When a signal arrives, write
a character on the pipe in addition to setting a flag.  
Then select() on the pipe.

I doubt this is worth the effort, though.

Neal


From martin@v.loewis.de  Fri Sep  6 16:40:51 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 06 Sep 2002 17:40:51 +0200
Subject: [Python-Dev] Subsecond time stamps
In-Reply-To: <Pine.LNX.4.44.0209061026500.6840-100000@familjen.svensson.org>
References: <Pine.LNX.4.44.0209061026500.6840-100000@familjen.svensson.org>
Message-ID: <m3lm6funwc.fsf@mira.informatik.hu-berlin.de>

Paul Svensson <paul-python@svensson.org> writes:

> This seems to me the most Pythonic way.
> Are C doubles dense enough to offer 100 ns resolution ?

It looks like they are:

>>> time.time()
1031326478.373606
>>> 1031326478 + 1e-6
1031326478.000001
>>> 1031326478 + 1e-7
1031326478.0000001
>>> 1031326478 + 1e-8
1031326478.0

but only just so:
>>> 1031326478 + 2e-7
1031326478.0000002
>>> 1031326478 + 3e-7
1031326478.0000004
>>> 1031326478 + 4e-7
1031326478.0000004

I admit that this looks tempting, but I'm worried about applications
that break because they expect time stamps in struct stat to be
integers.

Regards,
Martin


From guido@python.org  Fri Sep  6 16:42:33 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 06 Sep 2002 11:42:33 -0400
Subject: [Python-Dev] Subsecond time stamps
In-Reply-To: Your message of "Fri, 06 Sep 2002 17:40:51 +0200."
 <m3lm6funwc.fsf@mira.informatik.hu-berlin.de>
References: <Pine.LNX.4.44.0209061026500.6840-100000@familjen.svensson.org>
 <m3lm6funwc.fsf@mira.informatik.hu-berlin.de>
Message-ID: <200209061542.g86FgXt15105@pcp02138704pcs.reston01.va.comcast.net>

> > This seems to me the most Pythonic way.
> 
> I admit that this looks tempting, but I'm worried about applications
> that break because they expect time stamps in struct stat to be
> integers.

Hm, so maybe new field names is still the way to go.  E.g. st_mtime
gives an int, st_mtimef gives a float.  The tuple version only gives
the int.  If the system doesn't support subsecond resolution, the
st_mtimef field still exists but is an int (no point allocating a
float and converting the int).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@comcast.net  Fri Sep  6 16:50:45 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 06 Sep 2002 11:50:45 -0400
Subject: [Python-Dev] Re: Signal-resistant code (was: Two random and nearly
 unrelated ideas)
In-Reply-To: <3D78CB88.9E642F78@metaslash.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEHLBCAB.tim.one@comcast.net>

[Neal Norwitz]
> Python can open a pipe to itself.  When a signal arrives, write
> a character on the pipe in addition to setting a flag.
> Then select() on the pipe.

Of course you meant to say it should do WaitForSingleObject(), so that this
scheme is portable <wink>.

> I doubt this is worth the effort, though.

Few things are.



From tim.one@comcast.net  Fri Sep  6 17:01:43 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 06 Sep 2002 12:01:43 -0400
Subject: [Python-Dev] Subsecond time stamps
In-Reply-To: <Pine.LNX.4.44.0209061026500.6840-100000@familjen.svensson.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEHNBCAB.tim.one@comcast.net>

[Paul Svensson]
> Are C doubles dense enough to offer 100 ns resolution ?

The question can't be answered unless you also specify how many years you
want to cover.  It takes about 25 bits to distinguish a year's worth of
seconds, and an IEEE double has 53 bits to play with.  So if you were only
interested in representing one year, you've got about 28 bits left to play
with.  If you want to cover an N-year span, you've got about 28 - log2(N)
bits to play with.  It takes a bit over 23 bits to distinguish the number of
100 ns slices in a second, so N has to be small enough that 5 - log2(N)
doesn't go negative.  So if you count the start of the epoch at 1970, you've
just created a year 2003 problem <wink>.



From oren-py-d@hishome.net  Fri Sep  6 17:54:49 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Fri, 6 Sep 2002 19:54:49 +0300
Subject: [Python-Dev] Re: Signal-resistant code (was: Two random and nearly unrelated ideas)
In-Reply-To: <9CA26C2C-C1AA-11D6-8D51-003065517236@oratrix.com>; from Jack.Jansen@oratrix.com on Fri, Sep 06, 2002 at 05:09:16PM +0200
References: <200209051501.g85F1EY13017@odiug.zope.com> <9CA26C2C-C1AA-11D6-8D51-003065517236@oratrix.com>
Message-ID: <20020906195449.A23347@hishome.net>

On Fri, Sep 06, 2002 at 05:09:16PM +0200, Jack Jansen wrote:
> Could we connect signals to semaphores or locks or something 
> like that? That would allow you to do the two things that i 
> think are worth doing in a signal handler: setting a flag and/or 
> making some other part of the code wake up.

Signal handlers and locks don't mix well. A signal handler can't grab a
lock. The signal handler can't wait for the lock to be released because
it has interrupted the code holding it. The traditional way this has been
handled is with a global "interrupt enable" flag.  Just like the good old
days of 8 bit micros and DOS when any application could clear the
interrupt flag :-)

If Queue.Queue sets up a signal critical section as well as getting the
queue lock a signal could write to a Queue and wake up a thread waiting
on the other end.

> Only problem is that for completeness you would really want to 
> wire up select-like functionality too, so that you could really 
> have a single waiting mechanism.

If the program uses select as the central dispatcher you can set up a
pipe. The signal handler writes to one end and the other end is listed in
the select socket map. It's a simple way to handle an occasional event
like a child process dying or a SIGHUP telling you to reload the
configuration file. Do you want to use signals for more intensive tasks
like asynchronous I/O?

	Oren



From zack@codesourcery.com  Fri Sep  6 18:28:03 2002
From: zack@codesourcery.com (Zack Weinberg)
Date: Fri, 6 Sep 2002 10:28:03 -0700
Subject: [Python-Dev] Re: Signal-resistant code (was: Two random and nearly unrelated ideas)
In-Reply-To: <20020906195449.A23347@hishome.net>
References: <200209051501.g85F1EY13017@odiug.zope.com> <9CA26C2C-C1AA-11D6-8D51-003065517236@oratrix.com> <20020906195449.A23347@hishome.net>
Message-ID: <20020906172803.GP6886@codesourcery.com>

On Fri, Sep 06, 2002 at 07:54:49PM +0300, Oren Tirosh wrote:
> Signal handlers and locks don't mix well. A signal handler can't grab a
> lock. The signal handler can't wait for the lock to be released because
> it has interrupted the code holding it. The traditional way this has been
> handled is with a global "interrupt enable" flag.  Just like the good old
> days of 8 bit micros and DOS when any application could clear the
> interrupt flag :-)
> 
> If Queue.Queue sets up a signal critical section as well as getting the
> queue lock a signal could write to a Queue and wake up a thread waiting
> on the other end.

Would this be an appropriate place to complain about how
KeyboardInterrupt won't wake up a thread stuck waiting on a Queue?

zw


From guido@python.org  Fri Sep  6 18:53:22 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 06 Sep 2002 13:53:22 -0400
Subject: [Python-Dev] Re: Signal-resistant code (was: Two random and nearly unrelated ideas)
In-Reply-To: Your message of "Fri, 06 Sep 2002 10:28:03 PDT."
 <20020906172803.GP6886@codesourcery.com>
References: <200209051501.g85F1EY13017@odiug.zope.com> <9CA26C2C-C1AA-11D6-8D51-003065517236@oratrix.com> <20020906195449.A23347@hishome.net>
 <20020906172803.GP6886@codesourcery.com>
Message-ID: <200209061753.g86HrMx15903@pcp02138704pcs.reston01.va.comcast.net>

> Would this be an appropriate place to complain about how
> KeyboardInterrupt won't wake up a thread stuck waiting on a Queue?

No, unless you have a real proposal on how to fix it (not just a vague
idea -- we've all had those, and they don't work).  Working code or
shut up. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From zack@codesourcery.com  Fri Sep  6 21:52:31 2002
From: zack@codesourcery.com (Zack Weinberg)
Date: Fri, 6 Sep 2002 13:52:31 -0700
Subject: [Python-Dev] Re: Signal-resistant code (was: Two random and nearly unrelated ideas)
In-Reply-To: <200209061753.g86HrMx15903@pcp02138704pcs.reston01.va.comcast.net>
References: <200209051501.g85F1EY13017@odiug.zope.com> <9CA26C2C-C1AA-11D6-8D51-003065517236@oratrix.com> <20020906195449.A23347@hishome.net> <20020906172803.GP6886@codesourcery.com> <200209061753.g86HrMx15903@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020906205231.GQ6886@codesourcery.com>

On Fri, Sep 06, 2002 at 01:53:22PM -0400, Guido van Rossum wrote:
> > Would this be an appropriate place to complain about how
> > KeyboardInterrupt won't wake up a thread stuck waiting on a Queue?
> 
> No, unless you have a real proposal on how to fix it (not just a vague
> idea -- we've all had those, and they don't work).  Working code or
> shut up. :-)

Fair enough.

The underlying problem is that KeyboardInterrupt does not abort
acquire() called on a thread lock.  This is only noticeable when it
was the main thread that called acquire -- if it's some other thread,
the KeyboardInterrupt will still be delivered to the main thread.
Compare the behavior of these two test programs:

-- test1.py --
import time, thread

lock = thread.allocate_lock()
lock.acquire()

def child_thread():
        print "Acquiring lock"
        lock.acquire()
        print "Have lock (can't happen)"
        lock.release()

thread.start_new_thread(child_thread, ())

print "Hit ^C now"
time.sleep(3600)

-- test2.py --
import time, thread

lock = thread.allocate_lock()

def child_thread():
        print "Acquiring lock"
        lock.acquire()
        print "Have lock"
        time.sleep(3600)
        lock.release()

thread.start_new_thread(child_thread, ())

time.sleep(1) # give child a chance to acquire lock
print "Hit ^C now"
lock.acquire()

I'm going to look only at the pthread-based thread support; presumably
similar changes to the ones I will propose, need to be made to the
others.

There are two cases of PyThread_acquire_lock in thread_pthread.h: using
semaphores, and using condition variables.  Let's look at the
condition variable one first:

                /* mut must be locked by me -- part of the condition
                 * protocol */
                status = pthread_mutex_lock( &thelock->mut );
                CHECK_STATUS("pthread_mutex_lock[2]");
                while ( thelock->locked ) {
                        status = pthread_cond_wait(&thelock->lock_released,
                                                   &thelock->mut);
                        CHECK_STATUS("pthread_cond_wait");
                }
                thelock->locked = 1;
                status = pthread_mutex_unlock( &thelock->mut );

Naively, we'd like to shove a check of PyOS_InterruptOccurred in that
loop so we can bail out if it's true.  It is part of the spec for
pthread_cond_wait that any signal which is handled (as SIGINT is) will
not interrupt its execution.  So in order to get a chance to check for
interrupts we need to change this to a repeated timed wait, like so:

                while ( thelock->locked && !interrupted ) {
                        timeout.tv_sec = time(0) + 1;
                        status = pthread_cond_timedwait(&thelock->lock_released,
                                                        &thelock->mut,
                                                        &timeout);
                        if (status != ETIMEDOUT)
                                CHECK_STATUS("pthread_cond_wait");
                        interrupted = PyOS_InterruptOccurred();
                }
                thelock->locked = 1;
                status = pthread_mutex_unlock( &thelock->mut );

Then we do a bit of fiddling in the return path to reset the interrupt
flag and make sure the caller sees a failure.

In the semaphore case, life is theoretically simpler: there is no
mutex, and sem_wait is interrupted by a handled signal, assuming
SA_RESTART was not set for that signal (which it isn't, in Python).

        do {
                if (waitflag)
                        status = fix_status(sem_wait(thelock));
                else
                        status = fix_status(sem_trywait(thelock));
        } while (status == EINTR); /* Retry if interrupted by a signal */

becomes

        do {
                if (waitflag)
                        status = fix_status(sem_wait(thelock));
                else
                        status = fix_status(sem_trywait(thelock));
                if (status == EINTR && PyOS_InterruptOccurred())
                        goto interrupted;
        } while (status == EINTR); /* Retry if interrupted by a signal */

...

 interrupted:
        PyErr_SetInterrupt();
        dprintf(("PyThread_acquire_lock(%p, %d) interrupted by user\n",
                 lock, waitflag));
        return 0;

However, the Linux semaphore implementation is buggy and will not
actually return EINTR from sem_wait, ever.  I'll take this up with the
libc maintainers; at the Python level, the thing to do is assume it
works.  Hence, the appended patch.

(While I was at it I fixed CHECK_STATUS so that it actually prints the
relevant system error, instead of whatever junk happens to be in
errno.)

zw

===================================================================
Index: thread_pthread.h
--- thread_pthread.h	17 Mar 2002 17:19:00 -0000	2.40
+++ thread_pthread.h	6 Sep 2002 20:51:31 -0000
@@ -128,7 +128,12 @@ typedef struct {
 	pthread_mutex_t  mut;
 } pthread_lock;
 
-#define CHECK_STATUS(name)  if (status != 0) { perror(name); error = 1; }
+#define CHECK_STATUS(name)  do { 					\
+	if (status != 0) {						\
+		fprintf(stderr, "%s: %s\n", name, strerror(status));	\
+		error = 1;						\
+	}								\
+} while (0) 
 
 /*
  * Initialization.
@@ -387,6 +392,8 @@ PyThread_acquire_lock(PyThread_type_lock
 			status = fix_status(sem_wait(thelock));
 		else
 			status = fix_status(sem_trywait(thelock));
+		if (status == EINTR && PyOS_InterruptOccurred())
+			goto interrupted;
 	} while (status == EINTR); /* Retry if interrupted by a signal */
 
 	if (waitflag) {
@@ -399,6 +406,12 @@ PyThread_acquire_lock(PyThread_type_lock
 
 	dprintf(("PyThread_acquire_lock(%p, %d) -> %d\n", lock, waitflag, success));
 	return success;
+
+ interrupted:
+	PyErr_SetInterrupt();
+	dprintf(("PyThread_acquire_lock(%p, %d) interrupted by user\n",
+		 lock, waitflag));
+	return 0;
 }
 
 void 
@@ -472,8 +485,10 @@ int 
 PyThread_acquire_lock(PyThread_type_lock lock, int waitflag)
 {
 	int success;
+	int interrupted = 0;
 	pthread_lock *thelock = (pthread_lock *)lock;
 	int status, error = 0;
+	struct timespec timeout;
 
 	dprintf(("PyThread_acquire_lock(%p, %d) called\n", lock, waitflag));
 
@@ -491,10 +506,15 @@ PyThread_acquire_lock(PyThread_type_lock
 		 * protocol */
 		status = pthread_mutex_lock( &thelock->mut );
 		CHECK_STATUS("pthread_mutex_lock[2]");
-		while ( thelock->locked ) {
-			status = pthread_cond_wait(&thelock->lock_released,
-						   &thelock->mut);
-			CHECK_STATUS("pthread_cond_wait");
+		timeout.tv_nsec = 0;
+		while ( thelock->locked && !interrupted ) {
+			timeout.tv_sec = time(0) + 1;
+			status = pthread_cond_timedwait(&thelock->lock_released,
+							&thelock->mut,
+							&timeout);
+			if (status != ETIMEDOUT)
+				CHECK_STATUS("pthread_cond_wait");
+			interrupted = PyOS_InterruptOccurred();
 		}
 		thelock->locked = 1;
 		status = pthread_mutex_unlock( &thelock->mut );
@@ -502,6 +522,10 @@ PyThread_acquire_lock(PyThread_type_lock
 		success = 1;
 	}
 	if (error) success = 0;
+	if (interrupted) {
+		PyErr_SetInterrupt();
+		success = 0;
+	}
 	dprintf(("PyThread_acquire_lock(%p, %d) -> %d\n", lock, waitflag, success));
 	return success;
 }


From martin@v.loewis.de  Sat Sep  7 08:35:26 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 07 Sep 2002 09:35:26 +0200
Subject: [Python-Dev] Subsecond time stamps
In-Reply-To: <200209061542.g86FgXt15105@pcp02138704pcs.reston01.va.comcast.net>
References: <Pine.LNX.4.44.0209061026500.6840-100000@familjen.svensson.org>
 <m3lm6funwc.fsf@mira.informatik.hu-berlin.de>
 <200209061542.g86FgXt15105@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <m3r8g62qwx.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> Hm, so maybe new field names is still the way to go.  E.g. st_mtime
> gives an int, st_mtimef gives a float.  The tuple version only gives
> the int.  If the system doesn't support subsecond resolution, the
> st_mtimef field still exists but is an int (no point allocating a
> float and converting the int).

OTOH, I just found that the time values are already floats on the
Mac. Did the change in return value for time.time() cause any problems
at the time it was made?

Regards,
Martin



From aahz@pythoncraft.com  Sat Sep  7 22:44:09 2002
From: aahz@pythoncraft.com (Aahz)
Date: Sat, 7 Sep 2002 17:44:09 -0400
Subject: [Python-Dev] Re: Python-dev summary for 2002-08-15 - 2002-09-01
In-Reply-To: <oqd6rwh179.fsf@titan.progiciels-bpi.ca>
References: <Pine.SOL.4.44.0209011545490.23213-100000@death.OCF.Berkeley.EDU> <15730.52469.604124.730029@localhost.localdomain> <200209021401.g82E1k030628@pcp02138704pcs.reston01.va.comcast.net> <oqd6rwh179.fsf@titan.progiciels-bpi.ca>
Message-ID: <20020907214409.GA1939@panix.com>

On Mon, Sep 02, 2002, François Pinard wrote:
>
> To get the same effects with email addresses, I often prefer using
> `mailto:' as a prefix over writing `<' and `>' around a quoted address
> in a message body, even if not fully systematic about this.  In the
> message header itself, `<' and '>' are the proper way to go, of
> course.

Ewww.  I hate "mailto:" because it interferes with cut'n'paste.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/


From Jack.Jansen@oratrix.com  Sat Sep  7 23:11:36 2002
From: Jack.Jansen@oratrix.com (Jack Jansen)
Date: Sun, 8 Sep 2002 00:11:36 +0200
Subject: [Python-Dev] Subsecond time stamps
In-Reply-To: <m3r8g62qwx.fsf@mira.informatik.hu-berlin.de>
Message-ID: <C6B7AF66-C2AE-11D6-B221-003065517236@oratrix.com>

On zaterdag, september 7, 2002, at 09:35 , Martin v. Loewis wrote:

> Guido van Rossum <guido@python.org> writes:
>
>> Hm, so maybe new field names is still the way to go.  E.g. st_mtime
>> gives an int, st_mtimef gives a float.  The tuple version only gives
>> the int.  If the system doesn't support subsecond resolution, the
>> st_mtimef field still exists but is an int (no point allocating a
>> float and converting the int).
>
> OTOH, I just found that the time values are already floats on the
> Mac. Did the change in return value for time.time() cause any problems
> at the time it was made?

It's been causing me headaches in the form of failing test 
suites about once a year:-) But if I break down the time 
problems I have on the Mac (100% of which are due to people 
having a completely unix-centric idea of what a timestamp is) I 
would say 90% are due to the Mac epoch being in 1904 in stead of 
in 1970, 9% are due to mac timestamps being localtime in stead 
of GMT and only 1% are due to the timestamps being floats. And 
the latter are the easiest to fix, too. The localtime/gmt issues 
are the hardest, especially because of DST.

My preference would be that st_mtime and all other such values 
are defined to be cookies (sort of similar to lseek values). You 
would then invoke one of the mythical Python datetime routines 
to convert the cookie into something guaranteed to be of your 
liking. (and this specific datetime routine would be platform 
dependent). If you use the cookie as-is you have a good chance 
of it working, but you're living dangerously (an analogy would 
be opening a binary file without "rb"). But this isn't very 
friendly for backwards compatibility...
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- 
Emma Goldman -



From pinard@iro.umontreal.ca  Sat Sep  7 23:50:20 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: Sat, 07 Sep 2002 18:50:20 -0400
Subject: [Python-Dev] Re: Python-dev summary for 2002-08-15 - 2002-09-01
In-Reply-To: <20020907214409.GA1939@panix.com> (Aahz's message of "Sat, 7
 Sep 2002 17:44:09 -0400")
References: <Pine.SOL.4.44.0209011545490.23213-100000@death.OCF.Berkeley.EDU>
 <15730.52469.604124.730029@localhost.localdomain>
 <200209021401.g82E1k030628@pcp02138704pcs.reston01.va.comcast.net>
 <oqd6rwh179.fsf@titan.progiciels-bpi.ca>
 <20020907214409.GA1939@panix.com>
Message-ID: <oqr8g5v2hf.fsf@titan.progiciels-bpi.ca>

[Aahz]

> Ewww.  I hate "mailto:" because it interferes with cut'n'paste.

I read that you cannot cut and paste a string preceded by `mailto:'?  Is it
what you meant?  What is this interference you mention?

What I like in `mailto:' for text or message bodies, is that my editor and
mail user agent highlights it and makes it clickable.  I would be tempted to
guess that other editors do this too, but the truth is that I do not know.
Maybe we should not let the strengths and drawbacks of the various editors we
use drive us into religious feelings for or against a specific markup.  Yet,
such comparisons let us have an overall feeling on the usefulness of a
particular approach.  As long as we resist editor wars, it may be useful.

If reStructuredText is going to gain popularity in the Python developers
community, maybe we should bet in that direction, and prefer the conventions
it proposes for Python-dev summaries and other simple documents.  The bet to
be taken, here, is that our editors and tools would eventually better support
reST, or be supplemented with a dependable set of programs to do so.

On the other hand, it seems that not everybody is comfortable with reST yet,
this might be a problem if there is strong resistance.  For one, I rather
liked what I saw so far, and without knowing how much time or effort it would
take before I use reST fluently, I would probably be happy to share the bet!

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard


From guido@python.org  Sun Sep  8 00:24:54 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 07 Sep 2002 19:24:54 -0400
Subject: [Python-Dev] Subsecond time stamps
In-Reply-To: Your message of "Sun, 08 Sep 2002 00:11:36 +0200."
 <C6B7AF66-C2AE-11D6-B221-003065517236@oratrix.com>
References: <C6B7AF66-C2AE-11D6-B221-003065517236@oratrix.com>
Message-ID: <200209072324.g87NOsG15613@pcp02138704pcs.reston01.va.comcast.net>

> >> Hm, so maybe new field names is still the way to go.  E.g. st_mtime
> >> gives an int, st_mtimef gives a float.  The tuple version only gives
> >> the int.  If the system doesn't support subsecond resolution, the
> >> st_mtimef field still exists but is an int (no point allocating a
> >> float and converting the int).
> >
> > OTOH, I just found that the time values are already floats on the
> > Mac. Did the change in return value for time.time() cause any problems
> > at the time it was made?
> 
> It's been causing me headaches in the form of failing test 
> suites about once a year:-) But if I break down the time 
> problems I have on the Mac (100% of which are due to people 
> having a completely unix-centric idea of what a timestamp is) I 
> would say 90% are due to the Mac epoch being in 1904 in stead of 
> in 1970, 9% are due to mac timestamps being localtime in stead 
> of GMT and only 1% are due to the timestamps being floats. And 
> the latter are the easiest to fix, too. The localtime/gmt issues 
> are the hardest, especially because of DST.

I'm not sure if this can be used as an argument for making st_mtime
and friends floats and be done with it.  I wish it could be, because
in the long run that's a much nicer API than adding new fields.

> My preference would be that st_mtime and all other such values 
> are defined to be cookies (sort of similar to lseek values). You 
> would then invoke one of the mythical Python datetime routines 
> to convert the cookie into something guaranteed to be of your 
> liking. (and this specific datetime routine would be platform 
> dependent). If you use the cookie as-is you have a good chance 
> of it working, but you're living dangerously (an analogy would 
> be opening a binary file without "rb"). But this isn't very 
> friendly for backwards compatibility...

There's at least one place I know of in Python that assumes the epoch
being 1970: calendar.timegm() -- note the line "EPOCH = 1970" right in
front of it. :-)

Would it make sense if the portable Python APIs translated everything
to an epoch of 1970 and UTC?  That's what the Windows C library does.
Very helpful.  (Or is this a problem that's going to disappear with
MacOS X?  I presume it uses UTC and I hope its epoch is 1970?)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From aahz@pythoncraft.com  Sun Sep  8 05:02:24 2002
From: aahz@pythoncraft.com (Aahz)
Date: Sun, 8 Sep 2002 00:02:24 -0400
Subject: [Python-Dev] Re: Python-dev summary for 2002-08-15 - 2002-09-01
In-Reply-To: <oqr8g5v2hf.fsf@titan.progiciels-bpi.ca>
References: <Pine.SOL.4.44.0209011545490.23213-100000@death.OCF.Berkeley.EDU> <15730.52469.604124.730029@localhost.localdomain> <200209021401.g82E1k030628@pcp02138704pcs.reston01.va.comcast.net> <oqd6rwh179.fsf@titan.progiciels-bpi.ca> <20020907214409.GA1939@panix.com> <oqr8g5v2hf.fsf@titan.progiciels-bpi.ca>
Message-ID: <20020908040224.GA27302@panix.com>

On Sat, Sep 07, 2002, François Pinard wrote:
> [Aahz]
>> 
>> Ewww.  I hate "mailto:" because it interferes with cut'n'paste.
> 
> I read that you cannot cut and paste a string preceded by `mailto:'?  Is it
> what you meant?  What is this interference you mention?

xterm does a nifty job usually of figuring out what to highlight when I
double-click on a word.  It fails with mailto: because normally when I
cut'n'paste an address, I *don't* want to include the "mailto:" portion.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/


From skip@manatee.mojam.com  Sun Sep  8 13:00:23 2002
From: skip@manatee.mojam.com (Skip Montanaro)
Date: Sun, 8 Sep 2002 07:00:23 -0500
Subject: [Python-Dev] Weekly Python Bug/Patch Summary
Message-ID: <200209081200.g88C0N5Z008526@manatee.mojam.com>

Bug/Patch Summary
-----------------

278 open / 2830 total bugs (-6)
115 open / 1686 total patches (-4)

New Bugs
--------

setting file buffer size is unreliable (2002-09-02)
	http://python.org/sf/603724
spurious SyntaxWarning (2002-09-03)
	http://python.org/sf/604036
time.struct_time undocumented (2002-09-03)
	http://python.org/sf/604128
long list in Pythonwin -> weird text (2002-09-03)
	http://python.org/sf/604387
faster [None]*n or []*n (2002-09-04)
	http://python.org/sf/604716
pre bug (2002-09-04)
	http://python.org/sf/604803
python-mode.el replaces function on f1 (2002-09-06)
	http://python.org/sf/605818
python-mode kills arrow in gdb (gud.el) (2002-09-08)
	http://python.org/sf/606250
elisp: doesn't recognize comment-syntax (2002-09-08)
	http://python.org/sf/606251
py-electric-colon & delete-selection-mod (2002-09-08)
	http://python.org/sf/606254

New Patches
-----------

ccompiler argument checking too strict (2002-09-02)
	http://python.org/sf/603831
release GIL around getaddrinfo() (2002-09-03)
	http://python.org/sf/604210
For Bug [ 490168 ] shutil.copy(path, pat (2002-09-04)
	http://python.org/sf/604600
nntplib: group descriptions and RFC2980 (2002-09-05)
	http://python.org/sf/605370
Tweaks to calls to AH/Help (2002-09-07)
	http://python.org/sf/606067
fast dictionary lookup by name (2002-09-07)
	http://python.org/sf/606098
Mac OS X keydefs (2002-09-07)
	http://python.org/sf/606132
install_IDLE target in Mac/OSX/Makefile (2002-09-07)
	http://python.org/sf/606134

Closed Bugs
-----------

Unicode in sys.path not supported (2001-10-30)
	http://python.org/sf/476326
PDB single steps list comprehensions (2002-02-28)
	http://python.org/sf/523995
surprise overriding __radd__ in subclass of complex (2002-03-18)
	http://python.org/sf/531355
import user doesn't work with CGIs (2002-05-14)
	http://python.org/sf/555779
whatsnew explains noargs incorrectly (2002-06-11)
	http://python.org/sf/567607
Invalid mmap crashes Python interpreter (2002-07-24)
	http://python.org/sf/585792
spawn*() doesn't handle errors well (2002-08-20)
	http://python.org/sf/597795
The KeyError message doesn't use repr on the key value reported (2002-08-21)
	http://python.org/sf/598451
Method resolution order in Py 2.2 - 2.3 (2002-08-23)
	http://python.org/sf/599452
bug in new execvpe (2002-08-27)
	http://python.org/sf/601077
xmlrpclib ignores CDATA (2002-08-28)
	http://python.org/sf/601534
some int results that should be bool (2002-08-29)
	http://python.org/sf/601775
smtplib mishandles empty sender (2002-08-29)
	http://python.org/sf/602029
configure finds c++ w/o --with-cxx (2002-08-29)
	http://python.org/sf/602102

Closed Patches
--------------

unicode encoding error callbacks (2001-06-12)
	http://python.org/sf/432401
Pure Python strptime() (PEP 42) (2001-10-23)
	http://python.org/sf/474274
mimetypes: all extensions for a type (2002-05-09)
	http://python.org/sf/554192
socketmodule.[ch] downgrade (2002-08-09)
	http://python.org/sf/593069
email: RFC 2231 parameters encoding (2002-08-26)
	http://python.org/sf/600096
IDLE [Open module]: import submodules (2002-08-26)
	http://python.org/sf/600152
Robustness tweak to httplib.py (2002-08-26)
	http://python.org/sf/600488
obmalloc,structmodule: 64bit, big endian (2002-08-28)
	http://python.org/sf/601369
expose PYTHON_API_VERSION via sys (2002-08-28)
	http://python.org/sf/601456
replace_header method for Message class (2002-08-29)
	http://python.org/sf/601959
sys.path in user.py (2002-08-29)
	http://python.org/sf/602005
single shared ticker (2002-08-29)
	http://python.org/sf/602191


From Jack.Jansen@oratrix.com  Sun Sep  8 22:51:59 2002
From: Jack.Jansen@oratrix.com (Jack Jansen)
Date: Sun, 8 Sep 2002 23:51:59 +0200
Subject: [Python-Dev] Subsecond time stamps
In-Reply-To: <200209072324.g87NOsG15613@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <33811F04-C375-11D6-9BF8-003065517236@oratrix.com>

On zondag, september 8, 2002, at 01:24 , Guido van Rossum wrote:
> Would it make sense if the portable Python APIs translated everything
> to an epoch of 1970 and UTC?  That's what the Windows C library does.
> Very helpful.  (Or is this a problem that's going to disappear with
> MacOS X?  I presume it uses UTC and I hope its epoch is 1970?)

On MacOSX (if you use unix-based Python, not if you use old 
MacPython) the problem is gone. At least, if you ignore the 
timestamps returned by mac-specific filesystem routines, but I 
think we can do that safely.

Changing the APIs to return unix-style timestamps is what the 
GUSI unix-compatible socket and I/O library used by MacPython 
did originally, but I had to rip it out. The problem was that 
GUSI did provide all the unix system calls, but not the other 
library routines that handled timestamps. So these were provided 
by the Metrowerks C library, which assumes localtime. So ctime() 
and gmtime() and all its friends did the wrong thing, and I 
didn't cherish the idea of finding replacements for them.

If your suggestion is that every timestamp goes through a 
conversion routine before being passed from C to Python and 
through a reverse conversion when it goes from Python to C: yes, 
that would definitely make sense.
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- 
Emma Goldman -



From pinard@iro.umontreal.ca  Mon Sep  9 14:59:10 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: Mon, 09 Sep 2002 09:59:10 -0400
Subject: [Python-Dev] Codecs lookup order
Message-ID: <oqadmrgt75.fsf@titan.progiciels-bpi.ca>

Hi, people.

Happily playing with codecs (using Python 2.2.1), I found out that one should
be careful about _not_ naming a module after the encoding name, when closely
following the documentation in the Library Reference manual.

Here is what I guess is happening.  `codecs.register()' appends the search
function from the new codec module at end of existing search functions.
`codecs.lookup()' tries the search functions in the same order in which they
were declared.  Consequently, `encodings.lookup()' is tried first.

If the encoding does not exist in the cache, `encodings.lookup()' tries to
import a module by the name of the encoding, slightly transformed, and will
indeed import the new user codec module, because that module has the name of
the encoding, and is on the module search path.

But now, `encodings.lookup()' expects a `getregentry' function in that module,
does not find it, and raises a CodecRegistryError, not leaving a chance to
subsequent codec search functions to be used.  On the user side, a mere
renaming the user module holding the new codec solves the problem.

I'm not sure what should best be done.  The documentation might be modified to
explain the limitation, so other users do not trip up on it.
`encoding.lookup()' might merely return None in case `getregentry' is not
defined in the imported module, or else, it could make sure that it imports
modules exclusively from within the `encodings' package.

The best and simplest might be to lookup the code search functions in reverse
order of their registration.  `encoding.lookup()' would be called last instead
of first.  It would be easier for the user to override an encoding bundled
with the Python distribution, if there is a need to do so.  Because the Python
Library Reference does not specify yet in which order codec search functions
are tried, the order is not frozen yet and it might be easier to change it.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard


From rledwith@cas.org  Mon Sep  9 17:34:18 2002
From: rledwith@cas.org (ledwith@cas.org)
Date: Mon, 9 Sep 2002 12:34:18 -0400 (EDT)
Subject: [Python-Dev] 64-bit process optimization 1
Message-ID: <20020909123418.AAB25999@cas.org>

Hello,

    This is my first post to Python-Dev.  As requested by the list manager
    I am supplying a brief personal introduction before getting to the topic of
    this message:

	I am a Senior Research Scientist at <A http://www.cas.org>CAS</A>,
	a branch of the <A http://www.acs.org>American Chemical Society</A>.
	I have used Python as my programming language of choice for the last
	four years.  I typically work with large collections of text documents
	performing analyses of text, computer indexing of text, and information
	retrieval.  I use Python as (1) a general purpose programming
	language, and (2) a high-level programming language to invoke
	high-performance C and C++ modules (including Numeric).
	If I examine my programs by data structures, I would find that they
	contain mostly:

	    1.  Very large dictionaries using tuples and strings as keys.
		Guido's essay on
		<a href=http://www.python.org/doc/essays/graphs.html>Implementing Graphs</A>
		was the inspiration for my using dictionaries to create very large
		directed acyclic graphs.

	    2.  Specialized C++ objects to represent inverted lists.

	    3.  Numeric objects for representing vectors and tables of floating point values.

	My primary computing platforms are four dedicated Sun servers,
	containing 30 processors, 88GB of RAM and 2TB of DASD.  Most of the
	programs I write require between 1 hour and 27 days to complete.
	(Obviously, I am an atypical Python user!)
	During the last three months, I have been forced to migrate from
	32-bit python processes to 64-bit processes due to the large number
	of data points I am analyzing within a single program run.
	It is my experiences while migrating from 32-bit to 64-bit code
	that prompted this message.

    It is with some trepidation that as the subject of my first posting
    I am suggesting that Python 2.3 should use a different layout of all Python objects
    than is defined in Python 2.2.1.  Specifically, I have found that changing
    lines 63-74 of Include/object.h from:

#ifdef Py_TRACE_REFS
#define PyObject_HEAD \
	struct _object *_ob_next, *_ob_prev; \
	int ob_refcnt; \
	struct _typeobject *ob_type;
#define PyObject_HEAD_INIT(type) 0, 0, 1, type,
#else /* !Py_TRACE_REFS */
#define PyObject_HEAD \
	int ob_refcnt; \
	struct _typeobject *ob_type;
#define PyObject_HEAD_INIT(type) 1, type,
#endif /* !Py_TRACE_REFS */

    to:

#ifdef Py_TRACE_REFS
#define PyObject_HEAD \
	struct _object *_ob_next, *_ob_prev; \
	struct _typeobject *ob_type; \
	int ob_refcnt;
#define PyObject_HEAD_INIT(type) 0, 0, type, 1,
#else /* !Py_TRACE_REFS */
#define PyObject_HEAD \
	struct _typeobject *ob_type; \
	int ob_refcnt;
#define PyObject_HEAD_INIT(type) type, 1,
#endif /* !Py_TRACE_REFS */

    significantly improved the performance of my 64-bit processes.

    Basically, I have just changed the order of the items in PyObject and
    PyVarObject to avoid gas due to an "int" being a 4-byte long and aligned
    types, while "long" and pointers are 8-byte long and aligned types (on
    64-bit platforms that conform to the LP64 guideline).  For the ILP32
    guideline, such as Intel x86 and AMD CPUs, this should have no effect.  On
    the Sun platform on which I live, the changes work for both ILP32 and LP64.
    For the very large programs I run, the modification saved me 40% execution
    time.  This was probably due to the increased number of Python objects that
    would fit into the L2 cache, so I don't believe that others would
    necessarily see as large as a difference with this coding change.

    Please consider this change for inclusion in the upcoming Python release.

					    - Bob


From aahz@pythoncraft.com  Mon Sep  9 18:03:02 2002
From: aahz@pythoncraft.com (Aahz)
Date: Mon, 9 Sep 2002 13:03:02 -0400
Subject: [Python-Dev] 64-bit process optimization 1
In-Reply-To: <20020909123418.AAB25999@cas.org>
References: <20020909123418.AAB25999@cas.org>
Message-ID: <20020909170301.GA8457@panix.com>

Without commenting on the merits of your proposal, I can tell you that
it'll get lost unless you file a bug report on SourceForge.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/


From guido@python.org  Mon Sep  9 18:55:21 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 09 Sep 2002 13:55:21 -0400
Subject: [Python-Dev] 64-bit process optimization 1
In-Reply-To: Your message of "Mon, 09 Sep 2002 12:34:18 EDT."
 <20020909123418.AAB25999@cas.org>
References: <20020909123418.AAB25999@cas.org>
Message-ID: <200209091755.g89HtLV30441@pcp02138704pcs.reston01.va.comcast.net>

>     I am suggesting that Python 2.3 should use a different layout of
>     all Python objects than is defined in Python 2.2.1.
>     Specifically, I have found that changing lines 63-74 of
>     Include/object.h from:
> 
> #ifdef Py_TRACE_REFS
> #define PyObject_HEAD \
> 	struct _object *_ob_next, *_ob_prev; \
> 	int ob_refcnt; \
> 	struct _typeobject *ob_type;
> #define PyObject_HEAD_INIT(type) 0, 0, 1, type,
> #else /* !Py_TRACE_REFS */
> #define PyObject_HEAD \
> 	int ob_refcnt; \
> 	struct _typeobject *ob_type;
> #define PyObject_HEAD_INIT(type) 1, type,
> #endif /* !Py_TRACE_REFS */
> 
>     to:
> 
> #ifdef Py_TRACE_REFS
> #define PyObject_HEAD \
> 	struct _object *_ob_next, *_ob_prev; \
> 	struct _typeobject *ob_type; \
> 	int ob_refcnt;
> #define PyObject_HEAD_INIT(type) 0, 0, type, 1,
> #else /* !Py_TRACE_REFS */
> #define PyObject_HEAD \
> 	struct _typeobject *ob_type; \
> 	int ob_refcnt;
> #define PyObject_HEAD_INIT(type) type, 1,
> #endif /* !Py_TRACE_REFS */
> 
>     significantly improved the performance of my 64-bit processes.
> 
>     Basically, I have just changed the order of the items in
>     PyObject and PyVarObject to avoid gas due to an "int" being a
>     4-byte long and aligned types, while "long" and pointers are
>     8-byte long and aligned types (on 64-bit platforms that conform
>     to the LP64 guideline).  For the ILP32 guideline, such as Intel
>     x86 and AMD CPUs, this should have no effect.  On the Sun
>     platform on which I live, the changes work for both ILP32 and
>     LP64.  For the very large programs I run, the modification saved
>     me 40% execution time.  This was probably due to the increased
>     number of Python objects that would fit into the L2 cache, so I
>     don't believe that others would necessarily see as large as a
>     difference with this coding change.

Interesting!  I can see why this makes sense.  Strings, lists and
tuples all have an int (ob_size) directly following the standard HEAD,
and after that something that requires pointer alignment, so that
these object types would all save 8 bytes!  To wit:

string
	int refcnt, ptr type, int size, long hash, ...
                   ^gap                ^gap
list
	int refcnt, ptr type, int size, ptr item*
                   ^gap                ^gap
tuple
	int refcnt, ptr type, int size, ptr item[]
                   ^gap                ^gap

By swapping the first two fields, these gaps would all disappear.  The
dict object doesn't use ob_size, but starts with an odd number of
ints, so the same reasoning shows it would also save 8 bytes.

I don't have access to a 64-bit platform to experiment with this.

Unfortunately, one problem is binary compatibility.  We try to make it
possible to link newer Python versions with extension modules (like
Numeric, which you use) compiled for older versions.  This requires
that the binary lay-out of objects remains the same, and swapping
ob_refcnt and ob_type would cause immediate crashes in this case.

It may be that there are other reasons why binary incompatibilities
exist between 2.2 and 2.3 that make this impractical, so perhaps I'm
being too conservative here.

Another issue is that at least theoretically, on a 64-bit platform,
there could be more than 2 billion references to a particular object.
E.g. if you have enough memory, the following allocates 3 lists each
containing a billion references to None, causing the reference count
of None to go negative:

A = []
for i in range(3):
    A.append([None]*1000000000)

So perhaps the refcnt should have been a long in the first place.  A
similar argument may hold for the length of e.g. strings and lists:
one could wish to have a list of more than 2 billion elements, or a
string containing more than 2 gigabytes (that much RAM is easily found
on the larger 64-bit servers, I believe).

Opinions?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@comcast.net  Mon Sep  9 19:16:18 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 09 Sep 2002 14:16:18 -0400
Subject: [Python-Dev] 64-bit process optimization 1
In-Reply-To: <200209091755.g89HtLV30441@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEDABDAB.tim.one@comcast.net>

[Guido]
> ...
> So perhaps the refcnt should have been a long in the first place.

We agreed to that years ago, but never bothered to change it.  In fact, you
used to tell people it *was* a long until I beat that out of you <wink>.

Do note that a long is still only 4 bytes on Win64.  The type we really want
here is what pyport.h calls Py_intptr_t (a Python spelling of the
appropriate C99 type; C99 introduced ways to say what you really mean in
these cases).

> A similar argument may hold for the length of e.g. strings and lists:
> one could wish to have a list of more than 2 billion elements, or a
> string containing more than 2 gigabytes (that much RAM is easily found
> on the larger 64-bit servers, I believe).
>
> Opinions?

Those are more naturally addressed by size_t, since strlen and malloc are
constrained to that type.  I generally declare string-slinging code as using
size_t vars now, and endure the pain of casting back and forth to int to
talk with Python's idea of a string size.

Whether it's worth the pain to change this stuff depends on whether we think
64-bit boxes are just another passing fad like the Internet <wink>.



From python-dev@liveevil.com  Mon Sep  9 20:42:56 2002
From: python-dev@liveevil.com (john spurling)
Date: Mon, 9 Sep 2002 12:42:56 -0700
Subject: [Python-Dev] raw headers in rfc822.Message
Message-ID: <20020909194256.GA13424@c7c8.colobox.com>

greetings,

since the raw headers don't seem to be available in an rfc822.Message,
i added a quick two line hack to populate a rawheaders member. 
attached is a patch to rfc822.py from the python 2.2.1
distribution.

if you don't like my two line hack, consider this a request to provide
the raw headers in some way in an rfc822.Message.

thanks,
john spurling

-- 
"nothing brings people together like doom."
		--sarah vowell


From python-dev@liveevil.com  Mon Sep  9 20:52:34 2002
From: python-dev@liveevil.com (john spurling)
Date: Mon, 9 Sep 2002 12:52:34 -0700
Subject: [Python-Dev] Re: raw headers in rfc822.Message
Message-ID: <20020909195234.GA18807@c7c8.colobox.com>

--OXfL5xGRrasGEqWY
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

maybe it would help if i actually attached the diff...



--OXfL5xGRrasGEqWY
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="rfc822.diff"

139d138
<         self.rawheaders = ''
158,159d156
<             # Add line to the raw input
<             self.rawheaders += line

--OXfL5xGRrasGEqWY--


From aahz@pythoncraft.com  Mon Sep  9 20:52:08 2002
From: aahz@pythoncraft.com (Aahz)
Date: Mon, 9 Sep 2002 15:52:08 -0400
Subject: [Python-Dev] raw headers in rfc822.Message
In-Reply-To: <20020909194256.GA13424@c7c8.colobox.com>
References: <20020909194256.GA13424@c7c8.colobox.com>
Message-ID: <20020909195208.GA1662@panix.com>

On Mon, Sep 09, 2002, john spurling wrote:
> 
> since the raw headers don't seem to be available in an rfc822.Message,
> i added a quick two line hack to populate a rawheaders member. 
> attached is a patch to rfc822.py from the python 2.2.1
> distribution.

File a bug report on SourceForge.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/


From zack@codesourcery.com  Mon Sep  9 21:09:34 2002
From: zack@codesourcery.com (Zack Weinberg)
Date: Mon, 9 Sep 2002 13:09:34 -0700
Subject: [Python-Dev] Re: raw headers in rfc822.Message
In-Reply-To: <20020909195234.GA18807@c7c8.colobox.com>
References: <20020909195234.GA18807@c7c8.colobox.com>
Message-ID: <20020909200934.GA17001@codesourcery.com>

On Mon, Sep 09, 2002 at 12:52:34PM -0700, john spurling wrote:
> maybe it would help if i actually attached the diff...
>
> 139d138
> <         self.rawheaders = ''
> 158,159d156
> <             # Add line to the raw input
> <             self.rawheaders += line

You've generated this patch backward, and in a format which makes it
useless to us.  Please regenerate it with diff -c or diff -u (either
is acceptable) and put the newer file _second_ on the command line:
diff -u OLD_FILE NEW_FILE.

zw



From barry@python.org  Mon Sep  9 21:11:46 2002
From: barry@python.org (Barry A. Warsaw)
Date: Mon, 9 Sep 2002 16:11:46 -0400
Subject: [Python-Dev] raw headers in rfc822.Message
References: <20020909194256.GA13424@c7c8.colobox.com>
Message-ID: <15741.130.736249.914221@anthem.wooz.org>

>>>>> "js" == john spurling <python-dev@liveevil.com> writes:

    js> since the raw headers don't seem to be available in an
    js> rfc822.Message, i added a quick two line hack to populate a
    js> rawheaders member. attached is a patch to rfc822.py from the
    js> python 2.2.1 distribution.

    js> if you don't like my two line hack, consider this a request to
    js> provide the raw headers in some way in an rfc822.Message.

Why not just use email.Message.Message?  You can get the original
headers from it, and the email package tries really hard to produce
output identical to the input.

-Barry


From barry@barrys-emacs.org  Mon Sep  9 23:21:45 2002
From: barry@barrys-emacs.org (Barry Scott)
Date: Mon, 9 Sep 2002 23:21:45 +0100
Subject: [Python-Dev] Re: Python-dev summary for 2002-08-15 - 2002-09-01
In-Reply-To: <20020908040224.GA27302@panix.com>
Message-ID: <002001c2584f$480a4b10$070210ac@LAPDANCE>

> xterm does a nifty job usually of figuring out what to highlight when I
> double-click on a word.  It fails with mailto: because normally when I
> cut'n'paste an address, I *don't* want to include the "mailto:" portion.

You can configure xterm to treat : as punctuation and not a word
char. See man xterm.

BArry




From bsder@mail.allcaps.org  Mon Sep  9 23:21:49 2002
From: bsder@mail.allcaps.org (Andrew P. Lentvorski)
Date: Mon, 9 Sep 2002 15:21:49 -0700 (PDT)
Subject: [Python-Dev] Subsecond time stamps
In-Reply-To: <200209061530.g86FUeq15029@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020909150038.W79275-100000@mail.allcaps.org>

On Fri, 6 Sep 2002, Guido van Rossum wrote:

> > C. Make st_mtime a floating point number. This won't offer nanosecond
> >    resolution, as C doubles are not dense enough.
>
> This is the most Pythonic approach.

-1

This then locks Python into a specific bit-description notion of a double
in order to get the appropriate number of significant digits to describe
time sufficiently.  Embedded/portable processors may not support the
notion of an IEEE double.

In addition, timers get increasingly dense as computers get faster.  Thus,
doubles may work for nanoseconds, but will not be sufficient for
picoseconds.

If the goal is a field which never has to be changed to support any amount
of time, the value should be "infinite precision".  At that point, a
Python Long used in some tuple representation of fixed-point arithmetic
springs to mind.  ie. (<long>, <bit of fractional point>)

-a



From martin@v.loewis.de  Mon Sep  9 23:26:55 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 10 Sep 2002 00:26:55 +0200
Subject: [Python-Dev] Codecs lookup order
In-Reply-To: <oqadmrgt75.fsf@titan.progiciels-bpi.ca>
References: <oqadmrgt75.fsf@titan.progiciels-bpi.ca>
Message-ID: <m37khupzo0.fsf@mira.informatik.hu-berlin.de>

pinard@iro.umontreal.ca (Fran=E7ois Pinard) writes:

> I'm not sure what should best be done.  The documentation might be
> modified to explain the limitation, so other users do not trip up on
> it.  `encoding.lookup()' might merely return None in case
> `getregentry' is not defined in the imported module, or else, it
> could make sure that it imports modules exclusively from within the
> `encodings' package.

This is what Python 2.3, and Python 2.2.2 will do.

Regards,
Martin


From martin@v.loewis.de  Mon Sep  9 23:33:20 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 10 Sep 2002 00:33:20 +0200
Subject: [Python-Dev] 64-bit process optimization 1
In-Reply-To: <200209091755.g89HtLV30441@pcp02138704pcs.reston01.va.comcast.net>
References: <20020909123418.AAB25999@cas.org>
 <200209091755.g89HtLV30441@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <m33csipzdb.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> So perhaps the refcnt should have been a long in the first place.  A
> similar argument may hold for the length of e.g. strings and lists:
> one could wish to have a list of more than 2 billion elements, or a
> string containing more than 2 gigabytes (that much RAM is easily found
> on the larger 64-bit servers, I believe).
> 
> Opinions?

I agree with that position, and Tim's, that those fields should widen
to 64 bits on a 64-bit system. I disagree that size_t is suitable for
ob_size, since some types put negative values into ob_size. The signed
version of that, ssize_t, is not universally available, so we'd need
to add Py_ssize_t.

Regards,
Martin



From aahz@pythoncraft.com  Tue Sep 10 00:07:13 2002
From: aahz@pythoncraft.com (Aahz)
Date: Mon, 9 Sep 2002 19:07:13 -0400
Subject: [Python-Dev] Cut'n'paste
In-Reply-To: <002001c2584f$480a4b10$070210ac@LAPDANCE>
References: <20020908040224.GA27302@panix.com> <002001c2584f$480a4b10$070210ac@LAPDANCE>
Message-ID: <20020909230713.GA5338@panix.com>

On Mon, Sep 09, 2002, Barry Scott wrote:
>Aahz:
>>
>> xterm does a nifty job usually of figuring out what to highlight when I
>> double-click on a word.  It fails with mailto: because normally when I
>> cut'n'paste an address, I *don't* want to include the "mailto:" portion.
> 
> You can configure xterm to treat : as punctuation and not a word
> char. See man xterm.

Then it would fail with regular URLs.  You can't win.  ;-)
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/


From guido@python.org  Tue Sep 10 00:06:30 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 09 Sep 2002 19:06:30 -0400
Subject: [Python-Dev] Subsecond time stamps
In-Reply-To: Your message of "Mon, 09 Sep 2002 15:21:49 PDT."
 <20020909150038.W79275-100000@mail.allcaps.org>
References: <20020909150038.W79275-100000@mail.allcaps.org>
Message-ID: <200209092306.g89N6V806944@pcp02138704pcs.reston01.va.comcast.net>

> > > C. Make st_mtime a floating point number. This won't offer nanosecond
> > >    resolution, as C doubles are not dense enough.
> >
> > This is the most Pythonic approach.
> 
> -1
> 
> This then locks Python into a specific bit-description notion of a double
> in order to get the appropriate number of significant digits to describe
> time sufficiently.  Embedded/portable processors may not support the
> notion of an IEEE double.
> 
> In addition, timers get increasingly dense as computers get faster.  Thus,
> doubles may work for nanoseconds, but will not be sufficient for
> picoseconds.
> 
> If the goal is a field which never has to be changed to support any amount
> of time, the value should be "infinite precision".  At that point, a
> Python Long used in some tuple representation of fixed-point arithmetic
> springs to mind.  ie. (<long>, <bit of fractional point>)

I'm sorry, but I really don't see the point of wanting to record file
mtimes all the way up to nanosecond precision.  What would it mean?
Most clocks are off by a few seconds at least anyway.

Python has represented time as Pythin floats (implemented as C
doubles) all its life long and it has served us well.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From martin@v.loewis.de  Tue Sep 10 00:34:12 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 10 Sep 2002 01:34:12 +0200
Subject: [Python-Dev] Subsecond time stamps
In-Reply-To: <20020909150038.W79275-100000@mail.allcaps.org>
References: <20020909150038.W79275-100000@mail.allcaps.org>
Message-ID: <m3admqohzf.fsf@mira.informatik.hu-berlin.de>

"Andrew P. Lentvorski" <bsder@mail.allcaps.org> writes:

> This then locks Python into a specific bit-description notion of a double
> in order to get the appropriate number of significant digits to describe
> time sufficiently.  Embedded/portable processors may not support the
> notion of an IEEE double.

That's not true. Support you have two fields, tv_sec and tv_nsec. Then
the resulting float expression is 

   tv_sec + 1e-9 * tv_nsec;

This expression works on all systems that support floating point
numbers - be it IEEE or not.

> In addition, timers get increasingly dense as computers get faster.
> Thus, doubles may work for nanoseconds, but will not be sufficient
> for picoseconds.

At the same time, floating point numbers get increasingly more
accurate as computer registers widen. In a 64-bit float, you can just
barely express 1e-7s (if you base the era at 1970); with a 128-bit
float, you can express 1e-20s easily.

> If the goal is a field which never has to be changed to support any amount
> of time, the value should be "infinite precision".  

No, just using floating point numbers is sufficient. Notice that
time.time() also returns a floating point number.

> At that point, a Python Long used in some tuple representation of
> fixed-point arithmetic springs to mind.  ie. (<long>, <bit of
> fractional point>)

Yes, when/if Python gets rational numbers, or decimal
fixed-or-floating point numbers, those data types might represent the
the value that the system reports more accurately. At that time, there
will be a transition plan to introduce those numbers at all places
where it is reasonable, with as little impact on applications as
possible.

Regards,
Martin


From brian@sweetapp.com  Tue Sep 10 00:55:15 2002
From: brian@sweetapp.com (Brian Quinlan)
Date: Mon, 09 Sep 2002 16:55:15 -0700
Subject: [Python-Dev] Subsecond time stamps
In-Reply-To: <m3admqohzf.fsf@mira.informatik.hu-berlin.de>
Message-ID: <01b501c2585c$584b4a80$df7e4e18@brianspiv1700>

MvL wrote:

> That's not true. Support you have two fields, tv_sec and tv_nsec. Then
> the resulting float expression is
> 
>    tv_sec + 1e-9 * tv_nsec;
> 
> This expression works on all systems that support floating point
> numbers - be it IEEE or not.

Don't you have to truncate tv_sec for that to work? i.e.

	Truncate(tv_sec, 9) + 1e-9 * tv_nsec

Cheers,
Brian



From drifty@bigfoot.com  Tue Sep 10 01:25:56 2002
From: drifty@bigfoot.com (Brett Cannon)
Date: Mon, 9 Sep 2002 17:25:56 -0700 (PDT)
Subject: [Python-Dev] Cut'n'paste
In-Reply-To: <20020909230713.GA5338@panix.com>
Message-ID: <Pine.SOL.4.44.0209091723060.19999-100000@death.OCF.Berkeley.EDU>

[Aahz]

> > You can configure xterm to treat : as punctuation and not a word
> > char. See man xterm.
>
> Then it would fail with regular URLs.  You can't win.  ;-)

I am now officially ignoring any more comments on how to format URLs and
email addresses in the summary.  Aahz is right, "You can't win" and thus I
am not going to bother to try to please everyone.  I will just do it the
way I feel like it and if someone doesn't like it they can just reformat
the code with a regex to make themselves happy.

Now I know how Guido must have felt with everyone and their mother
throwing in their opinion about booleans.  =)

-Brett



From tim.one@comcast.net  Tue Sep 10 02:21:11 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 09 Sep 2002 21:21:11 -0400
Subject: [Python-Dev] Cut'n'paste
In-Reply-To: <Pine.SOL.4.44.0209091723060.19999-100000@death.OCF.Berkeley.EDU>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEEJBDAB.tim.one@comcast.net>

[Brett Cannon]
> I am now officially ignoring any more comments on how to format URLs
> and email addresses in the summary.

What?!  I didn't get around to insisting that you use XML for this, with one
UTF8-encoded character per element thingie.

> Aahz is right, "You can't win" and thus I am not going to bother to try
> to please everyone.  I will just do it the way I feel like it and if
> someone doesn't like it they can just reformat the code with a regex
> to make themselves happy.

Such a small-minded attitude, Brett.  OTOH, it may preserve a bit of your
life for something enjoyable.

> Now I know how Guido must have felt with everyone and their mother
> throwing in their opinion about booleans.  =)

Not until you're accused of destroying all that's good about Python, going
out of your way to make it impossible to teach programming, and most likely
breaking every important Python program ever written.  It will take several
years for you to earn that level of abuse <wink>.

no-good-deed-goes-unpunished-ly y'rs   - tim



From python@rcn.com  Tue Sep 10 04:07:11 2002
From: python@rcn.com (Raymond Hettinger)
Date: Mon, 9 Sep 2002 23:07:11 -0400
Subject: [Python-Dev] Cut'n'paste
References: <Pine.SOL.4.44.0209091723060.19999-100000@death.OCF.Berkeley.EDU>
Message-ID: <001501c25877$29c19460$a661accf@othello>

From: "Brett Cannon" <bac@OCF.Berkeley.EDU>

> I am now officially ignoring any more comments on how to format URLs and
> email addresses in the summary.  Aahz is right, "You can't win" and thus I
> am not going to bother to try to please everyone.  I will just do it the
> way I feel like it and if someone doesn't like it they can just reformat
> the code with a regex to make themselves happy.
> 
> Now I know how Guido must have felt with everyone and their mother
> throwing in their opinion about booleans.  =)

BTW, my mother would have wanted spaces as delimiters.


Raymond Hettinger



From martin@v.loewis.de  Tue Sep 10 07:30:02 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 10 Sep 2002 08:30:02 +0200
Subject: [Python-Dev] Subsecond time stamps
In-Reply-To: <01b501c2585c$584b4a80$df7e4e18@brianspiv1700>
References: <01b501c2585c$584b4a80$df7e4e18@brianspiv1700>
Message-ID: <m3k7lumk5x.fsf@mira.informatik.hu-berlin.de>

Brian Quinlan <brian@sweetapp.com> writes:

> >    tv_sec + 1e-9 * tv_nsec;
> > 
[...]
> Don't you have to truncate tv_sec for that to work? i.e.
> 
> 	Truncate(tv_sec, 9) + 1e-9 * tv_nsec

What is Truncate, and why would I need it?

Regards,
Martin


From brian@sweetapp.com  Tue Sep 10 07:50:09 2002
From: brian@sweetapp.com (Brian Quinlan)
Date: Mon, 09 Sep 2002 23:50:09 -0700
Subject: [Python-Dev] Subsecond time stamps
In-Reply-To: <m3k7lumk5x.fsf@mira.informatik.hu-berlin.de>
Message-ID: <01cd01c25896$4e301d70$df7e4e18@brianspiv1700>

> What is Truncate, and why would I need it?

You wouldn't need it because I misunderstood the problem. Sorry.

Cheers,
Brian



From fredrik@pythonware.com  Tue Sep 10 09:30:37 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Tue, 10 Sep 2002 10:30:37 +0200
Subject: [Python-Dev] 64-bit process optimization 1
References: <20020909123418.AAB25999@cas.org>  <200209091755.g89HtLV30441@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <001501c258a4$6f889ca0$ced241d5@hagrid>

guido wrote:

> Unfortunately, one problem is binary compatibility.  We try to make it
> possible to link newer Python versions with extension modules (like
> Numeric, which you use) compiled for older versions.  This requires
> that the binary lay-out of objects remains the same, and swapping
> ob_refcnt and ob_type would cause immediate crashes in this case.

a compromise could be to make the swap in 2.3, but only
on 64-bit platforms.

it's obvious that most people are stuck on 32-bit platforms
today, and I think it's safe to say that users on 64-bit plat-
forms might be a bit more willing to build everything they
need on their local platform.

another alternative would be to make it a configuration option,
with a platform-dependent default.

</F>



From Anthony Baxter <anthony@interlink.com.au>  Tue Sep 10 11:06:31 2002
From: Anthony Baxter <anthony@interlink.com.au> (Anthony Baxter)
Date: Tue, 10 Sep 2002 20:06:31 +1000
Subject: [Python-Dev] Subsecond time stamps
In-Reply-To: <200209092306.g89N6V806944@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200209101006.g8AA6Vb28742@localhost.localdomain>

>>> Guido van Rossum wrote
> I'm sorry, but I really don't see the point of wanting to record file
> mtimes all the way up to nanosecond precision.  What would it mean?
> Most clocks are off by a few seconds at least anyway.

Not only that, but if you're that precise, are you measuring the time
when the modification started, the time when it started hitting the
disks, when the write on the disk completed, when the O/S signalled
to the application that the modification was complete... questions 
questions.. .:)





From guido@python.org  Tue Sep 10 14:54:58 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 10 Sep 2002 09:54:58 -0400
Subject: [Python-Dev] 64-bit process optimization 1
In-Reply-To: Your message of "Tue, 10 Sep 2002 10:30:37 +0200."
 <001501c258a4$6f889ca0$ced241d5@hagrid>
References: <20020909123418.AAB25999@cas.org> <200209091755.g89HtLV30441@pcp02138704pcs.reston01.va.comcast.net>
 <001501c258a4$6f889ca0$ced241d5@hagrid>
Message-ID: <200209101354.g8ADswV23058@odiug.zope.com>

> > Unfortunately, one problem is binary compatibility.  We try to make it
> > possible to link newer Python versions with extension modules (like
> > Numeric, which you use) compiled for older versions.  This requires
> > that the binary lay-out of objects remains the same, and swapping
> > ob_refcnt and ob_type would cause immediate crashes in this case.
> 
> a compromise could be to make the swap in 2.3, but only
> on 64-bit platforms.
> 
> it's obvious that most people are stuck on 32-bit platforms
> today, and I think it's safe to say that users on 64-bit plat-
> forms might be a bit more willing to build everything they
> need on their local platform.
> 
> another alternative would be to make it a configuration option,
> with a platform-dependent default.

I like all of that.  Maybe it should also be a config option whether
refcount, sizes etc. should be 32 or 64 bit quantities on 64 bit
platforms.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mcherm@destiny.com  Tue Sep 10 15:18:38 2002
From: mcherm@destiny.com (Michael Chermside)
Date: Tue, 10 Sep 2002 10:18:38 -0400
Subject: [Python-Dev] Re: raw headers in rfc822.Message
Message-ID: <3D7DFF3E.3030200@destiny.com>

Zack Weinberg writes:
> You've generated this patch backward, and in a format which makes it
> useless to us.  Please regenerate it with diff -c or diff -u (either
> is acceptable) and put the newer file _second_ on the command line:
> diff -u OLD_FILE NEW_FILE.

It wasn't all that long ago that I submitted my first patch (of 
documentation, not code) to SourceForge. It took me > 20 minutes of 
careful web searching to figure out the desired way of submitting files 
and the correct way to generate that. And I still wasn't 100% sure I was 
generating the diff in the correct direction.

Couldn't Zack's comment be added to the directions found at 
https://sourceforge.net/tracker/?func=add&group_id=5470&atid=305470
so that anyone submitting a patch would see how to do it.

(But of course that wouldn't have helped THIS person, who didn't use 
sourceforge... :-(  )

-- Michael Chermside




From guido@python.org  Tue Sep 10 15:26:48 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 10 Sep 2002 10:26:48 -0400
Subject: [Python-Dev] Re: raw headers in rfc822.Message
In-Reply-To: Your message of "Tue, 10 Sep 2002 10:18:38 EDT."
 <3D7DFF3E.3030200@destiny.com>
References: <3D7DFF3E.3030200@destiny.com>
Message-ID: <200209101426.g8AEQm023271@odiug.zope.com>

> Zack Weinberg writes:
> > You've generated this patch backward, and in a format which makes it
> > useless to us.  Please regenerate it with diff -c or diff -u (either
> > is acceptable) and put the newer file _second_ on the command line:
> > diff -u OLD_FILE NEW_FILE.

[Michael Chermside]
> It wasn't all that long ago that I submitted my first patch (of 
> documentation, not code) to SourceForge. It took me > 20 minutes of 
> careful web searching to figure out the desired way of submitting files 
> and the correct way to generate that. And I still wasn't 100% sure I was 
> generating the diff in the correct direction.
> 
> Couldn't Zack's comment be added to the directions found at 
> https://sourceforge.net/tracker/?func=add&group_id=5470&atid=305470
> so that anyone submitting a patch would see how to do it.

I guess we're assuming that even people who aren't familiar with
SourceForge are familiar with diff.  Is that not a reasonable
assumption any more?

There's also the developer FAQ, which has carefull instructions for
patch generation at

  http://www.python.org/dev/devfaq.html#patches

and in addition points to http://www.python.org/patches/ which has
everything you need (except the hint about forward diffs; I'll add
that).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From Jack.Jansen@cwi.nl  Tue Sep 10 15:39:37 2002
From: Jack.Jansen@cwi.nl (Jack Jansen)
Date: Tue, 10 Sep 2002 16:39:37 +0200
Subject: [Python-Dev] Re: raw headers in rfc822.Message
In-Reply-To: <200209101426.g8AEQm023271@odiug.zope.com>
Message-ID: <2167CDEB-C4CB-11D6-911E-0030655234CE@cwi.nl>

On Tuesday, September 10, 2002, at 04:26 , Guido van Rossum wrote:
>> Couldn't Zack's comment be added to the directions found at
>> https://sourceforge.net/tracker/?func=add&group_id=5470&atid=305470
>> so that anyone submitting a patch would see how to do it.
>
> I guess we're assuming that even people who aren't familiar with
> SourceForge are familiar with diff.  Is that not a reasonable
> assumption any more?

Not cross-platform. I've had patches for MacPython in rather outlandish 
diff-like
formats, so a note that tells people to use the unix diff program 
wouldn't hurt.
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- Emma 
Goldman -



From guido@python.org  Tue Sep 10 15:41:40 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 10 Sep 2002 10:41:40 -0400
Subject: [Python-Dev] Re: raw headers in rfc822.Message
In-Reply-To: Your message of "Tue, 10 Sep 2002 16:39:37 +0200."
 <2167CDEB-C4CB-11D6-911E-0030655234CE@cwi.nl>
References: <2167CDEB-C4CB-11D6-911E-0030655234CE@cwi.nl>
Message-ID: <200209101441.g8AEfeW23387@odiug.zope.com>

> > I guess we're assuming that even people who aren't familiar with
> > SourceForge are familiar with diff.  Is that not a reasonable
> > assumption any more?
> 
> Not cross-platform. I've had patches for MacPython in rather
> outlandish diff-like formats, so a note that tells people to use the
> unix diff program wouldn't hurt.

But what good does a reference to "the unix diff program" do a Mac
developer?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From aahz@pythoncraft.com  Tue Sep 10 15:48:19 2002
From: aahz@pythoncraft.com (Aahz)
Date: Tue, 10 Sep 2002 10:48:19 -0400
Subject: [Python-Dev] Writing patches
In-Reply-To: <200209101426.g8AEQm023271@odiug.zope.com>
References: <3D7DFF3E.3030200@destiny.com> <200209101426.g8AEQm023271@odiug.zope.com>
Message-ID: <20020910144818.GA13037@panix.com>

On Tue, Sep 10, 2002, Guido van Rossum wrote:
>
> There's also the developer FAQ, which has carefull instructions for
> patch generation at
> 
>   http://www.python.org/dev/devfaq.html#patches
> 
> and in addition points to http://www.python.org/patches/ which has
> everything you need (except the hint about forward diffs; I'll add
> that).

Perhaps the "patches" link at http://www.python.org/ should point at
either DevFAQ#patches or the patches page.  (That was my original
intention in not linking directly to SF -- you're the one who added the
direct links.)

The question IMO is whether those links are for the benefit of core
developers or newbies.  I'm +1 on the latter.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/


From skip@pobox.com  Tue Sep 10 15:52:53 2002
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 10 Sep 2002 09:52:53 -0500
Subject: [Python-Dev] Re: raw headers in rfc822.Message
In-Reply-To: <3D7DFF3E.3030200@destiny.com>
References: <3D7DFF3E.3030200@destiny.com>
Message-ID: <15742.1861.180590.431080@12-248-11-90.client.attbi.com>

    Michael> Couldn't Zack's comment be added to the directions found at
    Michael> https://sourceforge.net/tracker/?func=add&group_id=5470&atid=305470
    Michael> so that anyone submitting a patch would see how to do it.

On that page there's a link entitled "See our hints on how to create a
patch."  This links to

    http://www.python.org/patches/

which has, I think, the required details.

-- 
Skip Montanaro
skip@pobox.com
consulting: http://manatee.mojam.com/~skip/resume.html


From guido@python.org  Tue Sep 10 15:59:13 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 10 Sep 2002 10:59:13 -0400
Subject: [Python-Dev] Re: raw headers in rfc822.Message
In-Reply-To: Your message of "Tue, 10 Sep 2002 09:52:53 CDT."
 <15742.1861.180590.431080@12-248-11-90.client.attbi.com>
References: <3D7DFF3E.3030200@destiny.com>
 <15742.1861.180590.431080@12-248-11-90.client.attbi.com>
Message-ID: <200209101459.g8AExD323473@odiug.zope.com>

>     Michael> Couldn't Zack's comment be added to the directions found at
>     Michael> https://sourceforge.net/tracker/?func=add&group_id=5470&atid=305470
>     Michael> so that anyone submitting a patch would see how to do it.
> 
> On that page there's a link entitled "See our hints on how to create a
> patch."  This links to
> 
>     http://www.python.org/patches/
> 
> which has, I think, the required details.

I added that link a few minutes ago. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mcherm@destiny.com  Tue Sep 10 16:02:23 2002
From: mcherm@destiny.com (Michael Chermside)
Date: Tue, 10 Sep 2002 11:02:23 -0400
Subject: [Python-Dev] Re: raw headers in rfc822.Message
Message-ID: <3D7E097F.7000003@destiny.com>

>> On that page there's a link entitled "See our hints on how to create a
>> patch."  This links to
>> 
>>     http://www.python.org/patches/
>> 
>> which has, I think, the required details.
> 
> I added that link a few minutes ago. :-)
> 
> --Guido van Rossum (home page: http://www.python.org/~guido/)

I think that's a great fix. Thanks!

-- Michael Chermside




From thomas@xs4all.net  Tue Sep 10 16:12:53 2002
From: thomas@xs4all.net (Thomas Wouters)
Date: Tue, 10 Sep 2002 17:12:53 +0200
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Mac/Include getapplbycreator.h,1.3,1.4 macdefs.h,1.11,1.12 macglue.h,1.61,1.62 pythonresources.h,1.27,1.28
In-Reply-To: <E17okCH-0006Cq-00@usw-pr-cvs1.sourceforge.net>
References: <E17okCH-0006Cq-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20020910151252.GA830@xs4all.nl>

On Tue, Sep 10, 2002 at 05:32:49AM -0700, jackjansen@users.sourceforge.net wrote:

> Modified Files:
> 	getapplbycreator.h macdefs.h macglue.h pythonresources.h 
> Log Message:
> Added include guards and C++ extern "C" {} constructs. Partial fix for #607253.
> Bugfix candidate.

[..]

> *** getapplbycreator.h	19 May 2001 12:32:39 -0000	1.3
> --- getapplbycreator.h	10 Sep 2002 12:32:47 -0000	1.4

[..]

>   ******************************************************************/
> + #ifndef Py_GETAPPLBYCREATOR_H
> + #define Py_GETALLPBYCREATOR_H

This looks suspiciously like a bug. If you really do intend to #define
something different than you just checked against, you should add a comment
stating that this really isn't a typo of a very common idiom :)

I'm-not-dead--I-feel-fine-ly y'rs,
-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From pinard@iro.umontreal.ca  Tue Sep 10 16:26:56 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: Tue, 10 Sep 2002 11:26:56 -0400
Subject: [Python-Dev] Re: Codecs lookup order
In-Reply-To: <m37khupzo0.fsf@mira.informatik.hu-berlin.de> (martin@v.loewis.de's
 message of "10 Sep 2002 00:26:55 +0200")
References: <oqadmrgt75.fsf@titan.progiciels-bpi.ca>
 <m37khupzo0.fsf@mira.informatik.hu-berlin.de>
Message-ID: <oqbs75q30f.fsf@titan.progiciels-bpi.ca>

[Martin v. Loewis]

> pinard@iro.umontreal.ca (Fran.ois Pinard) writes:

>> I'm not sure what should best be done.  1) The documentation might be
>> modified to explain the limitation, so other users do not trip up on it.
>> 2) `encoding.lookup()' might merely return None in case `getregentry' is
>> not defined in the imported module, or else, 3) it could make sure that it
>> imports modules exclusively from within the `encodings' package.

> This is what Python 2.3, and Python 2.2.2 will do.

Hi, Martin.

I added "1)", "2)" and "3)" in the original text for clarity.  Will Python
2.2.2 and 2.3 do "3)", or all of "1)", "2)" and "3)"?

If the codec search order is not changed, how one proceeds if s/he wants to
override a bundled codec, with a provided other with the same encoding name?

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard


From xscottg@yahoo.com  Tue Sep 10 17:15:04 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Tue, 10 Sep 2002 09:15:04 -0700 (PDT)
Subject: [Python-Dev] 64-bit process optimization 1
In-Reply-To: <200209101354.g8ADswV23058@odiug.zope.com>
Message-ID: <20020910161504.10967.qmail@web40110.mail.yahoo.com>

--- Guido wrote:
> > 
> > a compromise could be to make the swap in 2.3, but only
> > on 64-bit platforms.
> > 
> > it's obvious that most people are stuck on 32-bit platforms
> > today, and I think it's safe to say that users on 64-bit plat-
> > forms might be a bit more willing to build everything they
> > need on their local platform.
> > 
> > another alternative would be to make it a configuration option,
> > with a platform-dependent default.
> 
> I like all of that.  Maybe it should also be a config option whether
> refcount, sizes etc. should be 32 or 64 bit quantities on 64 bit
> platforms.
> 

+1 from this 64 bit user.














__________________________________________________
Yahoo! - We Remember
9-11: A tribute to the more than 3,000 lives lost
http://dir.remember.yahoo.com/tribute


From barry@python.org  Tue Sep 10 18:00:32 2002
From: barry@python.org (Barry A. Warsaw)
Date: Tue, 10 Sep 2002 13:00:32 -0400
Subject: [Python-Dev] The first trustworthy <wink> GBayes results
References: <15726.13053.111171.335483@12-248-11-90.client.attbi.com>
 <200208291631.g7TGVgd28718@localhost.localdomain>
Message-ID: <15742.9520.698662.836695@anthem.wooz.org>

>>>>> "AB" == Anthony Baxter <anthony@interlink.com.au> writes:

    >> Skip Montanaro wrote
    >> One thing worth noting before everybody starts using it to
    >> massage their mailboxes is that the email package contains a
    >> bug which causes it to occasionally delete whitespace when
    >> reformatting headers.

BTW, I fixed Greg's problem but not Skip's.  I'm still looking at this
one...

    AB> There's one other known problem - seriously misformatted MIME
    AB> (as seen in spam, and email from Microsoft Entourage) causes
    AB> the email package to barf out. I plan, at some point, to try
    AB> and make a "if it fails, just leave the body as one chunk of
    AB> text" mode, but it's a long long way down my list of
    AB> priorities.

I just checked this into cvs.
-Barry


From martin@v.loewis.de  Tue Sep 10 19:25:16 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 10 Sep 2002 20:25:16 +0200
Subject: [Python-Dev] Subsecond time stamps
In-Reply-To: <200209101006.g8AA6Vb28742@localhost.localdomain>
References: <200209101006.g8AA6Vb28742@localhost.localdomain>
Message-ID: <m3fzwhk8hf.fsf@mira.informatik.hu-berlin.de>

Anthony Baxter <anthony@interlink.com.au> writes:

> Not only that, but if you're that precise, are you measuring the time
> when the modification started, the time when it started hitting the
> disks, when the write on the disk completed, when the O/S signalled
> to the application that the modification was complete... questions 
> questions.. .:)

For Python, these questions are easy to answer: We just report to the
application what the system reports to us. It the the file system
implementor's job to define the notion of modification time.

Regards,
Martin



From martin@v.loewis.de  Tue Sep 10 19:26:06 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 10 Sep 2002 20:26:06 +0200
Subject: [Python-Dev] Re: Codecs lookup order
In-Reply-To: <oqbs75q30f.fsf@titan.progiciels-bpi.ca>
References: <oqadmrgt75.fsf@titan.progiciels-bpi.ca>
 <m37khupzo0.fsf@mira.informatik.hu-berlin.de>
 <oqbs75q30f.fsf@titan.progiciels-bpi.ca>
Message-ID: <m3bs75k8g1.fsf@mira.informatik.hu-berlin.de>

pinard@iro.umontreal.ca (Fran=E7ois Pinard) writes:

> >> I'm not sure what should best be done.  1) The documentation might be
> >> modified to explain the limitation, so other users do not trip up on i=
t.
> >> 2) `encoding.lookup()' might merely return None in case `getregentry' =
is
> >> not defined in the imported module, or else, 3) it could make sure tha=
t it
> >> imports modules exclusively from within the `encodings' package.
>=20
> > This is what Python 2.3, and Python 2.2.2 will do.
>=20
> Hi, Martin.
>=20
> I added "1)", "2)" and "3)" in the original text for clarity.  Will Python
> 2.2.2 and 2.3 do "3)", or all of "1)", "2)" and "3)"?

Oops, it's 2) that Python 2.3 will do.

Regards,
Martin


From barry@barrys-emacs.org  Tue Sep 10 20:17:26 2002
From: barry@barrys-emacs.org (Barry Scott)
Date: Tue, 10 Sep 2002 20:17:26 +0100
Subject: [Python-Dev] Cut'n'paste
In-Reply-To: <20020909230713.GA5338@panix.com>
Message-ID: <000001c258fe$b326d800$070210ac@LAPDANCE>

You double click and drag to highlight two words. I have to
be missing the problem here. This is all basic GUI usage
and nothing to do with whatever it is that's output URI.

	BArry


> -----Original Message-----
> From: python-dev-admin@python.org [mailto:python-dev-admin@python.org]On
> Behalf Of Aahz
> Sent: 10 September 2002 00:07
> To: barry@barrys-emacs.org
> Cc: python-dev@python.org
> Subject: [Python-Dev] Cut'n'paste
>
>
> On Mon, Sep 09, 2002, Barry Scott wrote:
> >Aahz:
> >>
> >> xterm does a nifty job usually of figuring out what to highlight when I
> >> double-click on a word.  It fails with mailto: because normally when I
> >> cut'n'paste an address, I *don't* want to include the
> "mailto:" portion.
> >
> > You can configure xterm to treat : as punctuation and not a word
> > char. See man xterm.
>
> Then it would fail with regular URLs.  You can't win.  ;-)
> --
> Aahz (aahz@pythoncraft.com)           <*>
> http://www.pythoncraft.com/
>
> Project Vote Smart: http://www.vote-smart.org/
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
>




From Jack.Jansen@oratrix.com  Tue Sep 10 21:03:05 2002
From: Jack.Jansen@oratrix.com (Jack Jansen)
Date: Tue, 10 Sep 2002 22:03:05 +0200
Subject: [Python-Dev] Re: raw headers in rfc822.Message
In-Reply-To: <200209101441.g8AEfeW23387@odiug.zope.com>
Message-ID: <51D8ADCF-C4F8-11D6-88B2-003065517236@oratrix.com>

On dinsdag, september 10, 2002, at 04:41 , Guido van Rossum wrote:

>>> I guess we're assuming that even people who aren't familiar with
>>> SourceForge are familiar with diff.  Is that not a reasonable
>>> assumption any more?
>>
>> Not cross-platform. I've had patches for MacPython in rather
>> outlandish diff-like formats, so a note that tells people to use the
>> unix diff program wouldn't hurt.
>
> But what good does a reference to "the unix diff program" do a Mac
> developer?

At the very least they won't send me MPW diffs. At best they 
fire up OSX and use the One True Diff:-)
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- 
Emma Goldman -



From Jack.Jansen@oratrix.com  Tue Sep 10 22:05:42 2002
From: Jack.Jansen@oratrix.com (Jack Jansen)
Date: Tue, 10 Sep 2002 23:05:42 +0200
Subject: [Python-Dev] Weeding out obsolete modules and Demos
Message-ID: <10D997B8-C501-11D6-88B2-003065517236@oratrix.com>

Folks,
how about going over the various demos, and see which ones have 
really lost their usefulness?

I happened to come across Demo/sgi/audio (works only on SGI 
4D/35 machines, which went out of production about 12 years 
ago), sv and video (works on Indigo's with the Starter Video 
board, last seen about 8 years ago). And there's the svmodule.c 
(yup, same board).
There are probably Indigo's still alive (4D35's? I doubt it, I 
can still remember the noise it made:-), but I wonder whether 
anyone in their right mind is still using the SV board.

The forms/fl stuff still technically works on newer SGI's, but 
we might also wonder how useful they still are.

And this is for SGI only, there's probably a lot more dead wood 
out there,
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- 
Emma Goldman -



From guido@python.org  Tue Sep 10 22:19:01 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 10 Sep 2002 17:19:01 -0400
Subject: [Python-Dev] Weeding out obsolete modules and Demos
In-Reply-To: Your message of "Tue, 10 Sep 2002 23:05:42 +0200."
 <10D997B8-C501-11D6-88B2-003065517236@oratrix.com>
References: <10D997B8-C501-11D6-88B2-003065517236@oratrix.com>
Message-ID: <200209102119.g8ALJ1h29280@odiug.zope.com>

> how about going over the various demos, and see which ones have 
> really lost their usefulness?

Yeah!

> I happened to come across Demo/sgi/audio (works only on SGI 
> 4D/35 machines, which went out of production about 12 years 
> ago), sv and video (works on Indigo's with the Starter Video 
> board, last seen about 8 years ago). And there's the svmodule.c 
> (yup, same board).
> There are probably Indigo's still alive (4D35's? I doubt it, I 
> can still remember the noise it made:-), but I wonder whether 
> anyone in their right mind is still using the SV board.
> 
> The forms/fl stuff still technically works on newer SGI's, but 
> we might also wonder how useful they still are.
> 
> And this is for SGI only, there's probably a lot more dead wood 
> out there,

I haven't seen or heard an SGI machine for years.  If you think those
SGI demos have lost their usefulness, please use your CVS powers to
delete them!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From drifty@bigfoot.com  Wed Sep 11 01:07:58 2002
From: drifty@bigfoot.com (Brett Cannon)
Date: Tue, 10 Sep 2002 17:07:58 -0700 (PDT)
Subject: [Python-Dev] utf-8 issue thread question
Message-ID: <Pine.SOL.4.44.0209101704040.10758-100000@death.OCF.Berkeley.EDU>

So here is the summary question for this thread: what exactly is a
surrogate?  I think I get it (from reading a l18n email from MAL on the
l18n list), but I am not confident enough to stick in the summary as of
yet.

The following is my current rough summary explanation for what a surrogate
is.  Can someone please correct it as needed?

"""
In Unicode, a surrogate is when you encode from a higher bit total
encoding (such as utf-16) into a smaller bit total encoding by
representing the character as several more bit chunks (such as two utf-8
chunks).  The following line is an example:

	>>> u'\ud800'.encode('utf-8') == '\xed\xa0\x80'

Notice how the initial Unicode character ends up being encoded as three
characters in utf-8.
"""

Also, anyone know of some good Unicode tutorials, explanations, etc. on
the web, in book form, whatever?  Most of the threads that I don't totally
comprehend are Unicode related and I would like to minimize my brain-dead
questions to a minimum.  Don't want my reputation to go down the drain.
=)

-Brett



From fredrik@pythonware.com  Wed Sep 11 01:24:53 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Wed, 11 Sep 2002 02:24:53 +0200
Subject: [Python-Dev] utf-8 issue thread question
References: <Pine.SOL.4.44.0209101704040.10758-100000@death.OCF.Berkeley.EDU>
Message-ID: <004a01c25929$a6fb3f50$0900a8c0@spiff>

Brett Cannon wrote:

> The following is my current rough summary explanation for what a =
surrogate
> is.  Can someone please correct it as needed?

needed, indeed.

it's 2.30 am over here, so I'm not going to try to explain this myself,
but some random googling brought up this page:

http://216.239.37.100/search?q=3Dcache:Dk12BZNt6skC:uk.geocities.com/Babe=
lStone1357/Software/surrogates.html

    The code points U+D800 through U+DB7F are reserved as High =
Surrogates,
    and the code points U+DC00 through U+DFFF are reserved as Low =
Surrogates.
    Each code point in [the full 20-bit unicode character space] maps to =
a pair of
    16-bit code points comprising a High Surrogate followed by a Low =
Surrogate.
    Thus, for example, the Gothic letter AHSA has the UTF-32 value of =
U+10330,
    which maps to the surrogate pair U+D800 and U+DF30. That is to say, =
in the
    16-bit encoding of Unicode (UTF-16), the Gothic letter AHSA is =
represented
    by two consecutive 16-bit code points (U+D800 and U+DF30), whereas =
in the
    32-bit encoding of Unicode (UTF-32), the same letter is represented =
by a
    single 32-bit value (U+10330).

</F>




From whisper@oz.net  Wed Sep 11 01:40:56 2002
From: whisper@oz.net (David LeBlanc)
Date: Tue, 10 Sep 2002 17:40:56 -0700
Subject: [Python-Dev] Weeding out obsolete modules and Demos
In-Reply-To: <200209102119.g8ALJ1h29280@odiug.zope.com>
Message-ID: <GCEDKONBLEFPPADDJCOEAEIOEOAA.whisper@oz.net>

> I haven't seen or heard an SGI machine for years.  If you think those
> SGI demos have lost their usefulness, please use your CVS powers to
> delete them!
>
> --Guido van Rossum (home page: http://www.python.org/~guido/)
>

Um... maybe just move them to the not-shipped side of things at first in
case there are hold-outs out there still clinging to their stone axes? ;)

Dave LeBlanc
Seattle, WA USA



From drifty@bigfoot.com  Wed Sep 11 01:40:00 2002
From: drifty@bigfoot.com (Brett Cannon)
Date: Tue, 10 Sep 2002 17:40:00 -0700 (PDT)
Subject: [Python-Dev] utf-8 issue thread question
In-Reply-To: <004a01c25929$a6fb3f50$0900a8c0@spiff>
Message-ID: <Pine.SOL.4.44.0209101738030.10758-100000@death.OCF.Berkeley.EDU>

[Fredrik Lundh]

> Brett Cannon wrote:
>
> it's 2.30 am over here, so I'm not going to try to explain this myself,
> but some random googling brought up this page:
>
> http://216.239.37.100/search?q=cache:Dk12BZNt6skC:uk.geocities.com/BabelStone1357/Software/surrogates.html
>
>     The code points U+D800 through U+DB7F are reserved as High Surrogates,
>     and the code points U+DC00 through U+DFFF are reserved as Low Surrogates.
>     Each code point in [the full 20-bit unicode character space] maps to a pair of
>     16-bit code points comprising a High Surrogate followed by a Low Surrogate.
>     Thus, for example, the Gothic letter AHSA has the UTF-32 value of U+10330,
>     which maps to the surrogate pair U+D800 and U+DF30. That is to say, in the
>     16-bit encoding of Unicode (UTF-16), the Gothic letter AHSA is represented
>     by two consecutive 16-bit code points (U+D800 and U+DF30), whereas in the
>     32-bit encoding of Unicode (UTF-32), the same letter is represented by a
>     single 32-bit value (U+10330).
>
> </F>
>

So with that explanation, here is the current rewrite:

"""
In Unicode, a surrogate pair is when you create the representation of a
character by using two values. So, for instance, UTF-32 can cover the
entire Unicode space (Unicode is 20 bits), but UTF-16 can't.  To solve the
issue a character can be represented as a pair of UTF-16 values.

The problem in Python 2.2.1 is that when there is only a lone surrogate
(instead of there being a pair of values), the encoder for UTF-8 messes up
and leaves off a UTF-8 value.  The following line is an example:

	>>> u'\ud800'.encode('utf-8')
	'\xa0\x80'  #In Python 2.2.1
	'\xed\xa0\x80'  #In Python 2.3a0

Notice how in Python 2.3a0 the extra value is inserted so as to make the
representation a complete Unicode character instead of only encoding the
half of the surrogate pair that the encode was given.
"""

How is that?

-Brett



From guido@python.org  Wed Sep 11 01:39:14 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 10 Sep 2002 20:39:14 -0400
Subject: [Python-Dev] utf-8 issue thread question
In-Reply-To: Your message of "Tue, 10 Sep 2002 17:07:58 PDT."
 <Pine.SOL.4.44.0209101704040.10758-100000@death.OCF.Berkeley.EDU>
References: <Pine.SOL.4.44.0209101704040.10758-100000@death.OCF.Berkeley.EDU>
Message-ID: <200209110039.g8B0dEQ09916@pcp02138704pcs.reston01.va.comcast.net>

> So here is the summary question for this thread: what exactly is a
> surrogate?

Unicode surrogates are used specifically to encode Unicode characters
with values >= 2**16 as two 16-bit code points.  The Unicode standard
has conveniently reserved two ranges for these (see /F's post).  The
first (high) surrogate encodes the high 10 bits, the second (low)
surrogate encodes the low 10 bits.  For redundancy, the top bit
pattern is different for high and low surrogates.  One thing to watch
out for: I believe that the bit pattern that's encoded is not the bit
pattern of the full unicode character, but 2**16 less.  This allows
one to encode 2**16 more characters, at the cost of some extra
complexity.

> I think I get it (from reading a l18n email from MAL on the
> l18n list), but I am not confident enough to stick in the summary as of
> yet.
> 
> The following is my current rough summary explanation for what a surrogate
> is.  Can someone please correct it as needed?
> 
> """
> In Unicode, a surrogate is when you encode from a higher bit total
> encoding (such as utf-16) into a smaller bit total encoding by
> representing the character as several more bit chunks (such as two utf-8
> chunks).  The following line is an example:
> 
> 	>>> u'\ud800'.encode('utf-8') == '\xed\xa0\x80'
> 
> Notice how the initial Unicode character ends up being encoded as three
> characters in utf-8.
> """

No, the UTF8 encoding is not called surrogate.  Only 16-bit values are
surrogates.  In this example, \ud800 is a high surrogate that's not
followed by a low surrogate.  The UTF-8 encoder could do two things
with this: encode the bit pattern, or throw an error.  Note that when
the UTF-8 encoder sees a *pair* of surrogates (a high surrogate
followed by a low surrogate), it is supposed to extract the single
unicode character from them, and encode that.  The UTF-8 decoder must
in turn create a surrogate pair when decoding to 16-bit Unicode (as
opposed to when decoding to 32-bit Unicode, when it should not
generate surrogates).

Note that there are various problems with this.  Surrogates are
illegal in 32-bit Unicode, but of course you cannot really prevent
them from occurring.  What should that mean?

> Also, anyone know of some good Unicode tutorials, explanations,
> etc. on the web, in book form, whatever?  Most of the threads that I
> don't totally comprehend are Unicode related and I would like to
> minimize my brain-dead questions to a minimum.  Don't want my
> reputation to go down the drain.  =)

I think the Unicode consortium website, www.unicode.org, has lots of
good stuff, including the complete standard online.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From python@rcn.com  Wed Sep 11 04:39:52 2002
From: python@rcn.com (Raymond Hettinger)
Date: Tue, 10 Sep 2002 23:39:52 -0400
Subject: [Python-Dev] Re: raw headers in rfc822.Message
References: <3D7DFF3E.3030200@destiny.com>  <200209101426.g8AEQm023271@odiug.zope.com>
Message-ID: <003201c25944$e4225a60$69d8accf@othello>

From: "Guido van Rossum" <guido@python.org>

> I guess we're assuming that even people who aren't familiar with
> SourceForge are familiar with diff.  Is that not a reasonable
> assumption any more?
> 
> There's also the developer FAQ, which has carefull instructions for
> patch generation at
> 
>   http://www.python.org/dev/devfaq.html#patches
> 
> and in addition points to http://www.python.org/patches/ which has
> everything you need (except the hint about forward diffs; I'll add
> that).

FAQs and pointers be darned; there is only one way to 
developerhood and that is through the school of hard knocks.

- Learn to use CVS (which, of course, entails SSH and such).
- Use Googol (a lot), or risk proposing that which has already
 been decided, researched, or discussed ad naseum.
- Submit a patch.  Find out that it was the wrong diff format
  or keyed-off of an older, non-current version of the file.
  Then find-out that your editor's tabbing and  spacing confounds 
  somebody's life, somewhere.  Oh, did you forget to run the 
  regression tests? Did your tests run fine, but you didn't run them 
  in debug mode?  Perhaps your Windows machine skips the test for
  the library you modified.  Break working code and suffer public 
  flogging.
- Read every PEP and make sure your patch style has no
  deviations (unless, of course, your C and Python coding
  style already matched the PEPs).
- BTW, did you submit unittests and docs with your patch?  
  Did you make appropriate adjustments to the makefiles,
  and every other reference to you work?  And appropriate
   announcements in Misc/NEWS?
- Surely, you've learned TeX and its many Python specific
  macros (forward slash or backslash, verbatim or code?)
  How many characters were on your longest line (72, 78,
  hopefully, not more).
- When learning a guitar, it helps to develop calluses on
  the fingers.  Write a PEP is the fastest way to develop the
  calluses; contradicting Guido is the second fastest way; 
  submitting a great idea is third fastest (bad ideas either
  get ignored or are slammed so quickly that the scar tissue
  doesn't have time to develop).
- Experience the politics of bug resolution.  If a developer 
  proposed it, then it should not be dismissed lightly.  If
  someone had a grandiose scheme in mind when they
  submitted the report, be prepared for wrath when you
  apply a simple solution. Realize that, in some cases,
  someone, somewhere is relying on the undocumented
  buggy behavior and your fixing it is breaking their code.
- And, my all time favorite, do everything right (formatting,
   procedure, profiling, testing, etc) and watch the Timbot
   come along five minutes later and improve your code
   making it faster, clearer, more conformant, more elegant, 
   and also gel neatly with the vaguaries of memory allocation,
   cache performance, and compilers you've never heard of.


Raymond Hettinger

Oh, and did I mention that native speakers of ASCII will
never be able to master Unicode like a native?






From drifty@bigfoot.com  Wed Sep 11 05:36:30 2002
From: drifty@bigfoot.com (Brett Cannon)
Date: Tue, 10 Sep 2002 21:36:30 -0700 (PDT)
Subject: [Python-Dev] Re: raw headers in rfc822.Message
In-Reply-To: <003201c25944$e4225a60$69d8accf@othello>
Message-ID: <Pine.SOL.4.44.0209102134130.4973-100000@death.OCF.Berkeley.EDU>

[Raymond Hettinger]

<a whole lot of funny stuff>

Kudos to Raymond on this email.  Great stuff.  I know I have had my
growing pains with learning how to do everything correctly, so I really
appreciate his points.

-Brett



From martin@v.loewis.de  Wed Sep 11 07:23:17 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 11 Sep 2002 08:23:17 +0200
Subject: [Python-Dev] utf-8 issue thread question
In-Reply-To: <200209110039.g8B0dEQ09916@pcp02138704pcs.reston01.va.comcast.net>
References: <Pine.SOL.4.44.0209101704040.10758-100000@death.OCF.Berkeley.EDU>
 <200209110039.g8B0dEQ09916@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <m3u1kxf3je.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> One thing to watch out for: I believe that the bit pattern that's
> encoded is not the bit pattern of the full unicode character, but
> 2**16 less.  This allows one to encode 2**16 more characters, at the
> cost of some extra complexity.

Correct. That allows to encode a total of 17 planes in Unicode, a
plane being 2**16 characters. Therefore, saying that Unicode is 20
bits is somewhat imprecise - its better to say that it is 21 bits.

Regards,
Martin


From sjoerd@acm.org  Wed Sep 11 09:44:03 2002
From: sjoerd@acm.org (Sjoerd Mullender)
Date: Wed, 11 Sep 2002 10:44:03 +0200
Subject: [Python-Dev] Weeding out obsolete modules and Demos
In-Reply-To: <200209102119.g8ALJ1h29280@odiug.zope.com>
References: <10D997B8-C501-11D6-88B2-003065517236@oratrix.com>
 <200209102119.g8ALJ1h29280@odiug.zope.com>
Message-ID: <200209110844.g8B8i3D14336@indus.ins.cwi.nl>

I would not be opposed to deleting the whole Demo/sgi tree.

I don't have an SGI workstation anymore, but I did have an SGI O2
until recently.  I think the audio (and al) stuff and the gl stuff
probably still work (I used mclock until recently).  I think we can
definitely get rid of the video directory (CMIF video format, remember
that?).  I'm not sure whether sv still compiles on modern SGI's.  The
cd module also still works.

Having said this, I'm not sure there is still much value in keeping
the demos.  The modules in the Modules directory is another matter.
Until recently I have used cd and al.  I think cl might still work,
but I'm not sure.  I don't think sv works on anything other than
Indigo's with a Starter Video board.
gl (and I think also fm) still works.  sgi also still works, but I'm
not sure how useful it still is.  It just defines functions nap and
_getpty.
rgbimg is for reading SGI RGB images, but is portable.  Although one
must ask whether it has a place in the standard library, since there
is no similar level of support for more popular image formats.

On Tue, Sep 10 2002 Guido van Rossum wrote:

> > how about going over the various demos, and see which ones have 
> > really lost their usefulness?
> 
> Yeah!
> 
> > I happened to come across Demo/sgi/audio (works only on SGI 
> > 4D/35 machines, which went out of production about 12 years 
> > ago), sv and video (works on Indigo's with the Starter Video 
> > board, last seen about 8 years ago). And there's the svmodule.c 
> > (yup, same board).
> > There are probably Indigo's still alive (4D35's? I doubt it, I 
> > can still remember the noise it made:-), but I wonder whether 
> > anyone in their right mind is still using the SV board.
> > 
> > The forms/fl stuff still technically works on newer SGI's, but 
> > we might also wonder how useful they still are.
> > 
> > And this is for SGI only, there's probably a lot more dead wood 
> > out there,
> 
> I haven't seen or heard an SGI machine for years.  If you think those
> SGI demos have lost their usefulness, please use your CVS powers to
> delete them!
> 
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> 

-- Sjoerd Mullender <sjoerd@acm.org>


From mcherm@destiny.com  Wed Sep 11 14:02:28 2002
From: mcherm@destiny.com (Michael Chermside)
Date: Wed, 11 Sep 2002 09:02:28 -0400
Subject: [Python-Dev] Re: raw headers in rfc822.Message
Message-ID: <3D7F3EE4.1050009@destiny.com>

Raymond Hettinger writes:
> FAQs and pointers be darned; there is only one way to 
> developerhood and that is through the school of hard knocks.
> 
 >    [Terrific course catalog for said school elided]

Raymond, I had to laugh at your "course catalog"... very funny but true, 
all of it very true. It made it into my long-term list of bookmarks not 
to lose.

But it's worth noting that, although the school of hard knocks may be 
required by life itself (and the nature of programming), anytime that we 
can just add a link to a web page and perhaps allow someone to skip a 
course, it's a win all around.

Anyway, thanks for bringing some humor into my morning.

-- Michael Chermside




From guido@python.org  Wed Sep 11 15:52:30 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 11 Sep 2002 10:52:30 -0400
Subject: [Python-Dev] Re: raw headers in rfc822.Message
In-Reply-To: Your message of "Wed, 11 Sep 2002 09:02:28 EDT."
 <3D7F3EE4.1050009@destiny.com>
References: <3D7F3EE4.1050009@destiny.com>
Message-ID: <200209111452.g8BEqUe07618@odiug.zope.com>

> Raymond Hettinger writes:
> > FAQs and pointers be darned; there is only one way to 
> > developerhood and that is through the school of hard knocks.
> > 
>  >    [Terrific course catalog for said school elided]
> 
> Raymond, I had to laugh at your "course catalog"... very funny but true, 
> all of it very true. It made it into my long-term list of bookmarks not 
> to lose.

Better yet, it made the Developer FAQ. :-)

> But it's worth noting that, although the school of hard knocks may be 
> required by life itself (and the nature of programming), anytime that we 
> can just add a link to a web page and perhaps allow someone to skip a 
> course, it's a win all around.

Got specific text you'd like us to add to a specific page?  Send it to
webmaster!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mwh@python.net  Wed Sep 11 18:15:01 2002
From: mwh@python.net (Michael Hudson)
Date: 11 Sep 2002 18:15:01 +0100
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules _hotshot.c,1.26,1.27
In-Reply-To: "Raymond Hettinger"'s message of "Wed, 11 Sep 2002 12:56:36 -0400"
References: <E17p9Xe-0007q3-00@usw-pr-cvs1.sourceforge.net> <000901c259b4$318ceda0$d4e97ad1@othello>
Message-ID: <2md6rk30tm.fsf@starship.python.net>

"Raymond Hettinger" <python@rcn.com> writes:

> Houston, we have a problem:
> 
> C:\py23\Modules\_hotshot.c(891) : error C2198: 'pack_lineno_tdelta' : too few actual parameters
> C:\py23\Modules\_hotshot.c(892) : error C2059: syntax error : ')'

Yikes!  Fixed.

I was sure I checked everything before checking in...

Cheers,
M.

-- 
  Imagine if every Thursday your shoes exploded if you tied them
  the usual way.  This happens to us all the time with computers,
  and nobody thinks of complaining.                     -- Jeff Raskin


From hbl@st-andrews.ac.uk  Wed Sep 11 19:18:02 2002
From: hbl@st-andrews.ac.uk (Hamish Lawson)
Date: Wed, 11 Sep 2002 19:18:02 +0100
Subject: [Python-Dev] Patch to make cgi.FieldStorage iterate over its keys
Message-ID: <5.1.1.6.0.20020911190503.035e7590@spey.st-andrews.ac.uk>

Below is a patch to make cgi.FieldStorage iterate over its keys, allowing 
it to behave like any other dictionary in this kind of construct:

     form = cgi.FieldStorage()
     for key in form:
         do something ...


Hamish Lawson


---
Compare: (<)E:\Python22\Lib\cgi.py (34894 bytes)
    with: (>)E:\temp\cgi.py (34955 bytes)

524a524,526
 >     def __iter__(self):
 >         return iter(self.keys())
 >



From guido@python.org  Wed Sep 11 19:23:56 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 11 Sep 2002 14:23:56 -0400
Subject: [Python-Dev] Patch to make cgi.FieldStorage iterate over its keys
In-Reply-To: Your message of "Wed, 11 Sep 2002 19:18:02 BST."
 <5.1.1.6.0.20020911190503.035e7590@spey.st-andrews.ac.uk>
References: <5.1.1.6.0.20020911190503.035e7590@spey.st-andrews.ac.uk>
Message-ID: <200209111823.g8BINuP22410@odiug.zope.com>

> Below is a patch to make cgi.FieldStorage iterate over its keys, allowing 
> it to behave like any other dictionary in this kind of construct:
> 
>      form = cgi.FieldStorage()
>      for key in form:
>          do something ...

Thanks.  I've applied this.

Points subtracted though for (1) sending a patch to python-dev instead
of using SourceForge and (2) sending a plain diff instead of a context
diff.  For that, your name won't be added the the list of
contributors. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mats@laplaza.org  Thu Sep 12 18:10:36 2002
From: mats@laplaza.org (Mats Wichmann)
Date: Thu, 12 Sep 2002 11:10:36 -0600
Subject: [Python-Dev] Re: 64-bit process optimization 1
In-Reply-To: <20020909233501.11696.35777.Mailman@mail.python.org>
Message-ID: <5.1.0.14.1.20020912105340.00aae7f0@204.151.72.2>

 >So perhaps the refcnt should have been a long in the first place.  A
 >similar argument may hold for the length of e.g. strings and lists:
 >one could wish to have a list of more than 2 billion elements, or a
 >string containing more than 2 gigabytes (that much RAM is easily found
 >on the larger 64-bit servers, I believe).
 >
 >Opinions?

If you change to longs it seems the reported
performance increase goes away, which would
seem to eliminate one of the motivations for
accepting the pain of a binary incompatibility.

Leaving just "getting it right".

Mats



From guido@python.org  Thu Sep 12 18:19:54 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 12 Sep 2002 13:19:54 -0400
Subject: [Python-Dev] Re: 64-bit process optimization 1
In-Reply-To: Your message of "Thu, 12 Sep 2002 11:10:36 MDT."
 <5.1.0.14.1.20020912105340.00aae7f0@204.151.72.2>
References: <5.1.0.14.1.20020912105340.00aae7f0@204.151.72.2>
Message-ID: <200209121719.g8CHJs729947@odiug.zope.com>

>  >So perhaps the refcnt should have been a long in the first place.  A
>  >similar argument may hold for the length of e.g. strings and lists:
>  >one could wish to have a list of more than 2 billion elements, or a
>  >string containing more than 2 gigabytes (that much RAM is easily found
>  >on the larger 64-bit servers, I believe).
>  >
>  >Opinions?
> 
> If you change to longs it seems the reported
> performance increase goes away, which would
> seem to eliminate one of the motivations for
> accepting the pain of a binary incompatibility.
> 
> Leaving just "getting it right".

Yup.  That's why I think it might have to be a 3-valued config option,
relevant for 64-bit machines only: "compat", "optimal", or "right".

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tismer@tismer.com  Thu Sep 12 19:02:47 2002
From: tismer@tismer.com (Christian Tismer)
Date: Thu, 12 Sep 2002 20:02:47 +0200
Subject: [Python-Dev] flextype.c  -- extended type system
Message-ID: <3D80D6C7.9040201@tismer.com>

Hi Guido, py-dev,

preface:
--------
a week ago or so, I sent a patch to Guido that removes
the "etype" struct. This is a hidden structure that
extends types when they are allocated on the heap.
One restriction with this type was that types could
not be extended by metatypes for some internal reason.
I fixed this. Now meta-types can define extra slots
for types.

the point:
----------
I wasn't really after slots in types, but I wanted to
have a type that can be extended as the user likes to.
Using the re-worked etype (now named PyHeapType_Type),
I created a new meta-type with some cool new features
which give you C++ - like virtual methods and inheritance.
The new dynamic type (PyFlexType_Type) allows to clone
any existing type, and thereby to pass a virtual method
table which will be bound into the type. It is a bit like
slots and slot definitions, but the VMT definition is
written like a PyMethodDef list (in fact, I have PyCMethodDef),
and the created virtual function entries are spelled explicitly
in the type structure.

Structure of a VMT definition:

typedef struct _pycmethoddef {
     char        *name;      /* name to lookup in __dict__ */
     PyCFunction match;      /* to be found if non-overridden */
     void        *fast;      /* native C call */
     void        *wrap;      /* wrapped call into Python */
     int         offset;     /* slot offset in heap type */
} PyCMethodDef;

At creation time of a new flextype, all VMT entries in the
accumulated bases (accessed via the MRT) are scanned from
oldest to newest, and the new type's methods are retrieved
by the "name" entry. Then it is checked whether the
method descriptor still points to the original PyCFunction
entry (the "match" field). If it is still original, the
native C call (field "fast") is inserted into the VMT,
otherwise the wrapped Python callback (field "wrap") is
inserted.

As a result, it is now very cheap to use overridable small
methods in your C implementations, since it nearly comes to
no cost if the method isn't overridden.
It is also possible to have private methods, in the sense
that you can use inheritance between your flextypes without
publishing every virtual method to Python at all.

Here an example of my Stackless type system, where I made
my channel interface overridable:

(channelobject.h)
"""
#define CHANNEL_SEND_HEAD(func) \
     int func (PyChannelObject *self, PyObject *arg)

#define CHANNEL_SEND_EXCEPTION_HEAD(func) \
     int func (PyChannelObject *self, PyObject *klass, PyObject *value)

#define CHANNEL_RECEIVE_HEAD(func) \
     PyObject * func (PyChannelObject *self)


typedef struct _pychannel_heaptype {
     PyFlexTypeObject type;
     /* the fast callbacks */
     CHANNEL_SEND_HEAD(           (*send)             );
     CHANNEL_SEND_EXCEPTION_HEAD( (*send_exception)   );
     CHANNEL_RECEIVE_HEAD(	 (*receive)          );
} PyChannel_HeapType;

int init_channeltype(void);
"""

Here the VMT definition of channelobject.c:
"""
static PyCMethodDef
channel_cmethods[] = {
     CMETHOD_PUBLIC_ENTRY(PyChannel_HeapType, channel, send),
     CMETHOD_PUBLIC_ENTRY(PyChannel_HeapType, channel, send_exception),
     CMETHOD_PUBLIC_ENTRY(PyChannel_HeapType, channel, receive),
     {NULL}                       /* sentinel */
};

"""

where the CMETHOD_PUBLIC_ENTRY macro looks like this:
/*
  * a public entry defines
  * - the function name      "name"
  * - the PyCFunction        class_name       seen from Python,
  * - the fast function      impl_class_name  implements the method for C
  * - the wrapper function   wrap_class_name  that calls back into a 
Python override.
  */
#define CMETHOD_PUBLIC_ENTRY(type, prefix, name) \
     {#name, (PyCFunction)prefix##_##name, &impl_##prefix##_##name, 
&wrap_##prefix##_##name, \
     offsetof(type, name)}

So basically three functions are involved in a virtual method:
the PyCFunction, the C implementation and a wrapper.
Normally, the PyCFunction and the implementation can be
identical, but usually my C interface looks slightly
different from the Python interface, for convenience.

Here an excerpt from channel_send:
"""
int
PyChannel_Send(PyChannelObject *self, PyObject *arg)
{
	PyChannel_HeapType *t = (PyChannel_HeapType *) self->ob_type;
	return t->send(self, arg);
}

static CHANNEL_SEND_HEAD(impl_channel_send)
{
     PyThreadState *ts = PyThreadState_GET();
     PyTaskletObject *sender, *receiver;
.... implementation skipped ....
}

static CHANNEL_SEND_HEAD(wrap_channel_send)
{
     PyObject * ret = PyObject_CallMethod((PyObject *) self, "send", 
"(O)", arg);
     return slp_return_wrapper(ret);
}

static PyObject *
channel_send(PyObject *myself, PyObject *arg)
{
     if (impl_channel_send((PyChannelObject*)myself, arg))
         return NULL;
     Py_INCREF(Py_None);
     return Py_None;
}
"""

end of story.

Summary:
--------
Overridable methods have always been present in Python,
via the built-in method slots. My extension methods
give the same functionality to the user, at maximum possible
speed (only templates can be faster).
The benefit is that users can use much more flexibility
in C modules than before, without fear of speed loss.
I believe that virtual methods will be used more often,
since it is cheap, flexible and compatible with Python.

Please let me know if there is interest to use this techique
in the Python core. I'm also not sure how to show the complete
thing, since it is partially a patch to the existing type
implementation (concerning the etype), partially a new C
module flextype.c, and the rest is part of Stackless.
Does it make sense (would somebody look at it) if I create
a little demo application or something?

cheers - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/




From barry@barrys-emacs.org  Thu Sep 12 23:12:37 2002
From: barry@barrys-emacs.org (Barry Scott)
Date: Thu, 12 Sep 2002 23:12:37 +0100
Subject: [Python-Dev] Feedback request on popen2.py Unix fix
Message-ID: <000101c25aa9$8094e650$070210ac@LAPDANCE>

I have logged a bug against python 2.2.1 with a fix.

[ 608635 ] Unix popen does not return exit status

Attached to the bug report is a proposed fix for
popen2.py. I'd appreciate feedback on the validity
of the changes.

		Barry




From aahz@pythoncraft.com  Fri Sep 13 02:23:06 2002
From: aahz@pythoncraft.com (Aahz)
Date: Thu, 12 Sep 2002 21:23:06 -0400
Subject: [Python-Dev] type categories
In-Reply-To: <200208240644.g7O6iRC25237@pcp02138704pcs.reston01.va.comcast.net>
References: <200208131802.g7DI2Ro27807@europa.research.att.com> <15718.25545.999300.938049@jin.int.geerbox.com> <200208231715.g7NHFRl12405@pcp02138704pcs.reston01.va.comcast.net> <15718.62725.643469.789554@slothrop.zope.com> <200208240644.g7O6iRC25237@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020913012306.GA20221@panix.com>

On Sat, Aug 24, 2002, Guido van Rossum wrote:
>
> Why do keep arguing for inheritance?  (a) the need to deny inheritance
> from an interface, while essential, is relatively rare IMO, and in
> *most* cases the inheritance rules work just fine; (b) having two
> separate but similar mechanisms makes the language larger.
> 
> For example, if we ever are going to add argument type declarations to
> Python, it will probably look like this:
> 
>     def foo(a: classA, b: classB):
>         ...body...

I'm curious, and I don't recall having seen anything about this: why
wouldn't we simply use attributes to hold this information, like
__slots__?  After all, attributes get inherited, too, and there's no
need to pretzel the syntax.  Using attributes IMO would make it easier
to handle the case where derived classes need to mangle type and
interface declarations.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/


From David Abrahams" <dave@boost-consulting.com  Fri Sep 13 02:26:38 2002
From: David Abrahams" <dave@boost-consulting.com (David Abrahams)
Date: Thu, 12 Sep 2002 21:26:38 -0400
Subject: [Python-Dev] type categories
References: <200208131802.g7DI2Ro27807@europa.research.att.com> <15718.25545.999300.938049@jin.int.geerbox.com> <200208231715.g7NHFRl12405@pcp02138704pcs.reston01.va.comcast.net> <15718.62725.643469.789554@slothrop.zope.com> <200208240644.g7O6iRC25237@pcp02138704pcs.reston01.va.comcast.net> <20020913012306.GA20221@panix.com>
Message-ID: <0b0501c25ac4$a5ffcd90$6401a8c0@boostconsulting.com>

From: "Aahz" <aahz@pythoncraft.com>


> On Sat, Aug 24, 2002, Guido van Rossum wrote:
> >
> > Why do keep arguing for inheritance?  (a) the need to deny inheritance
> > from an interface, while essential, is relatively rare IMO, and in
> > *most* cases the inheritance rules work just fine; (b) having two
> > separate but similar mechanisms makes the language larger.
> >
> > For example, if we ever are going to add argument type declarations to
> > Python, it will probably look like this:
> >
> >     def foo(a: classA, b: classB):
> >         ...body...
>
> I'm curious, and I don't recall having seen anything about this: why
> wouldn't we simply use attributes to hold this information, like
> __slots__?  After all, attributes get inherited, too, and there's no
> need to pretzel the syntax.  Using attributes IMO would make it easier
> to handle the case where derived classes need to mangle type and
> interface declarations.

A few weeks ago I realized there was reason in principle that declaring a
class satisfies an interface shouldn't just amount to adding the interface
to the class' __bases__ (as Guido has been suggesting all along).

Why not? Am we missing somethings?

-Dave

-----------------------------------------------------------
           David Abrahams * Boost Consulting
dave@boost-consulting.com * http://www.boost-consulting.com




From guido@python.org  Fri Sep 13 05:37:57 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 13 Sep 2002 00:37:57 -0400
Subject: [Python-Dev] type categories
In-Reply-To: Your message of "Thu, 12 Sep 2002 21:23:06 EDT."
 <20020913012306.GA20221@panix.com>
References: <200208131802.g7DI2Ro27807@europa.research.att.com> <15718.25545.999300.938049@jin.int.geerbox.com> <200208231715.g7NHFRl12405@pcp02138704pcs.reston01.va.comcast.net> <15718.62725.643469.789554@slothrop.zope.com> <200208240644.g7O6iRC25237@pcp02138704pcs.reston01.va.comcast.net>
 <20020913012306.GA20221@panix.com>
Message-ID: <200209130437.g8D4bvi13109@pcp02138704pcs.reston01.va.comcast.net>

> > Why do keep arguing for inheritance?  (a) the need to deny inheritance
> > from an interface, while essential, is relatively rare IMO, and in
> > *most* cases the inheritance rules work just fine; (b) having two
> > separate but similar mechanisms makes the language larger.
> > 
> > For example, if we ever are going to add argument type declarations to
> > Python, it will probably look like this:
> > 
> >     def foo(a: classA, b: classB):
> >         ...body...
> 
> I'm curious, and I don't recall having seen anything about this: why
> wouldn't we simply use attributes to hold this information, like
> __slots__?  After all, attributes get inherited, too, and there's no
> need to pretzel the syntax.  Using attributes IMO would make it easier
> to handle the case where derived classes need to mangle type and
> interface declarations.

That's exactly what Zope does with the __inherits__ attribute.

But it's got limitations: there's only one __inherits__ attribute, so
it isn't automatically merged properly on multiple inheritance, and
adding one new interface to it means you have to copy or reference the
base class __inherits__ attribute.

Also, __slots__ is provisional.  The plan is for this to eventually
get nicer syntax (when I get over my fear of adding new keywords :-).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Fri Sep 13 05:42:31 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 13 Sep 2002 00:42:31 -0400
Subject: [Python-Dev] type categories
In-Reply-To: Your message of "Thu, 12 Sep 2002 21:26:38 EDT."
 <0b0501c25ac4$a5ffcd90$6401a8c0@boostconsulting.com>
References: <200208131802.g7DI2Ro27807@europa.research.att.com> <15718.25545.999300.938049@jin.int.geerbox.com> <200208231715.g7NHFRl12405@pcp02138704pcs.reston01.va.comcast.net> <15718.62725.643469.789554@slothrop.zope.com> <200208240644.g7O6iRC25237@pcp02138704pcs.reston01.va.comcast.net> <20020913012306.GA20221@panix.com>
 <0b0501c25ac4$a5ffcd90$6401a8c0@boostconsulting.com>
Message-ID: <200209130442.g8D4gVH13129@pcp02138704pcs.reston01.va.comcast.net>

> A few weeks ago I realized there was reason in principle that
                                   ^^^^^^^^^^
Did you mean "was no reason"???

> declaring a class satisfies an interface shouldn't just amount to
> adding the interface to the class' __bases__ (as Guido has been
> suggesting all along).
> 
> Why not? Am we missing somethings?

We'd need a trick to deny an interface that would be inherited by
default.  Something like private inheritance.

There's also the ambiguity of inheriting from a single interface: does
that create a sub-interface or an implementation of the interface?
Of course with your C++ hat on you probably don't care.  On Mondays,
Wednesdays, Fridays and alternating Sundays I don't care either.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From dave@boost-consulting.com  Fri Sep 13 05:48:13 2002
From: dave@boost-consulting.com (David Abrahams)
Date: Fri, 13 Sep 2002 00:48:13 -0400
Subject: [Python-Dev] type categories
References: <200208131802.g7DI2Ro27807@europa.research.att.com> <15718.25545.999300.938049@jin.int.geerbox.com> <200208231715.g7NHFRl12405@pcp02138704pcs.reston01.va.comcast.net> <15718.62725.643469.789554@slothrop.zope.com> <200208240644.g7O6iRC25237@pcp02138704pcs.reston01.va.comcast.net> <20020913012306.GA20221@panix.com>              <0b0501c25ac4$a5ffcd90$6401a8c0@boostconsulting.com>  <200209130442.g8D4gVH13129@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <0c0001c25ae0$c6298860$6401a8c0@boostconsulting.com>

From: "Guido van Rossum" <guido@python.org>


> > A few weeks ago I realized there was reason in principle that
>                                    ^^^^^^^^^^
> Did you mean "was no reason"???
>
> > declaring a class satisfies an interface shouldn't just amount to
> > adding the interface to the class' __bases__ (as Guido has been
> > suggesting all along).
> >
> > Why not? Am we missing somethings?
>
> We'd need a trick to deny an interface that would be inherited by
> default.  Something like private inheritance.

I think it's more than that. You might need to "uninherit": Say Interface A
begets class B which begets class C. What if C doesn't fulfill A?

> There's also the ambiguity of inheriting from a single interface: does
> that create a sub-interface or an implementation of the interface?
> Of course with your C++ hat on you probably don't care.  On Mondays,
> Wednesdays, Fridays and alternating Sundays I don't care either.

With my C++ hat on I can't even imagine this. In C++ we don't express
interfaces in code: they're written down as "concepts" in the some
documentation somewhere (no, I don't think an abstract class in C++ is a
good analogy for these Python interfaces).

-Dave


-----------------------------------------------------------
           David Abrahams * Boost Consulting
dave@boost-consulting.com * http://www.boost-consulting.com




From guido@python.org  Fri Sep 13 06:08:22 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 13 Sep 2002 01:08:22 -0400
Subject: [Python-Dev] type categories
In-Reply-To: Your message of "Fri, 13 Sep 2002 00:48:13 EDT."
 <0c0001c25ae0$c6298860$6401a8c0@boostconsulting.com>
References: <200208131802.g7DI2Ro27807@europa.research.att.com> <15718.25545.999300.938049@jin.int.geerbox.com> <200208231715.g7NHFRl12405@pcp02138704pcs.reston01.va.comcast.net> <15718.62725.643469.789554@slothrop.zope.com> <200208240644.g7O6iRC25237@pcp02138704pcs.reston01.va.comcast.net> <20020913012306.GA20221@panix.com> <0b0501c25ac4$a5ffcd90$6401a8c0@boostconsulting.com> <200209130442.g8D4gVH13129@pcp02138704pcs.reston01.va.comcast.net>
 <0c0001c25ae0$c6298860$6401a8c0@boostconsulting.com>
Message-ID: <200209130508.g8D58NH13288@pcp02138704pcs.reston01.va.comcast.net>

> > > A few weeks ago I realized there was reason in principle that
> >                                    ^^^^^^^^^^
> > Did you mean "was no reason"???

So did you?

> > > declaring a class satisfies an interface shouldn't just amount to
> > > adding the interface to the class' __bases__ (as Guido has been
> > > suggesting all along).
> > >
> > > Why not? Am we missing somethings?
> >
> > We'd need a trick to deny an interface that would be inherited by
> > default.  Something like private inheritance.
> 
> I think it's more than that. You might need to "uninherit": Say
> Interface A begets class B which begets class C. What if C doesn't
> fulfill A?

Sorry, I meant to include that case.  How do you do that in C++?
Inherit privately from B and publicly from A, and making A virtual
base everywhere?

> > There's also the ambiguity of inheriting from a single interface: does
> > that create a sub-interface or an implementation of the interface?
> > Of course with your C++ hat on you probably don't care.  On Mondays,
> > Wednesdays, Fridays and alternating Sundays I don't care either.
> 
> With my C++ hat on I can't even imagine this. In C++ we don't
> express interfaces in code: they're written down as "concepts" in
> the some documentation somewhere (no, I don't think an abstract
> class in C++ is a good analogy for these Python interfaces).

What's the difference between an abstract class and an interface in
C++?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From aahz@pythoncraft.com  Fri Sep 13 08:12:55 2002
From: aahz@pythoncraft.com (Aahz)
Date: Fri, 13 Sep 2002 03:12:55 -0400
Subject: [Python-Dev] type categories
In-Reply-To: <200209130437.g8D4bvi13109@pcp02138704pcs.reston01.va.comcast.net>
References: <200208131802.g7DI2Ro27807@europa.research.att.com> <15718.25545.999300.938049@jin.int.geerbox.com> <200208231715.g7NHFRl12405@pcp02138704pcs.reston01.va.comcast.net> <15718.62725.643469.789554@slothrop.zope.com> <200208240644.g7O6iRC25237@pcp02138704pcs.reston01.va.comcast.net> <20020913012306.GA20221@panix.com> <200209130437.g8D4bvi13109@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020913071255.GA13052@panix.com>

On Fri, Sep 13, 2002, Guido van Rossum wrote:
>Aahz:
>> 
>> I'm curious, and I don't recall having seen anything about this: why
>> wouldn't we simply use attributes to hold this information, like
>> __slots__?  After all, attributes get inherited, too, and there's no
>> need to pretzel the syntax.  Using attributes IMO would make it easier
>> to handle the case where derived classes need to mangle type and
>> interface declarations.
> 
> That's exactly what Zope does with the __inherits__ attribute.
> 
> But it's got limitations: there's only one __inherits__ attribute, so
> it isn't automatically merged properly on multiple inheritance, and
> adding one new interface to it means you have to copy or reference the
> base class __inherits__ attribute.

Isn't that what metaclasses are for?
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/


From dave@boost-consulting.com  Fri Sep 13 12:57:21 2002
From: dave@boost-consulting.com (David Abrahams)
Date: Fri, 13 Sep 2002 07:57:21 -0400
Subject: [Python-Dev] type categories
References: <200208131802.g7DI2Ro27807@europa.research.att.com> <15718.25545.999300.938049@jin.int.geerbox.com> <200208231715.g7NHFRl12405@pcp02138704pcs.reston01.va.comcast.net> <15718.62725.643469.789554@slothrop.zope.com> <200208240644.g7O6iRC25237@pcp02138704pcs.reston01.va.comcast.net> <20020913012306.GA20221@panix.com>              <0b0501c25ac4$a5ffcd90$6401a8c0@boostconsulting.com>  <200209130442.g8D4gVH13129@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <0c2901c25b1d$4660bf80$6401a8c0@boostconsulting.com>

From: "Guido van Rossum" <guido@python.org>


> > A few weeks ago I realized there was reason in principle that
>                                    ^^^^^^^^^^
> Did you mean "was no reason"???

Oh. Yup.

-----------------------------------------------------------
           David Abrahams * Boost Consulting
dave@boost-consulting.com * http://www.boost-consulting.com




From David Abrahams" <dave@boost-consulting.com  Fri Sep 13 13:48:39 2002
From: David Abrahams" <dave@boost-consulting.com (David Abrahams)
Date: Fri, 13 Sep 2002 08:48:39 -0400
Subject: [Python-Dev] type categories
References: <200208131802.g7DI2Ro27807@europa.research.att.com> <15718.25545.999300.938049@jin.int.geerbox.com> <200208231715.g7NHFRl12405@pcp02138704pcs.reston01.va.comcast.net> <15718.62725.643469.789554@slothrop.zope.com> <200208240644.g7O6iRC25237@pcp02138704pcs.reston01.va.comcast.net> <20020913012306.GA20221@panix.com> <0b0501c25ac4$a5ffcd90$6401a8c0@boostconsulting.com> <200209130442.g8D4gVH13129@pcp02138704pcs.reston01.va.comcast.net>              <0c0001c25ae0$c6298860$6401a8c0@boostconsulting.com>  <200209130508.g8D58NH13288@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <0c4801c25b24$4c5822f0$6401a8c0@boostconsulting.com>

From: "Guido van Rossum" <guido@python.org>

> > > We'd need a trick to deny an interface that would be inherited by
> > > default.  Something like private inheritance.
> >
> > I think it's more than that. You might need to "uninherit": Say
> > Interface A begets class B which begets class C. What if C doesn't
> > fulfill A?
>
> Sorry, I meant to include that case.  How do you do that in C++?

We don't use inheritance for this kind of interface. When we're making a
Java-style interface, sure, inheritance works fine in C++. However, because
of Python's dynamic, generic nature what we've been calling an interface
for Python is much more like a "concept", which has no direct expression in
code:

http://www.boost.org/more/generic_programming.html#concept

Actually if you read a little further on down the page (the Traits and Tag
Dispatching sections), you'll see that it's possible to create an
expression in code of a concept in C++. Usually you want to do that when
concepts form a refinement hierarchy (e.g. bidirectional_iterator refines
forward_iterator) which may or may not correspond to inheritance
relationships.

> Inherit privately from B and publicly from A, and making A virtual
> base everywhere?


I guess you /could/ do that. I don't think anyone does, though ;-)

I was going to say that is seems to me if you can dynamically inject base
classes in Python there's no problem using inheritance to do this sort of
labelling. However, on third though, maybe there is a problem. Suppose you
have an inheritance chain A->B->C...->Z and I come a long later to say that
A fulfills interface II and add II to A's bases. Which of A's subclasses
also fulfill II. I might not know. I might not even know about them. For
this, maybe you'd need a way to express inheritance that goes just "one
level deep" (i.e. A inherits II publicly, but nothing else does). And that
might just screw with the notion of inheritance enough that you want a
separate parallel mechanism.

So I guess I'm back to where I was before. Inheritance probably doesn't
work out too well for expressing "satisfies interface".


-----------------------------------------------------------
           David Abrahams * Boost Consulting
dave@boost-consulting.com * http://www.boost-consulting.com






From jeremy@alum.mit.edu  Fri Sep 13 15:14:49 2002
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Fri, 13 Sep 2002 10:14:49 -0400
Subject: [Python-Dev] type categories
In-Reply-To: <0c4801c25b24$4c5822f0$6401a8c0@boostconsulting.com>
References: <200208131802.g7DI2Ro27807@europa.research.att.com>
 <15718.25545.999300.938049@jin.int.geerbox.com>
 <200208231715.g7NHFRl12405@pcp02138704pcs.reston01.va.comcast.net>
 <15718.62725.643469.789554@slothrop.zope.com>
 <200208240644.g7O6iRC25237@pcp02138704pcs.reston01.va.comcast.net>
 <20020913012306.GA20221@panix.com>
 <0b0501c25ac4$a5ffcd90$6401a8c0@boostconsulting.com>
 <200209130442.g8D4gVH13129@pcp02138704pcs.reston01.va.comcast.net>
 <0c0001c25ae0$c6298860$6401a8c0@boostconsulting.com>
 <200209130508.g8D58NH13288@pcp02138704pcs.reston01.va.comcast.net>
 <0c4801c25b24$4c5822f0$6401a8c0@boostconsulting.com>
Message-ID: <15745.62169.148898.620458@slothrop.zope.com>

>>>>> "DA" == David Abrahams <dave@boost-consulting.com> writes:

  DA> I was going to say that is seems to me if you can dynamically
  DA> inject base classes in Python there's no problem using
  DA> inheritance to do this sort of labelling. However, on third
  DA> though, maybe there is a problem. Suppose you have an
  DA> inheritance chain A->B->C...->Z and I come a long later to say
  DA> that A fulfills interface II and add II to A's bases. Which of
  DA> A's subclasses also fulfill II. I might not know. I might not
  DA> even know about them. For this, maybe you'd need a way to
  DA> express inheritance that goes just "one level deep" (i.e. A
  DA> inherits II publicly, but nothing else does). And that might
  DA> just screw with the notion of inheritance enough that you want a
  DA> separate parallel mechanism.

  DA> So I guess I'm back to where I was before. Inheritance probably
  DA> doesn't work out too well for expressing "satisfies interface".

I had similar third thoughts a couple of weeks ago :-).  So I guess I
agree with you.

Jeremy



From barry@python.org  Fri Sep 13 15:38:03 2002
From: barry@python.org (Barry A. Warsaw)
Date: Fri, 13 Sep 2002 10:38:03 -0400
Subject: [Python-Dev] type categories
References: <200208131802.g7DI2Ro27807@europa.research.att.com>
 <15718.25545.999300.938049@jin.int.geerbox.com>
 <200208231715.g7NHFRl12405@pcp02138704pcs.reston01.va.comcast.net>
 <15718.62725.643469.789554@slothrop.zope.com>
 <200208240644.g7O6iRC25237@pcp02138704pcs.reston01.va.comcast.net>
 <20020913012306.GA20221@panix.com>
 <0b0501c25ac4$a5ffcd90$6401a8c0@boostconsulting.com>
 <200209130442.g8D4gVH13129@pcp02138704pcs.reston01.va.comcast.net>
 <0c0001c25ae0$c6298860$6401a8c0@boostconsulting.com>
 <200209130508.g8D58NH13288@pcp02138704pcs.reston01.va.comcast.net>
 <0c4801c25b24$4c5822f0$6401a8c0@boostconsulting.com>
 <15745.62169.148898.620458@slothrop.zope.com>
Message-ID: <15745.63563.522111.553274@anthem.wooz.org>

>>>>> "JH" == Jeremy Hylton <jeremy@alum.mit.edu> writes:

>>>>> "DA" == David Abrahams <dave@boost-consulting.com> writes:

    DA> I was going to say that is seems to me if you can dynamically
    DA> inject base classes in Python there's no problem using
    DA> inheritance to do this sort of labelling. However, on third
    DA> though, maybe there is a problem. Suppose you have an
    DA> inheritance chain A->B->C...->Z and I come a long later to say
    DA> that A fulfills interface II and add II to A's bases. Which of
    DA> A's subclasses also fulfill II. I might not know. I might not
    DA> even know about them. For this, maybe you'd need a way to
    DA> express inheritance that goes just "one level deep" (i.e. A
    DA> inherits II publicly, but nothing else does). And that might
    DA> just screw with the notion of inheritance enough that you want
    DA> a separate parallel mechanism.

    DA> So I guess I'm back to where I was before. Inheritance
    DA> probably doesn't work out too well for expressing "satisfies
    DA> interface".

    JH> I had similar third thoughts a couple of weeks ago :-).  So I
    JH> guess I agree with you.

I tend to agree as well.  But to play devil's advocate for a moment: I
think Guido said that inheritance won't be the only way to spell
conforms-to, but it'll be the predominately common way.  So you'd
definitely need a way to spell that outside of inheritance as your
example clearly shows.  Which means that any conformsto() function
will have to be more complicated because it'll need to check both
mechanisms.  Is that a worthwhile price to pay to allow
conforms-to-by-inheritance?

What I don't like about the inheritance mechanism is that the syntax
isn't explicit.  I look at a class definition and I don't really know
what's a base class for implementation purposes and what's an
interface assertion.  It might even be difficult if I had the source
code for all the classes in the base class list if there's little
except convention to syntactically distinguish between a class
definition and an interface definition (no keyword, but just a
stylized bunch of defs).  I think it's going to be important to know
what's an interface and what's a base class.  Naming conventions
(IThingie) can help but aren't enforced.

-Barry


From drifty@bigfoot.com  Sun Sep 15 06:49:00 2002
From: drifty@bigfoot.com (Brett Cannon)
Date: Sat, 14 Sep 2002 22:49:00 -0700 (PDT)
Subject: [Python-Dev] flextype.c  -- extended type system
In-Reply-To: <3D80D6C7.9040201@tismer.com>
Message-ID: <Pine.SOL.4.44.0209142240310.4590-100000@death.OCF.Berkeley.EDU>

[Christian Tismer]

> Hi Guido, py-dev,
>
> preface:
> --------
> a week ago or so, I sent a patch to Guido that removes
> the "etype" struct. This is a hidden structure that
> extends types when they are allocated on the heap.
> One restriction with this type was that types could
> not be extended by metatypes for some internal reason.
> I fixed this. Now meta-types can define extra slots
> for types.
>

I have never written a type or object in C, so bear with my newbie
questions.  Are you saying, Chris, that before you could not inherit a
type written in C and override a method?  Is this only in regards to the
magic method slots or just any method?

>From what I gather in your email, it seems like you came up with proper
overriding inheritence in C for methods defined in a type.  So does this
means you can now override the __contains__ magic slot in C code through
some inherited type and this was not doable before?  Perhaps an example of
something from the Python core that was not possible before would solidify
this for me.

-Brett




From skip@manatee.mojam.com  Sun Sep 15 13:00:16 2002
From: skip@manatee.mojam.com (Skip Montanaro)
Date: Sun, 15 Sep 2002 07:00:16 -0500
Subject: [Python-Dev] Weekly Python Bug/Patch Summary
Message-ID: <200209151200.g8FC0Gux000552@manatee.mojam.com>

Bug/Patch Summary
-----------------

281 open / 2853 total bugs (+1)
108 open / 1690 total patches (-7)

New Bugs
--------

PyString_AsString underdocumented (2002-09-08)
	http://python.org/sf/606463
defining away __attribute__ is not good (2002-09-08)
	http://python.org/sf/606493
xml.sax second time file loading problem (2002-09-09)
	http://python.org/sf/606692
header file problems (2002-09-10)
	http://python.org/sf/607253
IDE should have "open recent" menu (2002-09-11)
	http://python.org/sf/607810
IDE look and feel (2002-09-11)
	http://python.org/sf/607814
IDE Preferences (2002-09-11)
	http://python.org/sf/607816
IDE output window (2002-09-11)
	http://python.org/sf/607821
Implied __init__.py not copied (2002-09-11)
	http://python.org/sf/608033
IDE - Breakpoints don't stick to lines (2002-09-11)
	http://python.org/sf/608085
gethostbyname("LOCALHOST") fails (2002-09-12)
	http://python.org/sf/608584
Problems in IDLE Browsers & Viewers (2002-09-12)
	http://python.org/sf/608595
Unix popen does not return exit status (2002-09-12)
	http://python.org/sf/608635
test_b1.py, disabling of list test (2002-09-13)
	http://python.org/sf/609041
cPickle.BadPickleGet is a string (2002-09-13)
	http://python.org/sf/609164

New Patches
-----------

Enhanced file constructor (2002-09-11)
	http://python.org/sf/608182
configure on Irix (sockets, posix) (2002-09-13)
	http://python.org/sf/608999

Closed Bugs
-----------

64-bit zip problems (2001-08-19)
	http://python.org/sf/453208
mmap bus error on linux (2001-09-19)
	http://python.org/sf/462783
shutil.copy(path, path) deletes contents (2001-12-07)
	http://python.org/sf/490168
IDLE doesn't save 8bit files (2002-04-18)
	http://python.org/sf/545600
del __builtins__ breaks out of rexec (2002-07-04)
	http://python.org/sf/577530
Docs unclear about cleanup. (2002-07-05)
	http://python.org/sf/577793
Get rid of FutureWarnings in Carbon (2002-08-15)
	http://python.org/sf/595763
import cycle in distutils (2002-08-19)
	http://python.org/sf/597604
Python not handling cText (2002-08-22)
	http://python.org/sf/598981
weird header wrapping in email.Generator (2002-08-28)
	http://python.org/sf/601392
spurious SyntaxWarning (2002-09-03)
	http://python.org/sf/604036
pre bug (2002-09-04)
	http://python.org/sf/604803

Closed Patches
--------------

patch for bug 462783 mmap bus error (2002-03-28)
	http://python.org/sf/536578
OpenBSD updates for build process (2002-05-10)
	http://python.org/sf/554718
THREAD_STACK_SIZE for 2.1 (2002-05-11)
	http://python.org/sf/554841
Remove import string in Tools/ directory (2002-06-21)
	http://python.org/sf/572113
types.BoolType (2002-08-02)
	http://python.org/sf/590119
improper use of strncpy in getpath (2002-08-29)
	http://python.org/sf/602108
For Bug [ 490168 ] shutil.copy(path, pat (2002-09-04)
	http://python.org/sf/604600
Tweaks to calls to AH/Help (2002-09-07)
	http://python.org/sf/606067
install_IDLE target in Mac/OSX/Makefile (2002-09-07)
	http://python.org/sf/606134


From tismer@tismer.com  Sun Sep 15 14:04:50 2002
From: tismer@tismer.com (Christian Tismer)
Date: Sun, 15 Sep 2002 15:04:50 +0200
Subject: [Python-Dev] flextype.c  -- extended type system
References: <Pine.SOL.4.44.0209142240310.4590-100000@death.OCF.Berkeley.EDU>
Message-ID: <3D848572.3020408@tismer.com>

Brett Cannon wrote:
> [Christian Tismer]
> 
> 
>>Hi Guido, py-dev,
>>
>>preface:
>>--------
>>a week ago or so, I sent a patch to Guido that removes
>>the "etype" struct. This is a hidden structure that
>>extends types when they are allocated on the heap.
>>One restriction with this type was that types could
>>not be extended by metatypes for some internal reason.
>>I fixed this. Now meta-types can define extra slots
>>for types.
> 
> I have never written a type or object in C, so bear with my newbie
> questions.  Are you saying, Chris, that before you could not inherit a
> type written in C and override a method?  Is this only in regards to the
> magic method slots or just any method?

Sure you could. The just was no general interface to it.
The magic method slots are already easy to override,
assuming that you always call these via the type slots
and don't call them directly.

For your own, non-magic methods, there was not support, yet.
Sure, you could override your methods, but you needed
extra machinery to keep track of the methods, to find out
which to call when, and so on.
The proper way to store extra info about methods is to
put this info into the type object itself. This was not
possible before my patch. You could help yourself my
extending some of the existing method tables, but this
is hackish.

With my flextype stuff, you explicitly extend your type
object with extra function pointers. Then you provide a
table with your implementation and wrapper functions,
and inheritance works from alone. That's what I was after.

> From what I gather in your email, it seems like you came up with proper
> overriding inheritence in C for methods defined in a type.  So does this
> means you can now override the __contains__ magic slot in C code through
> some inherited type and this was not doable before?  Perhaps an example of
> something from the Python core that was not possible before would solidify
> this for me.

I didn't care of the magic slots at all. I think they don't need
to be changed, but I will have a look at it.
The difference with my dynamic methods is that the method tables
are filled once, at the time when your type/class is created.
After that, there is no longer any lookup necessary. Method calls
which are not overridden are called with maximum possible speed.

In order to support changes to the undelying classes *after*
type creation, I will provide an extra type method that
allows to "re-bind" explictly.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/




From drifty@bigfoot.com  Sun Sep 15 20:33:24 2002
From: drifty@bigfoot.com (Brett Cannon)
Date: Sun, 15 Sep 2002 12:33:24 -0700 (PDT)
Subject: [Python-Dev] flextype.c  -- extended type system
In-Reply-To: <3D848572.3020408@tismer.com>
Message-ID: <Pine.SOL.4.44.0209151220130.5286-100000@death.OCF.Berkeley.EDU>

[Christian Tismer]

<snip>
> For your own, non-magic methods, there was not support, yet.
> Sure, you could override your methods, but you needed
> extra machinery to keep track of the methods, to find out
> which to call when, and so on.
> The proper way to store extra info about methods is to
> put this info into the type object itself. This was not
> possible before my patch. You could help yourself my
> extending some of the existing method tables, but this
> is hackish.
>

That sounds great.  Anything to make coding C extensions easier.

> I didn't care of the magic slots at all. I think they don't need
> to be changed, but I will have a look at it.
<snip>

Part of the reason I asked about the magic slots is that I personally
think it would be great if you didn't have to use the specific struct
slots for magic slots but instead were called based on their name in
Python.  That way you would not have to view Include/object.h every time
you wanted to use one of the magic methods; you could just add it just
like any other method and just give it a Python name that matched its
magic method name.  The obvious drawback is you would lose compiler
checking that the arguments were correct for the method.  But wouldn't
this simplify keeping binary-compatibility if it was used since the struct
would be pruned down significantly?

I don't know how much of a stumbling block this all is for newbies, but I
know when I looked at extending sre's pattern objects to add a
__contains__ method it took me a little while to find where the slot was
and what all the macros were for.  But that might also be because I didn't
read the C extension docs and just dove in.  =)

-Brett



From guido@python.org  Sun Sep 15 20:43:38 2002
From: guido@python.org (Guido van Rossum)
Date: Sun, 15 Sep 2002 15:43:38 -0400
Subject: [Python-Dev] flextype.c -- extended type system
In-Reply-To: Your message of "Sun, 15 Sep 2002 12:33:24 PDT."
 <Pine.SOL.4.44.0209151220130.5286-100000@death.OCF.Berkeley.EDU>
References: <Pine.SOL.4.44.0209151220130.5286-100000@death.OCF.Berkeley.EDU>
Message-ID: <200209151943.g8FJhc809943@pcp02138704pcs.reston01.va.comcast.net>

> [Christian Tismer]
> 
> <snip>
> > For your own, non-magic methods, there was not support, yet.
> > Sure, you could override your methods, but you needed
> > extra machinery to keep track of the methods, to find out
> > which to call when, and so on.
> > The proper way to store extra info about methods is to
> > put this info into the type object itself. This was not
> > possible before my patch. You could help yourself my
> > extending some of the existing method tables, but this
> > is hackish.

[Brett Cannon]
> That sounds great.  Anything to make coding C extensions easier.

Brett, may I politely suggest that you try writing C extensions first
before claiming it needs to be made easier?

Christian's additions (as far as I understand them :-) are mostly
intended for very esoteric situations.

> > I didn't care of the magic slots at all. I think they don't need
> > to be changed, but I will have a look at it.
> <snip>
> 
> Part of the reason I asked about the magic slots is that I
> personally think it would be great if you didn't have to use the
> specific struct slots for magic slots but instead were called based
> on their name in Python.  That way you would not have to view
> Include/object.h every time you wanted to use one of the magic
> methods; you could just add it just like any other method and just
> give it a Python name that matched its magic method name.  The
> obvious drawback is you would lose compiler checking that the
> arguments were correct for the method.  But wouldn't this simplify
> keeping binary-compatibility if it was used since the struct would
> be pruned down significantly?

Alas, it would cause a major slowdown if this was the only way to
provide heavily-used operations like __add__ and __getitem__.  Most of
the machinery to allow this probably already exists, but I wouldn't
recommend using it.  Also, you'd have to provide two implementations
for binary operators, e.g. __add__ and __radd__.

> I don't know how much of a stumbling block this all is for newbies,
> but I know when I looked at extending sre's pattern objects to add a
> __contains__ method it took me a little while to find where the slot
> was and what all the macros were for.  But that might also be
> because I didn't read the C extension docs and just dove in.  =)

You could've picked a simpler extension to try to modify. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From drifty@bigfoot.com  Mon Sep 16 05:31:53 2002
From: drifty@bigfoot.com (Brett Cannon)
Date: Sun, 15 Sep 2002 21:31:53 -0700 (PDT)
Subject: [Python-Dev] Python-dev summary for 2002-09-01 to 2002-09-15
Message-ID: <Pine.SOL.4.44.0209152126370.4459-100000@death.OCF.Berkeley.EDU>

Since posting here first before posting to c.l.py worked out rather nicely
last time, I am going to do it again.  Basically everyone gets 24 hours to
reply an corrections for this summary.

Now please note that the links that point to where I am going to keep the
summaries are not up yet (but they will be by the time I post to c.l.py).

Enjoy.

==============================

This is a summary of traffic on the `python-dev mailing list`_ between
September 01, 2002 and September 15, 2002 (exclusive).  It is intended to
inform the wider Python community of ongoing developments on the list;
everything from new features of the language to how to handle discovered
bugs that might affect the general Python programmer.  To comment on
anything mentioned here, just post to python-list@python.org or
comp.lang.python in the usual way. Give your posting a meaningful subject
line, and if it's about a PEP, include the PEP number (e.g. Subject: PEP
201 - Lockstep iteration) All python-dev members are interested in seeing
ideas discussed by the community, so don't hesitate to take a stance on a
PEP (or anything else for that matter) if you have an opinion.

::

 This is the second summary written by Brett Cannon (hopefully my
sophomoric performance will be better then most sophomore music albums).
 Summaries by me (2002-09-15 to ... when I burn out) are archived at
http://www.ocf.berkeley.edu/~bac/python-dev/summaries/index.php .
 You can find summaries by Michael Hudson (2002-02-01 to 2001-07-05) at
http://starship.python.net/crew/mwh/summaries/index.html .
 Summaries by A.M. Kuchling (2000-12-01 to 2001-01-31) are at
http://www.amk.ca/python/dev/ .

Please note that this summary is written using reST_ which can be found at
http://docutils.sourceforge.net/rst.html .  If there is some markup in the
summary that seems odd, chances are it is because of reST.

Also, I am considering keeping a list of names that people are often
referred to in emails.  This would serve a dual purpose: allows people who
read emails from the list to have a reference to be able to figure out who
is who and makes the summaries easier for me because I reference these
people in my head by their nicknames.  =)  Any comments on that idea are
appreciated.

.. _python-dev mailing list:
http://mail.python.org/mailman/listinfo/python-dev

============================
`To commit or not commit`_
============================

Walter Dorwald asked if there were "any objections against committing the
patch" for implementing `PEP 293`_ (Codec Error Handling Callbacks).
Guido asked what Martin V. Lowis and M.A. Lemburg had to say about it.
MAL responded that he was +1 on the patch.  Martin was "concerned about
the massive amounts of C code, most of which could be expressed way more
compact in Python code", but "Walter convinced [MvL] that this does have a
real performance impact for real data" so he would live with it.  In the
end he gave it his vote.

Walter said he would check it in (and he has).  The PEP has now been moved
to the finished PEP list.

.. _To commit or not commit:
http://mail.python.org/pipermail/python-dev/2002-September/028502.html
.. _PEP 293: http://www.python.org/peps/pep-0293.html

=======================================
`Proposed Mixins for Wide Interfaces`_
=======================================

Raymond Hettinger suggested adding mixin classes that automatically
implement magic methods when certain basic magic methods were already
implemented (e.g., "given an __eq__ method in a subclass, adds a __ne__
method").  David Abrahams said that he thought "these are a great idea,
*in the context of* an understanding of what we want interfaces to be,
say, and do."  Guido brought up some points about the initial suggestions
Raymond made.  He then said that he thought that there wasn't "enough here
to warrant putting this into the standard library"; the issue will be
revisited when a standard type or interface hierarchy is added to Python
(not in 2.3).

.. _Proposed Mixins for Wide Interfaces:
http://mail.python.org/pipermail/python-dev/2002-September/028543.html

===================================
`mysterious hangs in socket code`_
===================================

Jeremy Hylton wrote some threaded code to fetch some web pages that hung
when performing a slow DNS operation.  Apparently, in Python 2.1 "it
produces a steady stream of output -- urls and the time it took to load
them".  In Python 2.2 and 2.3, though, "it produces little bursts of
output, then pauses for a long time, then repeats".  Jeremy guessed that
it *might* have something to do with Linux's getaddrinfo() being
thread-safe by allowing only a single lookup at a time.  Aahz said that
"gethostbyname() IIRC has frequently been non-reentrant".  Skip ran the
code in question under strace and said that "it seems mostly to be sitting
in select() calls and rt_sigsuspend() which [Skip] guess[es] is a wrapper
around sigsuspend()."

.. _mysterious hangs in socket code:
http://mail.python.org/pipermail/python-dev/2002-September/028555.html

========================================
`Two random and nearly unrelated ideas`_
========================================

Skip Montanaro had two ideas; one was to make the info in `Misc/NEWS`_
(which is a summary of what has been changed in Python for each release)
and "to get rid of the ticker altogether in systems with proper signal
support" (see the `2002-08-16 - 2002-09-01 summary`_ for an explanation of
what the ticker is).  That would get rid of the polling of the ticker and
thus reduce the overhead on threads.

For the first idea, Guido asked Skip to try seeing what it would look like
with reST_ markup and what the resulting page would look like.

In response to the second idea, Oren Tirosh said it couldn't be done until
"all Python I/O calls are converted to be EINTR-safe" (EINTER-safe means
to be able to handle the EINTER signal which what is raised "When an I/O
operation is interrupted by an unmasked signal").  That "requires a lot of
work in some of the hairiest places in the Python codebase."  Fredrik
Lundh said that this "sounds like a good topic for a "here's what I
learned when trying to fix this problem" PEP.  This is most likely in
reference to Skip writing the patch to make the ticker global instead of a
per-thread issue.  Guido said, in terms of signals, to "just say no"; "it
is impossible to write correct code in the presense of signals".  Guido,
in a later email, gave this whole idea a vote of -1,000,000; so it ain't
ever going to happen.  Some discussion on signals ensued, but Guido never
budged from his position.

Oren pointed out that if some C code used signals and people didn't handle
it in their Python code by checking if IOError was caused by EINTER (as
shown below by Oren's code)::

    while 1:
        try:
            <code>
        except IOError, exc:
            if exc.errno == errno.EINTR:
                continue
            else:
                raise

, it would not restart properly even though there was no reason for it to
have stopped.  Oren said that Python could add the loop in the C code of
the core where EINTR might be raised ("Only low-level functions like
os.read_ and os.write_ that map directly to stdio functions should ever
return EINTR").  The proposed idea was to wrap functions that might raise
this that can be re-entered safely.

.. _Two random and nearly unrelated ideas:
http://mail.python.org/pipermail/python-dev/2002-September/028555.html
.. _Misc/NEWS:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Misc/NEWS
.. _2002-08-16 - 2002-09-01 summary:
http://www.ocf.berkeley.edu/~bac/python-dev/summaries/2002-08-16--2002-09-01.html
.. _reST: http://docutils.sourceforge.net/rst.html
.. _os.read:
.. _os.write: http://www.python.org/dev/doc/devel/lib/os-fd-ops.html

================================================
`Should KeyError use repr() on its arguments?`_
================================================

Originally, when an exception was raised and you passed in an optional
object to act as a description of why the exception was raised (such as
``KeyError("there is no spoon")`` where ``there is now spoon`` is the
optional argument bound to ``<exception>.args``), it just returned what
args was bound to when you called; ``str(<exception>) ==
<exception>.args``.  Now it calls repr() on what args is bound to;
``str(<exception>) == str(<exception>.args)``.  Much better. =)

.. _Should KeyError use repr() on its arguments?:
http://mail.python.org/pipermail/python-dev/2002-September/028545.html

==========================================
`New 'spambayes' project on SourceForge`_
==========================================

Thanks to great work done by Tim Peters and several other contributors,
Barry Warsaw started an SF project to host the spambayes code.  It can be
found at http://sf.net/projects/spambayes .  There are two mailing lists:
http://mail.python.org/mailman-21/listinfo/spambayes and
http://mail.python.org/mailman-21/listinfo/spambaye-checkins (yes, that is
Mailman 2.1, and yes, you will "help be a guinea pig for Mailman 2.1").

.. _New 'spambayes' project on SourceForge:
http://mail.python.org/pipermail/python-dev/2002-September/028626.html

=========================
`Subsecond time stamps`_
=========================

Martin V. Lowis wanted to introduce subsecond timestamps on platforms that
supported it.  He suggested adding another field to stat, create a new
type, or make st_mtime a floating point.  The first one option is easy,
the second has the usual problems of defining a new type, and the third
does not guarantee enough accuracy.

Paul Svensson and Guido said that the last option (turning st_mtime into a
float) was the most Pythonic.  MvL agreed, but worried about breaking code
that expected an int.  Guido then suggested that maybe the new field is
the way to go; define something like st_mtimef that will contain the float
if available or contain an int otherwise.  Tim Peters also weighed in with
his `IEEE 754`_ voodoo about how a float can hold enough info to be
accurate up to 100 nanoseconds if you only span a single year; issues
start to come up once you try to go past a year's worth of seconds.

But then MvL discovered that st_mtime was already a float on the Mac; had
that caused  issues?  Jack Jansen of course chimed in on this by saying
that it caused him a headache about once a year in the form of a failing
test (other issues caused by timestamps is the Classic Macs having the
epoch at 1904 and not using UTC time).  He said he would prefer to see the
timestamp as a cookie that was passed into a function that spit out
"something guaranteed to be of your liking".

To address the other issues that Jack mentioned, Guido suggested that all
timestamps be converted to UTC time with the epoch at 1970.

MvL has `SF patch 606592`_ up on SF that has already been closed that
makes all the relevant changes to have timestamps return floats.

.. _Subsecond time stamps:
http://mail.python.org/pipermail/python-dev/2002-September/028648.html
.. _IEEE 754: http://grouper.ieee.org/groups/754/
.. _SF patch 606592: http://www.python.org/sf/606592

=================================
`64-bit process optimization 1`_
=================================

Bob Ledwith posted a simple patch for `Include/object.h`_ that changed the
order of certain parts of the PyObject_HEAD macros, affecting PyObject and
PyVarObject.  This was for a 64-bit platform performance boost (40% for
large data sets according to Bob).  The reordering eliminated some padding
in the struct and allows more Python objects to fit in the L2 cache, or at
least that is what Bob thinks is going on.

Guido pointed out that this would save 8 bytes per object; he thought all
of this was "Interesting!".  But alas, using this patch would break binary
compatibility.  Guido was not sure, though, whether it had been broken yet
between Python 2.2 and 2.3 and thus he might be "being too conservative
here" in terms of saying that it should be held back for now.

A problem Guido pointed out for 64-bit systems, is that theoretically the
reference count for an object could go negative with enough references as
things stand now.  Guido then suggested that perhaps refcnt (struct item
that holds the reference count) should be a ``long``.  And while dealing
with that, Guido suggested that anything that stores a length should store
that number in a ``long``.

Chime in Tim Peters.  He pointed out that it was agreed upon years ago to
move refcnt to ``long`` but no one had bothered to do it.  Heck, even
Guido thought for a long time that it was a long when it wasn't; it
required Tim to "beat that out of [Guido] <wink>" to stop him from saying
that it was a ``long``.  He then pointed out that Win64 was still only 4
bytes for a ``long``; what was really desired was for it to be
``Py_intptr_t`` which is the Python way for spelling the C99 type that we
wanted.  Apparently C99 has a way to specify that things be a specific
byte length (now if everyone just had a C99 compiler we wouldn't need
these macros; oh, to dream...).

Tim also pointed out that what we wanted for the type that held a length
argument to be size_t since that is what strlen() and malloc() are
restricted by.  He said that he writes all of his "string-slinging code as
using size_t vars now".

Tim pointed out that the issue then became "Whether it's worth the pain to
change this stuff" which "depends on whether we think 64-bit boxes are
just another passing fad like the Internet <wink>".  =)

Martin V. Lowis agreed with the changing of refcnt to a long but had
reservations about using size_t for the length field (ob_size).  He
pointed out that some objects put negative values into that field.

Frederik suggested that the proposed changes be default on 64-bit systems
since the chances that they are willing to recompile is higher then people
on 32-bit systems.  He also suggested making it a compiler option.  Guido
thought it was a good idea.  But then Mats Wichmann discovered that the
switch to long killed the performance boost.  So Guido re-iterated that he
thinks it should be a compiler option only on 64-bit systems; have
"compat", "optimal", and "right" compiler options.

.. _64-bit process optimization 1:
http://mail.python.org/pipermail/python-dev/2002-September/028677.html
.. _Include/object.h:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Include/object.h

==========================================
`Weeding out obsolete modules and Demos`_
==========================================

Jack Jansen noticed that there demos for some of the SGI-specific modules
that use severely outdated systems and hardware (stuff discontinued 8 to
12 years ago).  Guido gave the go-ahead to yank them from CVS.  So the
demos are now history.

.. _Weeding out obsolete modules and Demos:
http://mail.python.org/pipermail/python-dev/2002-September/028718.html

==============
`utf8 issue`_
==============

(This thread actually started in August)  There was a bug in Python 2.2
that raised a UnicodeError when trying to decode a lone surrogate
(explanation of surrogates to follow this summary).  This caused issues in
importing .pyc files that contained a lone surrogate because marshal_
(which is what is used to create .pyc files) encodes Unicode_ literals in
UTF-8.  This has all been fixed in Python 2.3, but Guido was wondering how
to backport this for Python 2.2.2.

The option of bumping the magic number for .pyc files was raised and
instantly thrown out by Guido; "Bumping MAGIC is a no-no between dot
releases".  So M.A. Lemburg suggested to either fix the Unicode encoder or
change the Unicode decoder to handle the malformed Unicode.  MAL wasn't
sure, though, if some security issue would be raised by the latter option.

Guido said go for the latter and didn't see any possible security issue
since "If someone you don't trust can write your .pyc files, they can
cause your interpreter to crash by inserting bogus bytecode".

Explanation of lone surrogates:

    In Unicode, a surrogate pair is when you create the representation of
a character by using two values. So, for instance, UTF-32 can cover the
entire Unicode space (since Unicode is 20 bits, although MvL says it is
really more like 21 bits), but UTF-16 can't.  To solve the issue for an
encoding that cannot cover all possible characters in a single value a
character can be represented as a pair of UTF-16 values.  The high
surrogate cover the high 10 bits while the low surrogate cover the lower
10 bits.  High and low surrogates can never be the same since they are
defined by a range of possible values and those ranges do not overlap.  So
with the proper high and low surrogate paired together you can make any
possible Unicode character.

    The problem in Python 2.2.1 is that when there is only a lone
surrogate (instead of there being a pair of surrogates), the encoder for
UTF-8 messes up and leaves off a UTF-8 value.  The following line is an
example:

	>>> u'\ud800'.encode('utf-8')
	'\xa0\x80'  #In Python 2.2.1
	'\xed\xa0\x80'  #In Python 2.3a0

    Notice how in Python 2.3a0 the extra value is inserted so as to make
the representation a complete Unicode character instead of only encoding
the half of the surrogate pair that the encode was given.

    You can read
http://216.239.37.100/search?q=cache:Dk12BZNt6skC:uk.geocities.com/BabelStone1357/Software/surrogates.html
for more info.  Thanks goes to Frederik for the link and Guido for some
clarification.

.. _utf8 issue:
http://mail.python.org/pipermail/python-dev/2002-August/028254.html
.. _marshal: http://www.python.org/dev/doc/devel/lib/module-marshal.html
.. _Unicode: http://www.unicode.org/

=====================================
`Documentation inconsistency in re`_
=====================================

Christopher Craig noticed that the docs for the re_ module for the \b
metacharacter was incorrect; it says that "the end of a word is indicated
by whitespace or a non-alphanumeric character".  That would indicate that
an underscore would be the end of a word, which turns out to be false.
Frederik said that "\b is defined in terms of \w and \W" and thus allows
underscore to be a alphanumeric character.  The documentaiton has been
fixed.

.. _Documentation inconsistency in re:
http://mail.python.org/pipermail/python-dev/2002-September/028644.html
.. _re: http://www.python.org/dev/doc/devel/lib/module-re.html

=======================
`Codecs lookup order`_
=======================

Francois Pinard discovered that for the codecs_ module "one should be
careful about **not** [altered emphasis] naming a module after the
encoding name, when closely following the documentation in the Library
Reference manual".  This is because the codecs module first searches the
registry of codecs, then searches for a module with the same name and use
that module.  The issue comes up when the module does not contain a
function named getregentry(); "\`encodings.lookup()\` expects a
\`getregentry\` function in that module, does not find it, and raises a
CodecRegistryError, not leaving a chance to subsequent codec search
functions to be used".

M.A. Lemburg said that this has been fixed in Python 2.3 and will be in
2.2.2 by having encodings.lookup() return None if getregentry() is not
found and thus allowing the search to continue.

.. _Codecs lookup order:
http://mail.python.org/pipermail/python-dev/2002-September/028676.html
.. _codecs: http://www.python.org/dev/doc/devel/lib/module-codecs.html

=================================
`raw headers in rfc822.Message`_
=================================

John Spurling provided a two-line hack to keep the raw headers in an
rfc822.Message_ .  Barry responded that email.Message.Message_ keeps the
raw headers around.

But the reason I am summarizing this is what this thread quickly changed
to is how to properly generate a patch.  Patches should be generated using
UNIX diff, either the -c or -u option with preference for -c (using cvs
diff -c is even better; puts the version of the file you are diffing with
in the output); Mac folk can send MPW diffs, but UNIX diff is the
definitely preference.  Always put the order of the files `diff -c
OLD_FILE NEW_FILE` .  And always post the patches_ to SourceForge_!
Getting random patches, no matter how small, on the list is annoying (at
least to me) because the point of the list is to discuss the design and
implementation of Python, not to patch Python.  SF is used so that
Python-dev does not need to be bothered with mundame problems like
applying patches (and to annoy Aahz with SF's UI sucking in Lynx_ =).  So
please, for my sake and everyone else on Python-dev, use SF!

For a funny email from Raymond Hettinger about developing for Python read
http://mail.python.org/pipermail/python-dev/2002-September/028725.html .

.. _raw headers in rfc822.Message:
http://mail.python.org/pipermail/python-dev/2002-September/028682.html
.. _rfc822.Message:
http://www.python.org/dev/doc/devel/lib/message-objects.html
.. _email.Message.Message:
http://www.python.org/dev/doc/devel/lib/module-email.Message.html
.. _patches: http://sourceforge.net/patch/?group_id=5470
.. _SourceForge: http://www.sourceforge.net/
.. _Lynx: http://lynx.browser.org/

===================
`type categories`_
===================

Yes, the `same thread`_ from the `last summary`_ is back.  This thread has
become the bane of my summarizing existence.  =)

Aahz asked "why wouldn't we simply use attributes to hold" interfaces that
a class implemented (think of __slots__).  David Abrahams then brought up
the idea of just adding interfaces to the __class__ attribute.

Guido then chimed in on the attributes idea.  He pointed out that this is
how Zope does it, using the __inherits__ attribute.  The limitation is
that "it isn't automatically merged properly on multiple inheritance, and
adding one new interface to it means you have to copy or reference the
base class __inherits__ attribute".  And as for David's idea of just
adding to __class__, that doesn't work because there is no way to limit
the interface; you need "Something like private inheritance" for when an
interface is broken by some inherited class.  David subsequently added the
issue of being able to disinherit when an interface is not valid but is
inherited by default as another problem for using inheritence for
interfaces.

David then brought up the issue of having Python being so dynamic that you
could inject an interface if you used __class__ like he suggested through
black magic code.  If the injected interface didn't work because of the
inheritence chain, then you have a problem.

Barry Warsaw brought in his objections.  He tried playing Devil's Advocate
by saying that Guido had said that inheritance would not be the only way
to handle interfaces, but that it would be the predominent way.  But this
duality would complicate any conformsto()-like function since it would
have to handle two different ways for a class to get an interface.  Barry
then brought up the objection that he didn't like the idea of using
straight inheritence because he wanted a syntactic way to separate out
interfaces.

As a side note, Guido pointed out that __slots__ is provisional; nicer
syntax will eventually surface when Guido gets over his "fear of adding
new keywords".

.. _type categories:
http://mail.python.org/pipermail/python-dev/2002-September/028738.html
.. _same thread:
http://www.ocf.berkeley.edu/~bac/python-dev/summaries/2002-08-16--2002-09-01.html#type-categories
.. _last summary:
http://www.ocf.berkeley.edu/~bac/python-dev/summaries/2002-08-16--2002-09-01.html

=======================================
`flextype.c  -- extended type system`_
=======================================

Christian Tismer has come up with a replacement for the etype which is "a
hidden structure that extends types when they are allocated on the heap"
(you can find it in `Objects/typeobject.c`_ in the CVS_).  There is a
limitation with the etype where it could not be extended by metatypes.
Well, Chris worked his magic and came up with a new flextype that allows
overriding of methods.  So with Christian's code you would be able to
override methods in a type without having to hack something together to
handle the overriding correctly; it would be handled automatically.

Through some clarification from Christian and Guido, it was pointed out to
me (as of this moment I am the only one to make any noise on this thread,
and it was for this summary) that this simplifies an esoteric issue; note
the use of the words "metatype" above.  This is type/metatype black magic
hacking.  Spiffy, but something most of us "normal" folk will not have to
worry about.

.. _flextype.c  -- extended type system:
http://mail.python.org/pipermail/python-dev/2002-September/028736.html
.. _Objects/typeobject.c:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Objects/typeobject.c
.. _CVS: http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/#dirlist





From goodger@users.sourceforge.net  Mon Sep 16 06:45:27 2002
From: goodger@users.sourceforge.net (David Goodger)
Date: Mon, 16 Sep 2002 01:45:27 -0400
Subject: [Python-Dev] Python-dev summary for 2002-09-01 to 2002-09-15
In-Reply-To: <Pine.SOL.4.44.0209152126370.4459-100000@death.OCF.Berkeley.EDU>
Message-ID: <B9AAE836.28E49%goodger@users.sourceforge.net>

Brett Cannon wrote:
> Please note that this summary is written using reST_ which can be found at
> http://docutils.sourceforge.net/rst.html .  If there is some markup in the
> summary that seems odd, chances are it is because of reST.

Please don't blame the markup!  By the time people see it, it's been
mutilated by mailers to the point where it's unrecognizable.  Like Python,
leading whitespace is significant in reStructuredText.  As the author,
please take steps to prevent your document's mutilation.

There are some serious problems in the text I received, probably due to
emailer handling of the text.  Specifically, line wrapping gets screwed up
if lines are longer than 76 or 78 characters, and indentation goes out the
window.  I always limit files to 70 characters per line to prevent this.

I haven't had a chance to look through it thoroughly (gotta get some sleep),
but I noticed you used a literal block for your author's intro, beginning
"This is the second summary".  I think a block quote would be better; just
drop the "::" and fix the indentation (which was totally wacky).

If you send me the original as an attachment (gzipped would be best), I'll
be happy to take a look and give a detailed critique.

-- 
David Goodger  <goodger@users.sourceforge.net>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/



From mal@lemburg.com  Mon Sep 16 10:10:05 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 16 Sep 2002 11:10:05 +0200
Subject: [Python-Dev] Re: Automatic flex interface for Python?
References: <LNBBLJKPBEHFEDALKOLCKEOJAOAB.tim.one@comcast.net>
Message-ID: <3D859FED.8020609@lemburg.com>

Tim Peters wrote:
> [Gordon McMillan]
> 
>>mxTextTools lets (encourages?) you to break all
>>the rules about lex -> parse. If you can (& want to)
>>put a good deal of the "parse" stuff into the scanning
>>rules, you can get a speed advantage. You're also
>>not constrained by the rules of BNF, if you choose
>>to see that as an advantage :-).
>>
>>My one successful use of mxTextTools came after
>>using SPARK to figure out what I actually needed
>>in my AST, and realizing that the ambiguities in the
>>grammar didn't matter in practice, so I could produce
>>an almost-AST directly.
> 
> 
> I don't expect anyone will have much luck writing a fast lexer using
> mxTextTools *or* Python's regexp package unless they know quite a bit about
> how each works under the covers, and about how fast lexing is accomplished
> by DFAs.  If you know both, you can build a DFA by hand and painfully
> instruct mxTextTools in the details of its construction, and get a very fast
> tokenizer (compared to what's possible with re), regardless of the number of
> token classes or the complexity of their definitions.  Writing to
> mxTextTools directly is a lot like writing in an assembly language for a
> character-matching machine, with all the pains and potential joys that
> implies.  If I were Eric, I'd use Flex <wink>.

FYI, there are a few meta languages to make life easier for
mxTextTools like e.g. Mike Fletcher's SimpleParse.

The upcoming version 2.1 will also support Unicode and allows
text jump targets which boosts readability of the tag tables a
lot and makes hand-writing the tables much easier.

The beta of 2.1 is available to the subscribers of the egenix-users
mailing list.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/



From thomas@xs4all.net  Mon Sep 16 11:59:05 2002
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 16 Sep 2002 12:59:05 +0200
Subject: [Python-Dev] Python-dev summary for 2002-09-01 to 2002-09-15
In-Reply-To: <Pine.SOL.4.44.0209152126370.4459-100000@death.OCF.Berkeley.EDU>
References: <Pine.SOL.4.44.0209152126370.4459-100000@death.OCF.Berkeley.EDU>
Message-ID: <20020916105905.GE797@xs4all.nl>

On Sun, Sep 15, 2002 at 09:31:53PM -0700, Brett Cannon wrote:

> Skip Montanaro had two ideas; one was to make the info in `Misc/NEWS`_

I suspect there's a "using reST" or "in reST" missing here.

> (which is a summary of what has been changed in Python for each release)
> and "to get rid of the ticker altogether in systems with proper signal
> support" (see the `2002-08-16 - 2002-09-01 summary`_ for an explanation of
> what the ticker is).  That would get rid of the polling of the ticker and
> thus reduce the overhead on threads.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From tismer@tismer.com  Mon Sep 16 13:05:00 2002
From: tismer@tismer.com (Christian Tismer)
Date: Mon, 16 Sep 2002 14:05:00 +0200
Subject: [Python-Dev] flextype.c -- extended type system
References: <Pine.SOL.4.44.0209151220130.5286-100000@death.OCF.Berkeley.EDU> <200209151943.g8FJhc809943@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D85C8EC.50007@tismer.com>

Guido van Rossum wrote:
...

> Christian's additions (as far as I understand them :-) are mostly
> intended for very esoteric situations.

My additions support a subset of C++ virtual methods.
How is that esoteric?

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/




From tismer@tismer.com  Mon Sep 16 13:12:16 2002
From: tismer@tismer.com (Christian Tismer)
Date: Mon, 16 Sep 2002 14:12:16 +0200
Subject: [Python-Dev] flextype.c  -- extended type system
References: <Pine.SOL.4.44.0209151220130.5286-100000@death.OCF.Berkeley.EDU>
Message-ID: <3D85CAA0.6070603@tismer.com>

Brett Cannon wrote:
...

> Part of the reason I asked about the magic slots is that I personally
> think it would be great if you didn't have to use the specific struct
> slots for magic slots but instead were called based on their name in
> Python.  That way you would not have to view Include/object.h every time
> you wanted to use one of the magic methods; you could just add it just
> like any other method and just give it a Python name that matched its
> magic method name.  The obvious drawback is you would lose compiler
> checking that the arguments were correct for the method.

No, vice versa. I *could* support any magic slot and put
it into the extended type object with a Python name.
And even better, this version could have full type
checking, as my other methods have as well! This could
go far bejond what we have now. My system is explicit
as types: You repeat the whole function argument list
in the new gown slot. This is as type safe as can be.

Esoterically y'rs - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/




From tismer@tismer.com  Mon Sep 16 13:26:36 2002
From: tismer@tismer.com (Christian Tismer)
Date: Mon, 16 Sep 2002 14:26:36 +0200
Subject: [Python-Dev] flextype.c  -- extended type system
References: <Pine.SOL.4.44.0209151220130.5286-100000@death.OCF.Berkeley.EDU>
Message-ID: <3D85CDFC.7000004@tismer.com>

<sorry, I re-sent this one due to bad typos>
Brett Cannon wrote:
...

 > Part of the reason I asked about the magic slots is that I personally
 > think it would be great if you didn't have to use the specific struct
 > slots for magic slots but instead were called based on their name in
 > Python.  That way you would not have to view Include/object.h every time
 > you wanted to use one of the magic methods; you could just add it just
 > like any other method and just give it a Python name that matched its
 > magic method name.  The obvious drawback is you would lose compiler
 > checking that the arguments were correct for the method.

No, vice versa. I *could* support any magic slot and put
it into the extended type object with a Python name.
And even better, this version could have full type
checking, as my other methods have as well! This could
go far beyond what we have now. My system is explicit
at types: You repeat the whole function argument list
in the newly grown slot. This is as type safe as can be.

Esoterically y'rs - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
       whom do you want to sponsor today?   http://www.stackless.com/





From pinard@iro.umontreal.ca  Mon Sep 16 15:13:57 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: Mon, 16 Sep 2002 10:13:57 -0400
Subject: [Python-Dev] Re: Python-dev summary for 2002-09-01 to 2002-09-15
In-Reply-To: <B9AAE836.28E49%goodger@users.sourceforge.net> (David Goodger's
 message of "Mon, 16 Sep 2002 01:45:27 -0400")
References: <B9AAE836.28E49%goodger@users.sourceforge.net>
Message-ID: <oq3csakonu.fsf@titan.progiciels-bpi.ca>

[David Goodger]

> Please don't blame the markup!  By the time people see it, it's been
> mutilated by mailers to the point where it's unrecognizable.  [...]  As the
> author, please take steps to prevent your document's mutilation.

The message seems adequately formatted, as delivered here.

This is a recurring problem, deciding how far maintainers or writers should
keep in mind various broken software of the recipients.  There is an
equilibrium to reach, but the pressure is often undue on the authors, as
recipients want them to take care of everything bad they see.

My guess is that everybody has his/her share in that adventure.  As long as
the author does well, and he did fairly well here, most recipients problems
have to be addressed by recipients.

> Specifically, line wrapping gets screwed up if lines are longer than 76 or
> 78 characters, and indentation goes out the window.  I always limit files to
> 70 characters per line to prevent this.

The 79 or 80 character limit is still a reasonable convention and a good goal.
Some long URLs just do not fit within that space, they are not easily broken.
Some people use lower limits, as an aid for recipients later quoting the
original text, yet the proper refilling of quotes (and proper quotation) is
really the job of those who reply.  It goes a bit far that people limit
themselves to 70 characters per line, if because randomly broken software.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard


From praveen.patil@silver-software.com  Mon Sep 16 16:01:45 2002
From: praveen.patil@silver-software.com (Praveen Patil)
Date: Mon, 16 Sep 2002 16:01:45 +0100
Subject: [Python-Dev] Please help solving the problem
In-Reply-To: <NFBBLJFBNMKMLGNLJMFCCENBCCAA.praveen.patil@silver-software.com>
Message-ID: <NFBBLJFBNMKMLGNLJMFCEENDCCAA.praveen.patil@silver-software.com>

Hi ,


Please help me in solving the problem below.


step 1: I have written three dlls :  a.dll , b.dll , c.dll.
                 a.dll contains  funct_A();
                 b.dll contains  funct_B();
                 c.dll contains  funct_C();

step 2: I am copying a.dll to directory C:\Program Files\Python\DLLs  and
renaming as a.pyd
        similarly
           I am copying b.dll to directory C:\Program Files\Python\DLLs. I
am not renaming as b.pyd
           I am copying c.dll to directory C:\Program Files\Python\DLLs and
renaming as c.pyd
        So my  C:\Program Files\Python\DLLs  directory contain
                         a.pyd , b.dll , c.pyd

step 3: a)Python function func_pyA() calls funct_A()
        b)funct_A() call funct_B()
        c)funct_B() call funct_C()
        d)funct_C() call python  fuction  func_pyC()

step 4: I am importing a.pyd and c.pyd in python program.
             import a
             import c

step 5: I am having problem in importing 'a' because  'a' need to load b.dll
and c.dll. But I copied c.dll as c.pyd.
        Please suggest me some solution.


here is my code :

 1)a.c (a.dll)
   ----------
         void func_A();


 2)b.c (b.dll)
   -----------
          void func_B();

 3)c.c( c.dll)
   -----------
          void func_C();

 4) example.py
    ---------
   import a
   import c

   G_Logfile  = None
   def TestFunction():
     G_Logfile = open('Pytestfile.txt', 'w')
     G_Logfile.write("%s \n"%'I am writing python created text file')
     G_Logfile.close
     G_Logfile = None

   if __name__ == "__main__":
   a.func_A();
   .....
   .....


Please help me in solving the problem.


Cheers,


Praveen.


[ The information contained in this e-mail is confidential and is intended for the named recipient only. If you are not the named recipient, please notify us by telephone on +44 (0)1249 442 430 immediately, destroy the message and delete it from your computer. Silver Software has taken every reasonable precaution to ensure that any attachment to this e-mail has been checked for viruses. However, we cannot accept liability for any damage sustained as a result of any such software viruses and advise you to carry out your own virus check before opening any attachment. Furthermore, we do not accept responsibility for any change made to this message after it was sent by the sender.]


From aahz@pythoncraft.com  Mon Sep 16 16:14:19 2002
From: aahz@pythoncraft.com (Aahz)
Date: Mon, 16 Sep 2002 11:14:19 -0400
Subject: [Python-Dev] Please help solving the problem
In-Reply-To: <NFBBLJFBNMKMLGNLJMFCEENDCCAA.praveen.patil@silver-software.com>
References: <NFBBLJFBNMKMLGNLJMFCCENBCCAA.praveen.patil@silver-software.com> <NFBBLJFBNMKMLGNLJMFCEENDCCAA.praveen.patil@silver-software.com>
Message-ID: <20020916151418.GA9134@panix.com>

Please post this question to comp.lang.python; python-dev is only for
discussion for development of the Python project itself.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/


From martin@v.loewis.de  Mon Sep 16 16:50:46 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 16 Sep 2002 17:50:46 +0200
Subject: [Python-Dev] flextype.c -- extended type system
In-Reply-To: <3D85C8EC.50007@tismer.com>
References: <Pine.SOL.4.44.0209151220130.5286-100000@death.OCF.Berkeley.EDU>
 <200209151943.g8FJhc809943@pcp02138704pcs.reston01.va.comcast.net>
 <3D85C8EC.50007@tismer.com>
Message-ID: <m3n0qidjc9.fsf@mira.informatik.hu-berlin.de>

Christian Tismer <tismer@tismer.com> writes:

> > Christian's additions (as far as I understand them :-) are mostly
> > intended for very esoteric situations.
> 
> My additions support a subset of C++ virtual methods.
> How is that esoteric?

Why would an extension writer ever want to do this? "Normal" extension
types either wrap some C type, so you don't have inheritance at all,
or some C++ type, in which case a single type method can wrap
arbitrary virtual methods (since the VMT is done in C++).

A real-world example would help.

Regards,
Martin


From thomas.heller@ion-tof.com  Mon Sep 16 19:46:06 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Mon, 16 Sep 2002 20:46:06 +0200
Subject: [Python-Dev] flextype.c -- extended type system
References: <Pine.SOL.4.44.0209151220130.5286-100000@death.OCF.Berkeley.EDU><200209151943.g8FJhc809943@pcp02138704pcs.reston01.va.comcast.net><3D85C8EC.50007@tismer.com> <m3n0qidjc9.fsf@mira.informatik.hu-berlin.de>
Message-ID: <08a701c25db1$50aa0730$e000a8c0@thomasnotebook>

> > My additions support a subset of C++ virtual methods.
> > How is that esoteric?
> 
> Why would an extension writer ever want to do this? "Normal" extension
> types either wrap some C type, so you don't have inheritance at all,
> or some C++ type, in which case a single type method can wrap
> arbitrary virtual methods (since the VMT is done in C++).

I'm still in favor of a 'clean' method to add additional
C accessible structure fields to types. Currently I'm
attaching them to the the type's dict, as I reported before.

As I understand it, Christian's first patch allows this.

Thomas


From David Abrahams" <david.abrahams@rcn.com  Mon Sep 16 19:57:01 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Mon, 16 Sep 2002 14:57:01 -0400
Subject: [Python-Dev] flextype.c -- extended type system
References: <Pine.SOL.4.44.0209151220130.5286-100000@death.OCF.Berkeley.EDU><200209151943.g8FJhc809943@pcp02138704pcs.reston01.va.comcast.net><3D85C8EC.50007@tismer.com> <m3n0qidjc9.fsf@mira.informatik.hu-berlin.de>
Message-ID: <056601c25db3$6aa6a330$6701a8c0@boostconsulting.com>

From: "Martin v. Loewis" <martin@v.loewis.de>


> Christian Tismer <tismer@tismer.com> writes:
>
> > > Christian's additions (as far as I understand them :-) are mostly
> > > intended for very esoteric situations.
> >
> > My additions support a subset of C++ virtual methods.
> > How is that esoteric?
>
> Why would an extension writer ever want to do this? "Normal" extension
> types either wrap some C type, so you don't have inheritance at all,
> or some C++ type, in which case a single type method can wrap
> arbitrary virtual methods (since the VMT is done in C++).
>
> A real-world example would help.

Well, I want to do something like this, and I think it's for a fairly
simple reason.

All of my (dynamically-generated) extension classes need a piece of data
which tells them how much extra data to allocate in the variable-sized area
of their instances. This is an implementation detail which I don't want to
expose to users. Right now I have to stick it in the class' __dict__, which
not only means that it's exposed, but that users can change it at will. It
also costs me an extra lookup every time an instance of the extension class
is allocated. It would be much nicer if I could get a little data area in
the type object where I could stick this value, but right now there's no
place to put it.

Chris' patch allows me to handle the issue much more naturally. It doesn't
seem esoteric to add information to a type which doesn't live it its
__dict__. Not being able to do so makes types very different from other
objects.


-----------------------------------------------------------
           David Abrahams * Boost Consulting
dave@boost-consulting.com * http://www.boost-consulting.com

Of course, that makes it esoteric by its very definition ;-)



From drifty@bigfoot.com  Mon Sep 16 20:12:22 2002
From: drifty@bigfoot.com (Brett Cannon)
Date: Mon, 16 Sep 2002 12:12:22 -0700 (PDT)
Subject: [Python-Dev] Python-dev summary for 2002-09-01 to 2002-09-15
In-Reply-To: <B9AAE836.28E49%goodger@users.sourceforge.net>
Message-ID: <Pine.SOL.4.44.0209161205020.16185-100000@death.OCF.Berkeley.EDU>

[David Goodger]

> Brett Cannon wrote:
> > Please note that this summary is written using reST_ which can be found at
> > http://docutils.sourceforge.net/rst.html .  If there is some markup in the
> > summary that seems odd, chances are it is because of reST.
>
> Please don't blame the markup!  By the time people see it, it's been
> mutilated by mailers to the point where it's unrecognizable.  Like Python,
> leading whitespace is significant in reStructuredText.  As the author,
> please take steps to prevent your document's mutilation.
>

OK, I will mention that it might be reformatted in a strange way by their
reader as well, but I am going to leave in the mention of reST.  People
are going to not necessarily understand why I have :: after a paragraph.

> There are some serious problems in the text I received, probably due to
> emailer handling of the text.  Specifically, line wrapping gets screwed up
> if lines are longer than 76 or 78 characters, and indentation goes out the
> window.  I always limit files to 70 characters per line to prevent this.
>

I am willing to guarantee that is the problem.  I ran the summary through
tools/html.py and everything turned out well.

Problem is that wrapping at 70 characters will be a pain for me.  I can
try to use textwrap from Python 2.3 to do it for me, but unless I discover
a setting in my editor (BBEdit Lite; Vim was driving me nuts for straight
text editing), I don't know if my sanity is going to allow for this
request.

I am willing, though, to put in a line saying that various email and
newsgroup readers might reformat the code and that if you want the
original to run through reST code yourself, get it at from my summary
repository.

> I haven't had a chance to look through it thoroughly (gotta get some sleep),
> but I noticed you used a literal block for your author's intro, beginning
> "This is the second summary".  I think a block quote would be better; just
> drop the "::" and fix the indentation (which was totally wacky).
>

I was playing with that just before sending it out.  You answered my
personal email about it already, so that will be fixed before the summary
goes out.

> If you send me the original as an attachment (gzipped would be best), I'll
> be happy to take a look and give a detailed critique.
>

OK.

-Brett



From drifty@bigfoot.com  Mon Sep 16 20:14:38 2002
From: drifty@bigfoot.com (Brett Cannon)
Date: Mon, 16 Sep 2002 12:14:38 -0700 (PDT)
Subject: [Python-Dev] Python-dev summary for 2002-09-01 to 2002-09-15
In-Reply-To: <20020916105905.GE797@xs4all.nl>
Message-ID: <Pine.SOL.4.44.0209161214240.16185-100000@death.OCF.Berkeley.EDU>

[Thomas Wouters]

> On Sun, Sep 15, 2002 at 09:31:53PM -0700, Brett Cannon wrote:
>
> > Skip Montanaro had two ideas; one was to make the info in `Misc/NEWS`_
>
> I suspect there's a "using reST" or "in reST" missing here.
>

Yep.  Thanks.

-Brett



From thomas.heller@ion-tof.com  Mon Sep 16 20:24:54 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Mon, 16 Sep 2002 21:24:54 +0200
Subject: [Python-Dev] flextype.c -- extended type system
References: <Pine.SOL.4.44.0209151220130.5286-100000@death.OCF.Berkeley.EDU><200209151943.g8FJhc809943@pcp02138704pcs.reston01.va.comcast.net><3D85C8EC.50007@tismer.com> <m3n0qidjc9.fsf@mira.informatik.hu-berlin.de> <056601c25db3$6aa6a330$6701a8c0@boostconsulting.com>
Message-ID: <092701c25db6$bc9508f0$e000a8c0@thomasnotebook>

From: "David Abrahams" <dave@boost-consulting.com>
> All of my (dynamically-generated) extension classes need a piece of data
> which tells them how much extra data to allocate in the variable-sized area
> of their instances. This is an implementation detail which I don't want to

Not so different from what I need...

> expose to users. Right now I have to stick it in the class' __dict__, which
> not only means that it's exposed, but that users can change it at will. It
> also costs me an extra lookup every time an instance of the extension class
> is allocated. It would be much nicer if I could get a little data area in
> the type object where I could stick this value, but right now there's no
> place to put it.

You can (but you probably know this already) replace the type's tp_dict
by a custom subclass of PyDict_Object, which adds additional fields.

> Chris' patch allows me to handle the issue much more naturally. It doesn't
> seem esoteric to add information to a type which doesn't live it its
> __dict__. Not being able to do so makes types very different from other
> objects.

Actually this is not specific to types - it is for all variable size
objects.

Thomas


From David Abrahams" <david.abrahams@rcn.com  Mon Sep 16 20:16:23 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Mon, 16 Sep 2002 15:16:23 -0400
Subject: [Python-Dev] flextype.c -- extended type system
References: <Pine.SOL.4.44.0209151220130.5286-100000@death.OCF.Berkeley.EDU><200209151943.g8FJhc809943@pcp02138704pcs.reston01.va.comcast.net><3D85C8EC.50007@tismer.com> <m3n0qidjc9.fsf@mira.informatik.hu-berlin.de> <056601c25db3$6aa6a330$6701a8c0@boostconsulting.com> <092701c25db6$bc9508f0$e000a8c0@thomasnotebook>
Message-ID: <05a301c25db5$b3161db0$6701a8c0@boostconsulting.com>

From: "Thomas Heller" <thomas.heller@ion-tof.com>
> 
> You can (but you probably know this already) replace the type's tp_dict
> by a custom subclass of PyDict_Object, which adds additional fields.

I probably knew that once. Thanks for reminding me. When I have time...


-----------------------------------------------------------
           David Abrahams * Boost Consulting
dave@boost-consulting.com * http://www.boost-consulting.com





From mcolli@SyscomCipher.com.ar  Mon Sep 16 20:24:45 2002
From: mcolli@SyscomCipher.com.ar (mcolli@SyscomCipher.com.ar)
Date: Mon, 16 Sep 2002 16:24:45 -0300
Subject: [Python-Dev] looking for Python programmers
Message-ID: <OFFE4948A6.1FAC72DF-ON03256C36.006A0DFE@ZiryAsoc.com.ar>

Hi,

I would like to know how can I do to get information about Python
programmers that could be interested to work in a Zope/Python project in
Buenos Aires, Argentina.

Is this the right address to contact?

Many thanks and regards
Mariela



From drifty@bigfoot.com  Mon Sep 16 20:34:09 2002
From: drifty@bigfoot.com (Brett Cannon)
Date: Mon, 16 Sep 2002 12:34:09 -0700 (PDT)
Subject: [Python-Dev] looking for Python programmers
In-Reply-To: <OFFE4948A6.1FAC72DF-ON03256C36.006A0DFE@ZiryAsoc.com.ar>
Message-ID: <Pine.SOL.4.44.0209161233070.16185-100000@death.OCF.Berkeley.EDU>

[mcolli@SyscomCipher.com.ar]

> Hi,
>
> I would like to know how can I do to get information about Python
> programmers that could be interested to work in a Zope/Python project in
> Buenos Aires, Argentina.
>
> Is this the right address to contact?
>

No.  This address is used to discuss the development of Python.  Your
search would be better performed either on comp.lang.python and
comp.lang.python.announce .

-Brett C.



From just@letterror.com  Mon Sep 16 22:31:05 2002
From: just@letterror.com (Just van Rossum)
Date: Mon, 16 Sep 2002 23:31:05 +0200
Subject: [Python-Dev] SystemError: unknown opcode
Message-ID: <r01050300-1015-9C3A7066C9BB11D6B7FD003065D5E7E4@[10.0.0.23]>

After building Python from CVS (it's been a while) I get this error:

  Python 2.3a0 (#43, Sep 16 2002, 22:47:33) 
  [GCC 2.95.2 19991024 (release)] on darwin
  Type "help", "copyright", "credits" or "license" for more information.
  >>> import fontTools
  XXX lineno: 1, opcode: 127
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
    File "/Users/just/code/fonttools/Lib/fontTools/__init__.py", line 1, in ?
      version = "2.0b2"
  SystemError: unknown opcode
  >>> 

Does this mean the .pyc magic number needs to be changed? Or is it simply the
risk of using CVS Python? ;-)

Just


From martin@v.loewis.de  Mon Sep 16 22:41:08 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 16 Sep 2002 23:41:08 +0200
Subject: [Python-Dev] looking for Python programmers
In-Reply-To: <Pine.SOL.4.44.0209161233070.16185-100000@death.OCF.Berkeley.EDU>
References: <Pine.SOL.4.44.0209161233070.16185-100000@death.OCF.Berkeley.EDU>
Message-ID: <m3bs6xfw97.fsf@mira.informatik.hu-berlin.de>

Brett Cannon <bac@OCF.Berkeley.EDU> writes:

> > I would like to know how can I do to get information about Python
> > programmers that could be interested to work in a Zope/Python project in
> > Buenos Aires, Argentina.
[...]
> No.  This address is used to discuss the development of Python.  Your
> search would be better performed either on comp.lang.python and
> comp.lang.python.announce .

Actually, I think the Python Job Board (http://python.org/Jobs.html)
is the right forum for posting intents to hire.

Regards,
Martin


From guido@python.org  Mon Sep 16 22:43:29 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 16 Sep 2002 17:43:29 -0400
Subject: [Python-Dev] SystemError: unknown opcode
In-Reply-To: Your message of "Mon, 16 Sep 2002 23:31:05 +0200."
 <r01050300-1015-9C3A7066C9BB11D6B7FD003065D5E7E4@[10.0.0.23]>
References: <r01050300-1015-9C3A7066C9BB11D6B7FD003065D5E7E4@[10.0.0.23]>
Message-ID: <200209162143.g8GLhTa28570@pcp02138704pcs.reston01.va.comcast.net>

> After building Python from CVS (it's been a while) I get this error:
                                  ^^^^^^^^^^^^^^^^^

This is key.

>   Python 2.3a0 (#43, Sep 16 2002, 22:47:33) 
>   [GCC 2.95.2 19991024 (release)] on darwin
>   Type "help", "copyright", "credits" or "license" for more information.
>   >>> import fontTools
>   XXX lineno: 1, opcode: 127
>   Traceback (most recent call last):
>     File "<stdin>", line 1, in ?
>     File "/Users/just/code/fonttools/Lib/fontTools/__init__.py", line 1, in ?
>       version = "2.0b2"
>   SystemError: unknown opcode
>   >>> 
> 
> Does this mean the .pyc magic number needs to be changed? Or is it
> simply the risk of using CVS Python? ;-)

The magic number was changed several times due to the SET_LINENO
changes.  At the end I changed it *back* to what we changed it to
after another change earlier during the 2.3 cycle.  You may be the
only unlucky guy who missed both rounds of SET_LINENO changes.

Remove all your .pyc/.pyo files and be done with it.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Mon Sep 16 23:21:31 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 16 Sep 2002 18:21:31 -0400
Subject: [Python-Dev] Moratorium on changes to IDLE
Message-ID: <200209162221.g8GMLVY30469@pcp02138704pcs.reston01.va.comcast.net>

I'd like to put a stop to all changes to the version of IDLE in the
Python source tree (Tools/idle/* -- let's call it Python-idle).  The
current crop of changes are being merged into Idlefork, the separate
SF project where a new IDLE version is being cooked.  I hope that
Idlefork will be ready to be merged back into Python before we release
Python 2.3, and that will be easiest if we can simply abandon the
existing Python-idle code and copy the latest Idlefork in its place.
Any changes made to the Python-idle code will be lost at that point.
If you have a bug, fix or feature for IDLE, please suggest it on the
idle-dev mailing list or on Idlefork's SF bug/patch managers!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Tue Sep 17 00:23:15 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 16 Sep 2002 19:23:15 -0400
Subject: [Python-Dev] Python-dev summary for 2002-09-01 to 2002-09-15
In-Reply-To: Your message of "Sun, 15 Sep 2002 21:31:53 PDT."
 <Pine.SOL.4.44.0209152126370.4459-100000@death.OCF.Berkeley.EDU>
References: <Pine.SOL.4.44.0209152126370.4459-100000@death.OCF.Berkeley.EDU>
Message-ID: <200209162323.g8GNNFa30696@pcp02138704pcs.reston01.va.comcast.net>

> Jack Jansen noticed that there demos for some of the SGI-specific modules
> that use severely outdated systems and hardware (stuff discontinued 8 to
> 12 years ago).  Guido gave the go-ahead to yank them from CVS.  So the
> demos are now history.

Wish they were!  Nobody ripped them out.  Sjoerd Mullender gave some
more feedback (some of the code still works) but in the end nobody did
anything.  I still hope it'll happen though.

> Guido said go for the latter and didn't see any possible security issue
> since "If someone you don't trust can write your .pyc files, they can
> cause your interpreter to crash by inserting bogus bytecode".

Another issue where I fear the action item is still in somebody's corner.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From goodger@users.sourceforge.net  Tue Sep 17 01:53:49 2002
From: goodger@users.sourceforge.net (David Goodger)
Date: Mon, 16 Sep 2002 20:53:49 -0400
Subject: [Python-Dev] Re: Python-dev summary for 2002-09-01 to 2002-09-15
In-Reply-To: <oq3csakonu.fsf@titan.progiciels-bpi.ca>
Message-ID: <B9ABF55C.28F06%goodger@users.sourceforge.net>

[David Goodger]
>> Please don't blame the markup!  By the time people see it, it's
>> been mutilated by mailers to the point where it's unrecognizable.
>> [...]  As the author, please take steps to prevent your document's
>> mutilation.

[Fran=E7ois Pinard]
> The message seems adequately formatted, as delivered here.

I'm seeing subtle things that you probably won't notice if you're not
used to writing reStructuredText.  (Which is as it should be -- easy
to read, even though some care must be taken in the writing.)

Specifically, the paragraph beginnng "This is the second summary" has
very strange indentation.  Every other line is indented by one space
(tab?).  The simplest explanation for this is that the whole thing was
supposed to be indented, but the lines were very long.

> This is a recurring problem, deciding how far maintainers or writers
> should keep in mind various broken software of the recipients.

I think the problem is earlier than my mail client.  The text I
received in my mailbox is identical (indentation is equally wonky) to
that on the web in the Python-dev archive:
http://mail.python.org/pipermail/python-dev/2002-September/028754.html.

> There is an equilibrium to reach, but the pressure is often undue on
> the authors, as recipients want them to take care of everything bad
> they see.

In this case, it's a document posted to mailing lists.  I don't think
it's too much to ask that mailer line wrapping be allowed for.  I
agree that long URLs must remain, but there shouldn't be a problem for
ordinary text.

--=20
David Goodger  <goodger@users.sourceforge.net>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/



From fg@nuxeo.com  Tue Sep 17 02:00:09 2002
From: fg@nuxeo.com (Florent Guillaume)
Date: 17 Sep 2002 03:00:09 +0200
Subject: [Python-Dev] Unicode regexp problem
Message-ID: <1032224410.13730.9.camel@twin.in.efge.org>

I've got the following problem, in python 2.1, 2.2 and 2.3a0 (Debian):

>>> import re
>>> re.compile(r'\w+',   re.U).sub('X', u'hello caf\xe9')
u'X X'
>>> re.compile(r'\w{1}', re.U).sub('X', u'hello caf\xe9')
u'XXXXX XXXX'
>>> re.compile(r'\w',    re.U).sub('X', u'hello caf\xe9')
u'XXXXX XXX\xe9'

The first two results are ok, but the third is not.

Thanks,

Florent


PS: I'd appreciate a Cc on answers.

-- 
Florent Guillaume, Nuxeo (Paris, France)
+33 1 40 33 79 87  http://nuxeo.com  mailto:fg@nuxeo.com



From aahz@pythoncraft.com  Tue Sep 17 02:05:19 2002
From: aahz@pythoncraft.com (Aahz)
Date: Mon, 16 Sep 2002 21:05:19 -0400
Subject: [Python-Dev] Unicode regexp problem
In-Reply-To: <1032224410.13730.9.camel@twin.in.efge.org>
References: <1032224410.13730.9.camel@twin.in.efge.org>
Message-ID: <20020917010519.GA9969@panix.com>

On Tue, Sep 17, 2002, Florent Guillaume wrote:
>
> I've got the following problem, in python 2.1, 2.2 and 2.3a0 (Debian):
> 
> >>> import re
> >>> re.compile(r'\w+',   re.U).sub('X', u'hello caf\xe9')
> u'X X'
> >>> re.compile(r'\w{1}', re.U).sub('X', u'hello caf\xe9')
> u'XXXXX XXXX'
> >>> re.compile(r'\w',    re.U).sub('X', u'hello caf\xe9')
> u'XXXXX XXX\xe9'
> 
> The first two results are ok, but the third is not.

python-dev is the wrong forum for bug reports, unless

a) it's *only* in the CVS tree

and

b) you know you need advice for fixing it (and are planning to help fix)

In any case, you should write a bug report on SourceForge first (unless
you're posting to c.l.python to check whether it is a bug).
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/


From martin@v.loewis.de  Tue Sep 17 06:25:46 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 17 Sep 2002 07:25:46 +0200
Subject: [Python-Dev] Moratorium on changes to IDLE
In-Reply-To: <200209162221.g8GMLVY30469@pcp02138704pcs.reston01.va.comcast.net>
References: <200209162221.g8GMLVY30469@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <m3vg55chlx.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> I'd like to put a stop to all changes to the version of IDLE in the
> Python source tree (Tools/idle/* -- let's call it Python-idle).  The
> current crop of changes are being merged into Idlefork, the separate
> SF project where a new IDLE version is being cooked. 

Does Idlefork also require CVS Python?

Regards,
Martin


From guido@python.org  Tue Sep 17 06:28:19 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 17 Sep 2002 01:28:19 -0400
Subject: [Python-Dev] Moratorium on changes to IDLE
In-Reply-To: Your message of "Tue, 17 Sep 2002 07:25:46 +0200."
 <m3vg55chlx.fsf@mira.informatik.hu-berlin.de>
References: <200209162221.g8GMLVY30469@pcp02138704pcs.reston01.va.comcast.net>
 <m3vg55chlx.fsf@mira.informatik.hu-berlin.de>
Message-ID: <200209170528.g8H5SJA04609@pcp02138704pcs.reston01.va.comcast.net>

> > I'd like to put a stop to all changes to the version of IDLE in the
> > Python source tree (Tools/idle/* -- let's call it Python-idle).  The
> > current crop of changes are being merged into Idlefork, the separate
> > SF project where a new IDLE version is being cooked. 
> 
> Does Idlefork also require CVS Python?

Not yet, AFAIK, it requires 2.2.  A very small number of changes to
Python-idle could not be merged for that reason (e.g. mkstemp).  I'd
like to keep Idlefork working with 2.2 so there's a reasonable
potential user base for an Idlefork release.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tismer@tismer.com  Tue Sep 17 10:44:38 2002
From: tismer@tismer.com (Christian Tismer)
Date: Tue, 17 Sep 2002 11:44:38 +0200
Subject: [Python-Dev] flextype.c -- extended type system
References: <Pine.SOL.4.44.0209151220130.5286-100000@death.OCF.Berkeley.EDU><200209151943.g8FJhc809943@pcp02138704pcs.reston01.va.comcast.net><3D85C8EC.50007@tismer.com> <m3n0qidjc9.fsf@mira.informatik.hu-berlin.de> <056601c25db3$6aa6a330$6701a8c0@boostconsulting.com>
Message-ID: <3D86F986.7000305@tismer.com>

David Abrahams wrote:
<snip>

> Chris' patch allows me to handle the issue much more naturally. It doesn't
> seem esoteric to add information to a type which doesn't live it its
> __dict__. Not being able to do so makes types very different from other
> objects.
> 
> 
> -----------------------------------------------------------
>            David Abrahams * Boost Consulting
> dave@boost-consulting.com * http://www.boost-consulting.com
> 
> Of course, that makes it esoteric by its very definition ;-)

Hee hee :-))

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/




From tismer@tismer.com  Tue Sep 17 10:56:34 2002
From: tismer@tismer.com (Christian Tismer)
Date: Tue, 17 Sep 2002 11:56:34 +0200
Subject: [Python-Dev] flextype.c -- extended type system
References: <Pine.SOL.4.44.0209151220130.5286-100000@death.OCF.Berkeley.EDU><200209151943.g8FJhc809943@pcp02138704pcs.reston01.va.comcast.net><3D85C8EC.50007@tismer.com> <m3n0qidjc9.fsf@mira.informatik.hu-berlin.de> <08a701c25db1$50aa0730$e000a8c0@thomasnotebook>
Message-ID: <3D86FC52.9010505@tismer.com>

Thomas Heller wrote:
>>>My additions support a subset of C++ virtual methods.
>>>How is that esoteric?
>>
>>Why would an extension writer ever want to do this? "Normal" extension
>>types either wrap some C type, so you don't have inheritance at all,
>>or some C++ type, in which case a single type method can wrap
>>arbitrary virtual methods (since the VMT is done in C++).
> 
> 
> I'm still in favor of a 'clean' method to add additional
> C accessible structure fields to types. Currently I'm
> attaching them to the the type's dict, as I reported before.
> 
> As I understand it, Christian's first patch allows this.

Please let me know when you're actually going to use
it. I know there is a bug in the 2.3 patch. For Stackless,
I'm still hacking against 2.2.1, and the patch has been
extended in serveral ways as well: I removed the assumption
that objects generated from heap types need always to
be GC objects. This was probably decided with too much
classes in mind, but now this feature also makes sense
to simple types where you might want to avoid GC for
space or other reasons.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/




From guido@python.org  Tue Sep 17 21:24:18 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 17 Sep 2002 16:24:18 -0400
Subject: [Python-Dev] Weeding out obsolete modules and Demos
In-Reply-To: Your message of "Wed, 11 Sep 2002 10:44:03 +0200."
 <200209110844.g8B8i3D14336@indus.ins.cwi.nl>
References: <10D997B8-C501-11D6-88B2-003065517236@oratrix.com> <200209102119.g8ALJ1h29280@odiug.zope.com>
 <200209110844.g8B8i3D14336@indus.ins.cwi.nl>
Message-ID: <200209172024.g8HKOIo08287@odiug.zope.com>

> I would not be opposed to deleting the whole Demo/sgi tree.

OK, I've done this.  I've saved a private copy of the mclock code for
nostalgic purposes; I may yet rewrite using Tkinter. :-)

> I don't have an SGI workstation anymore, but I did have an SGI O2
> until recently.  I think the audio (and al) stuff and the gl stuff
> probably still work (I used mclock until recently).  I think we can
> definitely get rid of the video directory (CMIF video format, remember
> that?).  I'm not sure whether sv still compiles on modern SGI's.  The
> cd module also still works.
> 
> Having said this, I'm not sure there is still much value in keeping
> the demos.  The modules in the Modules directory is another matter.
> Until recently I have used cd and al.  I think cl might still work,
> but I'm not sure.  I don't think sv works on anything other than
> Indigo's with a Starter Video board.
> gl (and I think also fm) still works.  sgi also still works, but I'm
> not sure how useful it still is.  It just defines functions nap and
> _getpty.
> rgbimg is for reading SGI RGB images, but is portable.  Although one
> must ask whether it has a place in the standard library, since there
> is no similar level of support for more popular image formats.

OK, I won't touch the SGI specific code in Lib and Modules.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From kbk@shore.net  Wed Sep 18 00:59:57 2002
From: kbk@shore.net (Kurt B. Kaiser)
Date: 17 Sep 2002 19:59:57 -0400
Subject: [Python-Dev] Idle Development
Message-ID: <m37khkxj42.fsf@float.attbi.com>

Guido posted a moratorium on futher Python Idle development, with the
intention that work be shifted to Idlefork.

I'd like to extend an invitation to any interested developers with
Python CVS access to join the Idlefork project and continue in their
idle ways.

Send me an email and I'll set you up with Idlefork access.

KBK





From kbk@shore.net  Wed Sep 18 18:55:52 2002
From: kbk@shore.net (Kurt B. Kaiser)
Date: 18 Sep 2002 13:55:52 -0400
Subject: [Python-Dev] ANNOUNCE -- Python-idle to Idlefork Merge Completed
Message-ID: <m3lm5zuqqf.fsf@float.attbi.com>

Python-idle has been merged into Idlefork as of 13 Sep 2002.  

I believe there are no Python-idle check-ins after that date, and
since there will not be any future check-ins this should be the final
merge from Python-idle.

The Idlefork CVS is again open for business!

Please submit bugs and patches to the Idlefork Tracker and any
comments to the Idle-dev list.

KBK



From praveen.patil@silver-software.com  Thu Sep 19 14:39:43 2002
From: praveen.patil@silver-software.com (Praveen Patil)
Date: Thu, 19 Sep 2002 14:39:43 +0100
Subject: [Python-Dev] How pass array from  C  to python function
Message-ID: <NFBBLJFBNMKMLGNLJMFCKENPCCAA.praveen.patil@silver-software.com>

Hi ,

I have problem in passing array to python function.
Please help me passing array to python function.

Here is my  'C' program
-----------------------

void  RECEIVE_IL_STATE_S( int Instance , int vital_data[5])
{
    PyObject*arglist;
    PyObject* ret;
    PyObject* mylist;
    int count;
    
    mylist = PyList_New(5);
    for (count=0; count< 5; count++) {
          myint = PyInt_FromLong(vital_data[count]);
          PyList_Append(mylist,myint); 
    }
    arglist = Py_BuildValue("O", mylist);
    ret = PyEval_CallObject(my_callback , arglist);
    Py_DECREF(arglist);
    Py_DECREF(ret);
} 

Here is my Python program
-------------------------

G_Logfile          = None

def TestFunction(a):
    G_Logfile = open('Pytestfile.txt', 'w')
    G_Logfile.write("%d \n"% a[0])
    G_Logfile.write("%d \n"% a[1])
    G_Logfile.close


Cheers,

Praveen.

















[ The information contained in this e-mail is confidential and is intended for the named recipient only. If you are not the named recipient, please notify us by telephone on +44 (0)1249 442 430 immediately, destroy the message and delete it from your computer. Silver Software has taken every reasonable precaution to ensure that any attachment to this e-mail has been checked for viruses. However, we cannot accept liability for any damage sustained as a result of any such software viruses and advise you to carry out your own virus check before opening any attachment. Furthermore, we do not accept responsibility for any change made to this message after it was sent by the sender.]


From thomas.heller@ion-tof.com  Thu Sep 19 15:28:33 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Thu, 19 Sep 2002 16:28:33 +0200
Subject: [Python-Dev] CVS hosed?
Message-ID: <3D89DF11.4070106@ion-tof.com>

I cannot reach python's CVS repository.
Should I look on my side for a problem, or is it the same for other 
developers?

Thanks,

Thomas



From guido@python.org  Thu Sep 19 15:31:08 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 19 Sep 2002 10:31:08 -0400
Subject: [Python-Dev] CVS hosed?
In-Reply-To: Your message of "Thu, 19 Sep 2002 16:28:33 +0200."
 <3D89DF11.4070106@ion-tof.com>
References: <3D89DF11.4070106@ion-tof.com>
Message-ID: <200209191431.g8JEV8b03080@pcp02138704pcs.reston01.va.comcast.net>

> I cannot reach python's CVS repository.
> Should I look on my side for a problem, or is it the same for other 
> developers?

Same here.  I'll submit a support request; the SF status page says
everything's online.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From aahz@pythoncraft.com  Thu Sep 19 15:33:20 2002
From: aahz@pythoncraft.com (Aahz)
Date: Thu, 19 Sep 2002 10:33:20 -0400
Subject: [Python-Dev] How pass array from  C  to python function
In-Reply-To: <NFBBLJFBNMKMLGNLJMFCKENPCCAA.praveen.patil@silver-software.com>
References: <NFBBLJFBNMKMLGNLJMFCKENPCCAA.praveen.patil@silver-software.com>
Message-ID: <20020919143320.GA11594@panix.com>

On Thu, Sep 19, 2002, Praveen Patil wrote:
>
> I have problem in passing array to python function.
> Please help me passing array to python function.

python-dev is not for general questions about Python programming.
Please post your question to comp.lang.python.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/


From sjoerd@acm.org  Thu Sep 19 15:42:05 2002
From: sjoerd@acm.org (Sjoerd Mullender)
Date: Thu, 19 Sep 2002 16:42:05 +0200
Subject: [Python-Dev] CVS hosed?
In-Reply-To: <200209191431.g8JEV8b03080@pcp02138704pcs.reston01.va.comcast.net>
References: <3D89DF11.4070106@ion-tof.com>
 <200209191431.g8JEV8b03080@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200209191442.g8JEg5Z25259@indus.ins.cwi.nl>

There are already half a dozen such support requests.  Lots of people
are bothered by this in lots of projects.

On Thu, Sep 19 2002 Guido van Rossum wrote:

> > I cannot reach python's CVS repository.
> > Should I look on my side for a problem, or is it the same for other 
> > developers?
> 
> Same here.  I'll submit a support request; the SF status page says
> everything's online.
> 
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> 

-- Sjoerd Mullender <sjoerd@acm.org>


From barry@python.org  Thu Sep 19 15:40:55 2002
From: barry@python.org (Barry A. Warsaw)
Date: Thu, 19 Sep 2002 10:40:55 -0400
Subject: [Python-Dev] CVS hosed?
References: <3D89DF11.4070106@ion-tof.com>
Message-ID: <15753.57847.13384.639206@anthem.wooz.org>

>>>>> "TH" == Thomas Heller <thomas.heller@ion-tof.com> writes:

    TH> I cannot reach python's CVS repository.  Should I look on my
    TH> side for a problem, or is it the same for other developers?

All of SF's CVS has been hosed for a while this morning.  I suspect
it'll eventually come back <wink>.

-Barry


From mats@laplaza.org  Thu Sep 19 18:26:38 2002
From: mats@laplaza.org (Mats Wichmann)
Date: Thu, 19 Sep 2002 11:26:38 -0600
Subject: [Python-Dev] Re: mysterious hangs in socket code
In-Reply-To: <20020904160006.20673.57240.Mailman@mail.python.org>
Message-ID: <5.1.0.14.1.20020919112055.01ef1828@204.151.72.2>

 >> One possibility is that the Linux getaddrinfo() is thread-safe, but
 >> only by way of a lock that only allows one request to be outstanding
 >> at a time.
 >
 >The next step should be to get the getaddrinfo() source code from
 >glibc and see what it does.  It's open source, hey. :-)

I can dig around a bit, but I have to figure out what
I'm looking for.

On the failure platform, are we sure Python is using
the native getaddrinfo, not the Python-supplied one?
I've had some fun (not) with the latter; for working
on an LSB-conforming version of Python, I can't let it
use the glibc version of getaddrinfo because it's not
in the spec (will be in the next version); but the Python
addrinfo.h header has some fields in different order
than the Linux one, and it managed to call the Linux
one anyway.  The result of that was not subtle, however :-)
so I don't think that's the problem that started this
thread.

I do know the Linux (or rather, glibc) getaddrinfo doesn't
get rentrancy through magic, it calls gethostbyname_r
and gethostbyaddr_r. (Note the Python emulation getaddrinfo
just calls the straight gethostbyname and gethostbyaddr
routines and so is likely not to be reentrant).



From martin@v.loewis.de  Thu Sep 19 19:33:23 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 19 Sep 2002 20:33:23 +0200
Subject: [Python-Dev] Re: mysterious hangs in socket code
In-Reply-To: <5.1.0.14.1.20020919112055.01ef1828@204.151.72.2>
References: <5.1.0.14.1.20020919112055.01ef1828@204.151.72.2>
Message-ID: <m3ofat96do.fsf@mira.informatik.hu-berlin.de>

Mats Wichmann <mats@laplaza.org> writes:

> >> One possibility is that the Linux getaddrinfo() is thread-safe, but
> >> only by way of a lock that only allows one request to be outstanding
> >> at a time.
> >
> >The next step should be to get the getaddrinfo() source code from
> >glibc and see what it does.  It's open source, hey. :-)
> 
> I can dig around a bit, but I have to figure out what
> I'm looking for.

I think that part is already settled: getaddrinfo, on Linux, is
thread-safe.

> On the failure platform, are we sure Python is using
> the native getaddrinfo, not the Python-supplied one?

Correct.

I think the remaining question is: Even if the GIL is released around
getaddrinfo - why is the performance of Jeremy's test script still
that bad?

Regards,
Martin


From guido@python.org  Thu Sep 19 19:40:38 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 19 Sep 2002 14:40:38 -0400
Subject: [Python-Dev] Re: mysterious hangs in socket code
In-Reply-To: Your message of "Thu, 19 Sep 2002 20:33:23 +0200."
 <m3ofat96do.fsf@mira.informatik.hu-berlin.de>
References: <5.1.0.14.1.20020919112055.01ef1828@204.151.72.2>
 <m3ofat96do.fsf@mira.informatik.hu-berlin.de>
Message-ID: <200209191840.g8JIecF02462@odiug.zope.com>

> Mats Wichmann <mats@laplaza.org> writes:
> 
> > >> One possibility is that the Linux getaddrinfo() is thread-safe, but
> > >> only by way of a lock that only allows one request to be outstanding
> > >> at a time.
> > >
> > >The next step should be to get the getaddrinfo() source code from
> > >glibc and see what it does.  It's open source, hey. :-)
> > 
> > I can dig around a bit, but I have to figure out what
> > I'm looking for.

[MvL]
> I think that part is already settled: getaddrinfo, on Linux, is
> thread-safe.
> 
> > On the failure platform, are we sure Python is using
> > the native getaddrinfo, not the Python-supplied one?
> 
> Correct.
> 
> I think the remaining question is: Even if the GIL is released around
> getaddrinfo - why is the performance of Jeremy's test script still
> that bad?

I tried to read the glibc getaddrinfo() source, but it looks like it
would be a term project...  It could be that it's just doing a lot
more interaction with a DNS server.

I believe that Jeremy suspects that the test program isn't just slow,
but that one slow thread actually blocks all other threads from making
progress.  If that's the case (we don't know for sure), we're looking
for a bottleneck in the getaddrinfo() code that somehow holds a
resource needed by all threads calling getaddrinfo().

--Guido van Rossum (home page: http://www.python.org/~guido/)


From srinivas.rao.kollipara@ai.ag  Fri Sep 20 09:28:08 2002
From: srinivas.rao.kollipara@ai.ag (Srinivas Rao Kollipara)
Date: Fri, 20 Sep 2002 10:28:08 +0200
Subject: [Python-Dev] information needed
Message-ID: <6747D50F1AE5D511A02E009027CA36D137B977@exchange.mucs.ai.ag>

Hi,
I have a small python code, is there any tool which can convert the python
code to oracle plsql code.

Thanks

kolli



From drifty@bigfoot.com  Fri Sep 20 10:10:55 2002
From: drifty@bigfoot.com (Brett Cannon)
Date: Fri, 20 Sep 2002 02:10:55 -0700 (PDT)
Subject: [Python-Dev] information needed
In-Reply-To: <6747D50F1AE5D511A02E009027CA36D137B977@exchange.mucs.ai.ag>
Message-ID: <Pine.SOL.4.44.0209200209400.10975-100000@death.OCF.Berkeley.EDU>

[Srinivas Rao Kollipara]

> Hi,
> I have a small python code, is there any tool which can convert the python
> code to oracle plsql code.
>
> Thanks
>
> kolli
>

The Python-dev list is used  to discuss the development of Python.
General questions about Python such as this are best served by being
posted to comp.lang.python .

-Brett C.



From skip@pobox.com  Fri Sep 20 15:47:05 2002
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 20 Sep 2002 09:47:05 -0500
Subject: [Python-Dev] ReST-ing Misc/NEWS
Message-ID: <15755.13545.743552.357262@12-248-11-90.client.attbi.com>

I just checked in a ReST-ified version of Misc/NEWS.  While the total number
of changes was fairly large, the number of different types of changes made
were quite small.  The overwhelming majority of changes involved properly
highlighting section and subsection headers.  

I'll review the changes here so that people who modify this file in the
future will be able to easily adapt to the new format.

First, to process Misc/NEWS using ReST, you'll need the latest docutils
snapshot:

    http://docutils.sf.net/docutils-snapshot.tgz

David Goodger made a change to the allowable structure of internal
references which simplified my job significantly.

The changes required fell into the following categories:

* The top-level "What's New" section headers changed to:

    What's New in Python 2.3 alpha 1?
    =================================

    *XXX Release date: DD-MMM-2002 XXX*

* Subsections are underlined with a single row of hyphens:

    Type/class unification and new-style classes
    --------------------------------------------

* Places where "balanced" single quotes were used were switched to use
  just apostrophes (`string' -> 'string').

* In a few places asterisks needed to be escaped which would otherwise have
  been interpreted as beginning blocks of italic or bold text, e.g.:

    - The type of tp_free has been changed from "void (*)(PyObject \*)"
      to "void (*)(void \*)".

  Note that only the asterisks preceded by whitespace needed to be escaped.

* One instance of a word ending with an underscore needed to be quoted
  ("PyCmp_" became "``PyCmp_``").

* One table was converted to ReST form (search Misc/NEWS for "New codecs"
  for this example).

* A few places where chunks of code or indented text were displayed needed
  to be properly introduced (preceding paragraph terminated by "::" and the
  chunk of code or text indented w.r.t. the paragraph).  For example:

    - Note that PyLong_AsDouble can fail!  This has always been true,
      but no callers checked for it.  It's more likely to fail now,
      because overflow errors are properly detected now.  The proper way
      to check: ::

          double x = PyLong_AsDouble(some_long_object);
          if (x == -1.0 && PyErr_Occurred()) {
              /* The conversion failed. */
              }

Not yet addressed is whether to automatically convert Misc/NEWS to other
formats (such as HTML).  I assume an automatic conversion to HTML is in the
cards, with the output made available on the python.org website.

Skip


From mats@laplaza.org  Fri Sep 20 17:12:33 2002
From: mats@laplaza.org (Mats Wichmann)
Date: Fri, 20 Sep 2002 10:12:33 -0600
Subject: [Python-Dev] Re: Re: mysterious hangs in socket code
In-Reply-To: <20020920160008.21719.12166.Mailman@mail.python.org>
Message-ID: <5.1.0.14.1.20020920100536.027f1158@204.151.72.2>

Martin:

 >I think that part is already settled: getaddrinfo, on Linux, is
 >thread-safe.

The latest Posix/Single UNIX spec in fact require getaddrinfo
(and getnameinfo) to be thread-safe.

and Guido:

 >I tried to read the glibc getaddrinfo() source, but it looks like it
 >would be a term project...  It could be that it's just doing a lot
 >more interaction with a DNS server.
 >
 >I believe that Jeremy suspects that the test program isn't just slow,
 >but that one slow thread actually blocks all other threads from making
 >progress.  If that's the case (we don't know for sure), we're looking
 >for a bottleneck in the getaddrinfo() code that somehow holds a
 >resource needed by all threads calling getaddrinfo().

Gives me a headache, too, especially once it vectors off into
the glibc nss code.  These routines (__gethostbyname2_r is the
likely suspect, __gethostbyaddr_r and __getservbyname_r might
also get called) do have paths that could twiddle a glibc internal
lock so it's not impossible there's an issue here, although at
something less than a term-paper-depth look it doesn't SEEM like
it should ever be able to block for very long.



From guido@python.org  Fri Sep 20 17:16:38 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 20 Sep 2002 12:16:38 -0400
Subject: [Python-Dev] Re: Re: mysterious hangs in socket code
In-Reply-To: Your message of "Fri, 20 Sep 2002 10:12:33 MDT."
 <5.1.0.14.1.20020920100536.027f1158@204.151.72.2>
References: <5.1.0.14.1.20020920100536.027f1158@204.151.72.2>
Message-ID: <200209201616.g8KGGdj09850@pcp02138704pcs.reston01.va.comcast.net>

> Gives me a headache, too, especially once it vectors off into
> the glibc nss code.  These routines (__gethostbyname2_r is the
> likely suspect, __gethostbyaddr_r and __getservbyname_r might
> also get called) do have paths that could twiddle a glibc internal
> lock so it's not impossible there's an issue here, although at
> something less than a term-paper-depth look it doesn't SEEM like
> it should ever be able to block for very long.

Are there any tools for observing the locking calls made?  Maybe just
an strace would help.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From cgw@alum.mit.edu  Fri Sep 20 17:37:15 2002
From: cgw@alum.mit.edu (Charles G Waldman)
Date: Fri, 20 Sep 2002 09:37:15 -0700
Subject: [Python-Dev] Puzzling behavior when subclassing from float
Message-ID: <15755.20155.992326.621736@dragonfly.sportsdatabase.com>

Aplogies in advance if this is the wrong forum for this question.

I'm trying to understand some puzzling behavior related to subclassing
from built-in types.

I'm running Python 2.2.1

If I subclass from "dict", it seems that the base class constructor is
not being called, which is just as I would expect:

class D(dict):
    def __init__(self, spam, eggs):
        print "spam=",spam, "eggs=", eggs

>>> d = D(1,2)
spam= 1 eggs= 2
>>> print d
{}

But if I subclass from "float", some magic is happening, which I don't
quite understand - it seems that the base class constructor *is* called:

class F(float):
    def __init__(self, spam, eggs):
        print "spam=",spam, "eggs=", eggs
      
>>> f = F(1,2)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: float() takes at most 1 argument (2 given)
        

If I modify the constructor to only take a single argument, I get the
following:

class F(float): 
    def __init__(self, spam):
        print "spam=",spam
        
>>> f = F(3.14)
spam= 3.14
>>> print f
3.14
>>> print f*2
6.28

How is the value "3.14" getting associated with f?  Apparently the
base class constructor is called.  How come this is happening?



From guido@python.org  Fri Sep 20 17:42:14 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 20 Sep 2002 12:42:14 -0400
Subject: [Python-Dev] Puzzling behavior when subclassing from float
In-Reply-To: Your message of "Fri, 20 Sep 2002 09:37:15 PDT."
 <15755.20155.992326.621736@dragonfly.sportsdatabase.com>
References: <15755.20155.992326.621736@dragonfly.sportsdatabase.com>
Message-ID: <200209201642.g8KGgEM10013@pcp02138704pcs.reston01.va.comcast.net>

> Aplogies in advance if this is the wrong forum for this question.

It is, but because you're you, I don't mind.

You're missing that besides __init__, new-style classes also have a
lower-level constructor, __new__.  This is called before __init__.
For immutable objects, __new__ is where the action is.  Read about it
in http://www.python.org/2.2.1/descrintro.html

--Guido van Rossum (home page: http://www.python.org/~guido/)


From aahz@pythoncraft.com  Fri Sep 20 18:02:55 2002
From: aahz@pythoncraft.com (Aahz)
Date: Fri, 20 Sep 2002 13:02:55 -0400
Subject: [Python-Dev] Re: Re: mysterious hangs in socket code
In-Reply-To: <5.1.0.14.1.20020920100536.027f1158@204.151.72.2>
References: <20020920160008.21719.12166.Mailman@mail.python.org> <5.1.0.14.1.20020920100536.027f1158@204.151.72.2>
Message-ID: <20020920170255.GA13783@panix.com>

On Fri, Sep 20, 2002, Mats Wichmann wrote:
> Martin:
>> 
>>I think that part is already settled: getaddrinfo, on Linux, is
>>thread-safe.
> 
> The latest Posix/Single UNIX spec in fact require getaddrinfo
> (and getnameinfo) to be thread-safe.

Thread-safe or thread-hot?  E.g., Python in the absence of releasing the
GIL is thread-safe but not thread-hot.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/


From ark@research.att.com  Fri Sep 20 19:54:24 2002
From: ark@research.att.com (Andrew Koenig)
Date: Fri, 20 Sep 2002 14:54:24 -0400 (EDT)
Subject: [Python-Dev] Installation fails under Solaris 2.7 or 2.8 with binutils 2.13
Message-ID: <200209201854.g8KIsO208760@europa.research.att.com>

Nick Clifton at Red Hat has been kind enough to figure out for me why
I had been unable to install Python on Solaris 2.7 or 2.8 when using
binutils 2.13.  The problem turns out to be a change in default
options with 2.13 that affects dynamic linking.

I'm mentioning the problem here in the hope that someone will pick
up the patch I've put in with the bug report on Sourceforge
(http://sf.net/tracker/?func=detail&aid=596422&group_id=5470&atid=105470)
and include it as part of the next Python release.


From guido@python.org  Fri Sep 20 22:26:30 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 20 Sep 2002 17:26:30 -0400
Subject: [Python-Dev] ATTENTION!  Releasing Python 2.2.2 in a few weeks
Message-ID: <200209202126.g8KLQVI24554@pcp02138704pcs.reston01.va.comcast.net>

I'd like to release something called Python 2.2.2 in a few weeks (say,
around Oct 8; I like Tuesday release dates).

PythonLabs has no time to do a thorough search for all backport/bugfix
candidates in the trunk; if you want to help, the best thing you can
do is take your favorite set of modules or core files and
systematically backport anything that's clearly a bugfix and backports
easily.  Or you could simply make sure that your favorite bugfix is
backported.

I know Laura Creighton volunteered to help on behalf of the PBF, but I
don't know how long she'll take, and she can surely use help.  OTOH,
if nobody has time, I think it's fine to release what we have in CVS
on the 2.2 maintenance branch (the branch is named release22-maint).

Why release now?  It's been a while!  It'll be almost 6 months since
2.2.1 was released.  There have been a few important bugfixes (e.g. a
crash with ExtensionClasses on Solaris) that have bugged real-world
users.

Why release what we've got?  Frankly, I expect that nobody has the
time to backport everything that could reasonably be backported, so if
we wait for that to happen, we'll never have another release.  What
we've got is definitely a lot better than 2.2.1.

What about Python-in-a-tie?  Maybe Laura can shed light on the PBF's
schedule for that; I expect it'll be much longer in the making than
the planned 2.2.2 release.

What about Python 2.3?  Alpha by the end of 2002 is the best I can
promise.

What can you do?  Here's a brief treatise on backporting bugs that I
sent to Laura earlier:

    Basically, someone does the tedious part of triage, which means
    going over *every* 2.3 checkin message (with quick access to the
    corresponding diffs) and sorting them into:

    - already applied

    - trivial reject (e.g. new feature or fix for a bug introduced in
      2.3)

    - trivial accept (pure bugfix that applies cleanly to 2.2)

    - messy (e.g. unclear whether it's a bugfix or a feature even
      after staring at the source, bugfixes that affect binary
      compatibility, bugfixes that can only be applied with much code
      wrangling due to other changes in the code at the same place,
      etc.)

    Feel free to compile a list of "messy" ones and send it to
    python-dev.  It doesn't have to be all at once -- for big messy
    ones a separate python-dev discussion may be appropriate.

I think it's best not use the SF trackers to suggest bugs to be
backported -- this would merely be confusing, and it's a pretty heavy
communication mechanism.  If you want to help but don't have checkin
permission, find someone who does and work with them -- or we can give
you checkin permission (depending on your reputation).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mclay@nist.gov  Fri Sep 20 23:20:47 2002
From: mclay@nist.gov (Michael McLay)
Date: Fri, 20 Sep 2002 18:20:47 -0400
Subject: [Python-Dev] ATTENTION!  Releasing Python 2.2.2 in a few weeks
In-Reply-To: <200209202126.g8KLQVI24554@pcp02138704pcs.reston01.va.comcast.net>
References: <200209202126.g8KLQVI24554@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200209201820.47302.mclay@nist.gov>

On Friday 20 September 2002 05:26 pm, Guido van Rossum wrote:
>
> I think it's best not use the SF trackers to suggest bugs to be
> backported -- this would merely be confusing, and it's a pretty heavy
> communication mechanism.  If you want to help but don't have checkin
> permission, find someone who does and work with them -- or we can give
> you checkin permission (depending on your reputation).

Perhaps Wiki pages would be a good mechanism for collaboration on the 
classification of patches. Create one page for each classification type and 
then use the patch names and title as section titles within the page.  



From guido@python.org  Fri Sep 20 23:36:25 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 20 Sep 2002 18:36:25 -0400
Subject: [Python-Dev] ATTENTION! Releasing Python 2.2.2 in a few weeks
In-Reply-To: Your message of "Fri, 20 Sep 2002 18:20:47 EDT."
 <200209201820.47302.mclay@nist.gov>
References: <200209202126.g8KLQVI24554@pcp02138704pcs.reston01.va.comcast.net>
 <200209201820.47302.mclay@nist.gov>
Message-ID: <200209202236.g8KMaP124837@pcp02138704pcs.reston01.va.comcast.net>

> > I think it's best not use the SF trackers to suggest bugs to be
> > backported -- this would merely be confusing, and it's a pretty heavy
> > communication mechanism.  If you want to help but don't have checkin
> > permission, find someone who does and work with them -- or we can give
> > you checkin permission (depending on your reputation).
> 
> Perhaps Wiki pages would be a good mechanism for collaboration on the 
> classification of patches. Create one page for each classification type and 
> then use the patch names and title as section titles within the page.  

Excellent!  Let's start a 2.2.2 wiki as soon as there's anything to
discuss.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From martin@v.loewis.de  Fri Sep 20 23:38:18 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 21 Sep 2002 00:38:18 +0200
Subject: [Python-Dev] Re: Re: mysterious hangs in socket code
In-Reply-To: <5.1.0.14.1.20020920100536.027f1158@204.151.72.2>
References: <5.1.0.14.1.20020920100536.027f1158@204.151.72.2>
Message-ID: <m3hegkgucl.fsf@mira.informatik.hu-berlin.de>

Mats Wichmann <mats@laplaza.org> writes:

>  >I think that part is already settled: getaddrinfo, on Linux, is
>  >thread-safe.
> 
> The latest Posix/Single UNIX spec in fact require getaddrinfo
> (and getnameinfo) to be thread-safe.

Yes, but this doesn't stop *BSD from providing a getaddrinfo
implementation that is not thread-safe - they merely document that
limitation on the man page.

> Gives me a headache, too, especially once it vectors off into
> the glibc nss code.  These routines (__gethostbyname2_r is the
> likely suspect, __gethostbyaddr_r and __getservbyname_r might
> also get called) do have paths that could twiddle a glibc internal
> lock so it's not impossible there's an issue here, although at
> something less than a term-paper-depth look it doesn't SEEM like
> it should ever be able to block for very long.

I'd recommend to use strace for further analysis, perhaps using it's
-r option.

Regards,
Martin



From skip@pobox.com  Fri Sep 20 23:52:13 2002
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 20 Sep 2002 17:52:13 -0500
Subject: [Python-Dev] ATTENTION! Releasing Python 2.2.2 in a few weeks
In-Reply-To: <200209202236.g8KMaP124837@pcp02138704pcs.reston01.va.comcast.net>
References: <200209202126.g8KLQVI24554@pcp02138704pcs.reston01.va.comcast.net>
 <200209201820.47302.mclay@nist.gov>
 <200209202236.g8KMaP124837@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15755.42653.874623.68250@12-248-11-90.client.attbi.com>

    >> Perhaps Wiki pages would be a good mechanism for collaboration on the
    >> classification of patches. Create one page for each classification
    >> type and then use the patch names and title as section titles within
    >> the page.

    Guido> Excellent!  Let's start a 2.2.2 wiki as soon as there's anything to
    Guido> discuss.

It will only take me a couple minutes to create a wiki on the mojam.com
server.  Michael can be the first editor. ;-)

Skip



From exarkun@meson.dyndns.org  Sat Sep 21 06:50:11 2002
From: exarkun@meson.dyndns.org (Jp Calderone)
Date: Sat, 21 Sep 2002 01:50:11 -0400
Subject: [Python-Dev] Built-in functions, kw args
Message-ID: <20020921055011.GA1555@meson.dyndns.org>

--Q68bSM7Ycu6FN28Q
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

>>> ''.split(maxsplit=10)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: split() takes no keyword arguments

/usr/src/Python-2.2.1/Objects$ egrep "PyArg_ParseTuple[^A]" *.c | wc -l
     73
/usr/src/Python-2.2.1/Objects$ egrep PyArg_ParseTupleAndKeywords *.c | wc -l
     16

  It looks like the usage of ParseTuple vs ParseTupleAndKeywords is just
whatever the author felt like using at the time (to me, anyway).  For the
sake of consistency at least (and convenience to boot), might it be nice to
use PyArg_ParseTupleAndKeywords in more places -- I hesitate to say
everywhere, but at least in the places it makes sense?  Is there a reason
not to do this?  Would a patch be accepted that made it so?

  Jp

--
 1:00am up 123 days, 1:54, 3 users, load average: 0.47, 0.54, 0.53

--Q68bSM7Ycu6FN28Q
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQE9jAiSedcO2BJA+4YRAgiVAKDNB567/2y7GKZwyCHHv3Ew76ltnwCaAnsM
7GDjNNU3ucBGFwA8Rgtqx7A=
=WdYD
-----END PGP SIGNATURE-----

--Q68bSM7Ycu6FN28Q--


From martin@v.loewis.de  Sat Sep 21 11:37:04 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 21 Sep 2002 12:37:04 +0200
Subject: [Python-Dev] Built-in functions, kw args
In-Reply-To: <20020921055011.GA1555@meson.dyndns.org>
References: <20020921055011.GA1555@meson.dyndns.org>
Message-ID: <m365wzljcf.fsf@mira.informatik.hu-berlin.de>

Jp Calderone <exarkun@meson.dyndns.org> writes:

>   It looks like the usage of ParseTuple vs ParseTupleAndKeywords is just
> whatever the author felt like using at the time (to me, anyway).  For the
> sake of consistency at least (and convenience to boot), might it be nice to
> use PyArg_ParseTupleAndKeywords in more places -- I hesitate to say
> everywhere, but at least in the places it makes sense?  Is there a reason
> not to do this?  Would a patch be accepted that made it so?

I would require more precise criteria than "in the places it makes
sense".

For example, if the documentation suggests that some operation has a
keyword argument, this could be used as an indication that the
implementation should follow. Notice that the parameter names get cast
into stone that way.

Regards,
Martin



From guido@python.org  Sat Sep 21 12:37:47 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 21 Sep 2002 07:37:47 -0400
Subject: [Python-Dev] Built-in functions, kw args
In-Reply-To: Your message of "Sat, 21 Sep 2002 01:50:11 EDT."
 <20020921055011.GA1555@meson.dyndns.org>
References: <20020921055011.GA1555@meson.dyndns.org>
Message-ID: <200209211137.g8LBbm225981@pcp02138704pcs.reston01.va.comcast.net>

> It looks like the usage of ParseTuple vs ParseTupleAndKeywords is
> just whatever the author felt like using at the time (to me,
> anyway).

Almost.  Historically, ParseTupleAndKeywords didn't exist for many
years.  Also, it's much more painful to use.  And it's slower.  It's
likely that the few occurrences you found were almost all the new
class constructors, which pretty much require it.

> For the sake of consistency at least (and convenience to boot),
> might it be nice to use PyArg_ParseTupleAndKeywords in more places
> -- I hesitate to say everywhere, but at least in the places it makes
> sense?  Is there a reason not to do this?  Would a patch be accepted
> that made it so?

It would be a *humungous* patch, and it would take forever to verify
that it was 100% correct.  And you haven't even looked in the Modules
directory.

Another issue is to decide on the argument names.

IOW I'm hoping you'll forget it. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@pobox.com  Sat Sep 21 15:50:22 2002
From: skip@pobox.com (Skip Montanaro)
Date: Sat, 21 Sep 2002 09:50:22 -0500
Subject: [Python-Dev] Python for OpenVMS
Message-ID: <15756.34606.540052.330650@12-248-11-90.client.attbi.com>

FYI, Jean-Fran=E7ois Pi=E9ronne contacted me a few days with news that =
he has
2.1.3 running under OpenVMS (I updated question 7.4 of the FAQ to refer=
 to
his work).  He now has the CVS tree and is working on a port of 2.3.

Skip


From skip@manatee.mojam.com  Sun Sep 22 13:00:20 2002
From: skip@manatee.mojam.com (Skip Montanaro)
Date: Sun, 22 Sep 2002 07:00:20 -0500
Subject: [Python-Dev] Weekly Python Bug/Patch Summary
Message-ID: <200209221200.g8MC0KYQ021465@manatee.mojam.com>

Bug/Patch Summary
-----------------

283 open / 2867 total bugs (no change)
106 open / 1698 total patches (-2)

New Bugs
--------

Mac IDE Browser / ListManager issue (2002-09-16)
	http://python.org/sf/610149
unicode alphanumeric regexp bug (2002-09-16)
	http://python.org/sf/610299
SO name is too short!  Python 2.2.1 (2002-09-16)
	http://python.org/sf/610332
linuxaudiodev not documented (2002-09-17)
	http://python.org/sf/610401
mhlib does not obey MHCONTEXT env var (2002-09-17)
	http://python.org/sf/610556
Numpy doesn't build for Python.framework (2002-09-17)
	http://python.org/sf/610730
Lone surrogates cause bad .pyc files (2002-09-17)
	http://python.org/sf/610783
SMTP.login() uses invalid base64 enc. (2002-09-18)
	http://python.org/sf/611052
Cannot compile escaped unicode character (2002-09-20)
	http://python.org/sf/612074
2 bugs in turtle.py (2002-09-21)
	http://python.org/sf/612595

New Patches
-----------

Updated .spec file for 2.2 series. (2002-09-18)
	http://python.org/sf/611191
select problems on Windows (2002-09-19)
	http://python.org/sf/611464
zipfile.py reads archives with comments (2002-09-19)
	http://python.org/sf/611760
quietly select between 'less' and 'more' (2002-09-20)
	http://python.org/sf/612111
"Bare" text tag_configure in Tkinter (2002-09-21)
	http://python.org/sf/612602
Allow more Unicode on sys.stdout (2002-09-21)
	http://python.org/sf/612627

Closed Bugs
-----------

bugs in Tix.py ListNoteBook  PanedWindow (2001-11-23)
	http://python.org/sf/484994
multifile different in 2.2 from 2.1.1 (2002-02-07)
	http://python.org/sf/514676
Sig11 in cPickle (stack overflow) (2002-07-01)
	http://python.org/sf/576084
Empty genindex.html pages (2002-07-26)
	http://python.org/sf/586926
defining away __attribute__ is not good (2002-09-08)
	http://python.org/sf/606493
Problems in IDLE Browsers & Viewers (2002-09-12)
	http://python.org/sf/608595
test_b1.py, disabling of list test (2002-09-13)
	http://python.org/sf/609041
cPickle.BadPickleGet is a string (2002-09-13)
	http://python.org/sf/609164

Closed Patches
--------------

Reimplementation of multifile.py (2001-04-04)
	http://python.org/sf/413766
MSVC Preprocessor (2001-07-15)
	http://python.org/sf/441528
Setup and distutils changes. (2001-08-21)
	http://python.org/sf/454041
Extension to Calltips / Show attributes (2002-03-03)
	http://python.org/sf/525109
PEP 4 update: deprecations (2002-03-18)
	http://python.org/sf/531491
Support PyChecker in IDLE (2002-04-03)
	http://python.org/sf/539043
Mac OS X keydefs (2002-09-07)
	http://python.org/sf/606132
configure on Irix (sockets, posix) (2002-09-13)
	http://python.org/sf/608999


From lac@strakt.com  Mon Sep 23 11:01:20 2002
From: lac@strakt.com (Laura Creighton)
Date: Mon, 23 Sep 2002 12:01:20 +0200
Subject: [Python-Dev] Re: ATTENTION! Releasing Python 2.2.2 in a few weeks
In-Reply-To: Message from Guido van Rossum <guido@python.org>
 of "Fri, 20 Sep 2002 17:26:30 EDT." <200209202126.g8KLQVI24554@pcp02138704pcs.reston01.va.comcast.net>
References: <200209202126.g8KLQVI24554@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200209231001.g8NA1Ko9011212@ratthing-b246.strakt.com>

> I'd like to release something called Python 2.2.2 in a few weeks (say,
> around Oct 8; I like Tuesday release dates).
> 
> PythonLabs has no time to do a thorough search for all backport/bugfix
> candidates in the trunk; if you want to help, the best thing you can
> do is take your favorite set of modules or core files and
> systematically backport anything that's clearly a bugfix and backports
> easily.  Or you could simply make sure that your favorite bugfix is
> backported.

Ah, before you start systemetically backporting, make sure you
_announce_ that you are about to do this backporting.  Otherwise you
will find that your favourite modules are also somebody else's
favourite modules and you have both wasted time and made the eventual
merge job harder.  This is particularily true of people who need to
make a idiom change throughout the entire code base -- they will need
to modify files which aren't of any particular interest to you.  If
somebody is in the middle of working on those files in particular when
you come through with your idiom change it is very easy for them to
overwrite your change, either because it happened in a different part of
the file, one that does not have their attention, or because
they did not notice that, in addition to the masive refactoring
which they got right as part of a backport, they also have to change
an idiom.  This sort of thing drives release managers nuts.

Note: 'I mentioned it on the wiki someplace' is not good enough.  Busy
people want to get their bugfixes out and in quickly, not participate
in a community or create content.  Thus the loose structure of a wiki,
which is its strength in community building and in building
broad-based participation becomes its downfall when you actually want
to quickly know what you have to do, get in, and then get out.  A page
where one lists such things seems a fine compromise, as long as everybody
is aware that some people won't be update, or update properly, anyway.

I've been reading the 2.2 cvs over the last week, trying to ge my
brain around which changes go with which bugs and which features.  I
have found some patches where what I think happened is that in
addition to adding some feature, while somebody was there they
decided to fix a little unrelated ugliness at the same time.  Now
the task is deciding if that ugliness is also a bug.

> I know Laura Creighton volunteered to help on behalf of the PBF, but I
> don't know how long she'll take, and she can surely use help.  OTOH,
> if nobody has time, I think it's fine to release what we have in CVS
> on the 2.2 maintenance branch (the branch is named release22-maint).

I certainly can use help.  

But given your current plan to release in a few weeks, I wonder if my
task might be better changed from tracing all the changes from the
last release to 2.2 maintenance, to starting from 2.2 maint and
seeing if there is something we _don't_ want in, which needs removing.
I still don't have a good enough perspective to judge this.

> 
> Why release now?  It's been a while!  It'll be almost 6 months since
> 2.2.1 was released.  There have been a few important bugfixes (e.g. a
> crash with ExtensionClasses on Solaris) that have bugged real-world
> users.

Ah, that doesn't exactly explain why now -- that is in a few weeks --
rather than in one months time, or even now, that is this morning.
The only problem I see on my end is if I decide to procede starting
with 2.2.2 as the base release for Python-in-a-Tie and then swarms
of people show up saying, well, actually I was half way through something
when 2.2.2 came out, so for PIT you have to either remove <all of this>
stuff or add <all of that>.  Is now a quiet time, or do people expect
a lot of that to occur?  There is nothing like announcing an impending
release to get a lot of code out from the woodwork -- so I guess we
will find out.

> 
> Why release what we've got?  Frankly, I expect that nobody has the
> time to backport everything that could reasonably be backported, so if
> we wait for that to happen, we'll never have another release.  What
> we've got is definitely a lot better than 2.2.1.
> 
> What about Python-in-a-tie?  Maybe Laura can shed light on the PBF's
> schedule for that; I expect it'll be much longer in the making than
> the planned 2.2.2 release.

It has to be, by definition.  We need the Python for people to test their
extension modules against before we can package up the extension modules.

> 
> What about Python 2.3?  Alpha by the end of 2002 is the best I can
> promise.
> 
> What can you do?  Here's a brief treatise on backporting bugs that I
> sent to Laura earlier:
> 
>     Basically, someone does the tedious part of triage, which means
>     going over *every* 2.3 checkin message (with quick access to the
>     corresponding diffs) and sorting them into:
> 
>     - already applied
> 
>     - trivial reject (e.g. new feature or fix for a bug introduced in
>       2.3)
> 
>     - trivial accept (pure bugfix that applies cleanly to 2.2)
> 
>     - messy (e.g. unclear whether it's a bugfix or a feature even
>       after staring at the source, bugfixes that affect binary
>       compatibility, bugfixes that can only be applied with much code
>       wrangling due to other changes in the code at the same place,
>       etc.)
> 
>     Feel free to compile a list of "messy" ones and send it to
>     python-dev.  It doesn't have to be all at once -- for big messy
>     ones a separate python-dev discussion may be appropriate.
> 
> I think it's best not use the SF trackers to suggest bugs to be
> backported -- this would merely be confusing, and it's a pretty heavy
> communication mechanism.  If you want to help but don't have checkin
> permission, find someone who does and work with them -- or we can give
> you checkin permission (depending on your reputation).
> 
> --Guido van Rossum (home page: http://www.python.org/~guido/)


You also pointed me at Tools/scripts/logmerge.py , which I thought
I would mention in case anybody reading here isn't familiar with it.
It sorts the messages produced by cvs log by date and time, rather
than by file.  really useful.  Thank you.

Laura Creighton


From guido@python.org  Mon Sep 23 13:07:08 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 23 Sep 2002 08:07:08 -0400
Subject: [Python-Dev] Re: ATTENTION! Releasing Python 2.2.2 in a few weeks
In-Reply-To: Your message of "Mon, 23 Sep 2002 12:01:20 +0200."
 <200209231001.g8NA1Ko9011212@ratthing-b246.strakt.com>
References: <200209202126.g8KLQVI24554@pcp02138704pcs.reston01.va.comcast.net>
 <200209231001.g8NA1Ko9011212@ratthing-b246.strakt.com>
Message-ID: <200209231207.g8NC78C06322@pcp02138704pcs.reston01.va.comcast.net>

[Laura]
> Ah, before you start systemetically backporting, make sure you
> _announce_ that you are about to do this backporting.  Otherwise you
> will find that your favourite modules are also somebody else's
> favourite modules and you have both wasted time and made the eventual
> merge job harder.  This is particularily true of people who need to
> make a idiom change throughout the entire code base -- they will need
> to modify files which aren't of any particular interest to you.  If
> somebody is in the middle of working on those files in particular when
> you come through with your idiom change it is very easy for them to
> overwrite your change, either because it happened in a different part of
> the file, one that does not have their attention, or because
> they did not notice that, in addition to the masive refactoring
> which they got right as part of a backport, they also have to change
> an idiom.  This sort of thing drives release managers nuts.

Yes, but for 2.2.2 (and in general for maintenance branches) I would
never try to do any global idiom changes.  Those are only for the
trunk (and even then I usually frown upon them -- in our experience
here, these usually introduce a few bugs in rarely-used code that
don't get found out until 6-12 months mater).

> Note: 'I mentioned it on the wiki someplace' is not good enough.  Busy
> people want to get their bugfixes out and in quickly, not participate
> in a community or create content.  Thus the loose structure of a wiki,
> which is its strength in community building and in building
> broad-based participation becomes its downfall when you actually want
> to quickly know what you have to do, get in, and then get out.  A page
> where one lists such things seems a fine compromise, as long as everybody
> is aware that some people won't be update, or update properly, anyway.

OTOH I don't want this to generate tons of messages to python-dev with
little more content than announcements that C is going to look at
module Y.

> I've been reading the 2.2 cvs over the last week, trying to ge my
> brain around which changes go with which bugs and which features.  I
> have found some patches where what I think happened is that in
> addition to adding some feature, while somebody was there they
> decided to fix a little unrelated ugliness at the same time.  Now
> the task is deciding if that ugliness is also a bug.

And then you still need to decide whether you want that bug fixed in
2.2.  How thoroughly fixed does 2.2 need to be?

> > I know Laura Creighton volunteered to help on behalf of the PBF, but I
> > don't know how long she'll take, and she can surely use help.  OTOH,
> > if nobody has time, I think it's fine to release what we have in CVS
> > on the 2.2 maintenance branch (the branch is named release22-maint).
> 
> I certainly can use help.  
> 
> But given your current plan to release in a few weeks, I wonder if my
> task might be better changed from tracing all the changes from the
> last release to 2.2 maintenance, to starting from 2.2 maint and
> seeing if there is something we _don't_ want in, which needs removing.
> I still don't have a good enough perspective to judge this.

I very much doubt that anything would have crept into 2.2 cvs that we
don't want.  Or are you talking from the PBF POV and could they be
more conservative for Py-tie than we've been with 2.2.2?

> > Why release now?  It's been a while!  It'll be almost 6 months since
> > 2.2.1 was released.  There have been a few important bugfixes (e.g. a
> > crash with ExtensionClasses on Solaris) that have bugged real-world
> > users.
> 
> Ah, that doesn't exactly explain why now -- that is in a few weeks --
> rather than in one months time, or even now, that is this morning.

I meant "why stop procrastinating". :-)

The ~two-week period was chosen to give people enough notice but not
be so far in the future that procrastinators will say to themselves
"I'll think about it closer to the release."  Two weeks seems just
about right based on my experience in this group.

> The only problem I see on my end is if I decide to procede starting
> with 2.2.2 as the base release for Python-in-a-Tie and then swarms
> of people show up saying, well, actually I was half way through something
> when 2.2.2 came out, so for PIT you have to either remove <all of this>
> stuff or add <all of that>.  Is now a quiet time, or do people expect
> a lot of that to occur?  There is nothing like announcing an impending
> release to get a lot of code out from the woodwork -- so I guess we
> will find out.

Halfway through with what?  I would expect that checkins would be
complete sets.  And if someone just *has* to check in stuff that
requires some more work, they should let us know so we can hold up the
release for them or make it a priority to fix it (by backing out or
finishing those changes).  Given the nature of most changes that need
to be backported this is unlikely -- almost all of them are small
fixes to one file.

> > Why release what we've got?  Frankly, I expect that nobody has the
> > time to backport everything that could reasonably be backported, so if
> > we wait for that to happen, we'll never have another release.  What
> > we've got is definitely a lot better than 2.2.1.
> > 
> > What about Python-in-a-tie?  Maybe Laura can shed light on the PBF's
> > schedule for that; I expect it'll be much longer in the making than
> > the planned 2.2.2 release.
> 
> It has to be, by definition.  We need the Python for people to test
> their extension modules against before we can package up the
> extension modules.

Can you tell us more here about the Py-tie plans?  I know nothing
about it except that it'll be based on Python 2.2; I think it would be
helpful for the developer community to know what the long-term Py-tie
plans are.

> > What about Python 2.3?  Alpha by the end of 2002 is the best I can
> > promise.
> > 
> > What can you do?  Here's a brief treatise on backporting bugs that I
> > sent to Laura earlier:
> > 
> >     Basically, someone does the tedious part of triage, which means
> >     going over *every* 2.3 checkin message (with quick access to the
> >     corresponding diffs) and sorting them into:
> > 
> >     - already applied
> > 
> >     - trivial reject (e.g. new feature or fix for a bug introduced in
> >       2.3)
> > 
> >     - trivial accept (pure bugfix that applies cleanly to 2.2)
> > 
> >     - messy (e.g. unclear whether it's a bugfix or a feature even
> >       after staring at the source, bugfixes that affect binary
> >       compatibility, bugfixes that can only be applied with much code
> >       wrangling due to other changes in the code at the same place,
> >       etc.)
> > 
> >     Feel free to compile a list of "messy" ones and send it to
> >     python-dev.  It doesn't have to be all at once -- for big messy
> >     ones a separate python-dev discussion may be appropriate.
> > 
> > I think it's best not use the SF trackers to suggest bugs to be
> > backported -- this would merely be confusing, and it's a pretty heavy
> > communication mechanism.  If you want to help but don't have checkin
> > permission, find someone who does and work with them -- or we can give
> > you checkin permission (depending on your reputation).
> 
> You also pointed me at Tools/scripts/logmerge.py , which I thought
> I would mention in case anybody reading here isn't familiar with it.
> It sorts the messages produced by cvs log by date and time, rather
> than by file.  really useful.  Thank you.

You're welcome.  (At times I've wanted an addition to logmerge that
would restrict it to a certain branch; but I've not wanted it enough
to implement it.  I think you'd have to mine the "tags" output from
cvs log for each file to know the branch point and then act
accordingly for the revisions of that file.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mwh@python.net  Mon Sep 23 14:00:19 2002
From: mwh@python.net (Michael Hudson)
Date: 23 Sep 2002 14:00:19 +0100
Subject: [Python-Dev] ATTENTION!  Releasing Python 2.2.2 in a few weeks
In-Reply-To: Guido van Rossum's message of "Fri, 20 Sep 2002 17:26:30 -0400"
References: <200209202126.g8KLQVI24554@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <2m1y7kyi70.fsf@starship.python.net>

Guido van Rossum <guido@python.org> writes:

> I'd like to release something called Python 2.2.2 in a few weeks (say,
> around Oct 8; I like Tuesday release dates).

Cool.

I have a mailbox containing about 50 checkins that I thought deserved
backporting at some point.  I'll try to grind through that by the end
of the week (i.e. by 28/9).

There's no way I'm doing the "pore over logs" duty this time.

Cheers,
M.

-- 
[1] If you're lost in the woods, just bury some fibre in the ground
    carrying data. Fairly soon a JCB will be along to cut it for you
    - follow the JCB back to civilsation/hitch a lift.
                                               -- Simon Burr, cam.misc


From David Abrahams" <david.abrahams@rcn.com  Mon Sep 23 13:54:05 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Mon, 23 Sep 2002 08:54:05 -0400
Subject: [Python-Dev] ATTENTION!  Releasing Python 2.2.2 in a few weeks
References: <200209202126.g8KLQVI24554@pcp02138704pcs.reston01.va.comcast.net> <2m1y7kyi70.fsf@starship.python.net>
Message-ID: <0a0201c26300$924fd350$6701a8c0@boostconsulting.com>


> Guido van Rossum <guido@python.org> writes:
>
> > I'd like to release something called Python 2.2.2 in a few weeks (say,
> > around Oct 8; I like Tuesday release dates).


I've been planning to release Boost.Python v2 around the same time. Is
there any chance we can coordinate this so that we Boost.Python people can
test against all of the backported changes before either of these products
"goes final"?

-----------------------------------------------------------
           David Abrahams * Boost Consulting
dave@boost-consulting.com * http://www.boost-consulting.com



From guido@python.org  Mon Sep 23 14:19:29 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 23 Sep 2002 09:19:29 -0400
Subject: [Python-Dev] ATTENTION! Releasing Python 2.2.2 in a few weeks
In-Reply-To: Your message of "Mon, 23 Sep 2002 14:00:19 BST."
 <2m1y7kyi70.fsf@starship.python.net>
References: <200209202126.g8KLQVI24554@pcp02138704pcs.reston01.va.comcast.net>
 <2m1y7kyi70.fsf@starship.python.net>
Message-ID: <200209231319.g8NDJTs06610@pcp02138704pcs.reston01.va.comcast.net>

> I have a mailbox containing about 50 checkins that I thought deserved
> backporting at some point.  I'll try to grind through that by the end
> of the week (i.e. by 28/9).

Great!  If you run out of time, you can mail me that mailbox.

> There's no way I'm doing the "pore over logs" duty this time.

And nobody expects you to. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Mon Sep 23 14:22:14 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 23 Sep 2002 09:22:14 -0400
Subject: [Python-Dev] ATTENTION! Releasing Python 2.2.2 in a few weeks
In-Reply-To: Your message of "Mon, 23 Sep 2002 08:54:05 EDT."
 <0a0201c26300$924fd350$6701a8c0@boostconsulting.com>
References: <200209202126.g8KLQVI24554@pcp02138704pcs.reston01.va.comcast.net> <2m1y7kyi70.fsf@starship.python.net>
 <0a0201c26300$924fd350$6701a8c0@boostconsulting.com>
Message-ID: <200209231322.g8NDMGX06641@pcp02138704pcs.reston01.va.comcast.net>

> > > I'd like to release something called Python 2.2.2 in a few weeks (say,
> > > around Oct 8; I like Tuesday release dates).
> 
> I've been planning to release Boost.Python v2 around the same
> time. Is there any chance we can coordinate this so that we
> Boost.Python people can test against all of the backported changes
> before either of these products "goes final"?

If you check out the release22-maint branch of Python from CVS and
subscribe to the python-checkins list
(http://mail.python.org/mailman/listinfo/python-checkins) you should
be able to track the work leading up to 2.2.2 pretty closely.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From ark@research.att.com  Mon Sep 23 14:50:49 2002
From: ark@research.att.com (Andrew Koenig)
Date: Mon, 23 Sep 2002 09:50:49 -0400 (EDT)
Subject: [Python-Dev] -zcombreloc
Message-ID: <200209231350.g8NDonx28099@europa.research.att.com>

I now believe, after discussion with the binutils developers,
that the problem is that -zcombreloc just plain doesn't
work in any release of binutils.  One implication of this fact
is that turning on -znocombreloc when building python is insufficient
to make it work if you built gcc with -zcombreloc, because that build
will write dynamic libraries that the Sun loader cannot handle no
matter what.

So I think you are right -- the correct fix is to warn people that
they should not use binutils 2.13 on Solaris, period.  I believe that
they will fix the problem in 2.13.1 and will let you know.

Meanwhile, might I suggest including the following test somewhere
in the build procedure?  If it fails, I believe it will not be
possible to build Python successfully, so one might as well find
about it early....

If you run this and the output includes "core dumped', it failed :-)

-------------------------cut here----------------------
#! /bin/sh

mkdir /tmp/t.$$  || exit 3
cd /tmp/t.$$     || exit 3

cat >main.c <<'EOF'
#include <stdio.h>
#include <dlfcn.h>

int main(void)
{
    void *handle, *sym;
    char *error;

    puts("calling dlopen");
    handle = dlopen("./dyn.so", RTLD_NOW);
    if (!handle) {
        printf("%s\n", dlerror());
	return 1;
    }

    puts("calling dlsym");
    sym = dlsym(handle, "sym");
    if ((error = dlerror()) != 0) {
        printf("%s\n", error);
	return 1;
    }
    puts("calling sym");
    ((void (*)(void))sym)();
    puts("done");
    return 0;
}
EOF

cat >dyn.c <<'EOF'
#include <stdio.h>
void sym(void)
{
    puts("in sym");
}
EOF

[ -n "$SHFLAGS" ] || SHFLAGS="-fPIC -shared"
[ -n "$CC" ]  || CC=gcc

set -x

$CC $CFLAGS $SHFLAGS dyn.c -o dyn.so
$CC $CFLAGS main.c -o main -ldl

./main || exit $?

cd /tmp
rm -rf t.$$


From guido@python.org  Mon Sep 23 15:03:42 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 23 Sep 2002 10:03:42 -0400
Subject: [Python-Dev] -zcombreloc
In-Reply-To: Your message of "Mon, 23 Sep 2002 09:50:49 EDT."
 <200209231350.g8NDonx28099@europa.research.att.com>
References: <200209231350.g8NDonx28099@europa.research.att.com>
Message-ID: <200209231403.g8NE3hh06996@pcp02138704pcs.reston01.va.comcast.net>

> Meanwhile, might I suggest including the following test somewhere
> in the build procedure?  If it fails, I believe it will not be
> possible to build Python successfully, so one might as well find
> about it early....
> 
> If you run this and the output includes "core dumped', it failed :-)

I hope someone does this.  In the mean time, I've added a warning
about binutils 2.13 to the README file (and also to the 2.2.2 branch
README file).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From lac@strakt.com  Mon Sep 23 15:42:34 2002
From: lac@strakt.com (Laura Creighton)
Date: Mon, 23 Sep 2002 16:42:34 +0200
Subject: [Python-Dev] Re: ATTENTION! Releasing Python 2.2.2 in a few weeks
In-Reply-To: Message from Guido van Rossum <guido@python.org>
 of "Mon, 23 Sep 2002 08:07:08 EDT." <200209231207.g8NC78C06322@pcp02138704pcs.reston01.va.comcast.net>
References: <200209202126.g8KLQVI24554@pcp02138704pcs.reston01.va.comcast.net> <200209231001.g8NA1Ko9011212@ratthing-b246.strakt.com>  <200209231207.g8NC78C06322@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200209231442.g8NEgYo9012614@ratthing-b246.strakt.com>

> > 
> > But given your current plan to release in a few weeks, I wonder if my
> > task might be better changed from tracing all the changes from the
> > last release to 2.2 maintenance, to starting from 2.2 maint and
> > seeing if there is something we _don't_ want in, which needs removing.
> > I still don't have a good enough perspective to judge this.
> 
> I very much doubt that anything would have crept into 2.2 cvs that we
> don't want.  Or are you talking from the PBF POV and could they be
> more conservative for Py-tie than we've been with 2.2.2?

I sure don't think so.   But it looks now to me as if the new plan of
attack for making a PyTie release is to start with 2.2.2 and then see
what absolutely needs to be added to that, as opposed to my old approach,
which was to trace from 2.2.1 until 2.3 labelling things that _shouldn't_
be included.  Does this make sense, or was my old plan of attack better?

> Can you tell us more here about the Py-tie plans?  I know nothing
> about it except that it'll be based on Python 2.2; I think it would be
> helpful for the developer community to know what the long-term Py-tie
> plans are.

If there is consensus that making Py-Tie out of 2.2.2 plus a list
of things that have to be added/fixed, then the thing to do is to
start the Snake Farm testing 2.2maints.  Then we need to get a list
of what software that isn't part of the standard library should be
included in PyTie.  Then we have to test that against the PyTie candidate.
We're working on a way to add that to the snakefarm builds now.  The
long term plans are to fix serious bugs in the release if they should be
discovered, not only in Python but in any third party modules.  Also
we are working on how to license the whole thing, given that every
extra bit has its own particular license.  

Laura


From guido@python.org  Mon Sep 23 16:07:35 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 23 Sep 2002 11:07:35 -0400
Subject: [Python-Dev] Re: ATTENTION! Releasing Python 2.2.2 in a few weeks
In-Reply-To: Your message of "Mon, 23 Sep 2002 16:42:34 +0200."
 <200209231442.g8NEgYo9012614@ratthing-b246.strakt.com>
References: <200209202126.g8KLQVI24554@pcp02138704pcs.reston01.va.comcast.net> <200209231001.g8NA1Ko9011212@ratthing-b246.strakt.com> <200209231207.g8NC78C06322@pcp02138704pcs.reston01.va.comcast.net>
 <200209231442.g8NEgYo9012614@ratthing-b246.strakt.com>
Message-ID: <200209231507.g8NF7ZU09833@pcp02138704pcs.reston01.va.comcast.net>

> > > But given your current plan to release in a few weeks, I wonder if my
> > > task might be better changed from tracing all the changes from the
> > > last release to 2.2 maintenance, to starting from 2.2 maint and
> > > seeing if there is something we _don't_ want in, which needs removing.
> > > I still don't have a good enough perspective to judge this.
> > 
> > I very much doubt that anything would have crept into 2.2 cvs that we
> > don't want.  Or are you talking from the PBF POV and could they be
> > more conservative for Py-tie than we've been with 2.2.2?
> 
> I sure don't think so.   But it looks now to me as if the new plan of
> attack for making a PyTie release is to start with 2.2.2 and then see
> what absolutely needs to be added to that, as opposed to my old approach,
> which was to trace from 2.2.1 until 2.3 labelling things that _shouldn't_
> be included.  Does this make sense, or was my old plan of attack better?

I would never have suggested labeling things that *shouldn't* be
included; it's better to label things that *should* be included.
Whether your criteria for inclusion is "absolutely must have" or
"would be nice" depends on how much time you have and what the PBF's
real goal is.

I would suggest that if your primary goal is stability, being
conservative is probably right; everything that's not very clearly a
pure bugfix should be frowned upon.

> > Can you tell us more here about the Py-tie plans?  I know nothing
> > about it except that it'll be based on Python 2.2; I think it would be
> > helpful for the developer community to know what the long-term Py-tie
> > plans are.
> 
> If there is consensus that making Py-Tie out of 2.2.2 plus a list
> of things that have to be added/fixed, then the thing to do is to
> start the Snake Farm testing 2.2maints.  Then we need to get a list
> of what software that isn't part of the standard library should be
> included in PyTie.  Then we have to test that against the PyTie candidate.
> We're working on a way to add that to the snakefarm builds now.  The
> long term plans are to fix serious bugs in the release if they should be
> discovered, not only in Python but in any third party modules.  Also
> we are working on how to license the whole thing, given that every
> extra bit has its own particular license.  

Thanks.  In addition, I was hoping to hear about your timeline (when
do you expect to release PyTie?) and a hint on the 3rd party packages
you're thinking of adding.  Also a list of target platforms for which
PyTie must absolutely work.  (Note e.g. that we just discovered a
problem with Solaris and the latest version of binutils (2.13), which
seems to be used by the latest GCC version (3.2 IIRC) but is also
separately downloadable.  The bug is in binutils 2.13.  Is this
*combination* (Solaris + binutils 2.13) a target platform?  If so, you
might want to use a different approach than we plan to do for Python
2.3 and 2.2.2 (which is merely to bail out if a certain test dumps
core during configuration).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From ark@research.att.com  Mon Sep 23 16:30:42 2002
From: ark@research.att.com (Andrew Koenig)
Date: Mon, 23 Sep 2002 11:30:42 -0400 (EDT)
Subject: [Python-Dev] binutils/solaris -- one more thing
Message-ID: <200209231530.g8NFUgu28520@europa.research.att.com>

Assuming that the binutils developers do conclude that the -zcombreloc
problem is to be fixed, not worked around (as I think they will),
there is still one more binutils-related build problem that I
encountered with Solaris:  At binutils 2.12, the output from "ld -V"
changed in a way that invalidated the previous way of testing for
the presence of dynamic linking.

Someone--I forget who--gave me a patch that solved the problem;
I believe that this patch is necessary to build Python on Solaris
with binutils 2.12 or later.  Can I ask someone to check whether
it is already part of 2.2.2?


--------
*** configure.in	2002-09-23 10:07:42.559545843 -0400
--- configure.in.new	2002-09-23 10:08:32.944415830 -0400
***************
*** 889,895 ****
  		fi;;
  	SunOS/5*) case $CC in
  		  *gcc*)
! 		    if $CC -Xlinker -V 2>&1 | grep BFD >/dev/null
  		    then
  			LINKFORSHARED="-Xlinker --export-dynamic"
  		    fi;;
--- 889,895 ----
  		fi;;
  	SunOS/5*) case $CC in
  		  *gcc*)
! 		    if $CC -Xlinker --help 2>&1 | grep export-dynamic >/dev/null
  		    then
  			LINKFORSHARED="-Xlinker --export-dynamic"
  		    fi;;



From guido@python.org  Mon Sep 23 16:33:04 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 23 Sep 2002 11:33:04 -0400
Subject: [Python-Dev] binutils/solaris -- one more thing
In-Reply-To: Your message of "Mon, 23 Sep 2002 11:30:42 EDT."
 <200209231530.g8NFUgu28520@europa.research.att.com>
References: <200209231530.g8NFUgu28520@europa.research.att.com>
Message-ID: <200209231533.g8NFX4I10033@pcp02138704pcs.reston01.va.comcast.net>

> Assuming that the binutils developers do conclude that the -zcombreloc
> problem is to be fixed, not worked around (as I think they will),
> there is still one more binutils-related build problem that I
> encountered with Solaris:  At binutils 2.12, the output from "ld -V"
> changed in a way that invalidated the previous way of testing for
> the presence of dynamic linking.
> 
> Someone--I forget who--gave me a patch that solved the problem;
> I believe that this patch is necessary to build Python on Solaris
> with binutils 2.12 or later.  Can I ask someone to check whether
> it is already part of 2.2.2?
> 
> 
> --------
> *** configure.in	2002-09-23 10:07:42.559545843 -0400
> --- configure.in.new	2002-09-23 10:08:32.944415830 -0400
> ***************
> *** 889,895 ****
>   		fi;;
>   	SunOS/5*) case $CC in
>   		  *gcc*)
> ! 		    if $CC -Xlinker -V 2>&1 | grep BFD >/dev/null
>   		    then
>   			LINKFORSHARED="-Xlinker --export-dynamic"
>   		    fi;;
> --- 889,895 ----
>   		fi;;
>   	SunOS/5*) case $CC in
>   		  *gcc*)
> ! 		    if $CC -Xlinker --help 2>&1 | grep export-dynamic >/dev/null
>   		    then
>   			LINKFORSHARED="-Xlinker --export-dynamic"
>   		    fi;;

But what if this code is used with a version of binutils prior to
2.12?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From hu.peress@mail.mcgill.ca  Mon Sep 23 16:38:13 2002
From: hu.peress@mail.mcgill.ca (Hunter Peress)
Date: 23 Sep 2002 10:38:13 -0500
Subject: [Python-Dev] os.wait unweirding
In-Reply-To: <004701c25528$7c8b4530$ced241d5@hagrid>
References: <1031437860.636.29.camel@HillCountryPeress>
 <m3it1l8iu9.fsf@mira.informatik.hu-berlin.de>
 <1031442464.644.68.camel@HillCountryPeress>
 <003d01c25471$d83fe960$2fd8accf@othello>
 <1031451760.644.97.camel@HillCountryPeress>
 <004701c25528$7c8b4530$ced241d5@hagrid>
Message-ID: <1032795494.16226.478.camel@HillCountryPeress>

for i in a:
  print os.spawnv(os.P_NOWAIT,scr,["",str(i)])

for i in a:
  os.wait()

you have to do the second loop in order to wait for all children that u
spawned off. I think that os.wait() without any arguments should wait
for all chilren, not wait for the earliest executed child.



On Thu, 2002-09-05 at 17:06, Fredrik Lundh wrote:
> hunter wrote:
> 
> > I need not search far.
> > example 1) pydoc os.fork
> > Python Library Documentation: built-in function fork in os
> > fork(...)
> >     fork() -> pid
> >     Fork a child process.
> >     
> >     Return 0 to child process and PID of child to parent process.
> 
> why do you care about the type of a PID object?  in most
> cases, all you need to know is that a PID isn't 0, which is
> exactly what the documentation says.
> 
> and if you know what a PID is, you already know what type
> it is...
> 
> > example2) pydoc string.index
> > Python Library Documentation: function index in string
> > index(s, *args)
> >     index(s, sub [,start [,end]]) -> int
> >     
> >     Like find but raises ValueError when the substring is not found.
> > 
> > From these two, I have no idea what BOTH the input and return
> > types are.
> 
> the index documentation refers to the documentation
> for "find", which tells you that:
> 
> >>> help(string.find)
> Help on function find in module string:
> 
> find(s, *args)
>     find(s, sub [,start [,end]]) -> in
> 
>     Return the lowest index in s where substring sub is found,
>     such that sub is contained within s[start,end].  Optional
>     arguments start and end are interpreted as in slice notation.
> 
>     Return -1 on failure.
> 
> which, given that you know how indexes and slices work in
> python, is all you need to know.
> 
> > I found those examples in 10 seconds (literally). The state of the
> > python documentation is caca.
> 
> how long have you been using Python?
> 
> </F>
> 
> 




From ark@research.att.com  Mon Sep 23 16:40:11 2002
From: ark@research.att.com (Andrew Koenig)
Date: Mon, 23 Sep 2002 11:40:11 -0400 (EDT)
Subject: [Python-Dev] binutils/solaris -- one more thing
In-Reply-To: <200209231533.g8NFX4I10033@pcp02138704pcs.reston01.va.comcast.net>
 (message from Guido van Rossum on Mon, 23 Sep 2002 11:33:04 -0400)
References: <200209231530.g8NFUgu28520@europa.research.att.com> <200209231533.g8NFX4I10033@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200209231540.g8NFeBU28773@europa.research.att.com>

Guido> But what if this code is used with a version of binutils prior to
Guido> 2.12?

It should still work -- it asks ld for a list of options that it
supports and looks for "export-dynamic" in the list.  If the list of
supported options doesn't contain "export-dynamic", then the build
procedure had better not supply "export-dynamic" as an option, had it?
:-)

In other words, I believe that the patch replaces a test that works
only for 2.11 and earlier with a slightly more elaborate test that
works for all versions.



From neal@metaslash.com  Mon Sep 23 16:41:13 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Mon, 23 Sep 2002 11:41:13 -0400
Subject: [Python-Dev] binutils/solaris -- one more thing
References: <200209231530.g8NFUgu28520@europa.research.att.com> <200209231533.g8NFX4I10033@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D8F3619.19FB7B80@metaslash.com>

Guido van Rossum wrote:
> 
> > Assuming that the binutils developers do conclude that the -zcombreloc
> > problem is to be fixed, not worked around (as I think they will),
> > there is still one more binutils-related build problem that I
> > encountered with Solaris:  At binutils 2.12, the output from "ld -V"
> > changed in a way that invalidated the previous way of testing for
> > the presence of dynamic linking.
> >
> > Someone--I forget who--gave me a patch that solved the problem;
> > I believe that this patch is necessary to build Python on Solaris
> > with binutils 2.12 or later.  Can I ask someone to check whether
> > it is already part of 2.2.2?

I believe it was Martin that provided the patch.  
And this patch is in 2.2.2.

[Guido]
> But what if this code is used with a version of binutils prior to
> 2.12?

On Linux (but I think it's the same on Solaris):

[neal@epoch src]$ ld -V
GNU ld version 2.11.90.0.8 (with BFD 2.11.90.0.8)

[neal@epoch src]$ gcc -Xlinker --help 2>&1 | grep export-dynamic
  -E, --export-dynamic        Export all dynamic symbols

Neal


From ark@research.att.com  Mon Sep 23 16:44:21 2002
From: ark@research.att.com (Andrew Koenig)
Date: Mon, 23 Sep 2002 11:44:21 -0400 (EDT)
Subject: [Python-Dev] binutils/solaris -- one more thing
In-Reply-To: <3D8F3619.19FB7B80@metaslash.com> (message from Neal Norwitz on
 Mon, 23 Sep 2002 11:41:13 -0400)
References: <200209231530.g8NFUgu28520@europa.research.att.com> <200209231533.g8NFX4I10033@pcp02138704pcs.reston01.va.comcast.net> <3D8F3619.19FB7B80@metaslash.com>
Message-ID: <200209231544.g8NFiLI28812@europa.research.att.com>

Neal> I believe it was Martin that provided the patch.  
Neal> And this patch is in 2.2.2.

Thank you!
Neal> On Linux (but I think it's the same on Solaris):

Neal> [neal@epoch src]$ ld -V
Neal> GNU ld version 2.11.90.0.8 (with BFD 2.11.90.0.8)

Neal> [neal@epoch src]$ gcc -Xlinker --help 2>&1 | grep export-dynamic
Neal>   -E, --export-dynamic        Export all dynamic symbols

And that's the reason for the patch:

[europa] ld -V
GNU ld version 2.12.1
  Supported emulations:
   elf32_sparc
   elf64_sparc

[europa] gcc -Xlinker --help 2>&1 | grep export-dynamic
  -E, --export-dynamic        Export all dynamic symbols


From guido@python.org  Mon Sep 23 16:47:13 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 23 Sep 2002 11:47:13 -0400
Subject: [Python-Dev] binutils/solaris -- one more thing
In-Reply-To: Your message of "Mon, 23 Sep 2002 11:30:42 EDT."
 <200209231530.g8NFUgu28520@europa.research.att.com>
References: <200209231530.g8NFUgu28520@europa.research.att.com>
Message-ID: <200209231547.g8NFlDv10207@pcp02138704pcs.reston01.va.comcast.net>

> Someone--I forget who--gave me a patch that solved the problem;
> I believe that this patch is necessary to build Python on Solaris
> with binutils 2.12 or later.  Can I ask someone to check whether
> it is already part of 2.2.2?

Duh.  It's already in CVS for 2.2.2 and 2.3.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From ark@research.att.com  Mon Sep 23 16:51:57 2002
From: ark@research.att.com (Andrew Koenig)
Date: Mon, 23 Sep 2002 11:51:57 -0400 (EDT)
Subject: [Python-Dev] binutils/solaris -- one more thing
In-Reply-To: <200209231547.g8NFlDv10207@pcp02138704pcs.reston01.va.comcast.net>
 (message from Guido van Rossum on Mon, 23 Sep 2002 11:47:13 -0400)
References: <200209231530.g8NFUgu28520@europa.research.att.com> <200209231547.g8NFlDv10207@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200209231551.g8NFpvm28837@europa.research.att.com>

Guido> Duh.  It's already in CVS for 2.2.2 and 2.3.

Thanks!




From guido@python.org  Mon Sep 23 16:42:09 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 23 Sep 2002 11:42:09 -0400
Subject: [Python-Dev] os.wait unweirding
In-Reply-To: Your message of "Mon, 23 Sep 2002 10:38:13 CDT."
 <1032795494.16226.478.camel@HillCountryPeress>
References: <1031437860.636.29.camel@HillCountryPeress> <m3it1l8iu9.fsf@mira.informatik.hu-berlin.de> <1031442464.644.68.camel@HillCountryPeress> <003d01c25471$d83fe960$2fd8accf@othello> <1031451760.644.97.camel@HillCountryPeress> <004701c25528$7c8b4530$ced241d5@hagrid>
 <1032795494.16226.478.camel@HillCountryPeress>
Message-ID: <200209231542.g8NFg9710148@pcp02138704pcs.reston01.va.comcast.net>

> for i in a:
>   print os.spawnv(os.P_NOWAIT,scr,["",str(i)])
> 
> for i in a:
>   os.wait()
> 
> you have to do the second loop in order to wait for all children that u
> spawned off. I think that os.wait() without any arguments should wait
> for all chilren, not wait for the earliest executed child.

Go talk to the designers of Unix and the POSIX standard committee.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From hu.peress@mail.mcgill.ca  Mon Sep 23 17:00:12 2002
From: hu.peress@mail.mcgill.ca (Hunter Peress)
Date: 23 Sep 2002 11:00:12 -0500
Subject: [Python-Dev] os.wait unweirding. impetus
In-Reply-To: <200209231542.g8NFg9710148@pcp02138704pcs.reston01.va.comcast.net>
References: <1031437860.636.29.camel@HillCountryPeress>
 <m3it1l8iu9.fsf@mira.informatik.hu-berlin.de>
 <1031442464.644.68.camel@HillCountryPeress>
 <003d01c25471$d83fe960$2fd8accf@othello>
 <1031451760.644.97.camel@HillCountryPeress>
 <004701c25528$7c8b4530$ced241d5@hagrid>
 <1032795494.16226.478.camel@HillCountryPeress>
 <200209231542.g8NFg9710148@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <1032796812.16226.506.camel@HillCountryPeress>

The idea came from bash.
where the wait command will wait for all child processes.

Its clearly doable.
On Mon, 2002-09-23 at 10:42, Guido van Rossum wrote:
> > for i in a:
> >   print os.spawnv(os.P_NOWAIT,scr,["",str(i)])
> > 
> > for i in a:
> >   os.wait()
> > 
> > you have to do the second loop in order to wait for all children that u
> > spawned off. I think that os.wait() without any arguments should wait
> > for all chilren, not wait for the earliest executed child.
> 
> Go talk to the designers of Unix and the POSIX standard committee.
> 
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> 




From lac@strakt.com  Mon Sep 23 17:05:13 2002
From: lac@strakt.com (Laura Creighton)
Date: Mon, 23 Sep 2002 18:05:13 +0200
Subject: [Python-Dev] Re: ATTENTION! Releasing Python 2.2.2 in a few weeks
In-Reply-To: Message from Guido van Rossum <guido@python.org>
 of "Mon, 23 Sep 2002 11:07:35 EDT." <200209231507.g8NF7ZU09833@pcp02138704pcs.reston01.va.comcast.net>
References: <200209202126.g8KLQVI24554@pcp02138704pcs.reston01.va.comcast.net> <200209231001.g8NA1Ko9011212@ratthing-b246.strakt.com> <200209231207.g8NC78C06322@pcp02138704pcs.reston01.va.comcast.net> <200209231442.g8NEgYo9012614@ratthing-b246.strakt.com>  <200209231507.g8NF7ZU09833@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200209231605.g8NG5Do9012985@ratthing-b246.strakt.com>

> 
> Thanks.  In addition, I was hoping to hear about your timeline (when
> do you expect to release PyTie?) and a hint on the 3rd party packages
> you're thinking of adding.  Also a list of target platforms for which
> PyTie must absolutely work.  (Note e.g. that we just discovered a
> problem with Solaris and the latest version of binutils (2.13), which
> seems to be used by the latest GCC version (3.2 IIRC) but is also
> separately downloadable.  The bug is in binutils 2.13.  Is this
> *combination* (Solaris + binutils 2.13) a target platform?  If so, you
> might want to use a different approach than we plan to do for Python
> 2.3 and 2.2.2 (which is merely to bail out if a certain test dumps
> core during configuration).
> 
> --Guido van Rossum (home page: http://www.python.org/~guido/)

Do you know when a fixed binutils is due?  This may explain one
bug report I have right here at Strakt.  Solaris is the preferred
platform of our beta-testing customer, Chalmers, so I would like
PyTie to run on as many Solaris-including hardware and software
platforms as possible.  I'll bring this up in a meeting.  Right now
are you advising people not to use GCC 3.2 or binutils 2.13? or
do you have other advice for them which you can steer me towards?

Laura


From ark@research.att.com  Mon Sep 23 17:24:45 2002
From: ark@research.att.com (Andrew Koenig)
Date: 23 Sep 2002 12:24:45 -0400
Subject: [Python-Dev] Re: ATTENTION! Releasing Python 2.2.2 in a few weeks
In-Reply-To: <200209231605.g8NG5Do9012985@ratthing-b246.strakt.com>
References: <200209202126.g8KLQVI24554@pcp02138704pcs.reston01.va.comcast.net>
 <200209231001.g8NA1Ko9011212@ratthing-b246.strakt.com>
 <200209231207.g8NC78C06322@pcp02138704pcs.reston01.va.comcast.net>
 <200209231442.g8NEgYo9012614@ratthing-b246.strakt.com>
 <200209231507.g8NF7ZU09833@pcp02138704pcs.reston01.va.comcast.net>
 <200209231605.g8NG5Do9012985@ratthing-b246.strakt.com>
Message-ID: <yu99ptv43c8i.fsf@europa.research.att.com>

Laura> Do you know when a fixed binutils is due?  This may explain one
Laura> bug report I have right here at Strakt.  Solaris is the
Laura> preferred platform of our beta-testing customer, Chalmers, so I
Laura> would like PyTie to run on as many Solaris-including hardware
Laura> and software platforms as possible.  I'll bring this up in a
Laura> meeting.  Right now are you advising people not to use GCC 3.2
Laura> or binutils 2.13? or do you have other advice for them which
Laura> you can steer me towards?

On my machine, gcc 3.2 works just fine -- it's binutils 2.13 that
is the culprit.  Use 2.12.1 instead (but be sure to install
the configure.in patch I posted earlier today).

I hope to get a patch from the binutils developers today for
testing; if it works, I expect that the patch will be in
binutils 2.13.1, which I understand is to be released shortly.

-- 
Andrew Koenig, ark@research.att.com, http://www.research.att.com/info/ark


From martin@v.loewis.de  Mon Sep 23 17:43:08 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 23 Sep 2002 18:43:08 +0200
Subject: [Python-Dev] Re: ATTENTION! Releasing Python 2.2.2 in a few weeks
In-Reply-To: <200209231605.g8NG5Do9012985@ratthing-b246.strakt.com>
References: <200209202126.g8KLQVI24554@pcp02138704pcs.reston01.va.comcast.net>
 <200209231001.g8NA1Ko9011212@ratthing-b246.strakt.com>
 <200209231207.g8NC78C06322@pcp02138704pcs.reston01.va.comcast.net>
 <200209231442.g8NEgYo9012614@ratthing-b246.strakt.com>
 <200209231507.g8NF7ZU09833@pcp02138704pcs.reston01.va.comcast.net>
 <200209231605.g8NG5Do9012985@ratthing-b246.strakt.com>
Message-ID: <m38z1smzc3.fsf@mira.informatik.hu-berlin.de>

Laura Creighton <lac@strakt.com> writes:

> Do you know when a fixed binutils is due? 

The bug hasn't been acknowledged by binutils maintainers yet; gcc
maintainers report many problems with binutils, but have not
identified any specific problem.

So, unless somebody looks down into the details and studies the
resulting binaries, it may be a matter of months for a fix to
appear. Until then, binutils 2.13 should be avoided on Solaris.

> Right now are you advising people not to use GCC 3.2 or binutils
> 2.13? or do you have other advice for them which you can steer me
> towards?

gcc 3.2 is fine, binutils 2.13 is not - use 2.12 or the system tools
instead.

Regards,
Martin


From martin@v.loewis.de  Mon Sep 23 17:46:39 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 23 Sep 2002 18:46:39 +0200
Subject: [Python-Dev] binutils/solaris -- one more thing
In-Reply-To: <3D8F3619.19FB7B80@metaslash.com>
References: <200209231530.g8NFUgu28520@europa.research.att.com>
 <200209231533.g8NFX4I10033@pcp02138704pcs.reston01.va.comcast.net>
 <3D8F3619.19FB7B80@metaslash.com>
Message-ID: <m34rcgmz68.fsf@mira.informatik.hu-berlin.de>

Neal Norwitz <neal@metaslash.com> writes:

> I believe it was Martin that provided the patch.  
> And this patch is in 2.2.2.

All correct.

Regards,
Martin


From mats@laplaza.org  Mon Sep 23 17:43:41 2002
From: mats@laplaza.org (Mats Wichmann)
Date: Mon, 23 Sep 2002 10:43:41 -0600
Subject: [Python-Dev] Re: ATTENTION! Releasing Python 2.2.2 in a few weeks
In-Reply-To: <20020923151002.31927.90702.Mailman@mail.python.org>
Message-ID: <5.1.0.14.1.20020923103904.027d4298@204.151.72.2>

 >Thanks.  In addition, I was hoping to hear about your timeline (when
 >do you expect to release PyTie?) and a hint on the 3rd party packages
 >you're thinking of adding.  Also a list of target platforms for which
 >PyTie must absolutely work.  (Note e.g. that we just discovered a
 >problem with Solaris and the latest version of binutils (2.13), which
 >seems to be used by the latest GCC version (3.2 IIRC) but is also
 >separately downloadable.  The bug is in binutils 2.13.  Is this
 >*combination* (Solaris + binutils 2.13) a target platform?  If so, you
 >might want to use a different approach than we plan to do for Python
 >2.3 and 2.2.2 (which is merely to bail out if a certain test dumps
 >core during configuration).

gcc 3.2 requires binutils 2.12 or better; it doesn't
have a specific requirement on 2.13.  However the
sunfreeware bundle may bump bintuils to 2.13 (I haven't
checked what's going on there for a long time).
Sadly, the 2.13 release message indicates the purpose
is (only) to support a new platform and says nothing about
clever little modifications behind the scenes, like changing
default linker options so that relocation tables
are built differently (sigh).



From ark@research.att.com  Mon Sep 23 18:01:12 2002
From: ark@research.att.com (Andrew Koenig)
Date: 23 Sep 2002 13:01:12 -0400
Subject: [Python-Dev] Re: ATTENTION! Releasing Python 2.2.2 in a few weeks
In-Reply-To: <m38z1smzc3.fsf@mira.informatik.hu-berlin.de>
References: <200209202126.g8KLQVI24554@pcp02138704pcs.reston01.va.comcast.net>
 <200209231001.g8NA1Ko9011212@ratthing-b246.strakt.com>
 <200209231207.g8NC78C06322@pcp02138704pcs.reston01.va.comcast.net>
 <200209231442.g8NEgYo9012614@ratthing-b246.strakt.com>
 <200209231507.g8NF7ZU09833@pcp02138704pcs.reston01.va.comcast.net>
 <200209231605.g8NG5Do9012985@ratthing-b246.strakt.com>
 <m38z1smzc3.fsf@mira.informatik.hu-berlin.de>
Message-ID: <yu99lm5s3ajr.fsf@europa.research.att.com>

Martin> So, unless somebody looks down into the details and studies
Martin> the resulting binaries, it may be a matter of months for a fix
Martin> to appear. Until then, binutils 2.13 should be avoided on
Martin> Solaris.

Also, note that if you already have binutils 2.13, it is not
enough just to reinstall 2.12; you have to rebuild gcc also.


-- 
Andrew Koenig, ark@research.att.com, http://www.research.att.com/info/ark


From bkc@murkworks.com  Mon Sep 23 18:29:20 2002
From: bkc@murkworks.com (Brad Clements)
Date: Mon, 23 Sep 2002 13:29:20 -0400
Subject: [Python-Dev] Need advice: cloning python cvs for CE project
Message-ID: <3D8F173B.11372.4250401E@localhost>

There are now 4 different people working on the "python ce" project. We've been 
passing around build tree .zips and it's getting out of hand.

I think we should setup our own CVS somewhere so we can work out all the CE kinks 
before submitting patches to the core Python CVS.

Is it possible to maintain a single working directory that can be checked into two 
different CVS systems?

I really have no idea what the proper way is to do this.. Looking for recommendations, 
including "don't do that, do this instead".

Thanks


Brad Clements,                bkc@murkworks.com   (315)268-1000
http://www.murkworks.com                          (315)268-9812 Fax
AOL-IM: BKClements



From guido@python.org  Mon Sep 23 18:41:14 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 23 Sep 2002 13:41:14 -0400
Subject: [Python-Dev] PEP 282 Implementation
In-Reply-To: Your message of "Wed, 28 Aug 2002 11:45:11 BST."
 <006601c24e7f$fcff5440$652b6992@alpha>
References: <00e001c2261d$19bfc320$652b6992@alpha> <200208092040.g79Ke3S31416@pcp02138704pcs.reston01.va.comcast.net>
 <006601c24e7f$fcff5440$652b6992@alpha>
Message-ID: <200209231741.g8NHfEl11092@pcp02138704pcs.reston01.va.comcast.net>

I'm sorry that this seems to be a thread with one message per month!
I'll try to be more responsive from now on, the big Zope projects that
were keeping me busy have given me some slack time.

> > In general the code looks good.  Only one style nits: I prefer
> > docstrings that have a one-line summary, then a blank line, and then a
> > longer description.
> 
> I will update the docstrings as per your feedback.

Great!  (When can we see a new release on
http://www.red-dove.com/python_logging.html
?)

> > There's a lot of code there!  Should it perhaps be broken up into
> > different modules?  Perhaps it should become a logging *package* with
> > submodules that define the various filters and handlers.
> 
> How strongly do you feel about this? I did think about doing this
> and in fact the first implementation of the module was as a
> package. I found this a little more cumbersome than the single-file
> solution, and reimplemented as logging.py. The module is a little on
> the large side but the single-file organization makes it a little
> easier to use.

I would feel much less strongly about this if several of the
additional things could be moved to separate files without making it a
package.

> > - Why does the FileHandler open the file with mode "a+" (and later
> >   with "w+")?  The "+" makes the file readable, but I see no
> >   reason to read it.  Am I missing?
> 
> No, you're right - using "a" and "w" should work. I'll change the
> code to lose the "+".

OK.

> > - setRollover(): the explanation isn't 100% clear.  I *think* that
> >   you always write to "app.log", and when that's full, you rename
> >   it to app.log.1, and app.log.1 gets renamed to app.log.2, and so
> >   on, and then you start writing to a new app.log, right?
> 
> Yes. The original implementation was different - it just closed the
> current file and opened a new file app.log.n. The current
> implementation is slightly slower due to the need to rename several
> files, but the user can tell more easily which the latest log file
> is. I will update the setRollover() docstring to indicate more
> clearly how it works; I'm assuming that the current algorithm is
> deemed good enough.

Yes, this seems how log rotation is generally done.  (Please remove
the commented-out old code.)

> > - class SocketHandler: why set yourself up for buffer overflow by
> >   using only 2 bytes for the packet size?  You can use the struct
> >   module to encode/decode this, BTW.  I also wonder what the
> >   application for this is, BTW.
> 
> I agree about the 2-byte limit. I can change it to use struct and an
> integer length. The application for encoding the length is simply to
> allow a socket-based server to handle multiple events sent by
> SocketHandler, in the event that the connection is kept open as long
> as possible and not shut down after every event.

OK, please change it to a 4-byte length header.

I understand why you need the length header; I'm just curious about
the need for a socket server.

> >   - method send(): in Python 2.2 and later, you can use the
> >     sendall() socket method which takes care of this loop for you.
> 
> OK. I can update the code to use this in the case of 2.2 and later.

Especially since this is slated to go into 2.3 only. :-)

> > - class DatagramHandler, method send(): I don't think UDP handles
> >   fragmented packets very well -- if you have to break the packet up,
> >   there's no guarantee that the receiver will see the parts in order
> >   (or even all of them).
> 
> You're absolutely right - I wasn't thinking clearly enough about how
> UDP actually works. I will replace the loop with a single sendto()
> call.

The length header might still be useful just to be format-compatible
with the TCP variant though.

> > - fileConfig(): Is there documentation for the configuration file?
> 
> There is some documentation in the python_logging.html file which is
> part of the distribution and also on the Web at
> http://www.red-dove.com/python_logging.html - it's in the form of
> comments in an annotated logconf.ini. I have not polished the
> documentation in this area as I'm not sure how much of the
> configuration stuff should be in the logging module itself. Feedback
> I've had indicates that at least some people object moderately
> strongly to having a particular configuration design forced on
> them. I'd appreciate views on this.

This is an example of something that I'd like to see relegated to a
separate file.  It really looks like fileConfig(), listen() and
stopListening() are a separate feature bundle that looks like it is
a specific example application rather than a core feature of the
logging module.  It certainly doesn't appear in PEP 282.  Maybe the
socket handler classes belong in the same category.

Of course, the same can be said about all Handler subclasses except
StreamHandler.  Only StreamHandler is referenced by basicConfig().
Perhaps these should all (except StreamHandler) be moved to separate
files?  This sounds like a reason to make it a package.  The main
logging code could be in the __init__.py file -- there's no rule that
says __init__.py should be empty or short!

PS. In your comments you seem fond of the word "needful".  I've rarely
heard that word -- perhaps it is archaic or common only in India?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Mon Sep 23 18:44:20 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 23 Sep 2002 13:44:20 -0400
Subject: [Python-Dev] Need advice: cloning python cvs for CE project
In-Reply-To: Your message of "Mon, 23 Sep 2002 13:29:20 EDT."
 <3D8F173B.11372.4250401E@localhost>
References: <3D8F173B.11372.4250401E@localhost>
Message-ID: <200209231744.g8NHiKB11123@pcp02138704pcs.reston01.va.comcast.net>

> There are now 4 different people working on the "python ce"
> project. We've been passing around build tree .zips and it's getting
> out of hand.
> 
> I think we should setup our own CVS somewhere so we can work out all
> the CE kinks before submitting patches to the core Python CVS.
> 
> Is it possible to maintain a single working directory that can be
> checked into two different CVS systems?
> 
> I really have no idea what the proper way is to do this.. Looking
> for recommendations, including "don't do that, do this instead".

Perhaps you can work on a branch of the standard Python CVS tree?  If
you're all Python developers (or can be sworn in easily) that would
work.

Otherwise you could set up your own SF project "pythonce" and do a
vendor branch checkin of Python.  I've never used vendor branches
myself, but Kurt Kaiser uses them in the idlefork CVS, which deals
with a similar issue.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@pobox.com  Mon Sep 23 18:48:24 2002
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 23 Sep 2002 12:48:24 -0500
Subject: [Python-Dev] Re: ATTENTION! Releasing Python 2.2.2 in a few weeks
In-Reply-To: <m38z1smzc3.fsf@mira.informatik.hu-berlin.de>
References: <200209202126.g8KLQVI24554@pcp02138704pcs.reston01.va.comcast.net>
 <200209231001.g8NA1Ko9011212@ratthing-b246.strakt.com>
 <200209231207.g8NC78C06322@pcp02138704pcs.reston01.va.comcast.net>
 <200209231442.g8NEgYo9012614@ratthing-b246.strakt.com>
 <200209231507.g8NF7ZU09833@pcp02138704pcs.reston01.va.comcast.net>
 <200209231605.g8NG5Do9012985@ratthing-b246.strakt.com>
 <m38z1smzc3.fsf@mira.informatik.hu-berlin.de>
Message-ID: <15759.21480.428390.593910@12-248-11-90.client.attbi.com>

    Martin> Laura Creighton <lac@strakt.com> writes:
    >> Do you know when a fixed binutils is due? 

    Martin> The bug hasn't been acknowledged by binutils maintainers yet;
    Martin> gcc maintainers report many problems with binutils, but have not
    Martin> identified any specific problem.

    Martin> So, unless somebody looks down into the details and studies the
    Martin> resulting binaries, it may be a matter of months for a fix to
    Martin> appear. Until then, binutils 2.13 should be avoided on Solaris.

Perhaps on Solaris the Python configure script should detect the presence of
binutils 2.13 and barf if it's found?  Something like

    if [ `uname` = 'SunOS' ] ; then
        v=`as --version 2>/dev/null \
           | head -1 \
           | sed -e 's/.* \([^.]*\.[^.]*\.[^.]*\).*/\1/`
        if [ $? -eq 0 ] ; then
            # got the gnu version of as - Sun as doesn't grok --version
            if [ $v = '2.13.0' ] ; then
                barf
            fi
        fi
    fi

seems like it should come close to working.

Skip


From ark@research.att.com  Mon Sep 23 18:58:28 2002
From: ark@research.att.com (Andrew Koenig)
Date: 23 Sep 2002 13:58:28 -0400
Subject: [Python-Dev] Re: ATTENTION! Releasing Python 2.2.2 in a few weeks
In-Reply-To: <15759.21480.428390.593910@12-248-11-90.client.attbi.com>
References: <200209202126.g8KLQVI24554@pcp02138704pcs.reston01.va.comcast.net>
 <200209231001.g8NA1Ko9011212@ratthing-b246.strakt.com>
 <200209231207.g8NC78C06322@pcp02138704pcs.reston01.va.comcast.net>
 <200209231442.g8NEgYo9012614@ratthing-b246.strakt.com>
 <200209231507.g8NF7ZU09833@pcp02138704pcs.reston01.va.comcast.net>
 <200209231605.g8NG5Do9012985@ratthing-b246.strakt.com>
 <m38z1smzc3.fsf@mira.informatik.hu-berlin.de>
 <15759.21480.428390.593910@12-248-11-90.client.attbi.com>
Message-ID: <yu99d6r437wb.fsf@europa.research.att.com>

Skip> Perhaps on Solaris the Python configure script should detect the
Skip> presence of binutils 2.13 and barf if it's found?

I've already suggested a slightly different test, that has the advantage
of allowing a patched 2.13 (and of detecting a broken 2.13.1 should it
still be broken).

-- 
Andrew Koenig, ark@research.att.com, http://www.research.att.com/info/ark


From guido@python.org  Mon Sep 23 22:30:42 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 23 Sep 2002 17:30:42 -0400
Subject: [Python-Dev] Re: ATTENTION! Releasing Python 2.2.2 in a few weeks
In-Reply-To: Your message of "Mon, 23 Sep 2002 13:00:47 EDT."
Message-ID: <200209232130.g8NLUgi19325@pcp02138704pcs.reston01.va.comcast.net>

Skip, where's your 2.2.2 Wiki?  (Or should we just pick a page name in
the moinmoin on python.org?)

I've backported the following items to 2.2.2, most of which were my
responsibility and/or 64-bit issues needed for the snake farm:

----------------------------------------------------------------------
Modified Files:
      Tag: release22-maint
	regrtest.py 
Log Message:
Backport 1.96 from trunk (because I want Xenofarm to test 2.2.2):

Add a bunch of sys.stdout.flush() calls that will hopefully improve
the usability of the output of the Xenofarm builds.
----------------------------------------------------------------------
Modified Files:
      Tag: release22-maint
	unicodeobject.c 
Log Message:
Backport 2.166 from trunk:

Fix SF bug 599128, submitted by Inyeol Lee: .replace() would do the
wrong thing for a unicode subclass when there were zero string
replacements.  The example given in the SF bug report was only one way
to trigger this; replacing a string of length >= 2 that's not found is
another.  The code would actually write outside allocated memory if
replacement string was longer than the search string.
----------------------------------------------------------------------
Modified Files:
      Tag: release22-maint
	test_unicode.py 
Log Message:
Backport 1.56 and 1.68 from trunk:

1.56:
Apply diff3.txt from SF patch http://www.python.org/sf/536241

If a str or unicode method returns the original object,
make sure that for str and unicode subclasses the original
will not be returned.

This should prevent SF bug http://www.python.org/sf/460020
from reappearing.

1.68:
Fix SF bug 599128, submitted by Inyeol Lee: .replace() would do the
wrong thing for a unicode subclass when there were zero string
replacements.  The example given in the SF bug report was only one way
to trigger this; replacing a string of length >= 2 that's not found is
another.  The code would actually write outside allocated memory if
replacement string was longer than the search string.
----------------------------------------------------------------------
Modified Files:
      Tag: release22-maint
	structmodule.c 
Log Message:
Backport 2.57 from trunk:

(Most of) SF patch 601369 (Christos Georgiou): obmalloc,structmodule:
64bit, big endian (issue 2 only).

This adds a bunch of memcpy calls via a temporary variable to avoid
alignment errors.  That's needed for some platforms.
----------------------------------------------------------------------
Modified Files:
      Tag: release22-maint
	test_b1.py 
Log Message:
Backport 1.51 and 1.54 from trunk.

1.51:
Bug #556025: list(xrange(1e9)) --> seg fault

Close the bug report again -- this time for Cygwin due to a newlib bug.
See the following for the details:

	http://sources.redhat.com/ml/newlib/2002/msg00369.html

Note that this commit is only a documentation (i.e., comment) change.

1.54:
The list(xrange(sys.maxint / 4)) test blew up on 64-bit platforms.
Because ob_size is a 32-bit int but sys.maxint is LONG_MAX which is a
64-bit value, there's no way to make this test succeed on a 64-bit
platform.  So just skip it when sys.maxint isn't 0x7fffffff.
----------------------------------------------------------------------
Modified Files:
      Tag: release22-maint
	intobject.c 
Log Message:
Backport 2.93 from trunk:

Insert an overflow check when the sequence repetition count is outside
the range of ints.  The old code would pass random truncated bits to
sq_repeat() on a 64-bit machine.
----------------------------------------------------------------------
Modified Files:
      Tag: release22-maint
	unicodeobject.c stringobject.c 
Log Message:
Backport from trunk:

unicodeobject.c 2.169
stringobject.c 2.189

Fix warnings on 64-bit platforms about casts from pointers to ints.
Two of these were real bugs.
----------------------------------------------------------------------
Modified Files:
      Tag: release22-maint
	exceptions.c 
Log Message:
Backported 1.39 and 1.40 from trunk:

1.39:
Fix SF bug 610610 (reported by Martijn Pieters, diagnosed by Neal Norwitz).

The switch in Exception__str__ didn't clear the error if
PySequence_Size() raised an exception.  Added a case -1 which clears
the error and falls through to the default case.

1.40:
Two more cases of switch(PySequence_Size()) without checking for case -1.
(Same problem as last checkin for SF bug 610610)
Need to clear the error and proceed.
----------------------------------------------------------------------

Note that I've been careful to vary the formatting of my log messages
a bit. :-)

Michael Hudson backported a bunch of things too.  I notice a test
suite failure with rfc822 as a result of these.  Michael, did you run
the test suite?

FAILED (errors=1)
Traceback (most recent call last):
  File "../Lib/test/test_rfc822.py", line 211, in ?
    test_main()
  File "../Lib/test/test_rfc822.py", line 207, in test_main
    test_support.run_unittest(MessageTestCase)
  File "../Lib/test/test_support.py", line 180, in run_unittest
    run_suite(unittest.makeSuite(testclass), testclass)
  File "../Lib/test/test_support.py", line 175, in run_suite
    raise TestFailed(err)
test_support.TestFailed: Traceback (most recent call last):
  File "../Lib/test/test_rfc822.py", line 199, in test_parseaddr
    eq(rfc822.parseaddr('<>'), ('', ''))
  File "/home/guido/branch-2.2/Lib/rfc822.py", line 491, in parseaddr
    list = a.addresslist
AttributeError: AddrlistClass instance has no attribute 'addresslist'

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@pobox.com  Mon Sep 23 22:49:46 2002
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 23 Sep 2002 16:49:46 -0500
Subject: [Python-Dev] Re: ATTENTION! Releasing Python 2.2.2 in a few weeks
In-Reply-To: <200209232130.g8NLUgi19325@pcp02138704pcs.reston01.va.comcast.net>
References: <200209232130.g8NLUgi19325@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15759.35962.479042.152695@12-248-11-90.client.attbi.com>

    Guido> Skip, where's your 2.2.2 Wiki?  (Or should we just pick a page
    Guido> name in the moinmoin on python.org?)

Ain't been created yet.  This is the first response I got to my offer to
create one.  Just a sec...  Okay, it's at

    http://manatee.mojam.com/py222wiki/

and is completely untarnished by (virtual) human hands.

    Guido> I've backported the following items to 2.2.2, most of which were my
    Guido> responsibility and/or 64-bit issues needed for the snake farm:
    
    ...

I have to get off to a soccer game.  If nobody beats me to it I'll try to
update the wiki later tonight or first thing tomorrow.

Skip


From martin@v.loewis.de  Mon Sep 23 23:02:40 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 24 Sep 2002 00:02:40 +0200
Subject: [Python-Dev] Need advice: cloning python cvs for CE project
In-Reply-To: <200209231744.g8NHiKB11123@pcp02138704pcs.reston01.va.comcast.net>
References: <3D8F173B.11372.4250401E@localhost>
 <200209231744.g8NHiKB11123@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <m3heggjren.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> Otherwise you could set up your own SF project "pythonce" and do a
> vendor branch checkin of Python.  I've never used vendor branches
> myself, but Kurt Kaiser uses them in the idlefork CVS, which deals
> with a similar issue.

I would recommend this strategy. Supposedly, you will rarely need to
perform imports from PythonLabs Python: the CE changes should be
largely independent of how PythonLabs Python develops.

Imports might be needed only when a chunk of your changes is accepted
into CVS Python.

You don't need a separate SF project, perhaps: A CVS module on the
python project might be sufficient. We can ask SF to remove the tree
when/if CE incorporation is complete.

Regards,
Martin



From bkc@murkworks.com  Mon Sep 23 23:10:59 2002
From: bkc@murkworks.com (Brad Clements)
Date: Mon, 23 Sep 2002 18:10:59 -0400
Subject: [Python-Dev] Need advice: cloning python cvs for CE project
In-Reply-To: <m3heggjren.fsf@mira.informatik.hu-berlin.de>
References: <200209231744.g8NHiKB11123@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D8F593D.21953.43521B2B@localhost>

On 24 Sep 2002 at 0:02, Martin v. Loewis wrote:

> Guido van Rossum <guido@python.org> writes:
> 
> > Otherwise you could set up your own SF project "pythonce" and do a
> > vendor branch checkin of Python.  I've never used vendor branches
> > myself, but Kurt Kaiser uses them in the idlefork CVS, which deals
> > with a similar issue.
> 
> I would recommend this strategy. Supposedly, you will rarely need to
> perform imports from PythonLabs Python: the CE changes should be
> largely independent of how PythonLabs Python develops.

Agreed. We'd snapshot from the core to the CE working CVS periodically. 

We cannot keep up with the rate of core changes until we get our act together. In the 
end, we shouldn't really have anything outside the core if we "do it right".

> You don't need a separate SF project, perhaps: A CVS module on the
> python project might be sufficient. We can ask SF to remove the tree
> when/if CE incorporation is complete.

Can someone who understands the mechanics of how this works explain it? I'm not 
skilled enough in CVS to visualize the process of importing from the core, while still 
being able to track commit/update's from the CE tree.

Also, none of the developers have core CVS access now, so I do not think a branch 
would be appropriate.


Brad Clements,                bkc@murkworks.com   (315)268-1000
http://www.murkworks.com                          (315)268-9812 Fax
AOL-IM: BKClements



From greg@cosc.canterbury.ac.nz  Mon Sep 23 23:20:08 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 24 Sep 2002 10:20:08 +1200 (NZST)
Subject: [Python-Dev] os.wait unweirding
In-Reply-To: <1032795494.16226.478.camel@HillCountryPeress>
Message-ID: <200209232220.g8NMK8d16141@oma.cosc.canterbury.ac.nz>

Hunter Peress <hu.peress@mail.mcgill.ca>:

> I think that os.wait() without any arguments should wait
> for all chilren, not wait for the earliest executed child.

Actually, it waits for *any* one child to exit, not
necessarily the first one spawned.

In any case, the functions in the os module are supposed
to be direct wrappers of the platform's system calls.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From martin@v.loewis.de  Tue Sep 24 00:02:40 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 24 Sep 2002 01:02:40 +0200
Subject: [Python-Dev] Need advice: cloning python cvs for CE project
In-Reply-To: <3D8F593D.21953.43521B2B@localhost>
References: <200209231744.g8NHiKB11123@pcp02138704pcs.reston01.va.comcast.net>
 <3D8F593D.21953.43521B2B@localhost>
Message-ID: <m3wupcgvhr.fsf@mira.informatik.hu-berlin.de>

"Brad Clements" <bkc@murkworks.com> writes:

> Can someone who understands the mechanics of how this works explain it? 

1. Export a "blank tree"

cvs -d :pserver:anonymous@cvs.python.sourceforge.net:/cvsroot/python
export python


2. Import it into a fresh repository

cvs -d :ext:developername@cvs.python.sourceforge.net:/cvsroot/python
import pythonce Pythonlabs cvs_from_020924

3. Make a sandbox for your module

cvs -d :pserver:anonymous@cvs.python.sourceforge.net:/cvsroot/python
export -d MyPythonCE pythonce

Then, whenever you incorporate another Pythonlabs snapshot, import it
again. cvs will tell you the command to join your tree with the
imported tree when the import is complete.

HTH,
Martin


From vinay_sajip@red-dove.com  Tue Sep 24 00:04:15 2002
From: vinay_sajip@red-dove.com (Vinay Sajip)
Date: Tue, 24 Sep 2002 00:04:15 +0100
Subject: [Python-Dev] PEP 282 Implementation
References: <00e001c2261d$19bfc320$652b6992@alpha> <200208092040.g79Ke3S31416@pcp02138704pcs.reston01.va.comcast.net>              <006601c24e7f$fcff5440$652b6992@alpha>  <200209231741.g8NHfEl11092@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <000e01c26355$8b2dcdc0$652b6992@alpha>

Guido van Rossum wrote:
> I'm sorry that this seems to be a thread with one message per month!
> I'll try to be more responsive from now on, the big Zope projects that
> were keeping me busy have given me some slack time.

Great.

>> I will update the docstrings as per your feedback.
>
> Great!  (When can we see a new release on
> http://www.red-dove.com/python_logging.html
> ?)

I was waiting for your feedback about the packaging - the docstrings have
been changed but I wanted to roll everything into the next release. Speaking
of which...

> I would feel much less strongly about this if several of the
> additional things could be moved to separate files without making it a
> package.
>
[stuff snipped]

> This is an example of something that I'd like to see relegated to a
> separate file.  It really looks like fileConfig(), listen() and
> stopListening() are a separate feature bundle that looks like it is
> a specific example application rather than a core feature of the
> logging module.  It certainly doesn't appear in PEP 282.  Maybe the
> socket handler classes belong in the same category.
>
> Of course, the same can be said about all Handler subclasses except
> StreamHandler.  Only StreamHandler is referenced by basicConfig().
> Perhaps these should all (except StreamHandler) be moved to separate
> files?  This sounds like a reason to make it a package.  The main
> logging code could be in the __init__.py file -- there's no rule that
> says __init__.py should be empty or short!

How about this suggestion? We could leave the core code in the existing
module, "logging". This would include a minimal set of handlers, and all the
Filters, and I think StreamHandler and FileHandler should be in here. All
other handlers would live in "logging.handlers". As for configuration -
basicConfig() could live in "logging" and any other configuration code in
"logging.config".

If the above seems a good idea, please let me know and I'll refactor
accordingly - then the next release will (hopefully) be in the next 2-3
weeks.

> PS. In your comments you seem fond of the word "needful".  I've rarely
> heard that word -- perhaps it is archaic or common only in India?

I only found 2 uses of "needful" - in BufferingHandler and
ConfigStreamHandler. It's the whole phrase "do the needful", which I think
is peculiar to England but has its share of users on the subcontinent :-)

Regards


Vinay Sajip



From rwgk@yahoo.com  Tue Sep 24 00:52:24 2002
From: rwgk@yahoo.com (Ralf W. Grosse-Kunstleve)
Date: Mon, 23 Sep 2002 16:52:24 -0700 (PDT)
Subject: [C++-sig] Re: [Python-Dev] ATTENTION! Releasing Python 2.2.2 in a few weeks
In-Reply-To: <200209231322.g8NDMGX06641@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020923235224.76315.qmail@web20208.mail.yahoo.com>

--- Guido van Rossum <guido@python.org> wrote:
> If you check out the release22-maint branch of Python from CVS and
> subscribe to the python-checkins list
> (http://mail.python.org/mailman/listinfo/python-checkins) you should
> be able to track the work leading up to 2.2.2 pretty closely.

Apparently the bug report

https://sourceforge.net/tracker/?func=detail&atid=105470&aid=607253&group_id=5470

has not yet lead to any changes in the release22-maint branch. The
worst problem are missing extern "C" in descrobject.h and iterobject.h.
This is compounded by missing include guards. We struggled quite a bit
to find a workaround for Boost.Python.

It will also be helpful if include guards are added to pymactoolbox.h.

Ralf


__________________________________________________
Do you Yahoo!?
New DSL Internet Access from SBC & Yahoo!
http://sbc.yahoo.com


From tim.one@comcast.net  Tue Sep 24 01:13:25 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 23 Sep 2002 20:13:25 -0400
Subject: [C++-sig] Re: [Python-Dev] ATTENTION! Releasing Python 2.2.2 in a
 few weeks
In-Reply-To: <20020923235224.76315.qmail@web20208.mail.yahoo.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEEJBGAB.tim.one@comcast.net>

[Ralf W. Grosse-Kunstleve]
> Apparently the bug report
>
> https://sourceforge.net/tracker/?func=detail&atid=105470&aid=60725
> 3&group_id=5470
>
> has not yet lead to any changes in the release22-maint branch. The
> worst problem are missing extern "C" in descrobject.h and iterobject.h.
> This is compounded by missing include guards. We struggled quite a bit
> to find a workaround for Boost.Python.
>
> It will also be helpful if include guards are added to pymactoolbox.h.

The odds of something like this getting fixed to your satisfication (not to
mention at all <wink>) greatly increase if you submit a patch.  Looks to me
like what you want to do is both correct and harmless, but I'm (speaking as
a generic Python developer) not going to be able to make time to test it in
the context you're concerned about.  OTOH, if there were a patch that you
knew worked for *you*, cool, I could apply it and just make sure it didn't
break anything for me (speaking as a etc).



From guido@python.org  Tue Sep 24 01:53:44 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 23 Sep 2002 20:53:44 -0400
Subject: [C++-sig] Re: [Python-Dev] ATTENTION! Releasing Python 2.2.2 in a few weeks
In-Reply-To: Your message of "Mon, 23 Sep 2002 16:52:24 PDT."
 <20020923235224.76315.qmail@web20208.mail.yahoo.com>
References: <20020923235224.76315.qmail@web20208.mail.yahoo.com>
Message-ID: <200209240053.g8O0riW20237@pcp02138704pcs.reston01.va.comcast.net>

> > If you check out the release22-maint branch of Python from CVS and
> > subscribe to the python-checkins list
> > (http://mail.python.org/mailman/listinfo/python-checkins) you should
> > be able to track the work leading up to 2.2.2 pretty closely.
> 
> Apparently the bug report
> 
> https://sourceforge.net/tracker/?func=detail&atid=105470&aid=607253&group_id=5470
> 
> has not yet lead to any changes in the release22-maint branch. The
> worst problem are missing extern "C" in descrobject.h and iterobject.h.
> This is compounded by missing include guards. We struggled quite a bit
> to find a workaround for Boost.Python.

Please submit patches.  Not being a C++ user myself I find it hard to
guess exactly what needs to be done based upon your terse description.

> It will also be helpful if include guards are added to pymactoolbox.h.

I suppose you mean in the 2.2 branch?  Jack added them two weeks ago,
according to the bug report.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Tue Sep 24 02:12:34 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 23 Sep 2002 21:12:34 -0400
Subject: [Python-Dev] PEP 282 Implementation
In-Reply-To: Your message of "Tue, 24 Sep 2002 00:04:15 BST."
 <000e01c26355$8b2dcdc0$652b6992@alpha>
References: <00e001c2261d$19bfc320$652b6992@alpha> <200208092040.g79Ke3S31416@pcp02138704pcs.reston01.va.comcast.net> <006601c24e7f$fcff5440$652b6992@alpha> <200209231741.g8NHfEl11092@pcp02138704pcs.reston01.va.comcast.net>
 <000e01c26355$8b2dcdc0$652b6992@alpha>
Message-ID: <200209240112.g8O1CYE20440@pcp02138704pcs.reston01.va.comcast.net>

> > I would feel much less strongly about this if several of the
> > additional things could be moved to separate files without making it a
> > package.
> >
> [stuff snipped]
> 
> > This is an example of something that I'd like to see relegated to a
> > separate file.  It really looks like fileConfig(), listen() and
> > stopListening() are a separate feature bundle that looks like it is
> > a specific example application rather than a core feature of the
> > logging module.  It certainly doesn't appear in PEP 282.  Maybe the
> > socket handler classes belong in the same category.
> >
> > Of course, the same can be said about all Handler subclasses except
> > StreamHandler.  Only StreamHandler is referenced by basicConfig().
> > Perhaps these should all (except StreamHandler) be moved to separate
> > files?  This sounds like a reason to make it a package.  The main
> > logging code could be in the __init__.py file -- there's no rule that
> > says __init__.py should be empty or short!
> 
> How about this suggestion? We could leave the core code in the
> existing module, "logging". This would include a minimal set of
> handlers, and all the Filters, and I think StreamHandler and
> FileHandler should be in here. All other handlers would live in
> "logging.handlers". As for configuration - basicConfig() could live
> in "logging" and any other configuration code in "logging.config".

Sounds good to me.  I hope that whoever felt strongly about this
(Martin von Loewis?) agrees.

> If the above seems a good idea, please let me know and I'll refactor
> accordingly - then the next release will (hopefully) be in the next
> 2-3 weeks.
> 
> > PS. In your comments you seem fond of the word "needful".  I've rarely
> > heard that word -- perhaps it is archaic or common only in India?
> 
> I only found 2 uses of "needful" - in BufferingHandler and
> ConfigStreamHandler. It's the whole phrase "do the needful", which I
> think  is  peculiar to England but  has  its share  of  users on the
> subcontinent :-)

Oh well.  Shows how Americanized I am, despite my thoroughly European
upbringing, after 7 years here. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From chrism@zope.com  Tue Sep 24 04:22:01 2002
From: chrism@zope.com (Chris McDonough)
Date: Mon, 23 Sep 2002 23:22:01 -0400
Subject: [Python-Dev] PEP 282 Implementation
References: <00e001c2261d$19bfc320$652b6992@alpha> <200208092040.g79Ke3S31416@pcp02138704pcs.reston01.va.comcast.net>              <006601c24e7f$fcff5440$652b6992@alpha>  <200209231741.g8NHfEl11092@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <01aa01c26379$8d6def60$4901000a@dorothy>

> > > - setRollover(): the explanation isn't 100% clear.  I *think*
that
> > >   you always write to "app.log", and when that's full, you
rename
> > >   it to app.log.1, and app.log.1 gets renamed to app.log.2,
and so
> > >   on, and then you start writing to a new app.log, right?
> >
> > Yes. The original implementation was different - it just closed
the
> > current file and opened a new file app.log.n. The current
> > implementation is slightly slower due to the need to rename
several
> > files, but the user can tell more easily which the latest log
file
> > is. I will update the setRollover() docstring to indicate more
> > clearly how it works; I'm assuming that the current algorithm is
> > deemed good enough.
>
> Yes, this seems how log rotation is generally done.  (Please
remove
> the commented-out old code.)

It would be helpful for the FileHandler class to define a method
which just closes and reopens the current logfile (instead of
actually rotating a set like-named logfiles).  This would allow
logfile rotation to be performed by a separate process (e.g.
RedHat's logrotate).  Sometimes it's better (and even necessary) to
be able to use system-provided log rotation facilities instead of
relying on the native rotation facilities.

- C




From mhammond@skippinet.com.au  Tue Sep 24 05:16:07 2002
From: mhammond@skippinet.com.au (Mark Hammond)
Date: Tue, 24 Sep 2002 14:16:07 +1000
Subject: [Python-Dev] Need advice: cloning python cvs for CE project
In-Reply-To: <3D8F593D.21953.43521B2B@localhost>
Message-ID: <LCEPIIGDJPKCOIHOBJEPCENOGJAA.mhammond@skippinet.com.au>

FWIW, a breakin at my house means my insurance company is funding a
sparkling new CE machine for me - which means PythonCE should be able to
work again for me soon :)  My Linux box (laptop) was taken at the same time,
and it seems the replacement will be significantly faster than my desktop.
Gotta love insurance ;)

> Also, none of the developers have core CVS access now, so I do
> not think a branch
> would be appropriate.

I havent been up to date with the latest PythonCE work, but I believe there
are two key issues:

1) Fairly simple patches to random files that allow CE to compile.  These
are generally fairly transparent, and generally just #ifdef out certain
features.

2) Larger patches or new source files that involve significant code - often
a re-implementation of something missing from CE that Python really likes to
have, or converting everything to and from Unicode for the CE API.

I believe (1) could be handled using the source forge patch manager, as
patches to the core.  Depending on how much this reduces the size of the
patch, the best way to handle (2) could be determined later.

I'm happy to help steer some of this through, and as I said above, I should
actually be in a position to build and test PythonCE again soon.  I've got a
few busy weeks ahead of me, but after that will have some Python time back.

Mark.



From martin@v.loewis.de  Tue Sep 24 05:49:24 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 24 Sep 2002 06:49:24 +0200
Subject: [Python-Dev] PEP 282 Implementation
In-Reply-To: <200209240112.g8O1CYE20440@pcp02138704pcs.reston01.va.comcast.net>
References: <00e001c2261d$19bfc320$652b6992@alpha>
 <200208092040.g79Ke3S31416@pcp02138704pcs.reston01.va.comcast.net>
 <006601c24e7f$fcff5440$652b6992@alpha>
 <200209231741.g8NHfEl11092@pcp02138704pcs.reston01.va.comcast.net>
 <000e01c26355$8b2dcdc0$652b6992@alpha>
 <200209240112.g8O1CYE20440@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <m3sn006lgr.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> Sounds good to me.  I hope that whoever felt strongly about this
> (Martin von Loewis?) agrees.

I don't think I ever voiced an opinion on logging.

Regards,
Martin


From vinay_sajip@red-dove.com  Tue Sep 24 08:56:33 2002
From: vinay_sajip@red-dove.com (Vinay Sajip)
Date: Tue, 24 Sep 2002 08:56:33 +0100
Subject: [Python-Dev] PEP 282 Implementation
References: <00e001c2261d$19bfc320$652b6992@alpha> <200208092040.g79Ke3S31416@pcp02138704pcs.reston01.va.comcast.net>              <006601c24e7f$fcff5440$652b6992@alpha>  <200209231741.g8NHfEl11092@pcp02138704pcs.reston01.va.comcast.net> <01aa01c26379$8d6def60$4901000a@dorothy>
Message-ID: <003901c2639f$e7e21ae0$652b6992@alpha>

Chris McDonough wrote:
> It would be helpful for the FileHandler class to define a method
> which just closes and reopens the current logfile (instead of
> actually rotating a set like-named logfiles).  This would allow
> logfile rotation to be performed by a separate process (e.g.
> RedHat's logrotate).  Sometimes it's better (and even necessary) to
> be able to use system-provided log rotation facilities instead of
> relying on the native rotation facilities.

I'm not sure whether this should be in the core functionality. I presume you
don't mean an atomic "close and reopen" operation - rather, are you
suggesting close the file, maybe rename it at the application level, then
reopen? If so, then it's best handled entirely in the application level,
through a subclass of FileHandler. This allows each application to consider
issues such as what to do with events that occur between close and reopen
(e.g. if multiple threads are running).

Regards

Vinay



From mwh@python.net  Tue Sep 24 10:14:47 2002
From: mwh@python.net (Michael Hudson)
Date: Tue, 24 Sep 2002 10:14:47 +0100 (BST)
Subject: [Python-Dev] Re: ATTENTION! Releasing Python 2.2.2 in a few
 weeks
In-Reply-To: <200209232130.g8NLUgi19325@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <Pine.LNX.4.44.0209241013280.4517-100000@starship.python.net>

On Mon, 23 Sep 2002, Guido van Rossum wrote:

> Michael Hudson backported a bunch of things too.  I notice a test
> suite failure with rfc822 as a result of these.  Michael, did you run
> the test suite?

No, I'd just got started when the network started acting up.  I'll get to 
it.

Cheers,
M.



From mwh@python.net  Tue Sep 24 10:24:04 2002
From: mwh@python.net (Michael Hudson)
Date: 24 Sep 2002 10:24:04 +0100
Subject: [Python-Dev] PEP 282 Implementation
In-Reply-To: "Vinay Sajip"'s message of "Tue, 24 Sep 2002 00:04:15 +0100"
References: <00e001c2261d$19bfc320$652b6992@alpha> <200208092040.g79Ke3S31416@pcp02138704pcs.reston01.va.comcast.net> <006601c24e7f$fcff5440$652b6992@alpha> <200209231741.g8NHfEl11092@pcp02138704pcs.reston01.va.comcast.net> <000e01c26355$8b2dcdc0$652b6992@alpha>
Message-ID: <2md6r3oi4r.fsf@starship.python.net>

"Vinay Sajip" <vinay_sajip@red-dove.com> writes:

> I only found 2 uses of "needful" - in BufferingHandler and
> ConfigStreamHandler. It's the whole phrase "do the needful", which I
> think is peculiar to England but has its share of users on the
> subcontinent :-)

I've never heard it before, having lived my whole life in England...

The meaning's pretty obvious, though.

Cheers,
M.

-- 
  I'll write on my monitor fifty times 'I must not post self-indulgent
  wibble nobody is interested in to ucam.chat just because I'm bored
  and I can't find the bug I'm supposed to fix'.
                                            -- Steve Kitson, ucam.chat


From rengelin@strw.leidenuniv.nl  Tue Sep 24 12:10:33 2002
From: rengelin@strw.leidenuniv.nl (Roeland Rengelink)
Date: Tue, 24 Sep 2002 13:10:33 +0200
Subject: [Python-Dev] bug 576990
Message-ID: <3D904828.766779BE@strw.leidenuniv.nl>

Hi,

Early july I submitted a bug report ( http://www.python.org/sf/576990 ).
Although Raymond Hettinger briefly looked at it (closed it, and the
re-opened it), there's presently no assignee for the bug. I am certainly
willing to do the work myself, but before doing so, I'd like to be sure
that I understand the non-repsonse correctly. I see several
possibilities:

1. This is not a bug but somebody forgot to tell me.

2. This is a completely trivial to solve, but everybody overlooked it.  

3. This is a small bug, only seen in a marginal corner case that is of
no particular interest to anyone, so there is no reason ( and certainly
no time) for anybody to respond and/or solve this

4. This is a mildly interesting, but relatively obscure bug, that might
be straightforward to solve if somebody had the spare time. (what spare
time?)

5. This is clearly a profound and interesting bug, but solving this
seems to involve cans of worms, ten-foot poles, and a re-write of the
core.

I supsect that in this case the answer lies somewhere between 3 and 4. I
just want to make sure that this is not a type 1, 2 or 5 bug. 

...

Ok. So this is actually a blatant attempt to get someone to look at this
again before 2.2.2 goes out the door. On the other hand, I really am
willing to do the work (clarify the report, give more use-cases, explain
the reasoning behind the patch, implement alternative solutions).

Thanks,

Roeland Rengelink


From guido@python.org  Tue Sep 24 13:10:21 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 24 Sep 2002 08:10:21 -0400
Subject: [Python-Dev] PEP 282 Implementation
In-Reply-To: Your message of "Mon, 23 Sep 2002 23:22:01 EDT."
 <01aa01c26379$8d6def60$4901000a@dorothy>
References: <00e001c2261d$19bfc320$652b6992@alpha> <200208092040.g79Ke3S31416@pcp02138704pcs.reston01.va.comcast.net> <006601c24e7f$fcff5440$652b6992@alpha> <200209231741.g8NHfEl11092@pcp02138704pcs.reston01.va.comcast.net>
 <01aa01c26379$8d6def60$4901000a@dorothy>
Message-ID: <200209241210.g8OCALQ22479@pcp02138704pcs.reston01.va.comcast.net>

> It would be helpful for the FileHandler class to define a method
> which just closes and reopens the current logfile (instead of
> actually rotating a set like-named logfiles).  This would allow
> logfile rotation to be performed by a separate process (e.g.
> RedHat's logrotate).  Sometimes it's better (and even necessary) to
> be able to use system-provided log rotation facilities instead of
> relying on the native rotation facilities.

Maybe this could be a different Handler subclass?

I have to admit that I find log rotation borderline functionality for
the logging module.  Perhaps Chris' suggestion is sufficient.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Tue Sep 24 13:30:04 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 24 Sep 2002 08:30:04 -0400
Subject: [Python-Dev] "do the needful"
In-Reply-To: Your message of "Tue, 24 Sep 2002 10:24:04 BST."
 <2md6r3oi4r.fsf@starship.python.net>
References: <00e001c2261d$19bfc320$652b6992@alpha> <200208092040.g79Ke3S31416@pcp02138704pcs.reston01.va.comcast.net> <006601c24e7f$fcff5440$652b6992@alpha> <200209231741.g8NHfEl11092@pcp02138704pcs.reston01.va.comcast.net> <000e01c26355$8b2dcdc0$652b6992@alpha>
 <2md6r3oi4r.fsf@starship.python.net>
Message-ID: <200209241230.g8OCU4G22577@pcp02138704pcs.reston01.va.comcast.net>

A Google search on "do the needful" suggests that the phrase is indeed
popular on the subcontinent: associated with top hits are names like
Ibrahim Hunkunti, Lord Sri Krishna, India's National Newspaper, an
astrology site run by Mr. Harsh Nigram, Pakistan, Nepal, ...  You get
the picture.

I'm fine if Vinay leaves it in!  It definitely sounded funny to me,
but it's not broken English -- it's cultural diversity.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From niemeyer@conectiva.com  Tue Sep 24 13:52:33 2002
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Tue, 24 Sep 2002 09:52:33 -0300
Subject: [Python-Dev] Re: ATTENTION! Releasing Python 2.2.2 in a few weeks
In-Reply-To: <200209232130.g8NLUgi19325@pcp02138704pcs.reston01.va.comcast.net>
References: <200209232130.g8NLUgi19325@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020924095233.A22181@ibook.distro.conectiva>

> Skip, where's your 2.2.2 Wiki?  (Or should we just pick a page name in
> the moinmoin on python.org?)

(unashamed plug ahead)

Btw, if you're MoinMoin extensively (as we have been), you may want to
check a small script I've written (editmoin.py) to allow edition of moin
pages with your preferred editor, and also a syntax highlighting file
for vim:

http://moin.conectiva.com.br/EditMoin

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]


From thomas.heller@ion-tof.com  Tue Sep 24 14:24:41 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Tue, 24 Sep 2002 15:24:41 +0200
Subject: [Python-Dev] Assign to errno allowed?
Message-ID: <000d01c263cd$bd76cc00$e000a8c0@thomasnotebook>

I'm trying to fix selectmodule.c on Windows (it raises
bogus exceptions, because select() on Windows does not
set errno).
The first patch I had was this:
  
   if (n < 0) {
+ #ifdef MS_WINDOWS
+   PyErr_SetExcFromWindowsErr(SelectError, WSAGetLastError());
+ #else
    PyErr_SetFromErrno(SelectError);
+ #endif
   }
   else if (n == 0) {
                  /* optimization */

but PyErr_SetExcFromWindowsErr is not present in the 2.2
maintainance branch. An easier fix would be this one, but
I wonder if it is allowed/good style to set 'errno':

*** 274,279 ****
--- 274,282 ----
   Py_END_ALLOW_THREADS
  
   if (n < 0) {
+ #ifdef MS_WINDOWS
+   errno = WSAGetLastError();
+ #endif
    PyErr_SetFromErrno(SelectError);
   }
   else if (n == 0) {

Thomas


From skip@pobox.com  Tue Sep 24 14:41:28 2002
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 24 Sep 2002 08:41:28 -0500
Subject: [Python-Dev] Python 2.2.2 Wiki
Message-ID: <15760.27528.301965.340841@12-248-11-90.client.attbi.com>

For folks interested in 2.2.2, I created an empty Wiki available at

    http://manatee.mojam.com/py222wiki/

I did *nothing* other than set it up.  I don't know how people want to use
the Wiki.  Feel free to organize it any way you want, or give me some clues
and I'll take a crack at a top-level structure.

Michael McLay had suggested:

    >> Perhaps Wiki pages would be a good mechanism for collaboration on the
    >> classification of patches. Create one page for each classification
    >> type and then use the patch names and title as section titles within
    >> the page.

What are the classification types he referred to?

Skip


From mhammond@skippinet.com.au  Tue Sep 24 14:49:05 2002
From: mhammond@skippinet.com.au (Mark Hammond)
Date: Tue, 24 Sep 2002 23:49:05 +1000
Subject: [Python-Dev] Assign to errno allowed?
In-Reply-To: <000d01c263cd$bd76cc00$e000a8c0@thomasnotebook>
Message-ID: <LCEPIIGDJPKCOIHOBJEPGEOPGJAA.mhammond@skippinet.com.au>

> but PyErr_SetExcFromWindowsErr is not present in the 2.2
> maintainance branch. An easier fix would be this one, but
> I wonder if it is allowed/good style to set 'errno':
>
> *** 274,279 ****
> --- 274,282 ----
>    Py_END_ALLOW_THREADS
>
>    if (n < 0) {
> + #ifdef MS_WINDOWS
> +   errno = WSAGetLastError();
> + #endif
>     PyErr_SetFromErrno(SelectError);
>    }
>    else if (n == 0) {

Well, I'd agree it's not good style - therefore it deserves a comment
<wink>.  I'd say with a reasonable comment you should just go for it.

BDFL-pronouncement-not-withstanding ly,

Mark.



From mal@lemburg.com  Tue Sep 24 15:00:38 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 24 Sep 2002 16:00:38 +0200
Subject: [Python-Dev] Assign to errno allowed?
References: <000d01c263cd$bd76cc00$e000a8c0@thomasnotebook>
Message-ID: <3D907006.504@lemburg.com>

Thomas Heller wrote:
> I'm trying to fix selectmodule.c on Windows (it raises
> bogus exceptions, because select() on Windows does not
> set errno).
> The first patch I had was this:
>   
>    if (n < 0) {
> + #ifdef MS_WINDOWS
> +   PyErr_SetExcFromWindowsErr(SelectError, WSAGetLastError());
> + #else
>     PyErr_SetFromErrno(SelectError);
> + #endif
>    }
>    else if (n == 0) {
>                   /* optimization */
> 
> but PyErr_SetExcFromWindowsErr is not present in the 2.2
> maintainance branch. An easier fix would be this one, but
> I wonder if it is allowed/good style to set 'errno':
> 
> *** 274,279 ****
> --- 274,282 ----
>    Py_END_ALLOW_THREADS
>   
>    if (n < 0) {
> + #ifdef MS_WINDOWS
> +   errno = WSAGetLastError();
> + #endif
>     PyErr_SetFromErrno(SelectError);
>    }
>    else if (n == 0) {

Here's what the man page has to say:

NAME
        errno - number of last error

SYNOPSIS
        #include <errno.h>

        extern int errno;

DESCRIPTION
        The  integer errno is set by system calls (and some library functions)
        to indicate what went wrong.  Its value is significant only  when  the
        call  returned an error (usually -1), and a library function that does
        succeed is allowed to change errno.

        Sometimes, when -1 is also a legal return value one has to zero  errno
        before the call in order to detect possible errors.

        errno  is  defined  by the ISO C standard to be a modifiable lvalue of
        type int, and must not be explicitly declared; errno may be  a  macro.
        errno  is  thread-local;  setting it in one thread does not affect its
        value in any other thread.

        Valid error numbers are all non-zero; errno is never set  to  zero  by
        any  library  function.  All the error names specified by POSIX.1 must
        have distinct values.

	...

Setting errno is allowed; in fact, it is required to set it to 0
sometimes in order to narrow down the location of an error (in a
sequence of C library calls).

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/



From guido@python.org  Tue Sep 24 15:16:39 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 24 Sep 2002 10:16:39 -0400
Subject: [Python-Dev] Assign to errno allowed?
In-Reply-To: Your message of "Tue, 24 Sep 2002 15:24:41 +0200."
 <000d01c263cd$bd76cc00$e000a8c0@thomasnotebook>
References: <000d01c263cd$bd76cc00$e000a8c0@thomasnotebook>
Message-ID: <200209241416.g8OEGd902355@odiug.zope.com>

> I'm trying to fix selectmodule.c on Windows (it raises
> bogus exceptions, because select() on Windows does not
> set errno).

Are you *sure* about that?

> The first patch I had was this:
[...]
> but PyErr_SetExcFromWindowsErr is not present in the 2.2
> maintainance branch. An easier fix would be this one, but
> I wonder if it is allowed/good style to set 'errno':

Yes, assignment to errno is fine.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From ark@research.att.com  Tue Sep 24 15:14:05 2002
From: ark@research.att.com (Andrew Koenig)
Date: 24 Sep 2002 10:14:05 -0400
Subject: [Python-Dev] -zcombreloc
In-Reply-To: <200209231350.g8NDonx28099@europa.research.att.com>
References: <200209231350.g8NDonx28099@europa.research.att.com>
Message-ID: <yu99adm7h3v6.fsf@europa.research.att.com>

ark> So I think you are right -- the correct fix is to warn people that
ark> they should not use binutils 2.13 on Solaris, period.  I believe that
ark> they will fix the problem in 2.13.1 and will let you know.

I received a fix this morning from one of the binutils developers
and am testing it now.  The good news is that the Python build has
gotten further than it did last time, so I'm going to try rebuilding
gcc with the fixed binutils 2.13 and then rebuilding Python.

That process takes about 16 hours of cpu time, so don't expect
to hear from me until tomorrow at the earliest.

The bad news is that the fix is specific to Solaris, which means that
installing it breaks the linker for Sparc/Linux.  They are now trying
to figure out how to fix it in a way that does not break Linux;
obviously, they're not going to put it in a release until they have
one that works on both operating systems.

More news as I get it.

-- 
Andrew Koenig, ark@research.att.com, http://www.research.att.com/info/ark


From guido@python.org  Tue Sep 24 15:18:19 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 24 Sep 2002 10:18:19 -0400
Subject: [Python-Dev] Python 2.2.2 Wiki
In-Reply-To: Your message of "Tue, 24 Sep 2002 08:41:28 CDT."
 <15760.27528.301965.340841@12-248-11-90.client.attbi.com>
References: <15760.27528.301965.340841@12-248-11-90.client.attbi.com>
Message-ID: <200209241418.g8OEIJ502369@odiug.zope.com>

>     http://manatee.mojam.com/py222wiki/
> 
> I did *nothing* other than set it up.  I don't know how people want to use
> the Wiki.  Feel free to organize it any way you want, or give me some clues
> and I'll take a crack at a top-level structure.
> 
> Michael McLay had suggested:
> 
>     >> Perhaps Wiki pages would be a good mechanism for collaboration on the
>     >> classification of patches. Create one page for each classification
>     >> type and then use the patch names and title as section titles within
>     >> the page.
> 
> What are the classification types he referred to?

I suppose he was referring to this (from my initial 2.2.2 post last
Friday):

    Basically, someone does the tedious part of triage, which means
    going over *every* 2.3 checkin message (with quick access to the
    corresponding diffs) and sorting them into:

    - already applied

    - trivial reject (e.g. new feature or fix for a bug introduced in
      2.3)

    - trivial accept (pure bugfix that applies cleanly to 2.2)

    - messy (e.g. unclear whether it's a bugfix or a feature even
      after staring at the source, bugfixes that affect binary
      compatibility, bugfixes that can only be applied with much code
      wrangling due to other changes in the code at the same place,
      etc.)

    Feel free to compile a list of "messy" ones and send it to
    python-dev.  It doesn't have to be all at once -- for big messy
    ones a separate python-dev discussion may be appropriate.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From thomas.heller@ion-tof.com  Tue Sep 24 15:19:38 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Tue, 24 Sep 2002 16:19:38 +0200
Subject: [Python-Dev] Assign to errno allowed?
References: <000d01c263cd$bd76cc00$e000a8c0@thomasnotebook>  <200209241416.g8OEGd902355@odiug.zope.com>
Message-ID: <000b01c263d5$6a8721e0$e000a8c0@thomasnotebook>

> > I'm trying to fix selectmodule.c on Windows (it raises
> > bogus exceptions, because select() on Windows does not
> > set errno).
>
> Are you *sure* about that?
>

Yes.

MSDN:

The select function returns the total number of socket handles that are ready and contained in the fd_set structures, zero if the
time limit expired, or SOCKET_ERROR if an error occurred. If the return value is SOCKET_ERROR, WSAGetLastError can be used to
retrieve a specific error code.

Thomas



From martin@v.loewis.de  Tue Sep 24 15:32:45 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 24 Sep 2002 16:32:45 +0200
Subject: [Python-Dev] Assign to errno allowed?
In-Reply-To: <000b01c263d5$6a8721e0$e000a8c0@thomasnotebook>
References: <000d01c263cd$bd76cc00$e000a8c0@thomasnotebook>
 <200209241416.g8OEGd902355@odiug.zope.com>
 <000b01c263d5$6a8721e0$e000a8c0@thomasnotebook>
Message-ID: <m3ptv3pieq.fsf@mira.informatik.hu-berlin.de>

"Thomas Heller" <thomas.heller@ion-tof.com> writes:

> > Are you *sure* about that?
[...}

> The select function returns the total number of socket handles that
> are ready and contained in the fd_set structures, zero if the time
> limit expired, or SOCKET_ERROR if an error occurred. If the return
> value is SOCKET_ERROR, WSAGetLastError can be used to retrieve a
> specific error code.

This is a strong indication, but not enough for certainty. It does not
mention errno at all.

Regards,
Martin



From thomas.heller@ion-tof.com  Tue Sep 24 15:47:17 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Tue, 24 Sep 2002 16:47:17 +0200
Subject: [Python-Dev] Assign to errno allowed?
References: <000d01c263cd$bd76cc00$e000a8c0@thomasnotebook><200209241416.g8OEGd902355@odiug.zope.com><000b01c263d5$6a8721e0$e000a8c0@thomasnotebook> <m3ptv3pieq.fsf@mira.informatik.hu-berlin.de>
Message-ID: <000b01c263d9$4780ed80$e000a8c0@thomasnotebook>

From: "Martin v. Loewis" <martin@v.loewis.de>
> "Thomas Heller" <thomas.heller@ion-tof.com> writes:
> 
> > > Are you *sure* about that?
> [...}
> 
> > The select function returns the total number of socket handles that
> > are ready and contained in the fd_set structures, zero if the time
> > limit expired, or SOCKET_ERROR if an error occurred. If the return
> > value is SOCKET_ERROR, WSAGetLastError can be used to retrieve a
> > specific error code.
> 
> This is a strong indication, but not enough for certainty. It does not
> mention errno at all.
> 
Yes.
Here's an experiment (unpatched python):

Python 2.2.1 (#34, Apr  9 2002, 19:34:33) [MSC 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import select
>>> select.select([], [], [], 10)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
select.error: (0, 'Error')
>>>

Patched python:

Python 2.3a0 (#29, Sep 19 2002, 12:38:34) [MSC 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import select
>>> select.select([], [], [], 10)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
select.error: (10093, 'Either the application has not called WSAStartup, or WSAStartup failed')
>>> import socket
>>> select.select([], [], [], 10)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
select.error: (10022, 'An invalid argument was supplied')
>>>

Thomas


From thomas.heller@ion-tof.com  Tue Sep 24 15:51:48 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Tue, 24 Sep 2002 16:51:48 +0200
Subject: [Python-Dev] Assign to errno allowed?
References: <000d01c263cd$bd76cc00$e000a8c0@thomasnotebook><200209241416.g8OEGd902355@odiug.zope.com><000b01c263d5$6a8721e0$e000a8c0@thomasnotebook> <m3ptv3pieq.fsf@mira.informatik.hu-berlin.de>
Message-ID: <003901c263d9$e8ab03d0$e000a8c0@thomasnotebook>

From: "Martin v. Loewis" <martin@v.loewis.de>
> "Thomas Heller" <thomas.heller@ion-tof.com> writes:
> 
> > > Are you *sure* about that?
> [...}
> 
> > The select function returns the total number of socket handles that
> > are ready and contained in the fd_set structures, zero if the time
> > limit expired, or SOCKET_ERROR if an error occurred. If the return
> > value is SOCKET_ERROR, WSAGetLastError can be used to retrieve a
> > specific error code.
> 
> This is a strong indication, but not enough for certainty. It does not
> mention errno at all.

Before we dive into philosophical discussions about what this
sentence says, my interpretation would be:
If select() returns SOCKET_ERROR, you *should* call WSAGetLastError()
to get "details about the problem".

Thomas


From skip@pobox.com  Tue Sep 24 15:52:37 2002
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 24 Sep 2002 09:52:37 -0500
Subject: [Python-Dev] Python 2.2.2 Wiki
In-Reply-To: <200209241418.g8OEIJ502369@odiug.zope.com>
References: <15760.27528.301965.340841@12-248-11-90.client.attbi.com>
 <200209241418.g8OEIJ502369@odiug.zope.com>
Message-ID: <15760.31797.295764.509527@12-248-11-90.client.attbi.com>

    >> What are the classification types he referred to?

    Guido> I suppose he was referring to this (from my initial 2.2.2 post
    Guido> last Friday):
    ...

Okay, I created blank WikiNames for those categories.  Kevin Jacobs added a
bunch of bugs to the front page.  Now would probably be a good time for
people to jump in and review those bugs, then move them to the appropriate
classification page.

Skip


From guido@python.org  Tue Sep 24 15:56:26 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 24 Sep 2002 10:56:26 -0400
Subject: [Python-Dev] Assign to errno allowed?
In-Reply-To: Your message of "Tue, 24 Sep 2002 16:19:38 +0200."
 <000b01c263d5$6a8721e0$e000a8c0@thomasnotebook>
References: <000d01c263cd$bd76cc00$e000a8c0@thomasnotebook> <200209241416.g8OEGd902355@odiug.zope.com>
 <000b01c263d5$6a8721e0$e000a8c0@thomasnotebook>
Message-ID: <200209241456.g8OEuQe11641@odiug.zope.com>

> > > I'm trying to fix selectmodule.c on Windows (it raises
> > > bogus exceptions, because select() on Windows does not
> > > set errno).
> >
> > Are you *sure* about that?
> 
> Yes.
> 
> MSDN:
> 
> The select function returns the total number of socket handles that
> are ready and contained in the fd_set structures, zero if the time
> limit expired, or SOCKET_ERROR if an error occurred. If the return
> value is SOCKET_ERROR, WSAGetLastError can be used to retrieve a
> specific error code.

Argh!  So select() has never returned proper return values on
Windows. :-(

Thanks for fixing this.  Are you gonna fix it in 2.2.2 as well as 2.3?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Tue Sep 24 16:01:35 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 24 Sep 2002 11:01:35 -0400
Subject: [Python-Dev] Assign to errno allowed?
In-Reply-To: Your message of "Tue, 24 Sep 2002 16:47:17 +0200."
 <000b01c263d9$4780ed80$e000a8c0@thomasnotebook>
References: <000d01c263cd$bd76cc00$e000a8c0@thomasnotebook> <200209241416.g8OEGd902355@odiug.zope.com> <000b01c263d5$6a8721e0$e000a8c0@thomasnotebook> <m3ptv3pieq.fsf@mira.informatik.hu-berlin.de>
 <000b01c263d9$4780ed80$e000a8c0@thomasnotebook>
Message-ID: <200209241501.g8OF1ZN12148@odiug.zope.com>

> Patched python:
> 
> Python 2.3a0 (#29, Sep 19 2002, 12:38:34) [MSC 32 bit (Intel)] on win32
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import select
> >>> select.select([], [], [], 10)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> select.error: (10093, 'Either the application has not called WSAStartup, or WSAStartup failed')
> >>> import socket
> >>> select.select([], [], [], 10)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> select.error: (10022, 'An invalid argument was supplied')
> >>>

Hm...  I can confirm this on my Win98SE box.  But questions pop up:

Why is the error different the first time?  And why is this an
error at all?  On Linux, this is not an error.  (In fact, time.sleep()
uses this to sleep using subsecond precision.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From thomas.heller@ion-tof.com  Tue Sep 24 16:15:04 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Tue, 24 Sep 2002 17:15:04 +0200
Subject: [Python-Dev] Assign to errno allowed?
References: <000d01c263cd$bd76cc00$e000a8c0@thomasnotebook> <200209241416.g8OEGd902355@odiug.zope.com> <000b01c263d5$6a8721e0$e000a8c0@thomasnotebook> <m3ptv3pieq.fsf@mira.informatik.hu-berlin.de>              <000b01c263d9$4780ed80$e000a8c0@thomasnotebook>  <200209241501.g8OF1ZN12148@odiug.zope.com>
Message-ID: <005101c263dd$2908a5b0$e000a8c0@thomasnotebook>

> > Patched python:
> > 
> > Python 2.3a0 (#29, Sep 19 2002, 12:38:34) [MSC 32 bit (Intel)] on win32
> > Type "help", "copyright", "credits" or "license" for more information.
> > >>> import select
> > >>> select.select([], [], [], 10)
> > Traceback (most recent call last):
> >   File "<stdin>", line 1, in ?
> > select.error: (10093, 'Either the application has not called WSAStartup, or WSAStartup failed')
> > >>> import socket
> > >>> select.select([], [], [], 10)
> > Traceback (most recent call last):
> >   File "<stdin>", line 1, in ?
> > select.error: (10022, 'An invalid argument was supplied')
> > >>>
> 
> Hm...  I can confirm this on my Win98SE box.  But questions pop up:
> 
> Why is the error different the first time?  And why is this an
> error at all?

The winsock library is not initialized the first time - it seems
that socketmodule calls WSAStartup(), but I haven't looked at
this in detail.
Also I think it's not worth to fix it, there's no use for select()
on windows if you don't use sockets - you have to supply at least
one socket descriptor (that's the cause for the second error above).

 Although it could be argued whether it makes sense to simulate
a Linux-compatible select for Windows.

>  On Linux, this is not an error.  (In fact, time.sleep()
> uses this to sleep using subsecond precision.)

>From my early Unix (actually Minix) experiments I remember
that select(3) was the only possibility to do subsecond delays
in Unix. Is this still the same today?

> 
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> 
And yes, I will fix it in 2.3 and backport it to 2.2.2.

Thomas


From mwh@python.net  Tue Sep 24 16:21:05 2002
From: mwh@python.net (Michael Hudson)
Date: 24 Sep 2002 16:21:05 +0100
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Tools/freeze extensions_win32.ini,1.5,1.6
In-Reply-To: mhammond@users.sourceforge.net's message of "Thu, 27 Jun 2002 18:13:04 -0700"
References: <E17NkJs-0006qr-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <2mwupbmn1a.fsf@starship.python.net>

mhammond@users.sourceforge.net writes:

> Update of /cvsroot/python/python/dist/src/Tools/freeze
> In directory usw-pr-cvs1:/tmp/cvs-serv26120
> 
> Modified Files:
> 	extensions_win32.ini 
> Log Message:
> Patch 574531/Bug 574570 - allow freeze on windows to use the _winreg 
> extension.

Is this a bugfix?

Cheers,
M.

-- 
  This is an off-the-top-of-the-head-and-not-quite-sober suggestion,
  so is probably technically laughable.  I'll see how embarassed I
  feel tomorrow morning.            -- Patrick Gosling, ucam.comp.misc


From bkc@murkworks.com  Tue Sep 24 16:22:18 2002
From: bkc@murkworks.com (Brad Clements)
Date: Tue, 24 Sep 2002 11:22:18 -0400
Subject: [Python-Dev] Assign to errno allowed?
In-Reply-To: <200209241416.g8OEGd902355@odiug.zope.com>
References: Your message of "Tue, 24 Sep 2002 15:24:41 +0200." <000d01c263cd$bd76cc00$e000a8c0@thomasnotebook>
Message-ID: <3D904AF0.23984.4702510D@localhost>

On 24 Sep 2002 at 10:16, Guido van Rossum wrote:

> Yes, assignment to errno is fine.
> 

Please see patch 505846.

I haven't supplied this patch in proper form yet, but this discussion relates to the patch.

I would like to remind folks that on some platforms, one cannot just use "errno = 0". On 
those platforms calling a function is required to set errno.

The point of patch 505846 is to "standardized" the "errno = " function, and secondarily 
provide a way to "get" the errno. This is  done in pyport.h and "all modules" that use or 
set errno. (not as many as you might think)

It's an ugly patch, requires a lot of changes to the core. I'm willing to make all the 
changes to the core as needed, once we figure out the best way to handle this issue is.

In fact, it's this patch that is the principal cause of the "fork python ce" thread also 
recently discussed in this forum.  See "Need advice: cloning python cvs for CE project"  

Windows CE doesn't allow setting errno. Neither does NetWare (CLIB). 

Is it worthwhile to discuss patch 505846 some more in this thread? Perhaps those who 
haven't read the comments on the patch have a clever solution? 

Or should I just clean up my patch, resubmit it and move on?

I agree with Mark's post about keeping CE changes in the core. I'd rather do that. I 
submitted patch 505846 incorrectly and need to fix it.. But after it's submitted and if 
accepted, core developers would need to use Py_SetErrno instead of "errno = "

And for extension developers. Using the macro would be nice, but it's less of an issue 
since CE and NetWare ports have to be done "by hand" anyway for these modules, we 
can make those changes as they're encountered.

So .. discuss this, look for better insight, or resubmit the patch and move on?

Thanks

Brad Clements,                bkc@murkworks.com   (315)268-1000
http://www.murkworks.com                          (315)268-9812 Fax
AOL-IM: BKClements



From guido@python.org  Tue Sep 24 16:27:46 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 24 Sep 2002 11:27:46 -0400
Subject: [Python-Dev] Assign to errno allowed?
In-Reply-To: Your message of "Tue, 24 Sep 2002 17:15:04 +0200."
 <005101c263dd$2908a5b0$e000a8c0@thomasnotebook>
References: <000d01c263cd$bd76cc00$e000a8c0@thomasnotebook> <200209241416.g8OEGd902355@odiug.zope.com> <000b01c263d5$6a8721e0$e000a8c0@thomasnotebook> <m3ptv3pieq.fsf@mira.informatik.hu-berlin.de> <000b01c263d9$4780ed80$e000a8c0@thomasnotebook> <200209241501.g8OF1ZN12148@odiug.zope.com>
 <005101c263dd$2908a5b0$e000a8c0@thomasnotebook>
Message-ID: <200209241527.g8OFRkZ12485@odiug.zope.com>

> > Why is the error different the first time?  And why is this an
> > error at all?
> 
> The winsock library is not initialized the first time - it seems
> that socketmodule calls WSAStartup(), but I haven't looked at
> this in detail.

Oh well, that makes some sense.

> Also I think it's not worth to fix it, there's no use for select()
> on windows if you don't use sockets - you have to supply at least
> one socket descriptor (that's the cause for the second error above).

OK.

>  Although it could be argued whether it makes sense to simulate
> a Linux-compatible select for Windows.

Nah, it's been like this for a decade.

> >  On Linux, this is not an error.  (In fact, time.sleep()
> > uses this to sleep using subsecond precision.)
> 
> From my early Unix (actually Minix) experiments I remember
> that select(3) was the only possibility to do subsecond delays
> in Unix. Is this still the same today?

Probably.  HAVE_SELECT is the first thing tested in floatsleep().

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Tue Sep 24 16:37:49 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 24 Sep 2002 11:37:49 -0400
Subject: [Python-Dev] PEP 282 Implementation
In-Reply-To: Your message of "Tue, 24 Sep 2002 08:56:33 BST."
 <003901c2639f$e7e21ae0$652b6992@alpha>
References: <00e001c2261d$19bfc320$652b6992@alpha> <200208092040.g79Ke3S31416@pcp02138704pcs.reston01.va.comcast.net> <006601c24e7f$fcff5440$652b6992@alpha> <200209231741.g8NHfEl11092@pcp02138704pcs.reston01.va.comcast.net> <01aa01c26379$8d6def60$4901000a@dorothy>
 <003901c2639f$e7e21ae0$652b6992@alpha>
Message-ID: <200209241537.g8OFbnc13228@odiug.zope.com>

> Chris McDonough wrote:
> > It would be helpful for the FileHandler class to define a method
> > which just closes and reopens the current logfile (instead of
> > actually rotating a set like-named logfiles).  This would allow
> > logfile rotation to be performed by a separate process (e.g.
> > RedHat's logrotate).  Sometimes it's better (and even necessary) to
> > be able to use system-provided log rotation facilities instead of
> > relying on the native rotation facilities.
> 
> I'm not sure whether this should be in the core functionality. I
> presume you don't mean an atomic "close and reopen" operation -
> rather, are you suggesting close the file, maybe rename it at the
> application level, then reopen? If so, then it's best handled
> entirely in the application level, through a subclass of
> FileHandler. This allows each application to consider issues such as
> what to do with events that occur between close and reopen (e.g. if
> multiple threads are running).

No, this is using Unix functionality where once you have opened a
file, if the file is renamed, you can continue to write to it and you
will be writing to the renamed file.  IOW the open file is connected
to the inode, not the filename.

Typically an application catches SIGHUP (though that has its share of
problems!) and in response simply closes and reopens the file, using
the original filename.  The sysadmin uses this as follows:

  mv foo.log foo.log.1
  kill -HUP `cat foo.pid`

Having looked at it again, I think that this is definitely better than
doing log rotation in the FileHandler.  The rotation code in the log
handler currently calls tell() after each record is emitted.  This is
expensive, and not needed if you use an external process to watch over
the log files and rotate them.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From thomas.heller@ion-tof.com  Tue Sep 24 16:35:00 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Tue, 24 Sep 2002 17:35:00 +0200
Subject: [Python-Dev] Assign to errno allowed?
References: <000d01c263cd$bd76cc00$e000a8c0@thomasnotebook> <200209241416.g8OEGd902355@odiug.zope.com> <000b01c263d5$6a8721e0$e000a8c0@thomasnotebook> <m3ptv3pieq.fsf@mira.informatik.hu-berlin.de> <000b01c263d9$4780ed80$e000a8c0@thomasnotebook> <200209241501.g8OF1ZN12148@odiug.zope.com>              <005101c263dd$2908a5b0$e000a8c0@thomasnotebook>  <200209241527.g8OFRkZ12485@odiug.zope.com>
Message-ID: <008c01c263df$f2296090$e000a8c0@thomasnotebook>

> > Also I think it's not worth to fix it, there's no use for select()
> > on windows if you don't use sockets - you have to supply at least
> > one socket descriptor (that's the cause for the second error above).
> 
> OK.
> 
> >  Although it could be argued whether it makes sense to simulate
> > a Linux-compatible select for Windows.
> 
> Nah, it's been like this for a decade.

In the current form, it breaks asyncore - this is what
I wanted to fix in the first place.
asyncore contains this code snippet in the poll() function:

        try:
            r,w,e = select.select (r,w,e, timeout)
        except select.error, err:
            if err[0] != EINTR:
                raise
            r = []; w = []; e = []

This will fail on Windows if all of r,w,e are empty.
Even if there are active sockets, it may be that this
code is executed with all three lists empty.

How can this be fixed?

I have an SF item at http://www.python.org/sf/611464 discussing this.

Thomas


From guido@python.org  Tue Sep 24 17:02:36 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 24 Sep 2002 12:02:36 -0400
Subject: [Python-Dev] Assign to errno allowed?
In-Reply-To: Your message of "Tue, 24 Sep 2002 17:35:00 +0200."
 <008c01c263df$f2296090$e000a8c0@thomasnotebook>
References: <000d01c263cd$bd76cc00$e000a8c0@thomasnotebook> <200209241416.g8OEGd902355@odiug.zope.com> <000b01c263d5$6a8721e0$e000a8c0@thomasnotebook> <m3ptv3pieq.fsf@mira.informatik.hu-berlin.de> <000b01c263d9$4780ed80$e000a8c0@thomasnotebook> <200209241501.g8OF1ZN12148@odiug.zope.com> <005101c263dd$2908a5b0$e000a8c0@thomasnotebook> <200209241527.g8OFRkZ12485@odiug.zope.com>
 <008c01c263df$f2296090$e000a8c0@thomasnotebook>
Message-ID: <200209241602.g8OG2am13367@odiug.zope.com>

> > >  Although it could be argued whether it makes sense to simulate
> > > a Linux-compatible select for Windows.
> > 
> > Nah, it's been like this for a decade.
> 
> In the current form, it breaks asyncore - this is what
> I wanted to fix in the first place.
> asyncore contains this code snippet in the poll() function:
> 
>         try:
>             r,w,e = select.select (r,w,e, timeout)
>         except select.error, err:
>             if err[0] != EINTR:
>                 raise
>             r = []; w = []; e = []
> 
> This will fail on Windows if all of r,w,e are empty.

Aargh!!!

Apparently asyncore has never worked properly on Windows.  Note that
it also doesn't check for the Windows error codes on connect().

> Even if there are active sockets, it may be that this
> code is executed with all three lists empty.

Yes.

> How can this be fixed?

Change poll() in asyncore.py to use this:

    if [] == r == w == e:
       time.sleep(timeout)
    else:
       try:
          r, w, e = select.select(r, w, e, timeout)
       except select.error, err:
          ...etc...

> I have an SF item at http://www.python.org/sf/611464 discussing this.

The conclusion there seems that select() should be fixed, but then
goes on to say that there's no easy way to make it interruptible.

Since we don't try to hide the differences between select on Windows
and on Unix in other areas (on Windows you can only select on sockets)
I'm not sure it's worth trying to fix select if you lose
interruptability; fixing asyncore instead is easy enough, and I don't
think this is going to bite too many other applications.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From barry@barrys-emacs.org  Tue Sep 24 17:09:26 2002
From: barry@barrys-emacs.org (Barry Scott)
Date: Tue, 24 Sep 2002 17:09:26 +0100
Subject: [Python-Dev] Assign to errno allowed?
In-Reply-To: <008c01c263df$f2296090$e000a8c0@thomasnotebook>
Message-ID: <000001c263e4$c1181730$060210ac@private>

select on windows is very limited. It is only allowed to be called
with socket handles. You cannot use C RTL fd with it or another sort
of handle.

Because its part of winsock and not part of the C RTL so it cannot mess
with errno itself.

> HAVE_SELECT is the first thing tested in floatsleep().

HAVE_SELECT should probably be undefined on windows. With the expection
that the sockets module for windows can use it.

	BArry




From tim.one@comcast.net  Tue Sep 24 17:31:54 2002
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 24 Sep 2002 12:31:54 -0400
Subject: [Python-Dev] Assign to errno allowed?
Message-ID: <f9a82f93cf.f93cff9a82@icomcast.net>

[Brad Clements]
> ...
> I would like to remind folks that on some platforms, one cannot 
> just use "errno = 0". On those platforms calling a function is
> required to set errno.

Except that errno=0 works fine on any platform with a standard-
conforming C implementation, and this isn't even a fuzzy POSIX issue -- 
it's a requirement of the C standard.  We can define piles of macros 
instead, but I expect that, as the years drag on, people will "forget" 
to use them.



From gmcm@hypernet.com  Tue Sep 24 17:27:42 2002
From: gmcm@hypernet.com (Gordon McMillan)
Date: Tue, 24 Sep 2002 12:27:42 -0400
Subject: [Python-Dev] Assign to errno allowed?
In-Reply-To: <200209241501.g8OF1ZN12148@odiug.zope.com>
References: Your message of "Tue, 24 Sep 2002 16:47:17 +0200." <000b01c263d9$4780ed80$e000a8c0@thomasnotebook>
Message-ID: <3D905A3E.29317.245048AE@localhost>

While we're discussing the non-conformance of Window's select,
these 2 errors:

> > select.error: (10093, 'Either the application has not called
> > WSAStartup, or WSAStartup failed')

> > select.error: (10022, 'An invalid argument was supplied')

are about the only errors you'll get from select on Windows.
Where select would return a socket in the errors list on
*nix, on Windows it will come out as readable / writeable, and
it's the socket send / rcv that will find out what the problem
is. Each version of winsock gets a bit better, but (for example),
selecting for write in Win9x-era winsock is essentially a busy-wait.
You'll get the socket back immediately, go to write and get
the Window's EWOULDBLOCK error.

but-heck-it-multitasks-ly-y'rs
-- Gordon
http://www.mcmillan-inc.com/



From guido@python.org  Tue Sep 24 17:23:16 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 24 Sep 2002 12:23:16 -0400
Subject: [Python-Dev] Assign to errno allowed?
In-Reply-To: Your message of "Tue, 24 Sep 2002 17:09:26 BST."
 <000001c263e4$c1181730$060210ac@private>
References: <000001c263e4$c1181730$060210ac@private>
Message-ID: <200209241623.g8OGNGG13479@odiug.zope.com>

> select on windows is very limited. It is only allowed to be called
> with socket handles. You cannot use C RTL fd with it or another sort
> of handle.
> 
> Because its part of winsock and not part of the C RTL so it cannot mess
> with errno itself.

Yes I know.

> > HAVE_SELECT is the first thing tested in floatsleep().
> 
> HAVE_SELECT should probably be undefined on windows. With the expection
> that the sockets module for windows can use it.

Sorry, floatsleep() on Windows doesn't ever get to testing
HAVE_SELECT.  So no worry, that part at least works.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Tue Sep 24 17:21:13 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 24 Sep 2002 12:21:13 -0400
Subject: [Python-Dev] Assign to errno allowed?
In-Reply-To: Your message of "Tue, 24 Sep 2002 11:22:18 EDT."
 <3D904AF0.23984.4702510D@localhost>
References: "Your message of Tue, 24 Sep 2002 15:24:41 +0200." <000d01c263cd$bd76cc00$e000a8c0@thomasnotebook>
 <3D904AF0.23984.4702510D@localhost>
Message-ID: <200209241621.g8OGLD113452@odiug.zope.com>

> > Yes, assignment to errno is fine.
> 
> Please see patch 505846.
> 
> I haven't supplied this patch in proper form yet, but this
> discussion relates to the patch.
> 
> I would like to remind folks that on some platforms, one cannot just
> use "errno = 0". On those platforms calling a function is required
> to set errno.

Shucks.  That's in violation of the ISO C standard.

> The point of patch 505846 is to "standardized" the "errno = "
> function, and secondarily provide a way to "get" the errno. This is
> done in pyport.h and "all modules" that use or set errno. (not as
> many as you might think)

Why also provide an alternative way to get it?  Sure you can *get* it
even on Win/CE?

> It's an ugly patch, requires a lot of changes to the core. I'm
> willing to make all the changes to the core as needed, once we
> figure out the best way to handle this issue is.

I have a strong urge to tell you to start porting Linux to your CE
hardware rather than bothering with Win/CE.  Or buy an iPAQ for which
Linux is already available.

> In fact, it's this patch that is the principal cause of the "fork
> python ce" thread also recently discussed in this forum.  See "Need
> advice: cloning python cvs for CE project"

I've given all the advice I have time for.

> Windows CE doesn't allow setting errno. Neither does NetWare (CLIB).

Sigh.

> Is it worthwhile to discuss patch 505846 some more in this thread?
> Perhaps those who haven't read the comments on the patch have a
> clever solution?
> 
> Or should I just clean up my patch, resubmit it and move on?
> 
> I agree with Mark's post about keeping CE changes in the core. I'd
> rather do that. I submitted patch 505846 incorrectly and need to fix
> it.. But after it's submitted and if accepted, core developers would
> need to use Py_SetErrno instead of "errno = "

Except in extensions that don't have a snowball in hell's chance of
working on Win/CE, of course.

> And for extension developers. Using the macro would be nice, but
> it's less of an issue since CE and NetWare ports have to be done "by
> hand" anyway for these modules, we can make those changes as they're
> encountered.
> 
> So .. discuss this, look for better insight, or resubmit the patch
> and move on?

As I said, I have a very strong urge to tell you to go away.  But I
won't.  But I really don't like the idea of coding around this
particular platform's quirks.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From bkc@murkworks.com  Tue Sep 24 17:43:13 2002
From: bkc@murkworks.com (Brad Clements)
Date: Tue, 24 Sep 2002 12:43:13 -0400
Subject: [Python-Dev] Assign to errno allowed?
In-Reply-To: <f9a82f93cf.f93cff9a82@icomcast.net>
Message-ID: <3D905DE7.6823.474C66D7@localhost>

On 24 Sep 2002 at 12:31, Tim Peters wrote:

> > I would like to remind folks that on some platforms, one cannot 
> > just use "errno = 0". On those platforms calling a function is
> > required to set errno.
> 
> Except that errno=0 works fine on any platform with a standard-
> conforming C implementation, and this isn't even a fuzzy POSIX issue -- 
> it's a requirement of the C standard.  We can define piles of macros 
> instead, but I expect that, as the years drag on, people will "forget" 
> to use them.

I don't argue the point that "this stinks". 

What are our choices:

1. CE port will always be a distinct branch/other cvs, and require gobs of work by "CE 
porters" for every new core release, changing all the "errno" statements 

2. CE changes will always be a pain in the butt for core developers forced to remember 
Py_SetErrno()

3. No CE port.

(oh, also put "NetWare" in there wherever you see the word CE .. and I suspect some 
other embedded operating systems running on MMU-less processors that can't 
virtualize errno and don't have TLS)

Surprisingly, there aren't that many modules that reference errno directly.

In fact, the math stuff is the worst offender .. ;-)



Brad Clements,                bkc@murkworks.com   (315)268-1000
http://www.murkworks.com                          (315)268-9812 Fax
AOL-IM: BKClements



From martin@v.loewis.de  Tue Sep 24 15:30:24 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 24 Sep 2002 16:30:24 +0200
Subject: [Python-Dev] bug 576990
In-Reply-To: <3D904828.766779BE@strw.leidenuniv.nl>
References: <3D904828.766779BE@strw.leidenuniv.nl>
Message-ID: <m3u1kfpiin.fsf@mira.informatik.hu-berlin.de>

Roeland Rengelink <rengelin@strw.leidenuniv.nl> writes:

> 5. This is clearly a profound and interesting bug, but solving this
> seems to involve cans of worms, ten-foot poles, and a re-write of the
> core.

To me, it sounds like this. This has been changed forth and back, and
in every state, somebody is unhappy.

Regards,
Martin


From guido@python.org  Tue Sep 24 18:44:14 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 24 Sep 2002 13:44:14 -0400
Subject: [Python-Dev] bug 576990
In-Reply-To: Your message of "Tue, 24 Sep 2002 16:30:24 +0200."
 <m3u1kfpiin.fsf@mira.informatik.hu-berlin.de>
References: <3D904828.766779BE@strw.leidenuniv.nl>
 <m3u1kfpiin.fsf@mira.informatik.hu-berlin.de>
Message-ID: <200209241744.g8OHiEk28139@odiug.zope.com>

> Roeland Rengelink <rengelin@strw.leidenuniv.nl> writes:
> 
> > 5. This is clearly a profound and interesting bug, but solving this
> > seems to involve cans of worms, ten-foot poles, and a re-write of the
> > core.

[Martin]
> To me, it sounds like this. This has been changed forth and back, and
> in every state, somebody is unhappy.

Yes, it's very messy, see my comments to the SF bug entry.  I see no
fix that doesn't break something else.

Note that this "worked" in the initial 2.2 release only when the
subclass didn't have a docstring of its own:

>>> class P(property):
...   "This is class P"
... 
>>> p = P(None, None, None, "this is property p")
>>> p.__doc__
'This is class P'
>>> 

The best workaround is I can see that works everywhere is:

class P(property):
    "class P's docstring"
    __doc__ = property.__dict__['__doc__']

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fredrik@pythonware.com  Tue Sep 24 19:46:23 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Tue, 24 Sep 2002 20:46:23 +0200
Subject: [Python-Dev] Assign to errno allowed?
References: <3D905DE7.6823.474C66D7@localhost>
Message-ID: <004101c263fa$b1c88ec0$ced241d5@hagrid>

brad wrote:

> (oh, also put "NetWare" in there wherever you see the word CE .. and I
> suspect some other embedded operating systems running on MMU-less
> processors that can't virtualize errno and don't have TLS)

that's why the specification says that "errno" might be a macro,
and why many platforms define that macro to be something like:

    #define errno (*_errno())

</F>



From guido@python.org  Tue Sep 24 19:51:49 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 24 Sep 2002 14:51:49 -0400
Subject: [Python-Dev] Assign to errno allowed?
In-Reply-To: Your message of "Tue, 24 Sep 2002 12:43:13 EDT."
 <3D905DE7.6823.474C66D7@localhost>
References: <3D905DE7.6823.474C66D7@localhost>
Message-ID: <200209241851.g8OIpn328512@odiug.zope.com>

> What are our choices:
> 
> 1. CE port will always be a distinct branch/other cvs, and require
>    gobs of work by "CE porters" for every new core release, changing
>    all the "errno" statements
> 
> 2. CE changes will always be a pain in the butt for core developers
>    forced to remember Py_SetErrno()
> 
> 3. No CE port.
> 
> (oh, also put "NetWare" in there wherever you see the word CE .. and
> I suspect some other embedded operating systems running on MMU-less
> processors that can't virtualize errno and don't have TLS)
> 
> Surprisingly, there aren't that many modules that reference errno directly.
> 
> In fact, the math stuff is the worst offender .. ;-)

I'm strongly against 2.  5 years from now, CE and NetWare and their
limitations will only be a vague memory, but this convention will
still cripple the Python source code.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From bkc@murkworks.com  Tue Sep 24 20:04:40 2002
From: bkc@murkworks.com (Brad Clements)
Date: Tue, 24 Sep 2002 15:04:40 -0400
Subject: [Python-Dev] Assign to errno allowed?
In-Reply-To: <004101c263fa$b1c88ec0$ced241d5@hagrid>
Message-ID: <3D907F0D.19031.47CDE6D8@localhost>

On 24 Sep 2002 at 20:46, Fredrik Lundh wrote:

> brad wrote:
> 
> > (oh, also put "NetWare" in there wherever you see the word CE .. and I
> > suspect some other embedded operating systems running on MMU-less
> > processors that can't virtualize errno and don't have TLS)
> 
> that's why the specification says that "errno" might be a macro,
> and why many platforms define that macro to be something like:
> 
>     #define errno (*_errno())

Yes, but still on CE you cannot get a pointer to the thread specific errno. You cannot 
take it's address.

So .. errno is

#define errno	GetLastError()

I wish this were not true.


Brad Clements,                bkc@murkworks.com   (315)268-1000
http://www.murkworks.com                          (315)268-9812 Fax
AOL-IM: BKClements



From barry@barrys-emacs.org  Tue Sep 24 20:10:31 2002
From: barry@barrys-emacs.org (Barry Scott)
Date: Tue, 24 Sep 2002 20:10:31 +0100
Subject: [Python-Dev] Assign to errno allowed?
In-Reply-To: <000d01c263cd$bd76cc00$e000a8c0@thomasnotebook>
Message-ID: <000a01c263fe$0d796de0$060210ac@private>

Windows CE prevents assignment to errno...

There would be a solution if you compiled all the code as C++.
(Assuming that C++ reserved words are not used in the python code.)

Inject the following definitions:


	class ErrnoHack
		{
	public:
		operator int();	// return errno value
		operator =( int ); // assign to errno
		};

	ErrnoHack ErrnoObject

	#define errno ErrnoObject

and you can then write

	errno = 0;



BArry




From bkc@murkworks.com  Tue Sep 24 20:31:23 2002
From: bkc@murkworks.com (Brad Clements)
Date: Tue, 24 Sep 2002 15:31:23 -0400
Subject: [Python-Dev] Assign to errno allowed?
In-Reply-To: <200209241851.g8OIpn328512@odiug.zope.com>
References: Your message of "Tue, 24 Sep 2002 12:43:13 EDT." <3D905DE7.6823.474C66D7@localhost>
Message-ID: <3D908550.9509.47E65C8B@localhost>

On 24 Sep 2002 at 14:51, Guido van Rossum wrote:

> I'm strongly against 2.  5 years from now, CE and NetWare and their
> limitations will only be a vague memory, but this convention will
> still cripple the Python source code.

No arguments about CE, anyway ..

(noting that NetWare has reached it's ten year anniversary ;-)

I guess then the best solution is a distinct CVS for "the port to oddball platforms"

or, the other option is lots of #ifdefs in the code.

The original reason I proposed the macro idea was to eliminate multiple nested 
#ifdefs.. 

For example, I had trouble figuring out nested #ifdefs in posixmodule, as generated by 
Mark in his CE port.. It's awful.

By switching to macros for "errno =" I was able to clean up a lot of the #ifdefs

If changes for CE (and other errno-less OS's) are to be kept in the core, then we'll 
either have (from Modules/cpickle.c)

#ifndef	WINDOWCE
	errno = 0;
#else
	SetLastError(0);
#endif
	l = strtol(s, &endptr, 0);

#ifndef	WINDOWSCE
	if (errno || (*endptr != '\n') || (endptr[1] != '\0')) {
#else
	if (GetLastError() || (*endptr != '\n') || (endptr[1] != '\0')) {
#endif
		/* Hm, maybe we've got something long.  Let's try reading
		   it as a Python long object. */
#ifndef	WINDOWSCE
		errno = 0;
#else
		SetLastError(0);
#endif

---  Or ---

	Py_SetErrno(0)
	l = strtol(s, &endptr, 0);

	if (Py_GetErrno() || (*endptr != '\n') || (endptr[1] != '\0')) {
		/* Hm, maybe we've got something long.  Let's try reading
		   it as a Python long object. */
		Py_SetErrno(0);
  

Keeping in mind that adding NetWare or (other embedded OS or BIOS that wants to 
play) then the #ifdef version gets much worse.  The macro version doesn't change.

There are approximately 140 references to errno in the Modules directory alone. For 
the Alpha port of Python 2.2 to CE I changed every one of them (at least for any 
module that could run on CE, which is just about everything that runs on Win32)

I've already said that I've made these changes and am willing to make them all again. 
Once they're in there, how difficult will it be to keep them?

New code that uses errno will fail on subsequent builds on these errno-less platforms, 
but there will only be a handful of changes needed, rather than hundreds on every 
release of the core.

So .. developers who just write "errno = " in the future won't be penalized, rather the 
porters to errno-less platforms will just have to convert the expression to macro mode.

And that conversion process isn't a hardship if it only has to be done once for any given 
line of code on any given core release.. But if we (porters) have to change 140 
references every single time there's a release.. I could see enthusiasm fading away 
faster than those errno-less platforms ;-)

Clearly you guys know what's best better than I do. My line of reasoning for errno 
macros was:

1. on most platforms the macros compile away to what you'd write in C anyway

2. I'd generate and submit all the initial patches to the core to switch errno references 
to macros, leaving the burden of review and checkin to the core team (sorry) but not 
the burden of finding and changing all errno references

3. But once the patches are in-place, future releases wouldn't require nearly as much 
effort for re-port's to errno-less platforms because only a few lines would need to be 
"fixed up" to use macros, and only if the changed code didn't use the macros in the first 
place.

4. Not using the macros in core or extension source isn't an issue for any platform, 
except errno-less OS's.. at which time that code gets macro'ized at the time of the port.

5. What this does is reduces effort for future ports to crippled systems, at the expense 
of many initial changes, whose subsequent maintenence shouldn't (hopefully) be a 
burden, since ports to crippled systems would maintain the changes.

Though I do agree, a future mismash of macro's and direct errno references in the core 
will be ugly and confusing if that occurs.

(sorry this is so long, just want to clearly state my case if I have not already done so)

Brad Clements,                bkc@murkworks.com   (315)268-1000
http://www.murkworks.com                          (315)268-9812 Fax
AOL-IM: BKClements



From guido@python.org  Wed Sep 25 02:29:19 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 24 Sep 2002 21:29:19 -0400
Subject: [Python-Dev] Assign to errno allowed?
In-Reply-To: Your message of "Tue, 24 Sep 2002 15:31:23 EDT."
 <3D908550.9509.47E65C8B@localhost>
References: "Your message of Tue, 24 Sep 2002 12:43:13 EDT." <3D905DE7.6823.474C66D7@localhost>
 <3D908550.9509.47E65C8B@localhost>
Message-ID: <200209250129.g8P1TJp26117@pcp02138704pcs.reston01.va.comcast.net>

[Brad, would you mind limiting your messages to lines of about 72
characters at most?]

> I guess then the best solution is a distinct CVS for "the port to
> oddball platforms"
> 
> or, the other option is lots of #ifdefs in the code.
> 
> The original reason I proposed the macro idea was to eliminate
> multiple nested #ifdefs.. 
> 
> For example, I had trouble figuring out nested #ifdefs in
> posixmodule, as generated by Mark in his CE port.. It's awful.
> 
> By switching to macros for "errno =" I was able to clean up a lot of
> the #ifdefs

Absolutely.  *If* you want to tackle this, macros for setting errno
are the way to go.

> If changes for CE (and other errno-less OS's) are to be kept in the
> core, then we'll either have (from Modules/cpickle.c)
> 
> #ifndef	WINDOWCE
> 	errno = 0;
> #else
> 	SetLastError(0);
> #endif
> 	l = strtol(s, &endptr, 0);
> 
> #ifndef	WINDOWSCE
> 	if (errno || (*endptr != '\n') || (endptr[1] != '\0')) {
> #else
> 	if (GetLastError() || (*endptr != '\n') || (endptr[1] != '\0')) {
> #endif
> 		/* Hm, maybe we've got something long.  Let's try reading
> 		   it as a Python long object. */
> #ifndef	WINDOWSCE
> 		errno = 0;
> #else
> 		SetLastError(0);
> #endif
> 
> ---  Or ---
> 
> 	Py_SetErrno(0)
> 	l = strtol(s, &endptr, 0);
> 
> 	if (Py_GetErrno() || (*endptr != '\n') || (endptr[1] != '\0')) {
> 		/* Hm, maybe we've got something long.  Let's try reading
> 		   it as a Python long object. */
> 		Py_SetErrno(0);

Question.  You showed that errno was #defined as a call to the right
function.  Why don't you leave *getting* errno alone?

You talk of 100s of places using errno.  But how many places *set*
errno?

> Keeping in mind that adding NetWare or (other embedded OS or BIOS
> that wants to play) then the #ifdef version gets much worse.  The
> macro version doesn't change.

Brad, nobody said they preferred the #ifdef version over the macro!
The question is simply whether to use the macro or keep using errno,
in tune with Standard C.

> There are approximately 140 references to errno in the Modules
> directory alone. For the Alpha port of Python 2.2 to CE I changed
> every one of them (at least for any module that could run on CE,
> which is just about everything that runs on Win32)
> 
> I've already said that I've made these changes and am willing to
> make them all again.  Once they're in there, how difficult will it
> be to keep them?

Experience shows that each new release will be broken for your
platform anyway unless you actively maintain it as we gear up for a
release.  Even between alpha or beta releases the code base is likely
to change in some subtle way that breaks your release.  So you'll have
to chase down new uses of errno assignment leading up to each release.

> New code that uses errno will fail on subsequent builds on these
> errno-less platforms, but there will only be a handful of changes
> needed, rather than hundreds on every release of the core.

If you maintain a branch that uses the errno macro, you could merge
the trunk into that branch each time you feel like synching up with
the trunk.  That's a mostly mechanical process, certainly less than
fixing 100s of errno uses manually each time.

Or you could simply maintain a patch in the form of a context diff
that patches the 100s of places using errno -- assuming this is mostly
in stable code, you'd only have to fix up a handful of new occurrences
and places where the patch context has gotten out of sync.

> So .. developers who just write "errno = " in the future won't be
> penalized, rather the porters to errno-less platforms will just have
> to convert the expression to macro mode.

They *would* be penalized, because you have to fix their code, and
then they have to test it again, etc.  It's yet one more thing to
worry about.

> And that conversion process isn't a hardship if it only has to be
> done once for any given line of code on any given core release.. But
> if we (porters) have to change 140 references every single time
> there's a release.. I could see enthusiasm fading away faster than
> those errno-less platforms ;-)
> 
> Clearly you guys know what's best better than I do. My line of
> reasoning for errno macros was:
> 
> 1. on most platforms the macros compile away to what you'd write in
>    C anyway
> 
> 2. I'd generate and submit all the initial patches to the core to
>    switch errno references to macros, leaving the burden of review
>    and checkin to the core team (sorry) but not the burden of
>    finding and changing all errno references

That's another problem.  Whenever there's a massive "peephole" change
like this, there are always a few places that are broken but that no
reviewer notices and that don't happen to be tested by the test suite.
(After all, errno is only consulted when an error occurs, and some
errors are darn hard to provoke.)

> 3. But once the patches are in-place, future releases wouldn't
>    require nearly as much effort for re-port's to errno-less
>    platforms because only a few lines would need to be "fixed up" to
>    use macros, and only if the changed code didn't use the macros in
>    the first place.
> 
> 4. Not using the macros in core or extension source isn't an issue
>    for any platform, except errno-less OS's.. at which time that
>    code gets macro'ized at the time of the port.
> 
> 5. What this does is reduces effort for future ports to crippled
>    systems, at the expense of many initial changes, whose subsequent
>    maintenence shouldn't (hopefully) be a burden, since ports to
>    crippled systems would maintain the changes.
> 
> Though I do agree, a future mismash of macro's and direct errno
> references in the core will be ugly and confusing if that occurs.
> 
> (sorry this is so long, just want to clearly state my case if I have
> not already done so)

I totally understand your case.  I just don't like having to avoid
something that's legal according to the C Standard because of backward
platforms.  Sure, there were platform-specific changes for many other
minority platforms.  But none of then AFAIK required us to change
something that's Standard C -- these changes were usually about
system calls or filename conventions.  And we bend over for Win32
because it's the dominant platform.  For handhelds, I expect that
WinCE will be replaced by something less broken soon.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mwh@python.net  Wed Sep 25 11:21:46 2002
From: mwh@python.net (Michael Hudson)
Date: 25 Sep 2002 11:21:46 +0100
Subject: [Python-Dev] httplib.py vs 222
Message-ID: <2mvg4ul685.fsf@starship.python.net>

I'm tempted to just dump the trunk's version of httplib.py onto the
release branch.  Poring over logs reveals that nearly every checkin is
marked as a bugfix candidate (occasionally with a question mark).

Certainly it seems ot would be easier to work the checkins that aren't
bugfixes out of the trunk than work those that are into the branch, if
you see what I mean.

But I've never used httplib, so I thought I should ask here first (and
Cc: the people responsible for most of the changes).

Comments?

Cheers,
M.

-- 
  I've even been known to get Marmite *near* my mouth -- but never
  actually in it yet.  Vegamite is right out.
 UnicodeError: ASCII unpalatable error: vegamite found, ham expected
                                       -- Tim Peters, comp.lang.python


From rengelin@strw.leidenuniv.nl  Wed Sep 25 11:46:15 2002
From: rengelin@strw.leidenuniv.nl (Roeland Rengelink)
Date: Wed, 25 Sep 2002 12:46:15 +0200
Subject: [Python-Dev] bug 576990
References: <3D904828.766779BE@strw.leidenuniv.nl>
 <m3u1kfpiin.fsf@mira.informatik.hu-berlin.de> <200209241744.g8OHiEk28139@odiug.zope.com>
Message-ID: <3D9193F7.4976E479@strw.leidenuniv.nl>

Guido van Rossum wrote:
> 
> > Roeland Rengelink <rengelin@strw.leidenuniv.nl> writes:
> >
> > > 5. This is clearly a profound and interesting bug, but solving this
> > > seems to involve cans of worms, ten-foot poles, and a re-write of the
> > > core.
> 
> [Martin]
> > To me, it sounds like this. This has been changed forth and back, and
> > in every state, somebody is unhappy.
> 
> Yes, it's very messy, see my comments to the SF bug entry.  I see no
> fix that doesn't break something else.
> 
> Note that this "worked" in the initial 2.2 release only when the
> subclass didn't have a docstring of its own:
> 
> >>> class P(property):
> ...   "This is class P"
> ...
> >>> p = P(None, None, None, "this is property p")
> >>> p.__doc__
> 'This is class P'
> >>>
> 
> The best workaround is I can see that works everywhere is:
> 
> class P(property):
>     "class P's docstring"
>     __doc__ = property.__dict__['__doc__']
> 
> --Guido van Rossum (home page: http://www.python.org/~guido/)

Thanks for the response and thanks for the workaround. It does solve my
immediate problem, and I can live with losing "class P's docstring" in
pydoc.

I wish I could do more to help though,

Roeland


From bkc@murkworks.com  Wed Sep 25 15:20:21 2002
From: bkc@murkworks.com (Brad Clements)
Date: Wed, 25 Sep 2002 10:20:21 -0400
Subject: Reserved keywords in source: was RE: [Python-Dev] Assign to errno allowed?
In-Reply-To: <000a01c263fe$0d796de0$060210ac@private>
References: <000d01c263cd$bd76cc00$e000a8c0@thomasnotebook>
Message-ID: <3D918DE6.582.3CFA8D6@localhost>

On 24 Sep 2002 at 20:10, Barry Scott wrote:

> Windows CE prevents assignment to errno...
> 
> There would be a solution if you compiled all the code as C++.
> (Assuming that C++ reserved words are not used in the python code.)

Oh that's the other problem I ran into.

The core .c has a lot  of

  goto finally;


finally:

In C mode, this shouldn't matter. But MS's EVT compiler considers "finally" to be a 
reserved word.	 I had to change all of these too. (I used local_finally or some such 
thing)




Brad Clements,                bkc@murkworks.com   (315)268-1000
http://www.murkworks.com                          (315)268-9812 Fax
AOL-IM: BKClements



From jeremy@alum.mit.edu  Wed Sep 25 15:44:10 2002
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Wed, 25 Sep 2002 10:44:10 -0400
Subject: [Python-Dev] Re: httplib.py vs 222
In-Reply-To: <2mvg4ul685.fsf@starship.python.net>
References: <2mvg4ul685.fsf@starship.python.net>
Message-ID: <15761.52154.3349.23866@slothrop.zope.com>

>>>>> "MH" == Michael Hudson <mwh@python.net> writes:

  MH> I'm tempted to just dump the trunk's version of httplib.py onto
  MH> the release branch.  Poring over logs reveals that nearly every
  MH> checkin is marked as a bugfix candidate (occasionally with a
  MH> question mark).

  MH> Certainly it seems ot would be easier to work the checkins that
  MH> aren't bugfixes out of the trunk than work those that are into
  MH> the branch, if you see what I mean.

  MH> But I've never used httplib, so I thought I should ask here
  MH> first (and Cc: the people responsible for most of the changes).

  MH> Comments?

I think it makes sense to make httplib identical.  The changes to
httplib have all been intended to make it more robust.  

My one worry is that a set of changes I made may have broken pipelined
https requests in order to fix a different set of bugs.  I had
intended to check whether pipelined https requests actually worked in
2.2.1.

Jeremy



From mwh@python.net  Wed Sep 25 16:14:25 2002
From: mwh@python.net (Michael Hudson)
Date: 25 Sep 2002 16:14:25 +0100
Subject: [Python-Dev] Re: httplib.py vs 222
In-Reply-To: Jeremy Hylton's message of "Wed, 25 Sep 2002 10:44:10 -0400"
References: <2mvg4ul685.fsf@starship.python.net> <15761.52154.3349.23866@slothrop.zope.com>
Message-ID: <2madm6qey6.fsf@starship.python.net>

Jeremy Hylton <jeremy@alum.mit.edu> writes:

> >>>>> "MH" == Michael Hudson <mwh@python.net> writes:
> 
>   MH> I'm tempted to just dump the trunk's version of httplib.py onto
>   MH> the release branch.  Poring over logs reveals that nearly every
>   MH> checkin is marked as a bugfix candidate (occasionally with a
>   MH> question mark).
> 
>   MH> Certainly it seems ot would be easier to work the checkins that
>   MH> aren't bugfixes out of the trunk than work those that are into
>   MH> the branch, if you see what I mean.
> 
>   MH> But I've never used httplib, so I thought I should ask here
>   MH> first (and Cc: the people responsible for most of the changes).
> 
>   MH> Comments?
> 
> I think it makes sense to make httplib identical.  The changes to
> httplib have all been intended to make it more robust.  
> 
> My one worry is that a set of changes I made may have broken pipelined
> https requests in order to fix a different set of bugs.  I had
> intended to check whether pipelined https requests actually worked in
> 2.2.1.

Ooh!  Delagation!  This is now your problem :)

If you (or someone else) don't get to it before 2.2.2's release date,
I'll just dump the trunk's version into the branch.

Cheers,
M.

-- 
  Premature optimization is the root of all evil.
       -- Donald E. Knuth, Structured Programming with goto Statements


From guido@python.org  Wed Sep 25 16:32:00 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 25 Sep 2002 11:32:00 -0400
Subject: Reserved keywords in source: was RE: [Python-Dev] Assign to errno allowed?
In-Reply-To: Your message of "Wed, 25 Sep 2002 10:20:21 EDT."
 <3D918DE6.582.3CFA8D6@localhost>
References: <000d01c263cd$bd76cc00$e000a8c0@thomasnotebook>
 <3D918DE6.582.3CFA8D6@localhost>
Message-ID: <200209251532.g8PFW0O02531@odiug.zope.com>

> Oh that's the other problem I ran into.
> 
> The core .c has a lot  of
> 
>   goto finally;
> 
> 
> finally:
> 
> In C mode, this shouldn't matter. But MS's EVT compiler considers
> "finally" to be a reserved word.  I had to change all of these
> too. (I used local_finally or some such thing)

This compiler seems to fly in the face of the C std whenever it can.
What are they trying to accomplish?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@pobox.com  Wed Sep 25 16:31:29 2002
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 25 Sep 2002 10:31:29 -0500
Subject: [Python-Dev] Re: httplib.py vs 222
In-Reply-To: <2mvg4ul685.fsf@starship.python.net>
References: <2mvg4ul685.fsf@starship.python.net>
Message-ID: <15761.54993.969097.285258@12-248-11-90.client.attbi.com>

    mwh> I'm tempted to just dump the trunk's version of httplib.py onto the
    mwh> release branch.  Poring over logs reveals that nearly every checkin
    mwh> is marked as a bugfix candidate (occasionally with a question
    mwh> mark).

    ...

    mwh> But I've never used httplib, so I thought I should ask here first
    mwh> (and Cc: the people responsible for most of the changes).

The one change I applied to httplib since the 2.2 release was to fix a
problem with invalid urls.  If a colon follows the server name but is
followed by a non-numeric string or no string at all before the start of the
path, an InvalidURL exception is raised.  This is definitely a bugfix
candidate.  Jeremy's hand has been on that module much more heavily.  I
think it should probably be his call.

Skip


From bkc@murkworks.com  Wed Sep 25 16:35:56 2002
From: bkc@murkworks.com (Brad Clements)
Date: Wed, 25 Sep 2002 11:35:56 -0400
Subject: Reserved keywords in source: was RE: [Python-Dev] Assign to errno allowed?
In-Reply-To: <200209251532.g8PFW0O02531@odiug.zope.com>
References: Your message of "Wed, 25 Sep 2002 10:20:21 EDT." <3D918DE6.582.3CFA8D6@localhost>
Message-ID: <3D919F9D.20505.414DCD1@localhost>

On 25 Sep 2002 at 11:32, Guido van Rossum wrote:

> > In C mode, this shouldn't matter. But MS's EVT compiler considers
> > "finally" to be a reserved word.  I had to change all of these
> > too. (I used local_finally or some such thing)
> 
> This compiler seems to fly in the face of the C std whenever it can.
> What are they trying to accomplish?

World domination, what else?

--

I'm not sure, but I think my latest version of Metrowerks has the same issue.

In any case, using "finally" eliminates the possibility of compiling the core as C++,
regardless of the compiler being used.

I thought I remember seeing a thread on the subject of C++ compilation somewhere.

So, the suggested Errno class hack would work, except I still have to change
all the finally's. I suspect there are other issues with C++ compilation that I am
not aware of.

I'm not proposing anything specific here, just rambling. I need to answer your
other post.

Brad Clements,                bkc@murkworks.com   (315)268-1000
http://www.murkworks.com                          (315)268-9812 Fax
AOL-IM: BKClements



From ark@research.att.com  Wed Sep 25 16:38:43 2002
From: ark@research.att.com (Andrew Koenig)
Date: 25 Sep 2002 11:38:43 -0400
Subject: Reserved keywords in source: was RE: [Python-Dev] Assign to errno allowed?
In-Reply-To: <200209251532.g8PFW0O02531@odiug.zope.com>
References: <000d01c263cd$bd76cc00$e000a8c0@thomasnotebook>
 <3D918DE6.582.3CFA8D6@localhost>
 <200209251532.g8PFW0O02531@odiug.zope.com>
Message-ID: <yu99lm5qjczg.fsf@europa.research.att.com>

Guido> This compiler seems to fly in the face of the C std whenever it can.
Guido> What are they trying to accomplish?

To extend the language in ways that lock customers into their platform.

-- 
Andrew Koenig, ark@research.att.com, http://www.research.att.com/info/ark


From mwh@python.net  Wed Sep 25 16:41:44 2002
From: mwh@python.net (Michael Hudson)
Date: Wed, 25 Sep 2002 16:41:44 +0100 (BST)
Subject: Reserved keywords in source: was RE: [Python-Dev] Assign to
 errno allowed?
In-Reply-To: <3D919F9D.20505.414DCD1@localhost>
Message-ID: <Pine.LNX.4.44.0209251639320.4070-100000@starship.python.net>

On Wed, 25 Sep 2002, Brad Clements wrote:

> In any case, using "finally" eliminates the possibility of compiling the
> core as C++, regardless of the compiler being used.

Not in this universe.  I think there are already bigger barriers in the 
way of compiling the Python source as C# or Java...

Cheers,
M.



From bkc@murkworks.com  Wed Sep 25 16:44:11 2002
From: bkc@murkworks.com (Brad Clements)
Date: Wed, 25 Sep 2002 11:44:11 -0400
Subject: [Python-Dev] Assign to errno allowed?
In-Reply-To: <200209250129.g8P1TJp26117@pcp02138704pcs.reston01.va.comcast.net>
References: Your message of "Tue, 24 Sep 2002 15:31:23 EDT." <3D908550.9509.47E65C8B@localhost>
Message-ID: <3D91A18C.28183.41C688A@localhost>

On 24 Sep 2002 at 21:29, Guido van Rossum wrote:


> Question.  You showed that errno was #defined as a call to the right
> function.  Why don't you leave *getting* errno alone?

Sorry I forgot to clarify that part.

Windows CE 1 and 2 have "errno", but in CE 3.0 they improved 
the OS by eliminating errno and replacing it with
GetLastError()

I suspect this was done to allow CE to be embedded on new
processor types that would not otherwise be supported.


> You talk of 100s of places using errno.  But how many places *set*
> errno?

In the modules dir, grep shows:

File cmathmodule.c:
        Py_SetErrno(0);
File cPickle.c:
        Py_SetErrno(0);
                Py_SetErrno(0);
        Py_SetErrno(0);
File mathmodule.c:
        Py_SetErrno(0);
        Py_SetErrno(0);
        Py_SetErrno(0);
        Py_SetErrno(0);
        Py_SetErrno(0);

But alas for the GetLastError() issue, this wouldn't be so bad.


> If you maintain a branch that uses the errno macro, you could merge
> the trunk into that branch each time you feel like synching up with
> the trunk.  That's a mostly mechanical process, certainly less than
> fixing 100s of errno uses manually each time.

I agree. I think this will be the best way to go.

> That's another problem.  Whenever there's a massive "peephole" change
> like this, there are always a few places that are broken but that no
> reviewer notices and that don't happen to be tested by the test suite.
> (After all, errno is only consulted when an error occurs, and some errors
> are darn hard to provoke.)

I hadn't considered that aspect of the issue.

--

Seems then that creating a new SF project to hold a "derivative work?" of
the core is the best way to go, but the only difference in this work is

a) using macros for errno

b) changing "finally" labels to something else.

Anyone have a good suggestion for the name of this proposed project? 

(or, would it be a branch of the core? I'm sorry, I'm still a cvs virgin)

I don't think making it CE specific is correct, since it would also be used for 
NetWare.


(oh, did I say 10 years for NetWare? I meant 19)


Brad Clements,                bkc@murkworks.com   (315)268-1000
http://www.murkworks.com                          (315)268-9812 Fax
AOL-IM: BKClements



From ark@research.att.com  Wed Sep 25 16:44:15 2002
From: ark@research.att.com (Andrew Koenig)
Date: 25 Sep 2002 11:44:15 -0400
Subject: Reserved keywords in source: was RE: [Python-Dev] Assign to errno allowed?
In-Reply-To: <3D919F9D.20505.414DCD1@localhost>
References: <3D918DE6.582.3CFA8D6@localhost>
 <3D919F9D.20505.414DCD1@localhost>
Message-ID: <yu99d6r2jcq8.fsf@europa.research.att.com>

Brad> In any case, using "finally" eliminates the possibility of
Brad> compiling the core as C++, regardless of the compiler being
Brad> used.

"finally" is not a keyword in standard C++.

-- 
Andrew Koenig, ark@research.att.com, http://www.research.att.com/info/ark


From bkc@murkworks.com  Wed Sep 25 16:45:39 2002
From: bkc@murkworks.com (Brad Clements)
Date: Wed, 25 Sep 2002 11:45:39 -0400
Subject: Reserved keywords in source: was RE: [Python-Dev] Assign to errno allowed?
In-Reply-To: <Pine.LNX.4.44.0209251639320.4070-100000@starship.python.net>
References: <3D919F9D.20505.414DCD1@localhost>
Message-ID: <3D91A1E4.7608.41DC28B@localhost>

On 25 Sep 2002 at 16:41, Michael Hudson wrote:

> On Wed, 25 Sep 2002, Brad Clements wrote:
> 
> > In any case, using "finally" eliminates the possibility of compiling the
> > core as C++, regardless of the compiler being used.
> 
> Not in this universe.  I think there are already bigger barriers in the way
> of compiling the Python source as C# or Java...

Uh, how'd we go from C++ to C#/Java ?


Brad Clements,                bkc@murkworks.com   (315)268-1000
http://www.murkworks.com                          (315)268-9812 Fax
AOL-IM: BKClements



From pedronis@bluewin.ch  Wed Sep 25 16:36:47 2002
From: pedronis@bluewin.ch (Samuele Pedroni)
Date: Wed, 25 Sep 2002 17:36:47 +0200
Subject: Reserved keywords in source: was RE: [Python-Dev] Assign to errno allowed?
References: <3D919F9D.20505.414DCD1@localhost> <3D91A1E4.7608.41DC28B@localhost>
Message-ID: <003501c264a9$5c46f540$6d94fea9@newmexico>

From: Brad Clements <bkc@murkworks.com>
> On 25 Sep 2002 at 16:41, Michael Hudson wrote:
>
> > On Wed, 25 Sep 2002, Brad Clements wrote:
> >
> > > In any case, using "finally" eliminates the possibility of compiling the
> > > core as C++, regardless of the compiler being used.
> >
> > Not in this universe.  I think there are already bigger barriers in the way
> > of compiling the Python source as C# or Java...
>
> Uh, how'd we go from C++ to C#/Java ?

C++ to C#, I think first passing through managed C++, one of the latest MS
inventions <wink>.

regards.



From bkc@murkworks.com  Wed Sep 25 16:50:30 2002
From: bkc@murkworks.com (Brad Clements)
Date: Wed, 25 Sep 2002 11:50:30 -0400
Subject: Reserved keywords in source: was RE: [Python-Dev] Assign to errno allowed?
In-Reply-To: <200209251537.LAA00118@anvil.murkworks.com>
References: <3D91A1E4.7608.41DC28B@localhost>
Message-ID: <3D91A307.29702.42231ED@localhost>

On 25 Sep 2002 at 17:48, Alex Martelli wrote:

> On Wednesday 25 September 2002 05:45 pm, you wrote:
> > On 25 Sep 2002 at 16:41, Michael Hudson wrote:
> > > On Wed, 25 Sep 2002, Brad Clements wrote:
> > > > In any case, using "finally" eliminates the possibility of compiling
> > > > the core as C++, regardless of the compiler being used.
> > >
> > > Not in this universe.  I think there are already bigger barriers in the
> > > way of compiling the Python source as C# or Java...
> >
> > Uh, how'd we go from C++ to C#/Java ?
> 
> "finally" is a reserved word in Java and C# (and Python:-), but not in C++.
> 
> 
> Alex

Thanks for the clarification. This is what I get for using tools from the evil empire -- I 
lose my perspective of standards!

;-)


Brad Clements,                bkc@murkworks.com   (315)268-1000
http://www.murkworks.com                          (315)268-9812 Fax
AOL-IM: BKClements



From guido@python.org  Wed Sep 25 17:16:42 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 25 Sep 2002 12:16:42 -0400
Subject: [Python-Dev] Re: httplib.py vs 222
In-Reply-To: Your message of "Wed, 25 Sep 2002 16:14:25 BST."
 <2madm6qey6.fsf@starship.python.net>
References: <2mvg4ul685.fsf@starship.python.net> <15761.52154.3349.23866@slothrop.zope.com>
 <2madm6qey6.fsf@starship.python.net>
Message-ID: <200209251616.g8PGGgM11638@odiug.zope.com>

> If you (or someone else) don't get to it before 2.2.2's release date,
> I'll just dump the trunk's version into the branch.

Why don't you do that now, so we don't forget that part.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Wed Sep 25 17:32:29 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 25 Sep 2002 12:32:29 -0400
Subject: [Python-Dev] Assign to errno allowed?
In-Reply-To: Your message of "Wed, 25 Sep 2002 11:44:11 EDT."
 <3D91A18C.28183.41C688A@localhost>
References: "Your message of Tue, 24 Sep 2002 15:31:23 EDT." <3D908550.9509.47E65C8B@localhost>
 <3D91A18C.28183.41C688A@localhost>
Message-ID: <200209251632.g8PGWT811833@odiug.zope.com>

> > Question.  You showed that errno was #defined as a call to the right
> > function.  Why don't you leave *getting* errno alone?
> 
> Sorry I forgot to clarify that part.
> 
> Windows CE 1 and 2 have "errno", but in CE 3.0 they improved 
> the OS by eliminating errno and replacing it with
> GetLastError()
> 
> I suspect this was done to allow CE to be embedded on new
> processor types that would not otherwise be supported.
> 
> 
> > You talk of 100s of places using errno.  But how many places *set*
> > errno?
> 
> In the modules dir, grep shows:
> 
> File cmathmodule.c:
>         Py_SetErrno(0);
> File cPickle.c:
>         Py_SetErrno(0);
>                 Py_SetErrno(0);
>         Py_SetErrno(0);
> File mathmodule.c:
>         Py_SetErrno(0);
>         Py_SetErrno(0);
>         Py_SetErrno(0);
>         Py_SetErrno(0);
>         Py_SetErrno(0);
> 
> But alas for the GetLastError() issue, this wouldn't be so bad.

Well, *that* is easily solved in pyport.h:

#ifdef ...WINCE...
#ifndef errno
#define errno GetLastError()
#endif
#endif

Much better than changing every use of errno, isn't it? :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mwh@python.net  Wed Sep 25 18:01:14 2002
From: mwh@python.net (Michael Hudson)
Date: 25 Sep 2002 18:01:14 +0100
Subject: [Python-Dev] Re: httplib.py vs 222
References: <2mvg4ul685.fsf@starship.python.net> <15761.52154.3349.23866@slothrop.zope.com> <2madm6qey6.fsf@starship.python.net> <200209251616.g8PGGgM11638@odiug.zope.com>
Message-ID: <2mznu62ecl.fsf@starship.python.net>

Guido van Rossum <guido@python.org> writes:

> > If you (or someone else) don't get to it before 2.2.2's release date,
> > I'll just dump the trunk's version into the branch.
> 
> Why don't you do that now, so we don't forget that part.

Sure.  It seems Jeremy had already backported quite a pile of these
fixes back in July.

Cheers,
M.

-- 
  There are two ways of constructing a software design: one way is to
  make it so simple that there  are obviously no deficiencies and the
  other way  is to make it so complicated  that there are  no obvious
  deficiencies.                                      -- C. A. R. Hoare


From tim@multitalents.net  Wed Sep 25 21:09:07 2002
From: tim@multitalents.net (Tim Rice)
Date: Wed, 25 Sep 2002 13:09:07 -0700 (PDT)
Subject: [Python-Dev] building 2.2.2 on SCO Open Server
Message-ID: <Pine.UW2.4.44.0209251255110.17925-100000@ou8.int.multitalents.net>

I'm trying to get the release22-maint branch to build on
SCO Open Server 5. When setup.py fails to import an extention but
the .c file compiles, how do you track down why it failed?

Ie.  (lines formated for readability)

case $MAKEFLAGS in \
*-s*) CC='cc' LDSHARED='cc -G -Kpic -Ki486 -belf -Wl,-Bexport' \
	OPT='-DNDEBUG -O -Ki486 -DSCO5' ./python \
	-E /opt/src/utils/python/python-2.2.2/src/setup.py -q build;; \
*) CC='cc' LDSHARED='cc -G -Kpic -Ki486 -belf -Wl,-Bexport' \
	OPT='-DNDEBUG -O -Ki486 -DSCO5' ./python \
	-E /opt/src/utils/python/python-2.2.2/src/setup.py build;; \
esac
running build
running build_ext
building 'struct' extension
[snip]
building 'pwd' extension
cc -DNDEBUG -O -Ki486 -DSCO5 -Kpic -dy -Bdynamic -I. \
    -I/opt/src/utils/python/python-2.2.2/src/./Include \
    -I/usr/local/include -IInclude/ \
    -c /opt/src/utils/python/python-2.2.2/src/Modules/pwdmodule.c \
    -o build/temp.sco_sv-3.2-i386-2.2/pwdmodule.o
cc -G -Kpic -Ki486 -belf -Wl,-Bexport \
    build/temp.sco_sv-3.2-i386-2.2/pwdmodule.o -L/usr/local/lib \
    -o build/lib.sco_sv-3.2-i386-2.2/pwd.so
WARNING: removing "pwd" since importing it failed


-- 
Tim Rice				Multitalents	(707) 887-1469
tim@multitalents.net




From martin@v.loewis.de  Wed Sep 25 21:19:56 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 25 Sep 2002 22:19:56 +0200
Subject: [Python-Dev] building 2.2.2 on SCO Open Server
In-Reply-To: <Pine.UW2.4.44.0209251255110.17925-100000@ou8.int.multitalents.net>
References: <Pine.UW2.4.44.0209251255110.17925-100000@ou8.int.multitalents.net>
Message-ID: <m3znu5vn2r.fsf@mira.informatik.hu-berlin.de>

Tim Rice <tim@multitalents.net> writes:

> I'm trying to get the release22-maint branch to build on
> SCO Open Server 5. When setup.py fails to import an extention but
> the .c file compiles, how do you track down why it failed?

You invoke the compilation commands manually (as printed), then start
an interactive session and import try to import the module.

Regards,
Martin



From sholden@holdenweb.com  Wed Sep 25 21:22:57 2002
From: sholden@holdenweb.com (Steve Holden)
Date: Wed, 25 Sep 2002 16:22:57 -0400
Subject: Reserved keywords in source: was RE: [Python-Dev] Assign to errno allowed?
References: <000d01c263cd$bd76cc00$e000a8c0@thomasnotebook><3D918DE6.582.3CFA8D6@localhost><200209251532.g8PFW0O02531@odiug.zope.com> <yu99lm5qjczg.fsf@europa.research.att.com>
Message-ID: <01b801c264d1$586260e0$6300000a@holdenweb.com>

----- Original Message -----
From: "Andrew Koenig" <ark@research.att.com>
To: "Guido van Rossum" <guido@python.org>
Cc: <bkc@murkworks.com>; <python-dev@python.org>
Sent: Wednesday, September 25, 2002 11:38 AM
Subject: Re: Reserved keywords in source: was RE: [Python-Dev] Assign to
errno allowed?


> Guido> This compiler seems to fly in the face of the C std whenever it
can.
> Guido> What are they trying to accomplish?
>
> To extend the language in ways that lock customers into their platform.
>

The infamous "hijack an open standard by adding proprietary extensions"
philosophy, first really publicised by the Halloween documents.

regards
-----------------------------------------------------------------------
Steve Holden                                  http://www.holdenweb.com/
Python Web Programming                 http://pydish.holdenweb.com/pwp/
Previous .sig file retired to                    www.homeforoldsigs.com
-----------------------------------------------------------------------




From tim@multitalents.net  Wed Sep 25 21:46:28 2002
From: tim@multitalents.net (Tim Rice)
Date: Wed, 25 Sep 2002 13:46:28 -0700 (PDT)
Subject: [Python-Dev] building 2.2.2 on SCO Open Server
In-Reply-To: <m3znu5vn2r.fsf@mira.informatik.hu-berlin.de>
Message-ID: <Pine.UW2.4.44.0209251343350.17925-100000@ou8.int.multitalents.net>

On 25 Sep 2002, Martin v. Loewis wrote:

> Tim Rice <tim@multitalents.net> writes:
>
> > I'm trying to get the release22-maint branch to build on
> > SCO Open Server 5. When setup.py fails to import an extention but
> > the .c file compiles, how do you track down why it failed?
>
> You invoke the compilation commands manually (as printed), then start
> an interactive session and import try to import the module.
>
> Regards,
> Martin
>
Thanks.
I was sure I had tried that before. Oh well.
It does tell me what the problem is.
Now I have to track down why it can't find setpwent(). The man pacge says
it's in libc.

-- 
Tim Rice				Multitalents	(707) 887-1469
tim@multitalents.net




From pinard@iro.umontreal.ca  Wed Sep 25 22:07:13 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: Wed, 25 Sep 2002 17:07:13 -0400
Subject: [Python-Dev] sorted()
Message-ID: <oqsmzxaida.fsf@carouge.sram.qc.ca>

Hi, Guido, and people.

It recurrently happens that newcomers on the Python mailing list are surprised
that list.sort() does not return the sorted list as value.  I quite understand
and agree that this is a good thing, because sorting is done in place, and
Python programmers should stay aware and alert of this fact.

Yet, I often see myself writing things like:

    keys = messages.keys()
    keys.sort()
    for key in keys:
        DO_SOMETHING

This is not difficult to write, only slightly annoying.  Writing:

    def sorted(list):
        list = list[:]
        list.sort()
        return list

with the goal of simplifying the first excerpt into:

    for key in sorted(message.keys()):
        DO_SOMETHING

it is not really worth for small programs.  But in larger programs, where one
often loops over the sorted element of a list, it might become reasonable to
write this extra definition.  My feeling is that the idiom is common enough to
be worth a list method, so the above could be written instead:

    for key in message.keys().sorted():
        DO_SOMETHING

I immediately see an advantage and an inconvenient.  The inconvenient is that
users might confuse `.sort()' with `.sorted()', however we decide to spell
`sorted', so the existence of both may be some kind of trap.  The advantage is
that the `.sorted()' method fits well within how Python has evolved recently,
offering more concise and legible writings for frequent idioms.

Tim invested a lot of courageous efforts so Python `sort' becomes speedier.  A
`.sorted()' method requires separate space to hold the result, using the same
size as the original, and that guaranteed extra-space may eventually be put to
good use for speeding up the sorting even more.  The constraint of a sort
being in-place has indeed a cost, and deep down, we agree that this constraint
is artificial in contexts where `.sorted()' is really what the user needs.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From nas@python.ca  Wed Sep 25 22:51:25 2002
From: nas@python.ca (Neil Schemenauer)
Date: Wed, 25 Sep 2002 14:51:25 -0700
Subject: [Python-Dev] No __mod__ on str?
Message-ID: <20020925215125.GA14922@glacier.arctrix.com>

Here's the code for PyNumber_Remainder:

    PyObject *
    PyNumber_Remainder(PyObject *v, PyObject *w)
    {
        if (PyString_Check(v))
                return PyString_Format(v, w);
    #ifdef Py_USING_UNICODE
        else if (PyUnicode_Check(v))
                return PyUnicode_Format(v, w);
    #endif
        return binary_op(v, w, NB_SLOT(nb_remainder), "%");
    }

Is there any good reason why str.__mod__ != PyString_Format?  I want to
make a subclass of str that overrides the format operator.  I guess one
side effect would be that PyNumber_Check(astring) would start returning
true.

Should I file a bug saying "can't override __mod__ on str and unicode
subclasses"?  I guess the fix would be to check for nb_remainder first
and then fallback to PyString_Format or PyUnicode_Format.

    Neil


From guido@python.org  Thu Sep 26 00:46:05 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 25 Sep 2002 19:46:05 -0400
Subject: [Python-Dev] No __mod__ on str?
In-Reply-To: Your message of "Wed, 25 Sep 2002 14:51:25 PDT."
 <20020925215125.GA14922@glacier.arctrix.com>
References: <20020925215125.GA14922@glacier.arctrix.com>
Message-ID: <200209252346.g8PNk5q29231@pcp02138704pcs.reston01.va.comcast.net>

> Here's the code for PyNumber_Remainder:
> 
>     PyObject *
>     PyNumber_Remainder(PyObject *v, PyObject *w)
>     {
>         if (PyString_Check(v))
>                 return PyString_Format(v, w);
>     #ifdef Py_USING_UNICODE
>         else if (PyUnicode_Check(v))
>                 return PyUnicode_Format(v, w);
>     #endif
>         return binary_op(v, w, NB_SLOT(nb_remainder), "%");
>     }
> 
> Is there any good reason why str.__mod__ != PyString_Format?  I want to
> make a subclass of str that overrides the format operator.  I guess one
> side effect would be that PyNumber_Check(astring) would start returning
> true.

Good catch.  I think this is a relic from before str and unicode were
subclassable.

> Should I file a bug saying "can't override __mod__ on str and unicode
> subclasses"?  I guess the fix would be to check for nb_remainder first
> and then fallback to PyString_Format or PyUnicode_Format.

Yes please.  If you can provide a fix, make it a patch.  Anyway assign
it to me.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From greg@cosc.canterbury.ac.nz  Thu Sep 26 01:44:15 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 26 Sep 2002 12:44:15 +1200 (NZST)
Subject: Reserved keywords in source: was RE: [Python-Dev] Assign to errno allowed?
In-Reply-To: <3D919F9D.20505.414DCD1@localhost>
Message-ID: <200209260044.g8Q0iF306159@oma.cosc.canterbury.ac.nz>

Brad Clements <bkc@murkworks.com>:

> I'm not sure, but I think my latest version of Metrowerks has the
> same issue.

Doesn't Metrowerks have a "strict ANSI" switch that turns
off all the language extensions?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From greg@cosc.canterbury.ac.nz  Thu Sep 26 01:56:47 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 26 Sep 2002 12:56:47 +1200 (NZST)
Subject: [Python-Dev] sorted()
In-Reply-To: <oqsmzxaida.fsf@carouge.sram.qc.ca>
Message-ID: <200209260056.g8Q0ulp06182@oma.cosc.canterbury.ac.nz>

pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard):

> The advantage is that the `.sorted()' method fits well within how
> Python has evolved recently, offering more concise and legible
> writings for frequent idioms.

I prefer the idea of making sorted() a separate function,
because it can then be made to work on any sequence that
can be copied and has a sort() method.

To support specialised non-in-place sorting algorithms,
it could check whether its argument has a sorted()
method, and if not, fall back on the general implementation.

This seems more Pythonic to me.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From greg@cosc.canterbury.ac.nz  Thu Sep 26 01:59:48 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 26 Sep 2002 12:59:48 +1200 (NZST)
Subject: [Python-Dev] Assign to errno allowed?
In-Reply-To: <200209251632.g8PGWT811833@odiug.zope.com>
Message-ID: <200209260059.g8Q0xmL06185@oma.cosc.canterbury.ac.nz>

> #ifdef ...WINCE...
            ^^^^^

I wonder if Microsoft foresaw that abbreviation when
they chose the name CE...

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From tdelaney@avaya.com  Thu Sep 26 02:13:27 2002
From: tdelaney@avaya.com (Delaney, Timothy)
Date: Thu, 26 Sep 2002 11:13:27 +1000
Subject: [Python-Dev] sorted()
Message-ID: <B43D149A9AB2D411971300B0D03D7E8BF0A5DC@natasha.auslabs.avaya.com>

> From: Greg Ewing [mailto:greg@cosc.canterbury.ac.nz]
> 
> pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard):
> 
> > The advantage is that the `.sorted()' method fits well within how
> > Python has evolved recently, offering more concise and legible
> > writings for frequent idioms.
> 
> To support specialised non-in-place sorting algorithms,
> it could check whether its argument has a sorted()
> method, and if not, fall back on the general implementation.

Hmm - this actually suggests a couple more magic methods:

__sort__
__isort__

corresponding to "sort a copy" and "sort in-place".

Defining the rules for how these would be called requires a bit more thought
however. Do you want a sort() function to prefer __sort__ or __isort__?

def sort (seq, in_place=1):

    if in_place:
        return seq.__isort__()

    try:
        return seq.__sort__()
    except:
        pass

    seq = list(seq)
    seq.sort()
    return seq

So - if an in-place sort is specified, try to do one, throwing an exception
if it's not possible. Otherwise sort a copy.

This would allow a generic mechanism for objects to ort copies of
themselves, rather than blindly changing them to a list.

Would two methods be better for in-place and copy sort?

Tim Delaney


From David Abrahams" <david.abrahams@rcn.com  Thu Sep 26 03:43:48 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Wed, 25 Sep 2002 22:43:48 -0400
Subject: Reserved keywords in source: was RE: [Python-Dev] Assign to errno allowed?
References: <200209260044.g8Q0iF306159@oma.cosc.canterbury.ac.nz>
Message-ID: <124601c26506$91150c00$6701a8c0@boostconsulting.com>

From: "Greg Ewing" <greg@cosc.canterbury.ac.nz>


> Brad Clements <bkc@murkworks.com>:
> 
> > I'm not sure, but I think my latest version of Metrowerks has the
> > same issue.
> 
> Doesn't Metrowerks have a "strict ANSI" switch that turns
> off all the language extensions?

Yes.

-----------------------------------------------------------
           David Abrahams * Boost Consulting
dave@boost-consulting.com * http://www.boost-consulting.com



From tim@multitalents.net  Thu Sep 26 16:52:01 2002
From: tim@multitalents.net (Tim Rice)
Date: Thu, 26 Sep 2002 08:52:01 -0700 (PDT)
Subject: [Python-Dev] Lib/plat-xxxx directories
Message-ID: <Pine.UW2.4.44.0209260830180.8859-100000@ou8.int.multitalents.net>

Can someone enlighten me as to the purpose of the Lib/plat-xxxx
directories. What goes in there? Why?

I'm trying to figure out if I should be creating one for SCO Open Server

BTW. Someone with CVS write access should probably
	cd Lib
	ln -s plat-unixware7 plat-openunix8
	cvs add plat-openunix8

OpenUNIX 8.0.0 is really UnixWare 7.1.2 underneath.


-- 
Tim Rice				Multitalents	(707) 887-1469
tim@multitalents.net




From guido@python.org  Thu Sep 26 17:05:30 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 26 Sep 2002 12:05:30 -0400
Subject: [Python-Dev] Lib/plat-xxxx directories
In-Reply-To: Your message of "Thu, 26 Sep 2002 08:52:01 PDT."
 <Pine.UW2.4.44.0209260830180.8859-100000@ou8.int.multitalents.net>
References: <Pine.UW2.4.44.0209260830180.8859-100000@ou8.int.multitalents.net>
Message-ID: <200209261605.g8QG5UP29963@odiug.zope.com>

> Can someone enlighten me as to the purpose of the Lib/plat-xxxx
> directories. What goes in there? Why?

They're for platform-specific modules.  In most cases, the only
platform-specific modules are collections of system constants
generated by Tools/scripts/h2py.py.  For an example, see the regen
script in plat-linux2.  (It assumes you've set up an alias "h2py" for
the script.)  I wouldn't bother unless you have an actual use in mind.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim@multitalents.net  Thu Sep 26 17:09:07 2002
From: tim@multitalents.net (Tim Rice)
Date: Thu, 26 Sep 2002 09:09:07 -0700 (PDT)
Subject: [Python-Dev] Lib/plat-xxxx directories
In-Reply-To: <200209261605.g8QG5UP29963@odiug.zope.com>
Message-ID: <Pine.UW2.4.44.0209260904540.8964-100000@ou8.int.multitalents.net>

On Thu, 26 Sep 2002, Guido van Rossum wrote:

> > Can someone enlighten me as to the purpose of the Lib/plat-xxxx
> > directories. What goes in there? Why?
>
> They're for platform-specific modules.  In most cases, the only
> platform-specific modules are collections of system constants
> generated by Tools/scripts/h2py.py.  For an example, see the regen
> script in plat-linux2.  (It assumes you've set up an alias "h2py" for
> the script.)  I wouldn't bother unless you have an actual use in mind.

OK, Thanks.

I'll bundle up my patches and post them to the patch manager.

>
> --Guido van Rossum (home page: http://www.python.org/~guido/)
>

-- 
Tim Rice				Multitalents	(707) 887-1469
tim@multitalents.net




From guido@python.org  Thu Sep 26 20:13:15 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 26 Sep 2002 15:13:15 -0400
Subject: [Python-Dev] How to add an encoding alias?
Message-ID: <200209261913.g8QJDFe01575@odiug.zope.com>

In the spambayes project we encountered some mail samples that use an
encoding name ('ansi-x3-4-1968') that's not in encodings/aliases.py.
(At least not until I added it to CVS yesterday.)

I'd like the spambayes code base to be compatible with Python 2.2.1,
so I like to add this one to the list of aliases.

Is there an official API to add an alias, or do I just have to write

  import encodings.aliases
  encodings.aliases.aliases['ansi-x3-4-1968'] = 'ascii'

???

(BTW, there's an alias 'ansi_x3.4_1986' for ASCII.  Was the ASCII
standard renewed in 1986, or is that simply because there are encoding
designators out there in real life that contain a typo?)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Thu Sep 26 20:41:59 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 26 Sep 2002 21:41:59 +0200
Subject: [Python-Dev] How to add an encoding alias?
References: <200209261913.g8QJDFe01575@odiug.zope.com>
Message-ID: <3D936307.1040709@lemburg.com>

Guido van Rossum wrote:
> In the spambayes project we encountered some mail samples that use an
> encoding name ('ansi-x3-4-1968') that's not in encodings/aliases.py.
> (At least not until I added it to CVS yesterday.)
> 
> I'd like the spambayes code base to be compatible with Python 2.2.1,
> so I like to add this one to the list of aliases.
> 
> Is there an official API to add an alias, or do I just have to write
> 
>   import encodings.aliases
>   encodings.aliases.aliases['ansi-x3-4-1968'] = 'ascii'
> 
> ???

There's no other API to do this and since new features are
not allowed in 2.2.x that's the only way to go unless you register
your own lookup function which knows about the extra alias.

> (BTW, there's an alias 'ansi_x3.4_1986' for ASCII.  Was the ASCII
> standard renewed in 1986, or is that simply because there are encoding
> designators out there in real life that contain a typo?)

That was one of the official names for ASCII:

http://www.archivists.org/catalog/stds99/chapter7.html#x3_4

More details on the history of ASCII can be found at the
top of that page. The original version X3.4 was approved
in 1968, so it's not a typo.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/



From mal@lemburg.com  Thu Sep 26 20:43:37 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 26 Sep 2002 21:43:37 +0200
Subject: [Python-Dev] How to add an encoding alias?
References: <200209261913.g8QJDFe01575@odiug.zope.com>
Message-ID: <3D936369.3000908@lemburg.com>

Guido van Rossum wrote:
>   import encodings.aliases
>   encodings.aliases.aliases['ansi-x3-4-1968'] = 'ascii'

In order for the lookup to work, you have to replace hyphens
with underscores; see the top of aliases.py.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/



From guido@python.org  Thu Sep 26 21:00:15 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 26 Sep 2002 16:00:15 -0400
Subject: [Python-Dev] How to add an encoding alias?
In-Reply-To: Your message of "Thu, 26 Sep 2002 21:41:59 +0200."
 <3D936307.1040709@lemburg.com>
References: <200209261913.g8QJDFe01575@odiug.zope.com>
 <3D936307.1040709@lemburg.com>
Message-ID: <200209262000.g8QK0Fl01925@odiug.zope.com>

> > I'd like the spambayes code base to be compatible with Python 2.2.1,
> > so I like to add this one to the list of aliases.
> > 
> > Is there an official API to add an alias, or do I just have to write
> > 
> >   import encodings.aliases
> >   encodings.aliases.aliases['ansi-x3-4-1968'] = 'ascii'
> > 
> > ???
> 
> There's no other API to do this and since new features are
> not allowed in 2.2.x that's the only way to go unless you register
> your own lookup function which knows about the extra alias.

Thanks, I'll do that.

> > (BTW, there's an alias 'ansi_x3.4_1986' for ASCII.  Was the ASCII
> > standard renewed in 1986, or is that simply because there are encoding
> > designators out there in real life that contain a typo?)
> 
> That was one of the official names for ASCII:
> 
> http://www.archivists.org/catalog/stds99/chapter7.html#x3_4
> 
> More details on the history of ASCII can be found at the
> top of that page. The original version X3.4 was approved
> in 1968, so it's not a typo.

Wow.  Cute.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Thu Sep 26 21:03:05 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 26 Sep 2002 16:03:05 -0400
Subject: [Python-Dev] How to add an encoding alias?
In-Reply-To: Your message of "Thu, 26 Sep 2002 21:43:37 +0200."
 <3D936369.3000908@lemburg.com>
References: <200209261913.g8QJDFe01575@odiug.zope.com>
 <3D936369.3000908@lemburg.com>
Message-ID: <200209262003.g8QK36P01952@odiug.zope.com>

> Guido van Rossum wrote:
> >   import encodings.aliases
> >   encodings.aliases.aliases['ansi-x3-4-1968'] = 'ascii'
> 
> In order for the lookup to work, you have to replace hyphens
> with underscores; see the top of aliases.py.

Good catch!  Then my "fix" to aliases.py was also wrong.

Would it make sense to change the lookup function to convert *all*
punctuation to underscores before doing the lookup?  (Then this one
would actually have worked...)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Thu Sep 26 21:14:28 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 26 Sep 2002 22:14:28 +0200
Subject: [Python-Dev] How to add an encoding alias?
References: <200209261913.g8QJDFe01575@odiug.zope.com>              <3D936369.3000908@lemburg.com> <200209262003.g8QK36P01952@odiug.zope.com>
Message-ID: <3D936AA4.1080302@lemburg.com>

Guido van Rossum wrote:
>>Guido van Rossum wrote:
>>
>>>  import encodings.aliases
>>>  encodings.aliases.aliases['ansi-x3-4-1968'] = 'ascii'
>>
>>In order for the lookup to work, you have to replace hyphens
>>with underscores; see the top of aliases.py.
> 
> 
> Good catch!  Then my "fix" to aliases.py was also wrong.
> 
> Would it make sense to change the lookup function to convert *all*
> punctuation to underscores before doing the lookup?  (Then this one
> would actually have worked...)

Codecs must currently use names as defined by the search function in the
encodings package:

     Codec modules must have names corresponding to standard lower-case
     encoding names with hyphens mapped to underscores, e.g. 'utf-8' is
     implemented by the module 'utf_8.py'.

We could extend this to:

     Codec modules must have names corresponding to standard lower-case
     encoding names with all non-alphanumeric charactersmapped to
     underscores, e.g. 'utf-8' is implemented by the module 'utf_8.py'
     and 'ISO 639:1988' would be implemented as module 'iso_639_1988'.

Note that the aliasing dictionary is consulted *after*
having applied this mapping.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/



From guido@python.org  Thu Sep 26 21:27:47 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 26 Sep 2002 16:27:47 -0400
Subject: [Python-Dev] How to add an encoding alias?
In-Reply-To: Your message of "Thu, 26 Sep 2002 22:14:28 +0200."
 <3D936AA4.1080302@lemburg.com>
References: <200209261913.g8QJDFe01575@odiug.zope.com> <3D936369.3000908@lemburg.com> <200209262003.g8QK36P01952@odiug.zope.com>
 <3D936AA4.1080302@lemburg.com>
Message-ID: <200209262027.g8QKRlO02176@odiug.zope.com>

> > Would it make sense to change the lookup function to convert *all*
> > punctuation to underscores before doing the lookup?  (Then this one
> > would actually have worked...)
> 
> Codecs must currently use names as defined by the search function in the
> encodings package:
> 
>      Codec modules must have names corresponding to standard lower-case
>      encoding names with hyphens mapped to underscores, e.g. 'utf-8' is
>      implemented by the module 'utf_8.py'.
> 
> We could extend this to:
> 
>      Codec modules must have names corresponding to standard lower-case
>      encoding names with all non-alphanumeric charactersmapped to
>      underscores, e.g. 'utf-8' is implemented by the module 'utf_8.py'
>      and 'ISO 639:1988' would be implemented as module 'iso_639_1988'.
> 
> Note that the aliasing dictionary is consulted *after*
> having applied this mapping.

+1; +1 on backport to 2.2.2 also.

Note that this requires some changes to the dict in aliases.py.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jack_diederich@email.com  Fri Sep 27 02:41:45 2002
From: jack_diederich@email.com (Jack Diederich)
Date: Thu, 26 Sep 2002 20:41:45 -0500
Subject: [Python-Dev] Sets/Combinitorics, pointer to an implementation
Message-ID: <20020927014145.28754.qmail@email.com>

I was reading the summary of python-dev for August and saw the exchange about Combinatorics,  I emailed a few people in the discussion (guido, et al) who suggested I post to python-dev directly instead.

http://probstat.sourceforge.net  "Probability and Statistics Utils for Python"

Running Code I released in early August that does Combinations, Permutations, and Cartesian Products over python lists.  It has python object wrappers for fast C algorithms.  It supports iterating over objects, slices, len(), and random access.  It was 10+ times faster than the same algos done in python, but I wrote those without generators and I haven't done a benchmark since.  It uses standard algos, largely from the Gnu Scientific Library.  It lazily produces the cycles in lexiographic order, so the memory it consumes is about twice the size of a shallow copy of the list.  A slice of a Permutation object is another Permutation object with smaller internal start/end bounds.  I could write more about the implementation, but I'll save my keystrokes unless people actually want to know. 

A Power class that lazily evaluates the output would be easy to add, here is one:
from __future__ import generators
from probstat import Combination 

class Power:
   def __init__(self, set):
       self.__set = set
   def __iter__(self):
       self.__iter = self.setup_iter()
   def setup_iter(self):
       for (i) in range(len(self.__set)+1):
           if (i == 0): # Combination() doesn't allow N choose zero
               yield []
           else:
               for (i) in Combination(self.__set, i):
                   yield i      
   def next(self):
       return self.__iter.next()
 
Enjoy,
 
-jack

Eratta:
I tried using the Cartesian class to mimic nested for() loops.  It is 3 times slower than doing depth 3 nested for loops (i,j,k) in python.  That's probably the overhead of the Cartesian class new'ing a tuple and unpacking it for each iteration of the loop.

-- 
__________________________________________________________
Sign-up for your own FREE Personalized E-mail at Mail.com
http://www.mail.com/?sr=signup



From mal@lemburg.com  Fri Sep 27 10:25:15 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 27 Sep 2002 11:25:15 +0200
Subject: [Python-Dev] User extendable literal modifiers ?!
Message-ID: <3D9423FB.9070303@lemburg.com>

As you might have noticed, I have wrapped several parts of the
GMP Multi-Precision (GMP) library in form of Python types
in mxNumber.

Since these are numbers, it would be convenient if there were
some way to create them in form of literals, much like 123L
creates longs instead of integers or u"abc" gives you Unicode
instead of an 8-bit string.

I was wondering whether it would be worth adding something
like a registry of literal modifiers to Python, so that
extensions can register new modifiers with the compiler,
e.g.

sitecustomize.py:
def create_I_literal(literal_string):
     return 'mx.Number.Integer(%s)' % literal_string
sys.register_numberlitmod('I', create_I_literal)

test.py:
x = 123I * 456I
print x, 234I

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/



From gerhard.haering@opus-gmbh.net  Fri Sep 27 11:33:11 2002
From: gerhard.haering@opus-gmbh.net (Gerhard =?iso-8859-1?Q?H=E4ring?=)
Date: Fri, 27 Sep 2002 10:33:11 +0000 (UTC)
Subject: [Python-Dev] Re: User extendable literal modifiers ?!
References: <3D9423FB.9070303@lemburg.com>
Message-ID: <slrnap8cuj.20k.gerhard.haering@haering.opus-gmbh.net>

In article <3D9423FB.9070303@lemburg.com>, M.-A. Lemburg wrote:
> [mxNumber]
> I was wondering whether it would be worth adding something
> like a registry of literal modifiers to Python,

Especially for this purpose, that would be great. And have potential for
misuse, too. Just like, say, operator overloading. But in the context of
Python, I didn't see any misuse of operator overloading, yet.

> [...] so that
> extensions can register new modifiers with the compiler,
> e.g.
> 
> sitecustomize.py:
> def create_I_literal(literal_string):
>      return 'mx.Number.Integer(%s)' % literal_string
> sys.register_numberlitmod('I', create_I_literal)

A single literal, however, doesn't (easily) allow you to give precision and
scale arguments to your decimal literal. That's of course easy if you can
declare your variable, which you can't in Python. So we're back to
constructors/factory functions here, right?

-- Gerhard




From gmccaughan@synaptics-uk.com  Fri Sep 27 12:03:17 2002
From: gmccaughan@synaptics-uk.com (Gareth McCaughan)
Date: Fri, 27 Sep 2002 12:03:17 +0100 (BST)
Subject: [Python-Dev] Re: User extendable literal modifiers ?!
Message-ID: <200209271104.MAA27895@synaptics-uk.com>

Marc-Andre Lemburg wrote:

> Since these are numbers, it would be convenient if there were
> some way to create them in form of literals, much like 123L
> creates longs instead of integers or u"abc" gives you Unicode
> instead of an 8-bit string.
> 
> I was wondering whether it would be worth adding something
> like a registry of literal modifiers to Python, so that
> extensions can register new modifiers with the compiler,
> e.g.
> 
> sitecustomize.py:
> def create_I_literal(literal_string):
>      return 'mx.Number.Integer(%s)' % literal_string
> sys.register_numberlitmod('I', create_I_literal)
> 
> test.py:
> x = 123I * 456I
> print x, 234I

Too limiting. You'd only be able to do this for numbers,
and it doesn't seem worth the pain just for numbers.
Better would be user-definable *prefixes*.

Common Lisp, for instance, makes it easy to customize
the reader to recognize tokens of the form <hash> <character> <anything>.
So you can arrange that #Q123,234,456:a(b)c turns into, erm,
something terribly useful :-). Some of these characters are
already taken for things like arrays [#(1 2 3), #2((1 2) (3 4))],
"logical pathnames" (lightly abstracted filenames) [#"foo/bar/baz"],
bit vectors [#*0001101011001], and so on. As perceptive readers
will have noticed, you can splice a number between "#" and
the magic character for special effects.

Python could do something similar, though obviously "#"
isn't a suitable character :-). Letting the user hijack
the reader as completely as can be done in CL would probably
be un-Pythonic, too. Here's a strawman suggestion.

    For any character "x" in some set I can't be bothered to
    specify, the Python tokenizer/parser will subject input
    of the form $x<string-literal> to special processing.
    The string-literal can be formed using any of {',",''',"""}.

    When I say "tokenizer/parser", I mean: the tokenizer will
    produce a special token encoding the character "x" and the
    contents of the string-literal. The parser will perform
    "special processing" in an attempt to turn it into a more
    normal token.

    The default "special processing" is to raise a SyntaxError.
    The user can define the special processing appropriate for
    a particular character "x" by making a function that
    interprets the string and feeding it to sys.register_dollar_handler.
    (In fact, anything callable will do.) The function will
    be passed two arguments: the character "x" and the string.
    Its return value will replace the $x"..." combination in
    the token stream, as a literal token.

    If an exception other than a SyntaxError is raised and
    not caught in the handler function then it will be silently
    replaced by a SyntaxError whose parameter has the form
    "ill-formed <xxx> literal". The value of "xxx" is defined
    when registering the handler.

    Handler functions are permitted to call "eval".

    Example:

        >>> def handle_rational(char, s):
        ...     assert char == 'r'
        ...     components = s.split('/')
        ...     numerator, denominator = map(int, components)
        ...     return Rational(numerator, denominator)
        ... 
        >>> sys.register_dollar_handler('r', handle_rational, 'rational')
        >>> print $r"1/2" + $r"3/4"
        $r"5/4"
        >>> print $r"12345"
          File "<string>", line 1
            print $r"12345"
                  ^
        SyntaxError: ill-formed rational literal
        >>>

    Alternatively:

        >>> class Rational:
        ...     def __init__(self, x, y):
        ...     if isinstance(x, str):
        ...         x,y = map(int, y.split("/"))
        ...     self._numerator, self._denominator = x,y
        ...     [etc]
        ...
        >>> sys.register_dollar_handler('r', Rational, 'rational')

    Some dollar-syntax characters may be handled by Python itself
    or the standard library, or may be reserved for their use.
    It is possible for users to override them, but this should
    be considered bad practice.

    Registering a handler when one is already in place will produce
    a warning. To un-register a handler, pass None instead of the
    handler function. 

Possible applications:

  - Rational numbers.    $r"123/234"
  - Regular expressions. $/"foo.*bar"
  - Dates and times.     $t"2002-09-27 11:38"
  - Hostnames and ports. $h"www.google.com:80"

Questions:

  - Is this insane?
  - Is "$" the best character?
  - Should there be a way to return tokens other than literal ones?
    For instance, identifiers or keywords?
  - Is the behaviour with exceptions correct?

-- 
g




From mal@lemburg.com  Fri Sep 27 12:17:55 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 27 Sep 2002 13:17:55 +0200
Subject: [Python-Dev] Re: User extendable literal modifiers ?!
References: <3D9423FB.9070303@lemburg.com> <slrnap8cuj.20k.gerhard.haering@haering.opus-gmbh.net>
Message-ID: <3D943E63.30104@lemburg.com>

Gerhard H=E4ring wrote:
> In article <3D9423FB.9070303@lemburg.com>, M.-A. Lemburg wrote:
>=20
>>[mxNumber]
>>I was wondering whether it would be worth adding something
>>like a registry of literal modifiers to Python,
>=20
>=20
> Especially for this purpose, that would be great. And have potential fo=
r
> misuse, too. Just like, say, operator overloading. But in the context o=
f
> Python, I didn't see any misuse of operator overloading, yet.

I was thinking of giving the current concept of literal
modifiers a more general scope. Of course, this can be
misused, but then we could e.g. put certain constraints
on the possible modifiers, say only allow a predefined
number of modifiers and then have the compiler at
compile time or the interpreter at run-time apply the
necessary logic to the literal to turn it into an
object.

We currently have 'u', 'r', 'U', 'R' as modifiers for strings
(prefixes) and 'l', 'L', 'j', 'J' for numbers (postfix).

>>[...] so that
>>extensions can register new modifiers with the compiler,
>>e.g.
>>
>>sitecustomize.py:
>>def create_I_literal(literal_string):
>>     return 'mx.Number.Integer(%s)' % literal_string
>>sys.register_numberlitmod('I', create_I_literal)
>=20
>=20
> A single literal, however, doesn't (easily) allow you to give precision=
 and
> scale arguments to your decimal literal. That's of course easy if you c=
an
> declare your variable, which you can't in Python. So we're back to
> constructors/factory functions here, right?

Not really, since mxNumber Integers have arbitrary
precision, so scale and precision are not needed.

--=20
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/



From aleax@aleax.it  Fri Sep 27 12:53:28 2002
From: aleax@aleax.it (Alex Martelli)
Date: Fri, 27 Sep 2002 13:53:28 +0200
Subject: [Python-Dev] Re: User extendable literal modifiers ?!
In-Reply-To: <200209271104.MAA27895@synaptics-uk.com>
References: <200209271104.MAA27895@synaptics-uk.com>
Message-ID: <E17utgt-00071r-00@mail.python.org>

On Friday 27 September 2002 01:03 pm, Gareth McCaughan wrote:
	...
> Better would be user-definable *prefixes*.

Yes -- nice idea.

>     Its return value will replace the $x"..." combination in
>     the token stream, as a literal token.

Why just one token, and why just literal.  Returning an
arbitrary sequence of tokens seems more natural.  This
would allow e.g. Tim Berners-Lee to have basically what
he wants (and asked for in his talk at IPC10) in terms of
extended syntax for graphs, just with some $x in front.

I had a similar idea right after Tim's talk, but could not
articulate it clearly enough in a chat with Guido right
afterwards, and later I didn't follow through with it.  It
seems to me that your proposal is detailed and precise
enough (while my idea was rather vague) and that, by
returning an arbitrary sequence of tokens, it will let
Tim embed whatever funky syntax it requires.

This power is also the downside of the whole idea of
course -- no guarantee that somebody can't use this
mechanism to produce highly obfuscated programs.
But I think that such a somebody could already
obfuscate quite effectively in other ways, and the
risk of abuse shouldn't stop this interesting proposal.

>         ...     return Rational(numerator, denominator)

Hmmm, how would this "return a literal token"?  It returns
an instance of Rational -- how does the parser treat this
instance as a literal token?

I thought this use would have to return the sequence of
tokens for identifier 'Rational', open parenthesis, literal
(value of) numerator, comma, literal (value of) denominator,
closed parenthesis -- which in turn is why I thought of an
arbitrary sequence of tokens.  If a single instance of any
arbitrary class may be returned and get treated as a
literal token by the parser, then that's much better (maybe
I don't know Python's parser well enough, but I don't
clearly see how that would be done).

>   - Is this insane?

Hope not, since I like it.

>   - Is "$" the best character?

Among the few available ones, I think I slightly prefer "@"
for this use, but there's little to choose IMHO.


Alex


From pedronis@bluewin.ch  Fri Sep 27 13:29:49 2002
From: pedronis@bluewin.ch (Samuele Pedroni)
Date: Fri, 27 Sep 2002 14:29:49 +0200
Subject: [Python-Dev] Re: User extendable literal modifiers ?!
References: <200209271104.MAA27895@synaptics-uk.com> <E17utgt-00071r-00@mail.python.org>
Message-ID: <007f01c26621$92d3c060$6d94fea9@newmexico>

From: Alex Martelli <aleax@aleax.it>
> I thought this use would have to return the sequence of
> tokens for identifier 'Rational', open parenthesis, literal
> (value of) numerator, comma, literal (value of) denominator,
> closed parenthesis -- which in turn is why I thought of an
> arbitrary sequence of tokens.  If a single instance of any
> arbitrary class may be returned and get treated as a
> literal token by the parser, then that's much better

indeed, because then otherwise

$r"123/234" = literal transformation => Rational(123,234)

would require Rational to be installed in the builtins, or some kind of
implicit import (ugly) or people would have to rember to put an explicit from
... import Rational in all modules that use $r, one import per program just to
register $r would not be enough.

regards



From gmccaughan@synaptics-uk.com  Fri Sep 27 13:58:42 2002
From: gmccaughan@synaptics-uk.com (Gareth McCaughan)
Date: Fri, 27 Sep 2002 13:58:42 +0100 (BST)
Subject: [Python-Dev] Re[3]: User extendable literal modifiers ?!
In-Reply-To: <200209271152.MAA27946@synaptics-uk.com>
References: <200209271104.MAA27895@synaptics-uk.com>
 <200209271152.MAA27946@synaptics-uk.com>
Message-ID: <200209271259.NAA28049@synaptics-uk.com>

> >     Its return value will replace the $x"..." combination in
> >     the token stream, as a literal token.
> 
> Why just one token, and why just literal.  Returning an
> arbitrary sequence of tokens seems more natural.  This
> would allow e.g. Tim Berners-Lee to have basically what
> he wants (and asked for in his talk at IPC10) in terms of
> extended syntax for graphs, just with some $x in front.

1. I wasn't sure how easy it would be to return an
arbitrary sequence of tokens.

2. I wasn't sure how appropriate it was to make users
understand the internals of the parser in that way.
Transforming a magic token into a literal Python object
is easy to understand. Transforming it into an arbitrary
sequence of tokens is more powerful but harder to
understand. (And harder to claim as analogous with
u"...", 123L, etc., though I'm not sure that matters.)

> I had a similar idea right after Tim's talk, but could not
> articulate it clearly enough in a chat with Guido right
> afterwards, and later I didn't follow through with it.  It
> seems to me that your proposal is detailed and precise
> enough (while my idea was rather vague) and that, by
> returning an arbitrary sequence of tokens, it will let
> Tim embed whatever funky syntax it requires.

If we want to be able to generate arbitrary sequences
of tokens, I think I'd prefer a more flexible input
syntax.

> This power is also the downside of the whole idea of
> course -- no guarantee that somebody can't use this
> mechanism to produce highly obfuscated programs.
> But I think that such a somebody could already
> obfuscate quite effectively in other ways, and the
> risk of abuse shouldn't stop this interesting proposal.

I am inclined to agree.

> >         ...     return Rational(numerator, denominator)
> 
> Hmmm, how would this "return a literal token"?  It returns
> an instance of Rational -- how does the parser treat this
> instance as a literal token?
> 
> I thought this use would have to return the sequence of
> tokens for identifier 'Rational', open parenthesis, literal
> (value of) numerator, comma, literal (value of) denominator,
> closed parenthesis -- which in turn is why I thought of an
> arbitrary sequence of tokens.  If a single instance of any
> arbitrary class may be returned and get treated as a
> literal token by the parser, then that's much better (maybe
> I don't know Python's parser well enough, but I don't
> clearly see how that would be done).

I don't know Python's parser well enough either :-).
However: it can accept NUMBER and STRING tokens.
As far as the grammar is concerned, they are exactly
the same (except that multiple STRING tokens are
implicitly concatenated). As far as everything else
is concerned, they are very nearly exactly the same.
We could have a LITERAL token, treated in the same
sort of way as NUMBER and STRING. That was what I was
intending; certainly not returning the token-sequence
<Rational>, <(>, <numerator>, <,>, <denominator>, <)> !

> >   - Is this insane?
> 
> Hope not, since I like it.

Hmm. The other proposal I know you and I both like is
the adaptation protocol. This is not necessarily a good
omen. :-)

> >   - Is "$" the best character?
> 
> Among the few available ones, I think I slightly prefer "@"
> for this use, but there's little to choose IMHO.

Curiously, "@" was the first option I thought of for this.
I didn't have any very concrete reason for switching to "$".

-- 
g




From fredrik@pythonware.com  Fri Sep 27 14:21:45 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 27 Sep 2002 15:21:45 +0200
Subject: [Python-Dev] Re: User extendable literal modifiers ?!
References: <200209271104.MAA27895@synaptics-uk.com> <E17utgt-00071r-00@mail.python.org>
Message-ID: <03fd01c26628$d4d6ff20$0900a8c0@spiff>

alex wrote:

> If a single instance of any arbitrary class may be returned and
> get treated as a literal token by the parser, then that's much
> better

how do you marshal the resulting byte code?

</F>



From jepler@unpythonic.net  Fri Sep 27 14:43:38 2002
From: jepler@unpythonic.net (Jeff Epler)
Date: Fri, 27 Sep 2002 08:43:38 -0500
Subject: [Python-Dev] Re: User extendable literal modifiers ?!
In-Reply-To: <200209271104.MAA27895@synaptics-uk.com>
References: <200209271104.MAA27895@synaptics-uk.com>
Message-ID: <20020927134328.GA5941@unpythonic.net>

On Fri, Sep 27, 2002 at 12:03:17PM +0100, Gareth McCaughan wrote:
> Possible applications:
> 
>   - Rational numbers.    $r"123/234"
>   - Regular expressions. $/"foo.*bar"
>   - Dates and times.     $t"2002-09-27 11:38"
>   - Hostnames and ports. $h"www.google.com:80"

Of course, if you have no shame , each of these but $/ can be written
with today's syntax in no more characters, placing the type identifier
first and then an arbitrary, existing operator second:
    r+"123/234"
This, in turn, saves only one character over
    r("123/234")

Here's an example I wrote for work:
    class Dimension: ...

    class DimensionMaker:
	def __call__(self, v):
	    return Dimension(v)

	def __add__(self, v):
	    return Dimension(v)

    D = DimensionMaker()
I don't know if we'll ultimately judge the D+"..." syntax justified,
given that it feels yucky and saves only one character.

Note that we're also treading very close to allowing function calls
without parens, if we allow an arbitrary identifier before string
literals.  What actually happens if you write
    trailer: test | '(' [arglist] ')' | '[' subscriptlist ']' | '.' NAME
in the grammar and change the compiler accordingly?  I guess the problem
becomes that '(' could be the beginning of a testlist from inside atom,
but if you could arrange for '(' here to always start an arglist
instead, or invent a new production "altpower"
    trailer: altpower | '(' [arglist] ')' | '[' subscriptlist ']' | '.' NAME
    altpower: altatom trailer*
    altatom: NAME | NUMBER | STRING+
Now,
    a x.y.z()[:]
becomes legal syntax (and would be a call to 'a' with one arg,
x.y.z()[:])

Likewise, D"123/234" becomes legal, and is equivalent to D("123/234").
you have a problem with anything now recognized as a prefix of a string,
so r"123/234" can't work as $r"123/234" is proposed to work.  Of course,
you could make R 123/234 work, since that'd be (R 123)/234 which would
be R(123)/234.

Personally, I think all of this is pretty ugly.

Jeff


From mal@lemburg.com  Fri Sep 27 14:47:58 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 27 Sep 2002 15:47:58 +0200
Subject: [Python-Dev] Re: User extendable literal modifiers ?!
References: <200209271104.MAA27895@synaptics-uk.com> <E17utgt-00071r-00@mail.python.org> <007f01c26621$92d3c060$6d94fea9@newmexico>
Message-ID: <3D94618E.5070509@lemburg.com>

Samuele Pedroni wrote:
> From: Alex Martelli <aleax@aleax.it>
> 
>>I thought this use would have to return the sequence of
>>tokens for identifier 'Rational', open parenthesis, literal
>>(value of) numerator, comma, literal (value of) denominator,
>>closed parenthesis -- which in turn is why I thought of an
>>arbitrary sequence of tokens.  If a single instance of any
>>arbitrary class may be returned and get treated as a
>>literal token by the parser, then that's much better
> 
> 
> indeed, because then otherwise
> 
> $r"123/234" = literal transformation => Rational(123,234)
> 
> would require Rational to be installed in the builtins, or some kind of
> implicit import (ugly) or people would have to rember to put an explicit from
> ... import Rational in all modules that use $r, one import per program just to
> register $r would not be enough.

These are implementation details, e.g. if Python would
provide a way to register new modifiers, these would only
start working after having been registered.

Let's say that a user wants 123I to map to mx.Number.Integer(123),
then he'd have to make sure that mx.Number is imported in
sitecustomize.py to have Python load modules containing the
123I literal using the registered object constructor for that
literal modifier. Otherwise, the compiler or module loader
would fail. There should not be any magic imports going on
behind the scenes.

Note that the whole point of the idea is to simplify
using really basic types. Anything more complicated
than a single character modifier would fail to meet
this requirement.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/



From martin@v.loewis.de  Fri Sep 27 14:55:48 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 27 Sep 2002 15:55:48 +0200
Subject: [Python-Dev] User extendable literal modifiers ?!
In-Reply-To: <3D9423FB.9070303@lemburg.com>
References: <3D9423FB.9070303@lemburg.com>
Message-ID: <m3ptuzeduj.fsf@mira.informatik.hu-berlin.de>

"M.-A. Lemburg" <mal@lemburg.com> writes:

> Since these are numbers, it would be convenient if there were
> some way to create them in form of literals, much like 123L
> creates longs instead of integers or u"abc" gives you Unicode
> instead of an 8-bit string.

How would you marshal them?

Curious,

Martin


From pedronis@bluewin.ch  Fri Sep 27 14:45:57 2002
From: pedronis@bluewin.ch (Samuele Pedroni)
Date: Fri, 27 Sep 2002 15:45:57 +0200
Subject: R: [Python-Dev] Re: User extendable literal modifiers ?!
References: <200209271104.MAA27895@synaptics-uk.com> <E17utgt-00071r-00@mail.python.org> <007f01c26621$92d3c060$6d94fea9@newmexico> <3D94618E.5070509@lemburg.com>
Message-ID: <022e01c2662c$354c9240$6d94fea9@newmexico>

From: M.-A. Lemburg <mal@lemburg.com>
> Samuele Pedroni wrote:
> > From: Alex Martelli <aleax@aleax.it>
> >
> >>I thought this use would have to return the sequence of
> >>tokens for identifier 'Rational', open parenthesis, literal
> >>(value of) numerator, comma, literal (value of) denominator,
> >>closed parenthesis -- which in turn is why I thought of an
> >>arbitrary sequence of tokens.  If a single instance of any
> >>arbitrary class may be returned and get treated as a
> >>literal token by the parser, then that's much better
> >
> >
> > indeed, because then otherwise
> >
> > $r"123/234" = literal transformation => Rational(123,234)
> >
> > would require Rational to be installed in the builtins, or some kind of
> > implicit import (ugly) or people would have to rember to put an explicit
from
> > ... import Rational in all modules that use $r, one import per program just
to
> > register $r would not be enough.
>
> These are implementation details, e.g. if Python would
> provide a way to register new modifiers, these would only
> start working after having been registered.
>
> Let's say that a user wants 123I to map to mx.Number.Integer(123),
> then he'd have to make sure that mx.Number is imported in
> sitecustomize.py to have Python load modules containing the
> 123I literal using the registered object constructor for that
> literal modifier. Otherwise, the compiler or module loader
> would fail. There should not be any magic imports going on
> behind the scenes.

yes but that' my point, I simply pointed out  that the strategy that simply
re-interprets $r through a lexical transformation does not work.

regards.




From guido@python.org  Fri Sep 27 15:08:02 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 27 Sep 2002 10:08:02 -0400
Subject: [Python-Dev] Re: User extendable literal modifiers ?!
In-Reply-To: Your message of "Fri, 27 Sep 2002 15:47:58 +0200."
 <3D94618E.5070509@lemburg.com>
References: <200209271104.MAA27895@synaptics-uk.com> <E17utgt-00071r-00@mail.python.org> <007f01c26621$92d3c060$6d94fea9@newmexico>
 <3D94618E.5070509@lemburg.com>
Message-ID: <200209271408.g8RE82m05291@pcp02138704pcs.reston01.va.comcast.net>

Given all the discussion, this will need a PEP first.

I'd suggest Marc-Andre and Alex as co-authors, but that's up to you.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Fri Sep 27 15:30:35 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 27 Sep 2002 16:30:35 +0200
Subject: [Python-Dev] User extendable literal modifiers ?!
References: <3D9423FB.9070303@lemburg.com> <m3ptuzeduj.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3D946B8B.3080303@lemburg.com>

Martin v. Loewis wrote:
> "M.-A. Lemburg" <mal@lemburg.com> writes:
> 
>>Since these are numbers, it would be convenient if there were
>>some way to create them in form of literals, much like 123L
>>creates longs instead of integers or u"abc" gives you Unicode
>>instead of an 8-bit string.
> 
> 
> How would you marshal them?

Using a new marshal token which only stores the modifier together
with the literal as string. marshal.load() would then restore
the object by looking up the constructor in the modifier registry
and calling it with the string argument.

But that's just an implementation detail. What's more important
is whether this ideas raises interest or not. I'm not sure
myself whether it's a good idea and that's why I posted the
idea here.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/



From guido@python.org  Fri Sep 27 16:04:36 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 27 Sep 2002 11:04:36 -0400
Subject: [Python-Dev] User extendable literal modifiers ?!
In-Reply-To: Your message of "Fri, 27 Sep 2002 16:30:35 +0200."
 <3D946B8B.3080303@lemburg.com>
References: <3D9423FB.9070303@lemburg.com> <m3ptuzeduj.fsf@mira.informatik.hu-berlin.de>
 <3D946B8B.3080303@lemburg.com>
Message-ID: <200209271504.g8RF4at05601@pcp02138704pcs.reston01.va.comcast.net>

> What's more important is whether this ideas raises interest or
> not. I'm not sure myself whether it's a good idea and that's why I
> posted the idea here.

There are lots of possibilities for overgeneralization here.
E.g. most of the examples of the $x"..." syntax are just as easily
done using a function call, either passing a string or a few numbers.

One danger of new notations is that it could be much harder to find
out what it means if you're not familiar with a program.  If you see a
call to Frobozz(1, 2), it usually isn't hard to find the definition of
Frobozz -- at the worst, it's hidden in an "import *", and that's one
reason to avoid those.  But if you see $f"1 2" in a file, you may have
to grep all code that is imported by the program containing that file
for calls to sys.register.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Fri Sep 27 16:19:26 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 27 Sep 2002 11:19:26 -0400
Subject: [Python-Dev] sorted()
In-Reply-To: Your message of "Wed, 25 Sep 2002 17:07:13 EDT."
 <oqsmzxaida.fsf@carouge.sram.qc.ca>
References: <oqsmzxaida.fsf@carouge.sram.qc.ca>
Message-ID: <200209271519.g8RFJQK05777@pcp02138704pcs.reston01.va.comcast.net>

Since François is probably waiting for a pronouncement for me, let me
say that I think this is a problem that should not be addressed by
changes to the language, builtins or library.

A sorted() method for lists would require a copy.  François argues
that the extra space could be used by the sorting algorithm.  But if
the requirement is that the original array must not be shuffled at
all, I expect that there's no way you can make use of the extra space:
you have to make a copy of the whole list first, which then gets
shuffled in various ways.  I suppose it would be possible to write a
sorting algorithm that made some use of the availability of an output
array, but rewriting the sort code once again so that you can avoid
writing a three line function doesn't seem a good trade-off.

More generalized solutions seem overkill: I've not seen demand for
sorting other container types (except for list subclasses).

The argument against making sort() return self (while sorting
in-place) still holds, and this argument also means that having a
sorted() that sorts in-place is a bad idea.

You could consider adding a "sort" option to keys(), values() and
items(), but that doesn't solve other similar cases.

I think you'll just have to live with it.  Or you can create a dict
subclass that sorts its keys.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From dave@boost-consulting.com  Fri Sep 27 15:45:22 2002
From: dave@boost-consulting.com (David Abrahams)
Date: Fri, 27 Sep 2002 10:45:22 -0400
Subject: [Python-Dev] Keyword for first argument of methods?
Message-ID: <049f01c26634$afbda890$6501a8c0@boostconsulting.com>

Hi,

When implementing keyword argument support for Boost.Python, I noticed the
following. I'm sure it's not worth a lot of effort to change this behavior,
but I thought someone might like to know:

>>> class X:
...     def foo(self, y): print y
...
>>> X.foo(y = 1, self = X())
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: unbound method foo() must be called with X instance as first
argument (got nothing instead)

-Dave

-----------------------------------------------------------
           David Abrahams * Boost Consulting
dave@boost-consulting.com * http://www.boost-consulting.com





From skip@pobox.com  Fri Sep 27 16:15:44 2002
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 27 Sep 2002 10:15:44 -0500
Subject: [Python-Dev] User extendable literal modifiers ?!
In-Reply-To: <200209271504.g8RF4at05601@pcp02138704pcs.reston01.va.comcast.net>
References: <3D9423FB.9070303@lemburg.com>
 <m3ptuzeduj.fsf@mira.informatik.hu-berlin.de>
 <3D946B8B.3080303@lemburg.com>
 <200209271504.g8RF4at05601@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15764.30240.265628.247675@12-248-11-90.client.attbi.com>

    Guido> But if you see $f"1 2" in a file, you may have to grep all code
    Guido> that is imported by the program containing that file for calls to
    Guido> sys.register.

Even worse, if you happen to see it in isolation (a module disconnected from
the program it was written for), you might have no way to find out what the
$f prefix means.

Skip



From guido@python.org  Fri Sep 27 16:32:57 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 27 Sep 2002 11:32:57 -0400
Subject: [Python-Dev] Keyword for first argument of methods?
In-Reply-To: Your message of "Fri, 27 Sep 2002 10:45:22 EDT."
 <049f01c26634$afbda890$6501a8c0@boostconsulting.com>
References: <049f01c26634$afbda890$6501a8c0@boostconsulting.com>
Message-ID: <200209271532.g8RFWv405872@pcp02138704pcs.reston01.va.comcast.net>

> When implementing keyword argument support for Boost.Python, I
> noticed the following. I'm sure it's not worth a lot of effort to
> change this behavior, but I thought someone might like to know:
> 
> >>> class X:
> ...     def foo(self, y): print y
> ...
> >>> X.foo(y = 1, self = X())
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> TypeError: unbound method foo() must be called with X instance as first
> argument (got nothing instead)

You can't pass in self to an unbound method as a keyword argument --
it has to be the first positional argument.  The unbound method
__call__ implementation contains a check that ensures that the 'self'
argument is an instance of the class, but when this check is made, it
cannot assume that the 'self' argument is actually called 'self' --
that's only a naming convention.  It also doesn't know (in general)
the name of the first argument to the underlying function, since the
function can be an arbitrary callable -- there's no standard
introspection interface for callables to find out the argument names.

As you said, I see no reason to try to work harder in the case that
the underlying callable supports introspection using
obj.func_code.co_{argcount,varnames}.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From ark@research.att.com  Fri Sep 27 16:26:42 2002
From: ark@research.att.com (Andrew Koenig)
Date: 27 Sep 2002 11:26:42 -0400
Subject: [Python-Dev] Keyword for first argument of methods?
In-Reply-To: <049f01c26634$afbda890$6501a8c0@boostconsulting.com>
References: <049f01c26634$afbda890$6501a8c0@boostconsulting.com>
Message-ID: <yu991y7f1mj1.fsf@europa.research.att.com>

Dave> When implementing keyword argument support for Boost.Python, I noticed the
Dave> following. I'm sure it's not worth a lot of effort to change this behavior,
Dave> but I thought someone might like to know:

Dave> class X:
Dave> ...     def foo(self, y): print y
Dave> ...
Dave> X.foo(y = 1, self = X())
Dave> Traceback (most recent call last):
Dave>   File "<stdin>", line 1, in ?
Dave> TypeError: unbound method foo() must be called with X instance as first
Dave> argument (got nothing instead)

Perhaps more interesting:

        >>> X.foo(X(), 1)
        1
        >>> X.foo(self = X(), y = 1)
        TypeError: unbound method foo() must be called with X instance as first argument (got nothing instead)

-- 
Andrew Koenig, ark@research.att.com, http://www.research.att.com/info/ark


From dave@boost-consulting.com  Fri Sep 27 16:01:54 2002
From: dave@boost-consulting.com (David Abrahams)
Date: Fri, 27 Sep 2002 11:01:54 -0400
Subject: [Python-Dev] Keyword for first argument of methods?
References: <049f01c26634$afbda890$6501a8c0@boostconsulting.com> <yu991y7f1mj1.fsf@europa.research.att.com>
Message-ID: <04c801c26636$d24eec50$6501a8c0@boostconsulting.com>

From: "Andrew Koenig" <ark@research.att.com>


> Dave> When implementing keyword argument support for Boost.Python, I
noticed the
> Dave> following. I'm sure it's not worth a lot of effort to change this
behavior,
> Dave> but I thought someone might like to know:
>
> Dave> class X:
> Dave> ...     def foo(self, y): print y
> Dave> ...
> Dave> X.foo(y = 1, self = X())
> Dave> Traceback (most recent call last):
> Dave>   File "<stdin>", line 1, in ?
> Dave> TypeError: unbound method foo() must be called with X instance as
first
> Dave> argument (got nothing instead)
>
> Perhaps more interesting:
>
>         >>> X.foo(X(), 1)
>         1
>         >>> X.foo(self = X(), y = 1)
>         TypeError: unbound method foo() must be called with X instance as
first argument (got nothing instead)

Given my post, that behavior falls out of the (nicely documented) rules for
how functions are called, so it's unsurprising if you read the docs. I
wonder if that makes any difference in the real world ;-)


-----------------------------------------------------------
           David Abrahams * Boost Consulting
dave@boost-consulting.com * http://www.boost-consulting.com



From ark@research.att.com  Fri Sep 27 16:31:33 2002
From: ark@research.att.com (Andrew Koenig)
Date: Fri, 27 Sep 2002 11:31:33 -0400 (EDT)
Subject: [Python-Dev] Keyword for first argument of methods?
In-Reply-To: <04c801c26636$d24eec50$6501a8c0@boostconsulting.com>
 (dave@boost-consulting.com)
References: <049f01c26634$afbda890$6501a8c0@boostconsulting.com> <yu991y7f1mj1.fsf@europa.research.att.com> <04c801c26636$d24eec50$6501a8c0@boostconsulting.com>
Message-ID: <200209271531.g8RFVXY07106@europa.research.att.com>

Dave> Given my post, that behavior falls out of the (nicely
Dave> documented) rules for how functions are called, so it's
Dave> unsurprising if you read the docs. I wonder if that makes any
Dave> difference in the real world ;-)

Probably not.



From haering_python@gmx.de  Fri Sep 27 01:42:54 2002
From: haering_python@gmx.de (Gerhard =?iso-8859-1?Q?H=E4ring?=)
Date: Fri, 27 Sep 2002 02:42:54 +0200
Subject: [Python-Dev] Strange bug only happens with Python 2.2
Message-ID: <20020927004254.GA2069@lilith.ghaering.test>

This is somewhat off-topic, but I'm hoping maybe someone can give a hint
why this only happens on Python 2.2.1.

Ok, here's the story:

I've had a bug report against our pyPgSQL database interface package that
retrieving Large Objects doesn't work with Python 2.2.1. The reproducible
traceback we get is:

Traceback (most recent call last):
  File "p.py", line 20, in ?
    res = cs.fetchone()
  File "pyPgSQL/PgSQL.py", line 2672, in fetchone
    return self.__fetchOneRow()
  File "pyPgSQL/PgSQL.py", line 2281, in __fetchOneRow
    for _i in range(self.res.nfields):
AttributeError: 'str' object has no attribute '__bases__'

This traceback is quite obviously bogus, as self.res.nfields is a Python
int and no strings are involved here whatsoever. After some debugging, I
found that something very strange happens in a function call that
happens in this for loop. Inside the for loop, a function typecast is
called, which has this code within:

if isinstance(value, PgBytea) or type(value) is PgLargeObjectType:

This code is causing the problems which result in the bogus traceback
later on.

Now in my case, 'value' is of type PgLargeObjectType, which is a custom
type from our extension module. PgBytea is a Python class.

Now comes the first very strange observation: Swapping the checks, so
that the 'type(value) is PgLargeObjectType' check comes first makes the
problem go away. So my conclusion is that there's some problem with
isinstance and my custom extension type.

The second strange thing is that this only happens on Python 2.2.1
(Linux, FreeBSD, Windows), but _not_ on Python 2.1.3 or Python 2.3-CVS.

Oh, the problem isn't tied to isinstance(value, PgBytea). Any isinstance
check causes it later on.

Of course I'm suspecting that there's some problem with the extension
type. Looks like some internal interpreter data gets corrupted. No idea
how to debug that, too.

Does anybody have any tips where to look or how to debug this further?

-- Gerhard


From mwh@python.net  Fri Sep 27 16:39:42 2002
From: mwh@python.net (Michael Hudson)
Date: Fri, 27 Sep 2002 16:39:42 +0100 (BST)
Subject: [Python-Dev] Strange bug only happens with Python 2.2
In-Reply-To: <20020927004254.GA2069@lilith.ghaering.test>
Message-ID: <Pine.LNX.4.44.0209271637140.31133-100000@starship.python.net>

On Fri, 27 Sep 2002, Gerhard H=E4ring wrote:

> This is somewhat off-topic, but I'm hoping maybe someone can give a hint
> why this only happens on Python 2.2.1.

Guessing, but the (Jeremy's?) changes I recently backported to=20
classobject.c on the release22-maint branch might relate to this.

Can you try with a 222 build?

> Ok, here's the story:
>=20
> I've had a bug report against our pyPgSQL database interface package that
> retrieving Large Objects doesn't work with Python 2.2.1. The reproducible
> traceback we get is:
>=20
> Traceback (most recent call last):
>   File "p.py", line 20, in ?
>     res =3D cs.fetchone()
>   File "pyPgSQL/PgSQL.py", line 2672, in fetchone
>     return self.__fetchOneRow()
>   File "pyPgSQL/PgSQL.py", line 2281, in __fetchOneRow
>     for _i in range(self.res.nfields):
> AttributeError: 'str' object has no attribute '__bases__'
>=20
> This traceback is quite obviously bogus, as self.res.nfields is a Python
> int and no strings are involved here whatsoever. After some debugging, I
> found that something very strange happens in a function call that
> happens in this for loop. Inside the for loop, a function typecast is
> called, which has this code within:
>=20
> if isinstance(value, PgBytea) or type(value) is PgLargeObjectType:
>=20
> This code is causing the problems which result in the bogus traceback
> later on.

So something's setting an exception and not letting the interpreter know.

> Now in my case, 'value' is of type PgLargeObjectType, which is a custom
> type from our extension module. PgBytea is a Python class.
>=20
> Now comes the first very strange observation: Swapping the checks, so
> that the 'type(value) is PgLargeObjectType' check comes first makes the
> problem go away. So my conclusion is that there's some problem with
> isinstance and my custom extension type.
>=20
> The second strange thing is that this only happens on Python 2.2.1
> (Linux, FreeBSD, Windows), but _not_ on Python 2.1.3 or Python 2.3-CVS.

This is no surprise.

> Oh, the problem isn't tied to isinstance(value, PgBytea). Any isinstance
> check causes it later on.

Huh?

> Of course I'm suspecting that there's some problem with the extension
> type. Looks like some internal interpreter data gets corrupted. No idea
> how to debug that, too.
>=20
> Does anybody have any tips where to look or how to debug this further?

Try a release22-maint build?

Cheers,
M.



From guido@python.org  Fri Sep 27 17:04:01 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 27 Sep 2002 12:04:01 -0400
Subject: [Python-Dev] Strange bug only happens with Python 2.2
In-Reply-To: Your message of "Fri, 27 Sep 2002 02:42:54 +0200."
 <20020927004254.GA2069@lilith.ghaering.test>
References: <20020927004254.GA2069@lilith.ghaering.test>
Message-ID: <200209271604.g8RG41M06049@pcp02138704pcs.reston01.va.comcast.net>

> This is somewhat off-topic, but I'm hoping maybe someone can give a hint
> why this only happens on Python 2.2.1.
> 
> Ok, here's the story:
> 
> I've had a bug report against our pyPgSQL database interface package that
> retrieving Large Objects doesn't work with Python 2.2.1. The reproducible
> traceback we get is:
> 
> Traceback (most recent call last):
>   File "p.py", line 20, in ?
>     res = cs.fetchone()
>   File "pyPgSQL/PgSQL.py", line 2672, in fetchone
>     return self.__fetchOneRow()
>   File "pyPgSQL/PgSQL.py", line 2281, in __fetchOneRow
>     for _i in range(self.res.nfields):
> AttributeError: 'str' object has no attribute '__bases__'
> 
> This traceback is quite obviously bogus, as self.res.nfields is a Python
> int and no strings are involved here whatsoever. After some debugging, I
> found that something very strange happens in a function call that
> happens in this for loop. Inside the for loop, a function typecast is
> called, which has this code within:
> 
> if isinstance(value, PgBytea) or type(value) is PgLargeObjectType:
> 
> This code is causing the problems which result in the bogus traceback
> later on.
> 
> Now in my case, 'value' is of type PgLargeObjectType, which is a custom
> type from our extension module. PgBytea is a Python class.
> 
> Now comes the first very strange observation: Swapping the checks, so
> that the 'type(value) is PgLargeObjectType' check comes first makes the
> problem go away. So my conclusion is that there's some problem with
> isinstance and my custom extension type.
> 
> The second strange thing is that this only happens on Python 2.2.1
> (Linux, FreeBSD, Windows), but _not_ on Python 2.1.3 or Python 2.3-CVS.
> 
> Oh, the problem isn't tied to isinstance(value, PgBytea). Any isinstance
> check causes it later on.
> 
> Of course I'm suspecting that there's some problem with the extension
> type. Looks like some internal interpreter data gets corrupted. No idea
> how to debug that, too.
> 
> Does anybody have any tips where to look or how to debug this further?

Probably some C code receives an exception and decides to go a
different path (rather than propagating the exception), but forgets to
call PyErr_Clear().

If you call some other code that raises an exception or calls
PyErr_Clear(), the spurious exception is gone; but if you call some
other code that *tests* for an exception (usually with
PyExc_Occurred() or PyErr_ExceptionMatches()), that code may raise the
bogus exception at an unexpected place.

So I'd look in your extension for places where it tests for an
exception and decides to ignore it but forgets to clear it.

It's also possible that this occurs in the Python code (have you tried
the 2.2.2 CVS?  Use "cvs update -r release22-maint") but if I had to
bet, I'd bet on your SQL extension. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From haering_python@gmx.de  Fri Sep 27 17:23:13 2002
From: haering_python@gmx.de (Gerhard =?iso-8859-1?Q?H=E4ring?=)
Date: Fri, 27 Sep 2002 18:23:13 +0200
Subject: [Python-Dev] Strange bug only happens with Python 2.2
In-Reply-To: <Pine.LNX.4.44.0209271637140.31133-100000@starship.python.net>
References: <20020927004254.GA2069@lilith.ghaering.test> <Pine.LNX.4.44.0209271637140.31133-100000@starship.python.net>
Message-ID: <20020927162313.GA6854@lilith.ghaering.test>

* Michael Hudson <mwh@python.net> [2002-09-27 16:39 +0100]:
> On Fri, 27 Sep 2002, Gerhard Häring wrote:
> 
> > This is somewhat off-topic, but I'm hoping maybe someone can give a hint
> > why this only happens on Python 2.2.1.
> 
> Guessing, but the (Jeremy's?) changes I recently backported to 
> classobject.c on the release22-maint branch might relate to this.
> 
> Can you try with a 222 build?

Yep. The problem goes away with release22-maint :-)

> > Ok, here's the story:
> > [bogus traceback, caused by:]
> > if isinstance(value, PgBytea) or type(value) is PgLargeObjectType:
> 
> So something's setting an exception and not letting the interpreter know.

> > Oh, the problem isn't tied to isinstance(value, PgBytea). Any isinstance
> > check causes it later on.
> 
> Huh?

To clarify, any isinstance(value, x), where x is a Python class, causes
the problem.

> > [Any tips?]
> Try a release22-maint build?

That fixes the problem, so I'm now pretty confident it is a Python 2.2.1
problem (haven't tried with 2.2.0 yet, would that be of any use?).

-- Gerhard


From haering_python@gmx.de  Fri Sep 27 17:24:43 2002
From: haering_python@gmx.de (Gerhard =?iso-8859-1?Q?H=E4ring?=)
Date: Fri, 27 Sep 2002 18:24:43 +0200
Subject: [Python-Dev] Strange bug only happens with Python 2.2
In-Reply-To: <200209271604.g8RG41M06049@pcp02138704pcs.reston01.va.comcast.net>
References: <20020927004254.GA2069@lilith.ghaering.test> <200209271604.g8RG41M06049@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020927162443.GB6854@lilith.ghaering.test>

* Guido van Rossum <guido@python.org> [2002-09-27 12:04 -0400]:
> It's also possible that this occurs in the Python code (have you tried
> the 2.2.2 CVS?

Yep, the problem goes away, then.

> Use "cvs update -r release22-maint") but if I had to bet, I'd bet on
> your SQL extension. :-)

How much? ;-)

-- Gerhard


From guido@python.org  Fri Sep 27 18:03:08 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 27 Sep 2002 13:03:08 -0400
Subject: [Python-Dev] Strange bug only happens with Python 2.2
In-Reply-To: Your message of "Fri, 27 Sep 2002 18:23:13 +0200."
 <20020927162313.GA6854@lilith.ghaering.test>
References: <20020927004254.GA2069@lilith.ghaering.test> <Pine.LNX.4.44.0209271637140.31133-100000@starship.python.net>
 <20020927162313.GA6854@lilith.ghaering.test>
Message-ID: <200209271703.g8RH38q07337@pcp02138704pcs.reston01.va.comcast.net>

> That fixes the problem, so I'm now pretty confident it is a Python 2.2.1
> problem (haven't tried with 2.2.0 yet, would that be of any use?).

No.  If it's really fixed in 2.2.2, there's nothing else we can do.

But I'm curious what caused this.  Can you show self-contained example
code (not using SQL) that shows this behavior in 2.2.1?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Fri Sep 27 19:09:32 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 27 Sep 2002 20:09:32 +0200
Subject: [Python-Dev] User extendable literal modifiers ?!
References: <3D9423FB.9070303@lemburg.com> <m3ptuzeduj.fsf@mira.informatik.hu-berlin.de>              <3D946B8B.3080303@lemburg.com> <200209271504.g8RF4at05601@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D949EDC.5040202@lemburg.com>

Guido van Rossum wrote:
>>What's more important is whether this ideas raises interest or
>>not. I'm not sure myself whether it's a good idea and that's why I
>>posted the idea here.
> 
> 
> There are lots of possibilities for overgeneralization here.
> E.g. most of the examples of the $x"..." syntax are just as easily
> done using a function call, either passing a string or a few numbers.
> 
> One danger of new notations is that it could be much harder to find
> out what it means if you're not familiar with a program.  If you see a
> call to Frobozz(1, 2), it usually isn't hard to find the definition of
> Frobozz -- at the worst, it's hidden in an "import *", and that's one
> reason to avoid those.  But if you see $f"1 2" in a file, you may have
> to grep all code that is imported by the program containing that file
> for calls to sys.register.

You're probably right.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/



From tim.one@comcast.net  Fri Sep 27 19:10:08 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 27 Sep 2002 14:10:08 -0400
Subject: [Python-Dev] sorted()
In-Reply-To: <200209271519.g8RFJQK05777@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCOECDBHAB.tim.one@comcast.net>

[Guido]
> ...
> A sorted() method for lists would require a copy.  Fran=E7ois argue=
s
> that the extra space could be used by the sorting algorithm.  But i=
f
> the requirement is that the original array must not be shuffled at
> all, I expect that there's no way you can make use of the extra spa=
ce:
> you have to make a copy of the whole list first, which then gets
> shuffled in various ways.
>
> I suppose it would be possible to write a sorting algorithm that ma=
de
> some use of the availability of an output array, but rewriting the
> sort code once again so that you can avoid writing a three line
> function doesn't seem a good trade-off.

There's no efficiency argument to be made here unless someone can wri=
te a
sort function this way and demonstrate an improvement.

I expect that would be hard.  Back when I wrote the samplesort hybrid=
, I
tried several ways of coding mergesorts too, and they all lost on ran=
dom
data.  They all used a temp array of the same size as the original ar=
ray.
The current mergesort does not:  it uses a temp array at most half th=
e size.
This effectively doubled the amount of code needed, but cut the size =
of the
working set.  I first tried the current mergesort again with a temp a=
rray
the same size as the original, but it again lost (a little on random =
data, a
lot on many kinds of partially ordered data -- for example, take a so=
rted
array, and move its last element to the front; no matter how large th=
e
array, the current mergesort only needs a few dozen temp words to get=
 it
sorted again, and caches are much happier with that).




From jrw@pobox.com  Fri Sep 27 19:51:57 2002
From: jrw@pobox.com (John Williams)
Date: Fri, 27 Sep 2002 13:51:57 -0500
Subject: [Python-Dev] proposal for interfaces
Message-ID: <005001c26657$05000330$0100a8c0@shura>

I have an idea for an interface mechnism for Python, and I'd like to see if
anyone likes it before writing an actual PEP.  The key features are:

- It's implementable in pure Python (I've already started working on it).
- The syntax to use it is fairly concise.
- Interfaces are inherited by default, but can be turned off.
- Classes are made to implement interfaces without altering the class
definition in any way.
- A class can support any number of interfaces, even multiple interfaces
that define methods with the same names.
- It's easily extensible to add new features in a backward-compatible way.
- It has support for design-by-contract idioms (this part is not essential
to the proposal, so I won't discuss it further here, but interfaces without
DBC seem kind of incomplete to me).

Basic Usage
===========

In actual practice it would look something like this:  Suppose you have a
class like this:

  class SomeClass:
    def foo(...): ...
    def bar(...): ...
    def foo2(...): ...

In the simplest case, suppose you have an interface Foo that defines a
single method, foo.  To declare that SomeClass implements foo, you'd say:

  Foo.bind(SomeClass)

Now, suppose you have a function that requires an argument implemeting
interface Foo.  You would probably code it like this:

  def foo_proc(foo_arg):
    foo_proxy = Foo(foo_arg)
    ...
    x = foo_proxy.foo(...)
    ...

Two things are happening here.  First, foo_arg is being checked to make sure
it implements Foo; if not, an InterfaceError will be raised.  Then,
foo_proxy becomes a proxy for foo_arg, but it *only* supports calling the
method foo, since that's all the interface defines.  (If foo_arg is already
a proxy object, the call the Foo will just return foo_arg.)

Defining an Interface
=====================

How is "Foo" defined?  It could look something like this:

  class Foo(interface):
    def doc_foo(self,...): "Docstring for foo method."

The "doc_" prefix on foo is not part of the method name; it is needed to
control how the interface treats the method.  If the method has default
behavior, you could say this instead:

  class Foo(interface):
    def default_foo(self,...):
      "Docstring for foo method."
      print "Defaults can be handy."

In the version, unlike the first, classes implemeting Foo need not define
their own "foo" method is the defualt will suffice.  Requiring some sort of
prefix attached to every name defined by the interface is a little ugly, but
it opens up a lot of possibilities for creating different behaviors with a
minimum of fuss--I have a lot of uses in mind for different prefixes that I
won't go into here.

Advanced Examples
=================

Let's say we have a new interface, FooBar, defined like this:

  class FooBar(interface):
    def doc_foo(...): "Method foo."
    def doc_bar(...): "Method quux."

And suppose we'd like to make SomeClass above implement FooBar, but we want
FooBar.foo to call SomeClass.foo2 instead of SomeClass.foo.  It's easy!

  FooBar.bind(SomeClass)
  FooBar[SomeClass].foo = "foo2" # Override default binding.

Now we can do really confusing stuff:

  x = SomeClass()
  Foo(x).foo()     # Calls x.foo()
  FooBar(x).foo()  # Calls x.foo2()

Of course you probably wouldn't do something so confusing on purpose, but it
could be useful when an object must support two different interfaces
(written by different people) that happen to have method names in common, or
to connect a class to an interface where the class defines all the needed
functionality but the methods have the wrong names.

For the last trick, let's imagine you want to derive a subclass of
SomeClass.  If you want the new class to inherit all the interfaces, do
nothing.  To remove an inteface, just do something like this:

  class AnotherClass(SomeClass): ...
  Foo.unbind(AnotherClass)

And it's done!

Please let me know if you like this idea (or hate it).  If I get a good
response I'll try to write a PEP this weekend and make the implementation
availble to try out.

--jw




From mark@freelance-developer.com  Fri Sep 27 20:44:37 2002
From: mark@freelance-developer.com (Mark Nenadov)
Date: Fri, 27 Sep 2002 15:44:37 -0400
Subject: [Python-Dev] proposal for interfaces
In-Reply-To: <005001c26657$05000330$0100a8c0@shura>
References: <005001c26657$05000330$0100a8c0@shura>
Message-ID: <200209271544.37296.mark@freelance-developer.com>

John,

I like your idea! I would look forward to seeing the implementation.

I personally prefer to have the "interface binding" to be part of a class
definition. However, I can see some advantages to having the binding proc=
ess
seperate from the interface.

Good day,
~Mark


On September 27, 2002 02:51 pm, John Williams wrote:
> And it's done!
>
> Please let me know if you like this idea (or hate it).  If I get a good
> response I'll try to write a PEP this weekend and make the implementati=
on
> availble to try out.


From jriehl@spaceship.com  Fri Sep 27 20:44:19 2002
From: jriehl@spaceship.com (Jonathan Riehl)
Date: Fri, 27 Sep 2002 14:44:19 -0500 (CDT)
Subject: [Python-Dev] Extension module difficulty w/pgen.
Message-ID: <Pine.BSF.4.33.0209271428160.75360-100000@localhost>

Hi all,
	I know this is what I get for trying to integrate pgen into an
extension module: I can't get it to link properly.  I first saw the
following problem on a FreeBSD box.

Now, I have the following two external dependencies in my extension
module (.../src/Modules/pgenmodule.c):

extern grammar * _Py_pgen (node * n);
extern grammar * _Py_meta_grammar (void);

I added an entry to setup.py for it:

        exts.append( Extension('pgen', ['pgenmodule.c']))

Now when I run make, the extension module is not built, with the system
complaining about being unable to resolve "_Py_meta_grammar", but not
"_Py_pgen".  When I run nm, I can see both symbols in libpython.2.3.a
(these symbols are in pgen.c and metagrammar.c, both of which have been
added to the libpython build target):

~/cvs/python/dist/src> nm ./libpython2.3.a | grep _Py_pgen
[26]    |      2676|      48|FUNC |GLOB |0    |2      |_Py_pgen
~/cvs/python/dist/src> nm ./libpython2.3.a | grep _Py_meta
[36]    |         0|      12|FUNC |GLOB |0    |2      |_Py_meta_grammar

On the FreeBSD box, I was able to add "-L. -lpython2.3" to the command
line, and this builds.  However, when I use this hack on a Solaris
platform, it complains about being unable to reserve a text offset for
most if not all of the symbols in libpython.

It seems to me that I should not have to use this workaround, which only
works on one of the systems I use.  Does anyone have an idea as to what I
should do now?  I am a bit confused by this, since Fred Drake's parser
extension does not require any of this wackiness.

As an aside, the code for the modules I am working on and the diffs are on
Sourceforge (PEP 269 implementation), so you can play too, if so inclined.

Thanks!
-Jon



From guido@python.org  Fri Sep 27 21:24:58 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 27 Sep 2002 16:24:58 -0400
Subject: [Python-Dev] Extension module difficulty w/pgen.
In-Reply-To: Your message of "Fri, 27 Sep 2002 14:44:19 CDT."
 <Pine.BSF.4.33.0209271428160.75360-100000@localhost>
References: <Pine.BSF.4.33.0209271428160.75360-100000@localhost>
Message-ID: <200209272024.g8RKOwq22992@pcp02138704pcs.reston01.va.comcast.net>

> 	I know this is what I get for trying to integrate pgen into an
> extension module: I can't get it to link properly.  I first saw the
> following problem on a FreeBSD box.
> 
> Now, I have the following two external dependencies in my extension
> module (.../src/Modules/pgenmodule.c):
> 
> extern grammar * _Py_pgen (node * n);
> extern grammar * _Py_meta_grammar (void);
> 
> I added an entry to setup.py for it:
> 
>         exts.append( Extension('pgen', ['pgenmodule.c']))
>
> Now when I run make, the extension module is not built, with the system
> complaining about being unable to resolve "_Py_meta_grammar", but not
> "_Py_pgen".

Maybe you only get an error for the first unresolved symbol?

> When I run nm, I can see both symbols in libpython.2.3.a
> (these symbols are in pgen.c and metagrammar.c, both of which have been
> added to the libpython build target):
> 
> ~/cvs/python/dist/src> nm ./libpython2.3.a | grep _Py_pgen
> [26]    |      2676|      48|FUNC |GLOB |0    |2      |_Py_pgen
> ~/cvs/python/dist/src> nm ./libpython2.3.a | grep _Py_meta
> [36]    |         0|      12|FUNC |GLOB |0    |2      |_Py_meta_grammar

Linux nm output looks very different, so I don't know what this
means.  Are you *sure* it doesn't mean that there are global
references but no definitions for these symbols?  And what does the 0
in the second column for _Py_meta_grammar mean?

> On the FreeBSD box, I was able to add "-L. -lpython2.3" to the command
> line, and this builds.  However, when I use this hack on a Solaris
> platform, it complains about being unable to reserve a text offset for
> most if not all of the symbols in libpython.
> 
> It seems to me that I should not have to use this workaround, which only
> works on one of the systems I use.  Does anyone have an idea as to what I
> should do now?  I am a bit confused by this, since Fred Drake's parser
> extension does not require any of this wackiness.
> 
> As an aside, the code for the modules I am working on and the diffs are on
> Sourceforge (PEP 269 implementation), so you can play too, if so inclined.

Maybe the problem is that nothing else uses these symbols?  Try
sticking dummy references (e.g. an unreachable call) to them in
main.c, to see if that makes a difference.  I recall we had to do this
for something else that wasn't used by Python itself.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From pedronis@bluewin.ch  Fri Sep 27 21:40:30 2002
From: pedronis@bluewin.ch (Samuele Pedroni)
Date: Fri, 27 Sep 2002 22:40:30 +0200
Subject: [Python-Dev] buitlins instance have modifiable __class__?
Message-ID: <0a0301c26666$1f0f7800$6d94fea9@newmexico>

question on bultin types (under 2.2):

>>> d={}
>>> class ndict(dict):
...   __slots__ = ()
...   def __getitem__(self,k):
...    print "__getitem__"
...    return dict.__getitem__(self,k)
...
>>> d.items()
[]
>>> d['a']=3
>>> d.__class__=ndict

is intended to work?

it seems it does, but is that the intention?

>>> d['a']
__getitem__
3

[

>>> exec "print a" in d
3

Ok, that is the non cooperative behavior I already know about. ]

Thanks.



From pedronis@bluewin.ch  Fri Sep 27 22:07:21 2002
From: pedronis@bluewin.ch (Samuele Pedroni)
Date: Fri, 27 Sep 2002 23:07:21 +0200
Subject: [Python-Dev] buitlins instance have modifiable __class__?
References: <0a0301c26666$1f0f7800$6d94fea9@newmexico>
Message-ID: <0ac201c26669$dee14660$6d94fea9@newmexico>

typos apart, there was also another question, sorry I was typing and reflecting
on the consequences of all of this on Jython ...

[me]
>
> >>> exec "print a" in d
> 3
>
> Ok, that is the non cooperative behavior I already know about. ]
>

I recall this was already discussed here, what is the idea, to leave it  as it
is or make this work?

Thanks.



From jriehl@spaceship.com  Fri Sep 27 22:20:20 2002
From: jriehl@spaceship.com (Jonathan Riehl)
Date: Fri, 27 Sep 2002 16:20:20 -0500 (CDT)
Subject: [Python-Dev] Extension module difficulty w/pgen.
In-Reply-To: <200209272024.g8RKOwq22992@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <Pine.BSF.4.33.0209271606060.75360-100000@localhost>

On Fri, 27 Sep 2002, Guido van Rossum wrote:
>
> Maybe you only get an error for the first unresolved symbol?
>

Yup.  When I comment out the references to _Py_meta_grammar(), it still
complains about not being able to see _Py_pgen().

> > When I run nm, I can see both symbols in libpython.2.3.a
> > (these symbols are in pgen.c and metagrammar.c, both of which have been
> > added to the libpython build target):
> >
> > ~/cvs/python/dist/src> nm ./libpython2.3.a | grep _Py_pgen
> > [26]    |      2676|      48|FUNC |GLOB |0    |2      |_Py_pgen
> > ~/cvs/python/dist/src> nm ./libpython2.3.a | grep _Py_meta
> > [36]    |         0|      12|FUNC |GLOB |0    |2      |_Py_meta_grammar
>
> Linux nm output looks very different, so I don't know what this
> means.  Are you *sure* it doesn't mean that there are global
> references but no definitions for these symbols?  And what does the 0
> in the second column for _Py_meta_grammar mean?

Come on Guido, I thought you were at one point a Solaris hack ;-). FUNC
and GLOB means it is a global function defined in the module.  The second
column is some sort of memory offset value, and incidently the third
column is the byte size reserved for the object (i.e. the objects should
be in the library, and are not just place holders).  Here is the output
from Linux (where I have just duplicated the problem):

$ nm libpython2.3.a | egrep "_Py_(meta|pgen)"
00000000 T _Py_meta_grammar
00000e38 T _Py_pgen

If I understood the GNU info file for binutils, this means that the
symbols are defined in the text segment, and should be available for
external linkage.

> Maybe the problem is that nothing else uses these symbols?  Try
> sticking dummy references (e.g. an unreachable call) to them in
> main.c, to see if that makes a difference.  I recall we had to do this
> for something else that wasn't used by Python itself.

I tried this just now, but to no avail.  Maybe I am not being thorough
enough.  If the linker is excluding these symbols because they are not
used, why would nm seem to say they are there, and why would statically
linking libpython work (on FreeBSD, anyway)?  Conversely, I seem to
remember this working on an earlier, but abandoned attempt I made on a
Linux box.

Maybe I just need more vacation. :P

Thanks!
-Jon



From gerhard.haering@gmx.de  Fri Sep 27 23:07:40 2002
From: gerhard.haering@gmx.de (Gerhard =?iso-8859-1?Q?H=E4ring?=)
Date: Sat, 28 Sep 2002 00:07:40 +0200
Subject: [Python-Dev] Strange bug only happens with Python 2.2
In-Reply-To: <Pine.LNX.4.44.0209271637140.31133-100000@starship.python.net>
References: <20020927004254.GA2069@lilith.ghaering.test> <Pine.LNX.4.44.0209271637140.31133-100000@starship.python.net>
Message-ID: <20020927220740.GA7751@lilith.ghaering.test>

* Michael Hudson <mwh@python.net> [2002-09-27 16:39 +0100]:
> On Fri, 27 Sep 2002, Gerhard Häring wrote:
> 
> > This is somewhat off-topic, but I'm hoping maybe someone can give a hint
> > why this only happens on Python 2.2.1.
> 
> Guessing, but the (Jeremy's?) changes I recently backported to 
> classobject.c on the release22-maint branch might relate to this.

Maybe. I've not viewed the control flow in a debugger, but my tries to come up
with a minimalistic test case and my gut feeling says that this piece of code
has something to do with it:

static PyObject *PgLargeObject_getattr(PgLargeObject *self, char* attr)
{
    PyObject *res;

    res = Py_FindMethod(PgLargeObject_methods, (PyObject *)self, attr);
    if (res != NULL)
        return res;
    PyErr_Clear();

    if (strcmp(attr, "closed") == 0)
        return Py_BuildValue("l", (long)(self->lo_fd == -1));

    if (!strcmp(attr, "__module__"))
        return Py_BuildValue("s", MODULE_NAME);

    if (!strcmp(attr, "__class__")) {
        printf("__class__ accessed!\n");
        return Py_BuildValue("s", self->ob_type->tp_name);
    }

    return PyMember_Get((char *)self, PgLargeObject_members, attr);
}

from which I can see that isinstance tries to access the __class__ attribute.
Am I supposed to /not/ provide a __class__ attribute for classic types?

I haven't looked into the python22-maint changelogs yet, but I couldn't find
any related registered SF bug.

-- Gerhard


From haering_python@gmx.de  Fri Sep 27 23:53:50 2002
From: haering_python@gmx.de (Gerhard =?iso-8859-1?Q?H=E4ring?=)
Date: Sat, 28 Sep 2002 00:53:50 +0200
Subject: [Python-Dev] Strange bug only happens with Python 2.2
In-Reply-To: <20020927220740.GA7751@lilith.ghaering.test>
References: <20020927004254.GA2069@lilith.ghaering.test> <Pine.LNX.4.44.0209271637140.31133-100000@starship.python.net> <20020927220740.GA7751@lilith.ghaering.test>
Message-ID: <20020927225349.GA8862@lilith.ghaering.test>

* Gerhard Häring <gerhard.haering@gmx.de> [2002-09-28 00:07 +0200]:
> * Michael Hudson <mwh@python.net> [2002-09-27 16:39 +0100]:
> > On Fri, 27 Sep 2002, Gerhard Häring wrote:
> > 
> > > This is somewhat off-topic, but I'm hoping maybe someone can give a hint
> > > why this only happens on Python 2.2.1.
> > 
> > Guessing, but the (Jeremy's?) changes I recently backported to 
> > classobject.c on the release22-maint branch might relate to this.
> 
> Maybe. I've not viewed the control flow in a debugger, but my tries to come up
> with a minimalistic test case and my gut feeling says that this piece of code
> has something to do with it:
> 
> static PyObject *PgLargeObject_getattr(PgLargeObject *self, char* attr)
> {
>     PyObject *res;
> 
>     res = Py_FindMethod(PgLargeObject_methods, (PyObject *)self, attr);
>     if (res != NULL)
>         return res;
>     PyErr_Clear();
> 
>     if (strcmp(attr, "closed") == 0)
>         return Py_BuildValue("l", (long)(self->lo_fd == -1));
> 
>     if (!strcmp(attr, "__module__"))
>         return Py_BuildValue("s", MODULE_NAME);
> 
>     if (!strcmp(attr, "__class__")) {
>         printf("__class__ accessed!\n");
>         return Py_BuildValue("s", self->ob_type->tp_name);
>     }
> 
>     return PyMember_Get((char *)self, PgLargeObject_members, attr);
> }
> 
> from which I can see that isinstance tries to access the __class__ attribute.
> Am I supposed to /not/ provide a __class__ attribute for classic types?
> 
> I haven't looked into the python22-maint changelogs yet, but I couldn't find
> any related registered SF bug.

Ok, I've now further narrowed down this isinstance issue:

python22-maint                                   ==> bug does not appear
python22-maint with abstract.c from Python 2.2.1 ==> bug appears

So for what it's worth (i. e. not much), I'd say please /do/ include the
abstract.c changes into the upcoming Python 2.2.2 :-)

-- Gerhard


From guido@python.org  Sat Sep 28 01:11:54 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 27 Sep 2002 20:11:54 -0400
Subject: [Python-Dev] Strange bug only happens with Python 2.2
In-Reply-To: Your message of "Sat, 28 Sep 2002 00:53:50 +0200."
 <20020927225349.GA8862@lilith.ghaering.test>
References: <20020927004254.GA2069@lilith.ghaering.test> <Pine.LNX.4.44.0209271637140.31133-100000@starship.python.net> <20020927220740.GA7751@lilith.ghaering.test>
 <20020927225349.GA8862@lilith.ghaering.test>
Message-ID: <200209280011.g8S0BsF17313@pcp02138704pcs.reston01.va.comcast.net>

> Ok, I've now further narrowed down this isinstance issue:
> 
> python22-maint                                   ==> bug does not appear
> python22-maint with abstract.c from Python 2.2.1 ==> bug appears
> 
> So for what it's worth (i. e. not much), I'd say please /do/ include the
> abstract.c changes into the upcoming Python 2.2.2 :-)

I'm sure it's this change, which was backported to the 2.2 maintenance
branch (and hence will be in 2.2.2).  It fixes several several
occurrences where an error is not cleared.

----------------------------
revision 2.101
date: 2002/04/23 22:45:44;  author: bwarsaw;  state: Exp;  lines: +46 -9
abstract_get_bases(): Clarify exactly what the return values and
states can be for this function, and ensure that only AttributeErrors
are masked.  Any other exception raised via the equivalent of
getattr(cls, '__bases__') should be propagated up.

abstract_issubclass(): If abstract_get_bases() returns NULL, we must
call PyErr_Occurred() to see if an exception is being propagated, and
return -1 or 0 as appropriate.  This is the specific fix for a problem
whereby if getattr(derived, '__bases__') raised an exception, an
"undetected error" would occur (under a debug build).  This nasty
situation was uncovered when writing a security proxy extension type
for the Zope3 project, where the security proxy raised a Forbidden
exception on getattr of __bases__.

PyObject_IsInstance(), PyObject_IsSubclass(): After both calls to
abstract_get_bases(), where we're setting the TypeError if the return
value is NULL, we must first check to see if an exception occurred,
and /not/ mask an existing exception.

Neil Schemenauer should double check that these changes don't break
his ExtensionClass examples (there aren't any test cases for those
examples and abstract_get_bases() was added by him in response to
problems with ExtensionClass).  Neil, please add test cases if
possible!

I belive this is a bug fix candidate for Python 2.2.2.
----------------------------

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Sat Sep 28 01:17:50 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 27 Sep 2002 20:17:50 -0400
Subject: [Python-Dev] buitlins instance have modifiable __class__?
In-Reply-To: Your message of "Fri, 27 Sep 2002 22:40:30 +0200."
 <0a0301c26666$1f0f7800$6d94fea9@newmexico>
References: <0a0301c26666$1f0f7800$6d94fea9@newmexico>
Message-ID: <200209280017.g8S0Hol17357@pcp02138704pcs.reston01.va.comcast.net>

> question on bultin types (under 2.2):
> 
> >>> d={}
> >>> class ndict(dict):
> ...   __slots__ = ()
> ...   def __getitem__(self,k):
> ...    print "__getitem__"
> ...    return dict.__getitem__(self,k)
> ...
> >>> d.items()
> []
> >>> d['a']=3
> >>> d.__class__=ndict
> 
> is intended to work?
> 
> it seems it does, but is that the intention?

It was a mistake.  In 2.3, it's disallowed.  In 2.2.2, it'll still be
allowed, but you shouldn't do this -- all sorts of bizarre stuff can
happen because you can do this.

So if you're asking about this for Jython, please don't allow this in
Jython!

> typos apart, there was also another question, sorry I was typing and
> reflecting on the consequences of all of this on Jython ...
> 
> [me]
> >
> > >>> exec "print a" in d
> > 3
> >
> > Ok, that is the non cooperative behavior I already know about. ]
> >
> 
> I recall this was already discussed here, what is the idea, to leave
> it as it is or make this work?

That's not going to change in CPython, because I believe it would slow
down lookup for builtins and globals too much if we had to check for a
custom __getitem__.  But if you can fix it for Jython, go ahead.  I
don't mind if there are places where Jython is "purer" than CPython.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From pedronis@bluewin.ch  Sat Sep 28 01:15:11 2002
From: pedronis@bluewin.ch (Samuele Pedroni)
Date: Sat, 28 Sep 2002 02:15:11 +0200
Subject: [Python-Dev] buitlins instance have modifiable __class__?
References: <0a0301c26666$1f0f7800$6d94fea9@newmexico>  <200209280017.g8S0Hol17357@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <0ec901c26684$1cae5ae0$6d94fea9@newmexico>

From: Guido van Rossum <guido@python.org>
> It was a mistake.  In 2.3, it's disallowed.  In 2.2.2, it'll still be
> allowed, but you shouldn't do this -- all sorts of bizarre stuff can
> happen because you can do this.
>
> So if you're asking about this for Jython, please don't allow this in
> Jython!

honestly I was hoping that it was unintended, implementing this would make
already complicated things even more so. So we are happy campers.

Maybe others will be less so, I have seen a module referred on c.l.p using the
"feature" escaped from the lab to make builtins observable <wink>.

> > > >>> exec "print a" in d
> > > 3
> > >
> > > Ok, that is the non cooperative behavior I already know about. ]
> > >
> >
> > I recall this was already discussed here, what is the idea, to leave
> > it as it is or make this work?
>
> That's not going to change in CPython, because I believe it would slow
> down lookup for builtins and globals too much if we had to check for a
> custom __getitem__.  But if you can fix it for Jython, go ahead.  I
> don't mind if there are places where Jython is "purer" than CPython.

Very likely the other way is what we will get in Jython by doing nothing

regards.



From esteban@ccpgames.com  Sat Sep 28 09:26:38 2002
From: esteban@ccpgames.com (Esteban U.C.. Castro)
Date: Sat, 28 Sep 2002 08:26:38 -0000
Subject: [Python-Dev] proposal for interfaces
Message-ID: <25A39AFEB31B06408C675CEF28199E5B073CA3@postur.ccp.cc>

Hi, I have just joined python-dev and I saw your very interesting
proposal=20
for implementing intefaces.

> I have an idea for an interface mechnism for Python, and I'd like to
see if
> anyone likes it before writing an actual PEP.  [...]

I like it a lot! Anyway, if it can be implemented in python as is, what
is=20
the point of the PEP? Making the 'interface' root class and/or
InterfaceError=20
builtins, maybe?

I have some comments which I thought I would bounce. I'll organize these

attending to the activities they relate to. Don't hesitate to tell me if
I'm=20
sayig something stupid. :)


Define an interface
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

In your example, it seems that

  class Foo(interface):
    def default_foo(self, a, b):
      "Docstring for foo method."
      print "Defaults can be handy."

[I added the a, b arguments to illustrate one point below]

Does the following:=20

 - Defines the requirement that all objects that implement Foo have a
foo=20
   function
 - Defines that foo should accept arguments of the form (a, b) maybe?
 - Sets a doc string for the foo method.
 - Sets a default implementation for the method.
=20
Some questions on this:

 - Can an interface only define method names (and maybe argument
formats)?=20
   I think it would be handy to let it expose attributes.

 - Is method names (and maybe format of arguments) the only thing you
can=20
   'promise' in the interface? In other words, is that the only type of=20
   guarantee that code that works against the interface can get? I think

   a __check__(self, obj) special method in interfaces would be a simple

   way to boost their flexibility.

 - For the uses you have given to the prefix_methodname notation so far,

   I don't think it's really needed. Isn't the following sufficient?
=20
 class Foo(interface):
=20
    def foo(self, a, b):
        "foo docstring"
        # nothing here; no default definition
       =20
    def bar(self, a, b):
        pass # no docstring, _empty definition_
       =20

This has the side effect that a method with no default definition and no

doc string is a SyntaxError. Is this too bad?=20

- It would maybe be hard to figure out what such a method is supposed to

  do, so you _should_ provide a docstring.=20

- If you're in a hurry, an empty docstring will do the trick. While in=20
  'quick and dirty mode' you probably won't be using interfaces a lot,=20
  anyway.

Defaults look indeed useful, but the really crucial aspect to lay down=20
in an interface definition is what does it guarantee on the objects that

implement it. If amalgamating this with default defs would otherwise=20
obscure it (there's another issue I'm addressing below), I think
defaults=20
belong more properly to a class that implements the interface, not to
the=20
interface definition itself.

I guess knowing what other uses you have in mind for the
prefix_methodname=20
notation could be useful to decide whether it's warranted.



Check whether an object implements an interface
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

>From your examples, I get it that an object implements an interface iff
it
is an instance of a class that implements that interface.=20

So I guess any checking of the requirements expressed by the interface
is=20
done at the time when you bind() a class to an interface.

This paradigm perfectly fits static and strongly typed languages, but it

falls a bit short of the flexibility of python IMO. You can do very
funny=20
things to classes and objects at runtime, which can break any
assumptions=20
based on object class.

In your example:

  def foo_proc(foo_arg):
    foo_proxy =3D Foo(foo_arg)
    ...
    x =3D foo_proxy.foo(a, b)

[added a, b again]

imagine foo_proc may only really cares that foo_arg is an object that
has=20
a foo() method that takes (a, b) arguments (this is all Foo guarantees).

 * Will the Foo() call check this, or will it just check that some class
   in foo_arg's bases is bound to the Foo interface?=20

   In the second case, if someone has been fiddling with foo_arg or some

   of its base classes, foo_arg.foo() may no longer exist or it may have

   a different signature.
=20

 * Why should Foo() _always_ fail for objects that _do_ meet the
requirements=20
   expressed by the interface Foo but have _not_ declared that they
implement=20
   the interface? If the point is to avoid false positives, interfaces=20
   with this concern may still make the class check:

  class Foo(interface):
      def __call__(self, obj):
          error =3D __check__(obj)
          if error:
              raise InterfaceError, error
          else:
              return self.proxy(obj)
       =20
      def __check__(self, obj):
          if not hasattr(obj, "foo"):
              return "No method foo found"
          ...
          return interface.check_class(self, obj)


Making such check optional allows implicit (not declared) interface=20
satisfaction for those who want it. This should extend the applicability
of=20
interfaces.

And this brings up another problem with defaults: they would increase
false=20
positives. What if an interface wants to provide defaults for all its
methods?=20
Will then any object match it? This would force additional checking.=20

Even thought this doesn't look like a big issue to me, I think it's
cleanest=20
to leave validation for interfaces and implementation for classes.



Declare that an object implements an interface or part of it
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

  Foo.bind(SomeClass)


Problems:

* I agree the declaration had better be included in the class
definition, at=20
  least as an option.=20
* Declarative better than procedural, for this purpose.
* Only classes (not instances) can declare that they implement an
interface.=20
* Indexing notation to 'resolve' methods a bit counterintuitive.


What about:

  # interfaces=20

  class Foo(interface):
      def foo(self, a, b):=20
          ...

      def clash1(self, x):
          ...

      def clash2(self):
          ...      =20
               =20
  class Bar(interface):
      def bar(self, x, y):
          ...

      def clash1(self, a, s, d, f):
          ...

      def clash2(self):=20
          # actually equivalent to Foo.clash2
          # we should really factor this out in one interface
          # but imagine we can't do so for some reason...
          ...   =20
 =20

  # implementations

  class SomeClass:

	# promise that all instances of SomeClass will implement Foo and
Bar
      __implements__ =3D (Foo, Bar)=20

      # automatically assumed to implement Foo.foo()   =20
      def foo(self, a, b):=20
          ...
   =20

      # There is not automatic name clash resolution. InterfaceError
unless=20
	# we resolve these explicitly
   =20
      def fclash1(self, x):
          ...
      fclash1.__implements__ =3D Foo.clash1   # maybe we should require
this=20
							  # to be a
tuple too?

   =20
      def bclash1(self, x):
          ...
      bclash1.__implements__ =3D Bar.clash1

   =20
      def clash2(self):
          ...
      clash2.__implements__ =3D (Foo.clash2, Bar.clash2) # or maybe this
is=20
=09
# not really needed?
=09
# seems to reflect bad=20
=09
# design anyway


  # 'Remove' an interface from a subclass without actually removing it
from=20
  # the base (or just cut the search with negative result):
 =20
  class Child(SomeClass):
      __implements__ =3D (-Foo,)  # will be found before the 'Foo' in =
the
base=20
					  # class



  # an object that is *not* an instance of a class that implements Foo
wants=20
  # to play the Foo

  obj =3D Child()
  obj.foo =3D lambda a, b: ...
  obj.clash1 =3D lambda x: ...
  obj.for_the_fun_of_it =3D lambda: ...
  obj.__implements__ =3D (Foo,)   # will be found before the '-Foo' in =
the

					  # class

  obj.for_the_fun_of_it.__implements__ =3D Foo.clash2 # object dict will
be=20
								    #
searched first



Restrict
=3D=3D=3D=3D=3D=3D=3D=3D

This is, to make sure that an object is only accessed in the ways=20
defined in the interface (via the proxy).=20

This should be optional too, but your syntax does this nicely; you
can call Foo() as an assertion of sorts and ignore the result.

Note that the __implements__ method resolution magic would require=20
that you get a proxy, though.



What do you think?


Esteban.


From esteban@ccpgames.com  Sat Sep 28 10:32:28 2002
From: esteban@ccpgames.com (Esteban U.C.. Castro)
Date: Sat, 28 Sep 2002 09:32:28 -0000
Subject: [Python-Dev] sorry
Message-ID: <25A39AFEB31B06408C675CEF28199E5B06EB0B@postur.ccp.cc>

phew! sorry about the formatting of the last one! :(

Esteban.


From aleax@aleax.it  Sat Sep 28 11:30:08 2002
From: aleax@aleax.it (Alex Martelli)
Date: Sat, 28 Sep 2002 12:30:08 +0200
Subject: [Python-Dev] Why are useful tools omitted from the Win bin distro?
In-Reply-To: <25A39AFEB31B06408C675CEF28199E5B06EB0B@postur.ccp.cc>
References: <25A39AFEB31B06408C675CEF28199E5B06EB0B@postur.ccp.cc>
Message-ID: <02092812300806.05324@arthur>

Just helped some people on the Italian Python mailing list find some 
indispensable tool (Tools/i18n/pygettext.py, in this specific case) and 
they expressed astonishment that the tool isn't in the standard Windows 
binary distribution, which was all they had downloaded.  This set me to 
wondering -- is there any reason why this and other tools &c should NOT be 
included in that binary?  Could we add them in 2.2.2 and later releases?


Alex


From mwh@python.net  Sat Sep 28 12:37:56 2002
From: mwh@python.net (Michael Hudson)
Date: 28 Sep 2002 12:37:56 +0100
Subject: [Python-Dev] ATTENTION!  Releasing Python 2.2.2 in a few weeks
In-Reply-To: Guido van Rossum's message of "Fri, 20 Sep 2002 17:26:30 -0400"
References: <200209202126.g8KLQVI24554@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <2mn0q2fip7.fsf@starship.python.net>

Guido van Rossum <guido@python.org> writes:

> I'd like to release something called Python 2.2.2 in a few weeks (say,
> around Oct 8; I like Tuesday release dates).

One minor point of concern: I think Jack Jansen's on holiday.  Perhaps
we should wait for him to get back...

Cheers,
M.

-- 
  There's an aura of unholy black magic about CLISP.  It works, but
  I have no idea how it does it.  I suspect there's a goat involved
  somewhere.                     -- Johann Hibschman, comp.lang.scheme


From guido@python.org  Sat Sep 28 15:22:58 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 28 Sep 2002 10:22:58 -0400
Subject: [Python-Dev] ATTENTION! Releasing Python 2.2.2 in a few weeks
In-Reply-To: Your message of "Sat, 28 Sep 2002 12:37:56 BST."
 <2mn0q2fip7.fsf@starship.python.net>
References: <200209202126.g8KLQVI24554@pcp02138704pcs.reston01.va.comcast.net>
 <2mn0q2fip7.fsf@starship.python.net>
Message-ID: <200209281422.g8SEMw720102@pcp02138704pcs.reston01.va.comcast.net>

> > I'd like to release something called Python 2.2.2 in a few weeks (say,
> > around Oct 8; I like Tuesday release dates).
> 
> One minor point of concern: I think Jack Jansen's on holiday.  Perhaps
> we should wait for him to get back...

He should be back by Oct 5 or 6 if what he told me of his schedule is
true.  I'm not sure that it would matter much if MacPython 2.2.2 was
released a week after the main release.  Maybe we should do one
release candidate anyway and give him space that way.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Sat Sep 28 15:35:40 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 28 Sep 2002 10:35:40 -0400
Subject: [Python-Dev] Why are useful tools omitted from the Win bin distro?
In-Reply-To: Your message of "Sat, 28 Sep 2002 12:30:08 +0200."
 <02092812300806.05324@arthur>
References: <25A39AFEB31B06408C675CEF28199E5B06EB0B@postur.ccp.cc>
 <02092812300806.05324@arthur>
Message-ID: <200209281435.g8SEZe620175@pcp02138704pcs.reston01.va.comcast.net>

> Just helped some people on the Italian Python mailing list find some 
> indispensable tool (Tools/i18n/pygettext.py, in this specific case) and 
> they expressed astonishment that the tool isn't in the standard Windows 
> binary distribution, which was all they had downloaded.  This set me to 
> wondering -- is there any reason why this and other tools &c should NOT be 
> included in that binary?  Could we add them in 2.2.2 and later releases?

Good idea.  I think it was a simple oversight -- adding anything from
the Tools directory is not automatic, Tim has to add lines to the
Windows installer script.

I believe that the following Tools subdirectories are currently being
distributed:

idle
scripts
webchecker
versioncheck
pynche

That means these are not:

audiopy (Solaris only)
bgen (only used by Mac developers AFAIK)
compiler (experimental AFAIK)
faqwiz (only useful for people running web servers)
framer (new in 2.3)
freeze (only useful for developers?)
i18n
modulator (only useful for developers)
unicode (only useful for developers?)
world

Of these, I think i18n and world are candidates for inclusion on the
Windows installer.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From whisper@oz.net  Sat Sep 28 20:03:05 2002
From: whisper@oz.net (David LeBlanc)
Date: Sat, 28 Sep 2002 12:03:05 -0700
Subject: [Python-Dev] Why are useful tools omitted from the Win bin distro?
In-Reply-To: <02092812300806.05324@arthur>
Message-ID: <GCEDKONBLEFPPADDJCOECEPHFDAA.whisper@oz.net>

I agree - downloading the source distro yields some goodies that would be
nice in the binary distro - many windows folks won't d/l the source since
many of them don't have a C compiler, especially the "VB" types.

I nominate Alex as "Mr. WinBin" ;)

Regards,

David LeBlanc
Seattle, WA USA

> -----Original Message-----
> From: python-dev-admin@python.org [mailto:python-dev-admin@python.org]On
> Behalf Of Alex Martelli
> Sent: Saturday, September 28, 2002 3:30
> To: python-dev@python.org
> Subject: [Python-Dev] Why are useful tools omitted from the Win bin
> distro?
>
>
> Just helped some people on the Italian Python mailing list find some
> indispensable tool (Tools/i18n/pygettext.py, in this specific case) and
> they expressed astonishment that the tool isn't in the standard Windows
> binary distribution, which was all they had downloaded.  This set me to
> wondering -- is there any reason why this and other tools &c
> should NOT be
> included in that binary?  Could we add them in 2.2.2 and later releases?
>
>
> Alex
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev



From jriehl@spaceship.com  Sun Sep 29 00:18:55 2002
From: jriehl@spaceship.com (Jonathan Riehl)
Date: Sat, 28 Sep 2002 18:18:55 -0500 (CDT)
Subject: [Python-Dev] Extension module difficulty w/pgen.
In-Reply-To: <Pine.BSF.4.33.0209271606060.75360-100000@localhost>
Message-ID: <Pine.BSF.4.33.0209281800040.75360-100000@localhost>

> On Fri, 27 Sep 2002, Guido van Rossum wrote:
<
> > Maybe the problem is that nothing else uses these symbols?  Try
> > sticking dummy references (e.g. an unreachable call) to them in
> > main.c, to see if that makes a difference.  I recall we had to do this
> > for something else that wasn't used by Python itself.
>
> I tried this just now, but to no avail.  Maybe I am not being thorough
> enough.  If the linker is excluding these symbols because they are not
> used, why would nm seem to say they are there, and why would statically
> linking libpython work (on FreeBSD, anyway)?  Conversely, I seem to
> remember this working on an earlier, but abandoned attempt I made on a
> Linux box.
>
It turns out I wasn't being thorough enough.  I used nm on my python
build, and didn't see the symbols there.  I moved the dummy calls to a
global function in python.c and only then were the symbols linked into the
interpreter.

Keeping a dummy function in one of the core modules doesn't seem like a
terribly elegant solution, even if it allows me to keep developing the
pgen module.  What would you suggest be done to ensure that statically
linked builds link those symbols?  I would assume that since the required
symbols *are* in libpython (per my modifications to Makefile.pre.in),
building a python using dynamic libraries would allow the extension module
to "see" those symbols.

You had mentioned doing something like this before.  Is there some linkage
graveyard where I can bury calls to these symbols in order to ensure they
are linked?  Or are some of those fancy API macros used to ensure linkage
(perhaps by API functions that are utilities for extension writers and not
needed by the python core)?

Thanks!
-Jon



From guido@python.org  Sun Sep 29 00:55:26 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 28 Sep 2002 19:55:26 -0400
Subject: [Python-Dev] Extension module difficulty w/pgen.
In-Reply-To: Your message of "Sat, 28 Sep 2002 18:18:55 CDT."
 <Pine.BSF.4.33.0209281800040.75360-100000@localhost>
References: <Pine.BSF.4.33.0209281800040.75360-100000@localhost>
Message-ID: <200209282355.g8SNtQR21241@pcp02138704pcs.reston01.va.comcast.net>

> > > Maybe the problem is that nothing else uses these symbols?  Try
> > > sticking dummy references (e.g. an unreachable call) to them in
> > > main.c, to see if that makes a difference.  I recall we had to
> > > do this for something else that wasn't used by Python itself.
> >
> > I tried this just now, but to no avail.  Maybe I am not being
> > thorough enough.  If the linker is excluding these symbols because
> > they are not used, why would nm seem to say they are there, and
> > why would statically linking libpython work (on FreeBSD, anyway)?
> > Conversely, I seem to remember this working on an earlier, but
> > abandoned attempt I made on a Linux box.
> >
> It turns out I wasn't being thorough enough.  I used nm on my python
> build, and didn't see the symbols there.  I moved the dummy calls to
> a global function in python.c and only then were the symbols linked
> into the interpreter.
> 
> Keeping a dummy function in one of the core modules doesn't seem
> like a terribly elegant solution, even if it allows me to keep
> developing the pgen module.  What would you suggest be done to
> ensure that statically linked builds link those symbols?  I would
> assume that since the required symbols *are* in libpython (per my
> modifications to Makefile.pre.in), building a python using dynamic
> libraries would allow the extension module to "see" those symbols.

Yes, I think building a shared lib would work.  But we don't build
shared libs for all platforms.

> You had mentioned doing something like this before.  Is there some
> linkage graveyard where I can bury calls to these symbols in order
> to ensure they are linked?  Or are some of those fancy API macros
> used to ensure linkage (perhaps by API functions that are utilities
> for extension writers and not needed by the python core)?

No, AFAIK you have to create a dummy reference somewhere.  I'd suggest
adding it to the end of Python/pythonrun.c.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From dave@boost-consulting.com  Sun Sep 29 01:21:54 2002
From: dave@boost-consulting.com (David Abrahams)
Date: Sat, 28 Sep 2002 20:21:54 -0400
Subject: [Python-Dev] Doc location question
Message-ID: <096801c2674e$b3dab4c0$6501a8c0@boostconsulting.com>

Hi there,

Is there a good reason that http://www.python.org/2.2.1/descrintro.html
isn't also available as http://www.python.org/current/descrintro.html ?

Maybe I don't understand the system.

-----------------------------------------------------------
           David Abrahams * Boost Consulting
dave@boost-consulting.com * http://www.boost-consulting.com





From guido@python.org  Sun Sep 29 02:27:51 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 28 Sep 2002 21:27:51 -0400
Subject: [Python-Dev] Doc location question
In-Reply-To: Your message of "Sat, 28 Sep 2002 20:21:54 EDT."
 <096801c2674e$b3dab4c0$6501a8c0@boostconsulting.com>
References: <096801c2674e$b3dab4c0$6501a8c0@boostconsulting.com>
Message-ID: <200209290127.g8T1RpD23189@pcp02138704pcs.reston01.va.comcast.net>

> Is there a good reason that http://www.python.org/2.2.1/descrintro.html
> isn't also available as http://www.python.org/current/descrintro.html ?

descrintro.html is "rogue" documentation, i.e. it's not part of the
official documentation set.  Its contents should eventually be
incorporated into the official reference manual.

Possibly it's good to keep it as a separate tutorial for new-style
classes, in which case it should be converted to Latex and
incorporated in the official documentation.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@comcast.net  Sun Sep 29 04:48:51 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 28 Sep 2002 23:48:51 -0400
Subject: [Python-Dev] Why are useful tools omitted from the Win bin distro?
In-Reply-To: <GCEDKONBLEFPPADDJCOECEPHFDAA.whisper@oz.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEHGBHAB.tim.one@comcast.net>

[Alex Martelli]
> Just helped some people on the Italian Python mailing list find some
> indispensable tool (Tools/i18n/pygettext.py, in this specific case) and
> they expressed astonishment that the tool isn't in the standard Windows
> binary distribution, which was all they had downloaded.  This set me to
> wondering -- is there any reason why this and other tools &c
> should NOT be included in that binary?  Could we add them in 2.2.2 and
> later releases?

Last time this came up, Guido was opposed to it.  The problem is that the
majority of Python Windows users aren't particularly clueful, and the Demo
and Tools directories are loaded with stuff that's not maintained, not
documented, platform-dependent, and may not even work anymore.  This all
conspires to give it a "developers only" status.  Repeated calls for
volunteers to clean this up (i.e., document it, fix it, clean out the crap)
went unanswered.  I've been been known to respond to requests to include
specific pieces in the Windows distro, though (for example, IDLE and
pynche).  This is a PITA because it requires custom WISE scripting for each
one, so someone has to convince me they really, really want a piece first.



From guido@python.org  Sun Sep 29 05:48:25 2002
From: guido@python.org (Guido van Rossum)
Date: Sun, 29 Sep 2002 00:48:25 -0400
Subject: [Python-Dev] New logmerge feature
Message-ID: <200209290448.g8T4mQ607379@pcp02138704pcs.reston01.va.comcast.net>

For those interested in poring over CVS logs, I've added a new feature
to logmerge.py: a -b tag option that restricts the output to a
specific branch tag.  Use -b HEAD to show only the CVS HEAD
(a.k.a. trunk).  (The default is to show all revisions regardless of
the branch on which they occur, which isn't always so easy if you're
interested in a specific branch.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@manatee.mojam.com  Sun Sep 29 13:00:24 2002
From: skip@manatee.mojam.com (Skip Montanaro)
Date: Sun, 29 Sep 2002 07:00:24 -0500
Subject: [Python-Dev] Weekly Python Bug/Patch Summary
Message-ID: <200209291200.g8TC0Okm026518@manatee.mojam.com>

Bug/Patch Summary
-----------------

290 open / 2889 total bugs (+5)
109 open / 1709 total patches (+3)

New Bugs
--------

property missing doc'd __name__ attr (2002-09-22)
	http://python.org/sf/612969
memory leaks when importing posix module (2002-09-23)
	http://python.org/sf/613222
win32 build_ext problem (2002-09-24)
	http://python.org/sf/614051
fpectl module broken on Linux (2002-09-24)
	http://python.org/sf/614060
Rewrite _reduce and _reconstructor in C (2002-09-25)
	http://python.org/sf/614555
LookupError etc. need API to get the key (2002-09-25)
	http://python.org/sf/614557
gethostbyname() blocks when threaded (2002-09-25)
	http://python.org/sf/614791
broken link in documentation (2002-09-25)
	http://python.org/sf/614821
socket.getfqdn() doesn't on Windows (2002-09-27)
	http://python.org/sf/615472
No __mod__ on str subclass (2002-09-27)
	http://python.org/sf/615506
Tkinter.Misc has no __contains__ method (2002-09-27)
	http://python.org/sf/615772
getdefaultlocale failure on OS X (2002-09-28)
	http://python.org/sf/616002
cPickle documentation incomplete (2002-09-28)
	http://python.org/sf/616013
list(xrange(sys.maxint / 4)) -> swapping (2002-09-28)
	http://python.org/sf/616019

New Patches
-----------

koi8_u codec (2002-09-23)
	http://python.org/sf/613173
add unescape method to xml.sax.saxutils (2002-09-23)
	http://python.org/sf/613256
rm email package dependency on rfc822.py (2002-09-23)
	http://python.org/sf/613434
Bugfix: content-type header parsing (2002-09-23)
	http://python.org/sf/613605
OpenVMS patches (2002-09-24)
	http://python.org/sf/614055
fix for urllib2.AbstractBasicAuthHandler (2002-09-25)
	http://python.org/sf/614596
MSVC 7.0 compiler support (2002-09-25)
	http://python.org/sf/614770
build fixes for SCO (2002-09-26)
	http://python.org/sf/615069
acconfig.h out of date (2002-09-26)
	http://python.org/sf/615343

Closed Bugs
-----------

New features need syntax (2001-08-17)
	http://python.org/sf/452222
ConfigParser has_option case sensitive (2002-05-29)
	http://python.org/sf/561822
resize readonly memory mapped file (2002-07-05)
	http://python.org/sf/577782
ConfigParser spaces in keys not read (2002-07-18)
	http://python.org/sf/583248
exec*() doesn't handle errors well (2002-08-20)
	http://python.org/sf/597797
Lone surrogates cause bad .pyc files (2002-09-17)
	http://python.org/sf/610783
2 bugs in turtle.py (2002-09-21)
	http://python.org/sf/612595

Closed Patches
--------------

pyport.h, Wince and errno getter/setter (2002-01-19)
	http://python.org/sf/505846
Fix "file:" URL to have right no. of /'s (2002-08-06)
	http://python.org/sf/591713
bugfixes and cleanup for _strptime.py (2002-08-10)
	http://python.org/sf/593560
turtle tracer bugfixes and new functions (2002-08-14)
	http://python.org/sf/595111
select problems on Windows (2002-09-19)
	http://python.org/sf/611464
quietly select between 'less' and 'more' (2002-09-20)
	http://python.org/sf/612111


From pedronis@bluewin.ch  Sun Sep 29 14:03:01 2002
From: pedronis@bluewin.ch (Samuele Pedroni)
Date: Sun, 29 Sep 2002 15:03:01 +0200
Subject: [Python-Dev] "incriminated" module (was: buitlins instance have modifiable __class__?)
References: <0a0301c26666$1f0f7800$6d94fea9@newmexico>  <200209280017.g8S0Hol17357@pcp02138704pcs.reston01.va.comcast.net> <0ec901c26684$1cae5ae0$6d94fea9@newmexico>
Message-ID: <014c01c267b8$8adedd20$6d94fea9@newmexico>

This is a multi-part message in MIME format.

------=_NextPart_000_0149_01C267C9.4E13C0C0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit

[me]
> Maybe others will be less so, I have seen a module referred on c.l.p using
the
> "feature" escaped from the lab to make builtins observable <wink>.
>

the thread on c.l.p is  google: watching mutables? group:comp.lang.python

the package is at

http://oomadness.tuxfamily.org/en/editobj/

author: Jean-Baptiste LAMY -- jiba@tuxfamily

I have directly attached the "incriminated" module for the curious.

Should I give them the bad news?

regards.

------=_NextPart_000_0149_01C267C9.4E13C0C0
Content-Type: text/plain;
	name="eventobj.py"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
	filename="eventobj.py"

# EventObj=0A=
# Copyright (C) 2001-2002 Jean-Baptiste LAMY -- jiba@tuxfamily=0A=
#=0A=
# This program is free software; you can redistribute it and/or modify=0A=
# it under the terms of the GNU General Public License as published by=0A=
# the Free Software Foundation; either version 2 of the License, or=0A=
# (at your option) any later version.=0A=
#=0A=
# This program is distributed in the hope that it will be useful,=0A=
# but WITHOUT ANY WARRANTY; without even the implied warranty of=0A=
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the=0A=
# GNU General Public License for more details.=0A=
#=0A=
# You should have received a copy of the GNU General Public License=0A=
# along with this program; if not, write to the Free Software=0A=
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  =
USA=0A=
=0A=
# This module is certainly my best hack ;-)=0A=
# It is nothing else than a sequence of hacks, from the beginning to the =
end !!!!!=0A=
=0A=
"""EventObj -- Allow to add attribute-change, content-change or =
hierarchy-change event to any Python instance.=0A=
=0A=
Provides the following functions (view their doc for more info):=0A=
  dumper_event(obj, attr, oldvalue, newvalue)=0A=
  addevent(obj, event)=0A=
  hasevent(obj[, event])=0A=
  removeevent(obj, event)=0A=
  addevent_rec(obj, event)=0A=
  removeevent_rec(obj, event)=0A=
And the following constant for addition/removal events:=0A=
  ADDITION=0A=
  REMOVAL=0A=
=0A=
Events : An event is any callable object that take 4 parameters (=3D the =
listened instance, the changed attribute name, the new value, the old =
value).=0A=
After registration with addevent, it will be called just after any =
attribute is change on the instance.=0A=
You can add many events on the same instance, and use the same event =
many times.=0A=
An instance can implement __addevent__(event, copiable =3D 0), =
__hasevent__(event =3D None) and __removeevent__(event) to allow a =
custom event system.=0A=
=0A=
Notice that the event is weakref'ed, so if it is no longer accessible, =
the event is silently removed.=0A=
=0A=
Caution : As event management is performed by changing the class of the =
instance, you use eventobj with critical objects at your own risk... !=0A=
=0A=
Quick example :=0A=
>>> from editobj.eventobj import *=0A=
>>> class C: pass=0A=
>>> c =3D C()=0A=
>>> def event(obj, attr, value, oldvalue):=0A=
...   print "c." + attr, "was", `oldvalue` + ", is now", `value`=0A=
>>> addevent(c, event)=0A=
>>> c.x =3D 1=0A=
c.x was None, is now 1=0A=
=0A=
Addition/Removal events : if you add them to a list / UserList or a dict =
/ UserDict, or a subclass, events are also called for addition or =
removal in the list/dict.=0A=
Addition or removal can be performed by any of the methods of =
UserList/UserDict (e.g. append, extend, remove, __setitem__,...)=0A=
In this case, name will be the ADDITION or REMOVAL constant, value the =
added object for a list and the key-value pair for a dict, and oldvalue =
None.=0A=
=0A=
Quick example :=0A=
>>> from editobj.eventobj import *=0A=
>>> c =3D []=0A=
>>> def event(obj, attr, value, oldvalue):=0A=
...   if attr is ADDITION: # Only for list / dict=0A=
...     print `value`, "added to c"=0A=
...   elif attr is REMOVAL: # Only for list / dict=0A=
...     print `value`, "removed from c"=0A=
...   else:=0A=
...     print "c." + attr, "was", `oldvalue` + ", is now", `value`=0A=
>>> addevent(c, event)=0A=
>>> c.append(0)=0A=
0 added to c=0A=
=0A=
Hierachy events : such events are used with UserList or UserDict, and =
are usefull to listen an entire hierarchy (e.g. a list that contains =
other lists that can contain other lists...).=0A=
The event apply to the registred instance, and any other item it =
contains, and so on if the list/dict is a deeper hierarchy.=0A=
If you want to automatically add or remove the event when the hierarchy =
change (addition or removal at any level), use a HierarchyEvent (see =
example below).=0A=
=0A=
Quick example :=0A=
>>> from editobj.eventobj import *=0A=
>>> c =3D []=0A=
>>> def event(obj, attr, value, oldvalue):=0A=
...   if   attr is ADDITION:=0A=
...     print `value`, "added to", `obj`=0A=
...   elif attr is REMOVAL:=0A=
...     print `value`, "removed from", `obj`=0A=
>>> addevent_rec(c, event)=0A=
>>> c.append([])             # This sub-list was not in c when we add =
the event...=0A=
[] added to [[]] # [[]] is c=0A=
>>> c[0].append(12)          # ...but the hierarchical event has been =
automatically added !=0A=
12 added to [12] # [12] is c[0]=0A=
"""=0A=
=0A=
__all__ =3D [=0A=
  "addevent",=0A=
  "hasevent",=0A=
  "removeevent",=0A=
  "addevent_rec",=0A=
  "removeevent_rec",=0A=
  "dumper_event",=0A=
  "ADDITION",=0A=
  "REMOVAL",=0A=
  ]=0A=
=0A=
import new, weakref, types #, copy=0A=
from UserList import UserList=0A=
from UserDict import UserDict=0A=
=0A=
=0A=
def to_list(obj):=0A=
  if isinstance(obj, list) or isinstance(obj, UserList): return obj=0A=
  if hasattr(obj, "children"):=0A=
    items =3D obj.children=0A=
    if callable(items): return items()=0A=
    return items=0A=
  if hasattr(obj, "items"):=0A=
    items =3D obj.items=0A=
    if callable(items): return items()=0A=
    return items=0A=
  if hasattr(obj, "__getitem__"): return obj=0A=
  return None=0A=
=0A=
def to_dict(obj):=0A=
  if isinstance(obj, dict) or isinstance(obj, UserDict): return obj=0A=
  return None=0A=
=0A=
def to_dict_or_list(obj):=0A=
  content =3D to_dict(obj) # Try dict...=0A=
  if content is None:=0A=
    content =3D to_list(obj) # Try list...=0A=
    if content is None: return None, None=0A=
    return list, content=0A=
  else:=0A=
    return dict, content=0A=
  =0A=
def to_content(obj):=0A=
  content =3D to_dict(obj) # Try dict...=0A=
  if content is None: return to_list(obj) or () # Try list...=0A=
  return content.values()=0A=
=0A=
=0A=
def dumper_event(obj, attr, value, old):=0A=
  """dumper_event -- an event that dump a stacktrace when called. =
Usefull for debugging !=0A=
=0A=
dumper_event is the default value for add_event.=0A=
"""=0A=
  import traceback=0A=
  traceback.print_stack()=0A=
  =0A=
  if   attr is ADDITION: print "%s now contains %s." % (obj, value)=0A=
  elif attr is REMOVAL : print "%s no longer contains %s." % (obj, value)=0A=
  else:                  print "%s.%s was %s, is now %s." % (obj, attr, =
old, value)=0A=
  =0A=
=0A=
def addevent(obj, event =3D dumper_event):=0A=
  """addevent(obj, event =3D dumper_event)=0A=
Add the given attribute-change event to OBJ. OBJ must be a class =
instance (old or new style) or a list / dict, and EVENT a function that =
takes 4 args: (obj, attr, value, old).=0A=
EVENT will be called when any attribute of obj will be changed; OBJ is =
the object, ATTR the name of the attribute, VALUE the new value of the =
attribute and OLD the old one.=0A=
=0A=
EVENT defaults to the "dumper" event, which print each object =
modification.=0A=
=0A=
Raise eventobj.NonEventableError if OBJ cannot support event.=0A=
"""=0A=
=0A=
  event =3D _wrap_event(obj, event)=0A=
  =0A=
  try:=0A=
    obj.__addevent__(event)=0A=
  except:=0A=
    if hasattr(obj, "__dict__"):=0A=
      # Store the event and the old class of obj in an instance of =
_EventObj_stuff (a class we use to contain that).=0A=
      # Do that BEFORE we link the event (else we'll get a change event =
for the "_EventObj_stuff" attrib) !=0A=
      obj._EventObj_stuff =3D _EventObj_stuff(obj.__class__)=0A=
      =0A=
      # Change the class of the object.=0A=
      obj.__class__ =3D _create_class(obj, obj.__class__)=0A=
    else:=0A=
      # Change the class of the object.=0A=
      old_class     =3D obj.__class__=0A=
      obj.__class__ =3D _create_class(obj, obj.__class__)=0A=
      =0A=
      stuff_for_non_dyn_objs[id(obj)] =3D _EventObj_stuff(old_class)=0A=
    =0A=
    obj.__addevent__(event)=0A=
    =0A=
def hasevent(obj, event =3D None):=0A=
  """hasevent(obj[, event]) -> Boolean=0A=
Return wether the obj instance has the given event (or has any event, if =
event is None)."""=0A=
  return hasattr(obj, "__hasevent__") and obj.__hasevent__(event)=0A=
  =0A=
def removeevent(obj, event =3D dumper_event):=0A=
  """removeevent(obj, event =3D dumper_event)=0A=
Remove the given event from obj."""=0A=
  hasattr(obj, "__removeevent__") and obj.__removeevent__(event)=0A=
  =0A=
ADDITION   =3D "__added__"=0A=
REMOVAL =3D "__removed__"=0A=
=0A=
NonEventableError =3D "NonEventableError"=0A=
=0A=
# Private stuff :=0A=
=0A=
# A dict to store the created classes.=0A=
#_classes =3D weakref.WeakKeyDictionary() # old-style class cannot be =
weakref'ed !=0A=
_classes =3D {}=0A=
=0A=
# Create a with-event class for "clazz". the returned class is a "mixin" =
that will extends clazz and _EventObj (see below).=0A=
def _create_class(obj, clazz):=0A=
  try: return _classes[clazz]=0A=
  except:=0A=
    if hasattr(obj, "__dict__"):=0A=
      # The name of the new class is the same name of the original class =
(mimetism !)=0A=
      if   issubclass(clazz, list) or issubclass(clazz, UserList): cl =
=3D new.classobj(clazz.__name__, (_EventObj_List, _EventObj, clazz), {})=0A=
      elif issubclass(clazz, dict) or issubclass(clazz, UserDict): cl =
=3D new.classobj(clazz.__name__, (_EventObj_Dict, _EventObj, clazz), {})=0A=
      else:=0A=
        if issubclass(clazz, object):                              cl =
=3D new.classobj(clazz.__name__, (_EventObj, clazz), {})=0A=
        else:                                                      cl =
=3D new.classobj(clazz.__name__, (_EventObj_OldStyle, clazz), {})=0A=
        =0A=
    else:=0A=
      # list and dict were added in _classes at module initialization=0A=
      # Other types are not supported yet !=0A=
      raise NonEventableError, obj=0A=
    =0A=
    # Change also the module name.=0A=
    cl.__module__ =3D clazz.__module__=0A=
    _classes[clazz] =3D cl=0A=
    return cl=0A=
  =0A=
=0A=
# A container for _EventObj attribs.=0A=
class _EventObj_stuff:=0A=
  def __init__(self, clazz):=0A=
    self.clazz  =3D clazz=0A=
    self.events =3D []=0A=
    =0A=
  def __call__(self, obj, attr, value, oldvalue):=0A=
    # Clone the list, since executing an event function may add or =
remove some events.=0A=
    for event in self.events[:]: event(obj, attr, value, oldvalue)=0A=
    =0A=
  def remove_event(self, event): self.events.remove(event)=0A=
  =0A=
  def has_event(self, event): return event in self.events=0A=
  =0A=
def _wrap_event(obj, event, hi =3D 0):=0A=
  if not isinstance(event, WrappedEvent):=0A=
    #dump =3D repr(event)=0A=
    =0A=
    try: obj =3D weakref.proxy(obj) # Avoid cyclic ref=0A=
    except TypeError: pass=0A=
    =0A=
    def callback(o):=0A=
      #print "attention !", dump, "est mourant !"=0A=
      # This seems buggy since it is called when some objects are being =
destructed=0A=
      try:=0A=
        ob =3D obj=0A=
        if ob:=0A=
          if removeevent and hasevent(ob, event): removeevent(ob, event)=0A=
      except: pass=0A=
    if hi:=0A=
      if type(event) is types.MethodType: event =3D WeakHiMethod(event, =
callback)=0A=
      else:                               event =3D WeakHiFunc  (event, =
callback)=0A=
    else:=0A=
      if type(event) is types.MethodType: event =3D WeakMethod(event, =
callback)=0A=
      else:                               event =3D WeakFunc  (event, =
callback)=0A=
      =0A=
  return event=0A=
=0A=
=0A=
class WrappedEvent: pass=0A=
=0A=
class WeakFunc(WrappedEvent):=0A=
  def __init__(self, func, callback =3D None):=0A=
    if callback: self.func =3D weakref.ref(func, callback)=0A=
    else:        self.func =3D weakref.ref(func)=0A=
    =0A=
  def original(self): return self.func()=0A=
  =0A=
  def __call__(self, *args): self.func()(*args)=0A=
  =0A=
  def __eq__(self, other):=0A=
    return (self.func() =3D=3D other) or (isinstance(other, WeakFunc) =
and (self.func() =3D=3D other.func()))=0A=
  =0A=
  def __repr__(self): return "<WeakFunc for %s>" % self.func()=0A=
  =0A=
class WeakMethod(WrappedEvent):=0A=
  def __init__(self, method, callback =3D None):=0A=
    if callback: self.obj =3D weakref.ref(method.im_self, callback)=0A=
    else:        self.obj =3D weakref.ref(method.im_self)=0A=
    self.func =3D method.im_func=0A=
    =0A=
  def original(self):=0A=
    obj =3D self.obj()=0A=
    return new.instancemethod(self.func, obj, obj.__class__)=0A=
  =0A=
  def __call__(self, *args): self.func(self.obj(), *args)=0A=
  =0A=
  def __eq__(self, other):=0A=
    return ((type(other) is types.MethodType) and (self.obj() is =
other.im_self) and (self.func is other.im_func)) or (isinstance(other, =
WeakMethod) and (self.obj() is other.obj()) and (self.func is =
other.func))=0A=
  =0A=
  def __repr__(self): return "<WeakMethod for %s>" % self.original()=0A=
  =0A=
class HierarchyEvent:=0A=
  def __call__(self, obj, attr, value, oldvalue):=0A=
    if attr is ADDITION:=0A=
      try: =0A=
        if isinstance(obj, _EventObj_List): addevent_rec(value, =
self.original())=0A=
        else:                               addevent_rec(value[1], =
self.original())=0A=
      except NonEventableError: pass=0A=
    elif attr is REMOVAL:=0A=
      try: =0A=
        if isinstance(obj, _EventObj_List): removeevent_rec(value, =
self.original())=0A=
        else:                               removeevent_rec(value[1], =
self.original())=0A=
      except NonEventableError: pass=0A=
      =0A=
class WeakHiFunc(HierarchyEvent, WeakFunc):=0A=
  def __call__(self, obj, attr, value, oldvalue):=0A=
    HierarchyEvent.__call__(self, obj, attr, value, oldvalue)=0A=
    WeakFunc.__call__(self, obj, attr, value, oldvalue)=0A=
    =0A=
class WeakHiMethod(HierarchyEvent, WeakMethod):=0A=
  def __call__(self, obj, attr, value, oldvalue):=0A=
    HierarchyEvent.__call__(self, obj, attr, value, oldvalue)=0A=
    WeakMethod.__call__(self, obj, attr, value, oldvalue)=0A=
    =0A=
=0A=
# Mixin class used as base class for any with-event class.=0A=
class _EventObj:=0A=
  stocks =3D []=0A=
  def __setattr__(self, attr, value):=0A=
    # Get the old value of the changing attrib.=0A=
    oldvalue =3D getattr(self, attr, None)=0A=
    if attr =3D=3D "__class__":=0A=
      newclass =3D _create_class(self, value)=0A=
      self._EventObj_stuff.clazz.__setattr__(self, "__class__", newclass)=0A=
      self._EventObj_stuff.clazz =3D value=0A=
    else:=0A=
      # If a __setattr__ is defined for obj's old class, call it. Else, =
just set the attrib in obj's __dict__=0A=
      if hasattr(self._EventObj_stuff.clazz, "__setattr__"): =
self._EventObj_stuff.clazz.__setattr__(self, attr, value)=0A=
      else:                                                  =
self.__dict__[attr] =3D value=0A=
      =0A=
    # Comparison may fail=0A=
    try:=0A=
      if value =3D=3D oldvalue: return=0A=
    except: pass=0A=
    =0A=
    # Call registered events, if needed=0A=
    for event in self._EventObj_stuff.events:=0A=
      event(self, attr, value, oldvalue)=0A=
      =0A=
  def __addevent__(self, event):=0A=
    self._EventObj_stuff.events.append(event)=0A=
    l =3D to_list(self)=0A=
    if (not l is None) and (not l is self): addevent(l, event)=0A=
  def __hasevent__(self, event =3D None):=0A=
    return (event is None) or (self._EventObj_stuff.has_event(event))=0A=
  def __removeevent__(self, event):=0A=
    self._EventObj_stuff.remove_event(event)=0A=
    l =3D to_list(self)=0A=
    if (not l is None) and (not l is self): removeevent(l, event)=0A=
    if len(self._EventObj_stuff.events) =3D=3D 0: self.__restore__()=0A=
    =0A=
  def __restore__(self):=0A=
    # If no event left, reset obj to its original class.=0A=
    if hasattr(self._EventObj_stuff.clazz, "__setattr__"):=0A=
      self._EventObj_stuff.clazz.__setattr__(self, "__class__", =
self._EventObj_stuff.clazz)=0A=
    else:=0A=
      self.__class__ =3D self._EventObj_stuff.clazz=0A=
    # And delete the _EventObj_stuff.=0A=
    del self._EventObj_stuff=0A=
    =0A=
  # Called at pickling time=0A=
  def __getstate__(self):=0A=
    try:=0A=
      dict =3D self._EventObj_stuff.clazz.__getstate__(self)=0A=
      =0A=
      if dict is self.__dict__: dict =3D dict.copy()=0A=
    except: dict =3D self.__dict__.copy()=0A=
    =0A=
    try:=0A=
      del  dict["_EventObj_stuff"] # Remove what we have added.=0A=
      if   dict.has_key("children"): dict["children"] =3D =
list(dict["children"])=0A=
      elif dict.has_key("items"   ): dict["items"   ] =3D =
list(dict["items"   ])=0A=
    except: pass # Not a dictionary ??=0A=
    =0A=
    return dict=0A=
  =0A=
  def __reduce__(self):=0A=
    def rec_check(t):=0A=
      if t is self.__class__: return self._EventObj_stuff.clazz=0A=
      if type(t) is tuple: return tuple(map(rec_check, t))=0A=
      return t=0A=
    =0A=
    red =3D self._EventObj_stuff.clazz.__reduce__(self)=0A=
    =0A=
    if len(red) =3D=3D 2: return red[0], tuple(map(rec_check, red[1]))=0A=
    else:             return red[0], tuple(map(rec_check, red[1])), =
red[2]=0A=
=0A=
class _EventObj_OldStyle(_EventObj):=0A=
  def __deepcopy__(self, memo):=0A=
    if hasattr(self._EventObj_stuff.clazz, "__deepcopy__"):=0A=
      clone =3D self._EventObj_stuff.clazz.__deepcopy__(self, memo)=0A=
      if clone.__class__ is self.__class__:=0A=
        clone.__class__ =3D self._EventObj_stuff.clazz=0A=
      if hasattr(clone, "_EventObj_stuff"): del clone._EventObj_stuff=0A=
      return clone=0A=
    else:=0A=
      import copy=0A=
      =0A=
      if hasattr(self, '__getinitargs__'):=0A=
        args =3D self.__getinitargs__()=0A=
        copy._keep_alive(args, memo)=0A=
        args =3D copy.deepcopy(args, memo)=0A=
        y =3D apply(self._EventObj_stuff.clazz, args)=0A=
      else:=0A=
        y =3D copy._EmptyClass()=0A=
        y.__class__ =3D self._EventObj_stuff.clazz=0A=
        memo[id(self)] =3D y=0A=
      if hasattr(self, '__getstate__'):=0A=
        state =3D self.__getstate__()=0A=
        copy._keep_alive(state, memo)=0A=
      else:=0A=
        state =3D self.__dict__=0A=
      state =3D copy.deepcopy(state, memo)=0A=
      if hasattr(y, '__setstate__'): y.__setstate__(state)=0A=
      else:                          y.__dict__.update(state)=0A=
      return y=0A=
    =0A=
    =0A=
class _EventObj_List(_EventObj):=0A=
  def __added__  (self, value): self._EventObj_stuff(self, ADDITION, =
value, None)=0A=
  def __removed__(self, value): self._EventObj_stuff(self, REMOVAL , =
value, None)=0A=
  =0A=
  def append(self, value):=0A=
    self._EventObj_stuff.clazz.append(self, value)=0A=
    self.__added__(value)=0A=
  def insert(self, before, value):=0A=
    self._EventObj_stuff.clazz.insert(self, before, value)=0A=
    self.__added__(value)=0A=
  def extend(self, list):=0A=
    self._EventObj_stuff.clazz.extend(self, list)=0A=
    for value in list: self.__added__(value)=0A=
    =0A=
  def remove(self, value):=0A=
    self._EventObj_stuff.clazz.remove(self, value)=0A=
    self.__removed__(value)=0A=
  def pop(self, index =3D -1):=0A=
    value =3D self._EventObj_stuff.clazz.pop(self, index)=0A=
    self.__removed__(value)=0A=
    return value=0A=
  =0A=
  def __setitem__(self, index, new):=0A=
    old =3D self[index]=0A=
    self._EventObj_stuff.clazz.__setitem__(self, index, new)=0A=
    self.__removed__(old)=0A=
    self.__added__  (new)=0A=
  def __delitem__(self, index):=0A=
    value =3D self[index]=0A=
    self._EventObj_stuff.clazz.__delitem__(self, index)=0A=
    self.__removed__(value)=0A=
  def __setslice__(self, i, j, slice):=0A=
    olds =3D self[i:j]=0A=
    self._EventObj_stuff.clazz.__setslice__(self, i, j, slice)=0A=
    for value in olds : self.__removed__(value)=0A=
    for value in slice: self.__added__  (value)=0A=
  def __delslice__(self, i, j):=0A=
    olds =3D self[i:j]=0A=
    self._EventObj_stuff.clazz.__delslice__(self, i, j)=0A=
    for value in olds : self.__removed__(value)=0A=
  def __iadd__(self, list):=0A=
    self._EventObj_stuff.clazz.__iadd__(self, list)=0A=
    for value in list: self.__added__(value)=0A=
    return self=0A=
  def __imul__(self, n):=0A=
    olds =3D self[:]=0A=
    self._EventObj_stuff.clazz.__imul__(self, n)=0A=
    if n =3D=3D 0:=0A=
      for value in olds: self.__removed__(value)=0A=
    else:=0A=
      for value in olds * (n - 1): self.__added__(value)=0A=
    return self=0A=
=0A=
=0A=
class _EventObj_Dict(_EventObj):=0A=
  def __added__  (self, key, value): self._EventObj_stuff(self, =
ADDITION, (key, value), None)=0A=
  def __removed__(self, key, value): self._EventObj_stuff(self, REMOVAL =
, (key, value), None)=0A=
=0A=
  def update(self, dict):=0A=
    old =3D {}=0A=
    for key, value in dict.items():=0A=
      if self.has_key(key): old[key] =3D value=0A=
    self._EventObj_stuff.clazz.update(self, dict)=0A=
    for key, value in old .items(): self.__removed__(key, value)=0A=
    for key, value in dict.items(): self.__added__  (key, value)=0A=
  def popitem(self):=0A=
    old =3D self._EventObj_stuff.clazz.popitem(self)=0A=
    self.__removed__(old[0], old[1])=0A=
    return old=0A=
  def clear(self):=0A=
    old =3D self.items()=0A=
    self._EventObj_stuff.clazz.clear(self)=0A=
    for key, value in old: self.__removed__(key, value)=0A=
=0A=
  def __setitem__(self, key, new):=0A=
    if self.has_key(key):=0A=
      old =3D self[key]=0A=
      self._EventObj_stuff.clazz.__setitem__(self, key, new)=0A=
      self.__removed__(key, old)=0A=
    else:=0A=
      self._EventObj_stuff.clazz.__setitem__(self, key, new)=0A=
    self.__added__(key, new)=0A=
  def __delitem__(self, key):=0A=
    value =3D self[key]=0A=
    self._EventObj_stuff.clazz.__delitem__(self, key)=0A=
    self.__removed__(key, value)=0A=
    =0A=
# EventObj class for plain list (e.g. []) and plain dict :=0A=
=0A=
# EventObj stuff is not stored in the object's dict (because no such =
dict...)=0A=
# but in this dictionary :=0A=
#stuff_for_non_dyn_objs =3D weakref.WeakKeyDictionary()=0A=
stuff_for_non_dyn_objs =3D {}=0A=
=0A=
class _EventObj_PlainList(_EventObj_List, list):=0A=
  __slots__ =3D []=0A=
  =0A=
  #__hash__ =3D object.__hash__ # Allows to hash it ! (needed to use =
"self" as a dict key)=0A=
  =0A=
  def _get_EventObj_stuff(self): return stuff_for_non_dyn_objs[id(self)]=0A=
  def _set_EventObj_stuff(self, stuff): stuff_for_non_dyn_objs[id(self)] =
=3D stuff=0A=
  _EventObj_stuff =3D property(_get_EventObj_stuff, _set_EventObj_stuff)=0A=
  =0A=
  def __restore__(self):=0A=
    # If no event left, delete the _EventObj_stuff and reset obj to its =
original class.=0A=
    # Bypass the _EventObj.__setattr__ (it would crash since =
_EventObj_stuff is no longer available after the class change)=0A=
    self._EventObj_stuff.clazz.__setattr__(self, "__class__", =
self._EventObj_stuff.clazz)=0A=
    del stuff_for_non_dyn_objs[id(self)]=0A=
    =0A=
  def __getstate__(self): return None=0A=
    =0A=
_classes[list] =3D _EventObj_PlainList=0A=
=0A=
class _EventObj_PlainDict(_EventObj_Dict, dict):=0A=
  __slots__ =3D []=0A=
  =0A=
  def _get_EventObj_stuff(self): return stuff_for_non_dyn_objs[id(self)]=0A=
  def _set_EventObj_stuff(self, stuff): stuff_for_non_dyn_objs[id(self)] =
=3D stuff=0A=
  _EventObj_stuff =3D property(_get_EventObj_stuff, _set_EventObj_stuff)=0A=
  =0A=
  def __restore__(self):=0A=
    # If no event left, delete the _EventObj_stuff and reset obj to its =
original class.=0A=
    # Bypass the _EventObj.__setattr__ (it would crash since =
_EventObj_stuff is no longer available after the class change)=0A=
    self._EventObj_stuff.clazz.__setattr__(self, "__class__", =
self._EventObj_stuff.clazz)=0A=
    del stuff_for_non_dyn_objs[id(self)]=0A=
    =0A=
  def __getstate__(self): return None=0A=
  =0A=
_classes[dict] =3D _EventObj_PlainDict=0A=
=0A=
=0A=
# Hierarchy stuff :=0A=
=0A=
def addevent_rec(obj, event =3D dumper_event):=0A=
  """addevent_rec(obj, event =3D dumper_event)=0A=
Add event for obj, like addevent, but proceed recursively in all the =
hierarchy : if obj is a UserList/UserDict, event will be added to each =
instance obj contains, recursively.=0A=
If the hierarchy is changed, the newly added items will DO have the =
event, and the removed ones will no longuer have it."""=0A=
  if not hasevent(obj, event): # Avoid problem with cyclic list/dict=0A=
    # Wrap event in a hierarchy event=0A=
    if not isinstance(event, HierarchyEvent): wevent =3D =
_wrap_event(obj, event, 1)=0A=
    =0A=
    addevent(obj, wevent)=0A=
    =0A=
    for o in to_content(obj):=0A=
      try: addevent_rec(o, event)=0A=
      except NonEventableError: pass=0A=
      =0A=
def removeevent_rec(obj, event =3D dumper_event):=0A=
  """removeevent_rec(obj, event =3D dumper_event)=0A=
Remove event for obj, like removeevent, but proceed recursively."""=0A=
  if hasevent(obj, event): # Avoid problem with cyclic list/dict=0A=
    removeevent(obj, event)=0A=
    =0A=
    for o in to_content(obj):=0A=
      if isinstance(o, _EventObj): removeevent_rec(o, event)=0A=
      =0A=
def change_class(obj, newclass):=0A=
  """Change the class of OBJ to NEWCLASS, but keep the events it may =
have."""=0A=
  events =3D obj._EventObj_stuff.events[:]=0A=
  for event in events: removeevent(obj, event)=0A=
  obj.__class__ =3D newclass=0A=
  for event in events: addevent(obj, event)=0A=
  =0A=

------=_NextPart_000_0149_01C267C9.4E13C0C0--



From pedronis@bluewin.ch  Sun Sep 29 16:23:35 2002
From: pedronis@bluewin.ch (Samuele Pedroni)
Date: Sun, 29 Sep 2002 17:23:35 +0200
Subject: [Python-Dev] Re: FYI (was: watching mutables?)
Message-ID: <001701c267cc$3115d960$6d94fea9@newmexico>

"Jiba" <jiba@tuxfamily.org> ha scritto nel messaggio
news:3d96f25a$0$498$7a628cd7@news.club-internet.fr...
>
> Try EditObj (http://oomadness.tuxfamily.org/en/editobj) and look at the
> module editobj.eventobj.
>
> It does exactely what you need, but it is quite a hack...

FYI in Python 2.3: __class__ will be immutable for builtin types, and only
mutable for (user) subtypes.

See:

http://mail.python.org/pipermail/python-checkins/2002-August/028681.html

regards.





From jrw@pobox.com  Sun Sep 29 21:26:54 2002
From: jrw@pobox.com (John Williams)
Date: Sun, 29 Sep 2002 15:26:54 -0500
Subject: [Python-Dev] proposal for interfaces
References: <25A39AFEB31B06408C675CEF28199E5B073CA3@postur.ccp.cc>
Message-ID: <3D97620E.2030405@pobox.com>

Esteban U.C.. Castro wrote:
 > I like it a lot! Anyway, if it can be implemented in python as is,
 > what is the point of the PEP? Making the 'interface' root class and/or
 > InterfaceError builtins, maybe?

Well, aside from the ego boost of having my very own PEP, I think the
nature of interfaces is such that they're vastly more useful if lots of
people use them.  Putting something in the standard distribution almost
guarantees that.

 > I have some comments which I thought I would bounce. I'll organize
 > these attending to the activities they relate to. Don't hesitate to
 > tell me if I'm sayig something stupid. :)

You have some very good points; I'll address them individually below,
but let me get it out of the way that most of my programming background
is with compiled languages with strong type checking.  As much as I love
Python, sometimes I really miss the rigor of more stongly-typed
languages, so rather than trying to make something very Pythonic, I've
tried to go the opposite direction in order to complent what's there
already.

 > Define an interface
 > ===================
 >
 > In your example, it seems that
 >
 >   class Foo(interface):
 >     def default_foo(self, a, b):
 >       "Docstring for foo method."
 >       print "Defaults can be handy."
 >
 > [I added the a, b arguments to illustrate one point below]
 >
 > Does the following:
 >  - Defines the requirement that all objects that implement Foo have a
 >    foo function
 >  - Defines that foo should accept arguments of the form (a, b) maybe?
 >  - Sets a doc string for the foo method.
 >  - Sets a default implementation for the method.

Yes, exactly.

 > Some questions on this:
 >
 >  - Can an interface only define method names (and maybe argument
 >    formats)?  I think it would be handy to let it expose attributes.

Actually I wanted to have attributes, too (and operators, since they're
just methods).  This brings up one of the uses I had in mind for the
prefixes in front of the method names.  To add a property, you could do
something like this:

  doc_myProperty = "docstring for myproperty"
  readonly_myOtherProperty = "docstring for a read-only property"
  writeonly_myLastProperty = "docstring for write-only property"

This would require implementions to either have properties named
"myProperty", "myOtherProperty" and "myLastProperty", or define methods
named __get__myProperty, __set__myProperty, __get__myOtherProperty, and
__set__myLastProperty.

 >  - Is method names (and maybe format of arguments) the only thing you
 >    can 'promise' in the interface? In other words, is that the only
 >    type of guarantee that code that works against the interface can
 >    get? I think a __check__(self, obj) special method in interfaces
 >    would be a simple way to boost their flexibility.

Something I didn't include in the original message was the
design-by-contract feature, which would allow pre- and postconditions to
be specified for any method, like this:

  def check_foo(self, a, b):
    "docstring for foo"
    if not (some precondition for foo):
      raise ExceptionOfYourChoice
    if not (some other precondition for foo):
      raise DifferentExceptionOfYourChoice
    return lambda result: (postconditions of foo) # optional

I'd even like to allow multiple declarations per method, so you could
have a check_foo (which is always called, at least when __debug__ is
true), and default_foo, which is called only in classes that don't give
their own implementation of foo.

 >  - For the uses you have given to the prefix_methodname notation so far,
 >    I don't think it's really needed. Isn't the following sufficient?
 >  
 >  class Foo(interface):
 >     def foo(self, a, b):
 >         "foo docstring"
 >         # nothing here; no default definition
 >     def bar(self, a, b):
 >         pass # no docstring, _empty definition_

Not IMHO, since I'd want methods with no implementation to raise
NotImplementedError instead of silently returning None.  One this that
*could* be done to make the the simplest case (just a docstring?)
easier (and make the syntax look more Pythonic) would be this:

  def foo(): "Docstring for foo"
    # Never called, so arguments aren't needed, but they would be nice
    # for documentation purposes.
  def __default__bar(self): (default implementation of bar)
  def __check__baz(self): (error checking for baz)

This has to side effect of making the __ before the prefixes necessary,
so you can still define define methods that start with "default",
"check", etc.  I suppose the aestheics of the __ are a matter of taste,
though it does at least make the "magic" nature of the prefixes stand
out better.

 > This has the side effect that a method with no default definition and no
 > doc string is a SyntaxError. Is this too bad?
 > - It would maybe be hard to figure out what such a method is supposed
 >   to do, so you _should_ provide a docstring.
 > - If you're in a hurry, an empty docstring will do the trick. While in
 >   'quick and dirty mode' you probably won't be using interfaces a lot,
 >   anyway.

Exactly.

 > Defaults look indeed useful, but the really crucial aspect to lay down
 > in an interface definition is what does it guarantee on the objects
 > that implement it. If amalgamating this with default defs would
 > otherwise obscure it (there's another issue I'm addressing below), I
 > think defaults belong more properly to a class that implements the
 > interface, not to the interface definition itself.

I got the idea of defaults from Haskell, where it's common to see an
interface define methods with mutually recursive default definitions,
kind of like (to use a somewhat silly Python example) defining __eq__,
__ne__, __cmp__, etc. all in terms of one another and expecting
implementations to define enough of the methods that everything works.

 > Check whether an object implements an interface
 > ===============================================
 >
 > From your examples, I get it that an object implements an interface
 > iff it is an instance of a class that implements that interface.  So I
 > guess any checking of the requirements expressed by the interface is
 > done at the time when you bind() a class to an interface.  This
 > paradigm perfectly fits static and strongly typed languages, but it
 > falls a bit short of the flexibility of python IMO. You can do very
 > funny things to classes and objects at runtime, which can break any
 > assumptions based on object class.

Very good point.  The dynamic nature of Python it what makes it possible
to implement interfaces this way.  It would be a shame (and a little
ironic) if interfaces didn't play nice with dynamic code.

 > In your example:
 >
 >   def foo_proc(foo_arg):
 >     foo_proxy = Foo(foo_arg)
 >     ...
 >     x = foo_proxy.foo(a, b)
 >
 > [added a, b again]
 >
 > imagine foo_proc may only really cares that foo_arg is an object that
 > has a foo() method that takes (a, b) arguments (this is all Foo
 > guarantees).

Here's where my compiled language bias comes in.  If you only care the
foo_arg has a certain method, you don't want to use interfaces at all.
Using the interface doesn't just imply that foo_arg has a method named
foo, but also that the method satisfies the requirements laid out in the
interface definition.

 >  * Will the Foo() call check this, or will it just check that some class
 >    in foo_arg's bases is bound to the Foo interface?

Both, in a way.  The call to Foo() checks that foo_arg's class has been
declared to implement Foo, no more and no less.  Checking that the class
implements Foo *correctly* is a more multifaceted problem.  Some parts
(like making sure the right method names exist) could happen when the
interface is bound to the class, but other parts, like checking method
pre- and postconditions would have to happen at every method call.

 >    In the second case, if someone has been fiddling with foo_arg or some
 >    of its base classes, foo_arg.foo() may no longer exist or it may have
 >    a different signature.

You can't really stop people from shooting themselves in the foot.
Making methods disappear is black magic in my book.

 >  * Why should Foo() _always_ fail for objects that _do_ meet the
 >   requirements expressed by the interface Foo but have _not_ declared
 >   that they implement the interface?
 >
 > Making such check optional allows implicit (not declared) interface
 > satisfaction for those who want it. This should extend the
 > applicability of interfaces.

One of the major points of my design is to make the interface mechanism
very formal and explicit.  Keeping things implicit and informal is what
Python is already good at so I don't want to go there.  Also, having the
separate "bind" call means that it's really not necessary for a class to
declare that it implements the interface, only that some declaration of
that fact exists somewhere.

 > Declare that an object implements an interface or part of it
 > ============================================================
 >
 >   Foo.bind(SomeClass)
 >
 >
 > Problems:
 >
 > * I agree the declaration had better be included in the class
 >   definition, at least as an option.

I agree, but as an option.

The motivation for having interface bindings separate from the class
(and interface) definitions is mainly so that it's not necessary to
modify the source code for your classes to make them implement an
interface, so you can do things like add a new interface to a builtin or
legacy class without having access to the source code.  This is not
nearly as big a deal as with a language like C++ where you often can't
get to the source code in any useful way, but there are other reasons to
avoid touching the source, like avoiding the need to keep track of
patches to 3rd-party code.

OTOH, I agree that the level of control the "bindd" call gives you is
usually not useful.  I left out alterntives for the sake of brevity, but
it would be easy enough to add it in the interest of keeping the most
common case as simple as possible.

 > * Declarative better than procedural, for this purpose.

Whether this is declarative of procedural is mostly a matter of
perpective, IMHO.  In my imagination, calls to bind occur only at the
module level, and almost always immediately after the class or interface
definition, so the "flavor" is declarative.

 > * Only classes (not instances) can declare that they implement an
 >   interface.

Maybe there should be something like a "bindinstance" method as well.
I'm sure there's a way to do it, but I don't consider it a very high
priority.

 > * Indexing notation to 'resolve' methods a bit counterintuitive.
 > What about:
 [snip]

Good ideas.  Your method has a lot of advantages, but it would be hard
(and messy) to make it do everything you can do with seprarate method
calls, and one thing I'm *very* reluctant to do is have two ways of
doing everything with only subtle or stylistic differences between them.
The best thing to do here may be to allow your style for simple cases
(the 90% that would have required only a single "bind" call), but
use the method-based syntax for anything more elaborate, like method
renaming and unbinding subclasses.
 
 > Restrict
 > ========
 >
 > This is, to make sure that an object is only accessed in the ways
 > defined in the interface (via the proxy).
 >
 > This should be optional too, but your syntax does this nicely; you
 > can call Foo() as an assertion of sorts and ignore the result.

I think it would be very misleading to require than an object support a
formal interface but then expect it to also support methods not
specified in the interface.  If this is what you want, the right thing
to do is derive a new interface from the old one that has the extra
functionality you need.

 > Note that the __implements__ method resolution magic would require
 > that you get a proxy, though.

Unfortunately, yes.  Of course there'd be nothing stopping you from
calling the class's methods without going through the interface at all
(provided you know the right names for them), but in that case you
really just want to check that the object is an instance of a class, not
that it implements an interface.

OTOH, if you think the main problem with the proxy approach is just that
it's verbose, perhaps I can interest you in this idea (or some variation
thereof).  Here's the Python iterator protocol implemented with interfaces
instead of magic method names:

  class Iterable(interface):
    def iter():
      "Return an object implmenting the Iterator interface."

  class Iterator(Iterable):
    def next():
      "Return the next item or raise StopIteration."

  def iter(object):
    return Iterable(object).iter()

If you want to be even lazier, let Foo.foo(x) a synonym for
Foo(x).foo(), so you can define the "iter" function like this:

  iter = Iterable.__iter__

Here's how you might add the iterator protocol to builtin lists, string,
and tuples (if it wasn't already there, of course):

  # Define iteration semantics.
  class SequenceIterator(object):
    def __init__(self, seq)
      self.seq = seq
      self.index = 0
    def next(self):
      try:
        self.index += 1
        return self.seq[self.index - 1]
      except IndexError:
        raise StopIteration

  # Bind the interface to this implementation.
  Iterator.bind(SequenceIterator)
 
  # Add the Iterable interface to existing classes that don't have an
  # "iter" method.
  for t in list, tuple, str:
    Iterable.bind(t)
    Iterable[t].iter = SequenceIterator

Here are a few variations on the loop body, since you don't like the
indexing notation:
    
    Iterable.bind(t, {"iter": SequenceIterator})
    
    binding = Iterable.bind(t)
    binding.iter = SequenceIterator
    
    Iterable.bind(t)
    Iterable.bindmethod("iter", SequenceIterator)


Whew!  Ok, I guess I'm done now.  Thanks you your comments!

jw





From martin@v.loewis.de  Sun Sep 29 22:06:55 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 29 Sep 2002 23:06:55 +0200
Subject: [Python-Dev] Extension module difficulty w/pgen.
In-Reply-To: <Pine.BSF.4.33.0209271428160.75360-100000@localhost>
References: <Pine.BSF.4.33.0209271428160.75360-100000@localhost>
Message-ID: <m3n0q0o68g.fsf@mira.informatik.hu-berlin.de>

Jonathan Riehl <jriehl@spaceship.com> writes:

> It seems to me that I should not have to use this workaround, which only
> works on one of the systems I use.  Does anyone have an idea as to what I
> should do now?

It appears that metagrammar.o is not needed in the Python executable,
that's why the linker does not fetch it from the library when building
Python.

As for pgen, the first question would be why you need metagrammar.o in
your extension module. Assuming there is a good reason to expose it,
you should arrange to

1. exclude metagrammar.o from libpython.a,
2. include it explicitly into as a source for building the pgen module.

HTH,
Martin


From greg@cosc.canterbury.ac.nz  Mon Sep 30 00:39:44 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 30 Sep 2002 11:39:44 +1200 (NZST)
Subject: [Python-Dev] Re: User extendable literal modifiers ?!
In-Reply-To: <200209271104.MAA27895@synaptics-uk.com>
Message-ID: <200209292339.g8TNdiv28330@oma.cosc.canterbury.ac.nz>

Gareth McCaughan <gmccaughan@synaptics-uk.com>:

>   - Rational numbers.    $r"123/234"
>   - Regular expressions. $/"foo.*bar"
>   - Dates and times.     $t"2002-09-27 11:38"
>   - Hostnames and ports. $h"www.google.com:80"

This strikes me as ugly. There doesn't seem to be much, if any,
syntactical advantage over using a constructor:

   Rat("123/234")
   Regex("foo.*bar")
   Date("2002-09-27 11:38")
   Port("www.google.com:80")

These look cleaner and easier to read to me.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From mhammond@skippinet.com.au  Mon Sep 30 02:16:41 2002
From: mhammond@skippinet.com.au (Mark Hammond)
Date: Mon, 30 Sep 2002 11:16:41 +1000
Subject: [Python-Dev] Snapshot win32all builds of interest?
Message-ID: <LCEPIIGDJPKCOIHOBJEPCENPGKAA.mhammond@skippinet.com.au>

I'm wondering if there is any interest in me making regular, basically
untested win32all builds against the current Python CVS tree?

It would be fairly simple for me to do - I run against CVS Python, so it is
really just bundling up my latest built files into an installer .EXE.  I
would only do it for the current CVS trunk - ie, no Python 2.2 or earlier
builds in this form.
However, I would only bother if there were people willing to use it.  I
figure that if there aren't people in this forum who would use it, I won't
find them anywhere ;)

OTOH, people in this forum using CVS Python on Windows may prefer to use CVS
and build their own win32all - I really have no clue ;)

Thoughts?

Mark.



From dave@boost-consulting.com  Mon Sep 30 01:36:58 2002
From: dave@boost-consulting.com (David Abrahams)
Date: Sun, 29 Sep 2002 20:36:58 -0400
Subject: [Python-Dev] Documentation: type-vs.-function
Message-ID: <0cf201c2681a$2912c6d0$6501a8c0@boostconsulting.com>

I note that
http://www.python.org/dev/doc/devel/lib/built-in-funcs.html#l2h-14
describes dict as a built-in function, whereas we all know that Guido's
cool 2.2 changes made it into a type

  >>> dict
  <type 'dict'>

Does this distinction matter? A little, I think. Calling it a function
makes it sound like we're living in the past. Same goes for str, type,
list, tuple, et. al. I realize that the type (especially <type 'type'>)
acts like a function under many circumstances...

un-important-ly y'rs,
dave

-----------------------------------------------------------
           David Abrahams * Boost Consulting
dave@boost-consulting.com * http://www.boost-consulting.com





From martin@v.loewis.de  Mon Sep 30 06:41:55 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 30 Sep 2002 07:41:55 +0200
Subject: [Python-Dev] Documentation: type-vs.-function
In-Reply-To: <0cf201c2681a$2912c6d0$6501a8c0@boostconsulting.com>
References: <0cf201c2681a$2912c6d0$6501a8c0@boostconsulting.com>
Message-ID: <m3lm5kqbj0.fsf@mira.informatik.hu-berlin.de>

"David Abrahams" <dave@boost-consulting.com> writes:

> Does this distinction matter? 

Yes. However, I think a few patches changing this have been rejected,
on the grounds of being confusing to users. So careful wording is
necessary, which probably requires mastery of the English language.

Regards,
Martin


From esteban@ccpgames.com  Mon Sep 30 07:27:00 2002
From: esteban@ccpgames.com (Esteban U.C.. Castro)
Date: Mon, 30 Sep 2002 06:27:00 -0000
Subject: [Python-Dev] proposal for interfaces
Message-ID: <25A39AFEB31B06408C675CEF28199E5B073CA7@postur.ccp.cc>

Thanks for your reply-- and thanks for taking the work of fixing the=20
formatting in the quotations. I thought it would maybe be considered=20
spam to send the message again just to fix the mess. (I think I know
what caused the problem last time; I hope this one reads fine).=20
Is there some sort of policy on this?


> Well, aside from the ego boost of having my very own PEP, I think the
> nature of interfaces is such that they're vastly more useful if lots=20
> of people use them. Putting something in the standard distribution=20
> almost guarantees that.

Agreed. I was just curious about what is exactly that _something_ you=20
want to put in the standard distribution. Just a recommendation on the=20
docs? Have some of the existing modules refactored to use this method?=20

[P.S.: I'm afraid your ego will have to give up on having your own=20
interfaces PEP, since there is already one. See below.]


I don't know if there are precedents of code practices being
'officially'=20
endorsed by the python development team, when it implies no changes to=20
either the language or the standard library. To start with a
non-intrusive=20
addition, I can think of a module defining precisely the interfaces of=20
builtin types, and other commonplace de facto interfaces that already=20
exist in the standard library.

Imagine you want to use a function that takes a builtin type, but you=20
want to pass your own fake instance. Making it just inherit the builtin=20
type is an option, but maybe that's not practical, or yours is, too, a=20
type not implemented in python, or you don't want to inherit the=20
behavior of the builtin type.

Having the interface formalized somewhere will help you know what will=20
be expected from your custom object.

Most 'interfaces' in standard python libraries (eg. iterable, iterator,
stream...).  are really simple, and it has worked quite well to have=20
them as just de facto. I think code checking for an interface would be
more expressive than code checking for method names or just trying to=20
use the object and catch exceptions (you can still do this for other=20
reasons if you want). The latter also relies on everyone knowing and=20
agreeing on what makes an int an int, and a list a list.

Now, many functions take lists, but only use a subset of the list=20
interface. It would be handy if common subsets of (e.g.) the list=20
interface could be identified and formalized in a standard module. We=20
could define a SimpleReadList that would only guarantee getting=20
items by index, a SliceRWDList() that would guarantee index and slice
getting, setting, and deletion, and the complete List interface with
all the append, insert, sort methods.=20

[Note: of course it wouldn't be done like this, but I don't have a=20
good idea of what are the most commonly used subsets of the list=20
interface. Names ugly on purpose, so you won't take them seriously=20
:).]

This is also and example of why I see the usefulness or implicit=20
interface satisfaction. This way interface definitions can have=20
'retroactive' effect, so you don't have to mess with the builtin=20
types at all, in order to define how they interact with other objects=20
(this is, to define and use their interfaces).


> As much as I love Python, sometimes I really miss the rigor of more=20
> stongly-typed languages, so rather than trying to make something=20
> very Pythonic, I've tried to go the opposite direction in order to=20
> complent what's there already.

It looks like you are not alone in this-- or at least others have=20
been where you are. You may want to take a look at the homepage for=20
this (retired) Special Interest Group:

http://www.python.org/sigs/types-sig/



While looking for this, I found a PEP, proposed by this group, which=20
addresses interfaces too:

http://www.python.org/peps/pep-0245.html


I think there's a point in continuing this discussion though, as

> Dissenting Opinion
>    This PEP has not yet been discussed on python-dev.


If this is not a good time/place to talk about this, I guess we'll be
warned :).

I got a strong deja-vu reading this PEP. It matches so closely some=20
of the changes I proposed to your model, that I must have read it=20
long ago and then forgotten about it. Still, this suffers from some=20
statically-typed background and may have the problems I pointed about=20
your proposal; it seems to limit itself to the functionality and needs=20
of Java interfaces (the only example I am more or less familiar with).=20

Since there is a group effort behind it, I guess they have taken this=20
into account and still agreed on this solution for some reason; I=20
haven't looked at the archives yet.

Also, this PEP is less shy about proposing changes to the language=20
itself. This is maybe a good idea; if one of the important features=20
for the usefulness of interfaces is that they will be widely used, it
would help if there was only one universally accepted way to use=20
them. Endorsing one such way in the language itself should help.


> Actually I wanted to have attributes, too (and operators, since
they're
> just methods).  This brings up one of the uses I had in mind for the
> prefixes in front of the method names.  To add a property, you could
do
> something like this:
>
>  doc_myProperty =3D "docstring for myproperty"
>  readonly_myOtherProperty =3D "docstring for a read-only property"
>  writeonly_myLastProperty =3D "docstring for write-only property"

Again, why not just:

 #   name    access         docstring
 myProperty =3D "rw", "docstring for myProperty"
 myOtherProperty =3D "r", "docstring for a read-only property"
 myLastProperty =3D "w", "docstring for write-only property"


> This would require implementions to either have properties named
> "myProperty", "myOtherProperty" and "myLastProperty", or define
methods
> named __get__myProperty, __set__myProperty, __get__myOtherProperty,
and
> __set__myLastProperty.
       =20
It could just require that implementations can be called getattr and/or
setattr (depending on the access declared). There is the problem to
check
setattr non-destructively when the object is read-only. Maybe this is an

issue somewhere else too? I wish this can be solved to keep the syntax
as=20
simple as possible.=20



> Something I didn't include in the original message was the
> design-by-contract feature, which would allow pre- and postconditions
to
> be specified for any method, like this:
>
>  def check_foo(self, a, b):
>    "docstring for foo"
>    if not (some precondition for foo):
>      raise ExceptionOfYourChoice
>    if not (some other precondition for foo):
>      raise DifferentExceptionOfYourChoice
>    return lambda result: (postconditions of foo) # optional


I think the ability to set attributes on python functions, or builtin=20
properties (I'd have to refresh my memory on these :) could be used for=20
this. Either way, the syntax for the client user could be something=20
like:

 class SomeInterface(interface):=20
   def foo(self, a, b): "foo doc"
   def bar(self, a, b): "bar doc"
  =20
   foo.__pre__ =3D some_checking_func # takes a, b, raises something if=20
						# they're wrong

   foo.__post__ =3D other_checking_func # takes the return value, makes
sure
						  # it's not broken

   bar.__around__ =3D yet_another_checking_func # this is called =
_instead_
of
							    # the
function, and _should_
           						    # call the
function in turn


This makes it more explicit that __pre__, __post__ and/or __around__ are

something that relates to foo in some way. Your second approach (bar,=20
__default__bar) comes closer to this, and it may be more convenient than

first defining the function, then assigning it. Having part of the name=20
of an object have an special meaning is convenient but a bit hacky. I=20
myself do it often, but I don't think I'd propose a standard based on=20
that. A matter of personal taste, I guess.



>>  - For the uses you have given to the prefix_methodname notation so
far,
>>    I don't think it's really needed. Isn't the following sufficient?
>> =20
>>  class Foo(interface):
>>     def foo(self, a, b):
>>         "foo docstring"
>>         # nothing here; no default definition
>>     def bar(self, a, b):
>>         pass # no docstring, _empty definition_
>
> Not IMHO, since I'd want methods with no implementation to raise
> NotImplementedError instead of silently returning None.  One this that
> *could* be done to make the the simplest case (just a docstring?)
> easier (and make the syntax look more Pythonic) would be this:

The absence of 'pass' in an interface method (in foo) would be
considered=20
absence of any default implementation and therefore you'd get an=20
InterfaceError when trying to validate/get proxy on the object if it=20
doesn't implement that method.

In second thought, I admit this is dirty, and I don't know if this magic

is even possible. A more explicit approach could be:

 class Foo(interface):
    def foo(self, a, b):
        "foo docstring"
        raise NotImplementedError

    def bar(self, a, b):
        pass # no docstring, _empty definition_


Or, for consistency with the proposal above (and with the existing PEP):

 class Foo(interface):
    def foo(self, a, b):
        "foo docstring"
        # no definition allowed here, sorry

    def bar(self, a, b):
        "" # empty docstring required at the very least
    bar.__default__ =3D lambda a, b: None
   =20

Now that I think about it, the __underscores__ could maybe be taken out=20
for interface method special attributes. They remind us that we are=20
looking at magic stuff that the 'system' will be using in special ways,
but if you are not expected to assign arbitrary attributes to interface=20
methods, then there is no ambiguity. Aesthetic choice, again.




> I got the idea of defaults from Haskell, where it's common to see an
> interface define methods with mutually recursive default definitions,
> kind of like (to use a somewhat silly Python example) defining __eq__,
> __ne__, __cmp__, etc. all in terms of one another and expecting
> implementations to define enough of the methods that everything works.

I like it! :) I may be being too purist at this, but I still think that=20
doesn't belong in the interface definition. I admit putting defaults=20
there is convenient, but I wish we could find a solution that is both
convenient and keeps implementation details out of the interface=20
definition.

Although the existing interface PEP stresses the separation between=20
interface and class, it provides one possible solution for this: it=20
talks about a deferred() method (in interfaces) that will return a=20
class that implements the interface. In the PEP, it seems this is only=20
meant to provide error reporting, but I guess it could be put to good=20
use in other ways.=20

I admit I don't understand the 'deferred' name :), I'm not sure how=20
that default class would be defined, and whether it is intended to be=20
customizable in the PEP. Having a convenient, standard way to define=20
a default class and associate it with an interface without looking=20
like something intrinsic to it, _that_ would be, IMHO, the ideal=20
solution.
	=09


>> In your example:
>>
>>   def foo_proc(foo_arg):
>>     foo_proxy =3D Foo(foo_arg)
>>     ...
>>     x =3D foo_proxy.foo(a, b)
>>
>> [added a, b again]
>>
>> imagine foo_proc may only really cares that foo_arg is an object that
>> has a foo() method that takes (a, b) arguments (this is all Foo
>> guarantees).
>
> Here's where my compiled language bias comes in.  If you only care the
> foo_arg has a certain method, you don't want to use interfaces at all.
> Using the interface doesn't just imply that foo_arg has a method named
> foo, but also that the method satisfies the requirements laid out in
the
> interface definition.

I agree; this case was only simplified for the sake of exposition. The=20
point here is that the requirements laid out in the interface definition
may possibly be checked on objects (as opposed to classes), at runtime=20
(as opposed to, um, import-time :).=20

If you really want to check the class you can do so in one specific=20
interface.

If you don't want the overhead of checking every time, and you know you
won't be messing with class instances or objects, you may either check
by class, or maybe even cache the results of checking on
classes/objects.



>>    In the second case, if someone has been fiddling with foo_arg or
some
>>    of its base classes, foo_arg.foo() may no longer exist or it may
have
>>    a different signature.
>
> You can't really stop people from shooting themselves in the foot.
> Making methods disappear is black magic in my book.

Very true. This is of course no crucial point. Still, since method calls

are always late-bound I think it only makes sense that restrictions on=20
them _can_, at least, be late-checked.


>> * I agree the declaration had better be included in the class
>>   definition, at least as an option.
>
> I agree, but as an option.

The __implements__ alternative lets you fiddle with interfaces outside=20
the class too:

 # SomeClass itself defined elsewhere
 SomeClass.__implements__ +=3D (Foo,)=20

All in all, the syntax I like the most so far is the one described in=20
the existing PEP.



>> * Declarative better than procedural, for this purpose.
>
> Whether this is declarative of procedural is mostly a matter of
> perpective, IMHO.  In my imagination, calls to bind occur only at the
> module level, and almost always immediately after the class or
interface
> definition, so the "flavor" is declarative.

You're right; an assignment is just as procedural as a function call.=20
It was very personal aesthetic appreciation again; a function call looks

more like it's "doing" something, to me. But that is very arguable.


>> * Only classes (not instances) can declare that they implement an
>>   interface.
>
> Maybe there should be something like a "bindinstance" method as well.
> I'm sure there's a way to do it, but I don't consider it a very high
> priority.

Me neither. I think declaration would typically be done in a per class,
not per instance, basis. Still, due to the nature of python it is the=20
instance (the object, really) who implements (or fails to) the
interface.=20
So I would add it, for completeness and to reflect this, *if* it would=20
not require any specific syntax or additional complication. I think the=20
way __implements__ would be searched would be natural for python users,=20
since it is consistent with what is being done for __dict__, for
example.=20



> Good ideas.  Your method has a lot of advantages, but it would be hard
> (and messy) to make it do everything you can do with seprarate method
> calls, and one thing I'm *very* reluctant to do is have two ways of
> doing everything with only subtle or stylistic differences between
them.

I agree there should not be two ways. :>=20


> The best thing to do here may be to allow your style for simple cases
> (the 90% that would have required only a single "bind" call), but
> use the method-based syntax for anything more elaborate, like method
> renaming and unbinding subclasses.

Method renaming and unbinding subclasses are supported in the
alternative
and rather easy; I guess the weird formatting obscured this :).

Method renaming:
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

 class I1(interface):
   def f(self): ""

 class I2(interface):
   def f(self, a, r, g, s): ""

 class SomeClass:

   def i_f(self): pass
   i1f.__implements__ =3D I1.f

   def g_f(self, a, r, g, s): pass
   i2f.__implements__ =3D I2.f

[Note: if 'implements' is introduced as a keyword, as in the PEP, we=20
could just as well declare

 def g_f(self, a, r, g, s) implements I2.f:
   ...
]


Unbinding subclasses:
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

 class I1(interface): ...
 class I2interface): ...

 class Base:=20
   __implements__ =3D (I1, I2)

 class Sub(Base):
   # it would be more pretty to say (not I2,), but it would involve
   # some language hacking I guess, while (-I2,) only requires to
   # override __neg__ in the interface=20
   __implements__ =3D (-I2,)




>> [using proxies] should be optional too, but your syntax does this=20
>> nicely; you can call Foo() as an assertion of sorts and ignore the=20
>> result.
>
> I think it would be very misleading to require than an object support
a
> formal interface but then expect it to also support methods not
> specified in the interface.  If this is what you want, the right thing
> to do is derive a new interface from the old one that has the extra
> functionality you need.

As you pointed out, in the real world you won't be defining interfaces
for very simple things like "has a write() method". This means, you may=20
still want to use other forms of validation, or do without validation
at all (for aspects of the code which are still in experimental phase,=20
for example). You may even use the interface more informally as=20
a mere sanity check.=20

Anyway, method redirection almost enforces that the interface-specific=20
functionality of an object be accessed through a proxy. Doing so looks=20
like the right think to do anyway, so I don't consider this unfortunate=20
at all.

I like the proxy idea and the syntax you propose for it. The iterator=20
example was a nice read, anyway :).

Was this a typo?

  class Iterator(Iterable):

Meant class Iterator(interface): ?



As a conclussion, I am glad about the stress, in python, to generally
make things easier for you, rather than paternalistically try to keep=20
you from doing the evil. Since I came from Java, it took a bit using=20
to, but I have seen python scaling nicely to rather big sized projects=20
without this becoming a serious issue. I think there is a point to=20
standardize and automate type checking in python, but I believe it can
and should be done without betraying this philosophy.

Enough ranting for today! :)=20


Esteban.


From esteban@ccpgames.com  Mon Sep 30 07:31:23 2002
From: esteban@ccpgames.com (Esteban U.C.. Castro)
Date: Mon, 30 Sep 2002 06:31:23 -0000
Subject: [Python-Dev] bad formatting
Message-ID: <25A39AFEB31B06408C675CEF28199E5B06EB0E@postur.ccp.cc>

Sorry again. Anyone knows of a 'sandbox' mailing list where I could
experiment to fix the formatting problems?


From aleax@aleax.it  Mon Sep 30 07:38:54 2002
From: aleax@aleax.it (Alex Martelli)
Date: Mon, 30 Sep 2002 08:38:54 +0200
Subject: [Python-Dev] Documentation: type-vs.-function
In-Reply-To: <0cf201c2681a$2912c6d0$6501a8c0@boostconsulting.com>
References: <0cf201c2681a$2912c6d0$6501a8c0@boostconsulting.com>
Message-ID: <E17vuCp-0006ly-00@mail.python.org>

On Monday 30 September 2002 02:36 am, David Abrahams wrote:
> I note that
> http://www.python.org/dev/doc/devel/lib/built-in-funcs.html#l2h-14
> describes dict as a built-in function, whereas we all know that Guido's
> cool 2.2 changes made it into a type
>
>   >>> dict
>
>   <type 'dict'>
>
> Does this distinction matter? A little, I think. Calling it a function
> makes it sound like we're living in the past. Same goes for str, type,
> list, tuple, et. al. I realize that the type (especially <type 'type'>)
> acts like a function under many circumstances...

Trying to cover both 2.1 and 2.2 in the coming Nutshell, I've resorted to 
periphrases such as "the built-in dict" or "the dict built-in" (the latter 
uses "built-in" as a noun, I'm not yet sure the editor will let that go by).

I've also tried to use 'callable' systematically instead of 'function' 
wherever other callables (types, bound-methods, etc) can be substituted
in lieu of functions.  In documenting 2.2 or 2.3 only, I think such hedging
is not warranted.  It's important, when feasible, to clarify what built-ins
are types -- a type has MORE functionality than a function, after all (in 
particular, one can subclass it, while one can't subclass a function).


Alex


From esteban@ccpgames.com  Mon Sep 30 07:47:34 2002
From: esteban@ccpgames.com (Esteban U.C.. Castro)
Date: Mon, 30 Sep 2002 06:47:34 -0000
Subject: [Python-Dev] proposal for interfaces (errata)
Message-ID: <25A39AFEB31B06408C675CEF28199E5B06EB0F@postur.ccp.cc>

g_f should be i2f.

SPAM END :)
___________



Method renaming:
----------------

 class I1(interface):
   def f(self): ""

 class I2(interface):
   def f(self, a, r, g, s): ""

 class SomeClass:

   def i_f(self): pass
   i1f.__implements__ =3D I1.f

   def g_f(self, a, r, g, s): pass
   i2f.__implements__ =3D I2.f

[Note: if 'implements' is introduced as a keyword, as in the PEP, =
we=3D20
could just as well declare

 def g_f(self, a, r, g, s) implements I2.f:
   ...
]


From herald@ns1.nabitel.com  Mon Sep 30 05:42:36 2002
From: herald@ns1.nabitel.com (herald@ns1.nabitel.com)
Date: Mon, 30 Sep 2002 13:42:36 +0900
Subject: [Python-Dev] (ad)Strong WebRobot/eMailId Collector: Free Download !
Message-ID: <NS1PqbAszvnkB2BLuUr00002e26@ns1.ns1.nabitel.com>

This is a multi-part message in MIME format.

------=_NextPart_000_939A3_01C26887.3C7F6210
Content-Type: text/plain;
	charset="ks_c_5601-1987"
Content-Transfer-Encoding: quoted-printable


Sorry for interrupting you - click refuse
<mailto:mailer@ns1.nabitel.com>  for no more mail... =09
=A1=A1 =09
- Welcome to NabiTel's <http://www.nabitel.com/English.asp>  software
products and portal services - =09
Software Products =09
 <http://www.Nabitel.com/English.asp>=20

Web Robot: also called web spider or web crawler, collects useful web
page informations by navigating world wide web sites.=20

Download free trial version now ! <http://www.nabitel.com/English.asp>=20

 <http://www.Nabitel.com/English.asp>  	eMail ID Collector: Collects
email ids publicly opened on various web pages, with good intention.=20

Download free trial version now ! <http://www.nabitel.com/English.asp>=20

Portal Services =09
 <http://www.nabitel.com/English.asp>  	Web Portal: Do you have your own
home page and want to broadcast it all over the world ? Register your
home page to NabiTel Portal Now !! (nabi=3Da butterfly)

Register your home page now, it's free !
<http://www.nabitel.com/English.asp>=20

 <http://www.AllThatCars.com/English.asp>  	Automobiles: Do you want
to sell or buy automobiles ? Cars, trucks, limos, airplanes, ships,....
All That Cars are here !

Register your vehicles now, it's free !
<http://www.AllThatCars.com/English.asp>=20

 <http://www.AllThatComputers.Com>  	Computers: Do you want to sell
or buy computers ? PCs, printers, scanners, servers, mainframes, ....
All That Computers are here !=20

Register your computers now, it's free !
<http://www.AllThatComputers.com/English.asp>=20

 <http://www.AllThatFoods.Com/English.asp>  =09
Food & Restaurants: Are you seeking for a nice place to eat ? Or do you
run a restaurant ? Foods of the world, restaurants of the world, ....
All That Foods are here !=20

Register your restaurant now, it's free !
<http://www.AllThatFoods.com/English.asp>=20

Have a nice day.  Thank you. =09

------=_NextPart_000_939A3_01C26887.3C7F6210
Content-Type: text/html;
	charset="ks_c_5601-1987"
Content-Transfer-Encoding: quoted-printable

<html><head>   <title>Nabitel information broadcast mail</title>   <meta =
http-equiv=3D"Content-Type" content=3D"text/html; =
charset=3Deuc-kr"></head><body bgcolor=3D"white" text=3D"blue" =
link=3D"blue" vlink=3D"purple" alink=3D"red"><br><div align=3D"center">  =
<center>   <table border=3D"0" width=3D"538" height=3D"47" =
cellspacing=3D"0" bgcolor=3D"#FFFFCC" cellpadding=3D"0">   <tr>      <td =
height=3D27 align=3Dcenter valign=3Dmiddle bgcolor=3D#00CCFF width=3D536 =
colspan=3D2>        <font color=3D"#FFFFFF" size=3D"2"><b>Sorry for =
interrupting you - click <a =
href=3D"mailto:mailer@ns1.nabitel.com">refuse</a> for no more =
mail...</b></font>        </td>   </tr>   <tr>      <td height=3D27 =
align=3Dcenter valign=3Dmiddle width=3D536 colspan=3D2 =
bordercolor=3D"#FFFFFF">        =A1=A1      </td>   </tr>   <tr>      =
<td height=3D27 align=3Dcenter valign=3Dmiddle bgcolor=3D#FF0000 =
width=3D536 colspan=3D2>        <font color=3D"#FFFFFF" size=3D"2"><b>- =
Welcome to <a href=3D"http://www.nabitel.com/English.asp" =
target=3D"_blank">NabiTel's</a>             software products and portal =
services -</b></font>         </td>   </tr>   <tr>      <td height=3D26 =
align=3Dcenter valign=3Dmiddle bgcolor=3D#EEEEEE width=3D534 =
bordercolor=3D"#00FFFF" colspan=3D"2">        <b><font color=3D"#00CCFF" =
size=3D"2">Software Products</font></b>       </td>   </tr>   <tr>      =
<td height=3D72 align=3Dcenter valign=3Dmiddle bgcolor=3D#FFFF00 =
width=3D160 bordercolor=3D"#00FFFF">        <p align=3D"center"><font =
size=3D"2"><a href=3D"http://www.Nabitel.com/English.asp" =
target=3D"_blank"><img =
src=3Dhttp://www.allthatcomputers.com/image/robot3.jpg width=3D156 =
height=3D91 align=3Dleft></a></font></p>      </td>  </center>      <td =
height=3D72 align=3Dcenter valign=3Dmiddle bgcolor=3D#FFFF00 width=3D374 =
bordercolor=3D"#00FFFF">        <p align=3D"center"><font =
size=3D"2"><b>Web Robot:</b> also called web         spider or web =
crawler, collects useful web page informations by         navigating =
world wide web sites. </font></p>        <p align=3D"center"><b><a =
href=3D"http://www.nabitel.com/English.asp" target=3D"_blank"><font =
size=3D"2">Download         free trial version now !</font></a></b></p>  =
    </td>   </tr>  <center>   <tr>      <td height=3D53 align=3Dcenter =
valign=3Dmiddle bgcolor=3D#FFCCFF width=3D160>        <font =
size=3D"2"><a href=3D"http://www.Nabitel.com/English.asp" =
target=3D"_blank"><img =
src=3Dhttp://www.allthatcomputers.com/image/envelope4.jpg width=3D155 =
height=3D95 align=3Dleft></a></font>      </td>      <td height=3D53 =
align=3Dcenter valign=3Dmiddle bgcolor=3D#FFCCFF width=3D374>        <p =
align=3D"center"><font size=3D"2"><b>eMail ID Collector:</b> Collects    =
      email ids publicly opened on various web pages, with good =
intention. </font></p>        <p align=3D"center"><b><a =
href=3D"http://www.nabitel.com/English.asp" target=3D"_blank"><font =
size=3D"2">Download         free trial version now !</font></a></b></p>  =
    </td>   </tr>   <tr>      <td height=3D27 align=3Dcenter =
valign=3Dmiddle bgcolor=3D#EEEEEE width=3D534 colspan=3D"2">        =
<b><font color=3D"#00CCFF" size=3D"2">Portal Services</font></b>       =
</td>   </tr>   <tr>      <td height=3D81 align=3Dcenter valign=3Dmiddle =
bgcolor=3D#FFFF00 width=3D160>        <font size=3D"2"><a =
href=3D"http://www.nabitel.com/English.asp" target=3D"_blank"><img =
src=3Dhttp://www.allthatcomputers.com/image/=B3=AA=BA=F102.jpg =
width=3D153 height=3D86 align=3Dleft></a></font>      </td>      <td =
height=3D81 align=3Dcenter valign=3Dmiddle bgcolor=3D#FFFF00 =
width=3D374>        <p align=3D"center"><font size=3D"2"><b>Web =
Portal:</b> Do you have your own         home page and want to broadcast =
it all over the world ? Register your         home page to NabiTel =
Portal Now !! (nabi=3Da butterfly)</font></p>        <p =
align=3D"center"><b><a href=3D"http://www.nabitel.com/English.asp" =
target=3D"_blank"><font size=3D"2">Register          your home page now, =
it's free !</font></a></b></p>       </td>   </tr>   <tr>      <td =
height=3D81 align=3Dcenter valign=3Dmiddle bgcolor=3D#FFCCFF =
width=3D160>        <font size=3D"2"><a =
href=3D"http://www.AllThatCars.com/English.asp" target=3D"_blank"><img =
src=3Dhttp://www.allthatcomputers.com/image/porche2001a.jpg width=3D153 =
height=3D86 align=3Dleft></a></font>      </td>      <td height=3D81 =
align=3Dcenter valign=3Dmiddle bgcolor=3D#FFCCFF width=3D374>        <p =
align=3D"center"><font size=3D"2"><b>Automobiles:</b> Do you want to =
sell            or buy automobiles ? Cars, trucks, limos, airplanes, =
ships,....&nbsp;            All That Cars are here !</font></p>          =
 <p align=3D"center"><b><a =
href=3D"http://www.AllThatCars.com/English.asp" target=3D"_blank"><font =
size=3D"2">Register         your vehicles now, it's free =
!</font></a></b></p>      </td>   </tr>   <tr>      <td height=3D54 =
align=3Dcenter valign=3Dmiddle bgcolor=3D#FFFF00 width=3D160>       =
<font size=3D"2"><a href=3D"http://www.AllThatComputers.Com" =
target=3D"_blank"><img =
src=3Dhttp://www.allthatcomputers.com/image/ibm_pc02.gif width=3D152 =
height=3D83 align=3Dleft></a></font>      </td>      <td height=3D54 =
align=3Dcenter valign=3Dmiddle bgcolor=3D#FFFF00 width=3D374>        <p =
align=3D"center"><font size=3D"2"><b>Computers:</b> Do you want to sell =
or         buy computers ? PCs, printers, scanners, servers, mainframes, =
... All         That Computers are here ! </font></p>        <p =
align=3D"center"><b><a =
href=3D"http://www.AllThatComputers.com/English.asp" =
target=3D"_blank"><font size=3D"2">Register         your computers now, =
it's free !</font></a></b></p>      </td>   </tr>   <tr>      <td =
height=3D66 align=3Dcenter valign=3Dmiddle bgcolor=3D#FFCCFF =
width=3D160>       <font size=3D"2"><a =
href=3D"http://www.AllThatFoods.Com/English.asp" target=3D"_blank"><img =
src=3Dhttp://www.allthatcomputers.com/image/=C3=B6=C6=C701.JPG =
width=3D153 height=3D81 align=3Dleft></a></font>      </td>  </center>   =
   <td height=3D66 align=3Dcenter valign=3Dmiddle bgcolor=3D#FFCCFF =
width=3D374>        <p><font size=3D"2"><b>Food &amp; Restaurants:</b> =
Are you         seeking for a nice place to eat ? Or do you run a =
restaurant ? Foods of         the world, restaurants of the world, .... =
All That Foods are here ! </font></p>      <p align=3D"center"><b><a =
href=3D"http://www.AllThatFoods.com/English.asp" target=3D"_blank"><font =
size=3D"2">Register       your restaurant now, it's free =
!</font></a></b></p>      </td>   </tr>  <center>   <tr>      <td =
width=3D532 bgcolor=3D#FF0000 valign=3Dmiddle height=3D27 align=3Dcenter =
colspan=3D"2">        <b><font color=3D"#FFFFFF" size=3D"2">Have a nice =
day.&nbsp; Thank you.</font></b>       </td>   </tr>      </table>  =
</center>  </div></body></html>
------=_NextPart_000_939A3_01C26887.3C7F6210--


From aleax@aleax.it  Mon Sep 30 08:35:34 2002
From: aleax@aleax.it (Alex Martelli)
Date: Mon, 30 Sep 2002 09:35:34 +0200
Subject: [Python-Dev] proposal for interfaces
In-Reply-To: <3D97620E.2030405@pobox.com>
References: <25A39AFEB31B06408C675CEF28199E5B073CA3@postur.ccp.cc> <3D97620E.2030405@pobox.com>
Message-ID: <E17vv67-0006it-00@mail.python.org>

On Sunday 29 September 2002 10:26 pm, John Williams wrote:
> Esteban U.C.. Castro wrote:
>  > I like it a lot! Anyway, if it can be implemented in python as is,
>  > what is the point of the PEP? Making the 'interface' root class and/or
>  > InterfaceError builtins, maybe?
>
> Well, aside from the ego boost of having my very own PEP, I think the
> nature of interfaces is such that they're vastly more useful if lots of
> people use them.  Putting something in the standard distribution almost
> guarantees that.

I'm not following all of the details (and I'm prejudiced by the mentioned
bias for static typing), but I'd like to second this specific point: some
features, and interfaces are definitely one, have growing usefulness for
all the more people use them.  This is what economists call "a network
effect" (if you're the only one in the world to own a phone, its usefulness
to you is nil; if there are just two phones in the world, each has at least
one bit of usefulness; the more people use phones, the more useful each
phone becomes to its user).  If most Python modules used interfaces (of
almost any given kind), those interfaces could be very useful; if almost no
module did, the usefulness would be limited to inter-module communication
for very complex systems one develops oneself -- not nil, but much less.


Alex


From mwh@python.net  Mon Sep 30 09:14:20 2002
From: mwh@python.net (Michael Hudson)
Date: 30 Sep 2002 09:14:20 +0100
Subject: [Python-Dev] ATTENTION! Releasing Python 2.2.2 in a few weeks
In-Reply-To: Guido van Rossum's message of "Sat, 28 Sep 2002 10:22:58 -0400"
References: <200209202126.g8KLQVI24554@pcp02138704pcs.reston01.va.comcast.net> <2mn0q2fip7.fsf@starship.python.net> <200209281422.g8SEMw720102@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <2mlm5jlwrn.fsf@starship.python.net>

Guido van Rossum <guido@python.org> writes:

> > > I'd like to release something called Python 2.2.2 in a few weeks (say,
> > > around Oct 8; I like Tuesday release dates).
> > 
> > One minor point of concern: I think Jack Jansen's on holiday.  Perhaps
> > we should wait for him to get back...
> 
> He should be back by Oct 5 or 6 if what he told me of his schedule is
> true.  I'm not sure that it would matter much if MacPython 2.2.2 was
> released a week after the main release.  Maybe we should do one
> release candidate anyway and give him space that way.

OK, if you knew about this, then I'll assume you have it in hand...

Cheers,
M.

-- 
  Java is a WORA language! (Write Once, Run Away)
                	-- James Vandenberg (on progstone@egroups.com)
                           & quoted by David Rush on comp.lang.scheme


From fredrik@pythonware.com  Mon Sep 30 10:01:24 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Mon, 30 Sep 2002 11:01:24 +0200
Subject: [Python-Dev] Re: User extendable literal modifiers ?!
References: <200209292339.g8TNdiv28330@oma.cosc.canterbury.ac.nz>
Message-ID: <00f701c2685f$f67611f0$0900a8c0@spiff>

greg wrote:

> >   - Rational numbers.    $r"123/234"
> >   - Regular expressions. $/"foo.*bar"
> >   - Dates and times.     $t"2002-09-27 11:38"
> >   - Hostnames and ports. $h"www.google.com:80"
>=20
> This strikes me as ugly. There doesn't seem to be much, if any,
> syntactical advantage over using a constructor:
>=20
>    Rat("123/234")
>    Regex("foo.*bar")
>    Date("2002-09-27 11:38")
>    Port("www.google.com:80")
>=20
> These look cleaner and easier to read to me.

isn't the whole idea that with a special syntax, you can do some of the
processing when compiling the script?  it's pretty pointless to invent =
more
ways to call functions with string literals as arguments...

btw, the following note is slightly related to this topic, and has been
generating some buzz lately (at least in my mailbox):

    http://effbot.org/zone/idea-xml-literal.htm

</F>



From martin@v.loewis.de  Mon Sep 30 13:59:03 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 30 Sep 2002 14:59:03 +0200
Subject: [Python-Dev] Re: User extendable literal modifiers ?!
In-Reply-To: <00f701c2685f$f67611f0$0900a8c0@spiff>
References: <200209292339.g8TNdiv28330@oma.cosc.canterbury.ac.nz>
 <00f701c2685f$f67611f0$0900a8c0@spiff>
Message-ID: <m3it0nprag.fsf@mira.informatik.hu-berlin.de>

"Fredrik Lundh" <fredrik@pythonware.com> writes:

> isn't the whole idea that with a special syntax, you can do some of the
> processing when compiling the script?  it's pretty pointless to invent more
> ways to call functions with string literals as arguments...

That can't be the idea: Marshalling would store the string form, so
any compilation done until marshalling must be undone.

Perhaps the idea is that these things are interpreted once before byte
code interpretation starts (i.e. after loading a .pyc file). In that
case, a number of interesting questions arise:

- in what order, precisely, are those things evaluated? Probably in
  textual order, but this is not that easy, since the marshalling
  procedure might make such a requirement unimplementable.

- are duplicate occurrences eliminated? If so, how does one determine
  duplicates?

In any case, I think users will be surprised if $h"www.google.com:80"
causes a dial-up connection to be set up as soon as a module is
imported.

Regards,
Martin


From dave@boost-consulting.com  Mon Sep 30 14:27:43 2002
From: dave@boost-consulting.com (David Abrahams)
Date: Mon, 30 Sep 2002 09:27:43 -0400
Subject: [Python-Dev] Documentation: type-vs.-function
References: <0cf201c2681a$2912c6d0$6501a8c0@boostconsulting.com> <auto-000000093727@stlport.com>
Message-ID: <0e9f01c26885$7551a820$6501a8c0@boostconsulting.com>

From: "Alex Martelli" <aleax@aleax.it>


> On Monday 30 September 2002 02:36 am, David Abrahams wrote:
> > I note that
> > http://www.python.org/dev/doc/devel/lib/built-in-funcs.html#l2h-14
> > describes dict as a built-in function, whereas we all know that Guido's
> > cool 2.2 changes made it into a type
> >
> >   >>> dict
> >
> >   <type 'dict'>
> >
> > Does this distinction matter? A little, I think. Calling it a function
> > makes it sound like we're living in the past. Same goes for str, type,
> > list, tuple, et. al. I realize that the type (especially <type 'type'>)
> > acts like a function under many circumstances...
>
> Trying to cover both 2.1 and 2.2 in the coming Nutshell, I've resorted to
> periphrases such as "the built-in dict" or "the dict built-in" (the
latter
> uses "built-in" as a noun, I'm not yet sure the editor will let that go
by).
>
> I've also tried to use 'callable' systematically instead of 'function'
> wherever other callables (types, bound-methods, etc) can be substituted
> in lieu of functions.  In documenting 2.2 or 2.3 only, I think such
hedging
> is not warranted.  It's important, when feasible, to clarify what
built-ins
> are types -- a type has MORE functionality than a function, after all (in
> particular, one can subclass it, while one can't subclass a function).

It's probably also worth noting that the dict type is not documented
anywhere, except as a function.

-Dave



From guido@python.org  Mon Sep 30 15:33:41 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 30 Sep 2002 10:33:41 -0400
Subject: [Python-Dev] Re: User extendable literal modifiers ?!
In-Reply-To: Your message of "Mon, 30 Sep 2002 11:01:24 +0200."
 <00f701c2685f$f67611f0$0900a8c0@spiff>
References: <200209292339.g8TNdiv28330@oma.cosc.canterbury.ac.nz>
 <00f701c2685f$f67611f0$0900a8c0@spiff>
Message-ID: <200209301433.g8UEXfP14842@pcp02138704pcs.reston01.va.comcast.net>

[effbot]
> isn't the whole idea that with a special syntax, you can do some of
> the processing when compiling the script?  it's pretty pointless to
> invent more ways to call functions with string literals as
> arguments...

Not necessarily.  Domain-specific notations are useful with or without
compile-time processing, and sometimes the added noise of the function
call syntax + string literals can get in the way of readability.

(Hey, binary operators are [mostly] just another syntax for calling
functions, and around here we all agree that they're a good thing. :-)

That said, I'm not very enamored of the $x"foo" notation -- too much
line noise.  MAL's original minimalistic proposal (123x, or pehaps
also 123.456x, and maybe even 1.23e-456x) seems cleaner in cases where
it's applicable.  I don't expect Python will ever grow date/time or
(heaven forbid) IP address literals, and we already have r"regex"
literals.

> btw, the following note is slightly related to this topic, and has been
> generating some buzz lately (at least in my mailbox):
> 
>     http://effbot.org/zone/idea-xml-literal.htm

That looks interesting in a futuristic kind of way.  I'm curious why
you decided not to return fixed-type tuples of the form (tag, attrs,
content) -- that seems easier to deal with than having to deal with
both (tag, content) and (tag, attrs, content).  Tuples used as records
ought to have a fixed lay-out.

Parsing this would be tricky -- the tokenizer would have to know in
what state the parser is in order to tell when to switch to XML if it
sees a '<'.  And if you want to use a standard XML parser you'd have
to be careful to stop reading after the final '>'.

And what can this do that you can't do by putting it in a string
literal and feeding it to a convenience function?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake@acm.org  Mon Sep 30 15:58:03 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 30 Sep 2002 10:58:03 -0400
Subject: [Python-Dev] Re: Documentation: type-vs.-function
In-Reply-To: <0cf201c2681a$2912c6d0$6501a8c0@boostconsulting.com>
References: <0cf201c2681a$2912c6d0$6501a8c0@boostconsulting.com>
 <0H3800KZ9NSXEZ@mtain01.icomcast.net>
Message-ID: <15768.26235.434551.130411@grendel.zope.com>

David Abrahams writes:
 > Does this distinction matter? A little, I think. Calling it a function
 > makes it sound like we're living in the past. Same goes for str, type,
 > list, tuple, et. al. I realize that the type (especially <type 'type'>)
 > acts like a function under many circumstances...

It definately matters.

Alex Martelli writes:
 > It's important, when feasible, to clarify what built-ins are types
 > -- a type has MORE functionality than a function, after all (in
 > particular, one can subclass it, while one can't subclass a
 > function).

I agree.

The current somewhat-vague plan is to add a new section parallel to
the section on built-in functions that lists the built-in types
exposed in the __builtin__ module.  This would make it easier to
describe these types and their ability to be subclassed in a more
rational manner than in their current location.  Placeholder entries
will be maintained for the function entries so people accustomed to
looking in the current location won't be completely lost.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From guido@python.org  Mon Sep 30 16:00:48 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 30 Sep 2002 11:00:48 -0400
Subject: [Python-Dev] Snapshot win32all builds of interest?
In-Reply-To: Your message of "Mon, 30 Sep 2002 11:16:41 +1000."
 <LCEPIIGDJPKCOIHOBJEPCENPGKAA.mhammond@skippinet.com.au>
References: <LCEPIIGDJPKCOIHOBJEPCENPGKAA.mhammond@skippinet.com.au>
Message-ID: <200209301500.g8UF0mL18039@pcp02138704pcs.reston01.va.comcast.net>

> I'm wondering if there is any interest in me making regular,
> basically untested win32all builds against the current Python CVS
> tree?
> 
> It would be fairly simple for me to do - I run against CVS Python,
> so it is really just bundling up my latest built files into an
> installer .EXE.  I would only do it for the current CVS trunk - ie,
> no Python 2.2 or earlier builds in this form.  However, I would only
> bother if there were people willing to use it.  I figure that if
> there aren't people in this forum who would use it, I won't find
> them anywhere ;)

I think that would be useful, especially if the current 2.2-compatible
win32all does not work with 2.3, or if you have added features since
that was last released.

> OTOH, people in this forum using CVS Python on Windows may prefer to
> use CVS and build their own win32all - I really have no clue ;)

Not me.  I know how to build and install Python from source on
Windows, but setting up another project is a major pain for a Unix
weenie like me, so I'd much prefer a binary distribution.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jepler@unpythonic.net  Mon Sep 30 16:16:12 2002
From: jepler@unpythonic.net (Jeff Epler)
Date: Mon, 30 Sep 2002 10:16:12 -0500
Subject: [Python-Dev] Re: User extendable literal modifiers ?!
In-Reply-To: <00f701c2685f$f67611f0$0900a8c0@spiff>
References: <200209292339.g8TNdiv28330@oma.cosc.canterbury.ac.nz> <00f701c2685f$f67611f0$0900a8c0@spiff>
Message-ID: <20020930151601.GA20279@unpythonic.net>

On Mon, Sep 30, 2002 at 11:01:24AM +0200, Fredrik Lundh wrote:
> btw, the following note is slightly related to this topic, and has been
> generating some buzz lately (at least in my mailbox):
> 
>     http://effbot.org/zone/idea-xml-literal.htm

This is a little like what I implemented for 'pyhtml'.  It was inteded
to be an extension to the Quixote templating system, so it used the idea
that a HTML tag embedded in the code should write itself directly to the
output, like the result of expression statements already does in
templates.

An excerpt the README:
    The following code:

	<UL>
	    for i in range(10):
		<LI> i

    would output something like
	<UL><LI>0</LI><LI>2</LI>....<LI>9</LI></UL>

As you can see, I let <TAG> start a block, and let blocks end according
to Python's normal indentation rules.  The productions added to the
grammar were:

    compound_stmt: ... | tag_stmt
    tag_stmt: '<' NAME [tag_args] '>' suite
    tag_args: NAME '=' expr (',' NAME '=' expr)* [',']

so that
    <DIV CLASS="blue"> "this might be blue"
would also work.

I thought it was rather cute to reverse the normal practice of finding a
way to shoehorn Python syntax into the midst of an HTML document, but
never wrote anything serious using pyhtml.  The remains of the project
can be seen at
    http://unpythonic.net/~jepler/falcon/pyhtml/

Jeff


From greg@cosc.canterbury.ac.nz  Mon Sep 30 23:22:39 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 01 Oct 2002 10:22:39 +1200 (NZST)
Subject: [Python-Dev] Re: User extendable literal modifiers ?!
In-Reply-To: <00f701c2685f$f67611f0$0900a8c0@spiff>
Message-ID: <200209302222.g8UMMda01522@oma.cosc.canterbury.ac.nz>

Fredrik Lundh <fredrik@pythonware.com>:

> isn't the whole idea that with a special syntax, you can do some of the
> processing when compiling the script?

I suppose the literal object could be precomputed when
compiling -- but how would you marshal it when saving
the bytecode?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+