From collinw at gmail.com  Fri Sep  1 04:52:02 2006
From: collinw at gmail.com (Collin Winter)
Date: Thu, 31 Aug 2006 21:52:02 -0500
Subject: [Python-Dev] A test suite for unittest
Message-ID: <43aa6ff70608311952jd4cbb8ena17594458d480e8e@mail.gmail.com>

I've just uploaded a trio of unittest-related patches:

#1550272 (http://python.org/sf/1550272) is a test suite for the
mission-critical parts of unittest.

#1550273 (http://python.org/sf/1550273) fixes 6 issues uncovered while
writing the test suite. Several other items that I raised earlier
(http://mail.python.org/pipermail/python-dev/2006-August/068378.html)
were judged to be either non-issues or behaviours that, while
suboptimal, people have come to rely on.

#1550263 (http://python.org/sf/1550263) follows up on an earlier patch
I submitted for unittest's docs. This new patch corrects and clarifies
numerous sections of the module's documentation.

I'd appreciate it if these changes could make it into 2.5-final or at
least 2.5.1.

What follows is a list of the issues fixed in patch #1550273:

1) TestLoader.loadTestsFromName() failed to return a
suite when resolving a name to a callable that returns
a TestCase instance.

2) Fix a bug in both TestSuite.addTest() and
TestSuite.addTests() concerning a lack of input
checking on the input test case(s)/suite(s).

3) Fix a bug in both TestLoader.loadTestsFromName() and
TestLoader.loadTestsFromNames() that had ValueError
being raised instead of TypeError. The problem occured
when the given name resolved to a callable and the
callable returned something of the wrong type.

4) When a name resolves to a method on a TestCase
subclass, TestLoader.loadTestsFromName() did not return
a suite as promised.

5) TestLoader.loadTestsFromName() would raise a
ValueError (rather than a TypeError) if a name resolved
to an invalid object. This has been fixed so that a
TypeError is raised.

6) TestResult.shouldStop was being initialised to 0 in
TestResult.__init__. Since this attribute is always
used in a boolean context, it's better to use the False
spelling.

Thanks,
Collin Winter

From fdrake at acm.org  Fri Sep  1 06:02:59 2006
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri, 1 Sep 2006 00:02:59 -0400
Subject: [Python-Dev] A test suite for unittest
In-Reply-To: <43aa6ff70608311952jd4cbb8ena17594458d480e8e@mail.gmail.com>
References: <43aa6ff70608311952jd4cbb8ena17594458d480e8e@mail.gmail.com>
Message-ID: <200609010003.00418.fdrake@acm.org>

On Thursday 31 August 2006 22:52, Collin Winter wrote:
 > I've just uploaded a trio of unittest-related patches:

Thanks, Collin!

 > #1550272 (http://python.org/sf/1550272) is a test suite for the
 > mission-critical parts of unittest.
 >
 > #1550273 (http://python.org/sf/1550273) fixes 6 issues uncovered while
 > writing the test suite. Several other items that I raised earlier
 > (http://mail.python.org/pipermail/python-dev/2006-August/068378.html)
 > were judged to be either non-issues or behaviours that, while
 > suboptimal, people have come to rely on.

I'm hesitant to commit even tests at this point (the release candidate has 
already been released, and there's no plan for a second).  I've not reviewed 
the patches.

 > #1550263 (http://python.org/sf/1550263) follows up on an earlier patch
 > I submitted for unittest's docs. This new patch corrects and clarifies
 > numerous sections of the module's documentation.

Anthony did approve documentation changes for 2.5, so I've committed this for 
2.5 and on the trunk (2.6).  These should be considered for 2.4.4 as well.  
(The other two may be appropriate as well.)


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From anthony at interlink.com.au  Fri Sep  1 06:35:19 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Fri, 1 Sep 2006 14:35:19 +1000
Subject: [Python-Dev] A test suite for unittest
In-Reply-To: <43aa6ff70608311952jd4cbb8ena17594458d480e8e@mail.gmail.com>
References: <43aa6ff70608311952jd4cbb8ena17594458d480e8e@mail.gmail.com>
Message-ID: <200609011435.21060.anthony@interlink.com.au>

At this point, I'd say the documentation patches should go in - the other 
patches are probably appropriate for 2.5.1. I only want to accept critical 
patches between now and 2.5 final. 

Thanks for the patches (and particularly for the unittest! woooooo!)

Anthony

From fredrik at pythonware.com  Fri Sep  1 10:08:18 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 1 Sep 2006 10:08:18 +0200
Subject: [Python-Dev] That library reference, yet again
References: <8233478f0608311255o7058a1feo55c710e7eb8e6b6c@mail.gmail.com>
Message-ID: <ed8ppi$1kf$1@sea.gmane.org>

"Johann C. Rocholl" wrote:

> What is the status of http://effbot.org/lib/ ?
>
> I think it's a step in the right direction. Is it still in progress?

the pushback from the powers-that-be was massive, so we're currently working "under
the radar", using alternative deployment approaches (see pytut.infogami.com and friends).

</F> 




From jimjjewett at gmail.com  Fri Sep  1 15:31:36 2006
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 1 Sep 2006 09:31:36 -0400
Subject: [Python-Dev] Fwd: [Python-checkins] r51674 -
	python/trunk/Misc/Vim/vimrc
In-Reply-To: <20060831224237.B872C1E4002@bag.python.org>
References: <20060831224237.B872C1E4002@bag.python.org>
Message-ID: <fb6fbf560609010631l3e83717eh5ff4e4e79b64dd7e@mail.gmail.com>

This 8 vs 4 is getting cruftier and cruftier.  (And does it deal
properly with existing code that already has four spaces because it
was written recently?)

"Tim" regularly fixes whitespace already, with little damage.

Would it make sense to do a one-time cutover on the 2.6 trunk?
How about the bugfix branches?

If it is ever going to happen, then immediately after a release,
before unfreezing, is probably the best time.

-jJ

---------- Forwarded message ----------
From: brett.cannon <python-checkins at python.org>
Date: Aug 31, 2006 6:42 PM
Subject: [Python-checkins] r51674 - python/trunk/Misc/Vim/vimrc
To: python-checkins at python.org


Author: brett.cannon
Date: Fri Sep  1 00:42:37 2006
New Revision: 51674

Modified:
   python/trunk/Misc/Vim/vimrc
Log:
Have pre-existing C files use 8 spaces indents (to match old PEP 7 style), but
have all new files use 4 spaces (to match current PEP 7 style).


Modified: python/trunk/Misc/Vim/vimrc
==============================================================================
--- python/trunk/Misc/Vim/vimrc (original)
+++ python/trunk/Misc/Vim/vimrc Fri Sep  1 00:42:37 2006
@@ -19,9 +19,10 @@
 " Number of spaces to use for an indent.
 " This will affect Ctrl-T and 'autoindent'.
 " Python: 4 spaces
-" C: 4 spaces
+" C: 8 spaces (pre-existing files) or 4 spaces (new files)
 au BufRead,BufNewFile *.py,*pyw set shiftwidth=4
-au BufRead,BufNewFile *.c,*.h set shiftwidth=4
+au BufRead *.c,*.h set shiftwidth=8
+au BufNewFile *.c,*.h set shiftwidth=4

 " Number of spaces that a pre-existing tab is equal to.
 " For the amount of space used for a new tab use shiftwidth.
_______________________________________________
Python-checkins mailing list
Python-checkins at python.org
http://mail.python.org/mailman/listinfo/python-checkins

From guido at python.org  Fri Sep  1 17:02:37 2006
From: guido at python.org (Guido van Rossum)
Date: Fri, 1 Sep 2006 08:02:37 -0700
Subject: [Python-Dev] Fwd: [Python-checkins] r51674 -
	python/trunk/Misc/Vim/vimrc
In-Reply-To: <fb6fbf560609010631l3e83717eh5ff4e4e79b64dd7e@mail.gmail.com>
References: <20060831224237.B872C1E4002@bag.python.org>
	<fb6fbf560609010631l3e83717eh5ff4e4e79b64dd7e@mail.gmail.com>
Message-ID: <ca471dc20609010802v5e4b51cdn770cc248264c6700@mail.gmail.com>

For 2.x we really don't want to reformat all code. I even think it's
questionable to use 4 spaces for new files since it will mean problems
for editors switching between files.

For 3.0 we really do. But as long as 2.x and 3.0 aren't too far apart
I'd rather not reformat everything because it would break all merge
capabilities.

--Guido

On 9/1/06, Jim Jewett <jimjjewett at gmail.com> wrote:
> This 8 vs 4 is getting cruftier and cruftier.  (And does it deal
> properly with existing code that already has four spaces because it
> was written recently?)
>
> "Tim" regularly fixes whitespace already, with little damage.
>
> Would it make sense to do a one-time cutover on the 2.6 trunk?
> How about the bugfix branches?
>
> If it is ever going to happen, then immediately after a release,
> before unfreezing, is probably the best time.
>
> -jJ
>
> ---------- Forwarded message ----------
> From: brett.cannon <python-checkins at python.org>
> Date: Aug 31, 2006 6:42 PM
> Subject: [Python-checkins] r51674 - python/trunk/Misc/Vim/vimrc
> To: python-checkins at python.org
>
>
> Author: brett.cannon
> Date: Fri Sep  1 00:42:37 2006
> New Revision: 51674
>
> Modified:
>    python/trunk/Misc/Vim/vimrc
> Log:
> Have pre-existing C files use 8 spaces indents (to match old PEP 7 style), but
> have all new files use 4 spaces (to match current PEP 7 style).
>
>
> Modified: python/trunk/Misc/Vim/vimrc
> ==============================================================================
> --- python/trunk/Misc/Vim/vimrc (original)
> +++ python/trunk/Misc/Vim/vimrc Fri Sep  1 00:42:37 2006
> @@ -19,9 +19,10 @@
>  " Number of spaces to use for an indent.
>  " This will affect Ctrl-T and 'autoindent'.
>  " Python: 4 spaces
> -" C: 4 spaces
> +" C: 8 spaces (pre-existing files) or 4 spaces (new files)
>  au BufRead,BufNewFile *.py,*pyw set shiftwidth=4
> -au BufRead,BufNewFile *.c,*.h set shiftwidth=4
> +au BufRead *.c,*.h set shiftwidth=8
> +au BufNewFile *.c,*.h set shiftwidth=4
>
>  " Number of spaces that a pre-existing tab is equal to.
>  " For the amount of space used for a new tab use shiftwidth.
> _______________________________________________
> Python-checkins mailing list
> Python-checkins at python.org
> http://mail.python.org/mailman/listinfo/python-checkins
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From rhettinger at ewtllc.com  Fri Sep  1 19:56:17 2006
From: rhettinger at ewtllc.com (Raymond Hettinger)
Date: Fri, 01 Sep 2006 10:56:17 -0700
Subject: [Python-Dev] Py2.5 issue: decimal context manager
 misimplemented, misdesigned, and misdocumented
In-Reply-To: <44F6D12C.4040808@gmail.com>
References: <44F4D9D2.2040804@ewtllc.com> <44F6B524.6060504@gmail.com>
	<44F6D12C.4040808@gmail.com>
Message-ID: <44F87441.7060203@ewtllc.com>


>>> The right way to do it was presented in PEP343.  The implementation 
>>> was correct and the API was simple.
>>
>>
>>
>> Raymond's persuaded me that he's right on the API part at the very 
>> least. The current API was a mechanical replacement of the initial 
>> __context__ based API with a normal method, whereas I should have 
>> reverted back to the module-level localcontext() function from PEP343 
>> and thrown the method on Context objects away entirely.
>>
>> I can fix it on the trunk (and add those missing tests!), but I'll 
>> need Anthony and/or Neal's permission to backport it and remove the 
>> get_manager() method from Python 2.5 before we get stuck with it 
>> forever.
>
>
>
> I committed this fix as 51664 on the trunk (although the docstrings 
> are still example free because doctest doesn't understand __future__ 
> statements).
>

Thanks for getting this done.

Please make the following changes:
* rename ContextManger to _ContextManger and remove it from the __all__ 
listing
* move the copy() step from localcontext() to _ContextManager()
* make the trivial updates the whatsnew25 example

Once those nits are fixed, I recommend this patch be backported to the 
Py2.5 release.


Raymond

From rhettinger at ewtllc.com  Sat Sep  2 01:47:21 2006
From: rhettinger at ewtllc.com (Raymond Hettinger)
Date: Fri, 01 Sep 2006 16:47:21 -0700
Subject: [Python-Dev] Problem withthe API for str.rpartition()
Message-ID: <44F8C689.6050804@ewtllc.com>

Currently, both the partition() and rpartition() methods return a (head, 
sep, tail) tuple and the only difference between the two is whether the 
partition element search starts from the beginning or end of the 
string.  When no separator is found, both methods return the string S 
and two empty strings so that 'a'.partition('x') == 'a'.rpartition('x') 
== ('a', '', ''). 

For rpartition() the notion of head and tail are backwards -- you 
repeatedly search the tail, not the head.  The distinction is vital 
because the use cases for rpartition() are a mirror image of those for 
partition().  Accordingly, rpartition()'s result should be interpreted 
as (tail, sep, head) and the partition-not-found endcase needs change so 
that 'a'.rpartition('x') == ('', '', 'a') .

The test invariant should be:
    For every s and p:    s.partition(p) == s[::-1].rpartition(p)[::-1]

The following code demonstrates why the current choice is problematic:

    line = 'a.b.c.d'
    while line:
        field, sep, line = line.partition('.')
        print field

    line = 'a.b.c.d'
    while line:
        line, sep, field = line.rpartition('.')
        print field

The second fragment never terminates.

Since this is a critical API flaw rather than a implementation bug, I 
think it should get fixed right away rather than waiting for Py2.5.1.



Raymond

From guido at python.org  Sat Sep  2 02:04:12 2006
From: guido at python.org (Guido van Rossum)
Date: Fri, 1 Sep 2006 17:04:12 -0700
Subject: [Python-Dev] Problem withthe API for str.rpartition()
In-Reply-To: <44F8C689.6050804@ewtllc.com>
References: <44F8C689.6050804@ewtllc.com>
Message-ID: <ca471dc20609011704u87aba2epfb5a2f7482132b1f@mail.gmail.com>

+1

On 9/1/06, Raymond Hettinger <rhettinger at ewtllc.com> wrote:
> Currently, both the partition() and rpartition() methods return a (head,
> sep, tail) tuple and the only difference between the two is whether the
> partition element search starts from the beginning or end of the
> string.  When no separator is found, both methods return the string S
> and two empty strings so that 'a'.partition('x') == 'a'.rpartition('x')
> == ('a', '', '').
>
> For rpartition() the notion of head and tail are backwards -- you
> repeatedly search the tail, not the head.  The distinction is vital
> because the use cases for rpartition() are a mirror image of those for
> partition().  Accordingly, rpartition()'s result should be interpreted
> as (tail, sep, head) and the partition-not-found endcase needs change so
> that 'a'.rpartition('x') == ('', '', 'a') .
>
> The test invariant should be:
>     For every s and p:    s.partition(p) == s[::-1].rpartition(p)[::-1]
>
> The following code demonstrates why the current choice is problematic:
>
>     line = 'a.b.c.d'
>     while line:
>         field, sep, line = line.partition('.')
>         print field
>
>     line = 'a.b.c.d'
>     while line:
>         line, sep, field = line.rpartition('.')
>         print field
>
> The second fragment never terminates.
>
> Since this is a critical API flaw rather than a implementation bug, I
> think it should get fixed right away rather than waiting for Py2.5.1.
>
>
>
> Raymond
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From kbk at shore.net  Sat Sep  2 03:28:44 2006
From: kbk at shore.net (Kurt B. Kaiser)
Date: Fri, 1 Sep 2006 21:28:44 -0400 (EDT)
Subject: [Python-Dev] Weekly Python Patch/Bug Summary
Message-ID: <200609020128.k821SicT001270@bayview.thirdcreek.com>

Patch / Bug Summary
___________________

Patches :  412 open ( +5) /  3397 closed ( +4) /  3809 total ( +9)
Bugs    :  900 open (+12) /  6149 closed ( +4) /  7049 total (+16)
RFE     :  233 open ( +1) /   236 closed ( +0) /   469 total ( +1)

New / Reopened Patches
______________________

set literals  (2006-08-28)
CLOSED http://python.org/sf/1547796  opened by  Georg Brandl

"for x in setliteral" peepholer optimization  (2006-08-28)
CLOSED http://python.org/sf/1548082  opened by  Georg Brandl

set comprehensions  (2006-08-29)
       http://python.org/sf/1548388  opened by  Georg Brandl

Fix for structmember conversion issues  (2006-08-29)
       http://python.org/sf/1549049  opened by  Roger Upole

Implementation of PEP 3102 Keyword Only Argument  (2006-08-31)
       http://python.org/sf/1549670  opened by  Jiwon Seo

Add a test suite for test_unittest  (2006-08-31)
       http://python.org/sf/1550272  opened by  Collin Winter

Fix numerous bugs in unittest  (2006-08-31)
       http://python.org/sf/1550273  opened by  Collin Winter

Ellipsis literal "..."  (2006-09-01)
       http://python.org/sf/1550786  opened by  Georg Brandl

make exec a function  (2006-09-01)
       http://python.org/sf/1550800  opened by  Georg Brandl

Patches Closed
______________

Allow os.listdir to accept file names longer than MAX_PATH  (2006-04-26)
       http://python.org/sf/1477350  closed by  rupole

set literals  (2006-08-28)
       http://python.org/sf/1547796  closed by  gbrandl

pybench.py error reporting broken for bad -s filename  (2006-08-25)
       http://python.org/sf/1546372  closed by  lemburg

"if x in setliteral" peepholer optimization  (2006-08-28)
       http://python.org/sf/1548082  closed by  gvanrossum

New / Reopened Bugs
___________________

Typo in Language Reference Section 3.2 Class Instances  (2006-08-28)
       http://python.org/sf/1547931  opened by  whesse_at_clarkson

curses module segfaults on invalid tparm arguments  (2006-08-28)
       http://python.org/sf/1548092  opened by  Marien Zwart

Add 'find' method to sequence types  (2006-08-28)
       http://python.org/sf/1548178  opened by  kovan

Recursion limit exceeded in the match function  (2006-08-29)
CLOSED http://python.org/sf/1548252  opened by  wojtekwu

sgmllib.sgmlparser is not thread safe  (2006-08-28)
       http://python.org/sf/1548288  opened by  Andres Riancho

whichdb too dumb  (2006-08-28)
       http://python.org/sf/1548332  opened by  Curtis Doty

filterwarnings('error') has no effect  (2006-08-29)
       http://python.org/sf/1548371  opened by  Roger Upole

C modules reloaded on certain failed imports  (2006-08-29)
       http://python.org/sf/1548687  opened by  Josiah Carlson

shlex (or perhaps cStringIO) and unicode strings  (2006-08-29)
       http://python.org/sf/1548891  opened by  Erwin S. Andreasen

bug in classlevel variabels  (2006-08-30)
CLOSED http://python.org/sf/1549499  opened by  Thomas Dybdahl Ahle

Pdb parser bug  (2006-08-30)
       http://python.org/sf/1549574  opened by  Alexander Belopolsky

urlparse return exchanged values  (2006-08-30)
CLOSED http://python.org/sf/1549589  opened by  Oscar Acena

Enhance and correct unittest's docs (redux)  (2006-08-31)
       http://python.org/sf/1550263  reopened by  fdrake

Enhance and correct unittest's docs (redux)  (2006-08-31)
       http://python.org/sf/1550263  opened by  Collin Winter

inspect module and class startlineno  (2006-09-01)
       http://python.org/sf/1550524  opened by  Ali Gholami Rudi

SWIG wrappers incompatible with 2.5c1  (2006-09-01)
       http://python.org/sf/1550559  opened by  Andrew Gregory

itertools.tee raises SystemError  (2006-09-01)
       http://python.org/sf/1550714  opened by  Alexander Belopolsky

itertools.tee raises SystemError  (2006-09-01)
CLOSED http://python.org/sf/1550761  opened by  Alexander Belopolsky

Bugs Closed
___________

x!=y and [x]=[y] (!)  (2006-08-22)
       http://python.org/sf/1544762  closed by  rhettinger

Recursion limit exceeded in the match function  (2006-08-29)
       http://python.org/sf/1548252  closed by  gbrandl

bug in classlevel variabels  (2006-08-30)
       http://python.org/sf/1549499  closed by  gbrandl

urlparse return exchanged values  (2006-08-30)
       http://python.org/sf/1549589  closed by  gbrandl

Enhance and correct unittest's docs (redux)  (2006-08-31)
       http://python.org/sf/1550263  closed by  fdrake

itertools.tee raises SystemError  (2006-09-01)
       http://python.org/sf/1550761  deleted by  belopolsky


From ncoghlan at gmail.com  Sat Sep  2 06:47:30 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 02 Sep 2006 14:47:30 +1000
Subject: [Python-Dev] Py2.5 issue: decimal context manager
 misimplemented, misdesigned, and misdocumented
In-Reply-To: <44F715DC.1090001@ewtllc.com>
References: <44F4D9D2.2040804@ewtllc.com> <44F6B524.6060504@gmail.com>
	<44F715DC.1090001@ewtllc.com>
Message-ID: <44F90CE2.2050200@gmail.com>

Raymond Hettinger wrote:
> Please go ahead and get the patch together for localcontext().  This 
> should be an easy sell:
> 
> * simple bugs can be fixed in Py2.5.1 but API mistakes are forever. * 
> currently, all of the docs, docstrings, and whatsnew are incorrect.
> * the solution has already been worked-out in PEP343 -- it's nothing new.
> * nothing else, anywhere depends on this code -- it is as safe a change 
> as we could hope for.
> 
> Neal is tough, but he's not heartless ;-)

I backported the changes and assigned the patch to Neal:
http://www.python.org/sf/1550886

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From gjcarneiro at gmail.com  Sat Sep  2 14:10:04 2006
From: gjcarneiro at gmail.com (Gustavo Carneiro)
Date: Sat, 2 Sep 2006 13:10:04 +0100
Subject: [Python-Dev] Signals, threads, blocking C functions
Message-ID: <a467ca4f0609020510t2da5a1dbwa82e01d299befebd@mail.gmail.com>

We have to resort to timeouts in pygtk in order to catch unix signals
in threaded mode.
The reason is this.  We call gtk_main() (mainloop function) which
blocks forever.  Suppose there are threads in the program; then any
thread can receive a signal (e.g. SIGINT).  Python catches the signal,
but doesn't do anything; it simply sets a flag in a global structure
and calls Py_AddPendingCall(), and I guess it expects someone to call
Py_MakePendingCalls().  However, the main thread is blocked calling a
C function and has no way of being notified it needs to give control
back to python to handle the signal.  Hence, we use a 100ms timeout
for polling.  Unfortunately, timeouts needlessly consume CPU time and
drain laptop batteries.

According to [1], all python needs to do to avoid this problem is
block all signals in all but the main thread; then we can guarantee
signal handlers are always called from the main thread, and pygtk
doesn't need a timeout.

Another alternative would be to add a new API like
Py_AddPendingCallNotification, which would let python notify
extensions that new pending calls exist and need to be processed.

  But I would really prefer the first alternative, as it could be
fixed within python 2.5; no need to wait for 2.6.

  Please, let's make Python ready for the enterprise! [2]

[1] https://bugzilla.redhat.com/bugzilla/process_bug.cgi#c3
[2] http://perkypants.org/blog/2006/09/02/rfte-python/

From nmm1 at cus.cam.ac.uk  Sat Sep  2 15:02:43 2006
From: nmm1 at cus.cam.ac.uk (Nick Maclaren)
Date: Sat, 02 Sep 2006 14:02:43 +0100
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: Your message of "Sat, 02 Sep 2006 13:10:04 BST."
	<a467ca4f0609020510t2da5a1dbwa82e01d299befebd@mail.gmail.com> 
Message-ID: <E1GJV91-00041F-OF@draco.cus.cam.ac.uk>

"Gustavo Carneiro" <gjcarneiro at gmail.com> wrote:
> 
> We have to resort to timeouts in pygtk in order to catch unix signals
> in threaded mode.

A common defect of modern designs - TCP/IP is particularly objectionable
in this respect, but that battle was lost and won over two decades ago :-(

> The reason is this.  We call gtk_main() (mainloop function) which
> blocks forever.  Suppose there are threads in the program; then any
> thread can receive a signal (e.g. SIGINT).  Python catches the signal,
> but doesn't do anything; it simply sets a flag in a global structure
> and calls Py_AddPendingCall(), and I guess it expects someone to call
> Py_MakePendingCalls().  However, the main thread is blocked calling a
> C function and has no way of being notified it needs to give control
> back to python to handle the signal.  Hence, we use a 100ms timeout
> for polling.  Unfortunately, timeouts needlessly consume CPU time and
> drain laptop batteries.

Yup.

> According to [1], all python needs to do to avoid this problem is
> block all signals in all but the main thread; then we can guarantee
> signal handlers are always called from the main thread, and pygtk
> doesn't need a timeout.

1) That page is password protected, so I can't see what it says, and
am disinclined to register myself to yet another such site.

2) No way, Jose, anyway.  The POSIX signal handling model was broken
beyond redemption, even before threading was added, and the combination
is evil almost beyond belief.  That procedure is good practice, yes,
but that is NOT all that you have to do - it may be all that you CAN
do, but that is not the same.

Come back MVS (or even VMS) - all is forgiven!  That is only partly
a joke.

> Another alternative would be to add a new API like
> Py_AddPendingCallNotification, which would let python notify
> extensions that new pending calls exist and need to be processed.

Nope.  Sorry, but you can't solve a broken design by adding interfaces.

>   But I would really prefer the first alternative, as it could be
> fixed within python 2.5; no need to wait for 2.6.

It clearly should be done, assuming that Python's model is that it
doesn't want to get involved with subthread signalling (and I really,
but REALLY, recommend not doing so).  The best that can be done is to
say that all signal handling is the business of the main thread and
that, when the system bypasses that, all bets are off.

>   Please, let's make Python ready for the enterprise! [2]

Given that no Unix variant or Microsoft system is, isn't that rather
an unreasonable demand?

I am probably one of the last half-dozen people still employed in a
technical capacity who has implemented run-time systems that supported
user-level signal handling with threads/asynchronicity and allowing
for signals received while in system calls.  It would be possible to
modify/extend POSIX or Microsoft designs to support this, but currently
they don't make it possible.  There is NOTHING that Python can do but
to minimise the chaos.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  nmm1 at cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679

From jjl at pobox.com  Sat Sep  2 17:01:52 2006
From: jjl at pobox.com (John J Lee)
Date: Sat, 2 Sep 2006 15:01:52 +0000 (UTC)
Subject: [Python-Dev] Py2.5 issue: decimal context manager
 misimplemented, misdesigned, and misdocumented
In-Reply-To: <44F6D12C.4040808@gmail.com>
References: <44F4D9D2.2040804@ewtllc.com> <44F6B524.6060504@gmail.com>
	<44F6D12C.4040808@gmail.com>
Message-ID: <Pine.LNX.4.64.0609021458010.813@localhost>

On Thu, 31 Aug 2006, Nick Coghlan wrote:
[...]
> I committed this fix as 51664 on the trunk (although the docstrings are still
> example free because doctest doesn't understand __future__ statements).
[...]

Assuming doctest doesn't try to parse the Python code when SKIP is 
specified, I guess this would solve that little problem:

http://docs.python.org/dev/lib/doctest-options.html

"""
SKIP

     When specified, do not run the example at all. This can be useful in 
contexts where doctest examples serve as both documentation and test 
cases, and an example should be included for documentation purposes, but 
should not be checked. E.g., the example's output might be random; or the 
example might depend on resources which would be unavailable to the test 
driver.

     The SKIP flag can also be used for temporarily "commenting out" 
examples.

...

Changed in version 2.5: Constant SKIP was added.
"""


John


From ncoghlan at gmail.com  Sat Sep  2 17:27:03 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 03 Sep 2006 01:27:03 +1000
Subject: [Python-Dev] Py2.5 issue: decimal context manager
 misimplemented, misdesigned, and misdocumented
In-Reply-To: <Pine.LNX.4.64.0609021458010.813@localhost>
References: <44F4D9D2.2040804@ewtllc.com>
	<44F6B524.6060504@gmail.com>	<44F6D12C.4040808@gmail.com>
	<Pine.LNX.4.64.0609021458010.813@localhost>
Message-ID: <44F9A2C7.5060803@gmail.com>

John J Lee wrote:
> On Thu, 31 Aug 2006, Nick Coghlan wrote:
> [...]
>> I committed this fix as 51664 on the trunk (although the docstrings are still
>> example free because doctest doesn't understand __future__ statements).
> [...]
> 
> Assuming doctest doesn't try to parse the Python code when SKIP is 
> specified, I guess this would solve that little problem:
> 
> http://docs.python.org/dev/lib/doctest-options.html
> 
> """
> SKIP

A quick experiment suggests that using SKIP will solve the problem - fixing 
that can wait until 2.5.1 though. The localcontext() docstring does actually 
contain an example - it just isn't in a form that doctest will try to execute.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From alan.mcintyre at gmail.com  Sat Sep  2 18:31:54 2006
From: alan.mcintyre at gmail.com (Alan McIntyre)
Date: Sat, 02 Sep 2006 12:31:54 -0400
Subject: [Python-Dev] Windows build slave down until Tuesday-ish
Message-ID: <44F9B1FA.4010305@gmail.com>

The "x86 XP trunk" build slave will be down for a bit longer,
unfortunately.  Tropical storm Ernesto got in the way of my DSL
installation - I don't have a new install date yet, but I'm assuming
it's going to be Tuesday or later.

Alan

From gjcarneiro at gmail.com  Sat Sep  2 18:39:51 2006
From: gjcarneiro at gmail.com (Gustavo Carneiro)
Date: Sat, 2 Sep 2006 17:39:51 +0100
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <E1GJV91-00041F-OF@draco.cus.cam.ac.uk>
References: <a467ca4f0609020510t2da5a1dbwa82e01d299befebd@mail.gmail.com>
	<E1GJV91-00041F-OF@draco.cus.cam.ac.uk>
Message-ID: <a467ca4f0609020939s1c453e42wae9bfbc8f9af4dab@mail.gmail.com>

On 9/2/06, Nick Maclaren <nmm1 at cus.cam.ac.uk> wrote:
> > According to [1], all python needs to do to avoid this problem is
> > block all signals in all but the main thread; then we can guarantee
> > signal handlers are always called from the main thread, and pygtk
> > doesn't need a timeout.
>
> 1) That page is password protected, so I can't see what it says, and
> am disinclined to register myself to yet another such site.

  Oh, sorry, here's the comment:

(coment by Arjan van de Ven):
| afaik the kernel only sends signals to threads that don't have them blocked.
| If python doesn't want anyone but the main thread to get signals, it
should just
| block signals on all but the main thread and then by nature, all
signals will go
| to the main thread....


> 2) No way, Jose, anyway.  The POSIX signal handling model was broken
> beyond redemption, even before threading was added, and the combination
> is evil almost beyond belief.  That procedure is good practice, yes,
> but that is NOT all that you have to do - it may be all that you CAN
> do, but that is not the same.
>
> Nope.  Sorry, but you can't solve a broken design by adding interfaces.

  Well, Python has a broken design too; it postpones tasks and expects
to magically regain control in order to finish the job.  That often
doesn't happen!

>
> >   But I would really prefer the first alternative, as it could be
> > fixed within python 2.5; no need to wait for 2.6.
>
> It clearly should be done, assuming that Python's model is that it
> doesn't want to get involved with subthread signalling (and I really,
> but REALLY, recommend not doing so).  The best that can be done is to
> say that all signal handling is the business of the main thread and
> that, when the system bypasses that, all bets are off.

  Python is halfway there; it assumes signals are to be handled in the
main thread.  However, it _catches_ them in any thread, sets a flag,
and just waits for the next opportunity when it runs again in the main
thread.  It is precisely this "split handling" of signals that is
failing now.

  Anyway, attached a patch that should fix the problem in posix
threads systems, in case anyone wants to review.

  Cheers.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pythreads.diff
Type: text/x-patch
Size: 1030 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20060902/b91250df/attachment.bin 

From raymond.hettinger at verizon.net  Sat Sep  2 19:11:58 2006
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Sat, 02 Sep 2006 10:11:58 -0700
Subject: [Python-Dev] Py2.5 issue: decimal context manager
 misimplemented, misdesigned, and misdocumented
References: <44F4D9D2.2040804@ewtllc.com> <44F6B524.6060504@gmail.com>
	<44F715DC.1090001@ewtllc.com> <44F90CE2.2050200@gmail.com>
	<ee2a432c0609012200k50684c24k46e1eb9f79e4bc28@mail.gmail.com>
Message-ID: <006901c6ceb2$e692f8d0$4c00000a@RaymondLaptop1>

[Neal]
> Please review the patch and make a comment.  I did a diff between HEAD
> and 2.4 and am fine with this going in once you are happy.

I fixed a couple of documentation nits in rev 51688.
The patch is ready-to-go.
Nick, please go ahead and backport.


Raymond


From nmm1 at cus.cam.ac.uk  Sat Sep  2 20:41:59 2006
From: nmm1 at cus.cam.ac.uk (Nick Maclaren)
Date: Sat, 02 Sep 2006 19:41:59 +0100
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: Your message of "Sat, 02 Sep 2006 17:39:51 BST."
	<a467ca4f0609020939s1c453e42wae9bfbc8f9af4dab@mail.gmail.com> 
Message-ID: <E1GJaRL-0003Pm-KT@virgo.cus.cam.ac.uk>

"Gustavo Carneiro" <gjcarneiro at gmail.com> wrote:
> 
>   Oh, sorry, here's the comment:
> 
> (coment by Arjan van de Ven):
> | afaik the kernel only sends signals to threads that don't have them blocked.
> | If python doesn't want anyone but the main thread to get signals, it
> should just
> | block signals on all but the main thread and then by nature, all
> signals will go
> | to the main thread....

Well, THAT'S wrong, I am afraid!  Things ain't that simple :-(

Yes, POSIX implies that things work that way, but there are so many
get-out clauses and problems with trying to implement that specification
that such behaviour can't be relied on.

>   Well, Python has a broken design too; it postpones tasks and expects
> to magically regain control in order to finish the job.  That often
> doesn't happen!

Very true.  And that is another problem with POSIX :-(

>   Python is halfway there; it assumes signals are to be handled in the
> main thread.  However, it _catches_ them in any thread, sets a flag,
> and just waits for the next opportunity when it runs again in the main
> thread.  It is precisely this "split handling" of signals that is
> failing now. 

I agree that is not how to do it, but that code should not be removed.
Despite best attempts, there may well be circumstances under which
signals are received in a subthread, despite all attempts of the
program to ensure that the main thread gets them.

>   Anyway, attached a patch that should fix the problem in posix
> threads systems, in case anyone wants to review.

Not "fix" - "improve" :-)

I haven't looked at it, but I agree that what you have said is the
way to proceed.  The best solution is to enable the main thread for
all relevant signals, disable all subthreads, but to not rely on
any of that working in all cases.

It won't help with the problem where merely receiving a signal causes
chaos, or where blocking them does so, but there is nothing that Python
can do about that, in general.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  nmm1 at cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679

From anthony at interlink.com.au  Sun Sep  3 05:58:40 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Sun, 3 Sep 2006 13:58:40 +1000
Subject: [Python-Dev] Py2.5 issue: decimal context manager
	misimplemented, misdesigned, and misdocumented
In-Reply-To: <006901c6ceb2$e692f8d0$4c00000a@RaymondLaptop1>
References: <44F4D9D2.2040804@ewtllc.com>
	<ee2a432c0609012200k50684c24k46e1eb9f79e4bc28@mail.gmail.com>
	<006901c6ceb2$e692f8d0$4c00000a@RaymondLaptop1>
Message-ID: <200609031358.42774.anthony@interlink.com.au>

On Sunday 03 September 2006 03:11, Raymond Hettinger wrote:
> [Neal]
>
> > Please review the patch and make a comment.  I did a diff between HEAD
> > and 2.4 and am fine with this going in once you are happy.
>
> I fixed a couple of documentation nits in rev 51688.
> The patch is ready-to-go.
> Nick, please go ahead and backport.

I think this is suitable for 2.5. I'm thinking, though, that we need a second 
release candidate, given the number of changes since rc1. 



-- 
Anthony Baxter     <anthony at interlink.com.au>
It's never too late to have a happy childhood.

From aahz at pythoncraft.com  Sun Sep  3 06:06:27 2006
From: aahz at pythoncraft.com (Aahz)
Date: Sat, 2 Sep 2006 21:06:27 -0700
Subject: [Python-Dev] Py2.5 issue: decimal context manager
	misimplemented, misdesigned, and misdocumented
In-Reply-To: <200609031358.42774.anthony@interlink.com.au>
References: <44F4D9D2.2040804@ewtllc.com>
	<ee2a432c0609012200k50684c24k46e1eb9f79e4bc28@mail.gmail.com>
	<006901c6ceb2$e692f8d0$4c00000a@RaymondLaptop1>
	<200609031358.42774.anthony@interlink.com.au>
Message-ID: <20060903040627.GA21743@panix.com>

On Sun, Sep 03, 2006, Anthony Baxter wrote:
>
> I think this is suitable for 2.5. I'm thinking, though, that we need  
> a second release candidate, given the number of changes since rc1.    

+1
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

I support the RKAB

From fdrake at acm.org  Sun Sep  3 07:01:50 2006
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Sun, 3 Sep 2006 01:01:50 -0400
Subject: [Python-Dev] Py2.5 issue: decimal context manager
	misimplemented, misdesigned, and misdocumented
In-Reply-To: <200609031358.42774.anthony@interlink.com.au>
References: <44F4D9D2.2040804@ewtllc.com>
	<006901c6ceb2$e692f8d0$4c00000a@RaymondLaptop1>
	<200609031358.42774.anthony@interlink.com.au>
Message-ID: <200609030101.51129.fdrake@acm.org>

On Saturday 02 September 2006 23:58, Anthony Baxter wrote:
 > I think this is suitable for 2.5. I'm thinking, though, that we need a
 > second release candidate, given the number of changes since rc1.

+1


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From chrism at plope.com  Mon Sep  4 04:36:23 2006
From: chrism at plope.com (Chris McDonough)
Date: Sun, 3 Sep 2006 22:36:23 -0400
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <a467ca4f0609020510t2da5a1dbwa82e01d299befebd@mail.gmail.com>
References: <a467ca4f0609020510t2da5a1dbwa82e01d299befebd@mail.gmail.com>
Message-ID: <B43ABC30-1C65-4D4D-9D76-F7D583C0C1B7@plope.com>

Would adding an API for sigprocmask help here?

(Although it has been tried before -- http://mail.python.org/ 
pipermail/python-dev/2003-February/033016.html and died in the womb  
due to threading-related issues -- http://mail.mems-exchange.org/ 
durusmail/quixote-users/1248/)

- C

On Sep 2, 2006, at 8:10 AM, Gustavo Carneiro wrote:

> We have to resort to timeouts in pygtk in order to catch unix signals
> in threaded mode.
> The reason is this.  We call gtk_main() (mainloop function) which
> blocks forever.  Suppose there are threads in the program; then any
> thread can receive a signal (e.g. SIGINT).  Python catches the signal,
> but doesn't do anything; it simply sets a flag in a global structure
> and calls Py_AddPendingCall(), and I guess it expects someone to call
> Py_MakePendingCalls().  However, the main thread is blocked calling a
> C function and has no way of being notified it needs to give control
> back to python to handle the signal.  Hence, we use a 100ms timeout
> for polling.  Unfortunately, timeouts needlessly consume CPU time and
> drain laptop batteries.
>
> According to [1], all python needs to do to avoid this problem is
> block all signals in all but the main thread; then we can guarantee
> signal handlers are always called from the main thread, and pygtk
> doesn't need a timeout.
>
> Another alternative would be to add a new API like
> Py_AddPendingCallNotification, which would let python notify
> extensions that new pending calls exist and need to be processed.
>
>   But I would really prefer the first alternative, as it could be
> fixed within python 2.5; no need to wait for 2.6.
>
>   Please, let's make Python ready for the enterprise! [2]
>
> [1] https://bugzilla.redhat.com/bugzilla/process_bug.cgi#c3
> [2] http://perkypants.org/blog/2006/09/02/rfte-python/
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists 
> %40plope.com
>


From anthony at interlink.com.au  Mon Sep  4 09:19:39 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Mon, 4 Sep 2006 17:19:39 +1000
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <a467ca4f0609020510t2da5a1dbwa82e01d299befebd@mail.gmail.com>
References: <a467ca4f0609020510t2da5a1dbwa82e01d299befebd@mail.gmail.com>
Message-ID: <200609041719.41488.anthony@interlink.com.au>

On Saturday 02 September 2006 22:10, Gustavo Carneiro wrote:
> According to [1], all python needs to do to avoid this problem is
> block all signals in all but the main thread; then we can guarantee
> signal handlers are always called from the main thread, and pygtk
> doesn't need a timeout.

>   But I would really prefer the first alternative, as it could be
> fixed within python 2.5; no need to wait for 2.6.

Assuming "the first alternative" is the "just block all signals in all but the 
main thread" option, there is absolutely no chance of this going into 2.5.

Signals and threads combined are an complete *nightmare* of platform-specific 
behaviour. I'm -1000 on trying to change this code now, _after_ the first 
release candidate. To say that "that path lies madness" is like 
saying "Pacific Ocean large, wet, full of fish". 

-- 
Anthony Baxter     <anthony at interlink.com.au>
It's never too late to have a happy childhood.

From rasky at develer.com  Mon Sep  4 12:29:51 2006
From: rasky at develer.com (Giovanni Bajo)
Date: Mon, 4 Sep 2006 12:29:51 +0200
Subject: [Python-Dev] Error while building 2.5rc1 pythoncore_pgo on VC8
References: <8dd9fd0608310336q45d2d3d3re203e871c7b384b8@mail.gmail.com>	<ed6f6h$la4$1@sea.gmane.org><8dd9fd0608310446o6008240x8bfa852b41595eab@mail.gmail.com>
	<ed6ist$28v$1@sea.gmane.org>
Message-ID: <01ca01c6d00d$0dd14a90$b803030a@trilan>

Fredrik Lundh wrote:

>> That error mentioned in that post was in "pythoncore" module.
>> My error is while compiling "pythoncore_pgo" module.
> 
> iirc, that's a partially experimental alternative build for playing
> with performance guided optimizations.  are you sure you need 
> that module ?

Oh yes, it's a 30% improvement in pystone, for free.
-- 
Giovanni Bajo

From mwh at python.net  Mon Sep  4 15:30:41 2006
From: mwh at python.net (Michael Hudson)
Date: Mon, 04 Sep 2006 14:30:41 +0100
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <a467ca4f0609020510t2da5a1dbwa82e01d299befebd@mail.gmail.com>
	(Gustavo Carneiro's message of "Sat, 2 Sep 2006 13:10:04 +0100")
References: <a467ca4f0609020510t2da5a1dbwa82e01d299befebd@mail.gmail.com>
Message-ID: <2mpseboj26.fsf@starship.python.net>

"Gustavo Carneiro" <gjcarneiro at gmail.com> writes:

> According to [1], all python needs to do to avoid this problem is
> block all signals in all but the main thread;

Argh, no: then people who call system() from non-main threads end up
running subprocesses with all signals masked, which breaks other
things in very mysterious ways.  Been there...

No time to read the rest of the post, maybe in a few days...

Cheers,
mwh

-- 
  Arrrrgh, the braindamage!  It's not unlike the massively
  non-brilliant decision to use the period in abbreviations
  as well as a sentence terminator.  Had these people no
  imagination at _all_?                 -- Erik Naggum, comp.lang.lisp

From gjcarneiro at gmail.com  Mon Sep  4 15:48:54 2006
From: gjcarneiro at gmail.com (Gustavo Carneiro)
Date: Mon, 4 Sep 2006 13:48:54 +0000
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <2mpseboj26.fsf@starship.python.net>
References: <a467ca4f0609020510t2da5a1dbwa82e01d299befebd@mail.gmail.com>
	<2mpseboj26.fsf@starship.python.net>
Message-ID: <a467ca4f0609040648l48450f58mdaab26ffd0e9b1dd@mail.gmail.com>

On 9/4/06, Michael Hudson <mwh at python.net> wrote:
> "Gustavo Carneiro" <gjcarneiro at gmail.com> writes:
>
> > According to [1], all python needs to do to avoid this problem is
> > block all signals in all but the main thread;
>
> Argh, no: then people who call system() from non-main threads end up
> running subprocesses with all signals masked, which breaks other
> things in very mysterious ways.  Been there...

  That's a very good point; I wasn't aware that child processes
inherited the signals mask from their parent processes.

> No time to read the rest of the post, maybe in a few days...

  Don't worry.  From the feedback received so far it seems that any
proposed solution has to wait for Python 2.6 :-(

  I am now thinking of something along these lines:

typedef void (*PyPendingCallNotify)(void *user_data);
PyAPI_FUNC(void) Py_AddPendingCallNotify(PyPendingCallNotify callback,
void *user_data);
PyAPI_FUNC(void) Py_RemovePendingCallNotify(PyPendingCallNotify
callback, void *user_data);

  Regards.

From nmm1 at cus.cam.ac.uk  Mon Sep  4 16:05:56 2006
From: nmm1 at cus.cam.ac.uk (Nick Maclaren)
Date: Mon, 04 Sep 2006 15:05:56 +0100
Subject: [Python-Dev] Signals, threads, blocking C functions
Message-ID: <E1GKF5I-0004eE-5Y@draco.cus.cam.ac.uk>

"Gustavo Carneiro" <gjcarneiro at gmail.com> wrote:
>
> That's a very good point; I wasn't aware that child processes
> inherited the signals mask from their parent processes.

That's one of the few places where POSIX does describe what happens.
Well, usually.  You really don't want to know what happens when you
call something revolting, like csh or a setuid program.  This
particular mess is why I had to write my own nohup - the new POSIX
interfaces broke the existing one, and it remains broken today on
almost all systems.

>   I am now thinking of something along these lines:
> typedef void (*PyPendingCallNotify)(void *user_data);
> PyAPI_FUNC(void) Py_AddPendingCallNotify(PyPendingCallNotify callback,
>     void *user_data);
> PyAPI_FUNC(void) Py_RemovePendingCallNotify(PyPendingCallNotify
>     callback, void *user_data);

Why would that help?  The problems are semantic, not syntactic.

Anthony Baxter isn't exaggerating the problem, despite what you may
think from his posting.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  nmm1 at cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679

From nmm1 at cus.cam.ac.uk  Mon Sep  4 16:07:17 2006
From: nmm1 at cus.cam.ac.uk (Nick Maclaren)
Date: Mon, 04 Sep 2006 15:07:17 +0100
Subject: [Python-Dev] Signals, threads, blocking C functions
Message-ID: <E1GKF6b-0004fV-U9@draco.cus.cam.ac.uk>

Chris McDonough <chrism at plope.com> wrote:
>
> Would adding an API for sigprocmask help here?

No.  sigprocmask is a large part of the problem.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  nmm1 at cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679

From anthony at interlink.com.au  Mon Sep  4 16:22:22 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Tue, 5 Sep 2006 00:22:22 +1000
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <E1GKF5I-0004eE-5Y@draco.cus.cam.ac.uk>
References: <E1GKF5I-0004eE-5Y@draco.cus.cam.ac.uk>
Message-ID: <200609050022.23944.anthony@interlink.com.au>

On Tuesday 05 September 2006 00:05, Nick Maclaren wrote:
> Anthony Baxter isn't exaggerating the problem, despite what you may
> think from his posting.

If the SF bugtracker had a better search interface, you could see why I have 
such a bleak view of this area of Python. What's there now *mostly* works (I 
exclude freakshows like certain versions of HP/UX, AIX, SCO and the like). It 
took a hell of a lot of effort to get it to this point. 

threads + signals == tears.

Anthony

From gjcarneiro at gmail.com  Mon Sep  4 16:52:36 2006
From: gjcarneiro at gmail.com (Gustavo Carneiro)
Date: Mon, 4 Sep 2006 14:52:36 +0000
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <E1GKF5I-0004eE-5Y@draco.cus.cam.ac.uk>
References: <E1GKF5I-0004eE-5Y@draco.cus.cam.ac.uk>
Message-ID: <a467ca4f0609040752y7c051b24h620777d983590f40@mail.gmail.com>

On 9/4/06, Nick Maclaren <nmm1 at cus.cam.ac.uk> wrote:
> "Gustavo Carneiro" <gjcarneiro at gmail.com> wrote:
> >   I am now thinking of something along these lines:
> > typedef void (*PyPendingCallNotify)(void *user_data);
> > PyAPI_FUNC(void) Py_AddPendingCallNotify(PyPendingCallNotify callback,
> >     void *user_data);
> > PyAPI_FUNC(void) Py_RemovePendingCallNotify(PyPendingCallNotify
> >     callback, void *user_data);
>
> Why would that help?  The problems are semantic, not syntactic.
>
> Anthony Baxter isn't exaggerating the problem, despite what you may
> think from his posting.

  You guys are tough customers to please.  I am just trying to solve a
problem here, not create a new one; you have to believe me.

  OK, let's review what we know about current python, signals, and threads:

     1. Python launches threads without touching sigprocmask;
     2. Python installs signal handlers for all signals;
     3. Signals can be delivered to any thread, let's assume (because
of point #1 and not others not mentioned) that we have no control over
which threads receive which signals, might as well be random for all
we know;
     4. Python signal handlers do almost nothing: just sets a flag,
and calls Py_AddPendingCall, to postpone the job of handling a signal
until a "safer" time.
     5. The function Py_MakePendingCalls() should eventually get
called at a "safer" time by user or python code.
     6. It follows that until Py_MakePendingCalls() is called, the
signal will not be handled at all!

  Now, back to explaining the problem.

     1. In PyGTK we have a gobject.MainLoop.run() method, which blocks
essentially forever in a poll() system call, and only wakes if/when it
has to process timeout or IO event;
     2. When we only have one thread, we can guarantee that e.g.
SIGINT will always be caught by the thread running the
g_main_loop_run(), so we know poll() will be interrupted and a EINTR
will be generated, giving us control temporarily back to check for
python signals;
     3. When we have multiple thread, we cannot make this assumption,
so instead we install a timeout to periodically check for signals.

  We want to get rid of timeouts.  Now my idea: add a Python API to say:
     "dear Python, please call me when you start having pending calls,
even if from a signal handler context, ok?"

>From that point on, signals will get handled by Python, python calls
PyGTK, PyGTK calls a special API to safely wake up the main loop even
from a thread or signal handler, then main loop checks for signal by
calling PyErr_CheckSignals(), it is handled by Python, and the process
lives happily ever after, or die trying.

  I sincerely hope my explanation was satisfactory this time.

  Best regards.


PS: there's a "funny" comment in Py_AddPendingCall that suggests it is
not very safe against reentrancy problems:

	/* XXX Begin critical section */
	/* XXX If you want this to be safe against nested
	   XXX asynchronous calls, you'll have to work harder! */

Are signal handlers guaranteed to not be interrupted by another
signal, at least?  What about threads?

From anthony at interlink.com.au  Mon Sep  4 17:30:11 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Tue, 5 Sep 2006 01:30:11 +1000
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <a467ca4f0609040752y7c051b24h620777d983590f40@mail.gmail.com>
References: <E1GKF5I-0004eE-5Y@draco.cus.cam.ac.uk>
	<a467ca4f0609040752y7c051b24h620777d983590f40@mail.gmail.com>
Message-ID: <200609050130.13189.anthony@interlink.com.au>

On Tuesday 05 September 2006 00:52, Gustavo Carneiro wrote:
>      3. Signals can be delivered to any thread, let's assume (because
> of point #1 and not others not mentioned) that we have no control over
> which threads receive which signals, might as well be random for all
> we know;


Note that some Unix variants only deliver signals to the main thread (or so 
the manpages allege, anyway).

Anthony

From exarkun at divmod.com  Mon Sep  4 17:56:00 2006
From: exarkun at divmod.com (Jean-Paul Calderone)
Date: Mon, 4 Sep 2006 11:56:00 -0400
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <E1GKF5I-0004eE-5Y@draco.cus.cam.ac.uk>
Message-ID: <20060904155600.1717.605687145.divmod.quotient.38950@ohm>

On Mon, 04 Sep 2006 15:05:56 +0100, Nick Maclaren <nmm1 at cus.cam.ac.uk> wrote:
>"Gustavo Carneiro" <gjcarneiro at gmail.com> wrote:
>>
>> That's a very good point; I wasn't aware that child processes
>> inherited the signals mask from their parent processes.
>
>That's one of the few places where POSIX does describe what happens.
>Well, usually.  You really don't want to know what happens when you
>call something revolting, like csh or a setuid program.  This
>particular mess is why I had to write my own nohup - the new POSIX
>interfaces broke the existing one, and it remains broken today on
>almost all systems.
>
>>   I am now thinking of something along these lines:
>> typedef void (*PyPendingCallNotify)(void *user_data);
>> PyAPI_FUNC(void) Py_AddPendingCallNotify(PyPendingCallNotify callback,
>>     void *user_data);
>> PyAPI_FUNC(void) Py_RemovePendingCallNotify(PyPendingCallNotify
>>     callback, void *user_data);
>
>Why would that help?  The problems are semantic, not syntactic.
>
>Anthony Baxter isn't exaggerating the problem, despite what you may
>think from his posting.
>

Dealing with threads and signals is certainly hairy.

However, that barely has anything to do with what Gustavo is talking
about.

By the time Gustavo's proposed API springs into action, the threads
already exist and the signal is already being handled by one.  So,
let's forget about threads and signals for a moment.

The problem to be solved is that one piece of code wants to communicate
a piece of information to another piece of code.

The first piece of code is in Python itself.  The second piece of code
could be from any third-party library, and Python has no way of knowing
about it - now.

Gustavo is suggesting adding a registration API so that these third-party
libraries can tell Python that they exist and are interested in this
piece of information.

Simple, no?

PyGTK would presumably implement its pending call callback by writing a
byte to a pipe which it is also passing to poll().  This lets them handle
signals in a very timely manner without constantly waking up from poll()
to see if Python wants to do any work.

This is far from a new idea - it's basically the bog standard way of
handling this situation.  It strikes me as a very useful API to add to
Python (although at this point in the 2.5 release process, not to 2.5,
sorry Gustavo).

Jean-Paul

From david.nospam.hopwood at blueyonder.co.uk  Mon Sep  4 18:19:27 2006
From: david.nospam.hopwood at blueyonder.co.uk (David Hopwood)
Date: Mon, 04 Sep 2006 17:19:27 +0100
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <a467ca4f0609040752y7c051b24h620777d983590f40@mail.gmail.com>
References: <E1GKF5I-0004eE-5Y@draco.cus.cam.ac.uk>
	<a467ca4f0609040752y7c051b24h620777d983590f40@mail.gmail.com>
Message-ID: <44FC520F.3070307@blueyonder.co.uk>

Gustavo Carneiro wrote:
>   OK, let's review what we know about current python, signals, and threads:
> 
>      1. Python launches threads without touching sigprocmask;
>      2. Python installs signal handlers for all signals;
>      3. Signals can be delivered to any thread, let's assume (because
> of point #1 and not others not mentioned) that we have no control over
> which threads receive which signals, might as well be random for all
> we know;
>      4. Python signal handlers do almost nothing: just sets a flag,
> and calls Py_AddPendingCall, to postpone the job of handling a signal
> until a "safer" time.
>      5. The function Py_MakePendingCalls() should eventually get
> called at a "safer" time by user or python code.
>      6. It follows that until Py_MakePendingCalls() is called, the
> signal will not be handled at all!
> 
>   Now, back to explaining the problem.
> 
>      1. In PyGTK we have a gobject.MainLoop.run() method, which blocks
> essentially forever in a poll() system call, and only wakes if/when it
> has to process timeout or IO event;
>      2. When we only have one thread, we can guarantee that e.g.
> SIGINT will always be caught by the thread running the
> g_main_loop_run(), so we know poll() will be interrupted and a EINTR
> will be generated, giving us control temporarily back to check for
> python signals;
>      3. When we have multiple thread, we cannot make this assumption,
> so instead we install a timeout to periodically check for signals.
> 
>   We want to get rid of timeouts.  Now my idea: add a Python API to say:
>      "dear Python, please call me when you start having pending calls,
> even if from a signal handler context, ok?"

What can be safely done from a signal handler context is *very* limited.
Calling back arbitrary Python code is certainly not safe.

Reliable asynchronous interruption of arbitrary code is a difficult problem,
but POSIX and POSIX implementations botch it particularly badly. I don't
know how to implement what you want here, but I'd endorse the comments of
Nick Maclaren and Antony Baxter against making precipitate changes.

-- 
David Hopwood <david.nospam.hopwood at blueyonder.co.uk>



From nmm1 at cus.cam.ac.uk  Mon Sep  4 18:24:27 2006
From: nmm1 at cus.cam.ac.uk (Nick Maclaren)
Date: Mon, 04 Sep 2006 17:24:27 +0100
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: Your message of "Mon, 04 Sep 2006 14:52:36 -0000."
	<a467ca4f0609040752y7c051b24h620777d983590f40@mail.gmail.com> 
Message-ID: <E1GKHFL-0005l0-5O@draco.cus.cam.ac.uk>

"Gustavo Carneiro" <gjcarneiro at gmail.com> wrote:
>
>   You guys are tough customers to please.  I am just trying to solve a
> problem here, not create a new one; you have to believe me.

Oh, I believe you.

Look at it this way.  You are trying to resolve the problem that your
farm is littered with cluster bombs, and your cows keep blowing their
legs off.  Your solution is effectively saying "well, let's travel
around and pick them all up then".

>   We want to get rid of timeouts.  Now my idea: add a Python API to say:
>      "dear Python, please call me when you start having pending calls,
> even if from a signal handler context, ok?"

Yes, I know.  I have been there and done that, both academically and
(observing, as a consultant) to the vendor.  And that was on a system
that was a damn sight better engineered than any of the main ones that
Python runs on today.

I have attempted to do much EASIER tasks under both Unix and (earlier)
versions of Microsoft Windows, and failed dismally because the system
wasn't up to it.

> From that point on, signals will get handled by Python, python calls
> PyGTK, PyGTK calls a special API to safely wake up the main loop even
> from a thread or signal handler, then main loop checks for signal by
> calling PyErr_CheckSignals(), it is handled by Python, and the process
> lives happily ever after, or die trying.

The first thing that will happen to that beautiful theory when it goes
out into Unix County or Microsoft City is that a gang of ugly facts
will find it and beat it into a pulp.

>  I sincerely hope my explanation was satisfactory this time.

Oh, it was last time.  It isn't that that is the problem.

> Are signal handlers guaranteed to not be interrupted by another
> signal, at least?  What about threads?

No and no.  In theory, what POSIX says about blocking threads should
be reliable; in my experience, it almost is, except under precisely the
circumstances that you most want it to work.



Look, I am agreeing that your basic design is right.  What I am saying
is that (a) you cannot make delivery reliable and abolish timeouts
and (b) that it is such a revoltingly system-dependent mess that I
would much rather Python didn't fiddle with it.

Do you know how signalling is misimplemented at the hardware level?
And that it is possible for a handler to be called with any of its
critical pointers (INCLUDING the global code and data pointers) in
undefined states?  Do you know how to program round that sort of
thing?

I can answer "yes" to all three - for my sins, which must be many and
grievous, for that to be the case :-(


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  nmm1 at cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679

From david.nospam.hopwood at blueyonder.co.uk  Mon Sep  4 18:24:56 2006
From: david.nospam.hopwood at blueyonder.co.uk (David Hopwood)
Date: Mon, 04 Sep 2006 17:24:56 +0100
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <20060904155600.1717.605687145.divmod.quotient.38950@ohm>
References: <20060904155600.1717.605687145.divmod.quotient.38950@ohm>
Message-ID: <44FC5358.70806@blueyonder.co.uk>

Jean-Paul Calderone wrote:
> PyGTK would presumably implement its pending call callback by writing a
> byte to a pipe which it is also passing to poll().

But doing that in a signal handler context invokes undefined behaviour
according to POSIX.

-- 
David Hopwood <david.nospam.hopwood at blueyonder.co.uk>



From exarkun at divmod.com  Mon Sep  4 18:46:22 2006
From: exarkun at divmod.com (Jean-Paul Calderone)
Date: Mon, 4 Sep 2006 12:46:22 -0400
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <44FC5358.70806@blueyonder.co.uk>
Message-ID: <20060904164622.1717.895455315.divmod.quotient.38999@ohm>

On Mon, 04 Sep 2006 17:24:56 +0100, David Hopwood <david.nospam.hopwood at blueyonder.co.uk> wrote:
>Jean-Paul Calderone wrote:
>> PyGTK would presumably implement its pending call callback by writing a
>> byte to a pipe which it is also passing to poll().
>
>But doing that in a signal handler context invokes undefined behaviour
>according to POSIX.

write(2) is explicitly listed as async-signal safe in IEEE Std 1003.1, 2004.
Was this changed in a later edition?  Otherwise, I don't understand what you
mean by this.

Jean-Paul

From nmm1 at cus.cam.ac.uk  Mon Sep  4 19:18:41 2006
From: nmm1 at cus.cam.ac.uk (Nick Maclaren)
Date: Mon, 04 Sep 2006 18:18:41 +0100
Subject: [Python-Dev] Signals, threads, blocking C functions
Message-ID: <E1GKI5p-0006EE-RR@draco.cus.cam.ac.uk>

Jean-Paul Calderone <exarkun at divmod.com> wrote:
> On Mon, 04 Sep 2006 17:24:56 +0100,
> David Hopwood <david.nospam.hopwood at blueyon
> der.co.uk> wrote:
> >Jean-Paul Calderone wrote:
> >> PyGTK would presumably implement its pending call callback by writing a
> >> byte to a pipe which it is also passing to poll().
> >
> >But doing that in a signal handler context invokes undefined behaviour
> >according to POSIX.
> 
> write(2) is explicitly listed as async-signal safe in IEEE Std 1003.1, 2004.
> Was this changed in a later edition?  Otherwise, I don't understand what you
> mean by this.

Try looking at the C90 or C99 standard, for a start :-(

NOTHING may safely be done in a real signal handler, except possibly
setting a value of type static volatile sig_atomic_t.  And even that
can be problematic.  And note that POSIX defers to C on what the C
languages defines.  So, even if the function is async-signal-safe,
the code that calls it can't be!

POSIX's lists are complete fantasy, anyway.  Look at the one that
defines thread-safety, and then try to get your mind around what
exit being thread-safe actually implies (especially with regard to
atexit functions).


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  nmm1 at cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679

From david.nospam.hopwood at blueyonder.co.uk  Mon Sep  4 19:24:38 2006
From: david.nospam.hopwood at blueyonder.co.uk (David Hopwood)
Date: Mon, 04 Sep 2006 18:24:38 +0100
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <20060904164622.1717.895455315.divmod.quotient.38999@ohm>
References: <20060904164622.1717.895455315.divmod.quotient.38999@ohm>
Message-ID: <44FC6156.3000708@blueyonder.co.uk>

Jean-Paul Calderone wrote:
> On Mon, 04 Sep 2006 17:24:56 +0100, David Hopwood <david.nospam.hopwood at blueyonder.co.uk> wrote:
> 
>>Jean-Paul Calderone wrote:
>>
>>>PyGTK would presumably implement its pending call callback by writing a
>>>byte to a pipe which it is also passing to poll().
>>
>>But doing that in a signal handler context invokes undefined behaviour
>>according to POSIX.
> 
> write(2) is explicitly listed as async-signal safe in IEEE Std 1003.1, 2004.

I stand corrected. I must have misremembered this.

-- 
David Hopwood <david.hopwood at blueyonder.co.uk>



From exarkun at divmod.com  Mon Sep  4 19:55:41 2006
From: exarkun at divmod.com (Jean-Paul Calderone)
Date: Mon, 4 Sep 2006 13:55:41 -0400
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <E1GKI5p-0006EE-RR@draco.cus.cam.ac.uk>
Message-ID: <20060904175541.1717.1728502156.divmod.quotient.39053@ohm>

On Mon, 04 Sep 2006 18:18:41 +0100, Nick Maclaren <nmm1 at cus.cam.ac.uk> wrote:
>Jean-Paul Calderone <exarkun at divmod.com> wrote:
>> On Mon, 04 Sep 2006 17:24:56 +0100,
>> David Hopwood <david.nospam.hopwood at blueyon
>> der.co.uk> wrote:
>> >Jean-Paul Calderone wrote:
>> >> PyGTK would presumably implement its pending call callback by writing a
>> >> byte to a pipe which it is also passing to poll().
>> >
>> >But doing that in a signal handler context invokes undefined behaviour
>> >according to POSIX.
>>
>> write(2) is explicitly listed as async-signal safe in IEEE Std 1003.1, 2004.
>> Was this changed in a later edition?  Otherwise, I don't understand what you
>> mean by this.
>
>Try looking at the C90 or C99 standard, for a start :-(
>
>NOTHING may safely be done in a real signal handler, except possibly
>setting a value of type static volatile sig_atomic_t.  And even that
>can be problematic.  And note that POSIX defers to C on what the C
>languages defines.  So, even if the function is async-signal-safe,
>the code that calls it can't be!
>
>POSIX's lists are complete fantasy, anyway.  Look at the one that
>defines thread-safety, and then try to get your mind around what
>exit being thread-safe actually implies (especially with regard to
>atexit functions).
>

Thanks for expounding.  Given that it is basically impossible to do
anything useful in a signal handler according to the relevant standards
(does Python's current signal handler even avoid relying on undefined
behavior?), how would you suggest addressing this issue?

It seems to me that it is actually possible to do useful things in a
signal handler, so long as one accepts that doing so is relying on
platform specific behavior.

How hard would it be to implement this for the platforms Python supports,
rather than for a hypothetical standards-exact platform?

Jean-Paul

From nmm1 at cus.cam.ac.uk  Mon Sep  4 20:44:30 2006
From: nmm1 at cus.cam.ac.uk (Nick Maclaren)
Date: Mon, 04 Sep 2006 19:44:30 +0100
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: Your message of "Mon, 04 Sep 2006 13:55:41 EDT."
	<20060904175541.1717.1728502156.divmod.quotient.39053@ohm> 
Message-ID: <E1GKJQs-0006g7-IA@draco.cus.cam.ac.uk>

Jean-Paul Calderone <exarkun at divmod.com> wrote:
> 
> Thanks for expounding.  Given that it is basically impossible to do
> anything useful in a signal handler according to the relevant standards
> (does Python's current signal handler even avoid relying on undefined
> behavior?), how would you suggest addressing this issue?

Much as you are doing, and I described, but the first step would be
to find out what 'most' Python people need for signal handling in
threaded programs.  This is because there is an unavoidable conflict
between portability/reliability and functionality.

I would definitely block all signals in threads, except for those that
are likely to be generated ON the thread (SIGFPE etc.)  It is a very
good idea not to touch the handling of several of those, because doing
so can cause chaos.

I would have at least two 'standard' handlers, one of which would simply
set a flag and return, and the other of which would abort.  Now, NEITHER
is a very useful specification, but providing ANY information is risky,
which is why it is critical to know what people need.

I would not TRUST the blocking of signals, so would set up handlers even
when I blocked them, and would do the minimum fiddling in the main
thread compatible with decent functionality.

I would provide a call to test if the signal flag was set, and another
to test and clear it.  This would be callable ONLY from the main thread,
and that would be checked.

It is possible to do better, but that starts needing serious research.

> It seems to me that it is actually possible to do useful things in a
> signal handler, so long as one accepts that doing so is relying on
> platform specific behavior.

Unfortunately, that is wrong.  That was true under MVS and VMS, but
in Unix and Microsoft systems, the problem is that the behaviour is
both platform and circumstance-dependent.  What you can do reliably
depends mostly on what is going on at the time.

For example, on many Unix and Microsoft platforms, signals received
while you are in the middle of certain functions or system calls, or
certain particular signals (often SIGFPE), call the C handler with a
bad set of global pointers or similar.  I believe that this is one of
reasons (perhaps the main one) that some such failures so often cause
debuggers to be unable to find the stack pointer.

I have tracked a few of those down, and have occasionally identified
the cause (and even got it fixed!), but it is a murderous task, and
I know of few other people who have ever succeeded.

> How hard would it be to implement this for the platforms Python supports,
> rather than for a hypothetical standards-exact platform?

I have seen this effect on OSF/1, IRIX, Solaris, Linux and versions
of Microsoft Windows.  I have never used a modern BSD, haven't used
HP-UX since release 9, and haven't used Microsoft systems seriously
in years (though I did hang my new laptop in its GUI fairly easily).

As I say, this isn't so much a platform issue as a circumstance one.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  nmm1 at cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679

From andreas.raab at gmx.de  Mon Sep  4 23:36:19 2006
From: andreas.raab at gmx.de (Andreas Raab)
Date: Mon, 04 Sep 2006 14:36:19 -0700
Subject: [Python-Dev] Cross-platform math functions?
Message-ID: <44FC9C53.5060304@gmx.de>

Hi -

I'm curious if there is any interest in the Python community to achieve 
better cross-platform math behavior. A quick test[1] shows a 
non-surprising difference between the platform implementations. 
Question: Is there any interest in changing the behavior to produce 
identical results across platforms (for example by utilizing fdlibm 
[2])? Since I have need for a set of cross-platform math functions I'll 
probably start with a math-compatible fdlibm module (unless somebody has 
done that already ;-)

Cheers,
   - Andreas

[1] Using Python 2.4:
 >>> import math
 >>> math.cos(1.0e32)

WinXP:    -0.39929634612021897
LinuxX86: -0.49093671143542561

[2] http://www.netlib.org/fdlibm/


From gjcarneiro at gmail.com  Tue Sep  5 01:31:06 2006
From: gjcarneiro at gmail.com (Gustavo Carneiro)
Date: Tue, 5 Sep 2006 00:31:06 +0100
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <E1GKJQs-0006g7-IA@draco.cus.cam.ac.uk>
References: <20060904175541.1717.1728502156.divmod.quotient.39053@ohm>
	<E1GKJQs-0006g7-IA@draco.cus.cam.ac.uk>
Message-ID: <a467ca4f0609041631o719c5ea2s85e58c2309be452@mail.gmail.com>

  In GLib we have a child watch notification feature that relies on
the following signal handler:

static void
g_child_watch_signal_handler (int signum)
{
  child_watch_count ++;

  if (child_watch_init_state == CHILD_WATCH_INITIALIZED_THREADED)
    {
      write (child_watch_wake_up_pipe[1], "B", 1);
    }
  else
    {
      /* We count on the signal interrupting the poll in the same thread.
       */
    }
}

  Now, we've had this API for a long time already (at least 2.5
years).  I'm pretty sure it works well enough on most *nix systems.
Event if it works 99% of the times, it's way better than *failing*
*100%* of the times, which is what happens now with Python.

  All I ask is an API to add a callback that Python signal handlers
call, from signal context.  That much I'm sure is safe.  What happens
from there on will be out of Python's hands, so Python
purist^H^H^H^H^H^H developers cannot be blamed for anything that
happens next.  You can laugh at PyGTK and GLib all you want for having
"unsafe signal handling", I don't care.

  Regards.

On 9/4/06, Nick Maclaren <nmm1 at cus.cam.ac.uk> wrote:
> Jean-Paul Calderone <exarkun at divmod.com> wrote:
> >
> > Thanks for expounding.  Given that it is basically impossible to do
> > anything useful in a signal handler according to the relevant standards
> > (does Python's current signal handler even avoid relying on undefined
> > behavior?), how would you suggest addressing this issue?
>
> Much as you are doing, and I described, but the first step would be
> to find out what 'most' Python people need for signal handling in
> threaded programs.  This is because there is an unavoidable conflict
> between portability/reliability and functionality.
>
> I would definitely block all signals in threads, except for those that
> are likely to be generated ON the thread (SIGFPE etc.)  It is a very
> good idea not to touch the handling of several of those, because doing
> so can cause chaos.
>
> I would have at least two 'standard' handlers, one of which would simply
> set a flag and return, and the other of which would abort.  Now, NEITHER
> is a very useful specification, but providing ANY information is risky,
> which is why it is critical to know what people need.
>
> I would not TRUST the blocking of signals, so would set up handlers even
> when I blocked them, and would do the minimum fiddling in the main
> thread compatible with decent functionality.
>
> I would provide a call to test if the signal flag was set, and another
> to test and clear it.  This would be callable ONLY from the main thread,
> and that would be checked.
>
> It is possible to do better, but that starts needing serious research.
>
> > It seems to me that it is actually possible to do useful things in a
> > signal handler, so long as one accepts that doing so is relying on
> > platform specific behavior.
>
> Unfortunately, that is wrong.  That was true under MVS and VMS, but
> in Unix and Microsoft systems, the problem is that the behaviour is
> both platform and circumstance-dependent.  What you can do reliably
> depends mostly on what is going on at the time.
>
> For example, on many Unix and Microsoft platforms, signals received
> while you are in the middle of certain functions or system calls, or
> certain particular signals (often SIGFPE), call the C handler with a
> bad set of global pointers or similar.  I believe that this is one of
> reasons (perhaps the main one) that some such failures so often cause
> debuggers to be unable to find the stack pointer.
>
> I have tracked a few of those down, and have occasionally identified
> the cause (and even got it fixed!), but it is a murderous task, and
> I know of few other people who have ever succeeded.
>
> > How hard would it be to implement this for the platforms Python supports,
> > rather than for a hypothetical standards-exact platform?
>
> I have seen this effect on OSF/1, IRIX, Solaris, Linux and versions
> of Microsoft Windows.  I have never used a modern BSD, haven't used
> HP-UX since release 9, and haven't used Microsoft systems seriously
> in years (though I did hang my new laptop in its GUI fairly easily).
>
> As I say, this isn't so much a platform issue as a circumstance one.
>
>
> Regards,
> Nick Maclaren,
> University of Cambridge Computing Service,
> New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
> Email:  nmm1 at cam.ac.uk
> Tel.:  +44 1223 334761    Fax:  +44 1223 334679
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/gjcarneiro%40gmail.com
>

From tim.peters at gmail.com  Tue Sep  5 01:06:50 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 4 Sep 2006 19:06:50 -0400
Subject: [Python-Dev] Cross-platform math functions?
In-Reply-To: <44FC9C53.5060304@gmx.de>
References: <44FC9C53.5060304@gmx.de>
Message-ID: <1f7befae0609041606m13c5c24bm59ce229b27f32e9d@mail.gmail.com>

[Andreas Raab]
> I'm curious if there is any interest in the Python community to achieve
> better cross-platform math behavior. A quick test[1] shows a
> non-surprising difference between the platform implementations.
> Question: Is there any interest in changing the behavior to produce
> identical results across platforms (for example by utilizing fdlibm
> [2])? Since I have need for a set of cross-platform math functions I'll
> probably start with a math-compatible fdlibm module (unless somebody has
> done that already ;-)

Package a Python wrapper and see how popular it becomes.  Some reasons
against trying to standardize on fdlibm were explained here:

    http://mail.python.org/pipermail/python-list/2005-July/290164.html

Bottom line is I suspect that when it comes to bit-for-bit
reproducibility, fewer people care about that x-platform than care
about it x-language on the box they use.  Nothing wrong with different
modules for people with different desires.

From tim.peters at gmail.com  Tue Sep  5 04:25:01 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 4 Sep 2006 22:25:01 -0400
Subject: [Python-Dev] gcc 4.2 exposes signed integer overflows
In-Reply-To: <200608301242.28648.anthony@interlink.com.au>
References: <20060826190600.0E75911002B@bromo.msbb.uc.edu>
	<20060829201022.GA22579@code0.codespeak.net>
	<1f7befae0608291557l5b04a8f6wd1371e62a5c9c69c@mail.gmail.com>
	<200608301242.28648.anthony@interlink.com.au>
Message-ID: <1f7befae0609041925h61c184f1m8716951740b00b39@mail.gmail.com>

[Tim Peters]
>> Speaking of which, I saw no feedback on the proposed patch in
>>
>>     http://mail.python.org/pipermail/python-dev/2006-August/068502.html
>>
>> so I'll just check that in tomorrow.

[Anthony Baxter]
> This should also be backported to release24-maint and release23-maint. Let me
> know if you can't do the backport...

Done in rev 51711 on the 2.5 branch.

Done in rev 51715 on the 2.4 branch.

Done in rev 51716 on the trunk, although in the LONG_MIN way (which is
less obscure, but a more "radical" code change).

I don't care about the 2.3 branch, so leaving that to someone who
does.  Merge rev 51711 from the 2.5 branch.  It will generate a
conflict on Misc/NEWS.  Easiest to revert Misc/NEWS then and just
copy/paste the little blurb from 2.5 news at the appropriate place:

"""
- Overflow checking code in integer division ran afoul of new gcc
  optimizations.  Changed to be more standard-conforming.
"""

From rhamph at gmail.com  Tue Sep  5 05:28:37 2006
From: rhamph at gmail.com (Adam Olsen)
Date: Mon, 4 Sep 2006 21:28:37 -0600
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <E1GKI5p-0006EE-RR@draco.cus.cam.ac.uk>
References: <E1GKI5p-0006EE-RR@draco.cus.cam.ac.uk>
Message-ID: <aac2c7cb0609042028j29d8ac73qaa4ae70b4dea3ec@mail.gmail.com>

On 9/4/06, Nick Maclaren <nmm1 at cus.cam.ac.uk> wrote:
> Jean-Paul Calderone <exarkun at divmod.com> wrote:
> > On Mon, 04 Sep 2006 17:24:56 +0100,
> > David Hopwood <david.nospam.hopwood at blueyon
> > der.co.uk> wrote:
> > >Jean-Paul Calderone wrote:
> > >> PyGTK would presumably implement its pending call callback by writing a
> > >> byte to a pipe which it is also passing to poll().
> > >
> > >But doing that in a signal handler context invokes undefined behaviour
> > >according to POSIX.
> >
> > write(2) is explicitly listed as async-signal safe in IEEE Std 1003.1, 2004.
> > Was this changed in a later edition?  Otherwise, I don't understand what you
> > mean by this.
>
> Try looking at the C90 or C99 standard, for a start :-(
>
> NOTHING may safely be done in a real signal handler, except possibly
> setting a value of type static volatile sig_atomic_t.  And even that
> can be problematic.  And note that POSIX defers to C on what the C
> languages defines.  So, even if the function is async-signal-safe,
> the code that calls it can't be!

I don't believe that is true.  It says (or atleast SUSv3 says) that:

"""  3.26 Async-Signal-Safe Function

A function that may be invoked, without restriction, from
signal-catching functions. No function is async-signal-safe unless
explicitly described as such."""

Sure, it doesn't give me a warm-fuzzy feeling of knowing why it works,
but we can expect that it magically does.  My understanding is that
threading in general is the same way...

Of course that doesn't preclude bugs in the various implementations,
but those trump the standards anyway.

-- 
Adam Olsen, aka Rhamphoryncus

From rhamph at gmail.com  Tue Sep  5 05:41:13 2006
From: rhamph at gmail.com (Adam Olsen)
Date: Mon, 4 Sep 2006 21:41:13 -0600
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <a467ca4f0609041631o719c5ea2s85e58c2309be452@mail.gmail.com>
References: <20060904175541.1717.1728502156.divmod.quotient.39053@ohm>
	<E1GKJQs-0006g7-IA@draco.cus.cam.ac.uk>
	<a467ca4f0609041631o719c5ea2s85e58c2309be452@mail.gmail.com>
Message-ID: <aac2c7cb0609042041q357e3746t187d93be17bd5770@mail.gmail.com>

On 9/4/06, Gustavo Carneiro <gjcarneiro at gmail.com> wrote:
>   Now, we've had this API for a long time already (at least 2.5
> years).  I'm pretty sure it works well enough on most *nix systems.
> Event if it works 99% of the times, it's way better than *failing*
> *100%* of the times, which is what happens now with Python.

Failing 99% of the time is as bad as failing 100% of the time, if your
goal is to eliminate the short timeout on poll().  1% is quite a lot,
and it would probably have an annoying tendency to trigger repeatedly
when the user does certain things (not reproducible by you of course).

That said, I do hope we can get 100%, or at least enough nines that we
can increase the timeout significantly.

-- 
Adam Olsen, aka Rhamphoryncus

From nnorwitz at gmail.com  Tue Sep  5 06:12:43 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Mon, 4 Sep 2006 21:12:43 -0700
Subject: [Python-Dev] [Python-checkins] TRUNK IS UNFROZEN,
	available for 2.6 work if you are so inclined
In-Reply-To: <ec3ui8$shq$1@sea.gmane.org>
References: <200608180023.14037.anthony@interlink.com.au>
	<ec3ui8$shq$1@sea.gmane.org>
Message-ID: <ee2a432c0609042112i56d00a88hadd9ee88d8f1eb93@mail.gmail.com>

On 8/18/06, Georg Brandl <g.brandl at gmx.net> wrote:
>
> I'd like to commit this. It fixes bug 1542051.
>
> Index: Objects/exceptions.c

...

Georg,

Did you still want to fix this?  I don't remember anything happening
with it.  I don't see where _PyObject_GC_TRACK is called, so I'm not
sure why _PyObject_GC_UNTRACK is necessary.  You should probably add
the patch to the bug report and we can discuss there.

n

From nnorwitz at gmail.com  Tue Sep  5 06:14:34 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Mon, 4 Sep 2006 21:14:34 -0700
Subject: [Python-Dev] no remaining issues blocking 2.5 release
In-Reply-To: <20060815164114.GB23991@niemeyer.net>
References: <ee2a432c0608142131g59e2739dg2b4545ca7eb19b24@mail.gmail.com>
	<20060815164114.GB23991@niemeyer.net>
Message-ID: <ee2a432c0609042114v569ac00ewf0b85ecc47a441c4@mail.gmail.com>

Gustavo,

Did you still want this addressed?  Anthony and I made some comments
on the bug/patch, but nothing has been updated.

n
--

On 8/15/06, Gustavo Niemeyer <gustavo at niemeyer.net> wrote:
> > If you have issues, respond ASAP!  The release candidate is planned to
> > be cut this Thursday/Friday.  There are only a few more days before
> > code freeze.  A branch will be made when the release candidate is cut.
>
> I'd like to see problem #1531862 fixed.  The bug is clear and the
> fix should be trivial.  I can commit a fix tonight, if the subprocess
> module author/maintainer is unavailable to check it out.
>
> --
> Gustavo Niemeyer
> http://niemeyer.net
>

From nnorwitz at gmail.com  Tue Sep  5 06:24:16 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Mon, 4 Sep 2006 21:24:16 -0700
Subject: [Python-Dev] 2.5 status
Message-ID: <ee2a432c0609042124w281b7979t8cb10cbaeb937374@mail.gmail.com>

There are 3 bugs currently listed in PEP 356 as blocking:
        http://python.org/sf/1551432 - __unicode__ breaks on exception classes
        http://python.org/sf/1550938 - improper exception w/relative import
        http://python.org/sf/1541697 - sgmllib regexp bug causes hang

Does anyone want to fix the sgmlib issue?  If not, we should revert
this week before c2 is cut.  I'm hoping that we will have *no changes*
in 2.5 final from c2.  Should there be any bugs/patches added to or
removed from the list?

The buildbots are currently humming along, but I believe all 3
versions (2.4, 2.5, and 2.6) are fine.

Test out 2.5c1+ and report all bugs!

n

From andreas.raab at gmx.de  Tue Sep  5 07:03:11 2006
From: andreas.raab at gmx.de (Andreas Raab)
Date: Mon, 04 Sep 2006 22:03:11 -0700
Subject: [Python-Dev] Cross-platform math functions?
In-Reply-To: <1f7befae0609041606m13c5c24bm59ce229b27f32e9d@mail.gmail.com>
References: <44FC9C53.5060304@gmx.de>
	<1f7befae0609041606m13c5c24bm59ce229b27f32e9d@mail.gmail.com>
Message-ID: <44FD050F.20901@gmx.de>

Tim Peters wrote:
> Package a Python wrapper and see how popular it becomes.  Some reasons
> against trying to standardize on fdlibm were explained here:
> 
>    http://mail.python.org/pipermail/python-list/2005-July/290164.html

Thanks, these are good points. About speed, do you have any good 
benchmarks available? In my experience fdlibm is quite reasonable for 
speed in the context of use by dynamic languages (i.e., counting 
allocation overheads, lookup and send performance etc) but since I'm not 
a Python expert I'd appreciate some help with realistic benchmarks.

> Bottom line is I suspect that when it comes to bit-for-bit
> reproducibility, fewer people care about that x-platform than care
> about it x-language on the box they use.  Nothing wrong with different
> modules for people with different desires.

Agreed. Thus my question if someone had already done this ;-)

Cheers,
   - Andreas


From nmm1 at cus.cam.ac.uk  Tue Sep  5 10:51:43 2006
From: nmm1 at cus.cam.ac.uk (Nick Maclaren)
Date: Tue, 05 Sep 2006 09:51:43 +0100
Subject: [Python-Dev]  Cross-platform math functions?
Message-ID: <E1GKWel-0002uF-Kr@draco.cus.cam.ac.uk>

Andreas Raab <andreas.raab at gmx.de> wrote:
> 
> I'm curious if there is any interest in the Python community to achieve 
> better cross-platform math behavior. A quick test[1] shows a 
> non-surprising difference between the platform implementations. 
> Question: Is there any interest in changing the behavior to produce 
> identical results across platforms (for example by utilizing fdlibm 
> [2])? Since I have need for a set of cross-platform math functions I'll 
> probably start with a math-compatible fdlibm module (unless somebody has 
> done that already ;-)
> 
> [1] Using Python 2.4:
>  >>> import math
>  >>> math.cos(1.0e32)
> 
> WinXP:    -0.39929634612021897
> LinuxX86: -0.49093671143542561

Well, I hope not, but I am afraid that there is :-(

The word "better" is emotive and inaccurate.  Such calculations are
numerically meaningless, and merely encourage the confusion between
consistency and correctness.  There is a strong sense in which giving
random results between -1 and 1 would be better.

Now, I am not saying that you don't have a requirement for consistency
but I am saying that confusing it with correctness (as has been fostered
by IEEE 754, Java etc.) is harmful.  One of the great advantages of the
wide variety of arithmetics available in the 1970s is that numerical
testing was easier and more reliable - if you got wildly different
results on two platforms, you got a strong pointer to numerical problems.

That viewpoint is regarded as heresy nowadays, but used not to be!


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  nmm1 at cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679

From nmm1 at cus.cam.ac.uk  Tue Sep  5 11:07:12 2006
From: nmm1 at cus.cam.ac.uk (Nick Maclaren)
Date: Tue, 05 Sep 2006 10:07:12 +0100
Subject: [Python-Dev] Signals, threads, blocking C functions
Message-ID: <E1GKWtk-00033M-LR@draco.cus.cam.ac.uk>

"Adam Olsen" <rhamph at gmail.com> wrote:
> On 9/4/06, Gustavo Carneiro <gjcarneiro at gmail.com> wrote:
> 
> >   Now, we've had this API for a long time already (at least 2.5
> > years).  I'm pretty sure it works well enough on most *nix systems.
> > Event if it works 99% of the times, it's way better than *failing*
> > *100%* of the times, which is what happens now with Python.
> 
> Failing 99% of the time is as bad as failing 100% of the time, if your
> goal is to eliminate the short timeout on poll().  1% is quite a lot,
> and it would probably have an annoying tendency to trigger repeatedly
> when the user does certain things (not reproducible by you of course).

That can make it a lot WORSE that repeated failure.  At least with hard
failures, you have some hope of tracking them down in a reasonable time.
The problem with exception handling code that goes off very rarely,
under non-reproducible circumstances, is that it is almost untestable
and that bugs in it are positive nightmares.  I have been inflicted
with quite a large number in my time, and have a fairly good success
rate, but the number of people who know the tricks is decreasing.

Consider the (real) case where an unpredictable process on a large
server (64 CPUs) was failing about twice a week (detectably), with
no indication of how many failures were giving wrong answers.  We
replaced dozens of DIMMs, took days of down time and got nowhere;
it then went hard (i.e. one failure a day).  After a week's total
down time, with me spending 100% of my time on it and the vendor
allocating an expert at high priority, we cracked it.  We were very
lucky to find it so fast.

I could give you other examples that were/are there years and decades
later, because the pain threshhold never got high enough to dedicate
the time (and the VERY few people with experience).  I know of at
least one such problem in generic TCP/IP (i.e. on Linux, IRIX,
AIX and possibly Solaris) that has been there for decades and causes
occasional failure in most networked applications/protocols.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  nmm1 at cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679

From andreas.raab at gmx.de  Tue Sep  5 11:17:25 2006
From: andreas.raab at gmx.de (Andreas Raab)
Date: Tue, 05 Sep 2006 02:17:25 -0700
Subject: [Python-Dev] Cross-platform math functions?
In-Reply-To: <E1GKWel-0002uF-Kr@draco.cus.cam.ac.uk>
References: <E1GKWel-0002uF-Kr@draco.cus.cam.ac.uk>
Message-ID: <44FD40A5.8090406@gmx.de>

Nick Maclaren wrote:
> The word "better" is emotive and inaccurate.  Such calculations are
> numerically meaningless, and merely encourage the confusion between
> consistency and correctness.  There is a strong sense in which giving
> random results between -1 and 1 would be better.

I did, of course, mean more consistent (and yes, random consistent 
results would be "better" by this definition and indeed I would prefer 
that over inconsistent but more accurate results ;-)

Cheers,
   - Andreas

From gjcarneiro at gmail.com  Tue Sep  5 15:44:14 2006
From: gjcarneiro at gmail.com (Gustavo Carneiro)
Date: Tue, 5 Sep 2006 14:44:14 +0100
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <aac2c7cb0609042041q357e3746t187d93be17bd5770@mail.gmail.com>
References: <20060904175541.1717.1728502156.divmod.quotient.39053@ohm>
	<E1GKJQs-0006g7-IA@draco.cus.cam.ac.uk>
	<a467ca4f0609041631o719c5ea2s85e58c2309be452@mail.gmail.com>
	<aac2c7cb0609042041q357e3746t187d93be17bd5770@mail.gmail.com>
Message-ID: <a467ca4f0609050644v6a94b8daw4131ecaad0c64b33@mail.gmail.com>

On 9/5/06, Adam Olsen <rhamph at gmail.com> wrote:
> On 9/4/06, Gustavo Carneiro <gjcarneiro at gmail.com> wrote:
> >   Now, we've had this API for a long time already (at least 2.5
> > years).  I'm pretty sure it works well enough on most *nix systems.
> > Event if it works 99% of the times, it's way better than *failing*
> > *100%* of the times, which is what happens now with Python.
>
> Failing 99% of the time is as bad as failing 100% of the time, if your
> goal is to eliminate the short timeout on poll().  1% is quite a lot,
> and it would probably have an annoying tendency to trigger repeatedly
> when the user does certain things (not reproducible by you of course).
>
> That said, I do hope we can get 100%, or at least enough nines that we
> can increase the timeout significantly.

  Anyway, I was speaking hypothetically.  I'm pretty sure writing to a
pipe is async signal safe.  It is the oldest trick in the book,
everyone uses it.  I don't have to see a written signed contract to
know that it works.

  Here's a list of web sites google found me that talk about this problem:

This one describes the pipe writing technique:
  http://www.cocoadev.com/index.pl?SignalSafety

This one presents a list of "The only routines that POSIX guarantees
to be Async-Signal-Safe":
  http://docsun.cites.uiuc.edu/sun_docs/C/solaris_9/SUNWdev/MTP/p40.html#GEN-95948

Also here:

http://www.cs.usyd.edu.au/cgi-bin/man.cgi?section=5&topic=attributes

  This is all the evidence that I need.  And again I reiterate that
whether or not async safety can be achieved in practice for all
platforms is not Python's problem.  Although I believe writing to a
pipe is 100% reliable for most platforms.  Even if it is not, any
mission critical application relying on signals for correct behaviour
should be rewritten to use unix sockets instead; end of argument.

From nmm1 at cus.cam.ac.uk  Tue Sep  5 15:53:45 2006
From: nmm1 at cus.cam.ac.uk (Nick Maclaren)
Date: Tue, 05 Sep 2006 14:53:45 +0100
Subject: [Python-Dev] Signals, threads, blocking C functions
Message-ID: <E1GKbN3-0006wP-Pw@draco.cus.cam.ac.uk>

"Gustavo Carneiro" <gjcarneiro at gmail.com> wrote:
>
> Anyway, I was speaking hypothetically.  I'm pretty sure writing to a
> pipe is async signal safe.  It is the oldest trick in the book,
> everyone uses it.  I don't have to see a written signed contract to
> know that it works.

Ah.  Well, I can assure you that it's not the oldest trick in the book,
and not everyone uses it.

> This is all the evidence that I need.  And again I reiterate that
> whether or not async safety can be achieved in practice for all
> platforms is not Python's problem.

I wish you the joy of trying to report a case where it doesn't work
to a large vendor and get them to accept that it is a bug.

> Although I believe writing to a
> pipe is 100% reliable for most platforms.  Even if it is not, any
> mission critical application relying on signals for correct behaviour
> should be rewritten to use unix sockets instead; end of argument.

Er, no.  There are lots of circumstances where that isn't feasible,
such as wanting to close down an application cleanly when the scheduler
sends it a SIGXCPU.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  nmm1 at cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679

From gustavo at niemeyer.net  Tue Sep  5 17:28:33 2006
From: gustavo at niemeyer.net (Gustavo Niemeyer)
Date: Tue, 5 Sep 2006 12:28:33 -0300
Subject: [Python-Dev] no remaining issues blocking 2.5 release
In-Reply-To: <ee2a432c0609042114v569ac00ewf0b85ecc47a441c4@mail.gmail.com>
References: <ee2a432c0608142131g59e2739dg2b4545ca7eb19b24@mail.gmail.com>
	<20060815164114.GB23991@niemeyer.net>
	<ee2a432c0609042114v569ac00ewf0b85ecc47a441c4@mail.gmail.com>
Message-ID: <20060905152833.GA12378@niemeyer.net>

> Did you still want this addressed?  Anthony and I made some comments
> on the bug/patch, but nothing has been updated.

I was waiting because I got unassigned from the bug, so I thought
the maintainer was stepping up.  I'll commit a fix for it today.

Thanks for pinging me,

-- 
Gustavo Niemeyer
http://niemeyer.net

From jimjjewett at gmail.com  Tue Sep  5 18:08:19 2006
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue, 5 Sep 2006 12:08:19 -0400
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
In-Reply-To: <fb6fbf560609050904r238528c2q21fc26cf5e29795f@mail.gmail.com>
References: <fb6fbf560609050904r238528c2q21fc26cf5e29795f@mail.gmail.com>
Message-ID: <fb6fbf560609050908te8e0ab1vc4bed30be49f67b5@mail.gmail.com>

Reversing the order of the return tuple will break the alignment with
split/rsplit.

Why not just change which of the three strings holds the remainder in
the not-found case?

In rc1,

    "d".rpartition(".") --> ('d', '', '')

If that changes to

    "d".rpartition(".") --> ('', '', 'd')

then
(1)  the loop will terminate
(2)  rpartition will be more parallel to partition (and split),
(3)  people who used rpartition without looping to termination (and
therefore didn't catch the problem) will still be able to use their
existing working code.
(4)  the existing docstring would remain correct, though it could
still be improved.  (It says "returns S and two empty strings", but
doesn't specify the order.)

-jJ

From rhettinger at ewtllc.com  Tue Sep  5 18:13:49 2006
From: rhettinger at ewtllc.com (Raymond Hettinger)
Date: Tue, 05 Sep 2006 09:13:49 -0700
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
In-Reply-To: <fb6fbf560609050908te8e0ab1vc4bed30be49f67b5@mail.gmail.com>
References: <fb6fbf560609050904r238528c2q21fc26cf5e29795f@mail.gmail.com>
	<fb6fbf560609050908te8e0ab1vc4bed30be49f67b5@mail.gmail.com>
Message-ID: <44FDA23D.2060602@ewtllc.com>

Jim Jewett wrote:

>
>Why not just change which of the three strings holds the remainder in
>the not-found case?
>  
>

That was the only change submitted.
Are you happy with what was checked-in?


Raymond



From jdahlin at async.com.br  Tue Sep  5 18:18:20 2006
From: jdahlin at async.com.br (Johan Dahlin)
Date: Tue, 05 Sep 2006 13:18:20 -0300
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <E1GKbN3-0006wP-Pw@draco.cus.cam.ac.uk>
References: <E1GKbN3-0006wP-Pw@draco.cus.cam.ac.uk>
Message-ID: <44FDA34C.6030605@async.com.br>

Nick Maclaren wrote:
> "Gustavo Carneiro" <gjcarneiro at gmail.com> wrote:
>> Anyway, I was speaking hypothetically.  I'm pretty sure writing to a
>> pipe is async signal safe.  It is the oldest trick in the book,
>> everyone uses it.  I don't have to see a written signed contract to
>> know that it works.
> 
> Ah.  Well, I can assure you that it's not the oldest trick in the book,
> and not everyone uses it.
> 
>> This is all the evidence that I need.  And again I reiterate that
>> whether or not async safety can be achieved in practice for all
>> platforms is not Python's problem.
> 
> I wish you the joy of trying to report a case where it doesn't work
> to a large vendor and get them to accept that it is a bug.

Are you saying that we should let less commonly used platforms dictate
features and functionality for the popular ones?
I mean, who uses HP/UX, SCO and [insert your favorite flavor] as a modern
desktop system where this particular bug makes a difference?

Can't this just be enabled for platforms where it's known to work and let
Python as it currently is for the users of these legacy systems ?

Johan


From jimjjewett at gmail.com  Tue Sep  5 18:47:26 2006
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue, 5 Sep 2006 12:47:26 -0400
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
In-Reply-To: <44FDA23D.2060602@ewtllc.com>
References: <fb6fbf560609050904r238528c2q21fc26cf5e29795f@mail.gmail.com>
	<fb6fbf560609050908te8e0ab1vc4bed30be49f67b5@mail.gmail.com>
	<44FDA23D.2060602@ewtllc.com>
Message-ID: <fb6fbf560609050947l35980142jea281453a4f360a8@mail.gmail.com>

> Jim Jewett wrote:
> >Why not just change which of the three strings holds the remainder in
> >the not-found case?

On 9/5/06, Raymond Hettinger <rhettinger at ewtllc.com> wrote:
> That was the only change submitted.
> Are you happy with what was checked-in?

This change looks wrong:

 PyDoc_STRVAR(rpartition__doc__,
-"S.rpartition(sep) -> (head, sep, tail)\n\
+"S.rpartition(sep) -> (tail, sep, head)\n\

It looks like the code itself does the right thing, but I wasn't quite
confident of that.

-jJ

From rhettinger at ewtllc.com  Tue Sep  5 19:10:47 2006
From: rhettinger at ewtllc.com (Raymond Hettinger)
Date: Tue, 05 Sep 2006 10:10:47 -0700
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
In-Reply-To: <fb6fbf560609050947l35980142jea281453a4f360a8@mail.gmail.com>
References: <fb6fbf560609050904r238528c2q21fc26cf5e29795f@mail.gmail.com>	
	<fb6fbf560609050908te8e0ab1vc4bed30be49f67b5@mail.gmail.com>	
	<44FDA23D.2060602@ewtllc.com>
	<fb6fbf560609050947l35980142jea281453a4f360a8@mail.gmail.com>
Message-ID: <44FDAF97.3050502@ewtllc.com>


>
> This change looks wrong:
>
> PyDoc_STRVAR(rpartition__doc__,
> -"S.rpartition(sep) -> (head, sep, tail)\n\
> +"S.rpartition(sep) -> (tail, sep, head)\n\
>
> It looks like the code itself does the right thing, but I wasn't quite
> confident of that.
>
It is correct.  There may be some confusion in terminology.  Head and 
tail do not mean left-side or right-side. Instead, they refer to the 
"small part chopped-off" and "the rest that is still choppable". Think 
of head and tail in the sense of car and cdr.

A post-condition invariant for both str.partition() and str.rpartition() is:

    assert sep not in head

For non-looping cases, users will likely to use different variable names 
when they unpack the tuple:

   left, middle, right = s.rpartition(p)

But when they perform multiple partitions, the "tail" or "rest" 
terminology is more appropriate for the part of the string that may 
still contain separators.


Raymond
 







From mcherm at mcherm.com  Tue Sep  5 19:24:46 2006
From: mcherm at mcherm.com (Michael Chermside)
Date: Tue, 05 Sep 2006 10:24:46 -0700
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
Message-ID: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com>

Jim Jewett writes:
> This change [in docs] looks wrong:
>
> PyDoc_STRVAR(rpartition__doc__,
> -"S.rpartition(sep) -> (head, sep, tail)\n\
> +"S.rpartition(sep) -> (tail, sep, head)\n\

Raymond Hettinger replies:
> It is correct.  There may be some confusion in terminology.  Head  
> and tail do not mean left-side or right-side. Instead, they refer to  
> the "small part chopped-off" and "the rest that is still choppable".  
> Think of head and tail in the sense of car and cdr.


It is incorrect. The purpose of documentation is to explain
things to users, and documentation which fails to achieve this
is not "correct". The level of confusion generated by using "head"
to refer to the last part of the string and "tail" to refer to
the beginning, is quite significant.

How about something like this:

    S.partition(sep) -> (head, sep, tail)
    S.rpartition(sep) -> (tail, sep, rest)

Perhaps someone else can find something clearer than my suggestion,
but in my own head, the terms "head" and "tail" are tighly bound
with the idea of beginning and end (respectively) rather than with
the idea of "small part chopped off" and "big part that is still
choppable".

-- Michael Chermside


From barry at python.org  Tue Sep  5 19:26:15 2006
From: barry at python.org (Barry Warsaw)
Date: Tue, 5 Sep 2006 13:26:15 -0400
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
In-Reply-To: <44FDAF97.3050502@ewtllc.com>
References: <fb6fbf560609050904r238528c2q21fc26cf5e29795f@mail.gmail.com>	
	<fb6fbf560609050908te8e0ab1vc4bed30be49f67b5@mail.gmail.com>	
	<44FDA23D.2060602@ewtllc.com>
	<fb6fbf560609050947l35980142jea281453a4f360a8@mail.gmail.com>
	<44FDAF97.3050502@ewtllc.com>
Message-ID: <2BFAE30C-5B69-416A-AD76-7C5AD7D53DC7@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sep 5, 2006, at 1:10 PM, Raymond Hettinger wrote:

>> This change looks wrong:
>>
>> PyDoc_STRVAR(rpartition__doc__,
>> -"S.rpartition(sep) -> (head, sep, tail)\n\
>> +"S.rpartition(sep) -> (tail, sep, head)\n\
>>
>> It looks like the code itself does the right thing, but I wasn't  
>> quite
>> confident of that.
>>
> It is correct.  There may be some confusion in terminology.  Head and
> tail do not mean left-side or right-side. Instead, they refer to the
> "small part chopped-off" and "the rest that is still choppable". Think
> of head and tail in the sense of car and cdr.
>
> A post-condition invariant for both str.partition() and  
> str.rpartition() is:
>
>     assert sep not in head
>
> For non-looping cases, users will likely to use different variable  
> names
> when they unpack the tuple:
>
>    left, middle, right = s.rpartition(p)
>
> But when they perform multiple partitions, the "tail" or "rest"
> terminology is more appropriate for the part of the string that may
> still contain separators.

ISTM this is just begging for newbie (and maybe not-so-newbie)  
confusion.  Why not just document both as returning (left, sep,  
right) which seems the most obvious description of what the methods  
return?

- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iQCVAwUBRP2zPHEjvBPtnXfVAQKpvQP/X1Vg9G4gZLl9R7/fnevmfeszTbqVk1Bq
V7aXYm5pTFiD27cKV2e7MKZPifob6Pg8NPjsvAh6jZU5Uj0BUQhIwgDXZpcivsTM
MykyPz8oVpSLRhu5xfYU1IZjbogoKfPQ04FkqWgtM2QUqKjiLcvwzPnzLNLVxx9r
v2LplvrqJyc=
=Tckf
-----END PGP SIGNATURE-----

From rhettinger at ewtllc.com  Tue Sep  5 19:46:01 2006
From: rhettinger at ewtllc.com (Raymond Hettinger)
Date: Tue, 05 Sep 2006 10:46:01 -0700
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
In-Reply-To: <2BFAE30C-5B69-416A-AD76-7C5AD7D53DC7@python.org>
References: <fb6fbf560609050904r238528c2q21fc26cf5e29795f@mail.gmail.com>	
	<fb6fbf560609050908te8e0ab1vc4bed30be49f67b5@mail.gmail.com>	
	<44FDA23D.2060602@ewtllc.com>
	<fb6fbf560609050947l35980142jea281453a4f360a8@mail.gmail.com>
	<44FDAF97.3050502@ewtllc.com>
	<2BFAE30C-5B69-416A-AD76-7C5AD7D53DC7@python.org>
Message-ID: <44FDB7D9.5040108@ewtllc.com>


> ISTM this is just begging for newbie (and maybe not-so-newbie)  
> confusion.  Why not just document both as returning (left, sep,  
> right) which seems the most obvious description of what the methods  
> return?


I'm fine with that (though it's a little sad that we think the rather 
basic concepts of head and tail are beyond the grasp of typical 
pythonistas).

Changing to left/sep/right will certainly disambiguate questions about 
the ordering of the return tuple.  OTOH, there is some small loss in 
that the head/tail terminology is highly suggestive of how to use the 
function when making succesive partitions.


Raymond

From fdrake at acm.org  Tue Sep  5 19:51:49 2006
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Tue, 5 Sep 2006 13:51:49 -0400
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
In-Reply-To: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com>
References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com>
Message-ID: <200609051351.50494.fdrake@acm.org>

On Tuesday 05 September 2006 13:24, Michael Chermside wrote:
 > How about something like this:
 >
 >     S.partition(sep) -> (head, sep, tail)
 >     S.rpartition(sep) -> (tail, sep, rest)

I think I prefer:

    S.partition(sep) -> (head, sep, rest)
    S.rpartition(sep) -> (tail, sep, rest)

Here, "rest" is always used for "what remains"; head/tail are somewhat more 
clear here I think.


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From barry at python.org  Tue Sep  5 19:52:45 2006
From: barry at python.org (Barry Warsaw)
Date: Tue, 5 Sep 2006 13:52:45 -0400
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
In-Reply-To: <44FDB7D9.5040108@ewtllc.com>
References: <fb6fbf560609050904r238528c2q21fc26cf5e29795f@mail.gmail.com>	
	<fb6fbf560609050908te8e0ab1vc4bed30be49f67b5@mail.gmail.com>	
	<44FDA23D.2060602@ewtllc.com>
	<fb6fbf560609050947l35980142jea281453a4f360a8@mail.gmail.com>
	<44FDAF97.3050502@ewtllc.com>
	<2BFAE30C-5B69-416A-AD76-7C5AD7D53DC7@python.org>
	<44FDB7D9.5040108@ewtllc.com>
Message-ID: <76BC85F2-2184-476C-8059-A1944BBDD194@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sep 5, 2006, at 1:46 PM, Raymond Hettinger wrote:

>> ISTM this is just begging for newbie (and maybe not-so-newbie)   
>> confusion.  Why not just document both as returning (left, sep,   
>> right) which seems the most obvious description of what the  
>> methods  return?
>
>
> I'm fine with that (though it's a little sad that we think the  
> rather basic concepts of head and tail are beyond the grasp of  
> typical pythonistas).
>
> Changing to left/sep/right will certainly disambiguate questions  
> about the ordering of the return tuple.  OTOH, there is some small  
> loss in that the head/tail terminology is highly suggestive of how  
> to use the function when making succesive partitions.

Personally, I'd rather the docstring be clear and concise rather than  
suggestive of use cases.  IMO, the latter would be better served as  
an example in the latex documentation.

- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iQCVAwUBRP25cXEjvBPtnXfVAQJ4EwQAuKnVxtyabdtAv/Eu9CcZ8EkcwCJYOoAT
DmgMWeml861Sn4qN6NV1vMKbXljxiKqoSBgbKdpU+FRb6TeNiCisuWA0Q9xoOfsj
Jyvy3XN54WXCUBNBnfsfUROPqxjiNGnKxYUzx2a+pjkeSSSZxDzbuplU+2ijB6w4
HJWIT4JLldA=
=u6iU
-----END PGP SIGNATURE-----

From fdrake at acm.org  Tue Sep  5 19:55:17 2006
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Tue, 5 Sep 2006 13:55:17 -0400
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
In-Reply-To: <44FDB7D9.5040108@ewtllc.com>
References: <fb6fbf560609050904r238528c2q21fc26cf5e29795f@mail.gmail.com>
	<2BFAE30C-5B69-416A-AD76-7C5AD7D53DC7@python.org>
	<44FDB7D9.5040108@ewtllc.com>
Message-ID: <200609051355.18117.fdrake@acm.org>

On Tuesday 05 September 2006 13:46, Raymond Hettinger wrote:
 > Changing to left/sep/right will certainly disambiguate questions about

left/right is definately not helpful.  It's also ambiguous in the case 
of .rpartition(), where left and right in the input and result are different.

 > the ordering of the return tuple.  OTOH, there is some small loss in
 > that the head/tail terminology is highly suggestive of how to use the
 > function when making succesive partitions.

See my previous note in this thread for another suggestion.


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From jimjjewett at gmail.com  Tue Sep  5 20:02:31 2006
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue, 5 Sep 2006 14:02:31 -0400
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
In-Reply-To: <200609051351.50494.fdrake@acm.org>
References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com>
	<200609051351.50494.fdrake@acm.org>
Message-ID: <fb6fbf560609051102h1bfd3e66qcedde9c52dd95b79@mail.gmail.com>

On 9/5/06, Fred L. Drake, Jr. <fdrake at acm.org> wrote:

>     S.partition(sep) -> (head, sep, rest)
>     S.rpartition(sep) -> (tail, sep, rest)

> Here, "rest" is always used for "what remains"; head/tail are somewhat more
> clear here I think.

Then shouldn't rpartition be S.rpartition(sep) -> (rest, sep, tail)

Another possibility is data (for head/tail) and unparsed (for rest).

    S.partition(sep) -> (data, sep, unparsed)
    S.rpartition(sep) -> (unparsed, sep, data)

I'm not sure which is worse --
(1)  distinguishing between tail and rest
(2)  using (overly generic) jargon like unparsed and data.

Whatever the final decision, it would probably be best to add an
example to the docstring.   "a.b.c".rpartition(".") -> ("a.b", ".",
"c")

-jJ

From rhettinger at ewtllc.com  Tue Sep  5 20:06:19 2006
From: rhettinger at ewtllc.com (Raymond Hettinger)
Date: Tue, 05 Sep 2006 11:06:19 -0700
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
In-Reply-To: <fb6fbf560609051102h1bfd3e66qcedde9c52dd95b79@mail.gmail.com>
References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com>	
	<200609051351.50494.fdrake@acm.org>
	<fb6fbf560609051102h1bfd3e66qcedde9c52dd95b79@mail.gmail.com>
Message-ID: <44FDBC9B.6050406@ewtllc.com>


>
> Then shouldn't rpartition be S.rpartition(sep) -> (rest, sep, tail)

Gads, the cure is worse than the disease.

car and cdr are starting to look pretty good ;-)


Raymond

From fdrake at acm.org  Tue Sep  5 20:10:33 2006
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Tue, 5 Sep 2006 14:10:33 -0400
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
In-Reply-To: <fb6fbf560609051102h1bfd3e66qcedde9c52dd95b79@mail.gmail.com>
References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com>
	<200609051351.50494.fdrake@acm.org>
	<fb6fbf560609051102h1bfd3e66qcedde9c52dd95b79@mail.gmail.com>
Message-ID: <200609051410.34201.fdrake@acm.org>

On Tuesday 05 September 2006 14:02, Jim Jewett wrote:
 > Then shouldn't rpartition be S.rpartition(sep) -> (rest, sep, tail)

Whichever matches reality, sure.  I've lost track of the rpartition() result 
order.  --sigh--

 > Another possibility is data (for head/tail) and unparsed (for rest).
 >
 >     S.partition(sep) -> (data, sep, unparsed)
 >     S.rpartition(sep) -> (unparsed, sep, data)

It's all data, so I think that's too contrived.

 > I'm not sure which is worse --
 > (1)  distinguishing between tail and rest
 > (2)  using (overly generic) jargon like unparsed and data.

I don't see the distinction between tail and rest as problematic.  But I've 
not used lisp for a long time.

 > Whatever the final decision, it would probably be best to add an
 > example to the docstring.   "a.b.c".rpartition(".") -> ("a.b", ".",
 > "c")

Agreed.


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From barry at python.org  Tue Sep  5 20:12:16 2006
From: barry at python.org (Barry Warsaw)
Date: Tue, 5 Sep 2006 14:12:16 -0400
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
In-Reply-To: <44FDBC9B.6050406@ewtllc.com>
References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com>	
	<200609051351.50494.fdrake@acm.org>
	<fb6fbf560609051102h1bfd3e66qcedde9c52dd95b79@mail.gmail.com>
	<44FDBC9B.6050406@ewtllc.com>
Message-ID: <6B0AEAE3-A77E-4CE3-956E-14CF31F26FD8@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sep 5, 2006, at 2:06 PM, Raymond Hettinger wrote:

>> Then shouldn't rpartition be S.rpartition(sep) -> (rest, sep, tail)
>
> Gads, the cure is worse than the disease.
>
> car and cdr are starting to look pretty good ;-)

LOL, the lisper in me likes that too, but I don't think it'll work. :)

Fred's disagreement notwithstanding, I still like (left, sep, right),  
but another alternative comes to mind after actually reading the  
docstring for rpartition <wink>: (before, sep, after).  Now, that's  
not ambiguous is it?  Seems to work for both partition and rpartition.

- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iQCVAwUBRP2+AHEjvBPtnXfVAQLiPAP+N80jHkoT5VNTtX1h2cqD4pONz+j2maCI
QXDBoODucxLDPrig8FJ3c6IcT+Uapifu8Rrvd7Vm8gSPMUsMqAgAqhqNDbXTkHVH
xLk31en2k2fdiCQKQyKJSjE1R1CaFCezByV29FK3fWvqrrxObISRnsxf/wXB6Czu
pOUNSA9LLKo=
=g+iz
-----END PGP SIGNATURE-----

From Scott.Daniels at Acm.Org  Tue Sep  5 20:16:56 2006
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Tue, 05 Sep 2006 11:16:56 -0700
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <44FDA34C.6030605@async.com.br>
References: <E1GKbN3-0006wP-Pw@draco.cus.cam.ac.uk>
	<44FDA34C.6030605@async.com.br>
Message-ID: <edket0$oim$1@sea.gmane.org>

Johan Dahlin wrote:
> Nick Maclaren wrote:
>> "Gustavo Carneiro" <gjcarneiro at gmail.com> wrote:
>>> ....  I'm pretty sure writing to a pipe is async signal safe.  It is the
>>> oldest trick in the book, everyone uses it.  I ...  know that it works.
>> Ah.  Well, I can assure you that it's not the oldest trick in the book,
>> and not everyone uses it.
> ...
> Can't this just be enabled for platforms where it's known to work and let
> Python as it currently is for the users of these legacy systems ?

Ah, but that _is_ the current state of affairs.  .5 :-)

-- Scott David Daniels
Scott.Daniels at Acm.Org


From jjl at pobox.com  Tue Sep  5 20:22:11 2006
From: jjl at pobox.com (John J Lee)
Date: Tue, 5 Sep 2006 19:22:11 +0100 (GMT Standard Time)
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
In-Reply-To: <200609051351.50494.fdrake@acm.org>
References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com>
	<200609051351.50494.fdrake@acm.org>
Message-ID: <Pine.WNT.4.64.0609051911440.8416@shaolin>

On Tue, 5 Sep 2006, Fred L. Drake, Jr. wrote:

> On Tuesday 05 September 2006 13:24, Michael Chermside wrote:
> > How about something like this:
> >
> >     S.partition(sep) -> (head, sep, tail)
> >     S.rpartition(sep) -> (tail, sep, rest)
>
> I think I prefer:
>
>    S.partition(sep) -> (head, sep, rest)
>    S.rpartition(sep) -> (tail, sep, rest)
>
> Here, "rest" is always used for "what remains"; head/tail are somewhat more
> clear here I think.

But isn't rest is in the wrong place there, for rpartition: that's not the 
string that you might typically call.rpartition() on a second time.  How 
about:

     S.partition(sep) -> (left, sep, rest)
     S.rpartition(sep) -> (rest, sep, right)


John

From brett at python.org  Tue Sep  5 20:25:53 2006
From: brett at python.org (Brett Cannon)
Date: Tue, 5 Sep 2006 11:25:53 -0700
Subject: [Python-Dev] 2.5 status
In-Reply-To: <ee2a432c0609042124w281b7979t8cb10cbaeb937374@mail.gmail.com>
References: <ee2a432c0609042124w281b7979t8cb10cbaeb937374@mail.gmail.com>
Message-ID: <bbaeab100609051125x6d707ca2jf79973d9d68579a7@mail.gmail.com>

On 9/4/06, Neal Norwitz <nnorwitz at gmail.com> wrote:
>
> There are 3 bugs currently listed in PEP 356 as blocking:
>         http://python.org/sf/1551432 - __unicode__ breaks on exception
> classes


I replied on the bug report, but might as well comment here.

The problem with this bug is that BaseException now defines a __unicode__()
method in its PyMethodDef.  That intercepts the unicode() call on the class
and it complains it was not handed an instance.  I guess the only way to fix
this is to toss out the __unicode__() method and change the tp_str function
to return Unicode as needed (unless someone else has a better idea).  Or the
bug can be closed as Won't Fix.

        http://python.org/sf/1550938 - improper exception w/relative import
>         http://python.org/sf/1541697 - sgmllib regexp bug causes hang
>
> Does anyone want to fix the sgmlib issue?  If not, we should revert
> this week before c2 is cut.  I'm hoping that we will have *no changes*
> in 2.5 final from c2.  Should there be any bugs/patches added to or
> removed from the list?
>
> The buildbots are currently humming along, but I believe all 3
> versions (2.4, 2.5, and 2.6) are fine.
>
> Test out 2.5c1+ and report all bugs!
>
> n
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/brett%40python.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060905/8ac53a65/attachment-0001.htm 

From seojiwon at gmail.com  Tue Sep  5 20:33:59 2006
From: seojiwon at gmail.com (Jiwon Seo)
Date: Tue, 5 Sep 2006 11:33:59 -0700
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
In-Reply-To: <6B0AEAE3-A77E-4CE3-956E-14CF31F26FD8@python.org>
References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com>
	<200609051351.50494.fdrake@acm.org>
	<fb6fbf560609051102h1bfd3e66qcedde9c52dd95b79@mail.gmail.com>
	<44FDBC9B.6050406@ewtllc.com>
	<6B0AEAE3-A77E-4CE3-956E-14CF31F26FD8@python.org>
Message-ID: <b008462b0609051133t2001c2f4r68f66d08769e8a4f@mail.gmail.com>

On 9/5/06, Barry Warsaw <barry at python.org> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On Sep 5, 2006, at 2:06 PM, Raymond Hettinger wrote:
>
> >> Then shouldn't rpartition be S.rpartition(sep) -> (rest, sep, tail)
> >
> > Gads, the cure is worse than the disease.
> >
> > car and cdr are starting to look pretty good ;-)
>
> LOL, the lisper in me likes that too, but I don't think it'll work. :)
>

but when it comes to cadr, cddr, cdar... ;^)

I personally prefer  (left, sep, right ) since it's most clear and
there are many Python programmers whose first language is not English.

> Fred's disagreement notwithstanding, I still like (left, sep, right),
> but another alternative comes to mind after actually reading the
> docstring for rpartition <wink>: (before, sep, after).  Now, that's
> not ambiguous is it?  Seems to work for both partition and rpartition.
>
> - -Barry
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.5 (Darwin)
>
> iQCVAwUBRP2+AHEjvBPtnXfVAQLiPAP+N80jHkoT5VNTtX1h2cqD4pONz+j2maCI
> QXDBoODucxLDPrig8FJ3c6IcT+Uapifu8Rrvd7Vm8gSPMUsMqAgAqhqNDbXTkHVH
> xLk31en2k2fdiCQKQyKJSjE1R1CaFCezByV29FK3fWvqrrxObISRnsxf/wXB6Czu
> pOUNSA9LLKo=
> =g+iz
> -----END PGP SIGNATURE-----
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/seojiwon%40gmail.com
>

From rhettinger at ewtllc.com  Tue Sep  5 20:32:46 2006
From: rhettinger at ewtllc.com (Raymond Hettinger)
Date: Tue, 05 Sep 2006 11:32:46 -0700
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
In-Reply-To: <fb6fbf560609051102h1bfd3e66qcedde9c52dd95b79@mail.gmail.com>
References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com>	
	<200609051351.50494.fdrake@acm.org>
	<fb6fbf560609051102h1bfd3e66qcedde9c52dd95b79@mail.gmail.com>
Message-ID: <44FDC2CE.1040902@ewtllc.com>

Jim Jewett wrote: 

>
> Another possibility is data (for head/tail) and unparsed (for rest).
>
>    S.partition(sep) -> (data, sep, unparsed)
>    S.rpartition(sep) -> (unparsed, sep, data)


This communicates very little about the ordering of the return tuple.  
Beware of overly general terms like "data" that provide no hints about 
the semantics of the method.

The one good part that the terms are consistent between partition and 
rpartition so that the invariant can be stated:

    assert sep not in datum

I recommend we just leave the existing head/tail wording and add an 
example which will make the meaning instantly clear:
   'www.python.org'.rpartition('.')    -->   ('www.python', '.', 'org')

Also, remember that this discussion is being held in abstract.  An 
actual user of rpartition() is already thinking in terms of parsing from 
the end of the string.

Another thought is that strings don't really have a left and right.  
They have a beginning and end.  The left/right or top/bottom distinction 
is culture specific.


Raymond


BTW, if someone chops your ankles, does it matter which way you're 
facing to decide whether it was your feet or your head that had been 
cut-off?




From rrr at ronadam.com  Tue Sep  5 20:35:40 2006
From: rrr at ronadam.com (Ron Adam)
Date: Tue, 05 Sep 2006 13:35:40 -0500
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
In-Reply-To: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com>
References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com>
Message-ID: <44FDC37C.80304@ronadam.com>

Michael Chermside wrote:
> Jim Jewett writes:
>> This change [in docs] looks wrong:
>>
>> PyDoc_STRVAR(rpartition__doc__,
>> -"S.rpartition(sep) -> (head, sep, tail)\n\
>> +"S.rpartition(sep) -> (tail, sep, head)\n\
> 
> Raymond Hettinger replies:
>> It is correct.  There may be some confusion in terminology.  Head  
>> and tail do not mean left-side or right-side. Instead, they refer to  
>> the "small part chopped-off" and "the rest that is still choppable".  
>> Think of head and tail in the sense of car and cdr.
> 
> 
> It is incorrect. The purpose of documentation is to explain
> things to users, and documentation which fails to achieve this
> is not "correct". The level of confusion generated by using "head"
> to refer to the last part of the string and "tail" to refer to
> the beginning, is quite significant.
> 
> How about something like this:
> 
>     S.partition(sep) -> (head, sep, tail)
>     S.rpartition(sep) -> (tail, sep, rest)

This isn't immediately clear to me what I will get.

      s.partition(sep) -> (left, sep, right)
      s.rpartition(sep) -> (left, sep, right)

Would be clearer, along with an explanation of what left, and right are.

I hope this discussion is only about the words used and the
documentation and not about the actual order of what is received. I
would expect both the following should be true, and it is the current
behavior.

     ''.join(s.partition(sep)) -> s
     ''.join(s.rpartition(sep)) -> s



> Perhaps someone else can find something clearer than my suggestion,
> but in my own head, the terms "head" and "tail" are tighly bound
> with the idea of beginning and end (respectively) rather than with
> the idea of "small part chopped off" and "big part that is still
> choppable".

Maybe this?


partition(...)
     S.partition(sep) -> (left, sep, right)

     Partition a string at the first occurrence of sep from the
     left into a tuple of left, sep, and right parts.

     Returns (S, '', '') if sep is not found in S.


rpartition(...)
     S.rpartition(sep) -> (left, sep, right)

     Partition a string at the first occurrence of sep from the right
     into a tuple of left, sep, and right parts.

     Returns ('', '', S) if sep is not found in S.


I feel the terms head and tail, rest etc... should be used in examples
where their meaning will be clear by the context they are used in. But
not in the definition where their meanings are not obvious.

Cheers,
    Ron






From rrr at ronadam.com  Tue Sep  5 20:44:40 2006
From: rrr at ronadam.com (Ron Adam)
Date: Tue, 05 Sep 2006 13:44:40 -0500
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
In-Reply-To: <44FDC2C5.2080709@ronadam.com>
References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com>
	<44FDC2C5.2080709@ronadam.com>
Message-ID: <44FDC598.2000106@ronadam.com>

Ron Adam wrote:

Correcting myself...

> I hope this discussion is only about the words used and the 
> documentation and not about the actual order of what is received. I 
> would expect both the following should be true, and it is the current 
> behavior.
> 
>     ''.join(s.partition(sep)) -> s
>     ''.join(s.rpartition(sep)) -> s

>>> 'abcd'.partition('x')
('abcd', '', '')
>>> 'abcd'.rpartition('x')
('abcd', '', '')
>>>

Ok, I see Raymonds point, they are not what I expected.

Although the above is still true, the returned value for the not found condition
is inconsistent.

_Ron





From g.brandl at gmx.net  Tue Sep  5 20:49:01 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Tue, 05 Sep 2006 20:49:01 +0200
Subject: [Python-Dev] 2.5 status
In-Reply-To: <bbaeab100609051125x6d707ca2jf79973d9d68579a7@mail.gmail.com>
References: <ee2a432c0609042124w281b7979t8cb10cbaeb937374@mail.gmail.com>
	<bbaeab100609051125x6d707ca2jf79973d9d68579a7@mail.gmail.com>
Message-ID: <edkgqv$v3m$1@sea.gmane.org>

Brett Cannon wrote:
> 
> 
> On 9/4/06, *Neal Norwitz* <nnorwitz at gmail.com
> <mailto:nnorwitz at gmail.com>> wrote:
> 
>     There are 3 bugs currently listed in PEP 356 as blocking:
>             http://python.org/sf/1551432 - __unicode__ breaks on
>     exception classes
> 
> 
> I replied on the bug report, but might as well comment here.
> 
> The problem with this bug is that BaseException now defines a
> __unicode__() method in its PyMethodDef.  That intercepts the unicode()
> call on the class and it complains it was not handed an instance.  I
> guess the only way to fix this is to toss out the __unicode__() method
> and change the tp_str function to return Unicode as needed (unless
> someone else has a better idea).  Or the bug can be closed as Won't Fix.

Throwing out the __unicode__ method is fine with me -- exceptions didn't
have one before the NeedForSpeed rewrite, so there would be no loss in
functionality.

Georg


From barry at python.org  Tue Sep  5 20:51:13 2006
From: barry at python.org (Barry Warsaw)
Date: Tue, 5 Sep 2006 14:51:13 -0400
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
In-Reply-To: <44FDC2CE.1040902@ewtllc.com>
References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com>	
	<200609051351.50494.fdrake@acm.org>
	<fb6fbf560609051102h1bfd3e66qcedde9c52dd95b79@mail.gmail.com>
	<44FDC2CE.1040902@ewtllc.com>
Message-ID: <CD96E9CD-348E-4110-A7CC-F93FF0B86117@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sep 5, 2006, at 2:32 PM, Raymond Hettinger wrote:

> Another thought is that strings don't really have a left and right.
> They have a beginning and end.  The left/right or top/bottom  
> distinction
> is culture specific.

For the target of the method, this is true, but it's not true for the  
results which is what we're talking about describing here.  'left' is  
whatever is to the left of the separator and 'right' is whatever is  
to the right of the separator.  Seems obvious to me.

I believe (left, sep, right) will be the clearest description for all  
users, with little chance of confusion.

- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iQCVAwUBRP3HIXEjvBPtnXfVAQIx5wP+MPF5tk4moX4jH0yhGvR6gKcGBusyN152
redIr0xiNqECfrIHkc756UDLn3HhB2WdEjR9pn06RzmbgePMPcGP19cjZdHGwjFK
3e4Qg8zW3cL0iCnybL4AEaoZksuHGwJpZbId9HF60GFqYdjNTKEMNIVRI7jTE9pP
zbBO6Sscnl0=
=HB4k
-----END PGP SIGNATURE-----

From rrr at ronadam.com  Tue Sep  5 20:58:30 2006
From: rrr at ronadam.com (Ron Adam)
Date: Tue, 05 Sep 2006 13:58:30 -0500
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
In-Reply-To: <44FDC2CE.1040902@ewtllc.com>
References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com>		<200609051351.50494.fdrake@acm.org>	<fb6fbf560609051102h1bfd3e66qcedde9c52dd95b79@mail.gmail.com>
	<44FDC2CE.1040902@ewtllc.com>
Message-ID: <44FDC8D6.5090002@ronadam.com>

Raymond Hettinger wrote:

> Another thought is that strings don't really have a left and right.  
> They have a beginning and end.  The left/right or top/bottom distinction 
> is culture specific.

Well, it should have been epartition() and not rpartition() in that case.   ;-)

Is python ever edited in languages that don't use left to right lines?



From rhettinger at ewtllc.com  Tue Sep  5 21:06:03 2006
From: rhettinger at ewtllc.com (Raymond Hettinger)
Date: Tue, 05 Sep 2006 12:06:03 -0700
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
In-Reply-To: <44FDC37C.80304@ronadam.com>
References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com>
	<44FDC37C.80304@ronadam.com>
Message-ID: <44FDCA9B.60101@ewtllc.com>

Ron Adam wrote:

>I hope this discussion is only about the words used and the
>documentation and not about the actual order of what is received. I
>would expect both the following should be true, and it is the current
>behavior.
>
>     ''.join(s.partition(sep)) -> s
>     ''.join(s.rpartition(sep)) -> s
>
>  
>

Right.  The only thing in question is wording for the documentation. 

The viable options on the table are:
* Leave the current wording and add a clarifying example.
* Switch to left/sep/right and add a clarifying example.

The former tells you which part can still contain a separator and 
suggests how to use the tool when successive partitions are needed.  The 
latter makes the left/right ordering clear and tells you nothing about 
which part can still have the separators in it.  That has some import 
because the use cases for rpartition() all involve strings with multiple 
separators --if there were only one, you would just use partition().

BTW, the last check-in fixed the return value for the sep-not-found 
case, so that now:
   'a'.partition('x') -->  ('a', '', '')
   'a'.rpartition('x') -->  ('', '', 'a')

This was necessary so that looping/recursion would work and so that 
rpartition() acts as a mirror-image of partition().


Raymond






From tim.peters at gmail.com  Tue Sep  5 21:07:43 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 5 Sep 2006 15:07:43 -0400
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
In-Reply-To: <CD96E9CD-348E-4110-A7CC-F93FF0B86117@python.org>
References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com>
	<200609051351.50494.fdrake@acm.org>
	<fb6fbf560609051102h1bfd3e66qcedde9c52dd95b79@mail.gmail.com>
	<44FDC2CE.1040902@ewtllc.com>
	<CD96E9CD-348E-4110-A7CC-F93FF0B86117@python.org>
Message-ID: <1f7befae0609051207t479b8711g4ff3b719e46ca17@mail.gmail.com>

    upto, sep, rest

in whatever order they apply.  I think of a partition-like function as
starting at some position and matching "up to" the first occurence of
the separator (be that left or right or diagonally, "up to" is
relative to the search direction), and leaving "the rest" alone. The
docs should match that, since my mental model is correct ;-)

From brett at python.org  Tue Sep  5 21:19:52 2006
From: brett at python.org (Brett Cannon)
Date: Tue, 5 Sep 2006 12:19:52 -0700
Subject: [Python-Dev] 2.5 status
In-Reply-To: <edkgqv$v3m$1@sea.gmane.org>
References: <ee2a432c0609042124w281b7979t8cb10cbaeb937374@mail.gmail.com>
	<bbaeab100609051125x6d707ca2jf79973d9d68579a7@mail.gmail.com>
	<edkgqv$v3m$1@sea.gmane.org>
Message-ID: <bbaeab100609051219m53b7b2audd341529743eff22@mail.gmail.com>

On 9/5/06, Georg Brandl <g.brandl at gmx.net> wrote:
>
> Brett Cannon wrote:
> >
> >
> > On 9/4/06, *Neal Norwitz* <nnorwitz at gmail.com
> > <mailto:nnorwitz at gmail.com>> wrote:
> >
> >     There are 3 bugs currently listed in PEP 356 as blocking:
> >             http://python.org/sf/1551432 - __unicode__ breaks on
> >     exception classes
> >
> >
> > I replied on the bug report, but might as well comment here.
> >
> > The problem with this bug is that BaseException now defines a
> > __unicode__() method in its PyMethodDef.  That intercepts the unicode()
> > call on the class and it complains it was not handed an instance.  I
> > guess the only way to fix this is to toss out the __unicode__() method
> > and change the tp_str function to return Unicode as needed (unless
> > someone else has a better idea).  Or the bug can be closed as Won't Fix.
>
> Throwing out the __unicode__ method is fine with me -- exceptions didn't
> have one before the NeedForSpeed rewrite, so there would be no loss in
> functionality.



If this step is done and the tp_str function is not changed to return
Unicode as needed, PEP 352 will need to be updated.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060905/a3423370/attachment.html 

From mal at egenix.com  Tue Sep  5 21:33:54 2006
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 05 Sep 2006 21:33:54 +0200
Subject: [Python-Dev] 2.5 status
In-Reply-To: <bbaeab100609051125x6d707ca2jf79973d9d68579a7@mail.gmail.com>
References: <ee2a432c0609042124w281b7979t8cb10cbaeb937374@mail.gmail.com>
	<bbaeab100609051125x6d707ca2jf79973d9d68579a7@mail.gmail.com>
Message-ID: <44FDD122.3000809@egenix.com>

Brett Cannon wrote:
> On 9/4/06, Neal Norwitz <nnorwitz at gmail.com> wrote:
>>
>> There are 3 bugs currently listed in PEP 356 as blocking:
>>         http://python.org/sf/1551432 - __unicode__ breaks on exception
>> classes
> 
> 
> I replied on the bug report, but might as well comment here.
> 
> The problem with this bug is that BaseException now defines a __unicode__()
> method in its PyMethodDef.  That intercepts the unicode() call on the class
> and it complains it was not handed an instance.  I guess the only way to
> fix this is to toss out the __unicode__() method and change the tp_str function
> to return Unicode as needed (unless someone else has a better idea).  Or
> the bug can be closed as Won't Fix.

The proper fix would be to introduce a tp_unicode slot and let
this decide what to do, ie. call .__unicode__() methods on instances
and use the .__name__ on classes.

I think this would be the right way to go for Python 2.6. For
Python 2.5, just dropping this .__unicode__ method on exceptions
is probably the right thing to do.

The reason why the PyObject_Unicode() function tries to be smart
here is that we don't have a tp_unicode slot (to complement
tp_str). It's obvious that this is not perfect, but only a
work-around.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 05 2006)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From brett at python.org  Tue Sep  5 21:41:49 2006
From: brett at python.org (Brett Cannon)
Date: Tue, 5 Sep 2006 12:41:49 -0700
Subject: [Python-Dev] 2.5 status
In-Reply-To: <44FDD122.3000809@egenix.com>
References: <ee2a432c0609042124w281b7979t8cb10cbaeb937374@mail.gmail.com>
	<bbaeab100609051125x6d707ca2jf79973d9d68579a7@mail.gmail.com>
	<44FDD122.3000809@egenix.com>
Message-ID: <bbaeab100609051241m7d878b0dtd93018b535b9ee14@mail.gmail.com>

On 9/5/06, M.-A. Lemburg <mal at egenix.com> wrote:
>
> Brett Cannon wrote:
> > On 9/4/06, Neal Norwitz <nnorwitz at gmail.com> wrote:
> >>
> >> There are 3 bugs currently listed in PEP 356 as blocking:
> >>         http://python.org/sf/1551432 - __unicode__ breaks on exception
> >> classes
> >
> >
> > I replied on the bug report, but might as well comment here.
> >
> > The problem with this bug is that BaseException now defines a
> __unicode__()
> > method in its PyMethodDef.  That intercepts the unicode() call on the
> class
> > and it complains it was not handed an instance.  I guess the only way to
> > fix this is to toss out the __unicode__() method and change the tp_str
> function
> > to return Unicode as needed (unless someone else has a better idea).  Or
> > the bug can be closed as Won't Fix.
>
> The proper fix would be to introduce a tp_unicode slot and let
> this decide what to do, ie. call .__unicode__() methods on instances
> and use the .__name__ on classes.


That was my bug reaction  and what I said on the bug report.  Kind of
surprised one doesn't already exist.

I think this would be the right way to go for Python 2.6. For
> Python 2.5, just dropping this .__unicode__ method on exceptions
> is probably the right thing to do.


Neal, do you want to rip it out or should I?

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060905/f0862cc8/attachment.htm 

From p.f.moore at gmail.com  Tue Sep  5 21:41:58 2006
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 5 Sep 2006 20:41:58 +0100
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
In-Reply-To: <1f7befae0609051207t479b8711g4ff3b719e46ca17@mail.gmail.com>
References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com>
	<200609051351.50494.fdrake@acm.org>
	<fb6fbf560609051102h1bfd3e66qcedde9c52dd95b79@mail.gmail.com>
	<44FDC2CE.1040902@ewtllc.com>
	<CD96E9CD-348E-4110-A7CC-F93FF0B86117@python.org>
	<1f7befae0609051207t479b8711g4ff3b719e46ca17@mail.gmail.com>
Message-ID: <79990c6b0609051241x35bfd75fia7a9d8bb095e1019@mail.gmail.com>

On 9/5/06, Tim Peters <tim.peters at gmail.com> wrote:
>     upto, sep, rest
>
> in whatever order they apply.  I think of a partition-like function as
> starting at some position and matching "up to" the first occurence of
> the separator (be that left or right or diagonally, "up to" is
> relative to the search direction), and leaving "the rest" alone. The
> docs should match that, since my mental model is correct ;-)

+1

Paul

From nmm1 at cus.cam.ac.uk  Tue Sep  5 21:44:50 2006
From: nmm1 at cus.cam.ac.uk (Nick Maclaren)
Date: Tue, 05 Sep 2006 20:44:50 +0100
Subject: [Python-Dev] Signals, threads, blocking C functions
Message-ID: <E1GKgqo-00040l-3p@draco.cus.cam.ac.uk>

Johan Dahlin <jdahlin at async.com.br> wrote:
>
> Are you saying that we should let less commonly used platforms dictate
> features and functionality for the popular ones?
> I mean, who uses HP/UX, SCO and [insert your favorite flavor] as a modern
> desktop system where this particular bug makes a difference?

You haven't been following the thread.  As I posted, this problem
occurs to a greater or lesser degree on all platforms.  This will be
my last posting on the topic, but I shall try to explain.

The first problem is in the hardware and operating system.  A signal
interrupts the thread, and passes control to a handler with a very
partial environment and (usually) information on the environment
when it was interrupted.  If it interrupted the thread in the middle
of a system call or other library routine that uses non-Python
conventions, the registers and other state may be weird.  There ARE
solutions to this, but they are unbelievably foul, and even Linux
on x86 gas had trouble with this.  And, on return, everything has to
be reversed entirely transparently!

It is VERY common for there to be bugs in the C run-time system and
not rare for there to be ones in the kernel (that area of Linux has
been rewritten MANY times, for this reason).  In many cases, the
run-time system simply doesn't pretend to handle interrupts in
arbitrary code (which is where the C undefined behaviour is used by
vendors).

The second problem is that what you can do depends both on what you
were doing and how your 'primitive' is implemented.  For example, if
you call something that takes out even a very short term lock or uses
a spin loop to emulate an atomic operation, you had better not use it
if you interrupted code that was doing the same.  Your thread may
hang, crash or otherwise go bananas.  Can you guarantee that even
write is free of such things?  No, and certainly not if you are using
a debugger, a profiling library or even tracing system calls.  I have
often used programs that crashed as soon as I did one of those :-(

Related to this is that it is EXTREMELY hard to write synchronisation
primitives (mutexes etc.) that are interrupt-safe - MUCH harder than
to write thread-safe ones - and few people are even aware of the
issues.  There was a thread on some Linux kernel mailing list about
this, and even the kernel developers were having headaches thinking
about the issues.

Even if write is atomic, there are gotchas.  What if the interrupted
code is doing something to that file at the time?  Are you SURE that
an unexpected operation on it (in the same thread) won't cause the
library function of program to get confused?  And can you be sure
that the write will terminate fast enough to not cause time-critical
code to fail?  And have you studied the exact semantics of blocking
on pipes?  They are truly horrible.

So this is NOT a matter of platform X is safe and platform Y isn't.
Even Linux x86 isn't entirely safe - or wasn't, the last time I heard.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  nmm1 at cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679

From rhettinger at ewtllc.com  Tue Sep  5 22:13:02 2006
From: rhettinger at ewtllc.com (Raymond Hettinger)
Date: Tue, 05 Sep 2006 13:13:02 -0700
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
In-Reply-To: <1f7befae0609051207t479b8711g4ff3b719e46ca17@mail.gmail.com>
References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com>	<200609051351.50494.fdrake@acm.org>	<fb6fbf560609051102h1bfd3e66qcedde9c52dd95b79@mail.gmail.com>	<44FDC2CE.1040902@ewtllc.com>	<CD96E9CD-348E-4110-A7CC-F93FF0B86117@python.org>
	<1f7befae0609051207t479b8711g4ff3b719e46ca17@mail.gmail.com>
Message-ID: <44FDDA4E.2080506@ewtllc.com>

Tim Peters wrote:

>    upto, sep, rest
>
>in whatever order they apply. 
>
In the rpartition case, that would be (rest, sep, upto) which seems a 
bit cryptic.

We need some choice of words that clearly mean:
 * the chopped-off snippet (guaranteed to not contain the separator)
 * the separator if found
 * the unchopped remainer of the string (which may contain a separator).

Of course, if a clear example is added, the choice of words becomes much 
less important.


Raymond



From barry at python.org  Tue Sep  5 22:17:20 2006
From: barry at python.org (Barry Warsaw)
Date: Tue, 5 Sep 2006 16:17:20 -0400
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
In-Reply-To: <44FDDA4E.2080506@ewtllc.com>
References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com>	<200609051351.50494.fdrake@acm.org>	<fb6fbf560609051102h1bfd3e66qcedde9c52dd95b79@mail.gmail.com>	<44FDC2CE.1040902@ewtllc.com>	<CD96E9CD-348E-4110-A7CC-F93FF0B86117@python.org>
	<1f7befae0609051207t479b8711g4ff3b719e46ca17@mail.gmail.com>
	<44FDDA4E.2080506@ewtllc.com>
Message-ID: <C4296304-7A32-4819-A34B-A3DB9E30F388@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sep 5, 2006, at 4:13 PM, Raymond Hettinger wrote:

> Tim Peters wrote:
>
>>    upto, sep, rest
>>
>> in whatever order they apply.
>>
> In the rpartition case, that would be (rest, sep, upto) which seems a
> bit cryptic.
>
> We need some choice of words that clearly mean:
>  * the chopped-off snippet (guaranteed to not contain the separator)
>  * the separator if found
>  * the unchopped remainer of the string (which may contain a  
> separator).
>
> Of course, if a clear example is added, the choice of words becomes  
> much
> less important.

Ideally too, the terminology (and order) for partition and rpartition  
would be the same.

- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iQCVAwUBRP3bVXEjvBPtnXfVAQJSKwP9Ev3MPzum3kp4hNDJZyBmEShzPvL2WQv2
VThbxZX1MDfeDXupNwF22bFA5gF/9vZp3nToUqyAbOaPSd93hJSHOdeWdAhR2BdT
EICkzBTGCtVkbqu3Ep1N/jb9GJUvgkgNAWtRZVuTWQtJc6AanV9ssTcF6F7ipc6p
zgSWeAc0a3E=
=W7LV
-----END PGP SIGNATURE-----

From jimjjewett at gmail.com  Tue Sep  5 22:43:20 2006
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue, 5 Sep 2006 16:43:20 -0400
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
In-Reply-To: <CD96E9CD-348E-4110-A7CC-F93FF0B86117@python.org>
References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com>
	<200609051351.50494.fdrake@acm.org>
	<fb6fbf560609051102h1bfd3e66qcedde9c52dd95b79@mail.gmail.com>
	<44FDC2CE.1040902@ewtllc.com>
	<CD96E9CD-348E-4110-A7CC-F93FF0B86117@python.org>
Message-ID: <fb6fbf560609051343w66f98a08j4503781d9ae7ce39@mail.gmail.com>

I think I finally figured out where Raymond is coming from.

For Raymond, "head" is where he started processing -- for rpartition,
this is the .endswith part.

For me, "head" is the start of the data structure -- always the
.startswith part.

We won't resolve that with anything suggesting a sequential order; we
need something that makes it clear which part is the large leftover.

    S.partition(sep) -> (record, sep, remains)
    S.rpartition(sep) -> (remains, sep, record)

I do like the plural (or collective) sound of "remains".

I have no solid reasoning for "record" vs "rec" vs "onerec".  I would
welcome a word that did not suggest it would have further internal
structure.

-jJ

From barry at python.org  Tue Sep  5 22:55:44 2006
From: barry at python.org (Barry Warsaw)
Date: Tue, 5 Sep 2006 16:55:44 -0400
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
In-Reply-To: <fb6fbf560609051343w66f98a08j4503781d9ae7ce39@mail.gmail.com>
References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com>
	<200609051351.50494.fdrake@acm.org>
	<fb6fbf560609051102h1bfd3e66qcedde9c52dd95b79@mail.gmail.com>
	<44FDC2CE.1040902@ewtllc.com>
	<CD96E9CD-348E-4110-A7CC-F93FF0B86117@python.org>
	<fb6fbf560609051343w66f98a08j4503781d9ae7ce39@mail.gmail.com>
Message-ID: <118F763E-6B49-4AC2-91CB-961F14D504A0@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sep 5, 2006, at 4:43 PM, Jim Jewett wrote:

> I think I finally figured out where Raymond is coming from.
>
> For Raymond, "head" is where he started processing -- for rpartition,
> this is the .endswith part.
>
> For me, "head" is the start of the data structure -- always the
> .startswith part.
>
> We won't resolve that with anything suggesting a sequential order; we
> need something that makes it clear which part is the large leftover.

See, for me, it's all about the results of the operation, not how the  
results are (supposedly) used.  The way I think about it is that I've  
got some string and I'm looking for some split point within that  
string.  That split point is clearly the "middle" (but "sep" works  
too) and everything to the right of that split point gets returned in  
"right" while everything to the left gets returned in "left".

I'm less concerned with repeated splits because I probably have as  
many existing cases where I'm looking for the first split point as  
where I'm looking repeatedly for split points (think RFC 2822 header  
splitting -- partition will be awesome for this).

The bias with these terms is clearly the English left-to-right  
order.  Actually, that brings up an interesting question: what would  
happen if you called rpartition on a unicode string representing  
Hebrew, Arabic, or other RTL language?  Do partition and rpartition  
suddenly switch directions?

If not, then I think left-sep-right are fine.  If so, then yeah, we  
probably need something else.

- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iQCVAwUBRP3kUHEjvBPtnXfVAQJd6wP+OBtRR22O0A+s/uHF3ACgWhrdZJdEnzEW
qimKEWmDCUuK7CFIUsJKteoNNSHjIBgZIMMdnsymgI7CPgPNuB6CUAp8KFFeYvMy
PVpMIqNFOFXGUVYf4VA7ED9S7QbbDzHJv32kUUZvbuTniYK9DVMi0O7GStsv1Kg6
insyP+W1EcU=
=4aar
-----END PGP SIGNATURE-----

From pje at telecommunity.com  Tue Sep  5 23:07:17 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 05 Sep 2006 17:07:17 -0400
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
In-Reply-To: <118F763E-6B49-4AC2-91CB-961F14D504A0@python.org>
References: <fb6fbf560609051343w66f98a08j4503781d9ae7ce39@mail.gmail.com>
	<20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com>
	<200609051351.50494.fdrake@acm.org>
	<fb6fbf560609051102h1bfd3e66qcedde9c52dd95b79@mail.gmail.com>
	<44FDC2CE.1040902@ewtllc.com>
	<CD96E9CD-348E-4110-A7CC-F93FF0B86117@python.org>
	<fb6fbf560609051343w66f98a08j4503781d9ae7ce39@mail.gmail.com>
Message-ID: <5.1.1.6.0.20060905170453.0269c4e8@sparrow.telecommunity.com>

At 04:55 PM 9/5/2006 -0400, Barry Warsaw wrote:
>On Sep 5, 2006, at 4:43 PM, Jim Jewett wrote:
>
> > I think I finally figured out where Raymond is coming from.
> >
> > For Raymond, "head" is where he started processing -- for rpartition,
> > this is the .endswith part.
> >
> > For me, "head" is the start of the data structure -- always the
> > .startswith part.
> >
> > We won't resolve that with anything suggesting a sequential order; we
> > need something that makes it clear which part is the large leftover.
>
>See, for me, it's all about the results of the operation, not how the
>results are (supposedly) used.  The way I think about it is that I've
>got some string and I'm looking for some split point within that
>string.  That split point is clearly the "middle" (but "sep" works
>too) and everything to the right of that split point gets returned in
>"right" while everything to the left gets returned in "left".

+1 for left/sep/right for both operations.  It's easier to remember a 
visual correlation (left,sep,right) than it is to try and think about an 
abstraction in which the order of results has something to do with what 
direction I found the separator in.

If I'm repeating from right to left, then of course the "left" is the part 
I'll want to repeat on.


From rhettinger at ewtllc.com  Tue Sep  5 23:16:53 2006
From: rhettinger at ewtllc.com (Raymond Hettinger)
Date: Tue, 05 Sep 2006 14:16:53 -0700
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
In-Reply-To: <5.1.1.6.0.20060905170453.0269c4e8@sparrow.telecommunity.com>
References: <fb6fbf560609051343w66f98a08j4503781d9ae7ce39@mail.gmail.com>	<20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com>	<200609051351.50494.fdrake@acm.org>	<fb6fbf560609051102h1bfd3e66qcedde9c52dd95b79@mail.gmail.com>	<44FDC2CE.1040902@ewtllc.com>	<CD96E9CD-348E-4110-A7CC-F93FF0B86117@python.org>	<fb6fbf560609051343w66f98a08j4503781d9ae7ce39@mail.gmail.com>
	<5.1.1.6.0.20060905170453.0269c4e8@sparrow.telecommunity.com>
Message-ID: <44FDE945.7080801@ewtllc.com>

An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060905/9475cb5e/attachment.html 

From gjcarneiro at gmail.com  Wed Sep  6 02:21:11 2006
From: gjcarneiro at gmail.com (Gustavo Carneiro)
Date: Wed, 6 Sep 2006 01:21:11 +0100
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <E1GKgqo-00040l-3p@draco.cus.cam.ac.uk>
References: <E1GKgqo-00040l-3p@draco.cus.cam.ac.uk>
Message-ID: <a467ca4f0609051721q2b0d8405l5403cc2ac4258804@mail.gmail.com>

On 9/5/06, Nick Maclaren <nmm1 at cus.cam.ac.uk> wrote:
[...]
> Even if write is atomic, there are gotchas.  What if the interrupted
> code is doing something to that file at the time?  Are you SURE that
> an unexpected operation on it (in the same thread) won't cause the
> library function of program to get confused?

  Yes, I'm sure.  The technique is based on writing any arbitrary byte
onto a well known pipe.  Any byte will do.  All it matters is that we
trick the kernel into realizing there is data to read on the other end
of the pipe, so that it can wake up the poll() syscall waiting on it.
Only signal handlers ever write to this file descriptor.  If one
signal handler interrupts another one, it's ok; all it takes is that
at least one of them succeeds, and the data itself is irrelevant.
Only the mainloop ever reads from the pipe.

>  And can you be sure that the write will terminate fast enough to not cause time-critical code to fail?

 Time critical code should block signals.  Or should use a real-time OS.

>  And have you studied the exact semantics of blocking
> on pipes?  They are truly horrible.

  The pipe is changed to async mode; never blocks.  We don't care
about any data being transferred at all, only the state on the file
descriptor changing.

> So this is NOT a matter of platform X is safe and platform Y isn't.
> Even Linux x86 isn't entirely safe - or wasn't, the last time I heard.

  We can't prove write() is async safe, but you can't prove it isn't
either.  From all I know, write() doesn't use malloc(); it only loads
a few registers and calls some interrupt (or syscall in amd64).  It is
plausible that it is perfectly async safe.

   And that's completely beside the point.  We only ask python to call
a function of ours every time it handles a signal.  You are
criticizing the way pygtk or glib will handle the notification, but we
are here to discuss how will Python just give us a small hand in
solving the signals problem.  These are different problem domains.  We
don't ask Python developers to endorse any particular way of solving
our problem.  But since Python already snatches away our beloved
signals, especially SIGINT, it should at least be courteous enough to
give us just a notification when signals happen.  There is _no_ other
way.

From david.nospam.hopwood at blueyonder.co.uk  Wed Sep  6 03:08:03 2006
From: david.nospam.hopwood at blueyonder.co.uk (David Hopwood)
Date: Wed, 06 Sep 2006 02:08:03 +0100
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
In-Reply-To: <118F763E-6B49-4AC2-91CB-961F14D504A0@python.org>
References: <20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com>	<200609051351.50494.fdrake@acm.org>	<fb6fbf560609051102h1bfd3e66qcedde9c52dd95b79@mail.gmail.com>	<44FDC2CE.1040902@ewtllc.com>	<CD96E9CD-348E-4110-A7CC-F93FF0B86117@python.org>	<fb6fbf560609051343w66f98a08j4503781d9ae7ce39@mail.gmail.com>
	<118F763E-6B49-4AC2-91CB-961F14D504A0@python.org>
Message-ID: <44FE1F73.7020206@blueyonder.co.uk>

Barry Warsaw wrote:
> The bias with these terms is clearly the English left-to-right  
> order.  Actually, that brings up an interesting question: what would  
> happen if you called rpartition on a unicode string representing  
> Hebrew, Arabic, or other RTL language?  Do partition and rpartition  
> suddenly switch directions?

What happens is that rpartition searches the string backwards in logical
order (i.e. left to right as the text is written, assuming it only contains
Hebrew or Arabic letters, and not numbers or a mixture of scripts). But this
is not "switching directions"; it's still searching backwards. You really
don't want to think of bidirectional text in terms of presentation, when
you're doing processing that should be independent of presentation.

> If not, then I think left-sep-right are fine.  If so, then yeah, we  
> probably need something else.

+1 for (upto, sep, rest) -- and I think it should be in that order for
both partition and rpartition.

-- 
David Hopwood <david.nospam.hopwood at blueyonder.co.uk>



From pje at telecommunity.com  Wed Sep  6 03:14:18 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 05 Sep 2006 21:14:18 -0400
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
In-Reply-To: <44FE1F73.7020206@blueyonder.co.uk>
References: <118F763E-6B49-4AC2-91CB-961F14D504A0@python.org>
	<20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com>
	<200609051351.50494.fdrake@acm.org>
	<fb6fbf560609051102h1bfd3e66qcedde9c52dd95b79@mail.gmail.com>
	<44FDC2CE.1040902@ewtllc.com>
	<CD96E9CD-348E-4110-A7CC-F93FF0B86117@python.org>
	<fb6fbf560609051343w66f98a08j4503781d9ae7ce39@mail.gmail.com>
	<118F763E-6B49-4AC2-91CB-961F14D504A0@python.org>
Message-ID: <5.1.1.6.0.20060905211030.026352c8@sparrow.telecommunity.com>

At 02:08 AM 9/6/2006 +0100, David Hopwood wrote:
>Barry Warsaw wrote:
> > The bias with these terms is clearly the English left-to-right
> > order.  Actually, that brings up an interesting question: what would
> > happen if you called rpartition on a unicode string representing
> > Hebrew, Arabic, or other RTL language?  Do partition and rpartition
> > suddenly switch directions?
>
>What happens is that rpartition searches the string backwards in logical
>order (i.e. left to right as the text is written, assuming it only contains
>Hebrew or Arabic letters, and not numbers or a mixture of scripts). But this
>is not "switching directions"; it's still searching backwards. You really
>don't want to think of bidirectional text in terms of presentation, when
>you're doing processing that should be independent of presentation.
>
> > If not, then I think left-sep-right are fine.  If so, then yeah, we
> > probably need something else.
>
>+1 for (upto, sep, rest) -- and I think it should be in that order for
>both partition and rpartition.

It appears the problem is that one group of people thinks in terms of the 
order of the string, and the other in terms of the order of processing.

Both groups agree that both partition and rpartition should be "in the same 
order" -- but we disagree about what that means.  :)

Me, I want left/sep/right because I'm in the "string order" camp, and you 
want upto/sep/rest because you're in the "processing order" camp.


From fperez.net at gmail.com  Wed Sep  6 06:56:04 2006
From: fperez.net at gmail.com (Fernando Perez)
Date: Tue, 05 Sep 2006 22:56:04 -0600
Subject: [Python-Dev] inspect.py very slow under 2.5
Message-ID: <edlkau$27h$1@sea.gmane.org>

Hi all,

I know that the 2.5 release is extremely close, so this will probably be
2.5.1 material.  I discussed it briefly with Guido at scipy'06, and he
asked for some profile-based info, which I've only now had time to gather. 
I hope this will be of some use, as I think the problem is rather serious.

For context: I am the IPython lead developer (http://ipython.scipy.org), and
ipython is used as the base shell for several interactive environments, one
of which is the mathematics system SAGE
(http://modular.math.washington.edu/sage).  It was the SAGE lead who first
ran into this problem while testing SAGE with 2.5.

The issue is the following: ipython provides several exception reporting
modes which give a lot more information than python's default tracebacks. 
In order to generate this info, it makes extensive use of the inspect
module.  The module in ipython responsible for these fancy tracebacks is:

http://projects.scipy.org/ipython/ipython/browser/ipython/trunk/IPython/ultraTB.py

which is an enhanced port of Ka Ping-Yee's old cgitb module.

Under 2.5, the generation of one of these detailed tracebacks is /extremely/
expensive, and the cost goes up very quickly the more modules have been
imported into the current session.  While in a new ipython session the
slowdown is not crippling, under SAGE (which starts with a lot of loaded
modules) it is bad enough to make the system nearly unusable.

I'm attaching a little script which can be run to show the problem, but you
need IPython to be installed to run it.  If any of you run ubuntu, fedora,
suse or almost any other major linux distro, it's already available via the
usual channels.

In case you don't want to (or can't) run the attached code, here's a summary
of what I see on my machine (ubuntu dapper).  Using ipython under python
2.4.3, I get:

         2268 function calls (2225 primitive calls) in 0.020 CPU seconds

   Ordered by: call count
   List reduced from 127 to 32 due to restriction <0.25>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      305    0.000    0.000    0.000    0.000 :0(append)
  259/253    0.010    0.000    0.010    0.000 :0(len)
      177    0.000    0.000    0.000    0.000 :0(isinstance)
       90    0.000    0.000    0.000    0.000 :0(match)
       68    0.000    0.000    0.000    0.000 ultraTB.py:539(tokeneater)
       68    0.000    0.000    0.000    0.000 tokenize.py:16
(generate_tokens)
       61    0.000    0.000    0.000    0.000 :0(span)
       57    0.000    0.000    0.000    0.000 sre_parse.py:130(__getitem__)
       56    0.000    0.000    0.000    0.000 string.py:220(lower)

etc, while running the same script under ipython/python2.5 and no other
changes gives:

         230370 function calls (229754 primitive calls) in 3.340 CPU seconds

   Ordered by: call count
   List reduced from 83 to 21 due to restriction <0.25>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    55003    0.420    0.000    0.420    0.000 :0(startswith)
    45026    0.264    0.000    0.264    0.000 :0(endswith)
    20013    0.148    0.000    0.148    0.000 :0(append)
    12138    0.180    0.000    0.660    0.000 posixpath.py:156(islink)
    12138    0.192    0.000    0.192    0.000 :0(lstat)
    12138    0.180    0.000    0.288    0.000 stat.py:60(S_ISLNK)
    12138    0.108    0.000    0.108    0.000 stat.py:29(S_IFMT)
    11838    0.680    0.000    1.244    0.000 posixpath.py:56(join)
     4837    0.052    0.000    0.052    0.000 :0(len)
     4362    0.028    0.000    0.028    0.000 :0(split)
     4362    0.048    0.000    0.100    0.000 posixpath.py:47(isabs)
     3598    0.036    0.000    0.056    0.000 string.py:218(lower)
     3598    0.020    0.000    0.020    0.000 :0(lower)
     2815    0.032    0.000    0.032    0.000 :0(isinstance)
     2809    0.028    0.000    0.028    0.000 :0(join)
     2808    0.264    0.000    0.520    0.000 posixpath.py:374(normpath)
     2632    0.040    0.000    0.068    0.000 inspect.py:35(ismodule)
     2143    0.016    0.000    0.016    0.000 :0(hasattr)
     1884    0.028    0.000    0.444    0.000 posixpath.py:401(abspath)
     1557    0.016    0.000    0.016    0.000 :0(range)
     1078    0.008    0.000    0.044    0.000 inspect.py:342(getfile)


These enormous numbers of calls are the origin of the slowdown, and the more
modules have been imported, the worse it gets.

I haven't had time to dive deep into inspect.py to try and fix this, but I
figured it would be best to at least report it now.  As far as IPython and
its user projects is concerned, I'll probably hack things to overwrite
inspect.py from 2.4 over the 2.5 version in the exception reporter, because
the current code is simply unusable for detailed tracebacks.  It would be
great if this could be fixed in the trunk at some point.

I'll be happy to provide further feedback or put this information elsewhere. 
Guido suggested initially posting here, but if you prefer it on the SF
tracker (even as incomplete as this report is) I'll be glad to do so.

Regards,

f
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: traceback_timings.py
Url: http://mail.python.org/pipermail/python-dev/attachments/20060905/fb0ac8bf/attachment.asc 

From steve at holdenweb.com  Wed Sep  6 10:14:20 2006
From: steve at holdenweb.com (Steve Holden)
Date: Wed, 06 Sep 2006 09:14:20 +0100
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
In-Reply-To: <44FDE945.7080801@ewtllc.com>
References: <fb6fbf560609051343w66f98a08j4503781d9ae7ce39@mail.gmail.com>	<20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com>	<200609051351.50494.fdrake@acm.org>	<fb6fbf560609051102h1bfd3e66qcedde9c52dd95b79@mail.gmail.com>	<44FDC2CE.1040902@ewtllc.com>	<CD96E9CD-348E-4110-A7CC-F93FF0B86117@python.org>	<fb6fbf560609051343w66f98a08j4503781d9ae7ce39@mail.gmail.com>	<5.1.1.6.0.20060905170453.0269c4e8@sparrow.telecommunity.com>
	<44FDE945.7080801@ewtllc.com>
Message-ID: <edlvvp$t24$5@sea.gmane.org>

Raymond Hettinger wrote:
[...]
> That's fine with me.  I accept there will always be someone who stands 
> on their head [...]

You'd have to be some kind of contortionist to stand on your head.

willfully-misunderstanding-ly y'rs  - steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb       http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden


From ncoghlan at gmail.com  Wed Sep  6 10:21:54 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 06 Sep 2006 18:21:54 +1000
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
In-Reply-To: <5.1.1.6.0.20060905170453.0269c4e8@sparrow.telecommunity.com>
References: <fb6fbf560609051343w66f98a08j4503781d9ae7ce39@mail.gmail.com>	<20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com>	<200609051351.50494.fdrake@acm.org>	<fb6fbf560609051102h1bfd3e66qcedde9c52dd95b79@mail.gmail.com>	<44FDC2CE.1040902@ewtllc.com>	<CD96E9CD-348E-4110-A7CC-F93FF0B86117@python.org>	<fb6fbf560609051343w66f98a08j4503781d9ae7ce39@mail.gmail.com>
	<5.1.1.6.0.20060905170453.0269c4e8@sparrow.telecommunity.com>
Message-ID: <44FE8522.5020703@gmail.com>

Phillip J. Eby wrote:
> At 04:55 PM 9/5/2006 -0400, Barry Warsaw wrote:
>> On Sep 5, 2006, at 4:43 PM, Jim Jewett wrote:
>>
>>> I think I finally figured out where Raymond is coming from.
>>>
>>> For Raymond, "head" is where he started processing -- for rpartition,
>>> this is the .endswith part.
>>>
>>> For me, "head" is the start of the data structure -- always the
>>> .startswith part.
>>>
>>> We won't resolve that with anything suggesting a sequential order; we
>>> need something that makes it clear which part is the large leftover.
>> See, for me, it's all about the results of the operation, not how the
>> results are (supposedly) used.  The way I think about it is that I've
>> got some string and I'm looking for some split point within that
>> string.  That split point is clearly the "middle" (but "sep" works
>> too) and everything to the right of that split point gets returned in
>> "right" while everything to the left gets returned in "left".
> 
> +1 for left/sep/right for both operations.  It's easier to remember a 
> visual correlation (left,sep,right) than it is to try and think about an 
> abstraction in which the order of results has something to do with what 
> direction I found the separator in.

-1. The string docs are already lousy with left/right terminology that is
flatout wrong when dealing with a script that is displayed with a
right-to-left or vertical orientation*. In reality, strings are processed such
that index 0 is the first character and index -1 is the last character,
regardless of script orientation, but you could be forgiven for not realising
that after reading the current string docs. Let's not make that particular
problem any worse.

I don't see anything wrong with Raymond's 'head, sep, tail' and 'tail, sep,
head' terminology (although noting the common postcondition 'sep not in head'
in the docstrings might be useful).

However, if we're going to use the same result tuple for both, then I'd prefer
'before, sep, after', with the partition() postcondition being 'sep not in
before' and the rpartition() postcondition being 'sep not in after'. Those
terms are accurate regardless of script orientation.

Either way, I suggest putting the postcondition in the docstring to make the 
difference between the two methods explicit.

Regards,
Nick.

* I acknowledge that Python *code* is almost certainly going to be edited in a 
left-to-right text editor, because it's an English-based programming language. 
But the strings that string methods like partition() and rpartition() are used 
with are quite likely to be coming from or written to a or user interface that 
uses a native script orientation.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From steve at holdenweb.com  Wed Sep  6 10:32:19 2006
From: steve at holdenweb.com (Steve Holden)
Date: Wed, 06 Sep 2006 09:32:19 +0100
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
In-Reply-To: <44FE8522.5020703@gmail.com>
References: <fb6fbf560609051343w66f98a08j4503781d9ae7ce39@mail.gmail.com>	<20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com>	<200609051351.50494.fdrake@acm.org>	<fb6fbf560609051102h1bfd3e66qcedde9c52dd95b79@mail.gmail.com>	<44FDC2CE.1040902@ewtllc.com>	<CD96E9CD-348E-4110-A7CC-F93FF0B86117@python.org>	<fb6fbf560609051343w66f98a08j4503781d9ae7ce39@mail.gmail.com>	<5.1.1.6.0.20060905170453.0269c4e8@sparrow.telecommunity.com>
	<44FE8522.5020703@gmail.com>
Message-ID: <edm11g$45v$2@sea.gmane.org>

Nick Coghlan wrote:
> Phillip J. Eby wrote:
> 
>>At 04:55 PM 9/5/2006 -0400, Barry Warsaw wrote:
>>
>>>On Sep 5, 2006, at 4:43 PM, Jim Jewett wrote:
>>>
>>>
>>>>I think I finally figured out where Raymond is coming from.
>>>>
>>>>For Raymond, "head" is where he started processing -- for rpartition,
>>>>this is the .endswith part.
>>>>
>>>>For me, "head" is the start of the data structure -- always the
>>>>.startswith part.
>>>>
>>>>We won't resolve that with anything suggesting a sequential order; we
>>>>need something that makes it clear which part is the large leftover.
>>>
>>>See, for me, it's all about the results of the operation, not how the
>>>results are (supposedly) used.  The way I think about it is that I've
>>>got some string and I'm looking for some split point within that
>>>string.  That split point is clearly the "middle" (but "sep" works
>>>too) and everything to the right of that split point gets returned in
>>>"right" while everything to the left gets returned in "left".
>>
>>+1 for left/sep/right for both operations.  It's easier to remember a 
>>visual correlation (left,sep,right) than it is to try and think about an 
>>abstraction in which the order of results has something to do with what 
>>direction I found the separator in.
> 
> 
> -1. The string docs are already lousy with left/right terminology that is
> flatout wrong when dealing with a script that is displayed with a
> right-to-left or vertical orientation*. In reality, strings are processed such
> that index 0 is the first character and index -1 is the last character,
> regardless of script orientation, but you could be forgiven for not realising
> that after reading the current string docs. Let's not make that particular
> problem any worse.
> 
> I don't see anything wrong with Raymond's 'head, sep, tail' and 'tail, sep,
> head' terminology (although noting the common postcondition 'sep not in head'
> in the docstrings might be useful).
> 
> However, if we're going to use the same result tuple for both, then I'd prefer
> 'before, sep, after', with the partition() postcondition being 'sep not in
> before' and the rpartition() postcondition being 'sep not in after'. Those
> terms are accurate regardless of script orientation.
> 
> Either way, I suggest putting the postcondition in the docstring to make the 
> difference between the two methods explicit.
> 
> Regards,
> Nick.
> 
> * I acknowledge that Python *code* is almost certainly going to be edited in a 
> left-to-right text editor, because it's an English-based programming language. 
> But the strings that string methods like partition() and rpartition() are used 
> with are quite likely to be coming from or written to a or user interface that 
> uses a native script orientation.
> 
Perhaps we should be thinking "beginning" and "end" here, though it 
seems as though it won't be possible to find a terminology that will be 
intuitively obvious to everyone.

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb       http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden


From g.brandl at gmx.net  Wed Sep  6 10:39:07 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 06 Sep 2006 10:39:07 +0200
Subject: [Python-Dev] Fwd: Problem withthe API for str.rpartition()
In-Reply-To: <edm11g$45v$2@sea.gmane.org>
References: <fb6fbf560609051343w66f98a08j4503781d9ae7ce39@mail.gmail.com>	<20060905102446.4vnllkmmo1wgkc80@login.werra.lunarpages.com>	<200609051351.50494.fdrake@acm.org>	<fb6fbf560609051102h1bfd3e66qcedde9c52dd95b79@mail.gmail.com>	<44FDC2CE.1040902@ewtllc.com>	<CD96E9CD-348E-4110-A7CC-F93FF0B86117@python.org>	<fb6fbf560609051343w66f98a08j4503781d9ae7ce39@mail.gmail.com>	<5.1.1.6.0.20060905170453.0269c4e8@sparrow.telecommunity.com>	<44FE8522.5020703@gmail.com>
	<edm11g$45v$2@sea.gmane.org>
Message-ID: <edm1fb$5u5$1@sea.gmane.org>

Steve Holden wrote:

>> * I acknowledge that Python *code* is almost certainly going to be edited in a 
>> left-to-right text editor, because it's an English-based programming language. 
>> But the strings that string methods like partition() and rpartition() are used 
>> with are quite likely to be coming from or written to a or user interface that 
>> uses a native script orientation.
>> 
> Perhaps we should be thinking "beginning" and "end" here, though it 
> seems as though it won't be possible to find a terminology that will be 
> intuitively obvious to everyone.

Which is why an example is absolutely necessary and will make things clear for
everyone.

Georg


From ralf at brainbot.com  Wed Sep  6 12:14:09 2006
From: ralf at brainbot.com (Ralf Schmitt)
Date: Wed, 06 Sep 2006 12:14:09 +0200
Subject: [Python-Dev] inspect.py very slow under 2.5
In-Reply-To: <edlkau$27h$1@sea.gmane.org>
References: <edlkau$27h$1@sea.gmane.org>
Message-ID: <44FE9F71.3090903@brainbot.com>

Fernando Perez wrote:
> 
> These enormous numbers of calls are the origin of the slowdown, and the more
> modules have been imported, the worse it gets.


--- /exp/lib/python2.5/inspect.py	2006-08-28 11:53:36.000000000 +0200
+++ inspect.py	2006-09-06 12:10:45.000000000 +0200
@@ -444,7 +444,8 @@
      in the file and the line number indexes a line in that list.  An 
IOError
      is raised if the source code cannot be retrieved."""
      file = getsourcefile(object) or getfile(object)
-    module = getmodule(object)
+    #module = getmodule(object)
+    module = None
      if module:
          lines = linecache.getlines(file, module.__dict__)
      else:

The problem seems to originate from the module=getmodule(object) in 
findsource. If I outcomment that code (or rather do a module=None),
things seem to be back as normal. (linecache.getlines has been called 
with a None module in python 2.4's inspect.py).

- Ralf


From mwh at python.net  Wed Sep  6 12:34:23 2006
From: mwh at python.net (Michael Hudson)
Date: Wed, 06 Sep 2006 11:34:23 +0100
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <a467ca4f0609040752y7c051b24h620777d983590f40@mail.gmail.com>
	(Gustavo Carneiro's message of "Mon, 4 Sep 2006 14:52:36 +0000")
References: <E1GKF5I-0004eE-5Y@draco.cus.cam.ac.uk>
	<a467ca4f0609040752y7c051b24h620777d983590f40@mail.gmail.com>
Message-ID: <2m8xkxnv0w.fsf@starship.python.net>

"Gustavo Carneiro" <gjcarneiro at gmail.com> writes:

> On 9/4/06, Nick Maclaren <nmm1 at cus.cam.ac.uk> wrote:
>> "Gustavo Carneiro" <gjcarneiro at gmail.com> wrote:
>> >   I am now thinking of something along these lines:
>> > typedef void (*PyPendingCallNotify)(void *user_data);
>> > PyAPI_FUNC(void) Py_AddPendingCallNotify(PyPendingCallNotify callback,
>> >     void *user_data);
>> > PyAPI_FUNC(void) Py_RemovePendingCallNotify(PyPendingCallNotify
>> >     callback, void *user_data);
>>
>> Why would that help?  The problems are semantic, not syntactic.
>>
>> Anthony Baxter isn't exaggerating the problem, despite what you may
>> think from his posting.
>
>   You guys are tough customers to please. 

Yes.

> I am just trying to solve a problem here, not create a new one; you
> have to believe me.

We believe you, but you are stirring the ashes of old problems.

>      1. In PyGTK we have a gobject.MainLoop.run() method, which blocks
> essentially forever in a poll() system call, and only wakes if/when it
> has to process timeout or IO event;
>      2. When we only have one thread, we can guarantee that e.g.
> SIGINT will always be caught by the thread running the
> g_main_loop_run(), so we know poll() will be interrupted and a EINTR
> will be generated, giving us control temporarily back to check for
> python signals;
>      3. When we have multiple thread, we cannot make this assumption,
> so instead we install a timeout to periodically check for signals.
>
>   We want to get rid of timeouts.  Now my idea: add a Python API to say:
>      "dear Python, please call me when you start having pending calls,
> even if from a signal handler context, ok?"

This seems a reasonable proposal.  But it's totally a Python 2.6
thing, so how about taking a deep breath, working on a patch and
submitting it when it's ready?

Having to wake a process up a few times a second is ugly and annoying,
sure, but it is not a release delaying problem.

Cheers,
mwh

-- 
  It is never worth a first class man's time to express a majority
  opinion.  By definition, there are plenty of others to do that.
                                                        -- G. H. Hardy

From ncoghlan at gmail.com  Wed Sep  6 12:54:45 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 06 Sep 2006 20:54:45 +1000
Subject: [Python-Dev] inspect.py very slow under 2.5
In-Reply-To: <44FE9F71.3090903@brainbot.com>
References: <edlkau$27h$1@sea.gmane.org> <44FE9F71.3090903@brainbot.com>
Message-ID: <44FEA8F5.1000700@gmail.com>

Ralf Schmitt wrote:
> The problem seems to originate from the module=getmodule(object) in 
> findsource. If I outcomment that code (or rather do a module=None),
> things seem to be back as normal. (linecache.getlines has been called 
> with a None module in python 2.4's inspect.py).

It looks like the problem is the call to getabspath() in getmodule(). This 
happens every time, even if the file name is already in the modulesbyfile 
cache. This calls os.path.abspath() and os.path.normpath() every time that 
inspect.findsource() is called.

That can be fixed by having findsource() pass the filename argument to 
getmodule(), and adding a check of the modulesbyfile cache *before* the call 
to getabspath().

Can you try this patch and see if you get 2.4 level performance back on 
Fernando's test?:

http://www.python.org/sf/1553314

(Assigned to Neal in the hopes of making 2.5rc2)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From ralf at brainbot.com  Wed Sep  6 13:22:45 2006
From: ralf at brainbot.com (Ralf Schmitt)
Date: Wed, 06 Sep 2006 13:22:45 +0200
Subject: [Python-Dev] inspect.py very slow under 2.5
In-Reply-To: <44FEA8F5.1000700@gmail.com>
References: <edlkau$27h$1@sea.gmane.org> <44FE9F71.3090903@brainbot.com>
	<44FEA8F5.1000700@gmail.com>
Message-ID: <44FEAF85.1000107@brainbot.com>

Nick Coghlan wrote:
> 
> It looks like the problem is the call to getabspath() in getmodule(). This 
> happens every time, even if the file name is already in the modulesbyfile 
> cache. This calls os.path.abspath() and os.path.normpath() every time that 
> inspect.findsource() is called.
> 
> That can be fixed by having findsource() pass the filename argument to 
> getmodule(), and adding a check of the modulesbyfile cache *before* the call 
> to getabspath().
> 
> Can you try this patch and see if you get 2.4 level performance back on 
> Fernando's test?:

no. this doesn't work. getmodule always iterates over 
sys.modules.values() and only returns None afterwards.
One would have to cache the bad file value, or only inspect new/changed 
modules from sys.modules.

> 
> http://www.python.org/sf/1553314
> 
> (Assigned to Neal in the hopes of making 2.5rc2)
> 
> Cheers,
> Nick.
> 


From g.brandl at gmx.net  Wed Sep  6 14:41:19 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 06 Sep 2006 14:41:19 +0200
Subject: [Python-Dev] Exception message for invalid with statement usage
Message-ID: <edmflg$na5$1@sea.gmane.org>

Current trunk:

>>> with 1:
...  print "1"
...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'int' object has no attribute '__exit__'

Isn't that a bit crude? For "for i in 1" there's a better
error message, so why shouldn't the above give a
TypeError: 'int' object is not a context manager

?

Georg


From ncoghlan at gmail.com  Wed Sep  6 15:06:33 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 06 Sep 2006 23:06:33 +1000
Subject: [Python-Dev] inspect.py very slow under 2.5
In-Reply-To: <44FEAF85.1000107@brainbot.com>
References: <edlkau$27h$1@sea.gmane.org> <44FE9F71.3090903@brainbot.com>
	<44FEA8F5.1000700@gmail.com> <44FEAF85.1000107@brainbot.com>
Message-ID: <44FEC7D9.80500@gmail.com>

Ralf Schmitt wrote:
> Nick Coghlan wrote:
>>
>> It looks like the problem is the call to getabspath() in getmodule(). 
>> This happens every time, even if the file name is already in the 
>> modulesbyfile cache. This calls os.path.abspath() and 
>> os.path.normpath() every time that inspect.findsource() is called.
>>
>> That can be fixed by having findsource() pass the filename argument to 
>> getmodule(), and adding a check of the modulesbyfile cache *before* 
>> the call to getabspath().
>>
>> Can you try this patch and see if you get 2.4 level performance back 
>> on Fernando's test?:
> 
> no. this doesn't work. getmodule always iterates over 
> sys.modules.values() and only returns None afterwards.
> One would have to cache the bad file value, or only inspect new/changed 
> modules from sys.modules.

Good point. I modified the patch so it does the latter (it only calls 
getabspath() again for a module if the value of module.__file__ changes).

Cheers,
Nick.


-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From ncoghlan at gmail.com  Wed Sep  6 15:11:31 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 06 Sep 2006 23:11:31 +1000
Subject: [Python-Dev] Exception message for invalid with statement usage
In-Reply-To: <edmflg$na5$1@sea.gmane.org>
References: <edmflg$na5$1@sea.gmane.org>
Message-ID: <44FEC903.7060303@gmail.com>

Georg Brandl wrote:
> Current trunk:
> 
>>>> with 1:
> ...  print "1"
> ...
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> AttributeError: 'int' object has no attribute '__exit__'
> 
> Isn't that a bit crude? For "for i in 1" there's a better
> error message, so why shouldn't the above give a
> TypeError: 'int' object is not a context manager

The for loop has a nice error message because it starts with its own opcode, 
but the with statement translates pretty much to the code in PEP 343. There's 
a special opcode at the end to help with unwinding the stack, but at the start 
it's just normal attribute retrieval opcodes for __enter__ and __exit__.

 >>> def f():
...   with 1:
...     pass
...
 >>> dis.dis(f)
   2           0 LOAD_CONST               1 (1)
               3 DUP_TOP
               4 LOAD_ATTR                0 (__exit__)
               7 STORE_FAST               0 (_[1])
              10 LOAD_ATTR                1 (__enter__)
              13 CALL_FUNCTION            0
              16 POP_TOP
              17 SETUP_FINALLY            4 (to 24)

   3          20 POP_BLOCK
              21 LOAD_CONST               0 (None)
         >>   24 LOAD_FAST                0 (_[1])
              27 DELETE_FAST              0 (_[1])
              30 WITH_CLEANUP
              31 END_FINALLY
              32 LOAD_CONST               0 (None)
              35 RETURN_VALUE

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From ralf at brainbot.com  Wed Sep  6 16:53:30 2006
From: ralf at brainbot.com (Ralf Schmitt)
Date: Wed, 06 Sep 2006 16:53:30 +0200
Subject: [Python-Dev] inspect.py very slow under 2.5
In-Reply-To: <44FEC7D9.80500@gmail.com>
References: <edlkau$27h$1@sea.gmane.org>
	<44FE9F71.3090903@brainbot.com>	<44FEA8F5.1000700@gmail.com>
	<44FEAF85.1000107@brainbot.com> <44FEC7D9.80500@gmail.com>
Message-ID: <44FEE0EA.7000303@brainbot.com>

Nick Coghlan wrote:
> Ralf Schmitt wrote:
>> Nick Coghlan wrote:
>>> It looks like the problem is the call to getabspath() in getmodule(). 
>>> This happens every time, even if the file name is already in the 
>>> modulesbyfile cache. This calls os.path.abspath() and 
>>> os.path.normpath() every time that inspect.findsource() is called.
>>>
>>> That can be fixed by having findsource() pass the filename argument to 
>>> getmodule(), and adding a check of the modulesbyfile cache *before* 
>>> the call to getabspath().
>>>
>>> Can you try this patch and see if you get 2.4 level performance back 
>>> on Fernando's test?:
>> no. this doesn't work. getmodule always iterates over 
>> sys.modules.values() and only returns None afterwards.
>> One would have to cache the bad file value, or only inspect new/changed 
>> modules from sys.modules.
> 
> Good point. I modified the patch so it does the latter (it only calls 
> getabspath() again for a module if the value of module.__file__ changes).

with _filesbymodname[modname] = file changed to 
_filesbymodname[modname] = f
it seems to work ok.

diff -r d41ffd2faa28 inspect.py
--- a/inspect.py	Wed Sep 06 13:01:12 2006 +0200
+++ b/inspect.py	Wed Sep 06 16:52:39 2006 +0200
@@ -403,6 +403,7 @@ def getabsfile(object, _filename=None):
      return os.path.normcase(os.path.abspath(_filename))

  modulesbyfile = {}
+_filesbymodname = {}

  def getmodule(object, _filename=None):
      """Return the module an object was defined in, or None if not 
found."""
@@ -410,17 +411,23 @@ def getmodule(object, _filename=None):
          return object
      if hasattr(object, '__module__'):
          return sys.modules.get(object.__module__)
+    if _filename is not None and _filename in modulesbyfile:
+        return sys.modules.get(modulesbyfile[_filename])
      try:
          file = getabsfile(object, _filename)
      except TypeError:
          return None
      if file in modulesbyfile:
          return sys.modules.get(modulesbyfile[file])
-    for module in sys.modules.values():
+    for modname, module in sys.modules.iteritems():
          if ismodule(module) and hasattr(module, '__file__'):
+            f = module.__file__
+            if f == _filesbymodname.get(modname, None):
+                continue
+            _filesbymodname[modname] = f
              f = getabsfile(module)
              modulesbyfile[f] = modulesbyfile[
-                os.path.realpath(f)] = module.__name__
+                os.path.realpath(f)] = modname
      if file in modulesbyfile:
          return sys.modules.get(modulesbyfile[file])
      main = sys.modules['__main__']
@@ -444,7 +451,7 @@ def findsource(object):
      in the file and the line number indexes a line in that list.  An 
IOError
      is raised if the source code cannot be retrieved."""
      file = getsourcefile(object) or getfile(object)
-    module = getmodule(object)
+    module = getmodule(object, file)
      if module:
          lines = linecache.getlines(file, module.__dict__)
      else:


From guido at python.org  Wed Sep  6 17:46:21 2006
From: guido at python.org (Guido van Rossum)
Date: Wed, 6 Sep 2006 08:46:21 -0700
Subject: [Python-Dev] Exception message for invalid with statement usage
In-Reply-To: <edmflg$na5$1@sea.gmane.org>
References: <edmflg$na5$1@sea.gmane.org>
Message-ID: <ca471dc20609060846o3656f138w3bfddd72c4d85774@mail.gmail.com>

IMO it's fine. The only time you'll see this in reality is when
someone passed you the wrong type of object by mistake, and then the
type mentioned in the message is plenty help to debug it. Anyone with
even a slight understanding of 'with' knows it involves '__exit__',
and the linenumber should be a big fat hint, too.

On 9/6/06, Georg Brandl <g.brandl at gmx.net> wrote:
> Current trunk:
>
> >>> with 1:
> ...  print "1"
> ...
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> AttributeError: 'int' object has no attribute '__exit__'
>
> Isn't that a bit crude? For "for i in 1" there's a better
> error message, so why shouldn't the above give a
> TypeError: 'int' object is not a context manager
>
> ?
>
> Georg
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From tim.peters at gmail.com  Wed Sep  6 22:44:31 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 6 Sep 2006 16:44:31 -0400
Subject: [Python-Dev] Cross-platform math functions?
In-Reply-To: <44FD050F.20901@gmx.de>
References: <44FC9C53.5060304@gmx.de>
	<1f7befae0609041606m13c5c24bm59ce229b27f32e9d@mail.gmail.com>
	<44FD050F.20901@gmx.de>
Message-ID: <1f7befae0609061344x61b1ae87vdd523fceb32a12d7@mail.gmail.com>

[Tim Peters]
>> Package a Python wrapper and see how popular it becomes.  Some reasons
>> against trying to standardize on fdlibm were explained here:
>>
>>    http://mail.python.org/pipermail/python-list/2005-July/290164.html

[Andreas Raab]
> Thanks, these are good points. About speed, do you have any good
> benchmarks available?

Certainly not for "typical Python use" -- doubt such a benchmark
exists.  Some people use  sqrt once in a blue moon, others make heavy
use of many libm functions over millions & millions of floats, and in
some apps extremely heavy use is made where speed is everything and
accuracy doesn't much matter at all (e.g., gross plotting).

I'd ask on numeric Python lists, and (e.g.) people working with visualization.

> In my experience fdlibm is quite reasonable for speed in the context of use
> by dynamic languages (i.e., counting allocation overheads, lookup and send
> performance etc)

"Reasonable" for which purpose(s), specifically?  Some people would
certainly care about a 5% slowdown, while most others wouldn't, but
one thing to avoid is pissing off the people who use a thing the most
;-)

> but since I'm not a Python expert I'd appreciate some help with realistic
> benchmarks.

As above, python-dev isn't a likely place to look for such answers.

> ...
> Agreed. Thus my question if someone had already done this ;-)

Not that I know of, although my understanding (which may be wrong) is
that glibc's current math functions started as a copy of fdlibm.

From gustavo at niemeyer.net  Thu Sep  7 01:24:23 2006
From: gustavo at niemeyer.net (Gustavo Niemeyer)
Date: Wed, 6 Sep 2006 20:24:23 -0300
Subject: [Python-Dev] buildbot breakage
Message-ID: <20060906232422.GA8620@niemeyer.net>

Some buildbots will fail because they got revision r51793, and it
has a change I made to fix a problem in the subprocess module.

Please do not rollback any changes. I'm handling the issue.

Also notice that there's no broken code there.  The problem is that
the issue in subprocess is related to stdout/stderr handling, and I'm
having trouble making buildbot happy while keeping the new tests
in place.

I apologise for any inconvenience this may cause.

-- 
Gustavo Niemeyer
http://niemeyer.net

From gustavo at niemeyer.net  Thu Sep  7 01:45:50 2006
From: gustavo at niemeyer.net (Gustavo Niemeyer)
Date: Wed, 6 Sep 2006 20:45:50 -0300
Subject: [Python-Dev] buildbot breakage
In-Reply-To: <20060906232422.GA8620@niemeyer.net>
References: <20060906232422.GA8620@niemeyer.net>
Message-ID: <20060906234550.GA9265@niemeyer.net>

> Some buildbots will fail because they got revision r51793, and it
> has a change I made to fix a problem in the subprocess module.

I've removed the offending test in r51794 and buildbots should be
happy again.

One of the ways of exploring the issue reported is using sys.stdout
as the stdout keyword, such as:

   subprocess.call([...], stdout=sys.stdout)

it breaks because it ends up closing one of the standard descriptors
of the subprocess.

Unfortunately we can't test it that way because buildbot uses a
StringIO in sys.stdout.

I kept the test which uses stdout=1, and removed the one expecting
sys.stdout to be a "normal" file.

Sorry for the trouble,

-- 
Gustavo Niemeyer
http://niemeyer.net

From python-dev at zesty.ca  Thu Sep  7 05:38:07 2006
From: python-dev at zesty.ca (Ka-Ping Yee)
Date: Wed, 6 Sep 2006 22:38:07 -0500 (CDT)
Subject: [Python-Dev] new security doc using object-capabilities
In-Reply-To: <bbaeab100607191535p543cb0ddj66410cf985dd9b77@mail.gmail.com>
References: <bbaeab100607191535p543cb0ddj66410cf985dd9b77@mail.gmail.com>
Message-ID: <Pine.LNX.4.58.0609062237510.19634@server1.LFW.org>

Hi Brett,

Here are some comments on your proposal.  Sorry this took so long.
I apologize if any of these comments are out of date (but also look
forward to your answers to some of the questions, as they'll help
me understand some more of the details of your proposal).  Thanks!

> Introduction
> ///////////////////////////////////////
[...]
> Throughout this document several terms are going to be used.  A
> "sandboxed interpreter" is one where the built-in namespace is not the
> same as that of an interpreter whose built-ins were unaltered, which
> is called an "unprotected interpreter".

Is this a definition or an implementation choice?  As in, are you
defining "sandboxed" to mean "with altered built-ins" or just
"restricted in some way", and does the above mean to imply that
altering the built-ins is what triggers other kinds of restrictions
(as it did in Python's old restricted execution mode)?

> A "bare interpreter" is one where the built-in namespace has been
> stripped down the bare minimum needed to run any form of basic Python
> program.  This means that all atomic types (i.e., syntactically
> supported types), ``object``, and the exceptions provided by the
> ``exceptions`` module are considered in the built-in namespace.  There
> have also been no imports executed in the interpreter.

Is a "bare interpreter" just one example of a sandboxed interpreter,
or are all sandboxed interpreters in your design initially bare (i.e.
"sandboxed" = "bare" + zero or more granted authorities)?

> The "security domain" is the boundary at which security is cared
> about.  For this dicussion, it is the interpreter.

It might be clearer to say (if i understand correctly) "Each interpreter
is a separate security domain."

Many interpreters can run within a single operating system process,
right?  Could you say a bit about what sort of concurrency model you
have in mind?  How would this interact (if at all) with use of the
existing threading functionality?

> The "powerbox" is the thing that possesses the ultimate power in the
> system.  In our case it is the Python process.

This could also be the application process, right?

> Rationale
> ///////////////////////////////////////
[...]
> For instance, think of an application that supports a plug-in system
> with Python as the language used for writing plug-ins.  You do not
> want to have to examine every plug-in you download to make sure that
> it does not alter your filesystem if you can help it.  With a proper
> security model and implementation in place this hinderance of having
> to examine all code you execute should be alleviated.

I'm glad to have this use case set out early in the document, so the
reader can keep it in mind as an example while reading about the model.

> Approaches to Security
> ///////////////////////////////////////
>
> There are essentially two types of security: who-I-am
> (permissions-based) security and what-I-have (authority-based)
> security.

As Mark Miller mentioned in another message, your descriptions of
"who-I-am" security and "what-I-have" security make sense, but
they don't correspond to "permission" vs. "authority".  They
correspond to "identity-based" vs. "authority-based" security.

> Difficulties in Python for Object-Capabilities
> //////////////////////////////////////////////
[...]
> Three key requirements for providing a proper perimeter defence is
> private namespaces, immutable shared state across domains, and
> unforgeable references.

Nice summary.

> Problem of No Private Namespace
> ===============================
[...]
> The Python language has no such thing as a private namespace.

Don't local scopes count as private namespaces?  It seems clear
that they aren't designed with the intention of being exposed,
unlike other namespaces in Python.

> It also makes providing security at the object level using
> object-capabilities non-existent in pure Python code.

I don't think this is necessarily the case.  No Python code i've
ever seen expects to be able to invade the local scopes of other
functions, so you could use them as private namespaces.  There
are two ways i've seen to invade local scopes:

    (a) Use gc.get_referents to get back from a cell object
        to its contents.

    (b) Compare the cell object to another cell object, thereby
        causing __eq__ to be invoked to compare the contents of
        the cells.

So you could protect local scopes by prohibiting these or by
simply turning off access to func_closure.  It's clear that hardly
any code depends on these introspection featuresl, so it would be
reasonble to turn them off in a sandboxed interpreter.  (It seems
you would have to turn off some introspection features anyway in
order to have reliable import guards.)

> Problem of Mutable Shared State
> ===============================
[...]
> Regardless, sharing of state that can be influenced by another
> interpreter is not safe for object-capabilities.

Yup.

> Threat Model
> ///////////////////////////////////////

Good to see this specified here.  I like the way you've broken this
down.

> * An interpreter cannot gain abilties the Python process possesses
>   without explicitly being given those abilities.

It would be good to enumerate which abilities you're referring to in
this item.  For example, a bare interpreter should be able to allocate
memory and call most of the built-in functions, but should not be able
to open network connections.

> * An interpreter cannot influence another interpreter directly at the
>   Python level without explicitly allowing it.

You mean, without some other entity explicitly allowing it, right?
What would that other entity be -- presumably the interpreter that
spawned both of these sub-interpreters?

> * An interpreter cannot use operating system resources without being
>   explicitly given those resources.

Okay.

> * A bare Python interpreter is always trusted.

What does "trusted" mean in the above?

> * Python bytecode is always distrusted.
> * Pure Python source code is always safe on its own.

It would be helpful to clarify "safe" here.  I assume by "safe" you
mean that the Python source code can express whatever it wants,
including potentially dangerous activities, but when run in a bare
or sandboxed interpreter it cannot have harmful effects.  But then
in what sense does the "safety" have to do with the Python source code
rather than the restrictions on the interpreter?

Would it be correct to say:
  + We want to guarantee that Python source code cannot violate
    the restrictions in a restricted or bare interpreter.
  + We do not prevent arbitrary Python bytecode from violating
    these restrictions, and assume that it can.

>     + Malicious abilities are derived from C extension modules,
>       built-in modules, and unsafe types implemented in C, not from
>       pure Python source.

By "malicious" do you just mean "anything that isn't accessible to
a bare interpreter"?

> * A sub-interpreter started by another interpreter does not inherit
>   any state.

Do you envision a tree of interpreters and sub-interpreters?  Can the
levels of spawning get arbitrarily deep?

If i am visualizing your model correctly, maybe it would be useful to
introduce the term "parent", where each interpreter has as its parent
either the Python process or another interpreter.  Then you could say
that each interpreter acquires authority only by explicit granting from
its parent.  Then i have another question: can an interpreter acquire
authorities only when it is started, or can it acquire them while it is
running, and how?

> Implementation
> ///////////////////////////////////////
>
> Guiding Principles
> ========================
>
> To begin, the Python process garners all power as the powerbox.  It is
> up to the process to initially hand out access to resources and
> abilities to interpreters.  This might take the form of an interpreter
> with all abilities granted (i.e., a standard interpreter as launched
> when you execute Python), which then creates sub-interpreters with
> sandboxed abilities.  Another alternative is only creating
> interpreters with sandboxed abilities (i.e., Python being embedded in
> an application that only uses sandboxed interpreters).

This sounds like part of your design to me.  It might help to have
this earlier in the document (maybe even with an example diagram of a
tree of interpreters).

> All security measures should never have to ask who an interpreter is.
> This means that what abilities an interpreter has should not be stored
> at the interpreter level when the security can use a proxy to protect
> a resource.  This means that while supporting a memory cap can
> have a per-interpreter setting that is checked (because access to the
> operating system's memory allocator is not supported at the program
> level), protecting files and imports should not such a per-interpreter
> protection at such a low level (because those can have extension
> module proxies to provide the security).

It might be good to declare two categories of resources -- those
protected by object hiding and those protected by a per-interpreter
setting -- and make lists.

> Backwards-compatibility will not be a hindrance upon the design or
> implementation of the security model.  Because the security model will
> inherently remove resources and abilities that existing code expects,
> it is not reasonable to expect existing code to work in a sandboxed
> interpreter.

You might qualify the last statement a bit.  For example, a Python
implementation of a pure algorithm (e.g. string processing, data
compression, etc.) would still work in a sandboxed interpreter.

> Keeping Python "pythonic" is required for all design decisions.

As Lawrence Oluyede also mentioned, it would be helpful to say a
little more about what "pythonic" means.

> Restricting what is in the built-in namespace and the safe-guarding
> the interpreter (which includes safe-guarding the built-in types) is
> where security will come from.

Sounds good.

> Abilities of a Standard Sandboxed Interpreter
> =============================================
>
[...]
> * You cannot open any files directly.
> * Importation
>     + You can import any pure Python module.
>     + You cannot import any Python bytecode module.
>     + You cannot import any C extension module.
>     + You cannot import any built-in module.
> * You cannot find out any information about the operating system you
>   are running on.
> * Only safe built-ins are provided.

This looks reasonable.  This is probably a good place to itemize
exactly which built-ins are considered safe.

> Imports
> -------
>
> A proxy for protecting imports will be provided.  This is done by
> setting the ``__import__()`` function in the built-in namespace of the
> sandboxed interpreter to a proxied version of the function.
>
> The planned proxy will take in a passed-in function to use for the
> import and a whitelist of C extension modules and built-in modules to
> allow importation of.

Presumably these are passed in to the proxy's constructor.

> If an import would lead to loading an extension
> or built-in module, it is checked against the whitelist and allowed
> to be imported based on that list.  All .pyc and .pyo file will not
> be imported.  All .py files will be imported.

I'm unclear about this.  Is the whitelist a list of module names only,
or of filenames with extensions?  Does the normal path-searching process
take place or can it be restricted in some way?  Would it simplify the
security analysis to have the whitelist be a dictionary that maps module
names to absolute pathnames?

If both the .py and .pyc are present, the normal import would find the
.pyc file; would the import proxy reject such an import or ignore it
and recompile the .py instead?

> It must be warned that importing any C extension module is dangerous.

Right.

> Implementing Import in Python
> +++++++++++++++++++++++++++++
>
> To help facilitate in the exposure of more of what importation
> requires (and thus make implementing a proxy easier), the import
> machinery should be rewritten in Python.

This seems like a good idea.  Can you identify which minimum essential
pieces of the import machinery have to be written in C?

> Sanitizing Built-In Types
> -------------------------
[...]
> Constructors
> ++++++++++++
>
> Almost all of Python's built-in types
> contain a constructor that allows code to create a new instance of a
> type as long as you have the type itself.  Unfortunately this does not
> work in an object-capabilities system without either providing a proxy
> to the constructor or just turning it off.

The existence of the constructor isn't (by itself) the problem.
The problem is that both of the following are true:

    (a) From any object you can get its type object.
    (b) Using any type object you can construct a new instance.

So, you can control this either by hiding the type object, separating
the constructor from the type, or disabling the constructor.

> Types whose constructors are considered dangerous are:
>
> * ``file``
>     + Will definitely use the ``open()`` built-in.
> * code objects
> * XXX sockets?
> * XXX type?
> * XXX

Looks good so far.  Not sure i see what's dangerous about 'type'.

> Filesystem Information
> ++++++++++++++++++++++
>
> When running code in a sandboxed interpreter, POLA suggests that you
> do not want to expose information about your environment on top of
> protecting its use.  This means that filesystem paths typically should
> not be exposed.  Unfortunately, Python exposes file paths all over the
> place:
>
> * Modules
>     + ``__file__`` attribute
> * Code objects
>     + ``co_filename`` attribute
> * Packages
>     + ``__path__`` attribute
> * XXX
>
> XXX how to expose safely?

It seems that in most cases, a single Python object is associated with
a single pathname.  If that's true in general, one solution would be
to provide an introspection function named 'getpath' or something
similar that would get the path associated with any object.  This
function might go in a module containing all the introspection functions,
so imports of that module could be easily restricted.

> Mutable Shared State
> ++++++++++++++++++++
>
> Because built-in types are shared between interpreters, they cannot
> expose any mutable shared state.  Unfortunately, as it stands, some
> do.  Below is a list of types that share some form of dangerous state,
> how they share it, and how to fix the problem:
>
> * ``object``
>     + ``__subclasses__()`` function
>         - Remove the function; never seen used in real-world code.
> * XXX

Okay, more to work out here. :)

> Perimeter Defences Between a Created Interpreter and Its Creator
> ----------------------------------------------------------------
>
> The plan is to allow interpreters to instantiate sandboxed
> interpreters safely.  By using the creating interpreter's abilities to
> provide abilities to the created interpreter, you make sure there is
> no escalation in abilities.

Good.

> * ``__del__`` created in sandboxed interpreter but object is cleaned
>   up in unprotected interpreter.

How do you envision the launching of a sandboxed interpreter to look?
Could you sketch out some rough code examples?  Were you thinking of
something like:

    sys.spawn(code, dict)
        code: a string containing Python source code
        dict: the global namespace in which to run the code

If you allow the parent interpreter to pass mutable objects into the
child interpreter, then the parent and child can already communicate
via the object, so '__del__' is a moot issue.  Do you want to prevent
all communication between parent and child?  It's not obvious to me
why that would be necessary.

> * Using frames to walk the frame stack back to another interpreter.

Could you just disable introspection of the frame stack?

> Making the ``sys`` Module Safe
> ------------------------------
[...]
> This means that the ``sys`` module needs to have its safe information
> separated out from the unsafe settings.

Yes.

> XXX separate modules, ``sys.settings`` and ``sys.info``, or strip
> ``sys`` to settings and put info somewhere else?  Or provide a method
> that will create a faked sys module that has the safe values copied
> into it?

I think the last suggestion above would lead to confusion.  The two
groups should have two distinct names and it should be clear which
attribute goes with which group.

> Protecting I/O
> ++++++++++++++
>
> The ``print`` keyword and the built-ins ``raw_input()`` and
> ``input()`` use the values stored in ``sys.stdout`` and ``sys.stdin``.
> By exposing these attributes to the creating interpreter, one can set
> them to safe objects, such as instances of ``StringIO``.

Sounds good.

> Safe Networking
> ---------------
>
> XXX proxy on socket module, modify open() to be the constructor, etc.

Lots more to think about here. :)

> Protecting Memory Usage
> -----------------------
>
> To protect memory, low-level hooks into the memory allocator for
> Python is needed.  By hooking into the C API for memory allocation and
> deallocation a very rough running count of used memory can kept.  This
> can be used to prevent sandboxed interpreters from using so much
> memory that it impacts the overall performance of the system.

Preventing denial-of-service is in general quite difficult, but i
applaud the attempt.  I agree with your decision to separate this
work from the rest of the security model.


-- ?!ng

From nnorwitz at gmail.com  Thu Sep  7 09:28:39 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Thu, 7 Sep 2006 00:28:39 -0700
Subject: [Python-Dev] 2.5 status
In-Reply-To: <bbaeab100609051241m7d878b0dtd93018b535b9ee14@mail.gmail.com>
References: <ee2a432c0609042124w281b7979t8cb10cbaeb937374@mail.gmail.com>
	<bbaeab100609051125x6d707ca2jf79973d9d68579a7@mail.gmail.com>
	<44FDD122.3000809@egenix.com>
	<bbaeab100609051241m7d878b0dtd93018b535b9ee14@mail.gmail.com>
Message-ID: <ee2a432c0609070028t657e538dqf675e3ce45115150@mail.gmail.com>

On 9/5/06, Brett Cannon <brett at python.org> wrote:
>
> > [MAL]
> > The proper fix would be to introduce a tp_unicode slot and let
> > this decide what to do, ie. call .__unicode__() methods on instances
> > and use the .__name__ on classes.
>
> That was my bug reaction  and what I said on the bug report.  Kind of
> surprised one doesn't already exist.
>
> > I think this would be the right way to go for Python 2.6. For
> > Python 2.5, just dropping this .__unicode__ method on exceptions
> > is probably the right thing to do.
>
> Neal, do you want to rip it out or should I?

Is removing __unicode__ backwards compatible with 2.4 for both
instances and exception classes?

Does everyone agree this is the proper approach?  I'm not familiar
with this code.  Brett, if everyone agrees (ie, remains silent),
please fix this and add tests and a NEWS entry.

Everyone should be looking for incompatibilities with previous
versions.  Exceptions are new and deserve special attention.  Lots of
the internals of strings (8-bit and unicode) and the struct module
changed and should be tested thoroughly.  I'm sure there are a bunch
of other things I'm not remembering.  The compiler is also an obvious
target to verify your code still works.

We're stuck with anything that makes it into 2.5, so now is the time
to fix these problems.

n

From ronaldoussoren at mac.com  Thu Sep  7 11:17:37 2006
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Thu, 7 Sep 2006 11:17:37 +0200
Subject: [Python-Dev] 2.5 status
In-Reply-To: <ee2a432c0609042124w281b7979t8cb10cbaeb937374@mail.gmail.com>
References: <ee2a432c0609042124w281b7979t8cb10cbaeb937374@mail.gmail.com>
Message-ID: <E691718F-B71B-4B1E-8615-D4B70F3B4191@mac.com>


On 5-sep-2006, at 6:24, Neal Norwitz wrote:

> There are 3 bugs currently listed in PEP 356 as blocking:
>         http://python.org/sf/1551432 - __unicode__ breaks on  
> exception classes
>         http://python.org/sf/1550938 - improper exception w/ 
> relative import
>         http://python.org/sf/1541697 - sgmllib regexp bug causes hang
>
> Does anyone want to fix the sgmlib issue?  If not, we should revert
> this week before c2 is cut.  I'm hoping that we will have *no changes*
> in 2.5 final from c2.  Should there be any bugs/patches added to or
> removed from the list?
>
> The buildbots are currently humming along, but I believe all 3
> versions (2.4, 2.5, and 2.6) are fine.
>
> Test out 2.5c1+ and report all bugs!

I have another bug that I'd like to fix: Mac/ReadMe contains an  
error: it claims that you can build the frameworkinstall into a  
temporary directory and then move it into place, but that isn't  
actually true. The erroneous paragraph is this:

    Note that there are no references to the actual locations in the  
code or
    resource files, so you are free to move things around afterwards.  
For example,
    you could use --enable-framework=/tmp/newversion/Library/ 
Frameworks and use
    /tmp/newversion as the basis for an installer or something.

My proposed fix is to drop this paragraph. There is no bugreport for  
this yet, I got notified of this issue in a private e-mail.

Ronald

From nnorwitz at gmail.com  Thu Sep  7 11:19:35 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Thu, 7 Sep 2006 02:19:35 -0700
Subject: [Python-Dev] 2.5 status
In-Reply-To: <E691718F-B71B-4B1E-8615-D4B70F3B4191@mac.com>
References: <ee2a432c0609042124w281b7979t8cb10cbaeb937374@mail.gmail.com>
	<E691718F-B71B-4B1E-8615-D4B70F3B4191@mac.com>
Message-ID: <ee2a432c0609070219h5133fd18j13c5236dab83c228@mail.gmail.com>

Doc patches are fine, please fix.

n
--

On 9/7/06, Ronald Oussoren <ronaldoussoren at mac.com> wrote:
>
> On 5-sep-2006, at 6:24, Neal Norwitz wrote:
>
> > There are 3 bugs currently listed in PEP 356 as blocking:
> >         http://python.org/sf/1551432 - __unicode__ breaks on
> > exception classes
> >         http://python.org/sf/1550938 - improper exception w/
> > relative import
> >         http://python.org/sf/1541697 - sgmllib regexp bug causes hang
> >
> > Does anyone want to fix the sgmlib issue?  If not, we should revert
> > this week before c2 is cut.  I'm hoping that we will have *no changes*
> > in 2.5 final from c2.  Should there be any bugs/patches added to or
> > removed from the list?
> >
> > The buildbots are currently humming along, but I believe all 3
> > versions (2.4, 2.5, and 2.6) are fine.
> >
> > Test out 2.5c1+ and report all bugs!
>
> I have another bug that I'd like to fix: Mac/ReadMe contains an
> error: it claims that you can build the frameworkinstall into a
> temporary directory and then move it into place, but that isn't
> actually true. The erroneous paragraph is this:
>
>     Note that there are no references to the actual locations in the
> code or
>     resource files, so you are free to move things around afterwards.
> For example,
>     you could use --enable-framework=/tmp/newversion/Library/
> Frameworks and use
>     /tmp/newversion as the basis for an installer or something.
>
> My proposed fix is to drop this paragraph. There is no bugreport for
> this yet, I got notified of this issue in a private e-mail.
>
> Ronald
>

From ncoghlan at gmail.com  Thu Sep  7 12:59:01 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 07 Sep 2006 20:59:01 +1000
Subject: [Python-Dev] inspect.py very slow under 2.5
In-Reply-To: <44FEE0EA.7000303@brainbot.com>
References: <edlkau$27h$1@sea.gmane.org>
	<44FE9F71.3090903@brainbot.com>	<44FEA8F5.1000700@gmail.com>
	<44FEAF85.1000107@brainbot.com> <44FEC7D9.80500@gmail.com>
	<44FEE0EA.7000303@brainbot.com>
Message-ID: <44FFFB75.3030903@gmail.com>

Ralf Schmitt wrote:
> Nick Coghlan wrote:
>> Good point. I modified the patch so it does the latter (it only calls 
>> getabspath() again for a module if the value of module.__file__ changes).
> 
> with _filesbymodname[modname] = file changed to _filesbymodname[modname] 
> = f
> it seems to work ok.

I checked the inspect module unit tests and discovered the test for this 
function was only covering one of the half dozen possible execution paths.

I've updated the patch on SF, and committed the fix (including PJE's and 
Neal's comments) to the trunk.

I'll backport it tomorrow night (assuming I don't hear any objections in the 
meantime :).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From murman at gmail.com  Thu Sep  7 15:37:41 2006
From: murman at gmail.com (Michael Urman)
Date: Thu, 7 Sep 2006 08:37:41 -0500
Subject: [Python-Dev] Change in file() behavior in 2.5
Message-ID: <dcbbbb410609070637j5f0f719fia856ec0724ad3ffe@mail.gmail.com>

Hi folks,

Between 2.4 and 2.5 the behavior of file or open with the mode 'wU'
has changed. In 2.4 it silently works. in 2.5 it raises a ValueError.
I can't find any more discussion on it in python-dev than tangential
mentions in this thread:
  http://mail.python.org/pipermail/python-dev/2006-June/065939.html

It is (buried) in NEWS. First I found:
  Bug #1462152: file() now checks more thoroughly for invalid mode
  strings and removes a possible "U" before passing the mode to the
  C library function.
Which seems to imply different behavior than the actual entry:
  bug #967182: disallow opening files with 'wU' or 'aU' as specified by PEP
  278.

I don't see anything in pep278 about a timeline, and wanted to make
sure that transitioning directly from working to raising an error was
a desired change. This actually caught a bug in an application I work
with, which used an explicit 'wU', that will currently stop working
when people upgrade Python but not our application.

Thanks,
Michael
-- 
Michael Urman  http://www.tortall.net/mu/blog

From mwh at python.net  Thu Sep  7 16:15:35 2006
From: mwh at python.net (Michael Hudson)
Date: Thu, 07 Sep 2006 15:15:35 +0100
Subject: [Python-Dev] Change in file() behavior in 2.5
In-Reply-To: <dcbbbb410609070637j5f0f719fia856ec0724ad3ffe@mail.gmail.com>
	(Michael Urman's message of "Thu, 7 Sep 2006 08:37:41 -0500")
References: <dcbbbb410609070637j5f0f719fia856ec0724ad3ffe@mail.gmail.com>
Message-ID: <2m4pvjoj94.fsf@starship.python.net>

"Michael Urman" <murman at gmail.com> writes:

> Hi folks,
>
> Between 2.4 and 2.5 the behavior of file or open with the mode 'wU'
> has changed. In 2.4 it silently works. in 2.5 it raises a ValueError.
> I can't find any more discussion on it in python-dev than tangential
> mentions in this thread:
>   http://mail.python.org/pipermail/python-dev/2006-June/065939.html
>
> It is (buried) in NEWS. First I found:
>   Bug #1462152: file() now checks more thoroughly for invalid mode
>   strings and removes a possible "U" before passing the mode to the
>   C library function.
> Which seems to imply different behavior than the actual entry:
>   bug #967182: disallow opening files with 'wU' or 'aU' as specified by PEP
>   278.
>
> I don't see anything in pep278 about a timeline, and wanted to make
> sure that transitioning directly from working to raising an error was
> a desired change. 

That it was silently ignored was never intentional; it was a bug and
it was fixed.  I don't think having a release with deprecation
warnings and so on is worth it.

> This actually caught a bug in an application I work with, which used
> an explicit 'wU', that will currently stop working when people
> upgrade Python but not our application.

I would hope they wouldn't do that without careful testing anyway.

Cheers,
mwh

-- 
  No.  In fact, my eyeballs fell out just from reading this question,
  so it's a good thing I can touch-type.
                                    -- John Baez, sci.physics.research

From fperez.net at gmail.com  Thu Sep  7 17:31:20 2006
From: fperez.net at gmail.com (Fernando Perez)
Date: Thu, 07 Sep 2006 09:31:20 -0600
Subject: [Python-Dev] inspect.py very slow under 2.5
References: <edlkau$27h$1@sea.gmane.org>
	<44FE9F71.3090903@brainbot.com>	<44FEA8F5.1000700@gmail.com>
	<44FEAF85.1000107@brainbot.com> <44FEC7D9.80500@gmail.com>
	<44FEE0EA.7000303@brainbot.com> <44FFFB75.3030903@gmail.com>
Message-ID: <edpe08$ip3$1@sea.gmane.org>

Nick Coghlan wrote:

> I've updated the patch on SF, and committed the fix (including PJE's and
> Neal's comments) to the trunk.
> 
> I'll backport it tomorrow night (assuming I don't hear any objections in the
> meantime :).

I just wanted to thank you all for taking the time to work on this, even with
my 11-th hour report.  Greatly appreciated, really.

Looking forward to 2.5!

f


From grig.gheorghiu at gmail.com  Thu Sep  7 17:34:17 2006
From: grig.gheorghiu at gmail.com (Grig Gheorghiu)
Date: Thu, 7 Sep 2006 08:34:17 -0700
Subject: [Python-Dev] 'with' bites Twisted
Message-ID: <3f09d5a00609070834m35694c34u5af582dff3aa5bb4@mail.gmail.com>

When the pybot buildslave for Twisted is trying to run the Twisted test
suite via 'trial', it gets an exception:

Traceback (most recent call last):
  File "/tmp/Twisted/bin/trial", line 23, in <module>
    from twisted.scripts.trial import run
  File "/tmp/Twisted/twisted/scripts/trial.py", line 10, in <module>
    from twisted.application import app
  File "/tmp/Twisted/twisted/application/app.py", line 10, in <module>
    from twisted.application import service
  File "/tmp/Twisted/twisted/application/service.py", line 20, in <module>
    from twisted.python import components
  File "/tmp/Twisted/twisted/python/components.py", line 37, in <module>
    from zope.interface.adapter import AdapterRegistry
  File "/tmp/python-buildbot/local/lib/python2.6/site-packages/zope/interface/adapter.py",
line 201
    for with, objects in v.iteritems():
           ^
SyntaxError: invalid syntax


So the culprit in this case is really zope.interface.

The full log is here:

http://www.python.org/dev/buildbot/community/all/x86%20RedHat%209%20trunk/builds/97/step-shell/0

Grig

-- 
http://agiletesting.blogspot.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060907/148dda2b/attachment.htm 

From exarkun at divmod.com  Thu Sep  7 18:06:13 2006
From: exarkun at divmod.com (Jean-Paul Calderone)
Date: Thu, 7 Sep 2006 12:06:13 -0400
Subject: [Python-Dev] [Twisted-Python] Newbie question
In-Reply-To: <972ec5bd0609070841j6cd2a600o4c6db5567dffd790@mail.gmail.com>
Message-ID: <20060907160613.1717.1053187541.divmod.quotient.42002@ohm>

On Thu, 7 Sep 2006 11:41:48 -0400, Timothy Fitz <timothyfitz at gmail.com> wrote:
>On 9/5/06, Jean-Paul Calderone <exarkun at divmod.com> wrote:
>>You cannot stop the reactor and then start it again.
>
>Why don't the reactors throw if this happens? This question comes up
>almost once a month.
>

One could just as easily ask why no one bothers to read mailing list
archives to see if their question has been answered before.

No one will ever know, it is just one of the mysteries of the universe.

Jean-Paul

From aahz at pythoncraft.com  Thu Sep  7 18:22:17 2006
From: aahz at pythoncraft.com (Aahz)
Date: Thu, 7 Sep 2006 09:22:17 -0700
Subject: [Python-Dev] [Twisted-Python] Newbie question
In-Reply-To: <20060907160613.1717.1053187541.divmod.quotient.42002@ohm>
References: <972ec5bd0609070841j6cd2a600o4c6db5567dffd790@mail.gmail.com>
	<20060907160613.1717.1053187541.divmod.quotient.42002@ohm>
Message-ID: <20060907162217.GA17623@panix.com>

On Thu, Sep 07, 2006, Jean-Paul Calderone wrote:
> On Thu, 7 Sep 2006 11:41:48 -0400, Timothy Fitz <timothyfitz at gmail.com> wrote:
>>On 9/5/06, Jean-Paul Calderone <exarkun at divmod.com> wrote:
>>>
>>>You cannot stop the reactor and then start it again.
>>
>>Why don't the reactors throw if this happens? This question comes up
>>almost once a month.
> 
> One could just as easily ask why no one bothers to read mailing list
> archives to see if their question has been answered before.
> 
> No one will ever know, it is just one of the mysteries of the universe.

One could also ask why this got x-posted to python-dev...
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

I support the RKAB

From skip at pobox.com  Thu Sep  7 18:31:42 2006
From: skip at pobox.com (skip at pobox.com)
Date: Thu, 7 Sep 2006 11:31:42 -0500
Subject: [Python-Dev] [Twisted-Python] Newbie question
In-Reply-To: <20060907160613.1717.1053187541.divmod.quotient.42002@ohm>
References: <972ec5bd0609070841j6cd2a600o4c6db5567dffd790@mail.gmail.com>
	<20060907160613.1717.1053187541.divmod.quotient.42002@ohm>
Message-ID: <17664.18798.756868.339094@montanaro.dyndns.org>

    Jean-Paul> One could just as easily ask why no one bothers to read
    Jean-Paul> mailing list archives to see if their question has been
    Jean-Paul> answered before.

    Jean-Paul> No one will ever know, it is just one of the mysteries of the
    Jean-Paul> universe.

+1 QOTF...

Skip

From exarkun at divmod.com  Thu Sep  7 18:36:00 2006
From: exarkun at divmod.com (Jean-Paul Calderone)
Date: Thu, 7 Sep 2006 12:36:00 -0400
Subject: [Python-Dev] [Twisted-Python] Newbie question
In-Reply-To: <20060907162217.GA17623@panix.com>
Message-ID: <20060907163600.1717.1300037898.divmod.quotient.42020@ohm>

Sorry, brainfart.

Jean-Paul

From kristjan at ccpgames.com  Thu Sep  7 18:56:15 2006
From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_V=2E_J=F3nsson?=)
Date: Thu, 7 Sep 2006 16:56:15 -0000
Subject: [Python-Dev] Unicode Imports
Message-ID: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc>

Hello All.
I just added patch 1552880 to sourceforge.  It is a patch for 2.6 (and 2.5) which allows unicode paths in sys.path and uses the unicode file api on windows.
This is tried and tested on 2.5, and backported to 2.3 and is currently running on clients in china and esewhere.  It is minimally intrusive to the inporting mechanism, at the cost of some string conversion overhead (to utf8 and then back to unicode).
 
Cheers,
Kristj?n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060907/071726d7/attachment.html 

From skip at pobox.com  Thu Sep  7 19:23:39 2006
From: skip at pobox.com (skip at pobox.com)
Date: Thu, 7 Sep 2006 12:23:39 -0500
Subject: [Python-Dev] [Twisted-Python] Newbie question
In-Reply-To: <20060907163600.1717.1300037898.divmod.quotient.42020@ohm>
References: <20060907162217.GA17623@panix.com>
	<20060907163600.1717.1300037898.divmod.quotient.42020@ohm>
Message-ID: <17664.21915.553226.875941@montanaro.dyndns.org>


    Jean-Paul> Sorry, brainfart.

But still... QOTF ;-)

S

From amk at amk.ca  Thu Sep  7 19:39:00 2006
From: amk at amk.ca (A.M. Kuchling)
Date: Thu, 7 Sep 2006 13:39:00 -0400
Subject: [Python-Dev] Arlington sprints to occur monthly
Message-ID: <20060907173900.GA4691@rogue.amk.ca>

Jeffrey Elkner has arranged things so that the 1-day Python sprints in
Arlington VA will now be happening every month.  Future sprints will
be on September 23rd, October 21st, November 18th, and December 16th.

See http://wiki.python.org/moin/ArlingtonSprint for directions and to
sign up.

--amk


From brett at python.org  Thu Sep  7 19:39:20 2006
From: brett at python.org (Brett Cannon)
Date: Thu, 7 Sep 2006 10:39:20 -0700
Subject: [Python-Dev] 2.5 status
In-Reply-To: <ee2a432c0609070028t657e538dqf675e3ce45115150@mail.gmail.com>
References: <ee2a432c0609042124w281b7979t8cb10cbaeb937374@mail.gmail.com>
	<bbaeab100609051125x6d707ca2jf79973d9d68579a7@mail.gmail.com>
	<44FDD122.3000809@egenix.com>
	<bbaeab100609051241m7d878b0dtd93018b535b9ee14@mail.gmail.com>
	<ee2a432c0609070028t657e538dqf675e3ce45115150@mail.gmail.com>
Message-ID: <bbaeab100609071039i5515ff10v93e8904ee9089b0e@mail.gmail.com>

On 9/7/06, Neal Norwitz <nnorwitz at gmail.com> wrote:
>
> On 9/5/06, Brett Cannon <brett at python.org> wrote:
> >
> > > [MAL]
> > > The proper fix would be to introduce a tp_unicode slot and let
> > > this decide what to do, ie. call .__unicode__() methods on instances
> > > and use the .__name__ on classes.
> >
> > That was my bug reaction  and what I said on the bug report.  Kind of
> > surprised one doesn't already exist.
> >
> > > I think this would be the right way to go for Python 2.6. For
> > > Python 2.5, just dropping this .__unicode__ method on exceptions
> > > is probably the right thing to do.
> >
> > Neal, do you want to rip it out or should I?
>
> Is removing __unicode__ backwards compatible with 2.4 for both
> instances and exception classes?


 Should be.  There was no proper __unicode__() originally so that's why this
whole problem came up in the first place.

Does everyone agree this is the proper approach?  I'm not familiar
> with this code.


I am not terribly anymore either since Georg and Richard rewrote the whole
thing.  =)

  Brett, if everyone agrees (ie, remains silent),
> please fix this and add tests and a NEWS entry.


OK.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060907/3def1449/attachment.htm 

From anthony at interlink.com.au  Thu Sep  7 19:53:03 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Fri, 8 Sep 2006 03:53:03 +1000
Subject: [Python-Dev] Unicode Imports
In-Reply-To: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc>
References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc>
Message-ID: <200609080353.07502.anthony@interlink.com.au>

On Friday 08 September 2006 02:56, Kristj?n V. J?nsson wrote:
> Hello All.
> I just added patch 1552880 to sourceforge.  It is a patch for 2.6 (and 2.5)
> which allows unicode paths in sys.path and uses the unicode file api on
> windows. This is tried and tested on 2.5, and backported to 2.3 and is
> currently running on clients in china and esewhere.  It is minimally
> intrusive to the inporting mechanism, at the cost of some string conversion
> overhead (to utf8 and then back to unicode).

As this can't be considered a bugfix (that I can see), I'd be against it being 
checked into 2.5. 


From brett at python.org  Thu Sep  7 20:26:53 2006
From: brett at python.org (Brett Cannon)
Date: Thu, 7 Sep 2006 11:26:53 -0700
Subject: [Python-Dev] new security doc using object-capabilities
In-Reply-To: <Pine.LNX.4.58.0609062237510.19634@server1.LFW.org>
References: <bbaeab100607191535p543cb0ddj66410cf985dd9b77@mail.gmail.com>
	<Pine.LNX.4.58.0609062237510.19634@server1.LFW.org>
Message-ID: <bbaeab100609071126n382a9353s3d62bca1249d28c@mail.gmail.com>

On 9/6/06, Ka-Ping Yee <python-dev at zesty.ca> wrote:
>
> Hi Brett,
>
> Here are some comments on your proposal.  Sorry this took so long.
> I apologize if any of these comments are out of date (but also look
> forward to your answers to some of the questions, as they'll help
> me understand some more of the details of your proposal).  Thanks!


I think they are slightly outdated.  The latest version of the doc is in the
bcannon-objcap branch and is named securing_python.txt (
http://svn.python.org/view/python/branches/bcannon-objcap/securing_python.txt
).

> Introduction
> > ///////////////////////////////////////
> [...]
> > Throughout this document several terms are going to be used.  A
> > "sandboxed interpreter" is one where the built-in namespace is not the
> > same as that of an interpreter whose built-ins were unaltered, which
> > is called an "unprotected interpreter".
>
> Is this a definition or an implementation choice?  As in, are you
> defining "sandboxed" to mean "with altered built-ins" or just
> "restricted in some way", and does the above mean to imply that
> altering the built-ins is what triggers other kinds of restrictions
> (as it did in Python's old restricted execution mode)?


There is no "triggering" of other restrictions.  This is an implementation
choice.  "Sandboxed" means "with altered built-ins".

> A "bare interpreter" is one where the built-in namespace has been
> > stripped down the bare minimum needed to run any form of basic Python
> > program.  This means that all atomic types (i.e., syntactically
> > supported types), ``object``, and the exceptions provided by the
> > ``exceptions`` module are considered in the built-in namespace.  There
> > have also been no imports executed in the interpreter.
>
> Is a "bare interpreter" just one example of a sandboxed interpreter,
> or are all sandboxed interpreters in your design initially bare (i.e.
> "sandboxed" = "bare" + zero or more granted authorities)?


You build up from a bare interpreter by adding in authorities (e.g.,
providing a wrapped version of open()) to reach the level of security you
want.

> The "security domain" is the boundary at which security is cared
> > about.  For this dicussion, it is the interpreter.
>
> It might be clearer to say (if i understand correctly) "Each interpreter
> is a separate security domain."
>


> Many interpreters can run within a single operating system process,
> right?


Yes.

  Could you say a bit about what sort of concurrency model you
> have in mind?


None specifically.  Each new interpreter automatically runs in its own
Python thread, so they have essentially the same concurrency as using the
'thread' module.

  How would this interact (if at all) with use of the
> existing threading functionality?


See above.

> The "powerbox" is the thing that possesses the ultimate power in the
> > system.  In our case it is the Python process.
>
> This could also be the application process, right?


If Python is embedded, yes.

> Rationale
> > ///////////////////////////////////////
> [...]
> > For instance, think of an application that supports a plug-in system
> > with Python as the language used for writing plug-ins.  You do not
> > want to have to examine every plug-in you download to make sure that
> > it does not alter your filesystem if you can help it.  With a proper
> > security model and implementation in place this hinderance of having
> > to examine all code you execute should be alleviated.
>
> I'm glad to have this use case set out early in the document, so the
> reader can keep it in mind as an example while reading about the model.
>
> > Approaches to Security
> > ///////////////////////////////////////
> >
> > There are essentially two types of security: who-I-am
> > (permissions-based) security and what-I-have (authority-based)
> > security.
>
> As Mark Miller mentioned in another message, your descriptions of
> "who-I-am" security and "what-I-have" security make sense, but
> they don't correspond to "permission" vs. "authority".  They
> correspond to "identity-based" vs. "authority-based" security.


Right.  This was fixed the day Mark and Alan Karp made the comment.

> Difficulties in Python for Object-Capabilities
> > //////////////////////////////////////////////
> [...]
> > Three key requirements for providing a proper perimeter defence is
> > private namespaces, immutable shared state across domains, and
> > unforgeable references.
>
> Nice summary.
>
> > Problem of No Private Namespace
> > ===============================
> [...]
> > The Python language has no such thing as a private namespace.
>
> Don't local scopes count as private namespaces?  It seems clear
> that they aren't designed with the intention of being exposed,
> unlike other namespaces in Python.


Sort of.  But you can still get access to them if you have an execution
frame and they are not persistent.  Generators are are worse since they
store their execution frame with the generator itself, completely exposing
the local namespace.

> It also makes providing security at the object level using
> > object-capabilities non-existent in pure Python code.


I don't think this is necessarily the case.  No Python code i've
> ever seen expects to be able to invade the local scopes of other
> functions, so you could use them as private namespaces.  There
> are two ways i've seen to invade local scopes:
>
>     (a) Use gc.get_referents to get back from a cell object
>         to its contents.
>
>     (b) Compare the cell object to another cell object, thereby
>         causing __eq__ to be invoked to compare the contents of
>         the cells.


Or the execution frame which is exposed directly on generators.

But regardless, the comment was meant to apply to Python as it stands, not
that it couldn't be possibly tweaked somehow.

So you could protect local scopes by prohibiting these or by
> simply turning off access to func_closure.  It's clear that hardly
> any code depends on these introspection featuresl, so it would be
> reasonble to turn them off in a sandboxed interpreter.  (It seems
> you would have to turn off some introspection features anyway in
> order to have reliable import guards.)


Maybe this can be changed in the future, but this more than I need at the
moment so I am not going to go down that path right now.  But I added a
quick mention of this.

> Problem of Mutable Shared State
> > ===============================
> [...]
> > Regardless, sharing of state that can be influenced by another
> > interpreter is not safe for object-capabilities.
>
> Yup.
>
> > Threat Model
> > ///////////////////////////////////////
>
> Good to see this specified here.  I like the way you've broken this
> down.


The current version has more details per point than the one you read.

> * An interpreter cannot gain abilties the Python process possesses
> >   without explicitly being given those abilities.
>
> It would be good to enumerate which abilities you're referring to in
> this item.  For example, a bare interpreter should be able to allocate
> memory and call most of the built-in functions, but should not be able
> to open network connections.
>
> > * An interpreter cannot influence another interpreter directly at the
> >   Python level without explicitly allowing it.
>
> You mean, without some other entity explicitly allowing it, right?


Yep.

What would that other entity be -- presumably the interpreter that
> spawned both of these sub-interpreters?


Sure.  You could stick something in the built-in namespace of the
sub-interpreter to use for communicating.

> * An interpreter cannot use operating system resources without being
> >   explicitly given those resources.
>
> Okay.
>
> > * A bare Python interpreter is always trusted.
>
> What does "trusted" mean in the above?


It means that if Python source code can execute within a bare interpreter it
is considered safe code.  This is covered in the new version of the doc.

> * Python bytecode is always distrusted.
> > * Pure Python source code is always safe on its own.
>
> It would be helpful to clarify "safe" here.  I assume by "safe" you
> mean that the Python source code can express whatever it wants,
> including potentially dangerous activities, but when run in a bare
> or sandboxed interpreter it cannot have harmful effects.  But then
> in what sense does the "safety" have to do with the Python source code
> rather than the restrictions on the interpreter?
>
> Would it be correct to say:
>   + We want to guarantee that Python source code cannot violate
>     the restrictions in a restricted or bare interpreter.
>   + We do not prevent arbitrary Python bytecode from violating
>     these restrictions, and assume that it can.


>     + Malicious abilities are derived from C extension modules,
> >       built-in modules, and unsafe types implemented in C, not from
> >       pure Python source.
>
> By "malicious" do you just mean "anything that isn't accessible to
> a bare interpreter"?


Anything that could harm the system or interpreter.

> * A sub-interpreter started by another interpreter does not inherit
> >   any state.
>
> Do you envision a tree of interpreters and sub-interpreters?  Can the
> levels of spawning get arbitrarily deep?


Yes and yes.

If i am visualizing your model correctly, maybe it would be useful to
> introduce the term "parent", where each interpreter has as its parent
> either the Python process or another interpreter.  Then you could say
> that each interpreter acquires authority only by explicit granting from
> its parent.


You could, although there is not hierarchy at the implementation level.  But
it works in terms of who has a reference to whom and who gives each
interpreter their authority.


Then i have another question: can an interpreter acquire
> authorities only when it is started, or can it acquire them while it is
> running, and how?


 Well, whatever you want to do through the built-in namespace.  So if you
pass in a mutable object like a dict and add stuff to it on the fly, I don't
see why you couldn't give new authorities on the fly.

> Implementation
> > ///////////////////////////////////////
> >
> > Guiding Principles
> > ========================
> >
> > To begin, the Python process garners all power as the powerbox.  It is
> > up to the process to initially hand out access to resources and
> > abilities to interpreters.  This might take the form of an interpreter
> > with all abilities granted (i.e., a standard interpreter as launched
> > when you execute Python), which then creates sub-interpreters with
> > sandboxed abilities.  Another alternative is only creating
> > interpreters with sandboxed abilities (i.e., Python being embedded in
> > an application that only uses sandboxed interpreters).
>
> This sounds like part of your design to me.  It might help to have
> this earlier in the document (maybe even with an example diagram of a
> tree of interpreters).


Made Guiding Principles its own section and split off the bottom part of the
section and put it under Implementation.

> All security measures should never have to ask who an interpreter is.
> > This means that what abilities an interpreter has should not be stored
> > at the interpreter level when the security can use a proxy to protect
> > a resource.  This means that while supporting a memory cap can
> > have a per-interpreter setting that is checked (because access to the
> > operating system's memory allocator is not supported at the program
> > level), protecting files and imports should not such a per-interpreter
> > protection at such a low level (because those can have extension
> > module proxies to provide the security).
>
> It might be good to declare two categories of resources -- those
> protected by object hiding and those protected by a per-interpreter
> setting -- and make lists.


That is rather unknown since I am constantly finding stuff that is global to
the process compared to the interpreter, so making the list seems premature.

> Backwards-compatibility will not be a hindrance upon the design or
> > implementation of the security model.  Because the security model will
> > inherently remove resources and abilities that existing code expects,
> > it is not reasonable to expect existing code to work in a sandboxed
> > interpreter.
>
> You might qualify the last statement a bit.  For example, a Python
> implementation of a pure algorithm (e.g. string processing, data
> compression, etc.) would still work in a sandboxed interpreter.


I tossed in "all" to clarify.

> Keeping Python "pythonic" is required for all design decisions.
>
> As Lawrence Oluyede also mentioned, it would be helpful to say a
> little more about what "pythonic" means.


Done in the current version.

> Restricting what is in the built-in namespace and the safe-guarding
> > the interpreter (which includes safe-guarding the built-in types) is
> > where security will come from.
>
> Sounds good.
>
> > Abilities of a Standard Sandboxed Interpreter
> > =============================================
> >
> [...]
> > * You cannot open any files directly.
> > * Importation
> >     + You can import any pure Python module.
> >     + You cannot import any Python bytecode module.
> >     + You cannot import any C extension module.
> >     + You cannot import any built-in module.
> > * You cannot find out any information about the operating system you
> >   are running on.
> > * Only safe built-ins are provided.
>
> This looks reasonable.  This is probably a good place to itemize
> exactly which built-ins are considered safe.
>
> > Imports
> > -------
> >
> > A proxy for protecting imports will be provided.  This is done by
> > setting the ``__import__()`` function in the built-in namespace of the
> > sandboxed interpreter to a proxied version of the function.
> >
> > The planned proxy will take in a passed-in function to use for the
> > import and a whitelist of C extension modules and built-in modules to
> > allow importation of.
>
> Presumably these are passed in to the proxy's constructor.


Current plan is to expose the built-in namespace, imported modules, and sys
module dict when creating an Interpreter instance.

> If an import would lead to loading an extension
> > or built-in module, it is checked against the whitelist and allowed
> > to be imported based on that list.  All .pyc and .pyo file will not
> > be imported.  All .py files will be imported.
>
> I'm unclear about this.  Is the whitelist a list of module names only,
> or of filenames with extensions?


Have not deciced, but probably module name.

  Does the normal path-searching process
> take place or can it be restricted in some way?


Have not decided.

  Would it simplify the
> security analysis to have the whitelist be a dictionary that maps module
> names to absolute pathnames?


Don't know.  Protecting imports is the last thing I am going to implement
since it is the trickiest.

If both the .py and .pyc are present, the normal import would find the
> .pyc file; would the import proxy reject such an import or ignore it
> and recompile the .py instead?


Somethign along those lines.

> It must be warned that importing any C extension module is dangerous.
>
> Right.
>
> > Implementing Import in Python
> > +++++++++++++++++++++++++++++
> >
> > To help facilitate in the exposure of more of what importation
> > requires (and thus make implementing a proxy easier), the import
> > machinery should be rewritten in Python.
>
> This seems like a good idea.  Can you identify which minimum essential
> pieces of the import machinery have to be written in C?


Loading of C extensions, stating files, reading files, etc.  Pretty much
that requires help from the OS.

> Sanitizing Built-In Types
> > -------------------------
> [...]
> > Constructors
> > ++++++++++++
> >
> > Almost all of Python's built-in types
> > contain a constructor that allows code to create a new instance of a
> > type as long as you have the type itself.  Unfortunately this does not
> > work in an object-capabilities system without either providing a proxy
> > to the constructor or just turning it off.
>
> The existence of the constructor isn't (by itself) the problem.
> The problem is that both of the following are true:
>
>     (a) From any object you can get its type object.
>     (b) Using any type object you can construct a new instance.
>
> So, you can control this either by hiding the type object, separating
> the constructor from the type, or disabling the constructor.


I separated the constructor or initializer (tp_new or tp_init) into a
factory function.

> Types whose constructors are considered dangerous are:
> >
> > * ``file``
> >     + Will definitely use the ``open()`` built-in.
> > * code objects
> > * XXX sockets?
> > * XXX type?
> > * XXX
>
> Looks good so far.  Not sure i see what's dangerous about 'type'.


That's why it has the question mark.  =)

> Filesystem Information
> > ++++++++++++++++++++++
> >
> > When running code in a sandboxed interpreter, POLA suggests that you
> > do not want to expose information about your environment on top of
> > protecting its use.  This means that filesystem paths typically should
> > not be exposed.  Unfortunately, Python exposes file paths all over the
> > place:
> >
> > * Modules
> >     + ``__file__`` attribute
> > * Code objects
> >     + ``co_filename`` attribute
> > * Packages
> >     + ``__path__`` attribute
> > * XXX
> >
> > XXX how to expose safely?
>
> It seems that in most cases, a single Python object is associated with
> a single pathname.  If that's true in general, one solution would be
> to provide an introspection function named 'getpath' or something
> similar that would get the path associated with any object.  This
> function might go in a module containing all the introspection functions,
> so imports of that module could be easily restricted.


That is the current thinking.

> Mutable Shared State
> > ++++++++++++++++++++
> >
> > Because built-in types are shared between interpreters, they cannot
> > expose any mutable shared state.  Unfortunately, as it stands, some
> > do.  Below is a list of types that share some form of dangerous state,
> > how they share it, and how to fix the problem:
> >
> > * ``object``
> >     + ``__subclasses__()`` function
> >         - Remove the function; never seen used in real-world code.
> > * XXX
>
> Okay, more to work out here. :)


Possibly.  I might have to wait until I am much closer to being done to
discover more places where mutable shared state is exposed in a bare
interpreter because I have not been able to think of anymore.

> Perimeter Defences Between a Created Interpreter and Its Creator
> > ----------------------------------------------------------------
> >
> > The plan is to allow interpreters to instantiate sandboxed
> > interpreters safely.  By using the creating interpreter's abilities to
> > provide abilities to the created interpreter, you make sure there is
> > no escalation in abilities.
>
> Good.
>
> > * ``__del__`` created in sandboxed interpreter but object is cleaned
> >   up in unprotected interpreter.
>
> How do you envision the launching of a sandboxed interpreter to look?
> Could you sketch out some rough code examples?


>>> interp = interpreter.Interpreter()
>>> interp.builtins['open'] = wrapped_open()
>>> interp.sys_dict['path'] = []
>>> interp.exec("2 + 3")


Were you thinking of
> something like:
>
>      sys.spawn(code, dict)
>         code: a string containing Python source code
>         dict: the global namespace in which to run the code
>
> If you allow the parent interpreter to pass mutable objects into the
> child interpreter, then the parent and child can already communicate
> via the object, so '__del__' is a moot issue.  Do you want to prevent
> all communication between parent and child?  It's not obvious to me
> why that would be necessary.


No, I don't since there should be a secure way to allow that.  The __del__
worry came up from Guido pointing out you might be able to screw with it.
But if you pass in something implemented in C you should be okay.

> * Using frames to walk the frame stack back to another interpreter.
>
> Could you just disable introspection of the frame stack?


If you don't allow importing of 'sys' then yes, and that is planned.  I just
wanted to make sure I didn't forget this needs to be protected.

I do need to check what a generator's frame exposes, though.

> Making the ``sys`` Module Safe
> > ------------------------------
> [...]
> > This means that the ``sys`` module needs to have its safe information
> > separated out from the unsafe settings.
>
> Yes.
>
> > XXX separate modules, ``sys.settings`` and ``sys.info``, or strip
> > ``sys`` to settings and put info somewhere else?  Or provide a method
> > that will create a faked sys module that has the safe values copied
> > into it?
>
> I think the last suggestion above would lead to confusion.  The two
> groups should have two distinct names and it should be clear which
> attribute goes with which group.


This is also more complicated by the fact that some things are for the
entire process while others are per interpreter.  Might have to separate
things out even more.

> Protecting I/O
> > ++++++++++++++
> >
> > The ``print`` keyword and the built-ins ``raw_input()`` and
> > ``input()`` use the values stored in ``sys.stdout`` and ``sys.stdin``.
> > By exposing these attributes to the creating interpreter, one can set
> > them to safe objects, such as instances of ``StringIO``.
>
> Sounds good.
>
> > Safe Networking
> > ---------------
> >
> > XXX proxy on socket module, modify open() to be the constructor, etc.
>
> Lots more to think about here. :)


Oh yeah.  =)

> Protecting Memory Usage
> > -----------------------
> >
> > To protect memory, low-level hooks into the memory allocator for
> > Python is needed.  By hooking into the C API for memory allocation and
> > deallocation a very rough running count of used memory can kept.  This
> > can be used to prevent sandboxed interpreters from using so much
> > memory that it impacts the overall performance of the system.
>
> Preventing denial-of-service is in general quite difficult, but i
> applaud the attempt.  I agree with your decision to separate this


The memory tracking has a proof-of-concept done in the bcannon-sandboxing
branch.  Not perfect, but it does show how one could go about accounting for
every byte of data in terms of what it is basically used for.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060907/be9d8493/attachment.html 

From steve at holdenweb.com  Fri Sep  8 10:24:03 2006
From: steve at holdenweb.com (Steve Holden)
Date: Fri, 08 Sep 2006 09:24:03 +0100
Subject: [Python-Dev] Unicode Imports
In-Reply-To: <200609080353.07502.anthony@interlink.com.au>
References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc>
	<200609080353.07502.anthony@interlink.com.au>
Message-ID: <edr99s$v3g$4@sea.gmane.org>

Anthony Baxter wrote:
> On Friday 08 September 2006 02:56, Kristj?n V. J?nsson wrote:
> 
>>Hello All.
>>I just added patch 1552880 to sourceforge.  It is a patch for 2.6 (and 2.5)
>>which allows unicode paths in sys.path and uses the unicode file api on
>>windows. This is tried and tested on 2.5, and backported to 2.3 and is
>>currently running on clients in china and esewhere.  It is minimally
>>intrusive to the inporting mechanism, at the cost of some string conversion
>>overhead (to utf8 and then back to unicode).
> 
> 
> As this can't be considered a bugfix (that I can see), I'd be against it being 
> checked into 2.5. 
> 
Are you suggesting that Python's inability to correctly handle Unicode 
path elements isn't a bug? Or simply that this inability isn't currently 
described in a bug report on Sourceforge?

I agree it's a relatively large patch for a release candidate but if 
prudence suggests deferring it, it should be a *definite* for 2.5.1 and 
subsequent releases.

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb       http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden


From anthony at interlink.com.au  Fri Sep  8 10:58:28 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Fri, 8 Sep 2006 18:58:28 +1000
Subject: [Python-Dev] Unicode Imports
In-Reply-To: <edr99s$v3g$4@sea.gmane.org>
References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc>
	<200609080353.07502.anthony@interlink.com.au>
	<edr99s$v3g$4@sea.gmane.org>
Message-ID: <200609081858.32277.anthony@interlink.com.au>

On Friday 08 September 2006 18:24, Steve Holden wrote:
> > As this can't be considered a bugfix (that I can see), I'd be against it
> > being checked into 2.5.
>
> Are you suggesting that Python's inability to correctly handle Unicode
> path elements isn't a bug? Or simply that this inability isn't currently
> described in a bug report on Sourceforge?

I'm suggesting that adding the ability to handle unicode paths is a *new* 
*feature*.

If people actually want to see 2.5 final ever released, they're going to have 
to accept that "oh, but just this _one_ _more_ _thing_" is not going to fly.

We're _well_ past beta1, where new features should have been added. At this 
point, we have to cut another release candidate. This is far too much to add 
during the release candidate stage.

> I agree it's a relatively large patch for a release candidate but if
> prudence suggests deferring it, it should be a *definite* for 2.5.1 and
> subsequent releases.

Possibly. I remain unconvinced. 

-- 
Anthony Baxter     <anthony at interlink.com.au>
It's never too late to have a happy childhood.

From steve at holdenweb.com  Fri Sep  8 11:19:08 2006
From: steve at holdenweb.com (Steve Holden)
Date: Fri, 08 Sep 2006 10:19:08 +0100
Subject: [Python-Dev] Unicode Imports
In-Reply-To: <200609081858.32277.anthony@interlink.com.au>
References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc>	<200609080353.07502.anthony@interlink.com.au>	<edr99s$v3g$4@sea.gmane.org>
	<200609081858.32277.anthony@interlink.com.au>
Message-ID: <edrch5$ej3$1@sea.gmane.org>

Anthony Baxter wrote:
> On Friday 08 September 2006 18:24, Steve Holden wrote:
> 
>>>As this can't be considered a bugfix (that I can see), I'd be against it
>>>being checked into 2.5.
>>
>>Are you suggesting that Python's inability to correctly handle Unicode
>>path elements isn't a bug? Or simply that this inability isn't currently
>>described in a bug report on Sourceforge?
> 
> I'm suggesting that adding the ability to handle unicode paths is a *new* 
> *feature*.
> 
That's certainly true.

> If people actually want to see 2.5 final ever released, they're going to have 
> to accept that "oh, but just this _one_ _more_ _thing_" is not going to fly.
> 
> We're _well_ past beta1, where new features should have been added. At this 
> point, we have to cut another release candidate. This is far too much to add 
> during the release candidate stage.
> 
Right. I couldn't argue for putting this in to 2.5 - it would certainly 
represent unwarranted feature creep at the rc2 stage.
> 
>>I agree it's a relatively large patch for a release candidate but if
>>prudence suggests deferring it, it should be a *definite* for 2.5.1 and
>>subsequent releases.
> 
> 
> Possibly. I remain unconvinced. 
> 

But it *is* a desirable, albeit new, feature, so I'm surprised that you 
don't appear to perceive it as such for a downstream release.

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb       http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden


From ncoghlan at gmail.com  Fri Sep  8 11:56:27 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 08 Sep 2006 19:56:27 +1000
Subject: [Python-Dev] Unicode Imports
In-Reply-To: <edrch5$ej3$1@sea.gmane.org>
References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc>	<200609080353.07502.anthony@interlink.com.au>	<edr99s$v3g$4@sea.gmane.org>	<200609081858.32277.anthony@interlink.com.au>
	<edrch5$ej3$1@sea.gmane.org>
Message-ID: <45013E4B.4050802@gmail.com>

Steve Holden wrote:
> Anthony Baxter wrote:
>> On Friday 08 September 2006 18:24, Steve Holden wrote:
>>> I agree it's a relatively large patch for a release candidate but if
>>> prudence suggests deferring it, it should be a *definite* for 2.5.1 and
>>> subsequent releases.
>>
>> Possibly. I remain unconvinced. 
>>
> 
> But it *is* a desirable, albeit new, feature, so I'm surprised that you 
> don't appear to perceive it as such for a downstream release.

And unlike 2.2's True/False problem, it is an *environmental* feature, rather 
than a programmatic one.

So while it's a new feature, it would merely mean that 2.5.1 works correctly 
in more environments than 2.5.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From anthony at interlink.com.au  Fri Sep  8 11:48:51 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Fri, 8 Sep 2006 19:48:51 +1000
Subject: [Python-Dev] Unicode Imports
In-Reply-To: <edrch5$ej3$1@sea.gmane.org>
References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc>
	<200609081858.32277.anthony@interlink.com.au>
	<edrch5$ej3$1@sea.gmane.org>
Message-ID: <200609081948.55218.anthony@interlink.com.au>

On Friday 08 September 2006 19:19, Steve Holden wrote:
> But it *is* a desirable, albeit new, feature, so I'm surprised that you
> don't appear to perceive it as such for a downstream release.

Point releases (2.x.1 and suchlike) are absolutely not for new features. 
They're for bugfixes, only. It's possible that this could be considered a 
bugfix, but as I said right now I'm dubious.

Anthony
-- 
Anthony Baxter     <anthony at interlink.com.au>
It's never too late to have a happy childhood.

From steve at holdenweb.com  Fri Sep  8 12:28:27 2006
From: steve at holdenweb.com (Steve Holden)
Date: Fri, 08 Sep 2006 11:28:27 +0100
Subject: [Python-Dev] Unicode Imports
In-Reply-To: <200609081948.55218.anthony@interlink.com.au>
References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc>
	<200609081858.32277.anthony@interlink.com.au>
	<edrch5$ej3$1@sea.gmane.org>
	<200609081948.55218.anthony@interlink.com.au>
Message-ID: <450145CB.3070601@holdenweb.com>

Anthony Baxter wrote:
> On Friday 08 September 2006 19:19, Steve Holden wrote:
> 
>>But it *is* a desirable, albeit new, feature, so I'm surprised that you
>>don't appear to perceive it as such for a downstream release.
> 
> 
> Point releases (2.x.1 and suchlike) are absolutely not for new features. 
> They're for bugfixes, only. It's possible that this could be considered a 
> bugfix, but as I said right now I'm dubious.
> 
OK, in that case I'm going to argue that the current behaviour is buggy.

I suppose your point is that, assuming the patch is correct (and it 
seems the authors are relying on it for production purposes in tens of 
thousands of installations), it doesn't change the behaviour of the 
interpreter in existing cases, and therefore it is providing a new feature.

I don't regard this as the provision of a new feature but as the removal 
of an unnecessary restriction (which I would prefer to call a bug). If 
it was *documented* somewhere that Unicode paths aren't legal I would 
find your arguments more convincing. As things stand new Python users 
would, IMHO, be within their rights to assume that arbitrary directories 
could be added to the path without breakage.

Ultimately, your call, I guess. Would it help if I added "inability to 
import from Unicode directories" as a bug? Or would you prefer to change 
the documentation to state that some directories can't be used as path 
elements <0.3 wink>?

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb       http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden

From guido at python.org  Fri Sep  8 18:29:16 2006
From: guido at python.org (Guido van Rossum)
Date: Fri, 8 Sep 2006 09:29:16 -0700
Subject: [Python-Dev] Unicode Imports
In-Reply-To: <450145CB.3070601@holdenweb.com>
References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc>
	<200609081858.32277.anthony@interlink.com.au>
	<edrch5$ej3$1@sea.gmane.org>
	<200609081948.55218.anthony@interlink.com.au>
	<450145CB.3070601@holdenweb.com>
Message-ID: <ca471dc20609080929m233da6d6kfcc035cbd43c8313@mail.gmail.com>

On 9/8/06, Steve Holden <steve at holdenweb.com> wrote:
> Anthony Baxter wrote:
> > On Friday 08 September 2006 19:19, Steve Holden wrote:
> >
> >>But it *is* a desirable, albeit new, feature, so I'm surprised that you
> >>don't appear to perceive it as such for a downstream release.
> >
> >
> > Point releases (2.x.1 and suchlike) are absolutely not for new features.
> > They're for bugfixes, only. It's possible that this could be considered a
> > bugfix, but as I said right now I'm dubious.
> >
> OK, in that case I'm going to argue that the current behaviour is buggy.
>
> I suppose your point is that, assuming the patch is correct (and it
> seems the authors are relying on it for production purposes in tens of
> thousands of installations), it doesn't change the behaviour of the
> interpreter in existing cases, and therefore it is providing a new feature.
>
> I don't regard this as the provision of a new feature but as the removal
> of an unnecessary restriction (which I would prefer to call a bug). If
> it was *documented* somewhere that Unicode paths aren't legal I would
> find your arguments more convincing. As things stand new Python users
> would, IMHO, be within their rights to assume that arbitrary directories
> could be added to the path without breakage.
>
> Ultimately, your call, I guess. Would it help if I added "inability to
> import from Unicode directories" as a bug? Or would you prefer to change
> the documentation to state that some directories can't be used as path
> elements <0.3 wink>?

We've all heard the arguments for both sides enough times I think.

IMO it's the call of the release managers. Board members ought to
trust the release managers and not apply undue pressure.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From skip at pobox.com  Fri Sep  8 18:41:44 2006
From: skip at pobox.com (skip at pobox.com)
Date: Fri, 8 Sep 2006 11:41:44 -0500
Subject: [Python-Dev] Unicode Imports
In-Reply-To: <ca471dc20609080929m233da6d6kfcc035cbd43c8313@mail.gmail.com>
References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc>
	<200609081858.32277.anthony@interlink.com.au>
	<edrch5$ej3$1@sea.gmane.org>
	<200609081948.55218.anthony@interlink.com.au>
	<450145CB.3070601@holdenweb.com>
	<ca471dc20609080929m233da6d6kfcc035cbd43c8313@mail.gmail.com>
Message-ID: <17665.40264.242710.426290@montanaro.dyndns.org>


    Guido> IMO it's the call of the release managers. Board members ought to
    Guido> trust the release managers and not apply undue pressure.

Indeed.  Let's not go whacking people with boards.  The Perl people would
just laugh at us...

Skip

From rasky at develer.com  Fri Sep  8 20:51:46 2006
From: rasky at develer.com (Giovanni Bajo)
Date: Fri, 8 Sep 2006 20:51:46 +0200
Subject: [Python-Dev] Unicode Imports
References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc><200609081858.32277.anthony@interlink.com.au><edrch5$ej3$1@sea.gmane.org><200609081948.55218.anthony@interlink.com.au><450145CB.3070601@holdenweb.com>
	<ca471dc20609080929m233da6d6kfcc035cbd43c8313@mail.gmail.com>
Message-ID: <010301c6d377$d5df7bc0$46ba2997@bagio>

Guido van Rossum <guido at python.org> wrote:

> IMO it's the call of the release managers. Board members ought to
> trust the release managers and not apply undue pressure.


+1, but I would love to see a more formal definition of what a "bugfix" is,
which would reduce the ambiguous cases, and thus reduce the number of times the
release managers are called to pronounce.

Other projects, for instance, describe point releases as "open for regression
fixes only", which means that a patch, to be eligible for a point release, must
fix a regression (something which used to work before, and doesn't anymore).

Regressions are important because they affect people wanting to upgrade Python.
If something never worked before (like this unicode path thingie), surely
existing Python users are not affected by the bug (or they have already
workarounds in place), so that NOT having the bug fixed in a point release is
not a problem.

Anyway, I'm not pushing for this specific policy (even if I like it): I'm just
suggesting Release Managers to more formally define what should and what should
not go in a point release.

Giovanni Bajo


From rhettinger at ewtllc.com  Fri Sep  8 21:00:50 2006
From: rhettinger at ewtllc.com (Raymond Hettinger)
Date: Fri, 08 Sep 2006 12:00:50 -0700
Subject: [Python-Dev] Unicode Imports
In-Reply-To: <010301c6d377$d5df7bc0$46ba2997@bagio>
References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc><200609081858.32277.anthony@interlink.com.au><edrch5$ej3$1@sea.gmane.org><200609081948.55218.anthony@interlink.com.au><450145CB.3070601@holdenweb.com>	<ca471dc20609080929m233da6d6kfcc035cbd43c8313@mail.gmail.com>
	<010301c6d377$d5df7bc0$46ba2997@bagio>
Message-ID: <4501BDE2.6020306@ewtllc.com>

Giovanni Bajo wrote:

>
>+1, but I would love to see a more formal definition of what a "bugfix" is,
>which would reduce the ambiguous cases, and thus reduce the number of times the
>release managers are called to pronounce.
>  
>

Sorry, that is just a pipe-dream. To some degree, all bug-fixes are new 
features in that there is some behavioral difference, something will now 
work that wouldn't work before. While some cases are clear-cut (such as 
API changes), the ones that are interesting will defy definition and 
need a human judgment call as to whether a given change will help more 
than it hurts. The RMs are also strongly biased against extensive 
patches than haven't had a chance to go through a beta-cycle -- they 
don't want their releases mucked-up.


Raymond






From mal at egenix.com  Fri Sep  8 21:12:33 2006
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 08 Sep 2006 21:12:33 +0200
Subject: [Python-Dev] Unicode Imports
In-Reply-To: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc>
References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc>
Message-ID: <4501C0A1.4010600@egenix.com>

Kristj?n V. J?nsson wrote:
> Hello All.
> I just added patch 1552880 to sourceforge.  It is a patch for 2.6 (and 2.5) which allows unicode paths in sys.path and uses the unicode file api on windows.
> This is tried and tested on 2.5, and backported to 2.3 and is currently running on clients in china and esewhere.  It is minimally intrusive to the inporting mechanism, at the cost of some string conversion overhead (to utf8 and then back to unicode).

+1 on adding it to Python 2.6.

-0 for Python 2.5.x:

Applications/modules written for Python 2.4 and 2.5 won't be expecting
Unicode strings in sys.path with all the consequences that go with it,
so this is a true change in semantics, not just a nice to have
additional feature or "bug" fix.

OTOH, those applications will just break in a different place with the
patch applied :-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 08 2006)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From martin at v.loewis.de  Fri Sep  8 22:51:09 2006
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Fri, 08 Sep 2006 22:51:09 +0200
Subject: [Python-Dev] Unicode Imports
In-Reply-To: <edr99s$v3g$4@sea.gmane.org>
References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc>	<200609080353.07502.anthony@interlink.com.au>
	<edr99s$v3g$4@sea.gmane.org>
Message-ID: <4501D7BD.1020006@v.loewis.de>

Steve Holden schrieb:
>> As this can't be considered a bugfix (that I can see), I'd be against it being 
>> checked into 2.5. 
>>
> Are you suggesting that Python's inability to correctly handle Unicode 
> path elements isn't a bug?

Not sure whether Anthony suggests it, but I do.

> Or simply that this inability isn't currently 
> described in a bug report on Sourceforge?

No: sys.path is specified (originally) as containing a list of byte
strings; it was extended to also support path importers (or whatever
that PEP calls them). It was never extended to support Unicode strings.
That other PEP e

> I agree it's a relatively large patch for a release candidate but if 
> prudence suggests deferring it, it should be a *definite* for 2.5.1 and 
> subsequent releases.

I'm not so sure it should. It *is* a new feature: it makes applications
possible which aren't possible today, and the documentation does not
ever suggest that these applications should have been possible. In fact,
it is common knowledge that this currently isn't supported.

Regards,
Martin

From martin at v.loewis.de  Fri Sep  8 22:52:26 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 08 Sep 2006 22:52:26 +0200
Subject: [Python-Dev] Unicode Imports
In-Reply-To: <edrch5$ej3$1@sea.gmane.org>
References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc>	<200609080353.07502.anthony@interlink.com.au>	<edr99s$v3g$4@sea.gmane.org>	<200609081858.32277.anthony@interlink.com.au>
	<edrch5$ej3$1@sea.gmane.org>
Message-ID: <4501D80A.5050008@v.loewis.de>

Steve Holden schrieb:
>>> I agree it's a relatively large patch for a release candidate but if
>>> prudence suggests deferring it, it should be a *definite* for 2.5.1 and
>>> subsequent releases.
>>
>> Possibly. I remain unconvinced. 
>>
> 
> But it *is* a desirable, albeit new, feature, so I'm surprised that you 
> don't appear to perceive it as such for a downstream release.

Because 2.5.1 shouldn't include any new features. If it is a new feature
(which it is), it should go into 2.6.

Regards,
Martin

From martin at v.loewis.de  Fri Sep  8 22:54:43 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 08 Sep 2006 22:54:43 +0200
Subject: [Python-Dev] Unicode Imports
In-Reply-To: <45013E4B.4050802@gmail.com>
References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc>	<200609080353.07502.anthony@interlink.com.au>	<edr99s$v3g$4@sea.gmane.org>	<200609081858.32277.anthony@interlink.com.au>	<edrch5$ej3$1@sea.gmane.org>
	<45013E4B.4050802@gmail.com>
Message-ID: <4501D893.4090504@v.loewis.de>

Nick Coghlan schrieb:
>> But it *is* a desirable, albeit new, feature, so I'm surprised that you 
>> don't appear to perceive it as such for a downstream release.
> 
> And unlike 2.2's True/False problem, it is an *environmental* feature, rather 
> than a programmatic one.

Not sure what you mean by that; if you mean "thus existing applications
cannot break": this is not true. In fact, it seems that some
applications are extremely susceptible to the types of objects on
sys.path. Some applications apparently know exactly what you can and
cannot find on sys.path; changing that might break them.

Regards,
Martin

From martin at v.loewis.de  Fri Sep  8 22:56:48 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 08 Sep 2006 22:56:48 +0200
Subject: [Python-Dev] Unicode Imports
In-Reply-To: <450145CB.3070601@holdenweb.com>
References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc>	<200609081858.32277.anthony@interlink.com.au>	<edrch5$ej3$1@sea.gmane.org>	<200609081948.55218.anthony@interlink.com.au>
	<450145CB.3070601@holdenweb.com>
Message-ID: <4501D910.8020805@v.loewis.de>

Steve Holden schrieb:
> I don't regard this as the provision of a new feature but as the removal 
> of an unnecessary restriction (which I would prefer to call a bug).

You got the definition of "bug" wrong. Primarily, a bug is a deviation
from the specification. Extending the domain of an argument to an
existing function is a new feature.

Regards,
Martin

From martin at v.loewis.de  Fri Sep  8 22:59:57 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 08 Sep 2006 22:59:57 +0200
Subject: [Python-Dev] Unicode Imports
In-Reply-To: <010301c6d377$d5df7bc0$46ba2997@bagio>
References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc><200609081858.32277.anthony@interlink.com.au><edrch5$ej3$1@sea.gmane.org><200609081948.55218.anthony@interlink.com.au><450145CB.3070601@holdenweb.com>	<ca471dc20609080929m233da6d6kfcc035cbd43c8313@mail.gmail.com>
	<010301c6d377$d5df7bc0$46ba2997@bagio>
Message-ID: <4501D9CD.80301@v.loewis.de>

Giovanni Bajo schrieb:
> +1, but I would love to see a more formal definition of what a "bugfix" is,
> which would reduce the ambiguous cases, and thus reduce the number of times the
> release managers are called to pronounce.
> 
> Other projects, for instance, describe point releases as "open for regression
> fixes only", which means that a patch, to be eligible for a point release, must
> fix a regression (something which used to work before, and doesn't anymore).

In Python, the tradition has excepted bug fixes beyond that. For
example, fixing a memory leak would also count as a bug fix.

In general, I think a "bug" is a deviation from the specification (it
might be necessary to interpret the specification first to find out
whether the implementation deviates). A bug fix is then a behavior
change so that the new behavior follows the specification, or a
specification change so that it correctly describes the behavior.

Regards,
Martin

From misa at redhat.com  Sat Sep  9 00:06:05 2006
From: misa at redhat.com (Mihai Ibanescu)
Date: Fri, 8 Sep 2006 18:06:05 -0400
Subject: [Python-Dev] Py_BuildValue and decref
Message-ID: <20060908220605.GF990@abulafia.devel.redhat.com>

Hi,

Looking at:

http://docs.python.org/api/arg-parsing.html

The description for "O" is:

"O" (object) [PyObject *]
    Store a Python object (without any conversion) in a C object pointer. The
    C program thus receives the actual object that was passed. The object's
    reference count is not increased. The pointer stored is not NULL.

There is no description of what happens when Py_BuildValue fails. Will it
decref the python object passed in? Will it not?

Looking at tupleobject.h:

/*
Another generally useful object type is a tuple of object pointers.
For Python, this is an immutable type.  C code can change the tuple items
(but not their number), and even use tuples are general-purpose arrays of
object references, but in general only brand new tuples should be mutated,
not ones that might already have been exposed to Python code.

*** WARNING *** PyTuple_SetItem does not increment the new item's reference
count, but does decrement the reference count of the item it replaces,
if not nil.  It does *decrement* the reference count if it is *not*
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
inserted in the tuple.  Similarly, PyTuple_GetItem does not increment the
returned item's reference count.
*/

So, if the call to PyTuple_SetItem fails, the value passed in is lost. Should
I expect the same thing with Py_BuildValue?


Looking at how other modules deal with this, I picked typeobject.c:

        result = Py_BuildValue("[O]", (PyObject *)type);
        if (result == NULL) {
                Py_DECREF(to_merge);
                return NULL;
        }

so no attempt to DECREF type in the error case.


Further down...


                        if (n) {
                                state = Py_BuildValue("(NO)", state, slots);
                                if (state == NULL)
                                        goto end;
                        }

and further down:

  end:
        Py_XDECREF(cls);
        Py_XDECREF(args);
        Py_XDECREF(args2);
        Py_XDECREF(slots);
        Py_XDECREF(state);
        Py_XDECREF(names);
        Py_XDECREF(listitems);
        Py_XDECREF(dictitems);
        Py_XDECREF(copy_reg);
        Py_XDECREF(newobj);
        return res;

so it will attempt to DECREF the (non-NULL) slots in the error case.

It's probably not a big issue since if Py_BuildValue fails, you have bigger
issues than memory leaks, but it seems inconsistent to me. Can someone that
knows the internal implementation clarify one way over the other?

Thanks!
Misa

From barry at barrys-emacs.org  Sat Sep  9 00:18:49 2006
From: barry at barrys-emacs.org (Barry Scott)
Date: Fri, 8 Sep 2006 23:18:49 +0100
Subject: [Python-Dev] What windows tool chain do I need for python 2.5
	extensions?
Message-ID: <52A09F3B-0D3B-46E3-B7E5-02DC0D3BB170@barrys-emacs.org>

I have the tool chains to build extensions against your binary python  
2.2, 2.3 and 2.4 on windows.

What are the tool chain requirements for building extensions against  
python 2.5 on windows?

Barry



From barry at python.org  Sat Sep  9 00:27:08 2006
From: barry at python.org (Barry Warsaw)
Date: Fri, 8 Sep 2006 18:27:08 -0400
Subject: [Python-Dev] Py_BuildValue and decref
In-Reply-To: <20060908220605.GF990@abulafia.devel.redhat.com>
References: <20060908220605.GF990@abulafia.devel.redhat.com>
Message-ID: <D7FAEFC0-59CC-49F6-9757-27244D409E94@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sep 8, 2006, at 6:06 PM, Mihai Ibanescu wrote:

> There is no description of what happens when Py_BuildValue fails.  
> Will it
> decref the python object passed in? Will it not?

I just want to point out that the C API documentation is pretty  
silent about the refcounting side-effects in error conditions (and  
often in success conditions too) of most Python functions.  For  
example, what is the refcounting side-effects of PyDict_SetItem() on  
val?  What about if that function fails?  Has val been incref'd or  
not?  What about the side-effects on any value the new one replaces,  
both in success and failure?

The C API documentation has improved in documenting the refcount  
behavior for return values of many of the functions, but the only  
reliable way to know what some other side-effects are is to read the  
code.  After I perfect my human cloning techniques, I'll be assigning  
one of my minions to fix this situation (I'll bet my clean-the-kitty- 
litter-and-stalk-er-keep-tabs-on-Britney clone would love to take a  
break for a few weeks to work on this).

- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iQCVAwUBRQHuQnEjvBPtnXfVAQJfFAP9GHIRhiVc7lkzwEkPtJgqNsrN8edQcKh3
l4edSlDD7JoJrIaOElqyIaEKcJSkjpKfJt6qdA1qIt8LD9x4pGvdxpxgodGVYfFo
VGPwm+pU9SH6JJIZcCOOf9bJbEmR9iqZKceAJMGgJvZjBnTnoVSyf52254q3JJGR
b9glwqbddi0=
=3iWf
-----END PGP SIGNATURE-----

From misa at redhat.com  Sat Sep  9 00:35:58 2006
From: misa at redhat.com (Mihai Ibanescu)
Date: Fri, 8 Sep 2006 18:35:58 -0400
Subject: [Python-Dev] Py_BuildValue and decref
In-Reply-To: <D7FAEFC0-59CC-49F6-9757-27244D409E94@python.org>
References: <20060908220605.GF990@abulafia.devel.redhat.com>
	<D7FAEFC0-59CC-49F6-9757-27244D409E94@python.org>
Message-ID: <20060908223558.GG990@abulafia.devel.redhat.com>

On Fri, Sep 08, 2006 at 06:27:08PM -0400, Barry Warsaw wrote:
> 
> On Sep 8, 2006, at 6:06 PM, Mihai Ibanescu wrote:
> 
> >There is no description of what happens when Py_BuildValue fails.  
> >Will it
> >decref the python object passed in? Will it not?
> 
> I just want to point out that the C API documentation is pretty  
> silent about the refcounting side-effects in error conditions (and  
> often in success conditions too) of most Python functions.  For  
> example, what is the refcounting side-effects of PyDict_SetItem() on  
> val?  What about if that function fails?  Has val been incref'd or  
> not?  What about the side-effects on any value the new one replaces,  
> both in success and failure?

In this particular case, it doesn't decref it (or so I read the code).
Relevant code is in do_mkvalue from Python/modsupport.c

                case 'N':
                case 'S':
                case 'O':
                if (**p_format == '&') {
                        typedef PyObject *(*converter)(void *);
                        converter func = va_arg(*p_va, converter);
                        void *arg = va_arg(*p_va, void *);
                        ++*p_format;
                        return (*func)(arg);
                }
                else {
                        PyObject *v;
                        v = va_arg(*p_va, PyObject *);
                        if (v != NULL) {
                                if (*(*p_format - 1) != 'N')
                                        Py_INCREF(v);
                        }
                        else if (!PyErr_Occurred())
                                /* If a NULL was passed
                                 * because a call that should
                                 * have constructed a value
                                 * failed, that's OK, and we
                                 * pass the error on; but if
                                 * no error occurred it's not
                                 * clear that the caller knew
                                 * what she was doing. */
                                PyErr_SetString(PyExc_SystemError,
                                        "NULL object passed to
Py_BuildValue");
                        return v;
                }


Barry, where can I ship you my cloning machine? :-)

Misa

From jcarlson at uci.edu  Sat Sep  9 00:48:59 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri, 08 Sep 2006 15:48:59 -0700
Subject: [Python-Dev] What windows tool chain do I need for python 2.5
	extensions?
In-Reply-To: <52A09F3B-0D3B-46E3-B7E5-02DC0D3BB170@barrys-emacs.org>
References: <52A09F3B-0D3B-46E3-B7E5-02DC0D3BB170@barrys-emacs.org>
Message-ID: <20060908154754.F8DF.JCARLSON@uci.edu>


Barry Scott <barry at barrys-emacs.org> wrote:
> 
> I have the tool chains to build extensions against your binary python  
> 2.2, 2.3 and 2.4 on windows.
> 
> What are the tool chain requirements for building extensions against  
> python 2.5 on windows?

The compiler requirements for 2.5 on Windows is the same as 2.4 .

 - Josiah


From kbk at shore.net  Sat Sep  9 03:35:24 2006
From: kbk at shore.net (Kurt B. Kaiser)
Date: Fri, 8 Sep 2006 21:35:24 -0400 (EDT)
Subject: [Python-Dev] Weekly Python Patch/Bug Summary
Message-ID: <200609090135.k891ZOcT003051@bayview.thirdcreek.com>

Patch / Bug Summary
___________________

Patches :  413 open ( +1) /  3407 closed (+10) /  3820 total (+11)
Bugs    :  897 open ( -3) /  6167 closed (+18) /  7064 total (+15)
RFE     :  234 open ( +1) /   238 closed ( +2) /   472 total ( +3)

New / Reopened Patches
______________________

Fix decimal context management for 2.5  (2006-09-02)
CLOSED http://python.org/sf/1550886  opened by  Nick Coghlan

Fix for rpartition() end-case  (2006-09-03)
CLOSED http://python.org/sf/1551339  opened by  Raymond Hettinger

Updated spec file for 2.5 release.  (2006-09-03)
CLOSED http://python.org/sf/1551340  opened by  Sean Reifschneider

unparse.py decorator support  (2006-09-04)
       http://python.org/sf/1552024  opened by  Adal Chiriliuc

eval docstring typo  (2006-09-04)
CLOSED http://python.org/sf/1552093  opened by  Ori Avtalion

Fix error checks and leaks in setobject.c  (2006-09-05)
CLOSED http://python.org/sf/1552731  reopened by  gbrandl

Fix error checks and leaks in setobject.c  (2006-09-05)
CLOSED http://python.org/sf/1552731  opened by  Raymond Hettinger

Unicode Imports  (2006-09-05)
       http://python.org/sf/1552880  opened by  Kristj?n Valur

Fix inspect.py 2.5 slowdown  (2006-09-06)
CLOSED http://python.org/sf/1553314  opened by  Nick Coghlan

locale.getdefaultlocale() bug when _locale is missing  (2006-09-06)
       http://python.org/sf/1553427  opened by  STINNER Victor

UserDict New Style  (2006-09-09)
       http://python.org/sf/1555097  opened by  Indy

Performance enhancements.  (2006-09-09)
       http://python.org/sf/1555098  opened by  Indy

Patches Closed
______________

Fix decimal context management for 2.5  (2006-09-02)
       http://python.org/sf/1550886  closed by  ncoghlan

Fix for rpartition() end-case  (2006-09-02)
       http://python.org/sf/1551339  closed by  nnorwitz

Updated spec file for 2.5 release.  (2006-09-02)
       http://python.org/sf/1551340  closed by  nnorwitz

eval docstring typo  (2006-09-04)
       http://python.org/sf/1552093  closed by  nnorwitz

crash in dict_equal  (2006-08-24)
       http://python.org/sf/1546288  closed by  nnorwitz

Patches for OpenBSD 4.0  (2006-08-15)
       http://python.org/sf/1540470  closed by  nnorwitz

Fix error checks and leaks in setobject.c  (2006-09-05)
       http://python.org/sf/1552731  closed by  rhettinger

Fix error checks and leaks in setobject.c  (2006-09-05)
       http://python.org/sf/1552731  closed by  gbrandl

make exec a function  (2006-09-01)
       http://python.org/sf/1550800  closed by  gbrandl

Ellipsis literal "..."  (2006-09-01)
       http://python.org/sf/1550786  closed by  gbrandl

Fix inspect.py 2.5 slowdown  (2006-09-06)
       http://python.org/sf/1553314  closed by  ncoghlan

New / Reopened Bugs
___________________

from . import bug  (2006-09-02)
CLOSED http://python.org/sf/1550938  opened by  ganges master

random.choice(setinstance) fails  (2006-09-02)
CLOSED http://python.org/sf/1551113  opened by  Alan

Build of 2.4.3 on fedora core 5 fails to find asm/msr.h  (2006-09-02)
       http://python.org/sf/1551238  opened by  George R. Goffe

tiny bug in win32_urandom  (2006-09-03)
CLOSED http://python.org/sf/1551427  opened by  Rocco Matano

__unicode__ breaks for exception class objects  (2006-09-03)
       http://python.org/sf/1551432  opened by  Marcin 'Qrczak' Kowalczyk

Wrong link to unicode database  (2006-09-03)
CLOSED http://python.org/sf/1551669  opened by  Yevgen Muntyan

unpack list of singleton tuples not unpacking  (2006-07-11)
CLOSED http://python.org/sf/1520864  reopened by  gbrandl

UnixCCompiler runtime_library_dir uses -R instead of -Wl,-R  (2006-09-04)
CLOSED http://python.org/sf/1552304  opened by  TFKyle

PEP 290 <-> normal docu...  (2006-09-05)
CLOSED http://python.org/sf/1552618  opened by  Jens Diemer

Python polls unecessarily every 0.1 when interactive  (2006-09-05)
       http://python.org/sf/1552726  opened by  Richard Boulton

Python polls unnecessarily every 0.1 second when interactive  (2006-09-05)
       http://python.org/sf/1552726  reopened by  akuchling

subprocess.Popen(cmd, stdout=sys.stdout) fails  (2006-07-31)
CLOSED http://python.org/sf/1531862  reopened by  nnorwitz

ConfigParser converts option names to lower case on set()  (2006-09-05)
CLOSED http://python.org/sf/1552892  opened by  daniel

Pythonw doesn't get rebuilt if version number changes  (2006-09-05)
       http://python.org/sf/1552935  opened by  Jack Jansen

python 2.5 install can't find tcl/tk in /usr/lib64  (2006-09-06)
       http://python.org/sf/1553166  opened by  David Strozzi

logging.handlers.RotatingFileHandler - inconsistent mode  (2006-09-06)
       http://python.org/sf/1553496  opened by  Walker Hale

datetime.datetime.now() mangles tzinfo  (2006-09-06)
       http://python.org/sf/1553577  opened by  Skip Montanaro

Class instance apparently not destructed when expected  (2006-09-06)
       http://python.org/sf/1553819  opened by  Peter Donis

PyOS_InputHook() and related API funcs. not documented  (2006-09-07)
       http://python.org/sf/1554133  opened by  A.M. Kuchling

Bugs Closed
___________

itertools.tee raises SystemError  (2006-09-01)
       http://python.org/sf/1550714  closed by  nnorwitz

Typo in Language Reference Section 3.2 Class Instances  (2006-08-28)
       http://python.org/sf/1547931  closed by  nnorwitz

from . import bug  (2006-09-02)
       http://python.org/sf/1550938  closed by  gbrandl

tiny bug in win32_urandom  (2006-09-03)
       http://python.org/sf/1551427  closed by  gbrandl

sgmllib.sgmlparser is not thread safe  (2006-08-29)
       http://python.org/sf/1548288  closed by  gbrandl

test_anydbm segmentation fault  (2006-08-21)
       http://python.org/sf/1544106  closed by  greg

Wrong link to unicode database  (2006-09-03)
       http://python.org/sf/1551669  closed by  gbrandl

unpack list of singleton tuples not unpacking  (2006-07-11)
       http://python.org/sf/1520864  closed by  nnorwitz

UnixCCompiler runtime_library_dir uses -R instead of -Wl,-R  (2006-09-04)
       http://python.org/sf/1552304  closed by  tfkyle

gcc trunk (4.2) exposes a signed integer overflows  (2006-08-23)
       http://python.org/sf/1545668  closed by  nnorwitz

Exceptions don't call _PyObject_GC_UNTRACK(self)  (2006-08-17)
       http://python.org/sf/1542051  closed by  gbrandl

PEP 290 <-> normal docu...  (2006-09-05)
       http://python.org/sf/1552618  closed by  gbrandl

SimpleXMLRpcServer still uses sys.exc_value and sys.exc_type  (2006-07-19)
       http://python.org/sf/1525469  closed by  akuchling

unbalanced parentheses  from command line crash pdb  (2006-07-22)
       http://python.org/sf/1526834  closed by  akuchling

Python polls unnecessarily every 0.1 second when interactive  (2006-09-05)
       http://python.org/sf/1552726  closed by  akuchling

subprocess.Popen(cmd, stdout=sys.stdout) fails  (2006-07-31)
       http://python.org/sf/1531862  closed by  niemeyer

subprocess.Popen(cmd, stdout=sys.stdout) fails  (2006-07-31)
       http://python.org/sf/1531862  closed by  niemeyer

ConfigParser converts option names to lower case on set()  (2006-09-05)
       http://python.org/sf/1552892  closed by  gbrandl

SWIG wrappers incompatible with 2.5c1  (2006-09-01)
       http://python.org/sf/1550559  closed by  gbrandl

Building Python 2.4.3 on Solaris 9/10 with Sun Studio 11  (2006-05-28)
       http://python.org/sf/1496561  closed by  andyfloe

Curses module doesn't install on Solaris 2.8  (2005-10-12)
       http://python.org/sf/1324799  closed by  akuchling

New / Reopened RFE
__________________

Add traceback.print_full_exception()  (2006-09-06)
       http://python.org/sf/1553375  opened by  Michael Hoffman

Print full exceptions as they occur in logging  (2006-09-06)
       http://python.org/sf/1553380  opened by  Michael Hoffman

RFE Closed
__________

random.choice(setinstance) fails  (2006-09-02)
       http://python.org/sf/1551113  closed by  rhettinger

Add 'find' method to sequence types  (2006-08-28)
       http://python.org/sf/1548178  closed by  gbrandl


From jan-python at maka.demon.nl  Sat Sep  9 04:07:02 2006
From: jan-python at maka.demon.nl (Jan Kanis)
Date: Sat, 09 Sep 2006 04:07:02 +0200
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <1f7befae0607152047u43993a15ue5180b990f9a530f@mail.gmail.com>
References: <ee2a432c0607132339o1d35c402ib13884c85115856a@mail.gmail.com>
	<09f901c6a72c$495f2690$12472597@bagio>
	<20060714112137.GA891@Andrew-iBook2.local>
	<e99iut$u6n$1@sea.gmane.org> <44B8A90C.6070309@v.loewis.de>
	<ee2a432c0607151835m741dc92dpa5cc3a7149842a1c@mail.gmail.com>
	<1f7befae0607152047u43993a15ue5180b990f9a530f@mail.gmail.com>
Message-ID: <op.tfk9h00vaed6q0@e500>

At the risk of waking up a thread that was already declared dead, but  
perhaps this is usefull.

So, what happens is pythons signal handler sets a flag and registrers a  
callback. Then the main thread should check the flag and make the callback  
to actually do something with the signal. However the main thread is  
blocked in GTK and can't check the flag.

Nick Maclaren wrote:
...lots of reasons why you can't do anything reliably from within a signal  
handler...

As far as I understand it, what could work is this:
-PyGTK registrers a callback.
-Pythons signal handler does not change at all.
-All threads that run in the Python interpreter occasionally check the  
flag which the signal handler sets, like the main thread does nowadays. If  
it is set, the thread calls PyGTKs callback. It does not do anything else  
with the signal.
-PyGTKs callback wakes up the main thread, which actually handles the  
signal just like it does now.

PyGTKs callback could be called from any thread, but it would be called in  
a normal context, not in a signal handler. As the signal handler does not  
change, the risk of breaking anything or causing chaos is as large/small  
as it is under the current scheme. However, PyGTKs problem does get  
solved, as long as there is _a_ thread that returns to the interpreter  
within some timeframe. It seems plausible that this will happen.

From rhamph at gmail.com  Sat Sep  9 06:52:42 2006
From: rhamph at gmail.com (Adam Olsen)
Date: Fri, 8 Sep 2006 22:52:42 -0600
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <op.tfk9h00vaed6q0@e500>
References: <ee2a432c0607132339o1d35c402ib13884c85115856a@mail.gmail.com>
	<09f901c6a72c$495f2690$12472597@bagio>
	<20060714112137.GA891@Andrew-iBook2.local>
	<e99iut$u6n$1@sea.gmane.org> <44B8A90C.6070309@v.loewis.de>
	<ee2a432c0607151835m741dc92dpa5cc3a7149842a1c@mail.gmail.com>
	<1f7befae0607152047u43993a15ue5180b990f9a530f@mail.gmail.com>
	<op.tfk9h00vaed6q0@e500>
Message-ID: <aac2c7cb0609082152t76029092vb392cf1540168dc@mail.gmail.com>

On 9/8/06, Jan Kanis <jan-python at maka.demon.nl> wrote:
> At the risk of waking up a thread that was already declared dead, but
> perhaps this is usefull.

I don't think we should let this die, at least not yet.  Nick seems to
be arguing that ANY signal handler is prone to random crashes or
corruption (due to bugs).  However, we already have a signal handler,
so we should already be exposed to the random crashes/corruption.

If we're going to rely on signal handling being correct then I think
we should also rely on write() being correct.  Note that I'm not
suggesting an API that allows arbitrary signal handlers, but rather
one that calls write() on an array of prepared file descriptors
(ignoring errors).

Ensuring modifications to that array are atomic would be tricky, but I
think it would be doable if we use a read-copy-update approach (with
two alternating signal handler functions).  Not sure how to ensure
there's no currently running signal handlers in another thread though.
 Maybe have to rip the atomic read/write stuff out of the Linux
sources to ensure it's *always* defined behavior.

Looking into the existing signalmodule.c, I see no attempts to ensure
atomic access to the Handlers data structure.  Is the current code
broken, at least on non-x86 platforms?

-- 
Adam Olsen, aka Rhamphoryncus

From rhamph at gmail.com  Sat Sep  9 06:59:48 2006
From: rhamph at gmail.com (Adam Olsen)
Date: Fri, 8 Sep 2006 22:59:48 -0600
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <aac2c7cb0609082152t76029092vb392cf1540168dc@mail.gmail.com>
References: <ee2a432c0607132339o1d35c402ib13884c85115856a@mail.gmail.com>
	<09f901c6a72c$495f2690$12472597@bagio>
	<20060714112137.GA891@Andrew-iBook2.local>
	<e99iut$u6n$1@sea.gmane.org> <44B8A90C.6070309@v.loewis.de>
	<ee2a432c0607151835m741dc92dpa5cc3a7149842a1c@mail.gmail.com>
	<1f7befae0607152047u43993a15ue5180b990f9a530f@mail.gmail.com>
	<op.tfk9h00vaed6q0@e500>
	<aac2c7cb0609082152t76029092vb392cf1540168dc@mail.gmail.com>
Message-ID: <aac2c7cb0609082159g647fde2s3b714a6d001de5be@mail.gmail.com>

On 9/8/06, Adam Olsen <rhamph at gmail.com> wrote:
> Ensuring modifications to that array are atomic would be tricky, but I
> think it would be doable if we use a read-copy-update approach (with
> two alternating signal handler functions).  Not sure how to ensure
> there's no currently running signal handlers in another thread though.
>  Maybe have to rip the atomic read/write stuff out of the Linux
> sources to ensure it's *always* defined behavior.

Doh, except that's exactly what sig_atomic_t is for.  Ah well, can't
win them all.

-- 
Adam Olsen, aka Rhamphoryncus

From ncoghlan at gmail.com  Sat Sep  9 07:55:56 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 09 Sep 2006 15:55:56 +1000
Subject: [Python-Dev] Unicode Imports
In-Reply-To: <4501D7BD.1020006@v.loewis.de>
References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc>	<200609080353.07502.anthony@interlink.com.au>	<edr99s$v3g$4@sea.gmane.org>
	<4501D7BD.1020006@v.loewis.de>
Message-ID: <4502576C.3060604@gmail.com>

Martin v. L?wis wrote:
> Steve Holden schrieb:
>> Or simply that this inability isn't currently 
>> described in a bug report on Sourceforge?
> 
> No: sys.path is specified (originally) as containing a list of byte
> strings; it was extended to also support path importers (or whatever
> that PEP calls them). It was never extended to support Unicode strings.
> That other PEP e

That other PEP being PEP 302. That said, Unicode strings *are* permitted on 
sys.path - the import system will automatically encode them to an 8-bit string 
using the default filesystem encoding as part of the import process.

This works fine on Unix systems that use UTF-8 encoded strings to handle 
Unicode paths at the C API level, but is screwed on Windows because the 
default mbcs filesystem encoding can't handle the full range of possible 
Unicode path names (such as the Chinese directories that originally gave 
Kristj?n grief).

To get Unicode path names to work on Windows, you have to use the 
Windows-specific wide character API instead of the normal C API, and the 
import machinery doesn't do that.

So this is taking something that *already works properly on POSIX systems* and 
making it work on Windows as well.

>> I agree it's a relatively large patch for a release candidate but if 
>> prudence suggests deferring it, it should be a *definite* for 2.5.1 and 
>> subsequent releases.
> 
> I'm not so sure it should. It *is* a new feature: it makes applications
> possible which aren't possible today, and the documentation does not
> ever suggest that these applications should have been possible. In fact,
> it is common knowledge that this currently isn't supported.

It should already work fine on POSIX filesystems that use the default 
filesystem encoding for path names. As far as I am aware, it is only Windows 
where it doesn't work.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From martin at v.loewis.de  Sat Sep  9 09:23:32 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 09 Sep 2006 09:23:32 +0200
Subject: [Python-Dev] Unicode Imports
In-Reply-To: <4502576C.3060604@gmail.com>
References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc>	<200609080353.07502.anthony@interlink.com.au>	<edr99s$v3g$4@sea.gmane.org>
	<4501D7BD.1020006@v.loewis.de> <4502576C.3060604@gmail.com>
Message-ID: <45026BF4.5080108@v.loewis.de>

Nick Coghlan schrieb:
> So this is taking something that *already works properly on POSIX
> systems* and making it work on Windows as well.

I doubt it does without side effects. For example, an application that
would go through sys.path, and encode everything with
sys.getfilesystemencoding() currently works, but will break if the patch
is applied and non-mbcs strings are put on sys.path.

Also, what will be the effect on __file__? What value will it have
if the module originates from a sys.path entry that is a non-mbcs
unicode string? I haven't tested the patch, but it looks like
__file__ becomes a unicode string on Windows, and remains a byte
string encoded with the file system encoding elsewhere. That's also
a change in behavior.

Regards,
Martin


From brett at python.org  Sat Sep  9 09:23:54 2006
From: brett at python.org (Brett Cannon)
Date: Sat, 9 Sep 2006 00:23:54 -0700
Subject: [Python-Dev] 2.5 status
In-Reply-To: <ee2a432c0609070028t657e538dqf675e3ce45115150@mail.gmail.com>
References: <ee2a432c0609042124w281b7979t8cb10cbaeb937374@mail.gmail.com>
	<bbaeab100609051125x6d707ca2jf79973d9d68579a7@mail.gmail.com>
	<44FDD122.3000809@egenix.com>
	<bbaeab100609051241m7d878b0dtd93018b535b9ee14@mail.gmail.com>
	<ee2a432c0609070028t657e538dqf675e3ce45115150@mail.gmail.com>
Message-ID: <bbaeab100609090023w26575e17jf06c92350a7f572a@mail.gmail.com>

On 9/7/06, Neal Norwitz <nnorwitz at gmail.com> wrote:
>
> On 9/5/06, Brett Cannon <brett at python.org> wrote:
> >
> > > [MAL]
> > > The proper fix would be to introduce a tp_unicode slot and let
> > > this decide what to do, ie. call .__unicode__() methods on instances
> > > and use the .__name__ on classes.
> >
> > That was my bug reaction  and what I said on the bug report.  Kind of
> > surprised one doesn't already exist.
> >
> > > I think this would be the right way to go for Python 2.6. For
> > > Python 2.5, just dropping this .__unicode__ method on exceptions
> > > is probably the right thing to do.
> >
> > Neal, do you want to rip it out or should I?
>
> Is removing __unicode__ backwards compatible with 2.4 for both
> instances and exception classes?
>
> Does everyone agree this is the proper approach?  I'm not familiar
> with this code.  Brett, if everyone agrees (ie, remains silent),
> please fix this and add tests and a NEWS entry.


Done.  Even updated PEP 356 for you while I was at it.  =)

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060909/19a6f83f/attachment.html 

From nmm1 at cus.cam.ac.uk  Sat Sep  9 12:56:06 2006
From: nmm1 at cus.cam.ac.uk (Nick Maclaren)
Date: Sat, 09 Sep 2006 11:56:06 +0100
Subject: [Python-Dev] Signals, threads, blocking C functions
Message-ID: <E1GM0VK-0003n4-7b@draco.cus.cam.ac.uk>

I was hoping to have stopped, but here are a few comments.

I agree with Jan Kanis.  That is the way to tackle this one.

"Adam Olsen" <rhamph at gmail.com> wrote:
>         
> I don't think we should let this die, at least not yet.  Nick seems to
> be arguing that ANY signal handler is prone to random crashes or
> corruption (due to bugs).  However, we already have a signal handler,
> so we should already be exposed to the random crashes/corruption.

No.  I am afraid that is a common myth and often catastrophic mistake.
In this sort of area, NEVER assume that even apparently unrelated changes
won't cause 'working' code to misbehave.  Yes, Python is already exposed,
but it would be easy to turn a very rare failure into a more common one.

What I was actually arguing for was defensive programming.

> If we're going to rely on signal handling being correct then I think
> we should also rely on write() being correct.  Note that I'm not
> suggesting an API that allows arbitrary signal handlers, but rather
> one that calls write() on an array of prepared file descriptors
> (ignoring errors).

For your interpretation of 'correct'.  The cause of this chaos is that
the C and POSIX standards are inconsistent, even internally, and they
are wildly incompatible.  So, even if things 'work' today, don't bet on
the next release of your favourite system behaving the same way.

It wouldn't matter if there was a de facto standard (i.e. a consensus),
but there isn't.

> Ensuring modifications to that array are atomic would be tricky, but I
> think it would be doable if we use a read-copy-update approach (with
> two alternating signal handler functions).  Not sure how to ensure
> there's no currently running signal handlers in another thread though.
>  Maybe have to rip the atomic read/write stuff out of the Linux
> sources to ensure it's *always* defined behavior.

Yes.  But even that wouldn't solve the problem, as that code is very
gcc-specific.

> Looking into the existing signalmodule.c, I see no attempts to ensure
> atomic access to the Handlers data structure.  Is the current code
> broken, at least on non-x86 platforms?

Well, at a quick glance at the actual handler (the riskiest bit):

    1) It doesn't check the signal range - bad practice, as systems
do sometimes generate wayward numbers.

    2) Handlers[sig_num].tripped = 1; is formally undefined, but
actually pretty safe.  If that breaks, nothing much will work.  It
would be better to make the int sig_atomic_t, as you say.

    3) is_tripped++; and Py_AddPendingCall(checksignals_witharg, NULL);
will work only because the handler ignores all signals in subthreads
(which is definitely NOT right, as the comments say).

Despite the implication, the code of Py_AddPendingCall is NOT safe
against simultaneous registration.  It is just plain broken, I am
afraid.  The note starting "Darn" should be a LOT stronger :-)

[ For example, think of two threads calling the function at exactly
the same time, in almost perfect step.  Oops. ]

I can't honestly promise to put any time into this in the forseeable
future, but will try (sometime).  If anyone wants to tackle this,
please ask me for comments/help/etc.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  nmm1 at cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679

From gjcarneiro at gmail.com  Sat Sep  9 12:59:23 2006
From: gjcarneiro at gmail.com (Gustavo Carneiro)
Date: Sat, 9 Sep 2006 11:59:23 +0100
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <op.tfk9h00vaed6q0@e500>
References: <ee2a432c0607132339o1d35c402ib13884c85115856a@mail.gmail.com>
	<09f901c6a72c$495f2690$12472597@bagio>
	<20060714112137.GA891@Andrew-iBook2.local>
	<e99iut$u6n$1@sea.gmane.org> <44B8A90C.6070309@v.loewis.de>
	<ee2a432c0607151835m741dc92dpa5cc3a7149842a1c@mail.gmail.com>
	<1f7befae0607152047u43993a15ue5180b990f9a530f@mail.gmail.com>
	<op.tfk9h00vaed6q0@e500>
Message-ID: <a467ca4f0609090359l1e5e54b1o1731744b4b8b0e0f@mail.gmail.com>

On 9/9/06, Jan Kanis <jan-python at maka.demon.nl> wrote:
> At the risk of waking up a thread that was already declared dead, but
> perhaps this is usefull.
>
> So, what happens is pythons signal handler sets a flag and registrers a
> callback. Then the main thread should check the flag and make the callback
> to actually do something with the signal. However the main thread is
> blocked in GTK and can't check the flag.
>
> Nick Maclaren wrote:
> ...lots of reasons why you can't do anything reliably from within a signal
> handler...
>
> As far as I understand it, what could work is this:
> -PyGTK registrers a callback.
> -Pythons signal handler does not change at all.
> -All threads that run in the Python interpreter occasionally check the
> flag which the signal handler sets, like the main thread does nowadays. If
> it is set, the thread calls PyGTKs callback. It does not do anything else
> with the signal.
> -PyGTKs callback wakes up the main thread, which actually handles the
> signal just like it does now.
>
> PyGTKs callback could be called from any thread, but it would be called in
> a normal context, not in a signal handler. As the signal handler does not
> change, the risk of breaking anything or causing chaos is as large/small
> as it is under the current scheme.

> However, PyGTKs problem does get
> solved, as long as there is _a_ thread that returns to the interpreter
> within some timeframe. It seems plausible that this will happen.

  No, it is not plausible at all.  For instance, the GnomeVFS library
usually has a pool of thread, not doing anything, waiting for some VFS
task.  It is likely that a signal will be delivered to one of these
threads, which know nothing about Python, and sit idle most of the
time.

  Regards.

From gjcarneiro at gmail.com  Sat Sep  9 13:11:19 2006
From: gjcarneiro at gmail.com (Gustavo Carneiro)
Date: Sat, 9 Sep 2006 12:11:19 +0100
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <aac2c7cb0609082159g647fde2s3b714a6d001de5be@mail.gmail.com>
References: <ee2a432c0607132339o1d35c402ib13884c85115856a@mail.gmail.com>
	<09f901c6a72c$495f2690$12472597@bagio>
	<20060714112137.GA891@Andrew-iBook2.local>
	<e99iut$u6n$1@sea.gmane.org> <44B8A90C.6070309@v.loewis.de>
	<ee2a432c0607151835m741dc92dpa5cc3a7149842a1c@mail.gmail.com>
	<1f7befae0607152047u43993a15ue5180b990f9a530f@mail.gmail.com>
	<op.tfk9h00vaed6q0@e500>
	<aac2c7cb0609082152t76029092vb392cf1540168dc@mail.gmail.com>
	<aac2c7cb0609082159g647fde2s3b714a6d001de5be@mail.gmail.com>
Message-ID: <a467ca4f0609090411q3fa9d09bu15a966823d141733@mail.gmail.com>

On 9/9/06, Adam Olsen <rhamph at gmail.com> wrote:
> On 9/8/06, Adam Olsen <rhamph at gmail.com> wrote:
> > Ensuring modifications to that array are atomic would be tricky, but I
> > think it would be doable if we use a read-copy-update approach (with
> > two alternating signal handler functions).  Not sure how to ensure
> > there's no currently running signal handlers in another thread though.
> >  Maybe have to rip the atomic read/write stuff out of the Linux
> > sources to ensure it's *always* defined behavior.
>
> Doh, except that's exactly what sig_atomic_t is for.  Ah well, can't
> win them all.

>From the glibc manual:
"""
To avoid uncertainty about interrupting access to a variable, you can
use a particular data type for which access is always atomic:
sig_atomic_t. Reading and writing this data type is guaranteed to
happen in a single instruction, so there's no way for a handler to run
"in the middle" of an access.
"""

  So, no, this is certainly not the same as linux kernel atomic
operations, which allow you to do more interesting stuff like,
test-and-clear, or decrement-and-test atomically.  glib has those too,
and so does mozilla's NSPR, but only on a few architectures does it do
it without using mutexes.  for instance, i686 onwards don't require
mutexes, only special instructions, but i386 requires mutexes.  And we
all know mutexes in signal handlers cause deadlocks :-(

  And, yes, Py_AddPendingCall and Py_MakePendingCalls are most
certainly not async safe!  Just look at the source code of
Py_MakePendingCalls and you'll see an interesting comment...
Therefore, discussions about signal safety in whatever new API we may
add to Python should be taken with a grain of salt.

  Regards.

From gjcarneiro at gmail.com  Sat Sep  9 13:38:03 2006
From: gjcarneiro at gmail.com (Gustavo Carneiro)
Date: Sat, 9 Sep 2006 12:38:03 +0100
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <E1GM0VK-0003n4-7b@draco.cus.cam.ac.uk>
References: <E1GM0VK-0003n4-7b@draco.cus.cam.ac.uk>
Message-ID: <a467ca4f0609090438k55653721m20b2a27aa2c55dbc@mail.gmail.com>

On 9/9/06, Nick Maclaren <nmm1 at cus.cam.ac.uk> wrote:
> I was hoping to have stopped, but here are a few comments.
>
> I agree with Jan Kanis.  That is the way to tackle this one.

  Alas, it doesn't work in practice, as I already replied.

[...]
> Despite the implication, the code of Py_AddPendingCall is NOT safe
> against simultaneous registration.  It is just plain broken, I am
> afraid.  The note starting "Darn" should be a LOT stronger :-)

  Considering that this code has existed for a very long time, and
that it isn't really safe, should we even bother to try to make
signals 100% reliable?

  I remember about a security-related module (bastion?) that first
claimed to allow execution of malicious code while protecting the
system; later, they figured out it wasn't really safe, and couldn't be
safe, so the documentation was simply changed to state not to use that
module if you need real security.

  I see the same problem here.  Python signal handling isn't _really_
100% reliable.  And it would be very hard to make Py_AddPendingCall /
Py_MakePendingCalls completely reliable.

But let's think for a moment.  Do we really _need_ to make Python unix
signal handling 100% reliable?  What are the uses for signals?  I can
only understand a couple of uses: handling of SIGINT for generating
KeyboardInterrupt [1], and handling of fatal errors like SIGSEGV in
order to show a crash dialog and bug reporting tool.  The second use
case doesn't demand 100% reliability.  The second use case is
currently being handled also in recent Ubuntu Linux through
/proc/sys/kernel/crashdump-helper.  Other notable uses that I see of
signals are sending SIGUSR1 or SIGHUP to a daemon to make it reload
its configuration.  But any competent programmer already knows how to
make the program use local sockets instead.


[1] Although ideally Python wouldn't even have KeyboardInterrupt and
just die on Ctrl-C like any normal program.

From steve at holdenweb.com  Sat Sep  9 14:33:24 2006
From: steve at holdenweb.com (Steve Holden)
Date: Sat, 09 Sep 2006 13:33:24 +0100
Subject: [Python-Dev] Unicode Imports
In-Reply-To: <45026BF4.5080108@v.loewis.de>
References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc>	<200609080353.07502.anthony@interlink.com.au>	<edr99s$v3g$4@sea.gmane.org>	<4501D7BD.1020006@v.loewis.de>
	<4502576C.3060604@gmail.com> <45026BF4.5080108@v.loewis.de>
Message-ID: <4502B494.3080509@holdenweb.com>

Martin v. L?wis wrote:
> Nick Coghlan schrieb:
> 
>>So this is taking something that *already works properly on POSIX
>>systems* and making it work on Windows as well.
> 
> 
> I doubt it does without side effects. For example, an application that
> would go through sys.path, and encode everything with
> sys.getfilesystemencoding() currently works, but will break if the patch
> is applied and non-mbcs strings are put on sys.path.
> 
> Also, what will be the effect on __file__? What value will it have
> if the module originates from a sys.path entry that is a non-mbcs
> unicode string? I haven't tested the patch, but it looks like
> __file__ becomes a unicode string on Windows, and remains a byte
> string encoded with the file system encoding elsewhere. That's also
> a change in behavior.
> 
Just to summarise my feeling having read the words of those more 
familiar with the issues than me: it looks like this should be a 2.6 
enhancement if it's included at all. I'd like to see it go in, but there 
do seem to be problems ensuring consistent behaviour across inconsistent 
platforms.

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb       http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden

From steve at holdenweb.com  Sat Sep  9 14:33:24 2006
From: steve at holdenweb.com (Steve Holden)
Date: Sat, 09 Sep 2006 13:33:24 +0100
Subject: [Python-Dev] Unicode Imports
In-Reply-To: <45026BF4.5080108@v.loewis.de>
References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc>	<200609080353.07502.anthony@interlink.com.au>	<edr99s$v3g$4@sea.gmane.org>	<4501D7BD.1020006@v.loewis.de>
	<4502576C.3060604@gmail.com> <45026BF4.5080108@v.loewis.de>
Message-ID: <4502B494.3080509@holdenweb.com>

Martin v. L?wis wrote:
> Nick Coghlan schrieb:
> 
>>So this is taking something that *already works properly on POSIX
>>systems* and making it work on Windows as well.
> 
> 
> I doubt it does without side effects. For example, an application that
> would go through sys.path, and encode everything with
> sys.getfilesystemencoding() currently works, but will break if the patch
> is applied and non-mbcs strings are put on sys.path.
> 
> Also, what will be the effect on __file__? What value will it have
> if the module originates from a sys.path entry that is a non-mbcs
> unicode string? I haven't tested the patch, but it looks like
> __file__ becomes a unicode string on Windows, and remains a byte
> string encoded with the file system encoding elsewhere. That's also
> a change in behavior.
> 
Just to summarise my feeling having read the words of those more 
familiar with the issues than me: it looks like this should be a 2.6 
enhancement if it's included at all. I'd like to see it go in, but there 
do seem to be problems ensuring consistent behaviour across inconsistent 
platforms.

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb       http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden


From jan-python at maka.demon.nl  Sat Sep  9 16:06:20 2006
From: jan-python at maka.demon.nl (Jan Kanis)
Date: Sat, 09 Sep 2006 16:06:20 +0200
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <a467ca4f0609090359l1e5e54b1o1731744b4b8b0e0f@mail.gmail.com>
References: <ee2a432c0607132339o1d35c402ib13884c85115856a@mail.gmail.com>
	<09f901c6a72c$495f2690$12472597@bagio>
	<20060714112137.GA891@Andrew-iBook2.local>
	<e99iut$u6n$1@sea.gmane.org> <44B8A90C.6070309@v.loewis.de>
	<ee2a432c0607151835m741dc92dpa5cc3a7149842a1c@mail.gmail.com>
	<1f7befae0607152047u43993a15ue5180b990f9a530f@mail.gmail.com>
	<op.tfk9h00vaed6q0@e500>
	<a467ca4f0609090359l1e5e54b1o1731744b4b8b0e0f@mail.gmail.com>
Message-ID: <op.tfl6suzbaed6q0@e500>

On Sat, 09 Sep 2006 12:59:23 +0200, Gustavo Carneiro  
<gjcarneiro at gmail.com> wrote:

> On 9/9/06, Jan Kanis <jan-python at maka.demon.nl> wrote:
>> However, PyGTKs problem does get
>> solved, as long as there is _a_ thread that returns to the interpreter
>> within some timeframe. It seems plausible that this will happen.
>
>   No, it is not plausible at all.  For instance, the GnomeVFS library
> usually has a pool of thread, not doing anything, waiting for some VFS
> task.  It is likely that a signal will be delivered to one of these
> threads, which know nothing about Python, and sit idle most of the
> time.
>
>   Regards.

Well, perhaps it isn't plausible in all cases. However, it is dependant on  
the libraries you're using and debuggable, which broken signal handlers  
apparently aren't. The approach would work if you don't use libraries that  
block threads, and if the libraries that do, co-operate with the  
interpreter. Open source libraries can be made to co-operate, and if you  
don't have the source and a library doesn't work correctly, all bets are  
off anyway.
But having the signal handler itself write to a pipe seems to be a cleaner  
solution, if it can work reliable enough for some value of 'reliable'.

Jan

From david.nospam.hopwood at blueyonder.co.uk  Sat Sep  9 17:26:03 2006
From: david.nospam.hopwood at blueyonder.co.uk (David Hopwood)
Date: Sat, 09 Sep 2006 16:26:03 +0100
Subject: [Python-Dev] Unicode Imports
In-Reply-To: <45026BF4.5080108@v.loewis.de>
References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc>	<200609080353.07502.anthony@interlink.com.au>	<edr99s$v3g$4@sea.gmane.org>	<4501D7BD.1020006@v.loewis.de>
	<4502576C.3060604@gmail.com> <45026BF4.5080108@v.loewis.de>
Message-ID: <4502DD0B.2090903@blueyonder.co.uk>

Martin v. L?wis wrote:
> Nick Coghlan schrieb:
> 
>>So this is taking something that *already works properly on POSIX
>>systems* and making it work on Windows as well.
> 
> I doubt it does without side effects. For example, an application that
> would go through sys.path, and encode everything with
> sys.getfilesystemencoding() currently works, but will break if the patch
> is applied and non-mbcs strings are put on sys.path.

Huh? It won't break on any path for which it is not already broken.

You seem to be saying "Paths with non-mbcs strings shouldn't work on Windows,
because they haven't worked in the past."

-- 
David Hopwood <david.nospam.hopwood at blueyonder.co.uk>




From martin at v.loewis.de  Sat Sep  9 17:34:19 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 09 Sep 2006 17:34:19 +0200
Subject: [Python-Dev] Unicode Imports
In-Reply-To: <4502DD0B.2090903@blueyonder.co.uk>
References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc>	<200609080353.07502.anthony@interlink.com.au>	<edr99s$v3g$4@sea.gmane.org>	<4501D7BD.1020006@v.loewis.de>	<4502576C.3060604@gmail.com>
	<45026BF4.5080108@v.loewis.de> <4502DD0B.2090903@blueyonder.co.uk>
Message-ID: <4502DEFB.5030904@v.loewis.de>

David Hopwood schrieb:
>> I doubt it does without side effects. For example, an application that
>> would go through sys.path, and encode everything with
>> sys.getfilesystemencoding() currently works, but will break if the patch
>> is applied and non-mbcs strings are put on sys.path.
> 
> Huh? It won't break on any path for which it is not already broken.
> 
> You seem to be saying "Paths with non-mbcs strings shouldn't work on Windows,
> because they haven't worked in the past."

That's not what I'm saying. I'm saying that it shouldn't work in 2.5.x,
because it didn't in 2.5.0. Changing it in 2.6 is fine, along with the
incompatibilities it causes.

Regards,
Martin


From ncoghlan at gmail.com  Sat Sep  9 19:05:36 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 10 Sep 2006 03:05:36 +1000
Subject: [Python-Dev] Unicode Imports
In-Reply-To: <4502DD0B.2090903@blueyonder.co.uk>
References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc>	<200609080353.07502.anthony@interlink.com.au>	<edr99s$v3g$4@sea.gmane.org>	<4501D7BD.1020006@v.loewis.de>	<4502576C.3060604@gmail.com>
	<45026BF4.5080108@v.loewis.de> <4502DD0B.2090903@blueyonder.co.uk>
Message-ID: <4502F460.5040308@gmail.com>

David Hopwood wrote:
> Martin v. L?wis wrote:
>> Nick Coghlan schrieb:
>>
>>> So this is taking something that *already works properly on POSIX
>>> systems* and making it work on Windows as well.
>> I doubt it does without side effects. For example, an application that
>> would go through sys.path, and encode everything with
>> sys.getfilesystemencoding() currently works, but will break if the patch
>> is applied and non-mbcs strings are put on sys.path.
> 
> Huh? It won't break on any path for which it is not already broken.
> 
> You seem to be saying "Paths with non-mbcs strings shouldn't work on Windows,
> because they haven't worked in the past."

I think MvL is looking at it from the point of view of consumers of the list 
of strings in sys.path, such as PEP 302 importer and loader objects, and tools 
like module_finder. Currently, the list of values in sys.path is limited to:

1. 8-bit strings
2. Unicode strings containing only characters which can be encoded using the 
default file system encoding

For PEP 302 loaders, it is currently correct for them to take the 8-bit string 
they receive and do "path.decode(sys.getfilesystemencoding())"

Kristj?n's patch works nicely for his application because he doesn't have to 
worry about compatibility with existing loaders and utilities. The core 
doesn't have that luxury.

We *might* be able to find a backwards compatible way to do it that could be 
put into 2.5.x, but that is effort that could more profitably be spent 
elsewhere, particularly since the state of the import system in Py3k will be 
for it to be based entirely on Unicode (as GvR pointed out last time this 
topic came up [1]).

Cheers,
Nick.

http://mail.python.org/pipermail/python-dev/2006-June/066225.html



-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From martin at v.loewis.de  Sat Sep  9 19:42:17 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 09 Sep 2006 19:42:17 +0200
Subject: [Python-Dev] Unicode Imports
In-Reply-To: <4502F460.5040308@gmail.com>
References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc>	<200609080353.07502.anthony@interlink.com.au>	<edr99s$v3g$4@sea.gmane.org>	<4501D7BD.1020006@v.loewis.de>	<4502576C.3060604@gmail.com>	<45026BF4.5080108@v.loewis.de>
	<4502DD0B.2090903@blueyonder.co.uk> <4502F460.5040308@gmail.com>
Message-ID: <4502FCF9.2090403@v.loewis.de>

Nick Coghlan schrieb:
> I think MvL is looking at it from the point of view of consumers of the list 
> of strings in sys.path, such as PEP 302 importer and loader objects, and tools 
> like module_finder. Currently, the list of values in sys.path is limited to:

That, and all kinds of inspection tools. For example, when __file__ of a
module object changes to be a Unicode string (which it does under the
proposed patch), then these tools break. They currently don't break in
that way because putting arbitrary Unicode strings on sys.path doesn't
work in the first place.

Regards,
Martin

From martin at v.loewis.de  Sat Sep  9 20:10:19 2006
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 09 Sep 2006 20:10:19 +0200
Subject: [Python-Dev] Interest in a Python 2.3.6?
In-Reply-To: <05D21426-7FF5-4AEF-B757-DF0BEC5D0D74@python.org>
References: <05D21426-7FF5-4AEF-B757-DF0BEC5D0D74@python.org>
Message-ID: <4503038B.8060507@v.loewis.de>

Barry Warsaw schrieb:
> Thoughts?  I don't want to waste my time if nobody thinks a 2.3.6 would
> be useful, but I'm happy to do it if there's community support.  I'll
> also need the usual help with Windows installers and documentation updates.

I personally would consider it a waste of time. Since it wouldn't waste
*my* time, I'm -0 :-)

I think everybody has arranged with whatever quirks Python 2.3 has.
Distributors of Python 2.3 have added whatever patches they think are
absolutely necessary. Making another release could cause confusion;
at worst, it may cause people to special-case people for 2.3.6 in
case the release contains some incompatible change that affects
existing applications.

Regards,
Martin

From david.nospam.hopwood at blueyonder.co.uk  Sat Sep  9 20:52:48 2006
From: david.nospam.hopwood at blueyonder.co.uk (David Hopwood)
Date: Sat, 09 Sep 2006 19:52:48 +0100
Subject: [Python-Dev] Unicode Imports
In-Reply-To: <4502F460.5040308@gmail.com>
References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc>	<200609080353.07502.anthony@interlink.com.au>	<edr99s$v3g$4@sea.gmane.org>	<4501D7BD.1020006@v.loewis.de>	<4502576C.3060604@gmail.com>
	<45026BF4.5080108@v.loewis.de> <4502DD0B.2090903@blueyonder.co.uk>
	<4502F460.5040308@gmail.com>
Message-ID: <45030D80.9080105@blueyonder.co.uk>

Nick Coghlan wrote:
> David Hopwood wrote:
>> Martin v. L?wis wrote:
>>> Nick Coghlan schrieb:
>>>
>>>> So this is taking something that *already works properly on POSIX
>>>> systems* and making it work on Windows as well.
>>>
>>> I doubt it does without side effects. For example, an application that
>>> would go through sys.path, and encode everything with
>>> sys.getfilesystemencoding() currently works, but will break if the patch
>>> is applied and non-mbcs strings are put on sys.path.
>>
>> Huh? It won't break on any path for which it is not already broken.
>>
>> You seem to be saying "Paths with non-mbcs strings shouldn't work on
>> Windows, because they haven't worked in the past."
> 
> I think MvL is looking at it from the point of view of consumers of the
> list of strings in sys.path, such as PEP 302 importer and loader
> objects, and tools like module_finder. Currently, the list of values in
> sys.path is limited to:
> 
> 1. 8-bit strings
> 2. Unicode strings containing only characters which can be encoded using
> the default file system encoding

On Windows, file system pathnames can contain arbitrary Unicode characters
(well, almost). Despite the existence of "ANSI" filesystem APIs, and
regardless of what 'sys.getfilesystemencoding()' returns, the underlying
file system encoding for NTFS and FAT filesystems is UTF-16LE.

Thus, either:
 - the fact that sys.getfilesystemencoding() returns a non-Unicode encoding
   on Windows is a bug, or
 - any program that relies on sys.getfilesystemencoding() being able to
   encode arbitrary Windows pathnames has a bug.

We need to decide which of these is the case.

-- 
David Hopwood <david.nospam.hopwood at blueyonder.co.uk>




From martin at v.loewis.de  Sat Sep  9 21:16:45 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 09 Sep 2006 21:16:45 +0200
Subject: [Python-Dev] Unicode Imports
In-Reply-To: <45030D80.9080105@blueyonder.co.uk>
References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc>	<200609080353.07502.anthony@interlink.com.au>	<edr99s$v3g$4@sea.gmane.org>	<4501D7BD.1020006@v.loewis.de>	<4502576C.3060604@gmail.com>	<45026BF4.5080108@v.loewis.de>
	<4502DD0B.2090903@blueyonder.co.uk>	<4502F460.5040308@gmail.com>
	<45030D80.9080105@blueyonder.co.uk>
Message-ID: <4503131D.20806@v.loewis.de>

David Hopwood schrieb:
> On Windows, file system pathnames can contain arbitrary Unicode characters
> (well, almost). Despite the existence of "ANSI" filesystem APIs, and
> regardless of what 'sys.getfilesystemencoding()' returns, the underlying
> file system encoding for NTFS and FAT filesystems is UTF-16LE.
> 
> Thus, either:
>  - the fact that sys.getfilesystemencoding() returns a non-Unicode encoding
>    on Windows is a bug, or
>  - any program that relies on sys.getfilesystemencoding() being able to
>    encode arbitrary Windows pathnames has a bug.
> 
> We need to decide which of these is the case.

There is a third option:
- the operating system has a bug

It is actually this option that rules out the other two.
sys.getfilesystemencoding() returns "mbcs" on Windows, which means
CP_ACP. The file system encoding is an encoding that converts a
file name into a byte string. Unfortunately, on Windows, there are
file names which cannot be converted into a byte string in a standard
manner. This is an operating system bug (or mis-design; they should
have chosen UTF-8 as the byte encoding of file names, instead of
making it depend on the system locale, but they of course did so
for backwards compatibility with Windows 3.1 and 9x).

As a side note: every encoding in Python is a Unicode encoding;
so there aren't any "non-Unicode encodings".

Programs that rely on sys.getfilesystemencoding() being able to
represent arbitrary file names on Windows might have a bug;
programs that rely on sys.getfilesystemencoding() being able
to encode all elements of sys.path do not (atleast not for
Python 2.5 and earlier).

Regards,
Martin


From barry at python.org  Sat Sep  9 22:41:04 2006
From: barry at python.org (Barry Warsaw)
Date: Sat, 9 Sep 2006 16:41:04 -0400
Subject: [Python-Dev] Interest in a Python 2.3.6?
In-Reply-To: <4503038B.8060507@v.loewis.de>
References: <05D21426-7FF5-4AEF-B757-DF0BEC5D0D74@python.org>
	<4503038B.8060507@v.loewis.de>
Message-ID: <204A1476-028D-4D75-98C0-BECEA3509C39@python.org>

On Sep 9, 2006, at 2:10 PM, Martin v. L?wis wrote:

> Barry Warsaw schrieb:
>> Thoughts?  I don't want to waste my time if nobody thinks a 2.3.6  
>> would
>> be useful, but I'm happy to do it if there's community support.  I'll
>> also need the usual help with Windows installers and documentation  
>> updates.
>
> I personally would consider it a waste of time. Since it wouldn't  
> waste
> *my* time, I'm -0 :-)
>
> I think everybody has arranged with whatever quirks Python 2.3 has.
> Distributors of Python 2.3 have added whatever patches they think are
> absolutely necessary. Making another release could cause confusion;
> at worst, it may cause people to special-case people for 2.3.6 in
> case the release contains some incompatible change that affects
> existing applications.

Well, there certainly hasn't been an overwhelming chorus of support  
for the idea, so I think I'll waste my time elsewhere ;).  Consider  
the offer withdrawn.

-Barry


From david.nospam.hopwood at blueyonder.co.uk  Sat Sep  9 23:22:10 2006
From: david.nospam.hopwood at blueyonder.co.uk (David Hopwood)
Date: Sat, 09 Sep 2006 22:22:10 +0100
Subject: [Python-Dev] Unicode Imports
In-Reply-To: <4503131D.20806@v.loewis.de>
References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc>	<200609080353.07502.anthony@interlink.com.au>	<edr99s$v3g$4@sea.gmane.org>	<4501D7BD.1020006@v.loewis.de>	<4502576C.3060604@gmail.com>	<45026BF4.5080108@v.loewis.de>
	<4502DD0B.2090903@blueyonder.co.uk>	<4502F460.5040308@gmail.com>
	<45030D80.9080105@blueyonder.co.uk> <4503131D.20806@v.loewis.de>
Message-ID: <45033082.8080209@blueyonder.co.uk>

Martin v. L?wis wrote:
> David Hopwood schrieb:
> 
>>On Windows, file system pathnames can contain arbitrary Unicode characters
>>(well, almost). Despite the existence of "ANSI" filesystem APIs, and
>>regardless of what 'sys.getfilesystemencoding()' returns, the underlying
>>file system encoding for NTFS and FAT filesystems is UTF-16LE.
>>
>>Thus, either:
>> - the fact that sys.getfilesystemencoding() returns a non-Unicode encoding
>>   on Windows is a bug, or
>> - any program that relies on sys.getfilesystemencoding() being able to
>>   encode arbitrary Windows pathnames has a bug.
>>
>>We need to decide which of these is the case.
> 
> There is a third option:
> - the operating system has a bug

This behaviour is by design. If it is a bug, then it is a "won't ever fix --
no way, no how" bug, that Python must accomodate if it is to properly support
Unicode on Windows.

> It is actually this option that rules out the other two.
> sys.getfilesystemencoding() returns "mbcs" on Windows, which means
> CP_ACP. The file system encoding is an encoding that converts a
> file name into a byte string. Unfortunately, on Windows, there are
> file names which cannot be converted into a byte string in a standard
> manner. This is an operating system bug (or mis-design; they should
> have chosen UTF-8 as the byte encoding of file names, instead of
> making it depend on the system locale, but they of course did so
> for backwards compatibility with Windows 3.1 and 9x).

Although UTF-8 was invented (in September 1992) technically before the release
of the first version of NT supporting NTFS (NT 3.1 in July 1993), it had not
been invented before the decision to use Unicode in NTFS, or in Windows NT's
file APIs, had been made.

(I believe OS/2 HPFS had not supported Unicode, even though NTFS was otherwise
almost identical to it.)

At that time, the decision to use Unicode at all was quite forward-looking;
the final version of Unicode 1.0 had only been published in June 1992
(although it had been approved earlier; see <http://www.unicode.org/history/>).

UTF-8 was only officially added to the Unicode standard in an appendix of
Unicode 2.0 (published July 1996), and only given essentially equal status to
UTF-16 and UTF-32 in Unicode 3.0 (September 1999).

> As a side note: every encoding in Python is a Unicode encoding;
> so there aren't any "non-Unicode encodings".

It was clear from context that I meant "encoding capable of representing
all Unicode characters".

> Programs that rely on sys.getfilesystemencoding() being able to
> represent arbitrary file names on Windows might have a bug;
> programs that rely on sys.getfilesystemencoding() being able
> to encode all elements of sys.path do not (at least not for
> Python 2.5 and earlier).

Elements of sys.path can be Unicode strings in Python 2.5, and should be
pathnames supported by the underlying OS. Where is it documented that there
is any further restriction on them? And why should there be any further
restriction on them?

-- 
David Hopwood <david.nospam.hopwood at blueyonder.co.uk>




From martin at v.loewis.de  Sat Sep  9 23:55:20 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 09 Sep 2006 23:55:20 +0200
Subject: [Python-Dev] Unicode Imports
In-Reply-To: <45033082.8080209@blueyonder.co.uk>
References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc>	<200609080353.07502.anthony@interlink.com.au>	<edr99s$v3g$4@sea.gmane.org>	<4501D7BD.1020006@v.loewis.de>	<4502576C.3060604@gmail.com>	<45026BF4.5080108@v.loewis.de>	<4502DD0B.2090903@blueyonder.co.uk>	<4502F460.5040308@gmail.com>	<45030D80.9080105@blueyonder.co.uk>
	<4503131D.20806@v.loewis.de> <45033082.8080209@blueyonder.co.uk>
Message-ID: <45033848.2020307@v.loewis.de>

David Hopwood schrieb:
> Elements of sys.path can be Unicode strings in Python 2.5, and should be
> pathnames supported by the underlying OS. Where is it documented that there
> is any further restriction on them? And why should there be any further
> restriction on them?

It's not documented in that detail; if people think it should be
documented more thoroughly, that should be done (contributions are
welcome). Changing the import machinery to deal with Unicode strings
differently cannot be done for Python 2.5, though: it cannot be done
for 2.5.0 as the release candidate has already been published, and there
is no acceptable patch available at this moment. It cannot be added
to 2.5.x as it may reasonably break existing applications.

Regards,
Martin


From jcarlson at uci.edu  Sun Sep 10 00:23:50 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sat, 09 Sep 2006 15:23:50 -0700
Subject: [Python-Dev] Python 2.4.4  was: Interest in a Python 2.3.6?
In-Reply-To: <204A1476-028D-4D75-98C0-BECEA3509C39@python.org>
References: <4503038B.8060507@v.loewis.de>
	<204A1476-028D-4D75-98C0-BECEA3509C39@python.org>
Message-ID: <20060909151653.F8E7.JCARLSON@uci.edu>


Barry Warsaw <barry at python.org> wrote:
> Well, there certainly hasn't been an overwhelming chorus of support  
> for the idea, so I think I'll waste my time elsewhere ;).  Consider  
> the offer withdrawn.

I hope someone tries to fix one of the two bugs I listed that were
problems for 2.3 and 2.4 in 2.4.4:

http://www.python.org/sf/780714
http://www.python.org/sf/1548687

The former involves stack allocation errors in subthreads that exists
even in 2.5, which may not be fixable in Windows, and very likely is not
fixable on linux.

The latter is fixable on all platforms.

 - Josiah


From ncoghlan at gmail.com  Sun Sep 10 04:24:38 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 10 Sep 2006 12:24:38 +1000
Subject: [Python-Dev] Unicode Imports
In-Reply-To: <45033082.8080209@blueyonder.co.uk>
References: <129CEF95A523704B9D46959C922A280002FE991D@nemesis.central.ccp.cc>	<200609080353.07502.anthony@interlink.com.au>	<edr99s$v3g$4@sea.gmane.org>	<4501D7BD.1020006@v.loewis.de>	<4502576C.3060604@gmail.com>	<45026BF4.5080108@v.loewis.de>	<4502DD0B.2090903@blueyonder.co.uk>	<4502F460.5040308@gmail.com>	<45030D80.9080105@blueyonder.co.uk>
	<4503131D.20806@v.loewis.de> <45033082.8080209@blueyonder.co.uk>
Message-ID: <45037766.8030202@gmail.com>

David Hopwood wrote:
> Martin v. L?wis wrote:
>> Programs that rely on sys.getfilesystemencoding() being able to
>> represent arbitrary file names on Windows might have a bug;
>> programs that rely on sys.getfilesystemencoding() being able
>> to encode all elements of sys.path do not (at least not for
>> Python 2.5 and earlier).
> 
> Elements of sys.path can be Unicode strings in Python 2.5, and should be
> pathnames supported by the underlying OS. Where is it documented that there
> is any further restriction on them? And why should there be any further
> restriction on them?

There's no suggestion that this limitation shouldn't be fixed - merely that 
fixing it is likely to break some applications which rely on sys.path for 
importing or introspection purposes. A 2.5.x maintenance release typically 
shouldn't break anything that worked correctly on 2.5.0, hence fixing this 
becomes a project for either 2.6 or 3.0.

To put it another way: fixing this is likely to require changes to more than 
just the interpreter core. It will also potentially require changes to all 
applications which currently expect to be able to use 
's.encode(sys.getfilesystemencoding())' to convert any Unicode path entry or 
__file__ attribute to an 8-bit string.

Doing that qualifies as correcting a language design error or limitation, but 
it would require a real stretch of the definition to qualify as a bug fix.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From greg.ewing at canterbury.ac.nz  Sun Sep 10 09:35:53 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 10 Sep 2006 19:35:53 +1200
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <op.tfk9h00vaed6q0@e500>
References: <ee2a432c0607132339o1d35c402ib13884c85115856a@mail.gmail.com>
	<09f901c6a72c$495f2690$12472597@bagio>
	<20060714112137.GA891@Andrew-iBook2.local> <e99iut$u6n$1@sea.gmane.org>
	<44B8A90C.6070309@v.loewis.de>
	<ee2a432c0607151835m741dc92dpa5cc3a7149842a1c@mail.gmail.com>
	<1f7befae0607152047u43993a15ue5180b990f9a530f@mail.gmail.com>
	<op.tfk9h00vaed6q0@e500>
Message-ID: <4503C059.3070308@canterbury.ac.nz>

Jan Kanis wrote:
> However, PyGTKs problem does get  
> solved, as long as there is _a_ thread that returns to the interpreter  
> within some timeframe. It seems plausible that this will happen.

I don't see that this makes the situation much better,
as it just shifts the need for polling to another
thread.

> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/greg.ewing%40canterbury.ac.nz


From greg.ewing at canterbury.ac.nz  Sun Sep 10 09:35:59 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 10 Sep 2006 19:35:59 +1200
Subject: [Python-Dev] Py_BuildValue and decref
In-Reply-To: <D7FAEFC0-59CC-49F6-9757-27244D409E94@python.org>
References: <20060908220605.GF990@abulafia.devel.redhat.com>
	<D7FAEFC0-59CC-49F6-9757-27244D409E94@python.org>
Message-ID: <4503C05F.70508@canterbury.ac.nz>

Barry Warsaw wrote:
> I just want to point out that the C API documentation is pretty  
> silent about the refcounting side-effects in error conditions (and  
> often in success conditions too) of most Python functions.  For  
> example, what is the refcounting side-effects of PyDict_SetItem() on  
> val?  What about if that function fails?  Has val been incref'd or  
> not?  What about the side-effects on any value the new one replaces,  
> both in success and failure?

The usual principle is that the refcounting behaviour
is (or should be) independent of whether the function
succeeded or failed. In the absence of any statement
to the contrary in the docs, you should be able to
assume that.

The words used to describe the refcount behaviour of
some functions can be rather confusing, but it always
boils down to one of two cases: either the function
"borrows" a reference (and does its own incref if
needed, the caller doesn't need to care) or it "steals"
a reference (so the caller is always responsible for
doing an incref if needed before calling).

What that rather convoluted comment about PyTuple_SetItem
is trying to say is just that it *always* steals a reference,
regardless of whether it succeeds or fails. I expect the
same is true of Py_BuildValue.

--
Greg

From rhamph at gmail.com  Mon Sep 11 06:32:43 2006
From: rhamph at gmail.com (Adam Olsen)
Date: Sun, 10 Sep 2006 22:32:43 -0600
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <E1GM0VK-0003n4-7b@draco.cus.cam.ac.uk>
References: <E1GM0VK-0003n4-7b@draco.cus.cam.ac.uk>
Message-ID: <aac2c7cb0609102132s3654f9bm519f31f0a9d65ce9@mail.gmail.com>

On 9/9/06, Nick Maclaren <nmm1 at cus.cam.ac.uk> wrote:
> I can't honestly promise to put any time into this in the forseeable
> future, but will try (sometime).  If anyone wants to tackle this,
> please ask me for comments/help/etc.

It took me a while to realize just what was wrong with my proposal,
but I did, and it led me to a new proposal.  I'd appreciate if you
could point out any holes in it.  First though, for the benefit of
those reading, I'll try to explain the (multiple!) reasons why mine
fails.

First, sig_atomic_t essentially promises that the compiler will behave
atomically and the CPU it's ran on will behave locally atomic.  It
does not claim to make writes visible to other CPUs in an atomic way,
and thus you could have different bytes show up at different times.
The x86 architecture uses a very simple scheme and won't do this
(unless the compiler itself does), but other architectures will.

Second, the start of a write call may be delayed a very long time.
This means that a fd may not be written to for hours until after the
signal started.  We can't release any fd's used for such a purpose, or
else risk random writing to them if they get reused later..

Third, it doesn't resolve the existing problems.  If I'm going to fix
signals I should fix ALL of signals. :)

Now on to my new proposal.  I do still use write().  If you can't
accept that I think we should rip signals out entirely, just let them
kill the process.  Not a reliable feature of any OS.

We create a single pipe and use it for all signals.  We never release
it, instead letting the OS do it when the process gets cleaned up.  We
write the signal number to it as a byte (assuming there's at most 256
unique signals).

This much would allow a GUI's poll loop to wake up when there is a
signal, and give control back to the python main loop, which could
then read off the signals and queue up their handler functions.

The only problem is when there is no GUI poll loop.  We don't want
python to have to poll the fd, we'd rather it just check a variable.
Is it possible to set/clear a flag in a sufficiently portable
(reentrant-safe, non-blocking, thread-safe) fashion?

-- 
Adam Olsen, aka Rhamphoryncus

From nnorwitz at gmail.com  Mon Sep 11 06:54:49 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Sun, 10 Sep 2006 21:54:49 -0700
Subject: [Python-Dev] 2.5c2
Message-ID: <ee2a432c0609102154o909189ct881191c2ce069dec@mail.gmail.com>

PEP 356

  http://www.python.org/dev/peps/pep-0356/

has 2.5c2 scheduled for Sept 12.  I checked in a fix for the last
blocking 2.5 issue (revert sgml infinite loop bug).  There are no
blocking issues that I know of (the PEP is up to date).

I expect Anthony will call for a freeze real soon now.  It would be
awesome if there were no more changes from now until for 2.5 final!
(Changing the trunk or 2.4 branches are fine, updating doc for 2.5 is
also fine).

I will be running valgrind over 2.5, but don't expect anything to show
up since the last run was pretty recent.  Coverity has no outstanding
issues and Klocwork results are pretty clean.  It's not clear if the
remaining warnings from Klocwork are real issues or not.

Keep doing a bunch of testing so we don't have any surprises in 2.5.

n

PS Scary as it sounds, I hope to have an HP-UX buildbot up and running
real soon now.  After 2.5 is out, I will fix the issues with the
cygwin bot (ie, upgrade cygwin) and get the HP-UX bot running.

From nnorwitz at gmail.com  Mon Sep 11 10:34:21 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Mon, 11 Sep 2006 01:34:21 -0700
Subject: [Python-Dev] _PyGILState_NoteThreadState should be static or not?
Message-ID: <ee2a432c0609110134r7210a098j2199c1e342547c9c@mail.gmail.com>

Michael,

In Python/pystate.c, you made this checkin:

"""
r39044 | mwh | 2005-06-20 12:52:57 -0400 (Mon, 20 Jun 2005) | 8 lines

Fix bug:  [ 1163563 ] Sub threads execute in restricted mode
basically by fixing bug 1010677 in a non-broken way.
"""

_PyGILState_NoteThreadState is declared as static on line 54, but the
definition on line 508 is not static. (HP's cc is complaining.)  I
don't see this referenced in any header file, it seems like this
should be static?

$ grep _PyGILState_NoteThreadState */*.ch]
Python/pystate.c:static void _PyGILState_NoteThreadState(PyThreadState* tstate);
Python/pystate.c:               _PyGILState_NoteThreadState(tstate);
Python/pystate.c:       _PyGILState_NoteThreadState(t);
Python/pystate.c:_PyGILState_NoteThreadState(PyThreadState* tstate)

n

From mwh at python.net  Mon Sep 11 10:40:23 2006
From: mwh at python.net (Michael Hudson)
Date: Mon, 11 Sep 2006 09:40:23 +0100
Subject: [Python-Dev] _PyGILState_NoteThreadState should be static or
	not?
In-Reply-To: <ee2a432c0609110134r7210a098j2199c1e342547c9c@mail.gmail.com>
References: <ee2a432c0609110134r7210a098j2199c1e342547c9c@mail.gmail.com>
Message-ID: <A23EC20F-130B-4CD9-8152-0CF464DE4591@python.net>


On 11 Sep 2006, at 09:34, Neal Norwitz wrote:

> Michael,
>
> In Python/pystate.c, you made this checkin:
>
> """
> r39044 | mwh | 2005-06-20 12:52:57 -0400 (Mon, 20 Jun 2005) | 8 lines
>
> Fix bug:  [ 1163563 ] Sub threads execute in restricted mode
> basically by fixing bug 1010677 in a non-broken way.
> """
>
> _PyGILState_NoteThreadState is declared as static on line 54, but the
> definition on line 508 is not static. (HP's cc is complaining.)  I
> don't see this referenced in any header file, it seems like this
> should be static?

It seems very likely, yes.  I think at one point (in my working copy)  
there was a call in Modules/threadmodule.c, which may partially  
account for my confusion.

Seems we have lots of HP users tracking SVN HEAD, then...

Cheers,
mwh


From anthony at interlink.com.au  Mon Sep 11 13:58:13 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Mon, 11 Sep 2006 21:58:13 +1000
Subject: [Python-Dev] BRANCH FREEZE: release25-maint,
	00:00UTC 12 September 2006
Message-ID: <200609112158.19000.anthony@interlink.com.au>

Ok, I haven't heard back from Martin, but I'm going to hope he's OK with 
tomorrow as a release date for 2.5rc2. If he's not, we'll try for the day 
after. In any case, I'm going to declare the release25-maint branch FROZEN as 
at 00:00 UTC on the 12th. That's about 12 hours from now.

This is for 2.5rc2. Once this is out, I'd like to see as close to zero changes 
as possible for the next week or so until 2.5 final is released.

My god, it's getting so close... 

Anthony
-- 
Anthony Baxter     <anthony at interlink.com.au>
It's never too late to have a happy childhood.

From gjcarneiro at gmail.com  Mon Sep 11 16:16:44 2006
From: gjcarneiro at gmail.com (Gustavo Carneiro)
Date: Mon, 11 Sep 2006 15:16:44 +0100
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <aac2c7cb0609102132s3654f9bm519f31f0a9d65ce9@mail.gmail.com>
References: <E1GM0VK-0003n4-7b@draco.cus.cam.ac.uk>
	<aac2c7cb0609102132s3654f9bm519f31f0a9d65ce9@mail.gmail.com>
Message-ID: <a467ca4f0609110716i26a336beg3ff0def4536723b0@mail.gmail.com>

On 9/11/06, Adam Olsen <rhamph at gmail.com> wrote:
> Now on to my new proposal.  I do still use write().  If you can't
> accept that I think we should rip signals out entirely, just let them
> kill the process.  Not a reliable feature of any OS.
>
> We create a single pipe and use it for all signals.  We never release
> it, instead letting the OS do it when the process gets cleaned up.  We
> write the signal number to it as a byte (assuming there's at most 256
> unique signals).
>
> This much would allow a GUI's poll loop to wake up when there is a
> signal, and give control back to the python main loop, which could
> then read off the signals and queue up their handler functions.

  I like this approach.  Not only we would get a poll-able file
descriptor to notify a GUI main loop when signals arrive, we'd also
avoid the lack of async safety in Py_AddPendingCall /
Py_MakePendingCalls which affects _current_ Python code.

  Note that the file descriptor of the read end of the pipe has to
become a public Python API so that 3rd party extensions may poll it.
This is crucial.

>
> The only problem is when there is no GUI poll loop.  We don't want
> python to have to poll the fd, we'd rather it just check a variable.
> Is it possible to set/clear a flag in a sufficiently portable
> (reentrant-safe, non-blocking, thread-safe) fashion?

  It's simple.  That pipe file descriptor has to be changed to
non-blocking mode in both ends of the pipe, obviously, with fcntl.
Then, to find out whether a signal happened or not we modify
PyErr_CheckSignals() to try to read from the pipe.  If it reads bytes
from the pipe, we process the corresponding python signal handlers or
raise KeyboardInterrupt.  If the read() syscall returns zero bytes
read, we know no signal was delivered and move on.

  The only potential problem left is that, by changing the pipe file
descriptor to non-blocking mode we can only write as many bytes to it
without reading from the other side as the pipe buffer allows.  If a
large number of signals arrive very quickly, that buffer may fill and
we lose signals.  But I think the default buffer should be more than
enough.  And normally programs don't receive lots of signals in a
small time window.  If it happens we may lose signals, but that's very
rare, and who cares anyway.

  Regards.

From eric+python-dev at trueblade.com  Mon Sep 11 20:31:45 2006
From: eric+python-dev at trueblade.com (Eric V. Smith)
Date: Mon, 11 Sep 2006 14:31:45 -0400
Subject: [Python-Dev] datetime's strftime implementation: by design or bug
Message-ID: <4505AB91.6030908@trueblade.com>

[I hope this belongs on python-dev, since it's about the design of 
something.  But if not, let me know and I'll post to c.l.py.]

I'm willing to file a bug report and patch on this, but I'd like to know 
if it's by design or not.

In datetimemodule.c, the function wrap_strftime() insists that the 
length of a format string be <= 127 chars, by forcing the length into a 
char.  This seems like a bug to me.  wrap_strftime() calls time's 
strftime(), which doesn't have this limitation because it uses size_t.

 >>> import datetime
 >>> datetime.datetime.now().strftime('x'*128)
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
MemoryError


 >>> import datetime
 >>> datetime.datetime.now().strftime('x'*256)
in wrap_strftime
totalnew=1
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
SystemError: Objects/stringobject.c:4077: bad argument to internal function


 >>> import time
 >>> time.strftime('x'*128)
'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'


But before I write this up, I'd like to know if anyone knows if this is 
by design or not.

This is reproducible on Windows 2.4.3, and Linux 2.3.3 and 2.5c1.

Thanks.

Eric.

From tim.peters at gmail.com  Mon Sep 11 22:06:20 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 11 Sep 2006 16:06:20 -0400
Subject: [Python-Dev] datetime's strftime implementation: by design or
	bug
In-Reply-To: <4505AB91.6030908@trueblade.com>
References: <4505AB91.6030908@trueblade.com>
Message-ID: <1f7befae0609111306u587e2db9j51b58a15719797a1@mail.gmail.com>

[Eric V. Smith]
> [I hope this belongs on python-dev, since it's about the design of
> something.  But if not, let me know and I'll post to c.l.py.]
>
> I'm willing to file a bug report and patch on this, but I'd like to know
> if it's by design or not.
>
> In datetimemodule.c, the function wrap_strftime() insists that the
> length of a format string be <= 127 chars, by forcing the length into a
> char.  This seems like a bug to me.  wrap_strftime() calls time's
> strftime(), which doesn't have this limitation because it uses size_t.

Yawn ;-)  I'm very surprised the code doesn't verify that the format
size fits in a C char, but there's nothing deep about the assumption.
I expect it would work fine to just change the declarations of
`totalnew` and `usednew` from `char` to `Py_ssize_t` (for 2.5.1 and
2.6; to something else for 2.4.4 (I don't recall which C type
PyString_Size returned then -- probably `int`)), and /also/ change the
resize-and-overflow check.  The current:

 			int bigger = totalnew << 1;
 			if ((bigger >> 1) != totalnew) { /* overflow */
 				PyErr_NoMemory();
 				goto Done;
 			}

doesn't actually make sense even if it's certain than sizeof(int) is
strictly larger than sizeof(totalnew) (which C guarantees for type
`char`, but is plain false on some boxes if changed to Py_ssize_t).
Someone must have been on heavy drugs when writing that endlessly
tedious wrapper ;-)

> ...

From anthony at interlink.com.au  Tue Sep 12 02:54:30 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Tue, 12 Sep 2006 10:54:30 +1000
Subject: [Python-Dev] datetime's strftime implementation: by design or
	bug
In-Reply-To: <4505AB91.6030908@trueblade.com>
References: <4505AB91.6030908@trueblade.com>
Message-ID: <200609121054.35576.anthony@interlink.com.au>

Please log a bug - this is probably something suitable for fixing in 2.5.1. At 
the very least, if it's going to be limited to 127 characters, it should 
check that and raise a more suitable exception. 


From misa at redhat.com  Tue Sep 12 03:18:12 2006
From: misa at redhat.com (Mihai Ibanescu)
Date: Mon, 11 Sep 2006 21:18:12 -0400
Subject: [Python-Dev] Py_BuildValue and decref
In-Reply-To: <4503C05F.70508@canterbury.ac.nz>
References: <20060908220605.GF990@abulafia.devel.redhat.com>
	<D7FAEFC0-59CC-49F6-9757-27244D409E94@python.org>
	<4503C05F.70508@canterbury.ac.nz>
Message-ID: <20060912011812.GB14187@abulafia.devel.redhat.com>

On Sun, Sep 10, 2006 at 07:35:59PM +1200, Greg Ewing wrote:
> Barry Warsaw wrote:
> > I just want to point out that the C API documentation is pretty  
> > silent about the refcounting side-effects in error conditions (and  
> > often in success conditions too) of most Python functions.  For  
> > example, what is the refcounting side-effects of PyDict_SetItem() on  
> > val?  What about if that function fails?  Has val been incref'd or  
> > not?  What about the side-effects on any value the new one replaces,  
> > both in success and failure?
> 
> The usual principle is that the refcounting behaviour
> is (or should be) independent of whether the function
> succeeded or failed. In the absence of any statement
> to the contrary in the docs, you should be able to
> assume that.
> 
> The words used to describe the refcount behaviour of
> some functions can be rather confusing, but it always
> boils down to one of two cases: either the function
> "borrows" a reference (and does its own incref if
> needed, the caller doesn't need to care) or it "steals"
> a reference (so the caller is always responsible for
> doing an incref if needed before calling).
> 
> What that rather convoluted comment about PyTuple_SetItem
> is trying to say is just that it *always* steals a reference,
> regardless of whether it succeeds or fails. I expect the
> same is true of Py_BuildValue.

Given that it doesn't seem to be the case, and my quick look at the code
indicates that even internally python is inconsistent, should I file a
low-severity bug so we don't lose track of this?

Misa

From rhamph at gmail.com  Tue Sep 12 06:05:38 2006
From: rhamph at gmail.com (Adam Olsen)
Date: Mon, 11 Sep 2006 22:05:38 -0600
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <a467ca4f0609110716i26a336beg3ff0def4536723b0@mail.gmail.com>
References: <E1GM0VK-0003n4-7b@draco.cus.cam.ac.uk>
	<aac2c7cb0609102132s3654f9bm519f31f0a9d65ce9@mail.gmail.com>
	<a467ca4f0609110716i26a336beg3ff0def4536723b0@mail.gmail.com>
Message-ID: <aac2c7cb0609112105l25be1d94r291f03b570ffef99@mail.gmail.com>

On 9/11/06, Gustavo Carneiro <gjcarneiro at gmail.com> wrote:
> On 9/11/06, Adam Olsen <rhamph at gmail.com> wrote:
> > This much would allow a GUI's poll loop to wake up when there is a
> > signal, and give control back to the python main loop, which could
> > then read off the signals and queue up their handler functions.
>
>   I like this approach.  Not only we would get a poll-able file
> descriptor to notify a GUI main loop when signals arrive, we'd also
> avoid the lack of async safety in Py_AddPendingCall /
> Py_MakePendingCalls which affects _current_ Python code.
>
>   Note that the file descriptor of the read end of the pipe has to
> become a public Python API so that 3rd party extensions may poll it.
> This is crucial.

Yeah, so long as Python still does the actual reading.


> > The only problem is when there is no GUI poll loop.  We don't want
> > python to have to poll the fd, we'd rather it just check a variable.
> > Is it possible to set/clear a flag in a sufficiently portable
> > (reentrant-safe, non-blocking, thread-safe) fashion?
>
>   It's simple.  That pipe file descriptor has to be changed to
> non-blocking mode in both ends of the pipe, obviously, with fcntl.
> Then, to find out whether a signal happened or not we modify
> PyErr_CheckSignals() to try to read from the pipe.  If it reads bytes
> from the pipe, we process the corresponding python signal handlers or
> raise KeyboardInterrupt.  If the read() syscall returns zero bytes
> read, we know no signal was delivered and move on.

Aye, but my point was that a syscall is costly, and we'd like to avoid
it if possible.

We'll probably have to benchmark it though, to find out if it's worth
the hassle.


>   The only potential problem left is that, by changing the pipe file
> descriptor to non-blocking mode we can only write as many bytes to it
> without reading from the other side as the pipe buffer allows.  If a
> large number of signals arrive very quickly, that buffer may fill and
> we lose signals.  But I think the default buffer should be more than
> enough.  And normally programs don't receive lots of signals in a
> small time window.  If it happens we may lose signals, but that's very
> rare, and who cares anyway.

Indeed, we need to document very clearly that:
* Signals may be dropped if there is a burst
* Signals may be delayed for a very long time, and if you replace a
previous handler your new handler may get signals intended for the old
handler

-- 
Adam Olsen, aka Rhamphoryncus

From greg.ewing at canterbury.ac.nz  Tue Sep 12 06:08:07 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 12 Sep 2006 16:08:07 +1200
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <a467ca4f0609110716i26a336beg3ff0def4536723b0@mail.gmail.com>
References: <E1GM0VK-0003n4-7b@draco.cus.cam.ac.uk>
	<aac2c7cb0609102132s3654f9bm519f31f0a9d65ce9@mail.gmail.com>
	<a467ca4f0609110716i26a336beg3ff0def4536723b0@mail.gmail.com>
Message-ID: <450632A7.40504@canterbury.ac.nz>

Gustavo Carneiro wrote:
>   The only potential problem left is that, by changing the pipe file
> descriptor to non-blocking mode we can only write as many bytes to it
> without reading from the other side as the pipe buffer allows.  If a
> large number of signals arrive very quickly, that buffer may fill and
> we lose signals.

That might be an argument for *not* trying to
communicate the signal number by the value
written to the pipe, but keep a separate set
of signal-pending flags, and just use the pipe
as a way of indicating that *something* has
happened.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From greg.ewing at canterbury.ac.nz  Tue Sep 12 06:33:40 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 12 Sep 2006 16:33:40 +1200
Subject: [Python-Dev] Py_BuildValue and decref
In-Reply-To: <20060912011812.GB14187@abulafia.devel.redhat.com>
References: <20060908220605.GF990@abulafia.devel.redhat.com>
	<D7FAEFC0-59CC-49F6-9757-27244D409E94@python.org>
	<4503C05F.70508@canterbury.ac.nz>
	<20060912011812.GB14187@abulafia.devel.redhat.com>
Message-ID: <450638A4.6020903@canterbury.ac.nz>

Mihai Ibanescu wrote:

> Given that it doesn't seem to be the case, and my quick look at the code
> indicates that even internally python is inconsistent, should I file a
> low-severity bug so we don't lose track of this?

I'd say so, yes. A function whose refcount behaviour
differs when it fails is awkward to use safely
at best, impossible at worst (if there's no way
of finding out what needs to be decrefed in
order to clean up properly).]

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From rhamph at gmail.com  Tue Sep 12 07:05:00 2006
From: rhamph at gmail.com (Adam Olsen)
Date: Mon, 11 Sep 2006 23:05:00 -0600
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <450632A7.40504@canterbury.ac.nz>
References: <E1GM0VK-0003n4-7b@draco.cus.cam.ac.uk>
	<aac2c7cb0609102132s3654f9bm519f31f0a9d65ce9@mail.gmail.com>
	<a467ca4f0609110716i26a336beg3ff0def4536723b0@mail.gmail.com>
	<450632A7.40504@canterbury.ac.nz>
Message-ID: <aac2c7cb0609112205l52034601wfef5c4c1e790ca04@mail.gmail.com>

On 9/11/06, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Gustavo Carneiro wrote:
> >   The only potential problem left is that, by changing the pipe file
> > descriptor to non-blocking mode we can only write as many bytes to it
> > without reading from the other side as the pipe buffer allows.  If a
> > large number of signals arrive very quickly, that buffer may fill and
> > we lose signals.
>
> That might be an argument for *not* trying to
> communicate the signal number by the value
> written to the pipe, but keep a separate set
> of signal-pending flags, and just use the pipe
> as a way of indicating that *something* has
> happened.

That brings you back to how you access the flags variable.  At best it
is very difficult, requiring unique assembly code for every supported
platform.  At worst, some platforms may not have any way to do it from
an interrupt context..

A possible alternative is to keep a set of flags for every thread, but
that requires the threads poll their variable regularly, and possibly
a wake-up pipe for each thread..

-- 
Adam Olsen, aka Rhamphoryncus

From greg.ewing at canterbury.ac.nz  Tue Sep 12 08:35:41 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 12 Sep 2006 18:35:41 +1200
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <aac2c7cb0609112205l52034601wfef5c4c1e790ca04@mail.gmail.com>
References: <E1GM0VK-0003n4-7b@draco.cus.cam.ac.uk>
	<aac2c7cb0609102132s3654f9bm519f31f0a9d65ce9@mail.gmail.com>
	<a467ca4f0609110716i26a336beg3ff0def4536723b0@mail.gmail.com>
	<450632A7.40504@canterbury.ac.nz>
	<aac2c7cb0609112205l52034601wfef5c4c1e790ca04@mail.gmail.com>
Message-ID: <4506553D.1020307@canterbury.ac.nz>

Adam Olsen wrote:

> That brings you back to how you access the flags variable.

The existing signal handler sets a flag, doesn't it?
So it couldn't be any more broken than the current
implementation.

If we get too paranoid about this, we'll just end
up deciding that signals can't be used for anything,
at all, ever. That doesn't seem very helpful,
although techically I suppose it would solve
the problem. :-)

My own conclusion from all this is that if you
can't rely on writing to a variable in one part
of your program and reading it back in another,
then computer architectures have become far
too clever for their own good. :-(

--
Greg

From rhamph at gmail.com  Tue Sep 12 08:59:58 2006
From: rhamph at gmail.com (Adam Olsen)
Date: Tue, 12 Sep 2006 00:59:58 -0600
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <4506553D.1020307@canterbury.ac.nz>
References: <E1GM0VK-0003n4-7b@draco.cus.cam.ac.uk>
	<aac2c7cb0609102132s3654f9bm519f31f0a9d65ce9@mail.gmail.com>
	<a467ca4f0609110716i26a336beg3ff0def4536723b0@mail.gmail.com>
	<450632A7.40504@canterbury.ac.nz>
	<aac2c7cb0609112205l52034601wfef5c4c1e790ca04@mail.gmail.com>
	<4506553D.1020307@canterbury.ac.nz>
Message-ID: <aac2c7cb0609112359m3ff4ccb3t3f301b9d37052efb@mail.gmail.com>

On 9/12/06, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Adam Olsen wrote:
>
> > That brings you back to how you access the flags variable.
>
> The existing signal handler sets a flag, doesn't it?
> So it couldn't be any more broken than the current
> implementation.
>
> If we get too paranoid about this, we'll just end
> up deciding that signals can't be used for anything,
> at all, ever. That doesn't seem very helpful,
> although techically I suppose it would solve
> the problem. :-)
>
> My own conclusion from all this is that if you
> can't rely on writing to a variable in one part
> of your program and reading it back in another,
> then computer architectures have become far
> too clever for their own good. :-(

They've been that way for a long, long time.  The irony is that x86 is
immensely stupid in this regard, and as a result most programmers
remain unaware of it.

Other architectures have much more interesting read/write and cache
reordering semantics, and the code is certainly broken there.  C
leaves it undefined with good reason.

My previous mention of using a *single* flag may survive corruption
simply because we can tolerate false positives.  Signal handlers would
write 0xFFFFFFFF, the poll loop would check if *any* bit is set.  If
so, write 0x0, read off the fd, then loop around and check it again.
If the start of the read() acts as a write-barrier it SHOULD guarantee
we don't miss any positive writes.

Hmm, if that works we should be able to generalize it for all the
other flags too.  Something to think about anyway...

-- 
Adam Olsen, aka Rhamphoryncus

From gjcarneiro at gmail.com  Tue Sep 12 19:15:48 2006
From: gjcarneiro at gmail.com (Gustavo Carneiro)
Date: Tue, 12 Sep 2006 18:15:48 +0100
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <aac2c7cb0609112359m3ff4ccb3t3f301b9d37052efb@mail.gmail.com>
References: <E1GM0VK-0003n4-7b@draco.cus.cam.ac.uk>
	<aac2c7cb0609102132s3654f9bm519f31f0a9d65ce9@mail.gmail.com>
	<a467ca4f0609110716i26a336beg3ff0def4536723b0@mail.gmail.com>
	<450632A7.40504@canterbury.ac.nz>
	<aac2c7cb0609112205l52034601wfef5c4c1e790ca04@mail.gmail.com>
	<4506553D.1020307@canterbury.ac.nz>
	<aac2c7cb0609112359m3ff4ccb3t3f301b9d37052efb@mail.gmail.com>
Message-ID: <a467ca4f0609121015i6dd3b245o1db1eb9b87fc7fe7@mail.gmail.com>

On 9/12/06, Adam Olsen <rhamph at gmail.com> wrote:
> On 9/12/06, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> > Adam Olsen wrote:
> >
> > > That brings you back to how you access the flags variable.
> >
> > The existing signal handler sets a flag, doesn't it?
> > So it couldn't be any more broken than the current
> > implementation.
> >
> > If we get too paranoid about this, we'll just end
> > up deciding that signals can't be used for anything,
> > at all, ever. That doesn't seem very helpful,
> > although techically I suppose it would solve
> > the problem. :-)
> >
> > My own conclusion from all this is that if you
> > can't rely on writing to a variable in one part
> > of your program and reading it back in another,
> > then computer architectures have become far
> > too clever for their own good. :-(
>
> They've been that way for a long, long time.  The irony is that x86 is
> immensely stupid in this regard, and as a result most programmers
> remain unaware of it.
>
> Other architectures have much more interesting read/write and cache
> reordering semantics, and the code is certainly broken there.  C
> leaves it undefined with good reason.
>
> My previous mention of using a *single* flag may survive corruption
> simply because we can tolerate false positives.  Signal handlers would
> write 0xFFFFFFFF, the poll loop would check if *any* bit is set.  If
> so, write 0x0, read off the fd, then loop around and check it again.
> If the start of the read() acts as a write-barrier it SHOULD guarantee
> we don't miss any positive writes.

  Why write 0xFFFFFFFF?  Why can't the variable be of a "volatile
char" type?  Assuming sizeof(char) == 1, please don't tell me
architecture XPTO will write the value 4 bits at a time! :P

  I see your point of using a flag to avoid the read() syscall most of
the time.  Slightly more complex, but possibly worth it.

  I was going to describe a possible race condition, then wrote the
code below to help explain it, modified it slightly, and now I think
the race is gone.  In any case, the code might be helpful to check if
we are in sync.  Let me know if you spot any  race condition I missed.


static volatile char signal_flag;
static int signal_pipe_r, signal_pipe_w;

PyErr_CheckSignals()
{
  if (signal_flag) {
     char signum;
     signal_flag = 0;
     while (read(signal_pipe_r, &signum, 1) == 1)
         process_signal(signum);
  }
}

static void
signal_handler(int signum)
{
   char signum_c = signum;
   signal_flag = 1;
   write(signal_pipe_w, &signum_c, 1);
}

From jcarlson at uci.edu  Tue Sep 12 19:37:54 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Tue, 12 Sep 2006 10:37:54 -0700
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <aac2c7cb0609112359m3ff4ccb3t3f301b9d37052efb@mail.gmail.com>
References: <4506553D.1020307@canterbury.ac.nz>
	<aac2c7cb0609112359m3ff4ccb3t3f301b9d37052efb@mail.gmail.com>
Message-ID: <20060912090921.F918.JCARLSON@uci.edu>


"Adam Olsen" <rhamph at gmail.com> wrote:
> 
> On 9/12/06, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> > Adam Olsen wrote:
> >
> > > That brings you back to how you access the flags variable.
> >
> > The existing signal handler sets a flag, doesn't it?
> > So it couldn't be any more broken than the current
> > implementation.
[snip]
> My previous mention of using a *single* flag may survive corruption
> simply because we can tolerate false positives.  Signal handlers would
> write 0xFFFFFFFF, the poll loop would check if *any* bit is set.  If
> so, write 0x0, read off the fd, then loop around and check it again.
> If the start of the read() acts as a write-barrier it SHOULD guarantee
> we don't miss any positive writes.
[snip]

I've been lurking on this thread for a while, but I'm thinking that just
a single file handle with a poll/read (if the poll succeeds) would be
fine.  So what if you miss a signal if there is a burst of signal
activity?  If users want a *good* IPC mechanism, then they can use any
one of the known-good IPC mechanisms defined for their platform (mmap,
named pipes, unnamed pipes, sockets (unix domain, udp, tcp), etc.), not
an IPC mechanism that has historically (at least in Python) been
generally unreliable.

Also, I wouldn't be surprised if the majority of signals are from the
set: SIGHUP, SIGTERM, SIGKILL, none of which should be coming in at a
high rate.


 - Josiah


From eric+python-dev at trueblade.com  Tue Sep 12 21:40:07 2006
From: eric+python-dev at trueblade.com (Eric V. Smith)
Date: Tue, 12 Sep 2006 15:40:07 -0400
Subject: [Python-Dev] datetime's strftime implementation: by design or
 bug
In-Reply-To: <200609121054.35576.anthony@interlink.com.au>
References: <4505AB91.6030908@trueblade.com>
	<200609121054.35576.anthony@interlink.com.au>
Message-ID: <45070D17.1090302@trueblade.com>

Anthony Baxter wrote:
> Please log a bug - this is probably something suitable for fixing in 2.5.1. At 
> the very least, if it's going to be limited to 127 characters, it should 
> check that and raise a more suitable exception. 

[First time sent from wrong address, sorry if this is a dupe.]

Done.  The patch is at http://python.org/sf/1557390.


From rhamph at gmail.com  Tue Sep 12 22:53:49 2006
From: rhamph at gmail.com (Adam Olsen)
Date: Tue, 12 Sep 2006 14:53:49 -0600
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <a467ca4f0609121015i6dd3b245o1db1eb9b87fc7fe7@mail.gmail.com>
References: <E1GM0VK-0003n4-7b@draco.cus.cam.ac.uk>
	<aac2c7cb0609102132s3654f9bm519f31f0a9d65ce9@mail.gmail.com>
	<a467ca4f0609110716i26a336beg3ff0def4536723b0@mail.gmail.com>
	<450632A7.40504@canterbury.ac.nz>
	<aac2c7cb0609112205l52034601wfef5c4c1e790ca04@mail.gmail.com>
	<4506553D.1020307@canterbury.ac.nz>
	<aac2c7cb0609112359m3ff4ccb3t3f301b9d37052efb@mail.gmail.com>
	<a467ca4f0609121015i6dd3b245o1db1eb9b87fc7fe7@mail.gmail.com>
Message-ID: <aac2c7cb0609121353u2a4432eq35caaf522416ea34@mail.gmail.com>

On 9/12/06, Gustavo Carneiro <gjcarneiro at gmail.com> wrote:
> On 9/12/06, Adam Olsen <rhamph at gmail.com> wrote:
> > My previous mention of using a *single* flag may survive corruption
> > simply because we can tolerate false positives.  Signal handlers would
> > write 0xFFFFFFFF, the poll loop would check if *any* bit is set.  If
> > so, write 0x0, read off the fd, then loop around and check it again.
> > If the start of the read() acts as a write-barrier it SHOULD guarantee
> > we don't miss any positive writes.
>
>   Why write 0xFFFFFFFF?  Why can't the variable be of a "volatile
> char" type?  Assuming sizeof(char) == 1, please don't tell me
> architecture XPTO will write the value 4 bits at a time! :P

Nope.  It'll write 32 bits, then break that up into 8 bits :)
Although, at the moment I can't fathom what harm that would cause...

For the record, all volatile does is prevent compiler reordering
across sequence points.

Interestingly, it seems "volatile sig_atomic_t" is the correct way to
declare a variable for (single-threaded) signal handling.  Odd that
volatile didn't show up in any of the previous documentation I read..


>   I see your point of using a flag to avoid the read() syscall most of
> the time.  Slightly more complex, but possibly worth it.
>
>   I was going to describe a possible race condition, then wrote the
> code below to help explain it, modified it slightly, and now I think
> the race is gone.  In any case, the code might be helpful to check if
> we are in sync.  Let me know if you spot any  race condition I missed.
>
>
> static volatile char signal_flag;
> static int signal_pipe_r, signal_pipe_w;
>
> PyErr_CheckSignals()
> {
>   if (signal_flag) {
>      char signum;
>      signal_flag = 0;
>      while (read(signal_pipe_r, &signum, 1) == 1)
>          process_signal(signum);
>   }
> }

I'd prefer this to be a "while (signal_flag)" instead, although it
should technically work either way.


> static void
> signal_handler(int signum)
> {
>    char signum_c = signum;
>    signal_flag = 1;
>    write(signal_pipe_w, &signum_c, 1);
> }

This is wrong.  PyErr_CheckSignals could check and clear signal_flag
before you reach the write() call.  "signal_flag = 1" should come
after.

-- 
Adam Olsen, aka Rhamphoryncus

From martin at v.loewis.de  Tue Sep 12 23:38:39 2006
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 12 Sep 2006 23:38:39 +0200
Subject: [Python-Dev] Subversion 1.4
Message-ID: <450728DF.1020104@v.loewis.de>

As many of you probably know: Subversion 1.4 has been released.
It is safe to upgrade to this version, even if the repository
server (for us svn.python.org) stays at an older version: they
can interoperate just fine.

There is one major pitfall:

Subversion 1.4 changes the format of the working copy file structure
(.svn/format goes from 4 to 8). This new format is more efficient:
for a Python checkout, it saves about 15MiB (out of 125 MiB). Also,
several operations (e.g. svn status) are faster. Subversion performs
a silent upgrade of the existing repository on the first operation
(I believe on the first modifying operation).

However, this new format is not compatible with older clients; you
need 1.4 clients to access an upgraded working copy. So if you
use the same working copy with different clients (e.g. command
line and turtoise, or from different systems through NFS), you
either need to upgrade all clients, or else you should stay
away from 1.4. Alternatively, you can have different checkouts
for 1.3 and 1.4 clients, of course.

Just in case you didn't know.

Regards,
Martin

From rhamph at gmail.com  Wed Sep 13 01:03:54 2006
From: rhamph at gmail.com (Adam Olsen)
Date: Tue, 12 Sep 2006 17:03:54 -0600
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <a467ca4f0609121015i6dd3b245o1db1eb9b87fc7fe7@mail.gmail.com>
References: <E1GM0VK-0003n4-7b@draco.cus.cam.ac.uk>
	<aac2c7cb0609102132s3654f9bm519f31f0a9d65ce9@mail.gmail.com>
	<a467ca4f0609110716i26a336beg3ff0def4536723b0@mail.gmail.com>
	<450632A7.40504@canterbury.ac.nz>
	<aac2c7cb0609112205l52034601wfef5c4c1e790ca04@mail.gmail.com>
	<4506553D.1020307@canterbury.ac.nz>
	<aac2c7cb0609112359m3ff4ccb3t3f301b9d37052efb@mail.gmail.com>
	<a467ca4f0609121015i6dd3b245o1db1eb9b87fc7fe7@mail.gmail.com>
Message-ID: <aac2c7cb0609121603o5395d1f0l98ab878f14a24323@mail.gmail.com>

On 9/12/06, Gustavo Carneiro <gjcarneiro at gmail.com> wrote:
> On 9/12/06, Adam Olsen <rhamph at gmail.com> wrote:
> > My previous mention of using a *single* flag may survive corruption
> > simply because we can tolerate false positives.  Signal handlers would
> > write 0xFFFFFFFF, the poll loop would check if *any* bit is set.  If
> > so, write 0x0, read off the fd, then loop around and check it again.
> > If the start of the read() acts as a write-barrier it SHOULD guarantee
> > we don't miss any positive writes.

> PyErr_CheckSignals()
> {
>   if (signal_flag) {
>      char signum;
>      signal_flag = 0;
>      while (read(signal_pipe_r, &signum, 1) == 1)
>          process_signal(signum);
>   }
> }

The more I think about this the less I like relying on read() imposing
a hardware write barrier.  Unless somebody can say otherwise, I think
we'd be better of putting dummy
PyThread_aquire_lock/PyThread_release_lock calls in there.

-- 
Adam Olsen, aka Rhamphoryncus

From martin at v.loewis.de  Wed Sep 13 05:36:34 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 13 Sep 2006 05:36:34 +0200
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <E1GJaRL-0003Pm-KT@virgo.cus.cam.ac.uk>
References: <E1GJaRL-0003Pm-KT@virgo.cus.cam.ac.uk>
Message-ID: <45077CC2.9070601@v.loewis.de>

Nick Maclaren schrieb:
>> (coment by Arjan van de Ven):
>> | afaik the kernel only sends signals to threads that don't have them blocked.
>> | If python doesn't want anyone but the main thread to get signals, it
>> should just
>> | block signals on all but the main thread and then by nature, all
>> signals will go
>> | to the main thread....
> 
> Well, THAT'S wrong, I am afraid!  Things ain't that simple :-(
> 
> Yes, POSIX implies that things work that way, but there are so many
> get-out clauses and problems with trying to implement that specification
> that such behaviour can't be relied on.

Can you please give one example for each (one get-out clause, and
one problem with trying to implement that).

I fail to see why it isn't desirable to make all signals occur
in the main thread, on systems where this is possible.

Regards,
Martin

From martin at v.loewis.de  Wed Sep 13 05:38:08 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 13 Sep 2006 05:38:08 +0200
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <2mpseboj26.fsf@starship.python.net>
References: <a467ca4f0609020510t2da5a1dbwa82e01d299befebd@mail.gmail.com>
	<2mpseboj26.fsf@starship.python.net>
Message-ID: <45077D20.5070400@v.loewis.de>

Michael Hudson schrieb:
>> According to [1], all python needs to do to avoid this problem is
>> block all signals in all but the main thread;
> 
> Argh, no: then people who call system() from non-main threads end up
> running subprocesses with all signals masked, which breaks other
> things in very mysterious ways.  Been there...

Python should register a pthread_atfork handler then, which clears
the signal mask. Would that not work?

Regards,
Martin


From mwh at python.net  Wed Sep 13 10:14:42 2006
From: mwh at python.net (Michael Hudson)
Date: Wed, 13 Sep 2006 09:14:42 +0100
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <45077D20.5070400@v.loewis.de> (
	=?iso-8859-1?q?Martin_v._L=F6wis's_message_of?= "Wed,
	13 Sep 2006 05:38:08 +0200")
References: <a467ca4f0609020510t2da5a1dbwa82e01d299befebd@mail.gmail.com>
	<2mpseboj26.fsf@starship.python.net> <45077D20.5070400@v.loewis.de>
Message-ID: <2mlkookwst.fsf@starship.python.net>

"Martin v. L?wis" <martin at v.loewis.de> writes:

> Michael Hudson schrieb:
>>> According to [1], all python needs to do to avoid this problem is
>>> block all signals in all but the main thread;
>> 
>> Argh, no: then people who call system() from non-main threads end up
>> running subprocesses with all signals masked, which breaks other
>> things in very mysterious ways.  Been there...
>
> Python should register a pthread_atfork handler then, which clears
> the signal mask. Would that not work?

Not for system() at least:

http://mail.python.org/pipermail/python-dev/2003-December/041303.html

Cheers,
mwh

-- 
  ROOSTA:  Ever since you arrived on this planet last night you've
           been going round telling people that you're Zaphod
           Beeblebrox, but that they're not to tell anyone else.
                    -- The Hitch-Hikers Guide to the Galaxy, Episode 7

From anthony at interlink.com.au  Wed Sep 13 13:57:40 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Wed, 13 Sep 2006 21:57:40 +1000
Subject: [Python-Dev] RELEASED Python 2.5 (release candidate 2)
Message-ID: <200609132157.44342.anthony@interlink.com.au>

On behalf of the Python development team and the Python
community, I'm happy to announce the second RELEASE
CANDIDATE of Python 2.5.

After the first release candidate a number of new bugfixes
have been applied to the Python 2.5 code. In the interests
of making 2.5 the best release possible, we've decided to
put out a second (and hopefully last) release candidate. We
plan for a 2.5 final in a week's time.

This is not yet the final release - it is not suitable for
production use. It is being released to solicit feedback
and hopefully expose bugs, as well as allowing you to
determine how changes in 2.5 might impact you. As a release
candidate, this is one of your last chances to test the new
code in 2.5 before the final release. *Please* try this
release out and let us know about any problems you find.

In particular, note that changes to improve Python's support
of 64 bit systems might require authors of C extensions
to change their code. More information (as well as source
distributions and Windows and Universal Mac OSX installers)
are available from the 2.5 website:

    http://www.python.org/2.5/

As of this release, Python 2.5 is now in *feature freeze*.
Unless absolutely necessary, no functionality changes will
be made between now and the final release of Python 2.5.

The new features in Python 2.5 are described in Andrew
Kuchling's What's New In Python 2.5. It's available from the
2.5 web page.

Amongst the language features added include conditional
expressions, the with statement, the merge of try/except
and try/finally into try/except/finally, enhancements to
generators to produce a coroutine kind of functionality, and
a brand new AST-based compiler implementation.

New modules added include hashlib, ElementTree, sqlite3,
wsgiref, uuid and ctypes. In addition, a new profiling
module "cProfile" was added.

Enjoy this new release,
Anthony

Anthony Baxter
anthony at python.org
Python Release Manager
(on behalf of the entire python-dev team)

From gjcarneiro at gmail.com  Wed Sep 13 15:17:03 2006
From: gjcarneiro at gmail.com (Gustavo Carneiro)
Date: Wed, 13 Sep 2006 14:17:03 +0100
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <aac2c7cb0609121353u2a4432eq35caaf522416ea34@mail.gmail.com>
References: <E1GM0VK-0003n4-7b@draco.cus.cam.ac.uk>
	<aac2c7cb0609102132s3654f9bm519f31f0a9d65ce9@mail.gmail.com>
	<a467ca4f0609110716i26a336beg3ff0def4536723b0@mail.gmail.com>
	<450632A7.40504@canterbury.ac.nz>
	<aac2c7cb0609112205l52034601wfef5c4c1e790ca04@mail.gmail.com>
	<4506553D.1020307@canterbury.ac.nz>
	<aac2c7cb0609112359m3ff4ccb3t3f301b9d37052efb@mail.gmail.com>
	<a467ca4f0609121015i6dd3b245o1db1eb9b87fc7fe7@mail.gmail.com>
	<aac2c7cb0609121353u2a4432eq35caaf522416ea34@mail.gmail.com>
Message-ID: <a467ca4f0609130617v450820dawb5f9ff1d69f41275@mail.gmail.com>

On 9/12/06, Adam Olsen <rhamph at gmail.com> wrote:
> On 9/12/06, Gustavo Carneiro <gjcarneiro at gmail.com> wrote:
> > On 9/12/06, Adam Olsen <rhamph at gmail.com> wrote:
> > > My previous mention of using a *single* flag may survive corruption
> > > simply because we can tolerate false positives.  Signal handlers would
> > > write 0xFFFFFFFF, the poll loop would check if *any* bit is set.  If
> > > so, write 0x0, read off the fd, then loop around and check it again.
> > > If the start of the read() acts as a write-barrier it SHOULD guarantee
> > > we don't miss any positive writes.
> >
> >   Why write 0xFFFFFFFF?  Why can't the variable be of a "volatile
> > char" type?  Assuming sizeof(char) == 1, please don't tell me
> > architecture XPTO will write the value 4 bits at a time! :P
>
> Nope.  It'll write 32 bits, then break that up into 8 bits :)
> Although, at the moment I can't fathom what harm that would cause...

  Hmm... it means that to write those 8 bits the processor / compiler
may need to 1. read 32 bits from memory to a register, 2. modify 8
bits of the register, 3. write back those 32 bits.  Shouldn't affect
our case, but maybe it's better to use sig_atomic_t in any case.

> For the record, all volatile does is prevent compiler reordering
> across sequence points.

  It makes the compiler aware the value may change any time, outside
the current context/function, so it doesn't assume a constant value
and always re-reads it from memory instead of assuming a value from a
register is correct.

> > static volatile char signal_flag;
> > static int signal_pipe_r, signal_pipe_w;
> >
> > PyErr_CheckSignals()
> > {
> >   if (signal_flag) {
> >      char signum;
> >      signal_flag = 0;
> >      while (read(signal_pipe_r, &signum, 1) == 1)
> >          process_signal(signum);
> >   }
> > }
>
> I'd prefer this to be a "while (signal_flag)" instead, although it
> should technically work either way.

  I guess we can use while instead of if.

>
> > static void
> > signal_handler(int signum)
> > {
> >    char signum_c = signum;
> >    signal_flag = 1;
> >    write(signal_pipe_w, &signum_c, 1);
> > }
>
> This is wrong.  PyErr_CheckSignals could check and clear signal_flag
> before you reach the write() call.  "signal_flag = 1" should come
> after.

  Yes, good catch.

  I don't understand the memory barrier concern in your other email.
I know little on the subject, but from what I could find out memory
barriers are used to avoid reordering of multiple read and write
operations.  However, in this case we have a single value at stake,
there's nothing to reorder.

  Except perhaps that "signal_flag = 0" could be delayed... If it is
delayed until after the while (read (...)...) loop below we could get
in trouble. I see your point now... :|

  But I think that a system call has to act as memory barrier,
forcefully, because the CPU has to jump into kernelspace, a completely
different context, it _has_ to flush pending memory operations sooner
or later.

Round two:

static volatile sig_atomic_t signal_flag;
static int signal_pipe_r, signal_pipe_w;

PyErr_CheckSignals()
{
   while (signal_flag) {
     char signum;
     signal_flag = 0;
     while (read(signal_pipe_r, &signum, 1) == 1)
         process_signal(signum);
   }
}

static void
signal_handler(int signum)
{
    char signum_c = signum;
    write(signal_pipe_w, &signum_c, 1);
    signal_flag = 1;
}

-- 
Gustavo J. A. M. Carneiro
"The universe is always one step beyond logic."

From skip at pobox.com  Wed Sep 13 19:46:39 2006
From: skip at pobox.com (skip at pobox.com)
Date: Wed, 13 Sep 2006 12:46:39 -0500
Subject: [Python-Dev] Maybe we should have a C++ extension for testing...
Message-ID: <17672.17407.88122.884957@montanaro.dyndns.org>


Building Python with C and then linking in extensions written in or wrapped
with C++ can present problems, at least in some situations.  I don't know if
it's kosher to build that way, but folks do.  We're bumping into such
problems at work using Solaris 10 and Python 2.4 (building matplotlib, which
is largely written in C++), and it appears others have similar problems:

    http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6395191
    http://mail.python.org/pipermail/patches/2005-June/017820.html
    http://mail.python.org/pipermail/python-bugs-list/2005-November/030900.html

I attached a comment to the third item yesterday (even though it was
closed).

One of our C++ gurus (that's definitely not me!) patched the Python source
to include <wchar.h> at the top of Python.h.  That seems to have solved our
problems, but seems to be a symptomatic fix.  I got to thinking, should we
a) encourage people to compile Python with a C++ compiler if most/all of
their extensions are written in C++ anyway (does that even work if one or
more extensions are written in C?), or b) should the standard distribution
maybe include a toy extension written in C++ whose sole purpose is to test
for cross-language problems?

Either/or/neither/something else?

Skip

From dinov at exchange.microsoft.com  Wed Sep 13 20:05:32 2006
From: dinov at exchange.microsoft.com (Dino Viehland)
Date: Wed, 13 Sep 2006 11:05:32 -0700
Subject: [Python-Dev] .pyc file has different result for value
 "1.79769313486232e+308" than .py file
Message-ID: <7AD436E4270DD54A94238001769C22273E9618F9D8@DF-GRTDANE-MSG.exchange.corp.microsoft.com>

We've noticed a strange occurance on Python 2.4.3 w/ the floating point value 1.79769313486232e+308 and how it interacts w/ a .pyc.  Given x.py:

def foo():
        print str(1.79769313486232e+308)
        print str(1.79769313486232e+308) == "1.#INF"


The 1st time you run this you get the correct value, but if you reload the module after a .pyc is created then you get different results (and the generated byte code appears to have changed).

Python 2.4.3 (#69, Mar 29 2006, 17:35:34) [MSC v.1310 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import x
>>> import dis
>>> dis.dis(x.foo)
  2           0 LOAD_GLOBAL              0 (str)
              3 LOAD_CONST               1 (1.#INF)
              6 CALL_FUNCTION            1
              9 PRINT_ITEM
             10 PRINT_NEWLINE

  3          11 LOAD_GLOBAL              0 (str)
             14 LOAD_CONST               1 (1.#INF)
             17 CALL_FUNCTION            1
             20 LOAD_CONST               2 ('1.#INF')
             23 COMPARE_OP               2 (==)
             26 PRINT_ITEM
             27 PRINT_NEWLINE
             28 LOAD_CONST               0 (None)
             31 RETURN_VALUE
>>> reload(x)
<module 'x' from 'x.pyc'>
>>> dis.dis(x.foo)
  2           0 LOAD_GLOBAL              0 (str)
              3 LOAD_CONST               1 (1.0)
              6 CALL_FUNCTION            1
              9 PRINT_ITEM
             10 PRINT_NEWLINE

  3          11 LOAD_GLOBAL              0 (str)
             14 LOAD_CONST               1 (1.0)
             17 CALL_FUNCTION            1
             20 LOAD_CONST               2 ('1.#INF')
             23 COMPARE_OP               2 (==)
             26 PRINT_ITEM
             27 PRINT_NEWLINE
             28 LOAD_CONST               0 (None)
             31 RETURN_VALUE
>>> ^Z

From tim.peters at gmail.com  Wed Sep 13 20:39:26 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 13 Sep 2006 14:39:26 -0400
Subject: [Python-Dev] .pyc file has different result for value
	"1.79769313486232e+308" than .py file
In-Reply-To: <7AD436E4270DD54A94238001769C22273E9618F9D8@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
References: <7AD436E4270DD54A94238001769C22273E9618F9D8@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
Message-ID: <1f7befae0609131139v375c9492g6c464e164aa74f9b@mail.gmail.com>

[Dino Viehland]
> We've noticed a strange occurance on Python 2.4.3 w/ the floating point
> value 1.79769313486232e+308 and how it interacts w/ a .pyc.  Given x.py:
>
> def foo():
>         print str(1.79769313486232e+308)
>         print str(1.79769313486232e+308) == "1.#INF"
>
>
> The 1st time you run this you get the correct value, but if you reload the module
> after a .pyc is created then you get different results (and the generated byte code
> appears to have changed).
> ...

Exhaustively explained in this recent thread:

http://mail.python.org/pipermail/python-list/2006-August/355986.html

From dinov at exchange.microsoft.com  Thu Sep 14 00:04:46 2006
From: dinov at exchange.microsoft.com (Dino Viehland)
Date: Wed, 13 Sep 2006 15:04:46 -0700
Subject: [Python-Dev] .pyc file has different result for value
 "1.79769313486232e+308" than .py file
In-Reply-To: <1f7befae0609131139v375c9492g6c464e164aa74f9b@mail.gmail.com>
References: <7AD436E4270DD54A94238001769C22273E9618F9D8@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
	<1f7befae0609131139v375c9492g6c464e164aa74f9b@mail.gmail.com>
Message-ID: <7AD436E4270DD54A94238001769C22273E9618FB88@DF-GRTDANE-MSG.exchange.corp.microsoft.com>

Thanks for the link - it's a good explanation.

FYI I've opened a bug against the VC++ team to fix their round tripping on floating point values (doesn't sound like it'll make the next release, but hopefully it'll make it someday).

-----Original Message-----
From: Tim Peters [mailto:tim.peters at gmail.com]
Sent: Wednesday, September 13, 2006 11:39 AM
To: Dino Viehland
Cc: python-dev at python.org; Haibo Luo
Subject: Re: [Python-Dev] .pyc file has different result for value "1.79769313486232e+308" than .py file

[Dino Viehland]
> We've noticed a strange occurance on Python 2.4.3 w/ the floating
> point value 1.79769313486232e+308 and how it interacts w/ a .pyc.  Given x.py:
>
> def foo():
>         print str(1.79769313486232e+308)
>         print str(1.79769313486232e+308) == "1.#INF"
>
>
> The 1st time you run this you get the correct value, but if you reload
> the module after a .pyc is created then you get different results (and
> the generated byte code appears to have changed).
> ...

Exhaustively explained in this recent thread:

http://mail.python.org/pipermail/python-list/2006-August/355986.html

From tim.peters at gmail.com  Thu Sep 14 00:29:44 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 13 Sep 2006 18:29:44 -0400
Subject: [Python-Dev] .pyc file has different result for value
	"1.79769313486232e+308" than .py file
In-Reply-To: <7AD436E4270DD54A94238001769C22273E9618FB88@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
References: <7AD436E4270DD54A94238001769C22273E9618F9D8@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
	<1f7befae0609131139v375c9492g6c464e164aa74f9b@mail.gmail.com>
	<7AD436E4270DD54A94238001769C22273E9618FB88@DF-GRTDANE-MSG.exchange.corp.microsoft.com>
Message-ID: <1f7befae0609131529t353b9986t1e185afe8dc61a28@mail.gmail.com>

[Dino Viehland]
> FYI I've opened a bug against the VC++ team to fix their round tripping on floating
> point values (doesn't sound like it'll make the next release, but hopefully it'll make it
> someday).

Cool!  That would be helpful to many languages implemented in C/C++
relying on the platform {float, double}<->string library routines.

Note that the latest revision of the C standard ("C99") specifies
strings for infinities and NaNs that conforming implementations must
accept (for example, "inf").  It would be nice to accept those too,
for portability; "most" Python platforms already do.  In fact, this is
the primary reason people running on, e.g., Linux, resist upgrading to
Windows ;-)

From anthony at interlink.com.au  Thu Sep 14 02:58:08 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Thu, 14 Sep 2006 10:58:08 +1000
Subject: [Python-Dev] release is done,
	but release25-maint branch remains near-frozen
Message-ID: <200609141058.13625.anthony@interlink.com.au>

Ok - we're looking at a final release in 7 days time. I really, really, really 
don't want to have to cut an rc3, so unless it's a seriously critical 
brown-paper-bag bug, let's hold off on the checkins. Documentation, I don't 
mind so much - particularly any formatting errors.


-- 
Anthony Baxter     <anthony at interlink.com.au>
It's never too late to have a happy childhood.

From nnorwitz at gmail.com  Thu Sep 14 09:48:57 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Thu, 14 Sep 2006 00:48:57 -0700
Subject: [Python-Dev] fun threading problem
Message-ID: <ee2a432c0609140048m45187de3if550dbd2a8d98998@mail.gmail.com>

On everyones favorite platform (HP-UX), the following code consistently fails:

###
from thread import start_new_thread, allocate_lock
from time import sleep

def bootstrap():
    from os import fork ; fork()
    allocate_lock().acquire()

start_new_thread(bootstrap, ())
sleep(.1)
###

The error is:
Fatal Python error: Invalid thread state for this thread

This code was whittled down from test_socketserver which fails in the
same way.  It doesn't matter what value is passed to sleep as long as
it's greater than 0.  I also tried changing the sleep to a while 1:
pass and the same problem occurred.  So there isn't a huge interaction
of APIs, only:  fork, allocate_lock.acquire and start_new_thread.

HP-UX seems to be more sensitive to various threading issues.  In
Modules/_test_capimodule.c, I had to make this modification:

Index: Modules/_testcapimodule.c
===================================================================
--- Modules/_testcapimodule.c   (revision 51875)
+++ Modules/_testcapimodule.c   (working copy)
@@ -665,6 +665,9 @@
        PyThread_acquire_lock(thread_done, 1);  /* wait for thread to finish */
        Py_END_ALLOW_THREADS

+       /* Release lock we acquired above.  This is required on HP-UX. */
+       PyThread_release_lock(thread_done);
+
        PyThread_free_lock(thread_done);
        Py_RETURN_NONE;
 }

Without that patch, there would be this error:

sem_destroy: Device busy
sem_init: Device busy
Fatal Python error: UNREF invalid object
ABORT instruction (core dumped)

Anyone have any ideas?

n

From aahz at pythoncraft.com  Thu Sep 14 16:31:14 2006
From: aahz at pythoncraft.com (Aahz)
Date: Thu, 14 Sep 2006 07:31:14 -0700
Subject: [Python-Dev] fun threading problem
In-Reply-To: <ee2a432c0609140048m45187de3if550dbd2a8d98998@mail.gmail.com>
References: <ee2a432c0609140048m45187de3if550dbd2a8d98998@mail.gmail.com>
Message-ID: <20060914143114.GB20596@panix.com>

On Thu, Sep 14, 2006, Neal Norwitz wrote:
>
> On everyones favorite platform (HP-UX), the following code
> consistently fails:

Which exact HP-UX?  I remember from my ancient days that each HP-UX
version completely changes the way threading works -- dunno whether
that's still true.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"LL YR VWL R BLNG T S"  -- www.nancybuttons.com

From kbk at shore.net  Fri Sep 15 04:59:08 2006
From: kbk at shore.net (Kurt B. Kaiser)
Date: Thu, 14 Sep 2006 22:59:08 -0400 (EDT)
Subject: [Python-Dev] Weekly Python Patch/Bug Summary
Message-ID: <200609150259.k8F2x8cT031149@bayview.thirdcreek.com>

Patch / Bug Summary
___________________

Patches :  416 open ( +3) /  3408 closed ( +1) /  3824 total ( +4)
Bugs    :  898 open ( +1) /  6180 closed (+13) /  7078 total (+14)
RFE     :  234 open ( +0) /   238 closed ( +0) /   472 total ( +0)

New / Reopened Patches
______________________

email parser incorrectly breaks headers with a CRLF at 8192  (2006-09-10)
       http://python.org/sf/1555570  opened by  Tony Meyer

datetime's strftime limits strings to 127 chars  (2006-09-12)
       http://python.org/sf/1557390  opened by  Eric V. Smith

Add RLIMIT_SBSIZE to resource module  (2006-09-12)
       http://python.org/sf/1557515  opened by  Eric Huss

missing imports ctypes in documentation examples  (2006-09-13)
       http://python.org/sf/1557890  opened by  Daniele Varrazzo

Patches Closed
______________

UserDict New Style  (2006-09-08)
       http://python.org/sf/1555097  closed by  rhettinger

New / Reopened Bugs
___________________

Bug in the match function  (2006-09-09)
CLOSED http://python.org/sf/1555496  opened by  wojtekwu

Please include pliblist for all plattforms  (2006-09-09)
       http://python.org/sf/1555501  opened by  Guido Guenther

sgmllib should allow angle brackets in quoted values  (2006-06-11)
       http://python.org/sf/1504333  reopened by  nnorwitz

Move fpectl elsewhere in library reference  (2006-09-11)
       http://python.org/sf/1556261  opened by  Michael Hoffman

datetime's strftime limits strings to 127 chars  (2006-09-11)
       http://python.org/sf/1556784  opened by  Eric V. Smith

datetime's strftime limits strings to 127 chars  (2006-09-12)
CLOSED http://python.org/sf/1557037  opened by  Eric V. Smith

typo in encoding name in email package  (2006-09-12)
       http://python.org/sf/1556895  opened by  Guillaume Rousse

2.5c1 Core dump during 64-bit make on Solaris 9 Sparc  (2006-09-12)
       http://python.org/sf/1557490  opened by  Tony Bigbee

xlc 6 does not like bufferobject.c line22  (2006-09-13)
       http://python.org/sf/1557983  opened by  prueba uno

apache2 - mod_python - python2.4 core dump  (2006-09-14)
CLOSED http://python.org/sf/1558223  opened by  ThurnerRupert

Tru64 make install failure  (2006-09-14)
       http://python.org/sf/1558802  opened by  Ralf W. Grosse-Kunstleve

2.5c2 macosx installer aborts during "GUI Applications"  (2006-09-14)
       http://python.org/sf/1558983  opened by  Evan

Bugs Closed
___________

datetime.datetime.now() mangles tzinfo  (2006-09-06)
       http://python.org/sf/1553577  closed by  nnorwitz

__unicode__ breaks for exception class objects  (2006-09-03)
       http://python.org/sf/1551432  closed by  bcannon

Bug in the match function  (2006-09-09)
       http://python.org/sf/1555496  closed by  tim_one

Recently introduced sgmllib regexp bug hangs Python  (2006-08-16)
       http://python.org/sf/1541697  closed by  nnorwitz

logging.handlers.RotatingFileHandler - inconsistent mode  (2006-09-06)
       http://python.org/sf/1553496  closed by  vsajip

datetime's strftime limits strings to 127 chars  (2006-09-12)
       http://python.org/sf/1557037  closed by  ericvsmith

apache2 - mod_python - python2.4 core dump  (2006-09-13)
       http://python.org/sf/1558223  closed by  nnorwitz


From nnorwitz at gmail.com  Fri Sep 15 07:51:54 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Thu, 14 Sep 2006 22:51:54 -0700
Subject: [Python-Dev] fun threading problem
In-Reply-To: <20060914143114.GB20596@panix.com>
References: <ee2a432c0609140048m45187de3if550dbd2a8d98998@mail.gmail.com>
	<20060914143114.GB20596@panix.com>
Message-ID: <ee2a432c0609142251o69e34d22mf25cbc60ae2ce7f0@mail.gmail.com>

On 9/14/06, Aahz <aahz at pythoncraft.com> wrote:
> On Thu, Sep 14, 2006, Neal Norwitz wrote:
> >
> > On everyones favorite platform (HP-UX), the following code
> > consistently fails:
>
> Which exact HP-UX?  I remember from my ancient days that each HP-UX
> version completely changes the way threading works -- dunno whether
> that's still true.

 HP-UX 11i v2 on PA-RISC

td191 on http://www.testdrive.hp.com/current.shtml

From sanxiyn at gmail.com  Wed Sep 13 09:46:05 2006
From: sanxiyn at gmail.com (Sanghyeon Seo)
Date: Wed, 13 Sep 2006 16:46:05 +0900
Subject: [Python-Dev] IronPython and AST branch
Message-ID: <5b0248170609130046w4e5bd012s63ecf46cbcfb8d2b@mail.gmail.com>

CPython 2.5, which will be released Real Soon Now, is the first
version to ship with new "AST branch", which have been in development
for a long time.

AST branch uses ASDL, Abstract Syntax Description Language
http://asdl.sourceforge.net/ to describe Abstract Syntax Tree data
structure used by CPython compiler. In theory this is language
independant, and the same file could be used to generate C# source
files.

Having the same AST for Python implementations will be good for
applications and libraries using Python implementations's internal
parsers and compilers. Currently, those using CPython parser module or
compiler package can't be easily ported to IronPython.

What do you think?

-- 
Seo Sanghyeon

From dan.eloff at gmail.com  Wed Sep 13 23:14:46 2006
From: dan.eloff at gmail.com (Dan Eloff)
Date: Wed, 13 Sep 2006 16:14:46 -0500
Subject: [Python-Dev] Thank you all
Message-ID: <4817b6fc0609131414j60c50400r13df42e2c6abf38e@mail.gmail.com>

I was just browsing what's new in Python 2.5 at
http://docs.python.org/dev/whatsnew/

As I was reading I found myself thinking how almost every improvement
made a programming task I commonly bump into a little easier. Take the
with statement, or the new partition method for strings, or the
defaultdict (which I think was previously available, but I only now
realized what it does), or the unified try/except/finally, or the
conditional expression, etc

Then I remembered my reaction was much like that when python 2.4 was
released, and before that when Python 2.3 was released.

Every time a new version of python rolls around, my life gets a little easier.

I just want to say thank you, very much, from the bottom of my heart,
to everyone here who chooses to spend some of their free time working
on improving Python. Whether it be fixing bugs, writing documentation,
optimizing things, or adding new/updating modules or features, I want
you all to know I really appreciate your efforts. Your hard work has
long ago made Python into my favourite programming language, and the
gap only continues to grow. I think most people here and on
comp.lang.python feel the same way. It's just too often that people
(me) will find the 1% of things that aren't quite right and will focus
on that, rather than look at the 99% of things that are done very
well. So now, while I'm thinking about it, I want to take the
opportunity to say thank you for the 99% of Python that all of you
have done such a good job on.

-Dan

From martin at v.loewis.de  Sat Sep 16 08:39:07 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 16 Sep 2006 08:39:07 +0200
Subject: [Python-Dev] Thank you all
In-Reply-To: <4817b6fc0609131414j60c50400r13df42e2c6abf38e@mail.gmail.com>
References: <4817b6fc0609131414j60c50400r13df42e2c6abf38e@mail.gmail.com>
Message-ID: <450B9C0B.4070906@v.loewis.de>

Dan Eloff schrieb:
> I just want to say thank you, very much, from the bottom of my heart,
> to everyone here who chooses to spend some of their free time working
> on improving Python.

Hi Dan,

I can't really speak for all the other contributors (but maybe in
this case I can): Thanks for the kind words. While we know in
principle that many users appreciate our work, it is heartening
to actually hear (or read) the praise.

Regards,
Martin

From arigo at tunes.org  Sat Sep 16 13:11:11 2006
From: arigo at tunes.org (Armin Rigo)
Date: Sat, 16 Sep 2006 13:11:11 +0200
Subject: [Python-Dev] Before 2.5 - More signed integer overflows
Message-ID: <20060916111111.GA27757@code0.codespeak.net>

Hi all,

There are more cases of signed integer overflows in the CPython source
code base...

That's on a 64-bits machine:

    [GCC 4.1.2 20060715 (prerelease) (Debian 4.1.1-9)] on linux2
    abs(-sys.maxint-1) == -sys.maxint-1

I'd expect the same breakage everywhere when GCC 4.2 is used.  Note that
the above is Python 2.4.4c0 - apparently Python 2.3 compiled with GCC
4.1.2 works, although that doesn't make much sense to me because
intobject.c didn't change here - 2.3, 2.4, 2.5, trunk are all the same.
Both tested Pythons are Debian packages, not in-house compiled.

Humpf!  Looks like one person or two need to do a quick last-minute
review of all places trying to deal with -sys.maxint-1, and replace them
all with the "official" fix from Tim [SF 1545668].


A bientot,

Armin

From fabiofz at gmail.com  Sat Sep 16 18:49:01 2006
From: fabiofz at gmail.com (Fabio Zadrozny)
Date: Sat, 16 Sep 2006 13:49:01 -0300
Subject: [Python-Dev] Grammar change in classdef
Message-ID: <cfb578b20609160949h4a483dbbv803691d18e873e30@mail.gmail.com>

I've been porting the grammar for pydev to version 2.5 and I've seen
that you can now declare a class in the format: class B():pass
(without the testlist)

-- from the grammar: classdef: 'class' NAME ['(' [testlist] ')'] ':' suite

I think that this change should be presented at
http://docs.python.org/dev/whatsnew/whatsnew25.html

I'm saying that because I've only stumbled upon it by accident -- and
I wasn't able to find any explanation on the reason or semantics of
the change...

Thanks,

Fabio

From l.oluyede at gmail.com  Sat Sep 16 18:58:13 2006
From: l.oluyede at gmail.com (Lawrence Oluyede)
Date: Sat, 16 Sep 2006 18:58:13 +0200
Subject: [Python-Dev] Grammar change in classdef
In-Reply-To: <cfb578b20609160949h4a483dbbv803691d18e873e30@mail.gmail.com>
References: <cfb578b20609160949h4a483dbbv803691d18e873e30@mail.gmail.com>
Message-ID: <9eebf5740609160958n6fa7ae80hd7d7c737166e6367@mail.gmail.com>

> I think that this change should be presented at
> http://docs.python.org/dev/whatsnew/whatsnew25.html

It's already listed there: http://docs.python.org/dev/whatsnew/other-lang.html


-- 
Lawrence
http://www.oluyede.org/blog

From martin at v.loewis.de  Sat Sep 16 19:22:34 2006
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 16 Sep 2006 19:22:34 +0200
Subject: [Python-Dev] Testsuite fails on Windows if a space is in the path
Message-ID: <450C32DA.9030601@v.loewis.de>

The test suite currently (2.5) has two failures on Windows
if Python is installed into a directory with a space in it
(such as "Program Files"). The failing tests are test_popen
and test_cmd_line.

The test_cmd_line failure is shallow: the test fails to properly
quote sys.executable when passing it to os.popen. I propose to
fix this in Python 2.5.1; see #1559413

test_popen is more tricky. This code has always failed AFAICT,
except that the test itself is a recent addition. The test tries
to pass the following command to os.popen

"c:\Program Files\python25\python.exe" -c "import sys;print sys.version"

For some reason, os.popen invokes doesn't directly start Python as
a new process, but install invokes

cmd.exe /c "c:\Program Files\python25\python.exe" -c "import sys;print
sys.version"

Can somebody remember what the reason is to invoke cmd.exe (or COMSPEC)
in os.popen?

In any case, cmd.exe fails to execute this, claiming that c:\Program
is not a valid executable. It would run

cmd.exe /c "c:\Program Files\python25\python.exe"

just fine, so apparently, the problem is with argument that have
multiple pairs of quotes. I found, through experimentation, that it
*will* accept

cmd.exe /c ""c:\Program Files\python25\python.exe" -c "import sys;print
sys.version""

(i.e. doubling the quotes at the beginning and the end). I'm not quite
sure what algorithm cmd.exe uses for parsing, but it appears that
adding a pair of quotes works in all cases (at least those I could think
of). See # 1559298

Here are my questions:
1. Should this be fixed before the final release of Python 2.5?
2. If not, should it be fixed in Python 2.5.1? I'd say not: there
   is a potential of breaking existing applications. Applications
   might be aware of this mess, and deliberately add a pair of
   quotes already. If popen then adds yet another pair of quotes,
   cmd.exe will again fail.
3. If not, should this be fixed in 2.6 in the way I propose in
   the patch (i.e. add quotes around the command line)?
   Or can anybody propose a different fix?
4. How should we deal with different values of COMSPEC? Should
   this patch only apply for cmd.exe, or should we assume that
   other shells are quirk-compatible with cmd.exe in this
   respect (or that people stopped setting COMSPEC, anyway)?

Any comments appreciated,

Martin

From ncoghlan at gmail.com  Sat Sep 16 19:28:48 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 17 Sep 2006 03:28:48 +1000
Subject: [Python-Dev] Grammar change in classdef
In-Reply-To: <cfb578b20609160949h4a483dbbv803691d18e873e30@mail.gmail.com>
References: <cfb578b20609160949h4a483dbbv803691d18e873e30@mail.gmail.com>
Message-ID: <450C3450.4010401@gmail.com>

Fabio Zadrozny wrote:
> I've been porting the grammar for pydev to version 2.5 and I've seen
> that you can now declare a class in the format: class B():pass
> (without the testlist)
> 
> -- from the grammar: classdef: 'class' NAME ['(' [testlist] ')'] ':' suite
> 
> I think that this change should be presented at
> http://docs.python.org/dev/whatsnew/whatsnew25.html
> 
> I'm saying that because I've only stumbled upon it by accident -- and
> I wasn't able to find any explanation on the reason or semantics of
> the change...

Lawrence already noted that this is already covered by the What's New document 
(semantically, it's identical to omitting the parentheses entirely).

As for the reason: it makes it possible to use the same style for classes 
without bases as is used for functions without arguments. Prior to this 
change, there was a sharp break in the class syntax, such that if you got rid 
of the last base class you had to get rid of the parentheses as well.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From exarkun at divmod.com  Sat Sep 16 19:38:06 2006
From: exarkun at divmod.com (Jean-Paul Calderone)
Date: Sat, 16 Sep 2006 13:38:06 -0400
Subject: [Python-Dev] Testsuite fails on Windows if a space is in the
 path
In-Reply-To: <450C32DA.9030601@v.loewis.de>
Message-ID: <20060916173806.1717.491882186.divmod.quotient.51076@ohm>

On Sat, 16 Sep 2006 19:22:34 +0200, "\"Martin v. L?wis\"" <martin at v.loewis.de> wrote:
>The test suite currently (2.5) has two failures on Windows
>if Python is installed into a directory with a space in it
>(such as "Program Files"). The failing tests are test_popen
>and test_cmd_line.
>
>The test_cmd_line failure is shallow: the test fails to properly
>quote sys.executable when passing it to os.popen. I propose to
>fix this in Python 2.5.1; see #1559413
>
>test_popen is more tricky. This code has always failed AFAICT,
>except that the test itself is a recent addition. The test tries
>to pass the following command to os.popen
>
>"c:\Program Files\python25\python.exe" -c "import sys;print sys.version"
>
>For some reason, os.popen invokes doesn't directly start Python as
>a new process, but install invokes
>
>cmd.exe /c "c:\Program Files\python25\python.exe" -c "import sys;print
>sys.version"
>
>Can somebody remember what the reason is to invoke cmd.exe (or COMSPEC)
>in os.popen?

I would guess it was done to force cmd.exe-style argument parsing in the
subprocess, which is optional on Win32.

>
>In any case, cmd.exe fails to execute this, claiming that c:\Program
>is not a valid executable. It would run
>
>cmd.exe /c "c:\Program Files\python25\python.exe"
>
>just fine, so apparently, the problem is with argument that have
>multiple pairs of quotes. I found, through experimentation, that it
>*will* accept
>
>cmd.exe /c ""c:\Program Files\python25\python.exe" -c "import sys;print
>sys.version""
>
>(i.e. doubling the quotes at the beginning and the end). I'm not quite
>sure what algorithm cmd.exe uses for parsing, but it appears that
>adding a pair of quotes works in all cases (at least those I could think
>of). See # 1559298

You can find the quoting/dequoting rules used by cmd.exe documented on msdn:

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vclang/html/_pluslang_Parsing_C.2b2b_.Command.2d.Line_Arguments.asp

Interpreting them is something of a challenge (my favorite part is how the
examples imply that the final argument is automatically uppercased ;)

Here is an attempted implementation of the quoting rules:

http://twistedmatrix.com/trac/browser/trunk/twisted/python/win32.py#L41

Whether or not it is correct is probably a matter of discussion.  If you find
a more generally correct solution, I would certainly like to know about it.

Jean-Paul

From brett at python.org  Sat Sep 16 20:33:56 2006
From: brett at python.org (Brett Cannon)
Date: Sat, 16 Sep 2006 11:33:56 -0700
Subject: [Python-Dev] IronPython and AST branch
In-Reply-To: <5b0248170609130046w4e5bd012s63ecf46cbcfb8d2b@mail.gmail.com>
References: <5b0248170609130046w4e5bd012s63ecf46cbcfb8d2b@mail.gmail.com>
Message-ID: <bbaeab100609161133y224ae51as384989e8fe4942e2@mail.gmail.com>

On 9/13/06, Sanghyeon Seo <sanxiyn at gmail.com> wrote:
>
> CPython 2.5, which will be released Real Soon Now, is the first
> version to ship with new "AST branch", which have been in development
> for a long time.
>
> AST branch uses ASDL, Abstract Syntax Description Language
> http://asdl.sourceforge.net/ to describe Abstract Syntax Tree data
> structure used by CPython compiler. In theory this is language
> independant, and the same file could be used to generate C# source
> files.


It would be nice, but see below.

Having the same AST for Python implementations will be good for
> applications and libraries using Python implementations's internal
> parsers and compilers. Currently, those using CPython parser module or
> compiler package can't be easily ported to IronPython.
>
> What do you think?


I have talked to Jim Hugunin about this very topic at the last PyCon.  He
pointed out that IronPython was started before he knew about the AST branch
so that's why he didn't use it.  Plus, by the time he did know, it was too
late to switch right then and there.

As for making the AST branch itself more of a standard, I have talked to
Jeremy Hylton about that and he didn't like the idea, at least for now.  The
reasons for keeping it as "experimental" in terms of exposure at the Python
level is that we do not want to lock ourselves down to some AST spec that we
end up changing in the future.  It's the same reasoning behind not
officially documenting the marshal format; we want the flexibility.

How best to resolve all of this, I don't know.  I completely understand not
wanting to lock ourselves down to an AST too soon.  Might need to wait a
little while after the AST has been out in the wild to see what the user
response is and then make a decision.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060916/81f42b9e/attachment.htm 

From steve at holdenweb.com  Sat Sep 16 20:41:42 2006
From: steve at holdenweb.com (Steve Holden)
Date: Sat, 16 Sep 2006 14:41:42 -0400
Subject: [Python-Dev] Testsuite fails on Windows if a space is in the
	path
In-Reply-To: <450C32DA.9030601@v.loewis.de>
References: <450C32DA.9030601@v.loewis.de>
Message-ID: <eehgfq$9b7$1@sea.gmane.org>

Martin v. L?wis wrote:
> The test suite currently (2.5) has two failures on Windows
> if Python is installed into a directory with a space in it
> (such as "Program Files"). The failing tests are test_popen
> and test_cmd_line.
> 
> The test_cmd_line failure is shallow: the test fails to properly
> quote sys.executable when passing it to os.popen. I propose to
> fix this in Python 2.5.1; see #1559413
> 
> test_popen is more tricky. This code has always failed AFAICT,
> except that the test itself is a recent addition. The test tries
> to pass the following command to os.popen
> 
> "c:\Program Files\python25\python.exe" -c "import sys;print sys.version"
> 
> For some reason, os.popen invokes doesn't directly start Python as
> a new process, but install invokes
> 
> cmd.exe /c "c:\Program Files\python25\python.exe" -c "import sys;print
> sys.version"
> 
> Can somebody remember what the reason is to invoke cmd.exe (or COMSPEC)
> in os.popen?
> 
> In any case, cmd.exe fails to execute this, claiming that c:\Program
> is not a valid executable. It would run
> 
> cmd.exe /c "c:\Program Files\python25\python.exe"
> 
> just fine, so apparently, the problem is with argument that have
> multiple pairs of quotes. I found, through experimentation, that it
> *will* accept
> 
> cmd.exe /c ""c:\Program Files\python25\python.exe" -c "import sys;print
> sys.version""
> 
> (i.e. doubling the quotes at the beginning and the end). I'm not quite
> sure what algorithm cmd.exe uses for parsing, but it appears that
> adding a pair of quotes works in all cases (at least those I could think
> of). See # 1559298
> 
> Here are my questions:
> 1. Should this be fixed before the final release of Python 2.5?
> 2. If not, should it be fixed in Python 2.5.1? I'd say not: there
>    is a potential of breaking existing applications. Applications
>    might be aware of this mess, and deliberately add a pair of
>    quotes already. If popen then adds yet another pair of quotes,
>    cmd.exe will again fail.
> 3. If not, should this be fixed in 2.6 in the way I propose in
>    the patch (i.e. add quotes around the command line)?
>    Or can anybody propose a different fix?
> 4. How should we deal with different values of COMSPEC? Should
>    this patch only apply for cmd.exe, or should we assume that
>    other shells are quirk-compatible with cmd.exe in this
>    respect (or that people stopped setting COMSPEC, anyway)?
> 
> Any comments appreciated,
> 
1. Because this is almost certainly Windows version-dependent I would 
suggest that you definitely hold off trying to fix it for 2.5 - it would 
almost certainly make another RC necessary, and even that wouldn't 
guarantee the required testing (I sense that Windows versions get rather 
less pre-release testing than others).

2. I agree with your opinion: anyone for whom this is an important issue 
has almost certainly addressed it with their own (version-dependent) 
workarounds that will break with the change.

3/4. Tricky. I don't think it would be wise to assume 
quirk-compatibility across all Windows command processors. On balance I 
suspect we should just alter the documentation to note that quirks int 
he underlying platform may result in unexpected behavior on quoted 
arguments, perhaps with an example or two.

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb       http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden


From tim.peters at gmail.com  Sat Sep 16 21:49:52 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Sat, 16 Sep 2006 15:49:52 -0400
Subject: [Python-Dev] Testsuite fails on Windows if a space is in the
	path
In-Reply-To: <450C32DA.9030601@v.loewis.de>
References: <450C32DA.9030601@v.loewis.de>
Message-ID: <1f7befae0609161249u751e9a8oe651b1ca81be1879@mail.gmail.com>

[Martin v. L?wis]
> ...
> Can somebody remember what the reason is to invoke cmd.exe (or COMSPEC)
> in os.popen?

Absolutely necessary, as any number of shell gimmicks can be used in
the passed string, same as on non-Windows boxes; .e.g.,

>>> import os
>>> os.environ['STR'] = 'SSL'
>>> p = os.popen("findstr %STR% *.py | sort")
>>> print p.read()
build_ssl.py:        print " None of these versions appear suitable
for building OpenSSL"
build_ssl.py:        print "Could not find an SSL directory in '%s'" %
(sources,)
build_ssl.py:        print "Found an SSL directory at '%s'" % (best_name,)
build_ssl.py:    # Look for SSL 2 levels up from pcbuild - ie, same
place zlib etc all live.
...

That illustrates envar substitution and setting up a pipe in the
passed string, and people certainly do things like that.

These are the MS docs for cmd.exe's inscrutable quoting rules after /C:

"""
If /C or /K is specified, then the remainder of the command line after
the switch is processed as a command line, where the following logic is
used to process quote (") characters:

    1.  If all of the following conditions are met, then quote characters
        on the command line are preserved:

        - no /S switch
        - exactly two quote characters
        - no special characters between the two quote characters,
          where special is one of: &<>()@^|
        - there are one or more whitespace characters between the
          the two quote characters
        - the string between the two quote characters is the name
          of an executable file.

    2.  Otherwise, old behavior is to see if the first character is
        a quote character and if so, strip the leading character and
        remove the last quote character on the command line, preserving
        any text after the last quote character.
"""

Your

    cmd.exe /c "c:\Program Files\python25\python.exe"

example fit clause #1 above.

    cmd.exe /c "c:\Program Files\python25\python.exe" -c "import sys;print
sys.version"

fails the "exactly two quote characters" part of #1, so falls into #2,
and after stripping the first and last quotes leaves the senseless:

cmd.exe /c c:\Program Files\python25\python.exe" -c "import sys;print
sys.version

> (i.e. doubling the quotes at the beginning and the end) [works]

And that follows from the above, although not for a reason any sane
person would guess :-(

I personally wouldn't change anything here for 2.5.  It's a minefield,
and people who care a lot already have their own workarounds in place,
which we'd risk breaking.  It remains a minefield for newbies, but
we're really just passing on cmd.exe's behaviors.  People are
well-advised to accept the installer's default directory.

From talin at acm.org  Sat Sep 16 22:32:55 2006
From: talin at acm.org (Talin)
Date: Sat, 16 Sep 2006 13:32:55 -0700
Subject: [Python-Dev] Grammar change in classdef
In-Reply-To: <450C3450.4010401@gmail.com>
References: <cfb578b20609160949h4a483dbbv803691d18e873e30@mail.gmail.com>
	<450C3450.4010401@gmail.com>
Message-ID: <450C5F77.30107@acm.org>

Nick Coghlan wrote:
> As for the reason: it makes it possible to use the same style for classes 
> without bases as is used for functions without arguments. Prior to this 
> change, there was a sharp break in the class syntax, such that if you got rid 
> of the last base class you had to get rid of the parentheses as well.

Is the result a new-style or classic-style class? It would be nice if 
using the empty parens forced a new-style class...

-- Talin

From gjcarneiro at gmail.com  Sat Sep 16 22:54:13 2006
From: gjcarneiro at gmail.com (Gustavo Carneiro)
Date: Sat, 16 Sep 2006 21:54:13 +0100
Subject: [Python-Dev] Grammar change in classdef
In-Reply-To: <450C5F77.30107@acm.org>
References: <cfb578b20609160949h4a483dbbv803691d18e873e30@mail.gmail.com>
	<450C3450.4010401@gmail.com> <450C5F77.30107@acm.org>
Message-ID: <a467ca4f0609161354p1ea4b931pca6aabc8d7662d89@mail.gmail.com>

On 9/16/06, Talin <talin at acm.org> wrote:
> Nick Coghlan wrote:
> > As for the reason: it makes it possible to use the same style for classes
> > without bases as is used for functions without arguments. Prior to this
> > change, there was a sharp break in the class syntax, such that if you got rid
> > of the last base class you had to get rid of the parentheses as well.
>
> Is the result a new-style or classic-style class? It would be nice if
> using the empty parens forced a new-style class...

  That was my first thought as well.  Unfortunately a quick test shows
that class Foo(): creates an old style class instead :(

-- 
Gustavo J. A. M. Carneiro
"The universe is always one step beyond logic."

From fabiofz at gmail.com  Sat Sep 16 23:48:59 2006
From: fabiofz at gmail.com (Fabio Zadrozny)
Date: Sat, 16 Sep 2006 18:48:59 -0300
Subject: [Python-Dev] Grammar change in classdef
In-Reply-To: <9eebf5740609160958n6fa7ae80hd7d7c737166e6367@mail.gmail.com>
References: <cfb578b20609160949h4a483dbbv803691d18e873e30@mail.gmail.com>
	<9eebf5740609160958n6fa7ae80hd7d7c737166e6367@mail.gmail.com>
Message-ID: <cfb578b20609161448j157df45ew8826214daf4f6bb@mail.gmail.com>

On 9/16/06, Lawrence Oluyede <l.oluyede at gmail.com> wrote:
> > I think that this change should be presented at
> > http://docs.python.org/dev/whatsnew/whatsnew25.html
>
> It's already listed there: http://docs.python.org/dev/whatsnew/other-lang.html
>

Thanks... also, I don't know if the empty yield statement is mentioned
too (I couldn't find it either).

Cheers,

Fabio

From l.oluyede at gmail.com  Sat Sep 16 23:57:08 2006
From: l.oluyede at gmail.com (Lawrence Oluyede)
Date: Sat, 16 Sep 2006 23:57:08 +0200
Subject: [Python-Dev] Grammar change in classdef
In-Reply-To: <a467ca4f0609161354p1ea4b931pca6aabc8d7662d89@mail.gmail.com>
References: <cfb578b20609160949h4a483dbbv803691d18e873e30@mail.gmail.com>
	<450C3450.4010401@gmail.com> <450C5F77.30107@acm.org>
	<a467ca4f0609161354p1ea4b931pca6aabc8d7662d89@mail.gmail.com>
Message-ID: <9eebf5740609161457m2de173c2w82a068bdd0d8225a@mail.gmail.com>

>   That was my first thought as well.  Unfortunately a quick test shows
> that class Foo(): creates an old style class instead :(

I think that's because until it'll be safe to break things we will
stick with classic by default...

-- 
Lawrence
http://www.oluyede.org/blog

From talin at acm.org  Sun Sep 17 00:12:25 2006
From: talin at acm.org (Talin)
Date: Sat, 16 Sep 2006 15:12:25 -0700
Subject: [Python-Dev] Grammar change in classdef
In-Reply-To: <9eebf5740609161457m2de173c2w82a068bdd0d8225a@mail.gmail.com>
References: <cfb578b20609160949h4a483dbbv803691d18e873e30@mail.gmail.com>	
	<450C3450.4010401@gmail.com> <450C5F77.30107@acm.org>	
	<a467ca4f0609161354p1ea4b931pca6aabc8d7662d89@mail.gmail.com>
	<9eebf5740609161457m2de173c2w82a068bdd0d8225a@mail.gmail.com>
Message-ID: <450C76C9.5060201@acm.org>

Lawrence Oluyede wrote:
>>   That was my first thought as well.  Unfortunately a quick test shows
>> that class Foo(): creates an old style class instead :(
> 
> I think that's because until it'll be safe to break things we will
> stick with classic by default...

But in this case nothing will be broken, since the () syntax was 
formerly not allowed, so it won't appear in any existing code. So it 
would have been a good opportunity to shift over to increased usage 
new-style classes without breaking anything.

Thus, 'class Foo:' would create a classic class, but 'class Foo():' 
would create a new-style class.

However, once it's released as 2.5 that will no longer be the case, as 
people might start to use () to indicate a classic class. Oh well.

-- Talin

From greg.ewing at canterbury.ac.nz  Sun Sep 17 01:22:53 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 17 Sep 2006 11:22:53 +1200
Subject: [Python-Dev] Grammar change in classdef
In-Reply-To: <450C5F77.30107@acm.org>
References: <cfb578b20609160949h4a483dbbv803691d18e873e30@mail.gmail.com>
	<450C3450.4010401@gmail.com> <450C5F77.30107@acm.org>
Message-ID: <450C874D.9080302@canterbury.ac.nz>

Talin wrote:
> Is the result a new-style or classic-style class? It would be nice if 
> using the empty parens forced a new-style class...

No, it wouldn't, IMO. Too subtle a clue.

Best to just wait for Py3k when all classes will
be new-style.

--
Greg

From brett at python.org  Sun Sep 17 02:45:57 2006
From: brett at python.org (Brett Cannon)
Date: Sat, 16 Sep 2006 17:45:57 -0700
Subject: [Python-Dev] Grammar change in classdef
In-Reply-To: <450C76C9.5060201@acm.org>
References: <cfb578b20609160949h4a483dbbv803691d18e873e30@mail.gmail.com>
	<450C3450.4010401@gmail.com> <450C5F77.30107@acm.org>
	<a467ca4f0609161354p1ea4b931pca6aabc8d7662d89@mail.gmail.com>
	<9eebf5740609161457m2de173c2w82a068bdd0d8225a@mail.gmail.com>
	<450C76C9.5060201@acm.org>
Message-ID: <bbaeab100609161745p5b0af2cew2ede5d50c5ba02e@mail.gmail.com>

On 9/16/06, Talin <talin at acm.org> wrote:
>
> Lawrence Oluyede wrote:
> >>   That was my first thought as well.  Unfortunately a quick test shows
> >> that class Foo(): creates an old style class instead :(
> >
> > I think that's because until it'll be safe to break things we will
> > stick with classic by default...
>
> But in this case nothing will be broken, since the () syntax was
> formerly not allowed, so it won't appear in any existing code. So it
> would have been a good opportunity to shift over to increased usage
> new-style classes without breaking anything.
>
> Thus, 'class Foo:' would create a classic class, but 'class Foo():'
> would create a new-style class.
>
> However, once it's released as 2.5 that will no longer be the case, as
> people might start to use () to indicate a classic class. Oh well.


We didn't want there to suddenly be a way to make a new-style class that
didn't explicitly subclass 'object'.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060916/f30361fe/attachment.htm 

From ncoghlan at gmail.com  Sun Sep 17 11:00:13 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 17 Sep 2006 19:00:13 +1000
Subject: [Python-Dev] Grammar change in classdef
In-Reply-To: <450C5F77.30107@acm.org>
References: <cfb578b20609160949h4a483dbbv803691d18e873e30@mail.gmail.com>	<450C3450.4010401@gmail.com>
	<450C5F77.30107@acm.org>
Message-ID: <450D0E9D.5010001@gmail.com>

Talin wrote:
> Nick Coghlan wrote:
>> As for the reason: it makes it possible to use the same style for classes 
>> without bases as is used for functions without arguments. Prior to this 
>> change, there was a sharp break in the class syntax, such that if you got rid 
>> of the last base class you had to get rid of the parentheses as well.
> 
> Is the result a new-style or classic-style class? It would be nice if 
> using the empty parens forced a new-style class...

This was considered & rejected by Guido as too subtle a distinction. So you 
still need to set __metaclass__=type (or inherit from such a class) to get a 
new-style class.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From ncoghlan at gmail.com  Sun Sep 17 11:01:25 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 17 Sep 2006 19:01:25 +1000
Subject: [Python-Dev] Grammar change in classdef
In-Reply-To: <cfb578b20609161448j157df45ew8826214daf4f6bb@mail.gmail.com>
References: <cfb578b20609160949h4a483dbbv803691d18e873e30@mail.gmail.com>	<9eebf5740609160958n6fa7ae80hd7d7c737166e6367@mail.gmail.com>
	<cfb578b20609161448j157df45ew8826214daf4f6bb@mail.gmail.com>
Message-ID: <450D0EE5.3020806@gmail.com>

Fabio Zadrozny wrote:
> On 9/16/06, Lawrence Oluyede <l.oluyede at gmail.com> wrote:
>>> I think that this change should be presented at
>>> http://docs.python.org/dev/whatsnew/whatsnew25.html
>> It's already listed there: http://docs.python.org/dev/whatsnew/other-lang.html
>>
> 
> Thanks... also, I don't know if the empty yield statement is mentioned
> too (I couldn't find it either).

It's part of the PEP 342 changes. However, I don't believe AMK mentioned that 
part explicitly in the What's New.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From ncoghlan at gmail.com  Sun Sep 17 11:40:41 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 17 Sep 2006 19:40:41 +1000
Subject: [Python-Dev] IronPython and AST branch
In-Reply-To: <bbaeab100609161133y224ae51as384989e8fe4942e2@mail.gmail.com>
References: <5b0248170609130046w4e5bd012s63ecf46cbcfb8d2b@mail.gmail.com>
	<bbaeab100609161133y224ae51as384989e8fe4942e2@mail.gmail.com>
Message-ID: <450D1819.2080803@gmail.com>

Brett Cannon wrote:
> As for making the AST branch itself more of a standard, I have talked to 
> Jeremy Hylton about that and he didn't like the idea, at least for now.  
> The reasons for keeping it as "experimental" in terms of exposure at the 
> Python level is that we do not want to lock ourselves down to some AST 
> spec that we end up changing in the future.  It's the same reasoning 
> behind not officially documenting the marshal format; we want the 
> flexibility.
> 
> How best to resolve all of this, I don't know.  I completely understand 
> not wanting to lock ourselves down to an AST too soon.  Might need to 
> wait a little while after the AST has been out in the wild to see what 
> the user response is and then make a decision.

One of the biggest issues I have with the current AST is that I don't believe 
it really gets the "slice" and "extended slice" terminology correct (it uses 
'extended slice' to refer to multi-dimensional indexing, but the normal 
meaning of that phrase is to refer to the use of a step argument for a slice [1])

Cheers,
Nick.

[1]
http://www.python.org/doc/2.3.5/whatsnew/section-slices.html

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From howarth at bromo.msbb.uc.edu  Sun Sep 17 14:51:38 2006
From: howarth at bromo.msbb.uc.edu (Jack Howarth)
Date: Sun, 17 Sep 2006 08:51:38 -0400 (EDT)
Subject: [Python-Dev] python, lipo and the future?
Message-ID: <20060917125138.C4142110010@bromo.msbb.uc.edu>

    I am curious if there are any plans to support
the functionality provided by lipo on MacOS X to
create a python release that could operate at either
32-bit or 64-bit on Darwin ppc and Darwin intel? My
understanding was that the linux developers are very
interested in lipo as well as an approach to avoid
the difficulty of maintaining separate lib directories
for 32 and 64-bit libraries. Thanks in advance for
any insights on this issue.
              Jack

From ronaldoussoren at mac.com  Sun Sep 17 18:15:22 2006
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Sun, 17 Sep 2006 18:15:22 +0200
Subject: [Python-Dev] python, lipo and the future?
In-Reply-To: <20060917125138.C4142110010@bromo.msbb.uc.edu>
References: <20060917125138.C4142110010@bromo.msbb.uc.edu>
Message-ID: <09DAD3D0-D5DE-4396-ADF4-4E9EFC2D9EF9@mac.com>


On Sep 17, 2006, at 2:51 PM, Jack Howarth wrote:

>     I am curious if there are any plans to support
> the functionality provided by lipo on MacOS X to
> create a python release that could operate at either
> 32-bit or 64-bit on Darwin ppc and Darwin intel?

We already support universal binaries for PPC and x86, adding PPC64  
and x86-64 to the mix should be relatively straigthforward, but it  
isn't a complete no-brainer.

One problem is that python's configure script detects the sizes of  
various types and those values will be different on 32-bit and 64-bit  
flavours. Another problem is that Tiger's 64-bit support is pretty  
limited, basically just the Unix APIs, which means you cannot have a  
4-way universal python interpreter without upsetting anyone with a 64- 
bit machine :-).



> My
> understanding was that the linux developers are very
> interested in lipo as well as an approach to avoid
> the difficulty of maintaining separate lib directories
> for 32 and 64-bit libraries. Thanks in advance for
> any insights on this issue.

OSX uses the MachO binary format which natively supports fat  
binaries, I don't know if ELF (the linux binary format) support fat  
binaries.

Ronald

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2157 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20060917/20dad179/attachment.bin 

From martin at v.loewis.de  Sun Sep 17 18:53:04 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 17 Sep 2006 18:53:04 +0200
Subject: [Python-Dev] python, lipo and the future?
In-Reply-To: <09DAD3D0-D5DE-4396-ADF4-4E9EFC2D9EF9@mac.com>
References: <20060917125138.C4142110010@bromo.msbb.uc.edu>
	<09DAD3D0-D5DE-4396-ADF4-4E9EFC2D9EF9@mac.com>
Message-ID: <450D7D70.5080505@v.loewis.de>

Ronald Oussoren schrieb:
> One problem is that python's configure script detects the sizes of
> various types and those values will be different on 32-bit and 64-bit
> flavours.

FWIW, the PC build "solves" this problem by providing a hand-crafted
pyconfig.h file, instead of using an autoconf-generated one.
That could work for OSX as well, although it is tedious to keep
the hand-crafted file up-to-date.

For the PC, this isn't really a problem, since Windows doesn't suddenly
grow new features, at least not those that configure checks for. So
forking pyconfig.h once and then customizing it for universal binaries
might work.

Another approach would be to override architecture-specific defines.
For example, a block

#ifdef __APPLE__
#include "pyosx.h"
#endif

could be added to the end of pyconfig.h, and then pyosx.h would have

#undef SIZEOF_LONG

#if defined(__i386__) || defined(__ppc__)
#define SIZEOF_LONG 4
#elif defined(__amd64__) || defined(__ppc64__)
#define SIZEOF_LONG 8
#else
#error unsupported architecture
#endif

Out of curiosity: how do the current universal binaries deal with this
issue?

Regards,
Martin

From jcarlson at uci.edu  Sun Sep 17 20:03:31 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sun, 17 Sep 2006 11:03:31 -0700
Subject: [Python-Dev] python, lipo and the future?
In-Reply-To: <450D7D70.5080505@v.loewis.de>
References: <09DAD3D0-D5DE-4396-ADF4-4E9EFC2D9EF9@mac.com>
	<450D7D70.5080505@v.loewis.de>
Message-ID: <20060917105909.F9A7.JCARLSON@uci.edu>


"Martin v. L?wis" <martin at v.loewis.de> wrote:
> Out of curiosity: how do the current universal binaries deal with this
> issue?

If I remember correctly, usually you do two completely independant
compile runs (optionally on the same machine with different configure or
macro definitions, then use a packager provided by Apple to merge the
results for each binary/so to be distributed. Each additional platform
would just be a new compile run.


 - Josiah


From ronaldoussoren at mac.com  Sun Sep 17 20:31:38 2006
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Sun, 17 Sep 2006 20:31:38 +0200
Subject: [Python-Dev] python, lipo and the future?
In-Reply-To: <450D7D70.5080505@v.loewis.de>
References: <20060917125138.C4142110010@bromo.msbb.uc.edu>
	<09DAD3D0-D5DE-4396-ADF4-4E9EFC2D9EF9@mac.com>
	<450D7D70.5080505@v.loewis.de>
Message-ID: <5F30D695-B932-493B-9834-6FB3D9A9BAF0@mac.com>


On Sep 17, 2006, at 6:53 PM, Martin v. L?wis wrote:

> Ronald Oussoren schrieb:
>> One problem is that python's configure script detects the sizes of
>> various types and those values will be different on 32-bit and 64-bit
>> flavours.
>
> FWIW, the PC build "solves" this problem by providing a hand-crafted
> pyconfig.h file, instead of using an autoconf-generated one.
> That could work for OSX as well, although it is tedious to keep
> the hand-crafted file up-to-date.
>
> For the PC, this isn't really a problem, since Windows doesn't  
> suddenly
> grow new features, at least not those that configure checks for. So
> forking pyconfig.h once and then customizing it for universal binaries
> might work.
>
> Another approach would be to override architecture-specific defines.
> For example, a block
>
> #ifdef __APPLE__
> #include "pyosx.h"
> #endif

Thats what I had started on before Bob came up with the endianness  
check that is in pyconfig.h.in at the moment. I'd to do this instead  
of manually maintaining a fork of pyconfig.h, my guess it is a lot  
less likely that pyconfig.h will grow new size related macros than  
new feature related ones.

One possible issue here is that distutils has an API for fetching  
definitions from pyconfig.h, code that uses this to detect  
architecture features could cause problems.

>
> could be added to the end of pyconfig.h, and then pyosx.h would have
>
> #undef SIZEOF_LONG
>
> #if defined(__i386__) || defined(__ppc__)
> #define SIZEOF_LONG 4
> #elif defined(__amd64__) || defined(__ppc64__)
> #define SIZEOF_LONG 8
> #else
> #error unsupported architecture
> #endif
>
> Out of curiosity: how do the current universal binaries deal with this
> issue?

The sizes of basic types are the same on PPC32 and x86 which helps a  
lot. The byteorder is different, but we can use GCC feature checks  
there. The relevant bit of pyconfig.h.in:

#ifdef __BIG_ENDIAN__
#define WORDS_BIGENDIAN 1
#else
#ifndef __LITTLE_ENDIAN__
#undef WORDS_BIGENDIAN
#endif
#endif

Users of pyconfig.h will see the correct definition of  
WORDS_BIGENDIAN regardless of the architecture that was used to  
create the file.

One of the announced features of osx 10.5 is 64-bit support  
throughout the system and I definitely want to see if we can get 4- 
way universal support on such systems. As I don't have a system that  
is capable of running 64-bit code  I'm not going to worry too much  
about this right now :-)

Ronald
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2157 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20060917/11cd07d3/attachment.bin 

From martin at v.loewis.de  Sun Sep 17 20:35:34 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 17 Sep 2006 20:35:34 +0200
Subject: [Python-Dev] python, lipo and the future?
In-Reply-To: <20060917105909.F9A7.JCARLSON@uci.edu>
References: <09DAD3D0-D5DE-4396-ADF4-4E9EFC2D9EF9@mac.com>	<450D7D70.5080505@v.loewis.de>
	<20060917105909.F9A7.JCARLSON@uci.edu>
Message-ID: <450D9576.5070700@v.loewis.de>

Josiah Carlson schrieb:
> "Martin v. L?wis" <martin at v.loewis.de> wrote:
>> Out of curiosity: how do the current universal binaries deal with this
>> issue?
> 
> If I remember correctly, usually you do two completely independant
> compile runs (optionally on the same machine with different configure or
> macro definitions, then use a packager provided by Apple to merge the
> results for each binary/so to be distributed. Each additional platform
> would just be a new compile run.

It's true that the compiler is invoked twice, however, I very much doubt
that configure is run twice. Doing so would cause the Makefile being
regenerated, and the build starting from scratch. It would find the
object files from the previous run, and either all overwrite them, or
leave them in place.

The gcc driver on OSX allows to invoke cc1/as two times, and then
combines the resulting object files into a single one (not sure whether
or not by invoking lipo).

Regards,
Martin

From fabiofz at gmail.com  Sun Sep 17 20:38:42 2006
From: fabiofz at gmail.com (Fabio Zadrozny)
Date: Sun, 17 Sep 2006 15:38:42 -0300
Subject: [Python-Dev] New relative import issue
Message-ID: <cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>

I've been playing with the new features and there's one thing about
the new relative import that I find a little strange and I'm not sure
this was intended...

When you do a from . import xxx, it will always fail if you're in a
top-level module, and when executing any module, the directory of the
module will automatically go into the pythonpath, thus making all the
relative imports in that structure fail.

E.g.:

/foo/bar/imp1.py <-- has a "from . import imp2"
/foo/bar/imp2.py

if I now put a test-case (or any other module I'd like as the main module) at:
/foo/bar/mytest.py

if it imports imp1, it will always fail.

The solutions I see would be:
- only use the pythonpath actually defined by the user (and don't put
the current directory in the pythonpath)
- make relative imports work even if they reach some directory in the
pythonpath (making it work as an absolute import that would only
search the current directory structure)

Or is this actually a bug? (I'm with python 2.5 rc2)

I took another look at http://docs.python.org/dev/whatsnew/pep-328.html
and the example shows:

pkg/
pkg/__init__.py
pkg/main.py
pkg/string.py

with the main.py doing a "from . import string", which is what I was
trying to accomplish...

Cheers,

Fabio

From ronaldoussoren at mac.com  Sun Sep 17 20:50:08 2006
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Sun, 17 Sep 2006 20:50:08 +0200
Subject: [Python-Dev] python, lipo and the future?
In-Reply-To: <20060917105909.F9A7.JCARLSON@uci.edu>
References: <09DAD3D0-D5DE-4396-ADF4-4E9EFC2D9EF9@mac.com>
	<450D7D70.5080505@v.loewis.de>
	<20060917105909.F9A7.JCARLSON@uci.edu>
Message-ID: <73A4A5EE-76EE-41EF-8514-6EB8F828BE2B@mac.com>


On Sep 17, 2006, at 8:03 PM, Josiah Carlson wrote:

>
> "Martin v. L?wis" <martin at v.loewis.de> wrote:
>> Out of curiosity: how do the current universal binaries deal with  
>> this
>> issue?
>
> If I remember correctly, usually you do two completely independant
> compile runs (optionally on the same machine with different  
> configure or
> macro definitions, then use a packager provided by Apple to merge the
> results for each binary/so to be distributed. Each additional platform
> would just be a new compile run.

That's the hard way to do things, if you don't mind to spent some  
time checking the code you try to compile you can usually tweak  
header files and use '-arch ppc -arch i386' to build a universal  
binary in one go. This is a lot more convenient when building  
universal binaries and is what's used to build Python as a universal  
binary.

Ronald
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2157 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20060917/7d58af89/attachment-0001.bin 

From bob at redivi.com  Sun Sep 17 20:46:41 2006
From: bob at redivi.com (Bob Ippolito)
Date: Sun, 17 Sep 2006 11:46:41 -0700
Subject: [Python-Dev] python, lipo and the future?
In-Reply-To: <450D9576.5070700@v.loewis.de>
References: <09DAD3D0-D5DE-4396-ADF4-4E9EFC2D9EF9@mac.com>
	<450D7D70.5080505@v.loewis.de> <20060917105909.F9A7.JCARLSON@uci.edu>
	<450D9576.5070700@v.loewis.de>
Message-ID: <6a36e7290609171146k5b90b13v78d31f013f36d282@mail.gmail.com>

On 9/17/06, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> Josiah Carlson schrieb:
> > "Martin v. L?wis" <martin at v.loewis.de> wrote:
> >> Out of curiosity: how do the current universal binaries deal with this
> >> issue?
> >
> > If I remember correctly, usually you do two completely independant
> > compile runs (optionally on the same machine with different configure or
> > macro definitions, then use a packager provided by Apple to merge the
> > results for each binary/so to be distributed. Each additional platform
> > would just be a new compile run.

Sometimes this is done, but usually people just use CC="cc -arch i386
-arch ppc". Most of the time that Just Works, unless the source
depends on autoconf gunk for endianness related issues.

> It's true that the compiler is invoked twice, however, I very much doubt
> that configure is run twice. Doing so would cause the Makefile being
> regenerated, and the build starting from scratch. It would find the
> object files from the previous run, and either all overwrite them, or
> leave them in place.
>
> The gcc driver on OSX allows to invoke cc1/as two times, and then
> combines the resulting object files into a single one (not sure whether
> or not by invoking lipo).
>

That's exactly what it does. The gcc frontend ensures that cc1/as is
invoked exactly as many times as there are -arch flags, and the result
is lipo'ed together. This also means that you get to see a copy of all
warnings and errors for each -arch flag.

-bob

From ronaldoussoren at mac.com  Sun Sep 17 20:53:03 2006
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Sun, 17 Sep 2006 20:53:03 +0200
Subject: [Python-Dev] python, lipo and the future?
In-Reply-To: <450D9576.5070700@v.loewis.de>
References: <09DAD3D0-D5DE-4396-ADF4-4E9EFC2D9EF9@mac.com>
	<450D7D70.5080505@v.loewis.de>
	<20060917105909.F9A7.JCARLSON@uci.edu>
	<450D9576.5070700@v.loewis.de>
Message-ID: <4AC70E8C-85DF-4BF3-9B4B-C19B3D1118CC@mac.com>


On Sep 17, 2006, at 8:35 PM, Martin v. L?wis wrote:

> Josiah Carlson schrieb:
>> "Martin v. L?wis" <martin at v.loewis.de> wrote:
>>> Out of curiosity: how do the current universal binaries deal with  
>>> this
>>> issue?
>>
>> If I remember correctly, usually you do two completely independant
>> compile runs (optionally on the same machine with different  
>> configure or
>> macro definitions, then use a packager provided by Apple to merge the
>> results for each binary/so to be distributed. Each additional  
>> platform
>> would just be a new compile run.
>
> It's true that the compiler is invoked twice, however, I very much  
> doubt
> that configure is run twice. Doing so would cause the Makefile being
> regenerated, and the build starting from scratch. It would find the
> object files from the previous run, and either all overwrite them, or
> leave them in place.
>
> The gcc driver on OSX allows to invoke cc1/as two times, and then
> combines the resulting object files into a single one (not sure  
> whether
> or not by invoking lipo).

IIRC the gcc driver calls lipo when multiple -arch flags are present  
in the command line. This is very convenient, especially when  
combined with distutils. Universal builds of Python will automaticly  
build universal extensions as well, without major patches to distutils.

Ronald

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2157 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20060917/bb04a59f/attachment.bin 

From martin at v.loewis.de  Sun Sep 17 20:56:18 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 17 Sep 2006 20:56:18 +0200
Subject: [Python-Dev] python, lipo and the future?
In-Reply-To: <5F30D695-B932-493B-9834-6FB3D9A9BAF0@mac.com>
References: <20060917125138.C4142110010@bromo.msbb.uc.edu>
	<09DAD3D0-D5DE-4396-ADF4-4E9EFC2D9EF9@mac.com>
	<450D7D70.5080505@v.loewis.de>
	<5F30D695-B932-493B-9834-6FB3D9A9BAF0@mac.com>
Message-ID: <450D9A52.6010209@v.loewis.de>

Ronald Oussoren schrieb:
> The sizes of basic types are the same on PPC32 and x86 which helps a
> lot.

Ah, right. This was the missing piece of the puzzle.

 The byteorder is different, but we can use GCC feature checks
> there. The relevant bit of pyconfig.h.in:
> 
> #ifdef __BIG_ENDIAN__
> #define WORDS_BIGENDIAN 1
> #else
> #ifndef __LITTLE_ENDIAN__
> #undef WORDS_BIGENDIAN
> #endif
> #endif

Yes, I remember this change very well.

> One of the announced features of osx 10.5 is 64-bit support throughout
> the system and I definitely want to see if we can get 4-way universal
> support on such systems. As I don't have a system that is capable of
> running 64-bit code  I'm not going to worry too much about this right
> now :-)

Isn't this a size issue, also? There might be very few users of a 64-bit
binary (fewer even on PPC64 than on AMD64).

In addition: how does the system chose whether to create a 32-bit
or a 64-bit process if the python binary is fat?

Regards,
Martin

P.S.: for distutils, I think adding special cases would retrieving
pyconfig.h items would be necessary. In addition, I think Python should
expose some of these in the image, e.g. as
sys.platform_config.SIZEOF_INT.


From ronaldoussoren at mac.com  Sun Sep 17 21:11:20 2006
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Sun, 17 Sep 2006 21:11:20 +0200
Subject: [Python-Dev] python, lipo and the future?
In-Reply-To: <450D9A52.6010209@v.loewis.de>
References: <20060917125138.C4142110010@bromo.msbb.uc.edu>
	<09DAD3D0-D5DE-4396-ADF4-4E9EFC2D9EF9@mac.com>
	<450D7D70.5080505@v.loewis.de>
	<5F30D695-B932-493B-9834-6FB3D9A9BAF0@mac.com>
	<450D9A52.6010209@v.loewis.de>
Message-ID: <C9ECF90C-7E1B-4BF0-BC73-16C62C392133@mac.com>


On Sep 17, 2006, at 8:56 PM, Martin v. L?wis wrote:

>
>> One of the announced features of osx 10.5 is 64-bit support  
>> throughout
>> the system and I definitely want to see if we can get 4-way universal
>> support on such systems. As I don't have a system that is capable of
>> running 64-bit code  I'm not going to worry too much about this right
>> now :-)
>
> Isn't this a size issue, also? There might be very few users of a  
> 64-bit
> binary (fewer even on PPC64 than on AMD64).

On Tiger it's primairily a useability issue: 64-bit binaries can't  
use most of the system API's because only the unix API (libSystem) is  
64-bit at the moment.

The size of the python installer would grow significantly for a 4-way  
universal distribution, it would be almost twice as large as the  
current distribution ("almost" because only binaries would grow in  
site, python source files and data files wouldn't grow in size).

>
> In addition: how does the system chose whether to create a 32-bit
> or a 64-bit process if the python binary is fat?

It should take the best fit, on 32-bit processors it picks the 32-bit  
version and on 64-bit processors it picks the 64-bit one.  This  
probably means that we'll have to ship multiple versions of the  
python executable, otherwise Tiger (10.4) users would end up with an  
interpreter that cannot use OSX-specific API's.

Ronald
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2157 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20060917/80a3be83/attachment.bin 

From Jack.Jansen at cwi.nl  Sun Sep 17 21:29:52 2006
From: Jack.Jansen at cwi.nl (Jack Jansen)
Date: Sun, 17 Sep 2006 21:29:52 +0200
Subject: [Python-Dev] python, lipo and the future?
In-Reply-To: <C9ECF90C-7E1B-4BF0-BC73-16C62C392133@mac.com>
References: <20060917125138.C4142110010@bromo.msbb.uc.edu>
	<09DAD3D0-D5DE-4396-ADF4-4E9EFC2D9EF9@mac.com>
	<450D7D70.5080505@v.loewis.de>
	<5F30D695-B932-493B-9834-6FB3D9A9BAF0@mac.com>
	<450D9A52.6010209@v.loewis.de>
	<C9ECF90C-7E1B-4BF0-BC73-16C62C392133@mac.com>
Message-ID: <415D3487-C340-4A29-A9C1-032B1E6BB058@cwi.nl>

Just wondering: is it a good idea in the first place to create a  
universal 32/64 bit Python on MacOSX?

On MacOS you don't pay a penalty or anything for running in 32-bit  
mode on any current hardware, so the choice of whether to use 32 or  
64 bits really depends on the application. A single Python  
interpreter that can run in both 32 and 64 bit mode would possibly  
make this more difficult rather than easier. I think I'd prefer a  
situation where we have python32 and python64 (with both being ppc/ 
intel fat) and python being a symlink to either, at the end-users'  
discretion.

For extension modules it's different, though: there it would be nice  
to be able to have a single module that could load into any Python  
(32/64 bit, Intel/PPC) on any applicable MacOSX version.
--
Jack Jansen, <Jack.Jansen at cwi.nl>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma  
Goldman



From howarth at bromo.msbb.uc.edu  Sun Sep 17 21:37:14 2006
From: howarth at bromo.msbb.uc.edu (Jack Howarth)
Date: Sun, 17 Sep 2006 15:37:14 -0400 (EDT)
Subject: [Python-Dev] python, lipo and the future?
Message-ID: <20060917193714.BCA73110010@bromo.msbb.uc.edu>

Martin,
    I believe if you use the Xcode project management the
Universal binary creation is automated. Currently they
support the i386/ppc binaries but once Leopard comes
out you will see i386/x86_64/ppc/ppc64 binaries for
shared libraries.
             Jack

From ronaldoussoren at mac.com  Sun Sep 17 21:52:21 2006
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Sun, 17 Sep 2006 21:52:21 +0200
Subject: [Python-Dev] python, lipo and the future?
In-Reply-To: <415D3487-C340-4A29-A9C1-032B1E6BB058@cwi.nl>
References: <20060917125138.C4142110010@bromo.msbb.uc.edu>
	<09DAD3D0-D5DE-4396-ADF4-4E9EFC2D9EF9@mac.com>
	<450D7D70.5080505@v.loewis.de>
	<5F30D695-B932-493B-9834-6FB3D9A9BAF0@mac.com>
	<450D9A52.6010209@v.loewis.de>
	<C9ECF90C-7E1B-4BF0-BC73-16C62C392133@mac.com>
	<415D3487-C340-4A29-A9C1-032B1E6BB058@cwi.nl>
Message-ID: <4F28D5B1-B0FC-49D8-9839-1E1338188FA9@mac.com>


On Sep 17, 2006, at 9:29 PM, Jack Jansen wrote:

> Just wondering: is it a good idea in the first place to create a
> universal 32/64 bit Python on MacOSX?
>
> On MacOS you don't pay a penalty or anything for running in 32-bit
> mode on any current hardware, so the choice of whether to use 32 or
> 64 bits really depends on the application. A single Python
> interpreter that can run in both 32 and 64 bit mode would possibly
> make this more difficult rather than easier. I think I'd prefer a
> situation where we have python32 and python64 (with both being ppc/
> intel fat) and python being a symlink to either, at the end-users'
> discretion.
>
> For extension modules it's different, though: there it would be nice
> to be able to have a single module that could load into any Python
> (32/64 bit, Intel/PPC) on any applicable MacOSX version.

A 4-way universal python framework could be useful, but I agree that  
the python executable shouldn't be 64-bit.  I'm not too happy about a  
symlink that selects which version you get to use, wouldn't  
'python' (32-bit) and 'python-64' (64-bit) be just as good. That way  
the user doesn't have to set up anything and it helps to reinforce  
the message that 64-bit isn't necessarily better than 32-bit.

Having a 4-way universal framework would IMO be preferable over two  
seperate python installs, that would just increase the confusion.  
There are too many python distributions for the mac anyway. A major  
stumbling-block for a 4-way universal installation is the  
availability of binary packages for (popular) 3th party packages,  
this is not really relevant for python-dev but I'd prefer not having  
64-bit support in the default installer over a 64-bit capable  
installation where it is very  hard to get popular packages to work.

BTW. several sites on the interweb claim that x86-64 runs faster than  
plain x86 due to a larger register set. All my machines are 32-bit so  
I can't check if this is relevant for Python (let alone Python on OSX).

Ronald
> --
> Jack Jansen, <Jack.Jansen at cwi.nl>, http://www.cwi.nl/~jack
> If I can't dance I don't want to be part of your revolution -- Emma
> Goldman
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/ 
> ronaldoussoren%40mac.com

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2157 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20060917/2d174016/attachment-0001.bin 

From ronaldoussoren at mac.com  Sun Sep 17 21:56:21 2006
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Sun, 17 Sep 2006 21:56:21 +0200
Subject: [Python-Dev] python, lipo and the future?
In-Reply-To: <20060917193714.BCA73110010@bromo.msbb.uc.edu>
References: <20060917193714.BCA73110010@bromo.msbb.uc.edu>
Message-ID: <90D1893C-568B-4499-A054-CB4E63B5FFD6@mac.com>


On Sep 17, 2006, at 9:37 PM, Jack Howarth wrote:

> Martin,
>     I believe if you use the Xcode project management the
> Universal binary creation is automated. Currently they
> support the i386/ppc binaries but once Leopard comes
> out you will see i386/x86_64/ppc/ppc64 binaries for
> shared libraries.

That's not really relevant for python, python is build using  
makefiles not using a Xcode project (and I'd like to keep it that way).

BTW. Xcode 2.4 can already build 4-way universal binaries, Tiger  
supports 64-bit unix programs. On my system file /usr/lib/ 
libSystem.B.dylib (the unix/C library) says:
$ file /usr/lib/libSystem.B.dylib
/usr/lib/libSystem.B.dylib: Mach-O universal binary with 3 architectures
/usr/lib/libSystem.B.dylib (for architecture ppc64):    Mach-O 64-bit  
dynamically linked shared library ppc64
/usr/lib/libSystem.B.dylib (for architecture i386):     Mach-O  
dynamically linked shared library i386
/usr/lib/libSystem.B.dylib (for architecture ppc):      Mach-O  
dynamically linked shared library ppc


On the new Mac Pro's and probably the Core2 based iMac's as well  
libSystem also contains a x86-64 version.

Ronald

>              Jack

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2157 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20060917/8c68135e/attachment.bin 

From martin at v.loewis.de  Sun Sep 17 22:27:42 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 17 Sep 2006 22:27:42 +0200
Subject: [Python-Dev] python, lipo and the future?
In-Reply-To: <415D3487-C340-4A29-A9C1-032B1E6BB058@cwi.nl>
References: <20060917125138.C4142110010@bromo.msbb.uc.edu>	<09DAD3D0-D5DE-4396-ADF4-4E9EFC2D9EF9@mac.com>	<450D7D70.5080505@v.loewis.de>	<5F30D695-B932-493B-9834-6FB3D9A9BAF0@mac.com>	<450D9A52.6010209@v.loewis.de>	<C9ECF90C-7E1B-4BF0-BC73-16C62C392133@mac.com>
	<415D3487-C340-4A29-A9C1-032B1E6BB058@cwi.nl>
Message-ID: <450DAFBE.2000704@v.loewis.de>

Jack Jansen schrieb:
> Just wondering: is it a good idea in the first place to create a  
> universal 32/64 bit Python on MacOSX?

I wonder about the same thing.

> For extension modules it's different, though: there it would be nice  
> to be able to have a single module that could load into any Python  
> (32/64 bit, Intel/PPC) on any applicable MacOSX version.

That seems to suggest that the standard distribution should indeed
provide a four-times fat binary, at least for libpython: AFAIU,
to build extension modules that way, all target architectures must
be supported in all necessary libraries on the build machine (somebody
will surely correct me if that's wrong).

Regards,
Martin

From martin at v.loewis.de  Sun Sep 17 22:34:49 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 17 Sep 2006 22:34:49 +0200
Subject: [Python-Dev] python, lipo and the future?
In-Reply-To: <4F28D5B1-B0FC-49D8-9839-1E1338188FA9@mac.com>
References: <20060917125138.C4142110010@bromo.msbb.uc.edu>	<09DAD3D0-D5DE-4396-ADF4-4E9EFC2D9EF9@mac.com>	<450D7D70.5080505@v.loewis.de>	<5F30D695-B932-493B-9834-6FB3D9A9BAF0@mac.com>	<450D9A52.6010209@v.loewis.de>	<C9ECF90C-7E1B-4BF0-BC73-16C62C392133@mac.com>	<415D3487-C340-4A29-A9C1-032B1E6BB058@cwi.nl>
	<4F28D5B1-B0FC-49D8-9839-1E1338188FA9@mac.com>
Message-ID: <450DB169.5060608@v.loewis.de>

Ronald Oussoren schrieb:
> BTW. several sites on the interweb claim that x86-64 runs faster than
> plain x86 due to a larger register set. All my machines are 32-bit so I
> can't check if this is relevant for Python (let alone Python on OSX).

That is plausible. OTOH, the AMD64 binaries will often require twice
as much main memory, as all pointers double their size, and the Python
implementation (or most OO languages, for that matter) is full of
pointers. So it will be more efficient only until it starts swapping.
(there is also a negative effect of larger pointers on the processor
 cache; the impact of this effect is hard to estimate).

Regards,
Martin



From anthony at interlink.com.au  Mon Sep 18 06:35:52 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Mon, 18 Sep 2006 14:35:52 +1000
Subject: [Python-Dev] Before 2.5 - More signed integer overflows
In-Reply-To: <20060916111111.GA27757@code0.codespeak.net>
References: <20060916111111.GA27757@code0.codespeak.net>
Message-ID: <200609181435.59238.anthony@interlink.com.au>

On Saturday 16 September 2006 21:11, Armin Rigo wrote:
> Hi all,
>
> There are more cases of signed integer overflows in the CPython source
> code base...
>
> That's on a 64-bits machine:
>
>     [GCC 4.1.2 20060715 (prerelease) (Debian 4.1.1-9)] on linux2
>     abs(-sys.maxint-1) == -sys.maxint-1

> Humpf!  Looks like one person or two need to do a quick last-minute
> review of all places trying to deal with -sys.maxint-1, and replace them
> all with the "official" fix from Tim [SF 1545668].

Ick. We're now less than 24 hours from the scheduled release date for 2.5 
final. There seems to be a couple of approaches here:

1. Someone (it won't be me, I'm flat out with work and paperwriting today) 
reviews the code and fixes it
2. We leave it for a 2.5.1. I'm expecting (based on the number of bugs found 
and fixed during the release cycle) that we'll probably need a 2.5.1 in about 
3 months.
3. We delay the release until it's fixed.

I'm strongly leaning towards (2) at this point. (1) would probably require 
another release candidate, while (3) would result in another release 
candidate and massive amount of sobbing from a lot of people (including me).





-- 
Anthony Baxter     <anthony at interlink.com.au>
It's never too late to have a happy childhood.

From anthony at interlink.com.au  Mon Sep 18 06:40:43 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Mon, 18 Sep 2006 14:40:43 +1000
Subject: [Python-Dev] BRANCH FREEZE/IMMINENT RELEASE: Python 2.5 (final).
	2006-09-19, 00:00UTC
Message-ID: <200609181440.46961.anthony@interlink.com.au>

Ok, time to bring down the hammer. The release25-maint branch is absolutely 
frozen to everyone but the release team from 00:00UTC, Tuesday 19th September. 
That's just under 20 hours from now. This is for Python 2.5 FINAL, so anyone 
who breaks this release will make me very, very sad. Based on the last few 
releases, I'd expect the release process to take around 18 hours (timezones 
are a swine). 

Anthony
-- 
Anthony Baxter     <anthony at interlink.com.au>
It's never too late to have a happy childhood.

From tim.peters at gmail.com  Mon Sep 18 06:58:31 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 18 Sep 2006 00:58:31 -0400
Subject: [Python-Dev] Before 2.5 - More signed integer overflows
In-Reply-To: <200609181435.59238.anthony@interlink.com.au>
References: <20060916111111.GA27757@code0.codespeak.net>
	<200609181435.59238.anthony@interlink.com.au>
Message-ID: <1f7befae0609172158q7d6add0dwf411380b5591c782@mail.gmail.com>

[Armin Rigo]
>> There are more cases of signed integer overflows in the CPython source
>> code base...
>>
>> That's on a 64-bits machine:
>>
>>     [GCC 4.1.2 20060715 (prerelease) (Debian 4.1.1-9)] on linux2
>>     abs(-sys.maxint-1) == -sys.maxint-1
><
>> Humpf!  Looks like one person or two need to do a quick last-minute
>> review of all places trying to deal with -sys.maxint-1, and replace them
>> all with the "official" fix from Tim [SF 1545668].

[Anthony Baxter]
> Ick. We're now less than 24 hours from the scheduled release date for 2.5
> final. There seems to be a couple of approaches here:
>
> 1. Someone (it won't be me, I'm flat out with work and paperwriting today)
>    reviews the code and fixes it
> 2. We leave it for a 2.5.1. I'm expecting (based on the number of bugs found
>    and fixed during the release cycle) that we'll probably need a 2.5.1 in about
>    3 months.
> 3. We delay the release until it's fixed.
>
> I'm strongly leaning towards (2) at this point. (1) would probably require
> another release candidate, while (3) would result in another release
> candidate and massive amount of sobbing from a lot of people (including me).

I ignored this since I don't have a box where problems are visible (&
nobody responded to my request to check my last flying-blind "fix" on
a box where it mattered).

Given that these are weird, unlikely-in-real-life endcase bugs
specific to a single compiler, #2 is the natural choice.

BTW, did anyone try compiling Python with -fwrapv on a box where it
matters?  I doubt that Python's speed is affected one way or the
other, and if adding wrapv makes the problems go away, that would be
an easy last-second workaround for all possible such problems (which
of course could get fixed "for real" for 2.5.1, provided someone cares
enough to dig into it).

From martin at v.loewis.de  Mon Sep 18 08:26:07 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 18 Sep 2006 08:26:07 +0200
Subject: [Python-Dev] Before 2.5 - More signed integer overflows
In-Reply-To: <1f7befae0609172158q7d6add0dwf411380b5591c782@mail.gmail.com>
References: <20060916111111.GA27757@code0.codespeak.net>	<200609181435.59238.anthony@interlink.com.au>
	<1f7befae0609172158q7d6add0dwf411380b5591c782@mail.gmail.com>
Message-ID: <450E3BFF.3000700@v.loewis.de>

> BTW, did anyone try compiling Python with -fwrapv on a box where it
> matters?  I doubt that Python's speed is affected one way or the
> other, and if adding wrapv makes the problems go away, that would be
> an easy last-second workaround for all possible such problems (which
> of course could get fixed "for real" for 2.5.1, provided someone cares
> enough to dig into it).

It's not so easy to add this option: configure needs to be taught to
check whether the option is supported first; to test it, you ideally
need an installation where it is supported, and one where it isn't.

I've added a note to README indicating that GCC 4.2 shouldn't be
used to compile Python. I don't consider this a terrible limitation,
especially since GCC 4.2 isn't released, yet.

OTOH, I get the same problem that Armin gets (abs(-sys.maxint-1)
is negative) also on a 32-bit system, with Debian's gcc 4.1.2
(which also isn't released, yet), so it appears that the problem
is already with gcc 4.1.

On my system, adding -fwrapv indeed solves the problem
(tested for abs()). So I added this to the README also.

Regards,
Martin

From nnorwitz at gmail.com  Mon Sep 18 08:29:56 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Sun, 17 Sep 2006 23:29:56 -0700
Subject: [Python-Dev] Before 2.5 - More signed integer overflows
In-Reply-To: <450E3BFF.3000700@v.loewis.de>
References: <20060916111111.GA27757@code0.codespeak.net>
	<200609181435.59238.anthony@interlink.com.au>
	<1f7befae0609172158q7d6add0dwf411380b5591c782@mail.gmail.com>
	<450E3BFF.3000700@v.loewis.de>
Message-ID: <ee2a432c0609172329y13e21a34ke5b0ab193dc01b96@mail.gmail.com>

I also tested the fix (see patch below) for the abs() issue and it
seemed to work for 4.1.1 on 64-bit.  I'll apply the patch to head and
2.5 and a test after 2.5 is out.

I have no idea how to search for these problems.  I know that xrange
can't display -sys.maxint-1 properly, but I think it works properly.

n
--

Index: Objects/intobject.c
===================================================================
--- Objects/intobject.c (revision 51886)
+++ Objects/intobject.c (working copy)
@@ -763,7 +763,7 @@
        register long a, x;
        a = v->ob_ival;
        x = -a;
-       if (a < 0 && x < 0) {
+       if (a < 0 && (unsigned long)x == 0-(unsigned long)x) {
                PyObject *o = PyLong_FromLong(a);
                if (o != NULL) {
                        PyObject *result = PyNumber_Negative(o);


On 9/17/06, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> > BTW, did anyone try compiling Python with -fwrapv on a box where it
> > matters?  I doubt that Python's speed is affected one way or the
> > other, and if adding wrapv makes the problems go away, that would be
> > an easy last-second workaround for all possible such problems (which
> > of course could get fixed "for real" for 2.5.1, provided someone cares
> > enough to dig into it).
>
> It's not so easy to add this option: configure needs to be taught to
> check whether the option is supported first; to test it, you ideally
> need an installation where it is supported, and one where it isn't.
>
> I've added a note to README indicating that GCC 4.2 shouldn't be
> used to compile Python. I don't consider this a terrible limitation,
> especially since GCC 4.2 isn't released, yet.
>
> OTOH, I get the same problem that Armin gets (abs(-sys.maxint-1)
> is negative) also on a 32-bit system, with Debian's gcc 4.1.2
> (which also isn't released, yet), so it appears that the problem
> is already with gcc 4.1.
>
> On my system, adding -fwrapv indeed solves the problem
> (tested for abs()). So I added this to the README also.
>
> Regards,
> Martin
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/nnorwitz%40gmail.com
>

From martin at v.loewis.de  Mon Sep 18 08:56:26 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 18 Sep 2006 08:56:26 +0200
Subject: [Python-Dev] Before 2.5 - More signed integer overflows
In-Reply-To: <ee2a432c0609172329y13e21a34ke5b0ab193dc01b96@mail.gmail.com>
References: <20060916111111.GA27757@code0.codespeak.net>	
	<200609181435.59238.anthony@interlink.com.au>	
	<1f7befae0609172158q7d6add0dwf411380b5591c782@mail.gmail.com>	
	<450E3BFF.3000700@v.loewis.de>
	<ee2a432c0609172329y13e21a34ke5b0ab193dc01b96@mail.gmail.com>
Message-ID: <450E431A.8010507@v.loewis.de>

Neal Norwitz schrieb:
> I also tested the fix (see patch below) for the abs() issue and it
> seemed to work for 4.1.1 on 64-bit.  I'll apply the patch to head and
> 2.5 and a test after 2.5 is out.

Please also add it to 2.4.

> Index: Objects/intobject.c
> ===================================================================
> --- Objects/intobject.c (revision 51886)
> +++ Objects/intobject.c (working copy)
> @@ -763,7 +763,7 @@
>        register long a, x;
>        a = v->ob_ival;
>        x = -a;
> -       if (a < 0 && x < 0) {
> +       if (a < 0 && (unsigned long)x == 0-(unsigned long)x) {

Hmm. Shouldn't this drop 'x' and use 'a' instead? If a is
-sys.maxint-1, -a is already undefined.

Regards,
Martin

P.S. As for finding these problems, I would have hoped that
-ftrapv could help - unfortunately, gcc breaks with this
option (consumes incredible amounts of memory).

From nnorwitz at gmail.com  Mon Sep 18 08:59:39 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Sun, 17 Sep 2006 23:59:39 -0700
Subject: [Python-Dev] Before 2.5 - More signed integer overflows
In-Reply-To: <450E431A.8010507@v.loewis.de>
References: <20060916111111.GA27757@code0.codespeak.net>
	<200609181435.59238.anthony@interlink.com.au>
	<1f7befae0609172158q7d6add0dwf411380b5591c782@mail.gmail.com>
	<450E3BFF.3000700@v.loewis.de>
	<ee2a432c0609172329y13e21a34ke5b0ab193dc01b96@mail.gmail.com>
	<450E431A.8010507@v.loewis.de>
Message-ID: <ee2a432c0609172359l758e61b9j3da3b9366a307bb0@mail.gmail.com>

On 9/17/06, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> Neal Norwitz schrieb:
> > I also tested the fix (see patch below) for the abs() issue and it
> > seemed to work for 4.1.1 on 64-bit.  I'll apply the patch to head and
> > 2.5 and a test after 2.5 is out.
>
> Please also add it to 2.4.

Yes

>
> > Index: Objects/intobject.c
> > ===================================================================
> > --- Objects/intobject.c (revision 51886)
> > +++ Objects/intobject.c (working copy)
> > @@ -763,7 +763,7 @@
> >        register long a, x;
> >        a = v->ob_ival;
> >        x = -a;
> > -       if (a < 0 && x < 0) {
> > +       if (a < 0 && (unsigned long)x == 0-(unsigned long)x) {
>
> Hmm. Shouldn't this drop 'x' and use 'a' instead? If a is
> -sys.maxint-1, -a is already undefined.

Yes, probably.  I didn't review carefully.

> P.S. As for finding these problems, I would have hoped that
> -ftrapv could help - unfortunately, gcc breaks with this
> option (consumes incredible amounts of memory).

I'm getting a crash when running test_builtin and test_calendar (at
least) with gcc 4.1.1 on amd64.  It's happening in pymalloc, though I
don't know what the cause is.  I thought I tested with gcc 4.1 before,
but probably would have been in debug mode.

n
--
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 16384 (LWP 22020)]
PyObject_Malloc (nbytes=40) at obmalloc.c:746
746                             if ((pool->freeblock = *(block **)bp) != NULL) {
(gdb) p bp
$1 = (block *) 0x2a9558d41800 <Address 0x2a9558d41800 out of bounds>
(gdb) l
741                              * Pick up the head block of its free list.
742                              */
743                             ++pool->ref.count;
744                             bp = pool->freeblock;
745                             assert(bp != NULL);
746                             if ((pool->freeblock = *(block **)bp) != NULL) {
747                                     UNLOCK();
748                                     return (void *)bp;
749                             }
750                             /*
(gdb) p *pool
$2 = {ref = {_padding = 0x1a <Address 0x1a out of bounds>, count = 26},
  freeblock = 0x2a9558d41800 <Address 0x2a9558d41800 out of bounds>,
  nextpool = 0x2a95eac000, prevpool = 0x620210, arenaindex = 0, szidx = 4,
  nextoffset = 4088, maxnextoffset = 4056}
(gdb) p size
$3 = 4

From arigo at tunes.org  Mon Sep 18 11:13:14 2006
From: arigo at tunes.org (Armin Rigo)
Date: Mon, 18 Sep 2006 11:13:14 +0200
Subject: [Python-Dev] New relative import issue
In-Reply-To: <cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>
References: <cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>
Message-ID: <20060918091314.GA26814@code0.codespeak.net>

Hi Fabio,

On Sun, Sep 17, 2006 at 03:38:42PM -0300, Fabio Zadrozny wrote:
> I've been playing with the new features and there's one thing about
> the new relative import that I find a little strange and I'm not sure
> this was intended...

My (limited) understanding of the motivation for relative imports is
that they are only here as a transitional feature.  Fully-absolute
imports are the official future.

Neither relative nor fully-absolute imports address the fact that in any
multi-package project I've been involved with, there is some kind of
sys.path hackery required (or even custom import hooks).  Indeed, there
is no clean way from a test module 'foo.bar.test.test_hello' to import
'foo.bar.hello': the top-level directory must first be inserted into
sys.path magically.

> /foo/bar/imp1.py <-- has a "from . import imp2"
> /foo/bar/imp2.py
> 
> if I now put a test-case (or any other module I'd like as the main module) at:
> /foo/bar/mytest.py
> 
> if it imports imp1, it will always fail.

Indeed: foo/bar/mytest.py must do 'import foo.bar.imp1' or 'from foo.bar
import imp1', and then it works (if sys.path was properly hacked first,
of course).  (I'm not sure, but I think that this not so much a language
design decision as a consequence of the complexities of import.c, which
is the largest C source file of CPython and steadily growing.)


A bientot,

Armin

From ncoghlan at gmail.com  Mon Sep 18 13:25:03 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 18 Sep 2006 21:25:03 +1000
Subject: [Python-Dev] New relative import issue
In-Reply-To: <cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>
References: <cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>
Message-ID: <450E820F.4000505@gmail.com>

Fabio Zadrozny wrote:
> I've been playing with the new features and there's one thing about
> the new relative import that I find a little strange and I'm not sure
> this was intended...
> 
> When you do a from . import xxx, it will always fail if you're in a
> top-level module, and when executing any module, the directory of the
> module will automatically go into the pythonpath, thus making all the
> relative imports in that structure fail.

Correct. Relative imports are based on __name__ and don't work properly if 
__name__ does not properly reflect the module's position in the package 
hierarchy (usually because the module is the main module, so name is set to 
'__main__').

This is noted briefly in PEP 328 [1], with the current workarounds explained 
in more detail in PEP 338 [2].

Cheers,
Nick.

[1]
http://www.python.org/dev/peps/pep-0328/#relative-imports-and-name

[2]
http://www.python.org/dev/peps/pep-0338/#import-statements-and-the-main-module

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From martin at v.loewis.de  Mon Sep 18 16:02:29 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 18 Sep 2006 16:02:29 +0200
Subject: [Python-Dev] Before 2.5 - More signed integer overflows
In-Reply-To: <ee2a432c0609172359l758e61b9j3da3b9366a307bb0@mail.gmail.com>
References: <20060916111111.GA27757@code0.codespeak.net>	
	<200609181435.59238.anthony@interlink.com.au>	
	<1f7befae0609172158q7d6add0dwf411380b5591c782@mail.gmail.com>	
	<450E3BFF.3000700@v.loewis.de>	
	<ee2a432c0609172329y13e21a34ke5b0ab193dc01b96@mail.gmail.com>	
	<450E431A.8010507@v.loewis.de>
	<ee2a432c0609172359l758e61b9j3da3b9366a307bb0@mail.gmail.com>
Message-ID: <450EA6F5.4040100@v.loewis.de>

Neal Norwitz schrieb:
> I'm getting a crash when running test_builtin and test_calendar (at
> least) with gcc 4.1.1 on amd64.  It's happening in pymalloc, though I
> don't know what the cause is.  I thought I tested with gcc 4.1 before,
> but probably would have been in debug mode.

Can't really check right now, but it might be that this is just the
limitation that a debug obmalloc doesn't work on 64-bit systems.
There is a header at each block with a fixed size of 4 bytes, even
though it should be 8 bytes on 64-bit systems. This header is there
only in a debug build.

Regards,
Martin


From devik at cdi.cz  Mon Sep 18 15:46:02 2006
From: devik at cdi.cz (Martin Devera)
Date: Mon, 18 Sep 2006 15:46:02 +0200
Subject: [Python-Dev] deja-vu .. python locking
Message-ID: <450EA31A.6060500@cdi.cz>

Hello,

as someone has written in FAQ, sometimes someone starts a thread about
finer grained locking in Python.
Ok here is one.

I don't want to start a flamewar. I only seek suggestions and constructive
critic. I have some ideas whose are new in this context (I believe) and
I only wanted to make them public in case someone finds them interesting.

Comments are welcome.
Martin

------------
Round 1, Greg Stein's patch
  The patch removes GIL from version 1.6 and replaces locking of
  list, dict and other structures with finer grained locking.
  The major slowdown seems to be in list and dict structures, dicts
  are used for object attributes and these are accessed quite often.

  Because (IIUC) mutex struct is quite heavy, dict and list are
  locked via pool of locks. When you lock this pooled lock you
  have to lock two locks in reality. One locks pool itself, and other
  locks the pooled lock (the second locking can be omited in non
  contended case because locks in the pool are in locked state).
  One lock take about 25 cycles on UP P4 (mainly pipeline flush
  during memory barrier) and can be even more expensive (hundreds
  of cycles) due to cacheline move between CPUs on SMP machine.
  "Global" pool lock is subject to cacheline pinpong as it will
  be often reacquired by competing CPUs.
  In mappinglookup there is lookmapping guarded by this locking scheme,
  lookmapping itself has about 20 cycles in the best (one hope typical) case
  plus compareobj cost (in case of string keys say ... 50..100 cycles?).
  Thus locking/unlocking the read takes 50..100 cycles and operation
  itself is 70-120 cycles.
  One might expect about 50% slowdown in dict read path.

RCU like locking
  Solution I have in mind is similar to RCU. In Python we have quiscent
  state - when a thread returns to main loop of interpreter.
  Let's add "owner_thread" field to locked object. It reflects last thread
  (its id) which called any lockable method on the object.
  Each LOCK operation looks like:
  while (ob->owner_thread != self_thread()) {
	 unlock_mutex(thread_mutex[self_thread()])
	// wait for owning thread to go to quiscent state
	 lock_mutex(thread_mutex[ob->owner_thread])
	 ob->owner_thread = self_thread()
	 unlock_mutex(thread_mutex[ob->owner_thread])
	 lock_mutex(thread_mutex[self_thread()])
  }
  Unlock is not done - we own the object now and can use it without locking
  (until we return to interpreter loop or we call LOCK on other object).
  For non-shared objects there is only penalty of ob->owner_thread != self_thread()
  condition. Not sure about Windows, but in recent Linuxes one can use %gs register
  as thread id, thus compare is about 3 cycles (and owner_thread should be
  in object's cacheline anyway).
  In contended case there is some cache pingpong with ob and mutex but it is
  as expected.

Deadlocks
  Our object ownership is long - from getting it in LOCK to next quiscent state
  of the thread. Thus when two threads want to step each on other's object, they
  will deadlock. Simple solution is to extend set of quiscent states.
  It is when thread releases its thread_mutex in main loop (and immediately
  reacquires). Additionaly it can release it just before it is going to wait
  on another thread's mutex, like in LOCK (already in code above). If you use
  LOCK correctly then when you are LOCKing an object you can't be in vulnerable
  part of OTHER object. So that let other threads to get ownership of your own
  objects in that time.
  One can also want to release his lock when going to lock mutex in threading
  package and in other places where GIL is released today.
  However I admit that I did no formal proof regarding deadlock, I plan
  to do it if nobody can find other flaw in the proposal.

Big reader lock
  While above scheme might work well, it'd impose performance penalty
  for shared dicts which are almost read only (module.__dict__).
  For these similar locking can be used, only writer has to wait until
  ALL other threads enter quiscent state (take locks of them), then perform
  change and unlock them all. Readers can read without any locking.

Compatibilty with 3rd party modules
  I've read this argument on pydev list. Maybe I'm not understanding something,
  but is it so complex for Py_InitModule4 to use extra flag in apiver for example ?
  When at least one non-freethreaded module is loaded, locking is done in
  old good way...

From martin at v.loewis.de  Mon Sep 18 16:18:59 2006
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Mon, 18 Sep 2006 16:18:59 +0200
Subject: [Python-Dev] Testsuite fails on Windows if a space is in the
 path
In-Reply-To: <20060916173806.1717.491882186.divmod.quotient.51076@ohm>
References: <20060916173806.1717.491882186.divmod.quotient.51076@ohm>
Message-ID: <450EAAD3.4010002@v.loewis.de>

Jean-Paul Calderone schrieb:
> You can find the quoting/dequoting rules used by cmd.exe documented on msdn:
> 
> http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vclang/html/_pluslang_Parsing_C.2b2b_.Command.2d.Line_Arguments.asp
> 
> Interpreting them is something of a challenge (my favorite part is how the
> examples imply that the final argument is automatically uppercased ;)

That doesn't talk about cmd.exe, does it? It rather looks like the
procedure used to create argc/argv when calling main() in the
C run-time library.

If cmd.exe would use these rules, the current Python code should
be fine, AFAICT.

Regards,
Martin

From martin at v.loewis.de  Mon Sep 18 16:29:32 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 18 Sep 2006 16:29:32 +0200
Subject: [Python-Dev] Testsuite fails on Windows if a space is in the
 path
In-Reply-To: <1f7befae0609161249u751e9a8oe651b1ca81be1879@mail.gmail.com>
References: <450C32DA.9030601@v.loewis.de>
	<1f7befae0609161249u751e9a8oe651b1ca81be1879@mail.gmail.com>
Message-ID: <450EAD4C.5020902@v.loewis.de>

Tim Peters schrieb:
> These are the MS docs for cmd.exe's inscrutable quoting rules after /C:
> 
> """
> If /C or /K is specified, then the remainder of the command line after
> the switch is processed as a command line, where the following logic is
> used to process quote (") characters:
> 
>     1.  If all of the following conditions are met, then quote characters
>         on the command line are preserved:

I couldn't make sense of the German translation; reading over the
English version several times, I think I now understand what it does
(not that I truly understand *why* it does that, probably because too
 many people complained that it would strip off quotes when the
 program name had a space in it).

> I personally wouldn't change anything here for 2.5.  It's a minefield,
> and people who care a lot already have their own workarounds in place,
> which we'd risk breaking.  It remains a minefield for newbies, but
> we're really just passing on cmd.exe's behaviors.

So what do you suggest for 2.6? "Fix" it (i.e. make sure that the
target process is invoked with the same command line that is
passed to popen)? Or leave it as-is, just documenting the limitations
better. It's non-obvious that popen uses %COMSPEC% /c.

(Another problem is that the error message from cmd.exe gets discarded;
 that should get fixed regardless)

> People are well-advised to accept the installer's default directory.

That's very true, but difficult to communicate. Too many people actually
complain about that, and some even bring reasonable arguments (such
as the ACL in c:\ being too permissive for a software installation).

Regards,
Martin

From martin at v.loewis.de  Mon Sep 18 16:46:40 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 18 Sep 2006 16:46:40 +0200
Subject: [Python-Dev] deja-vu .. python locking
In-Reply-To: <450EA31A.6060500@cdi.cz>
References: <450EA31A.6060500@cdi.cz>
Message-ID: <450EB150.90700@v.loewis.de>

Martin Devera schrieb:
> RCU like locking
>   Solution I have in mind is similar to RCU. In Python we have quiscent
>   state - when a thread returns to main loop of interpreter.

There might be a terminology problem here. RCU is read-copy-update,
right? I fail to see the copy (copy data structure to be modified)
and update (replace original pointer with pointer to copy) part.
Do this play a role in that scheme? If so, what specific structure
is copied for, say, a list or a dict?

This confusion makes it very difficult for me to understand your
proposal, so I can't comment much on it. If you think it could work,
just go ahead and create an implementation.

Regards,
Martin

From devik at cdi.cz  Mon Sep 18 17:06:47 2006
From: devik at cdi.cz (Martin Devera)
Date: Mon, 18 Sep 2006 17:06:47 +0200
Subject: [Python-Dev] deja-vu .. python locking
In-Reply-To: <450EB150.90700@v.loewis.de>
References: <450EA31A.6060500@cdi.cz> <450EB150.90700@v.loewis.de>
Message-ID: <450EB607.70801@cdi.cz>

Martin v. L?wis wrote:
> Martin Devera schrieb:
>> RCU like locking
>>   Solution I have in mind is similar to RCU. In Python we have quiscent
>>   state - when a thread returns to main loop of interpreter.
> 
> There might be a terminology problem here. RCU is read-copy-update,
> right? I fail to see the copy (copy data structure to be modified)
> and update (replace original pointer with pointer to copy) part.
> Do this play a role in that scheme? If so, what specific structure
> is copied for, say, a list or a dict?
> 
> This confusion makes it very difficult for me to understand your
> proposal, so I can't comment much on it. If you think it could work,
> just go ahead and create an implementation.

It is why I used a word "similar". I see the similarity in a way to archieve
safe "delete" phase of RCU. Probably I selected bad title for the text. It
is because I was reading about RCU implementation in Linux kernel and
I discovered that the idea of postponing critical code to some safe point in
future might work in Python interpreter.

So that you are right. It is not RCU. It only uses similar technique as RCU
uses for free-ing old copy of data.

It is based on assumption that an object is typicaly used by single thread. You
must lock it anyway just for case if another thread steps on it. The idea is
that each object is "owned" by a thread. Owner can use its objects without
locking. If a thread wants to use foreign object then it has to wait for owning
thread to go to some safe place (out of interpreter, into LOCK of other object..).
It is done by per-thread lock and it is neccessary because owner does no locking,
thus you can be sure that nobody it using the object when former owner is somewhere
out of the object.

Regarding implementation, I wanted to look for some opinions before starting to
implement something as big as this patch. Probably someone can look and say, hey
it is stupit, you forgot that.... FILL_IN ... ;-)

I hope I explained it better this time, I know my English not the best. At least
worse than my Python :-)

thanks for your time, Martin

From exarkun at divmod.com  Mon Sep 18 17:22:07 2006
From: exarkun at divmod.com (Jean-Paul Calderone)
Date: Mon, 18 Sep 2006 11:22:07 -0400
Subject: [Python-Dev] deja-vu .. python locking
In-Reply-To: <450EB607.70801@cdi.cz>
Message-ID: <20060918152207.1717.107889227.divmod.quotient.52985@ohm>

On Mon, 18 Sep 2006 17:06:47 +0200, Martin Devera <devik at cdi.cz> wrote:
>Martin v. L?wis wrote:
>> Martin Devera schrieb:
>>> RCU like locking
>>>   Solution I have in mind is similar to RCU. In Python we have quiscent
>>>   state - when a thread returns to main loop of interpreter.
>>
>> There might be a terminology problem here. RCU is read-copy-update,
>> right? I fail to see the copy (copy data structure to be modified)
>> and update (replace original pointer with pointer to copy) part.
>> Do this play a role in that scheme? If so, what specific structure
>> is copied for, say, a list or a dict?
>>
>> This confusion makes it very difficult for me to understand your
>> proposal, so I can't comment much on it. If you think it could work,
>> just go ahead and create an implementation.
>
>It is why I used a word "similar". I see the similarity in a way to archieve
>safe "delete" phase of RCU. Probably I selected bad title for the text. It
>is because I was reading about RCU implementation in Linux kernel and
>I discovered that the idea of postponing critical code to some safe point in
>future might work in Python interpreter.
>
>So that you are right. It is not RCU. It only uses similar technique as RCU
>uses for free-ing old copy of data.
>
>It is based on assumption that an object is typicaly used by single thread. 

Which thread owns builtins?  Or module dictionaries?  If two threads are
running the same function and share no state except their globals, won't
they constantly be thrashing on the module dictionary?  Likewise, if the
same method is running in two different threads, won't they thrash on the
class dictionary?

Jean-Paul

From tim.peters at gmail.com  Mon Sep 18 17:27:00 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 18 Sep 2006 11:27:00 -0400
Subject: [Python-Dev] Before 2.5 - More signed integer overflows
In-Reply-To: <450EA6F5.4040100@v.loewis.de>
References: <20060916111111.GA27757@code0.codespeak.net>
	<200609181435.59238.anthony@interlink.com.au>
	<1f7befae0609172158q7d6add0dwf411380b5591c782@mail.gmail.com>
	<450E3BFF.3000700@v.loewis.de>
	<ee2a432c0609172329y13e21a34ke5b0ab193dc01b96@mail.gmail.com>
	<450E431A.8010507@v.loewis.de>
	<ee2a432c0609172359l758e61b9j3da3b9366a307bb0@mail.gmail.com>
	<450EA6F5.4040100@v.loewis.de>
Message-ID: <1f7befae0609180827p7ce60142u8c3cd3d9f3c9483@mail.gmail.com>

[Neal Norwitz]
>> I'm getting a crash when running test_builtin and test_calendar (at
>> least) with gcc 4.1.1 on amd64.  It's happening in pymalloc, though I
>> don't know what the cause is.  I thought I tested with gcc 4.1 before,
>> but probably would have been in debug mode.

Neil, in context it was unclear whether you were using trapv at the
time.  Were you?

[Martin v. L?wis]
> Can't really check right now, but it might be that this is just the
> limitation that a debug obmalloc doesn't work on 64-bit systems.
> There is a header at each block with a fixed size of 4 bytes, even
> though it should be 8 bytes on 64-bit systems. This header is there
> only in a debug build.

Funny then how all the 64-bit buildbots manage to pass running debug builds ;-)

As of revs 46637 + 46638 (3-4 months ago), debug-build obmalloc uses
sizeof(size_t) bytes for each of its header and trailer debugging
fields.

Before then, the debug-build obmalloc was "safe" in this respect:  if
it /needed/ to store more than 4 bytes in a debug bookkeeping field,
it assert-failed in a debug build.  That would happen if and only if a
call to malloc/realloc requested >= 2**32 bytes, so was never provoked
by Python's test suite.  As of rev 46638, that limitation should have
gone away.

From devik at cdi.cz  Mon Sep 18 19:08:16 2006
From: devik at cdi.cz (Martin Devera)
Date: Mon, 18 Sep 2006 19:08:16 +0200
Subject: [Python-Dev] deja-vu .. python locking
In-Reply-To: <20060918152207.1717.107889227.divmod.quotient.52985@ohm>
References: <20060918152207.1717.107889227.divmod.quotient.52985@ohm>
Message-ID: <450ED280.8010409@cdi.cz>

>> So that you are right. It is not RCU. It only uses similar technique as RCU
>> uses for free-ing old copy of data.
>>
>> It is based on assumption that an object is typicaly used by single thread. 
> 
> Which thread owns builtins?  Or module dictionaries?  If two threads are
> running the same function and share no state except their globals, won't
> they constantly be thrashing on the module dictionary?  Likewise, if the
> same method is running in two different threads, won't they thrash on the
> class dictionary?

As I've written in "Big reader lock" paragraph of the original proposal, these
objects could be handled by not blocking in read path and wait for all other
threads to "come home" before modifying.
The selection between locking mode could be selected either by something like
__locking__ or by detecting the mode.


From nnorwitz at gmail.com  Mon Sep 18 19:27:25 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Mon, 18 Sep 2006 10:27:25 -0700
Subject: [Python-Dev] Before 2.5 - More signed integer overflows
In-Reply-To: <1f7befae0609180827p7ce60142u8c3cd3d9f3c9483@mail.gmail.com>
References: <20060916111111.GA27757@code0.codespeak.net>
	<200609181435.59238.anthony@interlink.com.au>
	<1f7befae0609172158q7d6add0dwf411380b5591c782@mail.gmail.com>
	<450E3BFF.3000700@v.loewis.de>
	<ee2a432c0609172329y13e21a34ke5b0ab193dc01b96@mail.gmail.com>
	<450E431A.8010507@v.loewis.de>
	<ee2a432c0609172359l758e61b9j3da3b9366a307bb0@mail.gmail.com>
	<450EA6F5.4040100@v.loewis.de>
	<1f7befae0609180827p7ce60142u8c3cd3d9f3c9483@mail.gmail.com>
Message-ID: <ee2a432c0609181027yd786d0bhd18a5e76c81afc79@mail.gmail.com>

On 9/18/06, Tim Peters <tim.peters at gmail.com> wrote:
> [Neal Norwitz]
> >> I'm getting a crash when running test_builtin and test_calendar (at
> >> least) with gcc 4.1.1 on amd64.  It's happening in pymalloc, though I
> >> don't know what the cause is.  I thought I tested with gcc 4.1 before,
> >> but probably would have been in debug mode.
>
> Neil, in context it was unclear whether you were using trapv at the
> time.  Were you?

No trapv, just ./configure --without-pydebug IIRC.  I should have sent
a msg last night, but was too tired.  I got the same crash (I think)
with gcc 3.4.4, so it's almost definitely due to an outstanding
change, not python's or gcc's fault.

n

From rasky at develer.com  Mon Sep 18 20:45:49 2006
From: rasky at develer.com (Giovanni Bajo)
Date: Mon, 18 Sep 2006 20:45:49 +0200
Subject: [Python-Dev] Testsuite fails on Windows if a space is in the
	path
References: <450C32DA.9030601@v.loewis.de><1f7befae0609161249u751e9a8oe651b1ca81be1879@mail.gmail.com>
	<450EAD4C.5020902@v.loewis.de>
Message-ID: <09c401c6db52$a8d1a2b0$b803030a@trilan>

Martin v. L?wis wrote:

>> People are well-advised to accept the installer's default directory.
>
> That's very true, but difficult to communicate. Too many people
> actually
> complain about that, and some even bring reasonable arguments (such
> as the ACL in c:\ being too permissive for a software installation).

Besides, it won't be allowed in Vista with the default user permissions.
-- 
Giovanni Bajo


From pje at telecommunity.com  Mon Sep 18 21:45:20 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 18 Sep 2006 15:45:20 -0400
Subject: [Python-Dev] deja-vu .. python locking
In-Reply-To: <450ED280.8010409@cdi.cz>
References: <20060918152207.1717.107889227.divmod.quotient.52985@ohm>
	<20060918152207.1717.107889227.divmod.quotient.52985@ohm>
Message-ID: <5.1.1.6.0.20060918154011.026fb190@sparrow.telecommunity.com>

At 07:08 PM 9/18/2006 +0200, Martin Devera wrote:
> >> So that you are right. It is not RCU. It only uses similar technique 
> as RCU
> >> uses for free-ing old copy of data.
> >>
> >> It is based on assumption that an object is typicaly used by single 
> thread.
> >
> > Which thread owns builtins?  Or module dictionaries?  If two threads are
> > running the same function and share no state except their globals, won't
> > they constantly be thrashing on the module dictionary?  Likewise, if the
> > same method is running in two different threads, won't they thrash on the
> > class dictionary?
>
>As I've written in "Big reader lock" paragraph of the original proposal, these
>objects could be handled by not blocking in read path and wait for all other
>threads to "come home" before modifying.

Changing an object's reference count is modifying it, and most accesses to 
get the dictionaries themselves involve refcount changes.  Your plan, so 
far, does not appear to have any solution for reducing this overhead.

Module globals aren't so bad, in that you'd only have to lock and refcount 
when frames are created and destroyed.  But access to class dictionaries to 
obtain methods happens a lot more often, and refcounting is involved there 
as well.

So, I think for your plan to work, you would have to eliminate reference 
counting, in order to bring the lock overhead down to a manageable level.


From martin at v.loewis.de  Mon Sep 18 22:21:51 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 18 Sep 2006 22:21:51 +0200
Subject: [Python-Dev] deja-vu .. python locking
In-Reply-To: <450EB607.70801@cdi.cz>
References: <450EA31A.6060500@cdi.cz> <450EB150.90700@v.loewis.de>
	<450EB607.70801@cdi.cz>
Message-ID: <450EFFDF.1020307@v.loewis.de>

Martin Devera schrieb:
> It is based on assumption that an object is typicaly used by single 
> thread. You must lock it anyway just for case if another thread steps
> on it. The idea is that each object is "owned" by a thread. Owner can
> use its objects without locking. If a thread wants to use foreign
> object then it has to wait for owning thread to go to some safe place
> (out of interpreter, into LOCK of other object..). It is done by
> per-thread lock and it is neccessary because owner does no locking, 
> thus you can be sure that nobody it using the object when former
> owner is somewhere out of the object.

Ah, I think I understand now. First the minor critique: I believe
the locking algorithm isn't thread-safe:

  while (ob->owner_thread != self_thread()) {
	 unlock_mutex(thread_mutex[self_thread()])
	// wait for owning thread to go to quiscent state
	 lock_mutex(thread_mutex[ob->owner_thread])
	 ob->owner_thread = self_thread()
	 unlock_mutex(thread_mutex[ob->owner_thread])
	 lock_mutex(thread_mutex[self_thread()])
  }

If two threads are competing for the same object held by a third
thread, they may simultaneously enter the while loop, and then
simultaneously try to lock the owner_thread. Now, one will win,
and own the object. Later, the other will gain the lock, and
unconditionally overwrite ownership. This will cause two threads
to own the objects, which is an error.

The more fundamental critique is: Why? It seems you do this
to improve efficiency, (implicitly) claiming that it is
more efficient to keep holding the lock, instead of releasing
and re-acquiring it each time.

I claim that this doesn't really matter: any reasonable
mutex implementation will be "fast" if there is no lock
contention. On locking, it will not invoke any system
call if the lock is currently not held (but just
atomically test-and-set some field of the lock); on
unlocking, it will not invoke any system call if
the wait list is empty. As you also need to test, there
shouldn't be much of a performance difference.

Regards,
Martin

From greg.ewing at canterbury.ac.nz  Tue Sep 19 05:46:59 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 19 Sep 2006 15:46:59 +1200
Subject: [Python-Dev] New relative import issue
In-Reply-To: <20060918091314.GA26814@code0.codespeak.net>
References: <cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>
	<20060918091314.GA26814@code0.codespeak.net>
Message-ID: <450F6833.60603@canterbury.ac.nz>

Armin Rigo wrote:

> My (limited) understanding of the motivation for relative imports is
> that they are only here as a transitional feature.  Fully-absolute
> imports are the official future.

Guido does seem to have a dislike for relative imports,
but I don't really understand why. The usefulness of
being able to make a package self-contained and movable
to another place in the package hierarchy without hacking
it seems self-evident to me.

What's happening in Py3k? Will relative imports still
exist?

> there
> is no clean way from a test module 'foo.bar.test.test_hello' to import
> 'foo.bar.hello': the top-level directory must first be inserted into
> sys.path magically.

I've felt for a long time that problems like this
wouldn't arise so much if there were a closer
connection between the package hierarchy and the
file system structure. There really shouldn't be
any such thing as sys.path -- the view that any
given module has of the package namespace should
depend only on where it is, not on the history of
how it came to be invoked.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From jcarlson at uci.edu  Tue Sep 19 06:18:24 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Mon, 18 Sep 2006 21:18:24 -0700
Subject: [Python-Dev] New relative import issue
In-Reply-To: <450F6833.60603@canterbury.ac.nz>
References: <20060918091314.GA26814@code0.codespeak.net>
	<450F6833.60603@canterbury.ac.nz>
Message-ID: <20060918210603.07EA.JCARLSON@uci.edu>


Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Armin Rigo wrote:
> > there
> > is no clean way from a test module 'foo.bar.test.test_hello' to import
> > 'foo.bar.hello': the top-level directory must first be inserted into
> > sys.path magically.
> 
> I've felt for a long time that problems like this
> wouldn't arise so much if there were a closer
> connection between the package hierarchy and the
> file system structure. There really shouldn't be
> any such thing as sys.path -- the view that any
> given module has of the package namespace should
> depend only on where it is, not on the history of
> how it came to be invoked.

Wait, wait, wait.  If I remember correctly, one of the use-cases cited
was for sub-packages of a single larger package to be able to import
other sub-packages, via 'from ..subpackage2 import module2'.  That is to
say, given a package structure like...

.../__init__.py
.../subpackage1/module1.py
.../subpackage1/__init__.py
.../subpackage2/module2.py
.../subpackage2/__init__.py

Running module1.py, with an import line that read:
    from ..subpackage2 import module2

... would import module2 from subpackage2

Testing this in the beta I have installed tells me:

Traceback (most recent call last):
  File "module1.py", line 1, in <module>
    from ..subpackage2 import module2
ValueError: Relative importpath too deep

While I can understand why this is the case (if one is going to be
naming modules relative to __main__ or otherwise, unless one preserves
the number of leading '.', giving module2 a __name__ of
__main__..subpackage2.module2 or ..subpackage2.module2, naming can be
confusing), it does remove a very important feature.

Guido suggested I make up a PEP way back in March or so, but I was
slowed by actually implementing __main__-relative naming (which is
currently incomplete).

As it stands, in order to "work around" this particular feature, one
would need to write a 'loader' to handle importing and/or main() calling
in subpackage1/module1.py .


 - Josiah


From brett at python.org  Tue Sep 19 06:15:50 2006
From: brett at python.org (Brett Cannon)
Date: Mon, 18 Sep 2006 21:15:50 -0700
Subject: [Python-Dev] New relative import issue
In-Reply-To: <450F6833.60603@canterbury.ac.nz>
References: <cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>
	<20060918091314.GA26814@code0.codespeak.net>
	<450F6833.60603@canterbury.ac.nz>
Message-ID: <bbaeab100609182115w12e6f4dcl6e0b4dbf6b7a38a6@mail.gmail.com>

On 9/18/06, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
>
> Armin Rigo wrote:
>
> > My (limited) understanding of the motivation for relative imports is
> > that they are only here as a transitional feature.  Fully-absolute
> > imports are the official future.
>
> Guido does seem to have a dislike for relative imports,
> but I don't really understand why. The usefulness of
> being able to make a package self-contained and movable
> to another place in the package hierarchy without hacking
> it seems self-evident to me.


It is more of how relative imports used to be inherent and thus have no
clear way to delineate that an import was being done using a relative path
compared to an absolute one.

What's happening in Py3k? Will relative imports still
> exist?


Using the dot notation, yes they will exist in Py3K.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060918/01e1a07e/attachment.htm 

From greg.ewing at canterbury.ac.nz  Tue Sep 19 06:42:47 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 19 Sep 2006 16:42:47 +1200
Subject: [Python-Dev] deja-vu .. python locking
In-Reply-To: <450EB607.70801@cdi.cz>
References: <450EA31A.6060500@cdi.cz> <450EB150.90700@v.loewis.de>
	<450EB607.70801@cdi.cz>
Message-ID: <450F7547.2080900@canterbury.ac.nz>

Martin Devera wrote:

> Regarding implementation, I wanted to look for some opinions before starting to
> implement something as big as this patch. Probably someone can look and say, hey
> it is stupit, you forgot that.... FILL_IN ... ;-)

If I understand correctly, your suggestion for avoiding
deadlock relies on the fact that a given thread can really
only have one object locked at a time, i.e. after you
LOCK an object you can only assume you own it until
you LOCK another object or return to some quiescent
state. Is this right?

If so, the question is whether it's sufficient to be
able to lock just one object at a time. Maybe it is,
but some more formal consideration of that might be
a good idea.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From greg.ewing at canterbury.ac.nz  Tue Sep 19 06:58:41 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 19 Sep 2006 16:58:41 +1200
Subject: [Python-Dev] deja-vu .. python locking
In-Reply-To: <450ED280.8010409@cdi.cz>
References: <20060918152207.1717.107889227.divmod.quotient.52985@ohm>
	<450ED280.8010409@cdi.cz>
Message-ID: <450F7901.9030106@canterbury.ac.nz>

Martin Devera wrote:

> As I've written in "Big reader lock" paragraph of the original proposal, these
> objects could be handled by not blocking in read path

But as was just pointed out, because of refcounting,
there's really no such thing as read-only access to
an object. What *looks* like read-only access at the
Python level involves refcount updates just from the
act of touching the object.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From greg.ewing at canterbury.ac.nz  Tue Sep 19 07:13:46 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 19 Sep 2006 17:13:46 +1200
Subject: [Python-Dev] deja-vu .. python locking
In-Reply-To: <5.1.1.6.0.20060918154011.026fb190@sparrow.telecommunity.com>
References: <20060918152207.1717.107889227.divmod.quotient.52985@ohm>
	<20060918152207.1717.107889227.divmod.quotient.52985@ohm>
	<5.1.1.6.0.20060918154011.026fb190@sparrow.telecommunity.com>
Message-ID: <450F7C8A.7070500@canterbury.ac.nz>

Phillip J. Eby wrote:

> So, I think for your plan to work, you would have to eliminate reference 
> counting, in order to bring the lock overhead down to a manageable level.

There's a possibility it wouldn't be atrociously bad.
Seems like it would only add the 3 instructions or
whatever overhead to most refcount operations.

How much this would reduce performance depends on
what percentage of time is currently used by refcounting.
Are there any figures for that?

A quick way of getting an idea of how much effect
it would have might be to change Py_INCREF and
Py_DECREF to go through the relevant motions, and
see what timings are produced for single-threaded
code. It wouldn't be a working implementation, but
you'd find out pretty quickly if it were going to
be a disaster.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From devik at cdi.cz  Tue Sep 19 09:39:28 2006
From: devik at cdi.cz (Martin Devera)
Date: Tue, 19 Sep 2006 09:39:28 +0200
Subject: [Python-Dev] deja-vu .. python locking
In-Reply-To: <450F7901.9030106@canterbury.ac.nz>
References: <20060918152207.1717.107889227.divmod.quotient.52985@ohm>
	<450ED280.8010409@cdi.cz> <450F7901.9030106@canterbury.ac.nz>
Message-ID: <450F9EB0.3090206@cdi.cz>

Greg Ewing wrote:
> Martin Devera wrote:
> 
>> As I've written in "Big reader lock" paragraph of the original 
>> proposal, these
>> objects could be handled by not blocking in read path
> 
> But as was just pointed out, because of refcounting,
> there's really no such thing as read-only access to
> an object. What *looks* like read-only access at the
> Python level involves refcount updates just from the
> act of touching the object.
> 

Yes I was thinking about atomic inc/dec (locked inc/dec in x86)
as used in G.Stein's patch.
I have to admit that I haven't measured its performance, I was
hoping for decent one. But from http://www.linuxjournal.com/article/6993
it seems that atomic inc is rather expensive too (75ns on 1.8GHz P4) :-(

Greg, what change do you have in mind regarding that "3 instruction
addition" to refcounting ?
thanks, Martin

From devik at cdi.cz  Tue Sep 19 09:51:18 2006
From: devik at cdi.cz (Martin Devera)
Date: Tue, 19 Sep 2006 09:51:18 +0200
Subject: [Python-Dev] deja-vu .. python locking
In-Reply-To: <450EFFDF.1020307@v.loewis.de>
References: <450EA31A.6060500@cdi.cz> <450EB150.90700@v.loewis.de>
	<450EB607.70801@cdi.cz> <450EFFDF.1020307@v.loewis.de>
Message-ID: <450FA176.3090801@cdi.cz>

> Ah, I think I understand now. First the minor critique: I believe
> the locking algorithm isn't thread-safe:
> 
>   while (ob->owner_thread != self_thread()) {
> 	 unlock_mutex(thread_mutex[self_thread()])
> 	// wait for owning thread to go to quiscent state
> 	 lock_mutex(thread_mutex[ob->owner_thread])
> 	 ob->owner_thread = self_thread()
> 	 unlock_mutex(thread_mutex[ob->owner_thread])
> 	 lock_mutex(thread_mutex[self_thread()])
>   }
> 
> If two threads are competing for the same object held by a third
> thread, they may simultaneously enter the while loop, and then
> simultaneously try to lock the owner_thread. Now, one will win,
> and own the object. Later, the other will gain the lock, and
> unconditionally overwrite ownership. This will cause two threads
> to own the objects, which is an error.

oops .. well it seems as very stupid error on my side. Yes you are
absolutely right, I'll have to rethink it. I hope it is possible
to do it in correct way...

> The more fundamental critique is: Why? It seems you do this
> to improve efficiency, (implicitly) claiming that it is
> more efficient to keep holding the lock, instead of releasing
> and re-acquiring it each time.
> 
> I claim that this doesn't really matter: any reasonable
> mutex implementation will be "fast" if there is no lock
> contention. On locking, it will not invoke any system
> call if the lock is currently not held (but just
> atomically test-and-set some field of the lock); on
> unlocking, it will not invoke any system call if
> the wait list is empty. As you also need to test, there
> shouldn't be much of a performance difference.

I measured it. Lock op in futex based linux locking is of the same
speed as windows critical section and it is about 30 cycles on my
P4 1.8GHz in uncontented case.
As explained in already mentioned http://www.linuxjournal.com/article/6993
it seems due to pipeline flush during cmpxchg insn.
And there will be cacheline transfer penalty which is much larger. So
that mutex locking will take time comparable with protected code itself
(assuming fast code like dict/list read).
Single compare will take ten times less.
Am I missing something ?

thanks, Martin

From phd at phd.pp.ru  Tue Sep 19 11:47:38 2006
From: phd at phd.pp.ru (Oleg Broytmann)
Date: Tue, 19 Sep 2006 13:47:38 +0400
Subject: [Python-Dev] New relative import issue
In-Reply-To: <450F6833.60603@canterbury.ac.nz>
References: <cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>
	<20060918091314.GA26814@code0.codespeak.net>
	<450F6833.60603@canterbury.ac.nz>
Message-ID: <20060919094738.GC27707@phd.pp.ru>

On Tue, Sep 19, 2006 at 03:46:59PM +1200, Greg Ewing wrote:
> There really shouldn't be
> any such thing as sys.path -- the view that any
> given module has of the package namespace should
> depend only on where it is

   I do not understand this. Can you show an example? Imagine I have two
servers, Linux and FreeBSD, and on Linux python is in /usr/bin, home is
/home/phd, on BSD these are /usr/local/bin and /usr/home/phd. I have some
modules in site-packages and some modules in $HOME/lib/python. How can I
move programs from one server to the other without rewriting them (how can
I not to put full paths to modules)? I use PYTHONPATH manipulation - its
enough to write a shell script that starts daemons once and use it for many
years. How can I do this without sys.path?!

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.

From ncoghlan at gmail.com  Tue Sep 19 12:16:59 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 19 Sep 2006 20:16:59 +1000
Subject: [Python-Dev] New relative import issue
In-Reply-To: <20060918210603.07EA.JCARLSON@uci.edu>
References: <20060918091314.GA26814@code0.codespeak.net>	<450F6833.60603@canterbury.ac.nz>
	<20060918210603.07EA.JCARLSON@uci.edu>
Message-ID: <450FC39B.9070200@gmail.com>

Josiah Carlson wrote:
> As it stands, in order to "work around" this particular feature, one
> would need to write a 'loader' to handle importing and/or main() calling
> in subpackage1/module1.py .

Yup. At the moment, you can rely on PEP 328, or an PEP 338, but not both at 
the same time. This was previously discussed back in June/July with Anthony 
convincing me that the solution to the current poor interaction shouldn't be 
rushed [1].

It is, however, pretty trivial to write a runpy.run_module based launcher that 
will execute your module and use something other than "__name__ == '__main__'" 
to indicate that the module is the main module. By letting run_module set 
__name__ normally, relative imports will "just work".

For example:

#mypkg/launch.py
# Runs a script, using the global _launched to indicate whether or not
# the module is the main module

if "_launched" not in globals():
     _launched = False
if (__name__ == "__main__") or _launched:
     import runpy
     # Run the module specified as the next command line argument
     if len(sys.argv) < 2:
         print >> sys.stderr, "No module specified for execution"
     else:
         del sys.argv[0] # Make the requested module sys.argv[0]
         run_module(sys.argv[0],
                    init_globals=dict(_launched=True),
                    alter_sys=True)

Cheers,
Nick.

[1] http://mail.python.org/pipermail/python-dev/2006-July/067077.html

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From steve at holdenweb.com  Tue Sep 19 14:40:45 2006
From: steve at holdenweb.com (Steve Holden)
Date: Tue, 19 Sep 2006 08:40:45 -0400
Subject: [Python-Dev] New relative import issue
In-Reply-To: <450F6833.60603@canterbury.ac.nz>
References: <cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>	<20060918091314.GA26814@code0.codespeak.net>
	<450F6833.60603@canterbury.ac.nz>
Message-ID: <eeooet$v1m$1@sea.gmane.org>

Greg Ewing wrote:
> Armin Rigo wrote:
> 
> 
>>My (limited) understanding of the motivation for relative imports is
>>that they are only here as a transitional feature.  Fully-absolute
>>imports are the official future.
> 
> 
> Guido does seem to have a dislike for relative imports,
> but I don't really understand why. The usefulness of
> being able to make a package self-contained and movable
> to another place in the package hierarchy without hacking
> it seems self-evident to me.
> 
> What's happening in Py3k? Will relative imports still
> exist?
> 
> 
>>there
>>is no clean way from a test module 'foo.bar.test.test_hello' to import
>>'foo.bar.hello': the top-level directory must first be inserted into
>>sys.path magically.
> 
> 
> I've felt for a long time that problems like this
> wouldn't arise so much if there were a closer
> connection between the package hierarchy and the
> file system structure. There really shouldn't be
> any such thing as sys.path -- the view that any
> given module has of the package namespace should
> depend only on where it is, not on the history of
> how it came to be invoked.
> 
This does, of course, assume that you're importing modules from the 
filestore, which assumption is no longer valid in the presence of PEP 
302 importers.

The current initialization code actually looks for os.py as a means of 
establishing path elements. This should really be better integrated with 
the PEP 302 mechanism: ideally Python should work on systems that don't 
rely on filestore for import (even though for the foreseeable future all 
systems will continue to do this).

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb       http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden


From anthony at python.org  Tue Sep 19 14:39:48 2006
From: anthony at python.org (Anthony Baxter)
Date: Tue, 19 Sep 2006 22:39:48 +1000
Subject: [Python-Dev] RELEASED Python 2.5 (FINAL)
Message-ID: <200609192239.57508.anthony@python.org>

It's been nearly 20 months since the last major release
of Python (2.4), and 5 months since the first alpha
release of this cycle, so I'm absolutely thrilled to be
able to say:

    On behalf of the Python development team
    and the Python community, I'm happy to
    announce the FINAL release of Python 2.5.

This is a *production* release of Python 2.5. Yes, that's
right, it's finally here.

Python 2.5 is probably the most significant new release
of Python since 2.2, way back in the dark ages of 2001.
There's been a wide variety of changes and additions,
both user-visible and underneath the hood. In addition,
we've switched to SVN for development and now use Buildbot
to do continuous testing of the Python codebase.

Much more information (as well as source distributions
and Windows and Universal Mac OSX installers) are available
from the 2.5 website:

    http://www.python.org/2.5/

The new features in Python 2.5 are described in Andrew
Kuchling's What's New In Python 2.5. It's available
from the 2.5 web page.

Amongst the new features of Python 2.5 are conditional
expressions, the with statement, the merge of try/except
and try/finally into try/except/finally, enhancements
to generators to produce coroutine functionality, and
a brand new AST-based compiler implementation underneath
the hood. There's a variety of smaller new features as
well.

New to the standard library are hashlib, ElementTree,
sqlite3, wsgiref, uuid and ctypes. As well, a new
higher-performance profiling module (cProfile) was
added.

Extra-special thanks on behalf of the entire Python
community should go out to Neal Norwitz, who's done
absolutely sterling work in shepherding Python 2.5
through to it's final release.

Enjoy this new release, (and Woo-HOO! It's done!)
Anthony

Anthony Baxter
anthony at python.org
Python Release Manager
(on behalf of the entire python-dev team)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 191 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20060919/5e104475/attachment.pgp 

From anthony at interlink.com.au  Tue Sep 19 16:06:00 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Wed, 20 Sep 2006 00:06:00 +1000
Subject: [Python-Dev] release25-maint branch - please keep frozen for a day
	or two more.
Message-ID: <200609200006.06585.anthony@interlink.com.au>

Could people please treat the release25-maint branch as frozen for a day or 
two, just in case we have to cut an ohmygodnononokillme release? Thanks,
Anthony
-- 
Anthony Baxter     <anthony at interlink.com.au>
It's never too late to have a happy childhood.

From steve at holdenweb.com  Tue Sep 19 16:19:30 2006
From: steve at holdenweb.com (Steve Holden)
Date: Tue, 19 Sep 2006 10:19:30 -0400
Subject: [Python-Dev] release25-maint branch - please keep frozen for a
 day or two more.
In-Reply-To: <200609200006.06585.anthony@interlink.com.au>
References: <200609200006.06585.anthony@interlink.com.au>
Message-ID: <eeou82$mu3$1@sea.gmane.org>

Anthony Baxter wrote:
> Could people please treat the release25-maint branch as frozen for a day or 
> two, just in case we have to cut an ohmygodnononokillme release? Thanks,

Otherwise to be known as 2.5.005?

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb       http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden


From martin at v.loewis.de  Tue Sep 19 20:13:29 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 19 Sep 2006 20:13:29 +0200
Subject: [Python-Dev] deja-vu .. python locking
In-Reply-To: <450FA176.3090801@cdi.cz>
References: <450EA31A.6060500@cdi.cz> <450EB150.90700@v.loewis.de>
	<450EB607.70801@cdi.cz> <450EFFDF.1020307@v.loewis.de>
	<450FA176.3090801@cdi.cz>
Message-ID: <45103349.3020203@v.loewis.de>

Martin Devera schrieb:
> I measured it. Lock op in futex based linux locking is of the same
> speed as windows critical section and it is about 30 cycles on my
> P4 1.8GHz in uncontented case.
> As explained in already mentioned http://www.linuxjournal.com/article/6993
> it seems due to pipeline flush during cmpxchg insn.
> And there will be cacheline transfer penalty which is much larger. So
> that mutex locking will take time comparable with protected code itself
> (assuming fast code like dict/list read).
> Single compare will take ten times less.
> Am I missing something ?

I'll have to wait for your revised algorithm, but likely, you will need
some kind of memory barrier also, or else it can't work in the
multi-processor case.

In any case, if to judge whether 30 cycles is few or little,
measurements of the alternative approach are necessary.

Regards,
Martin

From michael.walter at gmail.com  Tue Sep 19 21:34:56 2006
From: michael.walter at gmail.com (Michael Walter)
Date: Tue, 19 Sep 2006 21:34:56 +0200
Subject: [Python-Dev] Download URL typo
Message-ID: <877e9a170609191234y6a5f2fa8g8f9e9aecf6bcdab3@mail.gmail.com>

Hiho,

in case noone didn't notice yet: the "Windows MSI Installer" link at
http://www.python.org/download/releases/2.5/ points to Python 2.4!

Regards,
Michael

From martin at v.loewis.de  Tue Sep 19 23:09:38 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 19 Sep 2006 23:09:38 +0200
Subject: [Python-Dev] Download URL typo
In-Reply-To: <877e9a170609191234y6a5f2fa8g8f9e9aecf6bcdab3@mail.gmail.com>
References: <877e9a170609191234y6a5f2fa8g8f9e9aecf6bcdab3@mail.gmail.com>
Message-ID: <45105C92.9030002@v.loewis.de>

Michael Walter schrieb:
> in case noone didn't notice yet: the "Windows MSI Installer" link at
> http://www.python.org/download/releases/2.5/ points to Python 2.4!

Why is this a problem? The link is actually correct: The MSI
documentation is the same.

Regards,
Martin

From martin at v.loewis.de  Tue Sep 19 23:45:27 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 19 Sep 2006 23:45:27 +0200
Subject: [Python-Dev] Download URL typo
In-Reply-To: <45105C92.9030002@v.loewis.de>
References: <877e9a170609191234y6a5f2fa8g8f9e9aecf6bcdab3@mail.gmail.com>
	<45105C92.9030002@v.loewis.de>
Message-ID: <451064F7.2000200@v.loewis.de>

Martin v. L?wis schrieb:
> Michael Walter schrieb:
>> in case noone didn't notice yet: the "Windows MSI Installer" link at
>> http://www.python.org/download/releases/2.5/ points to Python 2.4!
> 
> Why is this a problem? The link is actually correct: The MSI
> documentation is the same.

I reconsidered. Even though the documentation was nearly correct
(except that one limitation went away long ago), it's probably better
to have the documentation state "2.5" throughout. So I copied it,
changed the version numbers, and changed the links to refer to the
copy.

Regards,
Martin


From greg.ewing at canterbury.ac.nz  Wed Sep 20 01:54:08 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 20 Sep 2006 11:54:08 +1200
Subject: [Python-Dev] deja-vu .. python locking
In-Reply-To: <450F9EB0.3090206@cdi.cz>
References: <20060918152207.1717.107889227.divmod.quotient.52985@ohm>
	<450ED280.8010409@cdi.cz> <450F7901.9030106@canterbury.ac.nz>
	<450F9EB0.3090206@cdi.cz>
Message-ID: <45108320.6090109@canterbury.ac.nz>

Martin Devera wrote:

> Greg, what change do you have in mind regarding that "3 instruction
> addition" to refcounting ?

I don't have any change in mind. If even an atomic inc
is too expensive, it seems there's no hope for us.

--
Greg

From greg.ewing at canterbury.ac.nz  Wed Sep 20 02:06:41 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 20 Sep 2006 12:06:41 +1200
Subject: [Python-Dev] New relative import issue
In-Reply-To: <eeooet$v1m$1@sea.gmane.org>
References: <cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>
	<20060918091314.GA26814@code0.codespeak.net>
	<450F6833.60603@canterbury.ac.nz> <eeooet$v1m$1@sea.gmane.org>
Message-ID: <45108611.7090009@canterbury.ac.nz>

Steve Holden wrote:

> This does, of course, assume that you're importing modules from the 
> filestore, which assumption is no longer valid in the presence of PEP 
> 302 importers.

Well, you need to allow for a sufficiently abstract
notion of "filesystem".

I haven't really thought it through in detail. It
just seems as though it would be a lot less confusing
if you could figure out from static information which
module will get imported by a given import statement,
instead of having it depend on the history of run-time
modifications to sys.path. One such kind of static
information is the layout of the filesystem.

--
Greg

From steve at holdenweb.com  Wed Sep 20 03:04:26 2006
From: steve at holdenweb.com (Steve Holden)
Date: Tue, 19 Sep 2006 21:04:26 -0400
Subject: [Python-Dev] New relative import issue
In-Reply-To: <45108611.7090009@canterbury.ac.nz>
References: <cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>	<20060918091314.GA26814@code0.codespeak.net>	<450F6833.60603@canterbury.ac.nz>
	<eeooet$v1m$1@sea.gmane.org> <45108611.7090009@canterbury.ac.nz>
Message-ID: <eeq41b$pp0$1@sea.gmane.org>

Greg Ewing wrote:
> Steve Holden wrote:
> 
> 
>>This does, of course, assume that you're importing modules from the 
>>filestore, which assumption is no longer valid in the presence of PEP 
>>302 importers.
> 
> 
> Well, you need to allow for a sufficiently abstract
> notion of "filesystem".
> 
For some value of "sufficiently" ...

> I haven't really thought it through in detail. It
> just seems as though it would be a lot less confusing
> if you could figure out from static information which
> module will get imported by a given import statement,
> instead of having it depend on the history of run-time
> modifications to sys.path. One such kind of static
> information is the layout of the filesystem.
> 
Less confusing, but sadly also less realistic.

I suspect what's really needed is *more* importer behavior rather than 
less but, like you, I haven't yet thought it through in detail.

All I *can* tell you is once you start importing modules for a database 
the whole import mechanism starts to look a bit under-specified an 
over-complicated.

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb       http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden


From steve at holdenweb.com  Wed Sep 20 03:14:12 2006
From: steve at holdenweb.com (Steve Holden)
Date: Tue, 19 Sep 2006 21:14:12 -0400
Subject: [Python-Dev] Download URL typo
In-Reply-To: <451064F7.2000200@v.loewis.de>
References: <877e9a170609191234y6a5f2fa8g8f9e9aecf6bcdab3@mail.gmail.com>	<45105C92.9030002@v.loewis.de>
	<451064F7.2000200@v.loewis.de>
Message-ID: <eeq4jl$r4o$1@sea.gmane.org>

Martin v. L?wis wrote:
> Martin v. L?wis schrieb:
> 
>>Michael Walter schrieb:
>>
>>>in case noone didn't notice yet: the "Windows MSI Installer" link at
>>>http://www.python.org/download/releases/2.5/ points to Python 2.4!
>>
>>Why is this a problem? The link is actually correct: The MSI
>>documentation is the same.
> 
> 
> I reconsidered. Even though the documentation was nearly correct
> (except that one limitation went away long ago), it's probably better
> to have the documentation state "2.5" throughout. So I copied it,
> changed the version numbers, and changed the links to refer to the
> copy.
> 
As I write the situation is an ugly mess, since the most visible link is 
just plain wrong. The page

   http://www.python.org/download/releases/2.5/

has a block at the top right whose last link is "Windows MSI installer". 
That links to

   http://www.python.org/download/releases/2.5/msi/

which *also* has a block at the top right whose last link is "Windows 
MSI installer". Unfortunately that takes you to

   http://www.python.org/download/releases/2.5/msi/msi

by which time you have completely lost contact with any style sheet, and 
  despite the potential infinite regress have still not located the 
actual installer. The correct link is in-line:

   http://www.python.org/download/releases/2.5/python-2.5.msi

I think the next time we redesign the web production system we should 
take the release managers' needs into consideration. They should have a 
simple form to fill in, with defaults already provided. As indeed should 
many other people ...

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb       http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden


From devik at cdi.cz  Wed Sep 20 08:00:35 2006
From: devik at cdi.cz (Martin Devera)
Date: Wed, 20 Sep 2006 08:00:35 +0200
Subject: [Python-Dev] deja-vu .. python locking
In-Reply-To: <45108320.6090109@canterbury.ac.nz>
References: <20060918152207.1717.107889227.divmod.quotient.52985@ohm>
	<450ED280.8010409@cdi.cz> <450F7901.9030106@canterbury.ac.nz>
	<450F9EB0.3090206@cdi.cz> <45108320.6090109@canterbury.ac.nz>
Message-ID: <4510D903.3090002@cdi.cz>

Greg Ewing wrote:
> Martin Devera wrote:
> 
>> Greg, what change do you have in mind regarding that "3 instruction
>> addition" to refcounting ?
> 
> I don't have any change in mind. If even an atomic inc
> is too expensive, it seems there's no hope for us.

Just from curiosity, would be a big problem removing refcounting and live
with garbage collection only ? I'm not sure if some parts of py code
depends on exact refcnt behaviour (I guess it should not).
Probably not for mainstream, but maybe as compile time option as part
of freethreading solution only for those who need it.
Even if you can do fast atomic inc/dec, it forces cacheline with
refcounter to ping-pong between caches of referencing cpus (for read only
class dicts for example) so that you can probably never get good SMP
scalability.
Consider main memory latency 100ns, then on 8 way 2GHz SMP system where
paralel computation within the same py class is going on all cpus.
When you manage to do a lot of class references in a loop, say 6400
instructions apart (quite realistic) then at least one CPU each time
will block on that inc/dec, so that you lost one cpu in overhead...

From martin at v.loewis.de  Wed Sep 20 08:33:38 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 20 Sep 2006 08:33:38 +0200
Subject: [Python-Dev] deja-vu .. python locking
In-Reply-To: <4510D903.3090002@cdi.cz>
References: <20060918152207.1717.107889227.divmod.quotient.52985@ohm>	<450ED280.8010409@cdi.cz>
	<450F7901.9030106@canterbury.ac.nz>	<450F9EB0.3090206@cdi.cz>
	<45108320.6090109@canterbury.ac.nz> <4510D903.3090002@cdi.cz>
Message-ID: <4510E0C2.7060506@v.loewis.de>

Martin Devera schrieb:
> Just from curiosity, would be a big problem removing refcounting and live
> with garbage collection only ? I'm not sure if some parts of py code
> depends on exact refcnt behaviour (I guess it should not).

Now, this gives a true deja-vu. Python applications often rely on
reference counting (in particular, that releasing a file object
will immediately close the file), despite the language reference
saying that this is not a Python feature, just one of the
implementation. In addition, implementing a tracing garbage
collection would either be tedious or have additional consequences
on semantics: with a conservative GC, some objects may never
get collected, with a precise GC, you have to declare GC roots
on the C level. Things get more complicated if the GC is also
compacting. See the current thread on the py3k list.

Regards,
Martin

From martin at v.loewis.de  Wed Sep 20 08:48:18 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 20 Sep 2006 08:48:18 +0200
Subject: [Python-Dev] Download URL typo
In-Reply-To: <eeq4jl$r4o$1@sea.gmane.org>
References: <877e9a170609191234y6a5f2fa8g8f9e9aecf6bcdab3@mail.gmail.com>	<45105C92.9030002@v.loewis.de>	<451064F7.2000200@v.loewis.de>
	<eeq4jl$r4o$1@sea.gmane.org>
Message-ID: <4510E432.4060402@v.loewis.de>

Steve Holden schrieb:
> That links to
> 
>    http://www.python.org/download/releases/2.5/msi/
> 
> which *also* has a block at the top right whose last link is "Windows 
> MSI installer". Unfortunately that takes you to
> 
>    http://www.python.org/download/releases/2.5/msi/msi

I noticed, but my pyramid fu is not good enough to fix it.
Should I submit a pyramid/web site bug report? Or can you fix it?

Notice that the Highlights page behaves the same way, whereas
the License and Bugs pages works correctly. I can't really spot
a difference in the sources: the subnav.yml files are identical
in all these.

Actually, looking more closely, it appears that the "working"
pages have a line

    subnav: !fragment subnav.yml

in content.yml; this seems to make a difference. What does that
line mean?

Regards,
Martin

From jcarlson at uci.edu  Wed Sep 20 09:49:59 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed, 20 Sep 2006 00:49:59 -0700
Subject: [Python-Dev] deja-vu .. python locking
In-Reply-To: <4510D903.3090002@cdi.cz>
References: <45108320.6090109@canterbury.ac.nz> <4510D903.3090002@cdi.cz>
Message-ID: <20060920003827.0814.JCARLSON@uci.edu>


Martin Devera <devik at cdi.cz> wrote:
[snip]
> Even if you can do fast atomic inc/dec, it forces cacheline with
> refcounter to ping-pong between caches of referencing cpus (for read only
> class dicts for example) so that you can probably never get good SMP
> scalability.

That's ok.  Why?  Because according to Guido, the GIL isn't going away:
http://mail.python.org/pipermail/python-3000/2006-April/001072.html
... so ruminations about refcounting, GC, etc., at least with regards to
removing the GIL towards some sort of "free threading" Python, are
likely to go nowhere.  Unless someone is able to translate the codebase
into using such methods, show how it is not (significantly) more
difficult to program extensions for, show a mild to moderate slowdown on
single processors, and prove actual speedup on multiple processors.  But
even then it will be a difficult sell, as it would require possibly
radical rewrites for all of the hundreds or thousands of CPython
extensions currently being developed and maintained.


 - Josiah


From theller at python.net  Wed Sep 20 12:02:21 2006
From: theller at python.net (Thomas Heller)
Date: Wed, 20 Sep 2006 12:02:21 +0200
Subject: [Python-Dev] Exceptions and slicing
Message-ID: <eer3je$eqe$1@sea.gmane.org>

Is it an oversight that exception instances do no longer support
slicing in Python 2.5?

This code works in 2.4, but no longer in 2.5:

try:
    open("", "r")
except IOError, details:
    print details[:]

Thomas


From brett at python.org  Wed Sep 20 20:07:51 2006
From: brett at python.org (Brett Cannon)
Date: Wed, 20 Sep 2006 11:07:51 -0700
Subject: [Python-Dev] Exceptions and slicing
In-Reply-To: <eer3je$eqe$1@sea.gmane.org>
References: <eer3je$eqe$1@sea.gmane.org>
Message-ID: <bbaeab100609201107r6cab76eapc5624b0eb6cbe8d8@mail.gmail.com>

On 9/20/06, Thomas Heller <theller at python.net> wrote:
>
> Is it an oversight that exception instances do no longer support
> slicing in Python 2.5?
>
> This code works in 2.4, but no longer in 2.5:
>
> try:
>     open("", "r")
> except IOError, details:
>     print details[:]


Technically, yes.  There is no entry in the sq_slice field for the
PySequenceMethods struct.  Although you can get to the list of arguments by
going through the 'args' attribute if you need a quick fix.

I have a fix in my checkout that I will check into the trunk shortly and
into 25-maint as soon as Anthony unfreezes it.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060920/720c7bdb/attachment.html 

From theller at python.net  Wed Sep 20 21:38:39 2006
From: theller at python.net (Thomas Heller)
Date: Wed, 20 Sep 2006 21:38:39 +0200
Subject: [Python-Dev] Exceptions and slicing
In-Reply-To: <bbaeab100609201107r6cab76eapc5624b0eb6cbe8d8@mail.gmail.com>
References: <eer3je$eqe$1@sea.gmane.org>
	<bbaeab100609201107r6cab76eapc5624b0eb6cbe8d8@mail.gmail.com>
Message-ID: <ees5c5$evq$1@sea.gmane.org>

Brett Cannon schrieb:
> On 9/20/06, Thomas Heller <theller at python.net> wrote:
>>
>> Is it an oversight that exception instances do no longer support
>> slicing in Python 2.5?
>>
>> This code works in 2.4, but no longer in 2.5:
>>
>> try:
>>     open("", "r")
>> except IOError, details:
>>     print details[:]
> 
> 
> Technically, yes.  There is no entry in the sq_slice field for the
> PySequenceMethods struct.  Although you can get to the list of arguments by
> going through the 'args' attribute if you need a quick fix.

Well, Nick Coghlan pointed out in private email:

>> According to PEP 352 it should have at most been deprecated along with the 
>> rest of Exception.__getitem__:
>>
>> "This also means providing a __getitem__ method is unneeded for exceptions and 
>> thus will be deprecated as well."

> I have a fix in my checkout that I will check into the trunk shortly and
> into 25-maint as soon as Anthony unfreezes it.

I was not aware before I posted that tuple-unpacking of exceptions still works,
so this is another possibility:
    except WindowsError, (errno, message):


What I find worse about WindowsError especially is two things:

1. The __str__ of a WindowsError instance hides the 'real' windows
error number.  So, in 2.4 "print error_instance" would print
for example:

  [Errno 1002] Das Fenster kann die gesendete Nachricht nicht verarbeiten.
    
while in 2.5:

  [Error 22] Das Fenster kann die gesendete Nachricht nicht verarbeiten.

because the new mapping of windows error codes to posix error codes creates
EINVAL (22) when no corresponding posix error code exists.

2. How would one write portable exception handling for Python 2.4 and 2.5?

I have code like this:

try:
    do something
except WindowsError, details:
    if not details.errno in (TYPE_E_REGISTRYACCESS, TYPE_E_CANTLOADLIBRARY):
        raise

Doesn't work in 2.5 any longer, because I would have to use details.winerror
instead of e.errno.

The two portale possibilities I found are these, but neither is elegant imo:

  except WindowsError, (winerrno, message):
or
  except WindowsError, details:
      winerrno = details[0]

And the latter still uses __getitem__ which may go away according to PEP 352.

Thomas


From martin at v.loewis.de  Wed Sep 20 21:58:36 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 20 Sep 2006 21:58:36 +0200
Subject: [Python-Dev] Exceptions and slicing
In-Reply-To: <ees5c5$evq$1@sea.gmane.org>
References: <eer3je$eqe$1@sea.gmane.org>	<bbaeab100609201107r6cab76eapc5624b0eb6cbe8d8@mail.gmail.com>
	<ees5c5$evq$1@sea.gmane.org>
Message-ID: <45119D6C.2050005@v.loewis.de>

Thomas Heller schrieb:
> 1. The __str__ of a WindowsError instance hides the 'real' windows
> error number.  So, in 2.4 "print error_instance" would print
> for example:
> 
>   [Errno 1002] Das Fenster kann die gesendete Nachricht nicht verarbeiten.
>     
> while in 2.5:
> 
>   [Error 22] Das Fenster kann die gesendete Nachricht nicht verarbeiten.

That's a bug. I changed the string deliberately from Errno to error to
indicate that it is not an errno, but a GetLastError. Can you come up
with a patch?

> 2. How would one write portable exception handling for Python 2.4 and 2.5?
> 
> I have code like this:
> 
> try:
>     do something
> except WindowsError, details:
>     if not details.errno in (TYPE_E_REGISTRYACCESS, TYPE_E_CANTLOADLIBRARY):
>         raise
> 
> Doesn't work in 2.5 any longer, because I would have to use details.winerror
> instead of e.errno.

Portable code should do

def winerror(exc):
  try:
     return exc.winerror
  except AttributeError: #2.4 and earlier
     return exc.errno

and then

 try:
     do something
 except WindowsError, details:
     if not winerror(details) in (TYPE_E_REGISTRYACCESS,
YPE_E_CANTLOADLIBRARY):
         raise

Regards,
Martin


From brett at python.org  Wed Sep 20 22:04:49 2006
From: brett at python.org (Brett Cannon)
Date: Wed, 20 Sep 2006 13:04:49 -0700
Subject: [Python-Dev] Exceptions and slicing
In-Reply-To: <ees5c5$evq$1@sea.gmane.org>
References: <eer3je$eqe$1@sea.gmane.org>
	<bbaeab100609201107r6cab76eapc5624b0eb6cbe8d8@mail.gmail.com>
	<ees5c5$evq$1@sea.gmane.org>
Message-ID: <bbaeab100609201304m4da01f19w1f98d69f5ba49431@mail.gmail.com>

On 9/20/06, Thomas Heller <theller at python.net> wrote:
>
> Brett Cannon schrieb:
> > On 9/20/06, Thomas Heller <theller at python.net> wrote:
> >>
> >> Is it an oversight that exception instances do no longer support
> >> slicing in Python 2.5?
> >>
> >> This code works in 2.4, but no longer in 2.5:
> >>
> >> try:
> >>     open("", "r")
> >> except IOError, details:
> >>     print details[:]
> >
> >
> > Technically, yes.  There is no entry in the sq_slice field for the
> > PySequenceMethods struct.  Although you can get to the list of arguments
> by
> > going through the 'args' attribute if you need a quick fix.
>
> Well, Nick Coghlan pointed out in private email:
>
> >> According to PEP 352 it should have at most been deprecated along with
> the
> >> rest of Exception.__getitem__:
> >>
> >> "This also means providing a __getitem__ method is unneeded for
> exceptions and
> >> thus will be deprecated as well."


Right, the deprecation is not scheduled until Python 2.9 for __getitem__ so
it was a regression problem (was never a test for it before PEP 352 was
written).  The fix is now in so your code should work again from a trunk
checkout.  I will backport when the freeze is raised.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060920/5e51126c/attachment.html 

From theller at python.net  Wed Sep 20 22:11:58 2006
From: theller at python.net (Thomas Heller)
Date: Wed, 20 Sep 2006 22:11:58 +0200
Subject: [Python-Dev] Exceptions and slicing
In-Reply-To: <45119D6C.2050005@v.loewis.de>
References: <eer3je$eqe$1@sea.gmane.org>	<bbaeab100609201107r6cab76eapc5624b0eb6cbe8d8@mail.gmail.com>	<ees5c5$evq$1@sea.gmane.org>
	<45119D6C.2050005@v.loewis.de>
Message-ID: <ees7ad$m33$1@sea.gmane.org>

Martin v. L?wis schrieb:
> Thomas Heller schrieb:
>> 1. The __str__ of a WindowsError instance hides the 'real' windows
>> error number.  So, in 2.4 "print error_instance" would print
>> for example:
>> 
>>   [Errno 1002] Das Fenster kann die gesendete Nachricht nicht verarbeiten.
>>     
>> while in 2.5:
>> 
>>   [Error 22] Das Fenster kann die gesendete Nachricht nicht verarbeiten.
> 
> That's a bug. I changed the string deliberately from Errno to error to
> indicate that it is not an errno, but a GetLastError. Can you come up
> with a patch?

Yes, but not today.

>> 2. How would one write portable exception handling for Python 2.4 and 2.5?
>> 
> Portable code should do
> 
> def winerror(exc):
>   try:
>      return exc.winerror
>   except AttributeError: #2.4 and earlier
>      return exc.errno
> 
> and then
> 
>  try:
>      do something
>  except WindowsError, details:
>      if not winerror(details) in (TYPE_E_REGISTRYACCESS,
> YPE_E_CANTLOADLIBRARY):
>          raise

Ok (sigh ;-).


Thanks,
Thomas


From kbk at shore.net  Thu Sep 21 02:02:59 2006
From: kbk at shore.net (Kurt B. Kaiser)
Date: Wed, 20 Sep 2006 20:02:59 -0400 (EDT)
Subject: [Python-Dev] Weekly Python Patch/Bug Summary
Message-ID: <200609210002.k8L02xNN002362@bayview.thirdcreek.com>

Patch / Bug Summary
___________________

Patches :  419 open ( +3) /  3410 closed ( +2) /  3829 total ( +5)
Bugs    :  910 open (+12) /  6185 closed ( +5) /  7095 total (+17)
RFE     :  235 open ( +1) /   238 closed ( +0) /   473 total ( +1)

New / Reopened Patches
______________________

Practical ctypes example  (2006-09-15)
       http://python.org/sf/1559219  opened by  leppton

pyclbr reports different module for Class and Function  (2006-09-18)
       http://python.org/sf/1560617  opened by  Peter Otten

Exec stacks in python 2.5  (2006-09-18)
       http://python.org/sf/1560695  opened by  Chaza

Patches Closed
______________

test_grp.py doesn't skip special NIS entry, fails  (2006-06-22)
       http://python.org/sf/1510987  closed by  martineau

New / Reopened Bugs
___________________

some section links (previous, up, next) missing last word  (2006-09-15)
       http://python.org/sf/1559142  opened by  Tim Smith

time.strptime() access non exitant attribute in calendar.py  (2006-09-15)
CLOSED http://python.org/sf/1559515  opened by  betatim

shutil.copyfile incomplete on NTFS  (2006-09-16)
       http://python.org/sf/1559684  opened by  Roger Upole

gcc trunk (4.2) exposes a signed integer overflows  (2006-08-24)
       http://python.org/sf/1545668  reopened by  arigo

2.5c2 pythonw does not execute  (2006-09-16)
       http://python.org/sf/1559747  opened by  Ron Platten

list.sort does nothing when both cmp and key are given  (2006-09-16)
CLOSED http://python.org/sf/1559818  opened by  Marcin 'Qrczak' Kowalczyk

confusing error msg from random.randint  (2006-09-17)
       http://python.org/sf/1560032  opened by  paul rubin

Tutorial: incorrect info about package importing and mac  (2006-09-17)
       http://python.org/sf/1560114  opened by  C L

Better/faster implementation of os.path.split  (2006-09-17)
CLOSED http://python.org/sf/1560161  opened by  Michael Gebetsroither

Better/faster implementation of os.path.basename/dirname  (2006-09-17)
       http://python.org/sf/1560179  reopened by  gbrandl

Better/faster implementation of os.path.basename/dirname  (2006-09-17)
       http://python.org/sf/1560179  opened by  Michael Gebetsroither

copy() method of dictionaries is not "deep"  (2006-09-17)
       http://python.org/sf/1560327  reopened by  gbrandl

copy() method of dictionaries is not "deep"  (2006-09-17)
       http://python.org/sf/1560327  opened by  daniel hahler

python 2.5 fails to build with --as-needed  (2006-09-18)
       http://python.org/sf/1560984  opened by  Chaza

mac installer profile patch vs. .bash_login  (2006-09-19)
       http://python.org/sf/1561243  opened by  Ronald Oussoren

-xcode=pic32 option is not supported on Solaris x86 Sun C  (2006-09-19)
       http://python.org/sf/1561333  opened by  James Lick

Dedent with Italian keyboard  (2006-09-20)
       http://python.org/sf/1562092  opened by  neclepsio

Fails to install on Fedora Core 5  (2006-09-20)
       http://python.org/sf/1562171  opened by  Mark Summerfield

IDLE Hung up after open script by command line...  (2006-09-20)
       http://python.org/sf/1562193  opened by  Faramir^

uninitialized memory read in parsetok()  (2006-09-20)
       http://python.org/sf/1562308  opened by  Luke Moore

Bugs Closed
___________

2.5c2 macosx installer aborts during "GUI Applications"  (2006-09-15)
       http://python.org/sf/1558983  closed by  ronaldoussoren

time.strptime() access non existant attribute in calendar.py  (2006-09-15)
       http://python.org/sf/1559515  closed by  bcannon

list.sort does nothing when both cmp and key are given  (2006-09-16)
       http://python.org/sf/1559818  closed by  qrczak

Better/faster implementation of os.path.split  (2006-09-17)
       http://python.org/sf/1560161  deleted by  einsteinmg

Better/faster implementation of os.path.basename/dirname  (2006-09-17)
       http://python.org/sf/1560179  deleted by  einsteinmg

copy() method of dictionaries is not "deep"  (2006-09-17)
       http://python.org/sf/1560327  closed by  gbrandl

New / Reopened RFE
__________________

Exception need structured information associated with them  (2006-09-15)
       http://python.org/sf/1559549  opened by  Ned Batchelder

String searching performance improvement  (2006-09-19)
CLOSED http://python.org/sf/1561634  opened by  Nick Welch

RFE Closed
__________

String searching performance improvement  (2006-09-19)
       http://python.org/sf/1561634  deleted by  mackstann


From guido at python.org  Thu Sep 21 04:17:43 2006
From: guido at python.org (Guido van Rossum)
Date: Wed, 20 Sep 2006 19:17:43 -0700
Subject: [Python-Dev] New relative import issue
In-Reply-To: <45108611.7090009@canterbury.ac.nz>
References: <cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>
	<20060918091314.GA26814@code0.codespeak.net>
	<450F6833.60603@canterbury.ac.nz> <eeooet$v1m$1@sea.gmane.org>
	<45108611.7090009@canterbury.ac.nz>
Message-ID: <ca471dc20609201917k4df3d19dof01f70a846f4b30d@mail.gmail.com>

On 9/19/06, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> I haven't really thought it through in detail. It
> just seems as though it would be a lot less confusing
> if you could figure out from static information which
> module will get imported by a given import statement,
> instead of having it depend on the history of run-time
> modifications to sys.path. One such kind of static
> information is the layout of the filesystem.

Eek? If there are two third-party top-level packages A and B, by
different third parties, and A depends on B, how should A find B if
not via sys.path or something that is sufficiently equivalent as to
have the same problems? Surely every site shouldn't be required to
install A and B in the same location (or in the same location relative
to each other).

I sympathize with the problems that exist with the current import
mechanism, really, I do. Google feels the pain every day (alas,
Google's requirements are a bit unusual, so they alone can't provide
much guidance for a solution). But if you combine the various
requirements: zip imports, import hooks of various sorts, different
permissions for the owners of different packages that must cooperate,
versioning issues (Python versions as well as package versions),
forwards compatibility, backwards compatibility, ease of development,
ease of packaging, ease of installation, supporting the conventions of
vastly different platforms, data files mixed in with the source code
(sometimes with their own search path), and probably several other
requirements that I'm forgetting right now, it's just not an easy
problem.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Thu Sep 21 04:20:30 2006
From: guido at python.org (Guido van Rossum)
Date: Wed, 20 Sep 2006 19:20:30 -0700
Subject: [Python-Dev] IronPython and AST branch
In-Reply-To: <450D1819.2080803@gmail.com>
References: <5b0248170609130046w4e5bd012s63ecf46cbcfb8d2b@mail.gmail.com>
	<bbaeab100609161133y224ae51as384989e8fe4942e2@mail.gmail.com>
	<450D1819.2080803@gmail.com>
Message-ID: <ca471dc20609201920w4d0e287ev90b87361b05c1c8e@mail.gmail.com>

On 9/17/06, Nick Coghlan <ncoghlan at gmail.com> wrote:
> One of the biggest issues I have with the current AST is that I don't believe
> it really gets the "slice" and "extended slice" terminology correct (it uses
> 'extended slice' to refer to multi-dimensional indexing, but the normal
> meaning of that phrase is to refer to the use of a step argument for a slice [1])

The two were introduced together and were referred to together as
"extended slicing" at the time, so I'm not sure who is confused.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From ncoghlan at gmail.com  Thu Sep 21 12:22:27 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 21 Sep 2006 20:22:27 +1000
Subject: [Python-Dev] Removing __del__
In-Reply-To: <fb6fbf560609200543l70f16862p750da26b18eb66da@mail.gmail.com>
References: <20060919053609.vp8duwukq7sw4w48@login.werra.lunarpages.com>	
	<451123A2.7040701@gmail.com>
	<fb6fbf560609200543l70f16862p750da26b18eb66da@mail.gmail.com>
Message-ID: <451267E3.60405@gmail.com>

Adding pydev back in, since these seem like reasonable questions to me :)

Jim Jewett wrote:
> On 9/20/06, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>              # Create a class with the same instance attributes
>>              # as the original
>>              class attr_holder(object):
>>                  pass
>>              finalizer_arg = attr_holder()
>>              finalizer_arg.__dict__ = self.__dict__
> 
> Does this really work?

It works for normal user-defined classes at least:

 >>> class C1(object):
...     pass
...
 >>> class C2(object):
...     pass
...
 >>> a = C1()
 >>> b = C2()
 >>> b.__dict__ = a.__dict__
 >>> a.x = 1
 >>> b.x
1

> (1)  for classes with a dictproxy of some sort, you might get either a
> copy (which isn't updated)

Classes that change the way __dict__ is handled would probably need to define 
their own __del_arg__.

> (2)  for other classes, self might be added to the dict later

Yeah, that's the strongest argument I know of against having that default 
fallback - it can easily lead to a strong reference from sys.finalizers into 
an otherwise unreachable cycle. I believe it currently takes two __del__ 
methods to prevent a cycle from being collected, whereas in this set up it 
would only take one.

OTOH, fixing it would be much easier than it is now (by setting __del_args__ 
to something that holds only the subset of attributes that require finalization).

> and of course, if it isn't added later, then it doesn't hvae the full
> power of current finalizers -- just the __close__ subset.

True, but most finalizers I've seen don't really *need* the full power of the 
current __del__. They only need to get at a couple of their internal members 
in order to explicitly release external resources.

And more sophisticated usage is still possible by assigning an appropriate 
value to __del_arg__.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From ncoghlan at iinet.net.au  Thu Sep 21 12:24:28 2006
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Thu, 21 Sep 2006 20:24:28 +1000
Subject: [Python-Dev] Removing __del__
In-Reply-To: <451267E3.60405@gmail.com>
References: <20060919053609.vp8duwukq7sw4w48@login.werra.lunarpages.com>	
	<451123A2.7040701@gmail.com>
	<fb6fbf560609200543l70f16862p750da26b18eb66da@mail.gmail.com>
	<451267E3.60405@gmail.com>
Message-ID: <4512685C.3070603@iinet.net.au>

Nick Coghlan wrote:
> Adding pydev back in, since these seem like reasonable questions to me :)

D'oh, that should have been python-3000 not python-dev :(

Sorry for the noise, folks.

Cheers,
Nick.


-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From ncoghlan at gmail.com  Thu Sep 21 13:10:24 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 21 Sep 2006 21:10:24 +1000
Subject: [Python-Dev] IronPython and AST branch
In-Reply-To: <ca471dc20609201920w4d0e287ev90b87361b05c1c8e@mail.gmail.com>
References: <5b0248170609130046w4e5bd012s63ecf46cbcfb8d2b@mail.gmail.com>	
	<bbaeab100609161133y224ae51as384989e8fe4942e2@mail.gmail.com>	
	<450D1819.2080803@gmail.com>
	<ca471dc20609201920w4d0e287ev90b87361b05c1c8e@mail.gmail.com>
Message-ID: <45127320.6060308@gmail.com>

Guido van Rossum wrote:
> On 9/17/06, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> One of the biggest issues I have with the current AST is that I don't 
>> believe
>> it really gets the "slice" and "extended slice" terminology correct 
>> (it uses
>> 'extended slice' to refer to multi-dimensional indexing, but the normal
>> meaning of that phrase is to refer to the use of a step argument for a 
>> slice [1])
> 
> The two were introduced together and were referred to together as
> "extended slicing" at the time, so I'm not sure who is confused.

Ah, that would explain it then - I first encountered the phrase 'extended 
slicing' in the context of the Python 2.3 additions to the builtin types, so I 
didn't realise it referred to all __getitem__ based non-mapping lookups, 
rather than just the start:stop:step form of slicing.

Given that additional bit of history, I don't think changing the name of the 
AST node is worth the hassle - I'll just have to recalibrate my brain :)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From anthony at interlink.com.au  Thu Sep 21 13:12:03 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Thu, 21 Sep 2006 21:12:03 +1000
Subject: [Python-Dev] release25-maint is UNFROZEN
Message-ID: <200609212112.04923.anthony@interlink.com.au>

Ok - it's been 48 hours, and I've not seen any brown-paper-bag bugs, so I'm 
declaring the 2.5 maintenance branch open for business. As specified in 
PEP-006, this is a maintenance branch only suitable for bug fixes. No 
functionality changes should be checked in without discussion and agreement 
on python-dev first.

Thanks to everyone for helping make 2.5 happen. It's been a long slog there, 
but I think we can all be proud of the result.

Anthony
-- 
Anthony Baxter     <anthony at interlink.com.au>
It's never too late to have a happy childhood.

From g.brandl at gmx.net  Thu Sep 21 14:31:24 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Thu, 21 Sep 2006 14:31:24 +0200
Subject: [Python-Dev] GCC 4.x incompatibility
Message-ID: <eeu0mt$fc8$1@sea.gmane.org>

Is it noted somewhere that building Python with GCC 4.x results in
problems such as abs(-sys.maxint-1) being negative?

I think this is something users may want to know.


Perhaps the "Known Bugs" page at 
http://www.python.org/download/releases/2.5/bugs/ is the right place to put
this info.

Georg


From arigo at tunes.org  Thu Sep 21 14:35:11 2006
From: arigo at tunes.org (Armin Rigo)
Date: Thu, 21 Sep 2006 14:35:11 +0200
Subject: [Python-Dev] release25-maint is UNFROZEN
In-Reply-To: <200609212112.04923.anthony@interlink.com.au>
References: <200609212112.04923.anthony@interlink.com.au>
Message-ID: <20060921123510.GA22457@code0.codespeak.net>

Hi Anthony,

On Thu, Sep 21, 2006 at 09:12:03PM +1000, Anthony Baxter wrote:
> Thanks to everyone for helping make 2.5 happen. It's been a long slog there, 
> but I think we can all be proud of the result.

Thanks for the hassle!  I've got another bit of it for you, though.  The
freezed 2.5 documentation doesn't seem to be available on-line.  At
least, the doc links from the release page point to the 'dev' 2.6a0
version, and the URL following the common scheme -
http://www.python.org/doc/2.5/ - doesn't work.


A bientot,

Armin

From steve at holdenweb.com  Thu Sep 21 14:47:57 2006
From: steve at holdenweb.com (Steve Holden)
Date: Thu, 21 Sep 2006 08:47:57 -0400
Subject: [Python-Dev] New relative import issue
In-Reply-To: <ca471dc20609201917k4df3d19dof01f70a846f4b30d@mail.gmail.com>
References: <cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>	<20060918091314.GA26814@code0.codespeak.net>	<450F6833.60603@canterbury.ac.nz>
	<eeooet$v1m$1@sea.gmane.org>	<45108611.7090009@canterbury.ac.nz>
	<ca471dc20609201917k4df3d19dof01f70a846f4b30d@mail.gmail.com>
Message-ID: <eeu1kc$inm$1@sea.gmane.org>

Guido van Rossum wrote:
> On 9/19/06, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> 
>>I haven't really thought it through in detail. It
>>just seems as though it would be a lot less confusing
>>if you could figure out from static information which
>>module will get imported by a given import statement,
>>instead of having it depend on the history of run-time
>>modifications to sys.path. One such kind of static
>>information is the layout of the filesystem.
> 
> 
> Eek? If there are two third-party top-level packages A and B, by
> different third parties, and A depends on B, how should A find B if
> not via sys.path or something that is sufficiently equivalent as to
> have the same problems? Surely every site shouldn't be required to
> install A and B in the same location (or in the same location relative
> to each other).
> 
> I sympathize with the problems that exist with the current import
> mechanism, really, I do. Google feels the pain every day (alas,
> Google's requirements are a bit unusual, so they alone can't provide
> much guidance for a solution). But if you combine the various
> requirements: zip imports, import hooks of various sorts, different
> permissions for the owners of different packages that must cooperate,
> versioning issues (Python versions as well as package versions),
> forwards compatibility, backwards compatibility, ease of development,
> ease of packaging, ease of installation, supporting the conventions of
> vastly different platforms, data files mixed in with the source code
> (sometimes with their own search path), and probably several other
> requirements that I'm forgetting right now, it's just not an easy
> problem.
> 
But you're the BDFL! You mean to tell me there are some problems you 
can't solve?!?!?!?!?

shocked-and-amazed-ly y'rs  - steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb       http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden


From rasky at develer.com  Thu Sep 21 15:05:38 2006
From: rasky at develer.com (Giovanni Bajo)
Date: Thu, 21 Sep 2006 15:05:38 +0200
Subject: [Python-Dev] New relative import issue
References: <cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com><20060918091314.GA26814@code0.codespeak.net><450F6833.60603@canterbury.ac.nz>
	<20060919094738.GC27707@phd.pp.ru>
Message-ID: <05af01c6dd7e$a2209560$e303030a@trilan>

Oleg Broytmann wrote:

>> There really shouldn't be
>> any such thing as sys.path -- the view that any
>> given module has of the package namespace should
>> depend only on where it is
>
>    I do not understand this. Can you show an example? Imagine I have
> two servers, Linux and FreeBSD, and on Linux python is in /usr/bin,
> home is /home/phd, on BSD these are /usr/local/bin and /usr/home/phd.
> I have some modules in site-packages and some modules in
> $HOME/lib/python. How can I move programs from one server to the
> other without rewriting them (how can I not to put full paths to
> modules)? I use PYTHONPATH manipulation - its enough to write a shell
> script that starts daemons once and use it for many years. How can I
> do this without sys.path?!

My idea (and interpretation of Greg's statement) is that a module/package
should be able to live with either relative imports within itself, or fully
absolute imports. No sys.path *hackery* should ever be necessary to access
modules in sibling namespaces. Either it's an absolute import, or a relative
(internal) import. A sibling import is a symptom of wrong design of the
packages.

This is how I usually design my packages at least. There might be valid use
cases for doing sys.path hackery, but I have yet to find them.
-- 
Giovanni Bajo


From theller at python.net  Thu Sep 21 15:24:52 2006
From: theller at python.net (Thomas Heller)
Date: Thu, 21 Sep 2006 15:24:52 +0200
Subject: [Python-Dev] Small Py3k task: fix modulefinder.py
In-Reply-To: <ca471dc20608291442p3d92790ema7aa35f85d38156a@mail.gmail.com>
References: <ca471dc20608291442p3d92790ema7aa35f85d38156a@mail.gmail.com>
Message-ID: <eeu3r3$rd8$2@sea.gmane.org>

Guido van Rossum schrieb:
> Is anyone familiar enough with modulefinder.py to fix its breakage in
> Py3k? It chokes in a nasty way (exceeding the recursion limit) on the
> relative import syntax. I suspect this is also a problem for 2.5, when
> people use that syntax; hence the cross-post. There's no unittest for
> modulefinder.py, but I believe py2exe depends on it (and of course
> freeze.py, but who uses that still?)
> 

I'm not (yet) using relative imports in 2.5 or Py3k, but have not been able
to reproduce the recursion limit problem.  Can you describe the package
that fails?

Thanks,
Thomas


From gustavo at niemeyer.net  Thu Sep 21 15:42:49 2006
From: gustavo at niemeyer.net (Gustavo Niemeyer)
Date: Thu, 21 Sep 2006 10:42:49 -0300
Subject: [Python-Dev] dict.discard
Message-ID: <20060921134249.GA9238@niemeyer.net>

Hey guys,

After trying to use it a few times with no success :-), I'd like
to include a new method, dict.discard, mirroring set.discard:

  >>> print set.discard.__doc__
  Remove an element from a set if it is a member.
  
  If the element is not a member, do nothing.

Comments?

-- 
Gustavo Niemeyer
http://niemeyer.net

From fdrake at acm.org  Thu Sep 21 15:49:25 2006
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 21 Sep 2006 09:49:25 -0400
Subject: [Python-Dev] dict.discard
In-Reply-To: <20060921134249.GA9238@niemeyer.net>
References: <20060921134249.GA9238@niemeyer.net>
Message-ID: <200609210949.25822.fdrake@acm.org>

On Thursday 21 September 2006 09:42, Gustavo Niemeyer wrote:
 > After trying to use it a few times with no success :-), I'd like
 >
 > to include a new method, dict.discard, mirroring set.discard:
 >   >>> print set.discard.__doc__
 >
 >   Remove an element from a set if it is a member.
 >
 >   If the element is not a member, do nothing.

Would the argument be the key, or the pair?  I'd guess the key.

If so, there's the 2-arg flavor of dict.pop():

  >>> d = {}
  >>> d.pop("key", None)

It's not terribly obvious, but does the job without enlarging the dict API.


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From gustavo at niemeyer.net  Thu Sep 21 16:07:04 2006
From: gustavo at niemeyer.net (Gustavo Niemeyer)
Date: Thu, 21 Sep 2006 11:07:04 -0300
Subject: [Python-Dev] dict.discard
In-Reply-To: <200609210949.25822.fdrake@acm.org>
References: <20060921134249.GA9238@niemeyer.net>
	<200609210949.25822.fdrake@acm.org>
Message-ID: <20060921140704.GA10159@niemeyer.net>

> Would the argument be the key, or the pair?  I'd guess the key.

Right, the key.

> If so, there's the 2-arg flavor of dict.pop():
> 
>   >>> d = {}
>   >>> d.pop("key", None)
> 
> It's not terribly obvious, but does the job without enlarging
> the dict API.

Yeah, this looks good.  I don't think I've ever used it like this.

-- 
Gustavo Niemeyer
http://niemeyer.net

From guido at python.org  Thu Sep 21 16:22:04 2006
From: guido at python.org (Guido van Rossum)
Date: Thu, 21 Sep 2006 07:22:04 -0700
Subject: [Python-Dev] New relative import issue
In-Reply-To: <05af01c6dd7e$a2209560$e303030a@trilan>
References: <cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>
	<20060918091314.GA26814@code0.codespeak.net>
	<450F6833.60603@canterbury.ac.nz> <20060919094738.GC27707@phd.pp.ru>
	<05af01c6dd7e$a2209560$e303030a@trilan>
Message-ID: <ca471dc20609210722i620d0371g43add23268844be6@mail.gmail.com>

On 9/21/06, Giovanni Bajo <rasky at develer.com> wrote:
> >> Greg Eqing wrote:
> >> There really shouldn't be
> >> any such thing as sys.path -- the view that any
> >> given module has of the package namespace should
> >> depend only on where it is

> My idea (and interpretation of Greg's statement) is that a module/package
> should be able to live with either relative imports within itself, or fully
> absolute imports. No sys.path *hackery* should ever be necessary to access
> modules in sibling namespaces. Either it's an absolute import, or a relative
> (internal) import. A sibling import is a symptom of wrong design of the
> packages.
>
> This is how I usually design my packages at least. There might be valid use
> cases for doing sys.path hackery, but I have yet to find them.

While I agree with your idea(l), I don't think that's what Greg meant.
He clearly say "sys.path should not exist at all".

I do think it's fair to use sibling imports (using from ..sibling
import module) from inside subpackages of the same package; I couldn't
tell if you were against that or not.

sys.path exists to stitch together the toplevel module/package
namespace from diverse sources.

Import hooks and sys.path hackery exist so that module/package sources
don't have to be restricted to the filesystem (as well as to allow
unbridled experimentation by those so inclined :-).

I think one missing feature is a mechanism whereby you can say "THIS
package (gives top-level package name) lives HERE (gives filesystem
location of package)" without adding the parent of HERE to sys.path
for all module searches. I think Phillip Eby's egg system might
benefit from this.

Another missing feature is a mechanism whereby you can use a
particular file as the main script without adding its directory to the
front of sys.path.

Other missing features have to do with versioning constraints. But
that quickly gets extremely messy.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From mak at trisoft.com.pl  Thu Sep 21 18:23:11 2006
From: mak at trisoft.com.pl (Grzegorz Makarewicz)
Date: Thu, 21 Sep 2006 18:23:11 +0200
Subject: [Python-Dev] win32 - results from Lib/test - 2.5 release-maint
Message-ID: <4512BC6F.3090907@trisoft.com.pl>

Hi,

- *.txt files for unicode tests are downloaded from internet - I don't 
like this.
- __db.004 isn't removed after tests
- init_types is declared static in python/python-ast.c and cant be 
imported from PC/config.c.
- python_d -u regrtest.py -u bsddb -u curses -uall -v = dies after 
testInfinitRecursion  without any message, just dissapears from tasks 
and doesn't show anything

mak


From martin at v.loewis.de  Thu Sep 21 20:13:12 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 21 Sep 2006 20:13:12 +0200
Subject: [Python-Dev] GCC 4.x incompatibility
In-Reply-To: <eeu0mt$fc8$1@sea.gmane.org>
References: <eeu0mt$fc8$1@sea.gmane.org>
Message-ID: <4512D638.3040909@v.loewis.de>

Georg Brandl schrieb:
> Is it noted somewhere that building Python with GCC 4.x results in
> problems such as abs(-sys.maxint-1) being negative?

Yes, it's in the README (although it claims problems only exist with
4.1 and 4.2; 4.0 seems to work fine for me).

> I think this is something users may want to know.

See what I wrote. Users are advised to either not use that compiler,
or add -fwrapv.

Regards,
Martin

From brett at python.org  Thu Sep 21 20:19:13 2006
From: brett at python.org (Brett Cannon)
Date: Thu, 21 Sep 2006 11:19:13 -0700
Subject: [Python-Dev] win32 - results from Lib/test - 2.5 release-maint
In-Reply-To: <4512BC6F.3090907@trisoft.com.pl>
References: <4512BC6F.3090907@trisoft.com.pl>
Message-ID: <bbaeab100609211119m7df42357hf9c64bbaaecb59cb@mail.gmail.com>

On 9/21/06, Grzegorz Makarewicz <mak at trisoft.com.pl> wrote:
>
> Hi,
>
> - *.txt files for unicode tests are downloaded from internet - I don't
> like this.


Then don't use the urlfetch resource when running regrtest.py (which you did
specify when you ran with ``-uall``).

- __db.004 isn't removed after tests
> - init_types is declared static in python/python-ast.c and cant be
> imported from PC/config.c.
> - python_d -u regrtest.py -u bsddb -u curses -uall -v = dies after
> testInfinitRecursion  without any message, just dissapears from tasks
> and doesn't show anything


Please file a bug report for each of these issues.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060921/8ca024ed/attachment.htm 

From martin at v.loewis.de  Thu Sep 21 20:20:55 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 21 Sep 2006 20:20:55 +0200
Subject: [Python-Dev] win32 - results from Lib/test - 2.5 release-maint
In-Reply-To: <4512BC6F.3090907@trisoft.com.pl>
References: <4512BC6F.3090907@trisoft.com.pl>
Message-ID: <4512D807.5060501@v.loewis.de>

Please submit a patch to sf.net/projects/python.

> - *.txt files for unicode tests are downloaded from internet - I don't 
> like this.

What files specifcally? Could it be that you passed -u urlfetch
or -u all? If so, then just don't.

> - init_types is declared static in python/python-ast.c and cant be 
> imported from PC/config.c.

Why is that a problem? PC/config.c refers to Modules/_typesmodule.c.

Regards,
Martin

From nnorwitz at gmail.com  Thu Sep 21 21:17:09 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Thu, 21 Sep 2006 12:17:09 -0700
Subject: [Python-Dev] release25-maint is UNFROZEN
In-Reply-To: <20060921123510.GA22457@code0.codespeak.net>
References: <200609212112.04923.anthony@interlink.com.au>
	<20060921123510.GA22457@code0.codespeak.net>
Message-ID: <ee2a432c0609211217y6ca0f45rbfc07afc9b9f2846@mail.gmail.com>

On 9/21/06, Armin Rigo <arigo at tunes.org> wrote:
> Hi Anthony,
>
> On Thu, Sep 21, 2006 at 09:12:03PM +1000, Anthony Baxter wrote:
> > Thanks to everyone for helping make 2.5 happen. It's been a long slog there,
> > but I think we can all be proud of the result.
>
> Thanks for the hassle!  I've got another bit of it for you, though.  The
> freezed 2.5 documentation doesn't seem to be available on-line.  At
> least, the doc links from the release page point to the 'dev' 2.6a0
> version, and the URL following the common scheme -
> http://www.python.org/doc/2.5/ - doesn't work.

I got  http://docs.python.org/dev/2.5/ working last night.  So when
the 2.5 docs are updated these pages will reflect that.

http://docs.python.org/ should point to the 2.5 doc too.  I looked at
making these changes, but was confused by what needed to be done.

n

From kbk at shore.net  Thu Sep 21 22:15:24 2006
From: kbk at shore.net (Kurt B. Kaiser)
Date: Thu, 21 Sep 2006 16:15:24 -0400 (EDT)
Subject: [Python-Dev] Weekly Python Patch/Bug Summary ** REVISED **
Message-ID: <200609212015.k8LKFONN031921@bayview.thirdcreek.com>

Patch / Bug Summary
___________________

Patches :  420 open ( +4) /  3410 closed ( +2) /  3830 total ( +6)
Bugs    :  915 open (+17) /  6186 closed ( +6) /  7101 total (+23)
RFE     :  235 open ( +1) /   238 closed ( +0) /   473 total ( +1)

New / Reopened Patches
______________________

Practical ctypes example  (2006-09-15)
       http://python.org/sf/1559219  opened by  leppton

test_popen fails on Windows if installed to "Program Files"  (2006-09-15)
       http://python.org/sf/1559298  opened by  Martin v. L?wis

test_cmd_line fails on Windows  (2006-09-15)
       http://python.org/sf/1559413  opened by  Martin v. L?wis

pyclbr reports different module for Class and Function  (2006-09-18)
       http://python.org/sf/1560617  opened by  Peter Otten

Exec stacks in python 2.5  (2006-09-18)
       http://python.org/sf/1560695  opened by  Chaza

Python 2.5 fails with -Wl,--as-needed in LDFLAGS  (2006-09-21)
       http://python.org/sf/1562825  opened by  Chaza

Patches Closed
______________

test_grp.py doesn't skip special NIS entry, fails  (2006-06-22)
       http://python.org/sf/1510987  closed by  martineau

Add RLIMIT_SBSIZE to resource module  (2006-09-13)
       http://python.org/sf/1557515  closed by  loewis

New / Reopened Bugs
___________________

some section links (previous, up, next) missing last word  (2006-09-15)
       http://python.org/sf/1559142  opened by  Tim Smith

time.strptime() access non exitant attribute in calendar.py  (2006-09-15)
CLOSED http://python.org/sf/1559515  opened by  betatim

shutil.copyfile incomplete on NTFS  (2006-09-16)
       http://python.org/sf/1559684  opened by  Roger Upole

gcc trunk (4.2) exposes a signed integer overflows  (2006-08-24)
       http://python.org/sf/1545668  reopened by  arigo

2.5c2 pythonw does not execute  (2006-09-16)
CLOSED http://python.org/sf/1559747  opened by  Ron Platten

list.sort does nothing when both cmp and key are given  (2006-09-16)
CLOSED http://python.org/sf/1559818  opened by  Marcin 'Qrczak' Kowalczyk

confusing error msg from random.randint  (2006-09-17)
       http://python.org/sf/1560032  opened by  paul rubin

Tutorial: incorrect info about package importing and mac  (2006-09-17)
       http://python.org/sf/1560114  opened by  C L

Better/faster implementation of os.path.split  (2006-09-17)
CLOSED http://python.org/sf/1560161  opened by  Michael Gebetsroither

Better/faster implementation of os.path.basename/dirname  (2006-09-17)
       http://python.org/sf/1560179  reopened by  gbrandl

Better/faster implementation of os.path.basename/dirname  (2006-09-17)
       http://python.org/sf/1560179  opened by  Michael Gebetsroither

copy() method of dictionaries is not "deep"  (2006-09-17)
       http://python.org/sf/1560327  reopened by  gbrandl

copy() method of dictionaries is not "deep"  (2006-09-17)
       http://python.org/sf/1560327  opened by  daniel hahler

strftime('%z') behaving differently with/without time arg.  (2006-09-18)
       http://python.org/sf/1560794  opened by  Knut Aksel R?ysland

python 2.5 fails to build with --as-needed  (2006-09-18)
       http://python.org/sf/1560984  opened by  Chaza

mac installer profile patch vs. .bash_login  (2006-09-19)
       http://python.org/sf/1561243  opened by  Ronald Oussoren

-xcode=pic32 option is not supported on Solaris x86 Sun C  (2006-09-19)
       http://python.org/sf/1561333  opened by  James Lick

Dedent with Italian keyboard  (2006-09-20)
       http://python.org/sf/1562092  opened by  neclepsio

Fails to install on Fedora Core 5  (2006-09-20)
       http://python.org/sf/1562171  opened by  Mark Summerfield

IDLE Hung up after open script by command line...  (2006-09-20)
       http://python.org/sf/1562193  opened by  Faramir^

uninitialized memory read in parsetok()  (2006-09-20)
       http://python.org/sf/1562308  opened by  Luke Moore

asyncore.dispatcher.set_reuse_addr not documented.  (2006-09-20)
       http://python.org/sf/1562583  opened by  Noah Spurrier

Spurious tab/space warning  (2006-09-21)
       http://python.org/sf/1562716  opened by  torhu

Spurious Tab/space error  (2006-09-21)
       http://python.org/sf/1562719  opened by  torhu

decimal module borks thread  (2006-09-21)
       http://python.org/sf/1562822  opened by  Jaster

MacPython ignores user-installed Tcl/Tk  (2006-09-21)
       http://python.org/sf/1563046  opened by  Russell Owen

code.InteractiveConsole() and closed sys.stdout  (2006-09-21)
       http://python.org/sf/1563079  opened by  Skip Montanaro

Bugs Closed
___________

2.5c2 macosx installer aborts during "GUI Applications"  (2006-09-15)
       http://python.org/sf/1558983  closed by  ronaldoussoren

time.strptime() access non existant attribute in calendar.py  (2006-09-15)
       http://python.org/sf/1559515  closed by  bcannon

2.5c2 pythonw does not execute  (2006-09-16)
       http://python.org/sf/1559747  closed by  loewis

list.sort does nothing when both cmp and key are given  (2006-09-16)
       http://python.org/sf/1559818  closed by  qrczak

Better/faster implementation of os.path.split  (2006-09-17)
       http://python.org/sf/1560161  deleted by  einsteinmg

Better/faster implementation of os.path.basename/dirname  (2006-09-17)
       http://python.org/sf/1560179  deleted by  einsteinmg

copy() method of dictionaries is not "deep"  (2006-09-17)
       http://python.org/sf/1560327  closed by  gbrandl

UCS-4 tcl not found on SUSE 10.1 with tcl and tk 8.4.12-14  (2006-09-02)
       http://python.org/sf/1550791  closed by  loewis

New / Reopened RFE
__________________

Exception need structured information associated with them  (2006-09-15)
       http://python.org/sf/1559549  opened by  Ned Batchelder

String searching performance improvement  (2006-09-19)
CLOSED http://python.org/sf/1561634  opened by  Nick Welch

RFE Closed
__________

String searching performance improvement  (2006-09-19)
       http://python.org/sf/1561634  deleted by  mackstann


From p.f.moore at gmail.com  Thu Sep 21 22:22:55 2006
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 21 Sep 2006 21:22:55 +0100
Subject: [Python-Dev] New relative import issue
In-Reply-To: <ca471dc20609210722i620d0371g43add23268844be6@mail.gmail.com>
References: <cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>
	<20060918091314.GA26814@code0.codespeak.net>
	<450F6833.60603@canterbury.ac.nz> <20060919094738.GC27707@phd.pp.ru>
	<05af01c6dd7e$a2209560$e303030a@trilan>
	<ca471dc20609210722i620d0371g43add23268844be6@mail.gmail.com>
Message-ID: <79990c6b0609211322v54e0977ao2009e98b61d2915d@mail.gmail.com>

On 9/21/06, Guido van Rossum <guido at python.org> wrote:
> I think one missing feature is a mechanism whereby you can say "THIS
> package (gives top-level package name) lives HERE (gives filesystem
> location of package)" without adding the parent of HERE to sys.path
> for all module searches. I think Phillip Eby's egg system might
> benefit from this.

This is pretty easy to do with a custom importer on sys.meta_path.
Getting the details right is a touch fiddly, but it's conceptually
straightforward.

Hmm, I might play with this - a set of PEP 302 importers to completely
customise the import mechanism. The never-completed "phase 2" of the
PEP included a reimplementation of the built in import mechanism as
hooks. Is there any interest in this actually happening? I've been
looking for an interesting coding project for a while (although I
never have any free time...)

Paul.

From guido at python.org  Thu Sep 21 22:54:45 2006
From: guido at python.org (Guido van Rossum)
Date: Thu, 21 Sep 2006 13:54:45 -0700
Subject: [Python-Dev] New relative import issue
In-Reply-To: <79990c6b0609211322v54e0977ao2009e98b61d2915d@mail.gmail.com>
References: <cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>
	<20060918091314.GA26814@code0.codespeak.net>
	<450F6833.60603@canterbury.ac.nz> <20060919094738.GC27707@phd.pp.ru>
	<05af01c6dd7e$a2209560$e303030a@trilan>
	<ca471dc20609210722i620d0371g43add23268844be6@mail.gmail.com>
	<79990c6b0609211322v54e0977ao2009e98b61d2915d@mail.gmail.com>
Message-ID: <ca471dc20609211354g61a11cfbseb6c070d2e42aa42@mail.gmail.com>

On 9/21/06, Paul Moore <p.f.moore at gmail.com> wrote:
> On 9/21/06, Guido van Rossum <guido at python.org> wrote:
> > I think one missing feature is a mechanism whereby you can say "THIS
> > package (gives top-level package name) lives HERE (gives filesystem
> > location of package)" without adding the parent of HERE to sys.path
> > for all module searches. I think Phillip Eby's egg system might
> > benefit from this.
>
> This is pretty easy to do with a custom importer on sys.meta_path.
> Getting the details right is a touch fiddly, but it's conceptually
> straightforward.

Isn't the main problem how to specify a bunch of these in the
environment? Or can this be done through .pkg files? Those aren't
cheap either though -- it would be best if the work was only done when
the package is actually needed.

> Hmm, I might play with this - a set of PEP 302 importers to completely
> customise the import mechanism. The never-completed "phase 2" of the
> PEP included a reimplementation of the built in import mechanism as
> hooks. Is there any interest in this actually happening? I've been
> looking for an interesting coding project for a while (although I
> never have any free time...)

There's a general desire to reimplement import entirely in Python for
more flexibility. I believe Brett Cannon is working on this.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From p.f.moore at gmail.com  Thu Sep 21 23:23:08 2006
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 21 Sep 2006 22:23:08 +0100
Subject: [Python-Dev] New relative import issue
In-Reply-To: <ca471dc20609211354g61a11cfbseb6c070d2e42aa42@mail.gmail.com>
References: <cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>
	<20060918091314.GA26814@code0.codespeak.net>
	<450F6833.60603@canterbury.ac.nz> <20060919094738.GC27707@phd.pp.ru>
	<05af01c6dd7e$a2209560$e303030a@trilan>
	<ca471dc20609210722i620d0371g43add23268844be6@mail.gmail.com>
	<79990c6b0609211322v54e0977ao2009e98b61d2915d@mail.gmail.com>
	<ca471dc20609211354g61a11cfbseb6c070d2e42aa42@mail.gmail.com>
Message-ID: <79990c6b0609211423o72fa04fcgbdfb48655c5ab777@mail.gmail.com>

On 9/21/06, Guido van Rossum <guido at python.org> wrote:
> On 9/21/06, Paul Moore <p.f.moore at gmail.com> wrote:
> > On 9/21/06, Guido van Rossum <guido at python.org> wrote:
> > > I think one missing feature is a mechanism whereby you can say "THIS
> > > package (gives top-level package name) lives HERE (gives filesystem
> > > location of package)" without adding the parent of HERE to sys.path
> > > for all module searches. I think Phillip Eby's egg system might
> > > benefit from this.
> >
> > This is pretty easy to do with a custom importer on sys.meta_path.
> > Getting the details right is a touch fiddly, but it's conceptually
> > straightforward.
>
> Isn't the main problem how to specify a bunch of these in the
> environment? Or can this be done through .pkg files? Those aren't
> cheap either though -- it would be best if the work was only done when
> the package is actually needed.

Hmm, I wasn't thinking of the environment. I pretty much never use
PYTHONPATH, so I tend to forget about that aspect. I was assuming an
importer object with a "register(package_name, filesystem_path)"
method. Then register the packages you want in your code, or in
site.py.

I've attached a trivial proof of concept.

But yes, you'd need to consider the environment. Possibly just have an
initialisation function called at load time (I'm assuming the new hook
is defined in a system module of some sort - I mean when that system
module is loaded) which parses an environment variable and issues a
set of register() calls.

Paul.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: imphook.py
Type: text/x-python
Size: 1247 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20060921/56e5b9b7/attachment.py 

From pje at telecommunity.com  Thu Sep 21 23:23:13 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 21 Sep 2006 17:23:13 -0400
Subject: [Python-Dev] New relative import issue
In-Reply-To: <ca471dc20609211354g61a11cfbseb6c070d2e42aa42@mail.gmail.co
 m>
References: <79990c6b0609211322v54e0977ao2009e98b61d2915d@mail.gmail.com>
	<cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>
	<20060918091314.GA26814@code0.codespeak.net>
	<450F6833.60603@canterbury.ac.nz>
	<20060919094738.GC27707@phd.pp.ru>
	<05af01c6dd7e$a2209560$e303030a@trilan>
	<ca471dc20609210722i620d0371g43add23268844be6@mail.gmail.com>
	<79990c6b0609211322v54e0977ao2009e98b61d2915d@mail.gmail.com>
Message-ID: <5.1.1.6.0.20060921170416.027233b8@sparrow.telecommunity.com>

At 01:54 PM 9/21/2006 -0700, Guido van Rossum wrote:
>On 9/21/06, Paul Moore <p.f.moore at gmail.com> wrote:
> > On 9/21/06, Guido van Rossum <guido at python.org> wrote:
> > > I think one missing feature is a mechanism whereby you can say "THIS
> > > package (gives top-level package name) lives HERE (gives filesystem
> > > location of package)" without adding the parent of HERE to sys.path
> > > for all module searches. I think Phillip Eby's egg system might
> > > benefit from this.
> >
> > This is pretty easy to do with a custom importer on sys.meta_path.
> > Getting the details right is a touch fiddly, but it's conceptually
> > straightforward.
>
>Isn't the main problem how to specify a bunch of these in the
>environment?

Yes, that's exactly the problem, assuming that by environment you mean the 
operating environment, as opposed to e.g. os.environ.  (Environment 
variables are problematic for installation purposes, as on Unix-y systems 
there is no one obvious place to set them, and on Windows, the one obvious 
place is one that the user may have no permissions for!)


From grig.gheorghiu at gmail.com  Thu Sep 21 23:34:40 2006
From: grig.gheorghiu at gmail.com (Grig Gheorghiu)
Date: Thu, 21 Sep 2006 14:34:40 -0700
Subject: [Python-Dev] test_itertools fails for trunk on x86 OS X machine
Message-ID: <3f09d5a00609211434m91a0b91y26ceb558f0664c9@mail.gmail.com>

One of the Pybots buildslaves has been failing the 'test' step, with
the culprit being test_itertools:

test_itertools
test test_itertools failed -- Traceback (most recent call last):
  File
"/Users/builder/pybots/pybot/trunk.osaf-x86/build/Lib/test/test_itertools.py",
line 62, in test_count
    self.assertEqual(repr(c), 'count(-9)')
AssertionError: 'count(4294967287)' != 'count(-9)'

This started to happen after
<http://svn.python.org/view?rev=51950&view=rev>.

The complete log for the test step on that buildslave is here:

http://www.python.org/dev/buildbot/community/all/x86%20OSX%20trunk/builds/19/step-test/0

Grig


-- 
http://agiletesting.blogspot.com

From p.f.moore at gmail.com  Thu Sep 21 23:39:05 2006
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 21 Sep 2006 22:39:05 +0100
Subject: [Python-Dev] New relative import issue
In-Reply-To: <5.1.1.6.0.20060921170416.027233b8@sparrow.telecommunity.com>
References: <cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>
	<20060918091314.GA26814@code0.codespeak.net>
	<450F6833.60603@canterbury.ac.nz> <20060919094738.GC27707@phd.pp.ru>
	<05af01c6dd7e$a2209560$e303030a@trilan>
	<ca471dc20609210722i620d0371g43add23268844be6@mail.gmail.com>
	<79990c6b0609211322v54e0977ao2009e98b61d2915d@mail.gmail.com>
	<5.1.1.6.0.20060921170416.027233b8@sparrow.telecommunity.com>
Message-ID: <79990c6b0609211439q65fba28cu31611fa7291eabf1@mail.gmail.com>

On 9/21/06, Phillip J. Eby <pje at telecommunity.com> wrote:
> >Isn't the main problem how to specify a bunch of these in the
> >environment?
>
> Yes, that's exactly the problem, assuming that by environment you mean the
> operating environment, as opposed to e.g. os.environ.

Hmm, now I don't understand again. What "operating environment" might
there be, other than

- os.environ
- code that gets executed before the import

?

There are clearly application design issues involved here (application
configuration via initialisation files, plugin registries, etc etc).
But in purely technical terms, don't they all boil down to executing a
registration function (either directly by the user, or by the
application on behalf of the user)?

I don't think I'd expect anything at the language (or base library)
level beyond a registration function and possibly an OS environment
check.

Paul.

From jackdied at jackdied.com  Thu Sep 21 23:50:19 2006
From: jackdied at jackdied.com (Jack Diederich)
Date: Thu, 21 Sep 2006 17:50:19 -0400
Subject: [Python-Dev] test_itertools fails for trunk on x86 OS X machine
In-Reply-To: <3f09d5a00609211434m91a0b91y26ceb558f0664c9@mail.gmail.com>
References: <3f09d5a00609211434m91a0b91y26ceb558f0664c9@mail.gmail.com>
Message-ID: <20060921215019.GA6677@performancedrivers.com>

The python binary is out of step with the test_itertools.py version.
You can generate this same error on your own box by reverting the
change to itertoolsmodule.c but leaving the new test in test_itertools.py

I don't know why this only happened on that OSX buildslave

On Thu, Sep 21, 2006 at 02:34:40PM -0700, Grig Gheorghiu wrote:
> One of the Pybots buildslaves has been failing the 'test' step, with
> the culprit being test_itertools:
> 
> test_itertools
> test test_itertools failed -- Traceback (most recent call last):
>   File
> "/Users/builder/pybots/pybot/trunk.osaf-x86/build/Lib/test/test_itertools.py",
> line 62, in test_count
>     self.assertEqual(repr(c), 'count(-9)')
> AssertionError: 'count(4294967287)' != 'count(-9)'
> 
> This started to happen after
> <http://svn.python.org/view?rev=51950&view=rev>.
> 
> The complete log for the test step on that buildslave is here:
> 
> http://www.python.org/dev/buildbot/community/all/x86%20OSX%20trunk/builds/19/step-test/0
> 
> Grig
> 
> 
> -- 
> http://agiletesting.blogspot.com
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/jackdied%40jackdied.com
> 

From brett at python.org  Thu Sep 21 23:55:52 2006
From: brett at python.org (Brett Cannon)
Date: Thu, 21 Sep 2006 14:55:52 -0700
Subject: [Python-Dev] New relative import issue
In-Reply-To: <ca471dc20609211354g61a11cfbseb6c070d2e42aa42@mail.gmail.com>
References: <cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>
	<20060918091314.GA26814@code0.codespeak.net>
	<450F6833.60603@canterbury.ac.nz> <20060919094738.GC27707@phd.pp.ru>
	<05af01c6dd7e$a2209560$e303030a@trilan>
	<ca471dc20609210722i620d0371g43add23268844be6@mail.gmail.com>
	<79990c6b0609211322v54e0977ao2009e98b61d2915d@mail.gmail.com>
	<ca471dc20609211354g61a11cfbseb6c070d2e42aa42@mail.gmail.com>
Message-ID: <bbaeab100609211455l722d63a5w9c747f58c3b3db16@mail.gmail.com>

On 9/21/06, Guido van Rossum <guido at python.org> wrote:
>
> On 9/21/06, Paul Moore <p.f.moore at gmail.com> wrote:
> [SNIP]
> > Hmm, I might play with this - a set of PEP 302 importers to completely
> > customise the import mechanism. The never-completed "phase 2" of the
> > PEP included a reimplementation of the built in import mechanism as
> > hooks. Is there any interest in this actually happening? I've been
> > looking for an interesting coding project for a while (although I
> > never have any free time...)
>
> There's a general desire to reimplement import entirely in Python for
> more flexibility. I believe Brett Cannon is working on this.


Since I need to control imports to the point of being able to deny importing
built-in and extension modules, I was planning on re-implementing the import
system to use PEP 302 importers.  Possibly do it in pure Python for
ease-of-use.  Then that can be worked off of for possible Py3K improvements
to the import system.

But either way I will be messing with the import system in the relatively
near future.  If you want to help, Paul (or anyone else), just send me an
email and we can try to coordinate something (plan to do the work in the
sandbox as a separate thing from my security stuff).

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060921/536ef637/attachment.htm 

From grig.gheorghiu at gmail.com  Fri Sep 22 00:28:04 2006
From: grig.gheorghiu at gmail.com (Grig Gheorghiu)
Date: Thu, 21 Sep 2006 15:28:04 -0700
Subject: [Python-Dev] test_itertools fails for trunk on x86 OS X machine
In-Reply-To: <20060921215019.GA6677@performancedrivers.com>
References: <3f09d5a00609211434m91a0b91y26ceb558f0664c9@mail.gmail.com>
	<20060921215019.GA6677@performancedrivers.com>
Message-ID: <3f09d5a00609211528x5b0b60c9q1222aaf5961e0d82@mail.gmail.com>

On 9/21/06, Jack Diederich <jackdied at jackdied.com> wrote:
> The python binary is out of step with the test_itertools.py version.
> You can generate this same error on your own box by reverting the
> change to itertoolsmodule.c but leaving the new test in test_itertools.py
>
> I don't know why this only happened on that OSX buildslave

Not sure what you mean by out of step. The binary was built out of the
very latest itertoolsmodule.c, and test_itertools.py was also updated
from svn. So they're both in sync IMO. That tests passes successfully
on all the other buildslaves in the Pybots farm (x86 Ubuntu, Debian,
Gentoo, RH9, AMD-64 Ubuntu)

Grig

From guido at python.org  Fri Sep 22 00:28:43 2006
From: guido at python.org (Guido van Rossum)
Date: Thu, 21 Sep 2006 15:28:43 -0700
Subject: [Python-Dev] New relative import issue
In-Reply-To: <79990c6b0609211423o72fa04fcgbdfb48655c5ab777@mail.gmail.com>
References: <cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>
	<20060918091314.GA26814@code0.codespeak.net>
	<450F6833.60603@canterbury.ac.nz> <20060919094738.GC27707@phd.pp.ru>
	<05af01c6dd7e$a2209560$e303030a@trilan>
	<ca471dc20609210722i620d0371g43add23268844be6@mail.gmail.com>
	<79990c6b0609211322v54e0977ao2009e98b61d2915d@mail.gmail.com>
	<ca471dc20609211354g61a11cfbseb6c070d2e42aa42@mail.gmail.com>
	<79990c6b0609211423o72fa04fcgbdfb48655c5ab777@mail.gmail.com>
Message-ID: <ca471dc20609211528m61472450lbb0977d350407363@mail.gmail.com>

On 9/21/06, Paul Moore <p.f.moore at gmail.com> wrote:
> On 9/21/06, Guido van Rossum <guido at python.org> wrote:
> > Isn't the main problem how to specify a bunch of these in the
> > environment? Or can this be done through .pkg files? Those aren't
> > cheap either though -- it would be best if the work was only done when
> > the package is actually needed.
>
> Hmm, I wasn't thinking of the environment. I pretty much never use
> PYTHONPATH, so I tend to forget about that aspect.

As Phillip understood, I meant the environment to include the
filesystem (and on Windows, the registry -- in fact, Python on Windows
*has* exactly such a mechanism in the registry, although I believe
it's rarely used these days -- it was done by Mark Hammond to support
COM servers I believe.)

> I was assuming an
> importer object with a "register(package_name, filesystem_path)"
> method. Then register the packages you want in your code, or in
> site.py.

Neither is an acceptable method for an installer tool (e.g. eggs) to
register new packages. it needs to be some kind of data file or set of
data files.

> But yes, you'd need to consider the environment. Possibly just have an
> initialisation function called at load time (I'm assuming the new hook
> is defined in a system module of some sort - I mean when that system
> module is loaded) which parses an environment variable and issues a
> set of register() calls.

os.environ is useless because there's no way for a package installer
to set it for all users.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Fri Sep 22 00:50:28 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 21 Sep 2006 18:50:28 -0400
Subject: [Python-Dev] New relative import issue
In-Reply-To: <ca471dc20609211528m61472450lbb0977d350407363@mail.gmail.co
 m>
References: <79990c6b0609211423o72fa04fcgbdfb48655c5ab777@mail.gmail.com>
	<cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>
	<20060918091314.GA26814@code0.codespeak.net>
	<450F6833.60603@canterbury.ac.nz>
	<20060919094738.GC27707@phd.pp.ru>
	<05af01c6dd7e$a2209560$e303030a@trilan>
	<ca471dc20609210722i620d0371g43add23268844be6@mail.gmail.com>
	<79990c6b0609211322v54e0977ao2009e98b61d2915d@mail.gmail.com>
	<ca471dc20609211354g61a11cfbseb6c070d2e42aa42@mail.gmail.com>
	<79990c6b0609211423o72fa04fcgbdfb48655c5ab777@mail.gmail.com>
Message-ID: <5.1.1.6.0.20060921184915.02f4a4a8@sparrow.telecommunity.com>

At 03:28 PM 9/21/2006 -0700, Guido van Rossum wrote:
>os.environ is useless because there's no way for a package installer
>to set it for all users.

Or even for *one* user!  :)


From greg.ewing at canterbury.ac.nz  Fri Sep 22 01:37:37 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 22 Sep 2006 11:37:37 +1200
Subject: [Python-Dev] New relative import issue
In-Reply-To: <ca471dc20609201917k4df3d19dof01f70a846f4b30d@mail.gmail.com>
References: <cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>
	<20060918091314.GA26814@code0.codespeak.net>
	<450F6833.60603@canterbury.ac.nz>
	<eeooet$v1m$1@sea.gmane.org> <45108611.7090009@canterbury.ac.nz>
	<ca471dc20609201917k4df3d19dof01f70a846f4b30d@mail.gmail.com>
Message-ID: <45132241.7030307@canterbury.ac.nz>

Guido van Rossum wrote:

> Eek? If there are two third-party top-level packages A and B, by
> different third parties, and A depends on B, how should A find B if
> not via sys.path or something that is sufficiently equivalent as to
> have the same problems?

Some kind of configuration mechanism is needed, but
I don't see why it can't be a static, declarative one
rather than computed at run time.

Whoever installs package A should be responsible for
setting up whatever environment is necessary around it
for it to find package B and anything else it directly
depends on.

The program C which uses package A needs to be told
where to find it. But C doesn't need to know where to
find B, the dependency on which is an implementation
detail of A, because A already knows where to find it.

In the Eiffel world, there's a thing called and ACE
(Assembly of Classes in Eiffel), which is a kind of
meta-language for describing the shape of the class
namespace, and it allows each source file to have its
own unique view of that namespace. I'm groping towards
something equivalent for the Python module namespace.
(AMP - Assembly of Modules in Python?)

--
Greg


From greg.ewing at canterbury.ac.nz  Fri Sep 22 02:07:13 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 22 Sep 2006 12:07:13 +1200
Subject: [Python-Dev] New relative import issue
In-Reply-To: <ca471dc20609201917k4df3d19dof01f70a846f4b30d@mail.gmail.com>
References: <cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>
	<20060918091314.GA26814@code0.codespeak.net>
	<450F6833.60603@canterbury.ac.nz>
	<eeooet$v1m$1@sea.gmane.org> <45108611.7090009@canterbury.ac.nz>
	<ca471dc20609201917k4df3d19dof01f70a846f4b30d@mail.gmail.com>
Message-ID: <45132931.1070803@canterbury.ac.nz>

Another thought on static module namespace configuration:
It would make things a *lot* easier for py2exe, py2app
and the like that have to figure out what packages
a program depends on without running the program.

--
Greg

From pje at telecommunity.com  Fri Sep 22 02:15:41 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 21 Sep 2006 20:15:41 -0400
Subject: [Python-Dev] New relative import issue
In-Reply-To: <45132931.1070803@canterbury.ac.nz>
References: <ca471dc20609201917k4df3d19dof01f70a846f4b30d@mail.gmail.com>
	<cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>
	<20060918091314.GA26814@code0.codespeak.net>
	<450F6833.60603@canterbury.ac.nz> <eeooet$v1m$1@sea.gmane.org>
	<45108611.7090009@canterbury.ac.nz>
	<ca471dc20609201917k4df3d19dof01f70a846f4b30d@mail.gmail.com>
Message-ID: <5.1.1.6.0.20060921201311.02735808@sparrow.telecommunity.com>

At 12:07 PM 9/22/2006 +1200, Greg Ewing wrote:
>Another thought on static module namespace configuration:
>It would make things a *lot* easier for py2exe, py2app
>and the like that have to figure out what packages
>a program depends on without running the program.

Setuptools users already explicitly declare what projects their projects 
depend on; this is how easy_install can then find and install those 
dependencies.  So, there is at least one system already available for 
Python that manages this type of thing already, and my understanding is 
that the py2exe and py2app developers plan to support using this dependency 
information in the future.


From guido at python.org  Fri Sep 22 02:17:24 2006
From: guido at python.org (Guido van Rossum)
Date: Thu, 21 Sep 2006 17:17:24 -0700
Subject: [Python-Dev] New relative import issue
In-Reply-To: <45132241.7030307@canterbury.ac.nz>
References: <cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>
	<20060918091314.GA26814@code0.codespeak.net>
	<450F6833.60603@canterbury.ac.nz> <eeooet$v1m$1@sea.gmane.org>
	<45108611.7090009@canterbury.ac.nz>
	<ca471dc20609201917k4df3d19dof01f70a846f4b30d@mail.gmail.com>
	<45132241.7030307@canterbury.ac.nz>
Message-ID: <ca471dc20609211717w2ed59224l2dfd9f57adbee1a4@mail.gmail.com>

On 9/21/06, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Guido van Rossum wrote:
>
> > Eek? If there are two third-party top-level packages A and B, by
> > different third parties, and A depends on B, how should A find B if
> > not via sys.path or something that is sufficiently equivalent as to
> > have the same problems?
>
> Some kind of configuration mechanism is needed, but
> I don't see why it can't be a static, declarative one
> rather than computed at run time.

That would preclude writing the code that interprets the static data
in Python itself.

Despite the good use cases I think that's a big showstopper.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg.ewing at canterbury.ac.nz  Fri Sep 22 02:17:24 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 22 Sep 2006 12:17:24 +1200
Subject: [Python-Dev] New relative import issue
In-Reply-To: <05af01c6dd7e$a2209560$e303030a@trilan>
References: <cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>
	<20060918091314.GA26814@code0.codespeak.net>
	<450F6833.60603@canterbury.ac.nz>
	<20060919094738.GC27707@phd.pp.ru>
	<05af01c6dd7e$a2209560$e303030a@trilan>
Message-ID: <45132B94.6050401@canterbury.ac.nz>

Giovanni Bajo wrote:

> My idea (and interpretation of Greg's statement) is that a module/package
> should be able to live with either relative imports within itself, or fully
> absolute imports.

I think it goes further than that -- each module should
(potentially) have its own unique view of the module
namespace, defined at the time the module is installed,
that can't be disturbed by anything that any other module
does.

--
Greg

From guido at python.org  Fri Sep 22 02:18:41 2006
From: guido at python.org (Guido van Rossum)
Date: Thu, 21 Sep 2006 17:18:41 -0700
Subject: [Python-Dev] New relative import issue
In-Reply-To: <5.1.1.6.0.20060921201311.02735808@sparrow.telecommunity.com>
References: <cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>
	<20060918091314.GA26814@code0.codespeak.net>
	<450F6833.60603@canterbury.ac.nz> <eeooet$v1m$1@sea.gmane.org>
	<45108611.7090009@canterbury.ac.nz>
	<ca471dc20609201917k4df3d19dof01f70a846f4b30d@mail.gmail.com>
	<45132931.1070803@canterbury.ac.nz>
	<5.1.1.6.0.20060921201311.02735808@sparrow.telecommunity.com>
Message-ID: <ca471dc20609211718r1333b029q9342fc2c0552ce9e@mail.gmail.com>

I think it would be worth writing up a PEP to describe this, if it's
to become a de-facto standard. That might be a better path towards
standardization than just checking in the code... :-/

--Guido

On 9/21/06, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 12:07 PM 9/22/2006 +1200, Greg Ewing wrote:
> >Another thought on static module namespace configuration:
> >It would make things a *lot* easier for py2exe, py2app
> >and the like that have to figure out what packages
> >a program depends on without running the program.
>
> Setuptools users already explicitly declare what projects their projects
> depend on; this is how easy_install can then find and install those
> dependencies.  So, there is at least one system already available for
> Python that manages this type of thing already, and my understanding is
> that the py2exe and py2app developers plan to support using this dependency
> information in the future.
>
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Fri Sep 22 02:20:03 2006
From: guido at python.org (Guido van Rossum)
Date: Thu, 21 Sep 2006 17:20:03 -0700
Subject: [Python-Dev] New relative import issue
In-Reply-To: <45132B94.6050401@canterbury.ac.nz>
References: <cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>
	<20060918091314.GA26814@code0.codespeak.net>
	<450F6833.60603@canterbury.ac.nz> <20060919094738.GC27707@phd.pp.ru>
	<05af01c6dd7e$a2209560$e303030a@trilan>
	<45132B94.6050401@canterbury.ac.nz>
Message-ID: <ca471dc20609211720u7ea3714fnfa8e15fc6c4d151d@mail.gmail.com>

On 9/21/06, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Giovanni Bajo wrote:
>
> > My idea (and interpretation of Greg's statement) is that a module/package
> > should be able to live with either relative imports within itself, or fully
> > absolute imports.
>
> I think it goes further than that -- each module should
> (potentially) have its own unique view of the module
> namespace, defined at the time the module is installed,
> that can't be disturbed by anything that any other module
> does.

Well, maybe. But there's also the requirement that if packages A and B
both import C, they should get the same C. Having multiple versions of
the same package loaded simultaneously sounds like a recipe for
disaster.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg.ewing at canterbury.ac.nz  Fri Sep 22 02:21:01 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 22 Sep 2006 12:21:01 +1200
Subject: [Python-Dev] list.discard? (Re: dict.discard)
In-Reply-To: <20060921134249.GA9238@niemeyer.net>
References: <20060921134249.GA9238@niemeyer.net>
Message-ID: <45132C6D.9010806@canterbury.ac.nz>

Gustavo Niemeyer wrote:

>   >>> print set.discard.__doc__
>   Remove an element from a set if it is a member.

Actually I'd like this for lists. Often I find myself
writing

   if x not in somelist:
     somelist.remove(x)

A single method for doing this would be handy, and
more efficient.

--
Greg

From fdrake at acm.org  Fri Sep 22 02:28:00 2006
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 21 Sep 2006 20:28:00 -0400
Subject: [Python-Dev] list.discard? (Re: dict.discard)
In-Reply-To: <45132C6D.9010806@canterbury.ac.nz>
References: <20060921134249.GA9238@niemeyer.net>
	<45132C6D.9010806@canterbury.ac.nz>
Message-ID: <200609212028.00824.fdrake@acm.org>

On Thursday 21 September 2006 20:21, Greg Ewing wrote:
 >    if x not in somelist:
 >      somelist.remove(x)

I'm just guessing you really meant "if x in somelist".  ;-)


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From greg.ewing at canterbury.ac.nz  Fri Sep 22 02:40:03 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 22 Sep 2006 12:40:03 +1200
Subject: [Python-Dev] New relative import issue
In-Reply-To: <ca471dc20609210722i620d0371g43add23268844be6@mail.gmail.com>
References: <cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>
	<20060918091314.GA26814@code0.codespeak.net>
	<450F6833.60603@canterbury.ac.nz>
	<20060919094738.GC27707@phd.pp.ru>
	<05af01c6dd7e$a2209560$e303030a@trilan>
	<ca471dc20609210722i620d0371g43add23268844be6@mail.gmail.com>
Message-ID: <451330E3.1020701@canterbury.ac.nz>

Guido van Rossum wrote:

> While I agree with your idea(l), I don't think that's what Greg meant.
> He clearly say "sys.path should not exist at all".

Refining that a bit, I don't think there should be
a *single* sys.path for the whole program -- more
like each module having its own sys.path. And, at
least in most cases, its contents should be set
up from static information that exists outside the
program, established when the module is installed.

One reason for this is the lack of any absolute
notion of a "program". What is a program on one
level can be part of a larger program on another
level. For example, a module with test code that
is run when it's invoked as a main script.
Sometimes it's a program of its own, other times
it's not. And it shouldn't *matter* whether it's
a program or not when it comes to being able to
find other modules that it needs to import. So
using a piece of program-wide shared state for
this seems wrong.

--
Greg

From scott+python-dev at scottdial.com  Fri Sep 22 02:32:52 2006
From: scott+python-dev at scottdial.com (Scott Dial)
Date: Thu, 21 Sep 2006 20:32:52 -0400
Subject: [Python-Dev] New relative import issue
In-Reply-To: <ca471dc20609211720u7ea3714fnfa8e15fc6c4d151d@mail.gmail.com>
References: <cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>	<20060918091314.GA26814@code0.codespeak.net>	<450F6833.60603@canterbury.ac.nz>
	<20060919094738.GC27707@phd.pp.ru>	<05af01c6dd7e$a2209560$e303030a@trilan>	<45132B94.6050401@canterbury.ac.nz>
	<ca471dc20609211720u7ea3714fnfa8e15fc6c4d151d@mail.gmail.com>
Message-ID: <45132F34.801@scottdial.com>

Guido van Rossum wrote:
> On 9/21/06, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
>> I think it goes further than that -- each module should
>> (potentially) have its own unique view of the module
>> namespace, defined at the time the module is installed,
>> that can't be disturbed by anything that any other module
>> does.
> 
> Well, maybe. But there's also the requirement that if packages A and B
> both import C, they should get the same C. Having multiple versions of
> the same package loaded simultaneously sounds like a recipe for
> disaster.

I have exactly this scenario in my current codebase for a server. It was 
absolutely necessary for me to update a module in Twisted because all 
other solutions I could come up with were less desirable. Either I send 
my patch upstream and wait (can't wait), or I fork out another version 
and place it at the top of sys.path (this seems ok). Except an even 
better solution is to maintain my own subset of Twisted, because I am 
localized to a particularly small corner of the codebase. I can continue 
to use upstream updates to the rest of Twisted without any fussing about 
merging changes and so forth. And if Twisted was allowed to decide how 
it saw its own world, then I would have to go back to maintaining my own 
complete branch.

While I don't strictly need to be able to do this, I wanted to at least 
raise my hand and say, "I abuse this facet of the current import 
mechanism."

-- 
Scott Dial
scott at scottdial.com
scodial at cs.indiana.edu

From jcarlson at uci.edu  Fri Sep 22 03:02:07 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Thu, 21 Sep 2006 18:02:07 -0700
Subject: [Python-Dev] list.discard? (Re: dict.discard)
In-Reply-To: <45132C6D.9010806@canterbury.ac.nz>
References: <20060921134249.GA9238@niemeyer.net>
	<45132C6D.9010806@canterbury.ac.nz>
Message-ID: <20060921175719.0842.JCARLSON@uci.edu>


Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Gustavo Niemeyer wrote:
> 
> >   >>> print set.discard.__doc__
> >   Remove an element from a set if it is a member.
> 
> Actually I'd like this for lists. Often I find myself
> writing
> 
>    if x not in somelist:
>      somelist.remove(x)
> 
> A single method for doing this would be handy, and
> more efficient.

A marginal calling time improvement; but we are still talking linear
time containment test.

I'm -0, if only because I've never really found the need to use
list.remove(), so this API expansion doesn't feel necessary to me.

 - Josiah


From pje at telecommunity.com  Fri Sep 22 03:40:57 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 21 Sep 2006 21:40:57 -0400
Subject: [Python-Dev] New relative import issue
In-Reply-To: <451330E3.1020701@canterbury.ac.nz>
References: <ca471dc20609210722i620d0371g43add23268844be6@mail.gmail.com>
	<cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>
	<20060918091314.GA26814@code0.codespeak.net>
	<450F6833.60603@canterbury.ac.nz>
	<20060919094738.GC27707@phd.pp.ru>
	<05af01c6dd7e$a2209560$e303030a@trilan>
	<ca471dc20609210722i620d0371g43add23268844be6@mail.gmail.com>
Message-ID: <5.1.1.6.0.20060921212856.027284d8@sparrow.telecommunity.com>

At 12:40 PM 9/22/2006 +1200, Greg Ewing wrote:
>Guido van Rossum wrote:
>
> > While I agree with your idea(l), I don't think that's what Greg meant.
> > He clearly say "sys.path should not exist at all".
>
>Refining that a bit, I don't think there should be
>a *single* sys.path for the whole program -- more
>like each module having its own sys.path. And, at
>least in most cases, its contents should be set
>up from static information that exists outside the
>program, established when the module is installed.

Now I'm a little bit more in agreement with you, but not by much.  :)

In the Java world, the OSGi framework (which is a big inspiration for many 
aspects of setuptools) effectively has a sys.path per installed project 
(modulo certain issues for dynamic imports, which I'll get into more below).

And, when Bob Ippolito and I were first designing the egg runtime system, 
we considered using some hacks to do the same thing, such that you actually 
*could* have more than one version of a package present and being used by 
two other packages that require it.

However, we eventually tossed it as a YAGNI; dependency resolution is too 
damn hard already without also having to deal with crud like C extensions 
only being able to be loaded once, etc., in addition to the necessary 
import hackery.

So, in principle, it's a good idea.  In practice, I don't think it can be 
achieved on the CPython platform.  You need something like a Jython or 
IronPython or a PyPy+ctypes-ish platform, where all the C is effectively 
abstracted behind some barrier that prevents the module problem from 
occurring, and you could in principle load the modules in different 
interpreters (or "class loaders" in the Java context).

Amusingly, this is one of the few instances where every Python *except* 
CPython is probably in a better position to implement the feature...  :)


>One reason for this is the lack of any absolute
>notion of a "program". What is a program on one
>level can be part of a larger program on another
>level. For example, a module with test code that
>is run when it's invoked as a main script.
>Sometimes it's a program of its own, other times
>it's not. And it shouldn't *matter* whether it's
>a program or not when it comes to being able to
>find other modules that it needs to import. So
>using a piece of program-wide shared state for
>this seems wrong.

An interesting point about this is that it coincidentally solves the 
problem of dynamic interpretation of meta-syntactic features.  That is, if 
there is a static way to know what modules are accessible to the module, 
then the resolution of meta-syntax features like macros or custom 
statements is simpler than if a runtime resolution is required.

Of course, in all of this you're still ignoring the part where some modules 
may need to perform dynamic or "weak" imports.  So at least *some* portion 
of a module's import path *is* dependent on the notion of the "current 
program".

(See the documentation for the "Importing" package at 
http://peak.telecommunity.com/DevCenter/Importing for an explanation of 
"weak" importing.)


From jackdied at jackdied.com  Fri Sep 22 04:08:58 2006
From: jackdied at jackdied.com (Jack Diederich)
Date: Thu, 21 Sep 2006 22:08:58 -0400
Subject: [Python-Dev] test_itertools fails for trunk on x86 OS X machine
In-Reply-To: <3f09d5a00609211528x5b0b60c9q1222aaf5961e0d82@mail.gmail.com>
References: <3f09d5a00609211434m91a0b91y26ceb558f0664c9@mail.gmail.com>
	<20060921215019.GA6677@performancedrivers.com>
	<3f09d5a00609211528x5b0b60c9q1222aaf5961e0d82@mail.gmail.com>
Message-ID: <20060922020858.GB6677@performancedrivers.com>

On Thu, Sep 21, 2006 at 03:28:04PM -0700, Grig Gheorghiu wrote:
> On 9/21/06, Jack Diederich <jackdied at jackdied.com> wrote:
> > The python binary is out of step with the test_itertools.py version.
> > You can generate this same error on your own box by reverting the
> > change to itertoolsmodule.c but leaving the new test in test_itertools.py
> >
> > I don't know why this only happened on that OSX buildslave
> 
> Not sure what you mean by out of step. The binary was built out of the
> very latest itertoolsmodule.c, and test_itertools.py was also updated
> from svn. So they're both in sync IMO. That tests passes successfully
> on all the other buildslaves in the Pybots farm (x86 Ubuntu, Debian,
> Gentoo, RH9, AMD-64 Ubuntu)
> 

When I saw the failure, first I cursed (a lot).  Then I followed the repr
all the way down into stringobject.c, no dice.  Then I noticed that the
failure is exactly what you get if the test was updated but the old
module wasn't.

Faced with the choice of believing in a really strange platform specific 
bug in a commonly used routine that resulted in exactly the failure caused 
by one of the two files being updated or believing a failure occurred in the
long chain of networks, disks, file systems, build tools, and operating 
systems that would result in only one of the files being updated -
I went with the latter.

I'll continue in my belief until my dying day or until someone with OSX
confirms it is a bug, whichever comes first.

not-gonna-sweat-it-ly,

-Jack

From grig.gheorghiu at gmail.com  Fri Sep 22 04:16:33 2006
From: grig.gheorghiu at gmail.com (Grig Gheorghiu)
Date: Thu, 21 Sep 2006 19:16:33 -0700
Subject: [Python-Dev] test_itertools fails for trunk on x86 OS X machine
In-Reply-To: <20060922020858.GB6677@performancedrivers.com>
References: <3f09d5a00609211434m91a0b91y26ceb558f0664c9@mail.gmail.com>
	<20060921215019.GA6677@performancedrivers.com>
	<3f09d5a00609211528x5b0b60c9q1222aaf5961e0d82@mail.gmail.com>
	<20060922020858.GB6677@performancedrivers.com>
Message-ID: <3f09d5a00609211916g58d3aabam683517fe047f352b@mail.gmail.com>

On 9/21/06, Jack Diederich <jackdied at jackdied.com> wrote:
> On Thu, Sep 21, 2006 at 03:28:04PM -0700, Grig Gheorghiu wrote:
> > On 9/21/06, Jack Diederich <jackdied at jackdied.com> wrote:
> > > The python binary is out of step with the test_itertools.py version.
> > > You can generate this same error on your own box by reverting the
> > > change to itertoolsmodule.c but leaving the new test in test_itertools.py
> > >
> > > I don't know why this only happened on that OSX buildslave
> >
> > Not sure what you mean by out of step. The binary was built out of the
> > very latest itertoolsmodule.c, and test_itertools.py was also updated
> > from svn. So they're both in sync IMO. That tests passes successfully
> > on all the other buildslaves in the Pybots farm (x86 Ubuntu, Debian,
> > Gentoo, RH9, AMD-64 Ubuntu)
> >
>
> When I saw the failure, first I cursed (a lot).  Then I followed the repr
> all the way down into stringobject.c, no dice.  Then I noticed that the
> failure is exactly what you get if the test was updated but the old
> module wasn't.
>
> Faced with the choice of believing in a really strange platform specific
> bug in a commonly used routine that resulted in exactly the failure caused
> by one of the two files being updated or believing a failure occurred in the
> long chain of networks, disks, file systems, build tools, and operating
> systems that would result in only one of the files being updated -
> I went with the latter.
>
> I'll continue in my belief until my dying day or until someone with OSX
> confirms it is a bug, whichever comes first.
>
> not-gonna-sweat-it-ly,
>
> -Jack
> _______________________________________________

OK, sorry for having caused you so much grief....I'll investigate some
more on the Pybots side and I'll let you know what I find.

Grig

From grig.gheorghiu at gmail.com  Fri Sep 22 04:31:17 2006
From: grig.gheorghiu at gmail.com (Grig Gheorghiu)
Date: Thu, 21 Sep 2006 19:31:17 -0700
Subject: [Python-Dev] test_itertools fails for trunk on x86 OS X machine
In-Reply-To: <3f09d5a00609211916g58d3aabam683517fe047f352b@mail.gmail.com>
References: <3f09d5a00609211434m91a0b91y26ceb558f0664c9@mail.gmail.com>
	<20060921215019.GA6677@performancedrivers.com>
	<3f09d5a00609211528x5b0b60c9q1222aaf5961e0d82@mail.gmail.com>
	<20060922020858.GB6677@performancedrivers.com>
	<3f09d5a00609211916g58d3aabam683517fe047f352b@mail.gmail.com>
Message-ID: <3f09d5a00609211931j3cf5a084y8e4ea2603a65034@mail.gmail.com>

On 9/21/06, Grig Gheorghiu <grig.gheorghiu at gmail.com> wrote:
> On 9/21/06, Jack Diederich <jackdied at jackdied.com> wrote:
> > On Thu, Sep 21, 2006 at 03:28:04PM -0700, Grig Gheorghiu wrote:
> > > On 9/21/06, Jack Diederich <jackdied at jackdied.com> wrote:
> > > > The python binary is out of step with the test_itertools.py version.
> > > > You can generate this same error on your own box by reverting the
> > > > change to itertoolsmodule.c but leaving the new test in test_itertools.py
> > > >
> > > > I don't know why this only happened on that OSX buildslave
> > >
> > > Not sure what you mean by out of step. The binary was built out of the
> > > very latest itertoolsmodule.c, and test_itertools.py was also updated
> > > from svn. So they're both in sync IMO. That tests passes successfully
> > > on all the other buildslaves in the Pybots farm (x86 Ubuntu, Debian,
> > > Gentoo, RH9, AMD-64 Ubuntu)
> > >
> >
> > When I saw the failure, first I cursed (a lot).  Then I followed the repr
> > all the way down into stringobject.c, no dice.  Then I noticed that the
> > failure is exactly what you get if the test was updated but the old
> > module wasn't.
> >
> > Faced with the choice of believing in a really strange platform specific
> > bug in a commonly used routine that resulted in exactly the failure caused
> > by one of the two files being updated or believing a failure occurred in the
> > long chain of networks, disks, file systems, build tools, and operating
> > systems that would result in only one of the files being updated -
> > I went with the latter.
> >
> > I'll continue in my belief until my dying day or until someone with OSX
> > confirms it is a bug, whichever comes first.
> >
> > not-gonna-sweat-it-ly,
> >
> > -Jack
> > _______________________________________________
>
> OK, sorry for having caused you so much grief....I'll investigate some
> more on the Pybots side and I'll let you know what I find.
>
> Grig
>


Actually, that test fails also in the official Python buildbot farm,
on a g4 osx machine. See

http://www.python.org/dev/buildbot/trunk/g4%20osx.4%20trunk/builds/1449/step-test/0

So it looks like it's an OS X specific issue.

Grig

From mhammond at skippinet.com.au  Fri Sep 22 04:29:45 2006
From: mhammond at skippinet.com.au (Mark Hammond)
Date: Fri, 22 Sep 2006 12:29:45 +1000
Subject: [Python-Dev] New relative import issue
In-Reply-To: <ca471dc20609211528m61472450lbb0977d350407363@mail.gmail.com>
Message-ID: <13c301c6ddee$f8964370$050a0a0a@enfoldsystems.local>

Guido writes:

> As Phillip understood, I meant the environment to include the
> filesystem (and on Windows, the registry -- in fact, Python on Windows
> *has* exactly such a mechanism in the registry, although I believe
> it's rarely used these days -- it was done by Mark Hammond to support
> COM servers I believe.)

It is rarely used these days due to the fact it is truly global to the
machine.  These days, it is not uncommon to have multiple copies of the same
Python version installed on the same box - generally installed privately
into an application by the vendor - eg, Komodo and Open Office both do this
to some degree.

The problem with a global registry is that *all* Python installations
honoured it.  This meant bugs in the vendor applications, as their 'import
foo' did *not* locate the foo module inside their directory, but instead
loaded a completely unrelated one, which promptly crashed.

A mechanism similar to .pth files, where the "declaration" of a module's
location is stored privately to the installation seems a more workable
approach.

Mark


From skip at pobox.com  Fri Sep 22 05:08:56 2006
From: skip at pobox.com (skip at pobox.com)
Date: Thu, 21 Sep 2006 22:08:56 -0500
Subject: [Python-Dev] list.discard? (Re: dict.discard)
In-Reply-To: <45132C6D.9010806@canterbury.ac.nz>
References: <20060921134249.GA9238@niemeyer.net>
	<45132C6D.9010806@canterbury.ac.nz>
Message-ID: <17683.21448.909748.200493@montanaro.dyndns.org>


    Greg> Actually I'd like [discard] for lists.

It's obvious for sets and dictionaries that there is only one thing to
discard and that after the operation you're guaranteed the key no longer
exists.  Would you want the same semantics for lists or the semantics of
list.remove where it only removes the first instance?

When I want to remove something from a list I typically write:

   while x in somelist:
       somelist.remove(x)

not "if" as in your example.

Skip

From jcarlson at uci.edu  Fri Sep 22 05:44:30 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Thu, 21 Sep 2006 20:44:30 -0700
Subject: [Python-Dev] New relative import issue
In-Reply-To: <45132241.7030307@canterbury.ac.nz>
References: <ca471dc20609201917k4df3d19dof01f70a846f4b30d@mail.gmail.com>
	<45132241.7030307@canterbury.ac.nz>
Message-ID: <20060921183846.0845.JCARLSON@uci.edu>


Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> 
> Guido van Rossum wrote:
> 
> > Eek? If there are two third-party top-level packages A and B, by
> > different third parties, and A depends on B, how should A find B if
> > not via sys.path or something that is sufficiently equivalent as to
> > have the same problems?
> 
> Some kind of configuration mechanism is needed, but
> I don't see why it can't be a static, declarative one
> rather than computed at run time.

I could be missing something, or be completely off-topic, but why not
both, or really a mechanism to define:
1. Installation time package location (register package X in the package
registry at path Y and persist across Python sessions).
2. Runtime package location (package X is in path Y, do not persist
across Python sessions).

With 1 and 2, we remove the need for .pth files, all packages to be
installed into Lib/site-packages, and sys.path manipulation.  You want
access to package X?
    packages.register('X', '~/mypackages/X')
    packages.register('X', '~/mypackages/X', persist=True)
    packages.register('X', '~/mypackages/X', systemwide=True)


This can be implemented with a fairly simple package registry, contained
within a (small) SQLite database (which is conveniently shipped in
Python 2.5).  There can be a system-wide database that all users use as
a base, with a user-defined package registry (per user) where the
system-wide packages can be augmented.

With a little work, it could even be possible to define importers during
registration (filesystem, zip, database, etc.) or include a tracing
mechanism for py2exe/distutils/py2app/cx_freeze/etc. (optionally writing
to a simplified database-like file for distribution so that SQLite
doesn't need to be shipped).


 - Josiah


From martin at v.loewis.de  Fri Sep 22 06:09:41 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 22 Sep 2006 06:09:41 +0200
Subject: [Python-Dev] test_itertools fails for trunk on x86 OS X machine
In-Reply-To: <20060922020858.GB6677@performancedrivers.com>
References: <3f09d5a00609211434m91a0b91y26ceb558f0664c9@mail.gmail.com>	<20060921215019.GA6677@performancedrivers.com>	<3f09d5a00609211528x5b0b60c9q1222aaf5961e0d82@mail.gmail.com>
	<20060922020858.GB6677@performancedrivers.com>
Message-ID: <45136205.6060603@v.loewis.de>

Jack Diederich schrieb:
> Faced with the choice of believing in a really strange platform specific 
> bug in a commonly used routine that resulted in exactly the failure caused 
> by one of the two files being updated or believing a failure occurred in the
> long chain of networks, disks, file systems, build tools, and operating 
> systems that would result in only one of the files being updated -
> I went with the latter.

Please reconsider how subversion works. It has the notion of atomic
commits, so you either get the entire change, or none at all.

Fortunately, the buildbot keeps logs of everything it does:

http://www.python.org/dev/buildbot/trunk/g4%20osx.4%20trunk/builds/1449/step-svn/0

shows

U    Lib/test/test_itertools.py
U    Modules/itertoolsmodule.c
Updated to revision 51950.

So it said it updated both files. But perhaps it didn't build them?
Let's check:


http://www.python.org/dev/buildbot/trunk/g4%20osx.4%20trunk/builds/1449/step-compile/0

has this:

building 'itertools' extension

gcc -fno-strict-aliasing -Wno-long-double -no-cpp-precomp
-mno-fused-madd -g -Wall -Wstrict-prototypes -I.
-I/Users/buildslave/bb/trunk.psf-g4/build/./Include
-I/Users/buildslave/bb/trunk.psf-g4/build/./Mac/Include -I./Include -I.
-I/usr/local/include -I/Users/buildslave/bb/trunk.psf-g4/build/Include
-I/Users/buildslave/bb/trunk.psf-g4/build -c
/Users/buildslave/bb/trunk.psf-g4/build/Modules/itertoolsmodule.c -o
build/temp.macosx-10.3-ppc-2.6/Users/buildslave/bb/trunk.psf-g4/build/Modules/itertoolsmodule.o

gcc -bundle -undefined dynamic_lookup
build/temp.macosx-10.3-ppc-2.6/Users/buildslave/bb/trunk.psf-g4/build/Modules/itertoolsmodule.o
-L/usr/local/lib -o build/lib.macosx-10.3-ppc-2.6/itertools.so

So itertools.so is regenerated, as it should; qed.

Regards,
Martin

From pje at telecommunity.com  Fri Sep 22 06:10:45 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 22 Sep 2006 00:10:45 -0400
Subject: [Python-Dev] New relative import issue
In-Reply-To: <20060921183846.0845.JCARLSON@uci.edu>
References: <45132241.7030307@canterbury.ac.nz>
	<ca471dc20609201917k4df3d19dof01f70a846f4b30d@mail.gmail.com>
	<45132241.7030307@canterbury.ac.nz>
Message-ID: <5.1.1.6.0.20060921235853.03e01748@sparrow.telecommunity.com>

At 08:44 PM 9/21/2006 -0700, Josiah Carlson wrote:
>This can be implemented with a fairly simple package registry, contained
>within a (small) SQLite database (which is conveniently shipped in
>Python 2.5).  There can be a system-wide database that all users use as
>a base, with a user-defined package registry (per user) where the
>system-wide packages can be augmented.

As far as I can tell, you're ignoring that per-user must *also* be 
per-version, and per-application.  Each application or runtime environment 
needs its own private set of information like this.

Next, putting the installation data inside a database instead of 
per-installation-unit files presents problems of its own.  While some 
system packaging tools allow install/uninstall scripts to run, they are 
often frowned upon, and can be unintentionally bypassed.

These are just a few of the issues that come to mind.  Realistically 
speaking, .pth files are currently the most effective mechanism we have, 
and there actually isn't much that can be done to improve upon them.

What's more needed are better mechanisms for creating and managing Python 
"environments" (to use a term coined by Ian Bicking and Jim Fulton over on 
the distutils-sig), which are individual contexts in which Python 
applications run.  Some current tools in development by Ian and Jim include:

      http://cheeseshop.python.org/pypi/workingenv.py/
      http://cheeseshop.python.org/pypi/zc.buildout/

I don't know that much about either, but I do know that for zc.buildout, 
Jim's goal was to be able to install Python scripts with pre-baked sys.path 
based on package dependencies from setuptools, and as far as I know, he 
achieved it.

Anyway, system-wide and per-user environment information isn't nearly 
sufficient to address the issues that people have when developing and 
deploying multiple applications on a server, or even using multiple 
applications on a client installation (e.g. somebody using both the 
Enthought Python IDE and Chandler on the same machine).  These relatively 
simple use cases rapidly demonstrate the inadequacy of system-wide or 
per-user configuration of what packages are available.


From steve at holdenweb.com  Fri Sep 22 06:38:24 2006
From: steve at holdenweb.com (Steve Holden)
Date: Fri, 22 Sep 2006 00:38:24 -0400
Subject: [Python-Dev] New relative import issue
In-Reply-To: <79990c6b0609211322v54e0977ao2009e98b61d2915d@mail.gmail.com>
References: <cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>	<20060918091314.GA26814@code0.codespeak.net>	<450F6833.60603@canterbury.ac.nz>
	<20060919094738.GC27707@phd.pp.ru>	<05af01c6dd7e$a2209560$e303030a@trilan>	<ca471dc20609210722i620d0371g43add23268844be6@mail.gmail.com>
	<79990c6b0609211322v54e0977ao2009e98b61d2915d@mail.gmail.com>
Message-ID: <eevpag$d4a$1@sea.gmane.org>

Paul Moore wrote:
> On 9/21/06, Guido van Rossum <guido at python.org> wrote:
> 
>>I think one missing feature is a mechanism whereby you can say "THIS
>>package (gives top-level package name) lives HERE (gives filesystem
>>location of package)" without adding the parent of HERE to sys.path
>>for all module searches. I think Phillip Eby's egg system might
>>benefit from this.
> 
> 
> This is pretty easy to do with a custom importer on sys.meta_path.
> Getting the details right is a touch fiddly, but it's conceptually
> straightforward.
> 
> Hmm, I might play with this - a set of PEP 302 importers to completely
> customise the import mechanism. The never-completed "phase 2" of the
> PEP included a reimplementation of the built in import mechanism as
> hooks. Is there any interest in this actually happening? I've been
> looking for an interesting coding project for a while (although I
> never have any free time...)
> 
My interest in such a project would be in replacing a bunch of legacy C 
code with varying implementations of the import mechanism in pure Python 
strictly according to the dictats of PEP 302, using sys.path_hooks and 
sys.path (meta_path is for future consideration ;-).

Some readers may remember a lightning talk I gave at OSCON about three 
years ago. In that I discussed a database structure that allowed 
different implementations of modules to be loaded according to 
compatibility requirements established as a result of testing.

Although I now have a working database import mechanism based on PEP 302 
it's by no means obvious how that can be used exclusively (in other 
words, replacing the current import mechanism: the present 
implementation relies on an import of MySQLdb, which has many 
dependencies that clearly must be importable before the DB mechanism is 
in place). And I certainly haven't followed up by establishing the 
compatibility data that such an implementation would require.

Has anyone done any work on (for example) zip-only distributions?

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb       http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden


From steve at holdenweb.com  Fri Sep 22 06:41:38 2006
From: steve at holdenweb.com (Steve Holden)
Date: Fri, 22 Sep 2006 00:41:38 -0400
Subject: [Python-Dev] list.discard? (Re: dict.discard)
In-Reply-To: <200609212028.00824.fdrake@acm.org>
References: <20060921134249.GA9238@niemeyer.net>	<45132C6D.9010806@canterbury.ac.nz>
	<200609212028.00824.fdrake@acm.org>
Message-ID: <eevpgh$d4a$2@sea.gmane.org>

Fred L. Drake, Jr. wrote:
> On Thursday 21 September 2006 20:21, Greg Ewing wrote:
>  >    if x not in somelist:
>  >      somelist.remove(x)
> 
> I'm just guessing you really meant "if x in somelist".  ;-)
> 
No you aren't, that's clearly an *informed* guess.

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb       http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden


From jackdied at jackdied.com  Fri Sep 22 06:58:03 2006
From: jackdied at jackdied.com (Jack Diederich)
Date: Fri, 22 Sep 2006 00:58:03 -0400
Subject: [Python-Dev] test_itertools fails for trunk on x86 OS X machine
In-Reply-To: <45136205.6060603@v.loewis.de>
References: <3f09d5a00609211434m91a0b91y26ceb558f0664c9@mail.gmail.com>
	<20060921215019.GA6677@performancedrivers.com>
	<3f09d5a00609211528x5b0b60c9q1222aaf5961e0d82@mail.gmail.com>
	<20060922020858.GB6677@performancedrivers.com>
	<45136205.6060603@v.loewis.de>
Message-ID: <20060922045803.GC6677@performancedrivers.com>

On Fri, Sep 22, 2006 at 06:09:41AM +0200, "Martin v. L?wis" wrote:
> Jack Diederich schrieb:
> > Faced with the choice of believing in a really strange platform specific 
> > bug in a commonly used routine that resulted in exactly the failure caused 
> > by one of the two files being updated or believing a failure occurred in the
> > long chain of networks, disks, file systems, build tools, and operating 
> > systems that would result in only one of the files being updated -
> > I went with the latter.
> 
> Please reconsider how subversion works. It has the notion of atomic
> commits, so you either get the entire change, or none at all.
> 
> Fortunately, the buildbot keeps logs of everything it does:
> 
> http://www.python.org/dev/buildbot/trunk/g4%20osx.4%20trunk/builds/1449/step-svn/0
> 
> shows
> 
> U    Lib/test/test_itertools.py
> U    Modules/itertoolsmodule.c
> Updated to revision 51950.
> 
> So it said it updated both files. But perhaps it didn't build them?
> Let's check:
> 
> 
> http://www.python.org/dev/buildbot/trunk/g4%20osx.4%20trunk/builds/1449/step-compile/0
> 
> has this:
> 
> building 'itertools' extension
> 
> gcc -fno-strict-aliasing -Wno-long-double -no-cpp-precomp
> -mno-fused-madd -g -Wall -Wstrict-prototypes -I.
> -I/Users/buildslave/bb/trunk.psf-g4/build/./Include
> -I/Users/buildslave/bb/trunk.psf-g4/build/./Mac/Include -I./Include -I.
> -I/usr/local/include -I/Users/buildslave/bb/trunk.psf-g4/build/Include
> -I/Users/buildslave/bb/trunk.psf-g4/build -c
> /Users/buildslave/bb/trunk.psf-g4/build/Modules/itertoolsmodule.c -o
> build/temp.macosx-10.3-ppc-2.6/Users/buildslave/bb/trunk.psf-g4/build/Modules/itertoolsmodule.o
> 
> gcc -bundle -undefined dynamic_lookup
> build/temp.macosx-10.3-ppc-2.6/Users/buildslave/bb/trunk.psf-g4/build/Modules/itertoolsmodule.o
> -L/usr/local/lib -o build/lib.macosx-10.3-ppc-2.6/itertools.so
> 
> So itertools.so is regenerated, as it should; qed.
> 

I should leave the tounge-in-cheek bombast to Tim and Frederik, especially
when dealing with what might be an OS & machine specific bug.  The next
checkin and re-test will or won't highlight a failure and certainly someone 
with a g4 will try it out before 2.5.1 goes out so we'll know if it was a 
fluke soonish. The original error was mine, I typed "Size_t" instead of 
"Ssize_t" and while my one-char patch might also be wrong (I hope not, I'm 
red-faced enough as is) we should find out soon enough.

-Jack

From nnorwitz at gmail.com  Fri Sep 22 07:23:54 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Thu, 21 Sep 2006 22:23:54 -0700
Subject: [Python-Dev] test_itertools fails for trunk on x86 OS X machine
In-Reply-To: <20060922045803.GC6677@performancedrivers.com>
References: <3f09d5a00609211434m91a0b91y26ceb558f0664c9@mail.gmail.com>
	<20060921215019.GA6677@performancedrivers.com>
	<3f09d5a00609211528x5b0b60c9q1222aaf5961e0d82@mail.gmail.com>
	<20060922020858.GB6677@performancedrivers.com>
	<45136205.6060603@v.loewis.de>
	<20060922045803.GC6677@performancedrivers.com>
Message-ID: <ee2a432c0609212223q7dcd46edl37a61ea2399ebc8c@mail.gmail.com>

On 9/21/06, Jack Diederich <jackdied at jackdied.com> wrote:
>
> I should leave the tounge-in-cheek bombast to Tim and Frederik, especially
> when dealing with what might be an OS & machine specific bug.  The next
> checkin and re-test will or won't highlight a failure and certainly someone
> with a g4 will try it out before 2.5.1 goes out so we'll know if it was a
> fluke soonish. The original error was mine, I typed "Size_t" instead of
> "Ssize_t" and while my one-char patch might also be wrong (I hope not, I'm
> red-faced enough as is) we should find out soon enough.

It looks like %zd of a negative number is treated as an unsigned
number on OS X, even though the man page says it should be signed.

"""
The z modifier, when applied to a d or i conversion, indicates that
the argument is of a signed type equivalent in size to a size_t.
"""

The program below returns -123 on Linux and 4294967173 on OS X.

n
--
#include <stdio.h>
int main()
{
    char buffer[256];
      if(sprintf(buffer, "%zd", (size_t)-123) < 0)
        return 1;
     printf("%s\n", buffer);
     return 0;
}

From tim.peters at gmail.com  Fri Sep 22 08:12:07 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 22 Sep 2006 02:12:07 -0400
Subject: [Python-Dev] test_itertools fails for trunk on x86 OS X machine
In-Reply-To: <ee2a432c0609212223q7dcd46edl37a61ea2399ebc8c@mail.gmail.com>
References: <3f09d5a00609211434m91a0b91y26ceb558f0664c9@mail.gmail.com>
	<20060921215019.GA6677@performancedrivers.com>
	<3f09d5a00609211528x5b0b60c9q1222aaf5961e0d82@mail.gmail.com>
	<20060922020858.GB6677@performancedrivers.com>
	<45136205.6060603@v.loewis.de>
	<20060922045803.GC6677@performancedrivers.com>
	<ee2a432c0609212223q7dcd46edl37a61ea2399ebc8c@mail.gmail.com>
Message-ID: <1f7befae0609212312h28a961ffhbead29c6bab3c0f6@mail.gmail.com>

[Neal Norwitz]
> It looks like %zd of a negative number is treated as an unsigned
> number on OS X, even though the man page says it should be signed.
>
> """
> The z modifier, when applied to a d or i conversion, indicates that
> the argument is of a signed type equivalent in size to a size_t.
> """

It's not just some man page ;-), this is required by the C99 standard
(which introduced the `z` length modifier -- and it's the `d` or `i`
here that imply `signed`, `z` is only supposed to specify the width of
the integer type, and can also be applied to codes for unsigned
integer types, like %zu and %zx).

> The program below returns -123 on Linux and 4294967173 on OS X.
>
> n
> --
> #include <stdio.h>
> int main()
> {
>     char buffer[256];
>       if(sprintf(buffer, "%zd", (size_t)-123) < 0)
>         return 1;
>      printf("%s\n", buffer);
>      return 0;
> }

Well, to be strictly anal, while the result of

    (size_t)-123

is defined, the result of casting /that/ back to a signed type of the
same width is not defined.  Maybe your compiler was "doing you a
favor" ;-)

From jackdied at jackdied.com  Fri Sep 22 07:43:16 2006
From: jackdied at jackdied.com (Jack Diederich)
Date: Fri, 22 Sep 2006 01:43:16 -0400
Subject: [Python-Dev] test_itertools fails for trunk on x86 OS X machine
In-Reply-To: <ee2a432c0609212223q7dcd46edl37a61ea2399ebc8c@mail.gmail.com>
References: <3f09d5a00609211434m91a0b91y26ceb558f0664c9@mail.gmail.com>
	<20060921215019.GA6677@performancedrivers.com>
	<3f09d5a00609211528x5b0b60c9q1222aaf5961e0d82@mail.gmail.com>
	<20060922020858.GB6677@performancedrivers.com>
	<45136205.6060603@v.loewis.de>
	<20060922045803.GC6677@performancedrivers.com>
	<ee2a432c0609212223q7dcd46edl37a61ea2399ebc8c@mail.gmail.com>
Message-ID: <20060922054316.GD6677@performancedrivers.com>

On Thu, Sep 21, 2006 at 10:23:54PM -0700, Neal Norwitz wrote:
> On 9/21/06, Jack Diederich <jackdied at jackdied.com> wrote:
> >
> > I should leave the tounge-in-cheek bombast to Tim and Frederik, especially
> > when dealing with what might be an OS & machine specific bug.  The next
> > checkin and re-test will or won't highlight a failure and certainly someone
> > with a g4 will try it out before 2.5.1 goes out so we'll know if it was a
> > fluke soonish. The original error was mine, I typed "Size_t" instead of
> > "Ssize_t" and while my one-char patch might also be wrong (I hope not, I'm
> > red-faced enough as is) we should find out soon enough.
> 
> It looks like %zd of a negative number is treated as an unsigned
> number on OS X, even though the man page says it should be signed.
> 
> """
> The z modifier, when applied to a d or i conversion, indicates that
> the argument is of a signed type equivalent in size to a size_t.
> """
> 
> The program below returns -123 on Linux and 4294967173 on OS X.
> 
> n
> --
> #include <stdio.h>
> int main()
> {
>     char buffer[256];
>       if(sprintf(buffer, "%zd", (size_t)-123) < 0)
>         return 1;
>      printf("%s\n", buffer);
>      return 0;
> }

Consider me blushing even harder for denying the power of the buildbot
(and against all evidence).  Yikes, didn't any other tests trigger this?

sprat:~/src/python-head# find ./ -name '*.c' | xargs grep '%zd' | wc -l
65

-Jack

From nnorwitz at gmail.com  Fri Sep 22 08:37:37 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Thu, 21 Sep 2006 23:37:37 -0700
Subject: [Python-Dev] test_itertools fails for trunk on x86 OS X machine
In-Reply-To: <1f7befae0609212312h28a961ffhbead29c6bab3c0f6@mail.gmail.com>
References: <3f09d5a00609211434m91a0b91y26ceb558f0664c9@mail.gmail.com>
	<20060921215019.GA6677@performancedrivers.com>
	<3f09d5a00609211528x5b0b60c9q1222aaf5961e0d82@mail.gmail.com>
	<20060922020858.GB6677@performancedrivers.com>
	<45136205.6060603@v.loewis.de>
	<20060922045803.GC6677@performancedrivers.com>
	<ee2a432c0609212223q7dcd46edl37a61ea2399ebc8c@mail.gmail.com>
	<1f7befae0609212312h28a961ffhbead29c6bab3c0f6@mail.gmail.com>
Message-ID: <ee2a432c0609212337x40eab644tbe1aa0de2227d06a@mail.gmail.com>

On 9/21/06, Tim Peters <tim.peters at gmail.com> wrote:
>
> Well, to be strictly anal, while the result of
>
>     (size_t)-123
>
> is defined, the result of casting /that/ back to a signed type of the
> same width is not defined.  Maybe your compiler was "doing you a
> favor" ;-)

I also tried with a cast to an ssize_t and replacing %zd with an %zi.
None of them make a difference; all return an unsigned value.  This is
with powerpc-apple-darwin8-gcc-4.0.0 (GCC) 4.0.0 20041026 (Apple
Computer, Inc. build 4061).  Although i would expect the issue is in
the std C library rather than the compiler.

Forcing PY_FORMAT_SIZE_T to be "l" instead of "z" fixes this problem.

BTW, this is the same issue on Mac OS X:

>>> struct.pack('=b', -599999)
__main__:1: DeprecationWarning: 'b' format requires 4294967168 <= number <= 127
'A'

n
--

From jcarlson at uci.edu  Fri Sep 22 09:08:03 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri, 22 Sep 2006 00:08:03 -0700
Subject: [Python-Dev] New relative import issue
In-Reply-To: <5.1.1.6.0.20060921235853.03e01748@sparrow.telecommunity.com>
References: <20060921183846.0845.JCARLSON@uci.edu>
	<5.1.1.6.0.20060921235853.03e01748@sparrow.telecommunity.com>
Message-ID: <20060921233257.0848.JCARLSON@uci.edu>


"Phillip J. Eby" <pje at telecommunity.com> wrote:
> 
> At 08:44 PM 9/21/2006 -0700, Josiah Carlson wrote:
> >This can be implemented with a fairly simple package registry, contained
> >within a (small) SQLite database (which is conveniently shipped in
> >Python 2.5).  There can be a system-wide database that all users use as
> >a base, with a user-defined package registry (per user) where the
> >system-wide packages can be augmented.
> 
> As far as I can tell, you're ignoring that per-user must *also* be 
> per-version, and per-application.  Each application or runtime environment 
> needs its own private set of information like this.

Having a different database per Python version is not significantly
different than having a different Python binary for each Python version. 
About the only (annoying) nit is that the systemwide database needs to
be easily accessable to the Python runtime, and is possibly volatile. 
Maybe a symlink in the same path as the actual Python binary on *nix,
and the file located next to the binary on Windows.

I didn't mention the following because I thought it would be superfluous,
but it seems that I should have stated it right out.  My thoughts were
that on startup, Python would first query the 'system' database, caching
its results in a dictionary, then query the user's listing, updating the
dictionary as necessary, then unload the databases.  On demand, when
code runs packages.register(), if both persist and systemwide are False,
it just updates the dictionary. If either are true, it opens up and
updates the relevant database.

With such a semantic, every time Python gets run, every instance gets
its own private set of paths, derived from the system database, user
database, and runtime-defined packages.


> Next, putting the installation data inside a database instead of 
> per-installation-unit files presents problems of its own.  While some 
> system packaging tools allow install/uninstall scripts to run, they are 
> often frowned upon, and can be unintentionally bypassed.

This is easily remedied with a proper 'packages' implementation:

    python -Mpackages name path

Note that Python could auto-insert standard library and site-packages
'packages' on startup (creating the initial dictionary, then the
systemwide, then the user, ...).


> These are just a few of the issues that come to mind.  Realistically 
> speaking, .pth files are currently the most effective mechanism we have, 
> and there actually isn't much that can be done to improve upon them.

Except that .pth files are only usable in certain (likely) system paths,
that the user may not have write access to.  There have previously been
proposals to add support for .pth files in the path of the run .py file,
but they don't seem to have gotten any support.


> What's more needed are better mechanisms for creating and managing Python 
> "environments" (to use a term coined by Ian Bicking and Jim Fulton over on 
> the distutils-sig), which are individual contexts in which Python 
> applications run.  Some current tools in development by Ian and Jim include:
> 
> Anyway, system-wide and per-user environment information isn't nearly 
> sufficient to address the issues that people have when developing and 
> deploying multiple applications on a server, or even using multiple 
> applications on a client installation (e.g. somebody using both the 
> Enthought Python IDE and Chandler on the same machine).  These relatively 
> simple use cases rapidly demonstrate the inadequacy of system-wide or 
> per-user configuration of what packages are available.

It wouldn't be terribly difficult to add environment switching and
environment derivation (copying or linked, though copying would be
simpler).

    packages.derive_environment(parent_environment)
    packages.register(name, path, env=environment)
    packages.use(environment)

It also wouldn't be terribly difficult to set up environments that
required certain packages...

    packages.new_environment(environment, *required_packages, test=True)

To verify that the Python installation has the required packages, then
later...

    packages.new_environment(environment, *required_packages, persist=True)


I believe that most of the concerns that you have brought up can be
addressed, and I think that it could be far nicer to deal with than the
current sys.path hackery. The system database location is a bit annoying,
but I lack the *nix experience to say where such a database could or
should be located.

 - Josiah


From fredrik at pythonware.com  Fri Sep 22 10:35:50 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 22 Sep 2006 10:35:50 +0200
Subject: [Python-Dev] list.discard? (Re: dict.discard)
References: <20060921134249.GA9238@niemeyer.net>
	<45132C6D.9010806@canterbury.ac.nz>
Message-ID: <ef0796$i9a$1@sea.gmane.org>

Greg Ewing wrote:

> Actually I'd like this for lists. Often I find myself
> writing
>
>   if x not in somelist:
>     somelist.remove(x)
>
> A single method for doing this would be handy, and
> more efficient.

there is a single method that does this, of course, but you have to sprinkle
some sugar on it:

    try:
        somelist.remove(x)
    except ValueError: pass

</F> 




From ronaldoussoren at mac.com  Fri Sep 22 11:40:41 2006
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Fri, 22 Sep 2006 11:40:41 +0200
Subject: [Python-Dev] test_itertools fails for trunk on x86 OS X machine
In-Reply-To: <ee2a432c0609212337x40eab644tbe1aa0de2227d06a@mail.gmail.com>
References: <3f09d5a00609211434m91a0b91y26ceb558f0664c9@mail.gmail.com>
	<20060921215019.GA6677@performancedrivers.com>
	<3f09d5a00609211528x5b0b60c9q1222aaf5961e0d82@mail.gmail.com>
	<20060922020858.GB6677@performancedrivers.com>
	<45136205.6060603@v.loewis.de>
	<20060922045803.GC6677@performancedrivers.com>
	<ee2a432c0609212223q7dcd46edl37a61ea2399ebc8c@mail.gmail.com>
	<1f7befae0609212312h28a961ffhbead29c6bab3c0f6@mail.gmail.com>
	<ee2a432c0609212337x40eab644tbe1aa0de2227d06a@mail.gmail.com>
Message-ID: <1020669.1158918041341.JavaMail.ronaldoussoren@mac.com>

 
On Friday, September 22, 2006, at 08:38AM, Neal Norwitz <nnorwitz at gmail.com> wrote:

>On 9/21/06, Tim Peters <tim.peters at gmail.com> wrote:
>>
>> Well, to be strictly anal, while the result of
>>
>>     (size_t)-123
>>
>> is defined, the result of casting /that/ back to a signed type of the
>> same width is not defined.  Maybe your compiler was "doing you a
>> favor" ;-)
>
>I also tried with a cast to an ssize_t and replacing %zd with an %zi.
>None of them make a difference; all return an unsigned value.  This is
>with powerpc-apple-darwin8-gcc-4.0.0 (GCC) 4.0.0 20041026 (Apple
>Computer, Inc. build 4061).  Although i would expect the issue is in
>the std C library rather than the compiler.
>
>Forcing PY_FORMAT_SIZE_T to be "l" instead of "z" fixes this problem.
>
>BTW, this is the same issue on Mac OS X:
>
>>>> struct.pack('=b', -599999)
>__main__:1: DeprecationWarning: 'b' format requires 4294967168 <= number <= 127

Has anyone filed a bug at bugreport.apple.com about this (that is '%zd' not behaving as the documentation says it should behave)? I'll file a bug (as well), but the more people tell Apple about this the more likely it is that someone will fix this.

Ronald

>'A'
>
>n
>--
>_______________________________________________
>Python-Dev mailing list
>Python-Dev at python.org
>http://mail.python.org/mailman/listinfo/python-dev
>Unsubscribe: http://mail.python.org/mailman/options/python-dev/ronaldoussoren%40mac.com
>
>

From ncoghlan at gmail.com  Fri Sep 22 13:09:44 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 22 Sep 2006 21:09:44 +1000
Subject: [Python-Dev] New relative import issue
In-Reply-To: <bbaeab100609211455l722d63a5w9c747f58c3b3db16@mail.gmail.com>
References: <cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>	<20060918091314.GA26814@code0.codespeak.net>	<450F6833.60603@canterbury.ac.nz>
	<20060919094738.GC27707@phd.pp.ru>	<05af01c6dd7e$a2209560$e303030a@trilan>	<ca471dc20609210722i620d0371g43add23268844be6@mail.gmail.com>	<79990c6b0609211322v54e0977ao2009e98b61d2915d@mail.gmail.com>	<ca471dc20609211354g61a11cfbseb6c070d2e42aa42@mail.gmail.com>
	<bbaeab100609211455l722d63a5w9c747f58c3b3db16@mail.gmail.com>
Message-ID: <4513C478.6040307@gmail.com>

Brett Cannon wrote:
> But either way I will be messing with the import system in the 
> relatively near future.  If you want to help, Paul (or anyone else), 
> just send me an email and we can try to coordinate something (plan to do 
> the work in the sandbox as a separate thing from my security stuff).

Starting with pkgutil.get_loader and removing the current dependency on 
imp.find_module and imp.load_module would probably be a decent way to start.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From fdrake at acm.org  Fri Sep 22 14:44:55 2006
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri, 22 Sep 2006 08:44:55 -0400
Subject: [Python-Dev] [Python-checkins]  release25-maint is UNFROZEN
In-Reply-To: <20060921123510.GA22457@code0.codespeak.net>
References: <200609212112.04923.anthony@interlink.com.au>
	<20060921123510.GA22457@code0.codespeak.net>
Message-ID: <200609220844.55724.fdrake@acm.org>

On Thursday 21 September 2006 08:35, Armin Rigo wrote:
 > Thanks for the hassle!  I've got another bit of it for you, though.  The
 > freezed 2.5 documentation doesn't seem to be available on-line.  At
 > least, the doc links from the release page point to the 'dev' 2.6a0
 > version, and the URL following the common scheme -
 > http://www.python.org/doc/2.5/ - doesn't work.

This should mostly be working now.  The page at www.python.org/doc/2.5/ 
isn't "really" right, but will do the trick.  Hopefully I'll be able to work 
out how these pages should be updated properly at the Arlington sprint this 
weekend, at which point I can update PEP 101 appropriately and make sure this 
gets done when releases are made.


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From rokkamraja at gmail.com  Fri Sep 22 15:19:44 2006
From: rokkamraja at gmail.com (Raja Rokkam)
Date: Fri, 22 Sep 2006 18:49:44 +0530
Subject: [Python-Dev] Python network Programmign
Message-ID: <357297a00609220619x3e968d30p7fcafbb7683e5a69@mail.gmail.com>

Hi,
   I am currently doing my final year project "Secure mobile Robot
Management" . I have done the theoretical aspects of it till now and now
thinking of coding it .

I would like to code in Python , but i am new to Python Network Programming
.
Some of features of my project are:

1.  Each robot can send data to any other robot.
2. Each robot can receive data from any other robot.
3.  Every Robot has atleast 1 other bot in its communication range.
4.  maximum size of a data packet is limited to 35 bytes
5.  each mobile robot maintains a table with routes
6.  all the routes stored in the routing table include a ??eld named
life-time.
7.  Route Discovery Process initiated if there is no known route to other
bot.
8. There is no server over here .
9. every bot should be able to process the data from other bots and both
multicast/unicast
     need to be supported.

Assume the environment is gridded mesh and bots exploring the area. They
need to perform a set of tasks (assume finding some locations which are
dangerous or smthing like that).

My main concern is how to go about modifying the headers such that
everything fits in 35bytes .
I would like to know how to proceed and if any links or resources in this
regard. How to modify the headers ? ie. all in 35 bytes .

Thank You,
Raja.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060922/64a12c24/attachment.htm 

From fredrik at pythonware.com  Fri Sep 22 15:32:50 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 22 Sep 2006 15:32:50 +0200
Subject: [Python-Dev] Python network Programmign
References: <357297a00609220619x3e968d30p7fcafbb7683e5a69@mail.gmail.com>
Message-ID: <ef0om2$ebp$1@sea.gmane.org>

Raja Rokkam wrote:

> I would like to code in Python , but i am new to Python Network Programming

wrong list: python-dev is for people who develop the python core, not people
who want to develop *in* python.

see

    http://www.python.org/community/lists/

for a list of more appropriate forums.

cheers /F 




From pje at telecommunity.com  Fri Sep 22 18:25:01 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 22 Sep 2006 12:25:01 -0400
Subject: [Python-Dev] New relative import issue
In-Reply-To: <20060921233257.0848.JCARLSON@uci.edu>
References: <5.1.1.6.0.20060921235853.03e01748@sparrow.telecommunity.com>
	<20060921183846.0845.JCARLSON@uci.edu>
	<5.1.1.6.0.20060921235853.03e01748@sparrow.telecommunity.com>
Message-ID: <5.1.1.6.0.20060922120555.0270b420@sparrow.telecommunity.com>

At 12:08 AM 9/22/2006 -0700, Josiah Carlson wrote:
>"Phillip J. Eby" <pje at telecommunity.com> wrote:
> >
> > At 08:44 PM 9/21/2006 -0700, Josiah Carlson wrote:
> > >This can be implemented with a fairly simple package registry, contained
> > >within a (small) SQLite database (which is conveniently shipped in
> > >Python 2.5).  There can be a system-wide database that all users use as
> > >a base, with a user-defined package registry (per user) where the
> > >system-wide packages can be augmented.
> >
> > As far as I can tell, you're ignoring that per-user must *also* be
> > per-version, and per-application.  Each application or runtime environment
> > needs its own private set of information like this.
>
>Having a different database per Python version is not significantly
>different than having a different Python binary for each Python version.

You misunderstood me: I mean that the per-user database must be able to 
store information for *different Python versions*.  Having a single 
per-user database without the ability to include configuration for more 
than one Python version (analagous to the current situation with the 
distutils per-user config file) is problematic.

In truth, a per-user configuration is just a special case of the real need: 
to have per-application environments.  In effect, a per-user environment is 
a fallback for not having an appplication environment, and the system 
environment is a fallback for not having a user environment.


>About the only (annoying) nit is that the systemwide database needs to
>be easily accessable to the Python runtime, and is possibly volatile.
>Maybe a symlink in the same path as the actual Python binary on *nix,
>and the file located next to the binary on Windows.
>
>I didn't mention the following because I thought it would be superfluous,
>but it seems that I should have stated it right out.  My thoughts were
>that on startup, Python would first query the 'system' database, caching
>its results in a dictionary, then query the user's listing, updating the
>dictionary as necessary, then unload the databases.  On demand, when
>code runs packages.register(), if both persist and systemwide are False,
>it just updates the dictionary. If either are true, it opens up and
>updates the relevant database.

Using a database as the primary mechanism for managing import locations 
simply isn't workable.  You might as well suggest that each environment 
consist of a single large zipfile containing the packages in question: this 
would actually be *more* practical (and fast!) in terms of Python startup, 
and is no different from having a database with respect to the need for 
installation and uninstallation to modify a central file!

I'm not proposing we do that -- I'm just pointing out why using an actual 
database isn't really workable, considering that it has all of the 
disadvantages of a big zipfile, and none of the advantages (like speed, 
having code already written that supports it, etc.)


>This is easily remedied with a proper 'packages' implementation:
>
>     python -Mpackages name path
>
>Note that Python could auto-insert standard library and site-packages
>'packages' on startup (creating the initial dictionary, then the
>systemwide, then the user, ...).

I presume here you're suggesting a way to select a runtime environment from 
the command line, which would certainly be a good idea.


> > These are just a few of the issues that come to mind.  Realistically
> > speaking, .pth files are currently the most effective mechanism we have,
> > and there actually isn't much that can be done to improve upon them.
>
>Except that .pth files are only usable in certain (likely) system paths,
>that the user may not have write access to.  There have previously been
>proposals to add support for .pth files in the path of the run .py file,
>but they don't seem to have gotten any support.

Setuptools works around this by installing an enhancement for the 'site' 
module that extends .pth support to include all PYTHONPATH 
directories.  The enhancement delegates to the original site module after 
recording data about sys.path that the site module destroys at startup.



>I believe that most of the concerns that you have brought up can be
>addressed,

Well, as I said, I've already dealt with them, using .pth files, for the 
use cases I care about.  Ian Bicking and Jim Fulton have also gone farther 
with work on tools to create environments with greater isolation or more 
fixed version linkages than what setuptools does.  (Setuptools-generated 
environments dynamically select requirements based on available versions at 
runtime, while Ian and Jim's tools create environments whose inter-package 
linkages are frozen at installation time.)


>and I think that it could be far nicer to deal with than the
>current sys.path hackery.

I'm not sure of that, since I don't yet know how your approach would deal 
with namespace packages, which are distributed in pieces and assembled 
later.  For example, many PEAK and Zope distributions live in the peak.* 
and zope.* package namespaces, but are installed separately, and glued 
together via __path__ changes (see the pkgutil docs).

Thus, if you are talking about a packagename->importer mapping, it has to 
take into consideration the possibility of multiple import locations for 
the same package.


>  The system database location is a bit annoying,
>but I lack the *nix experience to say where such a database could or
>should be located.

This issue is a triviality compared to the more fundamental flaws (or at 
any rate, holes) in what you're currently proposing.  I wouldn't worry 
about it at all right now.

That having been said, I find the discussion stimulating, because I do plan 
to revisit the environments issue in setuptools 0.7, so who knows what 
ideas may come up?


From fuzzyman at voidspace.org.uk  Fri Sep 22 19:43:42 2006
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Fri, 22 Sep 2006 18:43:42 +0100
Subject: [Python-Dev] Suggestion for a new built-in - flatten
Message-ID: <451420CE.8070003@voidspace.org.uk>

Hello all,

I have a suggestion for a new Python built in function: 'flatten'.

This would (as if it needs explanation) take a single sequence, where 
each element can be a sequence (or iterable ?) nested to an arbitrary 
depth. It would return a flattened list. A useful restriction could be 
that it wouldn't expand strings :-)

I've needed this several times, and recently twice at work. There are 
several implementations in the Python cookbook. When I posted on my blog 
recently asking for one liners to flatten a list of lists (only 1 level 
of nesting), I had 26 responses, several of them saying it was a problem 
they had encountered before.

There are also numerous  places on the web bewailing the lack of this as 
a built-in. All of this points to the fact that it is something that 
would be appreciated as a built in.

There is an implementation already in Tkinter :

    import _tkinter._flatten as flatten

There are several different possible approaches in pure Python, but is 
this an idea that has legs ?

All the best,


Michael Foord
http://www.voidspace.org.uk/python/index.shtml


-- 
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.405 / Virus Database: 268.12.7/454 - Release Date: 21/09/2006


From theller at python.net  Fri Sep 22 20:10:23 2006
From: theller at python.net (Thomas Heller)
Date: Fri, 22 Sep 2006 20:10:23 +0200
Subject: [Python-Dev] Relative import bug?
Message-ID: <ef18ug$b4a$1@sea.gmane.org>

Consider a package containing these files:

a/__init__.py
a/b/__init__.py
a/b/x.py
a/b/y.py

If x.py contains this:

"""
from ..b import y
import a.b.x
from ..b import x
"""

Python trunk and Python 2.5 both complain:

Python 2.5 (r25:51908, Sep 19 2006, 09:52:17) [MSC v.1310 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import a.b.x
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "a\b\x.py", line 2, in <module>
    from ..b import x
ImportError: cannot import name x
>>>

A bug?

Thomas


From pje at telecommunity.com  Fri Sep 22 20:44:55 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 22 Sep 2006 14:44:55 -0400
Subject: [Python-Dev] Relative import bug?
In-Reply-To: <ef18ug$b4a$1@sea.gmane.org>
Message-ID: <5.1.1.6.0.20060922143559.02f03498@sparrow.telecommunity.com>

At 08:10 PM 9/22/2006 +0200, Thomas Heller wrote:
>Consider a package containing these files:
>
>a/__init__.py
>a/b/__init__.py
>a/b/x.py
>a/b/y.py
>
>If x.py contains this:
>
>"""
>from ..b import y
>import a.b.x
>from ..b import x
>"""
>
>Python trunk and Python 2.5 both complain:
>
>Python 2.5 (r25:51908, Sep 19 2006, 09:52:17) [MSC v.1310 32 bit (Intel)] 
>on win32
>Type "help", "copyright", "credits" or "license" for more information.
> >>> import a.b.x
>Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "a\b\x.py", line 2, in <module>
>     from ..b import x
>ImportError: cannot import name x
> >>>
>
>A bug?

If it is, it has nothing to do with relative importing per se.  Note that 
changing it to "from a.b import x" produces the exact same error.

This looks like a "standard" circular import bug.  What's happening is that 
the first import doesn't set "a.b.x = x" until after a.b.x is fully 
imported.  But subsequent "import a.b.x" statements don't set it either, 
because they are satisfied by finding 'a.b.x' in sys.modules.  So, when the 
'from ... import x' runs, it tries to get the 'x' attribute of 'a.b' 
(whether it gets a.b relatively or absolutely), and fails.

If you make the last import be "import a.b.x as x", you'll get a better 
error message:

Traceback (most recent call last):
   File "<string>", line 1, in <module>
   File "a/b/x.py", line 3, in <module>
     import a.b.x as x
AttributeError: 'module' object has no attribute 'x'

But the entire issue is a bug that exists in Python 2.4, and possibly prior 
versions as well.


From dave at boost-consulting.com  Fri Sep 22 20:45:17 2006
From: dave at boost-consulting.com (David Abrahams)
Date: Fri, 22 Sep 2006 14:45:17 -0400
Subject: [Python-Dev] Pep 353: Py_ssize_t advice
Message-ID: <871wq3eo5e.fsf@pereiro.luannocracy.com>


Pep 353 advises the use of this incantation:

  #if PY_VERSION_HEX < 0x02050000
  typedef int Py_ssize_t;
  #define PY_SSIZE_T_MAX INT_MAX
  #define PY_SSIZE_T_MIN INT_MIN
  #endif

I just wanted to point out that this advice could lead to library
header collisions when multiple 3rd parties decide to follow it.  I
suggest it be changed to something like:

  #if PY_VERSION_HEX < 0x02050000 && !defined(PY_SSIZE_T_MIN)
  typedef int Py_ssize_t;
  #define PY_SSIZE_T_MAX INT_MAX
  #define PY_SSIZE_T_MIN INT_MIN
  #endif

(C++ allows restating of typedefs; if C allows it, that should be
something like):

  #if PY_VERSION_HEX < 0x02050000
  typedef int Py_ssize_t;
  # if !defined(PY_SSIZE_T_MIN)
  #  define PY_SSIZE_T_MAX INT_MAX
  #  define PY_SSIZE_T_MIN INT_MIN
  # endif
  #endif

You may say that library developers should know better, but I just had
an argument with a very bright guy who didn't get it at first.

Thanks, and HTH.

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com


From skip at pobox.com  Fri Sep 22 20:46:15 2006
From: skip at pobox.com (skip at pobox.com)
Date: Fri, 22 Sep 2006 13:46:15 -0500
Subject: [Python-Dev] Suggestion for a new built-in - flatten
In-Reply-To: <451420CE.8070003@voidspace.org.uk>
References: <451420CE.8070003@voidspace.org.uk>
Message-ID: <17684.12151.408448.905468@montanaro.dyndns.org>


    Michael> There are several different possible approaches in pure Python,
    Michael> but is this an idea that has legs ?

Why not add it to itertools?  Then, if you need a true list, just call
list() on the returned iterator.

Skip

From jcarlson at uci.edu  Fri Sep 22 20:57:10 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri, 22 Sep 2006 11:57:10 -0700
Subject: [Python-Dev] Suggestion for a new built-in - flatten
In-Reply-To: <451420CE.8070003@voidspace.org.uk>
References: <451420CE.8070003@voidspace.org.uk>
Message-ID: <20060922114820.0851.JCARLSON@uci.edu>


Michael Foord <fuzzyman at voidspace.org.uk> wrote:
> 
> Hello all,
> 
> I have a suggestion for a new Python built in function: 'flatten'.

This has been brought up many times.  I'm -1 on its inclusion, if only
because it's a fairly simple 9-line function (at least the trivial
version I came up with), and not all X-line functions should be in the
standard library.  Also, while I have had need for such a function in
the past, I have found that I haven't needed it in a few years.


 - Josiah


From brett at python.org  Fri Sep 22 21:01:28 2006
From: brett at python.org (Brett Cannon)
Date: Fri, 22 Sep 2006 12:01:28 -0700
Subject: [Python-Dev] Suggestion for a new built-in - flatten
In-Reply-To: <17684.12151.408448.905468@montanaro.dyndns.org>
References: <451420CE.8070003@voidspace.org.uk>
	<17684.12151.408448.905468@montanaro.dyndns.org>
Message-ID: <bbaeab100609221201o123791cah59bd5a18bdf09f08@mail.gmail.com>

On 9/22/06, skip at pobox.com <skip at pobox.com> wrote:
>
>
>     Michael> There are several different possible approaches in pure
> Python,
>     Michael> but is this an idea that has legs ?
>
> Why not add it to itertools?  Then, if you need a true list, just call
> list() on the returned iterator.


Yeah, this is a better solution.  flatten() just doesn't scream "built-in!"
to me.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060922/35d33452/attachment.htm 

From bob at redivi.com  Fri Sep 22 21:05:19 2006
From: bob at redivi.com (Bob Ippolito)
Date: Fri, 22 Sep 2006 12:05:19 -0700
Subject: [Python-Dev] Suggestion for a new built-in - flatten
In-Reply-To: <20060922114820.0851.JCARLSON@uci.edu>
References: <451420CE.8070003@voidspace.org.uk>
	<20060922114820.0851.JCARLSON@uci.edu>
Message-ID: <6a36e7290609221205y392c6defy19ea2a004b82725a@mail.gmail.com>

On 9/22/06, Josiah Carlson <jcarlson at uci.edu> wrote:
>
> Michael Foord <fuzzyman at voidspace.org.uk> wrote:
> >
> > Hello all,
> >
> > I have a suggestion for a new Python built in function: 'flatten'.
>
> This has been brought up many times.  I'm -1 on its inclusion, if only
> because it's a fairly simple 9-line function (at least the trivial
> version I came up with), and not all X-line functions should be in the
> standard library.  Also, while I have had need for such a function in
> the past, I have found that I haven't needed it in a few years.

I think instead of adding a flatten function perhaps we should think
about adding something like Erlang's "iolist" support. The idea is
that methods like "writelines" should be able to take nested iterators
and consume any object they find that implements the buffer protocol.

-bob

From ferringb at gmail.com  Fri Sep 22 21:26:37 2006
From: ferringb at gmail.com (Brian Harring)
Date: Fri, 22 Sep 2006 12:26:37 -0700
Subject: [Python-Dev] Suggestion for a new built-in - flatten
In-Reply-To: <6a36e7290609221205y392c6defy19ea2a004b82725a@mail.gmail.com>
References: <451420CE.8070003@voidspace.org.uk>
	<20060922114820.0851.JCARLSON@uci.edu>
	<6a36e7290609221205y392c6defy19ea2a004b82725a@mail.gmail.com>
Message-ID: <20060922192637.GA10582@seldon>

On Fri, Sep 22, 2006 at 12:05:19PM -0700, Bob Ippolito wrote:
> On 9/22/06, Josiah Carlson <jcarlson at uci.edu> wrote:
> >
> > Michael Foord <fuzzyman at voidspace.org.uk> wrote:
> > >
> > > Hello all,
> > >
> > > I have a suggestion for a new Python built in function: 'flatten'.
> >
> > This has been brought up many times.  I'm -1 on its inclusion, if only
> > because it's a fairly simple 9-line function (at least the trivial
> > version I came up with), and not all X-line functions should be in the
> > standard library.  Also, while I have had need for such a function in
> > the past, I have found that I haven't needed it in a few years.
> 
> I think instead of adding a flatten function perhaps we should think
> about adding something like Erlang's "iolist" support. The idea is
> that methods like "writelines" should be able to take nested iterators
> and consume any object they find that implements the buffer protocol.

Which is no different then just passing in a generator/iterator that 
does flattening.

Don't much see the point in gumming up the file protocol with this 
special casing; still will have requests for a flattener elsewhere.

If flattening was added, should definitely be a general obj, not a 
special casing in one method in my opinion.
~harring
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20060922/4181f583/attachment.pgp 

From theller at python.net  Fri Sep 22 21:28:24 2006
From: theller at python.net (Thomas Heller)
Date: Fri, 22 Sep 2006 21:28:24 +0200
Subject: [Python-Dev] Relative import bug?
In-Reply-To: <5.1.1.6.0.20060922143559.02f03498@sparrow.telecommunity.com>
References: <ef18ug$b4a$1@sea.gmane.org>
	<5.1.1.6.0.20060922143559.02f03498@sparrow.telecommunity.com>
Message-ID: <ef1dgn$rgb$1@sea.gmane.org>

Phillip J. Eby schrieb:
> At 08:10 PM 9/22/2006 +0200, Thomas Heller wrote:
>>If x.py contains this:
>>
>>"""
>>from ..b import y
>>import a.b.x
>>from ..b import x
>>"""
...
>>ImportError: cannot import name x
>> >>>
>>
>>A bug?
> 
> If it is, it has nothing to do with relative importing per se.  Note that 
> changing it to "from a.b import x" produces the exact same error.
> 
> This looks like a "standard" circular import bug.

Of course.  Thanks.

Thomas


From glyph at divmod.com  Fri Sep 22 21:29:28 2006
From: glyph at divmod.com (glyph at divmod.com)
Date: Fri, 22 Sep 2006 15:29:28 -0400
Subject: [Python-Dev] Suggestion for a new built-in - flatten
In-Reply-To: <451420CE.8070003@voidspace.org.uk>
Message-ID: <20060922192928.1717.1975026622.divmod.quotient.57018@ohm>

On Fri, 22 Sep 2006 18:43:42 +0100, Michael Foord <fuzzyman at voidspace.org.uk> wrote:

>I have a suggestion for a new Python built in function: 'flatten'.

This seems superficially like a good idea, but I think adding it to Python anywhere would do a lot more harm than good.  I can see that consensus is already strongly against a builtin, but I think it would be bad to add to itertools too.

Flattening always *seems* to be a trivial and obvious operation.  "I just need something that takes a group of deeply structured data and turns it into a group of shallowly structured data.".  Everyone that has this requirement assumes that their list of implicit requirements for "flattening" is the obviously correct one.

This wouldn't be a problem except that everyone has a different idea of those requirements:).

Here are a few issues.

What do you do when you encounter a dict?  You can treat it as its keys(), its values(), or its items().

What do you do when you encounter an iterable object?

What order do you flatten set()s in?  (and, ha ha, do you Set the same?)

How are user-defined flattening behaviors registered?  Is it a new special method, a registration API?

How do you pass information about the flattening in progress to the user-defined behaviors?

If you do something special to iterables, do you special-case strings?  Why or why not?

What do you do if you encounter a function?  This is kind of a trick question, since Nevow's "flattener" *calls* functions as it encounters them, then treats the *result* of calling them as further input.

If you don't think that functions are special, what about *generator* functions?  How do you tell the difference?  What about functions that return generators but aren't themselves generators?  What about functions that return non-generator iterators?  What about pre-generated generator objects (if you don't want to treat iterables as special, are generators special?).

Do you produce the output as a structured list or an iterator that works incrementally?

Also, at least Nevow uses "flatten" to mean "serialize to bytes", not "produce a flat list", and I imagine at least a few other web frameworks do as well.  That starts to get into encoding issues.

If you make a decision one way or another on any of these questions of policy, you are going to make flatten() useless to a significant portion of its potential userbase.  The only difference between having it in the standard library and not is that if it's there, they'll spend an hour being confused by the weird way that it's dealing with <insert your favorite data type here> rather than just doing the "obvious" thing, and they'll take a minute to write the 10-line function that they need.  Without the standard library, they'll skip to step 2 and save a lot of time.

I would love to see a unified API that figured out all of these problems, and put them together into a (non-stdlib) library that anyone interested could use for a few years to work the kinks out.  Although it might be nice to have a simple "flatten" interface, I don't think that it would ever be simple enough to stick into a builtin; it would just be the default instance of the IncrementalDestructuringProcess class with the most popular (as determined by polling users of the library after a year or so) IncrementalDestructuringTypePolicy.

From jcarlson at uci.edu  Fri Sep 22 21:42:17 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri, 22 Sep 2006 12:42:17 -0700
Subject: [Python-Dev] New relative import issue
In-Reply-To: <5.1.1.6.0.20060922120555.0270b420@sparrow.telecommunity.com>
References: <20060921233257.0848.JCARLSON@uci.edu>
	<5.1.1.6.0.20060922120555.0270b420@sparrow.telecommunity.com>
Message-ID: <20060922112345.084E.JCARLSON@uci.edu>


"Phillip J. Eby" <pje at telecommunity.com> wrote:
> At 12:08 AM 9/22/2006 -0700, Josiah Carlson wrote:
> >"Phillip J. Eby" <pje at telecommunity.com> wrote:
> > > At 08:44 PM 9/21/2006 -0700, Josiah Carlson wrote:
[snip]
> You misunderstood me: I mean that the per-user database must be able to 
> store information for *different Python versions*.  Having a single 
> per-user database without the ability to include configuration for more 
> than one Python version (analagous to the current situation with the 
> distutils per-user config file) is problematic.

Just like having different systemwide databases for each Python version
makes sense, why wouldn't we have different user databases for each
Python version?  Something like ~/.python_packages.2.6 and
~/.python_packages.3.0

Also, by separating out the files per Python version, we can also
guarantee database compatability for any fixed Python series (2.5.x, etc.).
I don't know if the internal organization of SQLite databases changes
between revisions in a backwards compatible way, so this may not
actually be a concern (it is with bsddb).


> In truth, a per-user configuration is just a special case of the real need: 
> to have per-application environments.  In effect, a per-user environment is 
> a fallback for not having an appplication environment, and the system 
> environment is a fallback for not having a user environment.

I think you are mostly correct.  The reason you are not completely
correct is that if I were to install psyco, and I want all applications
that could use it to use it (they guard the psyco import with a
try/except), I merely need to register the package in the systemwide (or
user) package registery.  No need to muck about with each environment I
(or my installed applications) have defined, it just works.  Is it a
"fallback"?  Sure, but I prefer to call them "convenient defaults".


> >I didn't mention the following because I thought it would be superfluous,
> >but it seems that I should have stated it right out.  My thoughts were
> >that on startup, Python would first query the 'system' database, caching
> >its results in a dictionary, then query the user's listing, updating the
> >dictionary as necessary, then unload the databases.  On demand, when
> >code runs packages.register(), if both persist and systemwide are False,
> >it just updates the dictionary. If either are true, it opens up and
> >updates the relevant database.
> 
> Using a database as the primary mechanism for managing import locations 
> simply isn't workable.

Why?  Remember that this database isn't anything other than a
persistance mechanism that has pre-built locking semantics for
multi-process opening, reading, writing, and closing.  Given proper
cross-platform locking, we could use any persistance mechanism as a
replacement; miniconf, Pickle, marshal; whatever.


> You might as well suggest that each environment 
> consist of a single large zipfile containing the packages in question: this 
> would actually be *more* practical (and fast!) in terms of Python startup, 
> and is no different from having a database with respect to the need for 
> installation and uninstallation to modify a central file!

We should remember that the sizes of databases that (I expect) will be
common, we are talking about maybe 30k if a user has installed every
package in pypi.  And after the initial query, everything will be stored
in a dictionary or dictionary-like object, offering faster query times
than even a zip file (though loading the module/package from disk won't
have its performance improved).


> I'm not proposing we do that -- I'm just pointing out why using an actual 
> database isn't really workable, considering that it has all of the 
> disadvantages of a big zipfile, and none of the advantages (like speed, 
> having code already written that supports it, etc.)

SQLite is pretty fast.  And for startup, we are really only performing a
single query per database "SELECT * FROM package_registry".  It will end
up reading the entire database, but these databases will be generally
small, perhaps a few dozen rows, maybe a few thousand if we have set up
a bunch of installation-time application environments.


> >This is easily remedied with a proper 'packages' implementation:
> >
> >     python -Mpackages name path
> >
> >Note that Python could auto-insert standard library and site-packages
> >'packages' on startup (creating the initial dictionary, then the
> >systemwide, then the user, ...).
> 
> I presume here you're suggesting a way to select a runtime environment from 
> the command line, which would certainly be a good idea.

Actually, I'm offering a way of *registering* a package with the
repository from the command line.  I'm of the opinion that setting the
environment via command line for the subsequent Python runs is a bad
idea, but then again, I have been using wxPython's wxversion method for
a while to select which wxPython installation I want to use, and find
things like:

    import wxversion
    wxversion.ensureMinimal('2.6-unicode', optionsRequired=True)

To be exactly the amount of control I want, where I want it.

Further, a non-command-line mechanism to handle environment would save
people from mucking up their Python runtime environment if they forget
to switch it back to a 'default'.


With a package registry (perhaps as I have been describing, perhaps
something different), all of the disparate ways of choosing a version of
a library during import can be removed in favor of a single mechanism. 
This single mechanism could handle things like the wxPython
'ensureMinimal', perhaps even 'ensure exact' or 'use latest'.


> > > These are just a few of the issues that come to mind.  Realistically
> > > speaking, .pth files are currently the most effective mechanism we have,
> > > and there actually isn't much that can be done to improve upon them.
> >
> >Except that .pth files are only usable in certain (likely) system paths,
> >that the user may not have write access to.  There have previously been
> >proposals to add support for .pth files in the path of the run .py file,
> >but they don't seem to have gotten any support.
> 
> Setuptools works around this by installing an enhancement for the 'site' 
> module that extends .pth support to include all PYTHONPATH 
> directories.  The enhancement delegates to the original site module after 
> recording data about sys.path that the site module destroys at startup.

But wasn't there a recent discussion describing how keeping persistant
environment variables is a PITA both during install and runtime? 
Extending .pth files to PYTHONPATH seems to me like a hack meant to work
around the fact that Python doesn't have a package registry.  And really,
all of the current sys.path + .pth + PYTHONPATH stuff could be subsumed
into a *single* mechanism.

I'm of the opinion that the current system of paths, etc., are a bit
cumbersome.  And I think that we can do better, either with the
mechanism I am describing, or otherwise.


> >I believe that most of the concerns that you have brought up can be
> >addressed,
> 
> Well, as I said, I've already dealt with them, using .pth files, for the 
> use cases I care about.  Ian Bicking and Jim Fulton have also gone farther 
> with work on tools to create environments with greater isolation or more 
> fixed version linkages than what setuptools does.  (Setuptools-generated 
> environments dynamically select requirements based on available versions at 
> runtime, while Ian and Jim's tools create environments whose inter-package 
> linkages are frozen at installation time.)

All of these cases could be handled by a properly designed package
registry mechanism.


> >and I think that it could be far nicer to deal with than the
> >current sys.path hackery.
> 
> I'm not sure of that, since I don't yet know how your approach would deal 
> with namespace packages, which are distributed in pieces and assembled 
> later.  For example, many PEAK and Zope distributions live in the peak.* 
> and zope.* package namespaces, but are installed separately, and glued 
> together via __path__ changes (see the pkgutil docs).

    packages.register('zope', '/path/to/zope')

And if the installation path is different:

    packages.register('zope.subpackage', '/different/path/to/subpackage/')

Otherwise the importer will know where the zope (or peak) package exists
in the filesystem (or otherwise), and search it whenever 'from zope
import ...' is performed.


> Thus, if you are talking about a packagename->importer mapping, it has to 
> take into consideration the possibility of multiple import locations for 
> the same package.

Indeed.  But this is not any different than the "multiple import
locations for any absolute import" in all Pythons.  Only now we don't
need to rely on sys.path, .pth, PYTHONPATH, or monkey patching site.py,
and we don't need to be adding packages to the root of the absolute
import hierarchy: I can add my own package/module to the email package
if I want, and I don't even need to bork the system install to do it.


 - Josiah


From bob at redivi.com  Fri Sep 22 21:42:16 2006
From: bob at redivi.com (Bob Ippolito)
Date: Fri, 22 Sep 2006 12:42:16 -0700
Subject: [Python-Dev] Suggestion for a new built-in - flatten
In-Reply-To: <20060922192637.GA10582@seldon>
References: <451420CE.8070003@voidspace.org.uk>
	<20060922114820.0851.JCARLSON@uci.edu>
	<6a36e7290609221205y392c6defy19ea2a004b82725a@mail.gmail.com>
	<20060922192637.GA10582@seldon>
Message-ID: <6a36e7290609221242j165f23bfq22c0502b7afe9ffa@mail.gmail.com>

On 9/22/06, Brian Harring <ferringb at gmail.com> wrote:
> On Fri, Sep 22, 2006 at 12:05:19PM -0700, Bob Ippolito wrote:
> > On 9/22/06, Josiah Carlson <jcarlson at uci.edu> wrote:
> > >
> > > Michael Foord <fuzzyman at voidspace.org.uk> wrote:
> > > >
> > > > Hello all,
> > > >
> > > > I have a suggestion for a new Python built in function: 'flatten'.
> > >
> > > This has been brought up many times.  I'm -1 on its inclusion, if only
> > > because it's a fairly simple 9-line function (at least the trivial
> > > version I came up with), and not all X-line functions should be in the
> > > standard library.  Also, while I have had need for such a function in
> > > the past, I have found that I haven't needed it in a few years.
> >
> > I think instead of adding a flatten function perhaps we should think
> > about adding something like Erlang's "iolist" support. The idea is
> > that methods like "writelines" should be able to take nested iterators
> > and consume any object they find that implements the buffer protocol.
>
> Which is no different then just passing in a generator/iterator that
> does flattening.
>
> Don't much see the point in gumming up the file protocol with this
> special casing; still will have requests for a flattener elsewhere.
>
> If flattening was added, should definitely be a general obj, not a
> special casing in one method in my opinion.

I disagree, the reason for iolist is performance and convenience; the
required indirection of having to explicitly call a flattener function
removes some optimization potential and makes it less convenient to
use.

While there certainly should be a general mechanism available to
perform the task (easily accessible from C), the user would be better
served by not having to explicitly call itertools.iterbuffers every
time they want to write recursive iterables of stuff.

-bob

From fuzzyman at voidspace.org.uk  Fri Sep 22 21:55:18 2006
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Fri, 22 Sep 2006 20:55:18 +0100
Subject: [Python-Dev] Suggestion for a new built-in - flatten
In-Reply-To: <20060922192928.1717.1975026622.divmod.quotient.57018@ohm>
References: <20060922192928.1717.1975026622.divmod.quotient.57018@ohm>
Message-ID: <45143FA6.3020600@voidspace.org.uk>

glyph at divmod.com wrote:
> On Fri, 22 Sep 2006 18:43:42 +0100, Michael Foord <fuzzyman at voidspace.org.uk> wrote:
>
>   
>> I have a suggestion for a new Python built in function: 'flatten'.
>>     
>
> This seems superficially like a good idea, but I think adding it to Python anywhere would do a lot more harm than good.  I can see that consensus is already strongly against a builtin, but I think it would be bad to add to itertools too.
>
> Flattening always *seems* to be a trivial and obvious operation.  "I just need something that takes a group of deeply structured data and turns it into a group of shallowly structured data.".  Everyone that has this requirement assumes that their list of implicit requirements for "flattening" is the obviously correct one.
>
> This wouldn't be a problem except that everyone has a different idea of those requirements:).
>
> Here are a few issues.
>
> What do you do when you encounter a dict?  You can treat it as its keys(), its values(), or its items().
>
> What do you do when you encounter an iterable object?
>
> What order do you flatten set()s in?  (and, ha ha, do you Set the same?)
>
> How are user-defined flattening behaviors registered?  Is it a new special method, a registration API?
>
> How do you pass information about the flattening in progress to the user-defined behaviors?
>
> If you do something special to iterables, do you special-case strings?  Why or why not?
>
>   
If you consume iterables, and only special case strings - then none of 
the issues you raise above seem to be a problem.

Sets and dictionaries are both iterable.

If it's not iterable it's an element.

I'd prefer to see this as a built-in, lots of people seem to want it. IMHO

Having it in itertools is a good compromise.

> What do you do if you encounter a function?  This is kind of a trick question, since Nevow's "flattener" *calls* functions as it encounters them, then treats the *result* of calling them as further input.
>   
Sounds like not what anyone would normally expect.


> If you don't think that functions are special, what about *generator* functions?  How do you tell the difference?  What about functions that return generators but aren't themselves generators?  What about functions that return non-generator iterators?  What about pre-generated generator objects (if you don't want to treat iterables as special, are generators special?).
>
>   
What does the list constructor do with these ? Do the same.

> Do you produce the output as a structured list or an iterator that works incrementally?
>   
Either would be fine. I had in mind a list, but converting an iterator 
into a list is trivial.

> Also, at least Nevow uses "flatten" to mean "serialize to bytes", not "produce a flat list", and I imagine at least a few other web frameworks do as well.  That starts to get into encoding issues.
>
>   
Not a use of the term I've come across. On the other hand I've heard of 
flatten in the context of nested data-structures many times.

> If you make a decision one way or another on any of these questions of policy, you are going to make flatten() useless to a significant portion of its potential userbase.  The only difference between having it in the standard library and not is that if it's there, they'll spend an hour being confused by the weird way that it's dealing with <insert your favorite data type here> rather than just doing the "obvious" thing, and they'll take a minute to write the 10-line function that they need.  Without the standard library, they'll skip to step 2 and save a lot of time.
>   
I think that you're over complicating it and that the term flatten is 
really fairly straightforward. Especially if it's clearly documented in 
terms of consuming iterables.

All the best,


Michael Foord
http://www.voidspace.org.uk


> I would love to see a unified API that figured out all of these problems, and put them together into a (non-stdlib) library that anyone interested could use for a few years to work the kinks out.  Although it might be nice to have a simple "flatten" interface, I don't think that it would ever be simple enough to stick into a builtin; it would just be the default instance of the IncrementalDestructuringProcess class with the most popular (as determined by polling users of the library after a year or so) IncrementalDestructuringTypePolicy.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk
>
>
>   



-- 
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.405 / Virus Database: 268.12.7/454 - Release Date: 21/09/2006


From jcarlson at uci.edu  Fri Sep 22 22:17:23 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri, 22 Sep 2006 13:17:23 -0700
Subject: [Python-Dev] Suggestion for a new built-in - flatten
In-Reply-To: <6a36e7290609221242j165f23bfq22c0502b7afe9ffa@mail.gmail.com>
References: <20060922192637.GA10582@seldon>
	<6a36e7290609221242j165f23bfq22c0502b7afe9ffa@mail.gmail.com>
Message-ID: <20060922131249.0854.JCARLSON@uci.edu>


"Bob Ippolito" <bob at redivi.com> wrote:
> On 9/22/06, Brian Harring <ferringb at gmail.com> wrote:
> > On Fri, Sep 22, 2006 at 12:05:19PM -0700, Bob Ippolito wrote:
> > > I think instead of adding a flatten function perhaps we should think
> > > about adding something like Erlang's "iolist" support. The idea is
> > > that methods like "writelines" should be able to take nested iterators
> > > and consume any object they find that implements the buffer protocol.
> >
> > Which is no different then just passing in a generator/iterator that
> > does flattening.
> >
> > Don't much see the point in gumming up the file protocol with this
> > special casing; still will have requests for a flattener elsewhere.
> >
> > If flattening was added, should definitely be a general obj, not a
> > special casing in one method in my opinion.
> 
> I disagree, the reason for iolist is performance and convenience; the
> required indirection of having to explicitly call a flattener function
> removes some optimization potential and makes it less convenient to
> use.

Sorry Bob, but I disagree.  In the few times where I've needed to 'write
a list of buffers to a file handle', I find that iterating over the
buffers to be sufficient.  And honestly, in all of my time dealing
with socket and file IO, I've never needed to write a list of iterators
of buffers.  Not to say that YAGNI, but I'd like to see an example where
1) it was being used in the wild, and 2) where it would be a measurable
speedup.

 - Josiah


From martin at v.loewis.de  Fri Sep 22 22:14:42 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 22 Sep 2006 22:14:42 +0200
Subject: [Python-Dev] Pep 353: Py_ssize_t advice
In-Reply-To: <871wq3eo5e.fsf@pereiro.luannocracy.com>
References: <871wq3eo5e.fsf@pereiro.luannocracy.com>
Message-ID: <45144432.6010304@v.loewis.de>

David Abrahams schrieb:
>   #if PY_VERSION_HEX < 0x02050000
>   typedef int Py_ssize_t;
>   #define PY_SSIZE_T_MAX INT_MAX
>   #define PY_SSIZE_T_MIN INT_MIN
>   #endif
> 
> I just wanted to point out that this advice could lead to library
> header collisions when multiple 3rd parties decide to follow it.  I
> suggest it be changed to something like:
> 
>   #if PY_VERSION_HEX < 0x02050000 && !defined(PY_SSIZE_T_MIN)

Strictly speaking, this shouldn't be necessary. C allows redefinition
of an object-like macro if the replacement list is identical (for
some definition of identical which applies if the fragment is
copied literally from the PEP).

So I assume you had non-identical replacement list? Can you share
what alternative definition you were using?

In any case, I still think this is good practice, so I added it
to the PEP.

> (C++ allows restating of typedefs; if C allows it, that should be
> something like):

C also allows this; yet, our advise would be that these three
names get always defined together - if that is followed, having
a single guard macro should suffice. PY_SSIZE_T_MIN, as you propose,
should be sufficient.

Regards,
Martin


From rhettinger at ewtllc.com  Fri Sep 22 22:14:58 2006
From: rhettinger at ewtllc.com (Raymond Hettinger)
Date: Fri, 22 Sep 2006 13:14:58 -0700
Subject: [Python-Dev] Suggestion for a new built-in - flatten
In-Reply-To: <451420CE.8070003@voidspace.org.uk>
Message-ID: <B6FAC926EFE7B348B12F29CF7E4A93D401CF46A3@hammer.office.bhtrader.com>

[Michael Foord]
>I have a suggestion for a new Python built in function: 'flatten'.
> ...
> There are several different possible approaches in pure Python, 
> but is this an idea that has legs ?

No legs.

It has been discussed ad naseum on comp.lang.python.  People seem to
enjoy writing their own versions of flatten more than finding legitimate
use cases that don't already have trivial solutions.

A general purpose flattener needs some way to be told was is atomic and
what can be further subdivided.  Also, it not obvious how the algorithm
should be extended to cover inputs with tree-like data structures with
data at nodes as well as the leaves (preorder, postorder, inorder
traversal, etc.)

I say use your favorite cookbook approach and leave it out of the
language.


Raymond

From martin at v.loewis.de  Fri Sep 22 22:21:35 2006
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 22 Sep 2006 22:21:35 +0200
Subject: [Python-Dev] GCC patch for catching errors in PyArg_ParseTuple
Message-ID: <451445CF.7080407@v.loewis.de>

I wrote a patch for the GCC trunk to add an
__attribute__((format(PyArg_ParseTuple, 2, 3)))
declaration to functions (this specific declaration
should go to PyArg_ParseTuple only).

With that patch, parameter types are compared with the string parameter
(if that's a literal), and errors are reported if there is a type
mismatch (provided -Wformat is given).

I'll post more about this patch in the near future, and commit
some bug fixes I found with it, but here is the patch, in
a publish-early fashion.

There is little chance that this can go into GCC (as it is too
specific), so it likely needs to be maintained separately.
It was written for the current trunk, but hopefully applies
to most recent releases.

Regards,
Martin

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: pyformat.diff
Url: http://mail.python.org/pipermail/python-dev/attachments/20060922/588484ab/attachment-0001.diff 

From pje at telecommunity.com  Fri Sep 22 22:25:49 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 22 Sep 2006 16:25:49 -0400
Subject: [Python-Dev] New relative import issue
In-Reply-To: <20060922112345.084E.JCARLSON@uci.edu>
References: <5.1.1.6.0.20060922120555.0270b420@sparrow.telecommunity.com>
	<20060921233257.0848.JCARLSON@uci.edu>
	<5.1.1.6.0.20060922120555.0270b420@sparrow.telecommunity.com>
Message-ID: <5.1.1.6.0.20060922160541.028188e8@sparrow.telecommunity.com>

At 12:42 PM 9/22/2006 -0700, Josiah Carlson wrote:
> > You might as well suggest that each environment
> > consist of a single large zipfile containing the packages in question: 
> this
> > would actually be *more* practical (and fast!) in terms of Python startup,
> > and is no different from having a database with respect to the need for
> > installation and uninstallation to modify a central file!
>
>We should remember that the sizes of databases that (I expect) will be
>common, we are talking about maybe 30k if a user has installed every
>package in pypi.  And after the initial query, everything will be stored
>in a dictionary or dictionary-like object, offering faster query times
>than even a zip file

Measure it.  Be sure to include the time to import SQLite vs. the time to 
import the zipimport module.


>SQLite is pretty fast.  And for startup, we are really only performing a
>single query per database "SELECT * FROM package_registry".  It will end
>up reading the entire database, but these databases will be generally
>small, perhaps a few dozen rows, maybe a few thousand if we have set up
>a bunch of installation-time application environments.

Again, seriously, compare this against a zipfile.  You'll find that there's 
absolutely no comparison between reading this and reading a zipfile central 
directory -- which also results in an in-memory cache that can then be used 
to seek() directly to the module.


>Actually, I'm offering a way of *registering* a package with the
>repository from the command line.  I'm of the opinion that setting the
>environment via command line for the subsequent Python runs is a bad
>idea, but then again, I have been using wxPython's wxversion method for
>a while to select which wxPython installation I want to use, and find
>things like:
>
>     import wxversion
>     wxversion.ensureMinimal('2.6-unicode', optionsRequired=True)
>
>To be exactly the amount of control I want, where I want it.

Well, that's already easy to do for arbitrary packages and arbitrary 
versions with setuptools.  Eggs installed in "multi-version" mode are added 
to sys.path at runtime if/when they are requested.


>With a package registry (perhaps as I have been describing, perhaps
>something different), all of the disparate ways of choosing a version of
>a library during import can be removed in favor of a single mechanism.
>This single mechanism could handle things like the wxPython
>'ensureMinimal', perhaps even 'ensure exact' or 'use latest'.

This discussion is mostly making me realize that sys.path is exactly the 
right thing to have, and that the only thing that actually need fixing is 
universal .pth support, and maybe some utility functions for better 
sys.path manipulation within .pth files.  I suggest that there is no way an 
arbitrary "registry" implementation is going to be faster than reading 
lines from a text file.


> > Setuptools works around this by installing an enhancement for the 'site'
> > module that extends .pth support to include all PYTHONPATH
> > directories.  The enhancement delegates to the original site module after
> > recording data about sys.path that the site module destroys at startup.
>
>But wasn't there a recent discussion describing how keeping persistant
>environment variables is a PITA both during install and runtime?

Yes, exactly.


>Extending .pth files to PYTHONPATH seems to me like a hack meant to work
>around the fact that Python doesn't have a package registry.  And really,
>all of the current sys.path + .pth + PYTHONPATH stuff could be subsumed
>into a *single* mechanism.

Sure -- I suggest that the single mechanism is none other than 
*sys.path*.  The .pth files, PYTHONPATH, and a new command-line option 
merely being ways to set it.

All of the discussion that's taken place here has sufficed at this point to 
convince me that sys.path isn't broken at all, and doesn't need 
fixing.  Some tweaks to 'site' and maybe a new command-line option will 
suffice to clean everything up quite nicely.

I say this because all of the version and dependency management things that 
people are asking about can already be achieved by setuptools, so clearly 
the underlying machinery is fine.  It wasn't until this message of yours 
that I realized that you are trying to solve a bunch of problems that are 
quite solvable within the existing machinery.  I was mainly interested in 
cleaning up the final awkwardness that's effectively caused by lack of .pth 
support for the startup script directory.


> > I'm not sure of that, since I don't yet know how your approach would deal
> > with namespace packages, which are distributed in pieces and assembled
> > later.  For example, many PEAK and Zope distributions live in the peak.*
> > and zope.* package namespaces, but are installed separately, and glued
> > together via __path__ changes (see the pkgutil docs).
>
>     packages.register('zope', '/path/to/zope')
>
>And if the installation path is different:
>
>     packages.register('zope.subpackage', '/different/path/to/subpackage/')
>
>Otherwise the importer will know where the zope (or peak) package exists
>in the filesystem (or otherwise), and search it whenever 'from zope
>import ...' is performed.

If you're talking about replacing the current import machinery, you would 
have to leave this to Py3K, otherwise all you've done is add a *new* import 
hook, i.e. a "sys.package_loaders" dictionary or some such.

If you wanted something like that now, of course, you could slap an 
importer into sys.meta_path that then did a lookup in 
sys.package_loaders.  Getting this mechanism bootstrapped, however, is left 
as an exercise for the reader.  ;)

Note, by the way, that it might be quite possible to do away with 
everything but sys.meta_path in Py3K, prepopulated with such an importer 
(along with ones to support builtin and frozen modules).  You could then 
import a backward-compatibility module that would add support for sys.path 
and for package __path__ attributes, by adding a new entry to 
sys.meta_path.  But this is strictly a pipe dream where Python 2.x is 
concerned.


From bob at redivi.com  Fri Sep 22 22:34:23 2006
From: bob at redivi.com (Bob Ippolito)
Date: Fri, 22 Sep 2006 13:34:23 -0700
Subject: [Python-Dev] Suggestion for a new built-in - flatten
In-Reply-To: <20060922131249.0854.JCARLSON@uci.edu>
References: <20060922192637.GA10582@seldon>
	<6a36e7290609221242j165f23bfq22c0502b7afe9ffa@mail.gmail.com>
	<20060922131249.0854.JCARLSON@uci.edu>
Message-ID: <6a36e7290609221334q7ec72a5cu5000347ee13248fa@mail.gmail.com>

On 9/22/06, Josiah Carlson <jcarlson at uci.edu> wrote:
>
> "Bob Ippolito" <bob at redivi.com> wrote:
> > On 9/22/06, Brian Harring <ferringb at gmail.com> wrote:
> > > On Fri, Sep 22, 2006 at 12:05:19PM -0700, Bob Ippolito wrote:
> > > > I think instead of adding a flatten function perhaps we should think
> > > > about adding something like Erlang's "iolist" support. The idea is
> > > > that methods like "writelines" should be able to take nested iterators
> > > > and consume any object they find that implements the buffer protocol.
> > >
> > > Which is no different then just passing in a generator/iterator that
> > > does flattening.
> > >
> > > Don't much see the point in gumming up the file protocol with this
> > > special casing; still will have requests for a flattener elsewhere.
> > >
> > > If flattening was added, should definitely be a general obj, not a
> > > special casing in one method in my opinion.
> >
> > I disagree, the reason for iolist is performance and convenience; the
> > required indirection of having to explicitly call a flattener function
> > removes some optimization potential and makes it less convenient to
> > use.
>
> Sorry Bob, but I disagree.  In the few times where I've needed to 'write
> a list of buffers to a file handle', I find that iterating over the
> buffers to be sufficient.  And honestly, in all of my time dealing
> with socket and file IO, I've never needed to write a list of iterators
> of buffers.  Not to say that YAGNI, but I'd like to see an example where
> 1) it was being used in the wild, and 2) where it would be a measurable
> speedup.

The primary use for this is structured data, mostly file formats,
where you can't write the beginning until you have a bunch of
information about the entire structure such as the number of items or
the count of bytes when serialized. An efficient way to do that is
just to build a bunch of nested lists that you can use to calculate
the size (iolist_size(...) in Erlang) instead of having to write a
visitor that constructs a new flat list or writes to StringIO first. I
suppose in the most common case, for performance reasons, you would
want to restrict this to sequences only (as in PySequence_Fast)
because iolist_size(...) should be non-destructive (or else it has to
flatten into a new list anyway).

I've definitely done this before in Python, most recently here:
http://svn.red-bean.com/bob/flashticle/trunk/flashticle/

The flatten function in this case is flashticle.util.iter_only, and
it's used in flashticle.actions, flashticle.amf, flashticle.flv,
flashticle.swf, and flashticle.remoting.

-bob

From dave at boost-consulting.com  Sat Sep 23 00:17:13 2006
From: dave at boost-consulting.com (David Abrahams)
Date: Fri, 22 Sep 2006 18:17:13 -0400
Subject: [Python-Dev] Pep 353: Py_ssize_t advice
In-Reply-To: <45144432.6010304@v.loewis.de> (Martin v. =?utf-8?Q?L=C3=B6wi?=
	=?utf-8?Q?s's?= message of "Fri, 22 Sep 2006 22:14:42 +0200")
References: <871wq3eo5e.fsf@pereiro.luannocracy.com>
	<45144432.6010304@v.loewis.de>
Message-ID: <87k63vzguu.fsf@pereiro.luannocracy.com>

"Martin v. L?wis" <martin at v.loewis.de> writes:

> David Abrahams schrieb:
>>   #if PY_VERSION_HEX < 0x02050000
>>   typedef int Py_ssize_t;
>>   #define PY_SSIZE_T_MAX INT_MAX
>>   #define PY_SSIZE_T_MIN INT_MIN
>>   #endif
>> 
>> I just wanted to point out that this advice could lead to library
>> header collisions when multiple 3rd parties decide to follow it.  I
>> suggest it be changed to something like:
>> 
>>   #if PY_VERSION_HEX < 0x02050000 && !defined(PY_SSIZE_T_MIN)
>
> Strictly speaking, this shouldn't be necessary. C allows redefinition
> of an object-like macro if the replacement list is identical (for
> some definition of identical which applies if the fragment is
> copied literally from the PEP).
>
> So I assume you had non-identical replacement list? 

No:

a. I didn't actually experience a collision; I only anticipated it

b. We were using C++, which IIRC does not allow such redefinition

c. anyway you'll get a nasty warning, which for some people will be
   just as bad as an error

> Can you share what alternative definition you were using?
>
> In any case, I still think this is good practice, so I added it
> to the PEP.

Thanks,

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com

From rasky at develer.com  Sat Sep 23 00:42:33 2006
From: rasky at develer.com (Giovanni Bajo)
Date: Sat, 23 Sep 2006 00:42:33 +0200
Subject: [Python-Dev] GCC patch for catching errors in PyArg_ParseTuple
References: <451445CF.7080407@v.loewis.de>
Message-ID: <00c401c6de98$64cefd80$4bbd2997@bagio>

Martin v. L?wis wrote:

>> I'll post more about this patch in the near future, and commit
>> some bug fixes I found with it, but here is the patch, in
>> a publish-early fashion.
>>
>> There is little chance that this can go into GCC (as it is too
>> specific), so it likely needs to be maintained separately.
>> It was written for the current trunk, but hopefully applies
>> to most recent releases.

A way not to maintain this patch forever would be to devise a way to make
format syntax "pluggable" / "scriptable". There have been previous discussions
on the GCC mailing lists.

Giovanni Bajo


From typo_pl at hotmail.com  Sat Sep 23 00:46:32 2006
From: typo_pl at hotmail.com (Johnny Lee)
Date: Fri, 22 Sep 2006 22:46:32 +0000
Subject: [Python-Dev] Typo.pl scan of Python 2.5 source code
Message-ID: <BAY112-W5EEFA3A998FA1B8EA7AEE9E210@phx.gbl>




Hello,My name is Johnny Lee. I have developed a *ahem* perl script which scans C/C++ source files for typos.  I ran the typo.pl script on the released Python 2.5 source code. The scan took about two minutes and produced ~340 typos.After spending about 13 minutes weeding out the obvious false positives, 149 typos remain. One of the pros/cons of the script is that it doesn't need to be intergrated into the build process to work.It just searches for files with typical C/C++ source code file extensions and scans them.The downside is if the source file is not included in the build process, then the script is scanning an irrelevant file.Unless you aid the script via some parameters, it will scan all the code, even stuff inside #ifdef'sthat wouldn't normally be compiled. You can access the list of typos from <http://www.geocities.com/typopl/typoscan.htm>The Perl 1999 paper can be read at <http://www.geocities.com/typopl/index.htm> I've mapped the Python memory-related calls PyMem_Alloc, PyMem_Realloc, etc. to the same behaviour as the C std library malloc, realloc, etc. sinceInclude\pymem.h seem to map them to those calls. If that assumption is not valid, then you can ignore typos that involve those PyMem_XXX calls.  The Python 2.5 typos can be classified into 7 types. 1) if (X = 0)Assignment within an if statement. Typically a false positive, but sometimes it catches something valid.In Python's case, the one typo is: if (status = ERROR_MORE_DATA)but the previous code statement returns an error code into the status variable. 2) realloc overwrite src if NULL, i.e. p = realloc(p, new_size);If realloc() fails, it will return NULL. If you assign the return value to the same variable you passed into realloc,then you've overwritten the variable and possibly leaked the memory that the variable pointed to. 3) if (CreateFileMapping == IHV)On Win32, the CreateFileMapping() API will return NULL on failure, not INVALID_HANDLE_VALUE.The Python code does not check for NULL though. 4) if ((X!=0) || (X!=1))The problem with code of this type is that it doesn't work. In the Python case, we have in a large if statement: quotetabs && ((data[in]!='\t')||(data[in]!=' '))Now if data[in] == '\t', then it will fail the first data[in] but it will pass the second data[in] comparison.Typically you want "&&" not "||".5) using API result w/no checkThere are several APIs that should be checked for success before using the returned ptrs/cookies, i.e.malloc, realloc, and fopen among others. 6) XX;;Just being anal here. Two semicolons in a row. Second one is extraneous. 7) extraneous test for non-NULL ptrSeveral memory calls that free memory accept NULL ptrs. So testing for NULL before calling them is redundant and wastes code space.Now some codepaths may be time-critical, but probably not all, and smaller code usually helps.If you have any questions, comments, feel free to email. I hope this scan is useful. Thanks for your time,J
_________________________________________________________________
Use Messenger to talk to your IM friends, even those on Yahoo!
http://ideas.live.com/programpage.aspx?versionId=7adb59de-a857-45ba-81cc-685ee3e858fe
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060922/93285d97/attachment.htm 

From jcarlson at uci.edu  Sat Sep 23 02:03:45 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri, 22 Sep 2006 17:03:45 -0700
Subject: [Python-Dev] New relative import issue
In-Reply-To: <5.1.1.6.0.20060922160541.028188e8@sparrow.telecommunity.com>
References: <20060922112345.084E.JCARLSON@uci.edu>
	<5.1.1.6.0.20060922160541.028188e8@sparrow.telecommunity.com>
Message-ID: <20060922134229.0857.JCARLSON@uci.edu>


"Phillip J. Eby" <pje at telecommunity.com> wrote:
> At 12:42 PM 9/22/2006 -0700, Josiah Carlson wrote:
[snip]
> Measure it.  Be sure to include the time to import SQLite vs. the time to 
> import the zipimport module.
[snip]
> Again, seriously, compare this against a zipfile.  You'll find that there's 
> absolutely no comparison between reading this and reading a zipfile central 
> directory -- which also results in an in-memory cache that can then be used 
> to seek() directly to the module.

They are not directly comparable.  The registry of packages can do more
than zipimport in terms of package naming and hierarchy, but it's not an
importer; it's a conceptual replacement of sys.path.  I have already
stated that the actual imports from this registry won't be any faster,
as it will still need to read modules/packages from disk *after* it has
decided on a list of paths to check for the package/module.  Further,
whether we use SQLite, or any one of a number of other persistance
mechanisms, such a choice should depend on a few things (speed being one
of them, though maybe not the *only* consideration).  Perhaps even a zip
file whose 'files' are named with the desired package hierarchy, and
whose contents are something like:

    import imp
    globals.update(imp.load_XXX(...).__dict__)
    del imp


> >Actually, I'm offering a way of *registering* a package with the
> >repository from the command line.  I'm of the opinion that setting the
> >environment via command line for the subsequent Python runs is a bad
> >idea, but then again, I have been using wxPython's wxversion method for
> >a while to select which wxPython installation I want to use, and find
> >things like:
> >
> >     import wxversion
> >     wxversion.ensureMinimal('2.6-unicode', optionsRequired=True)
> >
> >To be exactly the amount of control I want, where I want it.
> 
> Well, that's already easy to do for arbitrary packages and arbitrary 
> versions with setuptools.  Eggs installed in "multi-version" mode are added 
> to sys.path at runtime if/when they are requested.

Why do we have to use eggs or setuptools to get a feature that
*arguably* should have existed a decade ago in core Python?

The core functionality I'm talking about is:

    packages.register(name, path, env=None, system=False, persist=False)
    #system==True implies persist==True

    packages.copy_env(fr_env, to_env)
    packages.use_env(env)

    packages.check(name, version=None)

    packages.use(name, version)

With those 5 functions and a few tricks, we can replace all user-level .pth
and PYTHONPATH use, and sys.path manipulation done in other 3rd party
packages (setuptools, etc.) are easily handled and supported.


> >With a package registry (perhaps as I have been describing, perhaps
> >something different), all of the disparate ways of choosing a version of
> >a library during import can be removed in favor of a single mechanism.
> >This single mechanism could handle things like the wxPython
> >'ensureMinimal', perhaps even 'ensure exact' or 'use latest'.
> 
> This discussion is mostly making me realize that sys.path is exactly the 
> right thing to have, and that the only thing that actually need fixing is 
> universal .pth support, and maybe some utility functions for better 
> sys.path manipulation within .pth files.  I suggest that there is no way an 
> arbitrary "registry" implementation is going to be faster than reading 
> lines from a text file.
> 
> > > Setuptools works around this by installing an enhancement for the 'site'
> > > module that extends .pth support to include all PYTHONPATH
> > > directories.  The enhancement delegates to the original site module after
> > > recording data about sys.path that the site module destroys at startup.
> >
> >But wasn't there a recent discussion describing how keeping persistant
> >environment variables is a PITA both during install and runtime?
> 
> Yes, exactly.

You have confused me, because not only have you just said "we use
PYTHONPATH as a solution", but you have just acknowledged that using
PYTHONPATH is not reasonable as a solution.  You have also just said
that we need to add features to .pth support so that it is more usable.

So, sys.path "is exactly the right thing to have", but we need to add
more features to make it better.

Ok, here's a sample .pth file if we are willing to make it better (in my
opinion):

    zope,/path/to/zope,3.2.1,netserver
    zope.subpackage,/path/to/subpackage,.1.1,netserver

That's a CSV file with rows defining packages, and columns in order:
package name, path to package, version, and a semicolon-separated list
of environments that this package is available in (a leading semicolon,
or a double semicolon says that it is available when no environment is
specified).

With a base sys.path, a dictionary of environment -> packages created
from .pth files, and a simple function, one can generally develop an
applicable sys.path on demand to some choose_environment() call.

This is, effectively, a variant of what I was suggesting, only with
a different persistance representation.


> >Extending .pth files to PYTHONPATH seems to me like a hack meant to work
> >around the fact that Python doesn't have a package registry.  And really,
> >all of the current sys.path + .pth + PYTHONPATH stuff could be subsumed
> >into a *single* mechanism.
> 
> Sure -- I suggest that the single mechanism is none other than 
> *sys.path*.  The .pth files, PYTHONPATH, and a new command-line option 
> merely being ways to set it.

I guess we disagree on what is meant by "single" in this context.


> All of the discussion that's taken place here has sufficed at this point to 
> convince me that sys.path isn't broken at all, and doesn't need 
> fixing.  Some tweaks to 'site' and maybe a new command-line option will 
> suffice to clean everything up quite nicely.
> 
> I say this because all of the version and dependency management things that 
> people are asking about can already be achieved by setuptools, so clearly 
> the underlying machinery is fine.  It wasn't until this message of yours 
> that I realized that you are trying to solve a bunch of problems that are 
> quite solvable within the existing machinery.  I was mainly interested in 
> cleaning up the final awkwardness that's effectively caused by lack of .pth 
> support for the startup script directory.

Indeed, everything is solvable within the existing machinery.  But it's
not a question of solvable, it's a question of can we make things better. 
When I have had the occasion to use .pth files, I've been somewhat
disappointed.  Given even the few functions I've defined for an API, or
the .pth variant I described, I know I wouldn't be disappointed in
trying to set up independant package version installations, application
environments, etc.  They all come fairly naturally.


> > > I'm not sure of that, since I don't yet know how your approach would deal
> > > with namespace packages, which are distributed in pieces and assembled
> > > later.  For example, many PEAK and Zope distributions live in the peak.*
> > > and zope.* package namespaces, but are installed separately, and glued
> > > together via __path__ changes (see the pkgutil docs).
> >
> >     packages.register('zope', '/path/to/zope')
> >
> >And if the installation path is different:
> >
> >     packages.register('zope.subpackage', '/different/path/to/subpackage/')
> >
> >Otherwise the importer will know where the zope (or peak) package exists
> >in the filesystem (or otherwise), and search it whenever 'from zope
> >import ...' is performed.
> 
> If you're talking about replacing the current import machinery, you would 
> have to leave this to Py3K, otherwise all you've done is add a *new* import 
> hook, i.e. a "sys.package_loaders" dictionary or some such.

It could coexist happily next to sys.path-based machinery, and it is
likely easier for it to do so (replacing the sys.path bits in the core
language is more work than I would be willing to do).


> If you wanted something like that now, of course, you could slap an 
> importer into sys.meta_path that then did a lookup in 
> sys.package_loaders.  Getting this mechanism bootstrapped, however, is left 
> as an exercise for the reader.  ;)

I just about cry every time I think about adding an import hook.  If
others think that this functionality has legs to stand on, I may just
have to get help from experienced users.


> Note, by the way, that it might be quite possible to do away with 
> everything but sys.meta_path in Py3K, prepopulated with such an importer 
> (along with ones to support builtin and frozen modules).  You could then 
> import a backward-compatibility module that would add support for sys.path 
> and for package __path__ attributes, by adding a new entry to 
> sys.meta_path.  But this is strictly a pipe dream where Python 2.x is 
> concerned.

Indeed, actually removing sys.path from 2.x is a non-starter.  But
replacing user-level modifications of sys.path with calls to a registry? 
That seems possible, if not desireable, from a "let us not monkey patch
the Python runtime" perspective.


 - Josiah


From glyph at divmod.com  Sat Sep 23 02:35:04 2006
From: glyph at divmod.com (glyph at divmod.com)
Date: Fri, 22 Sep 2006 20:35:04 -0400
Subject: [Python-Dev] Suggestion for a new built-in - flatten
In-Reply-To: <45143FA6.3020600@voidspace.org.uk>
Message-ID: <20060923003504.1717.1400241516.divmod.quotient.57242@ohm>

On Fri, 22 Sep 2006 20:55:18 +0100, Michael Foord <fuzzyman at voidspace.org.uk> wrote:

>glyph at divmod.com wrote:
>>On Fri, 22 Sep 2006 18:43:42 +0100, Michael Foord 
>><fuzzyman at voidspace.org.uk> wrote:

>>This wouldn't be a problem except that everyone has a different idea of 
>>those requirements:).

You didn't really address this, and it was my main point.  In fact, you more or less made my point for me.  You just assume that the type of application you have in mind right now is the only one that wants to use a flatten function, and dismiss out of hand any uses that I might have in mind.

>If you consume iterables, and only special case strings - then none of the 
>issues you raise above seem to be a problem.

You have just made two major policy decisions about the flattener without presenting a specific use case or set of use cases it is meant to be restricted to.

For example, you suggest special casing strings.  Why?  Your guideline otherwise is to follow what the iter() or list() functions do.  What about user-defined classes which subclass str and implement __iter__?

>Sets and dictionaries are both iterable.
>
>If it's not iterable it's an element.
>
>I'd prefer to see this as a built-in, lots of people seem to want it. IMHO

Can you give specific examples?  The only significant use of a flattener I'm intimately familiar with (Nevow) works absolutely nothing like what you described.

>Having it in itertools is a good compromise.

No need to compromise with me.  I am not in a position to reject your change.  No particular reason for me to make any concessions either: I'm simply trying to communicate the fact that I think this is a terrible idea, not come to an agreement with you about how progress might be made.  Absolutely no changes on this front are A-OK by me :).

You have made a case for the fact that, perhaps, you should have a utility library which you use in all your projects could use for consistency and to avoid repeating yourself, since you have a clearly defined need for what a flattener should do.  I haven't read anything that indicates there's a good reason for this function to be in the standard library.  What are the use cases?

It's definitely better for the core language to define lots of basic types so that you can say something in a library like "returns a dict mapping strings to ints" without having a huge argument about what "dict" and "string" and "int" mean.  What's the benefit to having everyone flatten things the same way, though?  Flattening really isn't that common of an operation, and in the cases where it's needed, a unified approach would only help if you had two flattenable data-structures from different libraries which needed to be combined.  I can't say I've ever seen a case where that would happen, let alone for it to be common enough that there should be something in the core language to support it.

>>What do you do if you encounter a function?  This is kind of a trick 
>>question, since Nevow's "flattener" *calls* functions as it encounters 
>>them, then treats the *result* of calling them as further input.
>>
>Sounds like not what anyone would normally expect.

Of course not.  My point is that there is nothing that anyone would "normally" expect from a flattener except a few basic common features.  Bob's use-case is completely different from yours, for example: he's talking about flattening to support high-performance I/O.

>What does the list constructor do with these ? Do the same.

>>> list('hello')
['h', 'e', 'l', 'l', 'o']

What more can I say?

>>Do you produce the output as a structured list or an iterator that works 
>>incrementally?

>Either would be fine. I had in mind a list, but converting an iterator into 
>a list is trivial.

There are applications where this makes a big difference.  Bob, for example, suggested that this should only work on structures that support the PySequence_Fast operations.

>>Also, at least Nevow uses "flatten" to mean "serialize to bytes", not 
>>"produce a flat list", and I imagine at least a few other web frameworks do 
>>as well.  That starts to get into encoding issues.

>Not a use of the term I've come across. On the other hand I've heard of 
>flatten in the context of nested data-structures many times.

Nevertheless the only respondent even mildly in favor of your proposal so far also mentions flattening sequences of bytes, although not quite as directly.

>I think that you're over complicating it and that the term flatten is really 
>fairly straightforward. Especially if it's clearly documented in terms of 
>consuming iterables.

And I think that you're over-simplifying.  If you can demonstrate that there is really a broad consensus that this sort of thing is useful in a wide variety of applications, then sure, I wouldn't complain too much.  But I've spent a LOT of time thinking about what "flattening" is, and several applications that I've worked on have very different ideas about how it should work, and I see very little benefit to unifying them.  That's just the work of one programmer; I have to assume that the broader domain of all applications which do structure flattening is much more diverse.

From greg.ewing at canterbury.ac.nz  Sat Sep 23 03:34:56 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 23 Sep 2006 13:34:56 +1200
Subject: [Python-Dev] list.discard? (Re: dict.discard)
In-Reply-To: <17683.21448.909748.200493@montanaro.dyndns.org>
References: <20060921134249.GA9238@niemeyer.net>
	<45132C6D.9010806@canterbury.ac.nz>
	<17683.21448.909748.200493@montanaro.dyndns.org>
Message-ID: <45148F40.40802@canterbury.ac.nz>

skip at pobox.com wrote:

> It's obvious for sets and dictionaries that there is only one thing to
> discard and that after the operation you're guaranteed the key no longer
> exists.  Would you want the same semantics for lists or the semantics of
> list.remove where it only removes the first instance?

In my use cases I usually know that there is either
zero or one occurrences in the list.

But maybe it would be more useful to have a remove_all()
method, whose behaviour with zero occurrences would just
be a special case.

Or maybe remove() should just do nothing if the item is
not found. I don't think I've ever found getting an exception
from it to be useful, and I've often found it a nuisance.
What experiences have others had with it?

--
Greg

From nnorwitz at gmail.com  Sat Sep 23 06:51:38 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Fri, 22 Sep 2006 21:51:38 -0700
Subject: [Python-Dev] Typo.pl scan of Python 2.5 source code
In-Reply-To: <BAY112-W5EEFA3A998FA1B8EA7AEE9E210@phx.gbl>
References: <BAY112-W5EEFA3A998FA1B8EA7AEE9E210@phx.gbl>
Message-ID: <ee2a432c0609222151k2bf1a211u44d9e44dcc6bbf5d@mail.gmail.com>

On 9/22/06, Johnny Lee <typo_pl at hotmail.com> wrote:
>
> Hello,
> My name is Johnny Lee. I have developed a *ahem* perl script which scans
> C/C++ source files for typos.

Hi Johnny.

Thanks for running your script, even if it is written in Perl and ran
on Windows. :-)

> The Python 2.5 typos can be classified into 7 types.
>
> 2) realloc overwrite src if NULL, i.e. p = realloc(p, new_size);
> If realloc() fails, it will return NULL. If you assign the return value to
> the same variable you passed into realloc,
> then you've overwritten the variable and possibly leaked the memory that the
> variable pointed to.

A bunch of these warnings were accurate and a bunch were not.  There
were 2 reasons for the false positives.  1) The pointer was aliased,
thus not lost, 2) On failure, we exited (Parser/*.c)

> 4) if ((X!=0) || (X!=1))

These 2 cases occurred in binascii.  I have no idea if the warning is
wright or the code is.

> 6) XX;;
> Just being anal here. Two semicolons in a row. Second one is extraneous.

I already checked in a fix for these on HEAD.  Hard for even me to
screw up those fixes. :-)

> 7) extraneous test for non-NULL ptr
> Several memory calls that free memory accept NULL ptrs.
> So testing for NULL before calling them is redundant and wastes code space.
> Now some codepaths may be time-critical, but probably not all, and smaller
> code usually helps.

I ignored these as I'm not certain all the platforms we run on accept
free(NULL).

Below is my categorization of the warnings except #7.  Hopefully
someone will fix all the real problems in the first batch.

Thanks again!

n
--

# Problems
Objects\fileobject.c (338):     realloc overwrite src if NULL; 17:
file->f_setbuf=(char*)PyMem_Realloc(file->f_setbuf,bufsize)
Objects\fileobject.c (342):     using PyMem_Realloc result w/no check
30: setvbuf(file->f_fp, file->f_setbuf, type, bufsize);
[file->f_setbuf]
Objects\listobject.c (2619):    using PyMem_MALLOC result w/no check
30: garbage[i] = selfitems[cur]; [garbage]
Parser\myreadline.c (144):      realloc overwrite src if NULL; 17:
p=(char*)PyMem_REALLOC(p,n+incr)
Modules\_csv.c (564):           realloc overwrite src if NULL; 17:
self->field=PyMem_Realloc(self->field,self->field_size)
Modules\_localemodule.c (366):  realloc overwrite src if NULL; 17:
buf=PyMem_Realloc(buf,n2)
Modules\_randommodule.c (290):  realloc overwrite src if NULL; 17:
key=(unsigned#long*)PyMem_Realloc(key,bigger*sizeof(*key))
Modules\arraymodule.c (1675):   realloc overwrite src if NULL; 17:
self->ob_item=(char*)PyMem_REALLOC(self->ob_item,itemsize*self->ob_size)
Modules\cPickle.c (536):        realloc overwrite src if NULL; 17:
self->buf=(char*)realloc(self->buf,n)
Modules\cPickle.c (592):        realloc overwrite src if NULL; 17:
self->buf=(char*)realloc(self->buf,bigger)
Modules\cPickle.c (4369):       realloc overwrite src if NULL; 17:
self->marks=(int*)realloc(self->marks,s*sizeof(int))
Modules\cStringIO.c (344):      realloc overwrite src if NULL; 17:
self->buf=(char*)realloc(self->buf,self->buf_size)
Modules\cStringIO.c (380):      realloc overwrite src if NULL; 17:
oself->buf=(char*)realloc(oself->buf,oself->buf_size)
Modules\_ctypes\_ctypes.c (2209):       using PyMem_Malloc result w/no
check 30: memset(obj->b_ptr, 0, dict->size); [obj->b_ptr]
Modules\_ctypes\callproc.c (1472):      using PyMem_Malloc result w/no
check 30: strcpy(conversion_mode_encoding, coding);
[conversion_mode_encoding]
Modules\_ctypes\callproc.c (1478):      using PyMem_Malloc result w/no
check 30: strcpy(conversion_mode_errors, mode);
[conversion_mode_errors]
Modules\_ctypes\stgdict.c (362):        using PyMem_Malloc result w/no
check 30: memset(stgdict->ffi_type_pointer.elements, 0,
[stgdict->ffi_type_pointer.elements]
Modules\_ctypes\stgdict.c (376):        using PyMem_Malloc result w/no
check 30: memset(stgdict->ffi_type_pointer.elements, 0,
[stgdict->ffi_type_pointer.elements]

# No idea if the code or tool is right.
Modules\binascii.c (1161)
Modules\binascii.c (1231)

# Platform specific files.  I didn't review and won't fix without testing.
Python\thread_lwp.h (107):      using malloc result w/no check 30:
lock->lock_locked = 0; [lock]
Python\thread_os2.h (141):      using malloc result w/no check 30:
(long)sem)); [sem]
Python\thread_os2.h (155):      using malloc result w/no check 30:
lock->is_set = 0; [lock]
Python\thread_pth.h (133):      using malloc result w/no check 30:
memset((void *)lock, '\0', sizeof(pth_lock)); [lock]
Python\thread_solaris.h (48):   using malloc result w/no check 30:
funcarg->func = func; [funcarg]
Python\thread_solaris.h (133):  using malloc result w/no check 30:
if(mutex_init(lock,USYNC_THREAD,0)) [lock]

# Who cares about these modules.
Modules\almodule.c:182
Modules\svmodule.c:547

# Not a problem.
Parser\firstsets.c (76)
Parser\grammar.c (40)
Parser\grammar.c (59)
Parser\grammar.c (83)
Parser\grammar.c (102)
Parser\node.c (95)
Parser\pgen.c (52)
Parser\pgen.c (69)
Parser\pgen.c (126)
Parser\pgen.c (438)
Parser\pgen.c (462)
Parser\tokenizer.c (797)
Parser\tokenizer.c (869)
Modules\_bsddb.c (2633)
Modules\_csv.c (1069)
Modules\arraymodule.c (1871)
Modules\gcmodule.c (1363)
Modules\zlib\trees.c (375)

From martin at v.loewis.de  Sat Sep 23 07:27:20 2006
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Sat, 23 Sep 2006 07:27:20 +0200
Subject: [Python-Dev] Pep 353: Py_ssize_t advice
In-Reply-To: <87k63vzguu.fsf@pereiro.luannocracy.com>
References: <871wq3eo5e.fsf@pereiro.luannocracy.com>	<45144432.6010304@v.loewis.de>
	<87k63vzguu.fsf@pereiro.luannocracy.com>
Message-ID: <4514C5B8.1070903@v.loewis.de>

David Abrahams schrieb:
> b. We were using C++, which IIRC does not allow such redefinition

You remember incorrectly. 16.3/2 (cpp.replace) says

# An identifier currently defined as a macro without use of lparen (an
# object-like macro) may be redefined by another #define preprocessing
# directive provided that the second definition is an object-like macro
# definition and the two replacement lists are identical, otherwise the
# program is ill-formed.

> c. anyway you'll get a nasty warning, which for some people will be 
> just as bad as an error

Try for yourself. You get the warning only if the redefinition is not
identical to the original definition (or an object-like macro is
redefined as a function-like macro or vice versa).

Regards,
Martin

From martin at v.loewis.de  Sat Sep 23 07:33:05 2006
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 23 Sep 2006 07:33:05 +0200
Subject: [Python-Dev] GCC patch for catching errors in PyArg_ParseTuple
In-Reply-To: <00c401c6de98$64cefd80$4bbd2997@bagio>
References: <451445CF.7080407@v.loewis.de>
	<00c401c6de98$64cefd80$4bbd2997@bagio>
Message-ID: <4514C711.9090003@v.loewis.de>

Giovanni Bajo schrieb:
> A way not to maintain this patch forever would be to devise a way to make
> format syntax "pluggable" / "scriptable". There have been previous discussions
> on the GCC mailing lists.

Perhaps. I very much doubt that this can or will be done, in a way that
would support PyArg_ParseTuple. It's probably easier to replace
PyArg_ParseTuple with something that can be statically checked by any
compiler.

Regards,
Martin

From anthony at interlink.com.au  Sat Sep 23 10:40:02 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Sat, 23 Sep 2006 18:40:02 +1000
Subject: [Python-Dev] AST structure and maintenance branches
Message-ID: <200609231840.03859.anthony@interlink.com.au>

I'd like to propose that the AST format returned by passing PyCF_ONLY_AST to 
compile() get the same guarantee in maintenance branches as the bytecode 
format - that is, unless it's absolutely necessary, we'll keep it the same. 
Otherwise anyone trying to write tools to manipulate the AST is in for a 
massive world of hurt.

Anyone have any problems with this, or can it be added to PEP 6?

Anthony

From barry at barrys-emacs.org  Sat Sep 23 14:06:34 2006
From: barry at barrys-emacs.org (Barry Scott)
Date: Sat, 23 Sep 2006 13:06:34 +0100
Subject: [Python-Dev] Maybe we should have a C++ extension for testing...
In-Reply-To: <17672.17407.88122.884957@montanaro.dyndns.org>
References: <17672.17407.88122.884957@montanaro.dyndns.org>
Message-ID: <EF66DADD-CA2A-4C00-8949-EB1726B1D0B2@barrys-emacs.org>


On Sep 13, 2006, at 18:46, skip at pobox.com wrote:

>
> Building Python with C and then linking in extensions written in or  
> wrapped
> with C++ can present problems, at least in some situations.  I  
> don't know if
> it's kosher to build that way, but folks do.  We're bumping into such
> problems at work using Solaris 10 and Python 2.4 (building  
> matplotlib, which
> is largely written in C++), and it appears others have similar  
> problems:
>
>     http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6395191
>     http://mail.python.org/pipermail/patches/2005-June/017820.html
>     http://mail.python.org/pipermail/python-bugs-list/2005-November/ 
> 030900.html
>
> I attached a comment to the third item yesterday (even though it was
> closed).
>
> One of our C++ gurus (that's definitely not me!) patched the Python  
> source
> to include <wchar.h> at the top of Python.h.  That seems to have  
> solved our
> problems, but seems to be a symptomatic fix.  I got to thinking,  
> should we
> a) encourage people to compile Python with a C++ compiler if most/ 
> all of
> their extensions are written in C++ anyway (does that even work if  
> one or
> more extensions are written in C?), or b) should the standard  
> distribution
> maybe include a toy extension written in C++ whose sole purpose is  
> to test
> for cross-language problems?

Mixing of C and C++ code is fully supported by the compilers and  
linkers.
There is no need to compile the python core as C++ code, indeed if you
did only C++ extension could use it!

In the distent past  there had been problems with some
unix distributions linking python in such a way that C++ code would not
initialise. The major distributions seem to have sort these problems  
out.
But clearly Solaris has a problem.

It would be worth finding out out why it was necessary to include  
<wchar.h>
to fix the problems. If you do add a C++ test extension it will need  
to do what
ever it was that <wchar.h> fixes.

 From what I can remember attempts to use std::cout would fail and I  
think
static object initialisation would fail. The test code would need to  
do all these
things and verify they are working.

Barry (PyCXX cxx.sourceforge.net)




From martin at v.loewis.de  Sat Sep 23 14:43:55 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 23 Sep 2006 14:43:55 +0200
Subject: [Python-Dev] Maybe we should have a C++ extension for testing...
In-Reply-To: <17672.17407.88122.884957@montanaro.dyndns.org>
References: <17672.17407.88122.884957@montanaro.dyndns.org>
Message-ID: <45152C0B.5070202@v.loewis.de>

skip at pobox.com schrieb:
> One of our C++ gurus (that's definitely not me!) patched the Python source
> to include <wchar.h> at the top of Python.h.  That seems to have solved our
> problems, but seems to be a symptomatic fix.

Indeed. The right fix is likely different, and relates to the question
what API Sun defines in its header files, and which of these which gcc
version uses.

> I got to thinking, should we
> a) encourage people to compile Python with a C++ compiler if most/all of
> their extensions are written in C++ anyway (does that even work if one or
> more extensions are written in C?)

I can't see how this could help. The problem you have is specific to
Solaris, and specific to using GCC on Solaris. This is just a tiny
fraction of Python users. Without further investigation, it might
be even depending on the specific version of GCC being used (and
the specific Solaris version).

> or b) should the standard distribution
> maybe include a toy extension written in C++ whose sole purpose is to test
> for cross-language problems?

Again, this isn't likely to help. If such a problem exist, it is only
found if somebody builds Python on that platform. You are perhaps the
first one to do in this specific combination, so you would have
encountered the problem first. Would that have helped you?

> Either/or/neither/something else?

Something else. Find and understand all platform quirks on platforms we
encounter, and come up with a solution. Fix them one by one, as we
encounter them, and document all work-arounds being made, so we can
take them out when the system disappears (or subsequent releases fixed
the platform bugs).

Doing so requires a good understanding of C and C++, of course.

Regards,
Martin

From dave at boost-consulting.com  Sat Sep 23 15:14:24 2006
From: dave at boost-consulting.com (David Abrahams)
Date: Sat, 23 Sep 2006 09:14:24 -0400
Subject: [Python-Dev] Pep 353: Py_ssize_t advice
In-Reply-To: <4514C5B8.1070903@v.loewis.de> (Martin v. =?utf-8?Q?L=C3=B6wi?=
	=?utf-8?Q?s's?= message of "Sat, 23 Sep 2006 07:27:20 +0200")
References: <871wq3eo5e.fsf@pereiro.luannocracy.com>
	<45144432.6010304@v.loewis.de> <87k63vzguu.fsf@pereiro.luannocracy.com>
	<4514C5B8.1070903@v.loewis.de>
Message-ID: <874puy209b.fsf@pereiro.luannocracy.com>

"Martin v. L?wis" <martin at v.loewis.de> writes:

>> c. anyway you'll get a nasty warning, which for some people will be 
>> just as bad as an error
>
> Try for yourself. You get the warning only if the redefinition is not
> identical to the original definition (or an object-like macro is
> redefined as a function-like macro or vice versa).

I'm confident that whether you get the warning otherwise is dependent
both on the compiler and the compiler-flags you use.

But this question is academic now, I think, since you accepted my
suggestion.
-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com

From skip at pobox.com  Sat Sep 23 15:27:12 2006
From: skip at pobox.com (skip at pobox.com)
Date: Sat, 23 Sep 2006 08:27:12 -0500
Subject: [Python-Dev] list.discard? (Re: dict.discard)
In-Reply-To: <45148F40.40802@canterbury.ac.nz>
References: <20060921134249.GA9238@niemeyer.net>
	<45132C6D.9010806@canterbury.ac.nz>
	<17683.21448.909748.200493@montanaro.dyndns.org>
	<45148F40.40802@canterbury.ac.nz>
Message-ID: <17685.13872.773665.230012@montanaro.dyndns.org>


    Greg> Or maybe remove() should just do nothing if the item is not
    Greg> found. 

If that's the case, I'd argue that dict.remove and set.remove should behave
the same way, making .discard unnecessary.  OTOH, perhaps lists should grow
a .discard method.

Skip

From glassfordm at hotmail.com  Fri Sep 22 14:24:32 2006
From: glassfordm at hotmail.com (Michael Glassford)
Date: Fri, 22 Sep 2006 08:24:32 -0400
Subject: [Python-Dev] Python 2.5 bug? Changes in behavior of traceback module
Message-ID: <ef0km3$ujn$1@sea.gmane.org>

In Python 2.4, traceback.print_exc() and traceback.format_exc() silently 
do nothing if there is no active exception; in Python 2.5, they raise an 
exception. Not too difficult to handle, but unexpected (and a pain if 
you use it in a lot of places). I assume it was an unintentional change?

Mike



In Python 2.4:

 >>> import traceback
 >>> traceback.print_exc()
None
 >>> traceback.format_exc()
'None\n'



In Python 2.5:

 >>> import traceback
 >>> traceback.print_exc()
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File 
"/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/traceback.py", 
line 227, in print_exc
     print_exception(etype, value, tb, limit, file)
   File 
"/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/traceback.py", 
line 126, in print_exception
     lines = format_exception_only(etype, value)
   File 
"/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/traceback.py", 
line 176, in format_exception_only
     stype = etype.__name__
AttributeError: 'NoneType' object has no attribute '__name__'
 >>> traceback.format_exc()
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File 
"/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/traceback.py", 
line 236, in format_exc
     return ''.join(format_exception(etype, value, tb, limit))
   File 
"/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/traceback.py", 
line 145, in format_exception
     list = list + format_exception_only(etype, value)
   File 
"/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/traceback.py", 
line 176, in format_exception_only
     stype = etype.__name__
AttributeError: 'NoneType' object has no attribute '__name__'


From skip at pobox.com  Sat Sep 23 15:37:38 2006
From: skip at pobox.com (skip at pobox.com)
Date: Sat, 23 Sep 2006 08:37:38 -0500
Subject: [Python-Dev] Maybe we should have a C++ extension for testing...
In-Reply-To: <45152C0B.5070202@v.loewis.de>
References: <17672.17407.88122.884957@montanaro.dyndns.org>
	<45152C0B.5070202@v.loewis.de>
Message-ID: <17685.14498.20311.248692@montanaro.dyndns.org>


    Martin> The problem you have is specific to Solaris, and specific to
    Martin> using GCC on Solaris. 

So can we fix this in pyport.h or with suitable Configure script
machinations?  Even though the current patch we're using is trivial I'd
really like to avoid patching the Python distribution when we install it.

Skip

From martin at v.loewis.de  Sat Sep 23 16:23:17 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 23 Sep 2006 16:23:17 +0200
Subject: [Python-Dev] Maybe we should have a C++ extension for testing...
In-Reply-To: <17685.14498.20311.248692@montanaro.dyndns.org>
References: <17672.17407.88122.884957@montanaro.dyndns.org>
	<45152C0B.5070202@v.loewis.de>
	<17685.14498.20311.248692@montanaro.dyndns.org>
Message-ID: <45154355.4040000@v.loewis.de>

skip at pobox.com schrieb:
>     Martin> The problem you have is specific to Solaris, and specific to
>     Martin> using GCC on Solaris. 
> 
> So can we fix this in pyport.h or with suitable Configure script
> machinations?  Even though the current patch we're using is trivial I'd
> really like to avoid patching the Python distribution when we install it.

Yes. However, to do so, somebody would have to understand the problem
in detail first.

Regards,
Martin

From mwh at python.net  Sat Sep 23 16:59:14 2006
From: mwh at python.net (Michael Hudson)
Date: Sat, 23 Sep 2006 15:59:14 +0100
Subject: [Python-Dev] AST structure and maintenance branches
In-Reply-To: <200609231840.03859.anthony@interlink.com.au> (Anthony Baxter's
	message of "Sat, 23 Sep 2006 18:40:02 +1000")
References: <200609231840.03859.anthony@interlink.com.au>
Message-ID: <2mu02yk4sd.fsf@starship.python.net>

Anthony Baxter <anthony at interlink.com.au> writes:

> I'd like to propose that the AST format returned by passing PyCF_ONLY_AST to 
> compile() get the same guarantee in maintenance branches as the bytecode 
> format - that is, unless it's absolutely necessary, we'll keep it the same. 
> Otherwise anyone trying to write tools to manipulate the AST is in for a 
> massive world of hurt.
>
> Anyone have any problems with this, or can it be added to PEP 6?

Sounds like a good idea.

Cheers,
mwh

-- 
  Reading Slashdot can [...] often be worse than useless, especially
  to young and budding programmers: it can give you exactly the wrong
  idea about the technical issues it raises.
 -- http://www.cs.washington.edu/homes/klee/misc/slashdot.html#reasons

From gh at ghaering.de  Sat Sep 23 19:31:00 2006
From: gh at ghaering.de (=?ISO-8859-1?Q?Gerhard_H=E4ring?=)
Date: Sat, 23 Sep 2006 19:31:00 +0200
Subject: [Python-Dev] Need help with C - problem in sqlite3 module
Message-ID: <45156F54.3010606@ghaering.de>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Looks like I don't know C so well after all ...

Apparently at least gcc on Linux exports all symbols by default that are
not static. This creates problems with Python extensions that export
symbols that are also used in other contexts. For example some people use
Python and the sqlite3 module under Apache, and the sqlite3 module exports
a symbol cache_init, but cache_init is also used by Apache's mod_cache
module. Thus there are crashes when using the sqlite3 module that only
occur in the mod_python context.

Can somebody with more knowledge about C tell me how to fix the sqlite3
module or compiler settings for distutils so that this does not happen?

Of course this only happens because the sqlite3 module is distributed among
multiple .c files and thus I couldn't make everything "static".

Thanks in advance.

- -- Gerhard
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFFW9UdIO4ozGCH14RApFQAKC+BJd8mGlCXJa89swOcMvASoj6GgCfZxf+
tZ/iVO8xTEV7qNeXBcDT0WU=
=lX07
-----END PGP SIGNATURE-----

From jeremy.kloth at 4suite.org  Sat Sep 23 20:07:29 2006
From: jeremy.kloth at 4suite.org (Jeremy Kloth)
Date: Sat, 23 Sep 2006 12:07:29 -0600
Subject: [Python-Dev] Need help with C - problem in sqlite3 module
In-Reply-To: <45156F54.3010606@ghaering.de>
References: <45156F54.3010606@ghaering.de>
Message-ID: <200609231207.30418.jeremy.kloth@4suite.org>

On Saturday, September 23, 2006 11:31 am, Gerhard H?ring wrote:
> Looks like I don't know C so well after all ...
>
> Apparently at least gcc on Linux exports all symbols by default that are
> not static. This creates problems with Python extensions that export
> symbols that are also used in other contexts. For example some people use
> Python and the sqlite3 module under Apache, and the sqlite3 module exports
> a symbol cache_init, but cache_init is also used by Apache's mod_cache
> module. Thus there are crashes when using the sqlite3 module that only
> occur in the mod_python context.
>
> Can somebody with more knowledge about C tell me how to fix the sqlite3
> module or compiler settings for distutils so that this does not happen?
>
> Of course this only happens because the sqlite3 module is distributed among
> multiple .c files and thus I couldn't make everything "static".

GCC's symbol visibility is supposed to address this exact problem.  It would 
be nice if -fvisibility=hidden was used to build Python (and its extensions) 
by default on supported platforms/compilers.  It shouldn't be much of an 
issue wrt. exported symbols as they already need to be tracked for Windows 
where symbols are hidden by default (unlike traditional *nix).

-- 
Jeremy Kloth
http://4suite.org/

From brett at python.org  Sat Sep 23 21:12:05 2006
From: brett at python.org (Brett Cannon)
Date: Sat, 23 Sep 2006 12:12:05 -0700
Subject: [Python-Dev] AST structure and maintenance branches
In-Reply-To: <200609231840.03859.anthony@interlink.com.au>
References: <200609231840.03859.anthony@interlink.com.au>
Message-ID: <bbaeab100609231212q65c07693ub739e0a29a74bd05@mail.gmail.com>

On 9/23/06, Anthony Baxter <anthony at interlink.com.au> wrote:
>
> I'd like to propose that the AST format returned by passing PyCF_ONLY_AST
> to
> compile() get the same guarantee in maintenance branches as the bytecode
> format - that is, unless it's absolutely necessary, we'll keep it the
> same.
> Otherwise anyone trying to write tools to manipulate the AST is in for a
> massive world of hurt.
>
> Anyone have any problems with this, or can it be added to PEP 6?


Works for me.

-Bett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060923/f9620f0c/attachment.htm 

From david.nospam.hopwood at blueyonder.co.uk  Sat Sep 23 22:00:29 2006
From: david.nospam.hopwood at blueyonder.co.uk (David Hopwood)
Date: Sat, 23 Sep 2006 21:00:29 +0100
Subject: [Python-Dev] Pep 353: Py_ssize_t advice
In-Reply-To: <45144432.6010304@v.loewis.de>
References: <871wq3eo5e.fsf@pereiro.luannocracy.com>
	<45144432.6010304@v.loewis.de>
Message-ID: <4515925D.6010408@blueyonder.co.uk>

Martin v. L?wis wrote:
> David Abrahams schrieb:
> 
>>(C++ allows restating of typedefs; if C allows it, that should be
>>something like):
> 
> C also allows this; [...]

This is nitpicking, since you agreed the change to the PEP, but are you
sure that C allows this?

From C99 + TC1 + TC2 (http://www.open-std.org/JTC1/SC22/WG14/www/standards):

# 6.2.2  Linkages of identifiers
#
# 6  The following identifiers have no linkage: an identifier declared
#    to be anything other than an object or a function; [...]

(i.e. typedef identifiers have no linkage)

# 6.7  Declarations
#
# Constraints
# 3  If an identifier has no linkage, there shall be no more than one
#    declaration of the identifier (in a declarator or type specifier)
#    with the same scope and in the same name space, except for tags as
#    specified in 6.7.2.3.

# 6.7.2.3  Tags
#
# Constraints
# 1  A specific type shall have its content defined at most once.

(There is nothing else in 6.7.2.3 that applies to typedefs.)

Since 6.7 (3) and 6.7.2.3 (1) are constraints, I read this as saying that
a C99 implementation must produce a diagnostic if a typedef is redeclared
in the same scope. If the program is run despite the diagnostic, its behaviour
is undefined.

Several C compilers I've used in the past have needed the idempotence guard
on typedefs, in any case.

-- 
David Hopwood <david.nospam.hopwood at blueyonder.co.uk>




From martin at v.loewis.de  Sat Sep 23 22:18:32 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 23 Sep 2006 22:18:32 +0200
Subject: [Python-Dev] Need help with C - problem in sqlite3 module
In-Reply-To: <45156F54.3010606@ghaering.de>
References: <45156F54.3010606@ghaering.de>
Message-ID: <45159698.9080405@v.loewis.de>

Gerhard H?ring schrieb:
> Apparently at least gcc on Linux exports all symbols by default that are
> not static.

Correct. Various factors influence run-time symbol binding, though.

> This creates problems with Python extensions that export
> symbols that are also used in other contexts. For example some people use
> Python and the sqlite3 module under Apache, and the sqlite3 module exports
> a symbol cache_init, but cache_init is also used by Apache's mod_cache
> module. Thus there are crashes when using the sqlite3 module that only
> occur in the mod_python context.
> 
> Can somebody with more knowledge about C tell me how to fix the sqlite3
> module or compiler settings for distutils so that this does not happen?

The only reliable way is to do renaming. This was one of the primary
reasons of the "grand renaming" in Python, where the Py prefix was
introduced.

> Of course this only happens because the sqlite3 module is distributed among
> multiple .c files and thus I couldn't make everything "static".

In the specific case, I can't understand that reason. cache_init is
declared in cache.c, and only used in cache.c (to fill a tp_init slot).
So just make the symbol static.

As a lesson learned, you should go through the module and make
all functions static, then see what functions really need to be
extern. You should then rename these functions, say by adding
a PySQLite prefix. All dynamic symbols remaining should then
either have the PySQLite prefix, except for init_sqlite3.

In fact, since most operations in Python go through function
pointers, there is typically very little need for extern
functions in a Python extension module, even if that module
consists of multiple C files.

Regards,
Martin

P.S. Currently, on my system, the following symbols are extern in
this module

00005890 T _authorizer_callback
0000dec0 A __bss_start
00007600 T _build_column_name
00005df0 T _build_py_params
00007ee0 T build_row_cast_map
00004880 T cache_dealloc
00004990 T cache_display
00004b90 T cache_get
00004da0 T cache_init
00004930 T cache_setup_types
0000d4a0 D CacheType
00004e80 T check_connection
00009f60 T check_remaining_sql
00005420 T check_thread
00006430 T _connection_begin
00005cb0 T connection_call
000068d0 T connection_close
000061c0 T connection_commit
000059b0 T connection_create_aggregate
00005ab0 T connection_create_function
000057a0 T connection_cursor
00006530 T connection_dealloc
00005320 T connection_execute
00005220 T connection_executemany
00005120 T connection_executescript
00006970 T connection_init
00006700 T connection_rollback
000056d0 T connection_set_authorizer
000050e0 T connection_setup_types
0000d5e0 D ConnectionType
0000ded8 B converters
000094d0 T converters_init
00007110 T cursor_close
00007190 T cursor_dealloc
00008d90 T cursor_execute
00008d50 T cursor_executemany
000072e0 T cursor_executescript
00007c90 T cursor_fetchall
00007d30 T cursor_fetchmany
00007e10 T cursor_fetchone
000070b0 T cursor_getiter
00007530 T cursor_init
00007b50 T cursor_iternext
000070e0 T cursor_setup_types
0000d980 D CursorType
0000decc B DatabaseError
0000ded4 B DataError
00005bb0 T _drop_unused_statement_references
0000dec0 A _edata
0000def0 B _enable_callback_tracebacks
0000defc A _end
0000dee8 B Error
00007710 T _fetch_one_row
00006cb0 T _final_callback
0000aac4 T _fini
00006830 T flush_statement_cache
00006fa0 T _func_callback
00007e60 T _get_converter
00003bd4 T _init
00009520 T init_sqlite3
0000deec B IntegrityError
0000ded0 B InterfaceError
0000dedc B InternalError
00008dd0 T microprotocols_adapt
00009040 T microprotocols_add
000090e0 T microprotocols_init
000047a0 T new_node
00004810 T node_dealloc
0000d3e0 D NodeType
0000def8 B NotSupportedError
0000dee4 B OperationalError
0000dee0 B OptimizedUnicode
00009ae0 T prepare_protocol_dealloc
00009ac0 T prepare_protocol_init
00009b10 T prepare_protocol_setup_types
0000dec8 B ProgrammingError
0000dec4 B psyco_adapters
00008fc0 T psyco_microprotocols_adapt
000070c0 T pysqlite_noop
00008110 T _query_execute
00006690 T reset_all_statements
0000dd20 D row_as_mapping
00009b50 T row_dealloc
00009e40 T row_init
00009bd0 T row_length
00009c40 T row_setup_types
00009c80 T row_subscript
0000dd40 D RowType
0000a910 T _seterror
00005fc0 T _set_result
00006c70 T _sqlite3_result_error
0000dc60 D SQLitePrepareProtocolType
0000aa30 T _sqlite_step_with_busyhandler
0000a2a0 T statement_bind_parameter
0000a530 T statement_bind_parameters
0000a7f0 T statement_create
0000a0f0 T statement_dealloc
0000a080 T statement_finalize
00009f30 T statement_mark_dirty
0000a210 T statement_recompile
0000a190 T statement_reset
0000a040 T statement_setup_types
0000de00 D StatementType
000076a0 T unicode_from_string
0000def4 B Warning


From martin at v.loewis.de  Sat Sep 23 22:19:39 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 23 Sep 2006 22:19:39 +0200
Subject: [Python-Dev] Need help with C - problem in sqlite3 module
In-Reply-To: <200609231207.30418.jeremy.kloth@4suite.org>
References: <45156F54.3010606@ghaering.de>
	<200609231207.30418.jeremy.kloth@4suite.org>
Message-ID: <451596DB.7060609@v.loewis.de>

Jeremy Kloth schrieb:
> GCC's symbol visibility is supposed to address this exact problem.  It would 
> be nice if -fvisibility=hidden was used to build Python (and its extensions) 
> by default on supported platforms/compilers.  It shouldn't be much of an 
> issue wrt. exported symbols as they already need to be tracked for Windows 
> where symbols are hidden by default (unlike traditional *nix).

Of course, this doesn't help on systems where gcc isn't used. So for
Python itself, we should always look for a solution that works across
compilers.

Regards,
Martin

From martin at v.loewis.de  Sat Sep 23 22:21:06 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 23 Sep 2006 22:21:06 +0200
Subject: [Python-Dev] Pep 353: Py_ssize_t advice
In-Reply-To: <4515925D.6010408@blueyonder.co.uk>
References: <871wq3eo5e.fsf@pereiro.luannocracy.com>	<45144432.6010304@v.loewis.de>
	<4515925D.6010408@blueyonder.co.uk>
Message-ID: <45159732.2000205@v.loewis.de>

David Hopwood schrieb:
>>> (C++ allows restating of typedefs; if C allows it, that should be
>>> something like):
>> C also allows this; [...]
> 
> This is nitpicking, since you agreed the change to the PEP, but are you
> sure that C allows this?

I was sure, but I was also wrong. Thanks for pointing that out.

Regards,
Martin

From arigo at tunes.org  Sun Sep 24 00:15:11 2006
From: arigo at tunes.org (Armin Rigo)
Date: Sun, 24 Sep 2006 00:15:11 +0200
Subject: [Python-Dev] New relative import issue
In-Reply-To: <ca471dc20609210722i620d0371g43add23268844be6@mail.gmail.com>
References: <cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>
	<20060918091314.GA26814@code0.codespeak.net>
	<450F6833.60603@canterbury.ac.nz>
	<20060919094738.GC27707@phd.pp.ru>
	<05af01c6dd7e$a2209560$e303030a@trilan>
	<ca471dc20609210722i620d0371g43add23268844be6@mail.gmail.com>
Message-ID: <20060923221510.GA21803@code0.codespeak.net>

Hi Guido,

On Thu, Sep 21, 2006 at 07:22:04AM -0700, Guido van Rossum wrote:
> sys.path exists to stitch together the toplevel module/package
> namespace from diverse sources.
> 
> Import hooks and sys.path hackery exist so that module/package sources
> don't have to be restricted to the filesystem (as well as to allow
> unbridled experimentation by those so inclined :-).

This doesn't match my experience, which is that sys.path hackery is
required in any project that is larger than one directory, but is not
itself a library.  The basic assumption is that I don't want to put
whole applications in 'site-packages' or in my $PYTHONPATH; I would like
them to work in a self-contained, zero-installation way, much like they
do if they are built from several modules in a single directory.

For example, consider an application with the following structure:

   myapp/
      main.py
      a/
         __init__.py
         b.py
         test_b.py
      c/
         __init__.py

This theoretical example shows main.py (the main entry point) at the
root of the package directories - it is the only place where it can be
if it needs to import the packages a and c.  The module a.b can import
c, too (and this is not bad design - think about c as a package
regrouping utilities that make sense for the whole application).  But
then the testing script test_b.py cannot import the whole application
any more.  Imports of a or c will fail, and even a relative import of b
will crash when b tries to import c.  The only way I can think of is to
insert the root directory in sys.path from within test_b.py, and then
use absolute imports.

(For example, to support this way of organizing applications, the 'py'
lib provides a call py.magic.autopath() that can be dropped at the start
of test_b.py.  It hacks sys.path by guessing the "real" root according
to how many levels of __init__.py there are...)


A bientot,

Armin.

From krcmar at datinel.cz  Sun Sep 24 00:14:41 2006
From: krcmar at datinel.cz (Milan Krcmar)
Date: Sun, 24 Sep 2006 00:14:41 +0200
Subject: [Python-Dev] Minipython
Message-ID: <20060923221441.GB5227@hornet.din.cz>

I would like to run Python scripts on an embedded MIPS Linux platform
having only 2 MiB of flash ROM and 16 MiB of RAM for everything.

Current (2.5) stripped and gzipped (I am going to use a compressed
filesystem) CPython binary, compiled with defaults on a i386/glibc
Linux, results in 500 KiB of "flash". How to make the Python interpreter
even smaller?

- can I completely drop out lexical analysis of sourcecode and compilation
  to bytecode? is it relevant enough to the size of interpreter?

- should I drop "useless" compiled-in modules? (what I need is a
  replacement for advanced bash scripting, being able to write more
  complex scripts and avoid forking tens of processes for things like
  searching filesystem, formating dates etc.)

I don't want to re-invent the wheel, but all my attempts at finding
Python for embedded systems ended in instructions for embedding
Python in another program :-)

Can you give me any information to start with? I would prefer stripping
current version of Python rather than returning to a years-old (but
smaller) version and remembering what of the new syntax/functionality to
avoid.

TIA, Milan

From rasky at develer.com  Sun Sep 24 01:11:06 2006
From: rasky at develer.com (Giovanni Bajo)
Date: Sun, 24 Sep 2006 01:11:06 +0200
Subject: [Python-Dev] New relative import issue
References: <cfb578b20609171138r7098cc11j38cb8962dbaef430@mail.gmail.com>
	<20060918091314.GA26814@code0.codespeak.net>
	<450F6833.60603@canterbury.ac.nz>
	<20060919094738.GC27707@phd.pp.ru>
	<05af01c6dd7e$a2209560$e303030a@trilan>
	<ca471dc20609210722i620d0371g43add23268844be6@mail.gmail.com>
	<20060923221510.GA21803@code0.codespeak.net>
Message-ID: <06e501c6df65$8c521b80$4bbd2997@bagio>

Armin Rigo wrote:

> This doesn't match my experience, which is that sys.path hackery is
> required in any project that is larger than one directory, but is not
> itself a library.  [...]

>    myapp/
>       main.py
>       a/
>          __init__.py
>          b.py
>          test_b.py
>       c/
>          __init__.py
>
> This theoretical example shows main.py (the main entry point) at the
> root of the package directories - it is the only place where it can be
> if it needs to import the packages a and c.  The module a.b can import
> c, too (and this is not bad design - think about c as a package
> regrouping utilities that make sense for the whole application).  But
> then the testing script test_b.py cannot import the whole application
> any more.  Imports of a or c will fail, and even a relative import of
> b will crash when b tries to import c.  The only way I can think of
> is to insert the root directory in sys.path from within test_b.py,
> and then use absolute imports.

This also matches my experience, but I never used sys.path hackery for this
kind of things. I either set PYTHONPATH while I work on "myapp" (which I
consider not such a big trouble after all, and surely much less invasive than
adding specific Python code tweaking sys.path into all the tests), or, even
more simply, I run the test from myapp main directory (manually typing
"myapp/b/test_b.py").

There is also another possibility, which is having a smarter test framework
where you can specify substrings of test names: I don't know py.test in detail,
but in my own framework I can say something like "./run_tests.py PAT", which
basically means "recursively discover and run all files named test_NAME, and
where PAT is a substring of NAME).

> (For example, to support this way of organizing applications, the 'py'
> lib provides a call py.magic.autopath() that can be dropped at the
> start of test_b.py.  It hacks sys.path by guessing the "real" root
> according to how many levels of __init__.py there are...)

Since I consider this more of an environmental problem, I would not find
satisfying any kind of solution at the single module level (and even less so
one requiring so much guess-work as this one).

Giovanni Bajo


From rasky at develer.com  Sun Sep 24 01:16:54 2006
From: rasky at develer.com (Giovanni Bajo)
Date: Sun, 24 Sep 2006 01:16:54 +0200
Subject: [Python-Dev] Minipython
References: <20060923221441.GB5227@hornet.din.cz>
Message-ID: <070101c6df66$5bc1e5d0$4bbd2997@bagio>

Milan Krcmar wrote:

> Current (2.5) stripped and gzipped (I am going to use a compressed
> filesystem) CPython binary, compiled with defaults on a i386/glibc
> Linux, results in 500 KiB of "flash". How to make the Python
> interpreter even smaller?

In my experience, the biggest gain can be obtained by dropping the rarely-used
CJK codecs (for Asian languages). That should sum up to almost 800K
(uncompressed), IIRC. After that, I once had to strip down the binary even
more, and found out (by guesswork and inspection of map files) that there is no
other low hanging fruit. By carefully selecting which modules to link in, I was
able to reduce of another 300K or so, but nothing really incredible. I would
also suggest -ffunction-sections in these cases, but you might already know
that.

Giovanni Bajo


From rasky at develer.com  Sun Sep 24 01:29:27 2006
From: rasky at develer.com (Giovanni Bajo)
Date: Sun, 24 Sep 2006 01:29:27 +0200
Subject: [Python-Dev] Removing __del__
References: <324634B71B159D469BCEB616678A6B94F94C3B@ingdexs1.ingdirect.com><B6FAC926EFE7B348B12F29CF7E4A93D401CF46A1@hammer.office.bhtrader.com><008901c6de94$d2072ed0$4bbd2997@bagio><20060922235602.GA3427@panix.com><6a36e7290609221735hcbd3df2ne41406323ce5fd72@mail.gmail.com><039d01c6def1$46df1ef0$4bbd2997@bagio><6a36e7290609230222w1fe8dfaam4780a1fd81481cd0@mail.gmail.com><03bb01c6def4$257b6c70$4bbd2997@bagio>
	<8764fesfsj.fsf@qrnik.zagroda>
Message-ID: <07b101c6df68$1ca8dfa0$4bbd2997@bagio>

Marcin 'Qrczak' Kowalczyk wrote:

>> 1) There's a way to destruct the handle BEFORE __del__ is called,
>> which would require killing the weakref / deregistering the
>> finalization hook.
>
> Weakrefs should have a method which runs their callback and
> unregisters them.
>
>> 2) The objects required in the destructor can be mutated / changed
>> during the lifetime of the instance. For instance, a class that
>> wraps Win32 FindFirstFirst/FindFirstNext and support transparent
>> directory recursion needs something similar.
>
> Listing files with transparent directory recursion can be implemented
> in terms of listing files of a given directory, such that a finalizer
> is only used with the low level object.
>
>> Another example is a class which creates named temporary files
>> and needs to remove them on finalization. It might need to create
>> several different temporary files (say, self.handle is the filename
>> in that case)[1], so the filename needed in the destructor changes
>> during the lifetime of the instance.
>
> Again: move the finalizer to a single temporary file object, and refer
> to such object instead of a raw handle.

Yes, I know Python is turing-complete even without __del__, but that is not my
point. The fact that we can enhance weakrefs and find a very complicated way to
solve problems which __del__ solves right now easily does not make things
different. People are still propsing to drop a feature which is perceived as
"easy" by users, and replace it with a complicated set of workarounds, which
are prone to mistakes, more verbose, hard to learn and to maintain.

I'm totally in favor of the general idea of dropping rarely used features (like
__var in the other thread). I just can't see how dropping __del__ makes things
easier, while it surely makes life a lot harder for the legitimate users of it.

Giovanni Bajo


From mwh at python.net  Sun Sep 24 01:36:41 2006
From: mwh at python.net (Michael Hudson)
Date: Sun, 24 Sep 2006 00:36:41 +0100
Subject: [Python-Dev] Minipython
In-Reply-To: <20060923221441.GB5227@hornet.din.cz> (Milan Krcmar's message
	of "Sun, 24 Sep 2006 00:14:41 +0200")
References: <20060923221441.GB5227@hornet.din.cz>
Message-ID: <2mhcyyjgty.fsf@starship.python.net>

Milan Krcmar <krcmar at datinel.cz> writes:

> I would like to run Python scripts on an embedded MIPS Linux platform
> having only 2 MiB of flash ROM and 16 MiB of RAM for everything.
>
> Current (2.5) stripped and gzipped (I am going to use a compressed
> filesystem) CPython binary, compiled with defaults on a i386/glibc
> Linux, results in 500 KiB of "flash". How to make the Python interpreter
> even smaller?
>
> - can I completely drop out lexical analysis of sourcecode and compilation
>   to bytecode? is it relevant enough to the size of interpreter?

I don't think there's an configure flag for this or anything, and it
might be a bit hairy to do it, but it's possible and it would probably
save a bit.

There is a configure option to remove unicode support.  It's not
terribly well supported and stops working every now and again, but
it's probably much easier to start with.

There was at one point and may still be an option to not include the
complex type.

> - should I drop "useless" compiled-in modules? (what I need is a
>   replacement for advanced bash scripting, being able to write more
>   complex scripts and avoid forking tens of processes for things like
>   searching filesystem, formating dates etc.)

Yes, definitely.

> I don't want to re-invent the wheel, but all my attempts at finding
> Python for embedded systems ended in instructions for embedding
> Python in another program :-)
>
> Can you give me any information to start with? I would prefer stripping
> current version of Python rather than returning to a years-old (but
> smaller) version and remembering what of the new syntax/functionality to
> avoid.

Well, I would start by looking at what is taking up the space...

Cheers,
mwh

-- 
  C++ is a siren song.  It *looks* like a HLL in which you ought to
  be able to write an application, but it really isn't.
                                       -- Alain Picard, comp.lang.lisp

From martin at v.loewis.de  Sun Sep 24 06:49:34 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 24 Sep 2006 06:49:34 +0200
Subject: [Python-Dev] Minipython
In-Reply-To: <20060923221441.GB5227@hornet.din.cz>
References: <20060923221441.GB5227@hornet.din.cz>
Message-ID: <45160E5E.8040406@v.loewis.de>

Milan Krcmar schrieb:
> Can you give me any information to start with? I would prefer stripping
> current version of Python rather than returning to a years-old (but
> smaller) version and remembering what of the new syntax/functionality to
> avoid.

I would start with dropping support for dynamic loading of extension
modules, and link all necessary modules statically.

Then, do what Michael Hudson says: find out what is taking up space.

size */*.o|sort -n

should give a good starting point; on my system, I get

[...]
  29356    1416     156   30928    78d0 Objects/classobject.o
  30663       0       0   30663    77c7 Objects/unicodectype.o
  33530     480     536   34546    86f2 Python/Python-ast.o
  33624    1792     616   36032    8cc0 Objects/longobject.o
  36603      16     288   36907    902b Python/ceval.o
  36710    2532       0   39242    994a Modules/_sre.o
  39169    9473    1032   49674    c20a Objects/stringobject.o
  52965       0      36   53001    cf09 Python/compile.o
  66197    4592     436   71225   11639 Objects/typeobject.o
  74111    9779    1160   85050   14c3a Objects/unicodeobject.o

Michael already mentioned you can drop unicodeobject if you want
to. compile.o would also offer savings, but stripping it might
not be easy. Dropping _sre is quite easy. If you manage to
drop compile.o, then dropping Python-ast.o (along with the
rest of the compiler) should also be possible.
unicodectype will go away if the Unicode type goes, but can
probably be removed separately. And so on.

When you come to a solution that satisfies your needs,
don't forget to document it somewhere.

Regards,
Martin

From krcmar at datinel.cz  Sun Sep 24 10:37:55 2006
From: krcmar at datinel.cz (Milan Krcmar)
Date: Sun, 24 Sep 2006 10:37:55 +0200
Subject: [Python-Dev] Minipython
In-Reply-To: <45160E5E.8040406@v.loewis.de>
References: <20060923221441.GB5227@hornet.din.cz>
	<45160E5E.8040406@v.loewis.de>
Message-ID: <20060924083755.GA27480@hornet.din.cz>

Thank you people. I'm going to try to strip unneeded things and let you
know the result.

Along with running Python on an embedded system, I am considering two
more things. Suppose the system to be a small Linux router, which, after
the kernel starts, merely configures lots of parameters of the kernel
and then runs some daemons for gathering statistics and allowing remote
control of the host.

Python helps mainly in the startup phase of configuring kernel according
to a human-readable confgiuration files. This has been solved by shell
scripts. Python is not as suitable for running external processes and
process pipes as a shell, but I'd like to write a module (at least)
helping him in the sense of scsh (a "Scheme shell",
http://www.scsh.net).

A more advanced solution is to replace system's init (/sbin/init) by
Python. It should even speed the startup up as it will not need to run
shell many times. To avoid running another processes, I want to "port
them" to Python. Processes for kernel configuration, like iproute2,
iptables etc. are often built above its own library, which can be used as
a start point. (Yes, it does matter, at startup, routers run such processes
hundreds times).

Milan

On Sun, Sep 24, 2006 at 06:49:34AM +0200, "Martin v. L?wis" wrote:
> Milan Krcmar schrieb:
> > Can you give me any information to start with? I would prefer stripping
> > current version of Python rather than returning to a years-old (but
> > smaller) version and remembering what of the new syntax/functionality to
> > avoid.
> 
> I would start with dropping support for dynamic loading of extension
> modules, and link all necessary modules statically.
> 
> Then, do what Michael Hudson says: find out what is taking up space.
> 
> size */*.o|sort -n
> 
> should give a good starting point; on my system, I get
> 
> [...]
>   29356    1416     156   30928    78d0 Objects/classobject.o
>   30663       0       0   30663    77c7 Objects/unicodectype.o
>   33530     480     536   34546    86f2 Python/Python-ast.o
>   33624    1792     616   36032    8cc0 Objects/longobject.o
>   36603      16     288   36907    902b Python/ceval.o
>   36710    2532       0   39242    994a Modules/_sre.o
>   39169    9473    1032   49674    c20a Objects/stringobject.o
>   52965       0      36   53001    cf09 Python/compile.o
>   66197    4592     436   71225   11639 Objects/typeobject.o
>   74111    9779    1160   85050   14c3a Objects/unicodeobject.o
> 
> Michael already mentioned you can drop unicodeobject if you want
> to. compile.o would also offer savings, but stripping it might
> not be easy. Dropping _sre is quite easy. If you manage to
> drop compile.o, then dropping Python-ast.o (along with the
> rest of the compiler) should also be possible.
> unicodectype will go away if the Unicode type goes, but can
> probably be removed separately. And so on.
> 
> When you come to a solution that satisfies your needs,
> don't forget to document it somewhere.
> 
> Regards,
> Martin

From gjcarneiro at gmail.com  Sun Sep 24 14:07:33 2006
From: gjcarneiro at gmail.com (Gustavo Carneiro)
Date: Sun, 24 Sep 2006 13:07:33 +0100
Subject: [Python-Dev] PyErr_CheckSignals error return value
Message-ID: <a467ca4f0609240507x4d4715cas5e719fe0c1787320@mail.gmail.com>

int PyErr_CheckSignals()

Documentation for PyErr_CheckSignals [1] says "If an exception is
raised the error indicator is set and the function returns 1;
otherwise the function returns 0.".  But the code I see tells me the
function returns -1 on error.  What to do?  Fix the code, or the
documentation?

[1] http://docs.python.org/api/exceptionHandling.html#l2h-115
-- 
Gustavo J. A. M. Carneiro
"The universe is always one step beyond logic."

From g.brandl at gmx.net  Sun Sep 24 14:50:50 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Sun, 24 Sep 2006 14:50:50 +0200
Subject: [Python-Dev] Python 2.5 bug? Changes in behavior of traceback
	module
In-Reply-To: <ef0km3$ujn$1@sea.gmane.org>
References: <ef0km3$ujn$1@sea.gmane.org>
Message-ID: <ef5uvb$e2h$1@sea.gmane.org>

Michael Glassford wrote:
> In Python 2.4, traceback.print_exc() and traceback.format_exc() silently 
> do nothing if there is no active exception; in Python 2.5, they raise an 
> exception. Not too difficult to handle, but unexpected (and a pain if 
> you use it in a lot of places). I assume it was an unintentional change?

This was certainly an unintentional change while restructuring some
internal traceback routines.

It's now fixed in SVN.

Georg


From gjcarneiro at gmail.com  Sun Sep 24 16:17:42 2006
From: gjcarneiro at gmail.com (Gustavo Carneiro)
Date: Sun, 24 Sep 2006 15:17:42 +0100
Subject: [Python-Dev] Signals, threads, blocking C functions
In-Reply-To: <a467ca4f0609130617v450820dawb5f9ff1d69f41275@mail.gmail.com>
References: <E1GM0VK-0003n4-7b@draco.cus.cam.ac.uk>
	<aac2c7cb0609102132s3654f9bm519f31f0a9d65ce9@mail.gmail.com>
	<a467ca4f0609110716i26a336beg3ff0def4536723b0@mail.gmail.com>
	<450632A7.40504@canterbury.ac.nz>
	<aac2c7cb0609112205l52034601wfef5c4c1e790ca04@mail.gmail.com>
	<4506553D.1020307@canterbury.ac.nz>
	<aac2c7cb0609112359m3ff4ccb3t3f301b9d37052efb@mail.gmail.com>
	<a467ca4f0609121015i6dd3b245o1db1eb9b87fc7fe7@mail.gmail.com>
	<aac2c7cb0609121353u2a4432eq35caaf522416ea34@mail.gmail.com>
	<a467ca4f0609130617v450820dawb5f9ff1d69f41275@mail.gmail.com>
Message-ID: <a467ca4f0609240717v79b6eae2u5a48076fa879bae4@mail.gmail.com>

-> http://www.python.org/sf/1564547

-- 
Gustavo J. A. M. Carneiro
"The universe is always one step beyond logic."

From python at rcn.com  Mon Sep 25 01:21:27 2006
From: python at rcn.com (python at rcn.com)
Date: Sun, 24 Sep 2006 19:21:27 -0400 (EDT)
Subject: [Python-Dev] list.discard? (Re: dict.discard)
Message-ID: <20060924192127.AFZ50059@ms09.lnh.mail.rcn.net>

> When I want to remove something from a list I typically write:
>
>   while x in somelist:
>       somelist.remove(x)

An O(n) version of removeall:

   somelist[:] = [e for e in somelist if e != x]


Raymond

From unknown_kev_cat at hotmail.com  Mon Sep 25 01:25:27 2006
From: unknown_kev_cat at hotmail.com (Joe Smith)
Date: Sun, 24 Sep 2006 19:25:27 -0400
Subject: [Python-Dev] Typo.pl scan of Python 2.5 source code
References: <BAY112-W5EEFA3A998FA1B8EA7AEE9E210@phx.gbl>
	<ee2a432c0609222151k2bf1a211u44d9e44dcc6bbf5d@mail.gmail.com>
Message-ID: <ef7458$idq$1@sea.gmane.org>


"Neal Norwitz" <nnorwitz at gmail.com> wrote in message 
news:ee2a432c0609222151k2bf1a211u44d9e44dcc6bbf5d at mail.gmail.com...

> I ignored these as I'm not certain all the platforms we run on accept
> free(NULL).
>
That sounds like exactly what the autotools are designed for. You simply use 
free(), and have autoconf check for support of free(NULL).
If free(NULL) is broken then a macro is defined:
"#define free(p)  (p==NULL)||free(p)"
Or something like that.

Note that this does not clutter up the main program any.
In fact it simplifies it.

It also potentially speeds up platforms with a working free,
without any negative speed implications for other platforms.

The only downside is a slight, presumably negligible, increase
in build time.



From mwh at python.net  Mon Sep 25 11:08:08 2006
From: mwh at python.net (Michael Hudson)
Date: Mon, 25 Sep 2006 10:08:08 +0100
Subject: [Python-Dev] Typo.pl scan of Python 2.5 source code
In-Reply-To: <ee2a432c0609222151k2bf1a211u44d9e44dcc6bbf5d@mail.gmail.com>
	(Neal Norwitz's message of "Fri, 22 Sep 2006 21:51:38 -0700")
References: <BAY112-W5EEFA3A998FA1B8EA7AEE9E210@phx.gbl>
	<ee2a432c0609222151k2bf1a211u44d9e44dcc6bbf5d@mail.gmail.com>
Message-ID: <2m64fcjouf.fsf@starship.python.net>

"Neal Norwitz" <nnorwitz at gmail.com> writes:

> I ignored these as I'm not certain all the platforms we run on accept
> free(NULL).

It's mandated by C99, and I don't *think* it changed from the previous
version (I only have a bootleg copy of C99 :).

Cheers,
mwh

-- 
  TRSDOS: Friendly old lizard. Or, at least, content to sit there
  eating flies.            -- Jim's pedigree of operating systems, asr

From talin at acm.org  Mon Sep 25 11:27:57 2006
From: talin at acm.org (Talin)
Date: Mon, 25 Sep 2006 02:27:57 -0700
Subject: [Python-Dev] Minipython
In-Reply-To: <20060924083755.GA27480@hornet.din.cz>
References: <20060923221441.GB5227@hornet.din.cz>	<45160E5E.8040406@v.loewis.de>
	<20060924083755.GA27480@hornet.din.cz>
Message-ID: <4517A11D.1050004@acm.org>

Milan Krcmar wrote:
> Thank you people. I'm going to try to strip unneeded things and let you
> know the result.
> 
> Along with running Python on an embedded system, I am considering two
> more things. Suppose the system to be a small Linux router, which, after
> the kernel starts, merely configures lots of parameters of the kernel
> and then runs some daemons for gathering statistics and allowing remote
> control of the host.
> 
> Python helps mainly in the startup phase of configuring kernel according
> to a human-readable confgiuration files. This has been solved by shell
> scripts. Python is not as suitable for running external processes and
> process pipes as a shell, but I'd like to write a module (at least)
> helping him in the sense of scsh (a "Scheme shell",
> http://www.scsh.net).
> 
> A more advanced solution is to replace system's init (/sbin/init) by
> Python. It should even speed the startup up as it will not need to run
> shell many times. To avoid running another processes, I want to "port
> them" to Python. Processes for kernel configuration, like iproute2,
> iptables etc. are often built above its own library, which can be used as
> a start point. (Yes, it does matter, at startup, routers run such processes
> hundreds times).
> 
> Milan

One alternative you might want to look into is the language "Lua" 
(www.lua.org), which is similar to Python in some respects (also has 
some similarities to Javascript), but specifically optimized for 
embedding in larger apps - meaning that it has a much smaller footprint, 
a much smaller standard library, less built-in data types and so on. 
(For example, dicts, lists, and objects are all merged into a single 
type called a 'table', which is just a generic indexable container.) 
Lua's C API consists of just a few dozen functions.

It's not as powerful as Python of course, although it's surprisingly 
powerful for its size - it has closures, continuations, and all of the 
goodness you would expect from a modern language. Lua provides 
'meta-mechanisms' for extending the language rather than implementing 
language features directly. So even though it's not a pure 
object-oriented language, it provides mechanisms for implementing 
classes and inheritance. And it's fast, since it has less baggage to 
carry around.

It has a few warts - for example, I don't like the fact that referring 
to an undefined variable silently returns nil instead of returning an 
error, but I suppose in some environments that's a feature.

A lot of game companies use Lua for embedded scripting languages in 
their games. (Console-based games in particular have strict memory 
requirements, since there's no virtual memory on consoles.)

-- Talin


From glassfordm at hotmail.com  Mon Sep 25 14:23:51 2006
From: glassfordm at hotmail.com (Michael Glassford)
Date: Mon, 25 Sep 2006 08:23:51 -0400
Subject: [Python-Dev] Python 2.5 bug? Changes in behavior of traceback
	module
In-Reply-To: <ef5uvb$e2h$1@sea.gmane.org>
References: <ef0km3$ujn$1@sea.gmane.org> <ef5uvb$e2h$1@sea.gmane.org>
Message-ID: <4517CA57.3040500@hotmail.com>

Thanks!

Mike

Georg Brandl wrote:
> Michael Glassford wrote:
>> In Python 2.4, traceback.print_exc() and traceback.format_exc() silently 
>> do nothing if there is no active exception; in Python 2.5, they raise an 
>> exception. Not too difficult to handle, but unexpected (and a pain if 
>> you use it in a lot of places). I assume it was an unintentional change?
> 
> This was certainly an unintentional change while restructuring some
> internal traceback routines.
> 
> It's now fixed in SVN.
> 
> Georg
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/python-python-dev%40m.gmane.org
> 


From steven.bethard at gmail.com  Tue Sep 26 06:04:37 2006
From: steven.bethard at gmail.com (Steven Bethard)
Date: Mon, 25 Sep 2006 22:04:37 -0600
Subject: [Python-Dev] python-dev summary for 2006-08-01 to 2006-08-15
Message-ID: <d11dcfba0609252104v443dba49i27900438c39d0fc8@mail.gmail.com>

Sorry about the delay.  Here's the summary for the first half of
August.  As always, comments and corrections are greatly appreciated.


=========
Summaries
=========

--------------------------------
Mixing str and unicode dict keys
--------------------------------

Ralf Schmitt noted that in Python head, inserting str and unicode keys
to the same dictionary would sometimes raise UnicodeDecodeErrors::

    >>> d = {}
    >>> d[u'm\xe1s'] = 1
    >>> d['m\xe1s'] = 1
    Traceback (most recent call last):
      ...
    UnicodeDecodeError: 'ascii' codec can't decode byte 0xe1 in
position 1: ordinal not in range(128)

This error showed up as a result of Armin Rigo's `patch to stop dict
lookup from hiding exceptions`_, which meant that the
UnicodeDecodeError raised when a str object is compared to a non-ASCII
unicode object was no longer silenced.  In the end, people agreed that
UnicodeDecodeError should not be raised for equality comparisons, and
in general, ``__eq__()`` methods should not raise exceptions. But
comparing str and unicode objects is often a programming error, so in
addition to just returning False, equality comparisons on str and
non-ASCII unicode now issues a warning with the UnicodeDecodeError
message.

.. _patch to stop dict lookup from hiding exceptions:
http://bugs.python.org/1497053

Contributing threads:

- `unicode hell/mixing str and unicode as dictionary keys
<http://mail.python.org/pipermail/python-dev/2006-August/067926.html>`__
- `Dicts are broken Was: unicode hell/mixing str and unicode
asdictionarykeys
<http://mail.python.org/pipermail/python-dev/2006-August/067978.html>`__
- `Dicts are broken ...
<http://mail.python.org/pipermail/python-dev/2006-August/067992.html>`__
- `Dict suppressing exceptions
<http://mail.python.org/pipermail/python-dev/2006-August/068090.html>`__

-----------------------
Rounding floats to ints
-----------------------

Bob Ippolito pointed out a long-standing bug in the struct module
where floats were automatically converted to ints. Michael Urman
showed a simple case that would provoke an exception if the bug were
fixed::

    pack('>H', round(value * 32768))

The source of this bug is the expectation that ``round()`` returns an
int, when it actually returns a float.  There was then some discussion
about splitting the round functionality into two functions:
``__builtin__.round()`` which would round floats to ints, and
``math.round()`` which would round floats to floats.  There was also
some discussion about the optional argument to ``round()`` which
currently specifies the number of decimal places to round to -- a
number of folks felt that it was a mistake to round to *decimal*
places when a float can only truly reflect *binary* places.

In the end, there were no definite conclusions about the future of
``round()``, but it seemed like the discussion might be resumed on the
Python 3000 list.

Contributing threads:

- `struct module and coercing floats to integers
<http://mail.python.org/pipermail/python-dev/2006-July/067798.html>`__
- `Rounding float to int directly (Re: struct module and coercing
floats to integers)
<http://mail.python.org/pipermail/python-dev/2006-July/067819.html>`__
- `Rounding float to int directly (Re: struct module and coercing
floats to integers)
<http://mail.python.org/pipermail/python-dev/2006-August/067867.html>`__
- `Rounding float to int directly ...
<http://mail.python.org/pipermail/python-dev/2006-August/067873.html>`__
- `struct module and coercing floats to integers
<http://mail.python.org/pipermail/python-dev/2006-August/067911.html>`__

---------------------------
Assigning to function calls
---------------------------

Neal Becker proposed that code by ``X() += 2`` be allowed so that you
could call __iadd__ on objects immediately after creation. People
pointed out that allowing augmented *assignment* is misleading when no
assignment can occur, and it would be better just to call the method
directly, e.g. ``X().__iadd__(2)``.

Contributing threads:

- `SyntaxError: can't assign to function call
<http://mail.python.org/pipermail/python-dev/2006-August/068081.html>`__
- `Split augmented assignment into two operator sets? [Re:
SyntaxError: can't assign to function call]
<http://mail.python.org/pipermail/python-dev/2006-August/068148.html>`__

---------------------------------------
PEP 357: Integer clipping and __index__
---------------------------------------

After some further discussion on the `__index__ issue`_ of last
fortnight, Travis E. Oliphant proposed `a patch for __index__`_ that
introduced three new C API functions:

* PyIndex_Check(obj) -- checks for nb_index
* PyObject* PyNumber_Index(obj) -- calls nb_index if possible or
raises a TypeError
* Py_ssize_t PyNumber_AsSsize_t(obj, err) -- converts the object to a
Py_ssize_t, raising err on overflow

After a few minor edits, this patch was checked in.

.. __index__ issue:
http://www.python.org/dev/summary/2006-07-16_2006-07-31/#pep-357-integer-clipping-and-index
.. a patch for __index__: http://bugs.python.org/1538606

Contributing threads:

- `Bad interaction of __index__ and sequence repeat
<http://mail.python.org/pipermail/python-dev/2006-August/067870.html>`__
- `__index__ clipping
<http://mail.python.org/pipermail/python-dev/2006-August/068091.html>`__
- `Fwd: [Python-checkins] r51236 - in python/trunk:
Doc/api/abstract.tex Include/abstract.h Include/object.h
Lib/test/test_index.py Misc/NEWS Modules/arraymodule.c
Modules/mmapmodule.c Modules/operator.c Objects/abstract.c
Objects/classobject.c Objects/
<http://mail.python.org/pipermail/python-dev/2006-August/068204.html>`__
- `Fwd: [Python-checkins] r51236 - in python/trunk:
Doc/api/abstract.tex Include/abstract.h Include/object.h
Lib/test/test_index.py Misc/NEWS Modules/arraymodule.c
Modules/mmapmodule.c Modules/operator.c Objects/abstract.c
Objects/class <http://mail.python.org/pipermail/python-dev/2006-August/068209.html>`__

----------------------------
OpenSSL and Windows binaries
----------------------------

Jim Jewett pointed out that a default build of OpenSSL includes the
patented IDEA cipher, and asked whether that needed to be kept out of
the Windows binary versions.  There was some concern about dropping a
feature, but Gregory P. Smith pointed out that IDEA isn't directly
exposed to any Python user, and suggested that IDEA should never be
required by any sane SSL connection.  Martin v. L?wis promised to look
into making the change.

Contributing threads:

- `windows 2.5 build: use OpenSSL for hashlib [bug 1535502]
<http://mail.python.org/pipermail/python-dev/2006-August/068009.html>`__
- `openSSL and windows binaries - license
<http://mail.python.org/pipermail/python-dev/2006-August/068055.html>`__

----------------------------
Type of range object members
----------------------------

Alexander Belopolsky proposed making the members of the ``range()``
object use Py_ssize_t instead of C longs.  Guido indicated that this
was basically wasted effort -- in the long run, the members should be
PyObject* so that they can handle Python longs correctly, so
converting them to Py_ssize_t would be an intermediate step that
wouldn't help in the transition.

There was then some discussion about the int and long types in Python
3000, with Guido suggesting two separate implementations that would be
mostly hidden at the Python level.

Contributing thread:

- `Type of range object members
<http://mail.python.org/pipermail/python-dev/2006-August/068230.html>`__

------------------------
Distutils version number
------------------------

A user noted that Python 2.4.3 shipped with distutils 2.4.1 and the
version number of distutils in the repository was only 2.4.0 and
requested that Python 2.5 include the newer distutils.  In fact, the
newest distutils was already the one in the repository but the version
number had not been appropriately bumped. For a short while, the
distutils number was automatically generated from the Python one, but
Marc-Andre Lemburg volunteered to manually bump it so that it would be
easier to use the SVN distutils with a different Python version.

Contributing threads:

- `Which version of distutils to ship with Python 2.5?
<http://mail.python.org/pipermail/python-dev/2006-August/067869.html>`__
- `no remaining issues blocking 2.5 release
<http://mail.python.org/pipermail/python-dev/2006-August/068240.html>`__

-------------------------------------
Dict containment and unhashable items
-------------------------------------

tomer filiba suggested that dict.__contain__ should return False
instead of raising a TypeError in situations like::

    >>> a={1:2, 3:4}
    >>> [] in a
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    TypeError: list objects are unhashable

Guido suggested that swallowing the TypeError here would be a mistake
as it would also swallow any TypeErrors produced by faulty
``__hash__()`` methods.

Contributing threads:

- `dict containment annoyance
<http://mail.python.org/pipermail/python-dev/2006-August/068198.html>`__
- `NotHashableError? (Re: dict containment annoyance)
<http://mail.python.org/pipermail/python-dev/2006-August/068218.html>`__

-------------------------------
Returning longs from __hash__()
-------------------------------

Armin Rigo pointed out that Python 2.5's change that allows id() to
return ints or longs would have caused some breakage for custom hash
functions like::

    def __hash__(self):
        return id(self)

Though it has long been documented that the result of ``id()`` is not
suitable as a hash value, code like this is apparently common.  So
Martin v. L?wis and Armin arranged for ``PyLong_Type.tp_hash`` to be
called in the code for ``hash()``.

Contributing thread:

- `returning longs from __hash__()
<http://mail.python.org/pipermail/python-dev/2006-August/068043.html>`__

----------------------
instancemethod builtin
----------------------

Nick Coghlan suggested adding an ``instancemethod()`` builtin along
the lines of ``staticmethod()`` and ``classmethod()`` which would
allow arbitrary callables to act more like functions.  In particular,
Nick was considering code like::

    class C(object):
        method = some_callable

Currently, if ``some_callable`` did not define the ``__get__()``
method, ``C().method`` would not bind the ``C`` instance as the first
argument.  By introducing ``instancemethod()``, this problem could be
solved like::

    class C(object):
        method = instancemethod(some_callable)

There wasn't much of a reaction one way or another, so it looked like
the idea would at least temporarily be shelved.

Contributing thread:

- `2.6 idea: a 'function' builtin to parallel classmethod and
staticmethod <http://mail.python.org/pipermail/python-dev/2006-August/068189.html>`__

--------------------------------
Unicode versions and unicodedata
--------------------------------

Armin Ronacher noted that Python 2.5 implements Unicode 4.1 but while
a ucd_3_2_0 object is available (implementing Unicode 3.2), no
ucd_4_1_0 object is available.  Martin v. L?wis explained that the
ucd_3_2_0 object is only available because IDNA needs it, and that
there are no current plans to expose any other Unicode versions (and
that ucd_3_2_0 may go away when IDNA no longer needs it).

Contributing thread:

- `Unicode Data in Python2.5 is missing a ucd_4_1_0 object
<http://mail.python.org/pipermail/python-dev/2006-August/068126.html>`__


==================
Previous Summaries
==================
- `Release manager pronouncement needed: PEP 302 Fix
<http://mail.python.org/pipermail/python-dev/2006-August/068050.html>`__


===============
Skipped Threads
===============
- `clock_gettime() vs. gettimeofday()?
<http://mail.python.org/pipermail/python-dev/2006-August/067879.html>`__
- `Strange memo behavior from cPickle
<http://mail.python.org/pipermail/python-dev/2006-August/067881.html>`__
- `internal weakref API should be Py_ssize_t?
<http://mail.python.org/pipermail/python-dev/2006-August/067885.html>`__
- `Weekly Python Patch/Bug Summary
<http://mail.python.org/pipermail/python-dev/2006-August/067888.html>`__
- `Releasemanager, please approve #1532975
<http://mail.python.org/pipermail/python-dev/2006-August/067889.html>`__
- `FW: using globals
<http://mail.python.org/pipermail/python-dev/2006-August/067892.html>`__
- `TRUNK FREEZE 2006-07-03, 00:00 UTC for 2.5b3
<http://mail.python.org/pipermail/python-dev/2006-August/067898.html>`__
- `segmentation fault in Python 2.5b3 (trunk:51066)
<http://mail.python.org/pipermail/python-dev/2006-August/067921.html>`__
- `using globals
<http://mail.python.org/pipermail/python-dev/2006-August/067947.html>`__
- `uuid module - byte order issue
<http://mail.python.org/pipermail/python-dev/2006-August/067948.html>`__
- `RELEASED Python 2.5 (beta 3)
<http://mail.python.org/pipermail/python-dev/2006-August/067959.html>`__
- `TRUNK is UNFROZEN
<http://mail.python.org/pipermail/python-dev/2006-August/067961.html>`__
- `2.5 status <http://mail.python.org/pipermail/python-dev/2006-August/067963.html>`__
- `Python 2.5b3 and AIX 4.3 - It Works
<http://mail.python.org/pipermail/python-dev/2006-August/067985.html>`__
- `More tracker demos online
<http://mail.python.org/pipermail/python-dev/2006-August/067996.html>`__
- `need an SSH key removed
<http://mail.python.org/pipermail/python-dev/2006-August/067999.html>`__
- `BZ2File.writelines should raise more meaningful exceptions
<http://mail.python.org/pipermail/python-dev/2006-August/068005.html>`__
- `test_mailbox on Cygwin
<http://mail.python.org/pipermail/python-dev/2006-August/068006.html>`__
- `cgi.FieldStorage DOS (sf bug #1112549)
<http://mail.python.org/pipermail/python-dev/2006-August/068011.html>`__
- `2.5b3, commit r46372 regressed PEP 302 machinery (sf not letting me
post) <http://mail.python.org/pipermail/python-dev/2006-August/068012.html>`__
- `free(): invalid pointer
<http://mail.python.org/pipermail/python-dev/2006-August/068042.html>`__
- `should i put this on the bug tracker ?
<http://mail.python.org/pipermail/python-dev/2006-August/068045.html>`__
- `Is this a bug?
<http://mail.python.org/pipermail/python-dev/2006-August/068076.html>`__
- `httplib and bad response chunking
<http://mail.python.org/pipermail/python-dev/2006-August/068087.html>`__
- `cgi DoS attack
<http://mail.python.org/pipermail/python-dev/2006-August/068092.html>`__
- `DRAFT: python-dev summary for 2006-07-01 to 2006-07-15
<http://mail.python.org/pipermail/python-dev/2006-August/068098.html>`__
- `SimpleXMLWriter missing from elementtree
<http://mail.python.org/pipermail/python-dev/2006-August/068106.html>`__
- `DRAFT: python-dev summary for 2006-07-16 to 2006-07-31
<http://mail.python.org/pipermail/python-dev/2006-August/068136.html>`__
- `Is module clearing still necessary? [Re: Is this a bug?]
<http://mail.python.org/pipermail/python-dev/2006-August/068146.html>`__
- `PyThreadState_SetAsyncExc bug?
<http://mail.python.org/pipermail/python-dev/2006-August/068158.html>`__
- `Elementtree and Namespaces in 2.5
<http://mail.python.org/pipermail/python-dev/2006-August/068171.html>`__
- `Errors after running make test
<http://mail.python.org/pipermail/python-dev/2006-August/068176.html>`__
- `What is the status of file.readinto?
<http://mail.python.org/pipermail/python-dev/2006-August/068188.html>`__
- `Recent logging spew
<http://mail.python.org/pipermail/python-dev/2006-August/068196.html>`__
- `[Python-3000] Python 2.5 release schedule (was: threading, part 2)
<http://mail.python.org/pipermail/python-dev/2006-August/068199.html>`__
- `test_socketserver failure on cygwin
<http://mail.python.org/pipermail/python-dev/2006-August/068216.html>`__
- `ANN: byteplay - a bytecode assembler/disassembler
<http://mail.python.org/pipermail/python-dev/2006-August/068225.html>`__
- `Arlington VA sprint on Sept. 23
<http://mail.python.org/pipermail/python-dev/2006-August/068226.html>`__
- `IDLE patches - bugfix or not?
<http://mail.python.org/pipermail/python-dev/2006-August/068266.html>`__
- `Four issue trackers submitted for Infrastructue Committee's tracker
search <http://mail.python.org/pipermail/python-dev/2006-August/068287.html>`__

From anthony at interlink.com.au  Tue Sep 26 06:14:43 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Tue, 26 Sep 2006 14:14:43 +1000
Subject: [Python-Dev] 2.4.4c1 October 11, 2.4.4 final October 18
Message-ID: <200609261414.46940.anthony@interlink.com.au>

The plan for 2.4.4 is to have a release candidate on October 11th, and a final 
release on October 18th. This is very likely to be the last ever 2.4 release, 
after which 2.4.4 joins 2.3.5 and earlier in the old folks home, where it can 
live out it's remaining life with dignity and respect. 

If you know of any backports that should go in, please make sure you get them 
done before the 11th.

Anthony

From fredrik at pythonware.com  Tue Sep 26 08:23:10 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 26 Sep 2006 08:23:10 +0200
Subject: [Python-Dev] 2.4.4c1 October 11, 2.4.4 final October 18
In-Reply-To: <200609261414.46940.anthony@interlink.com.au>
References: <200609261414.46940.anthony@interlink.com.au>
Message-ID: <efah0e$l8b$1@sea.gmane.org>

Anthony Baxter wrote:
>

  The plan for 2.4.4 is to have a release candidate on October 11th, and 
a final
> release on October 18th. This is very likely to be the last ever 2.4 release, 
> after which 2.4.4 joins 2.3.5 and earlier in the old folks home

"finally leaves school" is a more correct description, I think.  my 2.3 
and 2.4 installations are in a pretty good shape, after all...

</F>


From skip at pobox.com  Wed Sep 27 02:40:43 2006
From: skip at pobox.com (skip at pobox.com)
Date: Tue, 26 Sep 2006 19:40:43 -0500
Subject: [Python-Dev] 2.4.4c1 October 11, 2.4.4 final October 18
In-Reply-To: <200609261414.46940.anthony@interlink.com.au>
References: <200609261414.46940.anthony@interlink.com.au>
Message-ID: <17689.51339.120140.761854@montanaro.dyndns.org>


    Anthony> The plan for 2.4.4 is to have a release candidate on October
    Anthony> 11th, and a final release on October 18th. This is very likely
    Anthony> to be the last ever 2.4 release, after which 2.4.4 joins 2.3.5
    Anthony> and earlier in the old folks home, where it can live out it's
    Anthony> remaining life with dignity and respect.

    Anthony> If you know of any backports that should go in, please make
    Anthony> sure you get them done before the 11th.

John Hunter (matplotlib author) recently made me aware of a problem with
code.InteractiveConsole.  It doesn't protect itself from the user closing
sys.stdout:

    % ./python.exe 
    Python 2.4.4c0 (#2, Sep 26 2006, 06:26:16) 
    [GCC 4.0.0 (Apple Computer, Inc. build 5026)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import code
    >>> c = code.InteractiveConsole()
    >>> c.interact()
    Python 2.4.4c0 (#2, Sep 26 2006, 06:26:16) 
    [GCC 4.0.0 (Apple Computer, Inc. build 5026)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    (InteractiveConsole)
    >>> import sys
    >>> sys.stdout.close()
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
      File "/Users/skip/src/python-svn/release24-maint/Lib/code.py", line 234, in interact
        line = self.raw_input(prompt)
      File "/Users/skip/src/python-svn/release24-maint/Lib/code.py", line 277, in raw_input
        return raw_input(prompt)
    ValueError: I/O operation on closed file

I think the right thing is for InteractiveConsole to dup sys.std{in,out,err}
and do its own thing for its raw_input() method instead of naively calling
the raw_input() builtin.  I outlined a solution for ipython in a message to
ipython-dev, but even better would be if InteractiveConsole itself was
fixed:

    John Hunter alerted me to a segfault problem in code.InteractiveConsole
    when sys.stdout is closed.  This problem is present in Python up to
    2.4.3 as far as I can tell, but is fixed in later versions of Python
    (2.5, 2.4.4 when it's released, svn trunk).  Even with that fix, if the
    user calls sys.stdout.close() you'll get a ValueError and your console
    will be useless.  I took a look at the code in Python that the
    InteractiveConsole class exercises and see that the cause is that the
    naive raw_input() method simply calls the raw_input() builtin.  That
    function gets the "stdin" and "stdout" functions from the sys module and
    there's no way to override that behavior.

    In my opinion, the best thing to do would be to subclass
    InteractiveConsole and provide a more robust raw_input() method.
    Ideally, I think you'd want to dup() the file descriptors for
    sys.{stdin,stdout} and use those instead of calling the builtin
    raw_input().  Something like (untested):

        class IC(code.InteractiveConsole):
            def __init__(self):
                code.InteractiveConsole.__init__(self)
                self.input = os.fdopen(os.dup(sys.stdin.fileno()))
                self.output = os.fdopen(os.dup(sys.stdout.fileno()))
                self.error = os.fdopen(os.dup(sys.stderr.fileno()))

            def raw_input(self, prompt=""):
                if prompt:
                    self.output.write(prompt):
                    self.output.flush()
                return self.input.readline()

            def write(self, data):
                self.error.write(data)

    Also, the runcode() method will have to be overridden to use self.output
    instead of sys.stdout.  Those couple changes should (hopefully) insulate
    IPython from such user wackiness.

I'm happy to work up a patch for 2.4.4, 2.5.1 and 2.6.0.  Does this group
think that's the right route to take?

Skip


From skip at pobox.com  Wed Sep 27 12:50:04 2006
From: skip at pobox.com (skip at pobox.com)
Date: Wed, 27 Sep 2006 05:50:04 -0500
Subject: [Python-Dev] 2.4.4c1 October 11, 2.4.4 final October 18
In-Reply-To: <17689.51339.120140.761854@montanaro.dyndns.org>
References: <200609261414.46940.anthony@interlink.com.au>
	<17689.51339.120140.761854@montanaro.dyndns.org>
Message-ID: <17690.22364.692472.235533@montanaro.dyndns.org>


    Anthony> If you know of any backports that should go in, please make
    Anthony> sure you get them done before the 11th.

    skip> John Hunter (matplotlib author) recently made me aware of a
    skip> problem with code.InteractiveConsole.  It doesn't protect itself
    skip> from the user closing sys.stdout:

    ...

I attached a patch for code.py to

    http://sourceforge.net/support/tracker.php?aid=1563079

If someone wants to take a peek, that would be appreciated.  It seems to me
that it certainly should go into 2.5.1 and 2.6.  Whether it's deemed serious
enough to go into 2.4.4 is another question.

Skip

From skip at pobox.com  Wed Sep 27 17:28:46 2006
From: skip at pobox.com (skip at pobox.com)
Date: Wed, 27 Sep 2006 10:28:46 -0500
Subject: [Python-Dev] [SECUNIA] "buffer overrun in repr() for unicode
	strings" Potential Vulnerability (fwd)
Message-ID: <17690.39086.849178.331542@montanaro.dyndns.org>


This came in to the webmaster address and was also addressed to a number of
individuals (looks like the SF project admins).  This appears like it would
be of general interest to this group.

Looking through this message and the various bug tracker items it's not
clear to me if Secunia wants to know if the patch (which I believe has
already been applied to all three active svn branches) is the source of the
problem or if they want to know if it solves the buffer overrun problem.
Are they suggesting that 10*size should be the character multiple in all
cases?

Skip

-------------- next part --------------
An embedded message was scrubbed...
From: Secunia Research <vuln at secunia.com>
Subject: [SECUNIA] "buffer overrun in repr() for unicode strings" Potential
	Vulnerability
Date: Wed, 27 Sep 2006 15:18:46 +0200
Size: 5508
Url: http://mail.python.org/pipermail/python-dev/attachments/20060927/fdfd4bdf/attachment.mht 

From amk at amk.ca  Wed Sep 27 18:40:04 2006
From: amk at amk.ca (A.M. Kuchling)
Date: Wed, 27 Sep 2006 12:40:04 -0400
Subject: [Python-Dev] List of candidate 2.4.4 bugs?
Message-ID: <20060927164004.GA12389@localhost.localdomain>

Is anyone maintaining a list of candidate bugs to be fixed in 2.4.4?
If not, should we start a wiki page for the purpose?

--amk


From martin at v.loewis.de  Wed Sep 27 18:56:14 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 27 Sep 2006 18:56:14 +0200
Subject: [Python-Dev] List of candidate 2.4.4 bugs?
In-Reply-To: <20060927164004.GA12389@localhost.localdomain>
References: <20060927164004.GA12389@localhost.localdomain>
Message-ID: <451AAD2E.1070705@v.loewis.de>

A.M. Kuchling schrieb:
> Is anyone maintaining a list of candidate bugs to be fixed in 2.4.4?

I don't think so. Also, I see little chance that many bugs will be fixed
that aren't already. People should really do constant backporting,
instead of starting backports when a subminor release is made.

Of course, there are some things that people remember and want to see
fixed, but they are pretty arbitrary.

Regards,
Martin


From amk at amk.ca  Wed Sep 27 19:35:42 2006
From: amk at amk.ca (A.M. Kuchling)
Date: Wed, 27 Sep 2006 13:35:42 -0400
Subject: [Python-Dev] List of candidate 2.4.4 bugs?
In-Reply-To: <451AAD2E.1070705@v.loewis.de>
References: <20060927164004.GA12389@localhost.localdomain>
	<451AAD2E.1070705@v.loewis.de>
Message-ID: <20060927173542.GA10686@rogue.amk.ca>

On Wed, Sep 27, 2006 at 06:56:14PM +0200, "Martin v. L?wis" wrote:
> I don't think so. Also, I see little chance that many bugs will be fixed
> that aren't already. People should really do constant backporting,
> instead of starting backports when a subminor release is made.

Agreed.  

One reason I often don't backport a bug is because I'm not sure if
there will be another bugfix release; if not, it's wasted effort, and
I wasn't sure if a 2.4.4 release was ever going to happen.  After
2.4.4, will there be a 2.4.5 or is that too unlikely?

I've done an 'svn log' on the modules I'm familiar with (curses, zlib,
gzip) and will look at backporting the results.

Grepping for 'backport candidate' in 'svn log -r37910:HEAD' turns up
30-odd checkins that contain the phrase:

r51728 r51669 r47171 
r47061 r46991 r46882 r46879
r46878 r46602
r46589 r45234 r41842
r41696 r41531 r39767 r39743 r39739 r39650
r39645
r39595 r39594 r39491
r39135 r39044 r39030 r39012 r38932
r38927 r38887 
r38826 r38781
r38772 r38745

--amk

From fredrik at pythonware.com  Wed Sep 27 19:25:13 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed, 27 Sep 2006 19:25:13 +0200
Subject: [Python-Dev] 2.4.4c1 October 11, 2.4.4 final October 18
In-Reply-To: <17689.51339.120140.761854@montanaro.dyndns.org>
References: <200609261414.46940.anthony@interlink.com.au>
	<17689.51339.120140.761854@montanaro.dyndns.org>
Message-ID: <efec5p$n4v$2@sea.gmane.org>

skip at pobox.com wrote:

> I think the right thing is for InteractiveConsole to dup sys.std{in,out,err}
> and do its own thing for its raw_input() method instead of naively calling
> the raw_input() builtin.

what guarantees that sys.stdin etc has a valid and dup:able fileno when 
the console is instantiated ?

</F>


From skip at pobox.com  Wed Sep 27 20:06:34 2006
From: skip at pobox.com (skip at pobox.com)
Date: Wed, 27 Sep 2006 13:06:34 -0500
Subject: [Python-Dev] 2.4.4c1 October 11, 2.4.4 final October 18
In-Reply-To: <efec5p$n4v$2@sea.gmane.org>
References: <200609261414.46940.anthony@interlink.com.au>
	<17689.51339.120140.761854@montanaro.dyndns.org>
	<efec5p$n4v$2@sea.gmane.org>
Message-ID: <17690.48554.84183.999508@montanaro.dyndns.org>


    Fredrik> skip at pobox.com wrote:
    >> I think the right thing is for InteractiveConsole to dup
    >> sys.std{in,out,err} and do its own thing for its raw_input() method
    >> instead of naively calling the raw_input() builtin.

    Fredrik> what guarantees that sys.stdin etc has a valid and dup:able
    Fredrik> fileno when the console is instantiated ?

Nothing, I suppose.  I'm just concerned that the InteractiveConsole instance
keep working after its interact() method is called.

Skip

From jimjjewett at gmail.com  Wed Sep 27 20:10:16 2006
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed, 27 Sep 2006 14:10:16 -0400
Subject: [Python-Dev] openssl - was: 2.4.4c1 October 11,
	2.4.4 final October 18
Message-ID: <fb6fbf560609271110j7fffb11an4ddb751a0fc61b1b@mail.gmail.com>

OpenSSL should probably be upgraded to 0.9.8.c (or possibly 0.9.7.k)
because of the security patch.

    http://www.openssl.org/
    http://www.openssl.org/news/secadv_20060905.txt

I'm not sure which version shipped with the 2.4 windows binaries, but
externals (for 2.5) still points to 0.9.8.a, which is vulnerable.

openssl has also patched 0.9.7.k (0.9.7 was released in 2003) and the
patch itself

    http://www.openssl.org/news/patch-CVE-2006-4339.txt

should apply to 0.9.6 (released in 2000).

-jJ

From martin at v.loewis.de  Wed Sep 27 20:31:11 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 27 Sep 2006 20:31:11 +0200
Subject: [Python-Dev] openssl - was: 2.4.4c1 October 11,
	2.4.4 final October 18
In-Reply-To: <fb6fbf560609271110j7fffb11an4ddb751a0fc61b1b@mail.gmail.com>
References: <fb6fbf560609271110j7fffb11an4ddb751a0fc61b1b@mail.gmail.com>
Message-ID: <451AC36F.5020404@v.loewis.de>

Jim Jewett schrieb:
> OpenSSL should probably be upgraded to 0.9.8.c (or possibly 0.9.7.k)
> because of the security patch.
> 
>    http://www.openssl.org/
>    http://www.openssl.org/news/secadv_20060905.txt
> 
> I'm not sure which version shipped with the 2.4 windows binaries, but
> externals (for 2.5) still points to 0.9.8.a, which is vulnerable.

If there is any change, it should be to 0.9.7k; we shouldn't switch to
a new "branch" of OpenSSL in micro releases.

However, I'm uncertain whether I can do the update in next weeks.

Regards,
Martin

From martin at v.loewis.de  Wed Sep 27 21:20:08 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 27 Sep 2006 21:20:08 +0200
Subject: [Python-Dev] List of candidate 2.4.4 bugs?
In-Reply-To: <20060927173542.GA10686@rogue.amk.ca>
References: <20060927164004.GA12389@localhost.localdomain>	<451AAD2E.1070705@v.loewis.de>
	<20060927173542.GA10686@rogue.amk.ca>
Message-ID: <451ACEE8.2020604@v.loewis.de>

A.M. Kuchling schrieb:
> One reason I often don't backport a bug is because I'm not sure if
> there will be another bugfix release; if not, it's wasted effort, and
> I wasn't sure if a 2.4.4 release was ever going to happen.  After
> 2.4.4, will there be a 2.4.5 or is that too unlikely?

The "tradition" seems to be that there will be one last bug fix release
after a feature release is made; IOW, two branches are always maintained
(the trunk and the last release). Following this tradition, there
wouldn't be another 2.4.x release (and I think Anthony already said so).
Likewise, 2.5. will be maintained until 2.6 is released, and one last
2.5.x release will be made shortly after 2.6.

Regards,
Martin


From gustavo at niemeyer.net  Wed Sep 27 22:15:21 2006
From: gustavo at niemeyer.net (Gustavo Niemeyer)
Date: Wed, 27 Sep 2006 17:15:21 -0300
Subject: [Python-Dev] Minipython
In-Reply-To: <20060923221441.GB5227@hornet.din.cz>
References: <20060923221441.GB5227@hornet.din.cz>
Message-ID: <20060927201521.GA24770@niemeyer.net>

> I would like to run Python scripts on an embedded MIPS Linux platform
> having only 2 MiB of flash ROM and 16 MiB of RAM for everything.
(...)

Have you looked at Python for S60 and Python for the Maemo platform?

If not directly useful, they should provide some hints.

[1] http://opensource.nokia.com/projects/pythonfors60/
[2] http://pymaemo.sf.net

-- 
Gustavo Niemeyer
http://niemeyer.net

From brett at python.org  Wed Sep 27 23:11:30 2006
From: brett at python.org (Brett Cannon)
Date: Wed, 27 Sep 2006 14:11:30 -0700
Subject: [Python-Dev] difficulty of implementing phase 2 of PEP 302 in
	Python source
Message-ID: <bbaeab100609271411o2cb77d2am5ae7443567f1d9e4@mail.gmail.com>

I am at the point with my security work that I need to consider how I am
going to restrict importing modules.  My current plan is to basically
implement phase 2 of PEP 302 and control imports through what importer
objects are provided.  This work should lead to a meta_path importer for
built-ins and then path_hooks importers for .py, .pyc, and extension
modules.

But it has been suggested here that the import machinery be rewritten in
Python.  Now I have never touched the import code since it has always had
the reputation of being less than friendly to work with.  I am asking for
opinions from people who have worked with the import machinery before if it
is so bad that it is worth trying to re-implement the import semantics in
pure Python or if in the name of time to just work with the C code.
Basically I will end up breaking up built-in, .py, .pyc, and extension
modules into individual importers and then have a chaining class to act as a
combined .pyc/.py combination importer (this will also make writing out to
.pyc files an optional step of the .py import).

Any opinions would be greatly appreciated on this.  I need to get back to my
supervisor by the end of the day Friday with a decision as to whether I
think it is worth the rewrite.  If you are interested in helping with the
Python rewrite (or in general if the work is done with the C code), please
let me know since if enough people want to help with the Python rewrite it
might help wash out the extra time needed to make it work.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060927/60f301e2/attachment.htm 

From pje at telecommunity.com  Thu Sep 28 00:31:34 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 27 Sep 2006 18:31:34 -0400
Subject: [Python-Dev] difficulty of implementing phase 2 of PEP 302 in
 Python source
In-Reply-To: <bbaeab100609271411o2cb77d2am5ae7443567f1d9e4@mail.gmail.co
 m>
Message-ID: <5.1.1.6.0.20060927181456.03e8c088@sparrow.telecommunity.com>

At 02:11 PM 9/27/2006 -0700, Brett Cannon wrote:
>But it has been suggested here that the import machinery be rewritten in 
>Python.  Now I have never touched the import code since it has always had 
>the reputation of being less than friendly to work with.  I am asking for 
>opinions from people who have worked with the import machinery before if 
>it is so bad that it is worth trying to re-implement the import semantics 
>in pure Python or if in the name of time to just work with the C 
>code.  Basically I will end up breaking up built-in, .py, .pyc, and 
>extension modules into individual importers and then have a chaining class 
>to act as a combined .pyc/.py combination importer (this will also make 
>writing out to .pyc files an optional step of the .py import).

The problem you would run into here would be supporting zip imports.  It 
would probably be more useful to have a mapping of file types to "format 
handlers", because then a filesystem importer or zip importer would then be 
able to work with any .py/.pyc/.pyo/whatever formats, along with any new 
ones that are invented, without reinventing the wheel.

Thus, whether it's file import, zip import, web import, or whatever, the 
same handlers would be reusable, and when people invent new extensions like 
.ptl, .kid, etc., they can just register format handlers instead.

Format handlers could of course be based on the PEP 302 protocol, and 
simply accept a "parent importer" with a get_data() method.  So, let's say 
you have a PyImporter:

     class PyImporter:
         def __init__(self, parent_importer):
             self.parent = parent_importer

         def find_module(self, fullname):
             path = fullname.split('.')[-1]+'.py'
             try:
                 source = self.parent.get_data(path)
             except IOError:
                 return None
             else:
                 return PySourceLoader(source)

See what I mean?  The importers and loaders thus don't have to do direct 
filesystem operations.

Of course, to fully support .pyc timestamp checking and writeback, you'd 
need some sort of "stat" or "getmtime" feature on the parent importer, as 
well as perhaps an optional "save_data" method.  These would be extensions 
to PEP 302, but welcome ones.

Anyway, based on my previous work with pkg_resource, pkgutil, zipimport, 
import.c, etc. I would say this is how I'd want to structure a 
reimplementation of the core system.  And if it were for Py3K, I'd probably 
treat sys.path and all the import hooks associated with it as a single 
meta-importer on sys.meta_path -- listed after a meta-importer for handling 
frozen and built-in modules.  (I.e., the meta-importer that uses sys.path 
and its path hooks would be last on sys.meta_path.)

In other words, sys.meta_path is really the only critical import hook from 
the raw interpreter's point of view.  sys.path, however, (along with 
sys.path_hooks and sys.path_importer_cache) is critical from the 
perspective of users, applications, etc., as there has to be some way to 
get things onto Python's path in the first place.


From brett at python.org  Thu Sep 28 01:11:33 2006
From: brett at python.org (Brett Cannon)
Date: Wed, 27 Sep 2006 16:11:33 -0700
Subject: [Python-Dev] difficulty of implementing phase 2 of PEP 302 in
	Python source
In-Reply-To: <5.1.1.6.0.20060927181456.03e8c088@sparrow.telecommunity.com>
References: <5.1.1.6.0.20060927181456.03e8c088@sparrow.telecommunity.com>
Message-ID: <bbaeab100609271611u3e5c14dqa49b6f674e817861@mail.gmail.com>

On 9/27/06, Phillip J. Eby <pje at telecommunity.com> wrote:
>
> At 02:11 PM 9/27/2006 -0700, Brett Cannon wrote:
> >But it has been suggested here that the import machinery be rewritten in
> >Python.  Now I have never touched the import code since it has always had
> >the reputation of being less than friendly to work with.  I am asking for
> >opinions from people who have worked with the import machinery before if
> >it is so bad that it is worth trying to re-implement the import semantics
> >in pure Python or if in the name of time to just work with the C
> >code.  Basically I will end up breaking up built-in, .py, .pyc, and
> >extension modules into individual importers and then have a chaining
> class
> >to act as a combined .pyc/.py combination importer (this will also make
> >writing out to .pyc files an optional step of the .py import).
>
> The problem you would run into here would be supporting zip imports.


I have not looked at zipimport so I don't know the exact issue in terms of
how it hooks into the import machinery.  But a C level API will most likely
be needed.

  It
> would probably be more useful to have a mapping of file types to "format
> handlers", because then a filesystem importer or zip importer would then
> be
> able to work with any .py/.pyc/.pyo/whatever formats, along with any new
> ones that are invented, without reinventing the wheel.


So you are saying the zipimporter would then pull out of the zip file the
individual file to import and pass that to the format-specific importer?

Thus, whether it's file import, zip import, web import, or whatever, the
> same handlers would be reusable, and when people invent new extensions
> like
> .ptl, .kid, etc., they can just register format handlers instead.


So a sepration of data store from data interpretation for importation.  My
only worry is a possible explosion of checks for the various data types.  If
you are using the file data store and had .py, .pyc, .so, module.so, .ptl,
and .kid registered that might suck in terms of performance hit.  And I am
assuming for a web import that it would decide based on the extension of the
resulting web address?  And checking for the various types might not work
well for other data store types.  Guess you would need a way to register
with the data store exactly what types of data interpretation you might want
to check.

Other option is to just have the data store do its magic and somehow know
what kind of data interpretation is needed for the string returned (e.g., a
database data store might implicitly only store .py code and thus know that
it will only return a string of source).  Then that string and the supposed
file extension is passed ot the next step of creating a module from that
data string.

Format handlers could of course be based on the PEP 302 protocol, and
> simply accept a "parent importer" with a get_data() method.  So, let's say
> you have a PyImporter:
>
>      class PyImporter:
>          def __init__(self, parent_importer):
>              self.parent = parent_importer
>
>          def find_module(self, fullname):
>              path = fullname.split('.')[-1]+'.py'
>              try:
>                  source = self.parent.get_data(path)
>              except IOError:
>                  return None
>              else:
>                  return PySourceLoader(source)
>
> See what I mean?  The importers and loaders thus don't have to do direct
> filesystem operations.


I think so.  Basically you want more of a way to stack imports so that the
basic importers are just passed the string of what it is supposed to load
from.  Other importers higher in the chain can handle getting that string.

Of course, to fully support .pyc timestamp checking and writeback, you'd
> need some sort of "stat" or "getmtime" feature on the parent importer, as
> well as perhaps an optional "save_data" method.  These would be extensions
> to PEP 302, but welcome ones.


Could pass the string representing the location of where the string came
from.  That would allow for the required stat calls for .pyc files as needed
without having to implement methods just for this one use case.

Anyway, based on my previous work with pkg_resource, pkgutil, zipimport,
> import.c, etc. I would say this is how I'd want to structure a
> reimplementation of the core system.  And if it were for Py3K, I'd
> probably
> treat sys.path and all the import hooks associated with it as a single
> meta-importer on sys.meta_path -- listed after a meta-importer for
> handling
> frozen and built-in modules.  (I.e., the meta-importer that uses sys.path
> and its path hooks would be last on sys.meta_path.)


Ah, interesting idea!  Could even go as far as removing sys.path and just
making it an attribute of the base importer if you really wanted to make it
just meta_path for imports.

In other words, sys.meta_path is really the only critical import hook from
> the raw interpreter's point of view.  sys.path, however, (along with
> sys.path_hooks and sys.path_importer_cache) is critical from the
> perspective of users, applications, etc., as there has to be some way to
> get things onto Python's path in the first place.
>
>
Yeah, I think I get it.  I don't know how much it simplifies things for
users but I think it might make it easier for alternative import writers.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060927/0fda24ae/attachment.htm 

From pje at telecommunity.com  Thu Sep 28 01:41:18 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 27 Sep 2006 19:41:18 -0400
Subject: [Python-Dev] difficulty of implementing phase 2 of PEP 302 in
 Python source
In-Reply-To: <bbaeab100609271611u3e5c14dqa49b6f674e817861@mail.gmail.com
 >
References: <5.1.1.6.0.20060927181456.03e8c088@sparrow.telecommunity.com>
	<5.1.1.6.0.20060927181456.03e8c088@sparrow.telecommunity.com>
Message-ID: <5.1.1.6.0.20060927192728.02df6310@sparrow.telecommunity.com>

At 04:11 PM 9/27/2006 -0700, Brett Cannon wrote:


>On 9/27/06, Phillip J. Eby 
><<mailto:pje at telecommunity.com>pje at telecommunity.com> wrote:
>>At 02:11 PM 9/27/2006 -0700, Brett Cannon wrote:
>> >But it has been suggested here that the import machinery be rewritten in
>> >Python.  Now I have never touched the import code since it has always had
>> >the reputation of being less than friendly to work with.  I am asking for
>> >opinions from people who have worked with the import machinery before if
>> >it is so bad that it is worth trying to re-implement the import semantics
>> >in pure Python or if in the name of time to just work with the C
>> >code.  Basically I will end up breaking up built-in, .py, .pyc, and
>> >extension modules into individual importers and then have a chaining class
>> >to act as a combined .pyc/.py combination importer (this will also make
>> >writing out to .pyc files an optional step of the .py import).
>>
>>The problem you would run into here would be supporting zip imports.
>
>I have not looked at zipimport so I don't know the exact issue in terms of 
>how it hooks into the import machinery.  But a C level API will most 
>likely be needed.

I was actually assuming you planned to reimplement that in Python as well, 
and hence the need for the storage/format separation.


>>   It
>>would probably be more useful to have a mapping of file types to "format
>>handlers", because then a filesystem importer or zip importer would then be
>>able to work with any .py/.pyc/.pyo/whatever formats, along with any new
>>ones that are invented, without reinventing the wheel.
>
>So you are saying the zipimporter would then pull out of the zip file the 
>individual file to import and pass that to the format-specific importer?

No, I'm saying that the zipimporter would simply call the format importers 
in sequence, as in your original concept.  However, these importers would 
call *back* to the zipimporter to ask if the file they are looking for is 
there.


>>Thus, whether it's file import, zip import, web import, or whatever, the
>>same handlers would be reusable, and when people invent new extensions like
>>.ptl, .kid, etc., they can just register format handlers instead.
>
>So a sepration of data store from data interpretation for importation.  My 
>only worry is a possible explosion of checks for the various data 
>types.  If you are using the file data store and had .py, .pyc, .so, 
>module.so , .ptl, and .kid registered that might suck in terms of 
>performance hit.

Look at it this way: the parent importer can always pull a directory 
listing once and cache it for the duration of its calls to the child 
importers.  In practice, however, I suspect that the stat calls will be 
faster.  In the case of a zipimport parent, the zip directory is already 
cached.

Also, keep in mind that most imports will likely occur *before* any special 
additional types get registered, so the hits will be minimal.  And the more 
of sys.path is taken up by zip files, the less of a hit it will be for each 
query.


>   And I am assuming for a web import that it would decide based on the 
> extension of the resulting web address?

No - you'd effectively end up doing a web hit for each possible 
extension.  Which would suck, but that's what caching is 
for.  Realistically, you wouldn't want to do web-based imports without some 
disk-based caching anyway.

>   And checking for the various types might not work well for other data 
> store types.  Guess you would need a way to register with the data store 
> exactly what types of data interpretation you might want to check.

No, you just need a method on the parent importer like get_data().


>Other option is to just have the data store do its magic and somehow know 
>what kind of data interpretation is needed for the string returned (e.g., 
>a database data store might implicitly only store .py code and thus know 
>that it will only return a string of source).  Then that string and the 
>supposed file extension is passed ot the next step of creating a module 
>from that data string.

Again, all that's way more complex than you need; you can do the same thing 
by just raising IOError from get_data() when asked for something that's not 
a .py.


>>Format handlers could of course be based on the PEP 302 protocol, and
>>simply accept a "parent importer" with a get_data() method.  So, let's say
>>you have a PyImporter:
>>
>>      class PyImporter:
>>          def __init__(self, parent_importer):
>>              self.parent = parent_importer
>>
>>          def find_module(self, fullname):
>>              path = fullname.split('.')[-1]+'.py'
>>              try:
>>                  source = self.parent.get_data(path)
>>              except IOError:
>>                  return None
>>              else:
>>                  return PySourceLoader(source)
>>
>>See what I mean?  The importers and loaders thus don't have to do direct
>>filesystem operations.
>
>I think so.  Basically you want more of a way to stack imports so that the 
>basic importers are just passed the string of what it is supposed to load 
>from.  Other importers higher in the chain can handle getting that string.

No, they're full importers; they're not passed "a string".  The only 
difference between this and your original idea of an importer chain is that 
I'm saying the chained format-specific importers need to know who their 
"parent" importer (the data store) is, so they can be data-store 
independent.  Everything else can be done with that, and perhaps a few 
extra parent importer methods for stat, save, etc.


>>Of course, to fully support .pyc timestamp checking and writeback, you'd
>>need some sort of "stat" or "getmtime" feature on the parent importer, as
>>well as perhaps an optional "save_data" method.  These would be extensions
>>to PEP 302, but welcome ones.
>
>Could pass the string representing the location of where the string came 
>from.  That would allow for the required stat calls for .pyc files as 
>needed without having to implement methods just for this one use case.

Huh?  In order to know if a .pyc is up to date, you need the st_mtime of 
the .py file.  That can't be done in the parent importer without giving it 
format knowledge, which goes against the point of the exercise.  Thus, 
something like stat() and save() methods need to be available on the 
parent, if it can support them.


>>Anyway, based on my previous work with pkg_resource, pkgutil, zipimport,
>>import.c , etc. I would say this is how I'd want to structure a
>>reimplementation of the core system.  And if it were for Py3K, I'd probably
>>treat sys.path and all the import hooks associated with it as a single
>>meta-importer on sys.meta_path -- listed after a meta-importer for handling
>>frozen and built-in modules.  (I.e., the meta-importer that uses sys.path
>>and its path hooks would be last on sys.meta_path.)
>
>Ah, interesting idea!  Could even go as far as removing sys.path and just 
>making it an attribute of the base importer if you really wanted to make 
>it just meta_path for imports.

Perhaps, but then that just means you have to have a new variable for 
'sys.path_importer' or some such, just to get at it.  (i.e., code won't be 
able to assume it's always the last item in sys.meta_path).  So this seems 
wasteful and changing things just for the sake of change, vs. just keeping 
the other PEP 302 sys variables.  I just think the *implementation* of them 
can move to sys.meta_path, as that simplifies the main __import__ function 
down to just calling meta_path importers in sequence, modulo some package 
issues.

One other rather tricky matter is that the sys.path meta-importer has to 
deal with package __path__ management...  and actually, meta_path importers 
are supposed to receive a copy of sys.path...  ugh.  Well, it was a nice 
idea, but I guess you can't actually implement sys.path using a meta_path 
importer.  :(  For Py3K, we could drop the path argument to find_module() 
and manage it, but it can't be done and still allow current meta_path hooks 
to work right.


>>In other words, sys.meta_path is really the only critical import hook from
>>the raw interpreter's point of view.  sys.path, however, (along with
>>sys.path_hooks and sys.path_importer_cache) is critical from the
>>perspective of users, applications, etc., as there has to be some way to
>>get things onto Python's path in the first place.
>
>Yeah, I think I get it.  I don't know how much it simplifies things for 
>users but I think it might make it easier for alternative import writers.

That was the idea, yes.  :)


From brett at python.org  Thu Sep 28 02:26:14 2006
From: brett at python.org (Brett Cannon)
Date: Wed, 27 Sep 2006 17:26:14 -0700
Subject: [Python-Dev] difficulty of implementing phase 2 of PEP 302 in
	Python source
In-Reply-To: <5.1.1.6.0.20060927192728.02df6310@sparrow.telecommunity.com>
References: <5.1.1.6.0.20060927181456.03e8c088@sparrow.telecommunity.com>
	<5.1.1.6.0.20060927192728.02df6310@sparrow.telecommunity.com>
Message-ID: <bbaeab100609271726v43424f87s89aa51a637e5ea33@mail.gmail.com>

On 9/27/06, Phillip J. Eby <pje at telecommunity.com> wrote:
>
> At 04:11 PM 9/27/2006 -0700, Brett Cannon wrote:
>
>
> >On 9/27/06, Phillip J. Eby
> ><<mailto:pje at telecommunity.com>pje at telecommunity.com> wrote:
> >>At 02:11 PM 9/27/2006 -0700, Brett Cannon wrote:
> >> >But it has been suggested here that the import machinery be rewritten
> in
> >> >Python.  Now I have never touched the import code since it has always
> had
> >> >the reputation of being less than friendly to work with.  I am asking
> for
> >> >opinions from people who have worked with the import machinery before
> if
> >> >it is so bad that it is worth trying to re-implement the import
> semantics
> >> >in pure Python or if in the name of time to just work with the C
> >> >code.  Basically I will end up breaking up built-in, .py, .pyc, and
> >> >extension modules into individual importers and then have a chaining
> class
> >> >to act as a combined .pyc/.py combination importer (this will also
> make
> >> >writing out to .pyc files an optional step of the .py import).
> >>
> >>The problem you would run into here would be supporting zip imports.
> >
> >I have not looked at zipimport so I don't know the exact issue in terms
> of
> >how it hooks into the import machinery.  But a C level API will most
> >likely be needed.
>
> I was actually assuming you planned to reimplement that in Python as well,
> and hence the need for the storage/format separation.


I was not explictly planning on it.

>>   It
> >>would probably be more useful to have a mapping of file types to "format
> >>handlers", because then a filesystem importer or zip importer would then
> be
> >>able to work with any .py/.pyc/.pyo/whatever formats, along with any new
> >>ones that are invented, without reinventing the wheel.
> >
> >So you are saying the zipimporter would then pull out of the zip file the
> >individual file to import and pass that to the format-specific importer?
>
> No, I'm saying that the zipimporter would simply call the format importers
> in sequence, as in your original concept.  However, these importers would
> call *back* to the zipimporter to ask if the file they are looking for is
> there.


Ah, OK.  So for importing 'email', the zipimporter would call the .pyc
importer and it would ask the zipimporter, "can you get me email.pyc?" and
if it said no it would move on to asking the .py importer for email.py, etc.

>>Thus, whether it's file import, zip import, web import, or whatever, the
> >>same handlers would be reusable, and when people invent new extensions
> like
> >>.ptl, .kid, etc., they can just register format handlers instead.
> >
> >So a sepration of data store from data interpretation for
> importation.  My
> >only worry is a possible explosion of checks for the various data
> >types.  If you are using the file data store and had .py, .pyc, .so,
> >module.so , .ptl, and .kid registered that might suck in terms of
> >performance hit.
>
> Look at it this way: the parent importer can always pull a directory
> listing once and cache it for the duration of its calls to the child
> importers.  In practice, however, I suspect that the stat calls will be
> faster.  In the case of a zipimport parent, the zip directory is already
> cached.
>
> Also, keep in mind that most imports will likely occur *before* any
> special
> additional types get registered, so the hits will be minimal.  And the
> more
> of sys.path is taken up by zip files, the less of a hit it will be for
> each
> query.


That's fine.  Just thinking about how the current situation sucks for NFS
but how caching just isn't done.  But obvoiusly this could change.

>   And I am assuming for a web import that it would decide based on the
> > extension of the resulting web address?
>
> No - you'd effectively end up doing a web hit for each possible
> extension.  Which would suck, but that's what caching is
> for.  Realistically, you wouldn't want to do web-based imports without
> some
> disk-based caching anyway.
>
> >   And checking for the various types might not work well for other data
> > store types.  Guess you would need a way to register with the data store
> > exactly what types of data interpretation you might want to check.
>
> No, you just need a method on the parent importer like get_data().
>
>
> >Other option is to just have the data store do its magic and somehow know
> >what kind of data interpretation is needed for the string returned (e.g.,
> >a database data store might implicitly only store .py code and thus know
> >that it will only return a string of source).  Then that string and the
> >supposed file extension is passed ot the next step of creating a module
> >from that data string.
>
> Again, all that's way more complex than you need; you can do the same
> thing
> by just raising IOError from get_data() when asked for something that's
> not
> a .py.
>
>
> >>Format handlers could of course be based on the PEP 302 protocol, and
> >>simply accept a "parent importer" with a get_data() method.  So, let's
> say
> >>you have a PyImporter:
> >>
> >>      class PyImporter:
> >>          def __init__(self, parent_importer):
> >>              self.parent = parent_importer
> >>
> >>          def find_module(self, fullname):
> >>              path = fullname.split('.')[-1]+'.py'
> >>              try:
> >>                  source = self.parent.get_data(path)
> >>              except IOError:
> >>                  return None
> >>              else:
> >>                  return PySourceLoader(source)
> >>
> >>See what I mean?  The importers and loaders thus don't have to do direct
> >>filesystem operations.
> >
> >I think so.  Basically you want more of a way to stack imports so that
> the
> >basic importers are just passed the string of what it is supposed to load
> >from.  Other importers higher in the chain can handle getting that
> string.
>
> No, they're full importers; they're not passed "a string".  The only
> difference between this and your original idea of an importer chain is
> that
> I'm saying the chained format-specific importers need to know who their
> "parent" importer (the data store) is, so they can be data-store
> independent.  Everything else can be done with that, and perhaps a few
> extra parent importer methods for stat, save, etc.


OK.

>>Of course, to fully support .pyc timestamp checking and writeback, you'd
> >>need some sort of "stat" or "getmtime" feature on the parent importer,
> as
> >>well as perhaps an optional "save_data" method.  These would be
> extensions
> >>to PEP 302, but welcome ones.
> >
> >Could pass the string representing the location of where the string came
> >from.  That would allow for the required stat calls for .pyc files as
> >needed without having to implement methods just for this one use case.
>
> Huh?  In order to know if a .pyc is up to date, you need the st_mtime of
> the .py file.  That can't be done in the parent importer without giving it
> format knowledge, which goes against the point of the exercise.


Sorry, thought .pyc files based whether they needed to be recompiled based
on the stat info on the .py and .pyc file, not on data stored from within
the .pyc .

  Thus,
> something like stat() and save() methods need to be available on the
> parent, if it can support them.
>
>
> >>Anyway, based on my previous work with pkg_resource, pkgutil, zipimport,
> >>import.c , etc. I would say this is how I'd want to structure a
> >>reimplementation of the core system.  And if it were for Py3K, I'd
> probably
> >>treat sys.path and all the import hooks associated with it as a single
> >>meta-importer on sys.meta_path -- listed after a meta-importer for
> handling
> >>frozen and built-in modules.  (I.e., the meta-importer that uses
> sys.path
> >>and its path hooks would be last on sys.meta_path.)
> >
> >Ah, interesting idea!  Could even go as far as removing sys.path and just
> >making it an attribute of the base importer if you really wanted to make
> >it just meta_path for imports.
>
> Perhaps, but then that just means you have to have a new variable for
> 'sys.path_importer' or some such, just to get at it.  (i.e., code won't be
> able to assume it's always the last item in sys.meta_path).  So this seems
> wasteful and changing things just for the sake of change, vs. just keeping
> the other PEP 302 sys variables.  I just think the *implementation* of
> them
> can move to sys.meta_path, as that simplifies the main __import__ function
> down to just calling meta_path importers in sequence, modulo some package
> issues.
>
> One other rather tricky matter is that the sys.path meta-importer has to
> deal with package __path__ management...  and actually, meta_path
> importers
> are supposed to receive a copy of sys.path...  ugh.  Well, it was a nice
> idea, but I guess you can't actually implement sys.path using a meta_path
> importer.  :(  For Py3K, we could drop the path argument to find_module()
> and manage it, but it can't be done and still allow current meta_path
> hooks
> to work right.


Ah, true.

>>In other words, sys.meta_path is really the only critical import hook from
> >>the raw interpreter's point of view.  sys.path, however, (along with
> >>sys.path_hooks and sys.path_importer_cache) is critical from the
> >>perspective of users, applications, etc., as there has to be some way to
> >>get things onto Python's path in the first place.
> >
> >Yeah, I think I get it.  I don't know how much it simplifies things for
> >users but I think it might make it easier for alternative import writers.
>
> That was the idea, yes.  :)



=)
-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060927/6718c9f1/attachment.html 

From pje at telecommunity.com  Thu Sep 28 02:41:15 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 27 Sep 2006 20:41:15 -0400
Subject: [Python-Dev] difficulty of implementing phase 2 of PEP 302 in
 Python source
In-Reply-To: <bbaeab100609271726v43424f87s89aa51a637e5ea33@mail.gmail.co
 m>
References: <5.1.1.6.0.20060927192728.02df6310@sparrow.telecommunity.com>
	<5.1.1.6.0.20060927181456.03e8c088@sparrow.telecommunity.com>
	<5.1.1.6.0.20060927192728.02df6310@sparrow.telecommunity.com>
Message-ID: <5.1.1.6.0.20060927203149.028c5e90@sparrow.telecommunity.com>

At 05:26 PM 9/27/2006 -0700, Brett Cannon wrote:
>Ah, OK.  So for importing 'email', the zipimporter would call the .pyc 
>importer and it would ask the zipimporter, "can you get me email.pyc?" and 
>if it said no it would move on to asking the .py importer for email.py, etc.

Yes, exactly.


>That's fine.  Just thinking about how the current situation sucks for NFS 
>but how caching just isn't done.  But obvoiusly this could change.

Well, with this design, you can have a CachingFilesystemImporter as your 
storage mechanism to speed things up.


>> >>Of course, to fully support .pyc timestamp checking and writeback, you'd
>> >>need some sort of "stat" or "getmtime" feature on the parent importer, as
>> >>well as perhaps an optional "save_data" method.  These would be extensions
>> >>to PEP 302, but welcome ones.
>> >
>> >Could pass the string representing the location of where the string came
>> >from.  That would allow for the required stat calls for .pyc files as
>> >needed without having to implement methods just for this one use case.
>>
>>Huh?  In order to know if a .pyc is up to date, you need the st_mtime of
>>the .py file.  That can't be done in the parent importer without giving it
>>format knowledge, which goes against the point of the exercise.
>
>Sorry, thought .pyc files based whether they needed to be recompiled based 
>on the stat info on the .py and .pyc file, not on data stored from within 
>the .pyc .

It's not just that (although I believe it's also the case that there is a 
timestamp inside .pyc), it's that to do the check in the parent importer, 
the parent importer would have to know that there is such a thing as 
.py-and-.pyc.  The whole point of this design is that the parent importer 
doesn't have to know *anything* about filename extensions OR how those 
files are formatted internally.  In this scheme, adding more child 
importers is sufficient to add all the special handling needed for 
.py/.pyc-style schemes.

Of course, for maximum flexibility, you might want get_stream() and 
get_file() methods optionally available, since a .so loader really needs a 
file, and .pyc might want to read in two stages.  But the child importers 
can be defensively coded so as to be able to live with only a 
parent.get_data(), if necessary, and do the enhanced behaviors only if 
stat() or get_stream() or write_data() etc. attributes are available on the 
parent.

If we get some standards for these additional attributes, we can document 
them as standard PEP 302 extensions.

The format importer mechanism might want to have something like 
'sys.import_formats' as a list of importer classes (or factories).  Parent 
(storage) importer classes would then create instances to use.

If you add a new format importer to sys.import_formats, you would of course 
need to clear sys.path_importer_cache, so that the individual importers are 
rebuilt on the next import, and thus they will create new child importer 
chains.

Yeah, that pretty much ought to do it.


From xah at xahlee.org  Thu Sep 28 12:49:22 2006
From: xah at xahlee.org (xah lee)
Date: Thu, 28 Sep 2006 03:49:22 -0700
Subject: [Python-Dev] Python Doc problems
Message-ID: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org>

There are a lot reports on the lousy state of python docs. I'm not  
much in the python community so i don't know what the developers are  
doing anything about it.

anyway, i've rewrote the Python's RE module documentation, at:
  http://xahlee.org/perl-python/python_re-write/lib/module-re.html
and have recently made the term of user clear.

may i ask what the python developers is doing about the python's  
docs? Are you guys aware, that there are rampant criticisms of python  
docs and many diverse tries by various individuals to rewrite the doc  
by starting another wiki or site?

   Xah
   xah at xahlee.org
? http://xahlee.org/


?



From amk at amk.ca  Thu Sep 28 14:12:55 2006
From: amk at amk.ca (A.M. Kuchling)
Date: Thu, 28 Sep 2006 08:12:55 -0400
Subject: [Python-Dev] Collecting 2.4.4 fixes
Message-ID: <20060928121255.GE5511@localhost.localdomain>

I've put some candidate fixes and listed some tasks at
<http://wiki.python.org/moin/Python24Fixes>.

--amk



From jeremy at alum.mit.edu  Thu Sep 28 16:30:25 2006
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Thu, 28 Sep 2006 10:30:25 -0400
Subject: [Python-Dev] AST structure and maintenance branches
In-Reply-To: <200609231840.03859.anthony@interlink.com.au>
References: <200609231840.03859.anthony@interlink.com.au>
Message-ID: <e8bf7a530609280730l1fce4e0ai2b67b3c7f0708481@mail.gmail.com>

On 9/23/06, Anthony Baxter <anthony at interlink.com.au> wrote:
> I'd like to propose that the AST format returned by passing PyCF_ONLY_AST to
> compile() get the same guarantee in maintenance branches as the bytecode
> format - that is, unless it's absolutely necessary, we'll keep it the same.
> Otherwise anyone trying to write tools to manipulate the AST is in for a
> massive world of hurt.
>
> Anyone have any problems with this, or can it be added to PEP 6?

It's possible we should poll developers of other Python
implementations and find out if anyone has objections to supporting
this AST format.  But in principle, it sounds like a good idea to me.

Jeremy

From anthony at interlink.com.au  Thu Sep 28 16:37:16 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Fri, 29 Sep 2006 00:37:16 +1000
Subject: [Python-Dev] AST structure and maintenance branches
In-Reply-To: <e8bf7a530609280730l1fce4e0ai2b67b3c7f0708481@mail.gmail.com>
References: <200609231840.03859.anthony@interlink.com.au>
	<e8bf7a530609280730l1fce4e0ai2b67b3c7f0708481@mail.gmail.com>
Message-ID: <200609290037.18885.anthony@interlink.com.au>

On Friday 29 September 2006 00:30, Jeremy Hylton wrote:
> On 9/23/06, Anthony Baxter <anthony at interlink.com.au> wrote:
> > I'd like to propose that the AST format returned by passing PyCF_ONLY_AST
> > to compile() get the same guarantee in maintenance branches as the
> > bytecode format - that is, unless it's absolutely necessary, we'll keep
> > it the same. Otherwise anyone trying to write tools to manipulate the AST
> > is in for a massive world of hurt.
> >
> > Anyone have any problems with this, or can it be added to PEP 6?
>
> It's possible we should poll developers of other Python
> implementations and find out if anyone has objections to supporting
> this AST format.  But in principle, it sounds like a good idea to me.

I think it's extremely likely that the AST format will change over time - 
with major releases. I'd just like to guarantee that we won't mess with it 
other than that.

Anthony
-- 
Anthony Baxter     <anthony at interlink.com.au>
It's never too late to have a happy childhood.

From jeremy at alum.mit.edu  Thu Sep 28 16:42:15 2006
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Thu, 28 Sep 2006 10:42:15 -0400
Subject: [Python-Dev] AST structure and maintenance branches
In-Reply-To: <200609290037.18885.anthony@interlink.com.au>
References: <200609231840.03859.anthony@interlink.com.au>
	<e8bf7a530609280730l1fce4e0ai2b67b3c7f0708481@mail.gmail.com>
	<200609290037.18885.anthony@interlink.com.au>
Message-ID: <e8bf7a530609280742n67f40957l49fe9a65515efc4@mail.gmail.com>

On 9/28/06, Anthony Baxter <anthony at interlink.com.au> wrote:
> On Friday 29 September 2006 00:30, Jeremy Hylton wrote:
> > On 9/23/06, Anthony Baxter <anthony at interlink.com.au> wrote:
> > > I'd like to propose that the AST format returned by passing PyCF_ONLY_AST
> > > to compile() get the same guarantee in maintenance branches as the
> > > bytecode format - that is, unless it's absolutely necessary, we'll keep
> > > it the same. Otherwise anyone trying to write tools to manipulate the AST
> > > is in for a massive world of hurt.
> > >
> > > Anyone have any problems with this, or can it be added to PEP 6?
> >
> > It's possible we should poll developers of other Python
> > implementations and find out if anyone has objections to supporting
> > this AST format.  But in principle, it sounds like a good idea to me.
>
> I think it's extremely likely that the AST format will change over time -
> with major releases. I'd just like to guarantee that we won't mess with it
> other than that.

Good point.  I'm fine with the change, then.

Jeremy

From jcarlson at uci.edu  Thu Sep 28 19:40:24 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Thu, 28 Sep 2006 10:40:24 -0700
Subject: [Python-Dev] Python Doc problems
In-Reply-To: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org>
References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org>
Message-ID: <20060928095951.08BF.JCARLSON@uci.edu>


xah lee <xah at xahlee.org> wrote:
> There are a lot reports on the lousy state of python docs. I'm not  
> much in the python community so i don't know what the developers are  
> doing anything about it.

I don't know about everyone else, but when I recieve comments like "the
docs are lousy, fix them", it is more than a bit difficult to know where
to start, and/or what would be better.

Case-by-case examples of "the phrasing of the docs here is confusing"
are helpful, as are actual documentation patches (even plain text is
fine).  While I have heard comments along the lines of "the docs could
be better", I've never heard the claim that the Python docs are "lousy".


> anyway, i've rewrote the Python's RE module documentation, at:
>   http://xahlee.org/perl-python/python_re-write/lib/module-re.html
> and have recently made the term of user clear.

Aside from a few sections in the original docs, and also some sections
in your docs, about the only part of the original docs that I find
unclear is that some sections do not have function names sorted
lexically.  This is confusing compared to other module documentation
available in the stdlib.

I would also like to make one comment about your updated docs (I didn't
read them all, I'm on vacation); In the section about 'Regex Functions'
you used r'\w+@\w+\.com' as a regular expression for an email address in
information about the search() function. This particular RE will only
give results for the simplest of email addresses. I understand that you
wanted to provide an example, but providing a generally broken example
will be detrimental to newer Python RE users, especially those who were
looking for a regular expression for email addresses.  I would say slim
it down to domain names, but even the correct RE for domain names (with
or without internationalization) is ugly.  I don't currently have an
idea of what kind of example would be simple and illustrative, but maybe
someone else has an idea.

> may i ask what the python developers is doing about the python's  
> docs? Are you guys aware, that there are rampant criticisms of python  
> docs and many diverse tries by various individuals to rewrite the doc  
> by starting another wiki or site?

If there are "rampant criticisms" of the Python docs, then those that
are complaining should take specific examples of their complaints to the
sourceforge bug tracker and submit documentation patches for the
relevant sections.  And personally, I've not noticed that criticisms of
the Python docs are "rampant", but maybe there is some "I hate Python
docs" newsgroup or mailing list that I'm not subscribed to.

While I personally think that having a wiki attached to the
documentation is a decent idea, I fear that we would run into a
situation like php, where the documentation is so atrocious that users
need to comment on basically every function in every package to
understand what the heck is going on.


 - Josiah


From brett at python.org  Thu Sep 28 20:25:25 2006
From: brett at python.org (Brett Cannon)
Date: Thu, 28 Sep 2006 11:25:25 -0700
Subject: [Python-Dev] difficulty of implementing phase 2 of PEP 302 in
	Python source
In-Reply-To: <5.1.1.6.0.20060927203149.028c5e90@sparrow.telecommunity.com>
References: <5.1.1.6.0.20060927181456.03e8c088@sparrow.telecommunity.com>
	<5.1.1.6.0.20060927192728.02df6310@sparrow.telecommunity.com>
	<5.1.1.6.0.20060927203149.028c5e90@sparrow.telecommunity.com>
Message-ID: <bbaeab100609281125p275ad390x2bb9eaac857b617a@mail.gmail.com>

On 9/27/06, Phillip J. Eby <pje at telecommunity.com> wrote:
>
> At 05:26 PM 9/27/2006 -0700, Brett Cannon wrote:
> >Ah, OK.  So for importing 'email', the zipimporter would call the .pyc
> >importer and it would ask the zipimporter, "can you get me email.pyc?"
> and
> >if it said no it would move on to asking the .py importer for email.py,
> etc.
>
> Yes, exactly.
>
>
> >That's fine.  Just thinking about how the current situation sucks for NFS
> >but how caching just isn't done.  But obvoiusly this could change.
>
> Well, with this design, you can have a CachingFilesystemImporter as your
> storage mechanism to speed things up.
>
>
> >> >>Of course, to fully support .pyc timestamp checking and writeback,
> you'd
> >> >>need some sort of "stat" or "getmtime" feature on the parent
> importer, as
> >> >>well as perhaps an optional "save_data" method.  These would be
> extensions
> >> >>to PEP 302, but welcome ones.
> >> >
> >> >Could pass the string representing the location of where the string
> came
> >> >from.  That would allow for the required stat calls for .pyc files as
> >> >needed without having to implement methods just for this one use case.
> >>
> >>Huh?  In order to know if a .pyc is up to date, you need the st_mtime of
> >>the .py file.  That can't be done in the parent importer without giving
> it
> >>format knowledge, which goes against the point of the exercise.
> >
> >Sorry, thought .pyc files based whether they needed to be recompiled
> based
> >on the stat info on the .py and .pyc file, not on data stored from within
> >the .pyc .
>
> It's not just that (although I believe it's also the case that there is a
> timestamp inside .pyc), it's that to do the check in the parent importer,
> the parent importer would have to know that there is such a thing as
> .py-and-.pyc.  The whole point of this design is that the parent importer
> doesn't have to know *anything* about filename extensions OR how those
> files are formatted internally.  In this scheme, adding more child
> importers is sufficient to add all the special handling needed for
> .py/.pyc-style schemes.
>
> Of course, for maximum flexibility, you might want get_stream() and
> get_file() methods optionally available, since a .so loader really needs a
> file, and .pyc might want to read in two stages.  But the child importers
> can be defensively coded so as to be able to live with only a
> parent.get_data(), if necessary, and do the enhanced behaviors only if
> stat() or get_stream() or write_data() etc. attributes are available on
> the
> parent.


Yeah, how to get the proper information to the data importers is going to be
the trick.

If we get some standards for these additional attributes, we can document
> them as standard PEP 302 extensions.
>
> The format importer mechanism might want to have something like
> 'sys.import_formats' as a list of importer classes (or factories).  Parent
> (storage) importer classes would then create instances to use.
>
> If you add a new format importer to sys.import_formats, you would of
> course
> need to clear sys.path_importer_cache, so that the individual importers
> are
> rebuilt on the next import, and thus they will create new child importer
> chains.
>
> Yeah, that pretty much ought to do it.



I will think about it, but I am still trying to get the original question of
how bad the C code is compared to rewriting import in Python from people.
=)

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060928/979594a0/attachment.html 

From pje at telecommunity.com  Thu Sep 28 20:35:30 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 28 Sep 2006 14:35:30 -0400
Subject: [Python-Dev] difficulty of implementing phase 2 of PEP 302 in
 Python source
In-Reply-To: <bbaeab100609281125p275ad390x2bb9eaac857b617a@mail.gmail.co
 m>
References: <5.1.1.6.0.20060927203149.028c5e90@sparrow.telecommunity.com>
	<5.1.1.6.0.20060927181456.03e8c088@sparrow.telecommunity.com>
	<5.1.1.6.0.20060927192728.02df6310@sparrow.telecommunity.com>
	<5.1.1.6.0.20060927203149.028c5e90@sparrow.telecommunity.com>
Message-ID: <5.1.1.6.0.20060928143335.02e08360@sparrow.telecommunity.com>

At 11:25 AM 9/28/2006 -0700, Brett Cannon wrote:
>I will think about it, but I am still trying to get the original question 
>of how bad the C code is compared to rewriting import in Python from 
>people.  =)

I would say that the C code is *delicate*, not necessarily bad.  In most 
ways, it's rather straightforward, it's actually the requirements that are 
complex.  :)

A Python implementation, however, would be a good idea to have around for 
PyPy, Py3K, and other versions of Python, and as a refactoring basis for 
writing any new C code.


From tomerfiliba at gmail.com  Thu Sep 28 20:40:34 2006
From: tomerfiliba at gmail.com (tomer filiba)
Date: Thu, 28 Sep 2006 20:40:34 +0200
Subject: [Python-Dev] weakref enhancements
Message-ID: <1d85506f0609281140n324db9f5g206de1a13a3e55c5@mail.gmail.com>

i'd like to suggest adding weak attributes and weak methods to the std
weakref
module.

weakattrs are weakly-referenced attributes. when the value they reference is
no
longer strongly-referenced by something else, the weakattrs "nullify"
themselves.

weakmethod is a method decorator, like classmethod et al, that returns
"weakly
bound" methods. weakmethod's im_self is a weakref.proxy to `self`, which
means
the mere method will not keep the entire instance alive. instead, you'll get
a
ReferenceError.

i think these two features are quite useful, and being part of the stdlib,
would
provide programmers with easy-to-use solutions to object-aliveness issues.

more info, examples, and suggested implementation:
* http://sebulba.wikispaces.com/recipe+weakattr
* http://sebulba.wikispaces.com/recipe+weakmethod


-tomer
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060928/7c3f1fc2/attachment.html 

From theller at python.net  Thu Sep 28 20:54:02 2006
From: theller at python.net (Thomas Heller)
Date: Thu, 28 Sep 2006 20:54:02 +0200
Subject: [Python-Dev] difficulty of implementing phase 2 of PEP 302 in
	Python source
In-Reply-To: <5.1.1.6.0.20060928143335.02e08360@sparrow.telecommunity.com>
References: <5.1.1.6.0.20060927203149.028c5e90@sparrow.telecommunity.com>	<5.1.1.6.0.20060927181456.03e8c088@sparrow.telecommunity.com>	<5.1.1.6.0.20060927192728.02df6310@sparrow.telecommunity.com>	<5.1.1.6.0.20060927203149.028c5e90@sparrow.telecommunity.com>
	<bbaeab100609281125p275ad390x2bb9eaac857b617a@mail.gmail.co m>
	<5.1.1.6.0.20060928143335.02e08360@sparrow.telecommunity.com>
Message-ID: <efh5ob$qsm$2@sea.gmane.org>

Phillip J. Eby schrieb:
> At 11:25 AM 9/28/2006 -0700, Brett Cannon wrote:
>>I will think about it, but I am still trying to get the original question 
>>of how bad the C code is compared to rewriting import in Python from 
>>people.  =)
> 
> I would say that the C code is *delicate*, not necessarily bad.  In most 
> ways, it's rather straightforward, it's actually the requirements that are 
> complex.  :)
> 
> A Python implementation, however, would be a good idea to have around for 
> PyPy, Py3K, and other versions of Python, and as a refactoring basis for 
> writing any new C code.

FYI, Gordon McMillan had a Python 'model' of the import mechanism in his,
(not sure if it was really named) "iu.py".  It was part of his installer utility,
maybe the code still lives in the PyInstaller project.  IIRC, parts of pep 302 were
inspired by his code.

Thomas


From rhettinger at ewtllc.com  Thu Sep 28 21:02:20 2006
From: rhettinger at ewtllc.com (Raymond Hettinger)
Date: Thu, 28 Sep 2006 12:02:20 -0700
Subject: [Python-Dev] weakref enhancements
In-Reply-To: <1d85506f0609281140n324db9f5g206de1a13a3e55c5@mail.gmail.com>
References: <1d85506f0609281140n324db9f5g206de1a13a3e55c5@mail.gmail.com>
Message-ID: <451C1C3C.5010004@ewtllc.com>

tomer filiba wrote:

> i'd like to suggest adding weak attributes and weak methods to the std 
> weakref
> module.

 . . .

>
> i think these two features are quite useful, and being part of the 
> stdlib, would
> provide programmers with easy-to-use solutions to object-aliveness issues.
>
> more info, examples, and suggested implementation:
> * http://sebulba.wikispaces.com/recipe+weakattr
> * http://sebulba.wikispaces.com/recipe+weakmethod 
> <http://sebulba.wikispaces.com/recipe+weakmethod>
>

I'm sceptical that these would find use in practice.  The cited links 
have only toy examples and as motivation reference Greg Ewing's posting 
saying only "I'm thinking it would be nice . . . This could probably be 
done fairly easily with a property descriptor." 

Also, I question the utility of maintaining a weakref to a method or 
attribute instead of holding one for the object or class.  As long as 
the enclosing object or class lives, so too will their methods and 
attributes.  So what is the point of a tighter weakref granualarity?

So, before being entertained for addition to the standard library, this 
idea should probably first be posted as an ASPN recipe, then we can see 
if any use cases emerge in actual practice.  Then we could look at 
sample code fragments to see if any real-world code is actually improved 
with the new toys.  My bet is that very few will emerge, that most would 
be better served by a simple decorator, and that an expanding weakref 
zoo will only making the module more difficult to learn.


Raymond







From python at rcn.com  Thu Sep 28 21:23:31 2006
From: python at rcn.com (Raymond Hettinger)
Date: Thu, 28 Sep 2006 12:23:31 -0700
Subject: [Python-Dev] weakref enhancements
References: <1d85506f0609281140n324db9f5g206de1a13a3e55c5@mail.gmail.com>
	<451C1C3C.5010004@ewtllc.com>
Message-ID: <011401c6e333$96b35450$ea146b0a@RaymondLaptop1>

> Also, I question the utility of maintaining a weakref to a method or
> attribute instead of holding one for the object or class.

Strike that paragraph -- the proposed weakattrs have references away from the 
object, not to the object.


Raymond 

From p.f.moore at gmail.com  Thu Sep 28 21:25:55 2006
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 28 Sep 2006 20:25:55 +0100
Subject: [Python-Dev] difficulty of implementing phase 2 of PEP 302 in
	Python source
In-Reply-To: <efh5ob$qsm$2@sea.gmane.org>
References: <5.1.1.6.0.20060927181456.03e8c088@sparrow.telecommunity.com>
	<5.1.1.6.0.20060927192728.02df6310@sparrow.telecommunity.com>
	<5.1.1.6.0.20060927203149.028c5e90@sparrow.telecommunity.com>
	<5.1.1.6.0.20060928143335.02e08360@sparrow.telecommunity.com>
	<efh5ob$qsm$2@sea.gmane.org>
Message-ID: <79990c6b0609281225t27df9e4al7aa491c2d7008d6a@mail.gmail.com>

> Phillip J. Eby schrieb:

> > I would say that the C code is *delicate*, not necessarily bad.  In most
> > ways, it's rather straightforward, it's actually the requirements that are
> > complex.  :)

>From what I recall, that's right. The C code's main disadvantage is
that it isn't very well commented (as far as I recall) and there's no
documentation of precisely what it's trying to achieve (insofar as
there isn't a precise spec for how importing works in the Python docs,
covering all the subtleties of things like package imports, package
__path__ entries, reloading, etc etc...)

> > A Python implementation, however, would be a good idea to have around for
> > PyPy, Py3K, and other versions of Python, and as a refactoring basis for
> > writing any new C code.

It would also provide the basis of a much better spec - both because a
clear spec would need to be established before you could write it, and
because Python code is inherently readable...

On 9/28/06, Thomas Heller <theller at python.net> wrote:
> FYI, Gordon McMillan had a Python 'model' of the import mechanism in his,
> (not sure if it was really named) "iu.py".  It was part of his installer utility,
> maybe the code still lives in the PyInstaller project.  IIRC, parts of pep 302 were
> inspired by his code.

That's right. Lots of the path importer and metapath stuff came from iu.py.

I have an oldish copy (Installer 5b5_2, from 2003) if you can't get it
anywhere else...

Paul.

From tomerfiliba at gmail.com  Thu Sep 28 21:57:32 2006
From: tomerfiliba at gmail.com (tomer filiba)
Date: Thu, 28 Sep 2006 21:57:32 +0200
Subject: [Python-Dev] weakref enhancements
In-Reply-To: <011401c6e333$96b35450$ea146b0a@RaymondLaptop1>
References: <1d85506f0609281140n324db9f5g206de1a13a3e55c5@mail.gmail.com>
	<451C1C3C.5010004@ewtllc.com>
	<011401c6e333$96b35450$ea146b0a@RaymondLaptop1>
Message-ID: <1d85506f0609281257j4187a573n756b38b4ea072a8a@mail.gmail.com>

> I'm sceptical that these would find use in practice.
> [..]
> Also, I question the utility of maintaining a weakref to a method or
> attribute instead of holding one for the object or class.  As long as
> the enclosing object or class lives, so too will their methods and
> attributes.  So what is the point of a tighter weakref granualarity?

i didn't just came up with them "out of boredom", i have had specific
use cases for these, mainly in rpyc3000... but since the rpyc300
code base is still far from completion, i don't want to give examples
at this early stage.

however, these two are theoretically useful, so i refactored them out
of my code into recipes.


-tomer

On 9/28/06, Raymond Hettinger <python at rcn.com> wrote:
>
> > Also, I question the utility of maintaining a weakref to a method or
> > attribute instead of holding one for the object or class.
>
> Strike that paragraph -- the proposed weakattrs have references away from
> the
> object, not to the object.
>
>
> Raymond
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060928/80869972/attachment.html 

From aleaxit at gmail.com  Thu Sep 28 23:14:13 2006
From: aleaxit at gmail.com (Alex Martelli)
Date: Thu, 28 Sep 2006 14:14:13 -0700
Subject: [Python-Dev] weakref enhancements
In-Reply-To: <1d85506f0609281257j4187a573n756b38b4ea072a8a@mail.gmail.com>
References: <1d85506f0609281140n324db9f5g206de1a13a3e55c5@mail.gmail.com>
	<451C1C3C.5010004@ewtllc.com>
	<011401c6e333$96b35450$ea146b0a@RaymondLaptop1>
	<1d85506f0609281257j4187a573n756b38b4ea072a8a@mail.gmail.com>
Message-ID: <e8a0972d0609281414y5f2b448bt46914d06bb62325e@mail.gmail.com>

On 9/28/06, tomer filiba <tomerfiliba at gmail.com> wrote:
> > I'm sceptical that these would find use in practice.
> > [..]
> > Also, I question the utility of maintaining a weakref to a method or
> > attribute instead of holding one for the object or class.  As long as
> > the enclosing object or class lives, so too will their methods and
> > attributes.  So what is the point of a tighter weakref granualarity?
>
> i didn't just came up with them "out of boredom", i have had specific
> use cases for these, mainly in rpyc3000... but since the rpyc300
> code base is still far from completion, i don't want to give examples
> at this early stage.
>
> however, these two are theoretically useful, so i refactored them out
> of my code into recipes.

I've had use cases for "weakrefs to boundmethods" (and there IS a
Cookbook recipe for them), as follows: sometimes I'm maintaining a
container of callables, which may be of various kinds including
functions, boundmethods, etc; but I'd like the mere presence of a
callable in the container not to keep the callable alive (especially
when the callable in turn keeps alive an object with possibly massive
state). In practice I use wrapping and tricks, but it would be nice to
have cleaner standard library support for this. (Often the container
needs to be some form of a Queue.Queue, since queues of callables are
a form I use very often to dispatch work requests to worker-threads in
a threadpool).


Alex

From rhettinger at ewtllc.com  Fri Sep 29 01:08:56 2006
From: rhettinger at ewtllc.com (Raymond Hettinger)
Date: Thu, 28 Sep 2006 16:08:56 -0700
Subject: [Python-Dev] weakref enhancements
In-Reply-To: <e8a0972d0609281414y5f2b448bt46914d06bb62325e@mail.gmail.com>
References: <1d85506f0609281140n324db9f5g206de1a13a3e55c5@mail.gmail.com>	<451C1C3C.5010004@ewtllc.com>	<011401c6e333$96b35450$ea146b0a@RaymondLaptop1>	<1d85506f0609281257j4187a573n756b38b4ea072a8a@mail.gmail.com>
	<e8a0972d0609281414y5f2b448bt46914d06bb62325e@mail.gmail.com>
Message-ID: <451C5608.3020000@ewtllc.com>

[Alex Martelli]

>I've had use cases for "weakrefs to boundmethods" (and there IS a
>Cookbook recipe for them),
>
Weakmethods make some sense (though they raise the question of why bound 
methods are being kept when the underlying object is no longer in use -- 
possibly as unintended side-effect of aggressive optimization).

I'm more concerned about weakattr which hides the weak referencing from 
client code when it is usually the client that needs to know about the 
refcounts:

   n = SomeClass(x)
   obj.a = n
   del n                  # hmm, what happens now?

If obj.a is a weakattr, then n get vaporized; otherwise, it lives.

It is clearer and less error-prone to keep the responsibility with the 
caller:

   n = SomeClass(x)
   obj.a = weakref.proxy(n)
   del n                 # now, it is clear what happens

The wiki-space example shows objects that directly assign a copy of self 
to an attribute of self. Even in that simplified, self-referential 
example, it is clear that correct functioning (when __del__ gets called) 
depends knowing whether or not assignments are creating references.  
Accordingly, the code would be better-off if the weak-referencing 
assignment was made explicit rather than hiding the weak-referencing 
wrapper in a descriptor.



Raymond

From bob at redivi.com  Fri Sep 29 01:39:08 2006
From: bob at redivi.com (Bob Ippolito)
Date: Thu, 28 Sep 2006 16:39:08 -0700
Subject: [Python-Dev] weakref enhancements
In-Reply-To: <451C5608.3020000@ewtllc.com>
References: <1d85506f0609281140n324db9f5g206de1a13a3e55c5@mail.gmail.com>
	<451C1C3C.5010004@ewtllc.com>
	<011401c6e333$96b35450$ea146b0a@RaymondLaptop1>
	<1d85506f0609281257j4187a573n756b38b4ea072a8a@mail.gmail.com>
	<e8a0972d0609281414y5f2b448bt46914d06bb62325e@mail.gmail.com>
	<451C5608.3020000@ewtllc.com>
Message-ID: <6a36e7290609281639r38aa53dbk6f1f64be58cfaad2@mail.gmail.com>

On 9/28/06, Raymond Hettinger <rhettinger at ewtllc.com> wrote:
> [Alex Martelli]
>
> >I've had use cases for "weakrefs to boundmethods" (and there IS a
> >Cookbook recipe for them),
> >
> Weakmethods make some sense (though they raise the question of why bound
> methods are being kept when the underlying object is no longer in use --
> possibly as unintended side-effect of aggressive optimization).

There are *definitely* use cases for keeping bound methods around.

Contrived example:

    one_of = set([1,2,3,4]).__contains__
    filter(one_of, [2,4,6,8,10])

-bob

From raymond.hettinger at verizon.net  Fri Sep 29 03:03:43 2006
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Thu, 28 Sep 2006 18:03:43 -0700
Subject: [Python-Dev] weakref enhancements
References: <1d85506f0609281140n324db9f5g206de1a13a3e55c5@mail.gmail.com><451C1C3C.5010004@ewtllc.com><011401c6e333$96b35450$ea146b0a@RaymondLaptop1><1d85506f0609281257j4187a573n756b38b4ea072a8a@mail.gmail.com><e8a0972d0609281414y5f2b448bt46914d06bb62325e@mail.gmail.com><451C5608.3020000@ewtllc.com>
	<6a36e7290609281639r38aa53dbk6f1f64be58cfaad2@mail.gmail.com>
Message-ID: <006f01c6e363$1bf64670$ea146b0a@RaymondLaptop1>

> There are *definitely* use cases for keeping bound methods around.
>
> Contrived example:
>
>    one_of = set([1,2,3,4]).__contains__
>    filter(one_of, [2,4,6,8,10])

ISTM, the example shows the (undisputed) utility of regular bound methods.

How does it show the need for methods bound weakly to the underlying object,
where the underlying can be deleted while the bound method persists, alive but 
unusable?


Raymond 


From bob at redivi.com  Fri Sep 29 03:13:14 2006
From: bob at redivi.com (Bob Ippolito)
Date: Thu, 28 Sep 2006 18:13:14 -0700
Subject: [Python-Dev] weakref enhancements
In-Reply-To: <006f01c6e363$1bf64670$ea146b0a@RaymondLaptop1>
References: <1d85506f0609281140n324db9f5g206de1a13a3e55c5@mail.gmail.com>
	<451C1C3C.5010004@ewtllc.com>
	<011401c6e333$96b35450$ea146b0a@RaymondLaptop1>
	<1d85506f0609281257j4187a573n756b38b4ea072a8a@mail.gmail.com>
	<e8a0972d0609281414y5f2b448bt46914d06bb62325e@mail.gmail.com>
	<451C5608.3020000@ewtllc.com>
	<6a36e7290609281639r38aa53dbk6f1f64be58cfaad2@mail.gmail.com>
	<006f01c6e363$1bf64670$ea146b0a@RaymondLaptop1>
Message-ID: <6a36e7290609281813j1517017bga3304284dd6325a@mail.gmail.com>

On 9/28/06, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
> > There are *definitely* use cases for keeping bound methods around.
> >
> > Contrived example:
> >
> >    one_of = set([1,2,3,4]).__contains__
> >    filter(one_of, [2,4,6,8,10])
>
> ISTM, the example shows the (undisputed) utility of regular bound methods.
>
> How does it show the need for methods bound weakly to the underlying object,
> where the underlying can be deleted while the bound method persists, alive but
> unusable?

It doesn't. I seem to have misinterpreted your "Weakmethods have some
use (...)" sentence. Sorry for the noise.

-bob

From greg.ewing at canterbury.ac.nz  Fri Sep 29 03:13:46 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 29 Sep 2006 13:13:46 +1200
Subject: [Python-Dev] weakref enhancements
In-Reply-To: <451C1C3C.5010004@ewtllc.com>
References: <1d85506f0609281140n324db9f5g206de1a13a3e55c5@mail.gmail.com>
	<451C1C3C.5010004@ewtllc.com>
Message-ID: <451C734A.2060203@canterbury.ac.nz>

Raymond Hettinger wrote:
> Also, I question the utility of maintaining a weakref to a method or 
> attribute instead of holding one for the object or class.  As long as 
> the enclosing object or class lives, so too will their methods and 
> attributes.  So what is the point of a tighter weakref granualarity?

I think you're misunderstanding what the OP means. A
weak attribute isn't a weak reference to an attribute,
it's an attribute that holds a weak reference and is
automatically dereferenced when you access it.

A frequent potential use case is parent-child relationships.
To avoid creating cycles you'd like to make the link
from child to parent weak, but doing that with raw
weakrefs is somewhat tedious and doesn't feel worth
the bother. If I could just declare the attribute
to be weak and then use it like a normal attribute
from then on, I would probably use this technique
more often.

> So, before being entertained for addition to the standard library, this 
> idea should probably first be posted as an ASPN recipe,

That's a reasonable idea.

--
Greg

From stephen at xemacs.org  Fri Sep 29 02:49:35 2006
From: stephen at xemacs.org (stephen at xemacs.org)
Date: Fri, 29 Sep 2006 09:49:35 +0900
Subject: [Python-Dev] Python Doc problems
In-Reply-To: <20060928095951.08BF.JCARLSON@uci.edu>
References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org>
	<20060928095951.08BF.JCARLSON@uci.edu>
Message-ID: <17692.28063.224114.905464@uwakimon.sk.tsukuba.ac.jp>

Josiah Carlson writes:

 > fine).  While I have heard comments along the lines of "the docs could
 > be better", I've never heard the claim that the Python docs are "lousy".

FYI, I have heard this, recently, from Tom Lord (aka developer of
Arch, rx, guile, etc).  Since he also took a swipe at Emacsen, I
pressed him on what he meant.  He immediately backtracked on "(all)
Python docs" and "lousy", but did say that in his opinion scripting
languages that provide docstrings have lost a fair amount of coherence
in their documentation, and that Python's are consistent with the
general trend.  (He's started using Python relatively recently and
does not claim a historical perspective.)

What is lost according to him is information about how the elements of
a module work together.  The docstrings tend to be narrowly focused on
the particular function or variable, and too often discuss
implementation details.  On the other hand, manuals tend to become
either tutorials or compedia of the docstrings.

 > If there are "rampant criticisms" of the Python docs, then those that
 > are complaining should take specific examples of their complaints to the
 > sourceforge bug tracker and submit documentation patches for the
 > relevant sections.

What they *should* do, but don't, is not necessarily a reflection on
the accuracy of what they say.

FWIW ... I find the documentation for the language, the standard
library, and the Python applications I use quite adequate for my own
use.


From stephen at xemacs.org  Fri Sep 29 03:29:00 2006
From: stephen at xemacs.org (stephen at xemacs.org)
Date: Fri, 29 Sep 2006 10:29:00 +0900
Subject: [Python-Dev]  Python Doc problems
In-Reply-To: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org>
References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org>
Message-ID: <17692.30428.14501.569620@uwakimon.sk.tsukuba.ac.jp>

xah lee writes:

 > anyway, i've rewrote the Python's RE module documentation, at:
 >   http://xahlee.org/perl-python/python_re-write/lib/module-re.html

-1

The current docs could be improved (but not by me, at least not
today), but I don't consider the general direction of Xah's edits
desirable.  Eg, the current table of contents is just as accurate and
more precise than Xah's top node, which makes navigation faster for
someone who knows what he forgot.<wink>  In general his changes
improve the "narrative flow", but for me that's a very low priority in
a reference manual, while the cost in loss of navigability of his
changes is pretty high for me.


From steve at holdenweb.com  Fri Sep 29 03:41:01 2006
From: steve at holdenweb.com (Steve Holden)
Date: Fri, 29 Sep 2006 02:41:01 +0100
Subject: [Python-Dev] Python Doc problems
In-Reply-To: <17692.30428.14501.569620@uwakimon.sk.tsukuba.ac.jp>
References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org>
	<17692.30428.14501.569620@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <efhthr$5h4$1@sea.gmane.org>

stephen at xemacs.org wrote:
> xah lee writes:
> 
>  > anyway, i've rewrote the Python's RE module documentation, at:
>  >   http://xahlee.org/perl-python/python_re-write/lib/module-re.html
> 
> -1
> 
> The current docs could be improved (but not by me, at least not
> today), but I don't consider the general direction of Xah's edits
> desirable.  Eg, the current table of contents is just as accurate and
> more precise than Xah's top node, which makes navigation faster for
> someone who knows what he forgot.<wink>  In general his changes
> improve the "narrative flow", but for me that's a very low priority in
> a reference manual, while the cost in loss of navigability of his
> changes is pretty high for me.
> 
'Fraid that doesn't get him any nearer his hundred bucks, then. Xah: the 
money it still on offer should you choose to rewrite until the criteria 
are satisfied.

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb       http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden


From barry at python.org  Fri Sep 29 03:59:24 2006
From: barry at python.org (Barry Warsaw)
Date: Thu, 28 Sep 2006 21:59:24 -0400
Subject: [Python-Dev] Python Doc problems
In-Reply-To: <17692.28063.224114.905464@uwakimon.sk.tsukuba.ac.jp>
References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org>
	<20060928095951.08BF.JCARLSON@uci.edu>
	<17692.28063.224114.905464@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <34B6804F-9F83-4415-8C2F-BEDD6CD9F63B@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sep 28, 2006, at 8:49 PM, <stephen at xemacs.org> wrote:

> What is lost according to him is information about how the elements of
> a module work together.  The docstrings tend to be narrowly focused on
> the particular function or variable, and too often discuss
> implementation details.  On the other hand, manuals tend to become
> either tutorials or compedia of the docstrings.

There's no doubt that writing good documentation is an art form.   
There's also the pull between wanting to write reference docs for  
those who know what they've forgotten (I love that phrase!) and  
writing the introductory or "how it hangs together" documentation.   
It's not easy at all, and some of Python's documentation does better  
at this than others.  In the vast array of FOSS and for-pay docs I've  
read in my career, Python actually ain't too bad. :)

- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iQCVAwUBRRx+AXEjvBPtnXfVAQJgNgP8D9f2ZUqIDTUmQU8BRx4iqjbXQANrdHt1
usZCwguIS4pa0pmUp73E514y+tDs1UzU1E2I2itIifqtKXZuPOSZYG/DWcg4h8vh
KPCygqSDNiW5dr77UP4QBXk3DOoj68E/WpLWOquoLB/eOYWOa08lh+XEJ9ShHF1F
WfHMygrtpqk=
=vEEN
-----END PGP SIGNATURE-----

From greg.ewing at canterbury.ac.nz  Fri Sep 29 04:24:23 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 29 Sep 2006 14:24:23 +1200
Subject: [Python-Dev] Python Doc problems
In-Reply-To: <34B6804F-9F83-4415-8C2F-BEDD6CD9F63B@python.org>
References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org>
	<20060928095951.08BF.JCARLSON@uci.edu>
	<17692.28063.224114.905464@uwakimon.sk.tsukuba.ac.jp>
	<34B6804F-9F83-4415-8C2F-BEDD6CD9F63B@python.org>
Message-ID: <451C83D7.5090705@canterbury.ac.nz>

Barry Warsaw wrote:

> There's also the pull between wanting to write reference docs for  
> those who know what they've forgotten (I love that phrase!) and  
> writing the introductory or "how it hangs together" documentation. 

The trick to this, I think, is not to try to make the same
piece of documentation serve both purposes.

An example of a good way to do it is the original Inside
Macintosh series. Each chapter started with a narrative-style
"About this module" kind of section, that introduced the
relevant concepts and explained how they fitted together,
without going into low-level details. Then there was a
"Reference" section that systematically went through and
gave all the details of the API.

While Inside Mac could often be criticised for omitting
rather important info in either section now and then, I
think they had the basic structure of the docs right.

--
Greg

From tomerfiliba at gmail.com  Fri Sep 29 09:33:35 2006
From: tomerfiliba at gmail.com (tomer filiba)
Date: Fri, 29 Sep 2006 09:33:35 +0200
Subject: [Python-Dev] weakref enhancements
In-Reply-To: <451C5608.3020000@ewtllc.com>
References: <1d85506f0609281140n324db9f5g206de1a13a3e55c5@mail.gmail.com>
	<451C1C3C.5010004@ewtllc.com>
	<011401c6e333$96b35450$ea146b0a@RaymondLaptop1>
	<1d85506f0609281257j4187a573n756b38b4ea072a8a@mail.gmail.com>
	<e8a0972d0609281414y5f2b448bt46914d06bb62325e@mail.gmail.com>
	<451C5608.3020000@ewtllc.com>
Message-ID: <1d85506f0609290033l1b276ea7j3a833c57c281c343@mail.gmail.com>

this may still be premature, but i see people misunderstood the purpose.

weakattrs are not likely to be used "externally", out of the scope of
the object.
they are just meant to provide an easy to use means for not holding cyclic
references between parents and children.

many graph-like structures, i.e., rpyc's node and proxies, are interconnected
in both ways, and weakattrs help to solve that: i don't want a proxy of a node
to keep the node alive.

weakmethods are used very similarly. nodes have a method called
"getmodule", that performs remote importing of modules. i expose these
modules as a namespace object, so you could do:
>>> mynode.modules.sys
or
>>> mynode.modules.xml.dom.minidom.parseString
instead of
>>> mynode.getmodule("xml.dom.minidom").parseString

here's a sketch:

def ModuleNamespace:
    def __init__(self, importerfunc):
        self.importerfunc = importerfunc

class Node:
    def __init__(self, stream):
        ....
        self.modules = ModuleNamespace(self.getmodule)

    @ weakmethod
    def getmodule(self, name):
         ....

i define this getmodule method as a *weakmethod*, so the mere existence
of the ModuleNamespace instance will not keep the node alive. when the
node loses all external references, the ModuleNamespace should just
"commit suicide", and allow the node to be reclaimed.

yes, you can do all of these with today's weakref, but it takes quite
a lot of hassle to manually set up weakproxies every time.


-tomer

On 9/29/06, Raymond Hettinger <rhettinger at ewtllc.com> wrote:
> [Alex Martelli]
>
> >I've had use cases for "weakrefs to boundmethods" (and there IS a
> >Cookbook recipe for them),
> >
> Weakmethods make some sense (though they raise the question of why bound
> methods are being kept when the underlying object is no longer in use --
> possibly as unintended side-effect of aggressive optimization).
>
> I'm more concerned about weakattr which hides the weak referencing from
> client code when it is usually the client that needs to know about the
> refcounts:
>
>    n = SomeClass(x)
>    obj.a = n
>    del n                  # hmm, what happens now?
>
> If obj.a is a weakattr, then n get vaporized; otherwise, it lives.
>
> It is clearer and less error-prone to keep the responsibility with the
> caller:
>
>    n = SomeClass(x)
>    obj.a = weakref.proxy(n)
>    del n                 # now, it is clear what happens
>
> The wiki-space example shows objects that directly assign a copy of self
> to an attribute of self. Even in that simplified, self-referential
> example, it is clear that correct functioning (when __del__ gets called)
> depends knowing whether or not assignments are creating references.
> Accordingly, the code would be better-off if the weak-referencing
> assignment was made explicit rather than hiding the weak-referencing
> wrapper in a descriptor.
>
>
>
> Raymond
>

From nick at craig-wood.com  Fri Sep 29 10:14:02 2006
From: nick at craig-wood.com (Nick Craig-Wood)
Date: Fri, 29 Sep 2006 09:14:02 +0100
Subject: [Python-Dev] Caching float(0.0)
Message-ID: <20060929081402.GB19781@craig-wood.com>

I just discovered that in a program of mine it was wasting 7MB out of
200MB by storing multiple copies of 0.0.  I found this a bit suprising
since I'm used to small ints and strings being cached.

I added the apparently nonsensical lines

+        if age == 0.0:
+            age = 0.0                   # return a common object for the common case

and got 7MB of memory back!

Eg :-

Python 2.5c1 (r25c1:51305, Aug 19 2006, 18:23:29) 
[GCC 4.1.2 20060814 (prerelease) (Debian 4.1.1-11)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> a=0.0
>>> print id(a), id(0.0)
134738828 134738844
>>> 

Is there any reason why float() shouldn't cache the value of 0.0 since
it is by far and away the most common value?

A full cache of floats probably doesn't make much sense though since
there are so many 'more' of them than integers and defining small
isn't obvious.

-- 
Nick Craig-Wood <nick at craig-wood.com> -- http://www.craig-wood.com/nick

From Jack.Jansen at cwi.nl  Fri Sep 29 11:25:50 2006
From: Jack.Jansen at cwi.nl (Jack Jansen)
Date: Fri, 29 Sep 2006 11:25:50 +0200
Subject: [Python-Dev] Python Doc problems
In-Reply-To: <451C83D7.5090705@canterbury.ac.nz>
References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org>
	<20060928095951.08BF.JCARLSON@uci.edu>
	<17692.28063.224114.905464@uwakimon.sk.tsukuba.ac.jp>
	<34B6804F-9F83-4415-8C2F-BEDD6CD9F63B@python.org>
	<451C83D7.5090705@canterbury.ac.nz>
Message-ID: <32936525-6318-45DA-A8A9-57D3755C4F10@cwi.nl>


On 29-sep-2006, at 4:24, Greg Ewing wrote:
> An example of a good way to do it is the original Inside
> Macintosh series. Each chapter started with a narrative-style
> "About this module" kind of section, that introduced the
> relevant concepts and explained how they fitted together,
> without going into low-level details. Then there was a
> "Reference" section that systematically went through and
> gave all the details of the API.

Yep, this is exactly what I often miss in the Python library docs.
The module intro sections often do contain the "executive
summary" of the module, so that you can quickly see whether this
module could indeed help you solve the problem at hand.
But then you go straight to descriptions of classes and methods,
and there is often no info on how things are plumbed together, both
within the module (how the classes relate to each other) and
more globally (how this module relates to others, see also).

A similar thing occurs one level higher in the library hierarchy:
the section introductions are little more that a list of all the
modules in the section.
--
Jack Jansen, <Jack.Jansen at cwi.nl>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma  
Goldman


-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2255 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20060929/7f604aa0/attachment.bin 

From amk at amk.ca  Fri Sep 29 14:10:35 2006
From: amk at amk.ca (A.M. Kuchling)
Date: Fri, 29 Sep 2006 08:10:35 -0400
Subject: [Python-Dev] Python Doc problems
In-Reply-To: <17692.28063.224114.905464@uwakimon.sk.tsukuba.ac.jp>
References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org>
	<20060928095951.08BF.JCARLSON@uci.edu>
	<17692.28063.224114.905464@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20060929121035.GA4884@localhost.localdomain>

On Fri, Sep 29, 2006 at 09:49:35AM +0900, stephen at xemacs.org wrote:
> What is lost according to him is information about how the elements of
> a module work together.  The docstrings tend to be narrowly focused on
> the particular function or variable, and too often discuss
> implementation details.  

I agree with this, and am not very interested in tools such as epydoc
for this reason.  In such autogenerated documentation, you wind up
with a list of every single class and function, and both trivial and
important classes are given exactly the same emphasis.  Such docs are
useful as a reference when you know what class you need to look at,
but then pydoc also works well for that purpose.

--amk

From ndbecker2 at gmail.com  Fri Sep 29 14:20:48 2006
From: ndbecker2 at gmail.com (Neal Becker)
Date: Fri, 29 Sep 2006 08:20:48 -0400
Subject: [Python-Dev] os.unlink() closes file?
Message-ID: <efj34l$ct9$1@sea.gmane.org>

It seems (I haven't looked at source) that os.unlink() will close the file?

If so, please make this optional.  It breaks the unix idiom for making a
temporary file.

(Yes, I know there is a tempfile module, but I need some behavior it doesn't
implement so I want to do it myself).



From ronaldoussoren at mac.com  Fri Sep 29 14:36:23 2006
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Fri, 29 Sep 2006 14:36:23 +0200
Subject: [Python-Dev] os.unlink() closes file?
In-Reply-To: <efj34l$ct9$1@sea.gmane.org>
References: <efj34l$ct9$1@sea.gmane.org>
Message-ID: <10638710.1159533383149.JavaMail.ronaldoussoren@mac.com>

 
On Friday, September 29, 2006, at 02:22PM, Neal Becker <ndbecker2 at gmail.com> wrote:

>It seems (I haven't looked at source) that os.unlink() will close the file?
>
>If so, please make this optional.  It breaks the unix idiom for making a
>temporary file.
>
>(Yes, I know there is a tempfile module, but I need some behavior it doesn't
>implement so I want to do it myself).

On what platform? Do you have a script that demonstrates your problem? If yes, please file a bug in the bugtracker at http://www.sf.net/projects/python.

AFAIK os.unlink doesn't close files, and I cannot reproduce this problem (python2.3 on Solaris 9).

Ronald


From skip at pobox.com  Fri Sep 29 15:05:18 2006
From: skip at pobox.com (skip at pobox.com)
Date: Fri, 29 Sep 2006 08:05:18 -0500
Subject: [Python-Dev] Python Doc problems
In-Reply-To: <20060929121035.GA4884@localhost.localdomain>
References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org>
	<20060928095951.08BF.JCARLSON@uci.edu>
	<17692.28063.224114.905464@uwakimon.sk.tsukuba.ac.jp>
	<20060929121035.GA4884@localhost.localdomain>
Message-ID: <17693.6670.189595.646482@montanaro.dyndns.org>


    Andrew> In such autogenerated documentation, you wind up with a list of
    Andrew> every single class and function, and both trivial and important
    Andrew> classes are given exactly the same emphasis.  

I find this true where I work as well.  Doxygen is used as a documentation
generation tool for our C++ class libraries.  Too many people use that as a
crutch to often avoid writing documentation altogether.  It's worse in many
ways than tools like epydoc, because you don't need to write any docstrings
(or specially formatted comments) to generate reams and reams of virtual
paper.  This sort of documentation is all but useless for a Python
programmer like myself.  I don't really need to know the five syntactic
constructor variants.  I need to know how to use the classes which have been
exposed to me.

I guess this is a long-winded way of saying, "me too".

Skip

From ndbecker2 at gmail.com  Fri Sep 29 15:18:17 2006
From: ndbecker2 at gmail.com (Neal Becker)
Date: Fri, 29 Sep 2006 09:18:17 -0400
Subject: [Python-Dev] os.unlink() closes file?
References: <efj34l$ct9$1@sea.gmane.org>
	<10638710.1159533383149.JavaMail.ronaldoussoren@mac.com>
Message-ID: <efj6gd$peo$1@sea.gmane.org>

Ronald Oussoren wrote:

>  
> On Friday, September 29, 2006, at 02:22PM, Neal Becker
> <ndbecker2 at gmail.com> wrote:
> 
>>It seems (I haven't looked at source) that os.unlink() will close the
>>file?
>>
>>If so, please make this optional.  It breaks the unix idiom for making a
>>temporary file.
>>
>>(Yes, I know there is a tempfile module, but I need some behavior it
>>doesn't implement so I want to do it myself).
> 
> On what platform? Do you have a script that demonstrates your problem? If
> yes, please file a bug in the bugtracker at
> http://www.sf.net/projects/python.
> 
> AFAIK os.unlink doesn't close files, and I cannot reproduce this problem
> (python2.3 on Solaris 9).
> 
Sorry, my mistake.



From fredrik at pythonware.com  Fri Sep 29 17:11:02 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 29 Sep 2006 17:11:02 +0200
Subject: [Python-Dev] Caching float(0.0)
In-Reply-To: <20060929081402.GB19781@craig-wood.com>
References: <20060929081402.GB19781@craig-wood.com>
Message-ID: <efjd24$jaf$1@sea.gmane.org>

Nick Craig-Wood wrote:

> Is there any reason why float() shouldn't cache the value of 0.0 since
> it is by far and away the most common value?

says who ?

(I just checked the program I'm working on, and my analysis tells me 
that the most common floating point value in that program is 121.216, 
which occurs 32 times.  from what I can tell, 0.0 isn't used at all.)

</F>


From kristjan at ccpgames.com  Fri Sep 29 17:18:17 2006
From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_V=2E_J=F3nsson?=)
Date: Fri, 29 Sep 2006 15:18:17 -0000
Subject: [Python-Dev] Caching float(0.0)
Message-ID: <129CEF95A523704B9D46959C922A280002FE99A1@nemesis.central.ccp.cc>

Acting on this excellent advice, I have patched in a reuse for -1.0, 0.0 and 1.0 for EVE Online.  We use vectors and stuff a lot, and 0.0 is very, very common.  I'll report on the refcount of this for you shortly.

K 

> -----Original Message-----
> From: python-dev-bounces+kristjan=ccpgames.com at python.org 
> [mailto:python-dev-bounces+kristjan=ccpgames.com at python.org] 
> On Behalf Of Fredrik Lundh
> Sent: 29. september 2006 15:11
> To: python-dev at python.org
> Subject: Re: [Python-Dev] Caching float(0.0)
> 
> Nick Craig-Wood wrote:
> 
> > Is there any reason why float() shouldn't cache the value 
> of 0.0 since 
> > it is by far and away the most common value?
> 
> says who ?
> 
> (I just checked the program I'm working on, and my analysis 
> tells me that the most common floating point value in that 
> program is 121.216, which occurs 32 times.  from what I can 
> tell, 0.0 isn't used at all.)
> 
> </F>
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/kristjan%40c
cpgames.com
> 

From kristjan at ccpgames.com  Fri Sep 29 18:11:25 2006
From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_V=2E_J=F3nsson?=)
Date: Fri, 29 Sep 2006 16:11:25 -0000
Subject: [Python-Dev] Caching float(0.0)
Message-ID: <129CEF95A523704B9D46959C922A280002FE99A2@nemesis.central.ccp.cc>

Well gentlemen, I did gather some stats on the frequency of PyFloat_FromDouble().
out of the 1000 first different floats allocated, we get this frequency distribution once our server has started up:

-		stats	[1000]({v=0.00000000000000000 c=410612 },{v=1.0000000000000000 c=107838 },{v=0.75000000000000000 c=25487 },{v=5.0000000000000000 c=22557 },...)	std::vector<entry,std::allocator<entry> >
+		[0]	{v=0.00000000000000000 c=410612 }	entry
+		[1]	{v=1.0000000000000000 c=107838 }	entry
+		[2]	{v=0.75000000000000000 c=25487 }	entry
+		[3]	{v=5.0000000000000000 c=22557 }	entry
+		[4]	{v=10000.000000000000 c=18530 }	entry
+		[5]	{v=-1.0000000000000000 c=14950 }	entry
+		[6]	{v=2.0000000000000000 c=14460 }	entry
+		[7]	{v=1500.0000000000000 c=13470 }	entry
+		[8]	{v=100.00000000000000 c=11913 }	entry
+		[9]	{v=0.50000000000000000 c=11497 }	entry
+		[10]	{v=3.0000000000000000 c=9833 }	entry
+		[11]	{v=20.000000000000000 c=9019 }	entry
+		[12]	{v=0.90000000000000002 c=8954 }	entry
+		[13]	{v=10.000000000000000 c=8377 }	entry
+		[14]	{v=4.0000000000000000 c=7890 }	entry
+		[15]	{v=0.050000000000000003 c=7732 }	entry
+		[16]	{v=1000.0000000000000 c=7456 }	entry
+		[17]	{v=0.40000000000000002 c=7427 }	entry
+		[18]	{v=-100.00000000000000 c=7071 }	entry
+		[19]	{v=5000.0000000000000 c=6851 }	entry
+		[20]	{v=1000000.0000000000 c=6503 }	entry
+		[21]	{v=0.070000000000000007 c=6071 }	entry 

(here I omit the rest).
In addition, my shared 0.0 double has some 200000 references at this point.
0.0 is very, very common.  The same can be said about all the integers up to 5.0 as well as -1.0
I think I will add a simple cache for these values for Eve.
something like:
int i = (int) fval;
if ((double)i == fval && i>=-1 && i<6) {
	Py_INCREF(table[i]);
	return table[i];
}



Cheers,

Kristj?n
> -----Original Message-----
> From: python-dev-bounces+kristjan=ccpgames.com at python.org 
> [mailto:python-dev-bounces+kristjan=ccpgames.com at python.org] 
> On Behalf Of Kristj?n V. J?nsson
> Sent: 29. september 2006 15:18
> To: Fredrik Lundh; python-dev at python.org
> Subject: Re: [Python-Dev] Caching float(0.0)
> 
> Acting on this excellent advice, I have patched in a reuse 
> for -1.0, 0.0 and 1.0 for EVE Online.  We use vectors and 
> stuff a lot, and 0.0 is very, very common.  I'll report on 
> the refcount of this for you shortly.
> 
> K 
> 
> > -----Original Message-----
> > From: python-dev-bounces+kristjan=ccpgames.com at python.org
> > [mailto:python-dev-bounces+kristjan=ccpgames.com at python.org]
> > On Behalf Of Fredrik Lundh
> > Sent: 29. september 2006 15:11
> > To: python-dev at python.org
> > Subject: Re: [Python-Dev] Caching float(0.0)
> > 
> > Nick Craig-Wood wrote:
> > 
> > > Is there any reason why float() shouldn't cache the value
> > of 0.0 since
> > > it is by far and away the most common value?
> > 
> > says who ?
> > 
> > (I just checked the program I'm working on, and my analysis 
> tells me 
> > that the most common floating point value in that program 
> is 121.216, 
> > which occurs 32 times.  from what I can tell, 0.0 isn't 
> used at all.)
> > 
> > </F>
> > 
> > _______________________________________________
> > Python-Dev mailing list
> > Python-Dev at python.org
> > http://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe: 
> > http://mail.python.org/mailman/options/python-dev/kristjan%40c
> cpgames.com
> > 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/kristjan%40c
cpgames.com
> 

From lcaamano at gmail.com  Fri Sep 29 18:49:23 2006
From: lcaamano at gmail.com (Luis P Caamano)
Date: Fri, 29 Sep 2006 12:49:23 -0400
Subject: [Python-Dev] PEP 355 status
Message-ID: <c56e219d0609290949y1e3a645bqeb9af243441682a4@mail.gmail.com>

What's the status of PEP 355, Path - Object oriented filesystem paths?

We'd like to start using the current reference implementation but we'd
like to do it in a manner that minimizes any changes needed when Path
becomes part of stdlib.

In particular, the reference implementation in
http://wiki.python.org/moin/PathModule names the class 'path' instead
of 'Path', which seems like a source of name conflict problems.

How would you recommend one starts using it now, as is or renaming
class path to Path?

Thanks

-- 
Luis P Caamano
Atlanta, GA USA

From jason.orendorff at gmail.com  Fri Sep 29 19:47:54 2006
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Fri, 29 Sep 2006 13:47:54 -0400
Subject: [Python-Dev] Caching float(0.0)
In-Reply-To: <efjd24$jaf$1@sea.gmane.org>
References: <20060929081402.GB19781@craig-wood.com>
	<efjd24$jaf$1@sea.gmane.org>
Message-ID: <bb8868b90609291047m2ed2ed0q1aa9708b4de82092@mail.gmail.com>

On 9/29/06, Fredrik Lundh <fredrik at pythonware.com> wrote:
> (I just checked the program I'm working on, and my analysis tells me
> that the most common floating point value in that program is 121.216,
> which occurs 32 times.  from what I can tell, 0.0 isn't used at all.)

*bemused look*  Fredrik, can you share the reason why this number
occurs 32 times in this program?  I don't mean to imply anything by
that; it just sounds like it might be a fun story.  :)

Anyway, this kind of static analysis is probably more entertaining
than relevant.  For your enjoyment, the most-used float literals in
python25\Lib, omitting test directories, are:

1e-006: 5 hits
4.0: 6 hits
0.05: 7 hits
6.0: 8 hits
0.5: 13 hits
2.0: 25 hits
0.0: 36 hits
1.0: 62 hits

There are two hits each for -1.0 and -0.5.

In my own Python code, I don't even have enough float literals to bother with.

-j

From nmm1 at cus.cam.ac.uk  Fri Sep 29 20:03:42 2006
From: nmm1 at cus.cam.ac.uk (Nick Maclaren)
Date: Fri, 29 Sep 2006 19:03:42 +0100
Subject: [Python-Dev] Caching float(0.0)
Message-ID: <E1GTMi6-0005AC-Ed@draco.cus.cam.ac.uk>

"Jason Orendorff" <jason.orendorff at gmail.com> wrote:
>
> Anyway, this kind of static analysis is probably more entertaining
> than relevant.  ...

Well, yes.  One can tell that by the piffling little counts being
bandied about!  More seriously, yes, it is Well Known that 0.0 is
the Most Common Floating-Point Number is most numerical codes; a
lot of older (and perhaps modern) sparse matrix algorithms use that
to save space.

In the software floating-point that I have started to draft some
example code but have had to shelve (no, I haven't forgotten) the
values I predefine are Invalid, Missing, True Zero and Approximate
Zero.  The infinities and infinitesimals (a.k.a. signed zeroes)
could also be included, but are less common and more complicated.
And so could common integers and fractions.

It is generally NOT worth doing a cache lookup for genuinely
numerical code, as the common cases that are not the above rarely
account for enough of the numbers to be worth it.  I did a fair
amount of investigation looking for compressibility at one time,
and that conclusion jumped out at me.

The exact best choice depends entirely on what you are doing.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  nmm1 at cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679

From guido at python.org  Fri Sep 29 21:03:03 2006
From: guido at python.org (Guido van Rossum)
Date: Fri, 29 Sep 2006 12:03:03 -0700
Subject: [Python-Dev] Caching float(0.0)
In-Reply-To: <E1GTMi6-0005AC-Ed@draco.cus.cam.ac.uk>
References: <E1GTMi6-0005AC-Ed@draco.cus.cam.ac.uk>
Message-ID: <ca471dc20609291203q350efd06wc1b8638afb04d97e@mail.gmail.com>

I see some confusion in this thread.

If a *LITERAL* 0.0 (or any other float literal) is used, you only get
one object, no matter how many times it is used.

But if the result of a *COMPUTATION* returns 0.0, you get a new object
for each such result. If you have 70 MB worth of zeros, that's clearly
computation results, not literals.

Attempts to remove literal references from source code won't help much.

I'm personally +0 on caching computational results with common float
values such as 0 and small (positive or negative) powers of two, e.g.
0.5, 1.0, 2.0.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From simon at brunningonline.net  Fri Sep 29 21:11:13 2006
From: simon at brunningonline.net (Simon Brunning)
Date: Fri, 29 Sep 2006 20:11:13 +0100
Subject: [Python-Dev] Python Doc problems
In-Reply-To: <451C83D7.5090705@canterbury.ac.nz>
References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org>
	<20060928095951.08BF.JCARLSON@uci.edu>
	<17692.28063.224114.905464@uwakimon.sk.tsukuba.ac.jp>
	<34B6804F-9F83-4415-8C2F-BEDD6CD9F63B@python.org>
	<451C83D7.5090705@canterbury.ac.nz>
Message-ID: <8c7f10c60609291211u5804a9fdi20e09adfd7b56d74@mail.gmail.com>

On 9/29/06, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> An example of a good way to do it is the original Inside
> Macintosh series. Each chapter started with a narrative-style
> "About this module" kind of section, that introduced the
> relevant concepts and explained how they fitted together,
> without going into low-level details. Then there was a
> "Reference" section that systematically went through and
> gave all the details of the API.

The "How to use this module" sections sound like /F's "The Python
Standard Library", of which I keep the dead tree version on my desk
and the PDF vesion on my hard drive for when I'm coding in the pub. It
or something like it would be a superb addition to the (already very
good IMHO) Python docs.

-- 
Cheers,
Simon B,
simon at brunningonline.net

From fredrik at pythonware.com  Fri Sep 29 21:27:05 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 29 Sep 2006 21:27:05 +0200
Subject: [Python-Dev] Python Doc problems
In-Reply-To: <8c7f10c60609291211u5804a9fdi20e09adfd7b56d74@mail.gmail.com>
References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org>	<20060928095951.08BF.JCARLSON@uci.edu>	<17692.28063.224114.905464@uwakimon.sk.tsukuba.ac.jp>	<34B6804F-9F83-4415-8C2F-BEDD6CD9F63B@python.org>	<451C83D7.5090705@canterbury.ac.nz>
	<8c7f10c60609291211u5804a9fdi20e09adfd7b56d74@mail.gmail.com>
Message-ID: <efjs28$9g7$1@sea.gmane.org>

Simon Brunning wrote:

> The "How to use this module" sections sound like /F's "The Python
> Standard Library", of which I keep the dead tree version on my desk
> and the PDF vesion on my hard drive for when I'm coding in the pub. It
> or something like it would be a superb addition to the (already very
> good IMHO) Python docs.

that's what my old seealso proposal was supposed to address:

     http://effbot.org/zone/idea-seealso.htm

the standard library's seealso file is here:

     http://effbot.org/librarybook/seealso.xml

</F>


From guido at python.org  Fri Sep 29 21:29:31 2006
From: guido at python.org (Guido van Rossum)
Date: Fri, 29 Sep 2006 12:29:31 -0700
Subject: [Python-Dev] Python Doc problems
In-Reply-To: <20060929121035.GA4884@localhost.localdomain>
References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org>
	<20060928095951.08BF.JCARLSON@uci.edu>
	<17692.28063.224114.905464@uwakimon.sk.tsukuba.ac.jp>
	<20060929121035.GA4884@localhost.localdomain>
Message-ID: <ca471dc20609291229i53234deatd105834ce464ebdf@mail.gmail.com>

On 9/29/06, A.M. Kuchling <amk at amk.ca> wrote:
> On Fri, Sep 29, 2006 at 09:49:35AM +0900, stephen at xemacs.org wrote:
> > What is lost according to him is information about how the elements of
> > a module work together.  The docstrings tend to be narrowly focused on
> > the particular function or variable, and too often discuss
> > implementation details.
>
> I agree with this, and am not very interested in tools such as epydoc
> for this reason.  In such autogenerated documentation, you wind up
> with a list of every single class and function, and both trivial and
> important classes are given exactly the same emphasis.  Such docs are
> useful as a reference when you know what class you need to look at,
> but then pydoc also works well for that purpose.

Right.

BTW isn't xah a well-known troll? (There are exactly 666 Google hits
for the query ``xah troll'' -- draw your own conclusions. :-)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Fri Sep 29 21:38:22 2006
From: guido at python.org (Guido van Rossum)
Date: Fri, 29 Sep 2006 12:38:22 -0700
Subject: [Python-Dev] PEP 355 status
In-Reply-To: <c56e219d0609290949y1e3a645bqeb9af243441682a4@mail.gmail.com>
References: <c56e219d0609290949y1e3a645bqeb9af243441682a4@mail.gmail.com>
Message-ID: <ca471dc20609291238l4c3f40a4t6106d78432a0905e@mail.gmail.com>

I would recommend not using it. IMO it's an amalgam of unrelated
functionality (much like the Java equivalent BTW) and the existing os
and os.path modules work just fine. Those who disagree with me haven't
done a very good job of convincing me, so I expect this PEP to remain
in limbo indefinitely, until it is eventually withdrawn or rejected.

--Guido

On 9/29/06, Luis P Caamano <lcaamano at gmail.com> wrote:
> What's the status of PEP 355, Path - Object oriented filesystem paths?
>
> We'd like to start using the current reference implementation but we'd
> like to do it in a manner that minimizes any changes needed when Path
> becomes part of stdlib.
>
> In particular, the reference implementation in
> http://wiki.python.org/moin/PathModule names the class 'path' instead
> of 'Path', which seems like a source of name conflict problems.
>
> How would you recommend one starts using it now, as is or renaming
> class path to Path?
>
> Thanks
>
> --
> Luis P Caamano
> Atlanta, GA USA
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From lcaamano at gmail.com  Fri Sep 29 22:15:44 2006
From: lcaamano at gmail.com (Luis P Caamano)
Date: Fri, 29 Sep 2006 16:15:44 -0400
Subject: [Python-Dev] PEP 355 status
In-Reply-To: <ca471dc20609291238l4c3f40a4t6106d78432a0905e@mail.gmail.com>
References: <c56e219d0609290949y1e3a645bqeb9af243441682a4@mail.gmail.com>
	<ca471dc20609291238l4c3f40a4t6106d78432a0905e@mail.gmail.com>
Message-ID: <c56e219d0609291315y62eb48b8tb9d2c290047eb8d7@mail.gmail.com>

Thanks for your reply, that's the kind of info I was looking for to
decide what to do.  Good enough, I'll move on then.

Thanks

-- 
Luis P Caamano
Atlanta, GA USA

On 9/29/06, Guido van Rossum <guido at python.org> wrote:
> I would recommend not using it. IMO it's an amalgam of unrelated
> functionality (much like the Java equivalent BTW) and the existing os
> and os.path modules work just fine. Those who disagree with me haven't
> done a very good job of convincing me, so I expect this PEP to remain
> in limbo indefinitely, until it is eventually withdrawn or rejected.
>
> --Guido
>
> On 9/29/06, Luis P Caamano <lcaamano at gmail.com> wrote:
> > What's the status of PEP 355, Path - Object oriented filesystem paths?
> >
> > We'd like to start using the current reference implementation but we'd
> > like to do it in a manner that minimizes any changes needed when Path
> > becomes part of stdlib.
> >
> > In particular, the reference implementation in
> > http://wiki.python.org/moin/PathModule names the class 'path' instead
> > of 'Path', which seems like a source of name conflict problems.
> >
> > How would you recommend one starts using it now, as is or renaming
> > class path to Path?
> >
> > Thanks
> >
> > --
> > Luis P Caamano
> > Atlanta, GA USA
> > _______________________________________________
> > Python-Dev mailing list
> > Python-Dev at python.org
> > http://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
> >
>
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
>

From g.brandl at gmx.net  Fri Sep 29 22:18:16 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Fri, 29 Sep 2006 22:18:16 +0200
Subject: [Python-Dev] PEP 355 status
In-Reply-To: <ca471dc20609291238l4c3f40a4t6106d78432a0905e@mail.gmail.com>
References: <c56e219d0609290949y1e3a645bqeb9af243441682a4@mail.gmail.com>
	<ca471dc20609291238l4c3f40a4t6106d78432a0905e@mail.gmail.com>
Message-ID: <efjv29$kes$1@sea.gmane.org>

Shouldn't that paragraph be added to the PEP (e.g. under a "Status" subheading)?

enjoying-top-posting-ly,
Georg

Guido van Rossum wrote:
> I would recommend not using it. IMO it's an amalgam of unrelated
> functionality (much like the Java equivalent BTW) and the existing os
> and os.path modules work just fine. Those who disagree with me haven't
> done a very good job of convincing me, so I expect this PEP to remain
> in limbo indefinitely, until it is eventually withdrawn or rejected.
> 
> --Guido
> 
> On 9/29/06, Luis P Caamano <lcaamano at gmail.com> wrote:
>> What's the status of PEP 355, Path - Object oriented filesystem paths?
>>
>> We'd like to start using the current reference implementation but we'd
>> like to do it in a manner that minimizes any changes needed when Path
>> becomes part of stdlib.
>>
>> In particular, the reference implementation in
>> http://wiki.python.org/moin/PathModule names the class 'path' instead
>> of 'Path', which seems like a source of name conflict problems.
>>
>> How would you recommend one starts using it now, as is or renaming
>> class path to Path?
>>
>> Thanks
>>
>> --
>> Luis P Caamano
>> Atlanta, GA USA
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at python.org
>> http://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>>
> 
> 


From bjourne at gmail.com  Fri Sep 29 23:48:37 2006
From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=)
Date: Fri, 29 Sep 2006 23:48:37 +0200
Subject: [Python-Dev] Python Doc problems
In-Reply-To: <20060928095951.08BF.JCARLSON@uci.edu>
References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org>
	<20060928095951.08BF.JCARLSON@uci.edu>
Message-ID: <740c3aec0609291448p38b8f3ebp75f0608ba6a99259@mail.gmail.com>

> If there are "rampant criticisms" of the Python docs, then those that
> are complaining should take specific examples of their complaints to the
> sourceforge bug tracker and submit documentation patches for the
> relevant sections.  And personally, I've not noticed that criticisms of
> the Python docs are "rampant", but maybe there is some "I hate Python
> docs" newsgroup or mailing list that I'm not subscribed to.

Meh! The number one complaint IS that you have to take your complaints
to the sourceforge bug tracker and submit documentation patches. For
documentation changes, that is way to much overhead for to little
gain. But thankfully I think there are people working on fixing those
problems which is very nice.

-- 
mvh Bj?rn

From tzot at mediconsa.com  Sat Sep 30 01:27:25 2006
From: tzot at mediconsa.com (Christos Georgiou)
Date: Sat, 30 Sep 2006 02:27:25 +0300
Subject: [Python-Dev] Tix not included in 2.5 for Windows
Message-ID: <efka52$l43$1@sea.gmane.org>

Does anyone know why this happens? I can't find any information pointing to 
this being deliberate.

I just upgraded to 2.5 on Windows (after making sure I can build extensions 
with the freeware VC++ Toolkit 2003) and some of my programs stopped 
operating. I saw in a French forum that someone else had the same problem, 
and what they did was to copy the relevant files from a 2.4.3 installation. 
I did the same, and it seems it works, with only a console message appearing 
as soon as a root window is created:

attempt to provide package Tix 8.1 failed: package Tix 8.1.8.4 provided 
instead

Cheers. 



From jcarlson at uci.edu  Sat Sep 30 01:54:10 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri, 29 Sep 2006 16:54:10 -0700
Subject: [Python-Dev] Python Doc problems
In-Reply-To: <740c3aec0609291448p38b8f3ebp75f0608ba6a99259@mail.gmail.com>
References: <20060928095951.08BF.JCARLSON@uci.edu>
	<740c3aec0609291448p38b8f3ebp75f0608ba6a99259@mail.gmail.com>
Message-ID: <20060929164528.08DD.JCARLSON@uci.edu>


"BJ?rn Lindqvist" <bjourne at gmail.com> wrote:
> > If there are "rampant criticisms" of the Python docs, then those that
> > are complaining should take specific examples of their complaints to the
> > sourceforge bug tracker and submit documentation patches for the
> > relevant sections.  And personally, I've not noticed that criticisms of
> > the Python docs are "rampant", but maybe there is some "I hate Python
> > docs" newsgroup or mailing list that I'm not subscribed to.
> 
> Meh! The number one complaint IS that you have to take your complaints
> to the sourceforge bug tracker and submit documentation patches. For
> documentation changes, that is way to much overhead for to little
> gain. But thankfully I think there are people working on fixing those
> problems which is very nice.

Are you telling me that people want to be able to complain into the
ether and get their complaints heard?  I hope not, because that would be
insane.  Also, "doc patches" are basically "the function foo() should be
documented as ...", users don't need to know or learn TeX.  Should there
be an easier method of submitting doc fixes, etc.? Sure. But people are
still going to need to actually *report* the fixes they want, which they
aren't doing in *any* form now.


 - Josiah


From brett at python.org  Sat Sep 30 02:23:55 2006
From: brett at python.org (Brett Cannon)
Date: Fri, 29 Sep 2006 17:23:55 -0700
Subject: [Python-Dev] Python Doc problems
In-Reply-To: <740c3aec0609291448p38b8f3ebp75f0608ba6a99259@mail.gmail.com>
References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org>
	<20060928095951.08BF.JCARLSON@uci.edu>
	<740c3aec0609291448p38b8f3ebp75f0608ba6a99259@mail.gmail.com>
Message-ID: <bbaeab100609291723k4f3aa602h74eb973655472e24@mail.gmail.com>

On 9/29/06, BJ?rn Lindqvist <bjourne at gmail.com> wrote:
>
> > If there are "rampant criticisms" of the Python docs, then those that
> > are complaining should take specific examples of their complaints to the
> > sourceforge bug tracker and submit documentation patches for the
> > relevant sections.  And personally, I've not noticed that criticisms of
> > the Python docs are "rampant", but maybe there is some "I hate Python
> > docs" newsgroup or mailing list that I'm not subscribed to.
>
> Meh! The number one complaint IS that you have to take your complaints
> to the sourceforge bug tracker and submit documentation patches. For
> documentation changes, that is way to much overhead for to little
> gain. But thankfully I think there are people working on fixing those
> problems which is very nice.


The PSF Infrastructure committe has already met and drafted our
suggestions.  Expect a post to the list on Monday or Tuesday outlining our
recommendation on a new issue tracker.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060929/6760e511/attachment.html 

From greg.ewing at canterbury.ac.nz  Sat Sep 30 02:57:55 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 30 Sep 2006 12:57:55 +1200
Subject: [Python-Dev] Caching float(0.0)
In-Reply-To: <20060929081402.GB19781@craig-wood.com>
References: <20060929081402.GB19781@craig-wood.com>
Message-ID: <451DC113.4040002@canterbury.ac.nz>

Nick Craig-Wood wrote:

> Is there any reason why float() shouldn't cache the value of 0.0 since
> it is by far and away the most common value?

1.0 might be another candidate for cacheing.

Although the fact that nobody has complained about this
before suggests that it might not be a frequent enough
problem to be worth the effort.

--
Greg

From bob at redivi.com  Sat Sep 30 03:15:15 2006
From: bob at redivi.com (Bob Ippolito)
Date: Fri, 29 Sep 2006 18:15:15 -0700
Subject: [Python-Dev] Caching float(0.0)
In-Reply-To: <451DC113.4040002@canterbury.ac.nz>
References: <20060929081402.GB19781@craig-wood.com>
	<451DC113.4040002@canterbury.ac.nz>
Message-ID: <6a36e7290609291815i115b41b3o4ab6d196f404557f@mail.gmail.com>

On 9/29/06, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Nick Craig-Wood wrote:
>
> > Is there any reason why float() shouldn't cache the value of 0.0 since
> > it is by far and away the most common value?
>
> 1.0 might be another candidate for cacheing.
>
> Although the fact that nobody has complained about this
> before suggests that it might not be a frequent enough
> problem to be worth the effort.

My guess is that people do have this problem, they just don't know
where that memory has gone. I know I don't count objects unless I have
a process that's leaking memory or it grows so big that I notice (by
swapping or chance).

That said, I've never noticed this particular issue.. but I deal with
mostly strings. I have had issues with the allocator a few times that
I had to work around, but not this sort of issue.

-bob

From rrr at ronadam.com  Sat Sep 30 03:15:04 2006
From: rrr at ronadam.com (Ron Adam)
Date: Fri, 29 Sep 2006 20:15:04 -0500
Subject: [Python-Dev] Python Doc problems
In-Reply-To: <20060929164528.08DD.JCARLSON@uci.edu>
References: <20060928095951.08BF.JCARLSON@uci.edu>	<740c3aec0609291448p38b8f3ebp75f0608ba6a99259@mail.gmail.com>
	<20060929164528.08DD.JCARLSON@uci.edu>
Message-ID: <451DC518.10609@ronadam.com>

Josiah Carlson wrote:
> "BJ?rn Lindqvist" <bjourne at gmail.com> wrote:
>>> If there are "rampant criticisms" of the Python docs, then those that
>>> are complaining should take specific examples of their complaints to the
>>> sourceforge bug tracker and submit documentation patches for the
>>> relevant sections.  And personally, I've not noticed that criticisms of
>>> the Python docs are "rampant", but maybe there is some "I hate Python
>>> docs" newsgroup or mailing list that I'm not subscribed to.
>> Meh! The number one complaint IS that you have to take your complaints
>> to the sourceforge bug tracker and submit documentation patches. For
>> documentation changes, that is way to much overhead for to little
>> gain. But thankfully I think there are people working on fixing those
>> problems which is very nice.
> 
> Are you telling me that people want to be able to complain into the
> ether and get their complaints heard?  I hope not, because that would be
> insane.  Also, "doc patches" are basically "the function foo() should be
> documented as ...", users don't need to know or learn TeX.  Should there
> be an easier method of submitting doc fixes, etc.? Sure. But people are
> still going to need to actually *report* the fixes they want, which they
> aren't doing in *any* form now.

Maybe a doc fix day (similar to the bug fix day) would be good.  That way we can 
report a lot of minor doc fix's at once and then they can be fixed in batches.

For example of things I think may be thought of as too trivial to report but 
effect readability and ease of use with pythons help() function ...

A *lot* of doc strings have lines that wrap when they are displayed by pythons 
help() in a standard 80 column console window.

There are also two (maybe more) modules that have single backslash characters in 
their doc strings that get ate when viewed by pydoc.

     cookielib.py   -  has single '\'s in a diagram.

     SimpleXMLRPCServer.py  - line 31... code example with line continuation.

I wonder if double \ should also be allowed as line continuations so that 
doctests would look and work ok in doc strings when viewed by pythons help()?


Anyway if someone wants to search for other things of that type they can play 
around with the hacked together tool included below.  Setting it low enough so 
that indented methods don't wrap with the help() function brings up several 
thousand instances. I'm hoping most of those are duplicated/inherited doc strings.

Many of those are documented format lines with the form ...

    name( longs_list_of_arguments ... ) -> long_list_of_return_types ...


Rather than fix all of those, I'm changing the version of pydoc I've been 
rewriting to wordwrap lines.  Although that's not the prettiest solution, it's 
better than breaking the indented margin.


     Have fun...  ;-)

Ron


"""
     Find doc string lines are not longer than n characters.

     Dedenting the doc strings before testing may give more
     meaningful results.
"""
import sys
import os
import inspect
import types

class NullType(object):
     """ A simple Null object to use when None is a valid
     argument, or when redirecting print to Null. """
     def write(self, *args):
         pass
     def __repr__(self):
         return "Null"
Null = NullType()


check = 'CHECKING__________'
checkerr = 'ERROR CHECKING____'
err_obj = []
err_num = 0
stdout = sys.stdout
stderr = sys.stderr
seporator = '--------------------------------------------------------'
linelength = 100

def main():
     sys_path = sys.path
     # remove invalid dirs
     for f in sys_path[:]:
         try:
             os.listdir(f)
         except:
             sys_path.remove(f)
     #checkmodule('__builtin__')
     for mod in sys.builtin_module_names:
         checkmodule(mod)
     for dir_ in sys.path:
         for f in os.listdir(dir_):
             if f.endswith('.py') or f.endswith('.pyw') or f.endswith('.pyd'):
                 try:
                     checkmodule(f.partition('.')[0])
                 except Exception:
                     print seporator
                     print checkerr, f, err_obj
                     print '   %s: %s' % (sys.exc_type.__name__, sys.exc_value)
     print seporator

def checkmodule(modname):
     global err_obj
     err_obj = [modname]
     # Silent text printed on import.
     sys.stdout = sys.stderr = Null
     try:
         module = __import__(modname)
     finally:
         sys.stdout = stdout
         sys.stderr = stderr
     try:
         checkobj(module)               # module doc string
         for o1 in dir(module):
             obj1 = getattr(module, o1)
             err_obj = [modname, o1]
             checkobj(obj1)             # class and function doc strings
             for o2 in dir(obj1):
                 obj2 = getattr(obj1, o2)
                 err_obj = [modname, o1, o2]
                 checkobj(obj2)              # method doc strings
     finally:
         del module

def checkobj(obj):
     global err_num
     if not hasattr(obj, '__doc__'):
         return
     doc = str(obj.__doc__)
     err_obj.append('__doc__')
     lines = doc.split('\n')
     longlines = [x for x in lines if len(x) > linelength]
     if longlines:
         err_num += 1
         print seporator
         print '#%i: %s' % (err_num, '.'.join([str(x) for x in err_obj]))
         print
         for x in longlines:
             print len(x), repr(x.strip())

if __name__ == '__main__':
     main()






From glyph at divmod.com  Sat Sep 30 06:52:58 2006
From: glyph at divmod.com (glyph at divmod.com)
Date: Sat, 30 Sep 2006 00:52:58 -0400
Subject: [Python-Dev] PEP 355 status
In-Reply-To: <ca471dc20609291238l4c3f40a4t6106d78432a0905e@mail.gmail.com>
Message-ID: <20060930045258.1717.223590987.divmod.quotient.63544@ohm>


On Fri, 29 Sep 2006 12:38:22 -0700, Guido van Rossum <guido at python.org> wrote:
>I would recommend not using it. IMO it's an amalgam of unrelated
>functionality (much like the Java equivalent BTW) and the existing os
>and os.path modules work just fine. Those who disagree with me haven't
>done a very good job of convincing me, so I expect this PEP to remain
>in limbo indefinitely, until it is eventually withdrawn or rejected.

Personally I don't like the path module in question either, and I think that PEP 355 presents an exceptionally weak case, but I do believe that there are several serious use-cases for "object oriented" filesystem access.  Twisted has a module for doing this:

    http://twistedmatrix.com/trac/browser/trunk/twisted/python/filepath.py

I hope to one day propose this module as a replacement, or update, for PEP 355, but I have neither the time nor the motivation to do it currently.  I wouldn't propose it now; it is, for example, mostly undocumented, missing some useful functionality, and has some weird warts (for example, the name of the path-as-string attribute is "path").

However, since it's come up I thought I'd share a few of the use-cases for the general feature, and the things that Twisted has done with it.

1: Testing.  If you want to provide filesystem stubs to test code which interacts with the filesystem, it is fragile and extremely complex to temporarily replace the 'os' module; you have to provide a replacement which knows about all the hairy string manipulations one can perform on paths, and you'll almost always forget some weird platform feature.  If you have an object with a narrow interface to duck-type instead; for example, a "walk" method which returns similar objects, or an "open" method which returns a file-like object, mocking the appropriate parts of it in a test is a lot easier.  The proposed PEP 355 module can be used for this, but its interface is pretty wide and implicit (and portions of it are platform-specific), and because it is also a string you may still have to deal with platform-specific features in tests (or even mixed os.path manipulations, on the same object).

This is especially helpful when writing tests for error conditions that are difficult to reproduce on an actual filesystem, such as a network filesystem becoming unavailable.

2: Fast failure, or for lack of a better phrase, "type correctness".  PEP 355 gets close to this idea when it talks about datetimes and sockets not being strings.  In many cases, code that manipulates filesystems is passing around 'str' or 'unicode' objects, and may be accidentally passed the contents of a file rather than its name, leading to a bizarre failure further down the line.  FilePath fails immediately with an "unsupported operand types" TypeError in that case.  It also provides nice, immediate feedback at the prompt that the object you're dealing with is supposed to be a filesystem path, with no confusion as to whether it represents a relative or absolute path, or a path relative to a particular directory.  Again, the PEP 355 module's subclassing of strings creates problems, because you don't get an immediate and obvious exception if you try to interpolate it with a non-path-name string, it silently "succeeds".

3: Safety.  Almost every web server ever written (yes, including twisted.web) has been bitten by the "/../../../" bug at least once.  The default child(name) method of Twisted's file path class will only let you go "down" (to go "up" you have to call the parent() method), and will trap obscure platform features like the "NUL" and "CON" files on Windows so that you can't trick a program into manipulating something that isn't actually a file.  You can take strings you've read from an untrusted source and pass them to FilePath.child and get something relatively safe out.  PEP 355 doesn't mention this at all.

4: last, but certainly not least: filesystem polymorphism.  For an example of what I mean, take a look at this in-development module:

    http://twistedmatrix.com/trac/browser/trunk/twisted/python/zippath.py

It's currently far too informal, and incomplete, and there's no specified interface.  However, this module shows that by being objects and not module-methods, FilePath objects can also provide a sort of virtual filesystem for Python programs.  With FilePath plus ZipPath, You can write Python programs which can operate on a filesystem directory or a directory within a Zip archive, depending on what object they are passed.

On a more subjective note, I've been gradually moving over personal utility scripts from os.path manipulations to twisted.python.filepath for years.  I can't say that this will be everyone's experience, but in the same way that Python scripts avoid the class of errors present in most shell scripts (quoting), t.p.f scripts avoid the class of errors present in most Python scripts (off-by-one errors when looking at separators or extensions).

I hope that eventually Python will include some form of OO filesystem access, but I am equally hopeful that the current PEP 355 path.py is not it.

From ncoghlan at gmail.com  Sat Sep 30 07:17:16 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 30 Sep 2006 15:17:16 +1000
Subject: [Python-Dev] PEP 355 status
In-Reply-To: <20060930045258.1717.223590987.divmod.quotient.63544@ohm>
References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm>
Message-ID: <451DFDDC.9020708@gmail.com>

glyph at divmod.com wrote:
> I hope that eventually Python will include some form of OO filesystem
> access, but I am equally hopeful that the current PEP 355 path.py is not
> it.

+1

Cheers,
Nick.



-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From steve at holdenweb.com  Sat Sep 30 09:41:38 2006
From: steve at holdenweb.com (Steve Holden)
Date: Sat, 30 Sep 2006 08:41:38 +0100
Subject: [Python-Dev] Python Doc problems
In-Reply-To: <ca471dc20609291229i53234deatd105834ce464ebdf@mail.gmail.com>
References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org>	<20060928095951.08BF.JCARLSON@uci.edu>	<17692.28063.224114.905464@uwakimon.sk.tsukuba.ac.jp>	<20060929121035.GA4884@localhost.localdomain>
	<ca471dc20609291229i53234deatd105834ce464ebdf@mail.gmail.com>
Message-ID: <451E1FB2.9050209@holdenweb.com>

Guido van Rossum wrote:
> On 9/29/06, A.M. Kuchling <amk at amk.ca> wrote:
> 
>>On Fri, Sep 29, 2006 at 09:49:35AM +0900, stephen at xemacs.org wrote:
>>
>>>What is lost according to him is information about how the elements of
>>>a module work together.  The docstrings tend to be narrowly focused on
>>>the particular function or variable, and too often discuss
>>>implementation details.
>>
>>I agree with this, and am not very interested in tools such as epydoc
>>for this reason.  In such autogenerated documentation, you wind up
>>with a list of every single class and function, and both trivial and
>>important classes are given exactly the same emphasis.  Such docs are
>>useful as a reference when you know what class you need to look at,
>>but then pydoc also works well for that purpose.
> 
> 
> Right.
> 
> BTW isn't xah a well-known troll? (There are exactly 666 Google hits
> for the query ``xah troll'' -- draw your own conclusions. :-)
> 
The calming influence of c.l.py appears to have worked its magic on xah 
to the extent that his most recent post didn't contain any expletives. 
Maybe there's hope for him yet.

668-and-counting-ly y'rs  - steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb       http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden

From steve at holdenweb.com  Sat Sep 30 09:41:38 2006
From: steve at holdenweb.com (Steve Holden)
Date: Sat, 30 Sep 2006 08:41:38 +0100
Subject: [Python-Dev] Python Doc problems
In-Reply-To: <ca471dc20609291229i53234deatd105834ce464ebdf@mail.gmail.com>
References: <0D5869E1-E635-4DD8-889F-2E34F79DB647@xahlee.org>	<20060928095951.08BF.JCARLSON@uci.edu>	<17692.28063.224114.905464@uwakimon.sk.tsukuba.ac.jp>	<20060929121035.GA4884@localhost.localdomain>
	<ca471dc20609291229i53234deatd105834ce464ebdf@mail.gmail.com>
Message-ID: <451E1FB2.9050209@holdenweb.com>

Guido van Rossum wrote:
> On 9/29/06, A.M. Kuchling <amk at amk.ca> wrote:
> 
>>On Fri, Sep 29, 2006 at 09:49:35AM +0900, stephen at xemacs.org wrote:
>>
>>>What is lost according to him is information about how the elements of
>>>a module work together.  The docstrings tend to be narrowly focused on
>>>the particular function or variable, and too often discuss
>>>implementation details.
>>
>>I agree with this, and am not very interested in tools such as epydoc
>>for this reason.  In such autogenerated documentation, you wind up
>>with a list of every single class and function, and both trivial and
>>important classes are given exactly the same emphasis.  Such docs are
>>useful as a reference when you know what class you need to look at,
>>but then pydoc also works well for that purpose.
> 
> 
> Right.
> 
> BTW isn't xah a well-known troll? (There are exactly 666 Google hits
> for the query ``xah troll'' -- draw your own conclusions. :-)
> 
The calming influence of c.l.py appears to have worked its magic on xah 
to the extent that his most recent post didn't contain any expletives. 
Maybe there's hope for him yet.

668-and-counting-ly y'rs  - steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb       http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden


From steve at holdenweb.com  Sat Sep 30 09:45:03 2006
From: steve at holdenweb.com (Steve Holden)
Date: Sat, 30 Sep 2006 08:45:03 +0100
Subject: [Python-Dev] Caching float(0.0)
In-Reply-To: <bb8868b90609291047m2ed2ed0q1aa9708b4de82092@mail.gmail.com>
References: <20060929081402.GB19781@craig-wood.com>	<efjd24$jaf$1@sea.gmane.org>
	<bb8868b90609291047m2ed2ed0q1aa9708b4de82092@mail.gmail.com>
Message-ID: <efl78b$b0i$5@sea.gmane.org>

Jason Orendorff wrote:
> On 9/29/06, Fredrik Lundh <fredrik at pythonware.com> wrote:
> 
>>(I just checked the program I'm working on, and my analysis tells me
>>that the most common floating point value in that program is 121.216,
>>which occurs 32 times.  from what I can tell, 0.0 isn't used at all.)
> 
> 
> *bemused look*  Fredrik, can you share the reason why this number
> occurs 32 times in this program?  I don't mean to imply anything by
> that; it just sounds like it might be a fun story.  :)
> 
> Anyway, this kind of static analysis is probably more entertaining
> than relevant.  For your enjoyment, the most-used float literals in
> python25\Lib, omitting test directories, are:
> 
> 1e-006: 5 hits
> 4.0: 6 hits
> 0.05: 7 hits
> 6.0: 8 hits
> 0.5: 13 hits
> 2.0: 25 hits
> 0.0: 36 hits
> 1.0: 62 hits
> 
> There are two hits each for -1.0 and -0.5.
> 
> In my own Python code, I don't even have enough float literals to bother with.
> 
By these statistics I think the answer to the original question is 
clearly "no" in the general case.

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb       http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden


From martin at v.loewis.de  Sat Sep 30 10:43:01 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 30 Sep 2006 10:43:01 +0200
Subject: [Python-Dev] Tix not included in 2.5 for Windows
In-Reply-To: <efka52$l43$1@sea.gmane.org>
References: <efka52$l43$1@sea.gmane.org>
Message-ID: <451E2E15.4040906@v.loewis.de>

Christos Georgiou schrieb:
> Does anyone know why this happens? I can't find any information pointing to 
> this being deliberate.

It may well be that Tix wasn't included on Windows. I don't test Tix
regularly, and nobody reported missing it during the beta test.

Please submit a bug report to sf.net/projects/python.

Notice that Python 2.5 ships with a different Tcl version than 2.4;
using the 2.4 Tix binaries in 2.5 may cause crashes.

Regards,
Martin

From martin at v.loewis.de  Sat Sep 30 10:47:46 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 30 Sep 2006 10:47:46 +0200
Subject: [Python-Dev] Caching float(0.0)
In-Reply-To: <6a36e7290609291815i115b41b3o4ab6d196f404557f@mail.gmail.com>
References: <20060929081402.GB19781@craig-wood.com>	<451DC113.4040002@canterbury.ac.nz>
	<6a36e7290609291815i115b41b3o4ab6d196f404557f@mail.gmail.com>
Message-ID: <451E2F32.9070405@v.loewis.de>

Bob Ippolito schrieb:
> My guess is that people do have this problem, they just don't know
> where that memory has gone. I know I don't count objects unless I have
> a process that's leaking memory or it grows so big that I notice (by
> swapping or chance).

Right. Although I do wonder what kind of software people write to run
into this problem. As Guido points out, the numbers must be the result
from some computation, or created by an extension module by different
means. If people have many *simultaneous* copies of 0.0, I would expect
there is something else really wrong with the data structures or
algorithms they use.

Regards,
Martin

From ncoghlan at gmail.com  Sat Sep 30 10:59:25 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 30 Sep 2006 18:59:25 +1000
Subject: [Python-Dev] Caching float(0.0)
In-Reply-To: <451E2F32.9070405@v.loewis.de>
References: <20060929081402.GB19781@craig-wood.com>	<451DC113.4040002@canterbury.ac.nz>	<6a36e7290609291815i115b41b3o4ab6d196f404557f@mail.gmail.com>
	<451E2F32.9070405@v.loewis.de>
Message-ID: <451E31ED.7030905@gmail.com>

Martin v. L?wis wrote:
> Bob Ippolito schrieb:
>> My guess is that people do have this problem, they just don't know
>> where that memory has gone. I know I don't count objects unless I have
>> a process that's leaking memory or it grows so big that I notice (by
>> swapping or chance).
> 
> Right. Although I do wonder what kind of software people write to run
> into this problem. As Guido points out, the numbers must be the result
> from some computation, or created by an extension module by different
> means. If people have many *simultaneous* copies of 0.0, I would expect
> there is something else really wrong with the data structures or
> algorithms they use.

I suspect the problem would typically stem from floating point values that are 
read in from a human-readable file rather than being the result of a 
'calculation' as such:

 >>> float('1') is float('1')
False
 >>> float('0') is float('0')
False

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From tzot at mediconsa.com  Sat Sep 30 11:23:28 2006
From: tzot at mediconsa.com (Christos Georgiou)
Date: Sat, 30 Sep 2006 12:23:28 +0300
Subject: [Python-Dev] Tix not included in 2.5 for Windows
References: <efka52$l43$1@sea.gmane.org> <451E2E15.4040906@v.loewis.de>
Message-ID: <efld2l$sh7$1@sea.gmane.org>

""Martin v. L?wis"" <martin at v.loewis.de> wrote in message 
news:451E2E15.4040906 at v.loewis.de...

> Please submit a bug report to sf.net/projects/python.

Done: www.python.org/sf/1568240



From kristjan at ccpgames.com  Sat Sep 30 13:20:07 2006
From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_V=2E_J=F3nsson?=)
Date: Sat, 30 Sep 2006 11:20:07 -0000
Subject: [Python-Dev] Caching float(0.0)
Message-ID: <129CEF95A523704B9D46959C922A28000451FED3@nemesis.central.ccp.cc>

Well, a lot of extension code, like ours use PyFloat_FromDouble(foo);  This can be from vectors and stuff.  Very often these are values from a database.  Integral float values are very common in such case and id didn't occur to me that they weren't being reused, at least for small values.

Also, a lot of arithmetic involving floats is expected to end in integers, like computing some index from a float value.  Integers get promoted to floats when touched by them, as you know.

Anyway, I now precreate integral values from -10 to 10 with great effect.  The cost is minimal, the benefit great.

Cheers,
Kristj?n

-----Original Message-----
From: python-dev-bounces+kristjan=ccpgames.com at python.org [mailto:python-dev-bounces+kristjan=ccpgames.com at python.org] On Behalf Of "Martin v. L?wis"
Sent: 30. september 2006 08:48
To: Bob Ippolito
Cc: python-dev at python.org
Subject: Re: [Python-Dev] Caching float(0.0)

Bob Ippolito schrieb:
> My guess is that people do have this problem, they just don't know
> where that memory has gone. I know I don't count objects unless I have
> a process that's leaking memory or it grows so big that I notice (by
> swapping or chance).

Right. Although I do wonder what kind of software people write to run
into this problem. As Guido points out, the numbers must be the result
from some computation, or created by an extension module by different
means. If people have many *simultaneous* copies of 0.0, I would expect
there is something else really wrong with the data structures or
algorithms they use.

Regards,
Martin
_______________________________________________
Python-Dev mailing list
Python-Dev at python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/kristjan%40ccpgames.com

From mwh at python.net  Sat Sep 30 13:52:20 2006
From: mwh at python.net (Michael Hudson)
Date: Sat, 30 Sep 2006 12:52:20 +0100
Subject: [Python-Dev] PEP 355 status
In-Reply-To: <20060930045258.1717.223590987.divmod.quotient.63544@ohm>
	(glyph@divmod.com's message of "Sat, 30 Sep 2006 00:52:58 -0400")
References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm>
Message-ID: <2mk63lfu6j.fsf@starship.python.net>

glyph at divmod.com writes:

> I hope that eventually Python will include some form of OO
> filesystem access, but I am equally hopeful that the current PEP 355
> path.py is not it.

I think I agree with this too.  For another source of ideas there is
the 'py.path' bit of the py lib, which, um, doesn't seem to be
documented terribly well, but allows access to remote svn repositories
as well as local filesytems (at least).

Cheers,
mwh

-- 
3. Syntactic sugar causes cancer of the semicolon.
  -- Alan Perlis, http://www.cs.yale.edu/homes/perlis-alan/quotes.html

From guido at python.org  Sat Sep 30 17:09:58 2006
From: guido at python.org (Guido van Rossum)
Date: Sat, 30 Sep 2006 08:09:58 -0700
Subject: [Python-Dev] PEP 355 status
In-Reply-To: <2mk63lfu6j.fsf@starship.python.net>
References: <20060930045258.1717.223590987.divmod.quotient.63544@ohm>
	<2mk63lfu6j.fsf@starship.python.net>
Message-ID: <ca471dc20609300809g6f21a8e7k3b097677ade1b8a9@mail.gmail.com>

OK. Pronouncement: PEP 355 is dead. The authors (or the PEP editor)
can update the PEP.

I'm looking forward to a new PEP.

--Guido

On 9/30/06, Michael Hudson <mwh at python.net> wrote:
> glyph at divmod.com writes:
>
> > I hope that eventually Python will include some form of OO
> > filesystem access, but I am equally hopeful that the current PEP 355
> > path.py is not it.
>
> I think I agree with this too.  For another source of ideas there is
> the 'py.path' bit of the py lib, which, um, doesn't seem to be
> documented terribly well, but allows access to remote svn repositories
> as well as local filesytems (at least).
>
> Cheers,
> mwh
>
> --
> 3. Syntactic sugar causes cancer of the semicolon.
>   -- Alan Perlis, http://www.cs.yale.edu/homes/perlis-alan/quotes.html
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From Hans.Polak at capgemini.com  Fri Sep 29 12:46:43 2006
From: Hans.Polak at capgemini.com (Hans Polak)
Date: Fri, 29 Sep 2006 12:46:43 +0200
Subject: [Python-Dev]  PEP 351 - do while
Message-ID: <001301c6e3b4$8de5bda0$1d2c440a@spain.capgemini.com>


Hi,

 

Just an opinion, but many uses of the 'while true loop' are instances of a
'do loop'. I appreciate the language layout question, so I'll give you an
alternative:

 

do:

            <body>

            <setup code>

            while <condition>

 

Cheers,

Hans Polak.

 



This message contains information that may be privileged or confidential and is the property of the Capgemini Group. It is intended only for the person to whom it is addressed. If you are not the intended recipient,  you are not authorized to read, print, retain, copy, disseminate,  distribute, or use this message or any part thereof. If you receive this  message in error, please notify the sender immediately and delete all  copies of this message.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060929/906c9e6c/attachment.htm 

From tjreedy at udel.edu  Sat Sep 30 21:53:33 2006
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 30 Sep 2006 15:53:33 -0400
Subject: [Python-Dev] Caching float(0.0)
References: <20060929081402.GB19781@craig-wood.com>	<451DC113.4040002@canterbury.ac.nz>	<6a36e7290609291815i115b41b3o4ab6d196f404557f@mail.gmail.com><451E2F32.9070405@v.loewis.de>
	<451E31ED.7030905@gmail.com>
Message-ID: <efmhvu$rp6$1@sea.gmane.org>


"Nick Coghlan" <ncoghlan at gmail.com> wrote in message 
news:451E31ED.7030905 at gmail.com...
>I suspect the problem would typically stem from floating point values that 
>are
>read in from a human-readable file rather than being the result of a
>'calculation' as such:

For such situations, one could create a translation dict for both common 
float values and for non-numeric missing value indicators.  For instance,
flotran = {'*': None, '1.0':1.0, '2.0':2.0, '4.0':4.0}
The details, of course, depend on the specific case.

tjr





From Scott.Daniels at Acm.Org  Sat Sep 30 23:13:42 2006
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Sat, 30 Sep 2006 14:13:42 -0700
Subject: [Python-Dev] Tix not included in 2.5 for Windows
In-Reply-To: <efka52$l43$1@sea.gmane.org>
References: <efka52$l43$1@sea.gmane.org>
Message-ID: <efmmka$7vc$1@sea.gmane.org>

Christos Georgiou wrote:
> Does anyone know why this happens? I can't find any information pointing to 
> this being deliberate.
> 
> I just upgraded to 2.5 on Windows (after making sure I can build extensions 
> with the freeware VC++ Toolkit 2003) and some of my programs stopped 
> operating. I saw in a French forum that someone else had the same problem, 
> and what they did was to copy the relevant files from a 2.4.3 installation. 
> I did the same, and it seems it works, with only a console message appearing 
> as soon as a root window is created:

Also note: the Os/X universal seems to include a Tix runtime for the
            non-Intel processor, but not for the Intel processor.  This
            makes me think there is a build problem.

-- Scott David Daniels
Scott.Daniels at Acm.Org


From brett at python.org  Sat Sep 30 23:26:57 2006
From: brett at python.org (Brett Cannon)
Date: Sat, 30 Sep 2006 14:26:57 -0700
Subject: [Python-Dev] Possible semantic changes for PEP 352 in 2.6
Message-ID: <bbaeab100609301426k6208a69cq84efdea32bdbbfb3@mail.gmail.com>

I am working on PEP 352 stuff for 2.6 and there are two changes that I think
should be made that are not explicitly laid out in the PEP.

First, and most dramatic, involves what is legal to list in an 'except'
clause.  Right now you can listing *anything*.  This means ``except 42`` is
totally legal even though raising a number is not.  Since I am deprecating
catching string exceptions, I can go ahead and deprecate catching *any*
object that is not a legitimate object to be raised.

The second thing is changing PyErr_GivenExceptionMatches() to return 0 on
false, 1 on true, and -1 on error.  As of right now there is no defined
error return value.  While it could be suggested to check PyErr_Occurred()
after every call, there is a way to have the return value reflect all
possible so I think this changed should be made.

Anybody have objections with any of the changes I am proposing?

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060930/06950d01/attachment.html