From draghuram at gmail.com  Fri Jun  1 00:10:08 2007
From: draghuram at gmail.com (Raghuram Devarakonda)
Date: Thu, 31 May 2007 18:10:08 -0400
Subject: [Python-Dev] removing use of mimetools, multifile, and rfc822
In-Reply-To: <C4955E3D-3F4D-4829-BCF0-F94F64D3DD72@python.org>
References: <bbaeab100705311309o4da267e3n861cd21df4912bce@mail.gmail.com>
	<C4955E3D-3F4D-4829-BCF0-F94F64D3DD72@python.org>
Message-ID: <2c51ecee0705311510j6f7afaf9q2f2e917bb30d8fc4@mail.gmail.com>

On 5/31/07, Barry Warsaw <barry at python.org> wrote:
> > In other words this email is to hopefully inspire someone to remove
> > the uses
> > of rfc822, mimetools, and multifile from the stdlib so the
> > DeprecationWarnings can finally go in.
>
> +1 for deprecating these.  I don't have time to slog through the
> stdlib and do the work, but I would be happy to help answer questions
> about alternatives.

I will give it a shot and will try to come up with a patch.

Thanks,
Raghu

From brett at python.org  Fri Jun  1 04:10:30 2007
From: brett at python.org (Brett Cannon)
Date: Thu, 31 May 2007 19:10:30 -0700
Subject: [Python-Dev] failures in test_sqlite when entire test suite run
Message-ID: <bbaeab100705311910x5c40cb95mcbcb482e2643082c@mail.gmail.com>

I have been getting failures from test_sqlite off the trunk when I run the
entire test suite (as ``./python.exe Lib/test/regrtest.py``) with this error
on OS X 10.4.9 and sqlite3 3.3.16:

Traceback (most recent call last):
  File
"/Users/drifty/Dev/python/2.x/pristine/Lib/sqlite3/test/regression.py", line
29, in setUp
    self.con = sqlite.connect(":memory:")
ProgrammingError: library routine called out of sequence


When run in isolation it is fine.  Anyone have a guess as to what is going
on?

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070531/880f45fd/attachment.html 

From bjourne at gmail.com  Fri Jun  1 17:54:40 2007
From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=)
Date: Fri, 1 Jun 2007 17:54:40 +0200
Subject: [Python-Dev] Minor ConfigParser Change
In-Reply-To: <200705310045.58802.fdrake@acm.org>
References: <46585729.2030305@gmail.com>
	<4E9372E6B2234D4F859320D896059A9508DE8B40B5@exchis.ccp.ad.local>
	<46588B22.3090808@gmail.com> <200705310045.58802.fdrake@acm.org>
Message-ID: <740c3aec0706010854m426efa53s36a175923edda136@mail.gmail.com>

Patches are applied once, but thousands of people read the code in the
standard library each month. The standard library should be as
readable as possible to make it as easy as possible to maintain. It is
just good software development methodology.


Many parts of the standard library are arcane and almost impossible to
understand (see httplib for example) because refactoring changes are
Not done. So if someone wants to improve the code why not let them?


-- 
mvh Bj?rn

From draghuram at gmail.com  Fri Jun  1 18:45:39 2007
From: draghuram at gmail.com (Raghuram Devarakonda)
Date: Fri, 1 Jun 2007 12:45:39 -0400
Subject: [Python-Dev] error in Misc/NEWS
Message-ID: <2c51ecee0706010945n7f144a0fn6c49b03216c54570@mail.gmail.com>

There is an entry in "Core and builtins" section of Misc/NEWS:

"Bug #1722484: remove docstrings again when running with -OO.".

The actual bug is 1722485. Incidentally, 1722484 appears to be spam.

From tjreedy at udel.edu  Fri Jun  1 20:19:46 2007
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 1 Jun 2007 14:19:46 -0400
Subject: [Python-Dev] error in Misc/NEWS
References: <2c51ecee0706010945n7f144a0fn6c49b03216c54570@mail.gmail.com>
Message-ID: <f3po02$jta$1@sea.gmane.org>


"Raghuram Devarakonda" <draghuram at gmail.com> wrote in message 
news:2c51ecee0706010945n7f144a0fn6c49b03216c54570 at mail.gmail.com...
| There is an entry in "Core and builtins" section of Misc/NEWS:
|
| "Bug #1722484: remove docstrings again when running with -OO.".
|
| The actual bug is 1722485. Incidentally, 1722484 appears to be spam.

Sure enough.  But it is another project -- and submitted anonymously. 




From g.brandl at gmx.net  Fri Jun  1 21:20:38 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Fri, 01 Jun 2007 21:20:38 +0200
Subject: [Python-Dev] error in Misc/NEWS
In-Reply-To: <2c51ecee0706010945n7f144a0fn6c49b03216c54570@mail.gmail.com>
References: <2c51ecee0706010945n7f144a0fn6c49b03216c54570@mail.gmail.com>
Message-ID: <f3pri0$9c$3@sea.gmane.org>

Raghuram Devarakonda schrieb:
> There is an entry in "Core and builtins" section of Misc/NEWS:
> 
> "Bug #1722484: remove docstrings again when running with -OO.".
> 
> The actual bug is 1722485. Incidentally, 1722484 appears to be spam.

Fixed, thanks for spotting (you really read the commit logs thoroughly,
don't you? ;)

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From fdrake at acm.org  Fri Jun  1 23:08:45 2007
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri, 1 Jun 2007 17:08:45 -0400
Subject: [Python-Dev] Minor ConfigParser Change
In-Reply-To: <740c3aec0706010854m426efa53s36a175923edda136@mail.gmail.com>
References: <46585729.2030305@gmail.com> <200705310045.58802.fdrake@acm.org>
	<740c3aec0706010854m426efa53s36a175923edda136@mail.gmail.com>
Message-ID: <200706011708.45883.fdrake@acm.org>

On Friday 01 June 2007, BJ?rn Lindqvist wrote:
 > Patches are applied once, but thousands of people read the code in the
 > standard library each month. The standard library should be as
 > readable as possible to make it as easy as possible to maintain. It is
 > just good software development methodology.

Rest assured, I understand your sentiment here, and am not personally against 
an occaissional clean-up.  ConfigParser in particular is old and highly 
idiosyncratic.

 > Many parts of the standard library are arcane and almost impossible to
 > understand (see httplib for example) because refactoring changes are
 > Not done. So if someone wants to improve the code why not let them?

Changes in general are a source of risk; they have to be considered carefully.  
We've seen too many cases in which a change was thought to be safe, but broke 
something for someone.  Avoiding style-only changes helps avoid introducing 
problems without being able to predict them; there are tests for 
ConfigParser, but it's hard to be sure every corner case has been covered.

This is a general policy in the Python project, not simply my preference.  I'd 
love to be able to say "yes, the code is painful to read, let's make it 
nicer", but it's hard to say that without being able to say "I'm sure it 
won't break anything for anybody."  Python's too flexible for that to be 
easy.


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From draghuram at gmail.com  Sat Jun  2 01:00:31 2007
From: draghuram at gmail.com (Raghuram Devarakonda)
Date: Fri, 1 Jun 2007 19:00:31 -0400
Subject: [Python-Dev] error in Misc/NEWS
In-Reply-To: <f3pri0$9c$3@sea.gmane.org>
References: <2c51ecee0706010945n7f144a0fn6c49b03216c54570@mail.gmail.com>
	<f3pri0$9c$3@sea.gmane.org>
Message-ID: <2c51ecee0706011600r5920375ftadd9a3bf161167a2@mail.gmail.com>

On 6/1/07, Georg Brandl <g.brandl at gmx.net> wrote:
> Raghuram Devarakonda schrieb:
> > There is an entry in "Core and builtins" section of Misc/NEWS:
> >
> > "Bug #1722484: remove docstrings again when running with -OO.".
> >
> > The actual bug is 1722485. Incidentally, 1722484 appears to be spam.
>
> Fixed, thanks for spotting (you really read the commit logs thoroughly,
> don't you? ;)

I was just scanning the file for the comment related to my patch (my
first one, btw) when I spotted this.

From brett at python.org  Sat Jun  2 05:08:02 2007
From: brett at python.org (Brett Cannon)
Date: Fri, 1 Jun 2007 20:08:02 -0700
Subject: [Python-Dev] failures in test_sqlite when entire test suite run
In-Reply-To: <bbaeab100705311910x5c40cb95mcbcb482e2643082c@mail.gmail.com>
References: <bbaeab100705311910x5c40cb95mcbcb482e2643082c@mail.gmail.com>
Message-ID: <bbaeab100706012008v47bafbc3m44a5b1fe340aca64@mail.gmail.com>

On 5/31/07, Brett Cannon <brett at python.org> wrote:
>
> I have been getting failures from test_sqlite off the trunk when I run the
> entire test suite (as ``./python.exe Lib/test/regrtest.py``) with this error
> on OS X 10.4.9 and sqlite3 3.3.16:
>
> Traceback (most recent call last):
>   File
> "/Users/drifty/Dev/python/2.x/pristine/Lib/sqlite3/test/regression.py", line
> 29, in setUp
>     self.con = sqlite.connect(":memory:")
> ProgrammingError: library routine called out of sequence
>
>
> When run in isolation it is fine.  Anyone have a guess as to what is going
> on?



Nevermind.  It has started to pass again for me.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070601/b6c4aeef/attachment.htm 

From status at bugs.python.org  Sun Jun  3 02:00:56 2007
From: status at bugs.python.org (Tracker)
Date: Sun,  3 Jun 2007 00:00:56 +0000 (UTC)
Subject: [Python-Dev] Summary of Tracker Issues
Message-ID: <20070603000056.126EE780B3@psf.upfronthosting.co.za>


ACTIVITY SUMMARY (05/27/07 - 06/03/07)
Tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue 
number.  Do NOT respond to this message.


 1649 open ( +0) /  8584 closed ( +0) / 10233 total ( +0)

Average duration of open issues: 813 days.
Median duration of open issues: 764 days.

Open Issues Breakdown
   open  1649 ( +0)
pending     0 ( +0)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070603/529c96ba/attachment.html 

From josepharmbruster at gmail.com  Sun Jun  3 02:34:22 2007
From: josepharmbruster at gmail.com (Joseph Armbruster)
Date: Sat, 02 Jun 2007 20:34:22 -0400
Subject: [Python-Dev] popen test case inquiry in r55735 using PCBuild8
Message-ID: <46620C8E.1@gmail.com>

All,

I wanted to pass this one around before opening an issue on it.
When running the unit test for popen via rt.bat (in PCBuild8),
I received the following error:

=== BEGIN ERROR ===

C:\Documents and 
Settings\joe\Desktop\Development\Python\trunk\PCbuild8>rt test_popen
Deleting .pyc/.pyo files ...
43 .pyc deleted, 0 .pyo deleted

C:\Documents and 
Settings\joe\Desktop\Development\Python\trunk\PCbuild8>win32Release\python.exe 
  -E -tt ../lib/test/regrtest.py te
st_popen
test_popen
test test_popen failed -- Traceback (most recent call last):
   File "C:\Documents and 
Settings\joe\Desktop\Development\Python\trunk\lib\test\test_popen.py", 
line 31, in test_popen
     ["foo", "bar"]
   File "C:\Documents and 
Settings\joe\Desktop\Development\Python\trunk\lib\test\test_popen.py", 
line 24, in _do_test_commandline
     got = eval(data)[1:] # strip off argv[0]
   File "<string>", line 0

    ^
SyntaxError: unexpected EOF while parsing

1 test failed:
     test_popen

=== END ERROR ===


Only naturally, I looked into what was causing it and noticed the 
following:  Line 23 of the test_popen.py appears to be returning ''
and assigning this to data.

data = os.popen(cmd).read()

The problem with is, the next line (24) assumes the previous line
will work and goes on to perform the following strip and assert:

got = eval(data)[1:] # strip off argv[0]
self.assertEqual(got, expected)

So, in a perfect world, ['-c','foo','bar']\n is what data Should be.
I put some quick debug statements after line 23 in test_popen.py to 
verify this and I observed the following:

data=
cmd= "C:\Documents and 
Settings\joe\Desktop\Development\Python\trunk\PCbuild8\win32Release\python.exe" 
-c "import sys;print sys.argv" foo bar

Now, on to the 'interesting' part.  From the command line, observe the
following:

C:\Documents and 
Settings\joe\Desktop\Development\Python\trunk\PCbuild8\win32release>python 
-c "import sys; print sys.argv" foo bar
['-c', 'foo', 'bar']


Outside of the popen call failing.  I am wondering if an appropriate 
assert should be performed on the data object, prior to line 24.


In addition, if you debug into the posixmodule, this is the scoop:

1. breakpoint set in posixmodule at the start of posix_popen
2. i run in debug
3. run the following:

import os
tmp = os.popen('"C:/Documents and 
Settings/joe/Desktop/Development/Python/trunk/PCbuild8/win32Release/python.exe" 
-c "import sys;print sys.argv" foo bar')

3. call enters posixmodule posix_popen and follows path:
	f = _PyPopen(cmdstring, tm | _O_TEXT, POPEN_1);

4. enters posixmodule:  _PyPopen

5. enters posixmodule:  _PyPopenCreateProcess

6. enters posixmodule linen 4920 where the CreateProcess is...
s2 checks out as:
"C:\WINDOWS\system32\cmd.exe /c "C:/Documents and 
Settings/joe/Desktop/Development/Python/trunk/PCbuild8/win32Release/python.exe" 
-c "import sys;print sys.argv" foo bar"

this call returns nonzero, which means it "succeeded". see:
[ http://msdn2.microsoft.com/en-us/library/ms682425.aspx ]

On another note, I ran across CreateProcessW and am interested in 
questioning whether or not this has a place in posixmodule?


Any on yet another note, when I ran test_popen.py straight from /lib 
(using my std::Python25 install, I obtained the following debug output 
in the same statement of interest)

data=['-c', 'foo', 'bar']
cmd=c:\python25\python.exe -c "import sys;print sys.argv" foo bar


Your thoughts ?

Joseph Armbruster


From talin at acm.org  Sun Jun  3 21:07:24 2007
From: talin at acm.org (Talin)
Date: Sun, 03 Jun 2007 12:07:24 -0700
Subject: [Python-Dev] Substantial rewrite of PEP 3101
Message-ID: <4663116C.8020201@acm.org>

I've rewritten large portions of PEP 3101, incorporating some material 
from Patrick Maupin and Eric Smith, as well as rethinking the whole 
custom formatter design. Although it isn't showing up on the web site 
yet, you can view the copy in subversion (and the diffs) here:

    http://svn.python.org/view/peps/trunk/pep-3101.txt

Please let me know of any errors you find, either by mailing me 
directly, or replying to the topic in Python-3000. (I.e. lets not start 
a thread here.)

-- Talin

From mhammond at skippinet.com.au  Mon Jun  4 14:38:32 2007
From: mhammond at skippinet.com.au (Mark Hammond)
Date: Mon, 4 Jun 2007 22:38:32 +1000
Subject: [Python-Dev] popen test case inquiry in r55735 using PCBuild8
In-Reply-To: <46620C8E.1@gmail.com>
Message-ID: <082d01c7a6a5$436a19b0$1f0a0a0a@enfoldsystems.local>

> All,
>
> I wanted to pass this one around before opening an issue on it.
> When running the unit test for popen via rt.bat (in PCBuild8),
> I received the following error:
>
> === BEGIN ERROR ===
>
> C:\Documents and
> Settings\joe\Desktop\Development\Python\trunk\PCbuild8>rt test_popen
> Deleting .pyc/.pyo files ...
> 43 .pyc deleted, 0 .pyo deleted
>
> C:\Documents and
> Settings\joe\Desktop\Development\Python\trunk\PCbuild8>win32Re
> lease\python.exe -E -tt ../lib/test/regrtest.py test_popen
> test_popen
> test test_popen failed -- Traceback (most recent call last):
>    File "C:\Documents and Settings\joe\Desktop\Development\Python\...

I can't reproduce this.  I expect you will find it is due to the space in
the filename of your Python directory, via cmd.exe's documented behaviour
with quote characters.  A patch that allows the test suite to work in such
an environment would be welcome, but I think you might end up needing access
to GetShortPathName() rather than CreateProcess().

Cheers,

Mark


From josepharmbruster at gmail.com  Mon Jun  4 15:09:44 2007
From: josepharmbruster at gmail.com (Joseph Armbruster)
Date: Mon, 4 Jun 2007 09:09:44 -0400
Subject: [Python-Dev] popen test case inquiry in r55735 using PCBuild8
In-Reply-To: <082d01c7a6a5$436a19b0$1f0a0a0a@enfoldsystems.local>
References: <46620C8E.1@gmail.com>
	<082d01c7a6a5$436a19b0$1f0a0a0a@enfoldsystems.local>
Message-ID: <938f42d70706040609s5c35268sfd64fe0df167e241@mail.gmail.com>

Mark,

Sounds good, I will get patching tonight.  Any thoughts on CreateProcessW ?

Joseph Armbruster

On 6/4/07, Mark Hammond <mhammond at skippinet.com.au> wrote:
>
> > All,
> >
> > I wanted to pass this one around before opening an issue on it.
> > When running the unit test for popen via rt.bat (in PCBuild8),
> > I received the following error:
> >
> > === BEGIN ERROR ===
> >
> > C:\Documents and
> > Settings\joe\Desktop\Development\Python\trunk\PCbuild8>rt test_popen
> > Deleting .pyc/.pyo files ...
> > 43 .pyc deleted, 0 .pyo deleted
> >
> > C:\Documents and
> > Settings\joe\Desktop\Development\Python\trunk\PCbuild8>win32Re
> > lease\python.exe -E -tt ../lib/test/regrtest.py test_popen
> > test_popen
> > test test_popen failed -- Traceback (most recent call last):
> >    File "C:\Documents and Settings\joe\Desktop\Development\Python\...
>
> I can't reproduce this.  I expect you will find it is due to the space in
> the filename of your Python directory, via cmd.exe's documented behaviour
> with quote characters.  A patch that allows the test suite to work in such
> an environment would be welcome, but I think you might end up needing
> access
> to GetShortPathName() rather than CreateProcess().
>
> Cheers,
>
> Mark
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070604/1f600622/attachment.htm 

From bjourne at gmail.com  Mon Jun  4 21:32:14 2007
From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=)
Date: Mon, 4 Jun 2007 21:32:14 +0200
Subject: [Python-Dev] What exception should Thread.start() raise?
Message-ID: <740c3aec0706041232s1ab331ees1592ca1204d6c47a@mail.gmail.com>

The threading module contains buggy code:

class Thread(_Verbose):
    ...
    def start(self):
        assert self.__initialized, "Thread.__init__() not called"
        assert not self.__started, "thread already started"
    ...

If you run such code with python -O, weird stuff may happen when you
call mythread.start() multiple times. -O removes assert statements so
the code won't fail with an AssertionError which would be expected.

So what real exception should Thread.start() raise? I have suggested
adding an IllegalStateError modelled after java's
IllegalStateException, but that idea was rejected. So what exception
should be raised here, is it a RuntimeError?

-- 
mvh Bj?rn

From steven.bethard at gmail.com  Mon Jun  4 21:50:39 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Mon, 4 Jun 2007 13:50:39 -0600
Subject: [Python-Dev] What exception should Thread.start() raise?
In-Reply-To: <740c3aec0706041232s1ab331ees1592ca1204d6c47a@mail.gmail.com>
References: <740c3aec0706041232s1ab331ees1592ca1204d6c47a@mail.gmail.com>
Message-ID: <d11dcfba0706041250q21f3a7djf2cb6bb09464c851@mail.gmail.com>

On 6/4/07, BJ?rn Lindqvist <bjourne at gmail.com> wrote:
> The threading module contains buggy code:
>
> class Thread(_Verbose):
>     ...
>     def start(self):
>         assert self.__initialized, "Thread.__init__() not called"
>         assert not self.__started, "thread already started"
>     ...
>
> If you run such code with python -O, weird stuff may happen when you
> call mythread.start() multiple times. -O removes assert statements so
> the code won't fail with an AssertionError which would be expected.
>
> So what real exception should Thread.start() raise? I have suggested
> adding an IllegalStateError modelled after java's
> IllegalStateException, but that idea was rejected. So what exception
> should be raised here, is it a RuntimeError?

If you want to be fully backwards compatible, you could just write this like::

    def start(self):
        if not self.__initialized:
            raise AssertionError("Thread.__init__() not called")
        if self.__started:
            raise AssertionError("thread already started")

But I doubt anyone is actually catching the AssertionError, so
changing the error type would probably be okay.

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From guido at python.org  Mon Jun  4 22:33:11 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 4 Jun 2007 13:33:11 -0700
Subject: [Python-Dev] What exception should Thread.start() raise?
In-Reply-To: <d11dcfba0706041250q21f3a7djf2cb6bb09464c851@mail.gmail.com>
References: <740c3aec0706041232s1ab331ees1592ca1204d6c47a@mail.gmail.com>
	<d11dcfba0706041250q21f3a7djf2cb6bb09464c851@mail.gmail.com>
Message-ID: <ca471dc20706041333l363f202aw835d11532f5e8761@mail.gmail.com>

On 6/4/07, Steven Bethard <steven.bethard at gmail.com> wrote:
> On 6/4/07, BJ?rn Lindqvist <bjourne at gmail.com> wrote:
> > The threading module contains buggy code:
> >
> > class Thread(_Verbose):
> >     ...
> >     def start(self):
> >         assert self.__initialized, "Thread.__init__() not called"
> >         assert not self.__started, "thread already started"
> >     ...
> >
> > If you run such code with python -O, weird stuff may happen when you
> > call mythread.start() multiple times. -O removes assert statements so
> > the code won't fail with an AssertionError which would be expected.
> >
> > So what real exception should Thread.start() raise? I have suggested
> > adding an IllegalStateError modelled after java's
> > IllegalStateException, but that idea was rejected. So what exception
> > should be raised here, is it a RuntimeError?
>
> If you want to be fully backwards compatible, you could just write this like::
>
>     def start(self):
>         if not self.__initialized:
>             raise AssertionError("Thread.__init__() not called")
>         if self.__started:
>             raise AssertionError("thread already started")
>
> But I doubt anyone is actually catching the AssertionError, so
> changing the error type would probably be okay.

Anything that causes an "assert" to fail is technically using
"undefined" behavior. I am in favor of changing this case to
RuntimeError, which is the error Python usually uses for state
problems.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From jimjjewett at gmail.com  Tue Jun  5 00:55:01 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Mon, 4 Jun 2007 18:55:01 -0400
Subject: [Python-Dev] svn viewer confused
Message-ID: <fb6fbf560706041555q6d32afbfg5103de10da578c2a@mail.gmail.com>

Choosing a revision, such as

http://svn.python.org/view/python/trunk/Objects/?rev=55606&sortby=date&view=log

does not lead to the correct generated page; it either times out or
generates a much older changelog.

From tcdelaney at optusnet.com.au  Tue Jun  5 14:03:23 2007
From: tcdelaney at optusnet.com.au (Tim Delaney)
Date: Tue, 5 Jun 2007 22:03:23 +1000
Subject: [Python-Dev] Patch #1731330 - pysqlite_cache_display - missing
	Py_DECREF
Message-ID: <008601c7a769$84a7c2a0$0201a8c0@mshome.net>

I've added patch #1731330 to fix a missing Py_DECREF in 
pysqlite_cache_display. I've attached the diff to this email.

I haven't actually been able to test this - haven't been able to get 
pysqlite compiled here on cygwin yet. I just noticed it when taking an 
example of using PyObject_Print ...

Cheers,

Tim Delaney 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sqlite_cache.diff
Type: application/octet-stream
Size: 426 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20070605/bb5dc4fe/attachment.obj 

From bjourne at gmail.com  Tue Jun  5 18:34:49 2007
From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=)
Date: Tue, 5 Jun 2007 18:34:49 +0200
Subject: [Python-Dev] Minor ConfigParser Change
In-Reply-To: <200706011708.45883.fdrake@acm.org>
References: <46585729.2030305@gmail.com> <200705310045.58802.fdrake@acm.org>
	<740c3aec0706010854m426efa53s36a175923edda136@mail.gmail.com>
	<200706011708.45883.fdrake@acm.org>
Message-ID: <740c3aec0706050934u6c53f4a5k3f6fda82cd1e6f72@mail.gmail.com>

On 6/1/07, Fred L. Drake, Jr. <fdrake at acm.org> wrote:
> Changes in general are a source of risk; they have to be considered carefully.
> We've seen too many cases in which a change was thought to be safe, but broke
> something for someone.  Avoiding style-only changes helps avoid introducing
> problems without being able to predict them; there are tests for
> ConfigParser, but it's hard to be sure every corner case has been covered.

I understand what you mean, all changes carry a certain risk.
Especially in code that is so widely relied upon as the Standard
Library. But the alternative, which is to let the code rot, while
one-line fixes are applied upon it, is a much worse alternative.

It is true that unit tests does not cover all corner cases and that
you can't be 100% sure that a change won't break something for
someone. But on the other hand, the whole point with unit tests is to
facilitate exactly these kind of changes. If something breaks then
that is a great opportunity to introduce more tests.

> This is a general policy in the Python project, not simply my preference.  I'd
> love to be able to say "yes, the code is painful to read, let's make it
> nicer", but it's hard to say that without being able to say "I'm sure it
> won't break anything for anybody."  Python's too flexible for that to be
> easy.

While what you have stated is the policy, I can't help but think that
it is totally misguided (no offense intended). Maybe the policy can be
reevaluated?


-- 
mvh Bj?rn

From guido at python.org  Tue Jun  5 23:25:55 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 5 Jun 2007 14:25:55 -0700
Subject: [Python-Dev] Patch #1731330 - pysqlite_cache_display - missing
	Py_DECREF
In-Reply-To: <008601c7a769$84a7c2a0$0201a8c0@mshome.net>
References: <008601c7a769$84a7c2a0$0201a8c0@mshome.net>
Message-ID: <ca471dc20706051425w753567a8u75d483472356c8ba@mail.gmail.com>

On 6/5/07, Tim Delaney <tcdelaney at optusnet.com.au> wrote:
> I've added patch #1731330 to fix a missing Py_DECREF in
> pysqlite_cache_display. I've attached the diff to this email.
>
> I haven't actually been able to test this - haven't been able to get
> pysqlite compiled here on cygwin yet. I just noticed it when taking an
> example of using PyObject_Print ...

Committed revision 55783.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From tcdelaney at optusnet.com.au  Wed Jun  6 01:31:26 2007
From: tcdelaney at optusnet.com.au (Tim Delaney)
Date: Wed, 6 Jun 2007 09:31:26 +1000
Subject: [Python-Dev] Patch #1731330 - pysqlite_cache_display - missing
	Py_DECREF
In-Reply-To: <ca471dc20706051425w753567a8u75d483472356c8ba@mail.gmail.com>
References: <008601c7a769$84a7c2a0$0201a8c0@mshome.net>
	<ca471dc20706051425w753567a8u75d483472356c8ba@mail.gmail.com>
Message-ID: <98985ab20706051631m7128b02apc4fb9daa810a6985@mail.gmail.com>

On 06/06/07, Guido van Rossum <guido at python.org> wrote:
>
> On 6/5/07, Tim Delaney <tcdelaney at optusnet.com.au> wrote:
> > I've added patch #1731330 to fix a missing Py_DECREF in
> > pysqlite_cache_display. I've attached the diff to this email.
> >
> > I haven't actually been able to test this - haven't been able to get
> > pysqlite compiled here on cygwin yet. I just noticed it when taking an
> > example of using PyObject_Print ...
>
> Committed revision 55783.


Thanks. I've added a comment that it also needs to be applied to p3yk.

I've done a quick seach for other places with the same code, on the
off-chance that it was copied from elsewhere. Didn't turn up any other
cases.

Tim Delaney
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070606/15d04766/attachment.html 

From josepharmbruster at gmail.com  Wed Jun  6 03:09:56 2007
From: josepharmbruster at gmail.com (Joseph Armbruster)
Date: Tue, 05 Jun 2007 21:09:56 -0400
Subject: [Python-Dev] popen test case inquiry in r55735 using PCBuild8
In-Reply-To: <938f42d70706040609s5c35268sfd64fe0df167e241@mail.gmail.com>
References: <46620C8E.1@gmail.com>	
	<082d01c7a6a5$436a19b0$1f0a0a0a@enfoldsystems.local>
	<938f42d70706040609s5c35268sfd64fe0df167e241@mail.gmail.com>
Message-ID: <46660964.6090600@gmail.com>

Mark,

My apologies for being a day late, got working on some other things.
So here's the scoop as it relates to the issue at hand:

- If you run rt.bat from the trunk as-is and place it in a path that 
contains an empty space, you receive the error outlined in 
resultwithspace.txt.

- If you run rt.bat from the trunk as-is and place it in a path that 
does not contain an empty space, you receive no errors as outlined in 
resultwithoutspace.txt.

- If you run rt.bat with the patch, on Windows XP, you receive no errors 
as outlined in resultafterpatch.txt.

The patch is attached.  Probably my biggest question now is the use of 
GetVersion as opposed to GetVersionEx. According to the MSDN, it doesn't 
appear to be all that undesirable:

http://msdn2.microsoft.com/en-us/library/ms724451.aspx

Your thoughts?

Joseph Armbruster


Joseph Armbruster wrote:
> Mark,
> 
> Sounds good, I will get patching tonight.  Any thoughts on CreateProcessW ?
> 
> Joseph Armbruster
> 
> On 6/4/07, *Mark Hammond* < mhammond at skippinet.com.au 
> <mailto:mhammond at skippinet.com.au>> wrote:
> 
>      > All,
>      >
>      > I wanted to pass this one around before opening an issue on it.
>      > When running the unit test for popen via rt.bat (in PCBuild8),
>      > I received the following error:
>      >
>      > === BEGIN ERROR ===
>      >
>      > C:\Documents and
>      > Settings\joe\Desktop\Development\Python\trunk\PCbuild8>rt test_popen
>      > Deleting .pyc/.pyo files ...
>      > 43 .pyc deleted, 0 .pyo deleted
>      >
>      > C:\Documents and
>      > Settings\joe\Desktop\Development\Python\trunk\PCbuild8>win32Re
>      > lease\python.exe -E -tt ../lib/test/regrtest.py test_popen
>      > test_popen
>      > test test_popen failed -- Traceback (most recent call last):
>      >    File "C:\Documents and Settings\joe\Desktop\Development\Python\...
> 
>     I can't reproduce this.  I expect you will find it is due to the
>     space in
>     the filename of your Python directory, via cmd.exe's documented
>     behaviour
>     with quote characters.  A patch that allows the test suite to work
>     in such
>     an environment would be welcome, but I think you might end up
>     needing access
>     to GetShortPathName() rather than CreateProcess().
> 
>     Cheers,
> 
>     Mark
> 
> 

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: popen.patch
Url: http://mail.python.org/pipermail/python-dev/attachments/20070605/c95b09df/attachment.pot 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: resultafterpatch.txt
Url: http://mail.python.org/pipermail/python-dev/attachments/20070605/c95b09df/attachment.txt 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: resultwithoutspace.txt
Url: http://mail.python.org/pipermail/python-dev/attachments/20070605/c95b09df/attachment-0001.txt 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: resultwithspace.txt
Url: http://mail.python.org/pipermail/python-dev/attachments/20070605/c95b09df/attachment-0002.txt 

From mhammond at skippinet.com.au  Wed Jun  6 05:26:33 2007
From: mhammond at skippinet.com.au (Mark Hammond)
Date: Wed, 6 Jun 2007 13:26:33 +1000
Subject: [Python-Dev] popen test case inquiry in r55735 using PCBuild8
In-Reply-To: <46660964.6090600@gmail.com>
Message-ID: <09d201c7a7ea$7bc23720$1f0a0a0a@enfoldsystems.local>

> My apologies for being a day late, got working on some other things.
> So here's the scoop as it relates to the issue at hand:
>
> - If you run rt.bat from the trunk as-is and place it in a path that
> contains an empty space, you receive the error outlined in
> resultwithspace.txt.
>
> - If you run rt.bat from the trunk as-is and place it in a path that
> does not contain an empty space, you receive no errors as outlined in
> resultwithoutspace.txt.
>
> - If you run rt.bat with the patch, on Windows XP, you
> receive no errors as outlined in resultafterpatch.txt.

In that last step, you failed to indicate if the path had a space or not.
ie, on Windows XP I get that behaviour now without needing to apply the
patch.

> The patch is attached.

The vast majority of the patch is insignificant - it is either adding braces
where they are not necessary, or changing whitespace inappropriately (the
spaces you replaced are so the lines all line up regardless of the tab
width.)  It seems there is only one significant block in your patch, and its
not clear to me what the intent of the patch is - I admit I didn't apply it
and look at it in-place, but a couple of comments indicating exactly what
you are trying to do would be good, especially as I'm not aware of this
behaviour change from Win2K -> WinXP.

> Probably my biggest question now is
> the use of GetVersion as opposed to GetVersionEx.

The existing code explicitly checks if it is the 9x or NT family, which your
patch no longer does.  It seems to me that Windows ME will also qualify -
although in general the strcmp for command.com will succeed, if an
alternative shell is installed on a ME box it will do the wrong thing.  If
you need to check anything more than the high-bit of GetVersion(), IMO it
should be replaced with GetVersionEx().

Cheers,

Mark


From mhammond at skippinet.com.au  Wed Jun  6 05:45:08 2007
From: mhammond at skippinet.com.au (Mark Hammond)
Date: Wed, 6 Jun 2007 13:45:08 +1000
Subject: [Python-Dev] popen test case inquiry in r55735 using PCBuild8
In-Reply-To: <46660964.6090600@gmail.com>
Message-ID: <09d901c7a7ed$162d3e20$1f0a0a0a@enfoldsystems.local>

> My apologies for being a day late, got working on some other things.
> So here's the scoop as it relates to the issue at hand:

Something else I meant to mention:  your problem is that the test suite
fails in some circumstances, but these circumstances are not met for most
core developers or when running the python test suite from the directory it
is built in, but your proposed fix is a patch to os.popen().  There would
also need to be new test cases added to demonstrate this bug in a "normal"
test run.

Cheers,

Mark


From bjourne at gmail.com  Wed Jun  6 17:29:26 2007
From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=)
Date: Wed, 6 Jun 2007 17:29:26 +0200
Subject: [Python-Dev] Patch reviews
Message-ID: <740c3aec0706060829l7cfa5d87o14916a93f87bbe44@mail.gmail.com>

Here is a review of some patches:

* [ 1673759 ] '%G' string formatting doesn't catch same errors as '%g'

This patch is done, has been reviewed and works as advertised. Just
needs someone to commit it I think.

* [ 1100942 ] datetime.strptime constructor added

Doesn't apply cleanly, emits compiler warnings, but works as
expected. Lacks tests.

* [  968063 ] Add fileinput.islastline()

Good and useful patch (see the pup.py program) but lacks unit tests
and documentation.

* [ 1501979 ] syntax errors on continuation lines

Doesn't apply cleanly, but that is easy to fix. Needs someone to fix a
few minor flaws in it, but the patch works really well.

* [ 1375011 ] Improper handling of duplicate cookies

Fixes a fairly exotic bug in which Pythons cookie handling deviates in
an obscure way from Netscapes cookie specification. See the bug about
it at 1372650. As far as I can understand, the patch fixes the problem.

If someone still does the 5 for 1 deal, my patch is [ 1676135 ].

-- 
mvh Bj?rn

From facundo at taniquetil.com.ar  Wed Jun  6 21:19:36 2007
From: facundo at taniquetil.com.ar (Facundo Batista)
Date: Wed, 6 Jun 2007 19:19:36 +0000 (UTC)
Subject: [Python-Dev] Timeout in urllib2.urlopen
Message-ID: <f471c7$amr$1@sea.gmane.org>


facundo at expiron:~/devel/reps/python/trunk$ ./python 
Python 2.6a0 (trunk, Jun  6 2007, 12:32:23) 
[GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib2
>>> u = urllib2.urlopen("http://www.taniquetil.com.ar/plog")
>>> u.headers.items()
[..., ('content-type', 'text/html;charset=iso-8859-1'), ...]
>>> 
>>> u = urllib2.urlopen("http://www.taniquetil.com.ar/plog", timeout=3)
Traceback (most recent call last):
  ...
urllib2.URLError: <urlopen error timed out>
>>> 

Ok, my blog is a bit slow, but the timeout is working ok, :D

Regards,

-- 
.   Facundo
.
Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/



From theller at ctypes.org  Wed Jun  6 21:34:42 2007
From: theller at ctypes.org (Thomas Heller)
Date: Wed, 06 Jun 2007 21:34:42 +0200
Subject: [Python-Dev] Windows buildbot (Was: buildbot failure in x86 W2k
	trunk)
In-Reply-To: <f3kkif$si7$1@sea.gmane.org>
References: <20070520071645.BA1C01E4004@bag.python.org>	<bbaeab100705200023l7f58ff1fw9f54160d3ee2544a@mail.gmail.com>	<464FFFDC.4020600@v.loewis.de>	<46547C7F.7040908@activestate.com>	<4654C280.1080802@v.loewis.de>
	<f36m5l$p9t$1@sea.gmane.org> <f3kkif$si7$1@sea.gmane.org>
Message-ID: <f4728l$dlb$1@sea.gmane.org>

Thomas Heller schrieb:
> Thomas Heller schrieb:
>>>> Are there others that can provide a Windows buildbot? It would probably
>>>> be good to have two -- and a WinXP one would be good.
>>> 
>>> It certainly would be good. Unfortunately, Windows users are not that
>>> much engaged in the open source culture, so few of them volunteer
>>> (plus it's more painful, with Windows not being a true multi-user
>>> system).
>> 
>> I'll try to setup a buildbot under WinXP.
> 
> The buildbot is now working.

Should I try to setup another buildbot client for win32/AMD64?

Thomas


From martin at v.loewis.de  Wed Jun  6 22:55:40 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 06 Jun 2007 22:55:40 +0200
Subject: [Python-Dev] Windows buildbot (Was: buildbot failure in x86 W2k
 trunk)
In-Reply-To: <f4728l$dlb$1@sea.gmane.org>
References: <20070520071645.BA1C01E4004@bag.python.org>	<bbaeab100705200023l7f58ff1fw9f54160d3ee2544a@mail.gmail.com>	<464FFFDC.4020600@v.loewis.de>	<46547C7F.7040908@activestate.com>	<4654C280.1080802@v.loewis.de>	<f36m5l$p9t$1@sea.gmane.org>
	<f3kkif$si7$1@sea.gmane.org> <f4728l$dlb$1@sea.gmane.org>
Message-ID: <46671F4C.1000601@v.loewis.de>

> Should I try to setup another buildbot client for win32/AMD64?

We don't have a Win64 buildbot yet. Depending on whether you plan
to use PCbuild or PCbuild8, this might be a challenge to get working
(I think it's a challenge either way, but different).
If it could actually work, it might be useful.

Regards,
Martin

From armin.ronacher at active-4.com  Thu Jun  7 15:18:43 2007
From: armin.ronacher at active-4.com (Armin Ronacher)
Date: Thu, 7 Jun 2007 13:18:43 +0000 (UTC)
Subject: [Python-Dev] NaN / Infinity in Python
Message-ID: <loom.20070607T150706-820@post.gmane.org>

Hi,

It's one of those "non issues" but there are still some situations where you
have to deal with Infinity and NaN, even in Python. Basically one the problems
is that the repr of floating point numbers is platform depending and sometimes
yields "nan" which is not evaluable. It's true that eval() is probably a bad
thing but there are some libraries that use repr/%r for code generation and it
could happen that they produce erroneous code because of that. Also there is no
way to get the Infinity and NaN values and also no way to test if they exist.

Maybe changing the repr of `nan` to `math.NaN` and `inf` to `math.Infinity` as
well as `-inf` to `(-math.Infinity)` and add that code to the math module (of
course as a C implementation, there are even macros for testing for NaN values)::

    Infinity = 1e10000
    NaN = Infinity / Infinity

    def is_nan(x):
        return type(x) is float and x != x

    def is_finite(x):
        return x != Infinity

Bugs related to this issue:

 - http://bugs.python.org/1732212 [repr of 'nan' floats not parseable]
 - http://bugs.python.org/1481296 [long(float('nan'))!=0L]

Regards,
Armin


From jcarlson at uci.edu  Thu Jun  7 20:46:57 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Thu, 07 Jun 2007 11:46:57 -0700
Subject: [Python-Dev] NaN / Infinity in Python
In-Reply-To: <loom.20070607T150706-820@post.gmane.org>
References: <loom.20070607T150706-820@post.gmane.org>
Message-ID: <20070607113552.6F4D.JCARLSON@uci.edu>


Armin Ronacher <armin.ronacher at active-4.com> wrote:
> It's one of those "non issues" but there are still some situations where you
> have to deal with Infinity and NaN, even in Python. Basically one the problems
> is that the repr of floating point numbers is platform depending and sometimes
> yields "nan" which is not evaluable. It's true that eval() is probably a bad
> thing but there are some libraries that use repr/%r for code generation and it
> could happen that they produce erroneous code because of that. Also there is no
> way to get the Infinity and NaN values and also no way to test if they exist.
> 
> Maybe changing the repr of `nan` to `math.NaN` and `inf` to `math.Infinity` as
> well as `-inf` to `(-math.Infinity)` and add that code to the math module (of
> course as a C implementation, there are even macros for testing for NaN values)::

That would work for eval(repr(x)), but it fails for float(repr(x)).  I
believe this particular issue has been brought up before, as well as the
particular solution, but I can't remember the final outcome.


>     Infinity = 1e10000

Has the storage of infinity in .pyc files been fixed?  For a while it
was broken.

>     NaN = Infinity / Infinity
> 
>     def is_nan(x):
>         return type(x) is float and x != x
> 
>     def is_finite(x):
>         return x != Infinity

you mean

    def is_finite(x):
        return x not in (Infinity, -Infinity)


 - Josiah


From theller at ctypes.org  Fri Jun  8 15:45:47 2007
From: theller at ctypes.org (Thomas Heller)
Date: Fri, 08 Jun 2007 15:45:47 +0200
Subject: [Python-Dev] Windows buildbot (Was: buildbot failure in x86 W2k
	trunk)
In-Reply-To: <46671F4C.1000601@v.loewis.de>
References: <20070520071645.BA1C01E4004@bag.python.org>	<bbaeab100705200023l7f58ff1fw9f54160d3ee2544a@mail.gmail.com>	<464FFFDC.4020600@v.loewis.de>	<46547C7F.7040908@activestate.com>	<4654C280.1080802@v.loewis.de>	<f36m5l$p9t$1@sea.gmane.org>	<f3kkif$si7$1@sea.gmane.org>
	<f4728l$dlb$1@sea.gmane.org> <46671F4C.1000601@v.loewis.de>
Message-ID: <f4bmib$hj3$1@sea.gmane.org>

Martin v. L?wis schrieb:
>> Should I try to setup another buildbot client for win32/AMD64?
> 
> We don't have a Win64 buildbot yet. Depending on whether you plan
> to use PCbuild or PCbuild8, this might be a challenge to get working
> (I think it's a challenge either way, but different).
> If it could actually work, it might be useful.

For release25-maint, probably PCBuild should be used since that is used to create the installer.
For trunk/Python 2.6 I don't know what you will use for the release version.


Where do you think is the challange to get it working?  For the buildbot client itself
I would use the same stuff as in the WinXP buildbot (32-bit python2.4, twisted, buildbot,
pywin32).

For the build scripts in Tools\buildbot I made some small changes to the batch files (appended).
They look for a PROCESSOR_ARCHITECTURE env var, and if this is equal to AMD64 the build target
and one or two small other things are changed from the default.  So these scripts currently build
the PCBuild process.

I can run Tools\buildbot\build.bat, then Tools\buildbot\test.bat, and Tools\buildbot\clean.bat
successfully.  Of course this does not mean that *everything* is built correctly - currently
_sqlite3, bz2, _tkinter, and _ssl are not build because of several reasons.

If you want me to go online with the buildbot please send me a HOST:PORT and PASSWORD.

Thomas

Index: Tools/buildbot/build.bat
===================================================================

--- Tools/buildbot/build.bat	(revision 55792)
+++ Tools/buildbot/build.bat	(working copy)
@@ -2,4 +2,6 @@
 cmd /c Tools\buildbot\external.bat
 call "%VS71COMNTOOLS%vsvars32.bat"
 cmd /q/c Tools\buildbot\kill_python.bat
-devenv.com /useenv /build Debug PCbuild\pcbuild.sln
+set TARGET=Debug
+if "%PROCESSOR_ARCHITECTURE%" == "AMD64" set _TARGET=ReleaseAMD64
+devenv.com /useenv /build %_TARGET% PCbuild\pcbuild.sln
Index: Tools/buildbot/test.bat
===================================================================
--- Tools/buildbot/test.bat	(revision 55792)
+++ Tools/buildbot/test.bat	(working copy)
@@ -1,3 +1,5 @@
 @rem Used by the buildbot "test" step.
 cd PCbuild
-call rt.bat -d -q -uall -rw
+set _DEBUG=-d
+if "%PROCESSOR_ARCHITECTURE%"=="AMD64" set _DEBUG=
+call rt.bat %_DEBUG% -q -uall -rw
Index: Tools/buildbot/clean.bat
===================================================================
--- Tools/buildbot/clean.bat	(revision 55792)
+++ Tools/buildbot/clean.bat	(working copy)
@@ -2,5 +2,9 @@
 call "%VS71COMNTOOLS%vsvars32.bat"
 cd PCbuild
 @echo Deleting .pyc/.pyo files ...
-python_d.exe rmpyc.py
-devenv.com /clean Debug pcbuild.sln
+set _PYTHON=python_d.exe
+if "%PROCESSOR_ARCHITECTURE%"=="AMD64" set _PYTHON=python.exe
+%_PYTHON% rmpyc.py
+set TARGET=Debug
+if "%PROCESSOR_ARCHITECTURE%" == "AMD64" set TARGET=ReleaseAMD64
+devenv.com /clean %TARGET% pcbuild.sln


From martin at v.loewis.de  Fri Jun  8 21:26:28 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 08 Jun 2007 21:26:28 +0200
Subject: [Python-Dev] Windows buildbot (Was: buildbot failure in x86 W2k
 trunk)
In-Reply-To: <f4bmib$hj3$1@sea.gmane.org>
References: <20070520071645.BA1C01E4004@bag.python.org>	<bbaeab100705200023l7f58ff1fw9f54160d3ee2544a@mail.gmail.com>	<464FFFDC.4020600@v.loewis.de>	<46547C7F.7040908@activestate.com>	<4654C280.1080802@v.loewis.de>	<f36m5l$p9t$1@sea.gmane.org>	<f3kkif$si7$1@sea.gmane.org>	<f4728l$dlb$1@sea.gmane.org>
	<46671F4C.1000601@v.loewis.de> <f4bmib$hj3$1@sea.gmane.org>
Message-ID: <4669AD64.3010302@v.loewis.de>

> For release25-maint, probably PCBuild should be used since that is used to create the installer.
> For trunk/Python 2.6 I don't know what you will use for the release version.

I actually don't know either, yet. I would like to use Orcas, but it's
not clear when this will be released; neither is clear when 2.6 will
be released.

I notice that Kristjan would like to see a PCbuild8 buildbot.

> Where do you think is the challange to get it working?  For the buildbot client itself
> I would use the same stuff as in the WinXP buildbot (32-bit python2.4, twisted, buildbot,
> pywin32).

I think the scripts in Tools\buildbot might be tricky, along with
possible changes to the master.

> For the build scripts in Tools\buildbot I made some small changes to the batch files (appended).
> They look for a PROCESSOR_ARCHITECTURE env var, and if this is equal to AMD64 the build target
> and one or two small other things are changed from the default.  So these scripts currently build
> the PCBuild process.

That's an interesting option. I had myself arranged for the master to
issue a different build command to an AMD64 build slave.

> If you want me to go online with the buildbot please send me a HOST:PORT and PASSWORD.

Doing so in a separate message.

Regards,
Martin

From lance.ellinghaus at eds.com  Sat Jun  9 00:43:16 2007
From: lance.ellinghaus at eds.com (Ellinghaus, Lance)
Date: Fri, 8 Jun 2007 18:43:16 -0400
Subject: [Python-Dev] Compiling 2.5.1 under Studio 11
Message-ID: <752A61D5C34D41478E638FC92AF9051B0101B61E@usahm207.amer.corp.eds.com>

Hello,

I am having a couple of issues compiling Python 2.5.1 under Sun Solaris
Studio 11 on Solaris 8.

Everything compiles correctly except the _ctypes module because it
cannot use the libffi that comes with Python and it does not exist on
the system.
Has anyone gotten it to compile correctly using Studio 11? I know it
will compile under GCC but I am not allowed to use GCC.

Also, during the pyexpat tests, Python generates a segfault. 

Are there any patches to fix these?

Thank you,

Lance


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070608/62c3c0ca/attachment.html 

From eyal.lotem at gmail.com  Sat Jun  9 05:49:57 2007
From: eyal.lotem at gmail.com (Eyal Lotem)
Date: Sat, 9 Jun 2007 06:49:57 +0300
Subject: [Python-Dev] cProfile with generator throwing
Message-ID: <b64f365b0706082049s3892017o429142715f6cb9ee@mail.gmail.com>

Hi. It seems that cProfile does not support throwing exceptions into
generators properly, when an external timer routine is used.

The problem is that _lsprof.c: ptrace_enter_call assumes that there
are no exceptions set when it is called, which is not true when the
generator frame is being gen_send_ex'd to send an exception into it
(Maybe you could say that only CallExternalTimer assumes this, but I
am not sure).  This assumption causes its eventual call to
CallExternalTimer to discover that an error is set and assume that it
was caused by its own work (which it wasn't).

I am not sure what the right way to fix this is, so I cannot send a patch.
Here is a minimalist example to reproduce the bug:

>>> import cProfile
>>> import time
>>> p=cProfile.Profile(time.clock)
>>> def f():
...   yield 1
...
>>> p.run("f().throw(Exception())")
Exception exceptions.Exception: Exception() in <cProfile.Profile
object at 0xb7f5a304> ignored
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.5/cProfile.py", line 135, in run
    return self.runctx(cmd, dict, dict)
  File "/usr/lib/python2.5/cProfile.py", line 140, in runctx
    exec cmd in globals, locals
  File "<string>", line 1, in <module>
  File "<stdin>", line 1, in f
SystemError: error return without exception set

From g.brandl at gmx.net  Sat Jun  9 09:42:08 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Sat, 09 Jun 2007 09:42:08 +0200
Subject: [Python-Dev] cProfile with generator throwing
In-Reply-To: <b64f365b0706082049s3892017o429142715f6cb9ee@mail.gmail.com>
References: <b64f365b0706082049s3892017o429142715f6cb9ee@mail.gmail.com>
Message-ID: <f4dlkf$3fn$2@sea.gmane.org>

Eyal Lotem schrieb:
> Hi. It seems that cProfile does not support throwing exceptions into
> generators properly, when an external timer routine is used.
> 
> The problem is that _lsprof.c: ptrace_enter_call assumes that there
> are no exceptions set when it is called, which is not true when the
> generator frame is being gen_send_ex'd to send an exception into it
> (Maybe you could say that only CallExternalTimer assumes this, but I
> am not sure).  This assumption causes its eventual call to
> CallExternalTimer to discover that an error is set and assume that it
> was caused by its own work (which it wasn't).
> 
> I am not sure what the right way to fix this is, so I cannot send a patch.
> Here is a minimalist example to reproduce the bug:
> 
>>>> import cProfile
>>>> import time
>>>> p=cProfile.Profile(time.clock)
>>>> def f():
> ...   yield 1
> ...
>>>> p.run("f().throw(Exception())")
> Exception exceptions.Exception: Exception() in <cProfile.Profile
> object at 0xb7f5a304> ignored
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "/usr/lib/python2.5/cProfile.py", line 135, in run
>     return self.runctx(cmd, dict, dict)
>   File "/usr/lib/python2.5/cProfile.py", line 140, in runctx
>     exec cmd in globals, locals
>   File "<string>", line 1, in <module>
>   File "<stdin>", line 1, in f
> SystemError: error return without exception set

There might be a similar problem with trace functions, see bug #1733757 which
is quite obscure too.

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From martin at v.loewis.de  Sat Jun  9 10:41:22 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 09 Jun 2007 10:41:22 +0200
Subject: [Python-Dev] Compiling 2.5.1 under Studio 11
In-Reply-To: <752A61D5C34D41478E638FC92AF9051B0101B61E@usahm207.amer.corp.eds.com>
References: <752A61D5C34D41478E638FC92AF9051B0101B61E@usahm207.amer.corp.eds.com>
Message-ID: <466A67B2.6030505@v.loewis.de>

> I am having a couple of issues compiling Python 2.5.1 under Sun Solaris
> Studio 11 on Solaris 8.
> 
> Everything compiles correctly except the _ctypes module because it
> cannot use the libffi that comes with Python and it does not exist on
> the system.
> 
> Has anyone gotten it to compile correctly using Studio 11? 

This is not a question for python-dev; please ask it on comp.lang.python.

In any case, what processor are you using? I have compiled
Python successfully with Sun C 5.8.

> Also, during the pyexpat tests, Python generates a segfault.
> 
> Are there any patches to fix these?

Without knowing what precisely the problem is, it is difficult
to say whether it has been fixed.

Regards,
Martin

From snaury at gmail.com  Sat Jun  9 22:23:20 2007
From: snaury at gmail.com (Alexey Borzenkov)
Date: Sun, 10 Jun 2007 00:23:20 +0400
Subject: [Python-Dev] zipfile and unicode filenames
Message-ID: <e2480c70706091323y3df942e0n2776a19bb21e7a94@mail.gmail.com>

Hi everyone,

Today I've stumbled upon a bug in my program that wasn't very
straightforward to understand. The problem is that I was passing
unicode filenames to zipfile.ZipFile.write and I had
sys.setdefaultencoding() in effect, which resulted in a situation
where most of the bytes generated in zipfile.ZipInfo.FileHeader would
pass thru, except for a few, which caused codec error on another
machine (where filenames got infectiously upgraded to unicode). The
problem here is that it was absolutely unclear at first that I get
unicode filenames passed to write, and it incorrectly accepted them
silently. Is it worth to submit a bug report on this? The desired
behavior here would be to either a) disallow unicode strings as
arcname are raise an exception (since it is used in concatenation with
raw data it is likely to cause problems because of auto upgrading raw
data to unicode), or b) silently encode unicode strings to raw strings
(something like if isinstance(filename, unicode): filename =
filename.encode() in zipfile.ZipInfo constructor).

So, should I submit a bug report, and which behavior would be actually correct?

From eyal.lotem at gmail.com  Sat Jun  9 23:23:41 2007
From: eyal.lotem at gmail.com (Eyal Lotem)
Date: Sun, 10 Jun 2007 00:23:41 +0300
Subject: [Python-Dev] Instance variable access and descriptors
Message-ID: <b64f365b0706091423u260232cft35383ea95f06a32c@mail.gmail.com>

Hi.

I was surprised to find in my profiling that instance variable access
was pretty slow.

I looked through the CPython code involved, and discovered something
that really surprises me.

Python, probably through the valid assumption that most attribute
lookups go to the class, tries to look for the attribute in the class
first, and in the instance, second.

What Python currently does is quite peculiar!
Here's a short description o PyObject_GenericGetAttr:

A. Python looks for a descriptor in the _entire_ mro hierarchy
(len(mro) class/type check and dict lookups).
B. If Python found a descriptor and it has both get and set functions
- it uses it to get the value and returns, skipping the next stage.
C. If Python either did not find a descriptor, or found one that has
no setter, it will try a lookup in the instance dict.
D. If Python failed to find it in the instance, it will use the
descriptor's getter, and if it has no getter it will use the
descriptor itself.


I believe the average costs of A are much higher than of C. Because
there is just 1 instance dict to look through, and it is also
typically smaller than the class dicts (in rare cases of worse-case
timings of hash lookups), while there are len(mro) dicts to look for a
descriptor in.

This means that for simple instance variable lookups, Python is paying
the full mro lookup price!

I believe that this should be changed, so that Python first looks for
the attribute in the instance's dict and only then through the dict's
mro.

This will have the following effects:

A. It will break code that uses instance.__dict__['var'] directly,
when 'var' exists as a property with a __set__ in the class. I believe
this is not significant.
B. It will simplify getattr's semantics. Python should _always_ give
precedence to instance attributes over class ones, rather than have
very weird special-cases (such as a property with a __set__).
C. It will greatly speed up instance variable access, especially when
the class has a large mro.

I think obviously the code breakage is the worst problem. This could
probably be addressed by a transition version in which Python warns
about any instance attributes that existed in the mro as descriptors
as well.

What do you think?

From steven.bethard at gmail.com  Sat Jun  9 23:51:57 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Sat, 9 Jun 2007 15:51:57 -0600
Subject: [Python-Dev] Instance variable access and descriptors
In-Reply-To: <b64f365b0706091423u260232cft35383ea95f06a32c@mail.gmail.com>
References: <b64f365b0706091423u260232cft35383ea95f06a32c@mail.gmail.com>
Message-ID: <d11dcfba0706091451o1303a1daub427abb4148074a9@mail.gmail.com>

On 6/9/07, Eyal Lotem <eyal.lotem at gmail.com> wrote:
> I believe that this should be changed, so that Python first looks for
> the attribute in the instance's dict and only then through the dict's
> mro.
[snip]
> What do you think?

Are you suggesting that the following code should print "43" instead of "42"?
::

    >>> class C(object):
    ...     x = property(lambda self: self._x)
    ...     def __init__(self):
    ...         self._x = 42
    ...
    >>> c = C()
    >>> c.__dict__['x'] = 43
    >>> c.x
    42

If so, this is a pretty substantial backwards incompatibility, and you
should probably post this to python-ideas first to hash things out. If
people like it there, the right target is probably Python 3000, not
Python 2.x.

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From steven.bethard at gmail.com  Sun Jun 10 00:18:37 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Sat, 9 Jun 2007 16:18:37 -0600
Subject: [Python-Dev] Instance variable access and descriptors
In-Reply-To: <b64f365b0706091500i30698638vbc1414bd2973087e@mail.gmail.com>
References: <b64f365b0706091423u260232cft35383ea95f06a32c@mail.gmail.com>
	<d11dcfba0706091451o1303a1daub427abb4148074a9@mail.gmail.com>
	<b64f365b0706091500i30698638vbc1414bd2973087e@mail.gmail.com>
Message-ID: <d11dcfba0706091518j300b40cfg7263ac426b43f326@mail.gmail.com>

> On 6/10/07, Steven Bethard <steven.bethard at gmail.com> wrote:
> > On 6/9/07, Eyal Lotem <eyal.lotem at gmail.com> wrote:
> > > I believe that this should be changed, so that Python first looks for
> > > the attribute in the instance's dict and only then through the dict's
> > > mro.
> >
> > Are you suggesting that the following code should print "43" instead of "42"?
> > ::
> >
> >     >>> class C(object):
> >     ...     x = property(lambda self: self._x)
> >     ...     def __init__(self):
> >     ...         self._x = 42
> >     ...
> >     >>> c = C()
> >     >>> c.__dict__['x'] = 43
> >     >>> c.x
> >     42

On 6/9/07, Eyal Lotem <eyal.lotem at gmail.com> wrote:
> Yes, I do suggest that.
> But its important to notice that this is not a suggestion in order to
> improve Python, but one that makes it possible to get reasonable
> performance out of CPython. As such, I don't believe it should be done
> in Py3K.
>
> Firstly, like everything that breaks backwards compatibility, it is
> possible to have a transitional version that spits warnings for all
> problems (detect name clashes between properties and instance dict).

Sure, but then you're talking about really introducing this in Python
2.7, with 2.6 as a transitional version. So take a minute to look at
the release timelines:

http://www.python.org/dev/peps/pep-0361/
The initial 2.6 target is for April 2008.

http://www.python.org/dev/peps/pep-3000/
I hope to have a first alpha release (3.0a1) out in the first half of
2007; it should take no more than a year from then before the first
proper release, named Python 3.0

So I'm expecting Python 3.0 to come out *before* 2.7. Thus if you're
proposing a backwards-incompatible change that would have to wait
until 2.7 anyway, why not propose it for 3.0 where
backwards-incompatible changes are more acceptable?

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From pje at telecommunity.com  Sun Jun 10 01:30:41 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat, 09 Jun 2007 19:30:41 -0400
Subject: [Python-Dev] Instance variable access and descriptors
In-Reply-To: <b64f365b0706091423u260232cft35383ea95f06a32c@mail.gmail.co
 m>
References: <b64f365b0706091423u260232cft35383ea95f06a32c@mail.gmail.com>
Message-ID: <20070609232842.005993A4060@sparrow.telecommunity.com>

At 12:23 AM 6/10/2007 +0300, Eyal Lotem wrote:
>A. It will break code that uses instance.__dict__['var'] directly,
>when 'var' exists as a property with a __set__ in the class. I believe
>this is not significant.
>B. It will simplify getattr's semantics. Python should _always_ give
>precedence to instance attributes over class ones, rather than have
>very weird special-cases (such as a property with a __set__).

Actually, these are features that are both used and desirable; I've 
been using them both since Python 2.2 (i.e., for many years 
now).  I'm -1 on removing these features from any version of Python, even 3.0.


>C. It will greatly speed up instance variable access, especially when
>the class has a large mro.

...at the cost of slowing down access to properties and __slots__, by 
adding an *extra* dictionary lookup there.

Note, by the way, that if you want to change attribute lookup 
semantics, you can always override __getattribute__ and make it work 
whatever way you like, without forcing everybody else to change *their* code.


From status at bugs.python.org  Sun Jun 10 02:00:44 2007
From: status at bugs.python.org (Tracker)
Date: Sun, 10 Jun 2007 00:00:44 +0000 (UTC)
Subject: [Python-Dev] Summary of Tracker Issues
Message-ID: <20070610000044.83913780EA@psf.upfronthosting.co.za>


ACTIVITY SUMMARY (06/03/07 - 06/10/07)
Tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue 
number.  Do NOT respond to this message.


 1645 open ( +0) /  8584 closed ( +0) / 10229 total ( +0)

Average duration of open issues: 822 days.
Median duration of open issues: 770 days.

Open Issues Breakdown
   open  1645 ( +0)
pending     0 ( +0)

Issues Now Closed (1)
_____________________

New issue test for email                                          87 days
       http://bugs.python.org/issue1001    admin             



-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070610/4a68494b/attachment.html 

From bioinformed at gmail.com  Sun Jun 10 02:28:31 2007
From: bioinformed at gmail.com (Kevin Jacobs <jacobs@bioinformed.com>)
Date: Sat, 9 Jun 2007 20:28:31 -0400
Subject: [Python-Dev] Instance variable access and descriptors
In-Reply-To: <20070609232842.005993A4060@sparrow.telecommunity.com>
References: <b64f365b0706091423u260232cft35383ea95f06a32c@mail.gmail.com>
	<20070609232842.005993A4060@sparrow.telecommunity.com>
Message-ID: <2e1434c10706091728w593f4016p5f09ee49be97e80d@mail.gmail.com>

I agree with Phillip with regard to the semantics.  They are semantically
desirable.  However, there is a patch to add a mro cache to speed up these
sorts of cases on the Python tracker, originally submitted by Armin Rigo.
He saw ~20% speedups, others see less.  It is currently just sitting there
with no apparent activity.  So if the overhead of mro lookups is that
bothersome, it may be well worth your time to review the patch.

URL:
http://sourceforge.net/tracker/index.php?func=detail&aid=1700288&group_id=5470&atid=305470

-Kevin


On 6/9/07, Phillip J. Eby <pje at telecommunity.com> wrote:
>
> At 12:23 AM 6/10/2007 +0300, Eyal Lotem wrote:
> >A. It will break code that uses instance.__dict__['var'] directly,
> >when 'var' exists as a property with a __set__ in the class. I believe
> >this is not significant.
> >B. It will simplify getattr's semantics. Python should _always_ give
> >precedence to instance attributes over class ones, rather than have
> >very weird special-cases (such as a property with a __set__).
>
> Actually, these are features that are both used and desirable; I've
> been using them both since Python 2.2 (i.e., for many years
> now).  I'm -1 on removing these features from any version of Python, even
> 3.0.
>
>
> >C. It will greatly speed up instance variable access, especially when
> >the class has a large mro.
>
> ...at the cost of slowing down access to properties and __slots__, by
> adding an *extra* dictionary lookup there.
>
> Note, by the way, that if you want to change attribute lookup
> semantics, you can always override __getattribute__ and make it work
> whatever way you like, without forcing everybody else to change *their*
> code.
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/jacobs%40bioinformed.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070609/154fe92e/attachment.htm 

From eyal.lotem at gmail.com  Sun Jun 10 02:59:04 2007
From: eyal.lotem at gmail.com (Eyal Lotem)
Date: Sun, 10 Jun 2007 03:59:04 +0300
Subject: [Python-Dev] Frame zombies
Message-ID: <b64f365b0706091759m76e030dj46f3861a2afa8979@mail.gmail.com>

I was just looking through the code that handles frames (as part of my
current effort to determine how to improve on CPython's performance),
when I noticed the freelist/zombie mechanism for frame allocation
handling.

While the zombie mechanism seems like a nice optimization, I believe
there can be a couple of improvements.

Currently, each code object has a zombie frame that last executed it.
This zombie is reused when that code object is re-executed in a frame.
When a frame is released, it is reassigned as the zombie of the code,
and iff the code object already has a zombie assigned to it, it places
the frame in the freelist.

If I understand correctly, this means, that the "freelist" is actually
only ever used for recursive-call frames that were released.  It also
means that these released frames will be reassigned to other code
objects in the future - in which case they will be reallocated,
perhaps unnecessarily.  "freelist" is just temporary storage for
released recursive calls. A program with no recursions will always
have an empty freelist.

What is bounding the memory consumption of this mechanism is a limit
on the number of frames in the freelist (and the fact that there is a
limited number of code objects, each of which may have an additional
zombie frame).

I believe a better way to implement this mechanism:

A. Replace the co_zombie frame with a co_zombie_freelist.
B. Remove the global freelist.

In other words, have a free list for each code object, rather than
one-per-code-object and a freelist.
This can be memory-bound by limiting the freelist size in each code object.
This can be use a bit more memory if a recursion is called just once -
and then discarded (waste for example 10 frames instead of 1), but can
save a lot of realloc calls when there is more than one recursion used
in the same program.
It is also possible to substantially increase the number of frames
stored per code-object, and then use some kind of more sophisticated
aging mechanism on the zombie freelists to periodically get rid of
unused freelists.  That kind of mechanism would mean that even in the
case of recursive calls, virtually all frame allocs are available from
the freelist.

I also believe this to be somewhat simpler, as there is only one
concept (a zombie freelist) rather than 2 (a zombie code object and a
freelist for recursive calls), and frames are never realloc'd, but
only allocated.

Should I make a patch?

From eyal.lotem at gmail.com  Sun Jun 10 03:13:38 2007
From: eyal.lotem at gmail.com (Eyal Lotem)
Date: Sun, 10 Jun 2007 04:13:38 +0300
Subject: [Python-Dev] Instance variable access and descriptors
In-Reply-To: <2e1434c10706091728w593f4016p5f09ee49be97e80d@mail.gmail.com>
References: <b64f365b0706091423u260232cft35383ea95f06a32c@mail.gmail.com>
	<20070609232842.005993A4060@sparrow.telecommunity.com>
	<2e1434c10706091728w593f4016p5f09ee49be97e80d@mail.gmail.com>
Message-ID: <b64f365b0706091813h7900fac9q71f656380f3ab5d1@mail.gmail.com>

I must be missing something, as I really see no reason to keep the
existing semantics other than backwards compatibility (which can be
achieved by introducing a __fastattr__ or such).

Can you explain under which situations or find any example situation
where the existing semantics are desirable?

As for the mro cache - thanks for pointing it out - I think it can
serve as a platform for another idea that in conjunction with psyco,
can possibly speed up CPython very significantly (will create a thread
about this soon).

Please note that speeding up the mro-lookup solves only half of the
problem (if it was solved - which it seems not to have been), the more
important half of the problem remains, allow me to emphasize:

ALL instance attribute accesses look up in both instance and class
dicts, when it could look just in the instance dict. This is made
worse by the fact that the class dict lookup is more expensive (with
or without the mro cache).
Some code that accesses a lot of instance attributes in an inner loop
can easily be sped up by a factor of 2 or more (depending on the size
of the mro).

Eyal

On 6/10/07, Kevin Jacobs <jacobs at bioinformed.com> <bioinformed at gmail.com> wrote:
> I agree with Phillip with regard to the semantics.  They are semantically
> desirable.  However, there is a patch to add a mro cache to speed up these
> sorts of cases on the Python tracker, originally submitted by Armin Rigo.
> He saw ~20% speedups, others see less.  It is currently just sitting there
> with no apparent activity.  So if the overhead of mro lookups is that
> bothersome, it may be well worth your time to review the patch.
>
> URL:
> http://sourceforge.net/tracker/index.php?func=detail&aid=1700288&group_id=5470&atid=305470
>
> -Kevin
>
>
>
> On 6/9/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> >
> > At 12:23 AM 6/10/2007 +0300, Eyal Lotem wrote:
> > >A. It will break code that uses instance.__dict__['var'] directly,
> > >when 'var' exists as a property with a __set__ in the class. I believe
> > >this is not significant.
> > >B. It will simplify getattr's semantics. Python should _always_ give
> > >precedence to instance attributes over class ones, rather than have
> > >very weird special-cases (such as a property with a __set__).
> >
> > Actually, these are features that are both used and desirable; I've
> > been using them both since Python 2.2 (i.e., for many years
> > now).  I'm -1 on removing these features from any version of Python, even
> 3.0.
> >
> >
> > >C. It will greatly speed up instance variable access, especially when
> > >the class has a large mro.
> >
> > ...at the cost of slowing down access to properties and __slots__, by
> > adding an *extra* dictionary lookup there.
> >
> > Note, by the way, that if you want to change attribute lookup
> > semantics, you can always override __getattribute__ and make it work
> > whatever way you like, without forcing everybody else to change *their*
> code.
> >
> > _______________________________________________
> > Python-Dev mailing list
> > Python-Dev at python.org
> > http://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/jacobs%40bioinformed.com
> >
>
>

From eyal.lotem at gmail.com  Sun Jun 10 03:14:18 2007
From: eyal.lotem at gmail.com (Eyal Lotem)
Date: Sun, 10 Jun 2007 04:14:18 +0300
Subject: [Python-Dev] Fwd:  Instance variable access and descriptors
In-Reply-To: <b64f365b0706091715w3bafad17n18a8505edd0aba97@mail.gmail.com>
References: <b64f365b0706091423u260232cft35383ea95f06a32c@mail.gmail.com>
	<20070609232842.005993A4060@sparrow.telecommunity.com>
	<b64f365b0706091715w3bafad17n18a8505edd0aba97@mail.gmail.com>
Message-ID: <b64f365b0706091814x4935c73fo17ae3fc5c00a59f7@mail.gmail.com>

On 6/10/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 12:23 AM 6/10/2007 +0300, Eyal Lotem wrote:
> >A. It will break code that uses instance.__dict__['var'] directly,
> >when 'var' exists as a property with a __set__ in the class. I believe
> >this is not significant.
> >B. It will simplify getattr's semantics. Python should _always_ give
> >precedence to instance attributes over class ones, rather than have
> >very weird special-cases (such as a property with a __set__).
>
> Actually, these are features that are both used and desirable; I've
> been using them both since Python 2.2 (i.e., for many years
> now).  I'm -1 on removing these features from any version of Python, even 3.0.

It is the same feature, actually, two sides of the same coin.
Why do you use self.__dict__['propertyname'] when you can use
self._propertyname?
Why even call the first form, which is both longer and causes
performance problems "a feature"?


> >C. It will greatly speed up instance variable access, especially when
> >the class has a large mro.
>
> ...at the cost of slowing down access to properties and __slots__, by
> adding an *extra* dictionary lookup there.
It will slow down access to properties - but that slowdown is insignificant:
A. The vast majority of lookups are *NOT* of properties. They are the
rare case and should not be the focus of optimization.
B. Property access involves calling Python functions - which is
heavier than a single dict lookup.
C. The dict lookup to find the property in the __mro__ can involve
many dicts (so in those cases adding a single dict lookup is not
heavy).

> Note, by the way, that if you want to change attribute lookup
> semantics, you can always override __getattribute__ and make it work
> whatever way you like, without forcing everybody else to change *their* code.
If I write my own __getattribute__ I lose the performance benefit that
I am after.
I do agree that code shouldn't be broken, that's why a transitional
that requires using __fastlookup__ can be used (Unfortunately, from
__future__ cannot be used as it is not local to a module, but to a
class hierarchy - unless one imports a feature from __future__ into a
class).

From martin at v.loewis.de  Sun Jun 10 05:36:39 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 10 Jun 2007 05:36:39 +0200
Subject: [Python-Dev] zipfile and unicode filenames
In-Reply-To: <e2480c70706091323y3df942e0n2776a19bb21e7a94@mail.gmail.com>
References: <e2480c70706091323y3df942e0n2776a19bb21e7a94@mail.gmail.com>
Message-ID: <466B71C7.3040607@v.loewis.de>

> Today I've stumbled upon a bug in my program that wasn't very
> straightforward to understand. 

Unfortunately, it isn't straight-forward to understand your
description of it, either.

> The problem is that I was passing
> unicode filenames to zipfile.ZipFile.write and I had
> sys.setdefaultencoding() in effect

What do you mean here? How can sys.setdefaultencoding()
be "in effect"? There is always a default encoding; did
you mean you changed the default?

> which resulted in a situation
> where most of the bytes generated in zipfile.ZipInfo.FileHeader would
> pass thru, except for a few, which caused codec error on another
> machine (where filenames got infectiously upgraded to unicode).

Was the problem that most of the bytes would pass thru, or was
the problem that a few did not pass thru? Why did filenames in
the FileHeader infectiously upgraded to unicode on the other
machine, but not on the first machine?

> The
> problem here is that it was absolutely unclear at first that I get
> unicode filenames passed to write, and it incorrectly accepted them
> silently. Is it worth to submit a bug report on this?

Try to let me rephrase what I understood so far:

"I changed the default system encoding from ASCII to some other
value, and that caused zipfile.py to generate an incorrect
zipfile. Is that a bug in zipfile?"

To that, the answer is a clear "no". If you change the default
encoding, you are on your own. Don't do that.

> So, should I submit a bug report, and which behavior would be actually correct?

The issue of non-ASCII file names in zipfiles is fairly well understood.
The ZIP format historically did not support them well. I believe this
has recently been improved, but that format change has not propagated
into the zipfile module, yet. Howeer, everybody is aware of the
situation, so there is no need to report a bug.

Regards,
Martin

From martin at v.loewis.de  Sun Jun 10 06:17:57 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 10 Jun 2007 06:17:57 +0200
Subject: [Python-Dev] Frame zombies
In-Reply-To: <b64f365b0706091759m76e030dj46f3861a2afa8979@mail.gmail.com>
References: <b64f365b0706091759m76e030dj46f3861a2afa8979@mail.gmail.com>
Message-ID: <466B7B75.2080804@v.loewis.de>

> Should I make a patch?

-1. This could consume quite a lot of memory, and I doubt
that the speed improvement would be significant. Instead,
I would check whether performance could be improved by
just dropping the freelist. Looking at the code, I see
that it performs a realloc (!) of the frame object if
the one it found is too small. That surely must be
expensive, and should be replaced with a free/malloc pair
instead.

I'd be curious to see whether malloc on today's systems
is still so slow as to justify a free list. If it is,
I would propose to create multiple free lists per size
classes, e.g. for frames with 10, 20, 30, etc. variables,
rather than having free lists per code object.

Regards,
Martin

From eyal.lotem at gmail.com  Sun Jun 10 06:38:03 2007
From: eyal.lotem at gmail.com (Eyal Lotem)
Date: Sun, 10 Jun 2007 07:38:03 +0300
Subject: [Python-Dev] Frame zombies
In-Reply-To: <466B7B75.2080804@v.loewis.de>
References: <b64f365b0706091759m76e030dj46f3861a2afa8979@mail.gmail.com>
	<466B7B75.2080804@v.loewis.de>
Message-ID: <b64f365b0706092138y1369cd4do14013b0ea85d39ea@mail.gmail.com>

The freelist currently serves as a good optimization of a special case
of a recurring recursion.  If the same code object (or one of the same
size) is used for recursive calls repeatedly, the freelist will
realloc-to-same-size (which probably has no serious cost) and thus the
cost of allocating/deallocating frames was averted.

I think that in general, the complexity of a sophisticated and
efficient aging mechanism is not worth it just to optimize recursive
calls. The only question is whether it is truly a memory problem, if
using, say, up-to 50 frames per code object?

Note that _only_ recursions will have more than 1 frame attached.

How many recursions are used and then discarded?
How slow is it to constantly malloc/free frames in a recursion?

My proposal will accelerate the following example:

def f(x, y):
  if 0 == x: return
  f(x-1, y)
def g(x):
  if 0 == x: return
  g(x-1)

while True:
  f(100, 100)
  g(100)

The current implementation will work well with the following:

while True:
  f(100, 100)

But removing freelist altogether will not work well with any type of recursion.

Eyal

On 6/10/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> > Should I make a patch?
>
> -1. This could consume quite a lot of memory, and I doubt
> that the speed improvement would be significant. Instead,
> I would check whether performance could be improved by
> just dropping the freelist. Looking at the code, I see
> that it performs a realloc (!) of the frame object if
> the one it found is too small. That surely must be
> expensive, and should be replaced with a free/malloc pair
> instead.
>
> I'd be curious to see whether malloc on today's systems
> is still so slow as to justify a free list. If it is,
> I would propose to create multiple free lists per size
> classes, e.g. for frames with 10, 20, 30, etc. variables,
> rather than having free lists per code object.
>
> Regards,
> Martin
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/eyal.lotem%40gmail.com
>

From martin at v.loewis.de  Sun Jun 10 08:16:11 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 10 Jun 2007 08:16:11 +0200
Subject: [Python-Dev] Frame zombies
In-Reply-To: <b64f365b0706092138y1369cd4do14013b0ea85d39ea@mail.gmail.com>
References: <b64f365b0706091759m76e030dj46f3861a2afa8979@mail.gmail.com>	
	<466B7B75.2080804@v.loewis.de>
	<b64f365b0706092138y1369cd4do14013b0ea85d39ea@mail.gmail.com>
Message-ID: <466B972B.1090802@v.loewis.de>

> Note that _only_ recursions will have more than 1 frame attached.

That's not true; in the presence of threads, the same method
may also be invoked more than one time simultaneously.

> But removing freelist altogether will not work well with any type of
> recursion.

How do you know that? Did you measure the time? On what system?
What were the results?

Performance optimization without measuring is just unacceptable.

Regards,
Martin

From martin at v.loewis.de  Sun Jun 10 10:38:15 2007
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Sun, 10 Jun 2007 10:38:15 +0200
Subject: [Python-Dev] zipfile and unicode filenames
In-Reply-To: <e2480c70706100039w513e9055g7672c248c2843bb8@mail.gmail.com>
References: <e2480c70706091323y3df942e0n2776a19bb21e7a94@mail.gmail.com>	
	<466B71C7.3040607@v.loewis.de>
	<e2480c70706100039w513e9055g7672c248c2843bb8@mail.gmail.com>
Message-ID: <466BB877.8000404@v.loewis.de>

> sys.setdefaultencoding()
> exists for a reason, wouldn't it be better if stdlib could cope with
> that at least with zipfile?

sys.setdefaultencoding just does not work. Many more things break when
you call it. It only exists because people like you insisted that it
exists.

> Also note that I'm trying to ask if zipfile should be improved, how it
> should be improved, and this possible improvement is not even for me
> (because now I know how zipfile behaves and I will work correctly with
> it, but someone else might stumble upon this very unexpectedly).

If you want to come up with a patch: sure. The zipfile module should
handle Unicode strings, encoding them in the encoding that the ZIP
specification defines (both the formal one, and the
informal-defined-by-pkwares-implementation).

The tricky question is what to do when reading in zipfiles with
non-ASCII characters (and yes, I understand that in your case
there were only ASCII characters in the file names).

> The problem was that sourcedir was unicode, and on my machine
> everything went ok multiple times. zipfile.ZipInfo.FileHeader would
> return unicode, but then when it writes it to a file it gets back to
> str (because mappings back and forth were identical). The problem
> happened when on a different machine header suddenly got byte 0x98 in
> position 10 (seems to be compress_size), which cp1251 codec couldn't
> decode. You see, arcname didn't even have unicode characters, but the
> mere fact that it was unicode made header upgrade to unicode in
> "return header + self.filename + self.extra".

Ok, now I understand. If filename is a Unicode string, header is
converted using the system encoding; depending on the exact value
of header and depending on the system encoding, this may cause
a decoding error.

This bug has been reported as http://bugs.python.org/1170311

> Because that's not supposed to work sanely when self.filename is
> unicode I'm asking if the right behavior would be to a) disallow
> unicode filenames in zipfile.ZipInfo, b) automatically convert
> filename to str in zipfile.ZipInfo, c) leave everything as it is.

The correct behavior would be b); the difficult details are what
encoding to use.

Regards,
Martin

From gjcarneiro at gmail.com  Sun Jun 10 12:27:16 2007
From: gjcarneiro at gmail.com (Gustavo Carneiro)
Date: Sun, 10 Jun 2007 11:27:16 +0100
Subject: [Python-Dev] Fwd: Instance variable access and descriptors
In-Reply-To: <b64f365b0706091814x4935c73fo17ae3fc5c00a59f7@mail.gmail.com>
References: <b64f365b0706091423u260232cft35383ea95f06a32c@mail.gmail.com>
	<20070609232842.005993A4060@sparrow.telecommunity.com>
	<b64f365b0706091715w3bafad17n18a8505edd0aba97@mail.gmail.com>
	<b64f365b0706091814x4935c73fo17ae3fc5c00a59f7@mail.gmail.com>
Message-ID: <a467ca4f0706100327i39c4c891g2bac7d988b1b9456@mail.gmail.com>

  I have to agree with you.  If removing support for
self.__dict__['propertyname'] (where propertyname is also the name of a
descriptor) is the price to pay for significant speedup, so be it.  People
doing that are asking for trouble anyway!

On 10/06/07, Eyal Lotem <eyal.lotem at gmail.com> wrote:
>
> On 6/10/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> > At 12:23 AM 6/10/2007 +0300, Eyal Lotem wrote:
> > >A. It will break code that uses instance.__dict__['var'] directly,
> > >when 'var' exists as a property with a __set__ in the class. I believe
> > >this is not significant.
> > >B. It will simplify getattr's semantics. Python should _always_ give
> > >precedence to instance attributes over class ones, rather than have
> > >very weird special-cases (such as a property with a __set__).
> >
> > Actually, these are features that are both used and desirable; I've
> > been using them both since Python 2.2 (i.e., for many years
> > now).  I'm -1 on removing these features from any version of Python,
> even 3.0.
>
> It is the same feature, actually, two sides of the same coin.
> Why do you use self.__dict__['propertyname'] when you can use
> self._propertyname?
> Why even call the first form, which is both longer and causes
> performance problems "a feature"?
>
>
> > >C. It will greatly speed up instance variable access, especially when
> > >the class has a large mro.
> >
> > ...at the cost of slowing down access to properties and __slots__, by
> > adding an *extra* dictionary lookup there.
> It will slow down access to properties - but that slowdown is
> insignificant:
> A. The vast majority of lookups are *NOT* of properties. They are the
> rare case and should not be the focus of optimization.
> B. Property access involves calling Python functions - which is
> heavier than a single dict lookup.
> C. The dict lookup to find the property in the __mro__ can involve
> many dicts (so in those cases adding a single dict lookup is not
> heavy).
>
> > Note, by the way, that if you want to change attribute lookup
> > semantics, you can always override __getattribute__ and make it work
> > whatever way you like, without forcing everybody else to change *their*
> code.
> If I write my own __getattribute__ I lose the performance benefit that
> I am after.
> I do agree that code shouldn't be broken, that's why a transitional
> that requires using __fastlookup__ can be used (Unfortunately, from
> __future__ cannot be used as it is not local to a module, but to a
> class hierarchy - unless one imports a feature from __future__ into a
> class).
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/gjcarneiro%40gmail.com
>



-- 
Gustavo J. A. M. Carneiro
INESC Porto
"The universe is always one step beyond logic." -- Frank Herbert
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070610/4335b0d9/attachment.html 

From snaury at gmail.com  Sun Jun 10 12:40:19 2007
From: snaury at gmail.com (Alexey Borzenkov)
Date: Sun, 10 Jun 2007 14:40:19 +0400
Subject: [Python-Dev] zipfile and unicode filenames
In-Reply-To: <466BB877.8000404@v.loewis.de>
References: <e2480c70706091323y3df942e0n2776a19bb21e7a94@mail.gmail.com>
	<466B71C7.3040607@v.loewis.de>
	<e2480c70706100039w513e9055g7672c248c2843bb8@mail.gmail.com>
	<466BB877.8000404@v.loewis.de>
Message-ID: <e2480c70706100340q21045114j2549aa7390d56640@mail.gmail.com>

> > Also note that I'm trying to ask if zipfile should be improved, how it
> > should be improved, and this possible improvement is not even for me
> > (because now I know how zipfile behaves and I will work correctly with
> > it, but someone else might stumble upon this very unexpectedly).
> If you want to come up with a patch: sure. The zipfile module should
> handle Unicode strings, encoding them in the encoding that the ZIP
> specification defines (both the formal one, and the
> informal-defined-by-pkwares-implementation).

I don't think always encoding them to utf-8 (and using bit 11 of
flag_bits) is a good idea, since there's a chance to create archives
that won't be correctly readable by programs not supporting this bit
(it's no secret that currently some programs just assume that
filenames are encoded using one of system encodings). This is too
complex and hazy to implement. Even if I know what is the situation on
Windows (i.e. using OEM, also called DOS encoding, but I'm not sure
how to determine its codec name from within python apart from calling
GetConsoleCP), I'm totally unaware of the situation on other operating
systems.

> The tricky question is what to do when reading in zipfiles with
> non-ASCII characters (and yes, I understand that in your case
> there were only ASCII characters in the file names).

I don't think it should be changed.

> Ok, now I understand. If filename is a Unicode string, header is
> converted using the system encoding; depending on the exact value
> of header and depending on the system encoding, this may cause
> a decoding error.
>
> This bug has been reported as http://bugs.python.org/1170311

I see. Well, that's all easier now then, as I can just create a patch
for an already existing bug.

> > Because that's not supposed to work sanely when self.filename is
> > unicode I'm asking if the right behavior would be to a) disallow
> > unicode filenames in zipfile.ZipInfo, b) automatically convert
> > filename to str in zipfile.ZipInfo, c) leave everything as it is.
> The correct behavior would be b); the difficult details are what
> encoding to use.

Current zipfile seems to officially support ascii filenames only
anyway, so the patch can be as simple as this:

Index: Lib/zipfile.py
===================================================================
--- Lib/zipfile.py	(revision 55850)
+++ Lib/zipfile.py	(working copy)
@@ -252,12 +252,13 @@
             self.extract_version = max(45, self.extract_version)
             self.create_version = max(45, self.extract_version)

+        filename = str(self.filename)
         header = struct.pack(structFileHeader, stringFileHeader,
                  self.extract_version, self.reserved, self.flag_bits,
                  self.compress_type, dostime, dosdate, CRC,
                  compress_size, file_size,
-                 len(self.filename), len(extra))
-        return header + self.filename + extra
+                 len(filename), len(extra))
+        return header + filename + extra

     def _decodeExtra(self):
         # Try to decode the extra field.

This doesn't introduce new features, just enforces filenames to be
ascii (or whatever default encoding is)
encodable.

From snaury at gmail.com  Sun Jun 10 12:56:45 2007
From: snaury at gmail.com (Alexey Borzenkov)
Date: Sun, 10 Jun 2007 14:56:45 +0400
Subject: [Python-Dev] zipfile and unicode filenames
In-Reply-To: <e2480c70706100340q21045114j2549aa7390d56640@mail.gmail.com>
References: <e2480c70706091323y3df942e0n2776a19bb21e7a94@mail.gmail.com>
	<466B71C7.3040607@v.loewis.de>
	<e2480c70706100039w513e9055g7672c248c2843bb8@mail.gmail.com>
	<466BB877.8000404@v.loewis.de>
	<e2480c70706100340q21045114j2549aa7390d56640@mail.gmail.com>
Message-ID: <e2480c70706100356m4972add8w128ab0d19b6dd978@mail.gmail.com>

> Current zipfile seems to officially support ascii filenames only
> anyway, so the patch can be as simple as this:

Submitted patch and test case as http://python.org/sf/1734346

From martin at v.loewis.de  Sun Jun 10 18:45:51 2007
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Sun, 10 Jun 2007 18:45:51 +0200
Subject: [Python-Dev] zipfile and unicode filenames
In-Reply-To: <e2480c70706100340q21045114j2549aa7390d56640@mail.gmail.com>
References: <e2480c70706091323y3df942e0n2776a19bb21e7a94@mail.gmail.com>	
	<466B71C7.3040607@v.loewis.de>	
	<e2480c70706100039w513e9055g7672c248c2843bb8@mail.gmail.com>	
	<466BB877.8000404@v.loewis.de>
	<e2480c70706100340q21045114j2549aa7390d56640@mail.gmail.com>
Message-ID: <466C2ABF.2090500@v.loewis.de>

> I don't think always encoding them to utf-8 (and using bit 11 of
> flag_bits) is a good idea, since there's a chance to create archives
> that won't be correctly readable by programs not supporting this bit
> (it's no secret that currently some programs just assume that
> filenames are encoded using one of system encodings).

I think it is also fairly uniformly agreed that these programs are
incorrect; the official encoding of file names in a zip file is
Windows/DOS code page 437.

> This is too
> complex and hazy to implement. Even if I know what is the situation on
> Windows (i.e. using OEM, also called DOS encoding, but I'm not sure
> how to determine its codec name from within python apart from calling
> GetConsoleCP), I'm totally unaware of the situation on other operating
> systems.

I don't think that the situation on Windows is that the OEM code page
should be used. Instead, CP 437 should be used, independent of the OEM
code page.

>> The tricky question is what to do when reading in zipfiles with
>> non-ASCII characters (and yes, I understand that in your case
>> there were only ASCII characters in the file names).
> 
> I don't think it should be changed.

In Python 3, it will certainly change, since the string type
will be unicode-based. It probably should not change for the
rest of 2.x.

> Current zipfile seems to officially support ascii filenames only
> anyway

That's not true. You can use any byte string as the file name
that you want, including non-ASCII strings encoded in CP437.

> +        filename = str(self.filename)

That would be incorrect, as it relies on the system encoding,
which shouldn't be relied upon. Plus, it would allow arbitrary
non-string things as filenames. What it should do instead
(IMO) is to encode in CP437. Bonus points if it falls back
to the UTF-8 feature of zip files if encoding as CP437 fails.

Regards,
Martin

From pje at telecommunity.com  Sun Jun 10 20:08:48 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun, 10 Jun 2007 14:08:48 -0400
Subject: [Python-Dev] Fwd:  Instance variable access and descriptors
In-Reply-To: <b64f365b0706091814x4935c73fo17ae3fc5c00a59f7@mail.gmail.co
 m>
References: <b64f365b0706091423u260232cft35383ea95f06a32c@mail.gmail.com>
	<20070609232842.005993A4060@sparrow.telecommunity.com>
	<b64f365b0706091715w3bafad17n18a8505edd0aba97@mail.gmail.com>
	<b64f365b0706091814x4935c73fo17ae3fc5c00a59f7@mail.gmail.com>
Message-ID: <20070610180649.D510C3A407F@sparrow.telecommunity.com>

At 04:14 AM 6/10/2007 +0300, Eyal Lotem wrote:
>On 6/10/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> > At 12:23 AM 6/10/2007 +0300, Eyal Lotem wrote:
> > >A. It will break code that uses instance.__dict__['var'] directly,
> > >when 'var' exists as a property with a __set__ in the class. I believe
> > >this is not significant.
> > >B. It will simplify getattr's semantics. Python should _always_ give
> > >precedence to instance attributes over class ones, rather than have
> > >very weird special-cases (such as a property with a __set__).
> >
> > Actually, these are features that are both used and desirable; I've
> > been using them both since Python 2.2 (i.e., for many years
> > now).  I'm -1 on removing these features from any version of 
> Python, even 3.0.
>
>It is the same feature, actually, two sides of the same coin.
>Why do you use self.__dict__['propertyname'] when you can use
>self._propertyname?

Because I'm *not writing this by hand*.  I'm using descriptors that 
know what attribute name they're responsible for, and do the access directly.


>Why even call the first form, which is both longer and causes
>performance problems "a feature"?

If you don't understand that, IMO you don't yet understand enough 
about the descriptor architecture to be proposing changes to it.


> > Note, by the way, that if you want to change attribute lookup
> > semantics, you can always override __getattribute__ and make it work
> > whatever way you like, without forcing everybody else to change 
> *their* code.
>If I write my own __getattribute__ I lose the performance benefit that
>I am after.

Not if you write it in C.


>I do agree that code shouldn't be broken, that's why a transitional
>that requires using __fastlookup__ can be used (Unfortunately, from
>__future__ cannot be used as it is not local to a module, but to a
>class hierarchy - unless one imports a feature from __future__ into a
>class).

I have no idea what you're talking about here.


From snaury at gmail.com  Sun Jun 10 20:17:16 2007
From: snaury at gmail.com (Alexey Borzenkov)
Date: Sun, 10 Jun 2007 22:17:16 +0400
Subject: [Python-Dev] zipfile and unicode filenames
In-Reply-To: <466C2ABF.2090500@v.loewis.de>
References: <e2480c70706091323y3df942e0n2776a19bb21e7a94@mail.gmail.com>
	<466B71C7.3040607@v.loewis.de>
	<e2480c70706100039w513e9055g7672c248c2843bb8@mail.gmail.com>
	<466BB877.8000404@v.loewis.de>
	<e2480c70706100340q21045114j2549aa7390d56640@mail.gmail.com>
	<466C2ABF.2090500@v.loewis.de>
Message-ID: <e2480c70706101117r70645e95s1080f563c0cbc66c@mail.gmail.com>

On 6/10/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> > I don't think always encoding them to utf-8 (and using bit 11 of
> > flag_bits) is a good idea, since there's a chance to create archives
> > that won't be correctly readable by programs not supporting this bit
> > (it's no secret that currently some programs just assume that
> > filenames are encoded using one of system encodings).
> I think it is also fairly uniformly agreed that these programs are
> incorrect; the official encoding of file names in a zip file is
> Windows/DOS code page 437.

Before replying to you I actually did some quick tests. I packed a
file with localized filename and then opened it using explorer and
also viewed it using the hexeditor:

   7-Zip: directory cp866, header cp866: explorer sees correct filename.
   zipfile: directory cp1251, header cp1251: explorer sees incorrect filename.
   pkzip25.exe: directory cp866, header cp1251: explorer sees correct
filenames, zipfile complains that filenames differ.
   zip.exe: directory cp1251, header cp1251: explorer sees incorrect filenames.

Also note, that modifying filename in directory with a hex editor to
cp866 made explorer see correct filenames. Another experiment with
pkzip25 showed that modifying filename in directory makes it extract
files with that filenam, i.e. it ignores header filename. The same
behavior is showed by 7-Zip.

So the general idea is that at least directory filename has some sort
of convention of using oem (dos, console) encoding on Windows, cp866
in my case. Header filenames have different encodings, and seem to be
ignored.

> I don't think that the situation on Windows is that the OEM code page
> should be used. Instead, CP 437 should be used, independent of the OEM
> code page.

And on the contrary, pkzip25 made by PKWARE Inc. themselves behaves otherwise.

> > +        filename = str(self.filename)
> That would be incorrect, as it relies on the system encoding,
> which shouldn't be relied upon.

Well, as I've seen in numerous examples above, system (or actually
dos) encoding is actually what is used by at least by three major
programs: 7-zip, pkzip25 and explorer, at least on windows.

> Plus, it would allow arbitrary
> non-string things as filenames.

Hmm... why is that bad?

> What it should do instead
> (IMO) is to encode in CP437. Bonus points if it falls back
> to the UTF-8 feature of zip files if encoding as CP437 fails.

And encoding to cp437 would be incorrect, as no currently existing
program would correctly work on non-english Windows OSes. I think that
letting the user deciding on the encoding is the right way to go here,
as you can't know what user actually wants these days, it's all too
hazy to me. And in case unicode is passed it just converts it using
ascii (or default) codec. One can specify ascii codec there
explicitly, if using system encoding is really an issue.

From pje at telecommunity.com  Sun Jun 10 20:28:23 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun, 10 Jun 2007 14:28:23 -0400
Subject: [Python-Dev] Fwd: Instance variable access and descriptors
In-Reply-To: <a467ca4f0706100327i39c4c891g2bac7d988b1b9456@mail.gmail.co
 m>
References: <b64f365b0706091423u260232cft35383ea95f06a32c@mail.gmail.com>
	<20070609232842.005993A4060@sparrow.telecommunity.com>
	<b64f365b0706091715w3bafad17n18a8505edd0aba97@mail.gmail.com>
	<b64f365b0706091814x4935c73fo17ae3fc5c00a59f7@mail.gmail.com>
	<a467ca4f0706100327i39c4c891g2bac7d988b1b9456@mail.gmail.com>
Message-ID: <20070610182622.921653A407F@sparrow.telecommunity.com>

At 11:27 AM 6/10/2007 +0100, Gustavo Carneiro wrote:
>   I have to agree with you.  If removing support for 
> self.__dict__['propertyname'] (where propertyname is also the name 
> of a descriptor) is the price to pay for significant speedup, so be 
> it.  People doing that are asking for trouble anyway!

How so?  This order of lookup is explicitly defined by the precedence 
rules of PEP 252:

"""When a dynamic attribute (one defined in a regular object's
__dict__) has the same name as a static attribute (one defined
by a meta-object in the inheritance graph rooted at the regular
object's __class__), the static attribute has precedence if it
is a descriptor that defines a __set__ method (see below);
otherwise (if there is no __set__ method) the dynamic attribute
has precedence.  In other words, for data attributes (those
with a __set__ method), the static definition overrides the
dynamic definition, but for other attributes, dynamic overrides
static."""

I fail to see how relying on explicitly-documented language behavior 
is "asking for trouble". 


From martin at v.loewis.de  Sun Jun 10 21:43:19 2007
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Sun, 10 Jun 2007 21:43:19 +0200
Subject: [Python-Dev] zipfile and unicode filenames
In-Reply-To: <e2480c70706101117r70645e95s1080f563c0cbc66c@mail.gmail.com>
References: <e2480c70706091323y3df942e0n2776a19bb21e7a94@mail.gmail.com>	
	<466B71C7.3040607@v.loewis.de>	
	<e2480c70706100039w513e9055g7672c248c2843bb8@mail.gmail.com>	
	<466BB877.8000404@v.loewis.de>	
	<e2480c70706100340q21045114j2549aa7390d56640@mail.gmail.com>	
	<466C2ABF.2090500@v.loewis.de>
	<e2480c70706101117r70645e95s1080f563c0cbc66c@mail.gmail.com>
Message-ID: <466C5457.1020904@v.loewis.de>

> So the general idea is that at least directory filename has some sort
> of convention of using oem (dos, console) encoding on Windows, cp866
> in my case. Header filenames have different encodings, and seem to be
> ignored.

Ok, then this is what the zipfile module should implement.

>> That would be incorrect, as it relies on the system encoding,
>> which shouldn't be relied upon.
> 
> Well, as I've seen in numerous examples above, system (or actually
> dos) encoding is actually what is used by at least by three major
> programs: 7-zip, pkzip25 and explorer, at least on windows.

Please don't confuse Python's "system encoding" with the system's
(or user's) standard encoding - they are not related at all. Using
the OEM code page if everybody else does it is fine. Using the
encoding that somebody hand-coded into the Python installation
is not.

>> Plus, it would allow arbitrary
>> non-string things as filenames.
> 
> Hmm... why is that bad?

Errors should never pass silently.

>> What it should do instead
>> (IMO) is to encode in CP437. Bonus points if it falls back
>> to the UTF-8 feature of zip files if encoding as CP437 fails.
> 
> And encoding to cp437 would be incorrect, as no currently existing
> program would correctly work on non-english Windows OSes. I think that
> letting the user deciding on the encoding is the right way to go here,
> as you can't know what user actually wants these days, it's all too
> hazy to me. 

Asking "the user" is not practical. If "the user" was aware of the
problem, you would not have run into the problem in the first place -
you would have known to encode all file names before passing them
into the zipfile module.

The automatic mode should follow the standard or the conventions;
"the user" (in quotes, because the end user is rarely bothered
with that detail) can still override that explicitly.

Regards,
Martin

From snaury at gmail.com  Sun Jun 10 22:26:33 2007
From: snaury at gmail.com (Alexey Borzenkov)
Date: Mon, 11 Jun 2007 00:26:33 +0400
Subject: [Python-Dev] zipfile and unicode filenames
In-Reply-To: <466C5457.1020904@v.loewis.de>
References: <e2480c70706091323y3df942e0n2776a19bb21e7a94@mail.gmail.com>
	<466B71C7.3040607@v.loewis.de>
	<e2480c70706100039w513e9055g7672c248c2843bb8@mail.gmail.com>
	<466BB877.8000404@v.loewis.de>
	<e2480c70706100340q21045114j2549aa7390d56640@mail.gmail.com>
	<466C2ABF.2090500@v.loewis.de>
	<e2480c70706101117r70645e95s1080f563c0cbc66c@mail.gmail.com>
	<466C5457.1020904@v.loewis.de>
Message-ID: <e2480c70706101326k5ec40b63x6ed521173ef180a2@mail.gmail.com>

On 6/10/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> > So the general idea is that at least directory filename has some sort
> > of convention of using oem (dos, console) encoding on Windows, cp866
> > in my case. Header filenames have different encodings, and seem to be
> > ignored.
> Ok, then this is what the zipfile module should implement.

But this is only on Windows! I have no clue what's the common
situation on other OSes and don't even know how to sanely get OEM
codepage on Windows (the obvious way with ctypes.kernel32.GetOEMCP()
doesn't seem good to me).

So I guess that's bad idea anyway, maybe conforming to language bit is
better (ascii will stay ascii anyway).

What about this?

Index: Lib/zipfile.py
===================================================================
--- Lib/zipfile.py	(revision 55850)
+++ Lib/zipfile.py	(working copy)
@@ -252,6 +252,7 @@
             self.extract_version = max(45, self.extract_version)
             self.create_version = max(45, self.extract_version)

+        self._encodeFilename()
         header = struct.pack(structFileHeader, stringFileHeader,
                  self.extract_version, self.reserved, self.flag_bits,
                  self.compress_type, dostime, dosdate, CRC,
@@ -259,6 +260,16 @@
                  len(self.filename), len(extra))
         return header + self.filename + extra

+    def _encodeFilename(self):
+        if isinstance(self.filename, unicode):
+            self.filename = self.filename.encode('utf-8')
+            self.flag_bits = self.flag_bits | 0x800
+
+    def _decodeFilename(self):
+        if self.flag_bits & 0x800:
+            self.filename = self.filename.decode('utf-8')
+            self.flag_bits = self.flag_bits & ~0x800
+
     def _decodeExtra(self):
         # Try to decode the extra field.
         extra = self.extra
@@ -683,6 +694,7 @@
                                      t>>11, (t>>5)&0x3F, (t&0x1F) * 2 )

             x._decodeExtra()
+            x._decodeFilename()
             x.header_offset = x.header_offset + concat
             self.filelist.append(x)
             self.NameToInfo[x.filename] = x
@@ -967,6 +979,7 @@
                     extract_version = zinfo.extract_version
                     create_version = zinfo.create_version

+                zinfo._encodeFilename()
                 centdir = struct.pack(structCentralDir,
                   stringCentralDir, create_version,
                   zinfo.create_system, extract_version, zinfo.reserved,
Index: Lib/test/test_zipfile.py
===================================================================
--- Lib/test/test_zipfile.py	(revision 55850)
+++ Lib/test/test_zipfile.py	(working copy)
@@ -515,6 +515,11 @@
         # and report that the first file in the archive was corrupt.
         self.assertRaises(RuntimeError, zipf.testzip)

+    def testUnicodeFilenames(self):
+        zf = zipfile.ZipFile(TESTFN, "w")
+        zf.writestr(u"foo.txt", "Test for unicode filename")
+        zf.close()
+
     def tearDown(self):
         support.unlink(TESTFN)
         support.unlink(TESTFN2)

The problem is that I don't know if anything actually supports bit 11
at the time and can't even tell if I did this correctly or not. :(

From martin at v.loewis.de  Sun Jun 10 22:47:54 2007
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Sun, 10 Jun 2007 22:47:54 +0200
Subject: [Python-Dev] zipfile and unicode filenames
In-Reply-To: <e2480c70706101326k5ec40b63x6ed521173ef180a2@mail.gmail.com>
References: <e2480c70706091323y3df942e0n2776a19bb21e7a94@mail.gmail.com>	
	<466B71C7.3040607@v.loewis.de>	
	<e2480c70706100039w513e9055g7672c248c2843bb8@mail.gmail.com>	
	<466BB877.8000404@v.loewis.de>	
	<e2480c70706100340q21045114j2549aa7390d56640@mail.gmail.com>	
	<466C2ABF.2090500@v.loewis.de>	
	<e2480c70706101117r70645e95s1080f563c0cbc66c@mail.gmail.com>	
	<466C5457.1020904@v.loewis.de>
	<e2480c70706101326k5ec40b63x6ed521173ef180a2@mail.gmail.com>
Message-ID: <466C637A.2020404@v.loewis.de>

> But this is only on Windows! I have no clue what's the common
> situation on other OSes and don't even know how to sanely get OEM
> codepage on Windows (the obvious way with ctypes.kernel32.GetOEMCP()
> doesn't seem good to me).
> 
> So I guess that's bad idea anyway, maybe conforming to language bit is
> better (ascii will stay ascii anyway).
> 
> What about this?

I haven't checked (*) whether you got the right value for flag_bits;
assuming you do, this looks good.

For compatibility, I would propose to use UTF-8 only if the file
name is not ASCII. Even though the OEM code pages vary, they
are (mostly) ASCII supersets. So if the string can be encoded
in ASCII, there is no need to set the UTF-8 flag bit.

OTOH, I now wonder whether it would *hurt* to have the flag bit:
if old zip software does not choke if the flag is set, then
it can just as well be set, as ASCII strings automatically
get encoded as ASCII in UTF-8.

Regards,
Martin

(*) I just now read

http://www.pkware.com/documents/casestudies/APPNOTE.TXT

and 0x800 seems to be the right value indeed. Notice, in
appendix D, that the specification says that the historical
encoding of file names is code page 437.

From aahz at pythoncraft.com  Mon Jun 11 00:37:12 2007
From: aahz at pythoncraft.com (Aahz)
Date: Sun, 10 Jun 2007 15:37:12 -0700
Subject: [Python-Dev] Instance variable access and descriptors
In-Reply-To: <b64f365b0706091423u260232cft35383ea95f06a32c@mail.gmail.com>
References: <b64f365b0706091423u260232cft35383ea95f06a32c@mail.gmail.com>
Message-ID: <20070610223712.GA15827@panix.com>

On Sun, Jun 10, 2007, Eyal Lotem wrote:
> 
> Python, probably through the valid assumption that most attribute
> lookups go to the class, tries to look for the attribute in the class
> first, and in the instance, second.
> 
> What Python currently does is quite peculiar!
> Here's a short description o PyObject_GenericGetAttr:
> 
> A. Python looks for a descriptor in the _entire_ mro hierarchy
> (len(mro) class/type check and dict lookups).
> B. If Python found a descriptor and it has both get and set functions
> - it uses it to get the value and returns, skipping the next stage.
> C. If Python either did not find a descriptor, or found one that has
> no setter, it will try a lookup in the instance dict.
> D. If Python failed to find it in the instance, it will use the
> descriptor's getter, and if it has no getter it will use the
> descriptor itself.

Guido, Ping, and I tried working on this at the sprint for PyCon 2003.
We were unable to find any solution that did not affect critical-path
timing.  As other people have noted, the current semantics cannot be
changed.  I'll also echo other people and suggest that this discusion be
moved to python-ideas if you want to continue pushing for a change in
semantics.

I just did a Google for my notes from PyCon 2003 and it appears that I
never sent them out (probably because they aren't particularly
comprehensible).  Here they are for the record (from 3/25/2003):

'''
CACHE_ATTR is the name used to describe a speedup (for new-style classes
only) in attribute lookup by caching the location of attributes in the
MRO.  Some of the non-obvious bits of code:

* If a new-style class has any classic classes in its bases, we
can't do attribute caching (we need to weakrefs to the derived
classes).

* If searching the MRO for an attribute discovers a data descriptor (has
tp_descr_set), that overrides any attribute that might be in the instance;
however, the existence of tp_descr_get still permits the instance to
override its bases (but tp_descr_get is called if there is no instance
attribute).

* We need to invalidate the cache for the updated attribute in all derived
classes in the following cases:

    * an attribute is added or deleted to the class or its base classes

    * an attribute has its status changed to or from being a data
    descriptor

This file uses Python pseudocode to describe changes necessary to
implement CACHE_ATTR at the C level.  Except for class Meta, these are
all exact descriptions of the work being done.  Except for class Meta the
changes go into object.c (Meta goes into typeobject.c).  The pseudocode
looks somewhat C-like to ease the transformation.
'''

NULL = object()

def getattr(inst, name):
    isdata, where = lookup(inst.__class__, name)
    if isdata:
        descr = where[name]
        if hasattr(descr, "__get__"):
            return descr.__get__(inst)
        else:
            return descr
    value = inst.__dict__.get(name, NULL)
    if value != NULL:
        return value
    if where == NULL:
        raise AttributError
    descr = where[name]
    if hasattr(descr, "__get__"):
        value = descr.__get__(inst)
    else:
        value = descr
    return value

def setattr(inst, name, value):
    isdata, where = lookup(inst.__class__, name)
    if isdata:
        descr = where[name]
        descr.__set__(inst, value)
        return
    inst.__dict__[name] = value

def lookup(cls, name):
    if cls.__cache__ != NULL:
        pair = cls.__cache__.get(name)
    else:
        pair = NULL
    if pair:
        return pair
    else:
        for c in cls.__mro__:
            where = c.__dict__
            if name in where:
                descr = where[name]
                isdata = hasattr(descr, "__set__")
                pair = isdata, where
                break
        else:
            pair = False, NULL
    if cls.__cache__ != NULL:
        cls.__cache__[name] = pair
    return pair


'''
These changes go into typeobject.c; they are not a complete
description of what happens during creation/updates, only the
changes necessary to implement CACHE_ATTRO.
'''

from types import ClassType

class Meta(type):
    def _invalidate(cls, name):
        if name in cls.__cache__:
            del cls.__cache__[name]
        for c in cls.__subclasses__():
            if name not in c.__dict__:
                self._invalidate(c, name)
    def _build_cache(cls, bases):
        for base in bases:
            if type(base.__class__) is ClassType:
                cls.__cache__ = NULL
                break
        else:
            cls.__cache__ = {}
    def __new__ (cls, bases):
        self._build_cache(cls, bases)
    def __setbases__(cls, bases):
        self._build_cache(cls, bases)
    def __setattr__(cls, name, value):
        if cls.__cache__ != NULL:
            old = cls.__dict__.get(name, NULL)
            wasdata = old != NULL and hasattr(old, "__set__")
            isdata = value != NULL and hasattr(value, "__set__")
            if wasdata != isdata or (old == NULL) != (value === NULL):
                self._invalidate(cls, name)
        type.__setattr__(cls, name, value)
    def __delattr__(cls, name):
        self.__setattr__(cls, name, NULL)
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"as long as we like the same operating system, things are cool." --piranha

From eyal.lotem at gmail.com  Mon Jun 11 03:54:48 2007
From: eyal.lotem at gmail.com (Eyal Lotem)
Date: Mon, 11 Jun 2007 04:54:48 +0300
Subject: [Python-Dev] Frame zombies
In-Reply-To: <466B972B.1090802@v.loewis.de>
References: <b64f365b0706091759m76e030dj46f3861a2afa8979@mail.gmail.com>
	<466B7B75.2080804@v.loewis.de>
	<b64f365b0706092138y1369cd4do14013b0ea85d39ea@mail.gmail.com>
	<466B972B.1090802@v.loewis.de>
Message-ID: <b64f365b0706101854g22ab91e8ha9f654aa3e989233@mail.gmail.com>

On 6/10/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> > Note that _only_ recursions will have more than 1 frame attached.
>
> That's not true; in the presence of threads, the same method
> may also be invoked more than one time simultaneously.

Yes, I have missed that, and realized that I missed that myself a bit later.
I guess I can rationalize that with the fact that I myself tend to
avoid threads.

> > But removing freelist altogether will not work well with any type of
> > recursion.
>
> How do you know that? Did you measure the time? On what system?
> What were the results?
>
> Performance optimization without measuring is just unacceptable.

I agree, I may have used the wrong tone above.
Removing the freelist will probably either not have a significant
effect (at worst, its adding very little work of maintaining it), or
improve recursions and functions that tend to be running
simultaniously in multiple threads (as in those cases the realloc will
not actually resize the frame, and mallocs/free will indeed be saved).

But do note my corrected tone (I said "probably" :-) - and anyone is
welcome to try removing it and see if they get a performance benefit.

The fact threading also causes the same code object to be used in
multiple frames makes everything a little less predictable and may
mean that having a larger-than-1 number of frames associated with each
code object may indeed yield a performance benefit.

I am not sure how to benchmark such modifications. Is there any
benchmark that includes threaded use of the same functions in typical
use cases?

> Regards,
> Martin
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/eyal.lotem%40gmail.com
>

From martin at v.loewis.de  Mon Jun 11 04:58:02 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 11 Jun 2007 04:58:02 +0200
Subject: [Python-Dev] Frame zombies
In-Reply-To: <b64f365b0706101854g22ab91e8ha9f654aa3e989233@mail.gmail.com>
References: <b64f365b0706091759m76e030dj46f3861a2afa8979@mail.gmail.com>	<466B7B75.2080804@v.loewis.de>	<b64f365b0706092138y1369cd4do14013b0ea85d39ea@mail.gmail.com>	<466B972B.1090802@v.loewis.de>
	<b64f365b0706101854g22ab91e8ha9f654aa3e989233@mail.gmail.com>
Message-ID: <466CBA3A.8050307@v.loewis.de>

> I am not sure how to benchmark such modifications. Is there any
> benchmark that includes threaded use of the same functions in typical
> use cases?

I don't think it's necessary to benchmark that specific case -
*any* kind of micro-benchmark would be better than none.
If you want to introduce free lists per code object, you
need to benchmark such code, and compare it to the status
quo. While doing so, I'd ask to also measure the case
that the free list is dropped without a replacement.

Regards,
Martin

From snaury at gmail.com  Mon Jun 11 06:30:29 2007
From: snaury at gmail.com (Alexey Borzenkov)
Date: Mon, 11 Jun 2007 08:30:29 +0400
Subject: [Python-Dev] zipfile and unicode filenames
In-Reply-To: <466C637A.2020404@v.loewis.de>
References: <e2480c70706091323y3df942e0n2776a19bb21e7a94@mail.gmail.com>
	<466B71C7.3040607@v.loewis.de>
	<e2480c70706100039w513e9055g7672c248c2843bb8@mail.gmail.com>
	<466BB877.8000404@v.loewis.de>
	<e2480c70706100340q21045114j2549aa7390d56640@mail.gmail.com>
	<466C2ABF.2090500@v.loewis.de>
	<e2480c70706101117r70645e95s1080f563c0cbc66c@mail.gmail.com>
	<466C5457.1020904@v.loewis.de>
	<e2480c70706101326k5ec40b63x6ed521173ef180a2@mail.gmail.com>
	<466C637A.2020404@v.loewis.de>
Message-ID: <e2480c70706102130j5fe76864sae7a866ccf85c5d4@mail.gmail.com>

On 6/11/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> For compatibility, I would propose to use UTF-8 only if the file
> name is not ASCII. Even though the OEM code pages vary, they
> are (mostly) ASCII supersets. So if the string can be encoded
> in ASCII, there is no need to set the UTF-8 flag bit.

Done:

Index: Lib/zipfile.py
===================================================================
--- Lib/zipfile.py	(revision 55850)
+++ Lib/zipfile.py	(working copy)
@@ -252,13 +252,29 @@
             self.extract_version = max(45, self.extract_version)
             self.create_version = max(45, self.extract_version)

+        filename, flag_bits = self._encodeFilenameFlags()
         header = struct.pack(structFileHeader, stringFileHeader,
-                 self.extract_version, self.reserved, self.flag_bits,
+                 self.extract_version, self.reserved, flag_bits,
                  self.compress_type, dostime, dosdate, CRC,
                  compress_size, file_size,
-                 len(self.filename), len(extra))
-        return header + self.filename + extra
+                 len(filename), len(extra))
+        return header + filename + extra

+    def _encodeFilenameFlags(self):
+        if isinstance(self.filename, unicode):
+            try:
+                return self.filename.encode('ascii'), self.flag_bits
+            except UnicodeEncodeError:
+                return self.filename.encode('utf-8'), self.flag_bits | 0x800
+        else:
+            return self.filename, self.flag_bits
+
+    def _decodeFilenameFlags(self):
+        if self.flag_bits & 0x800:
+            return self.filename.decode('utf-8'), self.flag_bits & ~0x800
+        else:
+            return self.filename, self.flag_bits
+
     def _decodeExtra(self):
         # Try to decode the extra field.
         extra = self.extra
@@ -684,6 +700,7 @@

             x._decodeExtra()
             x.header_offset = x.header_offset + concat
+            x.filename, x.flag_bits = x._decodeFilenameFlags()
             self.filelist.append(x)
             self.NameToInfo[x.filename] = x
             if self.debug > 2:
@@ -967,16 +984,17 @@
                     extract_version = zinfo.extract_version
                     create_version = zinfo.create_version

+                filename, flag_bits = zinfo._encodeFilenameFlags()
                 centdir = struct.pack(structCentralDir,
                   stringCentralDir, create_version,
                   zinfo.create_system, extract_version, zinfo.reserved,
-                  zinfo.flag_bits, zinfo.compress_type, dostime, dosdate,
+                  flag_bits, zinfo.compress_type, dostime, dosdate,
                   zinfo.CRC, compress_size, file_size,
-                  len(zinfo.filename), len(extra_data), len(zinfo.comment),
+                  len(filename), len(extra_data), len(zinfo.comment),
                   0, zinfo.internal_attr, zinfo.external_attr,
                   header_offset)
                 self.fp.write(centdir)
-                self.fp.write(zinfo.filename)
+                self.fp.write(filename)
                 self.fp.write(extra_data)
                 self.fp.write(zinfo.comment)

Index: Lib/test/test_zipfile.py
===================================================================
--- Lib/test/test_zipfile.py	(revision 55850)
+++ Lib/test/test_zipfile.py	(working copy)
@@ -515,6 +515,12 @@
         # and report that the first file in the archive was corrupt.
         self.assertRaises(RuntimeError, zipf.testzip)

+    def testUnicodeFilenames(self):
+        zf = zipfile.ZipFile(TESTFN, "w")
+        zf.writestr(u"foo.txt", "Test for unicode filename")
+        assert isinstance(zf.infolist()[0].filename, unicode)
+        zf.close()
+
     def tearDown(self):
         support.unlink(TESTFN)
         support.unlink(TESTFN2)

What I also changed is to encode filenames only for writing to the
target file, without damaging ZipInfo. The reason for this is that if
user decides to enumerate infolist after she wrote files to ZipFile,
she would expect ZipInfo.filename to be what she passed to
ZipFile.write/ZipFile.writestr.

From eyal.lotem at gmail.com  Mon Jun 11 06:55:50 2007
From: eyal.lotem at gmail.com (Eyal Lotem)
Date: Mon, 11 Jun 2007 07:55:50 +0300
Subject: [Python-Dev] Question about dictobject.c:lookdict_string
Message-ID: <b64f365b0706102155y21d8c245jc22dca6d3510d982@mail.gmail.com>

My question is specifically regarding the transition back from
lookdict_string (the initial value) to the general lookdict.

Currently, when a string-only dict is trying to look up any
non-string, it reverts back to a general lookdict.

Wouldn't it be better (especially in the more important case of a
string-key-only dict), to revert to the generic lookdict when a
non-string is inserted to the dict, rather than when one is being
searched?

This seems to me as it would shift this (admittedly very slight)
performance cost of a type ptr comparison from the read-access, to
write-access on all dicts (which means insertions of new keys in
non-string-only dicts may pay for another check, or that the lookdict
funcptr will be replaced by two funcptrs so that a different insertion
func on string-only dicts is used  too [was tempted to say vtable
here, but that would add another dereference to lookups]).

It would also have the slight benefit of speeding up non-string
lookups in string-only dicts.

This does not seem like a significant issue, but as I know a lot of
effort went into optimizing dicts, I was wondering if I am missing
something here.

From snaury at gmail.com  Mon Jun 11 07:22:16 2007
From: snaury at gmail.com (Alexey Borzenkov)
Date: Mon, 11 Jun 2007 09:22:16 +0400
Subject: [Python-Dev] zipfile and unicode filenames
In-Reply-To: <e2480c70706101326k5ec40b63x6ed521173ef180a2@mail.gmail.com>
References: <e2480c70706091323y3df942e0n2776a19bb21e7a94@mail.gmail.com>
	<466B71C7.3040607@v.loewis.de>
	<e2480c70706100039w513e9055g7672c248c2843bb8@mail.gmail.com>
	<466BB877.8000404@v.loewis.de>
	<e2480c70706100340q21045114j2549aa7390d56640@mail.gmail.com>
	<466C2ABF.2090500@v.loewis.de>
	<e2480c70706101117r70645e95s1080f563c0cbc66c@mail.gmail.com>
	<466C5457.1020904@v.loewis.de>
	<e2480c70706101326k5ec40b63x6ed521173ef180a2@mail.gmail.com>
Message-ID: <e2480c70706102222o48bb17ay1c4399279c47b002@mail.gmail.com>

On 6/11/07, Alexey Borzenkov <snaury at gmail.com> wrote:
> The problem is that I don't know if anything actually supports bit 11
> at the time and can't even tell if I did this correctly or not. :(

I downloaded the latest WinZip and can confirm that it parses utf-8
filenames correctly (although it seems to treat presence of bit 11
more like enabling autodetection mode, not strict utf-8, but it must
be because it has to cope with lots of incorrect zip files), i.e. in
the presence of bit 11 it understands filename to be utf-8, without
presence of bit 11 it treats it just like oem. :)

From gjcarneiro at gmail.com  Mon Jun 11 12:43:16 2007
From: gjcarneiro at gmail.com (Gustavo Carneiro)
Date: Mon, 11 Jun 2007 11:43:16 +0100
Subject: [Python-Dev] Instance variable access and descriptors
In-Reply-To: <20070610223712.GA15827@panix.com>
References: <b64f365b0706091423u260232cft35383ea95f06a32c@mail.gmail.com>
	<20070610223712.GA15827@panix.com>
Message-ID: <a467ca4f0706110343n4c93f6f1yadd92774349b86fd@mail.gmail.com>

  While you're at it, it would be nice to fix this ugly asymmetry I found in
descriptors.  It seems that descriptor's __get__ is called even when
accessed from a class rather than instance, but __set__ is only invoked from
instances, never from classes:

class Descr(object):
    def __get__(self, obj, objtype):
        print "__get__ from instance %s, type %s" % (obj, type)
        return "foo"

    def __set__(self, obj, value):
        print "__set__ on instance %s, value %s" % (obj, value)

class Foo(object):
    foo = Descr()

print Foo.foo # works

## doesn't work, goes directly to the class dict, not calling __set__
Foo.foo = 123

  Because of this problem, I may have to install properties into a class's
metaclass achieve the same effect that I expected to achieve with a simple
descriptor :-(


On 10/06/07, Aahz <aahz at pythoncraft.com> wrote:
>
> On Sun, Jun 10, 2007, Eyal Lotem wrote:
> >
> > Python, probably through the valid assumption that most attribute
> > lookups go to the class, tries to look for the attribute in the class
> > first, and in the instance, second.
> >
> > What Python currently does is quite peculiar!
> > Here's a short description o PyObject_GenericGetAttr:
> >
> > A. Python looks for a descriptor in the _entire_ mro hierarchy
> > (len(mro) class/type check and dict lookups).
> > B. If Python found a descriptor and it has both get and set functions
> > - it uses it to get the value and returns, skipping the next stage.
> > C. If Python either did not find a descriptor, or found one that has
> > no setter, it will try a lookup in the instance dict.
> > D. If Python failed to find it in the instance, it will use the
> > descriptor's getter, and if it has no getter it will use the
> > descriptor itself.
>
> Guido, Ping, and I tried working on this at the sprint for PyCon 2003.
> We were unable to find any solution that did not affect critical-path
> timing.  As other people have noted, the current semantics cannot be
> changed.  I'll also echo other people and suggest that this discusion be
> moved to python-ideas if you want to continue pushing for a change in
> semantics.
>
> I just did a Google for my notes from PyCon 2003 and it appears that I
> never sent them out (probably because they aren't particularly
> comprehensible).  Here they are for the record (from 3/25/2003):
>
> '''
> CACHE_ATTR is the name used to describe a speedup (for new-style classes
> only) in attribute lookup by caching the location of attributes in the
> MRO.  Some of the non-obvious bits of code:
>
> * If a new-style class has any classic classes in its bases, we
> can't do attribute caching (we need to weakrefs to the derived
> classes).
>
> * If searching the MRO for an attribute discovers a data descriptor (has
> tp_descr_set), that overrides any attribute that might be in the instance;
> however, the existence of tp_descr_get still permits the instance to
> override its bases (but tp_descr_get is called if there is no instance
> attribute).
>
> * We need to invalidate the cache for the updated attribute in all derived
> classes in the following cases:
>
>     * an attribute is added or deleted to the class or its base classes
>
>     * an attribute has its status changed to or from being a data
>     descriptor
>
> This file uses Python pseudocode to describe changes necessary to
> implement CACHE_ATTR at the C level.  Except for class Meta, these are
> all exact descriptions of the work being done.  Except for class Meta the
> changes go into object.c (Meta goes into typeobject.c).  The pseudocode
> looks somewhat C-like to ease the transformation.
> '''
>
> NULL = object()
>
> def getattr(inst, name):
>     isdata, where = lookup(inst.__class__, name)
>     if isdata:
>         descr = where[name]
>         if hasattr(descr, "__get__"):
>             return descr.__get__(inst)
>         else:
>             return descr
>     value = inst.__dict__.get(name, NULL)
>     if value != NULL:
>         return value
>     if where == NULL:
>         raise AttributError
>     descr = where[name]
>     if hasattr(descr, "__get__"):
>         value = descr.__get__(inst)
>     else:
>         value = descr
>     return value
>
> def setattr(inst, name, value):
>     isdata, where = lookup(inst.__class__, name)
>     if isdata:
>         descr = where[name]
>         descr.__set__(inst, value)
>         return
>     inst.__dict__[name] = value
>
> def lookup(cls, name):
>     if cls.__cache__ != NULL:
>         pair = cls.__cache__.get(name)
>     else:
>         pair = NULL
>     if pair:
>         return pair
>     else:
>         for c in cls.__mro__:
>             where = c.__dict__
>             if name in where:
>                 descr = where[name]
>                 isdata = hasattr(descr, "__set__")
>                 pair = isdata, where
>                 break
>         else:
>             pair = False, NULL
>     if cls.__cache__ != NULL:
>         cls.__cache__[name] = pair
>     return pair
>
>
> '''
> These changes go into typeobject.c; they are not a complete
> description of what happens during creation/updates, only the
> changes necessary to implement CACHE_ATTRO.
> '''
>
> from types import ClassType
>
> class Meta(type):
>     def _invalidate(cls, name):
>         if name in cls.__cache__:
>             del cls.__cache__[name]
>         for c in cls.__subclasses__():
>             if name not in c.__dict__:
>                 self._invalidate(c, name)
>     def _build_cache(cls, bases):
>         for base in bases:
>             if type(base.__class__) is ClassType:
>                 cls.__cache__ = NULL
>                 break
>         else:
>             cls.__cache__ = {}
>     def __new__ (cls, bases):
>         self._build_cache(cls, bases)
>     def __setbases__(cls, bases):
>         self._build_cache(cls, bases)
>     def __setattr__(cls, name, value):
>         if cls.__cache__ != NULL:
>             old = cls.__dict__.get(name, NULL)
>             wasdata = old != NULL and hasattr(old, "__set__")
>             isdata = value != NULL and hasattr(value, "__set__")
>             if wasdata != isdata or (old == NULL) != (value === NULL):
>                 self._invalidate(cls, name)
>         type.__setattr__(cls, name, value)
>     def __delattr__(cls, name):
>         self.__setattr__(cls, name, NULL)
> --
> Aahz (aahz at pythoncraft.com)           <*>
> http://www.pythoncraft.com/
>
> "as long as we like the same operating system, things are cool." --piranha
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/gjcarneiro%40gmail.com
>



-- 
Gustavo J. A. M. Carneiro
INESC Porto, Telecommunications and Multimedia Unit
"The universe is always one step beyond logic." -- Frank Herbert
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070611/11b461a5/attachment.htm 

From arigo at tunes.org  Mon Jun 11 13:06:13 2007
From: arigo at tunes.org (Armin Rigo)
Date: Mon, 11 Jun 2007 13:06:13 +0200
Subject: [Python-Dev] Instance variable access and descriptors
In-Reply-To: <b64f365b0706091813h7900fac9q71f656380f3ab5d1@mail.gmail.com>
References: <b64f365b0706091423u260232cft35383ea95f06a32c@mail.gmail.com>
	<20070609232842.005993A4060@sparrow.telecommunity.com>
	<2e1434c10706091728w593f4016p5f09ee49be97e80d@mail.gmail.com>
	<b64f365b0706091813h7900fac9q71f656380f3ab5d1@mail.gmail.com>
Message-ID: <20070611110613.GA28880@code0.codespeak.net>

Hi Eyal,

On Sun, Jun 10, 2007 at 04:13:38AM +0300, Eyal Lotem wrote:
> I must be missing something, as I really see no reason to keep the
> existing semantics other than backwards compatibility (which can be
> achieved by introducing a __fastattr__ or such).
> 
> Can you explain under which situations or find any example situation
> where the existing semantics are desirable?

The existing semantics are essential when dealing with metaclasses.
Many of the descriptors of the 'type' class would stop working without
it.  For example, the fact that 'x.__class__' normally gives the type of
'x' for any object x relies on this.  Reading the '__dict__' attribute
of types is also based on this.  Before proposing changes, be sure you
understand exactly how the following works:

    >>> object.__class__
    <type 'type'>
    >>> object.__dict__['__class__']
    <attribute '__class__' of 'object' objects>

    >>> class A(object):
    ...     pass
    >>> A.__dict__
    <dictproxy object at 0xb7c98e6c>
    >>> A.__dict__['__dict__']
    <attribute '__dict__' of 'A' objects>


A bientot,

Armin.

From facundo at taniquetil.com.ar  Mon Jun 11 15:33:01 2007
From: facundo at taniquetil.com.ar (Facundo Batista)
Date: Mon, 11 Jun 2007 13:33:01 +0000 (UTC)
Subject: [Python-Dev] Santa Fe Python Day report
Message-ID: <f4jiuc$js2$1@sea.gmane.org>

It was very succesful, around +300 people assisted, and there were a lot of interesting talks (two introductory talks, Turbogears, PyWeek, Zope 3, security, creating 3D games, Plone, automatic security testings, concurrency, and programming the OLPC).

I want to thanks the PSF for the received support. Python is developing interestingly in Argentina, and this Python Days are both a prove of that, and a way to get more Python developers.

Some links:

  Santa Fe Python Day: http://www.python-santafe.com.ar/
  Python Argentina: http://www.python.com.ar/moin

Regards,

-- 
.   Facundo
.
Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/



From cfbolz at gmx.de  Mon Jun 11 17:23:10 2007
From: cfbolz at gmx.de (Carl Friedrich Bolz)
Date: Mon, 11 Jun 2007 17:23:10 +0200
Subject: [Python-Dev] Question about dictobject.c:lookdict_string
In-Reply-To: <b64f365b0706102155y21d8c245jc22dca6d3510d982@mail.gmail.com>
References: <b64f365b0706102155y21d8c245jc22dca6d3510d982@mail.gmail.com>
Message-ID: <466D68DE.6020907@gmx.de>

Eyal Lotem wrote:
> My question is specifically regarding the transition back from
> lookdict_string (the initial value) to the general lookdict.
> 
> Currently, when a string-only dict is trying to look up any
> non-string, it reverts back to a general lookdict.
> 
> Wouldn't it be better (especially in the more important case of a
> string-key-only dict), to revert to the generic lookdict when a
> non-string is inserted to the dict, rather than when one is being
> searched?
[...]
> This does not seem like a significant issue, but as I know a lot of
> effort went into optimizing dicts, I was wondering if I am missing
> something here.

Yes, you are: when doing a lookup with a non-string-key, that key could 
be an instance of a class that has __hash__ and __eq__ implementations 
that make the key compare equal to some string that is in the 
dictionary. So you need to change to lookdict, otherwise that lookup 
might fail.

Cheers,

Carl Friedrich Bolz


From greg.ewing at canterbury.ac.nz  Tue Jun 12 10:01:09 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 12 Jun 2007 20:01:09 +1200
Subject: [Python-Dev] 2.5 slower than 2.4 for some things?
Message-ID: <466E52C5.5040906@canterbury.ac.nz>

I've had a report from a user that Plex runs about half
as fast in 2.5 as it did in 2.4. In particular, the
NFA-to-DFA conversion phase, which does a lot of
messing about with dicts representing mappings between
sets of states.

Does anyone in the Ministry for Making Python Blazingly
fast happen to know of some change that might have
pessimised things in this area?

--
Greg
-------------- next part --------------
An embedded message was scrubbed...
From: Christian Kristukat <kristukat at physik.tu-berlin.de>
Subject: plex performance
Date: Sat, 09 Jun 2007 21:53:03 +0900
Size: 29661
Url: http://mail.python.org/pipermail/python-dev/attachments/20070612/ba3a7ac6/attachment-0001.mht 

From greg.ewing at canterbury.ac.nz  Tue Jun 12 10:10:26 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 12 Jun 2007 20:10:26 +1200
Subject: [Python-Dev] Instance variable access and descriptors
In-Reply-To: <20070609232842.005993A4060@sparrow.telecommunity.com>
References: <b64f365b0706091423u260232cft35383ea95f06a32c@mail.gmail.com>
	<20070609232842.005993A4060@sparrow.telecommunity.com>
Message-ID: <466E54F2.20002@canterbury.ac.nz>

Phillip J. Eby wrote:
> ...at the cost of slowing down access to properties and __slots__, by 
> adding an *extra* dictionary lookup there.

Rather than spend time tinkering with the lookup order,
it might be more productive to look into implementing
a cache for attribute lookups. That would help with
method lookups as well, which are probably more
frequent than instance var accesses.

--
Greg

From ferringb at gmail.com  Tue Jun 12 13:13:01 2007
From: ferringb at gmail.com (Brian Harring)
Date: Tue, 12 Jun 2007 04:13:01 -0700
Subject: [Python-Dev] Instance variable access and descriptors
In-Reply-To: <466E54F2.20002@canterbury.ac.nz>
References: <b64f365b0706091423u260232cft35383ea95f06a32c@mail.gmail.com>
	<20070609232842.005993A4060@sparrow.telecommunity.com>
	<466E54F2.20002@canterbury.ac.nz>
Message-ID: <20070612111301.GF5778@seldon>

On Tue, Jun 12, 2007 at 08:10:26PM +1200, Greg Ewing wrote:
> Phillip J. Eby wrote:
> > ...at the cost of slowing down access to properties and __slots__, by 
> > adding an *extra* dictionary lookup there.
> 
> Rather than spend time tinkering with the lookup order,
> it might be more productive to look into implementing
> a cache for attribute lookups. That would help with
> method lookups as well, which are probably more
> frequent than instance var accesses.

Was wondering the same; specifically, hijacking pep280 celldict 
appraoch for this.

Downside, this would break code that tries to do PyDict_* calls on a 
class tp_dict; haven't dug extensively, but I'm sure there are a few 
out there.

Main thing I like about that approach is that it avoids the staleness 
verification crap, single lookup- it's there or it isn't.  It would 
also be resuable for 280.

If folks don't much like the hit from tracing back to a cell holding 
an actual value, could always implement it such that upon change, the 
change propagates out to instances registered (iow, change a.__dict__, 
it notifies b.__dict__ of the change, etc, till it hits a point where 
the change doesn't need to go further).

~harring
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20070612/435ee92d/attachment.pgp 

From ocean at m2.ccsnet.ne.jp  Tue Jun 12 14:15:49 2007
From: ocean at m2.ccsnet.ne.jp (ocean)
Date: Tue, 12 Jun 2007 21:15:49 +0900
Subject: [Python-Dev] 2.5 slower than 2.4 for some things?
References: <466E52C5.5040906@canterbury.ac.nz>
Message-ID: <005901c7aceb$6a285a70$0300a8c0@whiterabc2znlh>

> I've had a report from a user that Plex runs about half
> as fast in 2.5 as it did in 2.4. In particular, the
> NFA-to-DFA conversion phase, which does a lot of
> messing about with dicts representing mappings between
> sets of states.
>
> Does anyone in the Ministry for Making Python Blazingly
> fast happen to know of some change that might have
> pessimised things in this area?

Hello, I investigated. On my environment, consumed time is

E:\Plex-1.1.5>py24 plex_test2.py
0.710999965668

E:\Plex-1.1.5>py25 plex_test2.py
0.921999931335

And after I applied this patch to Plex/Machines, (make `Node' new style
class)

62c62
< class Node:
---
> class Node(object):

E:\Plex-1.1.5>py24 plex_test2.py
0.401000022888

E:\Plex-1.1.5>py25 plex_test2.py
0.350999832153

So, probably hash, comparation mechanizm of old/new style class has changed.
# improved for new style class, worse for old style class. Maybe optimized
for new style class?

Try this for minimum test.

import timeit

init = """
class Class:
 pass
c1 = Class()
c2 = Class()
"""

t1 = timeit.Timer("""
c1 < c2
""", init)

t2 = timeit.Timer("""
hash(c1)
hash(c2)
""", init)

print t1.timeit(1000)
print t2.timeit(1000)


From eyal.lotem at gmail.com  Tue Jun 12 15:22:15 2007
From: eyal.lotem at gmail.com (Eyal Lotem)
Date: Tue, 12 Jun 2007 16:22:15 +0300
Subject: [Python-Dev] Question about dictobject.c:lookdict_string
In-Reply-To: <466D68DE.6020907@gmx.de>
References: <b64f365b0706102155y21d8c245jc22dca6d3510d982@mail.gmail.com>
	<466D68DE.6020907@gmx.de>
Message-ID: <b64f365b0706120622t768e392eodc6202e3b2ec1a18@mail.gmail.com>

On 6/11/07, Carl Friedrich Bolz <cfbolz at gmx.de> wrote:
> Eyal Lotem wrote:
> > My question is specifically regarding the transition back from
> > lookdict_string (the initial value) to the general lookdict.
> >
> > Currently, when a string-only dict is trying to look up any
> > non-string, it reverts back to a general lookdict.
> >
> > Wouldn't it be better (especially in the more important case of a
> > string-key-only dict), to revert to the generic lookdict when a
> > non-string is inserted to the dict, rather than when one is being
> > searched?
> [...]
> > This does not seem like a significant issue, but as I know a lot of
> > effort went into optimizing dicts, I was wondering if I am missing
> > something here.
>
> Yes, you are: when doing a lookup with a non-string-key, that key could
> be an instance of a class that has __hash__ and __eq__ implementations
> that make the key compare equal to some string that is in the
> dictionary. So you need to change to lookdict, otherwise that lookup
> might fail.
Ah, thanks for clarification.

But doesn't it make sense to only revert that single lookup, and not
modify the function ptr until the dict contains a non-string?

Eyal

From orsenthil at gmail.com  Tue Jun 12 22:39:10 2007
From: orsenthil at gmail.com (Senthil Kumaran)
Date: Wed, 13 Jun 2007 02:09:10 +0530
Subject: [Python-Dev] [RFC] urlparse - parse query facility
Message-ID: <7c42eba10706121339n2d04cadapee69443c5636d9@mail.gmail.com>

Hi all,
This mail is a request for comments on changes to urlparse module. We understand
that urlparse returns the 'complete query' value as the query
component and does not
provide the facilities to separate the query components. User will have to use
the cgi module (cgi.parse_qs) to get the query parsed.
There has been a discussion in the past, on having a method of parse query
string available from urlparse module itself. [1]

To implement the query parse feature in urlparse module, we can:
a) import cgi and call cgi module's query_ps.
This approach will have problems as it
	i) imports cgi for urlparse module.
	ii) cgi module in turn imports urllib and urlparse.

b) Implement a stand alone query parsing facility in urlparse *AS IN*
cgi module.

Below method implements the urlparse_qs(url, keep_blank_values,strict_parsing)
that will help in parsing the query component of the url. It behaves same as the
cgi.parse_qs.

Please let me know your comments on the below code.

----------------------------------------------------------------------

def unquote(s):
    """unquote('abc%20def') -> 'abc def'."""
    res = s.split('%')
    for i in xrange(1, len(res)):
        item = res[i]
        try:
            res[i] = _hextochr[item[:2]] + item[2:]
        except KeyError:
            res[i] = '%' + item
        except UnicodeDecodeError:
            res[i] = unichr(int(item[:2], 16)) + item[2:]
    return "".join(res)

def urlparse_qs(url, keep_blank_values=0, strict_parsing=0):
    """Parse a URL query string and return the components as a dictionary.

	Based on the cgi.parse_qs method.This is a utility function provided
	with urlparse so that users need not use cgi module for
	parsing the url query string.

        Arguments:

        url: URL with query string to be parsed

        keep_blank_values: flag indicating whether blank values in
            URL encoded queries should be treated as blank strings.
            A true value indicates that blanks should be retained as
            blank strings.  The default false value indicates that
            blank values are to be ignored and treated as if they were
            not included.

        strict_parsing: flag indicating what to do with parsing errors.
            If false (the default), errors are silently ignored.
            If true, errors raise a ValueError exception.
    """

    scheme, netloc, url, params, querystring, fragment = urlparse(url)

    pairs = [s2 for s1 in querystring.split('&') for s2 in s1.split(';')]
    query = []
    for name_value in pairs:
        if not name_value and not strict_parsing:
            continue
        nv = name_value.split('=', 1)
        if len(nv) != 2:
            if strict_parsing:
                raise ValueError, "bad query field: %r" % (name_value,)
            # Handle case of a control-name with no equal sign
            if keep_blank_values:
                nv.append('')
            else:
                continue
        if len(nv[1]) or keep_blank_values:
            name = unquote(nv[0].replace('+', ' '))
            value = unquote(nv[1].replace('+', ' '))
            query.append((name, value))

    dict = {}
    for name, value in query:
        if name in dict:
            dict[name].append(value)
        else:
            dict[name] = [value]
    return dict

----------------------------------------------------------------------

Testing:

$ python
Python 2.6a0 (trunk, Jun 10 2007, 12:04:03)
[GCC 3.4.2 20041017 (Red Hat 3.4.2-6.fc3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import urlparse
>>> dir(urlparse)
['BaseResult', 'MAX_CACHE_SIZE', 'ParseResult', 'SplitResult', '__all__',
'__builtins__', '__doc__', '__file__', '__name__', '_parse_cache',
'_splitnetloc', '_splitparams', 'clear_cache', 'non_hierarchical',
'scheme_chars', 'test', 'test_input', 'unquote', 'urldefrag', 'urljoin',
'urlparse', 'urlparse_qs', 'urlsplit', 'urlunparse', 'urlunsplit',
'uses_fragment', 'uses_netloc', 'uses_params', 'uses_query', 'uses_relative']
>>> URL =
>>> 'http://www.google.com/search?hl=en&lr=&ie=UTF-8&oe=utf-8&q=south+africa+travel+cape+town'
>>> print urlparse.urlparse_qs(URL)
{'q': ['south africa travel cape town'], 'oe': ['utf-8'], 'ie': ['UTF-8'],
'hl': ['en']}
>>> print urlparse.urlparse_qs(URL,keep_blank_values=1)
{'q': ['south africa travel cape town'], 'ie': ['UTF-8'], 'oe': ['utf-8'],
'lr': [''], 'hl': ['en']}
>>>


Thanks,
Senthil

[1] http://mail.python.org/pipermail/tutor/2002-August/016823.html



-- 
O.R.Senthil Kumaran
http://phoe6.livejournal.com

From orsenthil at gmail.com  Tue Jun 12 22:58:42 2007
From: orsenthil at gmail.com (Senthil Kumaran)
Date: Wed, 13 Jun 2007 02:28:42 +0530
Subject: [Python-Dev] Requesting commit access to python sandbox. Cleanup
	urllib2 - Summer of Code 2007 Project
Message-ID: <7c42eba10706121358g3040faaeu1682e5a460cff557@mail.gmail.com>

Hi,
I am a student participant of Google Summer of Code 2007 and I am
working on the cleanup task of urllib2, with Skip as my mentor.
I would like to request for a commit access to the Python Sandbox for
implementing the changes as part of the project. I have attached by
SSH Public keys.
preferred name : senthil.kumaran

I am following up and adding comments to the urllib related bugs at
sf.net page. I would also like to request addition of my sourceforge
id : orsenthil to the python project, so I can close the defects
raised against urllib modules.

Summer of Code Project:
http://code.google.com/soc/psf/appinfo.html?csaid=E73A6612F80229B6

The project actually commenced on May 28th itself. But, there was a
delay from my side to get started. Ivan Sutherland's  essay on
Technology and Courage [1] did some good thing to me. :-)

Thanks,
Senthil

[1] http://research.sun.com/techrep/Perspectives/smli_ps-1.pdf#search=%22sutherland%20courage%22

-- 
O.R.Senthil Kumaran
http://phoe6.livejournal.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: id_rsa.pub
Type: application/octet-stream
Size: 228 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20070613/e0d9d5e5/attachment.obj 

From greg.ewing at canterbury.ac.nz  Wed Jun 13 02:37:23 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 13 Jun 2007 12:37:23 +1200
Subject: [Python-Dev] 2.5 slower than 2.4 for some things?
In-Reply-To: <005901c7aceb$6a285a70$0300a8c0@whiterabc2znlh>
References: <466E52C5.5040906@canterbury.ac.nz>
	<005901c7aceb$6a285a70$0300a8c0@whiterabc2znlh>
Message-ID: <466F3C43.5090608@canterbury.ac.nz>

ocean wrote:

> So, probably hash, comparation mechanizm of old/new style class has changed.
> # improved for new style class, worse for old style class. Maybe optimized
> for new style class?

Thanks -- it looks like there's a simple solution that
will make Plex even faster! I'll pass this on to the
OP.

--
Greg

From ckkart at hoc.net  Wed Jun 13 03:15:38 2007
From: ckkart at hoc.net (Christian K)
Date: Wed, 13 Jun 2007 10:15:38 +0900
Subject: [Python-Dev] 2.5 slower than 2.4 for some things?
In-Reply-To: <005901c7aceb$6a285a70$0300a8c0@whiterabc2znlh>
References: <466E52C5.5040906@canterbury.ac.nz>
	<005901c7aceb$6a285a70$0300a8c0@whiterabc2znlh>
Message-ID: <f4ngi9$oo$1@sea.gmane.org>

ocean wrote:
>> I've had a report from a user that Plex runs about half
>> as fast in 2.5 as it did in 2.4. In particular, the
>> NFA-to-DFA conversion phase, which does a lot of
>> messing about with dicts representing mappings between
>> sets of states.

That was me.

>> Does anyone in the Ministry for Making Python Blazingly
>> fast happen to know of some change that might have
>> pessimised things in this area?
> 
> Hello, I investigated. On my environment, consumed time is
> 
> E:\Plex-1.1.5>py24 plex_test2.py
> 0.710999965668
> 
> E:\Plex-1.1.5>py25 plex_test2.py
> 0.921999931335
> 
> And after I applied this patch to Plex/Machines, (make `Node' new style
> class)
> 
> 62c62
> < class Node:
> ---
>> class Node(object):
> 
> E:\Plex-1.1.5>py24 plex_test2.py
> 0.401000022888
> 
> E:\Plex-1.1.5>py25 plex_test2.py
> 0.350999832153
> 

Nice!.

Meanwhile I tried to replace the parsing I did with Plex by re.Scanner. And
again there is a remarkable speed difference. Again python2.5 is slower:

try:
    from re import Scanner
except:
    from sre import Scanner

pars = {}
order = []
count = 0

def par(scanner,name):
    global count, order, pars

    if name in ['caller','e','pi']:
        return name
    if name not in pars.keys():
        pars[name] = ('ns', count)
        order.append(name)
        ret = 'a[%d]'%count
        count += 1
    else:
        ret = 'a[%d]'%(order.index(name))
    return ret

scanner = Scanner([
    (r"x", lambda y,x: x),
    (r"[a-zA-Z]+\.", lambda y,x: x),
    (r"[a-z]+\(", lambda y,x: x),
    (r"[a-zA-Z_]\w*", par),
    (r"\d+\.\d*", lambda y,x: x),
    (r"\d+", lambda y,x: x),
    (r"\+|-|\*|/", lambda y,x: x),
    (r"\s+", None),
    (r"\)+", lambda y,x: x),
    (r"\(+", lambda y,x: x),
    (r",", lambda y,x: x),
    ])

import profile
import pstats

def run():
    arg = '+amp*exp(-(x-pos)/fwhm)'
    for i in range(100):
        scanner.scan(arg)

profile.run('run()','profscanner')
p = pstats.Stats('profscanner')
p.strip_dirs()
p.sort_stats('cumulative')
p.print_stats()


Christian


From thopfin at umich.edu  Tue Jun  5 18:55:07 2007
From: thopfin at umich.edu (Todd Hopfinger)
Date: Tue, 5 Jun 2007 12:55:07 -0400
Subject: [Python-Dev] TLSAbruptCloseError
Message-ID: <000801c7a792$45e15e90$d1a41bb0$@edu>

I am using TLS Lite and J2ME SecureConnection for the purposes of encrypting
traffic to/from a Java Midlet client and a multithreaded Python server.
However, I encounter a TLSAbruptCloseError. I have tried to determine the
cause of the exception to no avail. I understand that it has to do with
close_notify alerts. My abbreviated code follows.

 

// Server

 

def sslSockRecv(conn, num):

                data = ''

                while len(data) < num:

                                data = conn.recv(num - len(data)) #
TLSAbruptCloseError thrown here

                                if len(data) == 0:

                                                raise NotEnoughBytes ('Too
few bytes from client. Expected ' + str(num) + '; got ' + str(len(data)),
num, len(data))

                return data

 

sslSockRecv() throws NotEnoughBytes exception to indicate that the client
has closed the connection. The NotEnoughBytes exception handler subsequently
closes the SSL connection and then the underlying socket.

 

// Client

 

import javax.microedition.io.SecureConnection;

 

sc = (SecureConnection)Connector.open("ssl://host:port");

inStream = sc.openInputStream();

outStream = sc.openOutputStream();

 

// read/write some data using streams

 

if (inStream != null)

inStream.close();

if (outStream != null)

 outStream.close();

if (sc != null) 

sc.close();

 

When using the Java phone emulator, SSLDump indicates after the application
data portions.

 

3 13 0.3227 (0.0479)  C>SV3.0(22)  Alert

    level           warning

    value           close_notify

3    0.3228 (0.0000)  C>S  TCP FIN

3 14 0.3233 (0.0005)  S>CV3.0(22)  Alert

    level           warning

    value           close_notify

 

However, the server doesn't throw a TLSAbruptCloseError when using the
emulator. Using the actual phone does cause a TLSAbruptCloseError on the
server but SSLDump reports no errors, just.

 

4    1.6258 (0.7012)  C>S  TCP FIN

4    1.6266 (0.0008)  S>C  TCP FIN

 

Any thoughts?

 


Todd Hopfinger

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070605/3d14619b/attachment.htm 

From rolandgeibel at yahoo.de  Wed Jun  6 18:59:54 2007
From: rolandgeibel at yahoo.de (Roland Geibel)
Date: Wed, 6 Jun 2007 18:59:54 +0200 (CEST)
Subject: [Python-Dev] minimal configuration for python on a DSP (C64xx
	family of TI)
Message-ID: <433272.47901.qm@web27412.mail.ukl.yahoo.com>


Dear all.

We want to make python run on DSP processors (C64xx
family of TI).

Which would be a minimal configuration (of modules,
C-files, ... ) to make
it start running (without all of the things useful to
add, once it runs).


Any hints welcome


Roland Geibel

Geibel at vision-comp.com





      Heute schon einen Blick in die Zukunft von E-Mails wagen? Versuchen Sie?s mit dem neuen Yahoo! Mail. www.yahoo.de/mail

From shredwheat at gmail.com  Fri Jun  8 05:31:23 2007
From: shredwheat at gmail.com (Pete Shinners)
Date: Thu, 7 Jun 2007 20:31:23 -0700
Subject: [Python-Dev] Representation of 'nan'
Message-ID: <cfd22a7c0706072031u594b7500kfa2c7b4a861afea0@mail.gmail.com>

The repr() for a float of 'inf' or 'nan' is generated as a string (not a
string literal). Perhaps this is only important in how one defines repr().
I've filed a bug, but am not sure if there is a clear solution.

https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1732212&group_id=5470

# Repr with a tuple of floats
>>> repr((1.0, 2.0, 3.0))
'(1.0, 2.0, 3.0)'
>>> eval(_)
(1.0, 2.0, 3.0)

# Repr with a tuple of floats, plus nan
>>> repr((1.0, float('nan'), 3.0))
'(1.0, nan, 3.0)'
>>> eval(_)
NameError: name 'nan' is not defined

There are a few alternatives I can think are fairly clean. I think I'd
prefer any of these over the current 'nan' implementation. I don't think it
is worth adding a nan literal into the language. But something could be
changed so that repr of nan meant something.

Best option in my opinion would be adding attributes to float, so that
float.nan, float.inf, and float.ninf are accessable. This could also help
with the odd situations of checking for these out of range values. With that
in place, repr could return 'float.nan' instead of 'nan'. This would make
the repr string evaluatable again. (In contexts where __builtins__ has not
been molested)

Another option could be for repr to return 'float("nan")' for these, which
would also evaluate correctly. But this doesn't seem a clean use for repr.

Is this worth even changing? It's just an irregularity that has come up and
surprised a few of us developers.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070607/0ceaaa66/attachment.html 

From cfbolz at gmx.de  Mon Jun 11 17:23:10 2007
From: cfbolz at gmx.de (Carl Friedrich Bolz)
Date: Mon, 11 Jun 2007 17:23:10 +0200
Subject: [Python-Dev] Question about dictobject.c:lookdict_string
In-Reply-To: <b64f365b0706102155y21d8c245jc22dca6d3510d982@mail.gmail.com>
References: <b64f365b0706102155y21d8c245jc22dca6d3510d982@mail.gmail.com>
Message-ID: <466D68DE.6020907@gmx.de>

Eyal Lotem wrote:
> My question is specifically regarding the transition back from
> lookdict_string (the initial value) to the general lookdict.
> 
> Currently, when a string-only dict is trying to look up any
> non-string, it reverts back to a general lookdict.
> 
> Wouldn't it be better (especially in the more important case of a
> string-key-only dict), to revert to the generic lookdict when a
> non-string is inserted to the dict, rather than when one is being
> searched?
[...]
> This does not seem like a significant issue, but as I know a lot of
> effort went into optimizing dicts, I was wondering if I am missing
> something here.

Yes, you are: when doing a lookup with a non-string-key, that key could 
be an instance of a class that has __hash__ and __eq__ implementations 
that make the key compare equal to some string that is in the 
dictionary. So you need to change to lookdict, otherwise that lookup 
might fail.

Cheers,

Carl Friedrich Bolz

From martin at v.loewis.de  Wed Jun 13 05:46:54 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 13 Jun 2007 05:46:54 +0200
Subject: [Python-Dev] TLSAbruptCloseError
In-Reply-To: <000801c7a792$45e15e90$d1a41bb0$@edu>
References: <000801c7a792$45e15e90$d1a41bb0$@edu>
Message-ID: <466F68AE.7020603@v.loewis.de>

> Any thoughts?

My main thought: this posting is off-topic for python-dev.
This list is for the development of Python itself; use
comp.lang.python for discussing development *with* Python.
However, this may still be the wrong place - perhaps
you better ask in a Java forum?

Regards,
Martin

From arigo at tunes.org  Wed Jun 13 09:46:41 2007
From: arigo at tunes.org (Armin Rigo)
Date: Wed, 13 Jun 2007 09:46:41 +0200
Subject: [Python-Dev] Instance variable access and descriptors
In-Reply-To: <466E54F2.20002@canterbury.ac.nz>
References: <b64f365b0706091423u260232cft35383ea95f06a32c@mail.gmail.com>
	<20070609232842.005993A4060@sparrow.telecommunity.com>
	<466E54F2.20002@canterbury.ac.nz>
Message-ID: <20070613074641.GA8702@code0.codespeak.net>

Hi,

On Tue, Jun 12, 2007 at 08:10:26PM +1200, Greg Ewing wrote:
> Rather than spend time tinkering with the lookup order,
> it might be more productive to look into implementing
> a cache for attribute lookups.

See patch #1700288.


Armin

From jon+python-dev at unequivocal.co.uk  Wed Jun 13 11:53:26 2007
From: jon+python-dev at unequivocal.co.uk (Jon Ribbens)
Date: Wed, 13 Jun 2007 10:53:26 +0100
Subject: [Python-Dev] TLSAbruptCloseError
In-Reply-To: <000801c7a792$45e15e90$d1a41bb0$@edu>
References: <000801c7a792$45e15e90$d1a41bb0$@edu>
Message-ID: <20070613095326.GY2531@snowy.squish.net>

On Tue, Jun 05, 2007 at 12:55:07PM -0400, Todd Hopfinger wrote:
>    I am using TLS Lite and J2ME SecureConnection for the purposes of
>    encrypting traffic to/from a Java Midlet client and a multithreaded Python
>    server. However, I encounter a TLSAbruptCloseError. I have tried to
>    determine the cause of the exception to no avail. I understand that it has
>    to do with close_notify alerts. My abbreviated code follows.

It may or may not be your specific problem, but Microsoft SSL servers
tend to just drop the TCP connection when they're done, rather than
do a proper SSL shutdown. This tends to make errors such as the above,
which you must then ignore.

From ocean at m2.ccsnet.ne.jp  Wed Jun 13 19:17:25 2007
From: ocean at m2.ccsnet.ne.jp (ocean)
Date: Thu, 14 Jun 2007 02:17:25 +0900
Subject: [Python-Dev] 2.5 slower than 2.4 for some things?
References: <466E52C5.5040906@canterbury.ac.nz><005901c7aceb$6a285a70$0300a8c0@whiterabc2znlh>
	<f4ngi9$oo$1@sea.gmane.org>
Message-ID: <001701c7adde$b6dfe560$0300a8c0@whiterabc2znlh>

> Meanwhile I tried to replace the parsing I did with Plex by re.Scanner.
And
> again there is a remarkable speed difference. Again python2.5 is slower:
>
> try:
>     from re import Scanner
> except:
>     from sre import Scanner
>
> pars = {}
> order = []
> count = 0
>
> def par(scanner,name):
>     global count, order, pars
>
>     if name in ['caller','e','pi']:
>         return name
>     if name not in pars.keys():
>         pars[name] = ('ns', count)
>         order.append(name)
>         ret = 'a[%d]'%count
>         count += 1
>     else:
>         ret = 'a[%d]'%(order.index(name))
>     return ret
>
> scanner = Scanner([
>     (r"x", lambda y,x: x),
>     (r"[a-zA-Z]+\.", lambda y,x: x),
>     (r"[a-z]+\(", lambda y,x: x),
>     (r"[a-zA-Z_]\w*", par),
>     (r"\d+\.\d*", lambda y,x: x),
>     (r"\d+", lambda y,x: x),
>     (r"\+|-|\*|/", lambda y,x: x),
>     (r"\s+", None),
>     (r"\)+", lambda y,x: x),
>     (r"\(+", lambda y,x: x),
>     (r",", lambda y,x: x),
>     ])
>
> import profile
> import pstats
>
> def run():
>     arg = '+amp*exp(-(x-pos)/fwhm)'
>     for i in range(100):
>         scanner.scan(arg)
>
> profile.run('run()','profscanner')
> p = pstats.Stats('profscanner')
> p.strip_dirs()
> p.sort_stats('cumulative')
> p.print_stats()

Well, I tried this script, there was no big difference.
Python2.4 0.772sec
Python2.5 0.816sec

Probably I found one reason comparation for classic style class is slower on
Python2.5.
Comparation function instance_compare() calls PyErr_GivenExceptionMatches(),
and it was just flag operation on 2.4. But on 2.5, probably related to
introduction of BaseException,
it checks inherited type tuple. (ie: PyExceptionInstance_Check)


From kumar.mcmillan at gmail.com  Wed Jun 13 22:30:57 2007
From: kumar.mcmillan at gmail.com (Kumar McMillan)
Date: Wed, 13 Jun 2007 15:30:57 -0500
Subject: [Python-Dev] sys.setdefaultencoding() vs. csv module + unicode
Message-ID: <b571660a0706131330g74c119en5653b5ea05eda319@mail.gmail.com>

I'm seeing conflicting opinions on whether to put
sys.setdefaultencoding('utf-8') in sitecustomize.py or not ([1] vs.
[2]) and frankly I'm confused.

The csv module says it's not unicode safe but the 2.5 docs [3] have a
workaround for this.  While the workaround says nothing about
sys.setdefaultencoding() it simply does not work with the default
encoding, "ascii."  Is this _the_ problem with the csv module?  Should
I give up and use XML?  Below is code that works vs. code that
doesn't.  Am I interpretting the workaround from the docs wrong?  If
so, can someone please give me a hint ;)  I should also point out that
I've tried this with the StringIO queued approach (from the
workaround) but that doesn't solve anything.

1) with the default encoding :

kumar$ python2.5
Python 2.5 (r25:51918, Sep 19 2006, 08:49:13)
[GCC 4.0.1 (Apple Computer, Inc. build 5341)] on darwin
>>> import sys, csv, codecs
>>> f = codecs.open('unicsv.csv','wb','utf-8')
>>> w = csv.writer(f)
>>> w.writerow([u'lang', u'espa\xa4ol'])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xa4' in
position 4: ordinal not in range(128)
>>>

2) with custom encoding :

kumar$ python2.5 -S
Python 2.5 (r25:51918, Sep 19 2006, 08:49:13)
[GCC 4.0.1 (Apple Computer, Inc. build 5341)] on darwin
>>> import sys, csv, codecs
>>> sys.setdefaultencoding('utf-8')
>>> f = codecs.open('unicsv.csv','wb','utf-8')
>>> w = csv.writer(f)
>>> w.writerow([u'lang', u'espa\xa4ol'])
>>> f.close()

thanks, Kumar

[1] http://mail.python.org/pipermail/python-dev/2007-June/073593.html
[2] http://diveintopython.org/xml_processing/unicode.html
[3] http://docs.python.org/lib/csv-examples.html#csv-examples

From nnorwitz at gmail.com  Thu Jun 14 00:55:49 2007
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Wed, 13 Jun 2007 15:55:49 -0700
Subject: [Python-Dev] 2.5 slower than 2.4 for some things?
In-Reply-To: <001701c7adde$b6dfe560$0300a8c0@whiterabc2znlh>
References: <466E52C5.5040906@canterbury.ac.nz>
	<005901c7aceb$6a285a70$0300a8c0@whiterabc2znlh>
	<f4ngi9$oo$1@sea.gmane.org>
	<001701c7adde$b6dfe560$0300a8c0@whiterabc2znlh>
Message-ID: <ee2a432c0706131555q78f5d766n340f084f600e91d8@mail.gmail.com>

On 6/13/07, ocean <ocean at m2.ccsnet.ne.jp> wrote:
> > Meanwhile I tried to replace the parsing I did with Plex by re.Scanner.
> And
> > again there is a remarkable speed difference. Again python2.5 is slower:
> >
> > try:
> >     from re import Scanner
> > except:
> >     from sre import Scanner
> >
> > pars = {}
> > order = []
> > count = 0
> >
> > def par(scanner,name):
> >     global count, order, pars
> >
> >     if name in ['caller','e','pi']:
> >         return name
> >     if name not in pars.keys():
> >         pars[name] = ('ns', count)
> >         order.append(name)
> >         ret = 'a[%d]'%count
> >         count += 1
> >     else:
> >         ret = 'a[%d]'%(order.index(name))
> >     return ret
> >
> > scanner = Scanner([
> >     (r"x", lambda y,x: x),
> >     (r"[a-zA-Z]+\.", lambda y,x: x),
> >     (r"[a-z]+\(", lambda y,x: x),
> >     (r"[a-zA-Z_]\w*", par),
> >     (r"\d+\.\d*", lambda y,x: x),
> >     (r"\d+", lambda y,x: x),
> >     (r"\+|-|\*|/", lambda y,x: x),
> >     (r"\s+", None),
> >     (r"\)+", lambda y,x: x),
> >     (r"\(+", lambda y,x: x),
> >     (r",", lambda y,x: x),
> >     ])
> >
> > import profile
> > import pstats
> >
> > def run():
> >     arg = '+amp*exp(-(x-pos)/fwhm)'
> >     for i in range(100):
> >         scanner.scan(arg)
> >
> > profile.run('run()','profscanner')
> > p = pstats.Stats('profscanner')
> > p.strip_dirs()
> > p.sort_stats('cumulative')
> > p.print_stats()
>
> Well, I tried this script, there was no big difference.
> Python2.4 0.772sec
> Python2.5 0.816sec
>
> Probably I found one reason comparation for classic style class is slower on
> Python2.5.
> Comparation function instance_compare() calls PyErr_GivenExceptionMatches(),
> and it was just flag operation on 2.4. But on 2.5, probably related to
> introduction of BaseException,
> it checks inherited type tuple. (ie: PyExceptionInstance_Check)

I'm curious about the speed of 2.6 (trunk).  I think this should have
become faster due to the introduction of fast subtype checks (he says
without looking at the code).

n

From jimjjewett at gmail.com  Thu Jun 14 01:27:24 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed, 13 Jun 2007 19:27:24 -0400
Subject: [Python-Dev] [RFC] urlparse - parse query facility
Message-ID: <fb6fbf560706131627o1a074b7fta39e14cbc6374845@mail.gmail.com>

> a) import cgi and call cgi module's query_ps.  [circular imports]

or

> b) Implement a stand alone query parsing facility in urlparse *AS IN*
> cgi module.

Assuming (b), please remove the (code for the) parsing from the cgi
module, and just import it back from urlparse (or urllib).  Since cgi
already imports urllib (which imports urlparse), this isn't adding any
dependencies -- but it keeps the code in a single location.

-jJ

From ocean at m2.ccsnet.ne.jp  Thu Jun 14 04:37:08 2007
From: ocean at m2.ccsnet.ne.jp (ocean)
Date: Thu, 14 Jun 2007 11:37:08 +0900
Subject: [Python-Dev] 2.5 slower than 2.4 for some things?
References: <466E52C5.5040906@canterbury.ac.nz>
	<005901c7aceb$6a285a70$0300a8c0@whiterabc2znlh>
	<f4ngi9$oo$1@sea.gmane.org>
	<001701c7adde$b6dfe560$0300a8c0@whiterabc2znlh>
	<ee2a432c0706131555q78f5d766n340f084f600e91d8@mail.gmail.com>
Message-ID: <002201c7ae2c$e7920910$0300a8c0@whiterabc2znlh>

> > Probably I found one reason comparation for classic style class is
slower on
> > Python2.5.
> > Comparation function instance_compare() calls
PyErr_GivenExceptionMatches(),
> > and it was just flag operation on 2.4. But on 2.5, probably related to
> > introduction of BaseException,
> > it checks inherited type tuple. (ie: PyExceptionInstance_Check)
>
> I'm curious about the speed of 2.6 (trunk).  I think this should have
> become faster due to the introduction of fast subtype checks (he says
> without looking at the code).
>
> n
>

Yes, I confirmed trunk is faster than 2.5.

///////////////////////////////////////
// Code

import timeit

t = timeit.Timer("""
f1 < f2
""", """
class Foo:
 pass
f1 = Foo()
f2 = Foo()
""")

print t.timeit(10000)

///////////////////////////////////////
// Result

release-maint24 0.337sec
release-maint25 0.625sec
trunk 0.494sec

//////////////////////////////////////
// Result of plex_test2.py

release-maint24 2.944sec
release-maint25 4.026sec
trunk 3.625sec


From orsenthil at users.sourceforge.net  Thu Jun 14 04:43:44 2007
From: orsenthil at users.sourceforge.net (O.R.Senthil Kumaran)
Date: Thu, 14 Jun 2007 08:13:44 +0530
Subject: [Python-Dev] [RFC] urlparse - parse query facility
In-Reply-To: <fb6fbf560706131627o1a074b7fta39e14cbc6374845@mail.gmail.com>
References: <fb6fbf560706131627o1a074b7fta39e14cbc6374845@mail.gmail.com>
Message-ID: <20070614024344.GA3321@gmail.com>

* Jim Jewett <jimjjewett at gmail.com> [2007-06-13 19:27:24]:

> > a) import cgi and call cgi module's query_ps.  [circular imports]
> 
>  or
> 
> > b) Implement a stand alone query parsing facility in urlparse *AS IN*
> > cgi module.
> 
>  Assuming (b), please remove the (code for the) parsing from the cgi
>  module, and just import it back from urlparse (or urllib).  Since cgi
>  already imports urllib (which imports urlparse), this isn't adding any
>  dependencies -- but it keeps the code in a single location.

Sure, thats a good idea as I see it. It wont break anything as well.

Thanks,

-- 
O.R.Senthil Kumaran
http://uthcode.sarovar.org

From fdrake at acm.org  Thu Jun 14 04:42:21 2007
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed, 13 Jun 2007 22:42:21 -0400
Subject: [Python-Dev] [RFC] urlparse - parse query facility
In-Reply-To: <7c42eba10706121339n2d04cadapee69443c5636d9@mail.gmail.com>
References: <7c42eba10706121339n2d04cadapee69443c5636d9@mail.gmail.com>
Message-ID: <200706132242.21436.fdrake@acm.org>

On Tuesday 12 June 2007, Senthil Kumaran wrote:
 > This mail is a request for comments on changes to urlparse module. We
 > understand that urlparse returns the 'complete query' value as the query
 > component and does not
 > provide the facilities to separate the query components. User will have to
 > use the cgi module (cgi.parse_qs) to get the query parsed.

I agree with the comments Jim provided.

 > Below method implements the urlparse_qs(url,
 > keep_blank_values,strict_parsing) that will help in parsing the query
 > component of the url. It behaves same as the cgi.parse_qs.

Except that it takes a URL, not only a query string.

 > def urlparse_qs(url, keep_blank_values=0, strict_parsing=0):
...
 >     scheme, netloc, url, params, querystring, fragment = urlparse(url)

I see no reason to incorporate the URL splitting into the function; the 
existing function signatures for cgi.parse_qs and cgi.parse_qsl are 
sufficient.

It may be convenient to add methods to the urlparse.BaseResult class providing 
access to the parsed version of the query on the instance.


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From martin at v.loewis.de  Thu Jun 14 08:47:43 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 14 Jun 2007 08:47:43 +0200
Subject: [Python-Dev] sys.setdefaultencoding() vs. csv module + unicode
In-Reply-To: <b571660a0706131330g74c119en5653b5ea05eda319@mail.gmail.com>
References: <b571660a0706131330g74c119en5653b5ea05eda319@mail.gmail.com>
Message-ID: <4670E48F.3090903@v.loewis.de>

> The csv module says it's not unicode safe but the 2.5 docs [3] have a
> workaround for this.  While the workaround says nothing about
> sys.setdefaultencoding() it simply does not work with the default
> encoding, "ascii."  Is this _the_ problem with the csv module?  Should
> I give up and use XML?  Below is code that works vs. code that
> doesn't.  Am I interpretting the workaround from the docs wrong?

These questions are off-topic for python-dev; please ask them on
comp.lang.python instead. python-dev is for the development *of*
Python, not for the development *with* Python.

> kumar$ python2.5
> Python 2.5 (r25:51918, Sep 19 2006, 08:49:13)
> [GCC 4.0.1 (Apple Computer, Inc. build 5341)] on darwin
>>>> import sys, csv, codecs
>>>> f = codecs.open('unicsv.csv','wb','utf-8')
>>>> w = csv.writer(f)
>>>> w.writerow([u'lang', u'espa\xa4ol'])

What you should do here is

def encoderow(r):
  return [s.encode("utf-8") for s in r])

f = open('unicsv.csv', 'wb', 'utf-8')
w = csv.writer(f)
w.writerow(encoderow([u'lang', u'espa\xa4ol'])

IOW, you need to encode *before* passing the strings
to the CSV module, not afterwards.

If it is too tedious for you to put in the encoderow
calls all the time, you can write a wrapper for CSV
writers which transparently encodes all Unicode
strings.

Regards,
Martin

From amk at amk.ca  Thu Jun 14 21:50:34 2007
From: amk at amk.ca (A.M. Kuchling)
Date: Thu, 14 Jun 2007 15:50:34 -0400
Subject: [Python-Dev] Outcome of Georg's documentation work?
Message-ID: <20070614195034.GA18011@localhost.localdomain>

What was the outcome of the discussion of Georg Brandl's reworked
documentation ("The docs, reloaded")?  Was any decision made on
whether to go with reST, or on what changes need to made before that's
possible?  Did Fred Drake say what he thought?

Georg, do you want access to python.org to host a version of the docs
there?

--amk

From orsenthil at users.sourceforge.net  Sat Jun 16 06:04:59 2007
From: orsenthil at users.sourceforge.net (O.R.Senthil Kumaran)
Date: Sat, 16 Jun 2007 09:34:59 +0530
Subject: [Python-Dev] [RFC] urlparse - parse query facility
In-Reply-To: <200706132242.21436.fdrake@acm.org>
References: <7c42eba10706121339n2d04cadapee69443c5636d9@mail.gmail.com>
	<200706132242.21436.fdrake@acm.org>
Message-ID: <20070616040459.GA3598@gmail.com>

* Fred L. Drake, Jr. <fdrake at acm.org> [2007-06-13 22:42:21]:

> I see no reason to incorporate the URL splitting into the function; the 
> existing function signatures for cgi.parse_qs and cgi.parse_qsl are 
> sufficient.

Thanks for the comments, Fred. I understand, that having the signatures of
parse_qs and parse_qsl are sufficient in the urlparse module and invoking the
same from cgi module will be correct.

The urlparse will cotain parse_qs and parse_qsl takes the query string (not
url) and with optional arguments keep_blank_values and strict_parsing (same as cgi).

http://deadbeefbabe.org/paste/5154

> It may be convenient to add methods to the urlparse.BaseResult class providing 
> access to the parsed version of the query on the instance.
> 

This is where, I spent a little bit time and I am unable to comeout
conclusively as how it can be done.

Someone in the list, please help me.

* parse_qs or parse_qsl will be invoked on the query component separately by
the user.
* If parsed query needs to be available at the instance as a convenience
function, then we will have to assume the keep_blank_values and strict_parsing
values.
* Coding question: Without retyping the bunch of code again in the BaseResult,
would is the possible to call parse_qs/parse_qsl function on self.query and
provide the result? Basically, what would be a good of doing it.


Thanks,
Senthil

-- 
O.R.Senthil Kumaran
http://uthcode.sarovar.org

From fdrake at acm.org  Sat Jun 16 07:06:59 2007
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Sat, 16 Jun 2007 01:06:59 -0400
Subject: [Python-Dev] [RFC] urlparse - parse query facility
In-Reply-To: <20070616040459.GA3598@gmail.com>
References: <7c42eba10706121339n2d04cadapee69443c5636d9@mail.gmail.com>
	<200706132242.21436.fdrake@acm.org>
	<20070616040459.GA3598@gmail.com>
Message-ID: <200706160107.00288.fdrake@acm.org>

On Saturday 16 June 2007, O.R.Senthil Kumaran wrote:
 > The urlparse will cotain parse_qs and parse_qsl takes the query string
 > (not url) and with optional arguments keep_blank_values and strict_parsing
 > (same as cgi).
 >
 > http://deadbeefbabe.org/paste/5154

Looks good.

 > > It may be convenient to add methods to the urlparse.BaseResult class
 > > providing access to the parsed version of the query on the instance.
...
 > * parse_qs or parse_qsl will be invoked on the query component separately
 > by the user.

Yes; this doesn't change, really.  Methods would still need to be invoked 
separately, but the query string doesn't need to be passed in; it's part of 
the data object.

 > * If parsed query needs to be available at the instance as a convenience
 > function, then we will have to assume the keep_blank_values and
 > strict_parsing values.

If it were a property, yes, but I think a method on the result object makes 
more sense because we don't want to assume values for these arguments.

 > * Coding question: Without retyping the bunch of code again in the
 > BaseResult, would is the possible to call parse_qs/parse_qsl function on
 > self.query and provide the result? Basically, what would be a good of
 > doing it.

That's what I was thinking.  Just add something like this to BaseResult 
(untested):

    def parsedQuery(self, keep_blank_values=False, strict_parsing=False):
        return parse_qs(
            self.query,
            keep_blank_values=keep_blank_values,
            strict_parsing=strict_parsing)

    def parsedQueryList(self, keep_blank_values=False, strict_parsing=False):
        return parse_qsl(
            self.query,
            keep_blank_values=keep_blank_values,
            strict_parsing=strict_parsing)

Whether there's a real win with this is unclear.  I generally prefer having an 
object that represents the URL and lets me get what I want from it, rather 
than having to pass the bits around to separate parsing functions.  The 
result objects were added in 2.5, though, and I've no real idea how widely 
they've been adopted.


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>
"Chaos is the score upon which reality is written." --Henry Miller

From orsenthil at users.sourceforge.net  Sat Jun 16 10:41:01 2007
From: orsenthil at users.sourceforge.net (O.R.Senthil Kumaran)
Date: Sat, 16 Jun 2007 14:11:01 +0530
Subject: [Python-Dev] [RFC] urlparse - parse query facility
In-Reply-To: <200706160107.00288.fdrake@acm.org>
References: <7c42eba10706121339n2d04cadapee69443c5636d9@mail.gmail.com>
	<200706132242.21436.fdrake@acm.org>
	<20070616040459.GA3598@gmail.com>
	<200706160107.00288.fdrake@acm.org>
Message-ID: <20070616084101.GA4115@gmail.com>

* Fred L. Drake, Jr. <fdrake at acm.org> [2007-06-16 01:06:59]:

>  > * Coding question: Without retyping the bunch of code again in the
>  > BaseResult, would is the possible to call parse_qs/parse_qsl function on
>  > self.query and provide the result? Basically, what would be a good of
>  > doing it.
> 
> That's what I was thinking.  Just add something like this to BaseResult 
> (untested):
> 
>     def parsedQuery(self, keep_blank_values=False, strict_parsing=False):
>         return parse_qs(
>             self.query,
>             keep_blank_values=keep_blank_values,
>             strict_parsing=strict_parsing)
> 
>     def parsedQueryList(self, keep_blank_values=False, strict_parsing=False):
>         return parse_qsl(
>             self.query,
>             keep_blank_values=keep_blank_values,
>             strict_parsing=strict_parsing)

Thanks Fred. That really helped. :-)

I have updated the urlparse.py module, cgi.py and also included in the tests
in the test_urlparse.py to test this new functionality. 
test run passed for all the valid queries, except for these:

#            ("=", {}),
#            ("=&=", {}),
#            ("=;=", {}),

The testcases are basically from test_cgi.py module and there is comment on
validity of these 3 tests for query values. 

Pending stuff is updating the documentation.

I maintained all the files temporarily at:

http://cvs.sarovar.org/cgi-bin/cvsweb.cgi/python/?cvsroot=uthcode

I had requested a commit access to Summer of Code branch in my previous mail,
but I guess it not been noticed yet. I shall update the files later or
send in as patches for application.


> Whether there's a real win with this is unclear.  I generally prefer having an 
> object that represents the URL and lets me get what I want from it, rather 
> than having to pass the bits around to separate parsing functions.  The 

I agree. This is really convenient when one comes to know about it.

Thanks,
Senthil

-- 
O.R.Senthil Kumaran
http://uthcode.sarovar.org

From g.brandl at gmx.net  Sat Jun 16 11:31:56 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Sat, 16 Jun 2007 11:31:56 +0200
Subject: [Python-Dev] Outcome of Georg's documentation work?
In-Reply-To: <20070614195034.GA18011@localhost.localdomain>
References: <20070614195034.GA18011@localhost.localdomain>
Message-ID: <f50am2$eg2$1@sea.gmane.org>

A.M. Kuchling schrieb:
> What was the outcome of the discussion of Georg Brandl's reworked
> documentation ("The docs, reloaded")?  Was any decision made on
> whether to go with reST, or on what changes need to made before that's
> possible?  Did Fred Drake say what he thought?

For my part, I'm still working on it and want to integrate a few of the
planned interactive features now.

> Georg, do you want access to python.org to host a version of the docs
> there?

That would be really nice. Should I subscribe to the pydotorg list?

Georg


-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From martin at v.loewis.de  Sat Jun 16 12:10:45 2007
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 16 Jun 2007 12:10:45 +0200
Subject: [Python-Dev] Requesting commit access to python sandbox.
 Cleanup urllib2 - Summer of Code 2007 Project
In-Reply-To: <7c42eba10706121358g3040faaeu1682e5a460cff557@mail.gmail.com>
References: <7c42eba10706121358g3040faaeu1682e5a460cff557@mail.gmail.com>
Message-ID: <4673B725.1000309@v.loewis.de>

> I am a student participant of Google Summer of Code 2007 and I am
> working on the cleanup task of urllib2, with Skip as my mentor.
> I would like to request for a commit access to the Python Sandbox for
> implementing the changes as part of the project. I have attached by
> SSH Public keys.
> preferred name : senthil.kumaran

I have now given you write access. Please constrain all checkins
to the sandbox; checkins elsewhere should be approved by your
mentor.

Regards,
Martin

From orsenthil at users.sourceforge.net  Sat Jun 16 15:25:05 2007
From: orsenthil at users.sourceforge.net (O.R.Senthil Kumaran)
Date: Sat, 16 Jun 2007 18:55:05 +0530
Subject: [Python-Dev] Requesting commit access to python
	sandbox.	Cleanup urllib2 - Summer of Code 2007 Project
In-Reply-To: <4673B725.1000309@v.loewis.de>
References: <7c42eba10706121358g3040faaeu1682e5a460cff557@mail.gmail.com>
	<4673B725.1000309@v.loewis.de>
Message-ID: <20070616132505.GB3558@gmail.com>

* "Martin v. L?wis" <martin at v.loewis.de> [2007-06-16 12:10:45]:

> > I am a student participant of Google Summer of Code 2007 and I am
> > working on the cleanup task of urllib2, with Skip as my mentor.
> 
> I have now given you write access. Please constrain all checkins
> to the sandbox; checkins elsewhere should be approved by your
> mentor.

Thanks Martin. I shall abide by the guidelines.

-- 
O.R.Senthil Kumaran
http://uthcode.sarovar.org

From amk at amk.ca  Sat Jun 16 20:10:31 2007
From: amk at amk.ca (A.M. Kuchling)
Date: Sat, 16 Jun 2007 14:10:31 -0400
Subject: [Python-Dev] Outcome of Georg's documentation work?
In-Reply-To: <f50am2$eg2$1@sea.gmane.org>
References: <20070614195034.GA18011@localhost.localdomain>
	<f50am2$eg2$1@sea.gmane.org>
Message-ID: <20070616181031.GA9465@andrew-kuchlings-computer.local>

On Sat, Jun 16, 2007 at 11:31:56AM +0200, Georg Brandl wrote:
> That would be really nice. Should I subscribe to the pydotorg list?

Yes, please, and e-mail me an SSH key.

Such work should be done on ximinez for security reasons, I think, 
even though the machine is fairly heavily loaded.

--amk

From status at bugs.python.org  Sun Jun 17 02:00:49 2007
From: status at bugs.python.org (Tracker)
Date: Sun, 17 Jun 2007 00:00:49 +0000 (UTC)
Subject: [Python-Dev] Summary of Tracker Issues
Message-ID: <20070617000049.02C96780DD@psf.upfronthosting.co.za>


ACTIVITY SUMMARY (06/10/07 - 06/17/07)
Tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue 
number.  Do NOT respond to this message.


 1645 open ( +0) /  8584 closed ( +0) / 10229 total ( +0)

Average duration of open issues: 829 days.
Median duration of open issues: 777 days.

Open Issues Breakdown
   open  1645 ( +0)
pending     0 ( +0)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070617/18b6ad26/attachment.htm 

From martin at v.loewis.de  Sun Jun 17 19:26:02 2007
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 17 Jun 2007 19:26:02 +0200
Subject: [Python-Dev] Upgrade of {www,svn}.python.org
Message-ID: <46756EAA.7000905@v.loewis.de>

I'd like to upgrade www.python.org this coming Thursday (June 21),
between 6:00am and 12:00am UTC. During that time, neither www
nor subversion access will be available (although I hope that
I need much less than 6 hours).

mail.python.org, and all other services running on other machines,
will continue to work.

I will send another message when I start.

Regards,
Martin

From jeff at taupro.com  Sun Jun 17 22:24:54 2007
From: jeff at taupro.com (Jeff Rush)
Date: Sun, 17 Jun 2007 15:24:54 -0500
Subject: [Python-Dev] [Pydotorg] Upgrade of {www,svn}.python.org
In-Reply-To: <46756EAA.7000905@v.loewis.de>
References: <46756EAA.7000905@v.loewis.de>
Message-ID: <46759896.9080609@taupro.com>

Martin v. L?wis wrote:
> I'd like to upgrade www.python.org this coming Thursday (June 21),
> between 6:00am and 12:00am UTC. During that time, neither www
> nor subversion access will be available (although I hope that
> I need much less than 6 hours).
> 
> mail.python.org, and all other services running on other machines,
> will continue to work.

Is this a software version upgrade or a hardware upgrade re the increase in
hard disk space recently mentioned by Sean?  If you're already physically at
the machine, it'd be nice to get an additional drive added at the same time.

-Jeff


From martin at v.loewis.de  Sun Jun 17 22:28:08 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 17 Jun 2007 22:28:08 +0200
Subject: [Python-Dev] [Pydotorg] Upgrade of {www,svn}.python.org
In-Reply-To: <46759896.9080609@taupro.com>
References: <46756EAA.7000905@v.loewis.de> <46759896.9080609@taupro.com>
Message-ID: <46759958.9010307@v.loewis.de>

>> mail.python.org, and all other services running on other machines,
>> will continue to work.
> 
> Is this a software version upgrade or a hardware upgrade re the increase in
> hard disk space recently mentioned by Sean?  If you're already physically at
> the machine, it'd be nice to get an additional drive added at the same time.

Just a software upgrade, and I will not be physically at the machine -
the machine is in Amsterdam, and I am in Berlin.

I don't think a hard disk space upgrade is planned, and I don't think it
is necessary.

Regards,
Martin

From ocean at m2.ccsnet.ne.jp  Mon Jun 18 15:02:56 2007
From: ocean at m2.ccsnet.ne.jp (ocean)
Date: Mon, 18 Jun 2007 22:02:56 +0900
Subject: [Python-Dev] Investigated ref leak report related to thread
	(regrtest.py -R ::)
Message-ID: <005b01c7b1a8$fde788a0$0300a8c0@whiterabc2znlh>

Hello. I investigated ref leak report related to thread.
Please run python regrtest.py -R :: test_leak.py (attached file)
Sometimes ref leak is reported.
# I saw this as regression failure on python-checkins.

# total ref count 92578 -> 92669
  _Condition 2
  Thread 6
  _Event 1
  bool 10
  instancemethod 1
  code 2
  dict 9
  file 1
  frame 3
  function 2
  int 1
  list 2
  builtin_function_or_method 5
  NoneType 2
  str 27
  thread.lock 7
  tuple 5
  type 5

Probably this happens because threading.Thread is implemented as Python
code,
(expecially threading.Thread#join), the code of regrtest.py

        if i >= nwarmup:
            deltas.append(sys.gettotalrefcount() - rc - 2)

can run before thread really quits. (before Moudles/threadmodule.c
t_bootstrap()'s

 Py_DECREF(boot->func);
 Py_DECREF(boot->args);
 Py_XDECREF(boot->keyw);

runs)

So I experimentally inserted the code to wait for thread termination.
(attached file experimental.patch) And I confirmed error was gone.

# Sorry for hackish patch which only runs on windows. It should run
# on other platforms if you replace Sleep() in Python/sysmodule.c
# sys_debug_ref_leak_leave() with appropriate function.


-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: test_leak.py
Url: http://mail.python.org/pipermail/python-dev/attachments/20070618/b26e2d48/attachment.asc 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: experimental.patch
Type: application/octet-stream
Size: 2690 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20070618/b26e2d48/attachment.obj 

From ocean at m2.ccsnet.ne.jp  Mon Jun 18 15:08:05 2007
From: ocean at m2.ccsnet.ne.jp (ocean)
Date: Mon, 18 Jun 2007 22:08:05 +0900
Subject: [Python-Dev] Investigated ref leak report related to
	thread(regrtest.py -R ::)
References: <005b01c7b1a8$fde788a0$0300a8c0@whiterabc2znlh>
Message-ID: <007c01c7b1a9$b5f43ba0$0300a8c0@whiterabc2znlh>

Sorry, mailer striped spaces... I'll try attaching files again.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: archive.zip
Type: application/x-zip-compressed
Size: 1511 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20070618/d0fae54b/attachment.bin 

From aahz at pythoncraft.com  Mon Jun 18 16:33:59 2007
From: aahz at pythoncraft.com (Aahz)
Date: Mon, 18 Jun 2007 07:33:59 -0700
Subject: [Python-Dev] Investigated ref leak report related to thread
	(regrtest.py -R ::)
In-Reply-To: <005b01c7b1a8$fde788a0$0300a8c0@whiterabc2znlh>
References: <005b01c7b1a8$fde788a0$0300a8c0@whiterabc2znlh>
Message-ID: <20070618143358.GA24375@panix.com>

On Mon, Jun 18, 2007, ocean wrote:
>
> Hello. I investigated ref leak report related to thread.
> Please run python regrtest.py -R :: test_leak.py (attached file)
> Sometimes ref leak is reported.

Please post a bug report to SF and report the bug number here.  When you
post bugs only to the list they get lost.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"as long as we like the same operating system, things are cool." --piranha

From ocean at m2.ccsnet.ne.jp  Mon Jun 18 16:46:20 2007
From: ocean at m2.ccsnet.ne.jp (ocean)
Date: Mon, 18 Jun 2007 23:46:20 +0900
Subject: [Python-Dev] Investigated ref leak report related to thread
	(regrtest.py -R ::)
References: <005b01c7b1a8$fde788a0$0300a8c0@whiterabc2znlh>
	<20070618143358.GA24375@panix.com>
Message-ID: <002901c7b1b7$70749ad0$0300a8c0@whiterabc2znlh>

> Please post a bug report to SF and report the bug number here.  When you
> post bugs only to the list they get lost.
> -- 
> Aahz (aahz at pythoncraft.com)           <*>
http://www.pythoncraft.com/
>
> "as long as we like the same operating system, things are cool." --piranha

Thank you for pointing it out. Done. http://www.python.org/sf/1739118


From guido at python.org  Tue Jun 19 08:32:59 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 18 Jun 2007 23:32:59 -0700
Subject: [Python-Dev] Python 3000 Status Update (Long!)
Message-ID: <ca471dc20706182332q18df52eaw77c3e544a65aa196@mail.gmail.com>

I've written up a comprehensive status report on Python 3000. Please read:

http://www.artima.com/weblogs/viewpost.jsp?thread=208549

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From g.brandl at gmx.net  Tue Jun 19 10:47:20 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Tue, 19 Jun 2007 10:47:20 +0200
Subject: [Python-Dev] Python 3000 Status Update (Long!)
In-Reply-To: <ca471dc20706182332q18df52eaw77c3e544a65aa196@mail.gmail.com>
References: <ca471dc20706182332q18df52eaw77c3e544a65aa196@mail.gmail.com>
Message-ID: <f5856m$h2q$1@sea.gmane.org>

Guido van Rossum schrieb:
> I've written up a comprehensive status report on Python 3000. Please read:
> 
> http://www.artima.com/weblogs/viewpost.jsp?thread=208549

Thank you! Now I have something to show to interested people except "read
the PEPs".

A minuscule nit: the rot13 codec has no library equivalent, so it won't be
supported anymore :)

Georg


From ncoghlan at gmail.com  Tue Jun 19 13:57:44 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 19 Jun 2007 21:57:44 +1000
Subject: [Python-Dev] Python 3000 Status Update (Long!)
In-Reply-To: <f5856m$h2q$1@sea.gmane.org>
References: <ca471dc20706182332q18df52eaw77c3e544a65aa196@mail.gmail.com>
	<f5856m$h2q$1@sea.gmane.org>
Message-ID: <4677C4B8.8010508@gmail.com>

Georg Brandl wrote:
> Guido van Rossum schrieb:
>> I've written up a comprehensive status report on Python 3000. Please read:
>>
>> http://www.artima.com/weblogs/viewpost.jsp?thread=208549
> 
> Thank you! Now I have something to show to interested people except "read
> the PEPs".
> 
> A minuscule nit: the rot13 codec has no library equivalent, so it won't be
> supported anymore :)

Given that there are valid use cases for bytes-to-bytes translations, 
and a common API for them would be nice, does it make sense to have an 
additional category of codec that is invoked via specific recoding 
methods on bytes objects? For example:

   encoded = data.encode_bytes('bz2')
   decoded = encoded.decode_bytes('bz2')
   assert data == decoded

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From fuzzyman at voidspace.org.uk  Tue Jun 19 14:31:38 2007
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Tue, 19 Jun 2007 13:31:38 +0100
Subject: [Python-Dev] Inspect Patch for IronPython (and Jython?)
	Compatibility
Message-ID: <4677CCAA.1050608@voidspace.org.uk>

Hello all,

I've just submitted a patch on sourceforge to make inspect compatible 
with IronPython (and Jython I think). This patch originally comes from 
the IPCE ( http://fepy.sf.net ) project by Seo Sanghyeon. It is a 
trivial change really.

The patch is number 1739696
http://sourceforge.net/tracker/index.php?func=detail&aid=1739696&group_id=5470&atid=305470

It moves getting a reference to 'code.co_code' into the body of the loop 
responsible for inspecting anonymous (tuple) arguments.

In IronPython, accessing 'co_code' raises a NotImplementedError - 
meaning that inspect.get_argspec is broken.

This patch means that *except* for functions with anonymous tuple 
arguments, it will work again on IronPython - whilst maintaining full 
compatibility with the previous behaviour.

Jython has a similar patch to overcome the same issue by the way. See 
http://jython.svn.sourceforge.net/viewvc/jython?view=rev&revision=3200

As it is a bugfix - backporting to 2.5 would be great. Should I generate 
a separate patch?

All the best,

Michael Foord

From g.brandl at gmx.net  Tue Jun 19 14:25:03 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Tue, 19 Jun 2007 14:25:03 +0200
Subject: [Python-Dev] Multi-line comments - a case for PEP 3099?
Message-ID: <f58hus$ub0$1@sea.gmane.org>

Hi,

we got another feature request for multi-line comments.

While it is nice to comment out multiple lines at once, every editor
that deserves that name can add a '#' to multiple lines.

And there's always "if 0" and triple-quoted strings...

Georg


From walter at livinglogic.de  Tue Jun 19 14:40:57 2007
From: walter at livinglogic.de (=?UTF-8?B?V2FsdGVyIETDtnJ3YWxk?=)
Date: Tue, 19 Jun 2007 14:40:57 +0200
Subject: [Python-Dev] [Python-3000]  Python 3000 Status Update (Long!)
In-Reply-To: <f58hlj$sri$1@sea.gmane.org>
References: <ca471dc20706182332q18df52eaw77c3e544a65aa196@mail.gmail.com>	<f5856m$h2q$1@sea.gmane.org>	<4677C4B8.8010508@gmail.com>
	<f58hlj$sri$1@sea.gmane.org>
Message-ID: <4677CED9.1060800@livinglogic.de>

Georg Brandl wrote:
> Nick Coghlan schrieb:
>> Georg Brandl wrote:
>>> Guido van Rossum schrieb:
>>>> I've written up a comprehensive status report on Python 3000. Please read:
>>>>
>>>> http://www.artima.com/weblogs/viewpost.jsp?thread=208549
>>> Thank you! Now I have something to show to interested people except "read
>>> the PEPs".
>>>
>>> A minuscule nit: the rot13 codec has no library equivalent, so it won't be
>>> supported anymore :)
>> Given that there are valid use cases for bytes-to-bytes translations, 
>> and a common API for them would be nice, does it make sense to have an 
>> additional category of codec that is invoked via specific recoding 
>> methods on bytes objects? For example:
>>
>>    encoded = data.encode_bytes('bz2')
>>    decoded = encoded.decode_bytes('bz2')
>>    assert data == decoded
> 
> This is exactly what I proposed a while before under the name
> bytes.transform().
> 
> IMO it would make a common use pattern much more convenient and
> should be given thought.
> 
> If a PEP is called for, I'd be happy to at least co-author it.

Codecs are a major exception to Guido's law: Never have a parameter
whose value switches between completely unrelated algorithms.

Why don't we put all string transformation functions into a common
module (the string module might be a good place):

>>> import string
>>> string.rot13('abc')

Servus,
   Walter

From mal at egenix.com  Tue Jun 19 15:19:50 2007
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 19 Jun 2007 15:19:50 +0200
Subject: [Python-Dev] [Python-3000]  Python 3000 Status Update (Long!)
In-Reply-To: <4677CED9.1060800@livinglogic.de>
References: <ca471dc20706182332q18df52eaw77c3e544a65aa196@mail.gmail.com>	<f5856m$h2q$1@sea.gmane.org>	<4677C4B8.8010508@gmail.com>	<f58hlj$sri$1@sea.gmane.org>
	<4677CED9.1060800@livinglogic.de>
Message-ID: <4677D7F6.3040304@egenix.com>

On 2007-06-19 14:40, Walter D?rwald wrote:
> Georg Brandl wrote:
>>>> A minuscule nit: the rot13 codec has no library equivalent, so it won't be
>>>> supported anymore :)
>>> Given that there are valid use cases for bytes-to-bytes translations, 
>>> and a common API for them would be nice, does it make sense to have an 
>>> additional category of codec that is invoked via specific recoding 
>>> methods on bytes objects? For example:
>>>
>>>    encoded = data.encode_bytes('bz2')
>>>    decoded = encoded.decode_bytes('bz2')
>>>    assert data == decoded
>> This is exactly what I proposed a while before under the name
>> bytes.transform().
>>
>> IMO it would make a common use pattern much more convenient and
>> should be given thought.
>>
>> If a PEP is called for, I'd be happy to at least co-author it.
> 
> Codecs are a major exception to Guido's law: Never have a parameter
> whose value switches between completely unrelated algorithms.

I don't see much of a problem with that. Parameters are
per-se intended to change the behavior of a function or
method.

Note that you are referring to the .encode() and .decode()
methods - these are just easy to use interfaces to the codecs
registered in the system.

The codec design allows for different input and output
types as it doesn't impose restrictions on these. Codecs
are more general in that respect: they don't just deal
with Unicode encodings, it's a more general approach
that also works with other kinds of data types.

The access methods, OTOH, can impose restrictions and probably
should to restrict the return types to a predicable set.

> Why don't we put all string transformation functions into a common
> module (the string module might be a good place):
> 
>>>> import string
>>>> string.rot13('abc')

I think the string module will have to go away. It doesn't
really separate between text and bytes data.

Adding more confusion will not really help with making
this distinction clear, either, I'm afraid.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 19 2007)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2007-07-09: EuroPython 2007, Vilnius, Lithuania            19 days to go

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From cbarton at metavr.com  Tue Jun 19 16:21:32 2007
From: cbarton at metavr.com (Campbell Barton)
Date: Wed, 20 Jun 2007 00:21:32 +1000
Subject: [Python-Dev] Calling Methods from Pythons C API with Keywords
In-Reply-To: <mailman.0.1182261543.9641.python-dev@python.org>
References: <mailman.0.1182261543.9641.python-dev@python.org>
Message-ID: <4677E66C.8000403@metavr.com>

Hey Guys,

My first post on this list so I hope this is the right place to post and
relevant.
Im rewriting parts of the Blender3D python API that has got a bit old
and needs an update.

Im making a PyList subtype with the C/Python API, this involves
intercepting calls to standard list methods to make sure Blenders array
data is in Sync with the list's data.

Iv got it working for tp_as_sequence, tp_as_mapping, iter and dealloc
etc but methods are a problem.

I want to add my own call's before and after PyLists standard functions
but have a proplem with functons that use keywords and have no API
equivalent.
For example, I cant use the API's PyList_Sort because that dosnt support
keywords like...

ls.sort(key=lambda a: a.foo))

And the Problem with PyObject_CallMethod is that it dosnt accept keywords.

  PyObject_CallMethod((PyObject *)mylist, "sort", "O", args);


Looking at abstract.c, PyObject_CallMethod uses call_function_tail,
which calls "PyObject_Call(callable, args, NULL);" - so Its not
currently possible with PyObject_CallMethod.

But I cant find any way to do this in a few lines.

I could use PyEval_CallObjectWithKeywords but that would mean Id need to
get the method from the list manually which Ill look into, but unless Im
missing something here, it seems PyObject_CallMethodWithKeywords would
be a nice addition to the Python API that cant be done in a straight
forward way at the moment.

- Thanks



From chrism at plope.com  Tue Jun 19 16:24:05 2007
From: chrism at plope.com (Chris McDonough)
Date: Tue, 19 Jun 2007 10:24:05 -0400
Subject: [Python-Dev] Issues with PEP 3101 (string formatting)
Message-ID: <A51EAB52-FA02-47DE-8A82-DF706F4ECD67@plope.com>

Wrt http://www.python.org/dev/peps/pep-3101/

PEP 3101 says Py3K should allow item and attribute access syntax  
within string templating expressions but "to limit potential security  
issues", access to underscore prefixed names within attribute/item  
access expressions will be disallowed.

I am a person who has lived with the aftermath of a framework  
designed to prevent data access by restricting access to underscore- 
prefixed names (Zope 2, ahem), and I've found it's very hard to  
explain and justify.  As a result, I feel that this is a poor default  
policy choice for a framework.

In some cases, underscore names must become part of an object's  
external interface.  Consider a URL with one or more underscore- 
prefixed path segment elements (because prefixing a filename with an  
underscore is a perfectly reasonable thing to do on a filesystem, and  
path elements are often named after file names) fed to a traversal  
algorithm that attempts to resolve each path element into an object  
by calling __getitem__ against the parent found by the last path  
element's traversal result.  Perhaps this is poor design and  
__getitem__ should not be consulted here, but I doubt that highly  
because there's nothing particularly special about calling a method  
named __getitem__ as opposed to some method named "traverse".

The only precedent within Python 2 for this sort of behavior is  
limiting access to variables that begin with __ and which do not end  
with __ to the scope defined by a class and its instances.  I  
personally don't believe this is a very useful feature, but it's  
still only an advisory policy and you can worm around it with enough  
gyrations.

Given that security is a concern at all, the only truly reasonable  
way to "limit security issues" is to disallow item and attribute  
access completely within the string templating expression syntax.  It  
seems gratuituous to me to encourage string templating expressions  
with item/attribute access, given that you could do it within the  
format arguments just as easily in the 99% case, and we've (well...  
I've) happily been living with that restriction for years now.

But if this syntax is preserved, there really should be no *default*  
restrictions on the traversable names within an expression because  
this will almost certainly become a hard-to-explain, hard-to-justify  
bug magnet as it has become in Zope.

- C


From walter at livinglogic.de  Tue Jun 19 16:45:46 2007
From: walter at livinglogic.de (=?UTF-8?B?V2FsdGVyIETDtnJ3YWxk?=)
Date: Tue, 19 Jun 2007 16:45:46 +0200
Subject: [Python-Dev] [Python-3000]  Python 3000 Status Update (Long!)
In-Reply-To: <f58k6r$6fv$1@sea.gmane.org>
References: <ca471dc20706182332q18df52eaw77c3e544a65aa196@mail.gmail.com>	<f5856m$h2q$1@sea.gmane.org>	<4677C4B8.8010508@gmail.com>	<f58hlj$sri$1@sea.gmane.org>	<4677CED9.1060800@livinglogic.de>
	<f58k6r$6fv$1@sea.gmane.org>
Message-ID: <4677EC1A.10306@livinglogic.de>

Georg Brandl wrote:
> Walter D?rwald schrieb:
>> Georg Brandl wrote:
>>> Nick Coghlan schrieb:
>>>> Georg Brandl wrote:
>>>>> Guido van Rossum schrieb:
>>>>>> I've written up a comprehensive status report on Python 3000. Please read:
>>>>>>
>>>>>> http://www.artima.com/weblogs/viewpost.jsp?thread=208549
>>>>> Thank you! Now I have something to show to interested people except "read
>>>>> the PEPs".
>>>>>
>>>>> A minuscule nit: the rot13 codec has no library equivalent, so it won't be
>>>>> supported anymore :)
>>>> Given that there are valid use cases for bytes-to-bytes translations, 
>>>> and a common API for them would be nice, does it make sense to have an 
>>>> additional category of codec that is invoked via specific recoding 
>>>> methods on bytes objects? For example:
>>>>
>>>>    encoded = data.encode_bytes('bz2')
>>>>    decoded = encoded.decode_bytes('bz2')
>>>>    assert data == decoded
>>> This is exactly what I proposed a while before under the name
>>> bytes.transform().
>>>
>>> IMO it would make a common use pattern much more convenient and
>>> should be given thought.
>>>
>>> If a PEP is called for, I'd be happy to at least co-author it.
>> Codecs are a major exception to Guido's law: Never have a parameter
>> whose value switches between completely unrelated algorithms.
> 
> I don't think that applies here. This is more like __import__():
> depending on the first parameter, completely different things can happen.
> Yes, the same import algorithm is used, but in the case of
> bytes.encode_bytes, the same algorithm is used to find and execute the
> codec.

What would a registry of tranformation algorithms buy us compared to a
module with transformation functions?

The function version is shorter:

   transform.rot13('foo')

compared to:

   'foo'.transform('rot13')

If each transformation has its own function, these functions can have
their own arguments, e.g.
   transform.bz2encode(data: bytes, level: int=6) -> bytes

Of course str.transform() could pass along all arguments to the
registered function, but that's worse from a documentation viewpoint,
because the real signature is hidden deep in the registry.

Servus,
   Walter

From guido at python.org  Tue Jun 19 17:18:15 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 19 Jun 2007 08:18:15 -0700
Subject: [Python-Dev] Multi-line comments - a case for PEP 3099?
In-Reply-To: <f58hus$ub0$1@sea.gmane.org>
References: <f58hus$ub0$1@sea.gmane.org>
Message-ID: <ca471dc20706190818j46ea0064n32ded8621e310bd1@mail.gmail.com>

On 6/19/07, Georg Brandl <g.brandl at gmx.net> wrote:

> we got another feature request for multi-line comments.
>
> While it is nice to comment out multiple lines at once, every editor
> that deserves that name can add a '#' to multiple lines.
>
> And there's always "if 0" and triple-quoted strings...
I'd als say that the case for TOOWTDI is pretty clear on that.

But perhaps we can keep the Py3k discussions on the python-3000 at python.org list?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Jun 19 17:20:25 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 19 Jun 2007 08:20:25 -0700
Subject: [Python-Dev] Issues with PEP 3101 (string formatting)
In-Reply-To: <A51EAB52-FA02-47DE-8A82-DF706F4ECD67@plope.com>
References: <A51EAB52-FA02-47DE-8A82-DF706F4ECD67@plope.com>
Message-ID: <ca471dc20706190820n7715fc30jeafcffd14c6b5623@mail.gmail.com>

Those are valid concerns. I'm cross-posting this to the python-3000
list in the hope that the PEP's author and defendents can respond. I'm
sure we can work something out.

Please keep further discussion on the python-3000 at python.org list.

--Guido

On 6/19/07, Chris McDonough <chrism at plope.com> wrote:
> Wrt http://www.python.org/dev/peps/pep-3101/
>
> PEP 3101 says Py3K should allow item and attribute access syntax
> within string templating expressions but "to limit potential security
> issues", access to underscore prefixed names within attribute/item
> access expressions will be disallowed.
>
> I am a person who has lived with the aftermath of a framework
> designed to prevent data access by restricting access to underscore-
> prefixed names (Zope 2, ahem), and I've found it's very hard to
> explain and justify.  As a result, I feel that this is a poor default
> policy choice for a framework.
>
> In some cases, underscore names must become part of an object's
> external interface.  Consider a URL with one or more underscore-
> prefixed path segment elements (because prefixing a filename with an
> underscore is a perfectly reasonable thing to do on a filesystem, and
> path elements are often named after file names) fed to a traversal
> algorithm that attempts to resolve each path element into an object
> by calling __getitem__ against the parent found by the last path
> element's traversal result.  Perhaps this is poor design and
> __getitem__ should not be consulted here, but I doubt that highly
> because there's nothing particularly special about calling a method
> named __getitem__ as opposed to some method named "traverse".
>
> The only precedent within Python 2 for this sort of behavior is
> limiting access to variables that begin with __ and which do not end
> with __ to the scope defined by a class and its instances.  I
> personally don't believe this is a very useful feature, but it's
> still only an advisory policy and you can worm around it with enough
> gyrations.
>
> Given that security is a concern at all, the only truly reasonable
> way to "limit security issues" is to disallow item and attribute
> access completely within the string templating expression syntax.  It
> seems gratuituous to me to encourage string templating expressions
> with item/attribute access, given that you could do it within the
> format arguments just as easily in the 99% case, and we've (well...
> I've) happily been living with that restriction for years now.
>
> But if this syntax is preserved, there really should be no *default*
> restrictions on the traversable names within an expression because
> this will almost certainly become a hard-to-explain, hard-to-justify
> bug magnet as it has become in Zope.
>
> - C
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Jun 19 17:22:20 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 19 Jun 2007 08:22:20 -0700
Subject: [Python-Dev] Inspect Patch for IronPython (and Jython?)
	Compatibility
In-Reply-To: <4677CCAA.1050608@voidspace.org.uk>
References: <4677CCAA.1050608@voidspace.org.uk>
Message-ID: <ca471dc20706190822x8ce277dp1487366d71a9d193@mail.gmail.com>

Let's definitely add this to the trunk (2.6). It sounds fine to me as
a bugfix too, since (from your description) it doesn't change the
behavior at all in CPython.

I won't have the time to submit this, but I'm sure there are others here who do.

--Guido

On 6/19/07, Michael Foord <fuzzyman at voidspace.org.uk> wrote:
> Hello all,
>
> I've just submitted a patch on sourceforge to make inspect compatible
> with IronPython (and Jython I think). This patch originally comes from
> the IPCE ( http://fepy.sf.net ) project by Seo Sanghyeon. It is a
> trivial change really.
>
> The patch is number 1739696
> http://sourceforge.net/tracker/index.php?func=detail&aid=1739696&group_id=5470&atid=305470
>
> It moves getting a reference to 'code.co_code' into the body of the loop
> responsible for inspecting anonymous (tuple) arguments.
>
> In IronPython, accessing 'co_code' raises a NotImplementedError -
> meaning that inspect.get_argspec is broken.
>
> This patch means that *except* for functions with anonymous tuple
> arguments, it will work again on IronPython - whilst maintaining full
> compatibility with the previous behaviour.
>
> Jython has a similar patch to overcome the same issue by the way. See
> http://jython.svn.sourceforge.net/viewvc/jython?view=rev&revision=3200
>
> As it is a bugfix - backporting to 2.5 would be great. Should I generate
> a separate patch?
>
> All the best,
>
> Michael Foord
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From g.brandl at gmx.net  Tue Jun 19 17:27:58 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Tue, 19 Jun 2007 17:27:58 +0200
Subject: [Python-Dev] [Python-3000]  Python 3000 Status Update (Long!)
In-Reply-To: <4677EC1A.10306@livinglogic.de>
References: <ca471dc20706182332q18df52eaw77c3e544a65aa196@mail.gmail.com>	<f5856m$h2q$1@sea.gmane.org>	<4677C4B8.8010508@gmail.com>	<f58hlj$sri$1@sea.gmane.org>	<4677CED9.1060800@livinglogic.de>	<f58k6r$6fv$1@sea.gmane.org>
	<4677EC1A.10306@livinglogic.de>
Message-ID: <f58sls$983$1@sea.gmane.org>

Walter D?rwald schrieb:

>>>> If a PEP is called for, I'd be happy to at least co-author it.
>>> Codecs are a major exception to Guido's law: Never have a parameter
>>> whose value switches between completely unrelated algorithms.
>> 
>> I don't think that applies here. This is more like __import__():
>> depending on the first parameter, completely different things can happen.
>> Yes, the same import algorithm is used, but in the case of
>> bytes.encode_bytes, the same algorithm is used to find and execute the
>> codec.
> 
> What would a registry of tranformation algorithms buy us compared to a
> module with transformation functions?

Easier registering of custom transformations. Without a registry, you'd have
to monkey-patch a module.

> The function version is shorter:
> 
>    transform.rot13('foo')
> 
> compared to:
> 
>    'foo'.transform('rot13')

Yes, that's a very convincing argument :)

> If each transformation has its own function, these functions can have
> their own arguments, e.g.
>    transform.bz2encode(data: bytes, level: int=6) -> bytes
> 
> Of course str.transform() could pass along all arguments to the
> registered function, but that's worse from a documentation viewpoint,
> because the real signature is hidden deep in the registry.

I don't think transformation functions need arguments.

Georg


From g.brandl at gmx.net  Tue Jun 19 17:30:23 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Tue, 19 Jun 2007 17:30:23 +0200
Subject: [Python-Dev] Multi-line comments - a case for PEP 3099?
In-Reply-To: <ca471dc20706190818j46ea0064n32ded8621e310bd1@mail.gmail.com>
References: <f58hus$ub0$1@sea.gmane.org>
	<ca471dc20706190818j46ea0064n32ded8621e310bd1@mail.gmail.com>
Message-ID: <f58sqc$983$2@sea.gmane.org>

Guido van Rossum schrieb:
> On 6/19/07, Georg Brandl <g.brandl at gmx.net> wrote:
> 
>> we got another feature request for multi-line comments.
>>
>> While it is nice to comment out multiple lines at once, every editor
>> that deserves that name can add a '#' to multiple lines.
>>
>> And there's always "if 0" and triple-quoted strings...
> I'd als say that the case for TOOWTDI is pretty clear on that.
> 
> But perhaps we can keep the Py3k discussions on the python-3000 at python.org list?

I haven't really seen this as a python-3000 specific issue. Or are you
referring to the other cross-posting thread?

Georg


From guido at python.org  Tue Jun 19 18:07:20 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 19 Jun 2007 09:07:20 -0700
Subject: [Python-Dev] Multi-line comments - a case for PEP 3099?
In-Reply-To: <f58sqc$983$2@sea.gmane.org>
References: <f58hus$ub0$1@sea.gmane.org>
	<ca471dc20706190818j46ea0064n32ded8621e310bd1@mail.gmail.com>
	<f58sqc$983$2@sea.gmane.org>
Message-ID: <ca471dc20706190907n271c65e2va5c37ec7a34147df@mail.gmail.com>

On 6/19/07, Georg Brandl <g.brandl at gmx.net> wrote:
> Guido van Rossum schrieb:
> > On 6/19/07, Georg Brandl <g.brandl at gmx.net> wrote:
> >
> >> we got another feature request for multi-line comments.
> >>
> >> While it is nice to comment out multiple lines at once, every editor
> >> that deserves that name can add a '#' to multiple lines.
> >>
> >> And there's always "if 0" and triple-quoted strings...
> > I'd als say that the case for TOOWTDI is pretty clear on that.
> >
> > But perhaps we can keep the Py3k discussions on the python-3000 at python.org list?
>
> I haven't really seen this as a python-3000 specific issue. Or are you
> referring to the other cross-posting thread?

That too, but at this point *any* feature request is a Py3k request.
If it's not good for Py3k there's no point in having it in 2.6. And
I'd like new functionality in 2.6 to be restricted to backported Py3k
features.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From fuzzyman at voidspace.org.uk  Tue Jun 19 21:50:46 2007
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Tue, 19 Jun 2007 20:50:46 +0100
Subject: [Python-Dev] Inspect Patch for IronPython (and Jython?)
	Compatibility
In-Reply-To: <ca471dc20706190822x8ce277dp1487366d71a9d193@mail.gmail.com>
References: <4677CCAA.1050608@voidspace.org.uk>
	<ca471dc20706190822x8ce277dp1487366d71a9d193@mail.gmail.com>
Message-ID: <46783396.6090302@voidspace.org.uk>

Guido van Rossum wrote:
> Let's definitely add this to the trunk (2.6). It sounds fine to me as
> a bugfix too, since (from your description) it doesn't change the
> behavior at all in CPython.
Great.

It looks to me like the patch will apply fine against release25-maint.

No behaviour change.

Thanks

Michael Foord


>
> I won't have the time to submit this, but I'm sure there are others 
> here who do.
>
> --Guido
>
> On 6/19/07, Michael Foord <fuzzyman at voidspace.org.uk> wrote:
>> Hello all,
>>
>> I've just submitted a patch on sourceforge to make inspect compatible
>> with IronPython (and Jython I think). This patch originally comes from
>> the IPCE ( http://fepy.sf.net ) project by Seo Sanghyeon. It is a
>> trivial change really.
>>
>> The patch is number 1739696
>> http://sourceforge.net/tracker/index.php?func=detail&aid=1739696&group_id=5470&atid=305470 
>>
>>
>> It moves getting a reference to 'code.co_code' into the body of the loop
>> responsible for inspecting anonymous (tuple) arguments.
>>
>> In IronPython, accessing 'co_code' raises a NotImplementedError -
>> meaning that inspect.get_argspec is broken.
>>
>> This patch means that *except* for functions with anonymous tuple
>> arguments, it will work again on IronPython - whilst maintaining full
>> compatibility with the previous behaviour.
>>
>> Jython has a similar patch to overcome the same issue by the way. See
>> http://jython.svn.sourceforge.net/viewvc/jython?view=rev&revision=3200
>>
>> As it is a bugfix - backporting to 2.5 would be great. Should I generate
>> a separate patch?
>>
>> All the best,
>>
>> Michael Foord
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at python.org
>> http://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe: 
>> http://mail.python.org/mailman/options/python-dev/guido%40python.org
>>
>
>


From martin at v.loewis.de  Tue Jun 19 22:53:09 2007
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Tue, 19 Jun 2007 22:53:09 +0200
Subject: [Python-Dev] [Python-3000]  Python 3000 Status Update (Long!)
In-Reply-To: <f58sls$983$1@sea.gmane.org>
References: <ca471dc20706182332q18df52eaw77c3e544a65aa196@mail.gmail.com>	<f5856m$h2q$1@sea.gmane.org>	<4677C4B8.8010508@gmail.com>	<f58hlj$sri$1@sea.gmane.org>	<4677CED9.1060800@livinglogic.de>	<f58k6r$6fv$1@sea.gmane.org>	<4677EC1A.10306@livinglogic.de>
	<f58sls$983$1@sea.gmane.org>
Message-ID: <46784235.5050102@v.loewis.de>

>> What would a registry of tranformation algorithms buy us compared to a
>> module with transformation functions?
> 
> Easier registering of custom transformations. Without a registry, you'd have
> to monkey-patch a module.

Or users would have to invoke the module directly.

I think a convention would be enough:

rot13.encode(foo)
rot13.decode(bar)

Then, "registration" would require to put the module on sys.path,
which it would for any other kind of registry as well.

My main objection to using an encoding is that for these,
the algorithm name will *always* be a string literal,
completely unlike "real" codecs, where the encoding name
often comes from the environment (either from the process
environment, or from some kind of input).

Regards,
Martin


From hrvoje.niksic at avl.com  Wed Jun 20 09:34:49 2007
From: hrvoje.niksic at avl.com (Hrvoje =?UTF-8?Q?Nik=C5=A1i=C4=87?=)
Date: Wed, 20 Jun 2007 09:34:49 +0200
Subject: [Python-Dev] Calling Methods from Pythons C API with Keywords
In-Reply-To: <4677E66C.8000403@metavr.com>
References: <mailman.0.1182261543.9641.python-dev@python.org>
	<4677E66C.8000403@metavr.com>
Message-ID: <1182324889.6077.111.camel@localhost>

On Wed, 2007-06-20 at 00:21 +1000, Campbell Barton wrote:
> I want to add my own call's before and after PyLists standard functions
> but have a proplem with functons that use keywords and have no API
> equivalent.
> For example, I cant use the API's PyList_Sort because that dosnt support
> keywords like...
> 
> ls.sort(key=lambda a: a.foo))
> 
> And the Problem with PyObject_CallMethod is that it dosnt accept keywords.

Note that you can always simply call PyObject_Call on the bound method
object retrieved using PyObject_GetAttrString.  The hardest part is
usually constructing the keywords dictionary, a job best left to
Py_BuildValue and friends.  When I need that kind of thing in more than
one place, I end up with a utility function like this one:

/* Equivalent to PyObject_CallMethod but accepts keyword args.  The
   format... arguments should produce a dictionary that will be passed
   as keyword arguments to obj.method.

   Usage example:
     PyObject *res = call_method(lst, "sort", "{s:O}", "key", keyfun));
*/

PyObject *
call_method(PyObject *obj, const char *methname, char *format, ...)
{
  va_list va;
  PyObject *meth = NULL, *args = NULL, *kwds = NULL, *ret = NULL;

  args = PyTuple_New(0);
  if (!args)
    goto out;
  meth = PyObject_GetAttrString(obj, methname);
  if (!meth)
    goto out;

  va_start(va, format);
  kwds = Py_VaBuildValue(format, va);
  va_end(va);
  if (!kwds)
    goto out;

  ret = PyObject_Call(meth, args, kwds);
 out:
  Py_XDECREF(meth);
  Py_XDECREF(args);
  Py_XDECREF(kwds);
  return ret;
}

It would be nice for the Python C API to support a more convenient way
of calling objects and methods with keyword arguments.



From cbarton at metavr.com  Wed Jun 20 12:17:22 2007
From: cbarton at metavr.com (Campbell Barton)
Date: Wed, 20 Jun 2007 20:17:22 +1000
Subject: [Python-Dev] Calling Methods from Pythons C API with Keywords
In-Reply-To: <1182324889.6077.111.camel@localhost>
References: <mailman.0.1182261543.9641.python-dev@python.org>	
	<4677E66C.8000403@metavr.com> <1182324889.6077.111.camel@localhost>
Message-ID: <4678FEB2.9050506@metavr.com>

Hrvoje Nik?i? wrote:
> On Wed, 2007-06-20 at 00:21 +1000, Campbell Barton wrote:
>> I want to add my own call's before and after PyLists standard functions
>> but have a proplem with functons that use keywords and have no API
>> equivalent.
>> For example, I cant use the API's PyList_Sort because that dosnt support
>> keywords like...
>>
>> ls.sort(key=lambda a: a.foo))
>>
>> And the Problem with PyObject_CallMethod is that it dosnt accept keywords.
> 
> Note that you can always simply call PyObject_Call on the bound method
> object retrieved using PyObject_GetAttrString.  The hardest part is
> usually constructing the keywords dictionary, a job best left to
> Py_BuildValue and friends.  When I need that kind of thing in more than
> one place, I end up with a utility function like this one:
> 
> /* Equivalent to PyObject_CallMethod but accepts keyword args.  The
>    format... arguments should produce a dictionary that will be passed
>    as keyword arguments to obj.method.
> 
>    Usage example:
>      PyObject *res = call_method(lst, "sort", "{s:O}", "key", keyfun));
> */
> 
> PyObject *
> call_method(PyObject *obj, const char *methname, char *format, ...)
> {
>   va_list va;
>   PyObject *meth = NULL, *args = NULL, *kwds = NULL, *ret = NULL;
> 
>   args = PyTuple_New(0);
>   if (!args)
>     goto out;
>   meth = PyObject_GetAttrString(obj, methname);
>   if (!meth)
>     goto out;
> 
>   va_start(va, format);
>   kwds = Py_VaBuildValue(format, va);
>   va_end(va);
>   if (!kwds)
>     goto out;
> 
>   ret = PyObject_Call(meth, args, kwds);
>  out:
>   Py_XDECREF(meth);
>   Py_XDECREF(args);
>   Py_XDECREF(kwds);
>   return ret;
> }
> 
> It would be nice for the Python C API to support a more convenient way
> of calling objects and methods with keyword arguments.


Thanks for the hint, I ended up using PyObject_Call.
This seems to work, EXPP_PyTuple_New_Prepend - is a utility function 
that returns a new tuple with self at the start (needed so args starts 
with self)

I dont think I can use PyObject_GetAttrString because the subtype would 
return a reference to this function - rather then the lists original 
function, Id need an instance of a list and dont have one at that point.
______________________

static PyObject * MaterialList_sort(BPy_MaterialList *self, PyObject 
*args, PyObject *keywds )
{
	PyObject *ret;
	PyObject *newargs = EXPP_PyTuple_New_Prepend(args, (PyObject *)self);
	
	sync_list_from_materials__internal(self); # makes sure the list matches 
blenders materials

	ret = PyObject_Call(PyDict_GetItemString(PyList_Type.tp_dict, "sort"), 
newargs, keywds);
	Py_DECREF(newargs);
	
	if (ret)
		sync_materials_from_list__internal(self); # makes blenders materials 
match the lists
	
	return ret;
}

_____________________

Later on Ill probably avoid using PyDict_GetItemString on 
PyList_Type.tp_dict all the time since the methods for lists does not 
change during python running. - Can probably be assigned to a constant.


-- 
Campbell J Barton (ideasman42)

From hrvoje.niksic at avl.com  Wed Jun 20 13:38:49 2007
From: hrvoje.niksic at avl.com (Hrvoje =?UTF-8?Q?Nik=C5=A1i=C4=87?=)
Date: Wed, 20 Jun 2007 13:38:49 +0200
Subject: [Python-Dev] Calling Methods from Pythons C API with Keywords
In-Reply-To: <4678FEB2.9050506@metavr.com>
References: <mailman.0.1182261543.9641.python-dev@python.org>
	<4677E66C.8000403@metavr.com> <1182324889.6077.111.camel@localhost>
	<4678FEB2.9050506@metavr.com>
Message-ID: <1182339529.6077.120.camel@localhost>

[ Note that this discussion, except maybe for the suggestion to add a
simpler way to call a method with keyword args, is off-topic to
python-dev. ]

On Wed, 2007-06-20 at 20:17 +1000, Campbell Barton wrote:
> I dont think I can use PyObject_GetAttrString because the subtype would 
> return a reference to this function - rather then the lists original 
> function, Id need an instance of a list and dont have one at that point.

Note that PyList_Type is a full-fledged PyObject, so
PyObject_GetAttrString works on it just fine.  Of course, you would also
need to add the "self" argument before the keywords, but that's a
trivial change to the function.

Calling PyObject_GetAttrString feels cleaner than accessing tp_dict
directly, and most importantly call_method as written delegates creation
of the dictionaty to Py_BuildValue.



From cbarton at metavr.com  Wed Jun 20 14:12:42 2007
From: cbarton at metavr.com (Campbell Barton)
Date: Wed, 20 Jun 2007 22:12:42 +1000
Subject: [Python-Dev] Calling Methods from Pythons C API with Keywords
In-Reply-To: <1182339529.6077.120.camel@localhost>
References: <mailman.0.1182261543.9641.python-dev@python.org>	
	<4677E66C.8000403@metavr.com>
	<1182324889.6077.111.camel@localhost>	
	<4678FEB2.9050506@metavr.com> <1182339529.6077.120.camel@localhost>
Message-ID: <467919BA.2090708@metavr.com>

Hrvoje Nik??i?? wrote:
> [ Note that this discussion, except maybe for the suggestion to add a
> simpler way to call a method with keyword args, is off-topic to
> python-dev. ]

Is there a list for this kind of discussion?
Iv tried asking questions on the freenode python chat room but almost 
very few people there do C/Python api development.

-- 
Campbell J Barton (ideasman42)

From facundo at taniquetil.com.ar  Wed Jun 20 14:36:33 2007
From: facundo at taniquetil.com.ar (Facundo Batista)
Date: Wed, 20 Jun 2007 12:36:33 +0000 (UTC)
Subject: [Python-Dev] Python 3000 Status Update (Long!)
References: <ca471dc20706182332q18df52eaw77c3e544a65aa196@mail.gmail.com>
Message-ID: <f5b70h$et7$1@sea.gmane.org>

Guido van Rossum wrote:

> I've written up a comprehensive status report on Python 3000. Please read:
>
> http://www.artima.com/weblogs/viewpost.jsp?thread=208549

One doubt: In Miscellaneus you say:

  Ordering comparisons (<, <=, >, >=) will raise TypeError by default
  instead of returning arbitrary results. Equality comparisons (==, !=)
  will compare for object identity (is, is not) by default.

I *guess* that you're talking about comparisons between different
datatypes... but you didn't explicit that in your blog.

Am I right?

-- 
.   Facundo
.
Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/



From hrvoje.niksic at avl.com  Wed Jun 20 15:00:58 2007
From: hrvoje.niksic at avl.com (Hrvoje =?UTF-8?Q?Nik=C5=A1i=C4=87?=)
Date: Wed, 20 Jun 2007 15:00:58 +0200
Subject: [Python-Dev] Calling Methods from Pythons C API with Keywords
In-Reply-To: <467919BA.2090708@metavr.com>
References: <mailman.0.1182261543.9641.python-dev@python.org>
	<4677E66C.8000403@metavr.com> <1182324889.6077.111.camel@localhost>
	<4678FEB2.9050506@metavr.com> <1182339529.6077.120.camel@localhost>
	<467919BA.2090708@metavr.com>
Message-ID: <1182344458.6077.125.camel@localhost>

On Wed, 2007-06-20 at 22:12 +1000, Campbell Barton wrote:
> Hrvoje Nik?i? wrote:
> > [ Note that this discussion, except maybe for the suggestion to add a
> > simpler way to call a method with keyword args, is off-topic to
> > python-dev. ]
> 
> Is there a list for this kind of discussion?

I believe the appropriate list would be the general Python
list/newsgroup.  I agree that response about the Python/C API tends to
be sparse on general-purpose lists, though.

If there is a forum dedicated to discussing the *use* of Python at the C
level, I'd like to know about it as well.



From guido at python.org  Wed Jun 20 15:30:48 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 20 Jun 2007 06:30:48 -0700
Subject: [Python-Dev] Python 3000 Status Update (Long!)
In-Reply-To: <f5b70h$et7$1@sea.gmane.org>
References: <ca471dc20706182332q18df52eaw77c3e544a65aa196@mail.gmail.com>
	<f5b70h$et7$1@sea.gmane.org>
Message-ID: <ca471dc20706200630xf2380c3j1ed59eaf523a609a@mail.gmail.com>

On 6/20/07, Facundo Batista <facundo at taniquetil.com.ar> wrote:
> Guido van Rossum wrote:
>
> > I've written up a comprehensive status report on Python 3000. Please read:
> >
> > http://www.artima.com/weblogs/viewpost.jsp?thread=208549
>
> One doubt: In Miscellaneus you say:
>
>   Ordering comparisons (<, <=, >, >=) will raise TypeError by default
>   instead of returning arbitrary results. Equality comparisons (==, !=)
>   will compare for object identity (is, is not) by default.
>
> I *guess* that you're talking about comparisons between different
> datatypes... but you didn't explicit that in your blog.
>
> Am I right?

No. The *default* comparison always raises an exception. Of course,
most types have a comparison that does the right thing for objects of
the same type -- but they still raise an exception when compared (for
ordering) to objects of different types (except subtypes or related
types).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From brett at python.org  Wed Jun 20 17:43:15 2007
From: brett at python.org (Brett Cannon)
Date: Wed, 20 Jun 2007 08:43:15 -0700
Subject: [Python-Dev] cleaning up the email addresses in the PEPs
Message-ID: <bbaeab100706200843q7634f110ycaddd834f2bf7bec@mail.gmail.com>

I am working on some code in the sandbox to automatically generate PEP
0.  This is also leading to code that checks all the PEPs follow some
basic guidelines.

One of those guidelines is an author having a single email address.
The Owners index at the bottom of PEP 0 is going to be created from
the names and email addresses found in the PEPs themselves.  But that
doesn't work too well when an author has multiple addresses listed.

If you are listed below, please choose a single address to use.  You
can either change the PEPs yourself or just reply with the email you
prefer.  I can tell you the multiple spellings if you want.  If I
don't hear from people I will just use my best judgement.

And even better, if you spell your name multiple ways in the PEPs
(e.g., Martin v. Loewis, Martin v. L?wis, Martin von L?wis) also let
it be known which spelling you prefer (unifying name spelling comes
after unifying the email addresses).

   Aahz
   Ka-Ping Yee:
   Neil Schemenauer
   David Goodger:
   Tim Peters:
   Martin v. L?wis:
   Paul Prescod:
   Jeremy Hylton:
   Clark C. Evans:
   Richard Jones:
   Alex Martelli:
   Moshe Zadka

-Brett

From martin at v.loewis.de  Wed Jun 20 19:26:32 2007
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Wed, 20 Jun 2007 19:26:32 +0200
Subject: [Python-Dev] Calling Methods from Pythons C API with Keywords
In-Reply-To: <467919BA.2090708@metavr.com>
References: <mailman.0.1182261543.9641.python-dev@python.org>		<4677E66C.8000403@metavr.com>	<1182324889.6077.111.camel@localhost>		<4678FEB2.9050506@metavr.com>
	<1182339529.6077.120.camel@localhost> <467919BA.2090708@metavr.com>
Message-ID: <46796348.2050902@v.loewis.de>

Campbell Barton schrieb:
> Hrvoje Nik??i?? wrote:
>> [ Note that this discussion, except maybe for the suggestion to add a
>> simpler way to call a method with keyword args, is off-topic to
>> python-dev. ]
> 
> Is there a list for this kind of discussion?

Hrvoje wasn't explicit on *why* this discussion is inappropriate here,
so I just add that for better understanding:

python-dev is for the development *of* Python, not for the development
*with* Python. So you post here if you propose an enhancement or discuss
the resolution of a bug. Question of the "how do I" kind are off-topic -
posters are expected to know and understand the options, and then
discuss the flaws of these options, rather than asking what they are.

As Hrvoje says: try python-list (aka comp.lang.python). If you don't
get an answer, you didn't phrase your question interestingly enough,
or nobody knows the answer, or nobody has the time to tell you.

Regards,
Martin

From martin at v.loewis.de  Thu Jun 21 08:10:48 2007
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 21 Jun 2007 08:10:48 +0200
Subject: [Python-Dev] www.python.org outage
Message-ID: <467A1668.1020600@v.loewis.de>

The scheduled outage starts now.

Regards,
Martin

From martin at v.loewis.de  Thu Jun 21 10:41:47 2007
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 21 Jun 2007 10:41:47 +0200
Subject: [Python-Dev] www.python.org back up
Message-ID: <467A39CB.4080705@v.loewis.de>

I completed the update of dinsdale. Please let me know if you
find any new problems with that machine.

Regards,
Martin

From hrvoje.niksic at avl.com  Thu Jun 21 13:33:56 2007
From: hrvoje.niksic at avl.com (Hrvoje =?UTF-8?Q?Nik=C5=A1i=C4=87?=)
Date: Thu, 21 Jun 2007 13:33:56 +0200
Subject: [Python-Dev] Calling Methods from Pythons C API with Keywords
In-Reply-To: <46796348.2050902@v.loewis.de>
References: <mailman.0.1182261543.9641.python-dev@python.org>
	<4677E66C.8000403@metavr.com>	<1182324889.6077.111.camel@localhost>
	<4678FEB2.9050506@metavr.com> <1182339529.6077.120.camel@localhost>
	<467919BA.2090708@metavr.com>  <46796348.2050902@v.loewis.de>
Message-ID: <1182425636.6077.141.camel@localhost>

On Wed, 2007-06-20 at 19:26 +0200, "Martin v. L?wis" wrote:
> As Hrvoje says: try python-list (aka comp.lang.python). If you don't
> get an answer, you didn't phrase your question interestingly enough,
> or nobody knows the answer, or nobody has the time to tell you.

The thing with comp.lang.python is that it is followed by a large number
of Python users, but a much smaller number of the C API users -- which
is only natural, since the group is about Python, not about C.  For most
users the Python/C API is an implementation detail which they never have
to worry about.

Futrhermore, questions about the C API often concern CPython
implementation details and so they don't feel like they would belong in
comp.lang.python.  As an experiment, it might make sense to open a
mailing list dedicated to the Python C API.  It could become a useful
support forum for extension writers (a group very useful to Python) and
maybe even a melting pot for new ideas regarding CPython, much like
comp.lang.python historically provided ideas for Python the language.



From cbarton at metavr.com  Thu Jun 21 13:59:30 2007
From: cbarton at metavr.com (Campbell Barton)
Date: Thu, 21 Jun 2007 21:59:30 +1000
Subject: [Python-Dev] Calling Methods from Pythons C API with Keywords
In-Reply-To: <1182425636.6077.141.camel@localhost>
References: <mailman.0.1182261543.9641.python-dev@python.org>	
	<4677E66C.8000403@metavr.com>	<1182324889.6077.111.camel@localhost>	
	<4678FEB2.9050506@metavr.com>
	<1182339529.6077.120.camel@localhost>	
	<467919BA.2090708@metavr.com> <46796348.2050902@v.loewis.de>
	<1182425636.6077.141.camel@localhost>
Message-ID: <467A6822.3050103@metavr.com>

The reason I asked on this in the first place is I had looked through 
the python source to make sure PyObject_Call had no equivalent that 
supported keywords, and since I needed to do this I figured it might be 
worth considering for Pythons API.

Im sure everyone could write their own PyObject_Call, if they had to but 
thats what the API's for.

Hrvoje Nik?i? wrote:
> On Wed, 2007-06-20 at 19:26 +0200, "Martin v. L??wis" wrote:
>> As Hrvoje says: try python-list (aka comp.lang.python). If you don't
>> get an answer, you didn't phrase your question interestingly enough,
>> or nobody knows the answer, or nobody has the time to tell you.
> 
> The thing with comp.lang.python is that it is followed by a large number
> of Python users, but a much smaller number of the C API users -- which
> is only natural, since the group is about Python, not about C.  For most
> users the Python/C API is an implementation detail which they never have
> to worry about.
> 
> Futrhermore, questions about the C API often concern CPython
> implementation details and so they don't feel like they would belong in
> comp.lang.python.  As an experiment, it might make sense to open a
> mailing list dedicated to the Python C API.  It could become a useful
> support forum for extension writers (a group very useful to Python) and
> maybe even a melting pot for new ideas regarding CPython, much like
> comp.lang.python historically provided ideas for Python the language.

Agree a Python/C API List would be great, in fact I cant see any reasons 
not to have it- likely the pure python users dont want to know about 
refcounting problems.. etc anyway.

http://mail.python.org/mailman/listinfo
http://www.python.org/community/sigs/
There are lists/newsgroup for py2exe and pyrex, Python-ObjectiveC etc,

Python/C API seems much more generic, and its also fairly tricky to use 
at times - when doing more advanced stuff (subtyping has been tricky for 
me anyway). I expect the dev's of pyrex, pygame etc might also need to 
discuss C API spesific issues as well.

Iv had roughly this conversation in IRC...

Q. Hi, Id like to know how wrap python subtype methods in the C API
A. C dosnt have classes, use C++
Q. no I want to use pythons C API,
A. Subtypes are easy to do in python..
...... you get the idea...

Quite a few "python only" users dont understand where the Python/C API 
fits in and its annoying to have to explain the question each time (yes, 
Iv had these conversations more then once)


- Cam

From amk at amk.ca  Thu Jun 21 17:23:52 2007
From: amk at amk.ca (A.M. Kuchling)
Date: Thu, 21 Jun 2007 11:23:52 -0400
Subject: [Python-Dev] Wanted: readers for a mailbox.py article
Message-ID: <20070621152352.GA10988@localhost.localdomain>

I'm writing an article about the mailbox module for an online
publication, and would like to get comments on the current draft from
people familiar with the module.

If you'd like to take a look, please e-mail me and I'll tell you the
draft's URL.

--amk

From martin at v.loewis.de  Thu Jun 21 19:25:06 2007
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Thu, 21 Jun 2007 19:25:06 +0200
Subject: [Python-Dev] Calling Methods from Pythons C API with Keywords
In-Reply-To: <1182425636.6077.141.camel@localhost>
References: <mailman.0.1182261543.9641.python-dev@python.org>	
	<4677E66C.8000403@metavr.com>	<1182324889.6077.111.camel@localhost>	
	<4678FEB2.9050506@metavr.com>
	<1182339529.6077.120.camel@localhost>	
	<467919BA.2090708@metavr.com> <46796348.2050902@v.loewis.de>
	<1182425636.6077.141.camel@localhost>
Message-ID: <467AB472.6070509@v.loewis.de>

> Futrhermore, questions about the C API often concern CPython
> implementation details and so they don't feel like they would belong in
> comp.lang.python.  As an experiment, it might make sense to open a
> mailing list dedicated to the Python C API.  It could become a useful
> support forum for extension writers (a group very useful to Python) and
> maybe even a melting pot for new ideas regarding CPython, much like
> comp.lang.python historically provided ideas for Python the language.

In the past, we created special-interest groups for such discussion.
Would you like to coordinate a C sig? See

http://www.python.org/community/sigs/

Regards,
Martin

From arigo at tunes.org  Fri Jun 22 11:06:22 2007
From: arigo at tunes.org (Armin Rigo)
Date: Fri, 22 Jun 2007 11:06:22 +0200
Subject: [Python-Dev] Vilnius/Post EuroPython PyPy Sprint 12-14th of July
Message-ID: <20070622090621.GA19639@code0.codespeak.net>

========================================================
Vilnius/Post EuroPython PyPy Sprint 12-14th of July
========================================================

The PyPy team is sprinting at EuroPython again and we invite
you to participate in our 3 day long sprint at the conference hotel
- Reval Hotel Lietuva.

If you plan to attend the sprint we recommend you to listen to the PyPy
technical talks (`EuroPython schedule`_) during the
conference since it will give you a good overview of the status of development.

On the morning of the first sprint day (12th) we will also have a
tutorial session for those new to PyPy development.  As 3 days is relatively
short for a PyPy sprint we suggest to travel back home on the 15th if
possible (but it is ok to attend less than 3 days too).

------------------------------
Goals and topics of the sprint
------------------------------

There are many possible and interesting sprint topics to work on - here
we list some possible task areas:

* completing the missing python 2.5 features and support
* write or port more extension modules (e.g. zlib is missing)
* identify slow areas of PyPy through benchmarking and work on improvements,
  possibly moving app-level parts of the Python interpreter to interp-level
  if useful.
* there are some parts of PyPy in need of refactoring, we may spend some time
  on those, for example:

    - rctypes and the extension compiler need some rethinking
    - support for LLVM 2.0 for the llvm backend
    - ...

* some JIT improvement work
* port the stackless transform to ootypesystem
* other interesting stuff that you would like to work on ...;-)

------------
Registration
------------

If you'd like to come, please subscribe to the `pypy-sprint mailing list`_
and drop a note about your interests and post any questions.  More
organisational information will be sent to that list.

Please register by adding yourself on the following list (via svn):

  http://codespeak.net/svn/pypy/extradoc/sprintinfo/post-ep2007/people.txt

or on the pypy-sprint mailing list if you do not yet have check-in rights:

  http://codespeak.net/mailman/listinfo/pypy-sprint

---------------------------------------
Preparation (if you feel it is needed):
---------------------------------------

* read the `getting-started`_ pages on http://codespeak.net/pypy

* for inspiration, overview and technical status you are welcome to
  read `the technical reports available and other relevant documentation`_

* please direct any technical and/or development oriented questions to
  pypy-dev at codespeak.net and any sprint organizing/logistical
  questions to pypy-sprint at codespeak.net

* if you need information about the conference, potential hotels,
  directions etc we recommend to look at http://www.europython.org.


We are looking forward to meet you at the Vilnius Post EuroPython
PyPy sprint!

The PyPy team 


.. See also ..

.. _getting-started: http://codespeak.net/pypy/dist/pypy/doc/getting-started.html
.. _`pypy-sprint mailing list`: http://codespeak.net/mailman/listinfo/pypy-sprint
.. _`the technical reports available and other relevant documentation`: http://codespeak.net/pypy/dist/pypy/doc/index.html
.. _`EuroPython schedule`: http://indico.cern.ch/conferenceTimeTable.py?confId=13919&showDate=all&showSession=all&detailLevel=contribution&viewMode=room

From henning.vonbargen at arcor.de  Fri Jun 22 23:40:04 2007
From: henning.vonbargen at arcor.de (Henning von Bargen)
Date: Fri, 22 Jun 2007 23:40:04 +0200
Subject: [Python-Dev] Proposal for a new function "open_noinherit" to avoid
	problems with subprocesses and security risks
Message-ID: <003a01c7b515$e56a7b50$6401a8c0@max>

I'd like to propose a new function "open_noinherit"
or maybe even a new mode flag "n" for the builtin "open"
(see footnote for the names).

The new function should work exactly like the builtin "open", with one 
difference:
The open file is not inherited to any child processes
(whereas files opened with "open" will be inherited).

The new function can be implemented (basically) using
os.O_NOINHERIT on MS Windows
resp. fcntl / FD_CLOEXEC on Posix.

I will post a working Python implementation next week.

There are five reasons for the proposal:
1) The builtin "open" causes unexpected problems in conjunction with 
subprocesses,
     in particular in multi-threaded programs.
     It can cause file permission errors in the subprocess or in the current 
process.
     On Microsoft Windows, some of the possible file permission errors are 
not
     documented by Microsoft (thus very few programs written for Windows 
will
     react properly).
2) Inheriting open file handles to subprocesses is a security risk.
3) For the developer, finding "cause and effect" is *very* hard, in 
particular in
    multi-threaded programs, when the errors occur only in race-conditions.
4) The problems arise in some of the standard library modules as well,
     i.e. shutil.filecopy.
5) Very few developers are aware of the possible problems.

As a work-around, one can replace open with
os.fdopen (os.open (..., + os.O_NOINHERIT), ... )
on Windows, but that's really ugly, hard to read,
may raise a different exception than open (IOError instead of OSError),
and needs careful work to take platform-specific code into account

Here is a single-threaded example to demonstrate the effect:

import os
import subprocess
outf = open ("blah.tmp", "wt")
subprocess.Popen("notepad.exe")  # or whatever program you like, but
# It must be a program that does not exit immediately!
# Now the subprocess has inherited the open file handle
# We can still write:
outf.write ("Hello world!\n")
outf.close()
# But we can not rename the file (at least on Windows)
os.rename ("blah.tmp", "blah.txt")
# this fails with OSError: [Errno 13] Permission denied
# Similar problems with other file operations on non-Windows platforms.

Ok, in this little program one can see what is going wrong easily.

But what if the subprocess exits very quickly?
Then perhaps you see the OSError, perhaps not - depending on the process 
scheduler
of your operation system.

In a commercial multi-theaded daemon application, the error only occured
under heavy load and was hard to reproduce - and it was even harder to find 
the cause.
That's because cause and effect were in two different threads in two 
completely different
parts of the program:

- Thread A opens a file and starts to write data
- Thread B starts a subprocess (which inherits the file handle from thread 
A!)
- Thread A continues writing to the file and closes it.
- And now it's a race condition:
- a) Thread A wants to rename the file - b) the subprocess exits.
  If a) is first: Error, if b) is first: no error.

To make things more complicated, even two subprocesses can disturb each 
other.

The new function should be implemented in C ideally, because the GIL could
prevent a thread-switch between os.open and the fcntl.F_SETFD call.

Note that the problem described here arises not only for files, but for 
sockets
as well.
See bug 1222790: SimpleXMLRPCServer does not set FD_CLOEXEC

Once there is an easy-to-use, platform-independent, documented builtin
"open_noinherit" (or a new mode flag for "open"), the standard library 
should
be considered. For each occurence of "open" or "file", it should be 
considered
if it necessary to inherit the file to subprocesses. If not, it should be 
replaced
with open_noinherit.
One example is shutil.filecopy, where open_noiherit should be used instead 
of open.
The socket module is another candidate, I think - but I'm not sure about 
that.

A nice effect of using "open_noinherit" is that - in many cases - one no 
longer
needs to speficy close_fds = True when calling subprocess.Popen.
[Note that close_fds is *terribly* slow if MAX_OPEN_FILES is "big", e.g. 
800,
see bug 1663329]

Footnote:
While writing this mail, at least 3 times I typed "nonherit" instead of 
"noinherit".
So maybe someone can propose a better name?
Or a new mode flag character could be "p" (like "private" or "protected").

Henning


From kbk at shore.net  Sat Jun 23 04:17:15 2007
From: kbk at shore.net (Kurt B. Kaiser)
Date: Fri, 22 Jun 2007 22:17:15 -0400 (EDT)
Subject: [Python-Dev] Weekly Python Patch/Bug Summary
Message-ID: <200706230217.l5N2HFld023393@hampton.thirdcreek.com>

Patch / Bug Summary
___________________

Patches :  385 open (+21) /  3790 closed (+21) /  4175 total (+42)
Bugs    : 1029 open (+43) /  6744 closed (+43) /  7773 total (+86)
RFE     :  262 open ( +4) /   291 closed ( +4) /   553 total ( +8)

New / Reopened Patches
______________________

syslog syscall support for SysLogLogger  (2007-05-02)
       http://python.org/sf/1711603  reopened by  luke-jr

syslog syscall support for SysLogLogger  (2007-05-02)
       http://python.org/sf/1711603  reopened by  luke-jr

syslog syscall support for SysLogLogger  (2007-05-02)
       http://python.org/sf/1711603  reopened by  luke-jr

syslog syscall support for SysLogLogger  (2007-05-02)
       http://python.org/sf/1711603  reopened by  luke-jr

dict size changes during iter  (2007-05-24)
       http://python.org/sf/1724999  opened by  Ali Gholami Rudi

Line ending bug SimpleXMLRPCServer  (2007-05-24)
       http://python.org/sf/1725295  opened by  bgrubbs

IDLE - cursor color configuration bug  (2007-05-25)
       http://python.org/sf/1725576  opened by  Tal Einat

Distutils default exclude doesn't match top level .svn  (2007-05-25)
       http://python.org/sf/1725737  opened by  Petteri R?ty

ftplib.py: IndexError in voidresp occasionally  (2007-05-26)
       http://python.org/sf/1726172  opened by  kxroberto

Patch to vs 2005 build  (2007-05-26)
       http://python.org/sf/1726195  opened by  Joseph Armbruster

Windows Build Warnings  (2007-05-26)
       http://python.org/sf/1726196  opened by  Joseph Armbruster

Line iteration readability  (2007-05-26)
       http://python.org/sf/1726198  opened by  Joseph Armbruster

SimpleHTTPServer extensions_map  (2007-05-26)
       http://python.org/sf/1726208  opened by  Joseph Armbruster

ftplib and ProFTPD NLST 226 without 1xx response  (2007-05-27)
       http://python.org/sf/1726451  opened by  Kenneth Loafman

First steps towards new super (PEP 367)  (2007-05-28)
CLOSED http://python.org/sf/1727209  opened by  Guido van Rossum

move intern to sys, make intern() optionally warn  (2007-05-31)
       http://python.org/sf/1728741  opened by  Anthony Baxter

IDLE - configDialog layout cleanup  (2007-06-03)
       http://python.org/sf/1730217  opened by  Tal Einat

telnetlib: A callback for monitoring the telnet session  (2007-06-04)
       http://python.org/sf/1730959  opened by  Samuel Abels

BufReader, TextReader for PEP 3116 "New I/O"  (2007-06-04)
       http://python.org/sf/1731036  opened by  Ilguiz Latypov

Pruning threading.py from asserts  (2007-06-05)
CLOSED http://python.org/sf/1731049  opened by  Bj?rn Lindqvist

Expect skips by platform  (2007-06-04)
       http://python.org/sf/1731169  opened by  Matt Kraai

Missing Py_DECREF in pysqlite_cache_display  (2007-06-05)
CLOSED http://python.org/sf/1731330  opened by  Tim Delaney

Improve doc for time.strptime  (2007-06-05)
       http://python.org/sf/1731659  opened by  Bj?rn Lindqvist

urllib.urlretrieve/URLopener.retrieve - 'buff' argument  (2007-06-05)
       http://python.org/sf/1731720  opened by  Dariusz Suchojad

Document the constants in the socket module  (2007-06-06)
       http://python.org/sf/1732367  opened by  Bj?rn Lindqvist

Allow T_LONGLONG to accepts ints  (2007-06-09)
CLOSED http://python.org/sf/1733960  opened by  Roger Upole

_lsprof.c:ptrace_enter_call assumes PyErr_* is clean  (2007-06-09)
       http://python.org/sf/1733973  opened by  Eyal Lotem

PY_LLONG_MAX and so on  (2007-06-09)
CLOSED http://python.org/sf/1734014  opened by  Hirokazu Yamamoto

Fast path for unicodedata.normalize()  (2007-06-10)
       http://python.org/sf/1734234  opened by  Rauli Ruohonen

patch for bug 1170311 "zipfile UnicodeDecodeError"  (2007-06-10)
       http://python.org/sf/1734346  opened by  Alexey Borzenkov

platform.py patch to support turbolinux  (2007-06-11)
CLOSED http://python.org/sf/1734945  opened by  Yayati_Turbolinux

Fix selectmodule.c compilation on GNU/Hurd  (2007-06-11)
       http://python.org/sf/1735030  opened by  Michael Banck

Kill StandardError  (2007-06-12)
CLOSED http://python.org/sf/1735485  opened by  Collin Winter

asyncore should handle also ECONNABORTED in recv  (2007-06-12)
       http://python.org/sf/1736101  opened by  billiejoex

asyncore/asynchat patches  (2007-06-12)
       http://python.org/sf/1736190  opened by  Josiah Carlson

EasyDialogs patch to remove aepack dependency  (2007-06-15)
       http://python.org/sf/1737832  opened by  has

help() can't find right source file  (2007-06-15)
       http://python.org/sf/1738179  opened by  Greg Couch

Add a -z interpreter flag to execute a zip file  (2007-06-19)
       http://python.org/sf/1739468  opened by  andy-chu

zipfile.testzip() using progressive file reads  (2007-06-19)
       http://python.org/sf/1739648  opened by  Grzegorz Adam Hankiewicz

Patch inspect.py for IronPython / Jython Compatibility  (2007-06-19)
       http://python.org/sf/1739696  opened by  Mike Foord

Accelerate attr dict lookups  (2007-06-19)
       http://python.org/sf/1739789  opened by  Eyal Lotem

Add reduce to functools in 2.6  (2007-06-19)
       http://python.org/sf/1739906  opened by  Christian Heimes

Fix Decimal.sqrt bugs described in #1725899  (2007-06-22)
       http://python.org/sf/1741308  opened by  Mark Dickinson

Patches Closed
______________

Make isinstance/issubclass overloadable for PEP 3119  (2007-04-26)
       http://python.org/sf/1708353  closed by  gvanrossum

subprocess: Support close_fds on Win32  (2007-02-26)
       http://python.org/sf/1669481  closed by  astrand

First steps towards new super (PEP 3135)  (2007-05-28)
       http://python.org/sf/1727209  closed by  gvanrossum

platform.system() returns incorrect value in Vista  (2007-05-28)
       http://python.org/sf/1726668  closed by  lemburg

Fix warnings related to PyLong_FromVoidPtr  (2007-05-05)
       http://python.org/sf/1713234  closed by  theller

fix 1668596: copy datafiles properly when package_dir is ' '  (2007-05-17)
       http://python.org/sf/1720897  closed by  nnorwitz

Hide iteration variable in list comprehensions  (2007-02-15)
       http://python.org/sf/1660500  closed by  gbrandl

urllib2 raises an UnboundLocalError if "auth-int" is the qop  (2007-02-24)
       http://python.org/sf/1667860  closed by  gbrandl

Pruning threading.py from asserts  (2007-06-04)
       http://python.org/sf/1731049  closed by  collinwinter

Missing Py_DECREF in pysqlite_cache_display  (2007-06-05)
       http://python.org/sf/1731330  closed by  gbrandl

Fix tests that assume they can write to Lib/test  (2006-07-12)
       http://python.org/sf/1520904  closed by  dgreiman

Allow specifying headers for MIME parts  (2007-02-22)
       http://python.org/sf/1666625  closed by  nnorwitz

x64 clean compile patch for _ctypes  (2007-05-09)
       http://python.org/sf/1715718  closed by  theller

bug fix: ctypes truncates 64-bit pointers  (2007-04-19)
       http://python.org/sf/1703286  closed by  theller

fixes non ansi c declarations in libffi  (2007-04-19)
       http://python.org/sf/1703300  closed by  theller

Allow T_LONGLONG to accepts ints  (2007-06-09)
       http://python.org/sf/1733960  closed by  loewis

PY_LLONG_MAX and so on  (2007-06-09)
       http://python.org/sf/1734014  closed by  loewis

bdist_deb - Debian packager  (2004-10-27)
       http://python.org/sf/1054967  closed by  jafo

platform.py patch to support turbolinux  (2007-06-11)
       http://python.org/sf/1734945  closed by  lemburg

Kill StandardError  (2007-06-12)
       http://python.org/sf/1735485  closed by  collinwinter

locale.getdefaultlocale() bug when _locale is missing  (2006-09-06)
       http://python.org/sf/1553427  closed by  gbrandl

New / Reopened Bugs
___________________

inspect.formatargspec last argument ignored  (2007-05-23)
CLOSED http://python.org/sf/1723875  opened by  Patrick Dobbs

Grammar error in Python Tutorial 2.5 section 8.3  (2007-05-23)
CLOSED http://python.org/sf/1724099  opened by  sampson

cPickle module doesn't work with universal line endings  (2007-05-23)
       http://python.org/sf/1724366  opened by  Geoffrey Bache

shlex.split problems on Windows  (2007-05-24)
       http://python.org/sf/1724822  opened by  Geoffrey Bache

bsddb.btopen . del of record doesn't update index  (2007-05-25)
       http://python.org/sf/1725856  opened by  Charles Hixson

bsddb.btopen . del of record doesn't update index  (2007-05-25)
CLOSED http://python.org/sf/1725862  opened by  Charles Hixson

decimal sqrt method doesn't use round-half-even  (2007-05-25)
       http://python.org/sf/1725899  opened by  Mark Dickinson

Typo in ctypes.wintypes definition of WIN32_FIND_DATA field  (2007-05-26)
CLOSED http://python.org/sf/1726026  opened by  Koby Kahane

bsddb.btopen . del of record doesn't update index  (2007-05-27)
CLOSED http://python.org/sf/1726299  opened by  Charles Hixson

platform.system() returns incorrect value in Vista  (2007-05-27)
CLOSED http://python.org/sf/1726668  opened by  Benjamin Leppard

Bug found in datetime for Epoch time = -1  (2007-05-28)
       http://python.org/sf/1726687  opened by  Martin Blais

subprocess: unreliability of returncode not clear from docs  (2007-05-28)
       http://python.org/sf/1727024  opened by  Dan O'Huiginn

'assert statement' in doc index links to AssertionError  (2007-05-29)
CLOSED http://python.org/sf/1727417  opened by  ?smund Skj?veland

xmlrpclib waits indefinately  (2007-05-29)
       http://python.org/sf/1727418  opened by  Arno Stienen

64/32-bit issue when unpickling random.Random  (2007-05-29)
       http://python.org/sf/1727780  opened by  Charles

reading from malformed big5 document hangs cpython  (2007-05-30)
CLOSED http://python.org/sf/1728403  opened by  tsuraan

0.0 and -0.0 end up referring to the same object  (2007-05-31)
       http://python.org/sf/1729014  opened by  Johnnyg

os.stat producing incorrect / invalid results  (2007-05-31)
CLOSED http://python.org/sf/1729170  opened by  Joe

SVNVERSION redefined during compilation  (2007-05-31)
CLOSED http://python.org/sf/1729277  opened by  Brett Cannon

Error in example  (2007-05-31)
CLOSED http://python.org/sf/1729280  opened by  accdak

test_doctest fails when run in verbose mode  (2007-05-31)
       http://python.org/sf/1729305  opened by  Neal Norwitz

missing int->Py_ssize_t in documentation  (2007-06-01)
       http://python.org/sf/1729742  opened by  Brian Wellington

test_bsddb3 malloc corruption bug #1721309 broken in 2.6  (2007-06-02)
CLOSED http://python.org/sf/1729929  opened by  David Favor

2.5.1 latest svn fails test_curses and test_timeout  (2007-06-02)
       http://python.org/sf/1729930  opened by  David Favor

cStringIO no loonger accepts array.array objects  (2007-06-02)
       http://python.org/sf/1730114  opened by  reedobrien

tkFont.__eq__ gives type error  (2007-06-02)
       http://python.org/sf/1730136  opened by  L. Peter Deutsch

getattr([], '__eq__')(some-object) is NotImplemented  (2007-06-03)
CLOSED http://python.org/sf/1730322  opened by  L. Peter Deutsch

When Mesa is built with NPTL support, Python extensions link  (2007-06-03)
       http://python.org/sf/1730372  opened by  Gazi Alankus

strptime bug in time module  (2007-06-03)
CLOSED http://python.org/sf/1730389  opened by  Emma

__cmp__ present in type but not instance??  (2007-06-03)
CLOSED http://python.org/sf/1730401  opened by  L. Peter Deutsch

os._execvpe raises assignment error in python 3000 svn  (2007-06-04)
CLOSED http://python.org/sf/1730441  opened by  nifan

dict constructor accesses internal items of dict derivative  (2007-06-03)
       http://python.org/sf/1730480  opened by  Blake Ross

Importing a submodule after unloading its parent  (2007-06-04)
       http://python.org/sf/1731068  opened by  Blake Ross

tkinter memory leak problem  (2007-06-05)
       http://python.org/sf/1731706  opened by  Robert Hancock

race condition in subprocess module  (2007-06-05)
       http://python.org/sf/1731717  opened by  dsagal

python 2.6 latest fails test_socketserver.py  (2007-06-06)
       http://python.org/sf/1732145  opened by  David Favor

Unable to Start IDLE  (2007-06-06)
CLOSED http://python.org/sf/1732160  opened by  Kishore

Destructor behavior faulty  (2007-05-12)
       http://python.org/sf/1717900  reopened by  gbrandl

repr of 'nan' floats not parseable  (2007-06-06)
       http://python.org/sf/1732212  opened by  Pete Shinners

T_LONGLONG chokes on ints  (2007-06-06)
CLOSED http://python.org/sf/1732557  opened by  Roger Upole

Built-in open function fail. Too many file open  (2007-06-07)
CLOSED http://python.org/sf/1732629  opened by  Alex

socket makefile objects are not independent  (2007-06-07)
       http://python.org/sf/1732662  opened by  Jan Ondrej

Built-in open function fail. Too many file open  (2007-06-07)
       http://python.org/sf/1732686  reopened by  alexteo21

Built-in open function fail. Too many file open  (2007-06-07)
       http://python.org/sf/1732686  reopened by  alexteo21

Built-in open function fail. Too many file open  (2007-06-07)
       http://python.org/sf/1732686  opened by  Alex

sqlite3 module trigger problem  (2007-06-07)
       http://python.org/sf/1733085  opened by  Oinopion

sqlite3.dll cannot be relocated  (2007-06-08)
       http://python.org/sf/1733134  opened by  Tim Delaney

slice type is unhashable  (2007-06-07)
       http://python.org/sf/1733184  opened by  L. Peter Deutsch

Solaris 64 bit LD_LIBRARY_PATH_64 needs to be set  (2007-06-08)
       http://python.org/sf/1733484  opened by  Brad Hochstetler

AIX Objects/buffereobject.c does not build on AIX  (2007-06-08)
CLOSED http://python.org/sf/1733488  opened by  Brad Hochstetler

AIX  Modules/unicodedata.c does not build  (2007-06-08)
CLOSED http://python.org/sf/1733493  opened by  Brad Hochstetler

Modules/ld_so_aix needs to strip path off of whichcc call  (2007-06-08)
       http://python.org/sf/1733509  opened by  Brad Hochstetler

zlib configure behaves differently than main configure  (2007-06-08)
       http://python.org/sf/1733513  opened by  Brad Hochstetler

setup.py incorrect for HP  (2007-06-08)
CLOSED http://python.org/sf/1733518  opened by  Brad Hochstetler

HP shared object option  (2007-06-08)
       http://python.org/sf/1733523  opened by  Brad Hochstetler

HP automatic build of zlib  (2007-06-08)
       http://python.org/sf/1733532  opened by  Brad Hochstetler

windows 64 bit builds  (2007-06-08)
CLOSED http://python.org/sf/1733536  opened by  Brad Hochstetler

HP 64 bit does not run  (2007-06-08)
       http://python.org/sf/1733544  opened by  Brad Hochstetler

AIX shared object build of python 2.5 does not work  (2007-06-08)
       http://python.org/sf/1733546  opened by  Brad Hochstetler

RuntimeWarning: tp_compare didn't return -1 or -2   (2007-06-08)
       http://python.org/sf/1733757  opened by  Fabio Zadrozny

Tkinter is not working on trunk (2.6)  (2007-06-09)
       http://python.org/sf/1733943  opened by  Hirokazu Yamamoto

mmap.mmap can overrun buffer  (2007-06-09)
       http://python.org/sf/1733986  opened by  Roger Upole

struct.Struct.size is not documented  (2007-06-09)
       http://python.org/sf/1734111  opened by  Yang Yang

sqlite3 causes memory read error  (2007-06-10)
       http://python.org/sf/1734164  opened by  atsuo ishimoto

Repr class from repr module ignores maxtuple attribute  (2007-06-11)
CLOSED http://python.org/sf/1734723  opened by  Jason Roberts

Tutorial Section 6.4  (2007-06-10)
CLOSED http://python.org/sf/1734732  opened by  Eric Naeseth

sitecustomize.py not found  (2007-06-11)
       http://python.org/sf/1734860  opened by  www.spirito.de

file.read() truncating strings under Windows  (2007-06-12)
       http://python.org/sf/1735418  opened by  cgkanchi

Add O_NOATIME to os module  (2007-06-12)
       http://python.org/sf/1735632  opened by  sam morris

Mac build fails if not building universal due to libtool  (2007-06-12)
       http://python.org/sf/1736103  opened by  Jack Jansen

os.popen('yes | echo hello') stuck  (2007-06-13)
       http://python.org/sf/1736483  opened by  Eric

dict reentrant/threading bug  (2007-06-13)
       http://python.org/sf/1736792  opened by  Adam Olsen

re.findall hangs python completely  (2007-06-14)
       http://python.org/sf/1737127  reopened by  abakker

re.findall hangs python completely  (2007-06-14)
       http://python.org/sf/1737127  opened by  Arno Bakker

Add/Remove programs shows Martin v L?wis  (2007-06-14)
       http://python.org/sf/1737210  opened by  Simon Dahlbacka

telnetlib.Telnet does not process DATA MARK (DM)  (2007-06-15)
       http://python.org/sf/1737737  opened by  Norbert Buchm?ller

logging.exception() does not allow empty string  (2007-06-15)
CLOSED http://python.org/sf/1737864  opened by  Dmitrii Tisnek

parser error : out of memory error  (2007-06-15)
CLOSED http://python.org/sf/1738193  opened by  paul beard

Universal MacPython 2.5.1 installation fails  (2007-06-16)
       http://python.org/sf/1738250  opened by  Shinichiro Wachi

shutil.move doesn't work when only case changes  (2007-06-16)
       http://python.org/sf/1738441  opened by  Gabriel Gambetta

Python-2.5.1.tar.bz2 build failed at Centos-4.5 server  (2007-06-17)
       http://python.org/sf/1738559  opened by  shuvo

sqlite3 doc fix  (2007-06-17)
CLOSED http://python.org/sf/1738670  opened by  Mark Carter

Tutorial error in 3.1.2 Strings  (2007-06-17)
CLOSED http://python.org/sf/1738754  opened by  otan

Bug assigning list comprehension to __slots__ in python 2.5  (2007-06-18)
CLOSED http://python.org/sf/1739107  opened by  Fran?ois Desloges

shutil.rmtree's error message is confusing  (2007-06-18)
CLOSED http://python.org/sf/1739115  opened by  Bj?rn Lindqvist

Investigated ref leak report related to thread(regrtest.py -  (2007-06-18)
       http://python.org/sf/1739118  opened by  Hirokazu Yamamoto

Interactive help raise exception while listing modules  (2007-06-19)
CLOSED http://python.org/sf/1739659  opened by  Dmitry Vasiliev

xmlrpclib can no longer marshal Fault objects  (2007-06-19)
       http://python.org/sf/1739842  opened by  Mike Bonnet

asynchat should call "handle_close"  (2007-06-20)
       http://python.org/sf/1740572  opened by  billiejoex

python: Modules/gcmodule.c:240: update_refs: Assertion `gc->  (2007-06-20)
       http://python.org/sf/1740599  opened by  Sean

struct.pack("I", "foo"); struct.pack("L", "foo") should fail  (2007-06-21)
       http://python.org/sf/1741130  opened by  Thomas Heller

string formatter %x problem with indirectly given long   (2007-06-21)
       http://python.org/sf/1741218  opened by  Kenji Noguchi

defined format returns error  (2007-06-22)
CLOSED http://python.org/sf/1741524  opened by  Ted Bell

Odd UDP problems in socket library  (2007-06-22)
       http://python.org/sf/1741898  opened by  Jay Sherby

Bugs Closed
___________

inspect.formatargspec last argument ignored  (2007-05-23)
       http://python.org/sf/1723875  closed by  patrickcd

Crash in ctypes callproc function with unicode string arg  (2007-05-22)
       http://python.org/sf/1723338  closed by  theller

Grammar error in Python Tutorial 2.5 section 8.3  (2007-05-23)
       http://python.org/sf/1724099  closed by  gbrandl

Option -OO doesn't remove docstrings  (2007-05-21)
       http://python.org/sf/1722485  closed by  gbrandl

shlex.split problems on Windows  (2007-05-24)
       http://python.org/sf/1724822  closed by  gbrandl

docu enhancement for logging.handlers.SysLogHandler  (2007-05-17)
       http://python.org/sf/1720726  closed by  vsajip

tarfile stops expanding with long filenames  (2007-05-16)
       http://python.org/sf/1719898  closed by  gustaebel

bsddb.btopen . del of record doesn't update index  (2007-05-25)
       http://python.org/sf/1725862  closed by  nnorwitz

Typo in ctypes.wintypes definition of WIN32_FIND_DATA field  (2007-05-26)
       http://python.org/sf/1726026  closed by  theller

bsddb.btopen . del of record doesn't update index  (2007-05-26)
       http://python.org/sf/1726299  closed by  nnorwitz

'assert statement' in doc index links to AssertionError  (2007-05-29)
       http://python.org/sf/1727417  closed by  gbrandl

reading from malformed big5 document hangs cpython  (2007-05-31)
       http://python.org/sf/1728403  closed by  perky

os.stat producing incorrect / invalid results  (2007-05-31)
       http://python.org/sf/1729170  closed by  loewis

SVNVERSION redefined during compilation  (2007-06-01)
       http://python.org/sf/1729277  closed by  loewis

Error in example  (2007-05-31)
       http://python.org/sf/1729280  closed by  nnorwitz

distutils chops the first character of filenames  (2007-02-25)
       http://python.org/sf/1668596  closed by  nnorwitz

test_bsddb3 malloc corruption bug #1721309 broken in 2.6  (2007-06-02)
       http://python.org/sf/1729929  closed by  nnorwitz

Compiler is not thread safe?  (2007-05-16)
       http://python.org/sf/1720241  closed by  loewis

getattr([], '__eq__')(some-object) is NotImplemented  (2007-06-03)
       http://python.org/sf/1730322  closed by  collinwinter

make testall shows many glibc detected malloc corruptions  (2007-05-18)
       http://python.org/sf/1721309  closed by  nnorwitz

strptime bug in time module  (2007-06-03)
       http://python.org/sf/1730389  closed by  bcannon

__cmp__ present in type but not instance??  (2007-06-03)
       http://python.org/sf/1730401  closed by  bcannon

os._execvpe raises assignment error in python 3000 svn  (2007-06-03)
       http://python.org/sf/1730441  closed by  nnorwitz

Const(None) in compiler.ast.Return.value  (2007-05-09)
       http://python.org/sf/1715581  closed by  collinwinter

CGIHttpServer fails if python exe has spaces  (2007-05-02)
       http://python.org/sf/1711608  closed by  collinwinter

Unable to Start IDLE  (2007-06-06)
       http://python.org/sf/1732160  closed by  nnorwitz

T_LONGLONG chokes on ints  (2007-06-07)
       http://python.org/sf/1732557  closed by  loewis

Built-in open function fail. Too many file open  (2007-06-07)
       http://python.org/sf/1732629  closed by  gbrandl

Built-in open function fail. Too many file open  (2007-06-07)
       http://python.org/sf/1732686  closed by  loewis

Built-in open function fail. Too many file open  (2007-06-07)
       http://python.org/sf/1732686  closed by  gbrandl

urllib2 has memory leaks  (2006-02-13)
       http://python.org/sf/1430435  closed by  gbrandl

AIX Objects/buffereobject.c does not build on AIX  (2007-06-08)
       http://python.org/sf/1733488  closed by  loewis

AIX  Modules/unicodedata.c does not build  (2007-06-08)
       http://python.org/sf/1733493  closed by  perky

setup.py incorrect for HP  (2007-06-08)
       http://python.org/sf/1733518  closed by  loewis

windows 64 bit builds  (2007-06-08)
       http://python.org/sf/1733536  closed by  loewis

ctypes Fundamental data types  (2007-04-14)
       http://python.org/sf/1700455  closed by  theller

Repr class from repr module ignores maxtuple attribute  (2007-06-10)
       http://python.org/sf/1734723  closed by  nnorwitz

Tutorial Section 6.4  (2007-06-10)
       http://python.org/sf/1734732  closed by  nnorwitz

logging.exception() does not allow empty string  (2007-06-15)
       http://python.org/sf/1737864  closed by  gbrandl

parser error : out of memory error  (2007-06-15)
       http://python.org/sf/1738193  closed by  nnorwitz

sqlite3 doc fix  (2007-06-17)
       http://python.org/sf/1738670  closed by  nnorwitz

Tutorial error in 3.1.2 Strings  (2007-06-17)
       http://python.org/sf/1738754  closed by  nnorwitz

Bug assigning list comprehension to __slots__ in python 2.5  (2007-06-18)
       http://python.org/sf/1739107  closed by  gbrandl

shutil.rmtree's error message is confusing  (2007-06-18)
       http://python.org/sf/1739115  closed by  gbrandl

Interactive help raise exception while listing modules  (2007-06-19)
       http://python.org/sf/1739659  closed by  gbrandl

defined format returns error  (2007-06-22)
       http://python.org/sf/1741524  closed by  gbrandl

New / Reopened RFE
__________________

provide a shlex.split alternative for Windows shell syntax  (2007-05-24)
       http://python.org/sf/1724822  reopened by  gbrandl

add operator.fst and snd functions  (2007-05-28)
       http://python.org/sf/1726697  opened by  paul rubin

add itertools.ichain function and count.getvalue  (2007-05-28)
CLOSED http://python.org/sf/1726707  opened by  paul rubin

-q (quiet) option for python interpreter  (2007-05-30)
       http://python.org/sf/1728488  opened by  Marcin Wojdyr

ZipFile CallBack Needed...  (2007-06-08)
       http://python.org/sf/1733259  opened by  durumdara

Newer reply format for imap commands in imaplib.py  (2007-06-12)
       http://python.org/sf/1735509  opened by  Naoyuki Tai

make colon optional  (2007-06-19)
CLOSED http://python.org/sf/1739678  opened by  Chris

add multi-line comments  (2007-06-19)
CLOSED http://python.org/sf/1739679  opened by  Chris

RFE Closed
__________

add itertools.ichain function and count.getvalue  (2007-05-27)
       http://python.org/sf/1726707  closed by  rhettinger

new functool: "defaults" decorator  (2007-05-15)
       http://python.org/sf/1719222  closed by  rhettinger

make colon optional  (2007-06-19)
       http://python.org/sf/1739678  closed by  gbrandl

add multi-line comments  (2007-06-19)
       http://python.org/sf/1739679  closed by  gbrandl


From martin at v.loewis.de  Sat Jun 23 08:41:54 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 23 Jun 2007 08:41:54 +0200
Subject: [Python-Dev] Proposal for a new function "open_noinherit" to
 avoid	problems with subprocesses and security risks
In-Reply-To: <003a01c7b515$e56a7b50$6401a8c0@max>
References: <003a01c7b515$e56a7b50$6401a8c0@max>
Message-ID: <467CC0B2.2010700@v.loewis.de>

Henning von Bargen schrieb:
> I'd like to propose a new function "open_noinherit"
> or maybe even a new mode flag "n" for the builtin "open"
> (see footnote for the names).

Do you have a patch implementing that feature? I believe
it's unimplementable in Python 2.x: open() is mapped
to fopen(), which does not support O_NOINHERIT.

If you don't want the subprocess to inherit handles,
why don't you just specify close_fds=True when creating
the subprocess?

Regards,
Martin

From martin at v.loewis.de  Sat Jun 23 09:32:42 2007
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 23 Jun 2007 09:32:42 +0200
Subject: [Python-Dev] bzr on dinsdale
Message-ID: <467CCC9A.7050502@v.loewis.de>

If I do "bzr status" in dinsdale:/etc/apache2, I get

bzr: ERROR: bzrlib.errors.BzrCheckError: Internal check failed: file
u'/etc/init.d/stop-bootlogd' entered as kind 'symlink' id
'stopbootlogd-20070303140018-fe340b888f6e9c69', now of kind 'file'

Traceback (most recent call last):
  File "/usr/lib/python2.4/site-packages/bzrlib/commands.py", line 611,
in run_bzr_catch_errors
    return run_bzr(argv)
......
BzrCheckError: Internal check failed: file u'/etc/init.d/stop-bootlogd'
entered as kind 'symlink' id
'stopbootlogd-20070303140018-fe340b888f6e9c69', now of kind 'file'

bzr 0.11.0 on python 2.4.4.final.0 (linux2)
arguments: ['/usr/bin/bzr', 'status']

** please send this report to bazaar-ng at lists.ubuntu.com

Can somebody experienced with bzr please help?

Regards,
Martin

From henning.vonbargen at arcor.de  Sat Jun 23 10:04:33 2007
From: henning.vonbargen at arcor.de (Henning von Bargen)
Date: Sat, 23 Jun 2007 10:04:33 +0200
Subject: [Python-Dev] Proposal for a new function "open_noinherit" to
	avoid problems with subprocesses and security risks
References: <003a01c7b515$e56a7b50$6401a8c0@max> <467CC0B2.2010700@v.loewis.de>
Message-ID: <001901c7b56d$22c83f80$6401a8c0@max>

"""
OT: Argh - my email address is visible in the posting - I am doomed!
"""

----- Original Message ----- 
> Martin v. L?wis wrote:
>
> Do you have a patch implementing that feature? I believe
> it's unimplementable in Python 2.x: open() is mapped
> to fopen(), which does not support O_NOINHERIT.

Yes, I have a patch implemented in pure Python.

I got the code on my workplace PC (now I am writing from home,
that's why I said I'll post the code later).

The patch uses os.fdopen ( os.open (..., ...), ...).
It translates IOError into OSError then to raise the same class
of exception aso open().

Unfortunately, the patch is ignoring the bufsize argument,
so it is only a protoype at this time.

I know that open() is mapped to fopen() and fopen does not
support close_fds.
Thus a correct patch has to be implemented at the C level.
It should use open and fdopen instead of fopen - just like the
Python prototype.
AFAIK in the C stdlib implementation, fopen is implemented
based on open anyway.
BTW to find out what happens, I had to look to the source distribution
for the first time after 3 years of using Python.

> If you don't want the subprocess to inherit handles,
> why don't you just specify close_fds=True when creating
> the subprocess?

The subprocess module is a great piece of code,
but it has its weeknesses. "close_fds" is one of them.
subprocess.py fails on MS Windows if I specify close_fds.

And it *cannot* be fixed for MS Windows in the subprocess module.
This is due to the different way MS Windows handles handles :-)
in child process creation:

In Posix, you can just work through the file numbers range
and close the ones you don't want/need in the subprocess.
This is how close_fds works internally.
It closes the fds starting from 3 to MAX_FDs-1, thus only stdin,
stdout and stderr are inherited.

On MS Windows, AFAIK (correct me if I am wrong), you can
only choose either to inherit handles or not *as a whole*
 - including stdin, stdout and stderr -, when calling CreateProcess.
Each handle has a security attribute that specifies whether the
handle should be inherited or not - but this has to be specified
when creating the handle (in the Windows CreateFile API internally).
Thus, on MS Windows, you can either choose to inherit all
files opened with "open" + [stdin, stdout, stderr],
or to not inherit any files (meaning even stdin, stdout and stderr
will not be inherited).

In a platform-independent program, close_fds is therefore not an option.

Henning



From martin at v.loewis.de  Sat Jun 23 11:17:20 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 23 Jun 2007 11:17:20 +0200
Subject: [Python-Dev] Proposal for a new function "open_noinherit" to
 avoid problems with subprocesses and security risks
In-Reply-To: <001901c7b56d$22c83f80$6401a8c0@max>
References: <003a01c7b515$e56a7b50$6401a8c0@max> <467CC0B2.2010700@v.loewis.de>
	<001901c7b56d$22c83f80$6401a8c0@max>
Message-ID: <467CE520.3000405@v.loewis.de>

> Yes, I have a patch implemented in pure Python.
> 
> I got the code on my workplace PC (now I am writing from home,
> that's why I said I'll post the code later).
> 
> The patch uses os.fdopen ( os.open (..., ...), ...).
> It translates IOError into OSError then to raise the same class
> of exception aso open().

Hmm. I don't think I could accept such an implementation
(whether in Python or in C). That's very hackish.

> AFAIK in the C stdlib implementation, fopen is implemented
> based on open anyway.

Sure - and in turn, open is implemented on CreateFile.
However, I don't think I would like to see an fopen
implementation in Python. Python 3 will drop stdio entirely;
for 2.x, I'd be cautious to change things because that
may break other things in an unexpected manner.

> On MS Windows, AFAIK (correct me if I am wrong), you can
> only choose either to inherit handles or not *as a whole*
> - including stdin, stdout and stderr -, when calling CreateProcess.

I'm not sure. In general, that seems to be true. However,
according to the ReactOS sources at

http://www.reactos.org/generated/doxygen/dd/dda/dll_2win32_2kernel32_2process_2create_8c-source.html#l00624

Windows will duplicate stdin,stdout,stderr from the parent
process even if bInheritHandles is false, provided that
no handles are specified in the startupinfo, and provided
that the program to be started is a console (CUI) program.

> Each handle has a security attribute that specifies whether the
> handle should be inherited or not - but this has to be specified
> when creating the handle (in the Windows CreateFile API internally).

Not necessarily. You can turn on the flag later, through
SetHandleInformation.

> Thus, on MS Windows, you can either choose to inherit all
> files opened with "open" + [stdin, stdout, stderr],
> or to not inherit any files (meaning even stdin, stdout and stderr
> will not be inherited).
> 
> In a platform-independent program, close_fds is therefore not an option.

... assuming you care about whether stdin,stdout,stderr are inherited
to GUI programs. If the child process makes no use of stdin/stdout, you
can safely set close_fds to true.

Regards,
Martin

From dima at hlabs.spb.ru  Sat Jun 23 11:38:58 2007
From: dima at hlabs.spb.ru (Dmitry Vasiliev)
Date: Sat, 23 Jun 2007 13:38:58 +0400
Subject: [Python-Dev] bzr on dinsdale
In-Reply-To: <467CCC9A.7050502@v.loewis.de>
References: <467CCC9A.7050502@v.loewis.de>
Message-ID: <467CEA32.9040605@hlabs.spb.ru>

Martin v. L?wis wrote:
> If I do "bzr status" in dinsdale:/etc/apache2, I get
> 
> BzrCheckError: Internal check failed: file u'/etc/init.d/stop-bootlogd'
> entered as kind 'symlink' id
> 'stopbootlogd-20070303140018-fe340b888f6e9c69', now of kind 'file'
> 
> bzr 0.11.0 on python 2.4.4.final.0 (linux2)
> arguments: ['/usr/bin/bzr', 'status']
> 
> ** please send this report to bazaar-ng at lists.ubuntu.com
> 
> Can somebody experienced with bzr please help?

Bzr allow kind changes only starting from version 0.15, for old versions 
you should first remove file from version control with 'bzr rm' and then 
add again with 'bzr add'.

-- 
Dmitry Vasiliev <dima at hlabs.spb.ru>
http://hlabs.spb.ru

From martin at v.loewis.de  Sat Jun 23 13:18:48 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 23 Jun 2007 13:18:48 +0200
Subject: [Python-Dev] bzr on dinsdale
In-Reply-To: <467CEA32.9040605@hlabs.spb.ru>
References: <467CCC9A.7050502@v.loewis.de> <467CEA32.9040605@hlabs.spb.ru>
Message-ID: <467D0198.8010103@v.loewis.de>

> Bzr allow kind changes only starting from version 0.15, for old versions
> you should first remove file from version control with 'bzr rm' and then
> add again with 'bzr add'.

Thanks! that worked fine.

Regards,
Martin


From henning.vonbargen at arcor.de  Sat Jun 23 14:32:24 2007
From: henning.vonbargen at arcor.de (Henning von Bargen)
Date: Sat, 23 Jun 2007 14:32:24 +0200
Subject: [Python-Dev] Proposal for a new function "open_noinherit" to
	avoid problems with subprocesses and security risks
References: <003a01c7b515$e56a7b50$6401a8c0@max> <467CC0B2.2010700@v.loewis.de>
	<001901c7b56d$22c83f80$6401a8c0@max> <467CE520.3000405@v.loewis.de>
Message-ID: <000801c7b592$8e13e670$6401a8c0@max>


> "Martin v. L?wis" wrote:

>> Yes, I have a patch implemented in pure Python.
>>
>> I got the code on my workplace PC (now I am writing from home,
>> that's why I said I'll post the code later).
>>
>> The patch uses os.fdopen ( os.open (..., ...), ...).
>> It translates IOError into OSError then to raise the same class
>> of exception aso open().
>
> Hmm. I don't think I could accept such an implementation
> (whether in Python or in C). That's very hackish.

Well, if this is your opinion...
Take a look at the fopen implementation in stdio's fopen.c:
#  I found it via Google Code Search in the directory 
src/libc/ansi/stdio/fopen.c
# of http://clio.rice.edu/djgpp/win2k/djsr204_alpha.zip

FILE *
fopen(const char *file, const char *mode)
{
  FILE *f;
  int fd, rw, oflags = 0;
  ...
  fd = open(file, oflags, 0666);
  if (fd < 0)
    return NULL;

  f->_cnt = 0;
  f->_file = fd;
  f->_bufsiz = 0;
  ...
  }
  ...
  return f;
}

As you can see, at the C level, basically "fopen" is "open" with a little 
code
around it to parse flags etc. It's the same kind of hackish code.
And (apart from the ignored bufsize argument) the prototype is working fine.
I have to admit, though, that I am only using it on regular files.

Anyway, I don't want to argue about the implementation of a patch.
The fact is that until now the python programmer does not have an
easy, platform-independent option to open files non-inheritable.
As you mentioned yourself, the only way to work around it
in a platform-independent manner, IS VERY HACKISH.
So, shouldn't this hackish-ness better be hidden in the library
instead of leaving it as an execise to the common programmer?

The kind of errors I mentioned ("permission denied" errors that
seem to occur without an obvious reason) have cost me at least
two weeks of debugging the hard way (with ProcessExplorer etc)
and caused my manager to loose his trust in Python at all...
I think it is well worth the effort to keep this trouble away from
the Python programmers if possible.

And throughout the standard library modules, "open" is used,
causing these problems as soon as sub-processes come into play.

Apart from shutil.copyfile, other examples of using open that can cause
trouble are in socket.py (tell me any good reason why socket handles
should be inherited to child processes) and even in logging.py.

For example, I used RotatingFileHandler for logging my daemon
program activity. Sometimes, the logging  itself caused errors,
when a still-running child process had inherited the log file handle
and log rotation occured.

>
>> AFAIK in the C stdlib implementation, fopen is implemented
>> based on open anyway.
>
> Sure - and in turn, open is implemented on CreateFile.
> However, I don't think I would like to see an fopen
> implementation in Python. Python 3 will drop stdio entirely;
> for 2.x, I'd be cautious to change things because that
> may break other things in an unexpected manner.

Yeah, if you think it should not be included in 2.x,
then the handle inheritance problem should at least be considered
in the PEPs [(3116, "New I/O"), (337, "Logging Usage in the Standard 
Modules")]

>
>> On MS Windows, AFAIK (correct me if I am wrong), you can
>> only choose either to inherit handles or not *as a whole*
>> - including stdin, stdout and stderr -, when calling CreateProcess.
>
> I'm not sure. In general, that seems to be true. However,
> according to the ReactOS sources at
>
> http://www.reactos.org/generated/doxygen/dd/dda/dll_2win32_2kernel32_2process_2create_8c-source.html#l00624
>
> Windows will duplicate stdin,stdout,stderr from the parent
> process even if bInheritHandles is false, provided that
> no handles are specified in the startupinfo, and provided
> that the program to be started is a console (CUI) program.
>
>> Each handle has a security attribute that specifies whether the
>> handle should be inherited or not - but this has to be specified
>> when creating the handle (in the Windows CreateFile API internally).
>
> Not necessarily. You can turn on the flag later, through
> SetHandleInformation.

So do you think that a working "close_fds" could be implemented
for Windows as well?
Explicitly turning off the inheritance flag for all child handles except
stdin, stdout and stderr in subprocess / popen (the equivalent to
what close_fds does for Posix) - that's what I call hackish.
And I doubt that it is possible at all, for two reasons:
- you have to KNOW all the handles.
- due to the different process creation in Windows (there's no fork),
  you had to set the inheritance flags afterwards
- all this is not thread-safe.

>
>> Thus, on MS Windows, you can either choose to inherit all
>> files opened with "open" + [stdin, stdout, stderr],
>> or to not inherit any files (meaning even stdin, stdout and stderr
>> will not be inherited).
>>
>> In a platform-independent program, close_fds is therefore not an option.
>
> ... assuming you care about whether stdin,stdout,stderr are inherited
> to GUI programs. If the child process makes no use of stdin/stdout, you
> can safely set close_fds to true.

Hmm...
In the bug 1663329 I posted ("subprocess/popen close_fds perform poor if 
SC_OPEN_MAX is hi"),
you suggested:
"""
- you should set the FD_CLOEXEC flag on all file descriptors you don't
  want to be inherited, using fnctl(fd, F_SETFD, 1)
"""

Apart from the fact that this is not possible on MS Windows, it won't solve 
the problem!
(Because then I couldn't use all those standard modules that use open 
*without* FD_CLOEXEC).

The fact is that the combination ("multi-threading", "subprocess creation", 
"standard modules")
simply *does not work* flawlessly and produces errors that are hard to 
understand.
And probably most progammers are not even aware of the problem.
That's the main reason why I posted here.

And, in my experience, programs tend to get more complex and in the future
I expect to see more multi-threaded Python-programs.
So the problem will not vanish - we will see it more often than we like...

Regards,
Henning



From apt.shansen at gmail.com  Sat Jun 23 17:39:38 2007
From: apt.shansen at gmail.com (Stephen Hansen)
Date: Sat, 23 Jun 2007 08:39:38 -0700
Subject: [Python-Dev] Proposal for a new function "open_noinherit" to
	avoid problems with subprocesses and security risks
In-Reply-To: <000801c7b592$8e13e670$6401a8c0@max>
References: <003a01c7b515$e56a7b50$6401a8c0@max> <467CC0B2.2010700@v.loewis.de>
	<001901c7b56d$22c83f80$6401a8c0@max> <467CE520.3000405@v.loewis.de>
	<000801c7b592$8e13e670$6401a8c0@max>
Message-ID: <7a9c25c20706230839y5b594ee6i917fbd0924e26db1@mail.gmail.com>

The kind of errors I mentioned ("permission denied" errors that
> seem to occur without an obvious reason) have cost me at least
> two weeks of debugging the hard way (with ProcessExplorer etc)
> and caused my manager to loose his trust in Python at all...
> I think it is well worth the effort to keep this trouble away from
> the Python programmers if possible.
>
> And throughout the standard library modules, "open" is used,
> causing these problems as soon as sub-processes come into play.
>
> Apart from shutil.copyfile, other examples of using open that can cause
> trouble are in socket.py (tell me any good reason why socket handles
> should be inherited to child processes) and even in logging.py.
>
> For example, I used RotatingFileHandler for logging my daemon
> program activity. Sometimes, the logging  itself caused errors,
> when a still-running child process had inherited the log file handle
> and log rotation occured.


I just wanted to express to the group at large that these experiences aren't
just Henning's; we spent a *tremendous* amount of time and effort debugging
serious problems that arose from file handles getting shared to subprocesses
where it wasn't really expected. Specifically, the RotatingFileHandler
example above. It blatantly just breaks when subprocesses are used and its
an extremely obtuse process to discover why.

It was very costly to the company because it came up at a bad time and was
*so* obtuse of an error. At first it looked like some sort of thread-safety
problem, so a lot of prying went into that before we got stumped... after
all, we *knew* no other process touched that file, and the logging module
(and RotatingFileHandler) claimed and looked thread-safe, so.. how could it
be having a Permission Denied error when it very clearly is closing the file
before rotating it? Eventually the culprit was found, but it was very
painful.

A couple similar issues have arisen since, and they're only slightly easier
to debug once you are expecting it. But the fact that the simple and obvious
features provided in the stdlib break as a result of you launching a
subprocess at some point sorta sucks :)

So, yeah. Anything even remotely or vaguely approaching Henning's patch
would be really, really appreciated.

--SH
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070623/4c9eb540/attachment.htm 

From henning.vonbargen at arcor.de  Sat Jun 23 19:01:45 2007
From: henning.vonbargen at arcor.de (Henning von Bargen)
Date: Sat, 23 Jun 2007 19:01:45 +0200
Subject: [Python-Dev] Proposal for a new function "open_noinherit" to
	avoid problems with subprocesses and security risks
References: <003a01c7b515$e56a7b50$6401a8c0@max> <467CC0B2.2010700@v.loewis.de>
	<001901c7b56d$22c83f80$6401a8c0@max> <467CE520.3000405@v.loewis.de>
	<000801c7b592$8e13e670$6401a8c0@max>
	<7a9c25c20706230839y5b594ee6i917fbd0924e26db1@mail.gmail.com>
Message-ID: <001201c7b5b8$2ea03650$6401a8c0@max>

Stephen,

thank you for speaking it out loud on python-dev.
And you know better english words like "tremendous"
and "obtuse" (whatever that means:-)
that express what a PITA this really is.

When I said it took me two weeks, that's actually not the truth.
It was even more.
The first problem was with RotatingLogHandler, and just like you,
I first thought it was a threading problem.
Thus I wrote my own version of RotationLogHandler, which
builds new log file name with a timestamp instead of
renaming the old log files.

Actually, the point when I found out that indeed subprocesses
were causing problems was when I had the program running on
about 50 computers (for different clients) and for some clients
the program would run very well, while for other clients there
were often errors - suddenly it came to my mind that the clients
with the errors were those who used a subprocess for sending
e-mail via MAPI, whereas the clients who didn't experience
problems were those who used smtplib for sending e-mail
(no subprocesses).
And then it took me a few days to write my replacement open
function and to replace each occurence of "open" with the
replacement function.
And then, another few days later, a client told me that the errors
*still* occured (although in rare cases).
At first I built a lot of tracing and debugging into the MAPI subprocess
"sendmail.exe".
Finally I found out that it was actually shutil.filecopy that caused the 
error.
Of course, I hadn't searched for "open" in the whole bunch of standard
modules...

Henning



From martin at v.loewis.de  Sat Jun 23 20:09:17 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 23 Jun 2007 20:09:17 +0200
Subject: [Python-Dev] Proposal for a new function "open_noinherit" to
 avoid problems with subprocesses and security risks
In-Reply-To: <000801c7b592$8e13e670$6401a8c0@max>
References: <003a01c7b515$e56a7b50$6401a8c0@max> <467CC0B2.2010700@v.loewis.de>
	<001901c7b56d$22c83f80$6401a8c0@max> <467CE520.3000405@v.loewis.de>
	<000801c7b592$8e13e670$6401a8c0@max>
Message-ID: <467D61CD.9000403@v.loewis.de>

> As you can see, at the C level, basically "fopen" is "open" with a
> little code around it to parse flags etc. It's the same kind of hackish code.

"little code" is quite an understatement. In Microsoft's C library
(which we would have to emulate), the argument parsing of fopen is
120 lines of code. In addition, that code changes across compiler
versions (where VS 2005 adds additional error checking).

> Anyway, I don't want to argue about the implementation of a patch.
> The fact is that until now the python programmer does not have an
> easy, platform-independent option to open files non-inheritable.
> As you mentioned yourself, the only way to work around it
> in a platform-independent manner, IS VERY HACKISH.
> So, shouldn't this hackish-ness better be hidden in the library
> instead of leaving it as an execise to the common programmer?

Putting it into the library is fine. However, we need to find
an implementation strategy that meets the user's needs, and
is still maintainable.

Python 3 will offer a clean solution, deviating entirely from
stdio. For 2.x, we need to find a better solution than the
one you proposed.

> I think it is well worth the effort to keep this trouble away from
> the Python programmers if possible.

I don't argue about efforts - I argue about your proposed solution.

> Apart from shutil.copyfile, other examples of using open that can cause
> trouble are in socket.py (tell me any good reason why socket handles
> should be inherited to child processes) and even in logging.py.

On Unix, it is *very* common to inherit socket handles to child
processes. The parent process opens the socket, and the child
processes perform accept(3). This allows many processes to
serve requests on the same port. In Python,
SocketServer.Forking*Server rely on this precise capability.

>> Sure - and in turn, open is implemented on CreateFile.
>> However, I don't think I would like to see an fopen
>> implementation in Python. Python 3 will drop stdio entirely;
>> for 2.x, I'd be cautious to change things because that
>> may break other things in an unexpected manner.
> 
> Yeah, if you think it should not be included in 2.x,
> then the handle inheritance problem should at least be considered
> in the PEPs [(3116, "New I/O"), (337, "Logging Usage in the Standard
> Modules")]

I didn't say that a solution shouldn't be included in 2.x.
I said *your* solution shouldn't be. In 3.x, your solution
won't apply, sine Python won't be using stdio (so
fdopen becomes irrelevant)

>>> Each handle has a security attribute that specifies whether the
>>> handle should be inherited or not - but this has to be specified
>>> when creating the handle (in the Windows CreateFile API internally).
>>
>> Not necessarily. You can turn on the flag later, through
>> SetHandleInformation.
> 
> So do you think that a working "close_fds" could be implemented
> for Windows as well?

No. close_fds should have the semantics of only closing the handles
for that subprocess. SetHandleInformation applies to the parent
process, and *all* subprocesses. So this is different from close_fds.

> Explicitly turning off the inheritance flag for all child handles except
> stdin, stdout and stderr in subprocess / popen (the equivalent to
> what close_fds does for Posix) - that's what I call hackish.

I didn't propose that, and it wouldn't be the equivalent. In POSIX,
the closing occurs in the child process. This is not possible on
Windows, as there is no fork().

> And I doubt that it is possible at all, for two reasons:
> - you have to KNOW all the handles.
> - due to the different process creation in Windows (there's no fork),
>  you had to set the inheritance flags afterwards
> - all this is not thread-safe.

All true, and I did not suggest to integrate SetHandleInformation
into subprocess. I *ONLY* claimed that you can change the flag
after the file was opened.

With that API, it would be possible to provide cross-platform
access to the close-on-exec flag. Applications interested in setting
it could then set it right after opening the file.

> Apart from the fact that this is not possible on MS Windows, it won't
> solve the problem!
> (Because then I couldn't use all those standard modules that use open
> *without* FD_CLOEXEC).
> 
> The fact is that the combination ("multi-threading", "subprocess
> creation", "standard modules")
> simply *does not work* flawlessly and produces errors that are hard to
> understand.
> And probably most progammers are not even aware of the problem.
> That's the main reason why I posted here.

I don't see how your proposed change solves that. If there was
an "n" flag, then the modules in the standard library that open
files still won't use it.

Regards,
Martin

From amk at amk.ca  Sat Jun 23 20:36:45 2007
From: amk at amk.ca (A.M. Kuchling)
Date: Sat, 23 Jun 2007 14:36:45 -0400
Subject: [Python-Dev] Proposal for a new function "open_noinherit" to
	avoid problems with subprocesses and security risks
In-Reply-To: <7a9c25c20706230839y5b594ee6i917fbd0924e26db1@mail.gmail.com>
References: <003a01c7b515$e56a7b50$6401a8c0@max> <467CC0B2.2010700@v.loewis.de>
	<001901c7b56d$22c83f80$6401a8c0@max> <467CE520.3000405@v.loewis.de>
	<000801c7b592$8e13e670$6401a8c0@max>
	<7a9c25c20706230839y5b594ee6i917fbd0924e26db1@mail.gmail.com>
Message-ID: <20070623183645.GA10808@andrew-kuchlings-computer.local>

On Sat, Jun 23, 2007 at 08:39:38AM -0700, Stephen Hansen wrote:
> I just wanted to express to the group at large that these experiences aren't
> just Henning's; we spent a *tremendous* amount of time and effort debugging
> serious problems that arose from file handles getting shared to subprocesses
> where it wasn't really expected.

I've also encountered this when writing programs that are SCGI servers
that do a fork.  SCGI is like FastCGI; the HTTP server passes requests
to a local server using a custom protocol.  If the fork doesn't close
the SCGI server port, then Apache does nothing until the forked
subprocess exits, because the subprocess is keeping the request socket
open and alive.  

One fix is to always use subprocess.Popen and specify that
close_fd=True, which wasn't difficult for me, but I can imagine that
an easy way to set close-on-exec would be simpler in other cases.

--amk

From martin at v.loewis.de  Sat Jun 23 21:34:55 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 23 Jun 2007 21:34:55 +0200
Subject: [Python-Dev] Proposal for a new function "open_noinherit" to
 avoid problems with subprocesses and security risks
In-Reply-To: <20070623183645.GA10808@andrew-kuchlings-computer.local>
References: <003a01c7b515$e56a7b50$6401a8c0@max>
	<467CC0B2.2010700@v.loewis.de>	<001901c7b56d$22c83f80$6401a8c0@max>
	<467CE520.3000405@v.loewis.de>	<000801c7b592$8e13e670$6401a8c0@max>	<7a9c25c20706230839y5b594ee6i917fbd0924e26db1@mail.gmail.com>
	<20070623183645.GA10808@andrew-kuchlings-computer.local>
Message-ID: <467D75DF.6070509@v.loewis.de>

> One fix is to always use subprocess.Popen and specify that
> close_fd=True, which wasn't difficult for me, but I can imagine that
> an easy way to set close-on-exec would be simpler in other cases.

I think the complaint is not so much about simplicity, but correctness.
close_fd also closes stdin/stdout/stderr, which might be undesirable
and differs from POSIX.

In any case, providing a uniform set-close-on-exec looks fine to me,
provided it is implementable on all interesting platforms.

I'm -0 on adding "n" to open, and -1 for adding if it means to
reimplement fopen.

Regards,
Martin

From matthieu.brucher at gmail.com  Sat Jun 23 22:03:41 2007
From: matthieu.brucher at gmail.com (Matthieu Brucher)
Date: Sat, 23 Jun 2007 22:03:41 +0200
Subject: [Python-Dev] Proposal for a new function "open_noinherit" to
	avoid problems with subprocesses and security risks
In-Reply-To: <467D75DF.6070509@v.loewis.de>
References: <003a01c7b515$e56a7b50$6401a8c0@max> <467CC0B2.2010700@v.loewis.de>
	<001901c7b56d$22c83f80$6401a8c0@max> <467CE520.3000405@v.loewis.de>
	<000801c7b592$8e13e670$6401a8c0@max>
	<7a9c25c20706230839y5b594ee6i917fbd0924e26db1@mail.gmail.com>
	<20070623183645.GA10808@andrew-kuchlings-computer.local>
	<467D75DF.6070509@v.loewis.de>
Message-ID: <e76aa17f0706231303t14f6fa85i6192dbdc063791b5@mail.gmail.com>

Hi,


I think the complaint is not so much about simplicity, but correctness.
> close_fd also closes stdin/stdout/stderr, which might be undesirable
> and differs from POSIX.
>

According to the docs, stdin/stdout and stderr are not closed (
http://docs.python.org/lib/node529.html)


Matthieu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070623/95531174/attachment.html 

From martin at v.loewis.de  Sat Jun 23 23:12:32 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 23 Jun 2007 23:12:32 +0200
Subject: [Python-Dev] Proposal for a new function "open_noinherit" to
 avoid problems with subprocesses and security risks
In-Reply-To: <e76aa17f0706231303t14f6fa85i6192dbdc063791b5@mail.gmail.com>
References: <003a01c7b515$e56a7b50$6401a8c0@max>
	<467CC0B2.2010700@v.loewis.de>	<001901c7b56d$22c83f80$6401a8c0@max>
	<467CE520.3000405@v.loewis.de>	<000801c7b592$8e13e670$6401a8c0@max>	<7a9c25c20706230839y5b594ee6i917fbd0924e26db1@mail.gmail.com>	<20070623183645.GA10808@andrew-kuchlings-computer.local>	<467D75DF.6070509@v.loewis.de>
	<e76aa17f0706231303t14f6fa85i6192dbdc063791b5@mail.gmail.com>
Message-ID: <467D8CC0.8030808@v.loewis.de>

>     I think the complaint is not so much about simplicity, but correctness.
>     close_fd also closes stdin/stdout/stderr, which might be undesirable
>     and differs from POSIX.
> 
> 
> According to the docs, stdin/stdout and stderr are not closed (
> http://docs.python.org/lib/node529.html)

I don't get your point: The docs says explicitly "Unix only".

Regards,
Martin

From status at bugs.python.org  Sun Jun 24 02:00:49 2007
From: status at bugs.python.org (Tracker)
Date: Sun, 24 Jun 2007 00:00:49 +0000 (UTC)
Subject: [Python-Dev] Summary of Tracker Issues
Message-ID: <20070624000049.1CFEB781E0@psf.upfronthosting.co.za>


ACTIVITY SUMMARY (06/17/07 - 06/24/07)
Tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue 
number.  Do NOT respond to this message.


 1645 open ( +0) /  8584 closed ( +0) / 10229 total ( +0)

Average duration of open issues: 836 days.
Median duration of open issues: 784 days.

Open Issues Breakdown
   open  1645 ( +0)
pending     0 ( +0)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070624/0b2e5907/attachment.htm 

From talin at acm.org  Sun Jun 24 04:28:58 2007
From: talin at acm.org (Talin)
Date: Sat, 23 Jun 2007 19:28:58 -0700
Subject: [Python-Dev] [Python-3000] Issues with PEP 3101
	(string	formatting)
In-Reply-To: <20070620085701.GA31968@crater.logilab.fr>
References: <A51EAB52-FA02-47DE-8A82-DF706F4ECD67@plope.com>	<ca471dc20706190820n7715fc30jeafcffd14c6b5623@mail.gmail.com>
	<20070620085701.GA31968@crater.logilab.fr>
Message-ID: <467DD6EA.6010303@acm.org>

I haven't responded to this thread because I was hoping some of the 
original proponents of the feature would come out to defend it. 
(Remember, 3101 is a synthesis of a lot of people's ideas gleaned from 
many forum postings - In some cases I am willing to defend particular 
aspects of the PEP, and in others I just write down what I think the 
general consensus is.)

That being said - from what I've read so far, the evidence on both sides 
of the argument seems anecdotal to me. I'd rather wait and see what more 
people have to say on the topic.

-- Talin

Aur?lien Camp?as wrote:
> On Tue, Jun 19, 2007 at 08:20:25AM -0700, Guido van Rossum wrote:
>> Those are valid concerns. I'm cross-posting this to the python-3000
>> list in the hope that the PEP's author and defendents can respond. I'm
>> sure we can work something out.
> 
> Thanks to raise this. It is horrible enough that I feel obliged to
> de-lurk.
> 
> -10 on this part of PEP3101.
> 
> 
>> Please keep further discussion on the python-3000 at python.org list.
>>
>> --Guido
>>
>> On 6/19/07, Chris McDonough <chrism at plope.com> wrote:
>>> Wrt http://www.python.org/dev/peps/pep-3101/
>>>
>>> PEP 3101 says Py3K should allow item and attribute access syntax
>>> within string templating expressions but "to limit potential security
>>> issues", access to underscore prefixed names within attribute/item
>>> access expressions will be disallowed.
> 
> People talking about potential security issues should have an
> obligation to show how their proposals *really* improve security (in
> general); this is of course, a hard thing to do; mere hand-waving is
> not sufficient.
> 
>>> I am a person who has lived with the aftermath of a framework
>>> designed to prevent data access by restricting access to underscore-
>>> prefixed names (Zope 2, ahem), and I've found it's very hard to
>>> explain and justify.  As a result, I feel that this is a poor default
>>> policy choice for a framework.
> 
> And it's even poorer in the context of a language (for it's probably
> harder to escape language-level restrictions than framework
> obscurities ...).
> 
>>> In some cases, underscore names must become part of an object's
>>> external interface.  Consider a URL with one or more underscore-
>>> prefixed path segment elements (because prefixing a filename with an
>>> underscore is a perfectly reasonable thing to do on a filesystem, and
>>> path elements are often named after file names) fed to a traversal
>>> algorithm that attempts to resolve each path element into an object
>>> by calling __getitem__ against the parent found by the last path
>>> element's traversal result.  Perhaps this is poor design and
>>> __getitem__ should not be consulted here, but I doubt that highly
>>> because there's nothing particularly special about calling a method
>>> named __getitem__ as opposed to some method named "traverse".
> 
> This is trying to make a technical argument, but the 'consenting
> adults' policy might be enough. In my experience, zope forbiding
> access to _ prefixed attributes just led to work around the
> limitation, thus adding more useless indirection to an already crufty
> code base. The result is more obfuscation and probably even less
> security (as in auditability of the code).
> 
>>> The only precedent within Python 2 for this sort of behavior is
>>> limiting access to variables that begin with __ and which do not end
>>> with __ to the scope defined by a class and its instances.  I
>>> personally don't believe this is a very useful feature, but it's
>>> still only an advisory policy and you can worm around it with enough
>>> gyrations.
> 
> FWIW I've come to never use __attrs. The obfuscation feature seems to
> bring nothing but pain (the few times I've fell into that trap as a
> beginner python programmer).
> 
>>> Given that security is a concern at all, the only truly reasonable
>>> way to "limit security issues" is to disallow item and attribute
>>> access completely within the string templating expression syntax.  It
>>> seems gratuituous to me to encourage string templating expressions
>>> with item/attribute access, given that you could do it within the
>>> format arguments just as easily in the 99% case, and we've (well...
>>> I've) happily been living with that restriction for years now.
>>>
>>> But if this syntax is preserved, there really should be no *default*
>>> restrictions on the traversable names within an expression because
>>> this will almost certainly become a hard-to-explain, hard-to-justify
>>> bug magnet as it has become in Zope.
> 
> I'd add that Zope in general looks to me like a giant collection of
> python anti-patterns and as such can be used as a clue source about
> what not to do, especially what not to include in Py3k.
> 
> I don't want to offense people, well no more than necessary (imho zope
> *is* an offense to common sense in many ways), but that's the opinion
> from someone who earns its living mostly from zope/plone products
> dev. and maintenance (these days, anyway).
> 
> Regards,
> Aur?lien.
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/talin%40acm.org
> 

From henning.vonbargen at arcor.de  Sun Jun 24 11:05:54 2007
From: henning.vonbargen at arcor.de (Henning von Bargen)
Date: Sun, 24 Jun 2007 11:05:54 +0200
Subject: [Python-Dev] Proposal for a new function "open_noinherit" to
	avoid problems with subprocesses and security risks
References: <003a01c7b515$e56a7b50$6401a8c0@max> <467CC0B2.2010700@v.loewis.de>
	<001901c7b56d$22c83f80$6401a8c0@max> <467CE520.3000405@v.loewis.de>
	<000801c7b592$8e13e670$6401a8c0@max> <467D61CD.9000403@v.loewis.de>
Message-ID: <000f01c7b63e$df834c60$6401a8c0@max>

"""
My very personal opinion:
After a sleepness night, it seems to me that this is not a Python problem
(or any other programming language at all).
It looks more like an OS design problem (on MS Windows as well as
on Linux etc). In an ideal world, when a program asks the OS to start
a child process, it should have to explicitly list the handles that should
be inherited.
"""

> Martin v. L?wis wrote:

>> As you can see, at the C level, basically "fopen" is "open" with a
>> little code around it to parse flags etc. It's the same kind of hackish 
>> code.
>
> "little code" is quite an understatement. In Microsoft's C library
> (which we would have to emulate), the argument parsing of fopen is
> 120 lines of code. In addition, that code changes across compiler
> versions (where VS 2005 adds additional error checking).

Hmm. Wow!

>
>> Anyway, I don't want to argue about the implementation of a patch.
>> The fact is that until now the python programmer does not have an
>> easy, platform-independent option to open files non-inheritable.
>> As you mentioned yourself, the only way to work around it
>> in a platform-independent manner, IS VERY HACKISH.
>> So, shouldn't this hackish-ness better be hidden in the library
>> instead of leaving it as an execise to the common programmer?
>
> Putting it into the library is fine. However, we need to find
> an implementation strategy that meets the user's needs, and
> is still maintainable.
>
> Python 3 will offer a clean solution, deviating entirely from
> stdio.

Let me point out that stdio is not the problem.
The problem is handle inheritance.
So I don't see how this is going to be solve in Python 3 just by
not using stdio.
Inheritance has to be taken into account regardless of how it is
implemented on the C level.
And to open a file non-inheritable should be possible in an easy
and platform-independent way for the average python programmer.

> For 2.x, we need to find a better solution than the one you proposed.

Stephen, perhaps you can describe the workaround you used?
Maybe it is better than mine.
Or anyone else?

>
>> I think it is well worth the effort to keep this trouble away from
>> the Python programmers if possible.
>
> I don't argue about efforts - I argue about your proposed solution.
>
>> Apart from shutil.copyfile, other examples of using open that can cause
>> trouble are in socket.py (tell me any good reason why socket handles
>> should be inherited to child processes) and even in logging.py.
>
> On Unix, it is *very* common to inherit socket handles to child
> processes. The parent process opens the socket, and the child
> processes perform accept(3). This allows many processes to
> serve requests on the same port. In Python,
> SocketServer.Forking*Server rely on this precise capability.

Ahh, I see.
Maybe this is why my HTTP Server sometimes seems to not
react when a subprocess is running...
If more than one process has a handle for the same socket,
how does the OS know which process should react?

>
>>> Sure - and in turn, open is implemented on CreateFile.
>>> However, I don't think I would like to see an fopen
>>> implementation in Python. Python 3 will drop stdio entirely;
>>> for 2.x, I'd be cautious to change things because that
>>> may break other things in an unexpected manner.
>>
>> Yeah, if you think it should not be included in 2.x,
>> then the handle inheritance problem should at least be considered
>> in the PEPs [(3116, "New I/O"), (337, "Logging Usage in the Standard
>> Modules")]
>
> I didn't say that a solution shouldn't be included in 2.x.
> I said *your* solution shouldn't be. In 3.x, your solution
> won't apply, sine Python won't be using stdio (so
> fdopen becomes irrelevant).

See above - please take it into account for Python 3 then.

>
>>>> Each handle has a security attribute that specifies whether the
>>>> handle should be inherited or not - but this has to be specified
>>>> when creating the handle (in the Windows CreateFile API internally).
>>>
>>> Not necessarily. You can turn on the flag later, through
>>> SetHandleInformation.
>>
>> So do you think that a working "close_fds" could be implemented
>> for Windows as well?
>
> No. close_fds should have the semantics of only closing the handles
> for that subprocess. SetHandleInformation applies to the parent
> process, and *all* subprocesses. So this is different from close_fds.

Yes - that's why I doubt that could work.
And according to http://support.microsoft.com/kb/190351/en-us
in order to capture stdout and stderr of the child process,
one has to specify bInheritHandle=TRUE in CreateProcess,
with the net effect that you can only choose if
either ALL handles (if not explicitly specified otherwise during
handle creation) should be inherited or none of them.

>
>> Explicitly turning off the inheritance flag for all child handles except
>> stdin, stdout and stderr in subprocess / popen (the equivalent to
>> what close_fds does for Posix) - that's what I call hackish.
>
> I didn't propose that, and it wouldn't be the equivalent. In POSIX,
> the closing occurs in the child process. This is not possible on
> Windows, as there is no fork().

OK - I agree it is not possible. But "avoiding handle inheritance"
is what one wants to achieve when specifying close_fds, I think.

>> And I doubt that it is possible at all, for two reasons:
>> - you have to KNOW all the handles.
>> - due to the different process creation in Windows (there's no fork),
>>  you had to set the inheritance flags afterwards
>> - all this is not thread-safe.
>
> All true, and I did not suggest to integrate SetHandleInformation
> into subprocess. I *ONLY* claimed that you can change the flag
> after the file was opened.
>
> With that API, it would be possible to provide cross-platform
> access to the close-on-exec flag. Applications interested in setting
> it could then set it right after opening the file.

YES - that's exactly why I proposed an open_noinherit function.
It is a simple solution for a common problem - such a function,
documented in the library, would tell developers that they have to
be aware of the problem and that a solution exists (though the
implementation is more or less hackish due to platform-specific
code).

>
>> Apart from the fact that this is not possible on MS Windows, it won't
>> solve the problem!
>> (Because then I couldn't use all those standard modules that use open
>> *without* FD_CLOEXEC).
>>
>> The fact is that the combination ("multi-threading", "subprocess
>> creation", "standard modules")
>> simply *does not work* flawlessly and produces errors that are hard to
>> understand.
>> And probably most progammers are not even aware of the problem.
>> That's the main reason why I posted here.
>
> I don't see how your proposed change solves that. If there was
> an "n" flag, then the modules in the standard library that open
> files still won't use it.

That's why I said the standard library should be reviewed
for unintentionally handle inheritance by the use of open.
Note this is a security risk as well, see
http://msdn.microsoft.com/msdnmag/issues/0300/security/

Regards,
Henning



From g.brandl at gmx.net  Sun Jun 24 11:09:40 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Sun, 24 Jun 2007 11:09:40 +0200
Subject: [Python-Dev] Issues with PEP 3101 (string formatting)
In-Reply-To: <ca471dc20706190820n7715fc30jeafcffd14c6b5623@mail.gmail.com>
References: <A51EAB52-FA02-47DE-8A82-DF706F4ECD67@plope.com>
	<ca471dc20706190820n7715fc30jeafcffd14c6b5623@mail.gmail.com>
Message-ID: <f5lcc5$up4$1@sea.gmane.org>

Guido van Rossum schrieb:
> Those are valid concerns. I'm cross-posting this to the python-3000
> list in the hope that the PEP's author and defendents can respond. I'm
> sure we can work something out.

Another question w.r.t. new string formatting:

Assuming the %-operator for strings goes away as you said in the recent blog
post, how are we going to convert string formatting (which I daresay is a very
common operation in Python modules) in the 2to3 tool?

Of course, "abc" % anything can be converted easily.

name % tuple_or_dict can only be converted to name.format(tuple_or_dict),
without correcting the format string.

name % name can not be converted at all without type inference.

Though probably the first type of application is the most frequent one,
pre-building (or just loading from elsewhere) of format strings is not so
uncommon when it comes to localization, where the format string likely
has a _() wrapped around it.

Of course, converting format strings manually is a PITA, mainly because it's
so common.

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From martin at v.loewis.de  Sun Jun 24 20:19:40 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 24 Jun 2007 20:19:40 +0200
Subject: [Python-Dev] Proposal for a new function "open_noinherit" to
 avoid problems with subprocesses and security risks
In-Reply-To: <000f01c7b63e$df834c60$6401a8c0@max>
References: <003a01c7b515$e56a7b50$6401a8c0@max> <467CC0B2.2010700@v.loewis.de>
	<001901c7b56d$22c83f80$6401a8c0@max> <467CE520.3000405@v.loewis.de>
	<000801c7b592$8e13e670$6401a8c0@max> <467D61CD.9000403@v.loewis.de>
	<000f01c7b63e$df834c60$6401a8c0@max>
Message-ID: <467EB5BC.5060404@v.loewis.de>

>> Putting it into the library is fine. However, we need to find
>> an implementation strategy that meets the user's needs, and
>> is still maintainable.
>>
>> Python 3 will offer a clean solution, deviating entirely from
>> stdio.
> 
> Let me point out that stdio is not the problem.
> The problem is handle inheritance.
> So I don't see how this is going to be solve in Python 3 just by
> not using stdio.

In Python 3, it would be possible to implement the "n" flag
for open(), as we call CreateFile directly.

> And to open a file non-inheritable should be possible in an easy
> and platform-independent way for the average python programmer.

I don't see why it is a requirement to *open* the file in
non-inheritable mode. Why is not sufficient to *modify*
an open file to have its handle non-inheritable in
an easy and platform-independent way?

> Maybe this is why my HTTP Server sometimes seems to not
> react when a subprocess is running...
> If more than one process has a handle for the same socket,
> how does the OS know which process should react?

The processes which don't perform accept(), recv(), or
select() operations are not considered by the operating
system. So if only one process does recv() (say), then
this process will read the data.

If multiple processes perform accept() (which is a common
case), the system selects a process at random. This is
desirable, as the system will then automatically split
the load across processes, and the listen backlog cannot
pile up: if multiple connection requests arrive at the
same time, one process will do accept, and then start
to process the connection. Then the second process will
take the second request, and so on.

If multiple processes perform recv(), the system will
again chose randomly. This is mostly undesirable, and
should be avoided.

>> With that API, it would be possible to provide cross-platform
>> access to the close-on-exec flag. Applications interested in setting
>> it could then set it right after opening the file.
> 
> YES - that's exactly why I proposed an open_noinherit function.

I think I missed that proposal. What would that function do?

If you propose it to be similar to the open() function, I'd
be skeptical. It's not possible to implement that in thread-safe
way if you use SetHandleInformation/ioctl.

Regards,
Martin


From foom at fuhm.net  Sun Jun 24 21:47:03 2007
From: foom at fuhm.net (James Y Knight)
Date: Sun, 24 Jun 2007 15:47:03 -0400
Subject: [Python-Dev] Proposal for a new function "open_noinherit" to
	avoid problems with subprocesses and security risks
In-Reply-To: <467EB5BC.5060404@v.loewis.de>
References: <003a01c7b515$e56a7b50$6401a8c0@max> <467CC0B2.2010700@v.loewis.de>
	<001901c7b56d$22c83f80$6401a8c0@max> <467CE520.3000405@v.loewis.de>
	<000801c7b592$8e13e670$6401a8c0@max> <467D61CD.9000403@v.loewis.de>
	<000f01c7b63e$df834c60$6401a8c0@max> <467EB5BC.5060404@v.loewis.de>
Message-ID: <2CDFA7A4-0A3D-4D04-AB6C-D8A3B1C51CB8@fuhm.net>


On Jun 24, 2007, at 2:19 PM, Martin v. L?wis wrote:
> I don't see why it is a requirement to *open* the file in
> non-inheritable mode. Why is not sufficient to *modify*
> an open file to have its handle non-inheritable in
> an easy and platform-independent way?

Threads. Consider that you may fork a process on one thread right  
between the calls to open() and fcntl(F_SETFD, FD_CLOEXEC) on another  
thread. The only way to be safe is to open the file non-inheritable  
to start with.

Now, it is currently impossible under linux to open a file descriptor  
noninheritable, but they're considering adding that feature (I don't  
have the thread-refs on me, but it's actually from the last month).  
The issue is that there's a *bunch* of syscalls that open FDs: this  
feature would need to be added to all of them, not only "open".

It's possible that it makes sense for python to provide "as good as  
possible" an implementation. At the least, putting the fcntl call in  
the same C function as open would fix programs that don't open files/ 
spawn processes outside of the GIL protection.

But, like the kernel, this feature then ought to be provided for all  
APIs that create file descriptors.

>>> With that API, it would be possible to provide cross-platform
>>> access to the close-on-exec flag. Applications interested in setting
>>> it could then set it right after opening the file.
>>
>> YES - that's exactly why I proposed an open_noinherit function.
>
> I think I missed that proposal. What would that function do?
>
> If you propose it to be similar to the open() function, I'd
> be skeptical. It's not possible to implement that in thread-safe
> way if you use SetHandleInformation/ioctl.

Now I'm confused: are you talking about the same thread-safety  
situation as I described above? If so, why did you ask why it's not  
sufficient to modify a handle to be non-inheritable?

James

From martin at v.loewis.de  Sun Jun 24 22:48:30 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 24 Jun 2007 22:48:30 +0200
Subject: [Python-Dev] Proposal for a new function "open_noinherit" to
 avoid problems with subprocesses and security risks
In-Reply-To: <2CDFA7A4-0A3D-4D04-AB6C-D8A3B1C51CB8@fuhm.net>
References: <003a01c7b515$e56a7b50$6401a8c0@max> <467CC0B2.2010700@v.loewis.de>
	<001901c7b56d$22c83f80$6401a8c0@max> <467CE520.3000405@v.loewis.de>
	<000801c7b592$8e13e670$6401a8c0@max> <467D61CD.9000403@v.loewis.de>
	<000f01c7b63e$df834c60$6401a8c0@max> <467EB5BC.5060404@v.loewis.de>
	<2CDFA7A4-0A3D-4D04-AB6C-D8A3B1C51CB8@fuhm.net>
Message-ID: <467ED89E.7050405@v.loewis.de>

>> I don't see why it is a requirement to *open* the file in
>> non-inheritable mode. Why is not sufficient to *modify*
>> an open file to have its handle non-inheritable in
>> an easy and platform-independent way?
> 
> Threads. Consider that you may fork a process on one thread right
> between the calls to open() and fcntl(F_SETFD, FD_CLOEXEC) on another
> thread. The only way to be safe is to open the file non-inheritable to
> start with.

No, that's not the only safe way. The application can synchronize the
threads, and prevent starting subprocesses during critical regions.
Just define a subprocess_lock, and make all of your threads follow
the protocol of locking that lock when either opening a new file,
or creating a subprocess.

> Now, it is currently impossible under linux to open a file descriptor
> noninheritable, but they're considering adding that feature (I don't
> have the thread-refs on me, but it's actually from the last month). The
> issue is that there's a *bunch* of syscalls that open FDs: this feature
> would need to be added to all of them, not only "open".

Right. That is what makes it difficult inherently on the API level.

> It's possible that it makes sense for python to provide "as good as
> possible" an implementation. At the least, putting the fcntl call in the
> same C function as open would fix programs that don't open files/spawn
> processes outside of the GIL protection.

No, that would not work. Python releases the GIL when opening a file
(and needs to do so because that may be a long-running operation).

> Now I'm confused: are you talking about the same thread-safety situation
> as I described above? 

Yes.

> If so, why did you ask why it's not sufficient to
> modify a handle to be non-inheritable?

Because I wanted to hear what the reasons are to consider that
insufficient. I would have expected "ease of use" and such things
(perhaps Henning will still bring up other reasons). If
thread-safety is a primary concern, then that flag should *not*
be added to open(), since it cannot be implemented in a
thread-safe manner in a generic way - only the application can
perform the proper synchronization. As discussed, there are
other handles subject to inheritance, too, and the application
would have to use the modify-handle function, anyway, which
means it needs to make it thread-safe through explicit locking.

Regards,
Martin

From rcohen at snurgle.org  Sun Jun 24 23:13:03 2007
From: rcohen at snurgle.org (Ross Cohen)
Date: Sun, 24 Jun 2007 17:13:03 -0400
Subject: [Python-Dev] Proposal for a new function "open_noinherit" to
	avoid problems with subprocesses and security risks
In-Reply-To: <467ED89E.7050405@v.loewis.de>
References: <003a01c7b515$e56a7b50$6401a8c0@max> <467CC0B2.2010700@v.loewis.de>
	<001901c7b56d$22c83f80$6401a8c0@max> <467CE520.3000405@v.loewis.de>
	<000801c7b592$8e13e670$6401a8c0@max> <467D61CD.9000403@v.loewis.de>
	<000f01c7b63e$df834c60$6401a8c0@max> <467EB5BC.5060404@v.loewis.de>
	<2CDFA7A4-0A3D-4D04-AB6C-D8A3B1C51CB8@fuhm.net>
	<467ED89E.7050405@v.loewis.de>
Message-ID: <20070624211303.GG22573@snurgle.org>

On Sun, Jun 24, 2007 at 10:48:30PM +0200, "Martin v. L?wis" wrote:
> >> I don't see why it is a requirement to *open* the file in
> >> non-inheritable mode. Why is not sufficient to *modify*
> >> an open file to have its handle non-inheritable in
> >> an easy and platform-independent way?
> > 
> > Threads. Consider that you may fork a process on one thread right
> > between the calls to open() and fcntl(F_SETFD, FD_CLOEXEC) on another
> > thread. The only way to be safe is to open the file non-inheritable to
> > start with.
> 
> No, that's not the only safe way. The application can synchronize the
> threads, and prevent starting subprocesses during critical regions.
> Just define a subprocess_lock, and make all of your threads follow
> the protocol of locking that lock when either opening a new file,
> or creating a subprocess.

The problem here is that sitting in accept() becomes a critical section.
While a thread is sitting in that call, no other thread could start a
subprocess. A multithreaded server which uses a 1-thread-per-request
model wouldn't be possible, at least not in a reasonable amount of
comprehensible code.

> > Now, it is currently impossible under linux to open a file descriptor
> > noninheritable, but they're considering adding that feature (I don't
> > have the thread-refs on me, but it's actually from the last month). The
> > issue is that there's a *bunch* of syscalls that open FDs: this feature
> > would need to be added to all of them, not only "open".
> 
> Right. That is what makes it difficult inherently on the API level.

LWN has had good coverage of the discussion:
http://lwn.net/Articles/237722/

Ross

From henning.vonbargen at arcor.de  Mon Jun 25 22:51:28 2007
From: henning.vonbargen at arcor.de (Henning von Bargen)
Date: Mon, 25 Jun 2007 22:51:28 +0200
Subject: [Python-Dev] Proposal for a new function "open_noinherit" to
	avoid problems with subprocesses and security risks
Message-ID: <003601c7b76a$9a2a0550$6401a8c0@max>

Hi,

# I'm not sure about netiquette here:
# I decided to continue posting to the python-list without CCing to 
everyone.

First of all, here's the prototype.
It's a prototype and I know it's far from perfect, but it works for me
(in production code) - however, I did not yet test it on Non-Windows.
---------------------------------
#!/bin/env python
# -*- coding: iso-8859-1 -*-

"""
File ftools.py: Useful tools for working with files.
"""

import os
import os.path
import time
import shutil

rmtree = shutil.rmtree
move = shutil.move

builtin_open = open

if os.name != "nt":
    import fcntl

def open(fname, mode="r", bufsize=None):
    """
    Like the "open" built-in, but does not inherit to child processes.
    The code is using os.open and os.fdopen.
    On Windows, to avoid inheritance, os.O_NOINHERIT is used
    directly in the open call, thus it should be thread-safe.
    On other operating systems, fcntl with FD_CLOEXEC is used
    right after opening the file; however in a mutli-threaded program
    it may happen that another thread starts a child process in the
    fraction of a second between os.open and fcntl.

    Note: The bufsize argument is ignored (not yet implemented).
    """
    flags = 0
    if "r" in mode:
        flags += os.O_RDONLY
    elif "w" in mode:
        flags += os.O_RDWR + os.O_CREAT + os.O_TRUNC
    elif "a" in mode:
        flags += os.O_RDWR + os.O_CREAT + os.O_APPEND
    else:
        raise NotImplementedError ("mode=" + mode)
    if os.name == "nt":
        if "b" in mode:
            flags += os.O_BINARY
        else:
            flags += os.O_TEXT
        flags += os.O_NOINHERIT
    try:
        fd = os.open (fname, flags)
        if os.name != "nt":
            old = fcntl.fcntl(fd, fcntl.F_GETFD)
            fcntl.fcntl(fd, fcntl.F_SETFD, old | fcntl.FD_CLOEXEC)
        return os.fdopen (fd, mode)
    except OSError, x:
        raise IOError(x.errno, x.strerror, x.filename)

def copyfile(src, dst):
    """
    Copies a file - like shutil.copyfile, but the files are opened 
non-inheritable.
    Note: This prototype does not test _samefile like shutil.copyfile.
    """
    fsrc = None
    fdst = None
    try:
        fsrc = open(src, "rb")
        fdst = open(dst, "wb")
        shutil.copyfileobj(fsrc, fdst)
    finally:
        if fdst:
            fdst.close()
        if fsrc:
            fsrc.close()

------------------------------------------------

"""
blah blah:
I googled around a bit, and it more and more seems to me
that the Posix system has a serious design flaw, since it seems
to be SO hard to write multi-threaded programs that also
start child-processes.
It's funny that right now Linus Torvalds himself seems to be
aware of this problem and that the Linux kernel developers
are discussing ways to solve it.
Let's hope they find a way to get around it on the OS level.
To me, the design of MS Windows looks better in the aspect
of process-creation, handle inheritance and multi-threading...
Anyway, it has its drawbacks, too.
For example, I still cannot specify in a thread-safe way that
a handle should be inherited to one child process but not
to another - I would have to use a lock to synchronize it,
which has its own problems, as Ross Cohen noted.
The best solution at the OS level would be to explitly
specify the handles/fds I want to be inherited in a
"create child process" system call.

BTW the Linux kernel developers face the same situation
as we do: They could somehow implement a new system
function like "open_noinherit", but there's a whole bunch
of existing "standard code" that uses open and similar
functions like socket(), accept() etc., and they don't want
to change all these calls.

So perhaps, for Python development, we just have to accept
that the problem persists and that at this time a 100% solution
just does not exist - and we should watch the discussion
on http://lwn.net/Articles/237722/ to see how they solve it for
Linux.
"""

That being said, I still think it's necessary for Python to provide
a solution as good as possible.

For example, in my production application, by consequently
using the ftools module above, I could reduce the error rate dramatically:
* With built-in open and shutil.copyfile:
    Several "Permission denied" and other errors a day
* With ftools.open and ftools.copyfile:
  program running for a week or more without errors.

There are still errors sometimes, and I suspect it has to do with the
unintenional inheritance of socket handles (I did not dig into
SocketServer.py, socket.py and socket.c to solve it).
(However, the errors are so rare now that our clients think
 it's just errors in their network :-).

Martin, you mentioned that for sockets, inheritance is not a problem
unless accept(), recv() or select() is called in the child process
(as far as I understood it).
Though I am not an expert in socket programming at the C level,
I doubt that you are right here. Apart from by own experiences,
I've found some evidence in the WWW
(searching for "child process socket inherit respond",
 and for "socket error 10054 process").
* http://mail.python.org/pipermail/python-list/2003-November/236043.html
   "socket's strange behavior with subprocesses"
   Funny: Even notepad.exe is used there as an example child process...
* http://mail.python.org/pipermail/python-bugs-list/2006-April/032974.html
   python-Bugs-1469163 SimpleXMLRPCServer doesn't work anymore on Windows
   (see also Bug 1222790).
* http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4202949
  Java has switched to non-inheritable sockets as well.
* http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5069989
  "(process) Runtime.exec unnecessarily inherits open files (win)"
* http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4197666"
   "Socket descriptors leak into processes spawned by Java programs on 
windows"
  ...

Any Windows Guru around who can explain what's going on with socket
handles and CreateProcess? I mean - is the explanation Martin gave for
accept(), recv(), select() correct for Windows, too? And if so - how can
the errors be explained that are mentioned in the URLs above?

Martin v. L?wis wrote:
> In Python 3, it would be possible to implement the "n" flag
> for open(), as we call CreateFile directly.
BTW, if it will be an additional flag to "open", let me correct myself:
"n" is not a good character. Now I prefer "p" like personal, private,
protected, process.

> > And to open a file non-inheritable should be possible in an easy
> > and platform-independent way for the average python programmer.
> I don't see why it is a requirement to *open* the file in
> non-inheritable mode. Why is not sufficient to *modify*
> an open file to have its handle non-inheritable in
> an easy and platform-independent way?
Because it wouldn't be thread-safe, unless a lock is used for synchronizing
subprocess and open calls, which would cause other issues.

Are you still reading?

Here's a pragmatic proposal:
- Add the functions in ftools.py (in a more complete version) to the 
standard library.
   Perhaps even add them to the subprocess.py module?
- Add a note about handle inheritance to the documentation for the 
subprocess module,
  saying that for the parent process, one should avoid using open and prefer 
ftools.open
  instead.
- Add a global switch to the socket module to choose between new and old 
behaviour:
- New behaviour: In the C level socket implementation,
   use os.O_NOINHERIT resp. fcntl FD_CLOEXEC
   Remember: In case you write a forking socket server in Python, you have 
to use the
   old behaviour (so, in the ForkingServerMixin, expliticly choose the old 
behaviour).
- Change the logging file handler classes to use ftools.open, so that at 
least the logging
   module does not produce errors in a multi-threaded program with child 
processes.
- For Python 3000, search the standard library for unintentional 
inheritance.

What do you think?

Henning

Footnote:
I bet that about 50% of all unexplainable, seemingly random, 
hard-to-reproduce errors
in any given program (written in any programming language) that uses child 
processes
are caused by unintentional handle inheritance.


From martin at v.loewis.de  Mon Jun 25 23:53:19 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 25 Jun 2007 23:53:19 +0200
Subject: [Python-Dev] Proposal for a new function "open_noinherit" to
 avoid problems with subprocesses and security risks
In-Reply-To: <003601c7b76a$9a2a0550$6401a8c0@max>
References: <003601c7b76a$9a2a0550$6401a8c0@max>
Message-ID: <4680394F.6060001@v.loewis.de>

> # I'm not sure about netiquette here:
> # I decided to continue posting to the python-list without CCing to 
> everyone.

[I assume you mean python-dev]

Discussing this issue on the list is fine. Posting code is on the
borderline, and will have no effect, i.e. no action will come out
of (at least *I* will ignore the code entirely, unless it is an
actual patch, and submitted to the bug tracker).

> So perhaps, for Python development, we just have to accept
> that the problem persists and that at this time a 100% solution
> just does not exist - and we should watch the discussion
> on http://lwn.net/Articles/237722/ to see how they solve it for
> Linux.

Exactly. My proposal is still to provide an API to toggle the
flag after the handle was created.

> Martin, you mentioned that for sockets, inheritance is not a problem
> unless accept(), recv() or select() is called in the child process
> (as far as I understood it).

I did not say "no problems". I said "there is no ambiguity whereto
direct the data if the child processes don't perform accept/recv".

> * http://mail.python.org/pipermail/python-list/2003-November/236043.html
>    "socket's strange behavior with subprocesses"
>    Funny: Even notepad.exe is used there as an example child process...

Sure: the system will not shutdown the connection as long as the handle
is still open in the subprocess (as the subprocess *might* send more
data - which it won't).

I think the problem could be avoided by the parent process explicitly
performing shutdown(2), but I'm uncertain as I have never actively used
shutdown().

> * http://mail.python.org/pipermail/python-bugs-list/2006-April/032974.html
>    python-Bugs-1469163 SimpleXMLRPCServer doesn't work anymore on Windows
>    (see also Bug 1222790).

I don't understand how this is relevant. This is about CLO_EXEC not
being available on Windows, and has nothing to do with socket
inheritance.

> * http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4202949
>   Java has switched to non-inheritable sockets as well.

Not surprisingly - they don't support fork(). If they would,
they could not have made that change. The bug report is the
same issue: clients will be able to connect as long as the
listen backlog fills. Then they will be turned down, as notepad
will never perform accept.

[I'm getting bored trying to explain the other cases as well]

> Any Windows Guru around who can explain what's going on with socket
> handles and CreateProcess? I mean - is the explanation Martin gave for
> accept(), recv(), select() correct for Windows, too? And if so - how can
> the errors be explained that are mentioned in the URLs above?

See my explanation above.

Martin

From ckkart at hoc.net  Tue Jun 26 04:47:21 2007
From: ckkart at hoc.net (Christian K)
Date: Tue, 26 Jun 2007 11:47:21 +0900
Subject: [Python-Dev] csv changed from python 2.4 to 2.5
Message-ID: <f5punr$8q2$1@sea.gmane.org>

Hi,

I could not find documentation of the following change in python2.5. What is the
reason for that?

Python 2.4.4 (#2, Apr 12 2007, 21:03:11)
[GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import csv
>>> d=csv.get_dialect('excel')
>>> d.delimiter = ','
>>>

ck at kiste:/media/hda6/home/ck/prog/peak-o-mat/trunk$ python2.5
Python 2.5.1 (r251:54863, May  2 2007, 16:56:35)
[GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import csv
>>> d=csv.get_dialect('excel')
>>> d.delimiter = ','
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: readonly attribute
>>>

the following however works:

Python 2.5.1 (r251:54863, May  2 2007, 16:56:35)
[GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import csv
>>> d = csv.excel
>>> d.delimiter = ','
>>>


Christian


From brett at python.org  Tue Jun 26 22:46:41 2007
From: brett at python.org (Brett Cannon)
Date: Tue, 26 Jun 2007 13:46:41 -0700
Subject: [Python-Dev] handling the granularity of possible Py3K warnings in
	2.6
Message-ID: <bbaeab100706261346u5c7913b1m19c33911d7b27621@mail.gmail.com>

My rewrite of import is written for 2.6.  But I am going to try to
bootstrap it into 3.0.  But I want to bootstrap into 2.6 if it works
out well in 3.0.  That means coding in 2.6 but constantly
forward-porting to 3.0.  In other words I am going to be a guinea pig
of our transition plan.

And being the guinea pig means I want to know where we want to take
the Py3K warnings in 2.6.  As of right now the -3 option causes some
DeprecationWarnings to be raised (mostly about callable and
dict.has_key).  But I was thinking that a certain level of granularity
existed amongst what should be warned against.

There is syntactic vs. semantic.  For syntactic there is stuff you can
do in versions of Python older than 2.6 (e.g., backtick removal),
stuff you can only do in 2.6 (e.g., new exception syntax), or stuff
that you can't do at all without a __future__ statement (e.g.,
'print').

For semantics, there is removal (e.g., callable), and then there is
new semantics (dict.items).

What I was thinking was having these various types of changes be
represented by proper warnings.  That allows for using the -W option
to control what warnings you care about.

So I envision something like:

+ Py3KWarning
   + Py3KSyntaxWarning
      + some reasonable name for stuff that can be done in 2.6.
          + some name for stuff that can be done in older than 2.6.
      + something for stuff like the 'print' removal that require a
__future__ statement.
   + Py3KSemanticWarning
       + Py3KDeprecationWarning
       + Py3KChangedSemanticsWarning (or whatever name you prefer)


The key point is that when I am forward-porting I want to easily tell
what syntax changes I can deal with now in older, e.g. 2.5 code, stuff
that I can change directly in 2.6, and stuff that requires 2to3 or a
__future__ statement.  Similar idea for semantic changes.  That way I
can do this all in steps.

Does this sound reasonable to people at all?

-Brett

From nick at craig-wood.com  Wed Jun 27 12:50:41 2007
From: nick at craig-wood.com (Nick Craig-Wood)
Date: Wed, 27 Jun 2007 11:50:41 +0100
Subject: [Python-Dev] csv changed from python 2.4 to 2.5
In-Reply-To: <f5punr$8q2$1@sea.gmane.org>
References: <f5punr$8q2$1@sea.gmane.org>
Message-ID: <20070627105041.869BA14C24B@irishsea.home.craig-wood.com>

Christian K <ckkart at hoc.net> wrote:
>  I could not find documentation of the following change in python2.5. What is the
>  reason for that?
> 
>  Python 2.4.4 (#2, Apr 12 2007, 21:03:11)
>  [GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] on linux2
>  Type "help", "copyright", "credits" or "license" for more information.
> >>> import csv
> >>> d=csv.get_dialect('excel')
> >>> d.delimiter = ','
> >>>
> 
>  ck at kiste:/media/hda6/home/ck/prog/peak-o-mat/trunk$ python2.5
>  Python 2.5.1 (r251:54863, May  2 2007, 16:56:35)
>  [GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] on linux2
>  Type "help", "copyright", "credits" or "license" for more information.
> >>> import csv
> >>> d=csv.get_dialect('excel')
> >>> d.delimiter = ','
>  Traceback (most recent call last):
>    File "<stdin>", line 1, in <module>
>  TypeError: readonly attribute
> >>>

Looks like this is the reason - the get_dialect call (which is
implemented in C) is now returning a read only Dialect object rather
than an instance of the original class :-

2.5
    >>> import csv
    >>> d = csv.get_dialect('excel')
    >>> d.__class__ 
    <type '_csv.Dialect'>
    >>> 

2.4
    >>> import csv
    >>> d = csv.get_dialect('excel')
    >>> d.__class__
    <class csv.excel at 0xb7d1f74c>
    >>> 


>  Python 2.5.1 (r251:54863, May  2 2007, 16:56:35)
>  [GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] on linux2
>  Type "help", "copyright", "credits" or "license" for more information.
> >>> import csv
> >>> d = csv.excel
> >>> d.delimiter = ','
> >>>

Don't you want to do this anyway?

  import csv
  class my_dialect(csv.excel):
      delimeter = ","

-- 
Nick Craig-Wood <nick at craig-wood.com> -- http://www.craig-wood.com/nick

From python-dev at xhaus.com  Wed Jun 27 20:26:08 2007
From: python-dev at xhaus.com (Alan Kennedy)
Date: Wed, 27 Jun 2007 19:26:08 +0100
Subject: [Python-Dev] Return error codes from getaddrinfo.
Message-ID: <4682ABC0.4060407@xhaus.com>

Dear all,

I'm seeking enlightenment on the error codes returned by the 
socket.getaddrinfo() function.

Consider the following on python 2.5 on Windows

 >>> import urllib
 >>> urllib.urlopen("http://nonexistent")
  [snip traceback]
IOError: [Errno socket error] (11001, 'getaddrinfo failed')

So the error number is 11001.

But when I try to find a symbolic constant in the errno module 
corresponding to this error number, I can't find one.

 >>> import errno
 >>> errno.errorcode[11]
'EAGAIN'
 >>> errno.errorcode[11001]
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
KeyError: 11001

Looking through the C source for the socket module doesn't provide any 
clarity (although my C is a little rusty). That module has a special 
function, set_gaierror(), for handling error returns from getaddrinfo. 
But I can't see if or how the resulting error codes relate to the errno 
module.

Is there supposed to be symbolic constants in the errno module 
corresponding to getaddrinfo errors?

I want jython to use the same errno symbolic constants as cpython, to 
ease portability of code.

Regards,

Alan.


From alexandre at peadrop.com  Wed Jun 27 22:15:46 2007
From: alexandre at peadrop.com (Alexandre Vassalotti)
Date: Wed, 27 Jun 2007 16:15:46 -0400
Subject: [Python-Dev] What's going on with the check-in emails?
Message-ID: <acd65fa20706271315i25bc96f6k61ea2ca1a805f627@mail.gmail.com>

Hi,

It seems there is a problem with check-in emails -- i.e., none have
been sent since r56057 (and the svn tree is at r56098 right now).
Does someone has a hint what's going on?

Thanks,
-- Alexandre

From skip at pobox.com  Wed Jun 27 22:41:53 2007
From: skip at pobox.com (skip at pobox.com)
Date: Wed, 27 Jun 2007 15:41:53 -0500
Subject: [Python-Dev] What's going on with the check-in emails?
In-Reply-To: <acd65fa20706271315i25bc96f6k61ea2ca1a805f627@mail.gmail.com>
References: <acd65fa20706271315i25bc96f6k61ea2ca1a805f627@mail.gmail.com>
Message-ID: <18050.52113.127601.141848@montanaro.dyndns.org>


    Alexandre> It seems there is a problem with check-in emails -- i.e.,
    Alexandre> none have been sent since r56057 (and the svn tree is at
    Alexandre> r56098 right now).  Does someone has a hint what's going on?

I'm not aware of a problem, though I noticed the slowdown in checkin emails
recently.  I forwarded your note to the python.org mailman/postfix gurus.

Skip

From skip at pobox.com  Wed Jun 27 22:59:22 2007
From: skip at pobox.com (skip at pobox.com)
Date: Wed, 27 Jun 2007 15:59:22 -0500
Subject: [Python-Dev] csv changed from python 2.4 to 2.5
In-Reply-To: <f5punr$8q2$1@sea.gmane.org>
References: <f5punr$8q2$1@sea.gmane.org>
Message-ID: <18050.53162.537852.407639@montanaro.dyndns.org>


    Christian> I could not find documentation of the following change in
    Christian> python2.5. What is the reason for that?

Without looking through the change history for the module it's unclear to me
why that would have changed.  The thing that changed is that the get_dialect
call now returns a _csv.Dialect object instead of an instance of the
csv.excel class:

    % python2.4
    Python 2.4.1 (#3, Jul 28 2005, 22:08:40) 
    [GCC 3.3 20030304 (Apple Computer, Inc. build 1671)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import csv
    >>> d = csv.get_dialect("excel")
    >>> d
    <csv.excel instance at 0x3ae058>

    % python
    Python 2.6a0 (trunk:54264M, Mar 10 2007, 15:19:48) 
    [GCC 4.0.1 (Apple Computer, Inc. build 5367)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import csv
    >>> d = csv.get_dialect("excel")
    >>> d
    <_csv.Dialect object at 0x137fac0>

Please submit a bug report on SourceForge.

Thx,

Skip

From martin at v.loewis.de  Wed Jun 27 23:47:49 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 27 Jun 2007 23:47:49 +0200
Subject: [Python-Dev] Return error codes from getaddrinfo.
In-Reply-To: <4682ABC0.4060407@xhaus.com>
References: <4682ABC0.4060407@xhaus.com>
Message-ID: <4682DB05.9090108@v.loewis.de>

> Is there supposed to be symbolic constants in the errno module 
> corresponding to getaddrinfo errors?

No. On Windows, there is a separate set of error codes, winerror.h

If you google for "winerror 11001", you find quickly that it is
"host not found".

> I want jython to use the same errno symbolic constants as cpython, to 
> ease portability of code.

That will be very difficult to achieve, as Python is (deliberately)
not even consistent across systems. Instead, it reports what the
platform reports, so you should do the same in Java.

Regards,
Martin

From brett at python.org  Thu Jun 28 02:19:36 2007
From: brett at python.org (Brett Cannon)
Date: Wed, 27 Jun 2007 17:19:36 -0700
Subject: [Python-Dev] What's going on with the check-in emails?
In-Reply-To: <acd65fa20706271315i25bc96f6k61ea2ca1a805f627@mail.gmail.com>
References: <acd65fa20706271315i25bc96f6k61ea2ca1a805f627@mail.gmail.com>
Message-ID: <bbaeab100706271719r9f5a612le37469fc3a7d7f03@mail.gmail.com>

On 6/27/07, Alexandre Vassalotti <alexandre at peadrop.com> wrote:
> Hi,
>
> It seems there is a problem with check-in emails -- i.e., none have
> been sent since r56057 (and the svn tree is at r56098 right now).
> Does someone has a hint what's going on?
>

I am having issues as well.  I just did a slew of PEP checkins and I
have not gotten a single email on them.

-Brett

From thomas at python.org  Thu Jun 28 02:46:11 2007
From: thomas at python.org (Thomas Wouters)
Date: Wed, 27 Jun 2007 17:46:11 -0700
Subject: [Python-Dev] What's going on with the check-in emails?
In-Reply-To: <acd65fa20706271315i25bc96f6k61ea2ca1a805f627@mail.gmail.com>
References: <acd65fa20706271315i25bc96f6k61ea2ca1a805f627@mail.gmail.com>
Message-ID: <9e804ac0706271746t60d6ed90le06b854218c09b72@mail.gmail.com>

The mail-checkins script broke because of the upgrade of the machine that
hosts the subversion repository -- Python 2.3 went away, but two scripts
were still using '#!/usr/bin/env python2.3'. They should be fixed now.

On 6/27/07, Alexandre Vassalotti <alexandre at peadrop.com> wrote:
>
> Hi,
>
> It seems there is a problem with check-in emails -- i.e., none have
> been sent since r56057 (and the svn tree is at r56098 right now).
> Does someone has a hint what's going on?
>
> Thanks,
> -- Alexandre
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/thomas%40python.org
>



-- 
Thomas Wouters <thomas at python.org>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20070627/0936c738/attachment.htm 

From tav at espians.com  Thu Jun 28 03:04:42 2007
From: tav at espians.com (tav)
Date: Thu, 28 Jun 2007 02:04:42 +0100
Subject: [Python-Dev] object capability; func_closure; __subclasses__
Message-ID: <95d8c0810706271804v2f6a8996hc1173481267ea4d2@mail.gmail.com>

rehi all,

I have been looking at the object capability + Python discussions for
a while now, and was wondering what the use cases for
``object.__subclasses__`` and ``FunctionType.func_closure`` were?

I don't see __subclasses__ used anywhere in the standard library. And
whilst I can see exposing func_closure as being useful in terms of
"cloning/modifying" an existing function, isn't it possible to do that
without making it introspectable?

Years ago, Ka-Ping Yee pointed out:

  http://mail.python.org/pipermail/python-dev/2003-March/034284.html

Derived from this we get:

# capability.py functions

def Namespace(*args, **kwargs):

    for arg in args:
        kwargs[arg.__name__] = arg

    def get(key):
        return kwargs.get(key)

    return get

class Getter(object):

    def __init__(self, getter):
        self.getter = getter

    def __repr__(self):
        return self.getter('__repr__') or object.__repr__(self)

    def __getattr__(self, attr):
        return self.getter(attr)

# io.py module

def FileReader(name):

    file = open(name, 'r')

    def __repr__():
        return '<FileReader: %r>' % name

    def read(bufsize=-1):
        return file.read(bufsize)

    def close():
        return file.close()

    return Getter(Namespace(__repr__, read, close))

----

Now, a process A -- which has full access to all objects -- can do:

  >>> motd = FileReader('/etc/motd')

And pass it to "process B" operating in a limited scope, which can then call:

  >>> motd.read()
  >>> motd.close()

But not:

  >>> motd = type(motd)(motd.name, 'w')

which would have been possible *had* motd been created as a ``file``
type by calling: ``open('/etc/motd', 'r')``.

Now, there are probably a million holes in this approach, but as long
as process B's __import__ is sanitised and it operates in a "limited"
scope with regards to references to other functionality, this seems to
be relatively secure.

However, this is where __subclasses__ and func_closure get in the way.

With object.__subclasses__ (as Brett points out), all defined
classes/types are available -- including the ``file`` type we were
trying to deny process B access to! Is it necessary to expose this
attribute publically?

And, likewise with func_closure, one can do
motd.read.func_closure[0].cell_contents and get hold of the original
``file`` object. Is it absolutely necessary to expose func_closure in
this way?

Now, whilst probably wrong, I can see myself being able to create a
minimal object capability system in pure python if those 2 "features"
disappeared. Am I missing something obvious that prevents me from
doing that?

Can we get rid of them for Python 2.6? Or even 2.5.2? Is anyone
besides PJE actually using them? ;p

Thanks in advance for your thoughts.

-- 
love, tav
founder and ceo, esp metanational llp

plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369

From ckkart at hoc.net  Thu Jun 28 05:36:48 2007
From: ckkart at hoc.net (Christian K)
Date: Thu, 28 Jun 2007 12:36:48 +0900
Subject: [Python-Dev] csv changed from python 2.4 to 2.5
In-Reply-To: <18050.53162.537852.407639@montanaro.dyndns.org>
References: <f5punr$8q2$1@sea.gmane.org>
	<18050.53162.537852.407639@montanaro.dyndns.org>
Message-ID: <f5vacl$pmu$1@sea.gmane.org>

skip at pobox.com wrote:
>     Christian> I could not find documentation of the following change in
>     Christian> python2.5. What is the reason for that?
> 
> Without looking through the change history for the module it's unclear to me
> why that would have changed.  The thing that changed is that the get_dialect
> call now returns a _csv.Dialect object instead of an instance of the
> csv.excel class:
> 
>     % python2.4
>     Python 2.4.1 (#3, Jul 28 2005, 22:08:40) 
>     [GCC 3.3 20030304 (Apple Computer, Inc. build 1671)] on darwin
>     Type "help", "copyright", "credits" or "license" for more information.
>     >>> import csv
>     >>> d = csv.get_dialect("excel")
>     >>> d
>     <csv.excel instance at 0x3ae058>
> 
>     % python
>     Python 2.6a0 (trunk:54264M, Mar 10 2007, 15:19:48) 
>     [GCC 4.0.1 (Apple Computer, Inc. build 5367)] on darwin
>     Type "help", "copyright", "credits" or "license" for more information.
>     >>> import csv
>     >>> d = csv.get_dialect("excel")
>     >>> d
>     <_csv.Dialect object at 0x137fac0>
> 
> Please submit a bug report on SourceForge.
> 

Ok. Done.

Christian


From pje at telecommunity.com  Thu Jun 28 06:41:35 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 28 Jun 2007 00:41:35 -0400
Subject: [Python-Dev] object capability; func_closure;  __subclasses__
In-Reply-To: <95d8c0810706271804v2f6a8996hc1173481267ea4d2@mail.gmail.co
 m>
References: <95d8c0810706271804v2f6a8996hc1173481267ea4d2@mail.gmail.com>
Message-ID: <20070628044018.076DA3A40AA@sparrow.telecommunity.com>

At 02:04 AM 6/28/2007 +0100, tav wrote:
>rehi all,
>
>I have been looking at the object capability + Python discussions for
>a while now, and was wondering what the use cases for
>``object.__subclasses__`` and ``FunctionType.func_closure`` were?
>
>I don't see __subclasses__ used anywhere in the standard library. And
>whilst I can see exposing func_closure as being useful in terms of
>"cloning/modifying" an existing function, isn't it possible to do that
>without making it introspectable?

You know, I find it particularly interesting that, as far as I can 
tell, nobody who proposes making changes to the Python language to 
add security, ever seems to offer any comparison or contrast of their 
approaches to Zope's -- which doesn't require any changes to the language.  :)


>Now, whilst probably wrong, I can see myself being able to create a
>minimal object capability system in pure python if those 2 "features"
>disappeared. Am I missing something obvious that prevents me from
>doing that?

Well, you're missing a simpler approach to protecting functions, 
anyway.  The '__call__' attribute of functions is still callable, but 
doesn't provide any access to func_closure, func_code, etc.  I 
believe this trick also works for bound method objects.

I suspect that you could also use ctypes to remove or alter the 
type.__subclasses__ member, though I suppose you might not consider 
that to be "pure" Python any more.  However, if you use a definition 
of pure that allows for stdlib modules, then perhaps it works.  :)


>Can we get rid of them for Python 2.6? Or even 2.5.2? Is anyone
>besides PJE actually using them? ;p

I wouldn't object (no pun intended) to moving the type.__subclasses__ 
method to say, the 'new' or 'gc' modules, since you wouldn't want to 
make those available to restricted code, but then they'd still be 
available for folks who need/want them.  'gc' has similar 
capabilities (again no pun intended) anyway.

However, ISTM that this should be a 3.0 change rather than a 2.x one, 
even so.  With regard to the func_closure thing, I'd actually like to 
make it *writable* as well as readable, and I don't mean just to 
change the contents of the cells.  But, since you can use .__call__ 
to make a capability without access to func_closure, it doesn't seem 
like you really need to remove func_closure. 


From python-dev at xhaus.com  Thu Jun 28 09:59:37 2007
From: python-dev at xhaus.com (Alan Kennedy)
Date: Thu, 28 Jun 2007 08:59:37 +0100
Subject: [Python-Dev] Return error codes from getaddrinfo.
In-Reply-To: <4682DB05.9090108@v.loewis.de>
References: <4682ABC0.4060407@xhaus.com> <4682DB05.9090108@v.loewis.de>
Message-ID: <46836A69.6040701@xhaus.com>

[Alan]
>>I want jython to use the same errno symbolic constants as cpython, to 
>>ease portability of code.

[Martin]
> That will be very difficult to achieve, as Python is (deliberately)
> not even consistent across systems. Instead, it reports what the
> platform reports, so you should do the same in Java.

I think I need to explain myself more clearly; I'm looking for the 
errno.SYMBOLIC_CONSTANT for errors returned by the getaddrinfo function.

Take the following lines from the cpython 2.5 httplib.

Line 998 - 1014
# -=-=-=-=-=-=
while True:
     try:
         buf = self._ssl.read(self._bufsize)
     except socket.sslerror, err:
         if (err[0] == socket.SSL_ERROR_WANT_READ
             or err[0] == socket.SSL_ERROR_WANT_WRITE):
             continue
         if (err[0] == socket.SSL_ERROR_ZERO_RETURN
             or err[0] == socket.SSL_ERROR_EOF):
             break
         raise
     except socket.error, err:
         if err[0] == errno.EINTR:
             continue
         if err[0] == errno.EBADF:
             # XXX socket was closed?
             break
         raise
# -=-=-=-=-=-=-=

How can that code work on jython, other than if

A: The jython errno module contains definitions for EINTR and EBADF
B: The socket module raises the exceptions with the correct 
errno.SYMBOLIC_CONSTANTS, in the same circumstances as the cpython module.

(The actual integers don't matter, but thanks anyway to the three 
separate people who informed me that googling "11001" was the solution 
to my problem).

And then there are the non-portable uses of error numbers, like this 
snippet from the 2.5 httplib:

Lines 706-711
#-=-=-=-=-=-=
     try:
         self.sock.sendall(str)
     except socket.error, v:
         if v[0] == 32:      # Broken pipe
             self.close()
         raise
#-=-=-=-=-=-=

Do these examples make it clearer why and in what way I want the jython 
errno symbolic constants to be the same as cpython?

Thanks,

Alan.

From mithun_rn at yahoo.co.in  Thu Jun 28 10:41:06 2007
From: mithun_rn at yahoo.co.in (Mithun R N)
Date: Thu, 28 Jun 2007 09:41:06 +0100 (BST)
Subject: [Python-Dev] Decoding libpython frame information on the stack
Message-ID: <242398.97019.qm@web8510.mail.in.yahoo.com>

Hi All,

Am a new subscriber to this list.
Am facing an issue in deciphering core-files of
applications with mixed C and libpython frames in it.

I was thinking of knowing any work that has been done
with respect to getting into the actual python line
(file-name.py:<line number>) from the libpython frames
on the stack while debugging such core-files. If
anybody knows some information on this, please let me
know. I could not get any link on the web that talks
about this feature.

Looking forward for your reply.
Thanks and regards,
Mithun



      Bollywood, fun, friendship, sports and more... you name it, we have it at http://in.groups.yahoo.com

From tav at espians.com  Thu Jun 28 14:09:01 2007
From: tav at espians.com (tav)
Date: Thu, 28 Jun 2007 13:09:01 +0100
Subject: [Python-Dev] object capability; func_closure; __subclasses__
In-Reply-To: <20070628044018.076DA3A40AA@sparrow.telecommunity.com>
References: <95d8c0810706271804v2f6a8996hc1173481267ea4d2@mail.gmail.com>
	<20070628044018.076DA3A40AA@sparrow.telecommunity.com>
Message-ID: <95d8c0810706280509n43d1a22cm443f4cbc564e344f@mail.gmail.com>

> You know, I find it particularly interesting that, as far as I can
> tell, nobody who proposes making changes to the Python language to
> add security, ever seems to offer any comparison or contrast of their
> approaches to Zope's -- which doesn't require any changes to the language.  :)

Whilst it is definitely possible to build up a object capability
system with a pruned down version of Zope 3's proxy + checker
mechanism, it ends up in a system which is quite performance
intensive. All those proxies being wrapped/unwrapped/checked...

In contrast, the mechanism I am looking at here simply requires
*moving* just 2 attributes *out* of the core builtins...

Quite cheap, simple and effective, no?

> Well, you're missing a simpler approach to protecting functions,
> anyway.  The '__call__' attribute of functions is still callable, but
> doesn't provide any access to func_closure, func_code, etc.  I
> believe this trick also works for bound method objects.

Whilst that would have been a nice trick, what about __call__.__self__ ?

Or, setting __call__.__doc__ ?

> I suspect that you could also use ctypes to remove or alter the
> type.__subclasses__ member, though I suppose you might not consider
> that to be "pure" Python any more.  However, if you use a definition
> of pure that allows for stdlib modules, then perhaps it works.  :)

Ah, thanks! Will look into that.

> I wouldn't object (no pun intended) to moving the type.__subclasses__
> method to say, the 'new' or 'gc' modules, since you wouldn't want to
> make those available to restricted code, but then they'd still be
> available for folks who need/want them.  'gc' has similar
> capabilities (again no pun intended) anyway.

Ah, that's a great idea!

> However, ISTM that this should be a 3.0 change rather than a 2.x one,
> even so.  With regard to the func_closure thing, I'd actually like to
> make it *writable* as well as readable, and I don't mean just to
> change the contents of the cells.  But, since you can use .__call__
> to make a capability without access to func_closure, it doesn't seem
> like you really need to remove func_closure.

I don't object to making func_closure writable either. In fact, as
someone who has been following your work on generic functions from way
before RuleDispatch, I really want to see PEP 3124 in 3.0

But, all I am asking for is to not expose func_closure (and perhaps
some of the other func_*) as members of FunctionType -- isn't it
possible to add functionality to the ``new`` module which would allow
one to read/write func_closure?

-- 
love, tav
founder and ceo, esp metanational llp

plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369

From varmaa at gmail.com  Thu Jun 28 15:54:52 2007
From: varmaa at gmail.com (Atul Varma)
Date: Thu, 28 Jun 2007 08:54:52 -0500
Subject: [Python-Dev] Decoding libpython frame information on the stack
In-Reply-To: <242398.97019.qm@web8510.mail.in.yahoo.com>
References: <242398.97019.qm@web8510.mail.in.yahoo.com>
Message-ID: <361b27370706280654y199a71e8l6a77a74b07cf0eef@mail.gmail.com>

Hi Mithun,

Because python-dev is a mailing list for the development *of* Python
rather than development *with* Python, I believe you may not have
posted to the best list.  Further information about this distinction,
and some discussion about potentially setting up a special-interest
list exclusively for Python/C interactions, can be found in this
recent thread:

  http://mail.python.org/pipermail/python-dev/2007-June/073680.html

Regarding your question, I'll try to answer it as best I can: on our
Windows application, we use Microsoft minidumps [1] instead of core
dumps.  At the time that a crash occurs and a minidump is written, we
have some code that digs into the Python interpreter state to get a
text traceback for every Python thread currently in execution at the
time of the crash, which is appended to the log file that is sent with
the minidump in the automated bug report.  Doing this is a bit risky
because it assumes that the relevant parts of the Python interpreter
state aren't corrupt at the time of the crash, but precautions can be
made to deal with this edge case.  So while I can't help you get a
bead on debugging core files, you may want to consider a similar
solution on the Unix platform.

- Atul

[1] http://msdn2.microsoft.com/en-us/library/ms680369.aspx

On 6/28/07,  Mithun R N <mithun_rn at yahoo.co.in> wrote:
>  Hi All,
>
> Am a new subscriber to this list.
> Am facing an issue in deciphering core-files of
> applications with mixed C and libpython frames in it.
>
> I was thinking of knowing any work that has been done
>  with respect to getting into the actual python line
> (file-name.py:<line number>) from the libpython frames
> on the stack while debugging such core-files. If
> anybody knows some information on this, please let me
> know. I could not get any link on the web that talks
> about this feature.
>
> Looking forward for your reply.
> Thanks and regards,
> Mithun
>
>
>
>       Bollywood, fun, friendship, sports and more... you name it, we have it at  http://in.groups.yahoo.com
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
>  http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/varmaa%40gmail.com
>

From dustin at v.igoro.us  Thu Jun 28 16:57:18 2007
From: dustin at v.igoro.us (Dustin J. Mitchell)
Date: Thu, 28 Jun 2007 09:57:18 -0500
Subject: [Python-Dev] Decoding libpython frame information on the stack
In-Reply-To: <242398.97019.qm@web8510.mail.in.yahoo.com>
References: <242398.97019.qm@web8510.mail.in.yahoo.com>
Message-ID: <20070628145718.GA20378@v.igoro.us>

On Thu, Jun 28, 2007 at 09:41:06AM +0100, Mithun R N wrote:
> Am a new subscriber to this list.
> Am facing an issue in deciphering core-files of
> applications with mixed C and libpython frames in it.
> 
> I was thinking of knowing any work that has been done
> with respect to getting into the actual python line
> (file-name.py:<line number>) from the libpython frames
> on the stack while debugging such core-files. If
> anybody knows some information on this, please let me
> know. I could not get any link on the web that talks
> about this feature.

Dave Beazley once worked on this subject:

  http://www.usenix.org/events/usenix01/full_papers/beazley/beazley_html/index.html

Dustin

From pje at telecommunity.com  Thu Jun 28 17:03:32 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 28 Jun 2007 11:03:32 -0400
Subject: [Python-Dev] object capability; func_closure;  __subclasses__
In-Reply-To: <95d8c0810706280509n43d1a22cm443f4cbc564e344f@mail.gmail.co
 m>
References: <95d8c0810706271804v2f6a8996hc1173481267ea4d2@mail.gmail.com>
	<20070628044018.076DA3A40AA@sparrow.telecommunity.com>
	<95d8c0810706280509n43d1a22cm443f4cbc564e344f@mail.gmail.com>
Message-ID: <20070628150129.BBADF3A40B1@sparrow.telecommunity.com>

At 01:09 PM 6/28/2007 +0100, tav wrote:
>>You know, I find it particularly interesting that, as far as I can
>>tell, nobody who proposes making changes to the Python language to
>>add security, ever seems to offer any comparison or contrast of their
>>approaches to Zope's -- which doesn't require any changes to the 
>>language.  :)
>
>Whilst it is definitely possible to build up a object capability
>system with a pruned down version of Zope 3's proxy + checker
>mechanism, it ends up in a system which is quite performance
>intensive. All those proxies being wrapped/unwrapped/checked...
>
>In contrast, the mechanism I am looking at here simply requires
>*moving* just 2 attributes *out* of the core builtins...
>
>Quite cheap, simple and effective, no?
>
>>Well, you're missing a simpler approach to protecting functions,
>>anyway.  The '__call__' attribute of functions is still callable, but
>>doesn't provide any access to func_closure, func_code, etc.  I
>>believe this trick also works for bound method objects.
>
>Whilst that would have been a nice trick, what about __call__.__self__ ?

Well, there's no __self__ in 2.3 or 2.4; I guess that was added in 2.5.  Darn.


>Or, setting __call__.__doc__ ?

What does that do?


>>I suspect that you could also use ctypes to remove or alter the
>>type.__subclasses__ member, though I suppose you might not consider
>>that to be "pure" Python any more.  However, if you use a definition
>>of pure that allows for stdlib modules, then perhaps it works.  :)
>
>Ah, thanks! Will look into that.

If it works, you could probably do the same thing to remove __call__.__self__.


>I don't object to making func_closure writable either. In fact, as
>someone who has been following your work on generic functions from way
>before RuleDispatch, I really want to see PEP 3124 in 3.0
>
>But, all I am asking for is to not expose func_closure (and perhaps
>some of the other func_*) as members of FunctionType -- isn't it
>possible to add functionality to the ``new`` module which would allow
>one to read/write func_closure?

In 3.0, I don't mind if the access method moves, I just want to keep 
the access.  OTOH, I don't really care about __call__.__self__, since 
I got along fine without it in 2.3/2.4 and didn't know it had been 
added in 2.5.  :)


From tav at espians.com  Thu Jun 28 17:14:05 2007
From: tav at espians.com (tav)
Date: Thu, 28 Jun 2007 16:14:05 +0100
Subject: [Python-Dev] object capability; func_closure; __subclasses__
In-Reply-To: <20070628150129.BBADF3A40B1@sparrow.telecommunity.com>
References: <95d8c0810706271804v2f6a8996hc1173481267ea4d2@mail.gmail.com>
	<20070628044018.076DA3A40AA@sparrow.telecommunity.com>
	<95d8c0810706280509n43d1a22cm443f4cbc564e344f@mail.gmail.com>
	<20070628150129.BBADF3A40B1@sparrow.telecommunity.com>
Message-ID: <95d8c0810706280814q2a8a4e56u674d5e89d9d7aea@mail.gmail.com>

> Well, there's no __self__ in 2.3 or 2.4; I guess that was added in 2.5.  Darn.

anyone know *why* it was added?

> >Or, setting __call__.__doc__ ?
>
> What does that do?

ah, i just wanted a way of providing documentation, and __call__'s
__doc__ isn't writable...

> If it works, you could probably do the same thing to remove __call__.__self__.

will look into that too...

> In 3.0, I don't mind if the access method moves, I just want to keep
> the access.  OTOH, I don't really care about __call__.__self__, since
> I got along fine without it in 2.3/2.4 and didn't know it had been
> added in 2.5.  :)

w00p!

so, suggestions as to how one can go about getting those 2 access methods moved?

-- 
thanks, tav
founder and ceo, esp metanational llp

plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369

From pje at telecommunity.com  Thu Jun 28 17:44:16 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 28 Jun 2007 11:44:16 -0400
Subject: [Python-Dev] object capability; func_closure;  __subclasses__
In-Reply-To: <95d8c0810706280814q2a8a4e56u674d5e89d9d7aea@mail.gmail.com
 >
References: <95d8c0810706271804v2f6a8996hc1173481267ea4d2@mail.gmail.com>
	<20070628044018.076DA3A40AA@sparrow.telecommunity.com>
	<95d8c0810706280509n43d1a22cm443f4cbc564e344f@mail.gmail.com>
	<20070628150129.BBADF3A40B1@sparrow.telecommunity.com>
	<95d8c0810706280814q2a8a4e56u674d5e89d9d7aea@mail.gmail.com>
Message-ID: <20070628154208.DA6673A40B1@sparrow.telecommunity.com>

At 04:14 PM 6/28/2007 +0100, tav wrote:
> > Well, there's no __self__ in 2.3 or 2.4; I guess that was added 
> in 2.5.  Darn.
>
>anyone know *why* it was added?
>
> > >Or, setting __call__.__doc__ ?
> >
> > What does that do?
>
>ah, i just wanted a way of providing documentation, and __call__'s
>__doc__ isn't writable...
>
> > If it works, you could probably do the same thing to remove 
> __call__.__self__.
>
>will look into that too...
>
> > In 3.0, I don't mind if the access method moves, I just want to keep
> > the access.  OTOH, I don't really care about __call__.__self__, since
> > I got along fine without it in 2.3/2.4 and didn't know it had been
> > added in 2.5.  :)
>
>w00p!
>
>so, suggestions as to how one can go about getting those 2 access 
>methods moved?

Post a proposal on the Python-3000 list and supply patches to do the moves.


From skip at pobox.com  Thu Jun 28 18:00:14 2007
From: skip at pobox.com (skip at pobox.com)
Date: Thu, 28 Jun 2007 11:00:14 -0500
Subject: [Python-Dev] Decoding libpython frame information on the stack
In-Reply-To: <20070628145718.GA20378@v.igoro.us>
References: <242398.97019.qm@web8510.mail.in.yahoo.com>
	<20070628145718.GA20378@v.igoro.us>
Message-ID: <18051.56078.34614.249653@montanaro.dyndns.org>


    >> Am a new subscriber to this list.  Am facing an issue in deciphering
    >> core-files of applications with mixed C and libpython frames in it.

    >> I was thinking of knowing any work that has been done with respect to
    >> getting into the actual python line (file-name.py:<line number>) from
    >> the libpython frames on the stack while debugging such core-files. If
    >> anybody knows some information on this, please let me know. I could
    >> not get any link on the web that talks about this feature.

Sorry, I missed this the first time round and just saw Dustin's reply.  The
Python distribution comes with a gdbinit file in the Misc directory.  I use
it frequently to display Python stack traces from within GDB.  Here's the
most recent copy online:

    http://svn.python.org/view/python/trunk/Misc/gdbinit?view=markup

The following commands are implemented:

    pystack - display the full stack trace
    pystackv - as above, but also display local variables
    pyframe - display just the current frame
    pyframev - as above, but also display local variables
    up, down - move up or down one C stack frame, but display Python
               frame if you move into PyEval_EvalFrame

This should all work within active sessions and sessions debugging core
files (e.g., no active process).

It needs some rework.  For instance, it assumes you're running within Emacs
and puts out lines gud can use to display source lines.  These look a little
funky when debugging from a terminal window.

Skip

From pje at telecommunity.com  Thu Jun 28 19:26:28 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 28 Jun 2007 13:26:28 -0400
Subject: [Python-Dev] object capability; func_closure;  __subclasses__
In-Reply-To: <95d8c0810706280923v6a72cb72w6d5474c41c748c29@mail.gmail.co
 m>
References: <95d8c0810706271804v2f6a8996hc1173481267ea4d2@mail.gmail.com>
	<20070628044018.076DA3A40AA@sparrow.telecommunity.com>
	<95d8c0810706280509n43d1a22cm443f4cbc564e344f@mail.gmail.com>
	<20070628150129.BBADF3A40B1@sparrow.telecommunity.com>
	<95d8c0810706280814q2a8a4e56u674d5e89d9d7aea@mail.gmail.com>
	<20070628154208.DA6673A40B1@sparrow.telecommunity.com>
	<95d8c0810706280923v6a72cb72w6d5474c41c748c29@mail.gmail.com>
Message-ID: <20070628172433.382DC3A40AF@sparrow.telecommunity.com>

At 05:23 PM 6/28/2007 +0100, tav wrote:
>Any pointers on removing members via ctypes front?
>
>Whilst I can understand even the most obscure aspects of your python
>code fine, I'm not familiar with C/ctypes...

What you want is to get access to the type's real dictionary, not the 
proxy.  Then you can just delete '__subclasses__' from the dictionary 
using Python code.  Here's some code that does the trick:

     from ctypes import pythonapi, POINTER, py_object

     getdict = pythonapi._PyObject_GetDictPtr
     getdict.restype = POINTER(py_object)
     getdict.argtypes = [py_object]

     def dictionary_of(ob):
         dptr = getdict(ob)
         if dptr and dptr.contents:
             return dptr.contents.value

'dictionary_of' returns either a dictionary object, or None if the 
object has no dictionary.  You can then simply delete any unwanted 
contents.  However, you should *never use this* to assign __special__ 
methods, as Python will not change the type slots correctly.  Heck, 
you should probably never use this, period.  :)  Usage example:

   print "before", type.__subclasses__
   del dictionary_of(type)['__subclasses__']
   print "after", type.__subclasses__

This will print something like:

   before <method '__subclasses__' of 'type' objects>
   after
   Traceback (most recent call last):
     File "ctypes_dicto.py", line 14, in <module>
       print "after", type.__subclasses__
   AttributeError: type object 'type' has no attribute '__subclasses__'

et voila.

You should also be able to delete unwanted function type attributes like this::

   from types import FunctionType
   del dictionary_of(FunctionType)['func_closure']
   del dictionary_of(FunctionType)['func_code']

Of course, don't blame me if any of this code fries your computer and 
gives you a disease, doesn't work with new versions of Python, etc. 
etc.  It works for me on Windows and Linux with Python 2.3, 2.4 and 
2.5.  It may also work with 3.0, but remember that func_* attributes 
have different names there.


From pje at telecommunity.com  Thu Jun 28 19:34:05 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 28 Jun 2007 13:34:05 -0400
Subject: [Python-Dev] object capability; func_closure;  __subclasses__
In-Reply-To: <435DF58A933BA74397B42CDEB8145A860D502D6B@ex9.hostedexchang
	e.local>
References: <95d8c0810706280509n43d1a22cm443f4cbc564e344f@mail.gmail.com>
	<435DF58A933BA74397B42CDEB8145A860D502D6B@ex9.hostedexchange.local>
Message-ID: <20070628173156.9D8773A40AF@sparrow.telecommunity.com>

At 10:20 AM 6/28/2007 -0700, Robert Brewer wrote:
>tav wrote:
> > But, all I am asking for is to not expose func_closure (and perhaps
> > some of the other func_*) as members of FunctionType -- isn't it
> > possible to add functionality to the ``new`` module which would allow
> > one to read/write func_closure?
>
>Would func_closure then also be removed from the FunctionType
>constructor arg list?

That wouldn't be necessary, since restricted code is probably not 
going to be allowed access to new in the first place.  We're talking 
about removing read access to the closure *attribute* only.  (And 
write access to func_code would also have to be removed, else you 
could replace the bytecode in order to grant yourself read access to 
the closure contents...)


>If so, would I be expected to create a function
>object and then use the "new" module to supply its func_closure?

Nope.  The idea here is that the new module would grow utility 
functions like get_closure, get_code, set_code, get_subclasses, 
etc.  The 'inspect' module would then use these functions to do its 
job, and I would use them for generic function stuff.


From martin at v.loewis.de  Thu Jun 28 19:32:57 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 28 Jun 2007 19:32:57 +0200
Subject: [Python-Dev] Return error codes from getaddrinfo.
In-Reply-To: <46836A69.6040701@xhaus.com>
References: <4682ABC0.4060407@xhaus.com> <4682DB05.9090108@v.loewis.de>
	<46836A69.6040701@xhaus.com>
Message-ID: <4683F0C9.7040708@v.loewis.de>

> [Martin]
>> That will be very difficult to achieve, as Python is (deliberately)
>> not even consistent across systems. Instead, it reports what the
>> platform reports, so you should do the same in Java.
> 
> Do these examples make it clearer why and in what way I want the jython
> errno symbolic constants to be the same as cpython?

I fully understood that, already in your original message. All I was
saying is that this will be very difficult to achieve.

It would be much easier if you don't take the code of the standard
library and the application as given, but instead accept that people
may have to change the error conditions somewhat when porting to
Jython. Ideally, such porting would allow to still run the same code
on CPython, and ideally, you would then provide patches for the
Python library to make it run unmodified on Jython (rather than
trying to arrange to make the *current* library run unmodified).

Regards,
Martin

From tav at espians.com  Thu Jun 28 19:35:23 2007
From: tav at espians.com (tav)
Date: Thu, 28 Jun 2007 18:35:23 +0100
Subject: [Python-Dev] object capability; func_closure; __subclasses__
In-Reply-To: <20070628172433.382DC3A40AF@sparrow.telecommunity.com>
References: <95d8c0810706271804v2f6a8996hc1173481267ea4d2@mail.gmail.com>
	<20070628044018.076DA3A40AA@sparrow.telecommunity.com>
	<95d8c0810706280509n43d1a22cm443f4cbc564e344f@mail.gmail.com>
	<20070628150129.BBADF3A40B1@sparrow.telecommunity.com>
	<95d8c0810706280814q2a8a4e56u674d5e89d9d7aea@mail.gmail.com>
	<20070628154208.DA6673A40B1@sparrow.telecommunity.com>
	<95d8c0810706280923v6a72cb72w6d5474c41c748c29@mail.gmail.com>
	<20070628172433.382DC3A40AF@sparrow.telecommunity.com>
Message-ID: <95d8c0810706281035o198ac9f9yba0000278b8b9ba7@mail.gmail.com>

I love you PJE! Thank you! =)

On 6/28/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 05:23 PM 6/28/2007 +0100, tav wrote:
> >Any pointers on removing members via ctypes front?
> >
> >Whilst I can understand even the most obscure aspects of your python
> >code fine, I'm not familiar with C/ctypes...
>
> What you want is to get access to the type's real dictionary, not the
> proxy.  Then you can just delete '__subclasses__' from the dictionary
> using Python code.  Here's some code that does the trick:
>
>      from ctypes import pythonapi, POINTER, py_object
>
>      getdict = pythonapi._PyObject_GetDictPtr
>      getdict.restype = POINTER(py_object)
>      getdict.argtypes = [py_object]
>
>      def dictionary_of(ob):
>          dptr = getdict(ob)
>          if dptr and dptr.contents:
>              return dptr.contents.value
>
> 'dictionary_of' returns either a dictionary object, or None if the
> object has no dictionary.  You can then simply delete any unwanted
> contents.  However, you should *never use this* to assign __special__
> methods, as Python will not change the type slots correctly.  Heck,
> you should probably never use this, period.  :)  Usage example:
>
>    print "before", type.__subclasses__
>    del dictionary_of(type)['__subclasses__']
>    print "after", type.__subclasses__
>
> This will print something like:
>
>    before <method '__subclasses__' of 'type' objects>
>    after
>    Traceback (most recent call last):
>      File "ctypes_dicto.py", line 14, in <module>
>        print "after", type.__subclasses__
>    AttributeError: type object 'type' has no attribute '__subclasses__'
>
> et voila.
>
> You should also be able to delete unwanted function type attributes like this::
>
>    from types import FunctionType
>    del dictionary_of(FunctionType)['func_closure']
>    del dictionary_of(FunctionType)['func_code']
>
> Of course, don't blame me if any of this code fries your computer and
> gives you a disease, doesn't work with new versions of Python, etc.
> etc.  It works for me on Windows and Linux with Python 2.3, 2.4 and
> 2.5.  It may also work with 3.0, but remember that func_* attributes
> have different names there.
>
>


-- 
love, tav
founder and ceo, esp metanational llp

plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369

From fumanchu at amor.org  Thu Jun 28 19:20:31 2007
From: fumanchu at amor.org (Robert Brewer)
Date: Thu, 28 Jun 2007 10:20:31 -0700
Subject: [Python-Dev] object capability; func_closure; __subclasses__
In-Reply-To: <95d8c0810706280509n43d1a22cm443f4cbc564e344f@mail.gmail.com>
Message-ID: <435DF58A933BA74397B42CDEB8145A860D502D6B@ex9.hostedexchange.local>

tav wrote:
> But, all I am asking for is to not expose func_closure (and perhaps
> some of the other func_*) as members of FunctionType -- isn't it
> possible to add functionality to the ``new`` module which would allow
> one to read/write func_closure?

Would func_closure then also be removed from the FunctionType
constructor arg list? If so, would I be expected to create a function
object and then use the "new" module to supply its func_closure?


Robert Brewer
System Architect
Amor Ministries
fumanchu at amor.org

From pje at telecommunity.com  Thu Jun 28 19:44:03 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 28 Jun 2007 13:44:03 -0400
Subject: [Python-Dev] object capability; func_closure;   __subclasses__
In-Reply-To: <20070628172433.382DC3A40AF@sparrow.telecommunity.com>
References: <95d8c0810706271804v2f6a8996hc1173481267ea4d2@mail.gmail.com>
	<20070628044018.076DA3A40AA@sparrow.telecommunity.com>
	<95d8c0810706280509n43d1a22cm443f4cbc564e344f@mail.gmail.com>
	<20070628150129.BBADF3A40B1@sparrow.telecommunity.com>
	<95d8c0810706280814q2a8a4e56u674d5e89d9d7aea@mail.gmail.com>
	<20070628154208.DA6673A40B1@sparrow.telecommunity.com>
	<95d8c0810706280923v6a72cb72w6d5474c41c748c29@mail.gmail.com>
	<20070628172433.382DC3A40AF@sparrow.telecommunity.com>
Message-ID: <20070628174155.617AE3A40AF@sparrow.telecommunity.com>

At 01:26 PM 6/28/2007 -0400, Phillip J. Eby wrote:
>You should also be able to delete unwanted function type attributes 
>like this::
>
>    from types import FunctionType
>    del dictionary_of(FunctionType)['func_closure']
>    del dictionary_of(FunctionType)['func_code']

By the way, you probably want to also delete func_globals and 
func_defaults, as there are security ramifications to those 
attributes as well.  Probably not so much for func_dict/__dict__ though.

And of course, for Python<=2.4 you can just use the __call__ 
attribute and not bother with deleting anything but __subclasses__.


From alexandre at peadrop.com  Thu Jun 28 23:34:47 2007
From: alexandre at peadrop.com (Alexandre Vassalotti)
Date: Thu, 28 Jun 2007 17:34:47 -0400
Subject: [Python-Dev] What's going on with the check-in emails?
In-Reply-To: <9e804ac0706271746t60d6ed90le06b854218c09b72@mail.gmail.com>
References: <acd65fa20706271315i25bc96f6k61ea2ca1a805f627@mail.gmail.com>
	<9e804ac0706271746t60d6ed90le06b854218c09b72@mail.gmail.com>
Message-ID: <acd65fa20706281434s29dcc0ddm6fbd6608b0162853@mail.gmail.com>

Thanks! The check-in emails are working again.

-- Alexandre

On 6/27/07, Thomas Wouters <thomas at python.org> wrote:
>
> The mail-checkins script broke because of the upgrade of the machine that
> hosts the subversion repository -- Python 2.3 went away, but two scripts
> were still using '#!/usr/bin/env python2.3'. They should be fixed now.
>

From mithun_rn at yahoo.co.in  Fri Jun 29 06:36:39 2007
From: mithun_rn at yahoo.co.in (Mithun R N)
Date: Fri, 29 Jun 2007 05:36:39 +0100 (BST)
Subject: [Python-Dev] Decoding libpython frame information on the stack
In-Reply-To: <18051.56078.34614.249653@montanaro.dyndns.org>
Message-ID: <473842.23106.qm@web8502.mail.in.yahoo.com>

Hi All,

Thanks much for your suggestions and help.
Shall get back after reading through and trying some
stuff mentioned in the emails.

Thanks and regards,
Mithun

--- skip at pobox.com wrote:

> 
>     >> Am a new subscriber to this list.  Am facing
> an issue in deciphering
>     >> core-files of applications with mixed C and
> libpython frames in it.
> 
>     >> I was thinking of knowing any work that has
> been done with respect to
>     >> getting into the actual python line
> (file-name.py:<line number>) from
>     >> the libpython frames on the stack while
> debugging such core-files. If
>     >> anybody knows some information on this,
> please let me know. I could
>     >> not get any link on the web that talks about
> this feature.
> 
> Sorry, I missed this the first time round and just
> saw Dustin's reply.  The
> Python distribution comes with a gdbinit file in the
> Misc directory.  I use
> it frequently to display Python stack traces from
> within GDB.  Here's the
> most recent copy online:
> 
>    
>
http://svn.python.org/view/python/trunk/Misc/gdbinit?view=markup
> 
> The following commands are implemented:
> 
>     pystack - display the full stack trace
>     pystackv - as above, but also display local
> variables
>     pyframe - display just the current frame
>     pyframev - as above, but also display local
> variables
>     up, down - move up or down one C stack frame,
> but display Python
>                frame if you move into
> PyEval_EvalFrame
> 
> This should all work within active sessions and
> sessions debugging core
> files (e.g., no active process).
> 
> It needs some rework.  For instance, it assumes
> you're running within Emacs
> and puts out lines gud can use to display source
> lines.  These look a little
> funky when debugging from a terminal window.
> 
> Skip
> 


Send free SMS to your Friends on Mobile from your Yahoo! Messenger. Download Now! http://messenger.yahoo.com/download.php

From cbarton at metavr.com  Sat Jun 30 01:37:32 2007
From: cbarton at metavr.com (Campbell Barton)
Date: Sat, 30 Jun 2007 09:37:32 +1000
Subject: [Python-Dev] Py/C API sig is here! --- (Was "Calling Methods from
 Pythons C API with Keywords")
In-Reply-To: <1182847168.6077.156.camel@localhost>
References: <mailman.0.1182261543.9641.python-dev@python.org>	
	<4677E66C.8000403@metavr.com>	<1182324889.6077.111.camel@localhost>	
	<4678FEB2.9050506@metavr.com>
	<1182339529.6077.120.camel@localhost>	
	<467919BA.2090708@metavr.com> <46796348.2050902@v.loewis.de>	
	<1182425636.6077.141.camel@localhost>
	<467AB472.6070509@v.loewis.de>
	<1182847168.6077.156.camel@localhost>
Message-ID: <468597BC.5080703@metavr.com>

Hrvoje Nik??i?? wrote:
> On Thu, 2007-06-21 at 19:25 +0200, "Martin v. L????wis" wrote:
>> In the past, we created special-interest groups for such discussion.
>> Would you like to coordinate a C sig? See
>>
>> http://www.python.org/community/sigs/
> 
> A SIG sounds like an excellent idea.  If created, a newcomer with a C
> API question could then be redirected to the SIG's mailing list, where
> (hopefully, in time) there would be enough knowledgable people to answer
> his question.
> 
> As for me coordinating the SIG, I'm not sure if that would be a good
> idea.  For one, I don't know what a coordinator really does and how much
> time the job takes from one's daily activities.  But more importantly,
> my interest in Python's C API is related to my current needs at work.
> If the situation at work changes, I will probably have much less time
> (if any) to devote to the C API discussions.

This mailing list is now running, if your interested in asking/answering 
questions about the Py/C api sign up here.

http://mail.python.org/mailman/listinfo/capi-sig


-- 
Campbell J Barton (ideasman42)

From henning.vonbargen at arcor.de  Sat Jun 30 14:03:35 2007
From: henning.vonbargen at arcor.de (Henning von Bargen)
Date: Sat, 30 Jun 2007 14:03:35 +0200
Subject: [Python-Dev] Proposal for a new function "open_noinherit" to
	avoid problems with subprocesses and security risks
References: <003601c7b76a$9a2a0550$6401a8c0@max> <4680394F.6060001@v.loewis.de>
Message-ID: <003a01c7bb0e$affc6b00$6401a8c0@max>

> Martin v. L?wis wrote:
> Exactly. My proposal is still to provide an API to toggle the
> flag after the handle was created.

OK, here is an API that I tested on Windows and for sockets only.
Perhaps someone can test it on Non-Windows (Linux, for example?)

I think the best place for it would be as a new method "set_noinherit"
for file and socket objects or as a new function in the os module
(thus the implementation should probably be rewritten at the C level).

Note that for file objects, the code would need an additional call to
win32file._get_osfhandle, because .fileno() returns the Windows handle
only for sockets, but not for files.

The code below uses thes Mark Hammond's win32all library.

import os

if os.name == "nt":

    import win32api, win32con
    def set_noinherit(socket, noinherit=True):
        """
        Mark a socket as non-inheritable to child processes.

        This should be called right after socket creation if you want
        to prevent the socket from being inherited to child processes.

        Notes:

        Unintentional socket or file inheritance is a security risk and
        can cause errors like "permission denied", "adress already in use" 
etc.
        in programs that start subprocesses, particularly in multi-threaded
        programs. These errors tend to occur seemingly randomly and are hard
        to reproduce (race condition!) and even harder to debug.

        Thus it is good practice to call this function as soon as possible
        after opening a file or socket that you doesn't need to be inherited
        to subprocesses.
        Note that in a multi-threaded environment, it is still possible that
        another thread starts a subprocess after you created a file/socket,
        but before you call set_noinherit.

        Note that for sockets, the new socket returned from accept() will be
        inheritable even if the listener socket was not; so you should call
        set_noinherit for the new socket as well.

        Availability: Posix, Windows
        """

        flags = 0
        if noinherit:
            flags = flags | win32con.HANDLE_FLAG_INHERIT
        win32api.SetHandleInformation(socket.fileno(), 
win32con.HANDLE_FLAG_INHERIT, flags)

else:

    import fcntl
    def set_noinherit(socket, noinherit=True):
        """
        ... documentation copied from the nt case ...
        """

        fd = socket.fileno()
        flags = fcntl.fcntl(fd, fcntl.F_GETFD) & ~fncl.FD_CLOEXEC
        if noinherit:
            flags = flags | fcntl.FD_CLOEXEC)
        fcntl.fcntl(fd, fcntl.F_SETFD, flags)


>
>> Martin, you mentioned that for sockets, inheritance is not a problem
>> unless accept(), recv() or select() is called in the child process
>> (as far as I understood it).
>
> I did not say "no problems". I said "there is no ambiguity whereto
> direct the data if the child processes don't perform accept/recv".
>
>> * http://mail.python.org/pipermail/python-list/2003-November/236043.html
>>    "socket's strange behavior with subprocesses"
>>    Funny: Even notepad.exe is used there as an example child process...
>
> Sure: the system will not shutdown the connection as long as the handle
> is still open in the subprocess (as the subprocess *might* send more
> data - which it won't).
>
> I think the problem could be avoided by the parent process explicitly
> performing shutdown(2), but I'm uncertain as I have never actively used
> shutdown().
>
>> * http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4202949
>>   Java has switched to non-inheritable sockets as well.
>
> Not surprisingly - they don't support fork(). If they would,
> they could not have made that change. The bug report is the
> same issue: clients will be able to connect as long as the
> listen backlog fills. Then they will be turned down, as notepad
> will never perform accept.

I think I more or less understand what happens.

The fact remains, that a subprocess takes influence on the behaviour
of server and client programs in an unwanted (and for many people:
unexpected) manner.

Regarding the influence of socket inheritance for server apps,
I performed some tests on Windows. If a subprocess is started
after the parent called accept(), then a client can still
connect() and send() to the socket - even if the parent process
has closed it and exited meanwhile.
This works until the queue is full (whose size was specified in
listen()). THEN the client will get (10061, 'Connection refused');
as you already explained.

And client and server will have the socket in CLOSE_WAIT resp.
FIN_WAIT2 status. However I doubt that this is actually a problem
as long as the server continues accept()ing.
But it means the client and server OS have to manage the socket
until the subprocess exits - even though neither client nor server
need the socket anymore.

One might argue that it is not actually a big problem, since the
subprocess will exit sooner or later.
This is more or less true for Linux, etc.

However, sometimes a subprocess might crash or hang.
Now what happens if the server program is closed and then started
again? On Linux, no problem (more or less). When the server program
is closed, the subprocess will be killed by the OS (I think), and
the socket is released (perhaps with a few minutes delay).

On Windows the situation is worse.

Subprocess hanging:
When the server program is closed, the subprocess will NOT be killed
by the OS ("officially" there isn't a parent-child relationship for
processes). It will continue hanging.
When the server program is restarted, it will run into an
"address alread in use" error.

Subprocess crashing:
Unfortunately, on a default Windows installation, this will be very
much the same like a hanging subprocess:
Dr. Watson or whichever debugger comes to debug the crashed program.
As long as nobody clicks on the messagebox popup, the crashed program
will not be freed and all handles will be kept open.
Of course, on a server computer, usually there is nobody watching the
desktop...
Note: It is possible to work around this by installing a different
debugger (which usually includes hacking the registry).

These problems can be avoided by calling set_noinherit for the lisener
socket as well as for the new socket returned by accept() ASAP.

>
>> * 
>> http://mail.python.org/pipermail/python-bugs-list/2006-April/032974.html
>>    python-Bugs-1469163 SimpleXMLRPCServer doesn't work anymore on Windows
>>    (see also Bug 1222790).
>
> I don't understand how this is relevant. This is about CLO_EXEC not
> being available on Windows, and has nothing to do with socket
> inheritance.

The original bug was about problems due to unwanted handle inheritance
with SimpleXMLRPCServer. The bug fix was to set CLO_EXEC. The fix
didn't work for Windows of course. A correct bug fix would to use
the "set_noinherit" function above.

>
> [I'm getting bored trying to explain the other cases as well]
>

OK, YOU do understand the issue and know what's going on under the hood.

I understand as well - at least now. It cost be several weeks of
frustrating debugging, changing code to avoid the built-in "open",
reading library source code, searching the internet and cursing...
(and I'm not a beginner, as well as others who mentioned they had
similar problems in this list).

Note that the solution REQUIRED avoiding and/or hacking the standard
library. As mentioned in previous posts, in a multi-threaded program,
the correct solution for files is to use the ftools.open on Windows
and a correct solution is not possible for sockets and on Non-Windows
due to possible race-conditions.

Using set_noinherit will reduce the risk as best as possible.

For Python 2.6 I propose to add the set_noinherit method to file
and socket objects.

For Python 3000 I propose that by default files and sockets should
be created non-inheritable (though it will not work perfectly for
mult-threaded programs on Non-Windows - see the doc in the code).

A server that needs handles to be inherited can then still call
set_noinherit(False).
Typical uses would include SocketServer.ForkingMixIn and the
3 standard handles for subprocess/os popen.

If this seems reasonable and I can help in implementing this,
please let me know.

The change would prevent other developers from making the same
frustrating experiences as I did.

Regards,
Henning


From martin at v.loewis.de  Sat Jun 30 19:24:41 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 30 Jun 2007 19:24:41 +0200
Subject: [Python-Dev] Proposal for a new function "open_noinherit" to
 avoid problems with subprocesses and security risks
In-Reply-To: <003a01c7bb0e$affc6b00$6401a8c0@max>
References: <003601c7b76a$9a2a0550$6401a8c0@max> <4680394F.6060001@v.loewis.de>
	<003a01c7bb0e$affc6b00$6401a8c0@max>
Message-ID: <468691D9.1000009@v.loewis.de>

> I think the best place for it would be as a new method "set_noinherit"
> for file and socket objects or as a new function in the os module
> (thus the implementation should probably be rewritten at the C level).

Indeed. Can you come up with a C implementation of it?
I think it should be a function in the posix/nt module, expecting
OS handles; the function in the os module could additionally support
sockets and file objects also in a polymorphic way.

> This works until the queue is full (whose size was specified in
> listen()). THEN the client will get (10061, 'Connection refused');
> as you already explained.

That's for accept, yes. For send, you can continue sending until
the TCP window closes (plus some unspecified amount of local
buffering the OS might do).

> However, sometimes a subprocess might crash or hang.
> Now what happens if the server program is closed and then started
> again? On Linux, no problem (more or less). When the server program
> is closed, the subprocess will be killed by the OS (I think), and
> the socket is released (perhaps with a few minutes delay).

That's not true. The child process can run indefinitely even though
the parent process has terminated. You may be thinking of SIGHUP,
which is sent to all processes when the user logs out of
the terminal.

Regards,
Martin


From ckkart at hoc.net  Wed Jun 27 13:43:49 2007
From: ckkart at hoc.net (Christian)
Date: Wed, 27 Jun 2007 20:43:49 +0900
Subject: [Python-Dev] csv changed from python 2.4 to 2.5
In-Reply-To: <20070627105041.869BA14C24B@irishsea.home.craig-wood.com>
References: <f5punr$8q2$1@sea.gmane.org>
	<20070627105041.869BA14C24B@irishsea.home.craig-wood.com>
Message-ID: <46824D75.9080808@hoc.net>

Nick Craig-Wood wrote:
> Christian K <ckkart at hoc.net> wrote:

[...]


>>  Python 2.5.1 (r251:54863, May  2 2007, 16:56:35)
>>  [GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] on linux2
>>  Type "help", "copyright", "credits" or "license" for more information.
>>>>> import csv
>>>>> d = csv.excel
>>>>> d.delimiter = ','
>>>>>
> 
> Don't you want to do this anyway?
> 
>   import csv
>   class my_dialect(csv.excel):
>       delimeter = ","
> 

I could probably do that, sure. I used to register my custom dialects and
retrieve and modify them at another place, thus probably misusing the register
mechanism as a replacement for a global symbol.

Thanks, Christian