From guido at python.org  Tue Nov  1 00:36:09 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 31 Oct 2005 16:36:09 -0700
Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover
In-Reply-To: <43654858.9020108@v.loewis.de>
References: <435BC27C.1010503@v.loewis.de> <2mbr1g6loh.fsf@starship.python.net>
	<e8bf7a530510270523g4a3bef5fk1dd5e8e016d9aa1a@mail.gmail.com>
	<17248.52771.225830.484931@montanaro.dyndns.org>
	<43610C36.2030500@v.loewis.de>
	<1f7befae0510281829n20ae2936pbc9f923da807bf6a@mail.gmail.com>
	<17252.50390.256221.4882@montanaro.dyndns.org>
	<17252.59653.792906.582288@montanaro.dyndns.org>
	<43654858.9020108@v.loewis.de>
Message-ID: <ca471dc20510311536g406db798o6249ab8108813c6f@mail.gmail.com>

Help!

What's the magic to get $Revision$ and $Date$ to be expanded upon
checkin? Comparing pep-0352.txt and pep-0343.txt, I noticed that the
latter has the svn revision and date in the headers, while the former
still has Brett's original revision 1.5 and a date somewhere in June.
I tried to fix this by rewriting the fields as $Revision$ and $Date$
but that doesn't seem to make a difference.

Googling for this is a bit tricky because Google collapses $Revision
and Revision, which makes any query for svn and $Revision rather
non-specific. :-(  It's also not yet in our Wiki.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From tim.peters at gmail.com  Tue Nov  1 00:48:44 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 31 Oct 2005 18:48:44 -0500
Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover
In-Reply-To: <ca471dc20510311536g406db798o6249ab8108813c6f@mail.gmail.com>
References: <435BC27C.1010503@v.loewis.de> <2mbr1g6loh.fsf@starship.python.net>
	<e8bf7a530510270523g4a3bef5fk1dd5e8e016d9aa1a@mail.gmail.com>
	<17248.52771.225830.484931@montanaro.dyndns.org>
	<43610C36.2030500@v.loewis.de>
	<1f7befae0510281829n20ae2936pbc9f923da807bf6a@mail.gmail.com>
	<17252.50390.256221.4882@montanaro.dyndns.org>
	<17252.59653.792906.582288@montanaro.dyndns.org>
	<43654858.9020108@v.loewis.de>
	<ca471dc20510311536g406db798o6249ab8108813c6f@mail.gmail.com>
Message-ID: <1f7befae0510311548v34da0695jc38e0a5c831256c8@mail.gmail.com>

[Guido]
> Help!
>
> What's the magic to get $Revision$ and $Date$ to be expanded upon
> checkin? Comparing pep-0352.txt and pep-0343.txt, I noticed that the
> latter has the svn revision and date in the headers, while the former
> still has Brett's original revision 1.5 and a date somewhere in June.
> I tried to fix this by rewriting the fields as $Revision$ and $Date$
> but that doesn't seem to make a difference.
>
> Googling for this is a bit tricky because Google collapses $Revision
> and Revision, which makes any query for svn and $Revision rather
> non-specific. :-(  It's also not yet in our Wiki.

You have to set the `svn:keywords` property on each file for which you
want these kinds of expansions:

    http://svnbook.red-bean.com/en/1.0/ch07s02.html#svn-ch-7-sect-2.3.4

Use

    svn propedit svn:keywords path/to/file

to set that property to what you want.

Looking at your examples,

C:\Code>svn proplist -v http://svn.python.org/projects/peps/trunk/pep-0343.txt
Properties on 'http://svn.python.org/projects/peps/trunk/pep-0343.txt':
  svn:keywords : Author Date Id Revision
  svn:eol-style : native

So that has svn:keywords set, and expansion occurs.  OTOH,

C:\Code>svn proplist -v http://svn.python.org/projects/peps/trunk/pep-0352.txt

Nada -- that one doesn't even have svn:eol-style set.

See

    http://wiki.python.org/moin/CvsToSvn

section "File Modes" for how to convince SVN to automatically set the
properties you want on new files you commit (unfortunately, each
developer has to do this in their own SVN config file).

From pinard at iro.umontreal.ca  Tue Nov  1 00:50:00 2005
From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard)
Date: Mon, 31 Oct 2005 18:50:00 -0500
Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover
In-Reply-To: <ca471dc20510311536g406db798o6249ab8108813c6f@mail.gmail.com>
References: <435BC27C.1010503@v.loewis.de> <2mbr1g6loh.fsf@starship.python.net>
	<e8bf7a530510270523g4a3bef5fk1dd5e8e016d9aa1a@mail.gmail.com>
	<17248.52771.225830.484931@montanaro.dyndns.org>
	<43610C36.2030500@v.loewis.de>
	<1f7befae0510281829n20ae2936pbc9f923da807bf6a@mail.gmail.com>
	<17252.50390.256221.4882@montanaro.dyndns.org>
	<17252.59653.792906.582288@montanaro.dyndns.org>
	<43654858.9020108@v.loewis.de>
	<ca471dc20510311536g406db798o6249ab8108813c6f@mail.gmail.com>
Message-ID: <20051031235000.GA14812@alcyon.progiciels-bpi.ca>

[Guido van Rossum]

>What's the magic to get $Revision$ and $Date$ to be expanded upon
>checkin?

Expansion does not occur on checkin, but on checkout, and even then, 
only in your copy -- that one you see (the internal Subversion copy is 
untouched).  You have to edit a property for the file where you want 
substitutions.  That property is named "svn:keywords" and its value 
decides which kind of substitution you want to allow.

This is all theory for me, I never used them.

-- 
Fran?ois Pinard   http://pinard.progiciels-bpi.ca

From gherron at islandtraining.com  Tue Nov  1 00:54:46 2005
From: gherron at islandtraining.com (Gary Herron)
Date: Mon, 31 Oct 2005 15:54:46 -0800
Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover
In-Reply-To: <ca471dc20510311536g406db798o6249ab8108813c6f@mail.gmail.com>
References: <435BC27C.1010503@v.loewis.de>
	<2mbr1g6loh.fsf@starship.python.net>	<e8bf7a530510270523g4a3bef5fk1dd5e8e016d9aa1a@mail.gmail.com>	<17248.52771.225830.484931@montanaro.dyndns.org>	<43610C36.2030500@v.loewis.de>	<1f7befae0510281829n20ae2936pbc9f923da807bf6a@mail.gmail.com>	<17252.50390.256221.4882@montanaro.dyndns.org>	<17252.59653.792906.582288@montanaro.dyndns.org>	<43654858.9020108@v.loewis.de>
	<ca471dc20510311536g406db798o6249ab8108813c6f@mail.gmail.com>
Message-ID: <4366AEC6.2000803@islandtraining.com>

Guido van Rossum wrote:

>Help!
>
>What's the magic to get $Revision$ and $Date$ to be expanded upon
>checkin? Comparing pep-0352.txt and pep-0343.txt, I noticed that the
>latter has the svn revision and date in the headers, while the former
>still has Brett's original revision 1.5 and a date somewhere in June.
>I tried to fix this by rewriting the fields as $Revision$ and $Date$
>but that doesn't seem to make a difference.
>
>Googling for this is a bit tricky because Google collapses $Revision
>and Revision, which makes any query for svn and $Revision rather
>non-specific. :-(  It's also not yet in our Wiki.
>  
>
It's an svn property associated with the file.  The property name is 
svn:keywords, and the value is a space separated list of keywords you'd 
like to have substituted.  Like this:

svn propset svn:keywords "Date Revision" ...file list...

The list of keywords it will handle is
  LastChangedDate (or Date)
  LastChangedRevision (or Revision or Rev)
  LastChangedBy (or Author)
  HeadURL (or URL)
  Id

Gary Herron


From greg.ewing at canterbury.ac.nz  Tue Nov  1 02:03:25 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 01 Nov 2005 14:03:25 +1300
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
 conversions).
In-Reply-To: <20051031022554.GA20255@alcyon.progiciels-bpi.ca>
References: <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com>
	<4362A44F.9010506@v.loewis.de>
	<20051029110331.D5AA.ISHIMOTO@gembook.org>
	<4363395A.3040606@v.loewis.de> <1130589142.5945.11.camel@fsol>
	<43638BC0.40108@v.loewis.de>
	<20051031022554.GA20255@alcyon.progiciels-bpi.ca>
Message-ID: <4366BEDD.9020100@canterbury.ac.nz>

Fran?ois Pinard wrote:

> All development is done in house by French people.  All documentation, 
> external or internal, comments, identifier and function names, 
> everything is in French.

There's nothing stopping you from creating your own
Frenchified version of Python that lets you use all
the characters you want, for your own in-house use.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From greg.ewing at canterbury.ac.nz  Tue Nov  1 02:24:11 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 01 Nov 2005 14:24:11 +1300
Subject: [Python-Dev] a different kind of reduce...
In-Reply-To: <8393fff0510311113p63bc194ak88580f84a25b1a1a@mail.gmail.com>
References: <8393fff0510311113p63bc194ak88580f84a25b1a1a@mail.gmail.com>
Message-ID: <4366C3BB.3010407@canterbury.ac.nz>

Martin Blais wrote:

> I'm always--literally every time-- looking for a more functional form,
> something that would be like this:
> 
>    # apply dirname() 3 times on its results, initializing with p
>    ... = repapply(dirname, 3, p)

Maybe ** should be defined for functions so that you
could do things like

   up3levels = dirname ** 3

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From pinard at iro.umontreal.ca  Tue Nov  1 03:51:15 2005
From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard)
Date: Mon, 31 Oct 2005 21:51:15 -0500
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
In-Reply-To: <4366BEDD.9020100@canterbury.ac.nz>
References: <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com>
	<4362A44F.9010506@v.loewis.de>
	<20051029110331.D5AA.ISHIMOTO@gembook.org>
	<4363395A.3040606@v.loewis.de> <1130589142.5945.11.camel@fsol>
	<43638BC0.40108@v.loewis.de>
	<20051031022554.GA20255@alcyon.progiciels-bpi.ca>
	<4366BEDD.9020100@canterbury.ac.nz>
Message-ID: <20051101025115.GA18573@alcyon.progiciels-bpi.ca>

[Greg Ewing]

>> All development is done in house by French people.  All documentation, 
>> external or internal, comments, identifier and function names, 
>> everything is in French.

> There's nothing stopping you from creating your own Frenchified 
> version of Python that lets you use all the characters you want, for 
> your own in-house use.

No doubt that we, you and me and everybody, could all have our own 
little version of Python.  :-)

To tell all the truth, the very topic of your suggestion has already 
been discussed in-house already, and the decision has been to stick to 
Python mainstream.  We could not justify to our administration that we 
start modifying our sources, in such a way that we ought to invest 
maintainance each time a new Python version appears, forever.

On the other hand, we may reasonably guess that many people in this 
world would love being as comfortable as possible using Python, while 
naming identifiers naturally.  It is not so unreasonable that we keep 
some _hope_ that Guido will soon choose to help us all, not only me.

-- 
Fran?ois Pinard   http://pinard.progiciels-bpi.ca

From amk at amk.ca  Tue Nov  1 15:35:05 2005
From: amk at amk.ca (A.M. Kuchling)
Date: Tue, 1 Nov 2005 09:35:05 -0500
Subject: [Python-Dev] python-dev sprint at PyCon
Message-ID: <20051101143505.GE14719@rogue.amk.ca>

Every PyCon has featured a python-dev sprint.  For the past few years,
hacking on the AST branch has been a tradition, but we'll have to come
up with something new for this year's conference (in Dallas Texas;
sprints will be Monday Feb. 27 through Thursday March 2).

According to Anthony's release plan, a first alpha of 2.5 would be
released in March, hence after PyCon and the sprints.  We should
discuss possible tasks for a python-dev sprint.  What could we do?

When the discussion is over, someone should update the wiki page with
whatever tasks are suggested:
<http://wiki.python.org/moin/PyCon2006/Sprints>.

--amk


From dave at boost-consulting.com  Tue Nov  1 17:25:23 2005
From: dave at boost-consulting.com (David Abrahams)
Date: Tue, 01 Nov 2005 11:25:23 -0500
Subject: [Python-Dev] [C++-sig]  GCC version compatibility
References: <42CDA654.2080106@v.loewis.de> <uu0j6p7z1.fsf@boost-consulting.com>
	<20050708072807.GC3581@lap200.cdc.informatik.tu-darmstadt.de>
	<u8y0hl45u.fsf@boost-consulting.com> <42CEF948.3010908@v.loewis.de>
	<20050709102010.GA3836@lap200.cdc.informatik.tu-darmstadt.de>
	<42D0D215.9000708@v.loewis.de>
	<20050710125458.GA3587@lap200.cdc.informatik.tu-darmstadt.de>
	<42D15DB2.3020300@v.loewis.de>
	<20050716101357.GC3607@lap200.cdc.informatik.tu-darmstadt.de>
	<20051012120917.GA11058@lap200.cdc.informatik.tu-darmstadt.de>
Message-ID: <u64rc49os.fsf@boost-consulting.com>

Christoph Ludwig <cludwig at cdc.informatik.tu-darmstadt.de> writes:

> Hi,
>
> this is to continue a discussion started back in July by a posting by 
> Dave Abrahams <url:http://thread.gmane.org/gmane.comp.python.devel/69651>
> regarding the compiler (C vs. C++) used to compile python's main() and to link
> the executable.
>
>
> On Sat, Jul 16, 2005 at 12:13:58PM +0200, Christoph Ludwig wrote:
>> On Sun, Jul 10, 2005 at 07:41:06PM +0200, "Martin v. L?wis" wrote:
>> > Maybe. For Python 2.4, feel free to contribute a more complex test. For
>> > Python 2.5, I would prefer if the entire code around ccpython.cc was
>> > removed.
>> 
>> I submitted patch #1239112 that implements the test involving two TUs for
>> Python 2.4. I plan to work on a more comprehensive patch for Python 2.5 but
>> that will take some time.
>
>
> I finally had the spare time to look into this problem again and submitted
> patch #1324762. The proposed patch implements the following:

I just wanted to write to encourage some Python developers to look at
(and accept!) Christoph's patch.  This is really crucial for smooth
interoperability between C++ and Python.

Thank you,
Dave

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com


From pje at telecommunity.com  Tue Nov  1 18:16:52 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 01 Nov 2005 12:16:52 -0500
Subject: [Python-Dev] python-dev sprint at PyCon
In-Reply-To: <20051101143505.GE14719@rogue.amk.ca>
Message-ID: <5.1.1.6.0.20051101121245.020559e8@mail.telecommunity.com>

At 09:35 AM 11/1/2005 -0500, A.M. Kuchling wrote:
>Every PyCon has featured a python-dev sprint.  For the past few years,
>hacking on the AST branch has been a tradition, but we'll have to come
>up with something new for this year's conference (in Dallas Texas;
>sprints will be Monday Feb. 27 through Thursday March 2).
>
>According to Anthony's release plan, a first alpha of 2.5 would be
>released in March, hence after PyCon and the sprints.  We should
>discuss possible tasks for a python-dev sprint.  What could we do?

* PEP 343 implementation ('with:')
* PEP 308 implementation ('x if y else z')
* A bytes type

Or perhaps some of the things that have been waiting for the AST branch to 
be finished, i.e.:

* One of the "global variable speedup" PEPs
* Guido's instance variable speedup idea (LOAD_SELF_IVAR and 
STORE_SELF_IVAR, see 
http://mail.python.org/pipermail/python-dev/2002-February/019854.html)


From guido at python.org  Tue Nov  1 18:22:16 2005
From: guido at python.org (Guido van Rossum)
Date: Tue, 1 Nov 2005 10:22:16 -0700
Subject: [Python-Dev] python-dev sprint at PyCon
In-Reply-To: <5.1.1.6.0.20051101121245.020559e8@mail.telecommunity.com>
References: <20051101143505.GE14719@rogue.amk.ca>
	<5.1.1.6.0.20051101121245.020559e8@mail.telecommunity.com>
Message-ID: <ca471dc20511010922g2f463d7en5d9bc8dbc5a26c92@mail.gmail.com>

On 11/1/05, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 09:35 AM 11/1/2005 -0500, A.M. Kuchling wrote:
> >Every PyCon has featured a python-dev sprint.  For the past few years,
> >hacking on the AST branch has been a tradition, but we'll have to come
> >up with something new for this year's conference (in Dallas Texas;
> >sprints will be Monday Feb. 27 through Thursday March 2).
> >
> >According to Anthony's release plan, a first alpha of 2.5 would be
> >released in March, hence after PyCon and the sprints.  We should
> >discuss possible tasks for a python-dev sprint.  What could we do?
>
> * PEP 343 implementation ('with:')
> * PEP 308 implementation ('x if y else z')
> * A bytes type

* PEP 328 - absolute/relative import
* PEP 341 - unifying try/except and try/finally (I believe this was
accepted; it's still marked Open in PEP 0)

> Or perhaps some of the things that have been waiting for the AST branch to
> be finished, i.e.:
>
> * One of the "global variable speedup" PEPs
> * Guido's instance variable speedup idea (LOAD_SELF_IVAR and
> STORE_SELF_IVAR, see
> http://mail.python.org/pipermail/python-dev/2002-February/019854.html)

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From nnorwitz at gmail.com  Tue Nov  1 18:59:26 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Tue, 1 Nov 2005 09:59:26 -0800
Subject: [Python-Dev] python-dev sprint at PyCon
In-Reply-To: <ca471dc20511010922g2f463d7en5d9bc8dbc5a26c92@mail.gmail.com>
References: <20051101143505.GE14719@rogue.amk.ca>
	<5.1.1.6.0.20051101121245.020559e8@mail.telecommunity.com>
	<ca471dc20511010922g2f463d7en5d9bc8dbc5a26c92@mail.gmail.com>
Message-ID: <ee2a432c0511010959h7348679endf7ecc4bdf12d7a9@mail.gmail.com>

On 11/1/05, Guido van Rossum <guido at python.org> wrote:
> On 11/1/05, Phillip J. Eby <pje at telecommunity.com> wrote:
> > At 09:35 AM 11/1/2005 -0500, A.M. Kuchling wrote:
> > >Every PyCon has featured a python-dev sprint.  For the past few years,
> > >hacking on the AST branch has been a tradition, but we'll have to come
> > >up with something new for this year's conference (in Dallas Texas;
> > >sprints will be Monday Feb. 27 through Thursday March 2).
> > >
> > >According to Anthony's release plan, a first alpha of 2.5 would be
> > >released in March, hence after PyCon and the sprints.  We should
> > >discuss possible tasks for a python-dev sprint.  What could we do?

I added the 4 PEPs mentioned and a few more ideas here:

  http://wiki.python.org/moin/PyCon2006/Sprints/PythonCore

n

From pje at telecommunity.com  Tue Nov  1 19:02:09 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 01 Nov 2005 13:02:09 -0500
Subject: [Python-Dev] python-dev sprint at PyCon
Message-ID: <5.1.1.6.0.20051101130208.02047018@mail.telecommunity.com>

At 10:22 AM 11/1/2005 -0700, Guido van Rossum wrote:
>* PEP 328 - absolute/relative import

I assume that references to 2.4 in that PEP should be changed to 2.5, and 
so on.

It also appears to me that the PEP doesn't record the issue brought up by 
some people about the current absolute/relative ambiguity being useful for 
packaging purposes.  i.e., being able to nest third-party packages such 
that they end up seeing their dependencies, even though they're not 
installed at the "root" package level.

For example, I have a package that needs Python 2.4's version of pyexpat, 
and I need it to run in 2.3, but I can't really overwrite the 2.3 pyexpat, 
so I just build a backported pyexpat and drop it in the package, so that 
the code importing it just ends up with the right thing.

Of course, that specific example is okay since 2.3 isn't going to somehow 
grow absolute importing.  :)  But I think people brought up other examples 
besides that, it's just the one that I personally know I've done.


From guido at python.org  Tue Nov  1 19:14:46 2005
From: guido at python.org (Guido van Rossum)
Date: Tue, 1 Nov 2005 11:14:46 -0700
Subject: [Python-Dev] python-dev sprint at PyCon
In-Reply-To: <5.1.1.6.0.20051101130208.02047018@mail.telecommunity.com>
References: <5.1.1.6.0.20051101130208.02047018@mail.telecommunity.com>
Message-ID: <ca471dc20511011014o721c0d88w9244915e368a1a6c@mail.gmail.com>

On 11/1/05, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 10:22 AM 11/1/2005 -0700, Guido van Rossum wrote:
> >* PEP 328 - absolute/relative import
>
> I assume that references to 2.4 in that PEP should be changed to 2.5, and
> so on.

For the part that hasn't been implemented yet, yes.

> It also appears to me that the PEP doesn't record the issue brought up by
> some people about the current absolute/relative ambiguity being useful for
> packaging purposes.  i.e., being able to nest third-party packages such
> that they end up seeing their dependencies, even though they're not
> installed at the "root" package level.
>
> For example, I have a package that needs Python 2.4's version of pyexpat,
> and I need it to run in 2.3, but I can't really overwrite the 2.3 pyexpat,
> so I just build a backported pyexpat and drop it in the package, so that
> the code importing it just ends up with the right thing.
>
> Of course, that specific example is okay since 2.3 isn't going to somehow
> grow absolute importing.  :)  But I think people brought up other examples
> besides that, it's just the one that I personally know I've done.

I guess this ought to be recorded. :-(

The issue has been beaten to death and my position remains firm:
rather than playing namespace games, consistent renaming is the right
thing to do here. This becomes a trivial source edit, which beats the
problems of debugging things when it doesn't work out as expected
(which is very common due to the endless subtleties of loading
multiple versions of the same code).

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Tue Nov  1 19:28:12 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 01 Nov 2005 13:28:12 -0500
Subject: [Python-Dev] python-dev sprint at PyCon
In-Reply-To: <ca471dc20511011014o721c0d88w9244915e368a1a6c@mail.gmail.co
 m>
References: <5.1.1.6.0.20051101130208.02047018@mail.telecommunity.com>
	<5.1.1.6.0.20051101130208.02047018@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20051101132151.02fe9708@mail.telecommunity.com>

At 11:14 AM 11/1/2005 -0700, Guido van Rossum wrote:
>I guess this ought to be recorded. :-(
>
>The issue has been beaten to death and my position remains firm:
>rather than playing namespace games, consistent renaming is the right
>thing to do here. This becomes a trivial source edit,

Well, it's not trivial if you're (in my case) trying to support 2.3 and 2.4 
with the same code base.

It'd be nice to have some other advice to offer people besides, "go edit 
your code".  Of course, if the feature hadn't already existed, I suppose a 
PEP to add it would have been shot down, so it's a reasonable decision.


>which beats the
>problems of debugging things when it doesn't work out as expected
>(which is very common due to the endless subtleties of loading
>multiple versions of the same code).

Yeah, Bob Ippolito and I batted around a few ideas about how to implement 
simultaneous multi-version imports for Python Eggs, some of which relied on 
the relative/absolute ambiguity, but I think the main subtleties have to do 
with dynamic imports (including pickling) and the use of __name__.

Of course, since we never actually implemented it, I don't know what other 
subtleties could potentially exist.  Python Eggs currently allow you to 
install multiple versions of a package, but at runtime you can only import 
one of them, and you get a runtime VersionConflict exception if two eggs' 
version criteria are incompatible.


From nnorwitz at gmail.com  Tue Nov  1 19:34:29 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Tue, 1 Nov 2005 10:34:29 -0800
Subject: [Python-Dev] python-dev sprint at PyCon
In-Reply-To: <5.1.1.6.0.20051101132151.02fe9708@mail.telecommunity.com>
References: <5.1.1.6.0.20051101130208.02047018@mail.telecommunity.com>
	<5.1.1.6.0.20051101132151.02fe9708@mail.telecommunity.com>
Message-ID: <ee2a432c0511011034g678f93dbvca06cc44c0c643b7@mail.gmail.com>

On 11/1/05, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 11:14 AM 11/1/2005 -0700, Guido van Rossum wrote:
> >I guess this ought to be recorded. :-(
> >
> >The issue has been beaten to death and my position remains firm:
> >rather than playing namespace games, consistent renaming is the right
> >thing to do here. This becomes a trivial source edit,
>
> Well, it's not trivial if you're (in my case) trying to support 2.3 and 2.4
> with the same code base.
>
> It'd be nice to have some other advice to offer people besides, "go edit
> your code".  Of course, if the feature hadn't already existed, I suppose a
> PEP to add it would have been shot down, so it's a reasonable decision.

Why can't you add your version's directory to sys.path before importing pyexpat?

n

From jcarlson at uci.edu  Tue Nov  1 19:48:46 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Tue, 01 Nov 2005 10:48:46 -0800
Subject: [Python-Dev] apparent ruminations on mutable immutables (was:
	PEP 351, the freeze protocol)
In-Reply-To: <b348a0850510311425w493c14few57fc0677ad273d80@mail.gmail.com>
References: <20051031120205.3A0C.JCARLSON@uci.edu>
	<b348a0850510311425w493c14few57fc0677ad273d80@mail.gmail.com>
Message-ID: <20051101104731.0389.JCARLSON@uci.edu>


Noam Raphael <noamraph at gmail.com> wrote:
> On 10/31/05, Josiah Carlson <jcarlson at uci.edu> wrote:

> > > About the users-changing-my-internal-data issue:
> ...
> > You can have a printout before it dies:
> > "I'm crashing your program because something attempted to modify a data
> > structure (here's the traceback), and you were told not to."
> >
> > Then again, you can even raise an exception when people try to change
> > the object, as imdict does, as tuples do, etc.
> 
> Both solutions would solve the problem, but would require me to wrap
> the built-in set with something which doesn't allow changes. This is a
> lot of work - but it's quite similiar to what my solution would
> actually do, in a single built-in function.

I am an advocate for PEP 351.  However, I am against your proposed
implementation/variant of PEP 351 because I don't believe it ads enough
to warrant the additional complication and overhead necessary for every
object (even tuples would need to get a .frozen_cache member).

Give me a recursive freeze from PEP 351 (which handles objects that are
duplicated, but errors out on circular references), and I'll be happy.


> > > You suggest two ways for solving the problem. The first is by copying
> > > my mutable objects to immutable copies:
> >
> > And by caching those results, then invalidating them when they are
> > updated by your application.  This is the same as what you would like to
> > do, except that I do not rely on copy-on-write semantics, which aren't
> > any faster than freeze+cache by your application.
> 
> This isn't correct - freezing a set won't require a single copy to be
> performed, as long as the frozen copy isn't saved after the original
> is changed. Copy+cache always requires one copy.

You are wrong, and you even say you are wrong..."freezing a set doesn't
require a COPY, IF the frozen COPY isn't saved after the original is
CHANGED". Creating an immutable set IS CREATING A COPY, so it ALSO
copies, and you admit as much, but then say the equivalent of "copying
isn't copying because I say so".


> > In any case, whether you choose to use freeze, or use a different API,
> > this particular problem is solvable without copy-on-write semantics.
> 
> Right. But I think that a significant simplification of the API is a
> nice bonus for my solution. And about those copy-on-write semantics -
> it should be proven how complex they are. Remember that we are talking
> about frozen-copy-on-write, which I think would simplify matters
> considerably - for example, there are at most two instances sharing
> the same data, since the frozen copy can be returned again and again.

I think that adding an additional attribute to literally every single
object to handle the caching of 'frozen' objects, as well as a list to
every object to handle callbacks which should be called on object
mutation, along with a _call_stuff_when_mutated() method that handles
these callback calls, IN ADDITION TO the __freeze__ method which is
necessary to support this, is a little much, AND IS CERTAINLY NOT A
SIMPLIFICATION!

Let us pause for a second and consider:
Original PEP proposed 1 new method: __freeze__, which could be
implemented as a subclass of the original object (now), and integrated
into the original classes as time goes on.  One could /register/
__freeze__ functions/methods a'la Pickle, at which point objects
wouldn't even need a native freeze method.

Your suggestion offers 2 new methods along with 2 new instance variables. 
Let's see, a callback handler, __freeze__, the cache, and the callback
list.  Doesn't that seem a little excessive to you to support freezing?
It does to me.  If Guido were to offer your implementation of freeze, or
no freeze at all, I would opt for no freeze, as implementing your freeze
on user-defined classes would be a pain in the ass, not to mention
implementing them in C code would be more than I would care to do, and
more than I would ask any of the core developers to work on.


> > Even without validation, there are examples that force a high number of
> > calls, which are not O(1), ammortized or otherwise.
> >
> [Snap - a very interesting example]
> >
> > Now, the actual time analysis on repeated freezings and such gets ugly.
> > There are actually O(k) objects, which take up O(k**2) space.  When you
> > modify object b[i][j] (which has just been frozen), you get O(k)
> > callbacks, and when you call freeze(b), it actually results in O(k**2)
> > time to re-copy the O(k**2) pointers to the O(k) objects.  It should be
> > obvious that this IS NOT AMMORTIZABLE to original object creation time.
> >
> That's absolutely right. My ammortized analysis is correct only if you
> limit yourself to cases in which the original object doesn't change
> after a frozen() call was made. In that case, it's ok to count the
> O(k**2) copy with the O(k**2) object creation, because it's made only
> once.

But here's the crucial observation which you are missing.  You yourself
have stated that in both your table and graph examples you want your
application to continue to modify values while the user can't manipulate
them.  So even in your own use-cases, you are going to be modifying
objects after they have been frozen, and even then it won't be fast!

I believe that in general, people who are freezing things are going to
want to be changing the original objects - hence the use of mutables to
begin with - maybe for situations like yours where you don't want users
mutating returns, whatever.  If after they have frozen the object, they
don't want to be changing the original objects, then they are probably
going to be tossing out the original mutable and using the immutable
created with freeze anyways (mutate your object until you get it right,
then freeze it and use that so that no one can alter your data, not even
yourself), so I think that caching is something that the /user/ should
be doing, NOT Python.

The simple implementation (not copy-on-write) leads us to a simple
matter of documenting, "Freeze is 'stateless'; every call to freeze
returns a new object, regardless of modifications (or lack thereof)
between freeze calls."

Remember: "Simple is better than complex."


> Why it's ok to analyze only that limited case? I am suggesting a
> change in Python: that every object you would like be mutable, and
> would support the frozen() protocol. When you evaluate my suggestion,
> you need to take a program, and measure its performance in the current
> Python and in a Python which implements my suggestion. This means that
> the program should work also on the current Python. In that case, my
> assumption is true - you won't change objects after you have frozen
> them, simply because these objects (strings which are used as dict
> keys, for example) can't be changed at all in the current Python
> implementation!

Not everything can/should become mutable.  Integers should never become
mutable, as tuples should never become mutable, as strings/unicode
should never become mutable...wait, aren't we getting to the point that
everything which is currently immutable shouldn't become mutable? 
Indeed.  I don't believe that any currently immutable object should be
able to become mutable in order to satisfy /anyone's/ desire for mutable
/anything/.


In starting to bring up benchmarks you are falling into the trap of
needing to /have/ a benchmark (I have too), for which there are very few,
if any, current use-cases.

Without having or needing a benchmark, I'll state quite clearly where
your proposed copy-on-write would beat out the naive 'create a new copy
on every call to freeze':
1. If objects after they are frozen are never modified, copy on write
will be faster.
2. If original objects are modified after they are frozen, then the
naive implementation will be as fast if not faster in general, due to
far lower overhead, but may be slower in corner cases where some nested
structure is unchanged, and some shallow bit has changed:

    x = [[], NEVER_CHANGED_MUTABLE_NESTED_STRUCTURE]
    y = freeze(x)
    x[0].append(1)
    z = freeze(x)

Further, discussing benchmarks on use-cases, for which there are few (if
any) previously existing uses, is like saying "let's race cars" back in
1850; it's a bit premature.


Then there is this other example:

    x = [1,2,3]
    y = freeze(x)

The flat version of freeze in the PEP right now handles this case.  I
can change x all I want, yet I have a frozen y which stays unchanged. 
This is what I would want, and I would imagine it is what others would
want too.  In fact, this is precisely the use-case you offered for your
table and graph examples, so your expression of a sentiment of "users
aren't going to be changing the object after it has been frozen" is, by
definition, wrong: you do it yourself!


> I will write it in another way: I am proposing a change that will make
> Python objects, including strings, mutable, and gives you other
> advantages as well. I claim that it won't make existing Python
> programs run slower in O() terms. It would allow you to do many things
> that you can't do today; some of them would be fast, like editing a
> string, and some of them would be less fast - for example, repeatedly
> changing an object and freezing it.

Your claim on running time only works if the original isn't changed
after it is frozen

And I don't like making everything mutable, it's a "solution looking for
a problem", or a "tail wagging the dog" idea.  There is no good reason
to make everything mutable, and I challenge you to come up with a valid
one that isn't already covered by the existing standard library or
extension modules.

There is no need to bring strings into this conversation as there are
string variants which are already mutable: array.array('c', ...),
StringIO, mmap, take your pick!  And some future Python (perhaps 2.5)
will support a 'bytes' object, which is essentially an mmap which
doesn't need to be backed by a file.


> I think that the performance penalty may be rather small - remember
> that in programs which do not change strings, there would never be a
> need to copy the string data at all. And since I think that usually
> most of the dict lookups are for method or function names, there would
> almost never be a need to constuct a new object on dict lookup,
> because you search for the same names again and again, and a new
> object is created only on the first frozen() call. You might even gain
> performance, because s += x would be faster.

You really don't know how Python internals work.

The slow part of s += x on strings in Python 2.4 is the memory
reallocation and occasional data copy (it has been tuned like crazy by
Raymond in 2.4, see _PyString_Resize in stringobject.c). Unless you
severely over-allocated your strings, this WOULD NOT BE SPED UP BY
MUTABLE STRINGS.

Further, identifiers/names (obj, obj.attr, obj.attr1.attr2, ...) are
already created during compile-time, and are 'interned'.  That is, if
you have an object that you call 'foo', there gets to be a single "foo"
string, which is referenced by pointer by any code in that module which
references the 'foo' object to the single, always unchanging "foo"
string.  And because the string has already been hashed, it has a cached
hash value, and lookups in dictionaries are already fast due to a check
for pointer equivalency before comparing contents.  Mutable strings
CANNOT be faster than this method.


> > You have clarified it, but it is still wrong.  I stand by 'it is not
> > easy to get right', and would further claim, "I doubt it is possible to
> > make it fast."
> 
> It would not be very easy to implement, of course, but I hope that it
> won't be very hard either, since the basic idea is quite simple. Do
> you still doubt the possibility of making it fast, given my (correct)
> definition of fast?

I would claim that your definition is limited.  Yours would be fast if
objects never changed after they are frozen, which is counter to your
own use-cases.  This suggests that your definition is in fact incorrect,
and you fail to see your own inconsistancy.


> And if it's possible (which I think it is), it would allow us to get
> rid of inconvinient immutable objects, and it would let us put
> everything into a set. Isn't that nice?

No, it sounds like a solution looking for a problem.  I see no need to
make strings, floats, ints, tuples, etc. mutable, and I think that you
will have very little luck in getting core Python developer support for
any attempt to make them mutable.

If you make such a suggestion, I would offer that you create a new PEP,
because this discussion has gone beyond PEP 351, and has wandered into
the realm of "What other kinds of objects would be interesting to have
in a Python-like system?"



I'll summarize your claims:
1. copy-on-write is a simplification
2. everything being mutable would add to Python
3. copy-on-write is fast
4. people won't be mutating objects after they are frozen

I'll counter your claims:
1. 2 methods and 2 instance variables on ALL OBJECTS is not a
simplification.
2. a = b = 1; a += 1;  If all objects were to become mutable, then a ==
b, despite what Python and every other sane language would tell you, and
dct[a] would stop working (you would have to use c = freeze(a);dct[c],
or dct[x] would need to automatically call freeze and only ever
reference the result, significantly slowing down ALL dictionary
references).
3. only if you NEVER MUTATE an object after it has been frozen
4. /you/ mutate original objects after they are frozen

ALSO:
5. You fail to realize that if all objects were to become mutable, then
one COULDN'T implement frozen, because the frozen objects THEMSELVES
would be mutable.


I'm going to bow out of this discussion for a few reasons, not the least
of which being that I've spent too much time on this subject, and that I
think it is quite clear that your proposal is dead, whether I had
anything to do with it or not.

 - Josiah


From pje at telecommunity.com  Tue Nov  1 19:50:00 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 01 Nov 2005 13:50:00 -0500
Subject: [Python-Dev] python-dev sprint at PyCon
In-Reply-To: <ee2a432c0511011034g678f93dbvca06cc44c0c643b7@mail.gmail.co
 m>
References: <5.1.1.6.0.20051101132151.02fe9708@mail.telecommunity.com>
	<5.1.1.6.0.20051101130208.02047018@mail.telecommunity.com>
	<5.1.1.6.0.20051101132151.02fe9708@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20051101134754.0380cf68@mail.telecommunity.com>

At 10:34 AM 11/1/2005 -0800, Neal Norwitz wrote:
>Why can't you add your version's directory to sys.path before importing 
>pyexpat?

With library code that can be imported in any order, there is no such thing 
as "before".  Anyway, Guido has pronounced on this already, so it's moot.


From guido at python.org  Tue Nov  1 20:39:39 2005
From: guido at python.org (Guido van Rossum)
Date: Tue, 1 Nov 2005 12:39:39 -0700
Subject: [Python-Dev] python-dev sprint at PyCon
In-Reply-To: <5.1.1.6.0.20051101132151.02fe9708@mail.telecommunity.com>
References: <5.1.1.6.0.20051101130208.02047018@mail.telecommunity.com>
	<5.1.1.6.0.20051101132151.02fe9708@mail.telecommunity.com>
Message-ID: <ca471dc20511011139h2076b250mfa144d60f57a0fcb@mail.gmail.com>

On 11/1/05, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 11:14 AM 11/1/2005 -0700, Guido van Rossum wrote:
> >I guess this ought to be recorded. :-(
> >
> >The issue has been beaten to death and my position remains firm:
> >rather than playing namespace games, consistent renaming is the right
> >thing to do here. This becomes a trivial source edit,
>
> Well, it's not trivial if you're (in my case) trying to support 2.3 and 2.4
> with the same code base.

You should just bite the bullet and make a privatized copy of the
package(s) on which you depend part of your own distributions.

> It'd be nice to have some other advice to offer people besides, "go edit
> your code".  Of course, if the feature hadn't already existed, I suppose a
> PEP to add it would have been shot down, so it's a reasonable decision.

I agree it would be nice if we could do something about deep version
issues. But it's hard, and using the absolute/relative ambiguity isn't
a solution but a nasty hack. I don't have a solution either except
copying code (which IMO is a *fine* solution in most cases as long as
copyright issues don't prevent you).

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From raymond.hettinger at verizon.net  Tue Nov  1 21:14:32 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Tue, 01 Nov 2005 15:14:32 -0500
Subject: [Python-Dev] a different kind of reduce...
In-Reply-To: <4366C3BB.3010407@canterbury.ac.nz>
Message-ID: <001301c5df20$df865f00$153dc797@oemcomputer>

[Martin Blais]
> > I'm always--literally every time-- looking for a more functional
form,
> > something that would be like this:
> >
> >    # apply dirname() 3 times on its results, initializing with p
> >    ... = repapply(dirname, 3, p)

[Greg Ewing]
> Maybe ** should be defined for functions so that you
> could do things like
> 
>    up3levels = dirname ** 3

Hmm, using the function's own namespace is an interesting idea.  It
might also be a good place to put other functionals:

   results = f.map(data)
   newf = f.partial(somearg)
   

Raymond


From dberlin at dberlin.org  Tue Nov  1 21:15:08 2005
From: dberlin at dberlin.org (Daniel Berlin)
Date: Tue, 01 Nov 2005 15:15:08 -0500
Subject: [Python-Dev] svn checksum error
In-Reply-To: <17253.28294.538932.570903@montanaro.dyndns.org>
References: <17252.59531.252751.768301@montanaro.dyndns.org>
	<43654CA7.8030200@v.loewis.de>
	<17253.28294.538932.570903@montanaro.dyndns.org>
Message-ID: <1130876108.7280.35.camel@IBM-82ZWS052TEN.watson.ibm.com>

On Sun, 2005-10-30 at 19:08 -0600, skip at pobox.com wrote:
>     Martin> The natural question then is: what operating system, what
>     Martin> subversion version are you using?
> 
> Sorry, wasn't thinking in terms of svn bugs.  I was anticipating some sort
> of obvious pilot error.  I am on Mac OSX 10.3.9, running svn 1.1.3 I built
> from source back in the May timeframe.  Should I upgrade to 1.2.3 as a
> matter of course?
> 
>     Fredrik> "welcome to the wonderful world of subversion error messages"
>     ...
>     Fredrik> deleting the offending directory and doing "svn up" is the
>     Fredrik> easiest way to fix this.
> 
> Thanks.  I zapped Objects.  The next svn up complained about Misc.  The next
> about Lib.  After that, the next svn up ran to completion.
> 
> Skip

You didn't happen to try to update a checked out copy from a repo that
had an older cvs2svn conversion to the one produced by the final
conversion, did you?

Cause that will cause these errors too.
--Dan


From jeremy at alum.mit.edu  Tue Nov  1 21:23:05 2005
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Tue, 1 Nov 2005 15:23:05 -0500
Subject: [Python-Dev] python-dev sprint at PyCon
In-Reply-To: <5.1.1.6.0.20051101121245.020559e8@mail.telecommunity.com>
References: <20051101143505.GE14719@rogue.amk.ca>
	<5.1.1.6.0.20051101121245.020559e8@mail.telecommunity.com>
Message-ID: <e8bf7a530511011223x996d960oc029a5e18590c94b@mail.gmail.com>

On 11/1/05, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 09:35 AM 11/1/2005 -0500, A.M. Kuchling wrote:
> >Every PyCon has featured a python-dev sprint.  For the past few years,
> >hacking on the AST branch has been a tradition, but we'll have to come
> >up with something new for this year's conference (in Dallas Texas;
> >sprints will be Monday Feb. 27 through Thursday March 2).
> >
> >According to Anthony's release plan, a first alpha of 2.5 would be
> >released in March, hence after PyCon and the sprints.  We should
> >discuss possible tasks for a python-dev sprint.  What could we do?
>
> * PEP 343 implementation ('with:')
> * PEP 308 implementation ('x if y else z')
> * A bytes type
>
> Or perhaps some of the things that have been waiting for the AST branch to
> be finished, i.e.:
>
> * One of the "global variable speedup" PEPs
> * Guido's instance variable speedup idea (LOAD_SELF_IVAR and
> STORE_SELF_IVAR, see
> http://mail.python.org/pipermail/python-dev/2002-February/019854.html)

I hope to attend the sprints this year, so i'd be around to help
people get started and answer questions.  With luck, I'll also be
giving a technical presentation on the work at the main conference.

Jeremy

From reinhold-birkenfeld-nospam at wolke7.net  Tue Nov  1 21:27:23 2005
From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld)
Date: Tue, 01 Nov 2005 21:27:23 +0100
Subject: [Python-Dev] a different kind of reduce...
In-Reply-To: <001301c5df20$df865f00$153dc797@oemcomputer>
References: <4366C3BB.3010407@canterbury.ac.nz>
	<001301c5df20$df865f00$153dc797@oemcomputer>
Message-ID: <dk8j3b$rd0$1@sea.gmane.org>

Raymond Hettinger wrote:
> [Martin Blais]
>> > I'm always--literally every time-- looking for a more functional
> form,
>> > something that would be like this:
>> >
>> >    # apply dirname() 3 times on its results, initializing with p
>> >    ... = repapply(dirname, 3, p)
> 
> [Greg Ewing]
>> Maybe ** should be defined for functions so that you
>> could do things like
>> 
>>    up3levels = dirname ** 3
> 
> Hmm, using the function's own namespace is an interesting idea.  It
> might also be a good place to put other functionals:
> 
>    results = f.map(data)
>    newf = f.partial(somearg)

And we have solved the "map, filter and reduce are going away! Let's all
weep together" problem with one strike!

Reinhold

-- 
Mail address is perfectly valid!


From noamraph at gmail.com  Tue Nov  1 21:49:59 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Tue, 1 Nov 2005 22:49:59 +0200
Subject: [Python-Dev] apparent ruminations on mutable immutables (was:
	PEP 351, the freeze protocol)
In-Reply-To: <20051101104731.0389.JCARLSON@uci.edu>
References: <20051031120205.3A0C.JCARLSON@uci.edu>
	<b348a0850510311425w493c14few57fc0677ad273d80@mail.gmail.com>
	<20051101104731.0389.JCARLSON@uci.edu>
Message-ID: <b348a0850511011249j2d7a0645v8489746be4986d84@mail.gmail.com>

On 11/1/05, Josiah Carlson <jcarlson at uci.edu> wrote:
...
>
> I am an advocate for PEP 351.  However, I am against your proposed
> implementation/variant of PEP 351 because I don't believe it ads enough
> to warrant the additional complication and overhead necessary for every
> object (even tuples would need to get a .frozen_cache member).
>
> Give me a recursive freeze from PEP 351 (which handles objects that are
> duplicated, but errors out on circular references), and I'll be happy.
>
That's fine - but it doesn't mean that I must be happy with it.
>
...
> >
> > This isn't correct - freezing a set won't require a single copy to be
> > performed, as long as the frozen copy isn't saved after the original
> > is changed. Copy+cache always requires one copy.
>
> You are wrong, and you even say you are wrong..."freezing a set doesn't
> require a COPY, IF the frozen COPY isn't saved after the original is
> CHANGED". Creating an immutable set IS CREATING A COPY, so it ALSO
> copies, and you admit as much, but then say the equivalent of "copying
> isn't copying because I say so".

No, I am not wrong. I am just using misleading terms. I will call a
"frozen copy" a "frozen image". Here it goes: "freezing a set doesn't
require a COPY, IF the frozen IMAGE isn't saved after the original is
CHANGED". I suggest that there would be a way to create a frozenset
without COPYING an O(n) amount of MEMORY. When a frozen set is created
by a call frozen(x), it would not copy all the data, but would rather
reference the existing data, which was created by the non-frozen set.
Only if the original set changes, when there's a frozen set
referencing the data, the MEMORY would be actually copied.

I call it a "frozen copy" because it behaves as a frozen copy, even
though not all the memory is being copied. When you call the COPY
function in the COPY module with a string, it doesn't really copy
memory - the same string is returned. When you copy a file inside
subversion, it doesn't actually copy all the data associated with it,
but does something smarter, which takes O(1). The point is, for the
user, it's a copy. Whether or not memory is actually being copied, is
an implementation detail.
>
...
>
> I think that adding an additional attribute to literally every single
> object to handle the caching of 'frozen' objects, as well as a list to
> every object to handle callbacks which should be called on object
> mutation, along with a _call_stuff_when_mutated() method that handles
> these callback calls, IN ADDITION TO the __freeze__ method which is
> necessary to support this, is a little much, AND IS CERTAINLY NOT A
> SIMPLIFICATION!

I don't agree. You don't need to add a list to every object, since you
can store all those relations in one place, with a standard function
for registering them. Anyway, code written in Python (which is the
language we are discussing) WON'T BE COMPLICATED! The frozen
mechanism, along with two new protocols (__frozen__ and __changed__),
would be added automatically! The internal state of a class written in
Python can be automatically frozen, since it's basically a dict. Now
let's see if it's a simplification:

1. No Python code would have to be made more complicated because of the change.
2. There would be no need to find workarounds, like cStringIO, for the
fact that strings and tuples are immutable.
3. You would be able to put any kind of object into a set, or use it
as a dict key.
4. Classes (like the graph example) would be able to give users things
without having to make a choice between risking their users with
strange bugs, making a complicated interface, making very inefficient
methods, and writing complicated wrapper classes.

I will ask you: Is this a complication?
The answer is: it requires a significent change of the CPython
implementation. But about the Python language: it's definitely a
simplification.
>
> Let us pause for a second and consider:
> Original PEP proposed 1 new method: __freeze__, which could be
> implemented as a subclass of the original object (now), and integrated
> into the original classes as time goes on.  One could /register/
> __freeze__ functions/methods a'la Pickle, at which point objects
> wouldn't even need a native freeze method.
>
> Your suggestion offers 2 new methods along with 2 new instance variables.
> Let's see, a callback handler, __freeze__, the cache, and the callback
> list.  Doesn't that seem a little excessive to you to support freezing?
> It does to me.  If Guido were to offer your implementation of freeze, or
> no freeze at all, I would opt for no freeze, as implementing your freeze
> on user-defined classes would be a pain in the ass, not to mention
> implementing them in C code would be more than I would care to do, and
> more than I would ask any of the core developers to work on.
>
As I said above: this suggestion would certainly require more change
in the Python implementation than your suggestion. But the Python
language would gain a lot more. Implementing my frozen on user-defined
classes would not be a pain in the ass, because it will require no
work at all - the Python implementation would provide it
automatically. The fact that it can be done automatically for
user-defined classes raises a hope in me that it can be made not too
complicated for classes written in C.
>
...
>
> But here's the crucial observation which you are missing.  You yourself
> have stated that in both your table and graph examples you want your
> application to continue to modify values while the user can't manipulate
> them.  So even in your own use-cases, you are going to be modifying
> objects after they have been frozen, and even then it won't be fast!

No. In the table example, the table would never change the object
themselves - it may only calculate new values, and drop the references
to the old ones. This is definitely a case of not changing the value
after it has been frozen.

In the graph example, it is true that the set would be changed after
it's frozen, but it is expected that the frozen copy would not exist
by the time the change happens - think about the x is
graph.neighbours(y) example. There is actually no reason for keeping
them, besides for tracking the history of the graph - which would
require a copy anyway. The frozen() implementation of objects which do
not reference non-frozen objects, such as sets, really doesn't copy
any memory when it's called, and will never cause a memory copy if
there are no living frozen copies of the object while the object
changes.
>
> I believe that in general, people who are freezing things are going to
> want to be changing the original objects - hence the use of mutables to
> begin with - maybe for situations like yours where you don't want users
> mutating returns, whatever.  If after they have frozen the object, they
> don't want to be changing the original objects, then they are probably
> going to be tossing out the original mutable and using the immutable
> created with freeze anyways (mutate your object until you get it right,
> then freeze it and use that so that no one can alter your data, not even
> yourself), so I think that caching is something that the /user/ should
> be doing, NOT Python.

I don't agree. The table and the graph are examples. The common use
patterns I see regarding frozen() are:
1. Don't use frozen() at all. Think about strings becoming mutable.
Most strings which are changed would never be frozen. When you are
using a list, how many times do you make a frozen copy of it? (The
answer is zero, of course, you can't. You can't use it as a dict key,
or as a member of a set. This is just to show you that not freezing
mutable objects is a common thing.)
2. Create the object using more operations than constructing it, and
then don't change it, possibly making a frozen copy of it. The table
is an example: functions given by the user create objects, in whatever
way they choose, and then the table doesn't need to change them, and
needs to create a frozen copy.
It's a very reasonable use case: I would say that the less common case
is that you can create an object using only the constructor. Many
times you make a tuple out of a list that you've created just for that
purpose. It's not intuitive!
>
> The simple implementation (not copy-on-write) leads us to a simple
> matter of documenting, "Freeze is 'stateless'; every call to freeze
> returns a new object, regardless of modifications (or lack thereof)
> between freeze calls."
>
> Remember: "Simple is better than complex."

Firstly, you are talking about implementation. Secondly, sometimes
things are too simple, and lead to complex workarounds.
>
...
>
> Not everything can/should become mutable.  Integers should never become
> mutable, as tuples should never become mutable, as strings/unicode
> should never become mutable...wait, aren't we getting to the point that
> everything which is currently immutable shouldn't become mutable?
> Indeed.  I don't believe that any currently immutable object should be
> able to become mutable in order to satisfy /anyone's/ desire for mutable
> /anything/.

Integers should never become mutable - right. There should be no
mutable ints in Python. Tuples and strings should never become mutable
- wrong. Strings created by the user should be mutable - those
immutable strings are a Python anomaly. All I was saying was that
sometimes, the Python implementation would want to use immutable
strings. So will users, sometimes. There is a need for a mutable
string, and a need for an immutable string, and a need for an
efficient conversion between the two. That's all.
>
>
> In starting to bring up benchmarks you are falling into the trap of
> needing to /have/ a benchmark (I have too), for which there are very few,
> if any, current use-cases.
>
No, I don't. There are a lot of use cases. As I said, I suggest a
change to the Python language, which would give you many benefits.
When suggesting such a change, it should be verified that the
performance of existing Python programs won't be harmed, which I did.

What might be done as well, is to compare my suggestion to yours:

> Without having or needing a benchmark, I'll state quite clearly where
> your proposed copy-on-write would beat out the naive 'create a new copy
> on every call to freeze':
> 1. If objects after they are frozen are never modified, copy on write
> will be faster.
> 2. If original objects are modified after they are frozen, then the
> naive implementation will be as fast if not faster in general, due to
> far lower overhead, but may be slower in corner cases where some nested
> structure is unchanged, and some shallow bit has changed:

As I said, many objects are never modified after they are frozen.
***This includes all the strings which are used in current Python
programs as dict keys*** - I suggest that strings would become mutable
by default. This means that whenever you use a string as a dict key, a
call to frozen() is done by the dict. It's obvious that the string
won't change after it is frozen.

Now, my suggestion is faster in its order of complexity than yours. In
some cases, yours is faster by a constant, which I claim that would be
quite small in real use cases.
>
>    x = [[], NEVER_CHANGED_MUTABLE_NESTED_STRUCTURE]
>    y = freeze(x)
>    x[0].append(1)
>    z = freeze(x)
>
This is one of the cases in which the change in order of complexity is
significant.

> Further, discussing benchmarks on use-cases, for which there are few (if
> any) previously existing uses, is like saying "let's race cars" back in
> 1850; it's a bit premature.

I don't agree. That's why we're discussing it.
>
>
> Then there is this other example:
>
>    x = [1,2,3]
>    y = freeze(x)
>
> The flat version of freeze in the PEP right now handles this case.  I
> can change x all I want, yet I have a frozen y which stays unchanged.
> This is what I would want, and I would imagine it is what others would
> want too.  In fact, this is precisely the use-case you offered for your
> table and graph examples, so your expression of a sentiment of "users
> aren't going to be changing the object after it has been frozen" is, by
> definition, wrong: you do it yourself!

Okay, so may I add another use case in which frozen() is fast: If an
object which only holds references to frozen object is changed after a
frozen copy of it has been made, and the frozen copy is discarded
before the change is made, frozen() would still take O(1). This is the
case with the graph.
>
>
> > I will write it in another way: I am proposing a change that will make
> > Python objects, including strings, mutable, and gives you other
> > advantages as well. I claim that it won't make existing Python
> > programs run slower in O() terms. It would allow you to do many things
> > that you can't do today; some of them would be fast, like editing a
> > string, and some of them would be less fast - for example, repeatedly
> > changing an object and freezing it.
>
> Your claim on running time only works if the original isn't changed
> after it is frozen

But they won't, in existing Python programs, so my claim: "it won't
make existing Python programs run slower in O() terms" is absolutely
correct!
>
> And I don't like making everything mutable, it's a "solution looking for
> a problem", or a "tail wagging the dog" idea.  There is no good reason
> to make everything mutable, and I challenge you to come up with a valid
> one that isn't already covered by the existing standard library or
> extension modules.
>
> There is no need to bring strings into this conversation as there are
> string variants which are already mutable: array.array('c', ...),
> StringIO, mmap, take your pick!  And some future Python (perhaps 2.5)
> will support a 'bytes' object, which is essentially an mmap which
> doesn't need to be backed by a file.

My two examples don't have a satisfactory solution currently.
All this variety acutally proves my point: There is "more than one way
to do it" because these are all workarounds! If strings were mutable,
I won't have to learn about all these nice modules. And if that's not
enough, here's a simple use case which isn't covered by all those
modules. Say I store my byte arrays using array.array, or mmap. What
if I want to make a set of those, in order to check if a certain byte
sequence was already encountered? I CAN'T. I have to do another
workaround, which will probably be complicated and unefficient, to
convert my byte array into a string.

Everything is possible, if you are willing to work hard enough. I am
suggesting to simplify things.

...
>
> You really don't know how Python internals work.
>
> The slow part of s += x on strings in Python 2.4 is the memory
> reallocation and occasional data copy (it has been tuned like crazy by
> Raymond in 2.4, see _PyString_Resize in stringobject.c). Unless you
> severely over-allocated your strings, this WOULD NOT BE SPED UP BY
> MUTABLE STRINGS.

The fact that it has been tuned like crazy, and that it had to wait
for Python 2.4, is just showing us that we talking on *another*
workaround. And please remember that this optimization was announced
not to be counted on, in case you want your code to work efficiently
on other Python implementations. In that case (which would just grow
more common in the future), you would have to get back to the other
workarounds, like cStringIO.
>
> Further, identifiers/names (obj, obj.attr, obj.attr1.attr2, ...) are
> already created during compile-time, and are 'interned'.  That is, if
> you have an object that you call 'foo', there gets to be a single "foo"
> string, which is referenced by pointer by any code in that module which
> references the 'foo' object to the single, always unchanging "foo"
> string.  And because the string has already been hashed, it has a cached
> hash value, and lookups in dictionaries are already fast due to a check
> for pointer equivalency before comparing contents.  Mutable strings
> CANNOT be faster than this method.

Right - they can't be faster than this method. But they can be
virtually AS FAST. Store frozen strings as identifiers/names, and you
can continue to use exactly the same method you described.
>
>
...
>
> I would claim that your definition is limited.  Yours would be fast if
> objects never changed after they are frozen, which is counter to your
> own use-cases.  This suggests that your definition is in fact incorrect,
> and you fail to see your own inconsistancy.
>
I have answered this above: It is not counter to my use cases, and
it's a very good assumption, as it is true in many examples, including
all current Python programs.
>
> > And if it's possible (which I think it is), it would allow us to get
> > rid of inconvinient immutable objects, and it would let us put
> > everything into a set. Isn't that nice?
>
> No, it sounds like a solution looking for a problem.  I see no need to
> make strings, floats, ints, tuples, etc. mutable, and I think that you
> will have very little luck in getting core Python developer support for
> any attempt to make them mutable.

Concerning ints, floats, complexes, and any other object with a
constant memory use, I agree. Concerning other objects, I disagree. I
think that it would simplify things considerably, and that many things
that we are used to are actually workarounds.
>
> If you make such a suggestion, I would offer that you create a new PEP,
> because this discussion has gone beyond PEP 351, and has wandered into
> the realm of "What other kinds of objects would be interesting to have
> in a Python-like system?"
>
That is a good suggestion, and I have already started to write one. It
takes me a long time, but I hope I will manage.
>
>
> I'll summarize your claims:
> 1. copy-on-write is a simplification
> 2. everything being mutable would add to Python
> 3. copy-on-write is fast
> 4. people won't be mutating objects after they are frozen
>
> I'll counter your claims:

I'll counter-counter them:

> 1. 2 methods and 2 instance variables on ALL OBJECTS is not a
> simplification.

It is. This is basically an implementation detail, Python code would
never be complicated.

> 2. a = b = 1; a += 1;  If all objects were to become mutable, then a ==
> b, despite what Python and every other sane language would tell you, and
> dct[a] would stop working (you would have to use c = freeze(a);dct[c],
> or dct[x] would need to automatically call freeze and only ever
> reference the result, significantly slowing down ALL dictionary
> references).

This might be the point that I didn't stress enough. Dict *would* call
freeze, and this is why more work is needed to make sure it is a quick
operation. I have proven that it is quick in O() terms, and I claimed
that it can be made quick in actual terms.

> 3. only if you NEVER MUTATE an object after it has been frozen

...or if the frozen copy is killed before the change, for many types of objects.

> 4. /you/ mutate original objects after they are frozen

Yes I do, but see 3.

>
> ALSO:
> 5. You fail to realize that if all objects were to become mutable, then
> one COULDN'T implement frozen, because the frozen objects THEMSELVES
> would be mutable.

Really, you take me by the word. All objects COULD become mutable, if
we supply a frozen version of it. This doesn't include any object
which you don't want, including ints, and including frozen objects.
>
> I'm going to bow out of this discussion for a few reasons, not the least
> of which being that I've spent too much time on this subject, and that I
> think it is quite clear that your proposal is dead, whether I had
> anything to do with it or not.
>
>  - Josiah

That's fine. I wish that you read my answer, think about it a little,
and just tell me in a yes or a no if you still consider it dead. I
think that I have answered all your questions, and I hope that at
least others would be convinced by them, and that at the end my
suggestion would be accepted.

Others who read this - please respond if you think there's something
to my suggestion!

Thanks for your involvement. I hope it would at least help me better
explain my idea.
Noam

From noamraph at gmail.com  Tue Nov  1 21:55:14 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Tue, 1 Nov 2005 22:55:14 +0200
Subject: [Python-Dev] a different kind of reduce...
In-Reply-To: <dk8j3b$rd0$1@sea.gmane.org>
References: <4366C3BB.3010407@canterbury.ac.nz>
	<001301c5df20$df865f00$153dc797@oemcomputer>
	<dk8j3b$rd0$1@sea.gmane.org>
Message-ID: <b348a0850511011255t7683b34bk9fd90cf4a99c4fb6@mail.gmail.com>

On 11/1/05, Reinhold Birkenfeld <reinhold-birkenfeld-nospam at wolke7.net> wrote:
> > Hmm, using the function's own namespace is an interesting idea.  It
> > might also be a good place to put other functionals:
> >
> >    results = f.map(data)
> >    newf = f.partial(somearg)
>
> And we have solved the "map, filter and reduce are going away! Let's all
> weep together" problem with one strike!
>
> Reinhold

I have no problems with map and filter goint away. About reduce -
please remember that you need to add this method to any callable,
including every type (I mean the constructor). I am not sure it is a
good trade for throwing away one builting, which is a perfectly
reasonable function.

Noam

From pedronis at strakt.com  Tue Nov  1 22:00:20 2005
From: pedronis at strakt.com (Samuele Pedroni)
Date: Tue, 01 Nov 2005 22:00:20 +0100
Subject: [Python-Dev] a different kind of reduce...
In-Reply-To: <dk8j3b$rd0$1@sea.gmane.org>
References: <4366C3BB.3010407@canterbury.ac.nz>	<001301c5df20$df865f00$153dc797@oemcomputer>
	<dk8j3b$rd0$1@sea.gmane.org>
Message-ID: <4367D764.2090609@strakt.com>

Reinhold Birkenfeld wrote:
> Raymond Hettinger wrote:
> 
>>[Martin Blais]
>>
>>>>I'm always--literally every time-- looking for a more functional
>>
>>form,
>>
>>>>something that would be like this:
>>>>
>>>>   # apply dirname() 3 times on its results, initializing with p
>>>>   ... = repapply(dirname, 3, p)
>>
>>[Greg Ewing]
>>
>>>Maybe ** should be defined for functions so that you
>>>could do things like
>>>
>>>   up3levels = dirname ** 3
>>
>>Hmm, using the function's own namespace is an interesting idea.  It
>>might also be a good place to put other functionals:
>>
>>   results = f.map(data)
>>   newf = f.partial(somearg)
> 
> 
> And we have solved the "map, filter and reduce are going away! Let's all
> weep together" problem with one strike!

not really, those right now work with any callable,

 >>> class C:
...   def __call__(self, x):
...     return 2*x
...
 >>> map(C(), [1,2,3])
[2, 4, 6]


that's why attaching functionaliy as methods is not always the best 
solution.

regards.

From skip at pobox.com  Tue Nov  1 21:58:37 2005
From: skip at pobox.com (skip@pobox.com)
Date: Tue, 1 Nov 2005 14:58:37 -0600
Subject: [Python-Dev] python-dev sprint at PyCon
In-Reply-To: <20051101143505.GE14719@rogue.amk.ca>
References: <20051101143505.GE14719@rogue.amk.ca>
Message-ID: <17255.55037.609312.773649@montanaro.dyndns.org>


    amk> Every PyCon has featured a python-dev sprint.  For the past few
    amk> years, hacking on the AST branch has been a tradition, but we'll
    amk> have to come up with something new for this year's conference...

This is just a comment from the peanut gallery, as it's highly unlikely I'll
be in attendance, but why not continue with the AST theme?  Instead of
working on the AST branch, you could start to propagate the AST
representation around.  For example, you could use the new AST code to
improve/extend/rewrite the optimization steps the compiler currently
performs.  Another alternative would be to rewrite Pychecker (or Pychecker
2) to operate from the AST representation.

Skip

From tdelaney at avaya.com  Tue Nov  1 21:59:07 2005
From: tdelaney at avaya.com (Delaney, Timothy (Tim))
Date: Wed, 2 Nov 2005 07:59:07 +1100
Subject: [Python-Dev] apparent ruminations on mutable immutables
	(was:PEP 351, the freeze protocol)
Message-ID: <2773CAC687FD5F4689F526998C7E4E5F4DB75B@au3010avexu1.global.avaya.com>

Noam,

There's a simple solution to all this - write a competing PEP. One of
the two competing PEPs may be accepted.

FWIW, I'm +1 on PEP 351 in general, and -1 on what you've proposed.

PEP 351 is simple to explain, simple to implement and leaves things
under the control of the developer. I think there are still some issues
to be resolved, but the basic premise is exactly what I would want of a
freeze protocol.

Tim Delaney

From jcarlson at uci.edu  Tue Nov  1 22:03:56 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Tue, 01 Nov 2005 13:03:56 -0800
Subject: [Python-Dev] apparent ruminations on mutable immutables (was:
	PEP 351, the freeze protocol)
In-Reply-To: <b348a0850511011249j2d7a0645v8489746be4986d84@mail.gmail.com>
References: <20051101104731.0389.JCARLSON@uci.edu>
	<b348a0850511011249j2d7a0645v8489746be4986d84@mail.gmail.com>
Message-ID: <20051101125918.0396.JCARLSON@uci.edu>

> That's fine. I wish that you read my answer, think about it a little,
> and just tell me in a yes or a no if you still consider it dead. I
> think that I have answered all your questions, and I hope that at
> least others would be convinced by them, and that at the end my
> suggestion would be accepted.

I still consider it dead.
    "If the implementation is hard to explain, it's a bad idea."

Also, not all user-defined classes have a __dict__, and not all
user-defined classes can have arbitrary attributes added to them.

c>>> class foo(object):
...     __slots__ = ['lst']
...     def __init__(self):
...         self.lst = []
...
>>> a = foo()
>>> a.bar = 1
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: 'foo' object has no attribute 'bar'
>>> 

 - Josiah


From tdelaney at avaya.com  Tue Nov  1 22:02:04 2005
From: tdelaney at avaya.com (Delaney, Timothy (Tim))
Date: Wed, 2 Nov 2005 08:02:04 +1100
Subject: [Python-Dev] a different kind of reduce...
Message-ID: <2773CAC687FD5F4689F526998C7E4E5F4DB75C@au3010avexu1.global.avaya.com>

Reinhold Birkenfeld wrote:

> And we have solved the "map, filter and reduce are going away! Let's
> all weep together" problem with one strike!

I'm not sure if you're wildly enthusiastic, or very sarcastic.

I'm not sure which I should be either ...

The thought does appeal to me - especially func.partial(args). I don't
see any advantage to func.map(args) over func(*args), and it loses
functionality in comparison with map(func, args) (passing the function
as a separate reference).

Tim Delaney

From mal at egenix.com  Tue Nov  1 22:11:52 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 01 Nov 2005 22:11:52 +0100
Subject: [Python-Dev] PEP 328 - absolute imports (python-dev sprint at
 PyCon)
In-Reply-To: <ca471dc20511011014o721c0d88w9244915e368a1a6c@mail.gmail.com>
References: <5.1.1.6.0.20051101130208.02047018@mail.telecommunity.com>
	<ca471dc20511011014o721c0d88w9244915e368a1a6c@mail.gmail.com>
Message-ID: <4367DA18.6070502@egenix.com>

Guido van Rossum wrote:
> On 11/1/05, Phillip J. Eby <pje at telecommunity.com> wrote:
> 
>>At 10:22 AM 11/1/2005 -0700, Guido van Rossum wrote:
>>
>>>* PEP 328 - absolute/relative import
>>
>>I assume that references to 2.4 in that PEP should be changed to 2.5, and
>>so on.
> 
> 
> For the part that hasn't been implemented yet, yes.
> 
> 
>>It also appears to me that the PEP doesn't record the issue brought up by
>>some people about the current absolute/relative ambiguity being useful for
>>packaging purposes.  i.e., being able to nest third-party packages such
>>that they end up seeing their dependencies, even though they're not
>>installed at the "root" package level.
>>
>>For example, I have a package that needs Python 2.4's version of pyexpat,
>>and I need it to run in 2.3, but I can't really overwrite the 2.3 pyexpat,
>>so I just build a backported pyexpat and drop it in the package, so that
>>the code importing it just ends up with the right thing.
>>
>>Of course, that specific example is okay since 2.3 isn't going to somehow
>>grow absolute importing.  :)  But I think people brought up other examples
>>besides that, it's just the one that I personally know I've done.
> 
> 
> I guess this ought to be recorded. :-(
> 
> The issue has been beaten to death and my position remains firm:
> rather than playing namespace games, consistent renaming is the right
> thing to do here. This becomes a trivial source edit, which beats the
> problems of debugging things when it doesn't work out as expected
> (which is very common due to the endless subtleties of loading
> multiple versions of the same code).

Just for reference, may I remind you of this thread last year:

http://mail.python.org/pipermail/python-dev/2004-September/048695.html

The PEP's timeline should be updated accordingly.

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Nov 01 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From noamraph at gmail.com  Tue Nov  1 22:20:34 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Tue, 1 Nov 2005 23:20:34 +0200
Subject: [Python-Dev] apparent ruminations on mutable immutables
	(was:PEP 351, the freeze protocol)
In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5F4DB75B@au3010avexu1.global.avaya.com>
References: <2773CAC687FD5F4689F526998C7E4E5F4DB75B@au3010avexu1.global.avaya.com>
Message-ID: <b348a0850511011320h5e799ad6q43aef3d1da88508c@mail.gmail.com>

On 11/1/05, Delaney, Timothy (Tim) <tdelaney at avaya.com> wrote:
> Noam,
>
> There's a simple solution to all this - write a competing PEP. One of
> the two competing PEPs may be accepted.

I will. It may take some time, though.
>
> FWIW, I'm +1 on PEP 351 in general, and -1 on what you've proposed.
>
> PEP 351 is simple to explain, simple to implement and leaves things
> under the control of the developer. I think there are still some issues
> to be resolved, but the basic premise is exactly what I would want of a
> freeze protocol.
>
> Tim Delaney

It is true that PEP 351 is simpler. The problem is, that thanks to PEP
351 I have found a fundamental place in which the current Python
design is not optimal. It is not easy to fix it, because 1) it would
require a significant change to the current implementation, and 2)
people are so used to the current design that it is hard to convince
them that it's flawed.

The fact that discussing the design is long doesn't mean that the
result, for the Python programmer, would be complicated. They won't -
my suggestion will cause almost no backward-compatibility problems.
Think about it - it clearly means that my suggestion simply can't make
Python programming *more* complicated.

Please consider new-style classes. I'm sure they required a great deal
of discussion, but they are simple to use -- and they are a good
thing. And I think that my suggestion would make things easier, more
than the new-style-classes change did. Features of new-style classes
are an advanced topic. The questions, "why can't I change my strings?"
"why do you need both a tuple and a list?" and maybe "why can't I add
my list to a set", are fundamental ones, which would all not be asked
at all if my suggestion is accepted.

Noam

From jcarlson at uci.edu  Tue Nov  1 22:29:29 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Tue, 01 Nov 2005 13:29:29 -0800
Subject: [Python-Dev] a different kind of reduce...
In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5F4DB75C@au3010avexu1.global.avaya.com>
References: <2773CAC687FD5F4689F526998C7E4E5F4DB75C@au3010avexu1.global.avaya.com>
Message-ID: <20051101131830.0399.JCARLSON@uci.edu>


"Delaney, Timothy (Tim)" <tdelaney at avaya.com> wrote:
> 
> Reinhold Birkenfeld wrote:
> 
> > And we have solved the "map, filter and reduce are going away! Let's
> > all weep together" problem with one strike!
> 
> I'm not sure if you're wildly enthusiastic, or very sarcastic.
> 
> I'm not sure which I should be either ...
> 
> The thought does appeal to me - especially func.partial(args). I don't
> see any advantage to func.map(args) over func(*args), and it loses
> functionality in comparison with map(func, args) (passing the function
> as a separate reference).

I was under the impression that:
    fcn.<old builtin name>(...)
would perform equivalently as
    <old builtin name>(fcn, ...)
does now.

So all the following would be equivalent...
    func.map(args)
    map(func, args)
    [func(i) for i in args]


Me, I still use map, so seeing it as fcn.map(...) instead of map(fcn,...)
sounds good to me...though it does have the ugly rub of suggesting that
None.map/filter should exist, which I'm not really happy about.

In regards to the instance __call__ method, it seems reasonable to
require users to implement their own map/filter/reduce call.

 - Josiah


From noamraph at gmail.com  Tue Nov  1 22:30:48 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Tue, 1 Nov 2005 23:30:48 +0200
Subject: [Python-Dev] apparent ruminations on mutable immutables (was:
	PEP 351, the freeze protocol)
In-Reply-To: <20051101125918.0396.JCARLSON@uci.edu>
References: <20051101104731.0389.JCARLSON@uci.edu>
	<b348a0850511011249j2d7a0645v8489746be4986d84@mail.gmail.com>
	<20051101125918.0396.JCARLSON@uci.edu>
Message-ID: <b348a0850511011330g3b4c4edr9469940650d88b9c@mail.gmail.com>

On 11/1/05, Josiah Carlson <jcarlson at uci.edu> wrote:
...
>
> I still consider it dead.
>    "If the implementation is hard to explain, it's a bad idea."

It is sometimes true, but not always. It may mean two other things:
1. The one trying to explain is not talented enough.
2. The implementation is really not very simple. A hash table, used so
widely in Python, is really not a simple idea, and it's not that easy
to explain.
>
> Also, not all user-defined classes have a __dict__, and not all
> user-defined classes can have arbitrary attributes added to them.
>
> c>>> class foo(object):
> ...     __slots__ = ['lst']
> ...     def __init__(self):
> ...         self.lst = []
> ...
> >>> a = foo()
> >>> a.bar = 1
> Traceback (most recent call last):
>  File "<stdin>", line 1, in ?
> AttributeError: 'foo' object has no attribute 'bar'
> >>>
It doesn't matter. It only means that the implementation would have to
make frozen copies also of __slots__ items, when freezing a
user-defined class.

I am afraid that this question proves that I didn't convey my idea to
you. If you like, please forgive my inability to explain it clearly,
and try again to understand my idea, by going over what I wrote again,
and thinking on it. You can also wait for the PEP that I intend to
write. And you can also forget about it, if you don't want to bother
with it - you've already helped a lot.

Noam

From guido at python.org  Tue Nov  1 22:40:40 2005
From: guido at python.org (Guido van Rossum)
Date: Tue, 1 Nov 2005 14:40:40 -0700
Subject: [Python-Dev] a different kind of reduce...
In-Reply-To: <001301c5df20$df865f00$153dc797@oemcomputer>
References: <4366C3BB.3010407@canterbury.ac.nz>
	<001301c5df20$df865f00$153dc797@oemcomputer>
Message-ID: <ca471dc20511011340y7db2a86dn18ea361ecc7fafbd@mail.gmail.com>

> [Greg Ewing]
> > Maybe ** should be defined for functions so that you
> > could do things like
> >
> >    up3levels = dirname ** 3

[Raymond Hettinger]
> Hmm, using the function's own namespace is an interesting idea.  It
> might also be a good place to put other functionals:
>
>    results = f.map(data)
>    newf = f.partial(somearg)

Sorry to rain on everybody's parade, but I don't think so. There are
many different types of callables. This stuff would only work if they
all implemented the same API. That's unlikely to happen. A module with
functions to implement the various functional operations has much more
potential.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From s.percivall at chello.se  Wed Nov  2 00:14:20 2005
From: s.percivall at chello.se (Simon Percivall)
Date: Wed, 2 Nov 2005 00:14:20 +0100
Subject: [Python-Dev] a different kind of reduce...
In-Reply-To: <ca471dc20511011340y7db2a86dn18ea361ecc7fafbd@mail.gmail.com>
References: <4366C3BB.3010407@canterbury.ac.nz>
	<001301c5df20$df865f00$153dc797@oemcomputer>
	<ca471dc20511011340y7db2a86dn18ea361ecc7fafbd@mail.gmail.com>
Message-ID: <6FF0C116-C55B-43CC-AA21-2A7D72E09545@chello.se>

On 1 nov 2005, at 22.40, Guido van Rossum wrote:
>> [Greg Ewing]
>>> Maybe ** should be defined for functions so that you
>>> could do things like
>>>
>>>    up3levels = dirname ** 3
>
> [Raymond Hettinger]
>> Hmm, using the function's own namespace is an interesting idea.  It
>> might also be a good place to put other functionals:
>>
>>    results = f.map(data)
>>    newf = f.partial(somearg)
>
> Sorry to rain on everybody's parade, but I don't think so. There are
> many different types of callables. This stuff would only work if they
> all implemented the same API. That's unlikely to happen. A module with
> functions to implement the various functional operations has much more
> potential.

Perhaps then a decorator that uses these functions?

//Simon


From noamraph at gmail.com  Wed Nov  2 02:21:38 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Wed, 2 Nov 2005 03:21:38 +0200
Subject: [Python-Dev] Why should the default hash(x) == id(x)?
Message-ID: <b348a0850511011721ve1c3817vd5f61b644257e855@mail.gmail.com>

Hello,

While writing my PEP about unifying mutable and immutable, I came upon this:

Is there a reason why the default __hash__ method returns the id of the objects?

It is consistent with the default __eq__ behaviour, which is the same
as "is", but:

1. It can easily become inconsistent, if someone implements __eq__ and
doesn't implement __hash__.
2. It is confusing: even if someone doesn't implement __eq__, he may
see that it is suitable as a key to a dict, and expect it to be found
by other objects with the same "value".
3. If someone does want to associate values with objects, he can
explicitly use id:
dct[id(x)] = 3. This seems to better explain what he wants.


Now, I just thought of a possible answer: "because he wants to store
in his dict both normal objects and objects of his user-defined type,
which turn out to be not equal to any other object."

This leads me to another question: why should the default __eq__
method be the same as "is"? If someone wants to check if two objects
are the same object, that's what the "is" operator is for. Why not
make the default __eq__ really compare the objects, that is, their
dicts and their slot-members?

I would be happy to get answers.

Noam

From nnorwitz at gmail.com  Wed Nov  2 03:23:23 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Tue, 1 Nov 2005 18:23:23 -0800
Subject: [Python-Dev] python-dev sprint at PyCon
In-Reply-To: <17255.55037.609312.773649@montanaro.dyndns.org>
References: <20051101143505.GE14719@rogue.amk.ca>
	<17255.55037.609312.773649@montanaro.dyndns.org>
Message-ID: <ee2a432c0511011823h4d45f1d1pb703c284f331c52e@mail.gmail.com>

On 11/1/05, skip at pobox.com <skip at pobox.com> wrote:
>
> This is just a comment from the peanut gallery, as it's highly unlikely I'll
> be in attendance, but why not continue with the AST theme?  Instead of
> working on the AST branch, you could start to propagate the AST
> representation around.  For example, you could use the new AST code to
> improve/extend/rewrite the optimization steps the compiler currently
> performs.  Another alternative would be to rewrite Pychecker (or Pychecker
> 2) to operate from the AST representation.

That's an excellent suggestion.  I think I will borrow the time
machine and add it to the wiki. :-)  It's up on the wiki.  Brett also
added an item for the peephole optimizer.  Everyone should add
whatever they think are good ideas, even if they don't plan to attend
the sprints.

n

From mcherm at mcherm.com  Wed Nov  2 14:48:55 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Wed, 02 Nov 2005 05:48:55 -0800
Subject: [Python-Dev] apparent ruminations on mutable
	immutables	(was:PEP 351, the freeze protocol)
Message-ID: <20051102054855.st4vtcvrorogggc8@login.werra.lunarpages.com>

Josiah Carlson writes:
> If you make such a suggestion, I would offer that you create a new PEP,
> because this discussion has gone beyond PEP 351, and has wandered into
> the realm of "What other kinds of objects would be interesting to have
> in a Python-like system?"

Noam Raphael replies:
> That is a good suggestion, and I have already started to write one. It
> takes me a long time, but I hope I will manage.

My thanks to both of you... following this conversation has been an
educational experience. Just for the record, I wanted to chime in with
my own opinion formed after following the full interchange.

I think Noam's propsal is very interesting. I like the idea of allowing
both "frozen" (ie, immutable) and mutable treatments for the same
object. I think that C++'s version of this concept (the "const" modifier)
has, on balance, been only a very limited success. I find myself convinced
by Noam's claims that many common use patterns either (1) only use mutables,
or (2) only use immutables, or (3) only use immutable copies temporarily
and avoid mutating while doing so. Any such use patterns (particularly
use (3)) would benefit from the presence of an efficient method for
creating an immutable copy of a mutable object which avoids the copy where
possible.

However... it seems to me that what is being described here is not Python.
Python is a wonderful language, but it has certain characteristics, like
extremely dynamic behavior and close integration with underlying system
methods (C in CPython, Java in Jython, etc) that seem to me to make this
particular feature a poor fit. That's OK... not all languages need to be
Python!

I would encourage you (Noam) to go ahead and explore this idea of yours.
You might wind up building a new language from scratch (in which case I
strongly encourage you to borrow _syntax_ from Python -- its syntax is
more usable than that of any other language I know of). Or perhaps you
will prefer to take CPython and make minor modifications. This kind of
experimentation is allowed (open source) and even encouraged... consider
Christian Tismer's Stackless -- a widely admired variant of CPython which
is unlikely to ever become part of the core, but is nevertheless an
important part of the vivrant Python community. You might even be
interested in starting, instead, with PyPy -- an large project which has
as its main goal producing an implementation of Python which is easy
to modify so as to support just this kind of experimentation.

You are also welcome to submit a PEP for modifying Python (presumably
CPython, Jython, Iron Python, and all other implementations). However,
I think such a PEP would be rejected. Building your own thing that
works well with Python would NOT be rejected. The idea is interesting,
and it _may_ be sound; only an actual implementation could prove this
either way.

-- Michael Chermside


From mcherm at mcherm.com  Wed Nov  2 18:39:44 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Wed, 02 Nov 2005 09:39:44 -0800
Subject: [Python-Dev] Why should the default hash(x) == id(x)?
Message-ID: <20051102093944.8jhktwq4e98g4444@login.werra.lunarpages.com>

Noam Raphael writes:
> Is there a reason why the default __hash__ method returns the id of the
objects?
>
> It is consistent with the default __eq__ behaviour, which is the same
> as "is", but:
>
> 1. It can easily become inconsistent, if someone implements __eq__ and
> doesn't implement __hash__.
> 2. It is confusing: even if someone doesn't implement __eq__, he may
> see that it is suitable as a key to a dict, and expect it to be found
> by other objects with the same "value".
> 3. If someone does want to associate values with objects, he can
> explicitly use id:
> dct[id(x)] = 3. This seems to better explain what he wants.

Your first criticism is valid... it's too bad that there isn't a magical
__hash__ function that automatically derived its behavior from __eq__.
To your second point, I would tell this user to read the requirements.
And your third point isn't a criticism, just an alternative.

But to answer your question, the reason that the default __hash__ returns
the ID in CPython is just that this works. In Jython, I belive that the
VM provides a native hash method, and __hash__ uses that instead of
returning ID. Actually, it not only works, it's also FAST (which is
important... many algorithms prefer that __hash__ being O(1)).

I can't imagine what you would propose instead. Keep in mind that the
requirements are that __hash__ must return a value which distinguishes
the object. So, for instance, two mutable objects with identical values
MUST (probably) return different __hash__ values as they are distinct
objects.

> This leads me to another question: why should the default __eq__
> method be the same as "is"?

Another excellent question. The answer is that this is the desired
behavior of the language. Two user-defined object references are
considered equal if and only if (1) they are two references to the
same object, or (2) the user who designed it has specified a way
to compare objects (implemented __eq__) and it returns a True value.

> Why not make the default __eq__ really compare the objects, that is,
> their dicts and their slot-members?

Short answer: not the desired behavior. Longer answer: there are
three common patterns in object design. There are "value" objects,
which should be considered equal if all fields are equal. There are
"identity" objects which are considered equal only when they are
the same object. And then there are (somewhat less common) "value"
objects in which a few fields don't count -- they may be used for
caching a pre-computed result for example. The default __eq__
behavior has to cater to one of these -- clearly either "value"
objects or "identity" objects. Guido chose to cater to "identity"
objects believing that they are actually more common in most
situations. A beneficial side-effect is that the default behavior
of __eq__ is QUITE simple to explain, and if the implementation is
easy to explain then it may be a good idea.

-- Michael Chermside


From jcarlson at uci.edu  Wed Nov  2 18:46:09 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed, 02 Nov 2005 09:46:09 -0800
Subject: [Python-Dev] Why should the default hash(x) == id(x)?
In-Reply-To: <b348a0850511011721ve1c3817vd5f61b644257e855@mail.gmail.com>
References: <b348a0850511011721ve1c3817vd5f61b644257e855@mail.gmail.com>
Message-ID: <20051102092422.F283.JCARLSON@uci.edu>


Noam Raphael <noamraph at gmail.com> wrote:
> 
> Hello,
> 
> While writing my PEP about unifying mutable and immutable, I came upon this:
> 
> Is there a reason why the default __hash__ method returns the id of the objects?

A quick search in the list archives via google search
    "site:mail.python.org object __hash__"
Says that Guido wanted to remove the default __hash__ method for object
in Python 2.4, but that never actually happened.

http://www.python.org/sf/660098
http://mail.python.org/pipermail/python-dev/2003-December/041375.html

There may be more code which relies on the default behavior now, but
fixing such things is easy.


> Now, I just thought of a possible answer: "because he wants to store
> in his dict both normal objects and objects of his user-defined type,
> which turn out to be not equal to any other object."

Which is a use-case, but a use-case which isn't always useful.  Great
for singleton/default arguments that no one should ever pass, not quite
so good when you need the /original key/ (no copies) in order to get at
a value in a dictionary - but that could be something that someone wants.


> This leads me to another question: why should the default __eq__
> method be the same as "is"? If someone wants to check if two objects
> are the same object, that's what the "is" operator is for. Why not
> make the default __eq__ really compare the objects, that is, their
> dicts and their slot-members?

Using 'is' makes sense when the default hash is id (and actually in
certain other cases as well). Actually comparing the contents of an
object is certainly not desireable with the default hash, and probably
not desireable in the general case because equality doesn't always
depend on /all/ attributes of extension objects.

    Explicit is better than implicit.
    In the face of ambiguity, refuse the temptation to guess.

I believe the current behavior of __eq__ is more desireable than
comparing contents, as this may result in undesireable behavior
(recursive compares on large nested objects are now slow, which used to
be fast because default methods wouldn't cause a recursive comparison at
all).

As for removing the default __hash__ for objects, I'm actually hovering
around a -0, if only because it is sometimes useful to generate unique
keys for dictionaries (which can be done right now with object() ), and
I acknowledge that it would be easy to subclass and use that instead.

 - Josiah


From noamraph at gmail.com  Wed Nov  2 20:04:50 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Wed, 2 Nov 2005 21:04:50 +0200
Subject: [Python-Dev] apparent ruminations on mutable immutables
	(was:PEP 351, the freeze protocol)
In-Reply-To: <20051102054855.st4vtcvrorogggc8@login.werra.lunarpages.com>
References: <20051102054855.st4vtcvrorogggc8@login.werra.lunarpages.com>
Message-ID: <b348a0850511021104s13298755nbe71fd877388ae26@mail.gmail.com>

Thank you for your encouraging words!

I am currently working on a PEP. I am sure that writing it is a good
idea, and that it would help with explaining this idea both to others
and to myself.

What I already wrote makes me think that it can be accomplished with
no really large changes to the language - only six built-in types are
affected, and there is no reason why existing code, both in C and in
Python, would stop working.

I hope others would be interested in the idea too, when I finish
writing the PEP draft, so it would be discussed. Trying the idea with
PyPy is a really nice idea - it seems that it would be much simpler to
implement, and I'm sure that learning PyPy would be interesting.

Thanks again, and I would really like to hear your comments when I
post the PEP draft,
Noam

From runehol at ping.uio.no  Wed Nov  2 20:18:52 2005
From: runehol at ping.uio.no (Rune Holm)
Date: Wed, 02 Nov 2005 20:18:52 +0100
Subject: [Python-Dev] Optimizations on the AST representation
Message-ID: <4369111C.5060508@ping.uio.no>

Hi,

I'm a norwegian applied mathematics student with an interest in 
compilers, and I've been a long-time python user and a python-dev lurker 
for some time. I'm very happy that you've integrated the AST branch into 
mainline, but I noticed that the AST compiler does not perform much 
optimization yet, so I though I'd take a crack at it.

I just submitted the following patches:
http://www.python.org/sf/1346214
http://www.python.org/sf/1346238

which adds better dead code elimination and constant folding of the AST 
representation to Python.
The constant folding patch adds two new files, Include/optimize.h and 
Python/optimize.c, which includes a general visitor interface abstracted 
from exisiting visitor code for easy optimization pass creation. The 
code is in new files in order to make room for more AST optimization 
passes, and since Python/compile.c is already quite
crowded with byte code generation and bytecode optimization. If desired, 
this patch could changed to add code to Python/compile.c instead.


Further work:

A limited form of type interference (e.g. as a synthesized attribute) 
could be very useful for further optimizations. Since python allows 
operator overloading, it isn't safe to perform strength reductions on 
expressions with operands of unknown type, as there is no way to know if 
algebraic identities will hold. However, if we can infer from the 
context that expressions have the type of int, float or long, many 
optimizations become possible, for instance:

x**2 => x*x
x*2 => x+x
x*0 => 0
x*1 => x
4*x + 5*x => 9*x (this optimization actually requires common 
subexpression elimination for the general case, but simple cases can be 
performed without this)

and so forth.

Another interesting optimization that can potensially bring a lot of 
additional speed is hoisting of loop invariants, since calling python 
methods involves allocating and creating a method-wrapper object. An 
informal test shows that optimizing

lst = []
for i in range(10):
    lst.append(i+1)

into

lst = []
tmp = lst.append
for i in range(10):
    tmp(i+1)

will yield a 10% speed increase. This operation is of course not safe 
with arbitrary types, but with the above type interference, we could 
perform this operation if the object is of a type that disallows 
attribute assignment, for instance lists, tuples, strings and unicode 
strings.


Regards,

Rune Holm



From noamraph at gmail.com  Wed Nov  2 20:26:52 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Wed, 2 Nov 2005 21:26:52 +0200
Subject: [Python-Dev] Why should the default hash(x) == id(x)?
In-Reply-To: <20051102092422.F283.JCARLSON@uci.edu>
References: <b348a0850511011721ve1c3817vd5f61b644257e855@mail.gmail.com>
	<20051102092422.F283.JCARLSON@uci.edu>
Message-ID: <b348a0850511021126p31a12e15n7a3c22eb1b69026b@mail.gmail.com>

On 11/2/05, Josiah Carlson <jcarlson at uci.edu> wrote:
...
>
> A quick search in the list archives via google search
>    "site:mail.python.org object __hash__"
> Says that Guido wanted to remove the default __hash__ method for object
> in Python 2.4, but that never actually happened.
>
> http://www.python.org/sf/660098
> http://mail.python.org/pipermail/python-dev/2003-December/041375.html
>
> There may be more code which relies on the default behavior now, but
> fixing such things is easy.
>
Cool! If Guido also thinks that it should be gone, who am I to argue...

(Seriously, I am in favor of removing it. I really think that it is confusing.)

And if backwards-compatibility is a problem: You can, in Python 2.5,
show a warning when the default __hash__ method is being called,
saying that it is going to disappear in Python 2.6.

[Snip - I will open a new thread about the equality operator]

> As for removing the default __hash__ for objects, I'm actually hovering
> around a -0, if only because it is sometimes useful to generate unique
> keys for dictionaries (which can be done right now with object() ), and
> I acknowledge that it would be easy to subclass and use that instead.
>
I can suggest a new class, that will help you in the cases that you do
want a dict of identities:

class ref(object):
    def __init__(self, obj):
        self._obj = obj
    def __call__(self):
        return self._obj
    def __eq__(self, other):
        return self._obj is other._obj
    def __hash__(self):
        return hash(id(self._obj))

It has the advantage over using ids as keys, that it saves a reference
to the object, so it won't be killed.

It lets you make a dict of object identities just as easily as before,
in a more explicit and error-prone way. Perhaps it should become a
builtin?

Noam

From rmunn at pobox.com  Wed Nov  2 20:46:05 2005
From: rmunn at pobox.com (Robin Munn)
Date: Wed, 02 Nov 2005 13:46:05 -0600
Subject: [Python-Dev] Problems with revision 4077 of new SVN repository
Message-ID: <4369177D.3020000@pobox.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I'm trying to mirror the brand-new Python SVN repository with SVK, to
better be able to track both the trunk and the various branches. Since
I'm not a Python developer and don't have svn+ssh access, I'm doing so
over http. The process fails when trying to fetch revision 4077, with
the following error message:

"RA layer request failed: REPORT request failed on
'projects/!svn/bc/41373/python': The REPORT request returned invalid XML
in the response: XML parse error at line 7: not well-formed (invalid
token) (/projects/!svn/bc/41373/python)"

The thread at http://svn.haxx.se/dev/archive-2004-07/0793.shtml suggests
that the problem may lie in the commit message for revision 4077: if it
has a character in the 0x01-0x1f range (which are invalid XML), then
Subversion methods like http: will fail to retrieve it, while methods
like file: will succeed. I haven't tried svn+ssh: since I don't have an
SSH key on the server.

Trying "svn log -r 4077 http://svn.python.org/projects/python/" also fails:

subversion/libsvn_ra_dav/util.c:780: (apr_err=175002)
svn: REPORT request failed on '/projects/!svn/bc/4077/python'
subversion/libsvn_ra_dav/util.c:760: (apr_err=175002)
svn: The REPORT request returned invalid XML in the response: XML parse
error at line 7: not well-formed (invalid token)
(/projects/!svn/bc/4077/python)

When I visit http://svn.python.org/view/python/?rev=4077, I can see the
offending log message. Sure enough, there's a 0x1b character in it,
between the space after "Added" and the "h" immediately before the word
"Moved".

This problem can be fixed by someone with root permissions on the SVN
server logging in and running the following:

echo "New commit message goes here" > new-message.txt
svnadmin setlog --bypass-hooks -r 4077 /path/to/repos new-message.txt

If there are other, similar problems later in the SVN repository, I was
unable to find them because the SVK mirror process consistently halts at
revision 4077. If revision 4077 is fixed and I turn up other log
problems, I'll report them as well.


- --
Robin Munn
rmunn at pobox.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (Darwin)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFDaRd46OLMk9ZJcBQRApjAAJ9K3Y5z1q4TulqwVjmZTZb9ZgY31ACcD8RI
fNFmGL2U4XaIKa2n6UUyxEA=
=tEbq
-----END PGP SIGNATURE-----

From noamraph at gmail.com  Wed Nov  2 21:36:54 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Wed, 2 Nov 2005 22:36:54 +0200
Subject: [Python-Dev] Should the default equality operator compare values
	instead of identities?
Message-ID: <b348a0850511021236u66c94838pb7bb9e27f1314c3d@mail.gmail.com>

I think it should.

(I copy here messages from the thread about the default hash method.)

On 11/2/05, Michael Chermside <mcherm at mcherm.com> wrote:
> > Why not make the default __eq__ really compare the objects, that is,
> > their dicts and their slot-members?
>
> Short answer: not the desired behavior. Longer answer: there are
> three common patterns in object design. There are "value" objects,
> which should be considered equal if all fields are equal. There are
> "identity" objects which are considered equal only when they are
> the same object. And then there are (somewhat less common) "value"
> objects in which a few fields don't count -- they may be used for
> caching a pre-computed result for example. The default __eq__
> behavior has to cater to one of these -- clearly either "value"
> objects or "identity" objects. Guido chose to cater to "identity"
> objects believing that they are actually more common in most
> situations. A beneficial side-effect is that the default behavior
> of __eq__ is QUITE simple to explain, and if the implementation is
> easy to explain then it may be a good idea.
>
This is a very nice observation. I wish to explain why I think that
the default __eq__ should compare values, not identities.

1. If you want to compare identities, you can always use "is". There
is currently no easy way to compare your user-defined classes by
value, in case they happen to be "value objects", in Michael's
terminology - you have to compare every single member. (Comparing the
__dict__ attributes is ugly, and will not always work). If the default
were to compare the objects by value, and they happen to be "identity
objects", you can always do:
    def __eq__(self, other):
        return self is other

2. I believe that counter to what Michael said, "value objects" are
more common than "identity objects", at least when talking about
user-defined classes, and especially when talking about simple
user-defined classes, where the defaults are most important, since the
writer wouldn't care to define all the appropriate protocols. (this
was a long sentence) Can you give examples of common "identity
objects"? I believe that they are usually dealing with some
input/output, that is, with things that interact with the environment
(files, for example). I believe almost all "algorithmic" classes are
"value objects". And I think that usually, comparison based on value
will give the correct result for "identity objects" too, since if they
do I/O, they will usually hold a reference to an I/O object, like
file, which is an "identity object" by itself. This means that the
comparison will compare those objects, and return false, since the I/O
objects they hold are not the same one.

3. I think that value-based comparison is also quite easy to explain:
user-defined classes combine functions with a data structure. In
Python, the "data structure" is simply member names which reference
other objects. The default, value-based, comparison, checks if two
objects have the same member names, and that they are referencing
equal (by value) objects, and if so, returns True. I think that
explaining this is not harder than explaining the current dict
comparison.


Now, for Josiah's reply:

On 11/2/05, Josiah Carlson <jcarlson at uci.edu> wrote:
> > This leads me to another question: why should the default __eq__
> > method be the same as "is"? If someone wants to check if two objects
> > are the same object, that's what the "is" operator is for. Why not
> > make the default __eq__ really compare the objects, that is, their
> > dicts and their slot-members?
>
> Using 'is' makes sense when the default hash is id (and actually in
> certain other cases as well). Actually comparing the contents of an
> object is certainly not desireable with the default hash, and probably
> not desireable in the general case because equality doesn't always
> depend on /all/ attributes of extension objects.
>
>    Explicit is better than implicit.
>    In the face of ambiguity, refuse the temptation to guess.
>
I hope that the default hash would stop being id, as Josiah showed
that Guido decided, so let's don't discuss it.

Now, about the good point that sometimes the state doesn't depend on
all the attributes. Right. But the current default doesn't compare
them well too - you have no escape from writing an equality operator
by yourself. And I think this is not the common case.

I think that the meaning of "in the face of ambiguity, refuse the
temptation to guess" is that you should not write code that changes
its behaviour according to what the user will do, based on your guess
as to what he meant. This is not the case - the value-based comparison
is strictly defined. It may just not be what the user would want - and
in most cases, I think it will.

"Explicit is better than implicit" says only "better". identity-based
comparison is just as implicit as value-based comparison.

(I want to add that there is a simple way to support value-based
comparison when some members don't count, by writing a metaclass that
will check if your class has a member like
__non_state_members__ = ["_calculated_hash", "debug_member"]
and if so, would not compare them in the default equality-testing
method. I would say that this can even be made the behavior of the
default type.)

> I believe the current behavior of __eq__ is more desireable than
> comparing contents, as this may result in undesireable behavior
> (recursive compares on large nested objects are now slow, which used to
> be fast because default methods wouldn't cause a recursive comparison at
> all).

But if the default method doesn't do what you want, it doesn't matter
how fast it is. Remember that it's very easy to make recursive
comparisons, by comparing lists for example, and it hasn't disturbed
anyone.

To summarize, I think that value-based equality testing would usually
be what you want, and currently implementing it is a bit of a pain.

Concerning backwards-compatibility: show a warning in Python 2.5 when
the default equality test is being made, and change it in Python 2.6.

Comments, please!

Thanks,
Noam

From noamraph at gmail.com  Wed Nov  2 22:11:25 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Wed, 2 Nov 2005 23:11:25 +0200
Subject: [Python-Dev] Should the default equality operator compare
	values instead of identities?
In-Reply-To: <b348a0850511021236u66c94838pb7bb9e27f1314c3d@mail.gmail.com>
References: <b348a0850511021236u66c94838pb7bb9e27f1314c3d@mail.gmail.com>
Message-ID: <b348a0850511021311t456b8f6bg61763a5ea20497af@mail.gmail.com>

I've looked for classes in my /usr/lib/python2.4 directory. I won't go
over all the 7346 classes that were found there, but let's see:

"identity objects" that will continue to work because they contain
other "identity objects"
========================
SocketServer, and everything which inherits from it (like HTTPServer)
Queue
csv (contains _csv objects)

"value objects" that would probably gain a meaningful equality operator
============================================
StringIO
ConfigParser
markupbase, HTMLParser
HexBin, BinHex
cgi.FieldStorage
AST Nodes

others
======
Cookie - inherits from dict its __eq__ method.

I'll stop here. I was not strictly scientific, because I chose classes
that I thought that I might guess what they do easily, and perhaps
discarded classes that didn't look interesting to me. But I didn't
have any bad intention when choosing the classes. I have seen no class
that the change would damage its equality operator. I have seen quite
a lot of classes which didn't define an equality operator, and that a
value-based comparison would be the right way to compare them.

I'm getting more convinced in my opinion.

Noam

From noamraph at gmail.com  Wed Nov  2 22:18:58 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Wed, 2 Nov 2005 23:18:58 +0200
Subject: [Python-Dev] Should the default equality operator compare
	valuesinstead of identities?
In-Reply-To: <001101c5dff0$462fa5c0$153dc797@oemcomputer>
References: <b348a0850511021236u66c94838pb7bb9e27f1314c3d@mail.gmail.com>
	<001101c5dff0$462fa5c0$153dc797@oemcomputer>
Message-ID: <b348a0850511021318s1fbdd0a5u4605deb4464b0dc8@mail.gmail.com>

On 11/2/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
> > Should the default equality operator compare valuesinstead of
> identities?
>
> No.  Look back into last year's python-dev postings where we agreed that
> identity would always imply equality.  There were a number of practical
> reasons.  Also, there are a number of places in CPython where that
> assumption is implicit.
>
Perhaps you've meant something else, or I didn't understand? Identity
implying equality is true also in value-based comparison. If the
default __eq__ operator compares by value, I would say that it would
do something like:

def __eq__(self, other):
    if self is other:
        return True
    if type(self) is not type(other):
        return False
    (compare the __dict__ and any __slots__, and if they are all ==,
return True.)

Noam

From martin at v.loewis.de  Wed Nov  2 23:10:12 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 02 Nov 2005 23:10:12 +0100
Subject: [Python-Dev] Problems with revision 4077 of new SVN repository
In-Reply-To: <4369177D.3020000@pobox.com>
References: <4369177D.3020000@pobox.com>
Message-ID: <43693944.3090803@v.loewis.de>

Robin Munn wrote:
> echo "New commit message goes here" > new-message.txt
> svnadmin setlog --bypass-hooks -r 4077 /path/to/repos new-message.txt

Thanks for pointing that out, and for giving those instructions.
I now corrected the log message.

Regards,
Martin

From rmunn at pobox.com  Thu Nov  3 00:14:50 2005
From: rmunn at pobox.com (Robin Munn)
Date: Wed, 02 Nov 2005 17:14:50 -0600
Subject: [Python-Dev] Problems with revision 4077 of new SVN repository
In-Reply-To: <43693944.3090803@v.loewis.de>
References: <4369177D.3020000@pobox.com> <43693944.3090803@v.loewis.de>
Message-ID: <4369486A.8090107@pobox.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Martin v. L?wis wrote:
> Robin Munn wrote:
> 
>> echo "New commit message goes here" > new-message.txt
>> svnadmin setlog --bypass-hooks -r 4077 /path/to/repos new-message.txt
> 
> 
> Thanks for pointing that out, and for giving those instructions.
> I now corrected the log message.

Revision 4077 is fine now. However, the same problem exists in revision
4284, which has a 0x01 character before the word "add". Same solution:

echo "New commit message goes here" > new-message.txt
svnadmin setlog --bypass-hooks -r 4284 /path/to/repos new-message.txt

If there are two errors of the same type within about 200 revisions,
there may be more. I'm currently running "svn log" on every revision in
the Python SVN repository to see if I find any more errors of this type,
so that I don't have to hunt them down one-by-one by rerunning SVK. I'll
post my findings when I'm done.


- --
Robin Munn
rmunn at pobox.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (Darwin)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFDaUho6OLMk9ZJcBQRAg5eAJ9cJTPKX69DhXJyoT/cDV5GmZlC3QCfRj/E
wCix8IYU8xbh5/Ibnpa+kg4=
=+jLR
-----END PGP SIGNATURE-----

From greg.ewing at canterbury.ac.nz  Thu Nov  3 01:39:44 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 03 Nov 2005 13:39:44 +1300
Subject: [Python-Dev] Why should the default hash(x) == id(x)?
In-Reply-To: <b348a0850511011721ve1c3817vd5f61b644257e855@mail.gmail.com>
References: <b348a0850511011721ve1c3817vd5f61b644257e855@mail.gmail.com>
Message-ID: <43695C50.5070600@canterbury.ac.nz>

Noam Raphael wrote:

> 3. If someone does want to associate values with objects, he can
> explicitly use id:
> dct[id(x)] = 3.

This is fragile. Once all references to x are dropped,
it is possible for another object to be created having
the same id that x used to have. The dict now
unintentionally references the new object.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From jcarlson at uci.edu  Thu Nov  3 02:16:40 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed, 02 Nov 2005 17:16:40 -0800
Subject: [Python-Dev] Should the default equality operator compare
	values instead of identities?
In-Reply-To: <b348a0850511021236u66c94838pb7bb9e27f1314c3d@mail.gmail.com>
References: <b348a0850511021236u66c94838pb7bb9e27f1314c3d@mail.gmail.com>
Message-ID: <20051102125437.F290.JCARLSON@uci.edu>


Noam Raphael <noamraph at gmail.com> wrote:
> On 11/2/05, Josiah Carlson <jcarlson at uci.edu> wrote:
> > I believe the current behavior of __eq__ is more desireable than
> > comparing contents, as this may result in undesireable behavior
> > (recursive compares on large nested objects are now slow, which used to
> > be fast because default methods wouldn't cause a recursive comparison at
> > all).
> 
> But if the default method doesn't do what you want, it doesn't matter
> how fast it is. Remember that it's very easy to make recursive
> comparisons, by comparing lists for example, and it hasn't disturbed
> anyone.

Right, but lists (dicts, tuples, etc.) are defined as containers, and
their comparison operation is defined on their contents.  Objects are
not defined as containers in the general case, so defining comparisons
based on their contents (as opposed to identity) is just one of the two
assumptions to be made.

I personally like the current behavior, and I see no /compelling/ reason
to change it.  You obviously feel so compelled for the behavior to
change that you are willing to express your desires.  How about you do
something more productive and produce a patch which implements the
changes you want, verify that it passes tests in the standard library,
then post it on sourceforge.  If someone is similarly compelled and
agrees with you (so far I've not seen any public support for your
proposal by any of the core developers), the discussion will restart,
and it will be decided (not by you or I).


> To summarize, I think that value-based equality testing would usually
> be what you want, and currently implementing it is a bit of a pain.

Actually, implementing value-based equality testing, when you have a
finite set of values you want to test, is quite easy.

def __eq__(self, other):
    for i in self.__cmp_eq__:
        if getattr(self, i) != getattr(other, i):
            return False
    return True

With a simple metaclass that discovers all of those values automatically,
and/or your own protocol for exclusion, and you are done.  Remember, not
all 5-line functions should become builtin/default behavior, and this
implementation shows that it is not a significant burdon for you (or
anyone else) to implement this in your own custom library.

 - Josiah


P.S. One thing that you should remember is that even if your patch is
accepted, and even if this is desireable, Python 2.5 is supposed to be
released sometime next year (spring/summer?), and because it is a
backwards incompatible change, would need at least 2.6-2.7 before it
becomes the default behavior without a __future__ import, which is
another 3-4 years down the line.

I understand you are passionate, really I do (you should see some of my
proposals), but by the time these things get around to getting into
mainline Python, there are high odds that you probably won't care about
them much anymore (I've come to feel that way myself about many of my
proposals), and I think it is a good idea to attempt to balance - when
it comes to Python - "Now is better than never." and "Although never is
often better than *right* now."

Removing __hash__, changing __eq__, and trying to get in copy-on-write
freezing (which is really copy-and-cache freezing), all read to me like
"We gotta do this now!", which certainly isn't helping the proposal.


From rmunn at pobox.com  Thu Nov  3 03:13:33 2005
From: rmunn at pobox.com (Robin Munn)
Date: Wed, 02 Nov 2005 20:13:33 -0600
Subject: [Python-Dev] Problems with revision 4077 of new SVN repository
In-Reply-To: <4369486A.8090107@pobox.com>
References: <4369177D.3020000@pobox.com> <43693944.3090803@v.loewis.de>
	<4369486A.8090107@pobox.com>
Message-ID: <4369724D.8060001@pobox.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robin Munn wrote:
> Revision 4077 is fine now. However, the same problem exists in revision
> 4284, which has a 0x01 character before the word "add". Same solution:
> 
> echo "New commit message goes here" > new-message.txt
> svnadmin setlog --bypass-hooks -r 4284 /path/to/repos new-message.txt
> 
> If there are two errors of the same type within about 200 revisions,
> there may be more. I'm currently running "svn log" on every revision in
> the Python SVN repository to see if I find any more errors of this type,
> so that I don't have to hunt them down one-by-one by rerunning SVK. I'll
> post my findings when I'm done.

My script is up to revision 17500 with no further problems found; I now
believe that 4077 and 4284 were isolated cases. Once 4284 is fixed, it
should now be possible to SVK-mirror the entire repository.


- --
Robin Munn
rmunn at pobox.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (Darwin)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFDaXJF6OLMk9ZJcBQRAtZpAJ9iE1SlRJiQQOdIuBFuvjmQG3gshACgl9/A
vbsGD0bX3NCirQC5qtxdLYo=
=sgk/
-----END PGP SIGNATURE-----

From martin at v.loewis.de  Thu Nov  3 08:57:30 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 03 Nov 2005 08:57:30 +0100
Subject: [Python-Dev] Problems with revision 4077 of new SVN repository
In-Reply-To: <4369724D.8060001@pobox.com>
References: <4369177D.3020000@pobox.com> <43693944.3090803@v.loewis.de>
	<4369486A.8090107@pobox.com> <4369724D.8060001@pobox.com>
Message-ID: <4369C2EA.6030407@v.loewis.de>

Robin Munn wrote:
>>Revision 4077 is fine now. However, the same problem exists in revision
>>4284, which has a 0x01 character before the word "add". Same solution:

I now have fixed that as well.

Regards,
Martin

From rmunn at pobox.com  Thu Nov  3 09:07:43 2005
From: rmunn at pobox.com (Robin Munn)
Date: Thu, 03 Nov 2005 02:07:43 -0600
Subject: [Python-Dev] Problems with revision 4077 of new SVN repository
In-Reply-To: <4369C2EA.6030407@v.loewis.de>
References: <4369177D.3020000@pobox.com> <43693944.3090803@v.loewis.de>
	<4369486A.8090107@pobox.com> <4369724D.8060001@pobox.com>
	<4369C2EA.6030407@v.loewis.de>
Message-ID: <4369C54F.3050803@pobox.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Martin v. L?wis wrote:
> Robin Munn wrote:
> 
>>> Revision 4077 is fine now. However, the same problem exists in revision
>>> 4284, which has a 0x01 character before the word "add". Same solution:
> 
> 
> I now have fixed that as well.
> 
> Regards,
> Martin

And my script just finished running, with no further errors of this type
found. So doing an SVK mirror of the repository should work now, barring
any further surprises. I'm starting the SVK sync now; we'll see what
happens.

Thanks for fixing these!


- --
Robin Munn
rmunn at pobox.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (Darwin)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFDacVN6OLMk9ZJcBQRApUbAJ9+Ly5vPr8HRmoRbwJ3po4IWe8PBwCePTdm
XNx8HGqPvs7fwahHuJSogMw=
=a6Nc
-----END PGP SIGNATURE-----

From mwh at python.net  Thu Nov  3 13:48:06 2005
From: mwh at python.net (Michael Hudson)
Date: Thu, 03 Nov 2005 12:48:06 +0000
Subject: [Python-Dev] PyPy 0.8.0 is released!
Message-ID: <2mbr113njt.fsf@starship.python.net>

pypy-0.8.0: Translatable compiler/parser and some more speed 
==============================================================

The PyPy development team has been busy working and we've now packaged 
our latest improvements, completed work and new experiments as 
version 0.8.0, our third public release.

The highlights of this third release of PyPy are:

- Translatable parser and AST compiler. PyPy now integrates its own
  compiler based on Python own 'compiler' package but with a number
  of fixes and code simplifications in order to get it translated 
  with the rest of PyPy.  This makes using the translated pypy 
  interactively much more pleasant, as compilation is considerably 
  faster than in 0.7.0.

- Some Speed enhancements. Translated PyPy is now about 10 times
  faster than 0.7 but still 10-20 times slower than
  CPython on pystones and other benchmarks.  At the same time, 
  language compliancy has been slightly increased compared to 0.7
  which had already reached major CPython compliancy goals. 

- Some experimental features are now translateable.  Since 0.6.0, PyPy
  shipped with an experimental Object Space (the part of PyPy
  implementing Python object operations and manipulation) implementing
  lazily computed objects, the "Thunk" object space. With 0.8.0 this
  object space can also be translated preserving its feature
  additions.

What is PyPy (about)? 
------------------------------------------------

PyPy is a MIT-licensed research-oriented reimplementation of
Python written in Python itself, flexible and easy to
experiment with.  It translates itself to lower level
languages.  Our goals are to target a large variety of
platforms, small and large, by providing a compilation toolsuite
that can produce custom Python versions.  Platform, Memory and
Threading models are to become aspects of the translation
process - as opposed to encoding low level details into a
language implementation itself.  Eventually, dynamic
optimization techniques - implemented as another translation
aspect - should become robust against language changes.

Note that PyPy is mainly a research and development project
and does not by itself focus on getting a production-ready
Python implementation although we do hope and expect it to
become a viable contender in that area sometime next year. 

PyPy is partially funded as a research project under the 
European Union's IST programme. 

Where to start? 
-----------------------------

Getting started:    http://codespeak.net/pypy/dist/pypy/doc/getting-started.html

PyPy Documentation: http://codespeak.net/pypy/dist/pypy/doc/ 

PyPy Homepage:      http://codespeak.net/pypy/

The interpreter and object model implementations shipped with
the 0.8 version can run on their own and implement the core
language features of Python as of CPython 2.4.  However, we still
do not recommend using PyPy for anything else than for education, 
playing or research purposes.  

Ongoing work and near term goals
---------------------------------

At the last sprint in Paris we started exploring the new directions of
our work, in terms of extending and optimising PyPy further. We
started to scratch the surface of Just-In-Time compiler related work,
which we still expect will be the major source of our future speed
improvements and some successful amount of work has been done on the
support needed for stackless-like features.
  
This release also includes the snapshots in preliminary or embryonic
form of the following interesting but yet not completed sub projects:

- The OOtyper, a RTyper variation for higher-level backends 
  (Squeak, ...)
- A JavaScript backend
- A limited (PPC) assembler backend (this related to the JIT)
- some bits for a socket module

PyPy has been developed during approximately 16 coding sprints across
Europe and the US.  It continues to be a very dynamically and
incrementally evolving project with many of these one-week workshops
to follow.

PyPy has been a community effort from the start and it would
not have got that far without the coding and feedback support
from numerous people.   Please feel free to give feedback and 
raise questions. 

    contact points: http://codespeak.net/pypy/dist/pypy/doc/contact.html


have fun, 
    
    the pypy team, (Armin Rigo, Samuele Pedroni, 
    Holger Krekel, Christian Tismer, 
    Carl Friedrich Bolz, Michael Hudson, 
    and many others: http://codespeak.net/pypy/dist/pypy/doc/contributor.html)

PyPy development and activities happen as an open source project  
and with the support of a consortium partially funded by a two 
year European Union IST research grant. The full partners of that 
consortium are: 
        
    Heinrich-Heine University (Germany), AB Strakt (Sweden)
    merlinux GmbH (Germany), tismerysoft GmbH (Germany) 
    Logilab Paris (France), DFKI GmbH (Germany)
    ChangeMaker (Sweden), Impara (Germany)

From theller at python.net  Thu Nov  3 21:01:35 2005
From: theller at python.net (Thomas Heller)
Date: Thu, 03 Nov 2005 21:01:35 +0100
Subject: [Python-Dev] PYTHOPN_API_VERSION
Message-ID: <br11xzz4.fsf@python.net>

Shouldn't PYTHON_API_VERSION be different between 2.3 and 2.4?
It is 1012 in both versions.

I tried to detect whether PyTuple_Pack is supported, which was added in
2.4. Or is this only to detect changed apis, and not added apis?

Thomas


From Jack.Jansen at cwi.nl  Thu Nov  3 22:29:37 2005
From: Jack.Jansen at cwi.nl (Jack Jansen)
Date: Thu, 3 Nov 2005 22:29:37 +0100
Subject: [Python-Dev] Proposal: can we have a python-dev-announce mailing
	list?
Message-ID: <4407AF2E-9F9F-4D75-B890-052438D20468@cwi.nl>

As people may have noticed (or possibly not:-) I've been rather  
inactive on python-dev the last year or so, due to being completely  
inundated with other work. Too bad that I've missed all the  
interesting discussions on Python 3000, but I'm bound to catch up  
some time later this year:-).

BUT: what I also missed are all the important announcements, such as  
new releases, the switch to svn, and a couple more (I think).

I know I would be much helped with a moderated python-dev-announce  
mailing list, which would be only low-volume, time-critical  
announcements for people developing Python. Even during times when I  
am actively following python-dev it would be handy to have important  
announcements coming in in a separate mailbox in stead of buried  
under design discussions and such...
--
Jack Jansen, <Jack.Jansen at cwi.nl>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma  
Goldman



From phd at mail2.phd.pp.ru  Thu Nov  3 22:36:59 2005
From: phd at mail2.phd.pp.ru (Oleg Broytmann)
Date: Fri, 4 Nov 2005 00:36:59 +0300
Subject: [Python-Dev] Proposal: can we have a python-dev-announce
	mailing list?
In-Reply-To: <4407AF2E-9F9F-4D75-B890-052438D20468@cwi.nl>
References: <4407AF2E-9F9F-4D75-B890-052438D20468@cwi.nl>
Message-ID: <20051103213659.GA26132@phd.pp.ru>

On Thu, Nov 03, 2005 at 10:29:37PM +0100, Jack Jansen wrote:
> I know I would be much helped with a moderated python-dev-announce  
> mailing list, which would be only low-volume

   http://www.google.com/search?q=python-dev+summary+site%3Amail.python.org

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.

From jcarlson at uci.edu  Thu Nov  3 22:52:25 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Thu, 03 Nov 2005 13:52:25 -0800
Subject: [Python-Dev] Proposal: can we have a python-dev-announce
	mailing list?
In-Reply-To: <20051103213659.GA26132@phd.pp.ru>
References: <4407AF2E-9F9F-4D75-B890-052438D20468@cwi.nl>
	<20051103213659.GA26132@phd.pp.ru>
Message-ID: <20051103134856.BFB2.JCARLSON@uci.edu>


Even when they are on the ball, the summaries generally occur one week
after the discussion/execution happens.  That's not so much in the
'time-critical' aspect which, I would imagine, is about as important as
the 'low-volume' aspect.

 - Josiah


Oleg Broytmann <phd at oper.phd.pp.ru> wrote:
> 
> On Thu, Nov 03, 2005 at 10:29:37PM +0100, Jack Jansen wrote:
> > I know I would be much helped with a moderated python-dev-announce  
> > mailing list, which would be only low-volume
> 
>    http://www.google.com/search?q=python-dev+summary+site%3Amail.python.org
> 
> Oleg.
> -- 
>      Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
>            Programmers don't die, they just GOSUB without RETURN.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/jcarlson%40uci.edu


From Jack.Jansen at cwi.nl  Thu Nov  3 22:51:14 2005
From: Jack.Jansen at cwi.nl (Jack Jansen)
Date: Thu, 3 Nov 2005 22:51:14 +0100
Subject: [Python-Dev] Proposal: can we have a python-dev-announce
	mailing list?
In-Reply-To: <20051103213659.GA26132@phd.pp.ru>
References: <4407AF2E-9F9F-4D75-B890-052438D20468@cwi.nl>
	<20051103213659.GA26132@phd.pp.ru>
Message-ID: <5630A610-FB3B-4359-8E86-39CBF074CF0D@cwi.nl>


On 3-nov-2005, at 22:36, Oleg Broytmann wrote:

> On Thu, Nov 03, 2005 at 10:29:37PM +0100, Jack Jansen wrote:
>
>> I know I would be much helped with a moderated python-dev-announce
>> mailing list, which would be only low-volume
>>
>
>    http://www.google.com/search?q=python-dev+summary+site% 
> 3Amail.python.org

Hmm. I wouldn't mind if it was push in stead of pull, I wouldn't mind  
if it was in the right order, and I wouldn't mind if itwas more  
concise:-)

But: I'll just wait to see whether more people chime in that they'd  
like this, or that I'm alone...
--
Jack Jansen, <Jack.Jansen at cwi.nl>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma  
Goldman



From skip at pobox.com  Thu Nov  3 22:55:23 2005
From: skip at pobox.com (skip@pobox.com)
Date: Thu, 3 Nov 2005 15:55:23 -0600
Subject: [Python-Dev] Proposal: can we have a python-dev-announce
 mailing list?
In-Reply-To: <20051103213659.GA26132@phd.pp.ru>
References: <4407AF2E-9F9F-4D75-B890-052438D20468@cwi.nl>
	<20051103213659.GA26132@phd.pp.ru>
Message-ID: <17258.34635.582411.34526@montanaro.dyndns.org>

    >> I know I would be much helped with a moderated python-dev-announce
    >> mailing list, which would be only low-volume

    Oleg>    http://www.google.com/search?q=python-dev+summary+site%3Amail.python.org

That works up to a point, however the python-dev summaries only come out
once every couple of weeks, so probably aren't going to catch important
stuff that comes and goes with less than a two-week lifespan.  Alerts that
machines are going down for maintenance fall into this category.  Also, I
think the cvs->svn switch probably didn't take more than a few days once the
ball got rolling.  I think Martin announced the demise of the SF repository
around 20 October, with a cutover date of 26 October.

Skip


From martin at v.loewis.de  Thu Nov  3 23:08:55 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 03 Nov 2005 23:08:55 +0100
Subject: [Python-Dev] PYTHOPN_API_VERSION
In-Reply-To: <br11xzz4.fsf@python.net>
References: <br11xzz4.fsf@python.net>
Message-ID: <436A8A77.4040306@v.loewis.de>

Thomas Heller wrote:
> Shouldn't PYTHON_API_VERSION be different between 2.3 and 2.4?
> It is 1012 in both versions.
> 
> I tried to detect whether PyTuple_Pack is supported, which was added in
> 2.4. Or is this only to detect changed apis, and not added apis?

It's meant to detect changes that can break existing binary modules.
In most cases, this would be changed structs.
Whether such changes happened between 2.3 and 2.4, I don't know.

If you want to ask whether a certain function is present, either use
autoconf, or check for the Python (hex) version where it was first
introduced.

Regards,
Martin

From martin at v.loewis.de  Thu Nov  3 23:16:42 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 03 Nov 2005 23:16:42 +0100
Subject: [Python-Dev] Proposal: can we have a
 python-dev-announce	mailing list?
In-Reply-To: <5630A610-FB3B-4359-8E86-39CBF074CF0D@cwi.nl>
References: <4407AF2E-9F9F-4D75-B890-052438D20468@cwi.nl>	<20051103213659.GA26132@phd.pp.ru>
	<5630A610-FB3B-4359-8E86-39CBF074CF0D@cwi.nl>
Message-ID: <436A8C4A.8090908@v.loewis.de>

Jack Jansen wrote:
> Hmm. I wouldn't mind if it was push in stead of pull, I wouldn't mind  
> if it was in the right order, and I wouldn't mind if itwas more  
> concise:-)
> 
> But: I'll just wait to see whether more people chime in that they'd  
> like this, or that I'm alone...

I'm -1 on such a list. If it existed, people could complain "why wasn't
this announced properly". So the "blame" would be on people who
failed to give proper notice, instead of on the people who did not
care enough to follow the entire discussion.

More specifically, I'm sure I would have forgotten to post about the
svn switchover to python-dev-announce, just as I failed to post to
comp.lang.python.announce.

This is all volunteer work.

Regards,
Martin

From t-meyer at ihug.co.nz  Fri Nov  4 00:41:12 2005
From: t-meyer at ihug.co.nz (Tony Meyer)
Date: Fri, 4 Nov 2005 12:41:12 +1300
Subject: [Python-Dev] Proposal: can we have a python-dev-announce
	mailing list?
In-Reply-To: <4407AF2E-9F9F-4D75-B890-052438D20468@cwi.nl>
References: <4407AF2E-9F9F-4D75-B890-052438D20468@cwi.nl>
Message-ID: <27F1C5EB-9459-4F1F-A43F-9898941D66CF@ihug.co.nz>

> I know I would be much helped with a moderated python-dev-announce
> mailing list, which would be only low-volume, time-critical
> announcements for people developing Python. Even during times when I
> am actively following python-dev it would be handy to have important
> announcements coming in in a separate mailbox in stead of buried
> under design discussions and such...

Firstly, my apologies for the current delay in summaries, which  
exacerbates this problem (although others are right when they say  
that things sometimes happen too fast even for on-time summaries).

A while back there was talk about a mailing list for PEP changes and  
the solution was instead to use the "topic" feature of mailman,  
essentially creating a subset-mailing-list.  Would something like  
this be feasible for this?  (I don't really know enough how how the  
topic feature can be used to know if it is workable or not).

I presume that this would still need some sort of action from the  
poster (e.g. including a tag somewhere), but it would probably be  
easier for people to remember to do that than cross-post to another  
list entirely.

=Tony.Meyer


From rmunn at pobox.com  Fri Nov  4 01:17:32 2005
From: rmunn at pobox.com (Robin Munn)
Date: Thu, 03 Nov 2005 18:17:32 -0600
Subject: [Python-Dev] No more problems with new SVN repository
In-Reply-To: <4369C54F.3050803@pobox.com>
References: <4369177D.3020000@pobox.com>
	<43693944.3090803@v.loewis.de>	<4369486A.8090107@pobox.com>
	<4369724D.8060001@pobox.com>	<4369C2EA.6030407@v.loewis.de>
	<4369C54F.3050803@pobox.com>
Message-ID: <436AA89C.6050401@pobox.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robin Munn wrote:
> So doing an SVK mirror of the repository should work now, barring
> any further surprises. I'm starting the SVK sync now; we'll see what
> happens.

Confirmed; the SVK mirror took about 18 hours, but it completed
successfully with no further problems.

Again, thanks for fixing the issues so quickly.


- --
Robin Munn
rmunn at pobox.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (Darwin)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFDaqiZ6OLMk9ZJcBQRAjGuAJwLmbrxBgrHYUb/7LOvjq89GfKrWACghGgn
pvuMT5edAfMw3OAoZf5mJiw=
=2i88
-----END PGP SIGNATURE-----

From guido at python.org  Fri Nov  4 01:21:15 2005
From: guido at python.org (Guido van Rossum)
Date: Thu, 3 Nov 2005 16:21:15 -0800
Subject: [Python-Dev] No more problems with new SVN repository
In-Reply-To: <436AA89C.6050401@pobox.com>
References: <4369177D.3020000@pobox.com> <43693944.3090803@v.loewis.de>
	<4369486A.8090107@pobox.com> <4369724D.8060001@pobox.com>
	<4369C2EA.6030407@v.loewis.de> <4369C54F.3050803@pobox.com>
	<436AA89C.6050401@pobox.com>
Message-ID: <ca471dc20511031621s53fd3a5fha0e56b924974babb@mail.gmail.com>

I have a question after this exhilarating exchange.

Is there a way to prevent this kind of thing in the future, e.g. by
removing or rejecting change log messages with characters that are
considered invalid in XML?

(Or should perhaps the fix be to suppress or quote these characters
somehow in XML?)

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Fri Nov  4 07:31:03 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 04 Nov 2005 07:31:03 +0100
Subject: [Python-Dev] No more problems with new SVN repository
In-Reply-To: <ca471dc20511031621s53fd3a5fha0e56b924974babb@mail.gmail.com>
References: <4369177D.3020000@pobox.com>
	<43693944.3090803@v.loewis.de>	<4369486A.8090107@pobox.com>
	<4369724D.8060001@pobox.com>	<4369C2EA.6030407@v.loewis.de>
	<4369C54F.3050803@pobox.com>	<436AA89C.6050401@pobox.com>
	<ca471dc20511031621s53fd3a5fha0e56b924974babb@mail.gmail.com>
Message-ID: <436B0027.6010808@v.loewis.de>

Guido van Rossum wrote:
> I have a question after this exhilarating exchange.
> 
> Is there a way to prevent this kind of thing in the future, e.g. by
> removing or rejecting change log messages with characters that are
> considered invalid in XML?

I don't think it can happen again. Without testing, I would hope
subversion rejects log messages that contain "random" control
characters (if it doesn't, I should report that as a bug).

The characters are in there because of the CVS conversion (that
might be a bug in cvs2svn, which should have replaced them perhaps).
It only happened in very old log messages - so perhaps even CVS
doesn't allow them anymore.

In XML 1.0, there is a lot of confusion about including control
characters in text. In XML 1.1, this was clarified that you can include
them, but only through character references. So in the future, 
subversion might be able to transmit such log messages in
well-formed webdav.

Regards,
Martin

From fredrik at pythonware.com  Fri Nov  4 10:05:40 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 4 Nov 2005 10:05:40 +0100
Subject: [Python-Dev] Adding examples to PEP 263
Message-ID: <dkf895$4p9$1@sea.gmane.org>

the runtime warning you get when you use non-ascii characters in
python source code points the poor user to this page:

    http://www.python.org/peps/pep-0263.html

which tells the user to add a

    # -*- coding: <encoding name> -*-

to the source, and then provides a more detailed syntax description
as a RE pattern.  to help people that didn't grow up with emacs, and
don't speak fluent RE, and/or prefer to skim documentation, it would
be a quite helpful if the page also contained a few examples; e.g.

# -*- coding: utf-8 -*-
# -*- coding: iso-8859-1 -*-

can anyone with SVN write access perhaps add this?

(I'd probably add a note to the top of the page for anyone who arrives
there via a Python error message, which summarizes the pep and provides
an example or two; abstracts and rationales are nice, but if you're just a
plain user, a "do this; here's how it works; further discussion below" style
is a bit more practical...)

</F>




From mal at egenix.com  Fri Nov  4 10:27:43 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 04 Nov 2005 10:27:43 +0100
Subject: [Python-Dev] Adding examples to PEP 263
In-Reply-To: <dkf895$4p9$1@sea.gmane.org>
References: <dkf895$4p9$1@sea.gmane.org>
Message-ID: <436B298F.3050803@egenix.com>

Fredrik Lundh wrote:
> the runtime warning you get when you use non-ascii characters in
> python source code points the poor user to this page:
> 
>     http://www.python.org/peps/pep-0263.html
> 
> which tells the user to add a
> 
>     # -*- coding: <encoding name> -*-
> 
> to the source, and then provides a more detailed syntax description
> as a RE pattern.  to help people that didn't grow up with emacs, and
> don't speak fluent RE, and/or prefer to skim documentation, it would
> be a quite helpful if the page also contained a few examples; e.g.
> 
> # -*- coding: utf-8 -*-
> # -*- coding: iso-8859-1 -*-
> 
> can anyone with SVN write access perhaps add this?

Good point. I'll add some examples.

> (I'd probably add a note to the top of the page for anyone who arrives
> there via a Python error message, which summarizes the pep and provides
> an example or two; abstracts and rationales are nice, but if you're just a
> plain user, a "do this; here's how it works; further discussion below" style
> is a bit more practical...)

The PEP isn't all that long, so I don't think a summary would
help. However, we might want to point the user to a different
URL in the error message, e.g. a Wiki page with more user-friendly
content.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Nov 04 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2005-10-17: Released mxODBC.Zope.DA 1.0.9        http://zope.egenix.com/

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From noamraph at gmail.com  Fri Nov  4 13:02:31 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Fri, 4 Nov 2005 14:02:31 +0200
Subject: [Python-Dev] Why should the default hash(x) == id(x)?
In-Reply-To: <43695C50.5070600@canterbury.ac.nz>
References: <b348a0850511011721ve1c3817vd5f61b644257e855@mail.gmail.com>
	<43695C50.5070600@canterbury.ac.nz>
Message-ID: <b348a0850511040402v1edadf2ehc962166c329538df@mail.gmail.com>

On 11/3/05, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> > 3. If someone does want to associate values with objects, he can
> > explicitly use id:
> > dct[id(x)] = 3.
>
> This is fragile. Once all references to x are dropped,
> it is possible for another object to be created having
> the same id that x used to have. The dict now
> unintentionally references the new object.
>
You are right. Please see the simple "ref" class that I wrote in my
previous post, which solves this problem.

Noam

From steve at holdenweb.com  Fri Nov  4 13:03:44 2005
From: steve at holdenweb.com (Steve Holden)
Date: Fri, 04 Nov 2005 12:03:44 +0000
Subject: [Python-Dev] Adding examples to PEP 263
In-Reply-To: <436B298F.3050803@egenix.com>
References: <dkf895$4p9$1@sea.gmane.org> <436B298F.3050803@egenix.com>
Message-ID: <dkfin0$1qa$2@sea.gmane.org>

M.-A. Lemburg wrote:
> Fredrik Lundh wrote:
> 
>>the runtime warning you get when you use non-ascii characters in
>>python source code points the poor user to this page:
>>
>>    http://www.python.org/peps/pep-0263.html
>>
>>which tells the user to add a
>>
>>    # -*- coding: <encoding name> -*-
>>
>>to the source, and then provides a more detailed syntax description
>>as a RE pattern.  to help people that didn't grow up with emacs, and
>>don't speak fluent RE, and/or prefer to skim documentation, it would
>>be a quite helpful if the page also contained a few examples; e.g.
>>
>># -*- coding: utf-8 -*-
>># -*- coding: iso-8859-1 -*-
>>
>>can anyone with SVN write access perhaps add this?
> 
> 
> Good point. I'll add some examples.
> 
> 
>>(I'd probably add a note to the top of the page for anyone who arrives
>>there via a Python error message, which summarizes the pep and provides
>>an example or two; abstracts and rationales are nice, but if you're just a
>>plain user, a "do this; here's how it works; further discussion below" style
>>is a bit more practical...)
> 
> 
> The PEP isn't all that long, so I don't think a summary would
> help. However, we might want to point the user to a different
> URL in the error message, e.g. a Wiki page with more user-friendly
> content.
> 
Under NO circumstances should a Wiki page be used as the destination for 
a link in a runtime error message.

If the page happens to be spammed when the user follows the link they'll 
wonder why the error message is pointing to a page full of links to hot 
babes, or whatever.

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC                     www.holdenweb.com
PyCon TX 2006                  www.python.org/pycon/


From pinard at iro.umontreal.ca  Fri Nov  4 16:32:24 2005
From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard)
Date: Fri, 4 Nov 2005 10:32:24 -0500
Subject: [Python-Dev] No more problems with new SVN repository
In-Reply-To: <ca471dc20511031621s53fd3a5fha0e56b924974babb@mail.gmail.com>
References: <4369177D.3020000@pobox.com> <43693944.3090803@v.loewis.de>
	<4369486A.8090107@pobox.com> <4369724D.8060001@pobox.com>
	<4369C2EA.6030407@v.loewis.de> <4369C54F.3050803@pobox.com>
	<436AA89C.6050401@pobox.com>
	<ca471dc20511031621s53fd3a5fha0e56b924974babb@mail.gmail.com>
Message-ID: <20051104153224.GA22469@alcyon.progiciels-bpi.ca>

[Guido van Rossum]

> Is there a way to prevent this kind of thing in the future, e.g. by
> removing or rejecting change log messages with characters that are
> considered invalid in XML?

Suppose TOP is the top of the Subversion repository.  The easiest way is 
providing a TOP/hook/pre-commit script.  If the script exits with 
non-zero status, usually with some clear diagnostic on stderr, the whole 
commit aborts, and the diagnostic is shown to the committing user.

The tricky part is getting the tentative log message from within the 
script.  This is done by popening "svnlook log -t ARG2 ARG1", where ARG1 
and ARG2 are arguments given to the pre-commit script.

-- 
Fran?ois Pinard   http://pinard.progiciels-bpi.ca

From dave at boost-consulting.com  Fri Nov  4 21:09:39 2005
From: dave at boost-consulting.com (David Abrahams)
Date: Fri, 04 Nov 2005 15:09:39 -0500
Subject: [Python-Dev] Plea to distribute debugging lib
Message-ID: <uek5wdvjw.fsf@boost-consulting.com>


For years, Boost.Python has been doing some hacks to work around the
fact that a Windows Python distro doesn't include the debug build of
the library.  

  http://www.boost.org/libs/python/doc/building.html#variants

explains.  We wanted to make it reasonably convenient for Windows
developers (and our distributed testers) to work with a debug build of
the Boost.Python library and of their own code.  Having to download
the Python source and build the debug DLL was deemed unacceptable.

Well, those hacks have run out of road.  VC++8 now detects that some
of its headers have been #included with _DEBUG and some without, and
it will refuse to build anything when it does.  We have several new
hacks to work around that detection, and I think we _might_ be able to
get away with them for one more release.  But it's really time to do
it right.  MS is recommending that we (Boost) start distributing a
debug build of the Python DLL with Boost, but Boost really seems like
the wrong place to host such a thing.  Is there any way Python.org can
make a debug build more accessible?

Thanks,
Dave

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com


From python at discworld.dyndns.org  Fri Nov  4 21:28:25 2005
From: python at discworld.dyndns.org (Charles Cazabon)
Date: Fri, 4 Nov 2005 14:28:25 -0600
Subject: [Python-Dev] Plea to distribute debugging lib
In-Reply-To: <uek5wdvjw.fsf@boost-consulting.com>
References: <uek5wdvjw.fsf@boost-consulting.com>
Message-ID: <20051104202824.GA19678@discworld.dyndns.org>

David Abrahams <dave at boost-consulting.com> wrote:
> 
> For years, Boost.Python has been doing some hacks to work around the fact
> that a Windows Python distro doesn't include the debug build of the library.  
[...]
> Having to download the Python source and build the debug DLL was deemed
> unacceptable.

I'm curious: why was this "deemed unacceptable"?  Python's license is about as
liberal as it gets, and the code is almost startlingly easy to compile --
easier than any other similarly-sized codebase I've had to work with.

Charles
-- 
-----------------------------------------------------------------------
Charles Cazabon                           <python at discworld.dyndns.org>
GPL'ed software available at:               http://pyropus.ca/software/
-----------------------------------------------------------------------

From guido at python.org  Fri Nov  4 21:33:44 2005
From: guido at python.org (Guido van Rossum)
Date: Fri, 4 Nov 2005 12:33:44 -0800
Subject: [Python-Dev] Plea to distribute debugging lib
In-Reply-To: <20051104202824.GA19678@discworld.dyndns.org>
References: <uek5wdvjw.fsf@boost-consulting.com>
	<20051104202824.GA19678@discworld.dyndns.org>
Message-ID: <ca471dc20511041233i25b3c33wbfa3e3ad4a356702@mail.gmail.com>

I vaguely recall that there were problems with distributing the debug
version of the MS runtime.

Anyway, why can't you do this yourself for all Boost users? It's all
volunteer time, you know...

--Guido

On 11/4/05, Charles Cazabon <python at discworld.dyndns.org> wrote:
> David Abrahams <dave at boost-consulting.com> wrote:
> >
> > For years, Boost.Python has been doing some hacks to work around the fact
> > that a Windows Python distro doesn't include the debug build of the library.
> [...]
> > Having to download the Python source and build the debug DLL was deemed
> > unacceptable.
>
> I'm curious: why was this "deemed unacceptable"?  Python's license is about as
> liberal as it gets, and the code is almost startlingly easy to compile --
> easier than any other similarly-sized codebase I've had to work with.
>
> Charles
> --
> -----------------------------------------------------------------------
> Charles Cazabon                           <python at discworld.dyndns.org>
> GPL'ed software available at:               http://pyropus.ca/software/
> -----------------------------------------------------------------------
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From tim.peters at gmail.com  Fri Nov  4 21:37:37 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 4 Nov 2005 15:37:37 -0500
Subject: [Python-Dev] Plea to distribute debugging lib
In-Reply-To: <uek5wdvjw.fsf@boost-consulting.com>
References: <uek5wdvjw.fsf@boost-consulting.com>
Message-ID: <1f7befae0511041237j2156306fhe5b90053a7e027f8@mail.gmail.com>

[David Abrahams]
> For years, Boost.Python has been doing some hacks to work around the
> fact that a Windows Python distro doesn't include the debug build of
> the library.
> ...
> MS is recommending that we (Boost) start distributing a debug build of the
> Python DLL with Boost, but Boost really seems like the wrong place to host
> such a thing.  Is there any way Python.org can make a debug build more
> accessible?

Possibly.  I don't do this anymore (this == build the Python Windows
installers), but I used to.  For some time I also made available a zip
file containing various debug-build bits, captured at the time the
official installer was built. We didn't (and I'm sure we still don't)
want to include them in the main installer, because they bloat its
size for something most users truly do not want.

I got sick of building the debug zip file, and stopped doing that too.
 No two users wanted the same set of stuff in it, so it grew to
contain the union of everything everyone wanted, and then people
complained that it was "too big".  This is one of the few times in
your Uncle Timmy's life that he said "so screw it -- do it yourself,
you whiny baby whiners with your incessant baby whining you " ;-)

Based on that sure-to-be universal reaction from anyone who signs up
for this, I'd say the best thing you could do to help it along is to
define precisely (a) what an acceptable distribution format is; and,
(b) what exactly it should contain.  That, and being nice to Martin,
would go a long way.

From theller at python.net  Fri Nov  4 21:47:40 2005
From: theller at python.net (Thomas Heller)
Date: Fri, 04 Nov 2005 21:47:40 +0100
Subject: [Python-Dev] Plea to distribute debugging lib
References: <uek5wdvjw.fsf@boost-consulting.com>
	<20051104202824.GA19678@discworld.dyndns.org>
	<ca471dc20511041233i25b3c33wbfa3e3ad4a356702@mail.gmail.com>
Message-ID: <irv86syb.fsf@python.net>

Guido van Rossum <guido at python.org> writes:

> I vaguely recall that there were problems with distributing the debug
> version of the MS runtime.

Right: the debug runtime dlls are not disributable.

> Anyway, why can't you do this yourself for all Boost users? It's all
> volunteer time, you know...

Doesn't any boost user need a C compiler anyway, so it should not really
be a problem to compile Python?

Anyway, AFAIK, the activestate distribution contains Python debug dlls.

Thomas


From dave at boost-consulting.com  Fri Nov  4 21:58:11 2005
From: dave at boost-consulting.com (David Abrahams)
Date: Fri, 04 Nov 2005 15:58:11 -0500
Subject: [Python-Dev] Plea to distribute debugging lib
In-Reply-To: <1f7befae0511041237j2156306fhe5b90053a7e027f8@mail.gmail.com>
	(Tim Peters's message of "Fri, 4 Nov 2005 15:37:37 -0500")
References: <uek5wdvjw.fsf@boost-consulting.com>
	<1f7befae0511041237j2156306fhe5b90053a7e027f8@mail.gmail.com>
Message-ID: <u8xw4dtb0.fsf@boost-consulting.com>

Tim Peters <tim.peters at gmail.com> writes:

> [David Abrahams]
>> For years, Boost.Python has been doing some hacks to work around the
>> fact that a Windows Python distro doesn't include the debug build of
>> the library.
>> ...
>> MS is recommending that we (Boost) start distributing a debug build of the
>> Python DLL with Boost, but Boost really seems like the wrong place to host
>> such a thing.  Is there any way Python.org can make a debug build more
>> accessible?
>
> Possibly.  I don't do this anymore (this == build the Python Windows
> installers), but I used to.  For some time I also made available a zip
> file containing various debug-build bits, captured at the time the
> official installer was built. We didn't (and I'm sure we still don't)
> want to include them in the main installer, because they bloat its
> size for something most users truly do not want.
>
> I got sick of building the debug zip file, and stopped doing that too.
>  No two users wanted the same set of stuff in it, so it grew to
> contain the union of everything everyone wanted, and then people
> complained that it was "too big".  This is one of the few times in
> your Uncle Timmy's life that he said "so screw it -- do it yourself,
> you whiny baby whiners with your incessant baby whining you " ;-)
>
> Based on that sure-to-be universal reaction from anyone who signs up
> for this, I'd say the best thing you could do to help it along is to
> define precisely (a) what an acceptable distribution format is; and,
> (b) what exactly it should contain.

Who knows what the whiny babies will accept?  That said, I think
people would be happy with a .zip file containing whatever is built by
selecting the debug build in the VS project and asking it to build
everything. (**)

> That, and being nice to Martin, 

I'm always as nice as Davidly possible to Martin!

> would go a long way.

My fingers and toes are crossed.

Thanks!

(**) If you could build the ability to download the debugging binaries
into the regular installer, that would be the shiznit, but I don't
dare ask for it. ;-)

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com

From dave at boost-consulting.com  Fri Nov  4 23:25:55 2005
From: dave at boost-consulting.com (David Abrahams)
Date: Fri, 04 Nov 2005 17:25:55 -0500
Subject: [Python-Dev] Plea to distribute debugging lib
In-Reply-To: <436BD111.5080808@rubikon.pl> (Bronek Kozicki's message of "Fri, 
	04 Nov 2005 21:22:25 +0000")
References: <uek5wdvjw.fsf@boost-consulting.com>
	<1f7befae0511041237j2156306fhe5b90053a7e027f8@mail.gmail.com>
	<u8xw4dtb0.fsf@boost-consulting.com> <436BD111.5080808@rubikon.pl>
Message-ID: <ufyqccaoc.fsf@boost-consulting.com>

Bronek Kozicki <brok at rubikon.pl> writes:

> David Abrahams wrote:
>> Who knows what the whiny babies will accept?  That said, I think
>> people would be happy with a .zip file containing whatever is built by
>> selecting the debug build in the VS project and asking it to build
>> everything. (**)
>
> Just to clarify - what we are asking for is library built with _DEBUG 
> and no BOOST_DEBUG_PYTHON, that is the one compatible with default 
> Python distribution. 

Bronek,

I know you're trying to help, but I'm sure that's not making anything
clearer for these people.  They don't know anything about
BOOST_DEBUG_PYTHON and would never have cause to define it.

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com

From martin at v.loewis.de  Fri Nov  4 23:29:56 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 04 Nov 2005 23:29:56 +0100
Subject: [Python-Dev] Plea to distribute debugging lib
In-Reply-To: <u8xw4dtb0.fsf@boost-consulting.com>
References: <uek5wdvjw.fsf@boost-consulting.com>	<1f7befae0511041237j2156306fhe5b90053a7e027f8@mail.gmail.com>
	<u8xw4dtb0.fsf@boost-consulting.com>
Message-ID: <436BE0E4.6@v.loewis.de>

David Abrahams wrote:
> Who knows what the whiny babies will accept?  That said, I think
> people would be happy with a .zip file containing whatever is built by
> selecting the debug build in the VS project and asking it to build
> everything. (**)

I would go a step further than Tim: Send me (*) a patch to msi.py (which
is used to build the distribution) that picks up the files and packages
them in the desired way, and I will include the files it outputs
in the official distribution. This is how the libpython24.a got in
(and this is also the way in which it will get out again).

In the patch, preferably state whom to contact for the specific feature,
as I won't be able to answer questions about it.

I don't have a personal need for the feature (I do have debug builds
myself, and it takes only 10 minutes or so to create them), so I won't
even have a way to test whether the feature works correctly.

Regards,
Martin

(*) that is, sf.net/projects/python


From eyal.lotem at gmail.com  Fri Nov  4 23:33:29 2005
From: eyal.lotem at gmail.com (Eyal Lotem)
Date: Sat, 5 Nov 2005 00:33:29 +0200
Subject: [Python-Dev] Class decorators vs metaclasses
Message-ID: <b64f365b0511041433m773361d9x202d57ac83534aa8@mail.gmail.com>

I have a few claims, some unrelated, and some built on top of each
other.  I would like to hear your responses as to which are
convincing, which arne't, and why. I think that if these claims are
true, Python 3000 should change quite a bit.

A. Metaclass code is black magic and few understand how it works,
while decorator code is mostly understandable, even by non-gurus.

B. One of Decorators' most powerful features is that they can
mixed-and-matched, which makes them very powerful for many purposes,
while metaclasses are exclusive, and only one can be used.  This is
especially problematic as some classes may assume their subclasses
must use their respective metaclasses.  This means classdecorators are
strictly more powerful than metaclasses, without cumbersome
convertions between metaclass mechanisms and decorator mechanisms.

C. Interesting uses of classdecorators are allowing super-calling
without redundantly specifying the name of your class, or your
superclass.

D. Python seems to be incrementally adding power to the core language
with these features, which is great, but it also causes significant
overlapping of language features, which I believe is something to
avoid when possible.  If metaclasses are replaced with class
decorators, then suddenly inheritence becomes a redundant feature.

E. If inheritence is a redundant feature, it can be removed and an
"inherit" class decorator can be used.  This could also reduce all the
__mro__ clutter from the language along with other complexities, into
alternate implementations of the inherit classdecorator.

From martin at v.loewis.de  Fri Nov  4 23:44:53 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 04 Nov 2005 23:44:53 +0100
Subject: [Python-Dev] Plea to distribute debugging lib
In-Reply-To: <ufyqccaoc.fsf@boost-consulting.com>
References: <uek5wdvjw.fsf@boost-consulting.com>	<1f7befae0511041237j2156306fhe5b90053a7e027f8@mail.gmail.com>	<u8xw4dtb0.fsf@boost-consulting.com>
	<436BD111.5080808@rubikon.pl> <ufyqccaoc.fsf@boost-consulting.com>
Message-ID: <436BE465.4000100@v.loewis.de>

David Abrahams wrote:
>>Just to clarify - what we are asking for is library built with _DEBUG 
>>and no BOOST_DEBUG_PYTHON, that is the one compatible with default 
>>Python distribution. 
> 
> 
> I know you're trying to help, but I'm sure that's not making anything
> clearer for these people.  They don't know anything about
> BOOST_DEBUG_PYTHON and would never have cause to define it.
> 

Actually, I'm truly puzzled. Why would a library that has _DEBUG defined
be compatible with the standard distribution? Doesn't _DEBUG cause
linkage with msvcr71d.dll?

In addition (more correctly: for that reason), the debug build causes
python2x_d.dll to be build, instead of python2x.dll, which definitely
is incompatible with the standard DLL. It not only uses a different
C library; it also causes Py_DEBUG to be defined, which in turn creates
a different memory layout for PyObject.

So in the end, I would assume you are requesting what you call a
debug-python, i.e. one that (in your system) *has* BOOST_DEBUG_PYTHON
defined.

Regards,
Martin

From aleaxit at gmail.com  Sat Nov  5 00:02:42 2005
From: aleaxit at gmail.com (Alex Martelli)
Date: Fri, 4 Nov 2005 15:02:42 -0800
Subject: [Python-Dev] Class decorators vs metaclasses
In-Reply-To: <b64f365b0511041433m773361d9x202d57ac83534aa8@mail.gmail.com>
References: <b64f365b0511041433m773361d9x202d57ac83534aa8@mail.gmail.com>
Message-ID: <e8a0972d0511041502i5de78d12ua23b1eb79970e120@mail.gmail.com>

On 11/4/05, Eyal Lotem <eyal.lotem at gmail.com> wrote:
> I have a few claims, some unrelated, and some built on top of each
> other.  I would like to hear your responses as to which are
> convincing, which arne't, and why. I think that if these claims are
> true, Python 3000 should change quite a bit.
>
> A. Metaclass code is black magic and few understand how it works,
> while decorator code is mostly understandable, even by non-gurus.

I disagree.  I've held many presentations and classes on both
subjects, and while people may INITIALLY feel like metaclasses are
black magic, as soon as I've explained it the fear dissipates.  It all
boils down do understanding that:

class Name(Ba,Ses): <<body>>

means

Name = suitable_metaclass('Name', (Ba,Ses), <<dict-built-by-body>>)

which isn't any harder than understanding that

@foo(bar)
def baz(args): ...

means

def baz(args): ...
baz = foo(bar)(baz)


> B. One of Decorators' most powerful features is that they can
> mixed-and-matched, which makes them very powerful for many purposes,
> while metaclasses are exclusive, and only one can be used.  This is

Wrong.  You can mix as many metaclasses as you wish, as long as
they're properly coded for multiple inheritance (using super, etc) --
just inherit from them all.  This is reasonably easy to automate (see
the last recipe in the 2nd ed of the Python Cookbook), too.

> especially problematic as some classes may assume their subclasses
> must use their respective metaclasses.  This means classdecorators are
> strictly more powerful than metaclasses, without cumbersome
> convertions between metaclass mechanisms and decorator mechanisms.

The assertion that classdecorators are strictly more powerful than
custom metaclasses is simply false.  How would you design
classdecorator XXX so that

@XXX
class Foo: ...

allows 'print Foo' to emit 'this is beautiful class Foo', for example?
 the str(Foo) implicit in print calls type(Foo).__str__(Foo), so you
do need a custom type(Foo) -- which is all that is meant by "a custom
metaclass"... a custom type whose instances are classes, that's all.


> C. Interesting uses of classdecorators are allowing super-calling
> without redundantly specifying the name of your class, or your
> superclass.

Can you give an example?

>
> D. Python seems to be incrementally adding power to the core language
> with these features, which is great, but it also causes significant
> overlapping of language features, which I believe is something to
> avoid when possible.  If metaclasses are replaced with class
> decorators, then suddenly inheritence becomes a redundant feature.

And how do you customize what "print Foo" emits, as above?

> E. If inheritence is a redundant feature, it can be removed and an
> "inherit" class decorator can be used.  This could also reduce all the
> __mro__ clutter from the language along with other complexities, into
> alternate implementations of the inherit classdecorator.

How do you propose to get exactly the same effects as inheritance
(affect every attribute lookup on a class and its instances) without
inheritance?  Essentially, inheritance is automated delegation
obtained by having getattr(foo, 'bar') look through a chain of objects
(essentially the __mro__) until an attribute named 'bar' is found in
one of those objects, plus a few minor but useful side effects, e.g.
on isinstance and issubclass, and the catching of exceptions in
try/except statements.  How would any mechanism allowing all of these
uses NOT be inheritance?


Alex

From dave at boost-consulting.com  Sat Nov  5 00:04:29 2005
From: dave at boost-consulting.com (David Abrahams)
Date: Fri, 04 Nov 2005 18:04:29 -0500
Subject: [Python-Dev] Plea to distribute debugging lib
In-Reply-To: <436BE0E4.6@v.loewis.de> (Martin v. =?iso-8859-1?Q?L=F6wis's?=
	message of "Fri, 04 Nov 2005 23:29:56 +0100")
References: <uek5wdvjw.fsf@boost-consulting.com>
	<1f7befae0511041237j2156306fhe5b90053a7e027f8@mail.gmail.com>
	<u8xw4dtb0.fsf@boost-consulting.com> <436BE0E4.6@v.loewis.de>
Message-ID: <uy844aubm.fsf@boost-consulting.com>

"Martin v. L?wis" <martin at v.loewis.de> writes:

> David Abrahams wrote:
>> Who knows what the whiny babies will accept?  That said, I think
>> people would be happy with a .zip file containing whatever is built by
>> selecting the debug build in the VS project and asking it to build
>> everything. (**)
>
> I would go a step further than Tim: Send me (*) a patch to msi.py (which
> is used to build the distribution) that picks up the files and packages
> them in the desired way, and I will include the files it outputs
> in the official distribution. This is how the libpython24.a got in
> (and this is also the way in which it will get out again).

Not to look a gift horse in the mouth, but won't that cause the
problem that Tim was worried about, i.e. a bloated Python installer?

> In the patch, preferably state whom to contact for the specific feature,
> as I won't be able to answer questions about it.
>
> I don't have a personal need for the feature (I do have debug builds
> myself, and it takes only 10 minutes or so to create them), 

I know, me too.  It's easy enough once you get started building
Python.  I just think it's too big a hump for many people.

> so I won't even have a way to test whether the feature works
> correctly.
>
> Regards,
> Martin
>
> (*) that is, sf.net/projects/python

I s'pose that means, "put it in the patches tracker."

grateful-ly y'rs,

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com

From dave at boost-consulting.com  Sat Nov  5 00:13:39 2005
From: dave at boost-consulting.com (David Abrahams)
Date: Fri, 04 Nov 2005 18:13:39 -0500
Subject: [Python-Dev] Plea to distribute debugging lib
In-Reply-To: <436BE465.4000100@v.loewis.de> (Martin v. =?iso-8859-1?Q?L=F6?=
	=?iso-8859-1?Q?wis's?= message of
	"Fri, 04 Nov 2005 23:44:53 +0100")
References: <uek5wdvjw.fsf@boost-consulting.com>
	<1f7befae0511041237j2156306fhe5b90053a7e027f8@mail.gmail.com>
	<u8xw4dtb0.fsf@boost-consulting.com> <436BD111.5080808@rubikon.pl>
	<ufyqccaoc.fsf@boost-consulting.com> <436BE465.4000100@v.loewis.de>
Message-ID: <ur79watwc.fsf@boost-consulting.com>

"Martin v. L?wis" <martin at v.loewis.de> writes:

> David Abrahams wrote:
>>> Just to clarify - what we are asking for is library built with
>>> _DEBUG and no BOOST_DEBUG_PYTHON, that is the one compatible with
>>> default Python distribution. 
>> I know you're trying to help, but I'm sure that's not making
>> anything
>> clearer for these people.  They don't know anything about
>> BOOST_DEBUG_PYTHON and would never have cause to define it.
>> 
>
> Actually, I'm truly puzzled. 

I was afraid this would happen.  Really, you're better off ignoring
Bronek's message.

> Why would a library that has _DEBUG defined
> be compatible with the standard distribution? Doesn't _DEBUG cause
> linkage with msvcr71d.dll?

Unless you do the hacks that I mentioned in my opening message.
Read http://www.boost.org/libs/python/doc/building.html#variants for
details.

> In addition (more correctly: for that reason), the debug build causes
> python2x_d.dll to be build, instead of python2x.dll, which definitely
> is incompatible with the standard DLL. It not only uses a different
> C library; it also causes Py_DEBUG to be defined, which in turn creates
> a different memory layout for PyObject.

Exactly.

> So in the end, I would assume you are requesting what you call a
> debug-python, i.e. one that (in your system) *has*
> BOOST_DEBUG_PYTHON defined.

What I am requesting is the good old python2x_d.dll and any associated
extension modules that get built as part of the Python distro, so I
can stop doing the hack, drop BOOST_DEBUG_PYTHON, and tell people use
_DEBUG in the usual way.

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com

From martin at v.loewis.de  Sat Nov  5 00:21:05 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 05 Nov 2005 00:21:05 +0100
Subject: [Python-Dev] Plea to distribute debugging lib
In-Reply-To: <uy844aubm.fsf@boost-consulting.com>
References: <uek5wdvjw.fsf@boost-consulting.com>	<1f7befae0511041237j2156306fhe5b90053a7e027f8@mail.gmail.com>	<u8xw4dtb0.fsf@boost-consulting.com>
	<436BE0E4.6@v.loewis.de> <uy844aubm.fsf@boost-consulting.com>
Message-ID: <436BECE1.3090202@v.loewis.de>

David Abrahams wrote:
>>I would go a step further than Tim: Send me (*) a patch to msi.py (which
>>is used to build the distribution) that picks up the files and packages
>>them in the desired way, and I will include the files it outputs
>>in the official distribution. This is how the libpython24.a got in
>>(and this is also the way in which it will get out again).
> 
> 
> Not to look a gift horse in the mouth, but won't that cause the
> problem that Tim was worried about, i.e. a bloated Python installer?

Not if done properly: it would, of course, *not* add the desired
files in to the msi file, but create a separate file. It is pure
Python code, and called msi.py because that's it main function.
It does several other things, though (such as creating a .cab file
and a .a file); it could well create another zip file.

As to how it would work: preferably by invoking the Python zip
library, but invoking external programs to package up everything
might be acceptable as well (assuming I'm told what these tools
are, and assuming it falls back to doing nothing if the tools
are not available).

The separate file would have a name similar to the MSI file,
so that the debug file has the same version number as the MSI
file.

> I s'pose that means, "put it in the patches tracker."
> 

Exactly.

Regards,
Martin

From eyal.lotem at gmail.com  Sat Nov  5 12:27:55 2005
From: eyal.lotem at gmail.com (Eyal Lotem)
Date: Sat, 5 Nov 2005 13:27:55 +0200
Subject: [Python-Dev] Class decorators vs metaclasses
In-Reply-To: <e8a0972d0511041502i5de78d12ua23b1eb79970e120@mail.gmail.com>
References: <b64f365b0511041433m773361d9x202d57ac83534aa8@mail.gmail.com>
	<e8a0972d0511041502i5de78d12ua23b1eb79970e120@mail.gmail.com>
Message-ID: <b64f365b0511050327g40eaebecp90f0a60c6440b660@mail.gmail.com>

On 11/5/05, Alex Martelli <aleaxit at gmail.com> wrote:
> On 11/4/05, Eyal Lotem <eyal.lotem at gmail.com> wrote:
> > I have a few claims, some unrelated, and some built on top of each
> > other.  I would like to hear your responses as to which are
> > convincing, which arne't, and why. I think that if these claims are
> > true, Python 3000 should change quite a bit.
> >
> > A. Metaclass code is black magic and few understand how it works,
> > while decorator code is mostly understandable, even by non-gurus.
>
> I disagree.  I've held many presentations and classes on both
> subjects, and while people may INITIALLY feel like metaclasses are
> black magic, as soon as I've explained it the fear dissipates.  It all
> boils down do understanding that:
>
> class Name(Ba,Ses): <<body>>
>
> means
>
> Name = suitable_metaclass('Name', (Ba,Ses), <<dict-built-by-body>>)
>
> which isn't any harder than understanding that
>
> @foo(bar)
> def baz(args): ...
>
> means
>
> def baz(args): ...
> baz = foo(bar)(baz)

I disagree again. My experience is that metaclass code is very hard to
understand. Especially when it starts doing non-trivial things, such
as using a base metaclass class that is parametrized by metaclass
attributes in its subclasses.  Lookups of attributes in the base
metaclass methods is mind boggling (is it searching them in the base
metaclass, the subclass, the instance [which is the class]?).  The
same code would be much easier to understand with class decorators.

> > B. One of Decorators' most powerful features is that they can
> > mixed-and-matched, which makes them very powerful for many purposes,
> > while metaclasses are exclusive, and only one can be used.  This is
>
> Wrong.  You can mix as many metaclasses as you wish, as long as
> they're properly coded for multiple inheritance (using super, etc) --
> just inherit from them all.  This is reasonably easy to automate (see
> the last recipe in the 2nd ed of the Python Cookbook), too.

Multiple inheritence is an awful way to mix class fucntionalities
though. Lets take a simpler example.  Most UT frameworks use a
TestCase base class they inherit from to implement setup, tearDown,
and then inherit from it again to implement the test itself.  I argue
this is a weak approach, because then mixing/matching setups is
difficult.  You would argue this is not the case, because of the
ability to multiply-inherit from test cases, but how easy is the
equivalent of:

@with_setup('blah')
@with_other_setup('bleh')
def my_test():
  # the blah setup and bleh other setup are up and usable here,
  # and will be "torn down" at the end of this test

The equivalent of this requires a lot more work and violating DRY. 
Creating a specific function to multiply inherit from TestCases is a
possible solution, but it is much more conceptually complex, and needs
to be reimplemented in the next scenario (Metaclasses for example).

> > especially problematic as some classes may assume their subclasses
> > must use their respective metaclasses.  This means classdecorators are
> > strictly more powerful than metaclasses, without cumbersome
> > convertions between metaclass mechanisms and decorator mechanisms.
>
> The assertion that classdecorators are strictly more powerful than
> custom metaclasses is simply false.  How would you design
> classdecorator XXX so that
>
> @XXX
> class Foo: ...
>
> allows 'print Foo' to emit 'this is beautiful class Foo', for example?
>  the str(Foo) implicit in print calls type(Foo).__str__(Foo), so you
> do need a custom type(Foo) -- which is all that is meant by "a custom
> metaclass"... a custom type whose instances are classes, that's all.

I would argue that this is not such a useful feature, as in that case
you can simply use a factory object instead of a class.  If this
feature remains, that's fine, but the fact it allows for a weak form
of "decoration" of classes should not kill the concept of class
decorators.
The only reason of using metaclasses rather than factory objects, in
my experience, was that references to class objects are considered
different than references to factories (by pickle and deepcopy, and
maybe others) and that can be a useful feature. This feature can be
implemented in more readable means though.

> > C. Interesting uses of classdecorators are allowing super-calling
> > without redundantly specifying the name of your class, or your
> > superclass.
>
> Can you give an example?

@anotherclassdecorator
@supercallerclass
class MyClass(object):
     @supercaller
     def my_method(self, supcaller, x, y, z):
         ...
         result = supcaller.my_method(x, y, z)
         ...

Could be nice to remove the need for decorating the class, and only
decorating the methods, but the method decorators get a function
object, not a method object, so its more difficult (perhaps portably
impossible?) to do this.

Note that "__metaclass__ = superclasscaller" could also work, but then
combining "anotherclassdecorator" would require a lot more code at
worst, or a complex mechanism to combine metaclasses via multiple
inheritence at best.

> > D. Python seems to be incrementally adding power to the core language
> > with these features, which is great, but it also causes significant
> > overlapping of language features, which I believe is something to
> > avoid when possible.  If metaclasses are replaced with class
> > decorators, then suddenly inheritence becomes a redundant feature.
>
> And how do you customize what "print Foo" emits, as above?

As I said, "Foo" can be a factory object rather than a class object.

> > E. If inheritence is a redundant feature, it can be removed and an
> > "inherit" class decorator can be used.  This could also reduce all the
> > __mro__ clutter from the language along with other complexities, into
> > alternate implementations of the inherit classdecorator.
>
> How do you propose to get exactly the same effects as inheritance
> (affect every attribute lookup on a class and its instances) without
> inheritance?  Essentially, inheritance is automated delegation
> obtained by having getattr(foo, 'bar') look through a chain of objects
> (essentially the __mro__) until an attribute named 'bar' is found in
> one of those objects, plus a few minor but useful side effects, e.g.
> on isinstance and issubclass, and the catching of exceptions in
> try/except statements.  How would any mechanism allowing all of these
> uses NOT be inheritance?

One possibility is to copy the superclass attributes into subclasses.
Another is to allow the class decorator to specify
getattr/setattr/delattr's implementation without modifying the
metaclass [admittedly this is a difficult/problematic solution].
In any case, the inheritence class decorator could specify special
attributes in the class (it can remain compatible with __bases__) for
isinstance/try to work.

I agree that implementing inheritence this way is problematic [I'm
convinced], but don't let that determine the fate of class decorators
in general, which are more useful than metaclasses in many (most?)
scenarios.

From pinard at iro.umontreal.ca  Sat Nov  5 17:29:49 2005
From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard)
Date: Sat, 5 Nov 2005 11:29:49 -0500
Subject: [Python-Dev] PEP 352 Transition Plan
In-Reply-To: <ca471dc20510311124s1c3aeffeya879056477ea515d@mail.gmail.com>
References: <ca471dc20510281329m5312946bjedf100d942c0dc49@mail.gmail.com>
	<007d01c5dc00$738da2e0$b62dc797@oemcomputer>
	<bbaeab100510281552rfd260afrde3e72eec14dd5df@mail.gmail.com>
	<4362DD15.4080606@gmail.com>
	<bbaeab100510282037m5bad1f67kb4d5cb7171ac163b@mail.gmail.com>
	<ca471dc20510311124s1c3aeffeya879056477ea515d@mail.gmail.com>
Message-ID: <20051105162949.GA8992@phenix.sram.qc.ca>

[Guido van Rossum]

> I've made a final pass over PEP 352, mostly fixing the __str__, 
> __unicode__ and __repr__ methods to behave more reasonably. I'm all 
> for accepting it now. Does anybody see any last-minute show-stopping 
> problems with it?

I did not follow the thread, so maybe I'm out in order, be kind with me.

After having read PEP 352, it is not crystal clear whether in:

    try:
        ...
    except:
        ...

the "except:" will mean "except BaseException:" or "except Exception:".
I would except the first, but the text beginning the section titled 
"Exception Hierarchy Changes" suggests it could mean the second, without
really stating it.

Let me argue that "except BaseException:" is preferable.  First, because 
there is no reason to load a bare "except:" by anything but a very 
simple and clean meaning, like the real base of the exception hierarchy.  
Second, as a bare "except:" is not considered good practice on average, 
it would be counter-productive trying to figure out ways to make it more 
frequently _usable_.

-- 
Fran?ois Pinard   http://pinard.progiciels-bpi.ca

From guido at python.org  Sat Nov  5 18:46:54 2005
From: guido at python.org (Guido van Rossum)
Date: Sat, 5 Nov 2005 09:46:54 -0800
Subject: [Python-Dev] PEP 352 Transition Plan
In-Reply-To: <20051105162949.GA8992@phenix.sram.qc.ca>
References: <ca471dc20510281329m5312946bjedf100d942c0dc49@mail.gmail.com>
	<007d01c5dc00$738da2e0$b62dc797@oemcomputer>
	<bbaeab100510281552rfd260afrde3e72eec14dd5df@mail.gmail.com>
	<4362DD15.4080606@gmail.com>
	<bbaeab100510282037m5bad1f67kb4d5cb7171ac163b@mail.gmail.com>
	<ca471dc20510311124s1c3aeffeya879056477ea515d@mail.gmail.com>
	<20051105162949.GA8992@phenix.sram.qc.ca>
Message-ID: <ca471dc20511050946k38da87d7o1e676df61a9b9a78@mail.gmail.com>

> [Guido van Rossum]
>
> > I've made a final pass over PEP 352, mostly fixing the __str__,
> > __unicode__ and __repr__ methods to behave more reasonably. I'm all
> > for accepting it now. Does anybody see any last-minute show-stopping
> > problems with it?

[Fran?ois]
> I did not follow the thread, so maybe I'm out in order, be kind with me.
>
> After having read PEP 352, it is not crystal clear whether in:
>
>     try:
>         ...
>     except:
>         ...
>
> the "except:" will mean "except BaseException:" or "except Exception:".
> I would except the first, but the text beginning the section titled
> "Exception Hierarchy Changes" suggests it could mean the second, without
> really stating it.

This is probably a leftover from PEP 348, which did have a change for
bare 'except:' in mind. PEP 352 doesn't propose to change its meaning,
and if there are words that suggest this, they should be removed.

Until Python 3.0, it will not change its meaning from what it is now;
this is because until then, it is still *possible* (though it will
become deprecated behavior) to raise string exceptions or classes that
don't inherit from BaseException.

> Let me argue that "except BaseException:" is preferable.  First, because
> there is no reason to load a bare "except:" by anything but a very
> simple and clean meaning, like the real base of the exception hierarchy.
> Second, as a bare "except:" is not considered good practice on average,
> it would be counter-productive trying to figure out ways to make it more
> frequently _usable_.

What bare 'except:' will mean in Python 3.0, and whether it is even
allowed at all, is up for discussion -- it will have to be a new PEP.

Personally, I think bare 'except:' should be removed from the language
in Python 3.0, so that all except clauses are explicit in what they
catch and there isn't any confusion over whether KeyboardInterrupt,
SystemExit etc. are included or not.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From noamraph at gmail.com  Sat Nov  5 20:05:28 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Sat, 5 Nov 2005 21:05:28 +0200
Subject: [Python-Dev] Should the default equality operator compare
	values instead of identities?
In-Reply-To: <20051102125437.F290.JCARLSON@uci.edu>
References: <b348a0850511021236u66c94838pb7bb9e27f1314c3d@mail.gmail.com>
	<20051102125437.F290.JCARLSON@uci.edu>
Message-ID: <b348a0850511051105k44906fbepdd0258ba2435bd2e@mail.gmail.com>

On 11/3/05, Josiah Carlson <jcarlson at uci.edu> wrote:
...
>
> Right, but lists (dicts, tuples, etc.) are defined as containers, and
> their comparison operation is defined on their contents.  Objects are
> not defined as containers in the general case, so defining comparisons
> based on their contents (as opposed to identity) is just one of the two
> assumptions to be made.
>
> I personally like the current behavior, and I see no /compelling/ reason
> to change it.  You obviously feel so compelled for the behavior to
> change that you are willing to express your desires.  How about you do
> something more productive and produce a patch which implements the
> changes you want, verify that it passes tests in the standard library,
> then post it on sourceforge.  If someone is similarly compelled and
> agrees with you (so far I've not seen any public support for your
> proposal by any of the core developers), the discussion will restart,
> and it will be decided (not by you or I).

Thanks for the advice - I will try to do as you suggest.
>
>
> > To summarize, I think that value-based equality testing would usually
> > be what you want, and currently implementing it is a bit of a pain.
>
> Actually, implementing value-based equality testing, when you have a
> finite set of values you want to test, is quite easy.
>
> def __eq__(self, other):
>     for i in self.__cmp_eq__:
>         if getattr(self, i) != getattr(other, i):
>             return False
>     return True
>
> With a simple metaclass that discovers all of those values automatically,
> and/or your own protocol for exclusion, and you are done.  Remember, not
> all 5-line functions should become builtin/default behavior, and this
> implementation shows that it is not a significant burdon for you (or
> anyone else) to implement this in your own custom library.
>
You are right that not all 5-line functions should become
builtin/default behaviour. However, I personally think that this one
should, since:
1. It doesn't add complexity, or a new builtin.
2. Those five line doesn't include the metaclass code, which will
probably take more than five lines and won't be trivial.
3. It will make other objects behave better, not only mine - other
classes will get a meaningful comparison operator, for free.
>
> P.S. One thing that you should remember is that even if your patch is
> accepted, and even if this is desireable, Python 2.5 is supposed to be
> released sometime next year (spring/summer?), and because it is a
> backwards incompatible change, would need at least 2.6-2.7 before it
> becomes the default behavior without a __future__ import, which is
> another 3-4 years down the line.

I hope that the warning can go in by Python 2.5, so the change (which
I think will cause relatively few backwards incompatibility problems)
can go in by Python 2.6, which I think is less than 2 years down the
line.
>
> I understand you are passionate, really I do (you should see some of my
> proposals), but by the time these things get around to getting into
> mainline Python, there are high odds that you probably won't care about
> them much anymore (I've come to feel that way myself about many of my
> proposals), and I think it is a good idea to attempt to balance - when
> it comes to Python - "Now is better than never." and "Although never is
> often better than *right* now."
>
> Removing __hash__, changing __eq__, and trying to get in copy-on-write
> freezing (which is really copy-and-cache freezing), all read to me like
> "We gotta do this now!", which certainly isn't helping the proposal.
>
Thanks - I should really calm down a bit. I will try to go "safe and
slowly", and I hope that at the end I will succeed in making my own
small contribution to Python.

Noam

From jcarlson at uci.edu  Sat Nov  5 21:30:17 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sat, 05 Nov 2005 12:30:17 -0800
Subject: [Python-Dev] Should the default equality operator compare
	values instead of identities?
In-Reply-To: <b348a0850511051105k44906fbepdd0258ba2435bd2e@mail.gmail.com>
References: <20051102125437.F290.JCARLSON@uci.edu>
	<b348a0850511051105k44906fbepdd0258ba2435bd2e@mail.gmail.com>
Message-ID: <20051105115816.BFE3.JCARLSON@uci.edu>


Noam Raphael <noamraph at gmail.com> wrote:
> On 11/3/05, Josiah Carlson <jcarlson at uci.edu> wrote:
> > > To summarize, I think that value-based equality testing would usually
> > > be what you want, and currently implementing it is a bit of a pain.
> >
> > Actually, implementing value-based equality testing, when you have a
> > finite set of values you want to test, is quite easy.
> >
> > def __eq__(self, other):
> >     for i in self.__cmp_eq__:
> >         if getattr(self, i) != getattr(other, i):
> >             return False
> >     return True
> >
> > With a simple metaclass that discovers all of those values automatically,
> > and/or your own protocol for exclusion, and you are done.  Remember, not
> > all 5-line functions should become builtin/default behavior, and this
> > implementation shows that it is not a significant burdon for you (or
> > anyone else) to implement this in your own custom library.
> >
> You are right that not all 5-line functions should become
> builtin/default behaviour. However, I personally think that this one
> should, since:
> 1. It doesn't add complexity, or a new builtin.

It changes default behavior (which I specified as a portion of my
statement, which you quote.

And you are wrong, it adds complexity to the implementation of both
class instantiation and the default comparison mechanism.  The former, I
believe, you will find more difficult to patch than the comparison,
though if you have not yet had adventures in that which is writing C
extension modules, modifying the default class instantiation may be
the deal breaker for you (I personally would have no idea where to start).


> 2. Those five line doesn't include the metaclass code, which will
> probably take more than five lines and won't be trivial.

class eqMetaclass(type):
    def __new__(cls, name, bases, dct):
        if '__cmp_include__' in dct:
            include = dict.fromkeys(dct['__cmp_include__'])
        else:
            include = dict.fromkeys(dct.keys)

        for i in dct.get('__cmp_exclude__'):
            _ = include.pop(i, None)

        dct['__cmp_eq__'] = include.keys()
        return type.__new__(cls, name, bases, dct)

It took 10 lines of code, and was trivial (except for not-included
multi-metaclass support code, which is being discussed in another thread).

Oh, I suppose I should modify that __eq__ definition to be smarter about
comparison...

def __eq__(self, other):
    if not hasattr(other, '__cmp_eq__'):
        return False
    if dict.fromkeys(self.__cmp_eq__) != \
       dict.fromkeys(other.__cmp_eq__):
        return False
    for i in self.__cmp_eq__:
        if getattr(self, i) != getattr(other, i):
            return False
    return True

Wow, 20 lines of support code, how could one ever expect users to write
that? ;)


> 3. It will make other objects behave better, not only mine - other
> classes will get a meaningful comparison operator, for free.

You are that the comparison previously wasn't "meaningful".  It has a
meaning, though it may not be exactly what you wanted it to be, which is
why Python allows users to define __eq__ operators to be exactly what
they want, and which is why I don't find your uses compelling.


> > P.S. One thing that you should remember is that even if your patch is
> > accepted, and even if this is desireable, Python 2.5 is supposed to be
> > released sometime next year (spring/summer?), and because it is a
> > backwards incompatible change, would need at least 2.6-2.7 before it
> > becomes the default behavior without a __future__ import, which is
> > another 3-4 years down the line.
> 
> I hope that the warning can go in by Python 2.5, so the change (which
> I think will cause relatively few backwards incompatibility problems)
> can go in by Python 2.6, which I think is less than 2 years down the
> line.

As per historical release schedules (available in PEP form at
www.python.org/peps), alpha 1 to final generally takes 6 months.  It
then takes at least a year before the alpha 1 of the following version
is to be released.

Being that 2.4 final was released November 2004, and we've not seen an
alpha for 2.5 yet, we are at least 6 months (according to history) from
2.5 final, and at least 2 years from 2.6 final.

From what I have previously learned from others in python-dev, the
warnings machinery is slow, so one is to be wary of using warnings
unless absolutely necessary. Regardless of it being absolutely necessary,
it would be 2 years at least before the feature would actually make it
into Python and become default behavior, IF it were desireable default
behavior.


> Thanks - I should really calm down a bit. I will try to go "safe and
> slowly", and I hope that at the end I will succeed in making my own
> small contribution to Python.

You should also realize that you can make contributions to Python
without changing the language or the implementation of the langauge. 
Read and review patches, help with bug reports, hang out on python-list
and attempt to help the hundreds (if not thousands) of users who are
asking for help, try to help new users in python-tutor, etc.  If you
have an idea for a language change, offer it up on python-list first
(I've forgotten to do this more often than I would like to admit), and
if it generally has more "cool" than "ick", then bring it back here.

 - Josiah


From martin at v.loewis.de  Sat Nov  5 22:41:21 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 05 Nov 2005 22:41:21 +0100
Subject: [Python-Dev] Why should the default hash(x) == id(x)?
In-Reply-To: <b348a0850511011721ve1c3817vd5f61b644257e855@mail.gmail.com>
References: <b348a0850511011721ve1c3817vd5f61b644257e855@mail.gmail.com>
Message-ID: <436D2701.6080400@v.loewis.de>

Noam Raphael wrote:
> Is there a reason why the default __hash__ method returns the id of the objects?

You are asking "why" question of the kind which are best answered as 
"why not".

IOW, you are saying that the current behaviour is bad, but you are not
proposing any alternative behaviour. There are many alternatives
possible, and they are presumably all worse than the current
implementation.

To give an example: "why does hash() return id()"?
Answer: The alternative would be that hash() returns always 0 unless
implemented otherwise. This would cause serious performance issues
for people using the objects as dictionary keys. If they don't do that,
it doesn't matter what hash() returns.

> This leads me to another question: why should the default __eq__
> method be the same as "is"?

Because the alternative would be to always return "False". This
would be confusing, because it would cause "x == x" to give False.

More generally, I claim that the current behaviour is better than
*any* alternative. To refute this claim, you would have to come
up with an alternative first.

Regards,
Martin

From noamraph at gmail.com  Sun Nov  6 00:00:16 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Sun, 6 Nov 2005 01:00:16 +0200
Subject: [Python-Dev] Should the default equality operator compare
	values instead of identities?
In-Reply-To: <20051105115816.BFE3.JCARLSON@uci.edu>
References: <20051102125437.F290.JCARLSON@uci.edu>
	<b348a0850511051105k44906fbepdd0258ba2435bd2e@mail.gmail.com>
	<20051105115816.BFE3.JCARLSON@uci.edu>
Message-ID: <b348a0850511051500w5205c608u13768da5156cd58b@mail.gmail.com>

On 11/5/05, Josiah Carlson <jcarlson at uci.edu> wrote:
...
> > 1. It doesn't add complexity, or a new builtin.
>
> It changes default behavior (which I specified as a portion of my
> statement, which you quote.
>
> And you are wrong, it adds complexity to the implementation of both
> class instantiation and the default comparison mechanism.  The former, I
> believe, you will find more difficult to patch than the comparison,
> though if you have not yet had adventures in that which is writing C
> extension modules, modifying the default class instantiation may be
> the deal breaker for you (I personally would have no idea where to start).

Sorry, I meant complexity to the Python user - it won't require him to
learn more in order to write programs in Python.
>
> class eqMetaclass(type):
>     def __new__(cls, name, bases, dct):
>         if '__cmp_include__' in dct:
>             include = dict.fromkeys(dct['__cmp_include__'])
>         else:
>             include = dict.fromkeys(dct.keys)
>
>         for i in dct.get('__cmp_exclude__'):
>             _ = include.pop(i, None)
>
>         dct['__cmp_eq__'] = include.keys()
>         return type.__new__(cls, name, bases, dct)
>
> It took 10 lines of code, and was trivial (except for not-included
> multi-metaclass support code, which is being discussed in another thread).
>
> Oh, I suppose I should modify that __eq__ definition to be smarter about
> comparison...
>
> def __eq__(self, other):
>     if not hasattr(other, '__cmp_eq__'):
>         return False
>     if dict.fromkeys(self.__cmp_eq__) != \
>        dict.fromkeys(other.__cmp_eq__):
>         return False
>     for i in self.__cmp_eq__:
>         if getattr(self, i) != getattr(other, i):
>             return False
>     return True

Thanks for the implementation. It would be very useful in order to
explain my suggestion.

It's nice that it compares only attributes, not types. It makes it
possible for two people to write classes that can be equal to one
another.

>
> Wow, 20 lines of support code, how could one ever expect users to write
> that? ;)

This might mean that implementing it in C, once I find the right
place, won't be too difficult.

And I think that for most users it will be harder than it was for you,
and there are some subtleties in those lines.
>
>
> > 3. It will make other objects behave better, not only mine - other
> > classes will get a meaningful comparison operator, for free.
>
> You are that the comparison previously wasn't "meaningful".  It has a
> meaning, though it may not be exactly what you wanted it to be, which is
> why Python allows users to define __eq__ operators to be exactly what
> they want, and which is why I don't find your uses compelling.
>
I think that value-based equality testing is a better default, since
in more cases it does what you want it to, and since in those cases
they won't have to write those 20 lines, or download them from
somewhere.
>
...
>
> From what I have previously learned from others in python-dev, the
> warnings machinery is slow, so one is to be wary of using warnings
> unless absolutely necessary. Regardless of it being absolutely necessary,
> it would be 2 years at least before the feature would actually make it
> into Python and become default behavior, IF it were desireable default
> behavior.

All right. I hope that those warnings will be ok - it's yet to be
seen. And about those 2 years - better later than never.
...
>
> You should also realize that you can make contributions to Python
> without changing the language or the implementation of the langauge.
> Read and review patches, help with bug reports, hang out on python-list
> and attempt to help the hundreds (if not thousands) of users who are
> asking for help, try to help new users in python-tutor, etc.

I confess that I don't do these a lot. I can say that I from time to
time teach beginners Python, and that where I work I help a lot of
other people with Python.

> If you
> have an idea for a language change, offer it up on python-list first
> (I've forgotten to do this more often than I would like to admit), and
> if it generally has more "cool" than "ick", then bring it back here.
>
I will. Thanks again.

Noam

From jcarlson at uci.edu  Sun Nov  6 00:40:34 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sat, 05 Nov 2005 15:40:34 -0800
Subject: [Python-Dev] Should the default equality operator compare
	values instead of identities?
In-Reply-To: <b348a0850511051500w5205c608u13768da5156cd58b@mail.gmail.com>
References: <20051105115816.BFE3.JCARLSON@uci.edu>
	<b348a0850511051500w5205c608u13768da5156cd58b@mail.gmail.com>
Message-ID: <20051105151436.BFF7.JCARLSON@uci.edu>


Noam Raphael <noamraph at gmail.com> wrote:
> 
> On 11/5/05, Josiah Carlson <jcarlson at uci.edu> wrote:
> ...
> > > 1. It doesn't add complexity, or a new builtin.
> >
> > It changes default behavior (which I specified as a portion of my
> > statement, which you quote.
> >
> > And you are wrong, it adds complexity to the implementation of both
> > class instantiation and the default comparison mechanism.  The former, I
> > believe, you will find more difficult to patch than the comparison,
> > though if you have not yet had adventures in that which is writing C
> > extension modules, modifying the default class instantiation may be
> > the deal breaker for you (I personally would have no idea where to start).
> 
> Sorry, I meant complexity to the Python user - it won't require him to
> learn more in order to write programs in Python.

Ahh, but it does add complexity.  Along with knowing __doc__, __slots__,
__metaclass__, __init__, __new__, __cmp__, __eq__, ..., __str__,
__repr__, __getitem__, __setitem__, __delitem__, __getattr__,
__setattr__, __delattr__, ...


The user must also know what __cmp_include__ and __cmp_exclude__ means
in order to understand code which uses them, and they must understand
that exclude entries overwrite include entries.


> > Wow, 20 lines of support code, how could one ever expect users to write
> > that? ;)
> 
> This might mean that implementing it in C, once I find the right
> place, won't be too difficult.
> 
> And I think that for most users it will be harder than it was for you,
> and there are some subtleties in those lines.

So put it in the Python Cookbook:
http://aspn.activestate.com/ASPN/Cookbook/Python 


> > > 3. It will make other objects behave better, not only mine - other
> > > classes will get a meaningful comparison operator, for free.
> >
> > You are that the comparison previously wasn't "meaningful".  It has a
> > meaning, though it may not be exactly what you wanted it to be, which is
> > why Python allows users to define __eq__ operators to be exactly what
> > they want, and which is why I don't find your uses compelling.
> >
> I think that value-based equality testing is a better default, since
> in more cases it does what you want it to, and since in those cases
> they won't have to write those 20 lines, or download them from
> somewhere.

You are making a value judgement on what people want to happen with
default Python. Until others state that they want such an operation as a
default, I'm going to consider this particular argument relatively
unfounded.


> > From what I have previously learned from others in python-dev, the
> > warnings machinery is slow, so one is to be wary of using warnings
> > unless absolutely necessary. Regardless of it being absolutely necessary,
> > it would be 2 years at least before the feature would actually make it
> > into Python and become default behavior, IF it were desireable default
> > behavior.
> 
> All right. I hope that those warnings will be ok - it's yet to be
> seen. And about those 2 years - better later than never.

It won't be OK.  Every comparison using the default operator will incur
a speed penalty while it checks the (pure Python) warning machinery to
determine if the warning has been issued yet.  This alone makes the
transition require a __future__ import.


 - Josiah


From noamraph at gmail.com  Sun Nov  6 01:02:36 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Sun, 6 Nov 2005 02:02:36 +0200
Subject: [Python-Dev] Should the default equality operator compare
	values instead of identities?
In-Reply-To: <20051105151436.BFF7.JCARLSON@uci.edu>
References: <20051105115816.BFE3.JCARLSON@uci.edu>
	<b348a0850511051500w5205c608u13768da5156cd58b@mail.gmail.com>
	<20051105151436.BFF7.JCARLSON@uci.edu>
Message-ID: <b348a0850511051602u4db5e332mdbc3dcecbe95b170@mail.gmail.com>

On 11/6/05, Josiah Carlson <jcarlson at uci.edu> wrote:
...
> >
> > Sorry, I meant complexity to the Python user - it won't require him to
> > learn more in order to write programs in Python.
>
> Ahh, but it does add complexity.  Along with knowing __doc__, __slots__,
> __metaclass__, __init__, __new__, __cmp__, __eq__, ..., __str__,
> __repr__, __getitem__, __setitem__, __delitem__, __getattr__,
> __setattr__, __delattr__, ...
>
>
> The user must also know what __cmp_include__ and __cmp_exclude__ means
> in order to understand code which uses them, and they must understand
> that exclude entries overwrite include entries.
>
You are right. But that's Python - I think that nobody knows all the
exact details of what all these do. You look in the documentation. It
is a compliation - but it's of the type that I can live with, if
there's a reason.
>
> > > Wow, 20 lines of support code, how could one ever expect users to write
> > > that? ;)
> >
> > This might mean that implementing it in C, once I find the right
> > place, won't be too difficult.
> >
> > And I think that for most users it will be harder than it was for you,
> > and there are some subtleties in those lines.
>
> So put it in the Python Cookbook:
> http://aspn.activestate.com/ASPN/Cookbook/Python
>
A good idea.
>
> > > > 3. It will make other objects behave better, not only mine - other
> > > > classes will get a meaningful comparison operator, for free.
> > >
> > > You are that the comparison previously wasn't "meaningful".  It has a
> > > meaning, though it may not be exactly what you wanted it to be, which is
> > > why Python allows users to define __eq__ operators to be exactly what
> > > they want, and which is why I don't find your uses compelling.
> > >
> > I think that value-based equality testing is a better default, since
> > in more cases it does what you want it to, and since in those cases
> > they won't have to write those 20 lines, or download them from
> > somewhere.
>
> You are making a value judgement on what people want to happen with
> default Python. Until others state that they want such an operation as a
> default, I'm going to consider this particular argument relatively
> unfounded.
>
All right. I will try to collect more examples for my proposal.
>
> > > From what I have previously learned from others in python-dev, the
> > > warnings machinery is slow, so one is to be wary of using warnings
> > > unless absolutely necessary. Regardless of it being absolutely necessary,
> > > it would be 2 years at least before the feature would actually make it
> > > into Python and become default behavior, IF it were desireable default
> > > behavior.
> >
> > All right. I hope that those warnings will be ok - it's yet to be
> > seen. And about those 2 years - better later than never.
>
> It won't be OK.  Every comparison using the default operator will incur
> a speed penalty while it checks the (pure Python) warning machinery to
> determine if the warning has been issued yet.  This alone makes the
> transition require a __future__ import.
>
How will the __future__ statement help? I think that the warning is
still needed, so that people using code that may stop working will
know about it. I see that they can add a __future__ import and see if
it still works, but it will catch much fewer problems, because usually
code would be run without the __future__ import.

If it really slows down things, it seems to me that the only solution
is to optimize the warning module...

Noam

From noamraph at gmail.com  Sun Nov  6 01:03:22 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Sun, 6 Nov 2005 02:03:22 +0200
Subject: [Python-Dev] Why should the default hash(x) == id(x)?
In-Reply-To: <436D2701.6080400@v.loewis.de>
References: <b348a0850511011721ve1c3817vd5f61b644257e855@mail.gmail.com>
	<436D2701.6080400@v.loewis.de>
Message-ID: <b348a0850511051603r6f218a46wa5ff0abf006a1fc3@mail.gmail.com>

On 11/5/05, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> More generally, I claim that the current behaviour is better than
> *any* alternative. To refute this claim, you would have to come
> up with an alternative first.
>
The alternative is to drop the __hash__ method of user-defined classes
(as Guido already decided to do), and to make the default __eq__
method compare the two objects' __dict__ and slot members.

See the thread about default equality operator - Josiah Carlson posted
there a metaclass implementing this equality operator.

Noam

From jcarlson at uci.edu  Sun Nov  6 01:19:49 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sat, 05 Nov 2005 16:19:49 -0800
Subject: [Python-Dev] Why should the default hash(x) == id(x)?
In-Reply-To: <b348a0850511051603r6f218a46wa5ff0abf006a1fc3@mail.gmail.com>
References: <436D2701.6080400@v.loewis.de>
	<b348a0850511051603r6f218a46wa5ff0abf006a1fc3@mail.gmail.com>
Message-ID: <20051105161846.C004.JCARLSON@uci.edu>


Noam Raphael <noamraph at gmail.com> wrote:
> 
> On 11/5/05, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> > More generally, I claim that the current behaviour is better than
> > *any* alternative. To refute this claim, you would have to come
> > up with an alternative first.
> >
> The alternative is to drop the __hash__ method of user-defined classes
> (as Guido already decided to do), and to make the default __eq__
> method compare the two objects' __dict__ and slot members.
> 
> See the thread about default equality operator - Josiah Carlson posted
> there a metaclass implementing this equality operator.

The existance of a simple equality operator and metaclass is actually a
strike against changing the default behavior for equality.

 - Josiah


From pedronis at strakt.com  Sun Nov  6 01:29:18 2005
From: pedronis at strakt.com (Samuele Pedroni)
Date: Sun, 06 Nov 2005 01:29:18 +0100
Subject: [Python-Dev] Why should the default hash(x) == id(x)?
In-Reply-To: <b348a0850511051603r6f218a46wa5ff0abf006a1fc3@mail.gmail.com>
References: <b348a0850511011721ve1c3817vd5f61b644257e855@mail.gmail.com>	<436D2701.6080400@v.loewis.de>
	<b348a0850511051603r6f218a46wa5ff0abf006a1fc3@mail.gmail.com>
Message-ID: <436D4E5E.7000301@strakt.com>

Noam Raphael wrote:
> On 11/5/05, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> 
>>More generally, I claim that the current behaviour is better than
>>*any* alternative. To refute this claim, you would have to come
>>up with an alternative first.
>>
> 
> The alternative is to drop the __hash__ method of user-defined classes
> (as Guido already decided to do), and to make the default __eq__
> method compare the two objects' __dict__ and slot members.
> 

no, whether object has an __hash__ and what is the default hashing
are different issues. Also all this discussion should have started and
lived on comp.lang.python and this is a good point as any to rectify this.


> See the thread about default equality operator - Josiah Carlson posted
> there a metaclass implementing this equality operator.
> 
> Noam
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/pedronis%40strakt.com


From jcarlson at uci.edu  Sun Nov  6 01:30:52 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sat, 05 Nov 2005 16:30:52 -0800
Subject: [Python-Dev] Should the default equality operator compare
	values instead of identities?
In-Reply-To: <b348a0850511051602u4db5e332mdbc3dcecbe95b170@mail.gmail.com>
References: <20051105151436.BFF7.JCARLSON@uci.edu>
	<b348a0850511051602u4db5e332mdbc3dcecbe95b170@mail.gmail.com>
Message-ID: <20051105162001.C007.JCARLSON@uci.edu>


Noam Raphael <noamraph at gmail.com> wrote:
> 
> On 11/6/05, Josiah Carlson <jcarlson at uci.edu> wrote:
> ...
> > >
> > > Sorry, I meant complexity to the Python user - it won't require him to
> > > learn more in order to write programs in Python.
> You are right. But that's Python - I think that nobody knows all the
> exact details of what all these do. You look in the documentation. It
> is a compliation - but it's of the type that I can live with, if
> there's a reason.

Regardless of whether people check the documentation, it does add
complexity to Python.


> > > All right. I hope that those warnings will be ok - it's yet to be
> > > seen. And about those 2 years - better later than never.
> >
> > It won't be OK.  Every comparison using the default operator will incur
> > a speed penalty while it checks the (pure Python) warning machinery to
> > determine if the warning has been issued yet.  This alone makes the
> > transition require a __future__ import.
> >
> How will the __future__ statement help? I think that the warning is
> still needed, so that people using code that may stop working will
> know about it. I see that they can add a __future__ import and see if
> it still works, but it will catch much fewer problems, because usually
> code would be run without the __future__ import.

What has been common is to use __future__ along with a note in the
release notes specifying the changes between 2.x and 2.x-1.  The precise
mechanisms when using __future__ vary from import to import, though this
one could signal the change of a single variable as to which code path
to use.


> If it really slows down things, it seems to me that the only solution
> is to optimize the warning module...

Possible solutions to possible problem of default __eq__ behavior:
1. It is not a problem, leave it alone.
2. Use __future__.
3. Use warnings, and deal with it being slow.
4. Make warnings a C module and expose it to CPython internals.


You are claiming that there is such a need to fix __eq__ that one would
NEEDs to change the warnings module so that the __eq__ fix can be fast.
Again, implement this, post it to sourceforge, and someone will decide.

 - Josiah


From martin at v.loewis.de  Sun Nov  6 11:08:22 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 06 Nov 2005 11:08:22 +0100
Subject: [Python-Dev] Why should the default hash(x) == id(x)?
In-Reply-To: <b348a0850511051603r6f218a46wa5ff0abf006a1fc3@mail.gmail.com>
References: <b348a0850511011721ve1c3817vd5f61b644257e855@mail.gmail.com>	
	<436D2701.6080400@v.loewis.de>
	<b348a0850511051603r6f218a46wa5ff0abf006a1fc3@mail.gmail.com>
Message-ID: <436DD616.3080304@v.loewis.de>

Noam Raphael wrote:
> The alternative is to drop the __hash__ method of user-defined classes
> (as Guido already decided to do), and to make the default __eq__
> method compare the two objects' __dict__ and slot members.

The question then is what hash(x) would do. It seems that you expect
it then somehow not to return a value. However, under this patch,
the fallback implementation (use pointer as the hash) would be used,
which would preserve hash(x)==id(x).

> See the thread about default equality operator - Josiah Carlson posted
> there a metaclass implementing this equality operator.

This will likely cause a lot of breakage. Objects will compare equal
even though they conceptually are not, and even though they did not
compare equal in previous Python versions.

Regards,
Martin

From jim at zope.com  Sun Nov  6 17:15:58 2005
From: jim at zope.com (Jim Fulton)
Date: Sun, 06 Nov 2005 11:15:58 -0500
Subject: [Python-Dev] For Python 3k, drop default/implicit hash,
	and comparison
Message-ID: <436E2C3E.7060807@zope.com>


The recent discussion about what the default hash and equality comparisons
should do makes me want to chime in.

IMO, the provision of defaults for hash, eq and other comparisons
was a mistake.  I'm especially sensitive to this because I do a lot
of work with persistent data that outlives program execution. For such
objects, memory address is meaningless.  In particular, the default
ordering of objects based in address has caused a great deal of pain
to people who store data in persistent BTrees.

Oddly, what I've read in these threads seems to be arguing about
which implicit method is best.  The answer, IMO, is to not do this
implicitly at all.  If programmers want their objects to be
hashable, comparable, or orderable, then they should implement operators
explicitly.  There could even be a handy, but *optional*, base class that
provides these operators based on ids.

This would be too big a change for Python 2 but, IMO, should definately
be made for Python 3k.  I doubt any change in the default definition
of these operations is practical for Python 2.  Too many people rely on
them, usually without really realizing it.

Lets plan to stop guessing how to do hash and comparison.

Explicit is better than implicit. :)

Jim

-- 
Jim Fulton           mailto:jim at zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org

From guido at python.org  Sun Nov  6 20:47:20 2005
From: guido at python.org (Guido van Rossum)
Date: Sun, 6 Nov 2005 11:47:20 -0800
Subject: [Python-Dev] For Python 3k, drop default/implicit hash,
	and comparison
In-Reply-To: <436E2C3E.7060807@zope.com>
References: <436E2C3E.7060807@zope.com>
Message-ID: <ca471dc20511061147p2e0ae9dbt83b6e52dbbd7e69b@mail.gmail.com>

On 11/6/05, Jim Fulton <jim at zope.com> wrote:
> IMO, the provision of defaults for hash, eq and other comparisons
> was a mistake.

I agree with you for 66%. Default hash and inequalities were a
mistake. But I wouldn't want to do without a default ==/!=
implementation (and of course it should be defined so that an object
is only equal to itself).

In fact, the original hash() was clever enough to complain when __eq__
(or __cmp__) was overridden but __hash__ wasn't; but this got lost by
accident for new-style classes when I added a default __hash__ to the
new universal base class (object). But I think the original default
hash() isn't particularly useful, so I think it's better to just not
be hashable unless __hash__ is defined explicitly.

> I'm especially sensitive to this because I do a lot
> of work with persistent data that outlives program execution. For such
> objects, memory address is meaningless.  In particular, the default
> ordering of objects based in address has caused a great deal of pain
> to people who store data in persistent BTrees.

This argues against the inequalities (<, <=, >, >=) and I agree.

> Oddly, what I've read in these threads seems to be arguing about
> which implicit method is best.  The answer, IMO, is to not do this
> implicitly at all.  If programmers want their objects to be
> hashable, comparable, or orderable, then they should implement operators
> explicitly.  There could even be a handy, but *optional*, base class that
> provides these operators based on ids.

I don't like that final suggestion. Before you know it, a meme
develops telling newbies that all classes should inherit from that
"optional" base class, and then later it's impossible to remove it
because you can't tell whether it's actually needed or not.

> This would be too big a change for Python 2 but, IMO, should definately
> be made for Python 3k.  I doubt any change in the default definition
> of these operations is practical for Python 2.  Too many people rely on
> them, usually without really realizing it.

Agreed.

> Lets plan to stop guessing how to do hash and comparison.
>
> Explicit is better than implicit. :)

Except that I really don't think that there's anything wrong with a
default __eq__ that uses object identity. As Martin pointed out, it's
just too weird that an object wouldn't be considered equal to itself.
It's the default __hash__ and __cmp__ that mess things up.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From jim at zope.com  Sun Nov  6 21:13:23 2005
From: jim at zope.com (Jim Fulton)
Date: Sun, 06 Nov 2005 15:13:23 -0500
Subject: [Python-Dev] For Python 3k, drop default/implicit hash,
	and comparison
In-Reply-To: <ca471dc20511061147p2e0ae9dbt83b6e52dbbd7e69b@mail.gmail.com>
References: <436E2C3E.7060807@zope.com>
	<ca471dc20511061147p2e0ae9dbt83b6e52dbbd7e69b@mail.gmail.com>
Message-ID: <436E63E3.7040307@zope.com>

Guido van Rossum wrote:
> On 11/6/05, Jim Fulton <jim at zope.com> wrote:
> 
...
> Except that I really don't think that there's anything wrong with a
> default __eq__ that uses object identity. As Martin pointed out, it's
> just too weird that an object wouldn't be considered equal to itself.
> It's the default __hash__ and __cmp__ that mess things up.

Good point.  I agree.

Jim

-- 
Jim Fulton           mailto:jim at zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org

From jrw at pobox.com  Sun Nov  6 21:39:42 2005
From: jrw at pobox.com (John Williams)
Date: Sun, 06 Nov 2005 14:39:42 -0600
Subject: [Python-Dev] For Python 3k, drop default/implicit hash,
	and comparison
In-Reply-To: <436E2C3E.7060807@zope.com>
References: <436E2C3E.7060807@zope.com>
Message-ID: <436E6A0E.4070508@pobox.com>

(This is kind of on a tangent to the original discussion, but I don't 
want to create yet another subject line about object comparisons.)

Lately I've found that virtually all my implementations of __cmp__, 
__hash__, etc. can be factored into this form inspired by the "key" 
parameter to the built-in sorting functions:

class MyClass:

   def __key(self):
     # Return a tuple of attributes to compare.
     return (self.foo, self.bar, ...)

   def __cmp__(self, that):
     return cmp(self.__key(), that.__key())

   def __hash__(self):
     return hash(self.__key())

I wonder if it wouldn't make sense to formalize this pattern with a 
magic __key__ method such that a class with a __key__ method would 
behave as if it had interited the definitions of __cmp__ and __hash__ above.

This scheme would eliminate the tedium of keeping the __hash__ method in 
sync with the __cmp__/__eq__ method, and writing a __key__ method would 
involve writing less code than a naive __eq__ method, since each 
attribute name only needs to be mentioned once instead of appearing on 
either side of a "==" expression.

On the other hand, this idea doesn't work in all situations (for 
instance, I don't think you could define the default __cmp__/__hash__ 
semantics in terms of __key__), it would only eliminate two one-line 
methods for each class, and it would further complicate the "==" 
operator (__key__, falling back to __eq__, falling back to __cmp__, 
falling back to object identity--ouch!)

If anyone thinks this is a good idea I'll investiate how many places in 
the standard library this pattern would apply.

--jw

From guido at python.org  Sun Nov  6 21:58:57 2005
From: guido at python.org (Guido van Rossum)
Date: Sun, 6 Nov 2005 12:58:57 -0800
Subject: [Python-Dev] For Python 3k, drop default/implicit hash,
	and comparison
In-Reply-To: <436E6A0E.4070508@pobox.com>
References: <436E2C3E.7060807@zope.com> <436E6A0E.4070508@pobox.com>
Message-ID: <ca471dc20511061258q636689c0se9e45b0f503e1299@mail.gmail.com>

On 11/6/05, John Williams <jrw at pobox.com> wrote:
> (This is kind of on a tangent to the original discussion, but I don't
> want to create yet another subject line about object comparisons.)
>
> Lately I've found that virtually all my implementations of __cmp__,
> __hash__, etc. can be factored into this form inspired by the "key"
> parameter to the built-in sorting functions:
>
> class MyClass:
>
>    def __key(self):
>      # Return a tuple of attributes to compare.
>      return (self.foo, self.bar, ...)
>
>    def __cmp__(self, that):
>      return cmp(self.__key(), that.__key())
>
>    def __hash__(self):
>      return hash(self.__key())

The main way this breaks down is when comparing objects of different
types. While most comparisons typically are defined in terms of
comparisons on simpler or contained objects, two objects of different
types that happen to have the same "key" shouldn't necessarily be
considered equal.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Sun Nov  6 22:22:31 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun, 06 Nov 2005 16:22:31 -0500
Subject: [Python-Dev] For Python 3k, drop default/implicit hash,
 and comparison
In-Reply-To: <ca471dc20511061258q636689c0se9e45b0f503e1299@mail.gmail.co
 m>
References: <436E6A0E.4070508@pobox.com> <436E2C3E.7060807@zope.com>
	<436E6A0E.4070508@pobox.com>
Message-ID: <5.1.1.6.0.20051106162127.01ede358@mail.telecommunity.com>

At 12:58 PM 11/6/2005 -0800, Guido van Rossum wrote:
>The main way this breaks down is when comparing objects of different
>types. While most comparisons typically are defined in terms of
>comparisons on simpler or contained objects, two objects of different
>types that happen to have the same "key" shouldn't necessarily be
>considered equal.

When I use this pattern, I often just include the object's type in the 
key.  (I call it the 'hashcmp' value, but otherwise it's the same pattern.)


From guido at python.org  Sun Nov  6 22:29:27 2005
From: guido at python.org (Guido van Rossum)
Date: Sun, 6 Nov 2005 13:29:27 -0800
Subject: [Python-Dev] For Python 3k, drop default/implicit hash,
	and comparison
In-Reply-To: <5.1.1.6.0.20051106162127.01ede358@mail.telecommunity.com>
References: <436E2C3E.7060807@zope.com> <436E6A0E.4070508@pobox.com>
	<5.1.1.6.0.20051106162127.01ede358@mail.telecommunity.com>
Message-ID: <ca471dc20511061329t46078897wdc02dd86e43d133d@mail.gmail.com>

On 11/6/05, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 12:58 PM 11/6/2005 -0800, Guido van Rossum wrote:
> >The main way this breaks down is when comparing objects of different
> >types. While most comparisons typically are defined in terms of
> >comparisons on simpler or contained objects, two objects of different
> >types that happen to have the same "key" shouldn't necessarily be
> >considered equal.
>
> When I use this pattern, I often just include the object's type in the
> key.  (I call it the 'hashcmp' value, but otherwise it's the same pattern.)

But how do you make that work with subclassing? (I'm guessing your
answer is that you don't. :-)

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From josh at janrain.com  Sun Nov  6 22:57:34 2005
From: josh at janrain.com (Josh Hoyt)
Date: Sun, 6 Nov 2005 13:57:34 -0800
Subject: [Python-Dev] For Python 3k, drop default/implicit hash,
	and comparison
In-Reply-To: <ca471dc20511061329t46078897wdc02dd86e43d133d@mail.gmail.com>
References: <436E2C3E.7060807@zope.com> <436E6A0E.4070508@pobox.com>
	<5.1.1.6.0.20051106162127.01ede358@mail.telecommunity.com>
	<ca471dc20511061329t46078897wdc02dd86e43d133d@mail.gmail.com>
Message-ID: <34714aad0511061357x2dc3765y2cdca412dec4e432@mail.gmail.com>

On 11/6/05, Guido van Rossum <guido at python.org> wrote:
> On 11/6/05, Phillip J. Eby <pje at telecommunity.com> wrote:
> > When I use this pattern, I often just include the object's type in the
> > key.  (I call it the 'hashcmp' value, but otherwise it's the same pattern.)
>
> But how do you make that work with subclassing? (I'm guessing your
> answer is that you don't. :-)

If there is a well-defined desired behaviour for comparisons in the
face of subclassing (which I'm not sure if there is) then that
behaviour could become part of the definition of how __key__ works.
Since __key__ would be for clarity of intent and convenience of
implementation, adding default behaviour for the most common case
seems like it would be a good idea.

My initial thought was that all subclasses of the class where __key__
was defined would compare as equal if they return the same value. More
precisely, if two objects have the same __key__ method, and it returns
the same value, then they are equal. That does not solve the __cmp__
problem, unless the __key__ function is used as part of the ordering.

For example:

def getKey(obj):
    __key__ = getattr(obj.__class__, '__key__')
    return (id(key), key(obj))

An obvious drawback is that if __key__ is overridden, then the
subclass where it is overridden and all further subclasses will no
longer have equality to the superclass. I think that this is probably
OK, except that it may be occasionally surprising.

Josh

From jcarlson at uci.edu  Mon Nov  7 00:12:36 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sun, 06 Nov 2005 15:12:36 -0800
Subject: [Python-Dev] For Python 3k, drop default/implicit hash,
	and comparison
In-Reply-To: <436E6A0E.4070508@pobox.com>
References: <436E2C3E.7060807@zope.com> <436E6A0E.4070508@pobox.com>
Message-ID: <20051106144700.C01C.JCARLSON@uci.edu>


John Williams <jrw at pobox.com> wrote:
> 
> (This is kind of on a tangent to the original discussion, but I don't 
> want to create yet another subject line about object comparisons.)
> 
> Lately I've found that virtually all my implementations of __cmp__, 
> __hash__, etc. can be factored into this form inspired by the "key" 
> parameter to the built-in sorting functions:
> 
> class MyClass:
> 
>    def __key(self):
>      # Return a tuple of attributes to compare.
>      return (self.foo, self.bar, ...)
> 
>    def __cmp__(self, that):
>      return cmp(self.__key(), that.__key())
> 
>    def __hash__(self):
>      return hash(self.__key())
> 
> I wonder if it wouldn't make sense to formalize this pattern with a 
> magic __key__ method such that a class with a __key__ method would 
> behave as if it had interited the definitions of __cmp__ and __hash__ above.

You probably already realize this, but I thought I would point out the
obvious.  Given a suitably modified MyClass...

>>> x = {}
>>> a = MyClass()
>>> a.a = 8
>>> x[a] = a
>>> a.a = 9
>>> x[a] = a
>>>
>>> x
{<__main__.MyClass instance at 0x007E0A08>: <__main__.MyClass instance at 0x007E
0A08>, <__main__.MyClass instance at 0x007E0A08>: <__main__.MyClass instance at
0x007E0A08>}

Of course everyone is saying "Josiah, people shouldn't be doing that";
but they will.  Given a mechanism to offer hash-by-value, a large number
of users will think that it will work for what they want, regardless of
the fact that in order for it to really work, those attributes must be
read-only by semantics or access mechanisms.  Not everyone who uses
Python understands fully the concepts of mutability and immutability,
and very few will realize that the attributes returned by __key() need
to be immutable aspects of the instance of that class (you can perform
at most one assignment to the attribute during its lifetime, and that
assignment must occur before any hash calls).


Call me a pessimist, but I don't believe that using magical key methods
will be helpful for understanding or using Python.

 - Josiah


From pje at telecommunity.com  Mon Nov  7 01:12:21 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun, 06 Nov 2005 19:12:21 -0500
Subject: [Python-Dev] For Python 3k, drop default/implicit hash,
 and comparison
In-Reply-To: <ca471dc20511061329t46078897wdc02dd86e43d133d@mail.gmail.co
 m>
References: <5.1.1.6.0.20051106162127.01ede358@mail.telecommunity.com>
	<436E2C3E.7060807@zope.com> <436E6A0E.4070508@pobox.com>
	<5.1.1.6.0.20051106162127.01ede358@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20051106191059.01edcf78@mail.telecommunity.com>

At 01:29 PM 11/6/2005 -0800, Guido van Rossum wrote:
>On 11/6/05, Phillip J. Eby <pje at telecommunity.com> wrote:
> > At 12:58 PM 11/6/2005 -0800, Guido van Rossum wrote:
> > >The main way this breaks down is when comparing objects of different
> > >types. While most comparisons typically are defined in terms of
> > >comparisons on simpler or contained objects, two objects of different
> > >types that happen to have the same "key" shouldn't necessarily be
> > >considered equal.
> >
> > When I use this pattern, I often just include the object's type in the
> > key.  (I call it the 'hashcmp' value, but otherwise it's the same pattern.)
>
>But how do you make that work with subclassing? (I'm guessing your
>answer is that you don't. :-)

By either changing the subclass __init__ to initialize it with a different 
hashcmp value, or by redefining the method that computes it.


From pje at telecommunity.com  Mon Nov  7 01:15:01 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun, 06 Nov 2005 19:15:01 -0500
Subject: [Python-Dev] For Python 3k, drop default/implicit hash,
 and comparison
In-Reply-To: <5.1.1.6.0.20051106191059.01edcf78@mail.telecommunity.com>
References: <ca471dc20511061329t46078897wdc02dd86e43d133d@mail.gmail.co m>
	<5.1.1.6.0.20051106162127.01ede358@mail.telecommunity.com>
	<436E2C3E.7060807@zope.com> <436E6A0E.4070508@pobox.com>
	<5.1.1.6.0.20051106162127.01ede358@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20051106191251.01fa9818@mail.telecommunity.com>

At 07:12 PM 11/6/2005 -0500, Phillip J. Eby wrote:
>At 01:29 PM 11/6/2005 -0800, Guido van Rossum wrote:
> >On 11/6/05, Phillip J. Eby <pje at telecommunity.com> wrote:
> > > At 12:58 PM 11/6/2005 -0800, Guido van Rossum wrote:
> > > >The main way this breaks down is when comparing objects of different
> > > >types. While most comparisons typically are defined in terms of
> > > >comparisons on simpler or contained objects, two objects of different
> > > >types that happen to have the same "key" shouldn't necessarily be
> > > >considered equal.
> > >
> > > When I use this pattern, I often just include the object's type in the
> > > key.  (I call it the 'hashcmp' value, but otherwise it's the same 
> pattern.)
> >
> >But how do you make that work with subclassing? (I'm guessing your
> >answer is that you don't. :-)
>
>By either changing the subclass __init__ to initialize it with a different
>hashcmp value, or by redefining the method that computes it.

Scratch that.  I realized 2 seconds after hitting "Send" that you meant the 
case where you want to compare instances with a common parent type.  And 
the answer is, I can't recall having needed to.  (Which is probably why it 
took me so long to realize what you meant.)


From fakeaddress at nowhere.org  Thu Nov  3 01:55:01 2005
From: fakeaddress at nowhere.org (Bryan Olson)
Date: Wed, 02 Nov 2005 16:55:01 -0800
Subject: [Python-Dev] PEP submission broken?
Message-ID: <43695FE5.1080803@nowhere.org>


Though I tried to submit a (pre-) PEP in the proper form through the proper
channels, it has disappeared into the ether.


In building a class that supports Python's slicing interface,

   http://groups.google.com/group/comp.lang.python/msg/8f35464483aa7d7b

I encountered a Python bug, which, upon further discussion, seemed to be
a combination of a wart and a documentation error.

 
http://groups.google.com/group/comp.lang.python/browse_frm/thread/402d770b6f503c27

I submitted the bug report via SourceForge; the resolution was to document
the actual behavior.  Next I worked out what behavior I think would 
eliminate
the wart, wrote it up as a pre-PEP, and sent it peps at python.org on 27 Aug of
this year.

I promptly received an automated response from Barry Warsaw, saying, in 
part,
"I get so much email that I can't promise a personal response."  I 
gathered that
he is a PEP editor. I did not infer from his reply that PEP's are simply 
ignored, but
this automated reply was the only response I ever received. I subscribed 
to the
Python-dev list, and watched, and waited; nothing on my concern appeared.


One response on the comp.lang.python newsgroup noted that a popular
extention module would have difficulty maintaining consistency with my
proposed PEP.  My proposal does not break how the extension currently
works, but still, that's a valid point. There are variations which do 
not have
that problem, and I think I can see a  course that will serve the entire
Python community. From what I can tell, We need to address fixing the
PEP process before there is any point in working on PEP's,



-- 
--Bryan

From ncoghlan at gmail.com  Mon Nov  7 10:32:57 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 07 Nov 2005 19:32:57 +1000
Subject: [Python-Dev] PEP submission broken?
In-Reply-To: <43695FE5.1080803@nowhere.org>
References: <43695FE5.1080803@nowhere.org>
Message-ID: <436F1F49.90606@gmail.com>

Bryan Olson wrote:
>  From what I can tell, We need to address fixing the
> PEP process before there is any point in working on PEP's,

I think this is a somewhat fair point (although perhaps a bit overstated) - 
David and Barry can be busy IRL, which can certainly slow down the process of 
PEP submission. PEP 328 hung in limbo for a while on that basis (I'm going to 
have to look into if and how PEP 328 relates to Python eggs one of these days. 
. .).

Would it be worth having a PEP category on the RFE tracker, rather than 
submitting pre-PEP's directly to the PEP editors? The process still wouldn't 
be perfect, but it would widen the pool of people that can bring a pre-PEP to 
the attention of python-dev.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From bjourne at gmail.com  Mon Nov  7 14:06:11 2005
From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=)
Date: Mon, 7 Nov 2005 14:06:11 +0100
Subject: [Python-Dev] Should the default equality operator compare
	values instead of identities?
In-Reply-To: <20051105162001.C007.JCARLSON@uci.edu>
References: <20051105151436.BFF7.JCARLSON@uci.edu>
	<b348a0850511051602u4db5e332mdbc3dcecbe95b170@mail.gmail.com>
	<20051105162001.C007.JCARLSON@uci.edu>
Message-ID: <740c3aec0511070506v6833686ag4894270034e01559@mail.gmail.com>

How would the value equality operator deal with recursive objects?

class Foo:
    def __init__(self):
        self.foo = self

Seems to me that it would take atleast some special-casing to get
Foo() == Foo() to evalute to True in this case...

--
mvh Bj?rn

From guido at python.org  Mon Nov  7 18:10:15 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 7 Nov 2005 09:10:15 -0800
Subject: [Python-Dev] For Python 3k, drop default/implicit hash,
	and comparison
In-Reply-To: <5.1.1.6.0.20051106191251.01fa9818@mail.telecommunity.com>
References: <436E2C3E.7060807@zope.com> <436E6A0E.4070508@pobox.com>
	<5.1.1.6.0.20051106162127.01ede358@mail.telecommunity.com>
	<5.1.1.6.0.20051106191059.01edcf78@mail.telecommunity.com>
	<5.1.1.6.0.20051106191251.01fa9818@mail.telecommunity.com>
Message-ID: <ca471dc20511070910u3e2e7ea6o6e98b46357a1af5c@mail.gmail.com>

Two more thoughts in this thread.

(1) The "key" idiom is a great pattern but I don't think it would work
well to make it a standard language API.

(2) I'm changing my mind about the default hash().

The original default hash() (which would raise TypeError if __eq__ was
overridden but __hash__ was not) is actually quite useful in some
situations. Basically, simplifying a bit, there are two types of
objects: those that represent *values* and those that do not. For
value-ish objects, overriding __eq__ is common and then __hash__ needs
to be overridden in order to get the proper dict and set behavior. In
a sense, __eq__ defines an "equivalence class" in the mathematical
sense.

But in many applications I've used objects for which object identity
is important.

Let me construct a hypothetical example: suppose we represent a car
and its parts as objects. Let's say each wheel is an object. Each
wheel is unique and we don't have equivalency classes for them.
However, it would be useful to construct sets of wheels (e.g. the set
of wheels currently on my car that have never had a flat tire). Python
sets use hashing just like dicts. The original hash() and __eq__
implementation would work exactly right for this purpose, and it seems
silly to have to add it to every object type that could possibly be
used as a set member (especially since this means that if a third
party library creates objects for you that don't implement __hash__
you'd have a hard time of adding it).

In short, I agree something's broken, but the fix should not be to
remove the default __hash__ and __eq__ altogether. Instead, the
default __hash__ should be made smarter (and perhaps the only way to
do this is to build the smarts into hash() again). I do agree that
__cmp__, __gt__ etc. should be left undefined by default. All of this
is Python 3000 only.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From nnorwitz at gmail.com  Mon Nov  7 21:49:21 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Mon, 7 Nov 2005 12:49:21 -0800
Subject: [Python-Dev] cross-compiling
Message-ID: <ee2a432c0511071249r41964175p166537722d4c51b0@mail.gmail.com>

We've been having some issues and discussions at work about cross
compiling.  There are various people that have tried (are) cross
compiling python.  Right now the support kinda sucks due to a couple
of reasons.

First, distutils is required to build all the modules.  This means
that python must be built twice.  Once for the target machine and once
for the host machine.  The host machine is really not desired since
it's only purpose is to run distutils.  I don't know the history of
why distutils is used.  I haven't had much of an issue with it since
I've never needed to cross compile.  What are the issues with not
requiring python to be built on the host machine (ie, not using
distutils)?

Second, in configure we try to run little programs (AC_TRY_RUN) to
determine what to set.  I don't know of any good alternative but to
force those to be defined manually for cross-compiled environments.
Any suggestions here?  I'm thinking we can skip the the AC_TRY_RUNs
if host != target and we pickup the answers to those from a user
supplied file.

I'm *not* suggesting that normal builds see any change in behaviour.
Nothing will change for most developers.  ie, ./configure ; make ;
./python will continue to work the same.  I only want to make it
possible to cross compile python by building it only on the target
platform.

n

PS.  I would be interested to hear from others who are doing cross
compiling and know more about it than me.

From guido at python.org  Mon Nov  7 22:04:53 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 7 Nov 2005 13:04:53 -0800
Subject: [Python-Dev] cross-compiling
In-Reply-To: <ee2a432c0511071249r41964175p166537722d4c51b0@mail.gmail.com>
References: <ee2a432c0511071249r41964175p166537722d4c51b0@mail.gmail.com>
Message-ID: <ca471dc20511071304h5ce50ddar522e2ca216e44a4a@mail.gmail.com>

I know some folks have successfully used cross-compilation before. But
this was in a distant past. There was some support for it in the
configure script; surely you're using that? I believe it lets you
specify defaults for the TRY_RUN macros. But it's probably very
primitive.

About using distutils to build the extensions, this is because some
extensions require quite a bit of logic to determine the build
commands (e.g. look at BSDDB or Tkinter). There was a pre-distutils
way of building extensions using Modules/Setup* but this required
extensive manual editing if tools weren't in the place where they were
expected, and they never were.

I don't have time to look into this further right now, but I hope I
will in the future. Keep me posted!

--Guido

On 11/7/05, Neal Norwitz <nnorwitz at gmail.com> wrote:
> We've been having some issues and discussions at work about cross
> compiling.  There are various people that have tried (are) cross
> compiling python.  Right now the support kinda sucks due to a couple
> of reasons.
>
> First, distutils is required to build all the modules.  This means
> that python must be built twice.  Once for the target machine and once
> for the host machine.  The host machine is really not desired since
> it's only purpose is to run distutils.  I don't know the history of
> why distutils is used.  I haven't had much of an issue with it since
> I've never needed to cross compile.  What are the issues with not
> requiring python to be built on the host machine (ie, not using
> distutils)?
>
> Second, in configure we try to run little programs (AC_TRY_RUN) to
> determine what to set.  I don't know of any good alternative but to
> force those to be defined manually for cross-compiled environments.
> Any suggestions here?  I'm thinking we can skip the the AC_TRY_RUNs
> if host != target and we pickup the answers to those from a user
> supplied file.
>
> I'm *not* suggesting that normal builds see any change in behaviour.
> Nothing will change for most developers.  ie, ./configure ; make ;
> ./python will continue to work the same.  I only want to make it
> possible to cross compile python by building it only on the target
> platform.
>
> n
>
> PS.  I would be interested to hear from others who are doing cross
> compiling and know more about it than me.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From maillist at roomity.com  Mon Nov  7 22:11:11 2005
From: maillist at roomity.com (shenanigans)
Date: Mon,  7 Nov 2005 13:11:11 -0800 (PST)
Subject: [Python-Dev] [OTAnn] Feedback
Message-ID: <26578370.591131397871429.JavaMail.tomcat5@slave1.roomity.com>

I was interested in getting feedback from current mail group users.

We have mirrored your mail list in a new application that provides a more aggregated and safe environment which utilizes the power of broadband.

Roomity.com v 1.5 is a web 2.01 community webapp. Our newest version adds broadcast video and social networking such as favorite authors and an html editor.

It?s free to join and any feedback would be appreciated.

S.



--------------------------------------------------------------------------------------------------------------------------------------------------
Broadband interface (RIA) + mail box saftey = <a href="http://Python_Core_Developers_List.roomity.com">Python_Core_Developers_List.roomity.com</a>
*Your* clubs, no sign up to read, ad supported; try broadband internet. ~~1131397871425~~
--------------------------------------------------------------------------------------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20051107/ad884a6f/attachment.html

From jeremy at alum.mit.edu  Mon Nov  7 22:38:18 2005
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Mon, 7 Nov 2005 16:38:18 -0500
Subject: [Python-Dev] cross-compiling
In-Reply-To: <ca471dc20511071304h5ce50ddar522e2ca216e44a4a@mail.gmail.com>
References: <ee2a432c0511071249r41964175p166537722d4c51b0@mail.gmail.com>
	<ca471dc20511071304h5ce50ddar522e2ca216e44a4a@mail.gmail.com>
Message-ID: <e8bf7a530511071338j4a02ee19y535f935d1c15979b@mail.gmail.com>

On 11/7/05, Guido van Rossum <guido at python.org> wrote:
> About using distutils to build the extensions, this is because some
> extensions require quite a bit of logic to determine the build
> commands (e.g. look at BSDDB or Tkinter). There was a pre-distutils
> way of building extensions using Modules/Setup* but this required
> extensive manual editing if tools weren't in the place where they were
> expected, and they never were.

I think part of the problem is that setup.py has a bunch of heuristics
that are intended to do the right thing without user intervention. 
If, on the other hand, the user wants to intervene, because "the right
thing" is wrong for cross-compiling, you are kind of stuck.  I don't
think there is an obvious way to select the extension modules to build
and the C libraries for them to link against.

Jeremy

From bcannon at gmail.com  Mon Nov  7 22:41:33 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Mon, 7 Nov 2005 13:41:33 -0800
Subject: [Python-Dev] cross-compiling
In-Reply-To: <ee2a432c0511071249r41964175p166537722d4c51b0@mail.gmail.com>
References: <ee2a432c0511071249r41964175p166537722d4c51b0@mail.gmail.com>
Message-ID: <bbaeab100511071341w49c8a593u12b1d0ab68ca1110@mail.gmail.com>

On 11/7/05, Neal Norwitz <nnorwitz at gmail.com> wrote:
> We've been having some issues and discussions at work about cross
> compiling.  There are various people that have tried (are) cross
> compiling python.  Right now the support kinda sucks due to a couple
> of reasons.

This might make a good sprint topic.  Maybe your employer might be
willing to get some people to come to hack on this?

I know I wouldn't mind seeing the whole build process cleaned up.  It
works well enough, but I think some things could stand to be updated
(speaking from experience of adding EXTRA_CFLAGS to the build
process), such as setup.py being made more modular.

-Brett

From barry at python.org  Mon Nov  7 23:05:38 2005
From: barry at python.org (Barry Warsaw)
Date: Mon, 07 Nov 2005 17:05:38 -0500
Subject: [Python-Dev] cross-compiling
In-Reply-To: <e8bf7a530511071338j4a02ee19y535f935d1c15979b@mail.gmail.com>
References: <ee2a432c0511071249r41964175p166537722d4c51b0@mail.gmail.com>
	<ca471dc20511071304h5ce50ddar522e2ca216e44a4a@mail.gmail.com>
	<e8bf7a530511071338j4a02ee19y535f935d1c15979b@mail.gmail.com>
Message-ID: <1131401138.4926.38.camel@geddy.wooz.org>

On Mon, 2005-11-07 at 16:38, Jeremy Hylton wrote:

> I think part of the problem is that setup.py has a bunch of heuristics
> that are intended to do the right thing without user intervention. 
> If, on the other hand, the user wants to intervene, because "the right
> thing" is wrong for cross-compiling, you are kind of stuck.  I don't
> think there is an obvious way to select the extension modules to build
> and the C libraries for them to link against.

This relates to an issue we've had to workaround with the distutils
based module builds in Python.  For some of the modules, we want the
auto-detection code to find versions of dependent libraries in locations
other than the "standard" system locations.  I don't think there's a
good way to convince the various setup.py scripts to look elsewhere for
things, short of modifying the code.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20051107/69363bd3/attachment.pgp

From martin at v.loewis.de  Mon Nov  7 23:34:26 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 07 Nov 2005 23:34:26 +0100
Subject: [Python-Dev] Should the default equality operator
 compare	values instead of identities?
In-Reply-To: <740c3aec0511070506v6833686ag4894270034e01559@mail.gmail.com>
References: <20051105151436.BFF7.JCARLSON@uci.edu>	<b348a0850511051602u4db5e332mdbc3dcecbe95b170@mail.gmail.com>	<20051105162001.C007.JCARLSON@uci.edu>
	<740c3aec0511070506v6833686ag4894270034e01559@mail.gmail.com>
Message-ID: <436FD672.5070807@v.loewis.de>

BJ?rn Lindqvist wrote:
> How would the value equality operator deal with recursive objects?
> 
> class Foo:
>     def __init__(self):
>         self.foo = self
> 
> Seems to me that it would take atleast some special-casing to get
> Foo() == Foo() to evalute to True in this case...

This is sort-of supported today:

 >>> a=[]
 >>> a.append(a)
 >>> b=[]
 >>> b.append(b)
 >>> a == b
True

Regards,
Martin

From martin at v.loewis.de  Mon Nov  7 23:38:34 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 07 Nov 2005 23:38:34 +0100
Subject: [Python-Dev] cross-compiling
In-Reply-To: <ee2a432c0511071249r41964175p166537722d4c51b0@mail.gmail.com>
References: <ee2a432c0511071249r41964175p166537722d4c51b0@mail.gmail.com>
Message-ID: <436FD76A.3020401@v.loewis.de>

Neal Norwitz wrote:
> First, distutils is required to build all the modules. 

As Guido already suggests, this assertion is false. In a
cross-compilation environment, I would try to avoid distutils,
and indeed, the build process to do so is still supported.

> Second, in configure we try to run little programs (AC_TRY_RUN) to
> determine what to set.  I don't know of any good alternative but to
> force those to be defined manually for cross-compiled environments.
> Any suggestions here?  I'm thinking we can skip the the AC_TRY_RUNs
> if host != target and we pickup the answers to those from a user
> supplied file.

You shouldn't be required to do that. Instead, just edit pyconfig.h
manually, to match the target. autoconf is designed to support that.

It would help if Makefile was target-independent (only host-dependent).
Not sure whether this is the case.

Regards,
Martin

From martin at v.loewis.de  Mon Nov  7 23:39:18 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 07 Nov 2005 23:39:18 +0100
Subject: [Python-Dev] cross-compiling
In-Reply-To: <e8bf7a530511071338j4a02ee19y535f935d1c15979b@mail.gmail.com>
References: <ee2a432c0511071249r41964175p166537722d4c51b0@mail.gmail.com>	<ca471dc20511071304h5ce50ddar522e2ca216e44a4a@mail.gmail.com>
	<e8bf7a530511071338j4a02ee19y535f935d1c15979b@mail.gmail.com>
Message-ID: <436FD796.3010208@v.loewis.de>

Jeremy Hylton wrote:
> I think part of the problem is that setup.py has a bunch of heuristics
> that are intended to do the right thing without user intervention. 
> If, on the other hand, the user wants to intervene, because "the right
> thing" is wrong for cross-compiling, you are kind of stuck.  I don't
> think there is an obvious way to select the extension modules to build
> and the C libraries for them to link against.

Of course there is: Modules/Setup.

Regards,
Martin

From mwh at python.net  Tue Nov  8 00:02:12 2005
From: mwh at python.net (Michael Hudson)
Date: Mon, 07 Nov 2005 23:02:12 +0000
Subject: [Python-Dev] Should the default equality operator
	compare	values instead of identities?
In-Reply-To: <436FD672.5070807@v.loewis.de> (
	=?iso-8859-1?q?Martin_v._L=F6wis's_message_of?= "Mon,
	07 Nov 2005 23:34:26 +0100")
References: <20051105151436.BFF7.JCARLSON@uci.edu>
	<b348a0850511051602u4db5e332mdbc3dcecbe95b170@mail.gmail.com>
	<20051105162001.C007.JCARLSON@uci.edu>
	<740c3aec0511070506v6833686ag4894270034e01559@mail.gmail.com>
	<436FD672.5070807@v.loewis.de>
Message-ID: <2mbr0w12q3.fsf@starship.python.net>

"Martin v. L?wis" <martin at v.loewis.de> writes:

> BJ?rn Lindqvist wrote:
>> How would the value equality operator deal with recursive objects?
>> 
>> class Foo:
>>     def __init__(self):
>>         self.foo = self
>> 
>> Seems to me that it would take atleast some special-casing to get
>> Foo() == Foo() to evalute to True in this case...
>
> This is sort-of supported today:
>
>  >>> a=[]
>  >>> a.append(a)
>  >>> b=[]
>  >>> b.append(b)
>  >>> a == b
> True

Uh, I think this changed in Python 2.4:

>>> a = []
>>> a.append(a)
>>> b = []
>>> b.append(b)
>>> a == b
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
RuntimeError: maximum recursion depth exceeded in cmp

Cheers,
mwh

-- 
  First of all, email me your AOL password as a security measure. You
  may find that won't be able to connect to the 'net for a while. This
  is normal. The next thing to do is turn your computer upside down
  and shake it to reboot it.                     -- Darren Tucker, asr

From kbk at shore.net  Tue Nov  8 03:29:49 2005
From: kbk at shore.net (Kurt B. Kaiser)
Date: Mon, 7 Nov 2005 21:29:49 -0500 (EST)
Subject: [Python-Dev] Weekly Python Patch/Bug Summary
Message-ID: <200511080229.jA82TnHG017341@bayview.thirdcreek.com>

Patch / Bug Summary
___________________

Patches :  365 open ( +5) /  2961 closed ( +5) /  3326 total (+10)
Bugs    :  904 open (+11) /  5367 closed (+14) /  6271 total (+25)
RFE     :  200 open ( +1) /   189 closed ( +0) /   389 total ( +1)

New / Reopened Patches
______________________

new function: os.path.relpath  (2005-10-27)
       http://python.org/sf/1339796  opened by  Richard Barran

commands.getstatusoutput()  (2005-11-02)
       http://python.org/sf/1346211  opened by  Dawa Lama 

Better dead code elimination for the AST compiler  (2005-11-02)
       http://python.org/sf/1346214  opened by  Rune Holm

A constant folding optimization pass for the AST  (2005-11-02)
       http://python.org/sf/1346238  opened by  Rune Holm

Remove inconsistent behavior between import and zipimport  (2005-11-03)
       http://python.org/sf/1346572  opened by  Osvaldo Santana Neto

Patch f. bug 495682 cannot handle http_proxy with user:pass@  (2005-11-05)
CLOSED http://python.org/sf/1349117  opened by  Johannes Nicolai

Patch f. bug 495682 cannot handle http_proxy with user:pass@  (2005-11-05)
       http://python.org/sf/1349118  opened by  Johannes Nicolai

[PATCH] 100x optimization for ngettext  (2005-11-06)
       http://python.org/sf/1349274  opened by  Joe Wreschnig

Fix for signal related abort in Visual Studio 2005  (2005-11-07)
       http://python.org/sf/1350409  opened by  Adal Chiriliuc

Redundant connect() call in logging.handlers.SysLogHandler  (2005-11-07)
       http://python.org/sf/1350658  opened by  Ken Lalonde

Patches Closed
______________

tarfile.py: fix for bug #1336623  (2005-10-26)
       http://python.org/sf/1338314  closed by  nnorwitz

Python 2.4.2 doesn't build with "--without-threads"  (2005-10-22)
       http://python.org/sf/1335054  closed by  nnorwitz

Speedup PyUnicode_DecodeCharmap  (2005-10-05)
       http://python.org/sf/1313939  closed by  lemburg

Allow use of non-latin1 chars in interactive shell  (2005-10-21)
       http://python.org/sf/1333679  closed by  loewis

Patch f. bug 495682 cannot handle http_proxy with user:pass@  (2005-11-05)
       http://python.org/sf/1349117  closed by  birkenfeld

New / Reopened Bugs
___________________

CVS webbrowser.py (1.40) bugs  (2005-10-27)
CLOSED http://python.org/sf/1339806  opened by  Greg Couch

TAB SETTINGS DONT WORK (win)  (2005-10-27)
       http://python.org/sf/1339883  opened by  reson5

time.strptime() with bad % code throws bad exception  (2005-10-27)
CLOSED http://python.org/sf/1340337  opened by  Steve R. Hastings

mmap does not accept length as 0  (2005-10-28)
       http://python.org/sf/1341031  opened by  liturgist

"\n" is incorrectly represented  (2005-10-30)
CLOSED http://python.org/sf/1341934  opened by  Pavel

Tkinter.Menu.delete doesn't delete command of entry  (2005-10-30)
       http://python.org/sf/1342811  opened by  Sverker Nilsson

Broken docs for os.removedirs  (2005-10-31)
       http://python.org/sf/1343671  opened by  David K?gedal

UNIX mmap leaks file descriptors  (2005-11-01)
CLOSED http://python.org/sf/1344508  opened by  Erwin S. Andreasen

colorsys tests, bug in frange  (2005-11-01)
CLOSED http://python.org/sf/1345263  opened by  Rune Holm

Python 2.4 and 2.3.5 won't build on OpenBSD 3.7  (2005-11-01)
       http://python.org/sf/1345313  opened by  Dan

doc typo  (2005-11-02)
CLOSED http://python.org/sf/1346026  opened by  Keith Briggs

Segfaults from unaligned loads in floatobject.c  (2005-11-02)
       http://python.org/sf/1346144  opened by  Rune Holm

Missing import in documentation  (2005-11-03)
CLOSED http://python.org/sf/1346395  opened by  Aggelos Orfanakos

selectmodule.c calls PyInt_AsLong without error checking  (2005-11-03)
CLOSED http://python.org/sf/1346533  opened by  Luke

_subprocess.c calls PyInt_AsLong without error checking  (2005-11-03)
       http://python.org/sf/1346547  opened by  Luke

httplib simply ignores CONTINUE  (2005-11-03)
       http://python.org/sf/1346874  opened by  Mike Looijmans

FeedParser does not comply with RFC2822  (2005-11-04)
       http://python.org/sf/1347874  opened by  Julian Phillips

pydoc seems to run some scripts!  (2005-11-04)
       http://python.org/sf/1348477  opened by  Olly Betts

email.Generators does not separates headers with "\r\n"  (2005-11-05)
       http://python.org/sf/1349106  opened by  Manlio Perillo

xmlrpclib does not use http proxy  (2005-11-06)
       http://python.org/sf/1349316  opened by  greatred

urllib.urlencode provides two features in one param  (2005-11-06)
       http://python.org/sf/1349732  opened by  Ori Avtalion

urllib2 blocked from news.google.com  (2005-11-07)
CLOSED http://python.org/sf/1349977  opened by  Michael Hoisie

built-in method .__cmp__  (2005-11-07)
       http://python.org/sf/1350060  opened by  Armin Rigo

"setdlopenflags" leads to crash upon "import"  (2005-11-07)
       http://python.org/sf/1350188  opened by  John Pye

CVS migration not in www.python.org docs  (2005-11-07)
       http://python.org/sf/1350498  opened by  Jim Jewett

zlib.crc32 doesn't handle 0xffffffff seed  (2005-11-07)
       http://python.org/sf/1350573  opened by  Danny Yoo

Bugs Closed
___________

CVS webbrowser.py (1.40) bugs  (2005-10-27)
       http://python.org/sf/1339806  deleted by  gregcouch

Memory keeping  (2005-10-26)
       http://python.org/sf/1338264  closed by  tim_one

tarfile can't extract some tar archives..  (2005-10-24)
       http://python.org/sf/1336623  closed by  nnorwitz

time.strptime() with bad % code throws bad exception  (2005-10-27)
       http://python.org/sf/1340337  closed by  bcannon

_socket module not build under cygwin  (2005-09-22)
       http://python.org/sf/1298709  closed by  jlt63

"\n" is incorrectly represented  (2005-10-30)
       http://python.org/sf/1341934  closed by  perky

pydoc HTTP reload failure  (2001-04-21)
       http://python.org/sf/417833  closed by  ping

UNIX mmap leaks file descriptors  (2005-10-31)
       http://python.org/sf/1344508  closed by  nnorwitz

colorsys tests, bug in frange  (2005-11-01)
       http://python.org/sf/1345263  closed by  nnorwitz

Python.h should include system headers properly [POSIX]  (2005-10-25)
       http://python.org/sf/1337400  closed by  loewis

doc typo  (2005-11-02)
       http://python.org/sf/1346026  closed by  nnorwitz

Missing import in documentation  (2005-11-02)
       http://python.org/sf/1346395  closed by  bcannon

selectmodule.c calls PyInt_AsLong without error checking  (2005-11-02)
       http://python.org/sf/1346533  closed by  nnorwitz

pydoc ignores $PAGER if TERM='dumb'  (2002-12-09)
       http://python.org/sf/651124  closed by  ping

urllib2 blocked from news.google.com  (2005-11-06)
       http://python.org/sf/1349977  closed by  bcannon

New / Reopened RFE
__________________

please support the free visual studio sdk compiler  (2005-11-04)
       http://python.org/sf/1348719  opened by  David McNab


From ronaldoussoren at mac.com  Tue Nov  8 08:37:16 2005
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Tue, 8 Nov 2005 08:37:16 +0100
Subject: [Python-Dev] Should the default equality operator
	compare	values instead of identities?
In-Reply-To: <436FD672.5070807@v.loewis.de>
References: <20051105151436.BFF7.JCARLSON@uci.edu>
	<b348a0850511051602u4db5e332mdbc3dcecbe95b170@mail.gmail.com>
	<20051105162001.C007.JCARLSON@uci.edu>
	<740c3aec0511070506v6833686ag4894270034e01559@mail.gmail.com>
	<436FD672.5070807@v.loewis.de>
Message-ID: <994C0B10-7382-4492-9EDE-1AE47BF7FA32@mac.com>


On 7-nov-2005, at 23:34, Martin v. L?wis wrote:

> BJ?rn Lindqvist wrote:
>> How would the value equality operator deal with recursive objects?
>>
>> class Foo:
>>     def __init__(self):
>>         self.foo = self
>>
>> Seems to me that it would take atleast some special-casing to get
>> Foo() == Foo() to evalute to True in this case...
>
> This is sort-of supported today:

But only for lists ;-)

 >>> a = {}
 >>> a[1] = a
 >>>
 >>> b = {}
 >>> b[1] = b
 >>>
 >>> a == b
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
RuntimeError: maximum recursion depth exceeded in cmp
 >>>

>
>>>> a=[]
>>>> a.append(a)
>>>> b=[]
>>>> b.append(b)
>>>> a == b
> True
>
> Regards,
> Martin
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/ 
> ronaldoussoren%40mac.com


From winlinchu at yahoo.it  Tue Nov  8 10:16:06 2005
From: winlinchu at yahoo.it (winlinchu)
Date: Tue, 8 Nov 2005 10:16:06 +0100 (CET)
Subject: [Python-Dev] Unifying decimal numbers.
Message-ID: <20051108091607.13272.qmail@web26010.mail.ukl.yahoo.com>

Now, Python unified ints and long ints.
For Python 3k, could be introduced a "Decimal" type
(yes, Decimal, the library module!) in place of the
actual float object. Of course, the Decimal type would
be rewritten in C.

Thanks.


		
___________________________________ 
Yahoo! Messenger: chiamate gratuite in tutto il mondo 
http://it.messenger.yahoo.com

From support at intercable.ru  Tue Nov  8 08:59:42 2005
From: support at intercable.ru (Technical Support of Intercable Co)
Date: Tue, 08 Nov 2005 10:59:42 +0300
Subject: [Python-Dev]  For Python 3k, drop default/implicit hash,
	and comparison
Message-ID: <43705AEE.2050600@intercable.ru>

Why 'identity' objects can't define:
    def __getKey__(self):
         return Key(self, id(self))
Then they would act as usually, while value object can define
    def __getKey__(self):
         return Key(self, self.i, self.j, self.a[1])

(Key is an abstraction to handle subclassing)

Of course, there should be a way to handle comparison off the class 
ierarhy (i think)
Today one can write:
 >>> class Boo(object):
    def __init__(self,s=""):
       self.s=s
    def __hash__(self):
       return hash(self.s)
    def __cmp__(self,other):
       if type(self)==type(other):
          return cmp(self.s,other.s)
       if type(other)==str:
          return cmp(self.s,other)
 >>> a={}
 >>> a['s']=1
 >>> a[Boo('s')]
1
 >>> a[Boo('z')]=2
 >>> a['z']
2
It is confused and hardly usefull, but possible.

Excuse my english.


From jcarlson at uci.edu  Tue Nov  8 19:26:16 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Tue, 08 Nov 2005 10:26:16 -0800
Subject: [Python-Dev] Unifying decimal numbers.
In-Reply-To: <20051108091607.13272.qmail@web26010.mail.ukl.yahoo.com>
References: <20051108091607.13272.qmail@web26010.mail.ukl.yahoo.com>
Message-ID: <20051108101410.C050.JCARLSON@uci.edu>


winlinchu <winlinchu at yahoo.it> wrote:
> Now, Python unified ints and long ints.
> For Python 3k, could be introduced a "Decimal" type
> (yes, Decimal, the library module!) in place of the
> actual float object. Of course, the Decimal type would
> be rewritten in C.

There is code which relies on standard IEEE 754 floating point math
(speed, behavior, rounding, etc.) that would break with the replacement
of floats with decimals.  Further, even if it were to be converted to C,
it would likely be hundreds of times slower than the processor-native
float operations.

This discussion has been had before use:
    site:mail.python.org decimal replace float python
in google to discover such discussions.  For example:
    http://mail.python.org/pipermail/python-dev/2005-June/054416.html


 - Josiah


From mdehoon at c2b2.columbia.edu  Tue Nov  8 20:46:48 2005
From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon)
Date: Tue, 08 Nov 2005 14:46:48 -0500
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
Message-ID: <437100A7.5050907@c2b2.columbia.edu>

Dear Pythoneers,

I use Python heavily for scientific computing, originally for 
computational physics and nowadays for computational biology. One of the 
extension modules I support is pygist, a module for scientific 
visualization. For this (and similar) packages to work, it is important 
to have an event loop in Python.

Currently, event loops are available in Python via PyOS_InputHook, a 
pointer to a user-defined function that is called when Python is idle 
(waiting for user input). However, an event loop using PyOS_InputHook 
has some inherent limitations, so I am thinking about how to improve 
event loop support in Python.

As an example, consider the current implementation of Tkinter. What's 
nice about it is that events as well as user-typed Python commands are 
handled without having to call mainloop() explicitly (except on some 
platforms):
"import Tkinter; Tkinter.Tk()" causes a Tk window to pop up that remains 
responsive throughout. It works as follows (using Tkinter as an example; 
pygist works essentially the same):

1) Importing Tkinter causes PyOS_InputHook to be set to the EventHook 
function in _tkinter.c.
2) Before Python calls fgets to read the next Python command typed by 
the user, it checks PyOS_InputHook and calls it if it is not NULL.
3) The EventHook function in _tkinter runs the following loop:
     - Check if user input is present; if so, exit the loop
     - Handle a Tcl/Tk event, if present
     - Sleep for 20 milliseconds
4) Once the EventHook function returns, Python continues to read the 
next user command. After executing the command, return to 2).

However, this implementation has the following problems:
1) Essentially, the event loop is a busy-wait loop with a 20 ms sleep in 
between. An event loop using select() (or equivalent on Windows) will 
give better performance.
2) Since this event loop runs inside Tkinter, there is no way for other 
extension modules to get their messages handled. Hence, we cannot have 
more than one extension module that needs an event loop. As an example, 
it would be nice to have a Tkinter GUI to steer a simulation and a 
(non-Tk) graphics output window to visualize the simulation.
3) Whereas PyOS_InputHook is called when Python is waiting for user 
input, it is not called when Python is waiting for anything else, for 
example one thread waiting for another. For example, IDLE uses two 
threads, one handling the GUI and one handling the user commands. When 
the second thread is waiting for the first thread (when waiting for user 
input to become available), PyOS_InputHook is not being called, and no 
Tkinter events are being handled. Hence, "import Tkinter; Tkinter.Tk()" 
does nothing when executed from an IDLE window. Which means that our 
scientific visualization software can only be run from Python started 
from the command line, whereas many users (especially on Windows) will 
want to use IDLE.

Now the problem I'm facing is that because of its integration with Tcl, 
this cannot be fixed easily with Tkinter as the GUI toolkit for Python. 
If the events to be handled were purely graphical events (move a window, 
repaint a window, etc.), there would be no harm in handling these events 
when waiting for e.g. another thread. With Tkinter, however, we cannot 
enter EventHook while waiting for another thread:
a) because EventHook only returns if user input is available (it doesn't 
wait for threads);
b) because EventHook also runs Tcl/Tk commands, and we wouldn't want to 
run some Tcl commands in some thread while waiting for another thread.

Therefore, as far as I can tell, there is no way to set up a true event 
loop in Python that works nicely with Tkinter, and there is no way to 
have an event loop function from IDLE.

So I'd like to ask the following questions:
1) Did I miss something? Is there some way to get an event loop with 
Tkinter?
2) Will Tkinter always be the standard toolkit for Python, or are there 
plans to replace it at some point?

I realize that Tkinter has been an important part of Python for some 
time now, and I don't expect it to be ditched just because of my event 
loop problems. At the same time, event loop support could use some 
improvement, so I'd like to call your attention to this issue. Tcl 
actually has event loops implemented quite nicely, and may serve as an 
example of how event loops may work in Python.

--Michiel.

-- 
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032



From janssen at parc.com  Tue Nov  8 21:32:59 2005
From: janssen at parc.com (Bill Janssen)
Date: Tue, 8 Nov 2005 12:32:59 PST
Subject: [Python-Dev] Unifying decimal numbers.
In-Reply-To: Your message of "Tue, 08 Nov 2005 10:26:16 PST."
	<20051108101410.C050.JCARLSON@uci.edu> 
Message-ID: <05Nov8.123259pst."58633"@synergy1.parc.xerox.com>

Might be more interesting to think about replacing ints and Decimal
with implicit-denominator rational type.  In the HTTP-NG typing
proposal, we called this a "fixed-point" type.  See Section 4.5.1 of
http://www.w3.org/Protocols/HTTP-NG/1998/08/draft-frystyk-httpng-arch-00.txt
for details.

The current notion of "int" would be defined as a specific kind of
fixed-point type (a denominator of 1), but other fixed-point types
such as dollars (denominator of 100) or dozens (denominator of 1/12)
could also be defined.  The nice thing about type systems like this is
that they can accurately describe non-binary values, like 1/3.

Bill

From martin at v.loewis.de  Tue Nov  8 21:37:41 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 08 Nov 2005 21:37:41 +0100
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <437100A7.5050907@c2b2.columbia.edu>
References: <437100A7.5050907@c2b2.columbia.edu>
Message-ID: <43710C95.30209@v.loewis.de>

Michiel Jan Laurens de Hoon wrote:
> 1) Did I miss something? Is there some way to get an event loop with 
> Tkinter?

Yes, and yes. You are missing multi-threading, which is the widely used
approach to doing things simultaneously in a single process. In one
thread, user interaction can occur; in another, computation. If you need
non-blocking interaction between the threads, use queues, or other
global variables. If you have other event sources, deal with them
in separate threads.

Yes, it is possible to get event loops with Tkinter. Atleast on Unix,
you can install a file handler into the Tk event loop (through
createfilehandler), which gives you callbacks whenever there is some
activity on the files.

Furthermore, it is possible to turn the event loop around, by doing
dooneevent explicitly.

In principle, it would also be possible to expose Tcl events and
notifications in Tkinter (i.e. the
Tcl_CreateEventSource/Tcl_WaitForEvent family of APIs). If you think
this would help in your case, then contributions are welcome.

> 2) Will Tkinter always be the standard toolkit for Python, or are there 
> plans to replace it at some point?

Python does not evolve along a grand master plan. Instead, individual
contributors propose specific modifications, e.g. through PEPs.

I personally have no plan to replace Tkinter.

Regards,
Martin

From goodger at python.org  Wed Nov  9 00:54:41 2005
From: goodger at python.org (David Goodger)
Date: Tue, 08 Nov 2005 18:54:41 -0500
Subject: [Python-Dev] PEP submission broken?
In-Reply-To: <43695FE5.1080803__36597.7150541314$1131334275$gmane$org@nowhere.org>
References: <43695FE5.1080803__36597.7150541314$1131334275$gmane$org@nowhere.org>
Message-ID: <43713AC1.5060803@python.org>

[Bryan Olson]
> Though I tried to submit a (pre-) PEP in the proper form through the
> proper channels, it has disappeared into the ether.

Indeed, it has; I can't find it in my mailbox.  Could you re-send the
latest text?  I'll review it right away.

> From what I can tell, We need to address fixing the
> PEP process before there is any point in working on PEP's,

Email is imperfect; just send it again.

And "fakeaddress at nowhere.org" doesn't help ;-)

--
David Goodger <http://python.net/~goodger>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 253 bytes
Desc: OpenPGP digital signature
Url : http://mail.python.org/pipermail/python-dev/attachments/20051108/c039f37b/signature.pgp

From goodger at python.org  Wed Nov  9 00:55:35 2005
From: goodger at python.org (David Goodger)
Date: Tue, 08 Nov 2005 18:55:35 -0500
Subject: [Python-Dev] PEP submission broken?
In-Reply-To: <436F1F49.90606@gmail.com>
References: <43695FE5.1080803@nowhere.org> <436F1F49.90606@gmail.com>
Message-ID: <43713AF7.1080600@python.org>

[Nick Coghlan]
> Would it be worth having a PEP category on the RFE tracker, rather than
> submitting pre-PEP's directly to the PEP editors?

Couldn't hurt.

--
David Goodger <http://python.net/~goodger>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 253 bytes
Desc: OpenPGP digital signature
Url : http://mail.python.org/pipermail/python-dev/attachments/20051108/c555115d/signature.pgp

From osantana at gmail.com  Wed Nov  9 03:33:47 2005
From: osantana at gmail.com (Osvaldo Santana Neto)
Date: Tue, 8 Nov 2005 23:33:47 -0300
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks
Message-ID: <20051109023347.GA15823@localhost.localdomain>

Hi,

I'm working on Python[1] port for Maemo Platform[2] and I've found a
inconsistent behavior in zipimport and import hook with '.pyc' and
'.pyo' files. The shell section below show this problem using a
'module_c.pyc', 'module_o.pyo' and 'modules.zip' (with module_c and
module_o inside):

$ ls
module_c.pyc  module_o.pyo  modules.zip

$ python
>>> import module_c
>>> import module_o
ImportError: No module named module_o

$ python -O
>>> import module_c
ImportError: No module named module_c
>>> import module_o

$ rm *.pyc *.pyo
$ PYTHONPATH=modules.zip python
>>> import module_c
module_c
>>> import module_o
module_o

$ PYTHONPATH=modules.zip python -O
>>> import module_c
module_c
>>> import module_o
module_o

I've create a patch suggestion to remove this inconsistency[3] (*I* think
zipimport behaviour is better).

[1] http://pymaemo.sf.net/
[2] http://www.maemo.org/
[3] http://python.org/sf/1346572

-- 
Osvaldo Santana Neto (aCiDBaSe)
icq, url = (11287184, "http://www.pythonbrasil.com.br")

From guido at python.org  Wed Nov  9 04:14:51 2005
From: guido at python.org (Guido van Rossum)
Date: Tue, 8 Nov 2005 19:14:51 -0800
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks
In-Reply-To: <20051109023347.GA15823@localhost.localdomain>
References: <20051109023347.GA15823@localhost.localdomain>
Message-ID: <ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com>

You didn't show us what's in the zip file.  Can you show a zipinfo output?

My intention with import was always that without -O, *.pyo files are
entirely ignored; and with -O, *.pyc files are entirely ignored.

It sounds like you're saying that you want to change this so that .pyc
and .pyo are always honored (with .pyc preferred if -O is not present
and .pyo preferred if -O is present). I'm not sure that I like that
better. If that's how zipimport works, I think it's broken!

--Guido

On 11/8/05, Osvaldo Santana Neto <osantana at gmail.com> wrote:
> Hi,
>
> I'm working on Python[1] port for Maemo Platform[2] and I've found a
> inconsistent behavior in zipimport and import hook with '.pyc' and
> '.pyo' files. The shell section below show this problem using a
> 'module_c.pyc', 'module_o.pyo' and 'modules.zip' (with module_c and
> module_o inside):
>
> $ ls
> module_c.pyc  module_o.pyo  modules.zip
>
> $ python
> >>> import module_c
> >>> import module_o
> ImportError: No module named module_o
>
> $ python -O
> >>> import module_c
> ImportError: No module named module_c
> >>> import module_o
>
> $ rm *.pyc *.pyo
> $ PYTHONPATH=modules.zip python
> >>> import module_c
> module_c
> >>> import module_o
> module_o
>
> $ PYTHONPATH=modules.zip python -O
> >>> import module_c
> module_c
> >>> import module_o
> module_o
>
> I've create a patch suggestion to remove this inconsistency[3] (*I* think
> zipimport behaviour is better).
>
> [1] http://pymaemo.sf.net/
> [2] http://www.maemo.org/
> [3] http://python.org/sf/1346572
>
> --
> Osvaldo Santana Neto (aCiDBaSe)
> icq, url = (11287184, "http://www.pythonbrasil.com.br")
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From gjc at inescporto.pt  Wed Nov  9 12:40:25 2005
From: gjc at inescporto.pt (Gustavo J. A. M. Carneiro)
Date: Wed, 09 Nov 2005 11:40:25 +0000
Subject: [Python-Dev] Weak references: dereference notification
Message-ID: <1131536425.9130.10.camel@localhost>

  Hello,

  I have come across a situation where I find the current weak
references interface for extension types insufficient.

  Currently you only have a tp_weaklistoffset slot, pointing to a
PyObject with weak references.  However, in my case[1] I _really_ need
to be notified when a weak reference is dereferenced.  What happens now
is that, when you call a weakref object, a simple Py_INCREF is done on
the referenced object.  It would be easy to implement a new slot to
contain a function that should be called when a weak reference is
dereferenced.  Or, alternatively, a slot or class attribute that
indicates an alternative type that should be used to create weak
references: instead of the builtin weakref object, a subtype of it, so
you can override tp_call.

  Does this sounds acceptable?
 
  Regards.

[1] http://bugzilla.gnome.org/show_bug.cgi?id=320428

-- 
Gustavo J. A. M. Carneiro
<gjc at inescporto.pt> <gustavo at users.sourceforge.net>
The universe is always one step beyond logic.


From osantana at gmail.com  Wed Nov  9 15:33:02 2005
From: osantana at gmail.com (Osvaldo Santana)
Date: Wed, 9 Nov 2005 12:33:02 -0200
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks
In-Reply-To: <ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com>
References: <20051109023347.GA15823@localhost.localdomain>
	<ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com>
Message-ID: <b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com>

On 11/9/05, Guido van Rossum <guido at python.org> wrote:
> You didn't show us what's in the zip file.  Can you show a zipinfo output?

$ zipinfo modules.zip
Archive:  modules.zip   426 bytes   2 files
-rw-r--r--  2.3 unx      109 bx defN 31-Oct-05 14:49 module_o.pyo
-rw-r--r--  2.3 unx      109 bx defN 31-Oct-05 14:48 module_c.pyc
2 files, 218 bytes uncompressed, 136 bytes compressed:  37.6%

> My intention with import was always that without -O, *.pyo files are
> entirely ignored; and with -O, *.pyc files are entirely ignored.
>
> It sounds like you're saying that you want to change this so that .pyc
> and .pyo are always honored (with .pyc preferred if -O is not present
> and .pyo preferred if -O is present). I'm not sure that I like that
> better. If that's how zipimport works, I think it's broken!

Yes, this is how zipimport works and I think this is good in cases
where a third-party binary module/package is available only with .pyo
files and others only with .pyc files (without .py source files, of
course).

I know we can rename the files, but this is a good solution? Well, I
don't have a strong opinion about the solution adopted and I really
like to see other alternatives and opinions.

Thanks,
Osvaldo

--
Osvaldo Santana Neto (aCiDBaSe)
icq, url = (11287184, "http://www.pythonbrasil.com.br")

From guido at python.org  Wed Nov  9 16:39:29 2005
From: guido at python.org (Guido van Rossum)
Date: Wed, 9 Nov 2005 07:39:29 -0800
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks
In-Reply-To: <b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com>
References: <20051109023347.GA15823@localhost.localdomain>
	<ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com>
	<b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com>
Message-ID: <ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com>

Maybe it makes more sense to deprecate .pyo altogether and instead
have a post-load optimizer optimize .pyc files according to the
current optimization settings?

Unless others are interested in this nothing will happen.

I've never heard of a third party making their code available only as
.pyo, so the use case for changing things isn't very strong. In fact
the only use cases I know for not making .py available are in
situations where a proprietary "canned" application is distributed to
end users who have no intention or need to ever add to the code.

--Guido

On 11/9/05, Osvaldo Santana <osantana at gmail.com> wrote:
> On 11/9/05, Guido van Rossum <guido at python.org> wrote:
> > You didn't show us what's in the zip file.  Can you show a zipinfo output?
>
> $ zipinfo modules.zip
> Archive:  modules.zip   426 bytes   2 files
> -rw-r--r--  2.3 unx      109 bx defN 31-Oct-05 14:49 module_o.pyo
> -rw-r--r--  2.3 unx      109 bx defN 31-Oct-05 14:48 module_c.pyc
> 2 files, 218 bytes uncompressed, 136 bytes compressed:  37.6%
>
> > My intention with import was always that without -O, *.pyo files are
> > entirely ignored; and with -O, *.pyc files are entirely ignored.
> >
> > It sounds like you're saying that you want to change this so that .pyc
> > and .pyo are always honored (with .pyc preferred if -O is not present
> > and .pyo preferred if -O is present). I'm not sure that I like that
> > better. If that's how zipimport works, I think it's broken!
>
> Yes, this is how zipimport works and I think this is good in cases
> where a third-party binary module/package is available only with .pyo
> files and others only with .pyc files (without .py source files, of
> course).
>
> I know we can rename the files, but this is a good solution? Well, I
> don't have a strong opinion about the solution adopted and I really
> like to see other alternatives and opinions.
>
> Thanks,
> Osvaldo
>
> --
> Osvaldo Santana Neto (aCiDBaSe)
> icq, url = (11287184, "http://www.pythonbrasil.com.br")
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From jim at zope.com  Wed Nov  9 17:50:44 2005
From: jim at zope.com (Jim Fulton)
Date: Wed, 09 Nov 2005 11:50:44 -0500
Subject: [Python-Dev] Weak references: dereference notification
In-Reply-To: <1131536425.9130.10.camel@localhost>
References: <1131536425.9130.10.camel@localhost>
Message-ID: <437228E4.4070800@zope.com>

Gustavo J. A. M. Carneiro wrote:
>   Hello,
> 
>   I have come across a situation where I find the current weak
> references interface for extension types insufficient.
> 
>   Currently you only have a tp_weaklistoffset slot, pointing to a
> PyObject with weak references.  However, in my case[1] I _really_ need
> to be notified when a weak reference is dereferenced.  What happens now
> is that, when you call a weakref object, a simple Py_INCREF is done on
> the referenced object.  It would be easy to implement a new slot to
> contain a function that should be called when a weak reference is
> dereferenced.  Or, alternatively, a slot or class attribute that
> indicates an alternative type that should be used to create weak
> references: instead of the builtin weakref object, a subtype of it, so
> you can override tp_call.
> 
>   Does this sounds acceptable?

Since you can now (as of 2.4) subclass the weakref.ref class, you should be able to
do this yourself in Python.  See for example, weakref.KeyedRef.

Jim

-- 
Jim Fulton           mailto:jim at zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org

From gjc at inescporto.pt  Wed Nov  9 18:14:59 2005
From: gjc at inescporto.pt (Gustavo J. A. M. Carneiro)
Date: Wed, 09 Nov 2005 17:14:59 +0000
Subject: [Python-Dev] Weak references: dereference notification
In-Reply-To: <437228E4.4070800@zope.com>
References: <1131536425.9130.10.camel@localhost> <437228E4.4070800@zope.com>
Message-ID: <1131556500.9130.18.camel@localhost>

Qua, 2005-11-09 ?s 11:50 -0500, Jim Fulton escreveu:
> Gustavo J. A. M. Carneiro wrote:
> >   Hello,
> > 
> >   I have come across a situation where I find the current weak
> > references interface for extension types insufficient.
> > 
> >   Currently you only have a tp_weaklistoffset slot, pointing to a
> > PyObject with weak references.  However, in my case[1] I _really_ need
> > to be notified when a weak reference is dereferenced.  What happens now
> > is that, when you call a weakref object, a simple Py_INCREF is done on
> > the referenced object.  It would be easy to implement a new slot to
> > contain a function that should be called when a weak reference is
> > dereferenced.  Or, alternatively, a slot or class attribute that
> > indicates an alternative type that should be used to create weak
> > references: instead of the builtin weakref object, a subtype of it, so
> > you can override tp_call.
> > 
> >   Does this sounds acceptable?
> 
> Since you can now (as of 2.4) subclass the weakref.ref class, you should be able to
> do this yourself in Python.  See for example, weakref.KeyedRef.

 I know I can subclass it, but it doesn't change anything.  If people
keep writing code like weakref.ref(myobj) instead of myweakref(myobj),
it still won't work.

  I wouldn't want to have to teach users of the library that they need
to use an alternative type; that seldom doesn't work.

  Now, if there was a place in the type that contained information like 

	"for creating weak references of instances of this type, use this
weakref class"

and weakref.ref was smart enough to lookup this type and use it, only
_then_ it could work.

  Thanks,

-- 
Gustavo J. A. M. Carneiro
<gjc at inescporto.pt> <gustavo at users.sourceforge.net>
The universe is always one step beyond logic.


From guido at python.org  Wed Nov  9 18:23:34 2005
From: guido at python.org (Guido van Rossum)
Date: Wed, 9 Nov 2005 09:23:34 -0800
Subject: [Python-Dev] Weak references: dereference notification
In-Reply-To: <1131556500.9130.18.camel@localhost>
References: <1131536425.9130.10.camel@localhost> <437228E4.4070800@zope.com>
	<1131556500.9130.18.camel@localhost>
Message-ID: <ca471dc20511090923u4ae0d00evf85c2cc8a123a1b5@mail.gmail.com>

> > Gustavo J. A. M. Carneiro wrote:
> > >   I have come across a situation where I find the current weak
> > > references interface for extension types insufficient.
> > >
> > >   Currently you only have a tp_weaklistoffset slot, pointing to a
> > > PyObject with weak references.  However, in my case[1] I _really_ need
> > > to be notified when a weak reference is dereferenced.

I find reading through the bug discussion a bit difficult to
understand your use case. Could you explain it here? If you can't
explain it you certainly won't get your problem solved! :-)

> > > What happens now
> > > is that, when you call a weakref object, a simple Py_INCREF is done on
> > > the referenced object.  It would be easy to implement a new slot to
> > > contain a function that should be called when a weak reference is
> > > dereferenced.  Or, alternatively, a slot or class attribute that
> > > indicates an alternative type that should be used to create weak
> > > references: instead of the builtin weakref object, a subtype of it, so
> > > you can override tp_call.
> > >
> > >   Does this sounds acceptable?

[Jim Fulton]
> > Since you can now (as of 2.4) subclass the weakref.ref class, you should be able to
> > do this yourself in Python.  See for example, weakref.KeyedRef.
>
>  I know I can subclass it, but it doesn't change anything.  If people
> keep writing code like weakref.ref(myobj) instead of myweakref(myobj),
> it still won't work.
>
>   I wouldn't want to have to teach users of the library that they need
> to use an alternative type; that seldom doesn't work.
>
>   Now, if there was a place in the type that contained information like
>
>         "for creating weak references of instances of this type, use this
> weakref class"
>
> and weakref.ref was smart enough to lookup this type and use it, only
> _then_ it could work.

Looks what you're looking for is a customizable factory fuction.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From nico at tekNico.net  Wed Nov  9 17:24:01 2005
From: nico at tekNico.net (Nicola Larosa)
Date: Wed, 09 Nov 2005 17:24:01 +0100
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks
In-Reply-To: <ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com>
References: <20051109023347.GA15823@localhost.localdomain>	<ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com>	<b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com>
	<ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com>
Message-ID: <dkt7r2$amm$1@sea.gmane.org>

> Maybe it makes more sense to deprecate .pyo altogether and instead
> have a post-load optimizer optimize .pyc files according to the
> current optimization settings?

That would not be enough, because it would leave the docstrings in the .pyc
files.


> Unless others are interested in this nothing will happen.

The status quo is good enough, for "normal" imports. If zipimport works
differently, well, that's not nice.


> I've never heard of a third party making their code available only as
> .pyo,

*cough* Ahem, here we are (the firm I work for).


> so the use case for changing things isn't very strong. In fact
> the only use cases I know for not making .py available are in
> situations where a proprietary "canned" application is distributed to
> end users who have no intention or need to ever add to the code.

Well, exactly. :-)

-- 
Nicola Larosa - nico at tekNico.net

No inventions have really significantly eased the cognitive difficulty
of writing scalable concurrent applications and it is unlikely that any
will in the near term. [...] Most of all, threads do not help, in fact,
they make the problem worse in many cases. -- G. Lefkowitz, August 2005


From gjc at inescporto.pt  Wed Nov  9 18:52:19 2005
From: gjc at inescporto.pt (Gustavo J. A. M. Carneiro)
Date: Wed, 09 Nov 2005 17:52:19 +0000
Subject: [Python-Dev] Weak references: dereference notification
In-Reply-To: <ca471dc20511090923u4ae0d00evf85c2cc8a123a1b5@mail.gmail.com>
References: <1131536425.9130.10.camel@localhost> <437228E4.4070800@zope.com>
	<1131556500.9130.18.camel@localhost>
	<ca471dc20511090923u4ae0d00evf85c2cc8a123a1b5@mail.gmail.com>
Message-ID: <1131558739.9130.40.camel@localhost>

Qua, 2005-11-09 ?s 09:23 -0800, Guido van Rossum escreveu:
> > > Gustavo J. A. M. Carneiro wrote:
> > > >   I have come across a situation where I find the current weak
> > > > references interface for extension types insufficient.
> > > >
> > > >   Currently you only have a tp_weaklistoffset slot, pointing to a
> > > > PyObject with weak references.  However, in my case[1] I _really_ need
> > > > to be notified when a weak reference is dereferenced.
> 
> I find reading through the bug discussion a bit difficult to
> understand your use case. Could you explain it here? If you can't
> explain it you certainly won't get your problem solved! :-)

  This is a typical PyObject wrapping C object (GObject) problem.  Both
PyObject and GObject have independent reference counts.  For each
GObject there is at most one PyObject wrapper.

  When the refcount on the wrapper drops to zero, tp_dealloc is called.
In tp_dealloc, and if the GObject refcount is > 1, I do something
slightly evil: I 'resurect' the PyObject (calling PyObject_Init), create
a weak reference to the GObject, and drop the "strong" reference.  I
call this a 'hibernation state'.


  Now the problem.  Suppose the user had a weak ref to the PyObject:

 	1- At certain point in time, when the wrapper is in hibernation state,
the user calls the weak ref
	2- It gets a PyObject that contains a weak reference to the GObject;
	3- Now suppose whatever was holding the GObject ref drops its
reference, which was the last one, and the GObject dies;
	4- Now the user does something with the PyObject obtained through the
weakref -> invalid memory access.

  The cause for the problem is that between steps 2 and 3 the wrapper
needs to change the weak reference to the GObject to a strong one.
Unfortunately, I don't get any notification that 2 happened.

  BTW, I fixed this problem in the mean time with a bit more of slightly
evil code.  I override tp_call of the standard weakref type :-P

[...]
> > and weakref.ref was smart enough to lookup this type and use it, only
> > _then_ it could work.
> 
> Looks what you're looking for is a customizable factory fuction.

  Sure, if weakref.ref could be such a factory, and could take "advice"
on what type of weakref to use for each class.

  Regards.

-- 
Gustavo J. A. M. Carneiro
<gjc at inescporto.pt> <gustavo at users.sourceforge.net>
The universe is always one step beyond logic.


From osantana at gmail.com  Wed Nov  9 19:15:04 2005
From: osantana at gmail.com (Osvaldo Santana)
Date: Wed, 9 Nov 2005 16:15:04 -0200
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks
In-Reply-To: <ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com>
References: <20051109023347.GA15823@localhost.localdomain>
	<ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com>
	<b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com>
	<ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com>
Message-ID: <b674ca220511091015s6d0a36bcj9ec1bd04dff93559@mail.gmail.com>

On 11/9/05, Guido van Rossum <guido at python.org> wrote:
> Maybe it makes more sense to deprecate .pyo altogether and instead
> have a post-load optimizer optimize .pyc files according to the
> current optimization settings?

I agree with this idea, but we've to think about docstrings (like
Nicola said in his e-mail).

Maybe we want to create a different and optimization-independent
option to remove docstrings from modules?

> Unless others are interested in this nothing will happen.
>
> I've never heard of a third party making their code available only as
> .pyo, so the use case for changing things isn't very strong. In fact
> the only use cases I know for not making .py available are in
> situations where a proprietary "canned" application is distributed to
> end users who have no intention or need to ever add to the code.

I've other use case: I'm porting Python to Maemo Platform and I want
to reduce the size of modules. The .pyo (-OO) are smaller than .pyc
files (mainly because docstring removing) and we start to use this
optimization flag to compile our Python distribution.

In this case we want to force developers to call Python Interpreter
with -O flags, set PYTHONOPTIMIZE, or apply my patch :) to make this
more transparent.

I've noticed this inconsistency when we stop to use zipimport in our
Python For Maemo distribution. We've decided to stop using zipimport
because the device (Nokia 770) uses a compressed filesystem.

Some friends (mainly Gustavo Barbieri) help me to create the suggested
patch after some discussion in our PythonBrasil mailing list.

Thanks,
Osvaldo

--
Osvaldo Santana Neto (aCiDBaSe)
icq, url = (11287184, "http://www.pythonbrasil.com.br")

From guido at python.org  Wed Nov  9 20:32:54 2005
From: guido at python.org (Guido van Rossum)
Date: Wed, 9 Nov 2005 11:32:54 -0800
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks
In-Reply-To: <b674ca220511091015s6d0a36bcj9ec1bd04dff93559@mail.gmail.com>
References: <20051109023347.GA15823@localhost.localdomain>
	<ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com>
	<b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com>
	<ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com>
	<b674ca220511091015s6d0a36bcj9ec1bd04dff93559@mail.gmail.com>
Message-ID: <ca471dc20511091132v4a5df88fy835da4ef092be053@mail.gmail.com>

On 11/9/05, Osvaldo Santana <osantana at gmail.com> wrote:
> I've noticed this inconsistency when we stop to use zipimport in our
> Python For Maemo distribution. We've decided to stop using zipimport
> because the device (Nokia 770) uses a compressed filesystem.

I won't comment further on the brainstorm that's going on (this is
becoming a topic for c.l.py) but I think you are misunderstanding the
point of zipimport. It's not done (usually) for the compression but
for the index. Finding a name in the zipfile index is much more
efficient than doing a directory search; and the zip index can be
cached.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From ronaldoussoren at mac.com  Wed Nov  9 20:40:02 2005
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Wed, 9 Nov 2005 20:40:02 +0100
Subject: [Python-Dev] Weak references: dereference notification
In-Reply-To: <1131558739.9130.40.camel@localhost>
References: <1131536425.9130.10.camel@localhost> <437228E4.4070800@zope.com>
	<1131556500.9130.18.camel@localhost>
	<ca471dc20511090923u4ae0d00evf85c2cc8a123a1b5@mail.gmail.com>
	<1131558739.9130.40.camel@localhost>
Message-ID: <9E82C8B1-8A32-457D-827A-F0135EB9F8D3@mac.com>


On 9-nov-2005, at 18:52, Gustavo J. A. M. Carneiro wrote:

> Qua, 2005-11-09 ?s 09:23 -0800, Guido van Rossum escreveu:
>>>> Gustavo J. A. M. Carneiro wrote:
>>>>>   I have come across a situation where I find the current weak
>>>>> references interface for extension types insufficient.
>>>>>
>>>>>   Currently you only have a tp_weaklistoffset slot, pointing to a
>>>>> PyObject with weak references.  However, in my case[1] I  
>>>>> _really_ need
>>>>> to be notified when a weak reference is dereferenced.
>>
>> I find reading through the bug discussion a bit difficult to
>> understand your use case. Could you explain it here? If you can't
>> explain it you certainly won't get your problem solved! :-)
>
>   This is a typical PyObject wrapping C object (GObject) problem.   
> Both
> PyObject and GObject have independent reference counts.  For each
> GObject there is at most one PyObject wrapper.
>
>   When the refcount on the wrapper drops to zero, tp_dealloc is  
> called.
> In tp_dealloc, and if the GObject refcount is > 1, I do something
> slightly evil: I 'resurect' the PyObject (calling PyObject_Init),  
> create
> a weak reference to the GObject, and drop the "strong" reference.  I
> call this a 'hibernation state'.

Why do you do that? The only reasons I can think of are that you hope  
to gain
some speed from this or that you want to support weak references to  
the GObject.

For what its worth, in PyObjC we don't support weak references to the  
underlying
Objective-C object and delete the proxy object when it is garbage  
collected.
Objective-C also has reference counts, we increase that in the  
constructor for
the proxy object and decrease it again in the destroctor.

Ronald

From pje at telecommunity.com  Wed Nov  9 20:48:25 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 09 Nov 2005 14:48:25 -0500
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport  hooks
In-Reply-To: <ca471dc20511091132v4a5df88fy835da4ef092be053@mail.gmail.co
 m>
References: <b674ca220511091015s6d0a36bcj9ec1bd04dff93559@mail.gmail.com>
	<20051109023347.GA15823@localhost.localdomain>
	<ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com>
	<b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com>
	<ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com>
	<b674ca220511091015s6d0a36bcj9ec1bd04dff93559@mail.gmail.com>
Message-ID: <5.1.1.6.0.20051109144523.01f4a6a8@mail.telecommunity.com>

At 11:32 AM 11/9/2005 -0800, Guido van Rossum wrote:
>On 11/9/05, Osvaldo Santana <osantana at gmail.com> wrote:
> > I've noticed this inconsistency when we stop to use zipimport in our
> > Python For Maemo distribution. We've decided to stop using zipimport
> > because the device (Nokia 770) uses a compressed filesystem.
>
>I won't comment further on the brainstorm that's going on (this is
>becoming a topic for c.l.py) but I think you are misunderstanding the
>point of zipimport. It's not done (usually) for the compression but
>for the index. Finding a name in the zipfile index is much more
>efficient than doing a directory search; and the zip index can be
>cached.

zipimport also helps distribution convenience - a large and elaborate 
package can be distributed in a single zipfile (such as is built by 
setuptools' "bdist_egg" command) and simply placed on PYTHONPATH or 
directly on sys.path.  And tools like py2exe can also append all an 
application's modules to an executable file in zipped form.


From barbieri at gmail.com  Wed Nov  9 22:12:38 2005
From: barbieri at gmail.com (Gustavo Sverzut Barbieri)
Date: Wed, 9 Nov 2005 19:12:38 -0200
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks
In-Reply-To: <ca471dc20511091132v4a5df88fy835da4ef092be053@mail.gmail.com>
References: <20051109023347.GA15823@localhost.localdomain>
	<ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com>
	<b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com>
	<ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com>
	<b674ca220511091015s6d0a36bcj9ec1bd04dff93559@mail.gmail.com>
	<ca471dc20511091132v4a5df88fy835da4ef092be053@mail.gmail.com>
Message-ID: <9ef20ef30511091312lcaa1caetbe0c4bade802738a@mail.gmail.com>

On 11/9/05, Guido van Rossum <guido at python.org> wrote:
> On 11/9/05, Osvaldo Santana <osantana at gmail.com> wrote:
> > I've noticed this inconsistency when we stop to use zipimport in our
> > Python For Maemo distribution. We've decided to stop using zipimport
> > because the device (Nokia 770) uses a compressed filesystem.
>
> I won't comment further on the brainstorm that's going on (this is
> becoming a topic for c.l.py) but I think you are misunderstanding the
> point of zipimport. It's not done (usually) for the compression but
> for the index. Finding a name in the zipfile index is much more
> efficient than doing a directory search; and the zip index can be
> cached.

Any way, not loading .pyo if no .pyc or .py is available is a
drawback, specially on unices that have scripts starting with
"#!/usr/bin/python" or "#!/usr/bin/env python" and the system just
have .pyo files, due a bunch of reasons, in this case the small disc
space.


--
Gustavo Sverzut Barbieri
--------------------------------------
Computer Engineer 2001 - UNICAMP
Mobile: +55 (19) 9165 8010
 Phone:  +1 (347) 624 6296 @ sip.stanaphone.com
Jabber: gsbarbieri at jabber.org
  ICQ#: 17249123
   MSN: barbieri at gmail.com
   GPG: 0xB640E1A2 @ wwwkeys.pgp.net

From janssen at parc.com  Wed Nov  9 22:22:35 2005
From: janssen at parc.com (Bill Janssen)
Date: Wed, 9 Nov 2005 13:22:35 PST
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks
In-Reply-To: Your message of "Wed, 09 Nov 2005 11:48:25 PST."
	<5.1.1.6.0.20051109144523.01f4a6a8@mail.telecommunity.com> 
Message-ID: <05Nov9.132241pst."58633"@synergy1.parc.xerox.com>

It's a shame that

1)  there's no equivalent of "java -jar", i.e., "python -z FILE.ZIP", and

2)  the use of zipfiles is so poorly documented.

Bill

From bob at redivi.com  Wed Nov  9 22:38:33 2005
From: bob at redivi.com (Bob Ippolito)
Date: Wed, 9 Nov 2005 13:38:33 -0800
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks
In-Reply-To: <05Nov9.132241pst."58633"@synergy1.parc.xerox.com>
References: <05Nov9.132241pst."58633"@synergy1.parc.xerox.com>
Message-ID: <A0F78CD8-1F2C-4201-B92B-1707AA822DF0@redivi.com>


On Nov 9, 2005, at 1:22 PM, Bill Janssen wrote:

> It's a shame that
>
> 1)  there's no equivalent of "java -jar", i.e., "python -z  
> FILE.ZIP", and

This should work on a few platforms:
env PYTHONPATH=FILE.zip python -m some_module_in_the_zip

-bob


From theller at python.net  Wed Nov  9 22:48:07 2005
From: theller at python.net (Thomas Heller)
Date: Wed, 09 Nov 2005 22:48:07 +0100
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks
References: <05Nov9.132241pst."58633"@synergy1.parc.xerox.com>
	<A0F78CD8-1F2C-4201-B92B-1707AA822DF0@redivi.com>
Message-ID: <1x1ppk6g.fsf@python.net>

Bob Ippolito <bob at redivi.com> writes:

> On Nov 9, 2005, at 1:22 PM, Bill Janssen wrote:
>
>> It's a shame that
>>
>> 1)  there's no equivalent of "java -jar", i.e., "python -z  
>> FILE.ZIP", and
>
> This should work on a few platforms:
> env PYTHONPATH=FILE.zip python -m some_module_in_the_zip

It should, yes - but it doesn't: -m doesn't work with zipimport.

Thomas


From bob at redivi.com  Wed Nov  9 22:55:04 2005
From: bob at redivi.com (Bob Ippolito)
Date: Wed, 9 Nov 2005 13:55:04 -0800
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks
In-Reply-To: <1x1ppk6g.fsf@python.net>
References: <05Nov9.132241pst."58633"@synergy1.parc.xerox.com>
	<A0F78CD8-1F2C-4201-B92B-1707AA822DF0@redivi.com>
	<1x1ppk6g.fsf@python.net>
Message-ID: <EDEA56AC-BB60-496D-8A3E-1FBD68F40D44@redivi.com>


On Nov 9, 2005, at 1:48 PM, Thomas Heller wrote:

> Bob Ippolito <bob at redivi.com> writes:
>
>> On Nov 9, 2005, at 1:22 PM, Bill Janssen wrote:
>>
>>> It's a shame that
>>>
>>> 1)  there's no equivalent of "java -jar", i.e., "python -z
>>> FILE.ZIP", and
>>
>> This should work on a few platforms:
>> env PYTHONPATH=FILE.zip python -m some_module_in_the_zip
>
> It should, yes - but it doesn't: -m doesn't work with zipimport.

That's dumb, someone should fix that.  Is there a bug filed?

-bob


From ncoghlan at gmail.com  Wed Nov  9 22:58:44 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 10 Nov 2005 07:58:44 +1000
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks
In-Reply-To: <A0F78CD8-1F2C-4201-B92B-1707AA822DF0@redivi.com>
References: <05Nov9.132241pst."58633"@synergy1.parc.xerox.com>
	<A0F78CD8-1F2C-4201-B92B-1707AA822DF0@redivi.com>
Message-ID: <43727114.4030107@gmail.com>

Bob Ippolito wrote:
> On Nov 9, 2005, at 1:22 PM, Bill Janssen wrote:
> 
>> It's a shame that
>>
>> 1)  there's no equivalent of "java -jar", i.e., "python -z  
>> FILE.ZIP", and
> 
> This should work on a few platforms:
> env PYTHONPATH=FILE.zip python -m some_module_in_the_zip

Really? I wrote the '-m' code, and I wouldn't expect that to work anywhere 
because 'execfile' and the C equivalent that -m relies on expect a real file.

PEP 328 goes some way towards fixing that by having a Python fallback to find
and execute the module if the current C code fails. If we had execmodule as a
Python function, it would make it much easier to add support for
compiling and executing the target module directly, rather than indirecting
through the file-system-dependent execfile. In theory this could be done in C, 
but execmodule is fairly long even written in Python. I'm actually fairly sure 
it *could* be written in C, but I think doing so would be horribly tedious 
(and not as useful in the long run).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From gjc at inescporto.pt  Wed Nov  9 23:44:38 2005
From: gjc at inescporto.pt (Gustavo J. A. M. Carneiro)
Date: Wed, 09 Nov 2005 22:44:38 +0000
Subject: [Python-Dev] Weak references: dereference notification
In-Reply-To: <9E82C8B1-8A32-457D-827A-F0135EB9F8D3@mac.com>
References: <1131536425.9130.10.camel@localhost> <437228E4.4070800@zope.com>
	<1131556500.9130.18.camel@localhost>
	<ca471dc20511090923u4ae0d00evf85c2cc8a123a1b5@mail.gmail.com>
	<1131558739.9130.40.camel@localhost>
	<9E82C8B1-8A32-457D-827A-F0135EB9F8D3@mac.com>
Message-ID: <1131576278.8540.14.camel@localhost.localdomain>

On Wed, 2005-11-09 at 20:40 +0100, Ronald Oussoren wrote:
> On 9-nov-2005, at 18:52, Gustavo J. A. M. Carneiro wrote:
> 
> > Qua, 2005-11-09 ?s 09:23 -0800, Guido van Rossum escreveu:
> >>>> Gustavo J. A. M. Carneiro wrote:
> >>>>>   I have come across a situation where I find the current weak
> >>>>> references interface for extension types insufficient.
> >>>>>
> >>>>>   Currently you only have a tp_weaklistoffset slot, pointing to a
> >>>>> PyObject with weak references.  However, in my case[1] I  
> >>>>> _really_ need
> >>>>> to be notified when a weak reference is dereferenced.
> >>
> >> I find reading through the bug discussion a bit difficult to
> >> understand your use case. Could you explain it here? If you can't
> >> explain it you certainly won't get your problem solved! :-)
> >
> >   This is a typical PyObject wrapping C object (GObject) problem.   
> > Both
> > PyObject and GObject have independent reference counts.  For each
> > GObject there is at most one PyObject wrapper.
> >
> >   When the refcount on the wrapper drops to zero, tp_dealloc is  
> > called.
> > In tp_dealloc, and if the GObject refcount is > 1, I do something
> > slightly evil: I 'resurect' the PyObject (calling PyObject_Init),  
> > create
> > a weak reference to the GObject, and drop the "strong" reference.  I
> > call this a 'hibernation state'.
> 
> Why do you do that? The only reasons I can think of are that you hope  
> to gain
> some speed from this or that you want to support weak references to  
> the GObject.

  We want to support weak references to GObjects.  Mainly because that
support has always been there and we don't want/can't break API.  And it
does have some uses...

> 
> For what its worth, in PyObjC we don't support weak references to the  
> underlying
> Objective-C object and delete the proxy object when it is garbage  
> collected.
> Objective-C also has reference counts, we increase that in the  
> constructor for
> the proxy object and decrease it again in the destroctor.

  OK, but what if it is a subclass of a builtin type, with instance
variables?  What if the PyObject is GC'ed but the ObjC object remains
alive, and later you get a new reference to it?  Do you create a new
PyObject wrapper for it?  What happened to the instance variables?

  Our goal in wrapping GObject is that, once a Python wrapper for a
GObject instance is created, it never dies until the GObject dies too.
At the same time, once the python wrapper loses all references, it
should not stop keeping the GObject alive.

  What happens currently, which is what I'm trying to change, is that
there is a reference loop between PyObject and GObject, so that
deallocation only happens with the help of the cyclic GC.  But relying
on the GC for _everything_ causes annoying problems:

	1- The GC runs only once in a while, not soon enough if eg. you have an
image object with several megabytes;

	2- It makes it hard to debug reference counting bugs, as the symptom
only appears when the GC runs, far away from the code that cause the
problem in the first place;

	3- Generally the GC has a lot more work, since every PyGTK object needs
it, and a GUI app can have lots of PyGTK objects.

  Regards.

-- 
Gustavo J. A. M. Carneiro
<gjc at inescporto.pt> <gustavo at users.sourceforge.net>
The universe is always one step beyond logic


From p.f.moore at gmail.com  Wed Nov  9 23:56:13 2005
From: p.f.moore at gmail.com (Paul Moore)
Date: Wed, 9 Nov 2005 22:56:13 +0000
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks
In-Reply-To: <EDEA56AC-BB60-496D-8A3E-1FBD68F40D44@redivi.com>
References: <A0F78CD8-1F2C-4201-B92B-1707AA822DF0@redivi.com>
	<1x1ppk6g.fsf@python.net>
	<EDEA56AC-BB60-496D-8A3E-1FBD68F40D44@redivi.com>
Message-ID: <79990c6b0511091456y329f1c5ey53b7428e59c97bc7@mail.gmail.com>

On 11/9/05, Bob Ippolito <bob at redivi.com> wrote:
>
> On Nov 9, 2005, at 1:48 PM, Thomas Heller wrote:
>
> > Bob Ippolito <bob at redivi.com> writes:
> >
> >> On Nov 9, 2005, at 1:22 PM, Bill Janssen wrote:
> >>
> >>> It's a shame that
> >>>
> >>> 1)  there's no equivalent of "java -jar", i.e., "python -z
> >>> FILE.ZIP", and
> >>
> >> This should work on a few platforms:
> >> env PYTHONPATH=FILE.zip python -m some_module_in_the_zip
> >
> > It should, yes - but it doesn't: -m doesn't work with zipimport.
>
> That's dumb, someone should fix that.  Is there a bug filed?

I did, a while ago. http://www.python.org/sf/1250389

Paul.

From bcannon at gmail.com  Thu Nov 10 00:05:13 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Wed, 9 Nov 2005 15:05:13 -0800
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks
In-Reply-To: <ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com>
References: <20051109023347.GA15823@localhost.localdomain>
	<ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com>
	<b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com>
	<ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com>
Message-ID: <bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com>

On 11/9/05, Guido van Rossum <guido at python.org> wrote:
> Maybe it makes more sense to deprecate .pyo altogether and instead
> have a post-load optimizer optimize .pyc files according to the
> current optimization settings?
>

But I thought part of the point of .pyo files was that they left out
docstrings and thus had a smaller footprint?  Plus I wouldn't be
surprised if we started to move away from bytecode optimization and
instead tried to do more AST transformations which would remove
possible post-load optimizations.

I would have  no issue with removing .pyo files and have .pyc files
just be as optimized as they  the current settings are and leave it at
that.  Could have some metadata listing what optimizations occurred,
but do we really need to have a specific way to denote if bytecode has
been optimized?  Binary files compiled from C don't note what -O
optimization they were compiled with.  If someone distributes
optimized .pyc files chances are they are going to have a specific
compile step with py_compile and they will know what optimizations
they are using.

-Brett

From foom at fuhm.net  Thu Nov 10 00:15:02 2005
From: foom at fuhm.net (James Y Knight)
Date: Wed, 9 Nov 2005 18:15:02 -0500
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks
In-Reply-To: <bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com>
References: <20051109023347.GA15823@localhost.localdomain>
	<ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com>
	<b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com>
	<ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com>
	<bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com>
Message-ID: <4132F2DC-0981-49F0-8DF4-B20FF840290D@fuhm.net>


On Nov 9, 2005, at 6:05 PM, Brett Cannon wrote:

> I would have  no issue with removing .pyo files and have .pyc files
> just be as optimized as they  the current settings are and leave it at
> that.  Could have some metadata listing what optimizations occurred,
> but do we really need to have a specific way to denote if bytecode has
> been optimized?  Binary files compiled from C don't note what -O
> optimization they were compiled with.  If someone distributes
> optimized .pyc files chances are they are going to have a specific
> compile step with py_compile and they will know what optimizations
> they are using.
>

This sounds quite sensible. The only thing I'd add is that iff there  
is a .py file of the same name, and the current optimization settings  
are different from those in the .pyc file, python should recompile  
the .py file.

James

From guido at python.org  Thu Nov 10 00:25:03 2005
From: guido at python.org (Guido van Rossum)
Date: Wed, 9 Nov 2005 15:25:03 -0800
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks
In-Reply-To: <bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com>
References: <20051109023347.GA15823@localhost.localdomain>
	<ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com>
	<b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com>
	<ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com>
	<bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com>
Message-ID: <ca471dc20511091525g11986fb8pf7e2a4ba9a21f5c0@mail.gmail.com>

On 11/9/05, Brett Cannon <bcannon at gmail.com> wrote:
> On 11/9/05, Guido van Rossum <guido at python.org> wrote:
> > Maybe it makes more sense to deprecate .pyo altogether and instead
> > have a post-load optimizer optimize .pyc files according to the
> > current optimization settings?
>
> But I thought part of the point of .pyo files was that they left out
> docstrings and thus had a smaller footprint?

Very few people care about the smaller footprint (although one piped up here).

> Plus I wouldn't be
> surprised if we started to move away from bytecode optimization and
> instead tried to do more AST transformations which would remove
> possible post-load optimizations.
>
> I would have  no issue with removing .pyo files and have .pyc files
> just be as optimized as they  the current settings are and leave it at
> that.  Could have some metadata listing what optimizations occurred,
> but do we really need to have a specific way to denote if bytecode has
> been optimized?  Binary files compiled from C don't note what -O
> optimization they were compiled with.  If someone distributes
> optimized .pyc files chances are they are going to have a specific
> compile step with py_compile and they will know what optimizations
> they are using.

Currently, .pyo files have some important semantic differences with
.pyc files; -O doesn't remove docstrings (that's -OO) but it does
remove asserts. I wouldn't want to accidentally use a .pyc file
without asserts compiled in unless the .py file wasn't around.

For application distribution, the following probably would work:

- instead of .pyo files, we use .pyc files
- the .pyc file records whether optimizations were applied, whether
asserts are compiled, and whether docstrings are retained
- if the compiler finds a .pyc that is inconsistent with the current
command line, it ignores it and rewrites it (if it is writable) just
as if the .py file were newer

However, this would be a major pain for the standard library and other
shared code -- there it's really nice to have a cache for each of the
optimization levels since usually regular users can't write the
.py[co] files there, meaning very slow always-recompilation if the
standard .pyc files aren't of the right level, causing unacceptable
start-up times.

The only solutions I can think of that use a single file actually
*increase* the file size by having unoptimized and optimized code
side-by-side, or some way to quickly skip the assertions -- the -OO
option is a special case that probably needs to be done differently
anyway and only for final distribution.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From bcannon at gmail.com  Thu Nov 10 01:04:15 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Wed, 9 Nov 2005 16:04:15 -0800
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks
In-Reply-To: <ca471dc20511091525g11986fb8pf7e2a4ba9a21f5c0@mail.gmail.com>
References: <20051109023347.GA15823@localhost.localdomain>
	<ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com>
	<b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com>
	<ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com>
	<bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com>
	<ca471dc20511091525g11986fb8pf7e2a4ba9a21f5c0@mail.gmail.com>
Message-ID: <bbaeab100511091604j732cfc86k170e782e0233f638@mail.gmail.com>

On 11/9/05, Guido van Rossum <guido at python.org> wrote:
> On 11/9/05, Brett Cannon <bcannon at gmail.com> wrote:
> > Plus I wouldn't be
> > surprised if we started to move away from bytecode optimization and
> > instead tried to do more AST transformations which would remove
> > possible post-load optimizations.
> >
> > I would have  no issue with removing .pyo files and have .pyc files
> > just be as optimized as they  the current settings are and leave it at
> > that.  Could have some metadata listing what optimizations occurred,
> > but do we really need to have a specific way to denote if bytecode has
> > been optimized?  Binary files compiled from C don't note what -O
> > optimization they were compiled with.  If someone distributes
> > optimized .pyc files chances are they are going to have a specific
> > compile step with py_compile and they will know what optimizations
> > they are using.
>
> Currently, .pyo files have some important semantic differences with
> .pyc files; -O doesn't remove docstrings (that's -OO) but it does
> remove asserts. I wouldn't want to accidentally use a .pyc file
> without asserts compiled in unless the .py file wasn't around.
>
> For application distribution, the following probably would work:
>
> - instead of .pyo files, we use .pyc files
> - the .pyc file records whether optimizations were applied, whether
> asserts are compiled, and whether docstrings are retained
> - if the compiler finds a .pyc that is inconsistent with the current
> command line, it ignores it and rewrites it (if it is writable) just
> as if the .py file were newer
>
> However, this would be a major pain for the standard library and other
> shared code -- there it's really nice to have a cache for each of the
> optimization levels since usually regular users can't write the
> .py[co] files there, meaning very slow always-recompilation if the
> standard .pyc files aren't of the right level, causing unacceptable
> start-up times.
>

What if PEP 304 came into being?  Then people would have a place to
have the shared code's recompiled version stored and thus avoid the
overhead from repeated use.

> The only solutions I can think of that use a single file actually
> *increase* the file size by having unoptimized and optimized code
> side-by-side, or some way to quickly skip the assertions -- the -OO
> option is a special case that probably needs to be done differently
> anyway and only for final distribution.
>

One option would be to introduce an ASSERTION bytecode that has an
argument specifying the amount of bytecode for the assertion.  The
eval loop can then just igonore the bytecode if assertions are being
evaluated and fall through to the bytecode for the assertions (and
thus be the equivalent of NOP) or use the argument to jump forward
that number of bytes in the bytecode and completely skip over the
assertion (and thus be just like a JUMP_FORWARD).  Either way
assertions becomes slightly more costly but it should be very minimal.

-Brett

From pje at telecommunity.com  Thu Nov 10 01:16:11 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 09 Nov 2005 19:16:11 -0500
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport  hooks
In-Reply-To: <ca471dc20511091525g11986fb8pf7e2a4ba9a21f5c0@mail.gmail.co
 m>
References: <bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com>
	<20051109023347.GA15823@localhost.localdomain>
	<ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com>
	<b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com>
	<ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com>
	<bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com>
Message-ID: <5.1.1.6.0.20051109190838.01f51838@mail.telecommunity.com>

At 03:25 PM 11/9/2005 -0800, Guido van Rossum wrote:
>The only solutions I can think of that use a single file actually
>*increase* the file size by having unoptimized and optimized code
>side-by-side, or some way to quickly skip the assertions -- the -OO
>option is a special case that probably needs to be done differently
>anyway and only for final distribution.

We could have a "JUMP_IF_NOT_DEBUG" opcode to skip over asserts and "if 
__debug__" blocks.  Then under -O we could either patch this to a plain 
jump, or compact the bytecode to remove the jumped-over part(s).

By the way, while we're on this subject, can we make the optimization 
options be part of the compile() interface?  Right now the distutils has to 
actually exec another Python process whenever you want to compile code with 
a different optimization level than what's currently in effect, whereas if 
it could pass the desired level to compile(), this wouldn't be necessary.


From guido at python.org  Thu Nov 10 01:33:00 2005
From: guido at python.org (Guido van Rossum)
Date: Wed, 9 Nov 2005 16:33:00 -0800
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks
In-Reply-To: <5.1.1.6.0.20051109190838.01f51838@mail.telecommunity.com>
References: <20051109023347.GA15823@localhost.localdomain>
	<ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com>
	<b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com>
	<ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com>
	<bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com>
	<5.1.1.6.0.20051109190838.01f51838@mail.telecommunity.com>
Message-ID: <ca471dc20511091633m4b7869b7jc3bd847436f452ab@mail.gmail.com>

On 11/9/05, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 03:25 PM 11/9/2005 -0800, Guido van Rossum wrote:
> >The only solutions I can think of that use a single file actually
> >*increase* the file size by having unoptimized and optimized code
> >side-by-side, or some way to quickly skip the assertions -- the -OO
> >option is a special case that probably needs to be done differently
> >anyway and only for final distribution.
>
> We could have a "JUMP_IF_NOT_DEBUG" opcode to skip over asserts and "if
> __debug__" blocks.  Then under -O we could either patch this to a plain
> jump, or compact the bytecode to remove the jumped-over part(s).

That sounds very reasonable.

> By the way, while we're on this subject, can we make the optimization
> options be part of the compile() interface?  Right now the distutils has to
> actually exec another Python process whenever you want to compile
> code with
> a different optimization level than what's currently in effect, whereas if
> it could pass the desired level to compile(), this wouldn't be necessary.

Makes sense to me; we need a patch of course.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Thu Nov 10 01:35:14 2005
From: guido at python.org (Guido van Rossum)
Date: Wed, 9 Nov 2005 16:35:14 -0800
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks
In-Reply-To: <bbaeab100511091604j732cfc86k170e782e0233f638@mail.gmail.com>
References: <20051109023347.GA15823@localhost.localdomain>
	<ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com>
	<b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com>
	<ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com>
	<bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com>
	<ca471dc20511091525g11986fb8pf7e2a4ba9a21f5c0@mail.gmail.com>
	<bbaeab100511091604j732cfc86k170e782e0233f638@mail.gmail.com>
Message-ID: <ca471dc20511091635p586d127cpa923926b3ac65639@mail.gmail.com>

[Guido]
> > However, this would be a major pain for the standard library and other
> > shared code -- there it's really nice to have a cache for each of the
> > optimization levels since usually regular users can't write the
> > .py[co] files there, meaning very slow always-recompilation if the
> > standard .pyc files aren't of the right level, causing unacceptable
> > start-up times.
[Brett]
> What if PEP 304 came into being?  Then people would have a place to
> have the shared code's recompiled version stored and thus avoid the
> overhead from repeated use.

Still sounds suboptimal for the standard library; IMO it should "just work".

> > The only solutions I can think of that use a single file actually
> > *increase* the file size by having unoptimized and optimized code
> > side-by-side, or some way to quickly skip the assertions -- the -OO
> > option is a special case that probably needs to be done differently
> > anyway and only for final distribution.
>
> One option would be to introduce an ASSERTION bytecode that has an
> argument specifying the amount of bytecode for the assertion.  The
> eval loop can then just igonore the bytecode if assertions are being
> evaluated and fall through to the bytecode for the assertions (and
> thus be the equivalent of NOP) or use the argument to jump forward
> that number of bytes in the bytecode and completely skip over the
> assertion (and thus be just like a JUMP_FORWARD).  Either way
> assertions becomes slightly more costly but it should be very minimal.

I like Phillip's suggestion -- no new opcode, just a conditional jump
that can be easily optimized out.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From bcannon at gmail.com  Thu Nov 10 01:57:07 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Wed, 9 Nov 2005 16:57:07 -0800
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks
In-Reply-To: <ca471dc20511091635p586d127cpa923926b3ac65639@mail.gmail.com>
References: <20051109023347.GA15823@localhost.localdomain>
	<ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com>
	<b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com>
	<ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com>
	<bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com>
	<ca471dc20511091525g11986fb8pf7e2a4ba9a21f5c0@mail.gmail.com>
	<bbaeab100511091604j732cfc86k170e782e0233f638@mail.gmail.com>
	<ca471dc20511091635p586d127cpa923926b3ac65639@mail.gmail.com>
Message-ID: <bbaeab100511091657t5377f05dl111b4b701b551d4a@mail.gmail.com>

On 11/9/05, Guido van Rossum <guido at python.org> wrote:
> [Guido]
> > > However, this would be a major pain for the standard library and other
> > > shared code -- there it's really nice to have a cache for each of the
> > > optimization levels since usually regular users can't write the
> > > .py[co] files there, meaning very slow always-recompilation if the
> > > standard .pyc files aren't of the right level, causing unacceptable
> > > start-up times.
> [Brett]
> > What if PEP 304 came into being?  Then people would have a place to
> > have the shared code's recompiled version stored and thus avoid the
> > overhead from repeated use.
>
> Still sounds suboptimal for the standard library; IMO it should "just work".
>

Fair enough.

> > > The only solutions I can think of that use a single file actually
> > > *increase* the file size by having unoptimized and optimized code
> > > side-by-side, or some way to quickly skip the assertions -- the -OO
> > > option is a special case that probably needs to be done differently
> > > anyway and only for final distribution.
> >
> > One option would be to introduce an ASSERTION bytecode that has an
> > argument specifying the amount of bytecode for the assertion.  The
> > eval loop can then just igonore the bytecode if assertions are being
> > evaluated and fall through to the bytecode for the assertions (and
> > thus be the equivalent of NOP) or use the argument to jump forward
> > that number of bytes in the bytecode and completely skip over the
> > assertion (and thus be just like a JUMP_FORWARD).  Either way
> > assertions becomes slightly more costly but it should be very minimal.
>
> I like Phillip's suggestion -- no new opcode, just a conditional jump
> that can be easily optimized out.

Huh?  But Phillip is suggesting a new opcode that is essentially the
same as my proposal but naming it differently and saying the bytecode
should get changed directly instead of having the eval loop handle the
semantic differences based on whether -O is being used.

-Brett

From greg.ewing at canterbury.ac.nz  Thu Nov 10 01:57:43 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 10 Nov 2005 13:57:43 +1300
Subject: [Python-Dev] Weak references: dereference notification
In-Reply-To: <1131576278.8540.14.camel@localhost.localdomain>
References: <1131536425.9130.10.camel@localhost> <437228E4.4070800@zope.com>
	<1131556500.9130.18.camel@localhost>
	<ca471dc20511090923u4ae0d00evf85c2cc8a123a1b5@mail.gmail.com>
	<1131558739.9130.40.camel@localhost>
	<9E82C8B1-8A32-457D-827A-F0135EB9F8D3@mac.com>
	<1131576278.8540.14.camel@localhost.localdomain>
Message-ID: <43729B07.6010907@canterbury.ac.nz>

Gustavo J. A. M. Carneiro wrote:

>   OK, but what if it is a subclass of a builtin type, with instance
> variables?  What if the PyObject is GC'ed but the ObjC object remains
> alive, and later you get a new reference to it?  Do you create a new
> PyObject wrapper for it?  What happened to the instance variables?

Your proposed scheme appears to involve destroying and
then re-initialising the Python wrapper. Isn't that
going to wipe out any instance variables it may
have had?

Also, it seems to me that as soon as the refcount on
the wrapper drops to zero, any weak references to it
will be broken. Or does your resurrection code
intervene before that happens?

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From guido at python.org  Thu Nov 10 02:01:57 2005
From: guido at python.org (Guido van Rossum)
Date: Wed, 9 Nov 2005 17:01:57 -0800
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks
In-Reply-To: <bbaeab100511091657t5377f05dl111b4b701b551d4a@mail.gmail.com>
References: <20051109023347.GA15823@localhost.localdomain>
	<ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com>
	<b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com>
	<ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com>
	<bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com>
	<ca471dc20511091525g11986fb8pf7e2a4ba9a21f5c0@mail.gmail.com>
	<bbaeab100511091604j732cfc86k170e782e0233f638@mail.gmail.com>
	<ca471dc20511091635p586d127cpa923926b3ac65639@mail.gmail.com>
	<bbaeab100511091657t5377f05dl111b4b701b551d4a@mail.gmail.com>
Message-ID: <ca471dc20511091701x36cc9061x142ad6afc2aeb853@mail.gmail.com>

> > I like Phillip's suggestion -- no new opcode, just a conditional jump
> > that can be easily optimized out.
>
> Huh?  But Phillip is suggesting a new opcode that is essentially the
> same as my proposal but naming it differently and saying the bytecode
> should get changed directly instead of having the eval loop handle the
> semantic differences based on whether -O is being used.

Sorry. Looking back they look pretty much the same to me. Somehow I
glanced over Phillip's code and thought he was proposing to use a
regular JUMP_IF opcode with the special __debug__ variable (which
would be a 3rd possibility, good if we had backwards compatibility
requirements for bytecode -- which we don't, fortunately :-).

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From mdehoon at c2b2.columbia.edu  Thu Nov 10 02:04:43 2005
From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon)
Date: Wed, 09 Nov 2005 20:04:43 -0500
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <43710C95.30209@v.loewis.de>
References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de>
Message-ID: <43729CAB.5070106@c2b2.columbia.edu>

Martin v. L?wis wrote:

> Michiel Jan Laurens de Hoon wrote:
>
>> 2) Will Tkinter always be the standard toolkit for Python, or are 
>> there plans to replace it at some point?
>
>
> Python does not evolve along a grand master plan. Instead, individual
> contributors propose specific modifications, e.g. through PEPs.

At this point, I can't propose a specific modification yet because I 
don't know the reasoning that went behind the original choice of Tk as 
the default GUI toolkit for Python (and hence, I don't know if those 
reasons are still valid today). I can see one disadvantage (using Tk 
limits our options to run an event loop for other Python extensions), 
and I am trying to find out why Tk was deemed more appropriate than 
other GUI toolkits anyway.

So let me rephrase the question: What is the advantage of Tk in 
comparison to other GUI toolkits? Is it Mac availability? More advanced 
widget set? Installation is easier? Portability? Switching to a 
different GUI toolkit would break too much existing code? I think that 
having the answer to this will stimulate further development of 
alternative GUI toolkits, which may give some future Python version a 
toolkit at least as good as Tk, and one that doesn't interfere with 
Python's event loop capabilities.

--Michiel.

-- 
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032



From bcannon at gmail.com  Thu Nov 10 02:49:39 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Wed, 9 Nov 2005 17:49:39 -0800
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks
In-Reply-To: <ca471dc20511091701x36cc9061x142ad6afc2aeb853@mail.gmail.com>
References: <20051109023347.GA15823@localhost.localdomain>
	<ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com>
	<b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com>
	<ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com>
	<bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com>
	<ca471dc20511091525g11986fb8pf7e2a4ba9a21f5c0@mail.gmail.com>
	<bbaeab100511091604j732cfc86k170e782e0233f638@mail.gmail.com>
	<ca471dc20511091635p586d127cpa923926b3ac65639@mail.gmail.com>
	<bbaeab100511091657t5377f05dl111b4b701b551d4a@mail.gmail.com>
	<ca471dc20511091701x36cc9061x142ad6afc2aeb853@mail.gmail.com>
Message-ID: <bbaeab100511091749k3f8feb0fue5146474bb4d0deb@mail.gmail.com>

On 11/9/05, Guido van Rossum <guido at python.org> wrote:
> > > I like Phillip's suggestion -- no new opcode, just a conditional jump
> > > that can be easily optimized out.
> >
> > Huh?  But Phillip is suggesting a new opcode that is essentially the
> > same as my proposal but naming it differently and saying the bytecode
> > should get changed directly instead of having the eval loop handle the
> > semantic differences based on whether -O is being used.
>
> Sorry.

No problem.  Figured you just misread mine.

> Looking back they look pretty much the same to me. Somehow I
> glanced over Phillip's code and thought he was proposing to use a
> regular JUMP_IF opcode with the special __debug__ variable (which
> would be a 3rd possibility, good if we had backwards compatibility
> requirements for bytecode -- which we don't, fortunately :-).
>

Fortunately.  =)

So does this mean you like the idea?  Should this all move forward somehow?

-Brett

From guido at python.org  Thu Nov 10 02:51:19 2005
From: guido at python.org (Guido van Rossum)
Date: Wed, 9 Nov 2005 17:51:19 -0800
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks
In-Reply-To: <bbaeab100511091749k3f8feb0fue5146474bb4d0deb@mail.gmail.com>
References: <20051109023347.GA15823@localhost.localdomain>
	<b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com>
	<ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com>
	<bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com>
	<ca471dc20511091525g11986fb8pf7e2a4ba9a21f5c0@mail.gmail.com>
	<bbaeab100511091604j732cfc86k170e782e0233f638@mail.gmail.com>
	<ca471dc20511091635p586d127cpa923926b3ac65639@mail.gmail.com>
	<bbaeab100511091657t5377f05dl111b4b701b551d4a@mail.gmail.com>
	<ca471dc20511091701x36cc9061x142ad6afc2aeb853@mail.gmail.com>
	<bbaeab100511091749k3f8feb0fue5146474bb4d0deb@mail.gmail.com>
Message-ID: <ca471dc20511091751t18052057j9cc7a4e9da627b5d@mail.gmail.com>

On 11/9/05, Brett Cannon <bcannon at gmail.com> wrote:
> On 11/9/05, Guido van Rossum <guido at python.org> wrote:
> > > > I like Phillip's suggestion -- no new opcode, just a conditional jump
> > > > that can be easily optimized out.
> > >
> > > Huh?  But Phillip is suggesting a new opcode that is essentially the
> > > same as my proposal but naming it differently and saying the bytecode
> > > should get changed directly instead of having the eval loop handle the
> > > semantic differences based on whether -O is being used.
> >
> > Sorry.
>
> No problem.  Figured you just misread mine.
>
> > Looking back they look pretty much the same to me. Somehow I
> > glanced over Phillip's code and thought he was proposing to use a
> > regular JUMP_IF opcode with the special __debug__ variable (which
> > would be a 3rd possibility, good if we had backwards compatibility
> > requirements for bytecode -- which we don't, fortunately :-).
> >
>
> Fortunately.  =)
>
> So does this mean you like the idea?  Should this all move forward somehow?

I guess so. :-)

It will need someone thinking really hard about all the use cases,
edge cases, etc., implementation details, and writing up a PEP. Feel
like volunteering? You might squeeze Phillip as a co-author. He's a
really good one.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From Scott.Daniels at Acm.Org  Thu Nov 10 03:41:59 2005
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Wed, 09 Nov 2005 18:41:59 -0800
Subject: [Python-Dev] int(string) (was: DRAFT: python-dev Summary for
 2005-09-01 through 2005-09-16)
In-Reply-To: <1f7befae0510211952x5eb2000bicdf3c1a80a3f5749@mail.gmail.com>
References: <1f7befae0510211952x5eb2000bicdf3c1a80a3f5749@mail.gmail.com>
Message-ID: <4372B377.6050806@Acm.Org>

Tim Peters wrote:
> ...
> Someone want a finite project that would _really_ help their Uncle
> Timmy in his slow-motion crusade to get Python on the list of "solved
> it!" languages for each problem on that magnificent site?
...
> Turns out it's _not_ input speed that's the problem here, and not even
> mainly the speed of integer mod:  the bulk of the time is spent in
> int(string).... 

OK, I got an idea about how to do this fast.  I started with Python
code, and I now have C code that should beat int(string) always while
getting a lot of speed making long values.  The working tables can be
used to do the reverse transformation (int or long to string in some
base) with a little finesse, but I haven't done that yet in C.

The code is pretty sprawly now (a lot left in for instrumentation and
testing pieces), but can eventually get smaller.  I gave myself time
to do this as a birthday present to myself.  It may take a while to
build a patch, but perhaps you can let me know how much speedup you
get using this code.  if you build this module, I'd suggest using
"from to_int import chomp" to get a function that works like int
(producing a long when needed and so on).

> If you can even track all the levels of C function calls that ends up 
 > invoking <wink>, you find yourself in PyOS_strtoul(), which is a
 > nifty all-purpose routine that accepts inputs in bases 2 thru 36, can
> auto-detect base, and does platform-independent overflow checking at
> the cost of a division per digit.  All those are features, but it
> makes for sloooow conversion.
OK, this code doesn't deal with unicode at all.  The key observations
are:
   A) to figure out the base, you pretty much need to get to the first
      digit; getting to the first non-zero digit is not that much worse.
   B) If you know the length of a string of digits (starting at the
      first non-zero digit) and the base, you know approximately how
      bits the result will have.  You can do a single allocation if
      you are building a long.  You can tell if you need to test for
      overflow in building an int; there is one length per base where
      you must.

So the question becomes, is it worth taking two passes at the digits?
Well, it sure looks like it to me, but I haven't timed one or two-
character integers.  I do longs in "megadigits" -- the largest set
of digits that fits safely in SHIFT bits, so they have no need for
overflow checks.

For further excitement, you can use a similar technique to go from
the number of bits to the string length.  That should make for a
fast convert int/long to string in any of 36 (or more, really) bases.

I pass all of your mentioned test cases (including the one from a
later message).  I'm pretty much out of time for this project at
the moment, but encouraging words would help me steal some time
to finish.  For anyone wanting to look at the code, or try it
themselves:

Installer:
    http://members.dsl-only.net/~daniels/dist/to_int-0.10.win32-py2.4.exe
Just the 2.4 dll:
    http://members.dsl-only.net/~daniels/dist/to_int-0.10.win32.zip
Sources:
    http://members.dsl-only.net/~daniels/dist/to_int-0.10.zip


--Scott David Daniels
Scott.Daniels at Acm.Org


From greg.ewing at canterbury.ac.nz  Thu Nov 10 04:02:04 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 10 Nov 2005 16:02:04 +1300
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <43729CAB.5070106@c2b2.columbia.edu>
References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de>
	<43729CAB.5070106@c2b2.columbia.edu>
Message-ID: <4372B82C.9010800@canterbury.ac.nz>

Michiel Jan Laurens de Hoon wrote:

> At this point, I can't propose a specific modification yet because I 
> don't know the reasoning that went behind the original choice of Tk as 
> the default GUI toolkit for Python

Probably because at the time it was really the
only cross-platform GUI toolkit that worked
about equally well (or equally badly, depending
on your point of view) on all the major
platforms.

I'm not sure the event-loop situation would be
much different with another one, anyway. From what
I've seen of GUI toolkits, they all have their own
form of event loop, and they all provide some way
of hooking other things into it (as does Tkinter),
but whichever one you're using, it likes to be in
charge. Code which blocks reading from standard
input doesn't fit very well into any of them.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From exarkun at divmod.com  Thu Nov 10 04:08:52 2005
From: exarkun at divmod.com (Jean-Paul Calderone)
Date: Wed, 9 Nov 2005 22:08:52 -0500
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <4372B82C.9010800@canterbury.ac.nz>
Message-ID: <20051110030852.10365.1719239053.divmod.quotient.6042@ohm>

On Thu, 10 Nov 2005 16:02:04 +1300, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
>Michiel Jan Laurens de Hoon wrote:
>
>> At this point, I can't propose a specific modification yet because I
>> don't know the reasoning that went behind the original choice of Tk as
>> the default GUI toolkit for Python
>
>Probably because at the time it was really the
>only cross-platform GUI toolkit that worked
>about equally well (or equally badly, depending
>on your point of view) on all the major
>platforms.
>
>I'm not sure the event-loop situation would be
>much different with another one, anyway. From what
>I've seen of GUI toolkits, they all have their own
>form of event loop, and they all provide some way
>of hooking other things into it (as does Tkinter),
>but whichever one you're using, it likes to be in
>charge. Code which blocks reading from standard
>input doesn't fit very well into any of them.
>

Of course, the problem could be approached from the 
other direction: the blocking reads could be replaced 
with something else...

Jean-Paul

From janssen at parc.com  Thu Nov 10 05:00:44 2005
From: janssen at parc.com (Bill Janssen)
Date: Wed, 9 Nov 2005 20:00:44 PST
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks
In-Reply-To: Your message of "Wed, 09 Nov 2005 13:38:33 PST."
	<A0F78CD8-1F2C-4201-B92B-1707AA822DF0@redivi.com> 
Message-ID: <05Nov9.200052pst."58633"@synergy1.parc.xerox.com>

> This should work on a few platforms:
> env PYTHONPATH=FILE.zip python -m some_module_in_the_zip

Yeah, that's not bad, but I hate setting PYTHONPATH.  I was thinking
more along the line of

  python -z ZIPFILE

where python would look at the ZIPFILE to see if there's a top-level
module called "__init__", and if so, load it.  That would allow
existing PYTHONPATH settings to still be used if the user cares.

Bill

From falcon at intercable.ru  Wed Nov  9 08:24:04 2005
From: falcon at intercable.ru (Sokolov Yura)
Date: Wed, 09 Nov 2005 10:24:04 +0300
Subject: [Python-Dev]  Unifying decimal numbers.
Message-ID: <4371A414.2020400@intercable.ru>

Excuse my English

I think, we could just segregate tokens for decimal and real float and 
make them interoperable.
Motivation:
   Most of us works with business databases - all "floats" are really 
decimals, algebraic operations
should work without float inconsistency and those operations rare so 
speed is not important.
But some of us use floats for speed in scientific and multimedia programs.

with
from __future__ import Decimal
we could:
a) interpret regular float constants as decimal
b) interpret float constants with suffix 'f' as float (like    1.5f    
345.2e-5f  etc)
c) result of operation with decimal operands should be decimal
 >>> 1.0/3.0
0.33333333333333333
d) result of operation with float operands should be float
 >>> 1.0f/3.0f
0.33333333333333331f
e) result of operation with decimal and float should be float (decimal 
converts into float and operation perfomed)
 >>> 1.0f/3.0
0.33333333333333331f
 >>> 1.0/3.0f
0.33333333333333331f



From bcannon at gmail.com  Thu Nov 10 06:14:14 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Wed, 9 Nov 2005 21:14:14 -0800
Subject: [Python-Dev] dev FAQ updated with day-to-day svn questions
Message-ID: <bbaeab100511092114y73e5f525ubf5011fae39eab01@mail.gmail.com>

I just finished fleshing out the dev FAQ
(http://www.python.org/dev/devfaq.html) with questions covering what
someone might need to know for regular usage.  If anyone thinks I
didn't cover something I should have, let me know.

-Brett

From stephen at xemacs.org  Thu Nov 10 06:19:42 2005
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu, 10 Nov 2005 14:19:42 +0900
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <43729CAB.5070106@c2b2.columbia.edu> (Michiel Jan Laurens de
	Hoon's message of "Wed, 09 Nov 2005 20:04:43 -0500")
References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de>
	<43729CAB.5070106@c2b2.columbia.edu>
Message-ID: <87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "Michiel" == Michiel Jan Laurens de Hoon <mdehoon at c2b2.columbia.edu> writes:

    Michiel> What is the advantage of Tk in comparison to other GUI
    Michiel> toolkits?

IMO, Tk's _advantage_ is that it's there already.  As a standard
component, it works well for typical simple GUI applications (thus
satisfying "batteries included" IMO), and it's self-contained.  So I
would say it's at _no disadvantage_ to other toolkits.

Alternatives like PyGtk and wxWidgets are easily available and provide
some degree of cross-platform support for those who need something
more/different.

Is there some reason why you can't require users to install a toolkit
more suited to your application's needs?

-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.

From mdehoon at c2b2.columbia.edu  Thu Nov 10 06:27:22 2005
From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon)
Date: Thu, 10 Nov 2005 00:27:22 -0500
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <4372B82C.9010800@canterbury.ac.nz>
References: <437100A7.5050907@c2b2.columbia.edu>
	<43710C95.30209@v.loewis.de>	<43729CAB.5070106@c2b2.columbia.edu>
	<4372B82C.9010800@canterbury.ac.nz>
Message-ID: <4372DA3A.8010206@c2b2.columbia.edu>

Greg Ewing wrote:

>I'm not sure the event-loop situation would be
>much different with another one, anyway. From what
>I've seen of GUI toolkits, they all have their own
>form of event loop, and they all provide some way
>of hooking other things into it (as does Tkinter),
>but whichever one you're using, it likes to be in
>charge.
>
It's not because it likes to be in charge, it's because there's no other 
way to do it in Python. In our scientific visualization software, we 
also have our own event loop. I'd much rather let a Python event loop 
handle our messages. Not only would it save us time programming (setting 
up an event loop in an extension module that passes control back to 
Python when needed is tricky), it would also give better performance, it 
would work with IDLE (which an event loop in an extension module cannot 
as explained in my previous post), and it would let different extension 
modules live happily together all using the same event loop.

Tkinter is a special case among GUI toolkits because it is married to 
Tcl. It doesn't just need to handle its GUI events, it also needs to run 
the Tcl interpreter in between. Which is why Tkinter needs to be in 
charge of the event loop. For other GUI toolkits, I don't see a reason 
why they'd need their own event loop.

> Code which blocks reading from standard
>input doesn't fit very well into any of them.
>  
>
Actually, this is not difficult to accomplish. For example, try Tcl's 
wish on Linux: It will pop up a (responsive) graphics window but 
continue to read Tcl commands from the terminal. This is done via a call 
to select (on Unix) or MsgWaitForMultipleObjects (on Windows). Both of 
these can listen for terminal input and GUI events at the same time.

--Michiel.

-- 
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032



From mdehoon at c2b2.columbia.edu  Thu Nov 10 06:40:47 2005
From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon)
Date: Thu, 10 Nov 2005 00:40:47 -0500
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp>
References: <437100A7.5050907@c2b2.columbia.edu>
	<43710C95.30209@v.loewis.de>	<43729CAB.5070106@c2b2.columbia.edu>
	<87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp>
Message-ID: <4372DD5F.70203@c2b2.columbia.edu>

Stephen J. Turnbull wrote:

>    Michiel> What is the advantage of Tk in comparison to other GUI
>    Michiel> toolkits?
>
>IMO, Tk's _advantage_ is that it's there already.  As a standard
>component, it works well for typical simple GUI applications (thus
>satisfying "batteries included" IMO), and it's self-contained.  So I
>would say it's at _no disadvantage_ to other toolkits.
>
>Alternatives like PyGtk and wxWidgets are easily available and provide
>some degree of cross-platform support for those who need something
>more/different.
>
>Is there some reason why you can't require users to install a toolkit
>more suited to your application's needs?
>  
>
My application doesn't need a toolkit at all. My problem is that because 
of Tkinter being the standard Python toolkit, we cannot have a decent 
event loop in Python. So this is the disadvantage I see in Tkinter.

--Michiel.


-- 
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032



From ronaldoussoren at mac.com  Thu Nov 10 08:06:02 2005
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Thu, 10 Nov 2005 08:06:02 +0100
Subject: [Python-Dev] Weak references: dereference notification
In-Reply-To: <1131576278.8540.14.camel@localhost.localdomain>
References: <1131536425.9130.10.camel@localhost> <437228E4.4070800@zope.com>
	<1131556500.9130.18.camel@localhost>
	<ca471dc20511090923u4ae0d00evf85c2cc8a123a1b5@mail.gmail.com>
	<1131558739.9130.40.camel@localhost>
	<9E82C8B1-8A32-457D-827A-F0135EB9F8D3@mac.com>
	<1131576278.8540.14.camel@localhost.localdomain>
Message-ID: <49FAFACC-3892-49EA-9154-1AC43E533179@mac.com>


On 9-nov-2005, at 23:44, Gustavo J. A. M. Carneiro wrote:

> On Wed, 2005-11-09 at 20:40 +0100, Ronald Oussoren wrote:
>> On 9-nov-2005, at 18:52, Gustavo J. A. M. Carneiro wrote:
>>
>>> Qua, 2005-11-09 ?s 09:23 -0800, Guido van Rossum escreveu:
>>>>>> Gustavo J. A. M. Carneiro wrote:
>>>>>>>   I have come across a situation where I find the current weak
>>>>>>> references interface for extension types insufficient.
>>>>>>>
>>>>>>>   Currently you only have a tp_weaklistoffset slot, pointing  
>>>>>>> to a
>>>>>>> PyObject with weak references.  However, in my case[1] I
>>>>>>> _really_ need
>>>>>>> to be notified when a weak reference is dereferenced.
>>>>
>>>> I find reading through the bug discussion a bit difficult to
>>>> understand your use case. Could you explain it here? If you can't
>>>> explain it you certainly won't get your problem solved! :-)
>>>
>>>   This is a typical PyObject wrapping C object (GObject) problem.
>>> Both
>>> PyObject and GObject have independent reference counts.  For each
>>> GObject there is at most one PyObject wrapper.
>>>
>>>   When the refcount on the wrapper drops to zero, tp_dealloc is
>>> called.
>>> In tp_dealloc, and if the GObject refcount is > 1, I do something
>>> slightly evil: I 'resurect' the PyObject (calling PyObject_Init),
>>> create
>>> a weak reference to the GObject, and drop the "strong" reference.  I
>>> call this a 'hibernation state'.
>>
>> Why do you do that? The only reasons I can think of are that you hope
>> to gain
>> some speed from this or that you want to support weak references to
>> the GObject.
>
>   We want to support weak references to GObjects.  Mainly because that
> support has always been there and we don't want/can't break API.   
> And it
> does have some uses...
>
>>
>> For what its worth, in PyObjC we don't support weak references to the
>> underlying
>> Objective-C object and delete the proxy object when it is garbage
>> collected.
>> Objective-C also has reference counts, we increase that in the
>> constructor for
>> the proxy object and decrease it again in the destroctor.
>
>   OK, but what if it is a subclass of a builtin type, with instance
> variables?  What if the PyObject is GC'ed but the ObjC object remains
> alive, and later you get a new reference to it?  Do you create a new
> PyObject wrapper for it?  What happened to the instance variables?

Our main goal is that there is at most one wrapper for a python object
alive at any one time. And likewise there is at most one Objective-C
wrapper for a python object.

If a PyObject is GC'ed and the ObjC object remains alive you will get
a new PyObject when a reference to the ObjC object passes into python
space again. That is no problem because the proxy object contains no
state other than the pointer to the ObjC object.

ObjC's runtime might be more flexible than that of GObject. If you  
create
a subclass of an ObjC class the PyObjC runtime will create a real ObjC
class for you and all object state, including Python instance variables,
are stored on the ObjC side.
>
>   Our goal in wrapping GObject is that, once a Python wrapper for a
> GObject instance is created, it never dies until the GObject dies too.
> At the same time, once the python wrapper loses all references, it
> should not stop keeping the GObject alive.

I tried that too, but ran into some very ugly issues and decided that
weak references are not important enough for that. There's also the  
problem
that this will keep the python proxy alive even when it is not needed  
anymore,
which gives significant overhead of you traverse a large datastructure.


What I don't quite understand is how you know that your python wrapper
is the last reference to the GObject and your wrapper should not be
forcefully kept alive.

>
>   What happens currently, which is what I'm trying to change, is that
> there is a reference loop between PyObject and GObject, so that
> deallocation only happens with the help of the cyclic GC.  But relying
> on the GC for _everything_ causes annoying problems:

At one time I used Python's reference counts as the reference count  
of the
Objective-C object (ObjC's reference count management is done through  
method
calls and can therefore be overridden in subclasses). That did work, but
getting the semantics completely correct turned the code into a mess.  
Our
current solution is much more satisfying.

>
> 	1- The GC runs only once in a while, not soon enough if eg. you  
> have an
> image object with several megabytes;
>
> 	2- It makes it hard to debug reference counting bugs, as the symptom
> only appears when the GC runs, far away from the code that cause the
> problem in the first place;
>
> 	3- Generally the GC has a lot more work, since every PyGTK object  
> needs
> it, and a GUI app can have lots of PyGTK objects.
>
>   Regards.
>
> -- 
> Gustavo J. A. M. Carneiro
> <gjc at inescporto.pt> <gustavo at users.sourceforge.net>
> The universe is always one step beyond logic
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/ 
> ronaldoussoren%40mac.com


From martin at v.loewis.de  Thu Nov 10 08:15:00 2005
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Thu, 10 Nov 2005 08:15:00 +0100
Subject: [Python-Dev] Weak references: dereference notification
In-Reply-To: <1131576278.8540.14.camel@localhost.localdomain>
References: <1131536425.9130.10.camel@localhost>
	<437228E4.4070800@zope.com>	<1131556500.9130.18.camel@localhost>	<ca471dc20511090923u4ae0d00evf85c2cc8a123a1b5@mail.gmail.com>	<1131558739.9130.40.camel@localhost>	<9E82C8B1-8A32-457D-827A-F0135EB9F8D3@mac.com>
	<1131576278.8540.14.camel@localhost.localdomain>
Message-ID: <4372F374.4060709@v.loewis.de>

Gustavo J. A. M. Carneiro wrote:
>   OK, but what if it is a subclass of a builtin type, with instance
> variables?  What if the PyObject is GC'ed but the ObjC object remains
> alive, and later you get a new reference to it?  Do you create a new
> PyObject wrapper for it?  What happened to the instance variables?

Normally, wrappers don't have state. But if you do have state, this
is how it could work:

1. Make two Python objects, PyState and PyWrapper (actually,
    PyState doesn't need to be a Python object)
    PyState holds the instance variables, and PyWrapper just
    holds a pointer to a GObject.
2. When a Python reference to a GObject is created for the
    first time, create both a PyState and a PyWrapper. Have
    the GObject point to the PyState, and the PyWrapper to
    the GObject. Have the PyState weakly reference the
    PyWrapper.
3. When the refcount to the PyWrapper drops to zero, discard it.
4. When somebody asks for the data in the PyWrapper,
    go to the GObject, then to the PyState, and return the
    data from there.
5. When somebody wants a reference to a GObject which already
    has a PyState, check the weak reference to find out
    whether there is a PyWrapper already. If yes, return it;
    if not, create a new one (and weakly reference it).
6. When the GObject is discarded, drop the PyState as well.

This has the following properties:
1. There are no cyclic references for wrapping GObjects.
2. Weakly-referencing wrappers is supported; if there
    are no strong Python references to the wrapper,
    the wrapper goes away, and, potentially, the GObject
    as well.
3. The PyState object lives as long as the GObject.
4. Using "is" for GObjects/PyWrappers "works": there is
    at most one PyWrapper per GObject at any time.
5. id() of a GObject may change over time, if the wrapper
    becomes unreferenced and then recreated.

Regards,
Martin

From martin at v.loewis.de  Thu Nov 10 08:26:51 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 10 Nov 2005 08:26:51 +0100
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <43729CAB.5070106@c2b2.columbia.edu>
References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de>
	<43729CAB.5070106@c2b2.columbia.edu>
Message-ID: <4372F63B.70301@v.loewis.de>

Michiel Jan Laurens de Hoon wrote:
> At this point, I can't propose a specific modification yet because I 
> don't know the reasoning that went behind the original choice of Tk as 
> the default GUI toolkit for Python (and hence, I don't know if those 
> reasons are still valid today).

I don't know, either, but I guess that it was the only option as a
cross-platform GUI at the time when it was created.

> I can see one disadvantage (using Tk 
> limits our options to run an event loop for other Python extensions), 
> and I am trying to find out why Tk was deemed more appropriate than 
> other GUI toolkits anyway.

I don't think this is a disadvantage: my guess is that other GUI
toolkits share the very same problems. So even though it looks like
a limitation of Tkinter, it really is a fundamental limitation, and
Tk is not any worse than the competitors.

Also, I firmly believe that whatever your event processing
requirements are, that there is a solution that meets all your end-user
needs. That solution would fail the requirement to be easy to implement
for you (IOW, it may take some work).

> So let me rephrase the question: What is the advantage of Tk in 
> comparison to other GUI toolkits?

It comes bundled with Python. If this sounds circular: It is. Whatever
the historical reasons for original inclusion where, I'm sure they
are not that important anymore. Today, what matters is that an actual
implementation of a GUI toolkit integration is actively being
maintained, in the Python core. This is something that is not the case
for any other GUI toolkit.

If you think it would be easy to change: it isn't. Somebody would
have to step forward and say "I will maintain it for the next 10
years". Removal of Tkinter would meet strong resistance, so it
would have to be maintained in addition to Tkinter. Nobody has
stepped forward making such an offer. For Tkinter, it's different:
because it *already* is part of Python, various maintainers fix
problems as they find them, and contributors contribute
improvements.

 > Is it Mac availability? More advanced
> widget set? Installation is easier? Portability? 

These are all important, yes. But other GUI toolkits likely
have the same properties.

 > Switching to a
> different GUI toolkit would break too much existing code?

Most definitely, yes. Switching would not be an option at all.
Another GUI toolkit would have to be an addition, not a replacement.

> I think that 
> having the answer to this will stimulate further development of 
> alternative GUI toolkits, which may give some future Python version a 
> toolkit at least as good as Tk, and one that doesn't interfere with 
> Python's event loop capabilities.

I personally don't think so. The task is just too huge for volunteers
to commit to.

Regards,
Martin

From Scott.Daniels at Acm.Org  Thu Nov 10 08:28:11 2005
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Wed, 09 Nov 2005 23:28:11 -0800
Subject: [Python-Dev] to_int -- oops, one step missing for use.
In-Reply-To: <4372B377.6050806@Acm.Org>
References: <1f7befae0510211952x5eb2000bicdf3c1a80a3f5749@mail.gmail.com>
	<4372B377.6050806@Acm.Org>
Message-ID: <4372F68B.5050106@Acm.Org>

Well, wouldn't you know it.
I get the code right and mess up the directions.


Scott David Daniels wrote:
> if you build this module, I'd suggest using
> "from to_int import chomp" to get a function that works like int
> (producing a long when needed and so on).

Well, actually it is a bit more than that.
     "from to_int import chomp, _flag; _flag(1)"

This sets a flag to suppress the return of the length along
with the value from chomp.

--Scott David Daniels
Scott.Daniels at Acm.Org


From martin at v.loewis.de  Thu Nov 10 08:30:51 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 10 Nov 2005 08:30:51 +0100
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <4372DA3A.8010206@c2b2.columbia.edu>
References: <437100A7.5050907@c2b2.columbia.edu>	<43710C95.30209@v.loewis.de>	<43729CAB.5070106@c2b2.columbia.edu>	<4372B82C.9010800@canterbury.ac.nz>
	<4372DA3A.8010206@c2b2.columbia.edu>
Message-ID: <4372F72B.9060501@v.loewis.de>

Michiel Jan Laurens de Hoon wrote:
> It's not because it likes to be in charge, it's because there's no other 
> way to do it in Python.

As I said: this is simply not true.

> Tkinter is a special case among GUI toolkits because it is married to 
> Tcl. It doesn't just need to handle its GUI events, it also needs to run 
> the Tcl interpreter in between. 

That statement is somewhat deceiving: there isn't much interpreter to
run, really.

> Which is why Tkinter needs to be in 
> charge of the event loop. For other GUI toolkits, I don't see a reason 
> why they'd need their own event loop.

They need to fetch events from the operating system level, and dispatch
them to the widgets.

Regards,
Martin

From fredrik at pythonware.com  Thu Nov 10 09:04:24 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu, 10 Nov 2005 09:04:24 +0100
Subject: [Python-Dev] dev FAQ updated with day-to-day svn questions
References: <bbaeab100511092114y73e5f525ubf5011fae39eab01@mail.gmail.com>
Message-ID: <dkuuu8$8i3$1@sea.gmane.org>

Brett Cannon wrote:

>I just finished fleshing out the dev FAQ
> (http://www.python.org/dev/devfaq.html) with questions covering what
> someone might need to know for regular usage.  If anyone thinks I
> didn't cover something I should have, let me know.

SVK!

</F> 




From ncoghlan at gmail.com  Thu Nov 10 10:11:50 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 10 Nov 2005 19:11:50 +1000
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks
In-Reply-To: <79990c6b0511091456y329f1c5ey53b7428e59c97bc7@mail.gmail.com>
References: <A0F78CD8-1F2C-4201-B92B-1707AA822DF0@redivi.com>	<1x1ppk6g.fsf@python.net>	<EDEA56AC-BB60-496D-8A3E-1FBD68F40D44@redivi.com>
	<79990c6b0511091456y329f1c5ey53b7428e59c97bc7@mail.gmail.com>
Message-ID: <43730ED6.1010807@gmail.com>

Paul Moore wrote:
> On 11/9/05, Bob Ippolito <bob at redivi.com> wrote:
>> On Nov 9, 2005, at 1:48 PM, Thomas Heller wrote:
>>
>>> Bob Ippolito <bob at redivi.com> writes:
>>>
>>>> On Nov 9, 2005, at 1:22 PM, Bill Janssen wrote:
>>>>
>>>>> It's a shame that
>>>>>
>>>>> 1)  there's no equivalent of "java -jar", i.e., "python -z
>>>>> FILE.ZIP", and
>>>> This should work on a few platforms:
>>>> env PYTHONPATH=FILE.zip python -m some_module_in_the_zip
>>> It should, yes - but it doesn't: -m doesn't work with zipimport.
>> That's dumb, someone should fix that.  Is there a bug filed?
> 
> I did, a while ago. http://www.python.org/sf/1250389

Please consider looking at and commenting on PEP 328 - I got zero feedback 
when I wrote it, and basically assumed no-one else was bothered by the -m 
switch's fairly significant limitations (it went in close to the first Python 
2.4 alpha release, so we wanted to keep it simple).

The PEP and the associated patch currently only cover lifting the limitation 
against executing modules inside packages, but it should be possible to extend 
it to cover executing modules inside zip files as well (as you say, increasing 
use of eggs will only make the current limitations more annoying).

That discussion should probably happen on c.l.p, though. cc' me if you start 
one, and I can keep on eye on it through Google (I won't have time to 
participate actively, unfortunately :()

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Thu Nov 10 10:15:14 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 10 Nov 2005 19:15:14 +1000
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks
In-Reply-To: <43730ED6.1010807@gmail.com>
References: <A0F78CD8-1F2C-4201-B92B-1707AA822DF0@redivi.com>	<1x1ppk6g.fsf@python.net>	<EDEA56AC-BB60-496D-8A3E-1FBD68F40D44@redivi.com>	<79990c6b0511091456y329f1c5ey53b7428e59c97bc7@mail.gmail.com>
	<43730ED6.1010807@gmail.com>
Message-ID: <43730FA2.6070404@gmail.com>

Nick Coghlan wrote:
> Please consider looking at and commenting on PEP 328 - I got zero feedback 
> when I wrote it, and basically assumed no-one else was bothered by the -m 
> switch's fairly significant limitations (it went in close to the first Python 
> 2.4 alpha release, so we wanted to keep it simple).

Oops, that should be PEP 3*3*8. PEP 328 is something completely different. 
That'll teach me to post without checking the PEP number ;)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Thu Nov 10 10:21:55 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 10 Nov 2005 19:21:55 +1000
Subject: [Python-Dev] dev FAQ updated with day-to-day svn questions
In-Reply-To: <bbaeab100511092114y73e5f525ubf5011fae39eab01@mail.gmail.com>
References: <bbaeab100511092114y73e5f525ubf5011fae39eab01@mail.gmail.com>
Message-ID: <43731133.9000904@gmail.com>

Brett Cannon wrote:
> I just finished fleshing out the dev FAQ
> (http://www.python.org/dev/devfaq.html) with questions covering what
> someone might need to know for regular usage.  If anyone thinks I
> didn't cover something I should have, let me know.

Should the section "Developing on Windows" disappear now?

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Thu Nov 10 10:43:43 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 10 Nov 2005 19:43:43 +1000
Subject: [Python-Dev] dev FAQ updated with day-to-day svn questions
In-Reply-To: <bbaeab100511092114y73e5f525ubf5011fae39eab01@mail.gmail.com>
References: <bbaeab100511092114y73e5f525ubf5011fae39eab01@mail.gmail.com>
Message-ID: <4373164F.8070606@gmail.com>

Brett Cannon wrote:
> I just finished fleshing out the dev FAQ
> (http://www.python.org/dev/devfaq.html) with questions covering what
> someone might need to know for regular usage.  If anyone thinks I
> didn't cover something I should have, let me know.

For question 1.2.10, I believe you also want:

   [miscellany]
   enable-auto-props = yes

so that "svn add" works properly.

Question 1.4.1 should cover the use of "svn diff" instead of "cvs diff" to 
make the patch.

On that note, we need to update the patch submission guidelines to point to 
SVN instead of CVS (those guidelines also still say context diffs are 
preferred to unified diffs, which I believe is no longer true).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From p.f.moore at gmail.com  Thu Nov 10 11:23:48 2005
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 10 Nov 2005 10:23:48 +0000
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks
In-Reply-To: <43730ED6.1010807@gmail.com>
References: <A0F78CD8-1F2C-4201-B92B-1707AA822DF0@redivi.com>
	<1x1ppk6g.fsf@python.net>
	<EDEA56AC-BB60-496D-8A3E-1FBD68F40D44@redivi.com>
	<79990c6b0511091456y329f1c5ey53b7428e59c97bc7@mail.gmail.com>
	<43730ED6.1010807@gmail.com>
Message-ID: <79990c6b0511100223l20d66617ka681223cf2fb7c0@mail.gmail.com>

On 11/10/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Paul Moore wrote:
> > On 11/9/05, Bob Ippolito <bob at redivi.com> wrote:
> >> On Nov 9, 2005, at 1:48 PM, Thomas Heller wrote:
> >>
> >>> Bob Ippolito <bob at redivi.com> writes:
> >>>
> >>>> On Nov 9, 2005, at 1:22 PM, Bill Janssen wrote:
> >>>>
> >>>>> It's a shame that
> >>>>>
> >>>>> 1)  there's no equivalent of "java -jar", i.e., "python -z
> >>>>> FILE.ZIP", and
> >>>> This should work on a few platforms:
> >>>> env PYTHONPATH=FILE.zip python -m some_module_in_the_zip
> >>> It should, yes - but it doesn't: -m doesn't work with zipimport.
> >> That's dumb, someone should fix that.  Is there a bug filed?
> >
> > I did, a while ago. http://www.python.org/sf/1250389
>
> Please consider looking at and commenting on PEP 328 - I got zero feedback
> when I wrote it, and basically assumed no-one else was bothered by the -m
> switch's fairly significant limitations (it went in close to the first Python
> 2.4 alpha release, so we wanted to keep it simple).
>
> The PEP and the associated patch currently only cover lifting the limitation
> against executing modules inside packages, but it should be possible to extend
> it to cover executing modules inside zip files as well (as you say, increasing
> use of eggs will only make the current limitations more annoying).
>
> That discussion should probably happen on c.l.p, though. cc' me if you start
> one, and I can keep on eye on it through Google (I won't have time to
> participate actively, unfortunately :()

I didn't respond simply because it seemed obvious that this should go
in, and I expected no debate. I assumed the only reason it didn't go
into 2.4 was because the issue came up too close to the release. Teach
me to assume, I guess...

FWIW, I'm +1 on PEP 338.

Paul.

From gjc at inescporto.pt  Thu Nov 10 13:04:23 2005
From: gjc at inescporto.pt (Gustavo J. A. M. Carneiro)
Date: Thu, 10 Nov 2005 12:04:23 +0000
Subject: [Python-Dev] Weak references: dereference notification
In-Reply-To: <4372F374.4060709@v.loewis.de>
References: <1131536425.9130.10.camel@localhost> <437228E4.4070800@zope.com>
	<1131556500.9130.18.camel@localhost>
	<ca471dc20511090923u4ae0d00evf85c2cc8a123a1b5@mail.gmail.com>
	<1131558739.9130.40.camel@localhost>
	<9E82C8B1-8A32-457D-827A-F0135EB9F8D3@mac.com>
	<1131576278.8540.14.camel@localhost.localdomain>
	<4372F374.4060709@v.loewis.de>
Message-ID: <1131624263.4292.16.camel@localhost>

Qui, 2005-11-10 ?s 08:15 +0100, "Martin v. L?wis" escreveu:
> Gustavo J. A. M. Carneiro wrote:
> >   OK, but what if it is a subclass of a builtin type, with instance
> > variables?  What if the PyObject is GC'ed but the ObjC object remains
> > alive, and later you get a new reference to it?  Do you create a new
> > PyObject wrapper for it?  What happened to the instance variables?
> 
> Normally, wrappers don't have state. But if you do have state, this
> is how it could work:
> 
> 1. Make two Python objects, PyState and PyWrapper (actually,
>     PyState doesn't need to be a Python object)
>     PyState holds the instance variables, and PyWrapper just
>     holds a pointer to a GObject.
> 2. When a Python reference to a GObject is created for the
>     first time, create both a PyState and a PyWrapper. Have
>     the GObject point to the PyState, and the PyWrapper to
>     the GObject. Have the PyState weakly reference the
>     PyWrapper.
> 3. When the refcount to the PyWrapper drops to zero, discard it.
> 4. When somebody asks for the data in the PyWrapper,
>     go to the GObject, then to the PyState, and return the
>     data from there.
> 5. When somebody wants a reference to a GObject which already
>     has a PyState, check the weak reference to find out
>     whether there is a PyWrapper already. If yes, return it;
>     if not, create a new one (and weakly reference it).
> 6. When the GObject is discarded, drop the PyState as well.
> 
> This has the following properties:
> 1. There are no cyclic references for wrapping GObjects.
> 2. Weakly-referencing wrappers is supported; if there
>     are no strong Python references to the wrapper,
>     the wrapper goes away, and, potentially, the GObject
>     as well.
> 3. The PyState object lives as long as the GObject.
> 4. Using "is" for GObjects/PyWrappers "works": there is
>     at most one PyWrapper per GObject at any time.
> 5. id() of a GObject may change over time, if the wrapper
>     becomes unreferenced and then recreated.

  This was my first approach, actually, in patch 4.1 in [1].  Only your
property 2 above drove me to try a different approach -- the weakrefs
may become invalid while the GObject may still be alive.  That's a bit
"surprising".  Of course, if I could override weakref.ref() for GObject
wrapper types, even that could be worked around... ;-)

  Thanks,

[1] http://bugzilla.gnome.org/show_bug.cgi?id=320428
-- 
Gustavo J. A. M. Carneiro
<gjc at inescporto.pt> <gustavo at users.sourceforge.net>
The universe is always one step beyond logic.


From gjc at inescporto.pt  Thu Nov 10 13:12:58 2005
From: gjc at inescporto.pt (Gustavo J. A. M. Carneiro)
Date: Thu, 10 Nov 2005 12:12:58 +0000
Subject: [Python-Dev] Weak references: dereference notification
In-Reply-To: <43729B07.6010907@canterbury.ac.nz>
References: <1131536425.9130.10.camel@localhost> <437228E4.4070800@zope.com>
	<1131556500.9130.18.camel@localhost>
	<ca471dc20511090923u4ae0d00evf85c2cc8a123a1b5@mail.gmail.com>
	<1131558739.9130.40.camel@localhost>
	<9E82C8B1-8A32-457D-827A-F0135EB9F8D3@mac.com>
	<1131576278.8540.14.camel@localhost.localdomain>
	<43729B07.6010907@canterbury.ac.nz>
Message-ID: <1131624778.4292.22.camel@localhost>

Qui, 2005-11-10 ?s 13:57 +1300, Greg Ewing escreveu:
> Gustavo J. A. M. Carneiro wrote:
> 
> >   OK, but what if it is a subclass of a builtin type, with instance
> > variables?  What if the PyObject is GC'ed but the ObjC object remains
> > alive, and later you get a new reference to it?  Do you create a new
> > PyObject wrapper for it?  What happened to the instance variables?
> 
> Your proposed scheme appears to involve destroying and
> then re-initialising the Python wrapper. Isn't that
> going to wipe out any instance variables it may
> have had?

  The object isn't really destroyed.  Simply ob_refcnt drops to zero,
then tp_dealloc is called, which is supposed to destroy it.  But since I
wrote tp_dealloc, I choose not to destroy it, and revive it by calling
PyObject_Init(), which makes ob_refcnt == 1 again, among other things.

> 
> Also, it seems to me that as soon as the refcount on
> the wrapper drops to zero, any weak references to it
> will be broken. Or does your resurrection code
> intervene before that happens?

  Yes, I intervene before that happens.

  Regards.

-- 
Gustavo J. A. M. Carneiro
<gjc at inescporto.pt> <gustavo at users.sourceforge.net>
The universe is always one step beyond logic.


From abo at minkirri.apana.org.au  Thu Nov 10 14:47:00 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Thu, 10 Nov 2005 13:47:00 +0000
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <4372DD5F.70203@c2b2.columbia.edu>
References: <437100A7.5050907@c2b2.columbia.edu>
	<43710C95.30209@v.loewis.de>	<43729CAB.5070106@c2b2.columbia.edu>
	<87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp>
	<4372DD5F.70203@c2b2.columbia.edu>
Message-ID: <1131630420.12077.44.camel@warna.corp.google.com>

On Thu, 2005-11-10 at 00:40 -0500, Michiel Jan Laurens de Hoon wrote:
> Stephen J. Turnbull wrote:
> 
> >    Michiel> What is the advantage of Tk in comparison to other GUI
> >    Michiel> toolkits?
[...]
> My application doesn't need a toolkit at all. My problem is that because 
> of Tkinter being the standard Python toolkit, we cannot have a decent 
> event loop in Python. So this is the disadvantage I see in Tkinter.
[...]

I'm kind of surprised no-one has mentioned Twisted in this thread.

Twisted is an async-framework that I believe has support for using a
variety of different event-loops, including Tkinter and wxWidgets, as
well as it's own.

It has been heavily re-factored many times, so if you want to see the
current Python "state of the art" way of doing this, I'd be having a
look at what they are doing.

-- 
Donovan Baarda <abo at minkirri.apana.org.au>
http://minkirri.apana.org.au/~abo/


From guido at python.org  Thu Nov 10 17:50:40 2005
From: guido at python.org (Guido van Rossum)
Date: Thu, 10 Nov 2005 08:50:40 -0800
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <4372DD5F.70203@c2b2.columbia.edu>
References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de>
	<43729CAB.5070106@c2b2.columbia.edu>
	<87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp>
	<4372DD5F.70203@c2b2.columbia.edu>
Message-ID: <ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com>

On 11/9/05, Michiel Jan Laurens de Hoon <mdehoon at c2b2.columbia.edu> wrote:
> My application doesn't need a toolkit at all. My problem is that because
> of Tkinter being the standard Python toolkit, we cannot have a decent
> event loop in Python. So this is the disadvantage I see in Tkinter.

That's a non-sequitur if I ever saw one. Who gave you that idea? There
is no connection.

(If there's *any* reason for Python not having a standard event loop
it's probably because I've never needed one.)

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From mdehoon at c2b2.columbia.edu  Thu Nov 10 18:16:36 2005
From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon)
Date: Thu, 10 Nov 2005 12:16:36 -0500
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <4372F72B.9060501@v.loewis.de>
References: <437100A7.5050907@c2b2.columbia.edu>	<43710C95.30209@v.loewis.de>	<43729CAB.5070106@c2b2.columbia.edu>	<4372B82C.9010800@canterbury.ac.nz>
	<4372DA3A.8010206@c2b2.columbia.edu> <4372F72B.9060501@v.loewis.de>
Message-ID: <43738074.2030508@c2b2.columbia.edu>

Martin v. L?wis wrote:

> Michiel Jan Laurens de Hoon wrote:
>
>> It's not because it likes to be in charge, it's because there's no 
>> other way to do it in Python.
>
> As I said: this is simply not true.

You are right in the sense it is possible to get events handled using 
the solutions you proposed before (sorry for not responding to those 
earlier). But I don't believe that these are very good solutions:

> You are missing multi-threading, which is the widely used
> approach to doing things simultaneously in a single process. In one
> thread, user interaction can occur; in another, computation. If you need
> non-blocking interaction between the threads, use queues, or other
> global variables. If you have other event sources, deal with them
> in separate threads.

The problem with threading (apart from potential portability problems) 
is that Python doesn't let us know when it's idle. This would cause 
excessive repainting (I can give you an explicit example if you're 
interested).

But there is another solution with threads: Can we let Tkinter run in a 
separate thread instead?

> Yes, it is possible to get event loops with Tkinter. Atleast on Unix,
> you can install a file handler into the Tk event loop (through
> createfilehandler), which gives you callbacks whenever there is some
> activity on the files.

This works, but only if Tkinter is installed, and even then it will give 
poor performance due to the busy-loop with 20 ms sleep in between in 
Tkinter. Furthermore, this will not work with IDLE, because the Python 
thread that handles user commands never enters the Tkinter event loop, 
even if we import Tkinter. AFAIK, there is no easy solution to this.

> Furthermore, it is possible to turn the event loop around, by doing
> dooneevent explicitly. 

Here, the problem is that we don't know *when* to call dooneevent, so 
we'd have to do a busy-loop and sleep in between.

>> Tkinter is a special case among GUI toolkits because it is married to 
>> Tcl. It doesn't just need to handle its GUI events, it also needs to 
>> run the Tcl interpreter in between. 
>
> That statement is somewhat deceiving: there isn't much interpreter to
> run, really.

I may be wrong here, but I'd think that it would be dangerous to run 
Tkinter's event loop when one thread is waiting for another (as happens 
in IDLE).

>> Which is why Tkinter needs to be in charge of the event loop. For 
>> other GUI toolkits, I don't see a reason why they'd need their own 
>> event loop.
>
> They need to fetch events from the operating system level, and dispatch
> them to the widgets. 

This is a perfect task for an event loop located in Python, instead of 
in an extension module. I could write a prototype event loop for Python 
to demonstrate how this would work.

Sorry if I'm sounding negative, but we've actually considered many 
different things to get the event loop working for our scientific 
visualization software, and we were never able to come up with a 
satisfactory scheme within the current Python framework. Other packages 
have run into the same problem (e.g. matplotlib, which now recommends 
using the interactive ipython instead of regular python; the python 
extension for the Rasmol protein viewer is another).

--Michiel.

-- 
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032



From pje at telecommunity.com  Thu Nov 10 18:46:34 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 10 Nov 2005 12:46:34 -0500
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport  hooks
In-Reply-To: <ca471dc20511091633m4b7869b7jc3bd847436f452ab@mail.gmail.co
 m>
References: <5.1.1.6.0.20051109190838.01f51838@mail.telecommunity.com>
	<20051109023347.GA15823@localhost.localdomain>
	<ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com>
	<b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com>
	<ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com>
	<bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com>
	<5.1.1.6.0.20051109190838.01f51838@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20051110124246.02bac470@mail.telecommunity.com>

At 04:33 PM 11/9/2005 -0800, Guido van Rossum wrote:
>On 11/9/05, Phillip J. Eby <pje at telecommunity.com> wrote:
> > By the way, while we're on this subject, can we make the optimization
> > options be part of the compile() interface?  Right now the distutils has to
> > actually exec another Python process whenever you want to compile
> > code with
> > a different optimization level than what's currently in effect, whereas if
> > it could pass the desired level to compile(), this wouldn't be necessary.
>
>Makes sense to me; we need a patch of course.

But before we can do that, it's not clear to me if it should be part of the 
existing "flags" argument, or whether it should be separate.  Similarly, 
whether it's just going to be a level or an optimization bitmask in its own 
right might be relevant too.

For the current use case, obviously, a level argument suffices, with 'None' 
meaning "whatever the command-line level was" for backward 
compatibility.  And I guess we could go with that for now easily enough, 
I'd just like to know whether any of the AST or optimization mavens had 
anything they were planning in the immediate future that might affect how 
the API addition should be structured.


From guido at python.org  Thu Nov 10 18:53:28 2005
From: guido at python.org (Guido van Rossum)
Date: Thu, 10 Nov 2005 09:53:28 -0800
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks
In-Reply-To: <5.1.1.6.0.20051110124246.02bac470@mail.telecommunity.com>
References: <20051109023347.GA15823@localhost.localdomain>
	<ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com>
	<b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com>
	<ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com>
	<bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com>
	<5.1.1.6.0.20051109190838.01f51838@mail.telecommunity.com>
	<5.1.1.6.0.20051110124246.02bac470@mail.telecommunity.com>
Message-ID: <ca471dc20511100953l2f1f2748s1d721782cb12c53c@mail.gmail.com>

On 11/10/05, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 04:33 PM 11/9/2005 -0800, Guido van Rossum wrote:
> >On 11/9/05, Phillip J. Eby <pje at telecommunity.com> wrote:
> > > By the way, while we're on this subject, can we make the optimization
> > > options be part of the compile() interface?  Right now the distutils has to
> > > actually exec another Python process whenever you want to compile
> > > code with
> > > a different optimization level than what's currently in effect, whereas if
> > > it could pass the desired level to compile(), this wouldn't be necessary.
> >
> >Makes sense to me; we need a patch of course.
>
> But before we can do that, it's not clear to me if it should be part of the
> existing "flags" argument, or whether it should be separate.  Similarly,
> whether it's just going to be a level or an optimization bitmask in its own
> right might be relevant too.
>
> For the current use case, obviously, a level argument suffices, with 'None'
> meaning "whatever the command-line level was" for backward
> compatibility.  And I guess we could go with that for now easily enough,
> I'd just like to know whether any of the AST or optimization mavens had
> anything they were planning in the immediate future that might affect how
> the API addition should be structured.

I'm not a big user of this API, please design as you see fit.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Thu Nov 10 18:55:19 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 10 Nov 2005 12:55:19 -0500
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <1131630420.12077.44.camel@warna.corp.google.com>
References: <4372DD5F.70203@c2b2.columbia.edu>
	<437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de>
	<43729CAB.5070106@c2b2.columbia.edu>
	<87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp>
	<4372DD5F.70203@c2b2.columbia.edu>
Message-ID: <5.1.1.6.0.20051110125301.02b2d318@mail.telecommunity.com>

At 01:47 PM 11/10/2005 +0000, Donovan Baarda wrote:
>Twisted is an async-framework that I believe has support for using a
>variety of different event-loops, including Tkinter and wxWidgets, as
>well as it's own.

Technically, it just gives Tkinter a chance to run every so often; you 
specifically *can't* use Tkinter's event loop.  Instead, you run the 
Twisted event loop after telling it that you'd like Tkinter to be kept in 
the loop, as it were.

But Twisted is definitely worth looking at for this sort of thing.  It's 
the nearest thing to a "standard Python event loop" that exists, apart from 
the asyncore stuff in the stdlib (which doesn't have any GUI support AFAIK).


From mdehoon at c2b2.columbia.edu  Thu Nov 10 19:07:17 2005
From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon)
Date: Thu, 10 Nov 2005 13:07:17 -0500
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com>
References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de>	
	<43729CAB.5070106@c2b2.columbia.edu>	
	<87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp>	
	<4372DD5F.70203@c2b2.columbia.edu>
	<ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com>
Message-ID: <43738C55.60509@c2b2.columbia.edu>

Guido van Rossum wrote:

>On 11/9/05, Michiel Jan Laurens de Hoon <mdehoon at c2b2.columbia.edu> wrote:
>  
>
>>My application doesn't need a toolkit at all. My problem is that because
>>of Tkinter being the standard Python toolkit, we cannot have a decent
>>event loop in Python. So this is the disadvantage I see in Tkinter.
>>    
>>
>
>That's a non-sequitur if I ever saw one. Who gave you that idea? There is no connection.
>
I have come to this conclusion after several years of maintaining a scientific plotting package and trying to set up an event loop for it. Whereas there are some solutions that more or less work, none of them work very well, and the solutions that we found tend to break. Other visualization packages are struggling with the same problem. I'm trying the best I can to explain in my other posts why I feel that Tkinter is the underlying reason, and why it would be difficult to solve.

>(If there's *any* reason for Python not having a standard event loop
>it's probably because I've never needed one.)
>
It's probably because we have gotten away with piggy-backing on Tcl's 
event loop for so long.

--Michiel.

-- 
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032



From mcherm at mcherm.com  Thu Nov 10 19:12:03 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Thu, 10 Nov 2005 10:12:03 -0800
Subject: [Python-Dev] (no subject)
Message-ID: <20051110101203.2v6dz00ya8ogs08o@login.werra.lunarpages.com>

Sokolov Yura writes:
> Excuse my English

No problem. You command of english probably exceeds my command of any
other language.

> I think, we could just segregate tokens for decimal and real float and
> make them interoperable.
>    Most of us works with business databases - all "floats" are really
> decimals, algebraic operations
> should work without float inconsistency and those operations rare so
> speed is not important.
> But some of us use floats for speed in scientific and multimedia programs.

I'm not sure why you say "most" (have you seen some surveys of Python
programmers that I haven't seen?), but I think we all agree that there
are Python users who rarely have a need for machine floats, and others
who badly need them.

I'll take your specific suggestions out of order:
> with "from __future__ import Decimal" we could:
> c) result of operation with decimal operands should be decimal
>  >>> 1.0/3.0
> 0.33333333333333333

This already works.

> d) result of operation with float operands should be float
>  >>> 1.0f/3.0f
> 0.33333333333333331f

This already works.

> e) result of operation with decimal and float should be float (decimal
> converts into float and operation perfomed)
>  >>> 1.0f/3.0
> 0.33333333333333331f
>  >>> 1.0/3.0f
> 0.33333333333333331f

Mixing Decimal and float is nearly ALWAYS a user error. Doing it correctly
requires significant expertise in the peculiarities of floating point
representations. So Python protects the user by throwing exceptions when
attempts are made to mix Decimal and floats. This is the desired
behavior (and the experts already know how to work around it in the RARE
occasions when they need to).

> a) interpret regular float constants as decimal
> b) interpret float constants with suffix 'f' as float (like    1.5f
> 345.2e-5f  etc)

There are two different ideas here, which I will separate. The first
is a proposal that there be a way to provide Decimal literals. The second
proposal is that the ###.### be the literal for Decimals and that
###.###f be the literal for floats.

I'm in favor of the first idea. Decimals are useful enough that it would
be a good idea to provide some sort of literal for their use. This is
well worth a PEP. But if we DO agree that we ought to have literals for
both floats and Decimals, then we also need to decide which gets the
coveted "unadorned decimal literal" (ie, ###.###). Performance argues
in favor of floats (they run *MUCH* faster). Usability (particularly
for beginners) argues in favor of Decimals (they sometimes still have
surprising behavior, but less often than with binary floats). And
backward compatibility argues in favor of floats. Myself, I'm an
"expert" user (at least to this extent) and I could easily handle
either choice. If others felt like me, then its likely that the
backward compatibility argument and the need to fight the pervasive
meme that "Python is slow" will win the day.

-- Michael Chermside


From tzot at mediconsa.com  Thu Nov 10 20:29:45 2005
From: tzot at mediconsa.com (Christos Georgiou)
Date: Thu, 10 Nov 2005 21:29:45 +0200
Subject: [Python-Dev] Building Python with Visual C++ 2005 Express Edition
Message-ID: <dl073b$aro$1@sea.gmane.org>

I didn't see any mention of this product in the Python-Dev list, so I 
thought to let you know.

http://msdn.microsoft.com/vstudio/express/visualc/download/

There is also a link for a CD image (.img) file to download.

I am downloading now, so I don't know yet whether Python compiles with it 
without any problems.  So if anyone has previous experience, please reply.

PS
This page ( 
http://msdn.microsoft.com/vstudio/express/support/faq/default.aspx#pricing ) 
says that if you download it until Nov 7, 2006, it's a gift --the Microsoft 
VC++ compiler for free (perhaps a cut-down version).

Bits from the FAQ: 
http://msdn.microsoft.com/vstudio/express/support/faq/default.aspx

4. Can be used even for commercial products without licensing restrictions
40. It includes the optimizing compiler (without stuff like Profile Guided 
Optimizations)
41. Builds both native and managed applications (you 99% need to download 
the SDK too)
42. No MFC or ATL included



From bcannon at gmail.com  Thu Nov 10 20:36:51 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Thu, 10 Nov 2005 11:36:51 -0800
Subject: [Python-Dev] dev FAQ updated with day-to-day svn questions
In-Reply-To: <43731133.9000904@gmail.com>
References: <bbaeab100511092114y73e5f525ubf5011fae39eab01@mail.gmail.com>
	<43731133.9000904@gmail.com>
Message-ID: <bbaeab100511101136q56ae01a2t29379079a933fc3b@mail.gmail.com>

On 11/10/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Brett Cannon wrote:
> > I just finished fleshing out the dev FAQ
> > (http://www.python.org/dev/devfaq.html) with questions covering what
> > someone might need to know for regular usage.  If anyone thinks I
> > didn't cover something I should have, let me know.
>
> Should the section "Developing on Windows" disappear now?
>

Well, the whole dev doc section needs cleaning up and that includes
the dev FAQ.  I was planning on doing this at some point; might as
well start talking about it now.

In my mind, the steps in each of the major things to do (bugs and
patches) needs better docs.  With that fleshed out, Intro to
Development can act as an overview of the process.  This should,
together with the dev FAQ, cover what someone needs to do dev work.

The question is how to structure the bug/patch guidelines.  There are
two options; dev FAQ entires much like the svn section or a more
classic layout of the info.  Both would have a bulleted list of the
steps necessary for a bug/patch.  The question is whether the
information is presented in paragraphs of text following the bulleted
list or as a list of questions.  What do people prefer?

-Brett

From martin at v.loewis.de  Thu Nov 10 20:40:04 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 10 Nov 2005 20:40:04 +0100
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <43738074.2030508@c2b2.columbia.edu>
References: <437100A7.5050907@c2b2.columbia.edu>	<43710C95.30209@v.loewis.de>	<43729CAB.5070106@c2b2.columbia.edu>	<4372B82C.9010800@canterbury.ac.nz>	<4372DA3A.8010206@c2b2.columbia.edu>
	<4372F72B.9060501@v.loewis.de> <43738074.2030508@c2b2.columbia.edu>
Message-ID: <4373A214.6060201@v.loewis.de>

Michiel Jan Laurens de Hoon wrote:
>>You are missing multi-threading, which is the widely used
>>approach to doing things simultaneously in a single process.
> 
> The problem with threading (apart from potential portability problems) 
> is that Python doesn't let us know when it's idle. This would cause 
> excessive repainting (I can give you an explicit example if you're 
> interested).

I don't understand how these are connected: why do you need to know
when Python is idle for multi-threaded applications, and why does not
knowing that it is idle cause massive repainting?

Not sure whether an explicit example would help, though; one would
probably need to understand a lot of details of your application. Giving
a simplified version of the example might help (which would do 'print
"Repainting"' instead of actually repainting).

> But there is another solution with threads: Can we let Tkinter run in a 
> separate thread instead?

Yes, you can. Actually, Tkinter *always* runs in a separate thread 
(separate from all other threads).

> This works, but only if Tkinter is installed, and even then it will give 
> poor performance due to the busy-loop with 20 ms sleep in between in 
> Tkinter. Furthermore, this will not work with IDLE, because the Python 
> thread that handles user commands never enters the Tkinter event loop, 
> even if we import Tkinter. AFAIK, there is no easy solution to this.

Here I'm losing track. What is "this" which is no easy solution for?
Why do you need a callback when Python is idle in the first place?

> I may be wrong here, but I'd think that it would be dangerous to run 
> Tkinter's event loop when one thread is waiting for another (as happens 
> in IDLE).

I don't understand. Threads don't wait for each other. Threads wait for
events (which might be generated by some other thread, of course).
However, there is no problem to run the Tkinter event loop when some
unnrelated thread is blocked.

> Sorry if I'm sounding negative, but we've actually considered many 
> different things to get the event loop working for our scientific 
> visualization software, and we were never able to come up with a 
> satisfactory scheme within the current Python framework.

I really don't see what the problem is. Why does the visualization
framework care that Tkinter is around? Why are the events that the
visualization framework needs to process?

Regards,
Martin

From martin at v.loewis.de  Thu Nov 10 20:44:00 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 10 Nov 2005 20:44:00 +0100
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <43738C55.60509@c2b2.columbia.edu>
References: <437100A7.5050907@c2b2.columbia.edu>
	<43710C95.30209@v.loewis.de>		<43729CAB.5070106@c2b2.columbia.edu>		<87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp>		<4372DD5F.70203@c2b2.columbia.edu>	<ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com>
	<43738C55.60509@c2b2.columbia.edu>
Message-ID: <4373A300.3080501@v.loewis.de>

Michiel Jan Laurens de Hoon wrote:
> I have come to this conclusion after several years of maintaining a
> scientific plotting package and trying to set up an event loop for
> it. Whereas there are some solutions that more or less work, none of
> them work very well, and the solutions that we found tend to break.
> Other visualization packages are struggling with the same problem.

As you can see, the problem is not familiar to anybody reading
python-dev.

> I'm trying the best I can to explain in my other posts why I feel
> that Tkinter is the underlying reason, and why it would be difficult
> to solve.

Before trying to explain the reason, please try to explain the
problem first. What is it *really* that you want to do which
you feel you currently can't do?

Regards,
Martin


From martin at v.loewis.de  Thu Nov 10 20:47:04 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 10 Nov 2005 20:47:04 +0100
Subject: [Python-Dev] dev FAQ updated with day-to-day svn questions
In-Reply-To: <43731133.9000904@gmail.com>
References: <bbaeab100511092114y73e5f525ubf5011fae39eab01@mail.gmail.com>
	<43731133.9000904@gmail.com>
Message-ID: <4373A3B8.3090402@v.loewis.de>

Nick Coghlan wrote:
> Should the section "Developing on Windows" disappear now?

I think so, yes (along with the document it refers to).

Regards,
Martin

From martin at v.loewis.de  Thu Nov 10 20:52:16 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 10 Nov 2005 20:52:16 +0100
Subject: [Python-Dev] Building Python with Visual C++ 2005 Express
	Edition
In-Reply-To: <dl073b$aro$1@sea.gmane.org>
References: <dl073b$aro$1@sea.gmane.org>
Message-ID: <4373A4F0.7010202@v.loewis.de>

Christos Georgiou wrote:
> I didn't see any mention of this product in the Python-Dev list, so I 
> thought to let you know.
> 
> http://msdn.microsoft.com/vstudio/express/visualc/download/
> 
> There is also a link for a CD image (.img) file to download.
> 
> I am downloading now, so I don't know yet whether Python compiles with it 
> without any problems.  So if anyone has previous experience, please reply.

I don't have previous experience, but I think this it likely shares the
issues that VS.NET 2005 has with the current code:
1. the project files are for VS.NET 2003. In theory, conversion to
    the new format is supported, but I don't know whether this conversion
    works flawlessly.
2. MS broke ISO C conformance in VS.NET 2005 in a way that affects
    Python's signal handling. There is a patch on SF which addresses
    the issue, but that hasn't been checked in yet.

Regards,
Martin

From bcannon at gmail.com  Thu Nov 10 22:38:50 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Thu, 10 Nov 2005 13:38:50 -0800
Subject: [Python-Dev] dev FAQ updated with day-to-day svn questions
In-Reply-To: <4373164F.8070606@gmail.com>
References: <bbaeab100511092114y73e5f525ubf5011fae39eab01@mail.gmail.com>
	<4373164F.8070606@gmail.com>
Message-ID: <bbaeab100511101338v59de0200o3c28f150958457c0@mail.gmail.com>

On 11/10/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Brett Cannon wrote:
> > I just finished fleshing out the dev FAQ
> > (http://www.python.org/dev/devfaq.html) with questions covering what
> > someone might need to know for regular usage.  If anyone thinks I
> > didn't cover something I should have, let me know.
>
> For question 1.2.10, I believe you also want:
>
>    [miscellany]
>    enable-auto-props = yes
>
> so that "svn add" works properly.

Added.  Missed that I had it in my personal config.  =)

>
> Question 1.4.1 should cover the use of "svn diff" instead of "cvs diff" to
> make the patch.
>

Changed.

> On that note, we need to update the patch submission guidelines to point to
> SVN instead of CVS (those guidelines also still say context diffs are
> preferred to unified diffs, which I believe is no longer true).
>

Fixed and fixed.

-Brett

From mhammond at skippinet.com.au  Thu Nov 10 22:49:20 2005
From: mhammond at skippinet.com.au (Mark Hammond)
Date: Fri, 11 Nov 2005 08:49:20 +1100
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <43738C55.60509@c2b2.columbia.edu>
Message-ID: <DAELJHBGPBHPJKEBGGLNIEBBICAD.mhammond@skippinet.com.au>

Michiel wrote:
> Guido van Rossum wrote:
>
> >On 11/9/05, Michiel Jan Laurens de Hoon
> <mdehoon at c2b2.columbia.edu> wrote:
> >
> >
> >>My application doesn't need a toolkit at all. My problem is that because
> >>of Tkinter being the standard Python toolkit, we cannot have a decent
> >>event loop in Python. So this is the disadvantage I see in Tkinter.
> >>
> >>
> >
> >That's a non-sequitur if I ever saw one. Who gave you that idea?
> There is no connection.
> >
> I have come to this conclusion after several years of maintaining
> a scientific plotting package and trying to set up an event loop
> for it. Whereas there are some solutions that more or less work,
> none of them work very well, and the solutions that we found tend
> to break. Other visualization packages are struggling with the
> same problem. I'm trying the best I can to explain in my other
> posts why I feel that Tkinter is the underlying reason, and why
> it would be difficult to solve.

I believe this problem all boils down to this paragraph from the first mail
on this topic:

: Currently, event loops are available in Python via PyOS_InputHook, a
: pointer to a user-defined function that is called when Python is idle
: (waiting for user input). However, an event loop using PyOS_InputHook
: has some inherent limitations, so I am thinking about how to improve
: event loop support in Python.

Either we have an unusual definition of "event loop" (as many many other
toolkits have implemented event loops without PyOS_InputHook), or the
requirement is for an event loop that plays nicely with the "interactive
loop" in Python.exe.

Assuming the latter, I would suggest simply not trying to do that!  Look at
the "code" module for a way you can create your own interactive loop that
plays nicely with your event loop (rather than trying to do it the other way
around).

Otherwise, I suggest you get very specific about what this event loop should
do.  From a previous mail in this thread (an exchange between you and
Martin):

> >> Which is why Tkinter needs to be in charge of the event loop. For
> >> other GUI toolkits, I don't see a reason why they'd need their own
> >> event loop.
> >
> > They need to fetch events from the operating system level, and dispatch
> > them to the widgets.

> This is a perfect task for an event loop located in Python, instead of
> in an extension module.

I believe the point Martin was trying to make is that we have 2 "unknown"
quantities here - the "operating system" and the "widgets".  Each OS
delivers raw GUI events differently, and each GUI framework consumes and
generates events differently.  I can't see what a single event loop would
look like.  Even on Windows there is no single, standard "event loop"
construct - MFC and VB apps both have custom message loops.  Mozilla XUL
applications (which are very close to being able to be written in Python
<wink>) have an event loop that could not possibly be expressed in Python -
but they do expose a way to *call* their standard event loop (which is quite
a different thing - you are asking to *implement* it.)

> I could write a prototype event loop for Python
> to demonstrate how this would work.

I think that would be the best way forward - this may all simply be one big
misunderstanding <wink>.  The next step after that would be to find even one
person who currently uses an event-loop based app, and for whom your event
loop would work.

Mark.


From ulrich.berning at desys.de  Fri Nov 11 14:32:21 2005
From: ulrich.berning at desys.de (Ulrich Berning)
Date: Fri, 11 Nov 2005 14:32:21 +0100
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport  hooks
In-Reply-To: <5.1.1.6.0.20051110124246.02bac470@mail.telecommunity.com>
References: <5.1.1.6.0.20051109190838.01f51838@mail.telecommunity.com>	<20051109023347.GA15823@localhost.localdomain>	<ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com>	<b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com>	<ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com>	<bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com>	<5.1.1.6.0.20051109190838.01f51838@mail.telecommunity.com>
	<5.1.1.6.0.20051110124246.02bac470@mail.telecommunity.com>
Message-ID: <43749D65.4040001@desys.de>

Phillip J. Eby schrieb:

>At 04:33 PM 11/9/2005 -0800, Guido van Rossum wrote:
>  
>
>>On 11/9/05, Phillip J. Eby <pje at telecommunity.com> wrote:
>>    
>>
>>>By the way, while we're on this subject, can we make the optimization
>>>options be part of the compile() interface?  Right now the distutils has to
>>>actually exec another Python process whenever you want to compile
>>>code with
>>>a different optimization level than what's currently in effect, whereas if
>>>it could pass the desired level to compile(), this wouldn't be necessary.
>>>      
>>>
>>Makes sense to me; we need a patch of course.
>>    
>>
>
>But before we can do that, it's not clear to me if it should be part of the 
>existing "flags" argument, or whether it should be separate.  Similarly, 
>whether it's just going to be a level or an optimization bitmask in its own 
>right might be relevant too.
>
>For the current use case, obviously, a level argument suffices, with 'None' 
>meaning "whatever the command-line level was" for backward 
>compatibility.  And I guess we could go with that for now easily enough, 
>I'd just like to know whether any of the AST or optimization mavens had 
>anything they were planning in the immediate future that might affect how 
>the API addition should be structured.
>
>  
>
I'm using a totally different approach for the above problem. I have 
implemented two functions in the sys module, that make the startup flags 
accessible at runtime. This also solves some other problems I had, as 
you will see in the examples below:


The first function makes most of the flags readable (I have ommited the 
flags, that are documented as deprecated in the code):

sys.getrunflag(name) -> integer

Return one of the interpreter run flags. Possible names are 'Optimize', 
'Verbose', 'Interactive', 'IgnoreEnvironment', 'Debug', 
'DivisionWarning', 'NoSite', 'NoZipImport', 'UseClassExceptions', 
'Unicode', 'Frozen', 'Tabcheck'. getrunflag('Optimize') for example 
returns the current value of Py_OptimizeFlag.


The second function makes a few flags writable:

sys.setrunflag(name, value) -> integer

Set an interpreter run flag. The only flags that can be changed at 
runtime are Py_VerboseFlag ('Verbose') and Py_OptimizeFlag ('Optimize'). 
Returns the previous value of the flag.


As you can see, I have also introduced the new flag Py_NoZipImport that 
can be activated with -Z at startup. This bypasses the activation of 
zipimport and is very handy, if you edit modules stored in the 
filesystem, that are normally imported from a zip archive and you want 
to test your modifications. With this flag, there is no need to delete, 
rename or update the zip archive or to modify sys.path to ensure that 
your changed modules are imported from the filesystem and not from the 
zip archive.


And here are a few usable examples for the new functions:

1.)  You have an application, that does a huge amount of imports and 
some of them are mysterious, so you want to track them in verbose mode. 
You could start python with -v or -vv, but then you get hundreds or 
thousands of lines of output. Instead, you can do the following:

import sys
import ...
import ...
oldval = sys.setrunflag('Verbose', 1) # -v, use 2 for -vv
import ...
import ...
sys.setrunflag('Verbose', oldval)
import ...
import ...

Now, you get only verbose messages for the imports that you want to track.

2.) You need to generate optimized byte code (without assertions and 
docstrings) from a source code, no matter how the interpreter was started:

import sys
...
source = ...
oldval = sys.setrunflag('Optimize', 2) # -OO, use 1 for -O
bytecode = compile(source, ...)
sys.setrunflag('Optimize', oldval)
...

3.) You have to build a command line for the running application (e.g. 
for registration in the registry) and need to check, if you are running 
a script or a frozen executable (this assumes, that your freeze tool 
sets the Py_FrozenFlag):

import sys
...
if sys.getrunflag('Frozen'):
    commandline = sys.executable
else:
    commandline = '%s %s' % (sys.executable, sys.argv[0])
...

NOTE: My own freeze tool sib.py, which is part of the VendorID package 
(www.riverbankcomputing.co.uk/vendorid) doesn't set the Py_FrozenFlag 
yet. I will provide an update soon.

----

And now back to the original subject:

I have done nearly the same changes, that Osvaldo provided with his 
patch and I would highly appreciate if this patch goes into the next 
release. The main reason why I changed the import behavior was 
pythonservice.exe from the win32 extensions. pythonservice.exe imports 
the module that contains the service class, but because 
pythonservice.exe doesn't run in optimized mode, it will only import a 
.py or a .pyc file, not a .pyo file. Because we always generate bytecode 
with -OO at distribution time, we either had to change the behavior of 
pythonservice.exe or change the import behavior of Python.
It is essential for us to remove assertions and docstrings in our 
commercial Python applications at distribution time, because assertions 
are meaningful only at development time and docstrings may contain 
implementation details, that our customers should never see (this makes 
reverse engineering a little bit harder, but not impossible).
Another important reason is, that the number of files in the standard 
library is reduced dramatically. I can have .py and .pyc files in 
lib/pythonX.Y containing assertions and docstrings and put optimized 
code in lib/pythonXY.zip. Then, if I need docstrings, I just bypass 
zipimport as described above with -Z. On the customer site, we only need 
to install the zip archive (our customers are not developers, just end 
users; they may not even recognize that we use Python for application 
development).

Guido, if it was intentional to separate slightly different generated 
bytecode into different files and if you have good reasons for doing 
this, why have I never seen a .pyoo file :-)

For instance, nobody would give the output of a C compiler a different 
extension when different compiler flags are used.

I would appreciate to see the generation of .pyo files completely 
removed in the next release. Just create them as .pyc files, no matter 
if -O or -OO is used or not. At runtime, Python should just ignore (jump 
over?) assert statements when started with -O or ignore assert 
statements and docstrings when started with -OO if they are in the .pyc 
file.

----

There are two reasons, why I haven't started a discussion about those 
issues earlier on this list:

1.) I have done those changes (and a lot of other minor and major 
changes) in Python 2.3.x at a time when Python 2.4.x came up and we 
still use Python 2.3.5. I just wanted to wait until I have the next 
release of our Python runtime environment (RTE) ready that will contain 
the most recent Python version.

2.) In the last months, my time was limited, because I first had to 
stabilize the current RTE (developers and customers were waiting on it).


A few notes about this Python runtime environment:

The RTE that I maintain contains everything (mainly, but not only 
Python) that is needed to run our applications on the customer site. The 
applications are distributed as frozen binaries either containing the 
whole application or only the frozen main script together with an 
additional zip archive holding the application specific modules and 
packages. Tools like py2exe or cx_Freeze follow a different approach: 
they always package a kind of runtime environment together with the 
application.  There is nothing bad with this approach, but if you 
provide more than one application, you waste resources and it may be 
harder to maintain (bugfixes, updates, etc.).
Our RTE currently runs on AIX, HP-UX, IRIX, Windows and Linux, 
SunOS/Solaris will follow in the near future. It gets installed together 
with the application(s) on the customer site if it is not already there 
in the appropriate version. It is vendor specific (more in the sense of 
a provider, not necessarily in the sense of a seller) and shouldn't 
interfere with any other software already installed on the target site.

This RTE is the result of years of experience of commercial application 
development with Python (we started with Python-1.3.x a long long time 
ago). My guideline is: Application developers and/or customers should 
not worry about the requirements needed to get the applications running 
on the target site. Developers should spend all their time for 
application development and customers can expect a complete installation 
with only a very small set of system requirements/dependencies.

In general, I would like to discuss those things that increase Python's 
usability especially in a commercial environment and things that make 
Python more consistent across platforms (e.g. unifying the filesystem 
layout on Windows and UNIX/Linux), but I don't know if this is the right 
mailing list.

Ulli










From jimjjewett at gmail.com  Fri Nov 11 15:27:13 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 11 Nov 2005 09:27:13 -0500
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks
Message-ID: <fb6fbf560511110627w7435754v9175076eb256932f@mail.gmail.com>

Ulrich Berning schrieb:

[He already has a patch that does much of what is being discussed]

> I have also introduced the new flag Py_NoZipImport that
> can be activated with -Z at startup. This bypasses the
> activation of zipimport

I think -Z could be confusing; I would expect it to work more like
the recent suggestion that it name a specific Zip file to use as
the only (or at least the first) source of modules.

I do see that the switch is handy; I'm just suggesting a different
name, such as -nozip or -skip file.zip.

-jJ

From jimjjewett at gmail.com  Fri Nov 11 16:32:42 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 11 Nov 2005 10:32:42 -0500
Subject: [Python-Dev] Event loops, PyOS_InputHook,
	and Tkinter - Summary attempt
Message-ID: <fb6fbf560511110732t3dd0e530v4ffb1fec8acce3a0@mail.gmail.com>

There has been enough misunderstanding in this thread
that the summarizers are going to have trouble.  So I'm
posting this draft in hopes of clarification; please correct
me.

(1)  There is some pre-discussion attached to patches
1049855 and 1252236.  Martin Loewis and Michiel
de Hoon agreed that the fixes were fragile, and that a
larger change should be discussed on python-dev.

(2)  Michiel writes visualization software; he (and
others, such as the writers of matplotlib) has trouble
creating a good event loop, because the GUI toolkit
(especially Tkinter?) wants its own event loop to be in
charge.

Note that this isn't the first time this sort of problem has
come up; usually it is phrased in terms of a problem with
Tix, or not being able to run turtle while in IDLE.

Event loops by their very nature are infinite loops;
once they start, everything else is out of luck unless it
gets triggered by an event or is already started.

(3)  Donovan Baarda suggested looking at Twisted for
state of the art in event loop integration.  Unfortunately,
as Phillip Eby notes, it works by not using the Tkinter
event loop.  It decides for itself when to call dooneevent.

(4)  Michiel doesn't actually need Tkinter (or any other GUI
framework?) for his own project, but he has to play nice
with it because his users expect to be able to use other
tools -- particularly IDLE -- while running his software.

(5)  It is possible to run Tkinter's dooneevent version
as part of your own event loop (as Twisted does), but
you can't really listen for its events, so you end up with
a busy loop polling, and stepping into lots of "I have
nothing to do" functions for every client eventloop.

You can use Tkinter's loop, but once it goes to sleep
waiting for input, everything sort of stalls out for a while,
and even non-Tkinter events get queued instead of
processed.

(6)  Mark Hammond suggests that it might be easier to
replace the interactive portions of python based on the
"code" module.  matplotlib suggests using ipython
instead of standard python for similar reasons.

If that is really the simplest answer (and telling users
which IDE to use is acceptable), then ... I think Michiel
has a point.

(7)  One option might be to always start Tk in a new
thread, rather than letting it take over the main thread.

There was some concern (see patch 1049855) that
Tkinter doesn't -- and shouldn't -- require threading.

My thoughts are that some of the biggest problems
with the event loop (waiting on a mutex) won't happen
in non-threaded python, and that even dummy_thread
would be an improvement over the current state (by
forcing the event loop to start last).  I may well be
missing something, but obviously I'm not sure what that is.

-jJ

From guido at python.org  Fri Nov 11 17:15:18 2005
From: guido at python.org (Guido van Rossum)
Date: Fri, 11 Nov 2005 08:15:18 -0800
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks
In-Reply-To: <43749D65.4040001@desys.de>
References: <20051109023347.GA15823@localhost.localdomain>
	<ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com>
	<b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com>
	<ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com>
	<bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com>
	<5.1.1.6.0.20051109190838.01f51838@mail.telecommunity.com>
	<5.1.1.6.0.20051110124246.02bac470@mail.telecommunity.com>
	<43749D65.4040001@desys.de>
Message-ID: <ca471dc20511110815p12bb82efhc887ba4f6fae670f@mail.gmail.com>

On 11/11/05, Ulrich Berning <ulrich.berning at desys.de> wrote:
> Guido, if it was intentional to separate slightly different generated
> bytecode into different files and if you have good reasons for doing
> this, why have I never seen a .pyoo file :-)

Because -OO was an afterthought and not implemented by me.

> For instance, nobody would give the output of a C compiler a different
> extension when different compiler flags are used.

But the usage is completely different. With C you explicitly manage
when compilation happens. With Python you don't. When you first run
your program with -O but it crashes, and then you run it again without
-O to enable assertions, you would be very unhappy if the bytecode
cached in a .pyo file would be reused!

> I would appreciate to see the generation of .pyo files completely
> removed in the next release.

You seem to forget the realities of backwards compatibility. While
there are ways to cache bytecode without having multiple extensions,
we probably can't do that until Python 3.0.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From mdehoon at c2b2.columbia.edu  Fri Nov 11 17:58:31 2005
From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon)
Date: Fri, 11 Nov 2005 11:58:31 -0500
Subject: [Python-Dev] Event loops, PyOS_InputHook,
 and Tkinter - Summary attempt
In-Reply-To: <fb6fbf560511110732t3dd0e530v4ffb1fec8acce3a0@mail.gmail.com>
References: <fb6fbf560511110732t3dd0e530v4ffb1fec8acce3a0@mail.gmail.com>
Message-ID: <4374CDB7.2020001@c2b2.columbia.edu>

I think this is an excellent summary of the discussion so far. Probably 
clearer than my own posts.
Thanks, Jim!

--Michiel.

Jim Jewett wrote:

>There has been enough misunderstanding in this thread
>that the summarizers are going to have trouble.  So I'm
>posting this draft in hopes of clarification; please correct
>me.
>
>(1)  There is some pre-discussion attached to patches
>1049855 and 1252236.  Martin Loewis and Michiel
>de Hoon agreed that the fixes were fragile, and that a
>larger change should be discussed on python-dev.
>
>(2)  Michiel writes visualization software; he (and
>others, such as the writers of matplotlib) has trouble
>creating a good event loop, because the GUI toolkit
>(especially Tkinter?) wants its own event loop to be in
>charge.
>
>Note that this isn't the first time this sort of problem has
>come up; usually it is phrased in terms of a problem with
>Tix, or not being able to run turtle while in IDLE.
>
>Event loops by their very nature are infinite loops;
>once they start, everything else is out of luck unless it
>gets triggered by an event or is already started.
>
>(3)  Donovan Baarda suggested looking at Twisted for
>state of the art in event loop integration.  Unfortunately,
>as Phillip Eby notes, it works by not using the Tkinter
>event loop.  It decides for itself when to call dooneevent.
>
>(4)  Michiel doesn't actually need Tkinter (or any other GUI
>framework?) for his own project, but he has to play nice
>with it because his users expect to be able to use other
>tools -- particularly IDLE -- while running his software.
>
>(5)  It is possible to run Tkinter's dooneevent version
>as part of your own event loop (as Twisted does), but
>you can't really listen for its events, so you end up with
>a busy loop polling, and stepping into lots of "I have
>nothing to do" functions for every client eventloop.
>
>You can use Tkinter's loop, but once it goes to sleep
>waiting for input, everything sort of stalls out for a while,
>and even non-Tkinter events get queued instead of
>processed.
>
>(6)  Mark Hammond suggests that it might be easier to
>replace the interactive portions of python based on the
>"code" module.  matplotlib suggests using ipython
>instead of standard python for similar reasons.
>
>If that is really the simplest answer (and telling users
>which IDE to use is acceptable), then ... I think Michiel
>has a point.
>
>(7)  One option might be to always start Tk in a new
>thread, rather than letting it take over the main thread.
>
>There was some concern (see patch 1049855) that
>Tkinter doesn't -- and shouldn't -- require threading.
>
>My thoughts are that some of the biggest problems
>with the event loop (waiting on a mutex) won't happen
>in non-threaded python, and that even dummy_thread
>would be an improvement over the current state (by
>forcing the event loop to start last).  I may well be
>missing something, but obviously I'm not sure what that is.
>
>-jJ
>_______________________________________________
>Python-Dev mailing list
>Python-Dev at python.org
>http://mail.python.org/mailman/listinfo/python-dev
>Unsubscribe: http://mail.python.org/mailman/options/python-dev/mdehoon%40c2b2.columbia.edu
>  
>


-- 
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032



From mdehoon at c2b2.columbia.edu  Fri Nov 11 18:56:57 2005
From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon)
Date: Fri, 11 Nov 2005 12:56:57 -0500
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <4373A300.3080501@v.loewis.de>
References: <437100A7.5050907@c2b2.columbia.edu>	<43710C95.30209@v.loewis.de>		<43729CAB.5070106@c2b2.columbia.edu>		<87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp>		<4372DD5F.70203@c2b2.columbia.edu>	<ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com>	<43738C55.60509@c2b2.columbia.edu>
	<4373A300.3080501@v.loewis.de>
Message-ID: <4374DB69.2080804@c2b2.columbia.edu>

Martin v. L?wis wrote:

>Before trying to explain the reason, please try to explain the
>problem first. What is it *really* that you want to do which
>you feel you currently can't do?
>  
>
Probably I should have started the discussion with this; sorry if I 
confused everybody. But here it is:

I have an extension module for scientific visualization. This extension 
module opens one or more windows, in which plots can be made. Something 
similar to the plotting capabilities of Matlab.

For the graphics windows to remain responsive, I need to make sure that 
its events get handled. So I need an event loop. At the same time, the 
user can enter new Python commands, which also need to be handled.

To achieve this, I make use of PyOS_InputHook, a pointer to a function 
which gets called just before going into fgets to read the next Python 
command. I use PyOS_InputHook to enter an event loop inside my extension 
module. This event loop handles the window events, and returns as soon 
as a new Python command is available at stdin, at which point we 
continue to fgets as usual.

While this approach more or less works, there are two problems that I 
have run into:

1) What if the user decides to import Tkinter next? Tkinter notices that 
PyOS_InputHook is already set, and does not reset it to its own event 
loop. Hence, Tkinter's events are not handled. Similarly, if a user 
imports Tkinter before my extension module, I don't reset 
PyOS_InputHook, so Tkinter's events are handled but not mine. If I were 
to reset PyOS_InputHook to my extension module's event loop, then my 
events get handled but not Tkinter's.

2) On Windows, many users will use IDLE to run Python. IDLE uses two 
Python threads, one for the GUI and one for the user's Python commands. 
Each has its own PyOS_InputHook. If I import my extension module (or 
Tkinter, for that matter), the user-Python's PyOS_InputHook gets set to 
the corresponding event loop function. So far so good. However, 
PyOS_InputHook doesn't actually get called:
Between Python commands, the GUI-Python is waiting for the user to type 
something, and the user-Python is waiting for the GUI-Python. While the 
user-Python is waiting for the GUI-Python thread, no call is made to 
PyOS_InputHook, therefore we don't enter an event loop, and no events 
get handled. Hence, neither my extension module nor Tkinter work when 
run from IDLE.

--Michiel.

-- 
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032



From fredrik at pythonware.com  Fri Nov 11 19:14:57 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 11 Nov 2005 19:14:57 +0100
Subject: [Python-Dev] Event loops, PyOS_InputHook,
	and Tkinter - Summary attempt
References: <fb6fbf560511110732t3dd0e530v4ffb1fec8acce3a0@mail.gmail.com>
Message-ID: <dl2n2s$a5e$1@sea.gmane.org>

Jim Jewett wrote:

> (6)  Mark Hammond suggests that it might be easier to
> replace the interactive portions of python based on the
> "code" module.  matplotlib suggests using ipython
> instead of standard python for similar reasons.
>
> If that is really the simplest answer (and telling users
> which IDE to use is acceptable), then ... I think Michiel
> has a point.

really?  Python comes with a module that makes it trivial to get
a fully working interpreter console under any kind of UI toolkit
with very little effort, and you think that proves that we need to
reengineer the CPython interpreter to support arbitary event loops
so you don't have to use that module?

as usual, you make absolutely no sense whatsoever.

(or was "..." short for "CPython's interactive mode should use
the code module" ?)

</F> 




From jimjjewett at gmail.com  Fri Nov 11 21:44:34 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 11 Nov 2005 15:44:34 -0500
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
Message-ID: <fb6fbf560511111244v6bf66043n27674c23670f42e9@mail.gmail.com>

>> (6)  Mark Hammond suggests that it might be easier to
>> replace the interactive portions of python based on the
>> "code" module.  matplotlib suggests using ipython
>> instead of standard python for similar reasons.

>> If that is really the simplest answer (and telling users
>> which IDE to use is acceptable), then ... I think Michiel
>> has a point.

Fredrik Lundh wrote:

> really?  Python comes with a module that makes it trivial to get
> a fully working interpreter console ...

Using an event loop (or an external GUI) should not require
forking the entire interactive mode, no matter how trivial that
fork is.

The subtle differences between interactive mode and IDLE
already cause occasional problems; the same would be true
of code.interact() if it were more widely used.

Part of Michiel's pain is that users want to make their own
decisions on whether to use IDLE or emacs or vt100, and
they want to mix and match toolkits.  They already run into
unexpected freezes because of the event loop conflicts.
If every extension writer also relied on their own subclasses
of the interactive mode, users would be in for far more
unpleasant surprises.

The right answer might be to run each event loop in a
separate thread.  The answer might be to have a python
event loop that re-dispatches single events to the other
frameworks.  The answer might be a way to chain
PyOS_InputHook functions like atexit does for shutdown
functions.  The answer might be something else
entirely.

But I'm pretty sure that the right answer does not involve
adding an additional layer of potential incompatibility.

-jJ

From fredrik at pythonware.com  Fri Nov 11 22:34:19 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 11 Nov 2005 22:34:19 +0100
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
References: <fb6fbf560511111244v6bf66043n27674c23670f42e9@mail.gmail.com>
Message-ID: <dl32ot$dkm$1@sea.gmane.org>

Jim Jewett wrote:

> > really?  Python comes with a module that makes it trivial to get
> > a fully working interpreter console ...
>
> Using an event loop (or an external GUI) should not require
> forking the entire interactive mode, no matter how trivial that
> fork is.

repeating a bogus argument doesn't make it any better.

</F>




From skip at pobox.com  Sat Nov 12 03:32:57 2005
From: skip at pobox.com (skip@pobox.com)
Date: Fri, 11 Nov 2005 20:32:57 -0600
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <4374DB69.2080804@c2b2.columbia.edu>
References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de>
	<43729CAB.5070106@c2b2.columbia.edu>
	<87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp>
	<4372DD5F.70203@c2b2.columbia.edu>
	<ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com>
	<43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de>
	<4374DB69.2080804@c2b2.columbia.edu>
Message-ID: <17269.21593.575449.78938@montanaro.dyndns.org>


    Michiel> 1) What if the user decides to import Tkinter next? Tkinter
    Michiel>    notices that PyOS_InputHook is already set, and does not
    Michiel>    reset it to its own event loop. Hence, Tkinter's events are
    Michiel>    not handled. Similarly, if a user imports Tkinter before my
    Michiel>    extension module, I don't reset PyOS_InputHook, so Tkinter's
    Michiel>    events are handled but not mine. If I were to reset
    Michiel>    PyOS_InputHook to my extension module's event loop, then my
    Michiel>    events get handled but not Tkinter's.

This sounds sort of like the situation that existed with sys.exitfunc before
the creation of the atexit module.  Can't we develop an API similar to that
so that many different event-loop-wanting packages can play nice together?
(Then again, maybe I'm just being too simpleminded.)

Skip

From avi at argo.co.il  Thu Nov 10 10:26:05 2005
From: avi at argo.co.il (Avi Kivity)
Date: Thu, 10 Nov 2005 11:26:05 +0200
Subject: [Python-Dev] indented longstrings?
Message-ID: <4373122D.30507@argo.co.il>

Python's longstring facility is very useful, but unhappily breaks 
indentation. I find myself writing code like

    msg = ('From: %s\r\n'

           + 'To: %s\r\n'

           + 'Subject: Host failure report for %s\r\n'

           + 'Date: %s\r\n'

           + '\r\n'

           + '%s\r\n') % (fr, ', '.join(to), host, time.ctime(), err)
    mail.sendmail(fr, to, msg)

instead of

    msg = ('''From: %s
To: %s
Subject: Host failure report for %s
Date: %s

%s
''') % (fr, ', '.join(to), host, time.ctime(), err)
    mail.sendmail(fr, to, msg)


while wishing for a 

    msg = i'''From: %s
              To: %s\r\n'
              Subject: Host failure report for %s
              Date: %s

              %s
              ''' % (fr, ', '.join(to), host, time.ctime(), err)
    mail.sendmail(fr, to, msg.replace('\n', '\r\n'))

isn't it so much prettier?


(((an indented longstring, i''' ... ''' behaves like a regular 
longstring except that indentation on the lines following the beginning 
of the longstring is stripped up to the first character position of the 
longstring on the first line. non-blanks before that character position 
are a syntax error)))

Avi


From falcon at intercable.ru  Thu Nov 10 21:48:14 2005
From: falcon at intercable.ru (Sokolov Yura)
Date: Thu, 10 Nov 2005 23:48:14 +0300
Subject: [Python-Dev]  (no subject)
Message-ID: <4373B20E.3060002@intercable.ru>

>
>
>Mixing Decimal and float is nearly ALWAYS a user error. Doing it correctly
>requires significant expertise in the peculiarities of floating point
>representations. 
>
So that I think user should declare floats explicitly (###.###f) - he will fall into float space only if
he wish it.


>So Python protects the user by throwing exceptions when
>attempts are made to mix Decimal and floats.

I hate it. I want to get float when I wish to get float. In that case i would like to write #f.
I want to stay with decimals by default. (and I want decimals written in C)

But it just an opinion of young inexperienced/unpractised man.



Excuse my English.



From skip at pobox.com  Sat Nov 12 03:56:32 2005
From: skip at pobox.com (skip@pobox.com)
Date: Fri, 11 Nov 2005 20:56:32 -0600
Subject: [Python-Dev] indented longstrings?
In-Reply-To: <4373122D.30507@argo.co.il>
References: <4373122D.30507@argo.co.il>
Message-ID: <17269.23008.424606.403292@montanaro.dyndns.org>


    Avi> Python's longstring facility is very useful, but unhappily breaks
    Avi> indentation. I find myself writing code like

    Avi>     msg = ('From: %s\r\n'
    Avi>            + 'To: %s\r\n'
    Avi>            + 'Subject: Host failure report for %s\r\n'
    Avi>            + 'Date: %s\r\n'
    Avi>            + '\r\n'
    Avi>            + '%s\r\n') % (fr, ', '.join(to), host, time.ctime(), err)
    Avi>     mail.sendmail(fr, to, msg)

This really belongs on comp.lang.python, at least until you've exhausted the
existing possibilities and found them lacking.  However, try:

    msg = ('From: %s\r\n'
           'To: %s\r\n'
           'Subject: Host failure report for %s\r\n'
           'Date: %s\r\n'
           '\r\n'
           '%s\r\n') % (fr, ', '.join(to), host, time.ctime(), err)

or

    msg = ('''\
From: %s
To: %s
Subject: Host failure report for %s
Date: %s

%s
') % (fr, ', '.join(to), host, time.ctime(), err)

or (untested)

    def istring(s):
        return re.sub(r"(\r?\n)\s+", r"\1", s)

    msg = """From: %s
             To: %s
             Subject: Host failure report for %s
             Date: %s

             %s
             """
    msg = istring(msg) % (fr, ', '.join(to), host, time.ctime(), err)

Skip

From greg.ewing at canterbury.ac.nz  Sat Nov 12 04:12:58 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 12 Nov 2005 16:12:58 +1300
Subject: [Python-Dev] Weak references: dereference notification
In-Reply-To: <1131624778.4292.22.camel@localhost>
References: <1131536425.9130.10.camel@localhost> <437228E4.4070800@zope.com>
	<1131556500.9130.18.camel@localhost>
	<ca471dc20511090923u4ae0d00evf85c2cc8a123a1b5@mail.gmail.com>
	<1131558739.9130.40.camel@localhost>
	<9E82C8B1-8A32-457D-827A-F0135EB9F8D3@mac.com>
	<1131576278.8540.14.camel@localhost.localdomain>
	<43729B07.6010907@canterbury.ac.nz>
	<1131624778.4292.22.camel@localhost>
Message-ID: <43755DBA.8070709@canterbury.ac.nz>

Gustavo J. A. M. Carneiro wrote:

>   The object isn't really destroyed.  Simply ob_refcnt drops to zero,
> then tp_dealloc is called, which is supposed to destroy it.  But since I
> wrote tp_dealloc, I choose not to destroy it,

Be aware that a C subclass of your wrapper that overrides
tp_dealloc is going to have its tp_dealloc called before
yours, and will therefore be partly destroyed before you
get control.

Greg


From bob at redivi.com  Sat Nov 12 04:31:34 2005
From: bob at redivi.com (Bob Ippolito)
Date: Fri, 11 Nov 2005 19:31:34 -0800
Subject: [Python-Dev] indented longstrings?
In-Reply-To: <4373122D.30507@argo.co.il>
References: <4373122D.30507@argo.co.il>
Message-ID: <2BF318ED-4FB9-4E7B-A3E2-1AA131E63B16@redivi.com>


On Nov 10, 2005, at 1:26 AM, Avi Kivity wrote:

> Python's longstring facility is very useful, but unhappily breaks
> indentation. I find myself writing code like

http://docs.python.org/lib/module-textwrap.html

-bob


From s.joaopaulo at gmail.com  Sat Nov 12 05:23:42 2005
From: s.joaopaulo at gmail.com (=?ISO-8859-1?Q?Jo=E3o_Paulo_Silva?=)
Date: Sat, 12 Nov 2005 04:23:42 +0000
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks
In-Reply-To: <ca471dc20511110815p12bb82efhc887ba4f6fae670f@mail.gmail.com>
References: <20051109023347.GA15823@localhost.localdomain>
	<ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com>
	<b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com>
	<ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com>
	<bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com>
	<5.1.1.6.0.20051109190838.01f51838@mail.telecommunity.com>
	<5.1.1.6.0.20051110124246.02bac470@mail.telecommunity.com>
	<43749D65.4040001@desys.de>
	<ca471dc20511110815p12bb82efhc887ba4f6fae670f@mail.gmail.com>
Message-ID: <787073ca0511112023l29794930n@mail.gmail.com>

Hi (first post here, note that English is not my native language),

One thing we shoudn't forgot is that Osvaldo is porting Python to a
plataform that has not so much disk space. He needs Python modules
with just the essencial.
I like ideias like __debug__ opcode, but in Osvaldo use case, there
are limitations to expanding a Python module size.

What the problem for the interpreter looks for both .pyc and .pyo? I
believe zipimport way to lookup modules is more useful.

--
At? mais..
Jo?o Paulo da Silva
LinuxUser #355914
ICQ: 265770691 | Jabber: joaopinga at jabber.org


PS: Guido, sorry the PVT...

From greg.ewing at canterbury.ac.nz  Sat Nov 12 05:33:31 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 12 Nov 2005 17:33:31 +1300
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <17269.21593.575449.78938@montanaro.dyndns.org>
References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de>
	<43729CAB.5070106@c2b2.columbia.edu>
	<87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp>
	<4372DD5F.70203@c2b2.columbia.edu>
	<ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com>
	<43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de>
	<4374DB69.2080804@c2b2.columbia.edu>
	<17269.21593.575449.78938@montanaro.dyndns.org>
Message-ID: <4375709B.4010009@canterbury.ac.nz>

skip at pobox.com wrote:

> This sounds sort of like the situation that existed with sys.exitfunc before
> the creation of the atexit module.  Can't we develop an API similar to that
> so that many different event-loop-wanting packages can play nice together?

I can't see how that would help. If the different hooks know
nothing about each other, there's no way for one to know when
to give up control to the next one in the chain.

Greg

From greg.ewing at canterbury.ac.nz  Sat Nov 12 05:39:57 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 12 Nov 2005 17:39:57 +1300
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <4374DB69.2080804@c2b2.columbia.edu>
References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de>
	<43729CAB.5070106@c2b2.columbia.edu>
	<87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp>
	<4372DD5F.70203@c2b2.columbia.edu>
	<ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com>
	<43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de>
	<4374DB69.2080804@c2b2.columbia.edu>
Message-ID: <4375721D.6040907@canterbury.ac.nz>

Michiel Jan Laurens de Hoon wrote:

> I have an extension module for scientific visualization. This extension 
> module opens one or more windows, in which plots can be made.

What sort of windows are these? Are you using an existing
GUI toolkit, or rolling your own?

> For the graphics windows to remain responsive, I need to make sure that 
> its events get handled. So I need an event loop.

How about running your event loop in a separate thread?

Greg


From skip at pobox.com  Sat Nov 12 06:15:03 2005
From: skip at pobox.com (skip@pobox.com)
Date: Fri, 11 Nov 2005 23:15:03 -0600
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <4375709B.4010009@canterbury.ac.nz>
References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de>
	<43729CAB.5070106@c2b2.columbia.edu>
	<87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp>
	<4372DD5F.70203@c2b2.columbia.edu>
	<ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com>
	<43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de>
	<4374DB69.2080804@c2b2.columbia.edu>
	<17269.21593.575449.78938@montanaro.dyndns.org>
	<4375709B.4010009@canterbury.ac.nz>
Message-ID: <17269.31319.806622.939477@montanaro.dyndns.org>


    >> This sounds sort of like the situation that existed with sys.exitfunc
    >> before the creation of the atexit module.  Can't we develop an API
    >> similar to that so that many different event-loop-wanting packages
    >> can play nice together?

    Greg> I can't see how that would help. If the different hooks know
    Greg> nothing about each other, there's no way for one to know when to
    Greg> give up control to the next one in the chain.

If I have a Gtk app I have to feed other (socket, callback) pairs to it.  It
takes care of adding it to the select() call.  Python could dictate that the
way to play ball is for other packages (Tkinter, PyGtk, wxPython, etc) to
feed Python the (socket, callback) pair.  Then you have a uniform way to
control event-driven applications.  Today, a package like Michiel's has no
idea what sort of event loop it will encounter.  If Python provided the
event loop API it would be the same no matter what widget set happened to be
used.

The sticking point is probably that a number of such packages presume they
will always provide the main event loop and have to way to feed their
sockets to another event loop controller.  That might present some hurdles
for the various package writers/Python wrappers.

Skip


From skip at pobox.com  Sat Nov 12 14:21:59 2005
From: skip at pobox.com (skip@pobox.com)
Date: Sat, 12 Nov 2005 07:21:59 -0600
Subject: [Python-Dev] Mapping cvs version numbers to svn revisions?
Message-ID: <17269.60535.243900.974801@montanaro.dyndns.org>

In a bug report I filed Neal Norwitz referred me to an earlier, fixed, bug
report from before the cvs-to-svn switch.  The file versions were thus cvs
version numbers instead of svn revisions.  Is it possible to map from cvs
version number to svn?  In this particular situation I can fairly easily
infer the revision number because I know Neal made the change and roughly
where in the given file(s) he was making changes, but I doubt that would
always be true.  I guess, did cvstosvn save that mapping somewhere?

Thx,

Skip

From p.f.moore at gmail.com  Sat Nov 12 16:20:35 2005
From: p.f.moore at gmail.com (Paul Moore)
Date: Sat, 12 Nov 2005 15:20:35 +0000
Subject: [Python-Dev] Building Python with Visual C++ 2005 Express
	Edition
In-Reply-To: <4373A4F0.7010202@v.loewis.de>
References: <dl073b$aro$1@sea.gmane.org> <4373A4F0.7010202@v.loewis.de>
Message-ID: <79990c6b0511120720w2c8b318do3a41051ba4eb0c6b@mail.gmail.com>

On 11/10/05, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> Christos Georgiou wrote:
> > I didn't see any mention of this product in the Python-Dev list, so I
> > thought to let you know.
> >
> > http://msdn.microsoft.com/vstudio/express/visualc/download/
> >
> > There is also a link for a CD image (.img) file to download.
> >
> > I am downloading now, so I don't know yet whether Python compiles with it
> > without any problems.  So if anyone has previous experience, please reply.
>
> I don't have previous experience, but I think this it likely shares the
> issues that VS.NET 2005 has with the current code:
> 1. the project files are for VS.NET 2003. In theory, conversion to
>     the new format is supported, but I don't know whether this conversion
>     works flawlessly.
> 2. MS broke ISO C conformance in VS.NET 2005 in a way that affects
>     Python's signal handling. There is a patch on SF which addresses
>     the issue, but that hasn't been checked in yet.

FWIW, I downloaded Visual C++ 2005 Express edition, and the latest
platform SDK, and had a go at building Python trunk.

I just followed the build instructions from PCBuild\readme.txt as best
I could - of the optional packages, I only got zlib to work. The
issues with the other modules may or may not be serious - for example,
bzip2 is now at version 1.0.3, and the old source isn't available.
Just renaming the directory didn't work, but I didn't bother
investigating further.

I applied the patch you mentioned, but otherwise left everything unchanged.

The project file conversions seemed to go fine, and the debug builds
were OK, although the deprecation warnings for all the "insecure" CRT
functions was a pain. It might be worth adding
_CRT_SECURE_NO_DEPRECATE to the project defines somehow.

I then ran the test suite, which mostly worked.

Results:

235 tests OK.
5 tests failed:
    test_asynchat test_cookie test_grammar test_mmap test_profile
58 tests skipped:
    test__locale test_aepack test_al test_applesingle test_bsddb185
    test_bsddb3 test_cd test_cl test_cmd_line test_code
    test_codecmaps_cn test_codecmaps_hk test_codecmaps_jp
    test_codecmaps_kr test_codecmaps_tw test_coding test_commands
    test_crypt test_curses test_dbm test_dl test_fcntl test_float
    test_fork1 test_functional test_gdbm test_gl test_grp test_hashlib
    test_hashlib_speed test_imgfile test_ioctl test_largefile
    test_linuxaudiodev test_macfs test_macostools test_mhlib test_nis
    test_normalization test_openpty test_ossaudiodev test_plistlib
    test_poll test_posix test_pty test_pwd test_resource
    test_scriptpackages test_signal test_socket_ssl test_socketserver
    test_sunaudiodev test_threadsignals test_timeout test_timing
    test_urllib2net test_urllibnet test_xdrlib
7 skips unexpected on win32:
    test_hashlib test_cmd_line test_xdrlib test_code test_float
    test_coding test_functional

I'm not sure what to make of the "unexpected" skips...

The output for the failed tests was:

test_asynchat
test test_asynchat produced unexpected output:
**********************************************************************
*** lines 2-3 of actual output doesn't appear in expected output after line 1:
+ Connected
+ Received: 'hello world'
**********************************************************************

test_cookie
test test_cookie produced unexpected output:
**********************************************************************
*** mismatch between lines 3-4 of expected output and lines 3-4 of
actual output:
- Set-Cookie: chips=ahoy
+ Set-Cookie: chips=ahoy;
?                       +
- Set-Cookie: vienna=finger
+ Set-Cookie: vienna=finger;
?                          +
*** mismatch between line 6 of expected output and line 6 of actual output:
- Set-Cookie: chips=ahoy
+ Set-Cookie: chips=ahoy;
?                       +
*** mismatch between line 8 of expected output and line 8 of actual output:
- Set-Cookie: vienna=finger
+ Set-Cookie: vienna=finger;
?                          +
*** mismatch between line 10 of expected output and line 10 of actual output:
- Set-Cookie: keebler="E=mc2; L=\"Loves\"; fudge=\012;"
+ Set-Cookie: keebler="E=mc2; L=\"Loves\"; fudge=\012;";
?                                                      +
*** mismatch between line 12 of expected output and line 12 of actual output:
- Set-Cookie: keebler="E=mc2; L=\"Loves\"; fudge=\012;"
+ Set-Cookie: keebler="E=mc2; L=\"Loves\"; fudge=\012;";
?                                                      +
*** mismatch between line 14 of expected output and line 14 of actual output:
- Set-Cookie: keebler=E=mc2
+ Set-Cookie: keebler=E=mc2;
?                          +
*** mismatch between lines 16-17 of expected output and lines 16-17 of
actual output:
- Set-Cookie: keebler=E=mc2
+ Set-Cookie: keebler=E=mc2;
?                          +
- Set-Cookie: Customer="WILE_E_COYOTE"; Path=/acme
+ Set-Cookie: Customer="WILE_E_COYOTE"; Path=/acme;
?                                                 +
*** mismatch between line 19 of expected output and line 19 of actual output:
-         <script type="text/javascript">
+         <SCRIPT LANGUAGE="JavaScript">
*** mismatch between line 21 of expected output and line 21 of actual output:
-         document.cookie = "Customer="WILE_E_COYOTE"; Path=/acme; Version=1";
?                                                                            -
+         document.cookie = "Customer="WILE_E_COYOTE"; Path=/acme; Version=1;"
?                                                                           +
*** mismatch between line 26 of expected output and line 26 of actual output:
-         <script type="text/javascript">
+         <SCRIPT LANGUAGE="JavaScript">
*** mismatch between line 28 of expected output and line 28 of actual output:
-         document.cookie = "Customer="WILE_E_COYOTE"; Path=/acme";
?                                                                 -
+         document.cookie = "Customer="WILE_E_COYOTE"; Path=/acme;"
?                                                                +
**********************************************************************

test_grammar
test test_grammar produced unexpected output:
**********************************************************************
*** line 37 of expected output missing:
- yield_stmt
**********************************************************************

test_mmap
test test_mmap produced unexpected output:
**********************************************************************
*** lines 34-35 of expected output missing:
-   Ensuring that passing 0 as map length sets map size to current file size.
-   Ensuring that passing 0 as map length sets map size to current file size.
**********************************************************************

test_profile
test test_profile produced unexpected output:
**********************************************************************
*** mismatch between line 10 of expected output and line 10 of actual output:
-         1    0.000    0.000    1.000    1.000 <string>:1(<module>)
?                                                          ^^^^^^^^
+         1    0.000    0.000    1.000    1.000 <string>:1(?)
?                                                          ^
**********************************************************************

I don't have time to investigate much further, and not all of these
look to be VC 2005 Express issues (for example, the test_profile and
test_cookie errors look like code issues rather than compiler ones),
but I don't have an alternative compiler to check.

I hope this is of some use - it would be brilliant if VC 2005 Express
could be a supported build environment. (Of course, MS have updated
the CRT again, so binaries built with VC 2005 Express aren't binary
compatible with extensions built for the standard release... :-( )

Regards,
Paul.

From martin at v.loewis.de  Sat Nov 12 19:02:33 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 12 Nov 2005 19:02:33 +0100
Subject: [Python-Dev] Building Python with Visual C++ 2005 Express
	Edition
In-Reply-To: <79990c6b0511120720w2c8b318do3a41051ba4eb0c6b@mail.gmail.com>
References: <dl073b$aro$1@sea.gmane.org> <4373A4F0.7010202@v.loewis.de>
	<79990c6b0511120720w2c8b318do3a41051ba4eb0c6b@mail.gmail.com>
Message-ID: <43762E39.3020005@v.loewis.de>

Paul Moore wrote:
> I hope this is of some use - it would be brilliant if VC 2005 Express
> could be a supported build environment. (Of course, MS have updated
> the CRT again, so binaries built with VC 2005 Express aren't binary
> compatible with extensions built for the standard release... :-( )

It is not really practical to support two build environments fully;
as MS changed the format of the project files again, one would have
to maintain two sets of project files (actually, it would be three
sets, as we keep the VC6 files as well).

So really having the VS2005 files in subversion isn't an option;
trying to make conversion go smooth all the time certainly is a
desirable goal.

Using VS2005 for official builds would only be an option with the
next major release (2.5), and I personally don't see that happening:
AFAICT, it is not that much of a change as VS2003 was (i.e. for
Python, nothing is gained AFAICT); also, I'm getting the impression
that VS2005 has too many bugs (*) to be useful, so I recommend to
skip that release completely, and go then to VS2006 (or whenever
that is release).

Regards,
Martin

(*) besides the really sad changes in the CRT which break ISO C
compliance, pre-release versions of the IDE were really unstable.
That might have improved for the release, of course. In addition,
I'm aware of various problems with .NET 2.0; something that doesn't
Python affect too much, though.

From martin at v.loewis.de  Sat Nov 12 19:10:08 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 12 Nov 2005 19:10:08 +0100
Subject: [Python-Dev] Mapping cvs version numbers to svn revisions?
In-Reply-To: <17269.60535.243900.974801@montanaro.dyndns.org>
References: <17269.60535.243900.974801@montanaro.dyndns.org>
Message-ID: <43763000.1000500@v.loewis.de>

skip at pobox.com wrote:
> In a bug report I filed Neal Norwitz referred me to an earlier, fixed, bug
> report from before the cvs-to-svn switch.  The file versions were thus cvs
> version numbers instead of svn revisions.  Is it possible to map from cvs
> version number to svn?  

It would have been possible in the process of using cvs2svn, which could
have generated subversion properties to collect the CVS revision
numbers. I decided against doing so, as this will become less important
over time, and I was uncertain if we would still have to carry those
properties around on the trunk forever. I also expected that in most
cases, it should be easy to find the relationship from the commit
messages. Also, nobody requested that feature in the test installation.

If somebody wants to come up with something (e.g. rerunning the 
conversion, only to create some kind of mapping file): the
tarball that was used to do the conversion is at

http://svn.python.org/snapshots/python-cvsroot-final.tar.bz2

Regards,
Martin

From noamraph at gmail.com  Sat Nov 12 20:06:59 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Sat, 12 Nov 2005 21:06:59 +0200
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <17269.31319.806622.939477@montanaro.dyndns.org>
References: <437100A7.5050907@c2b2.columbia.edu>
	<87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp>
	<4372DD5F.70203@c2b2.columbia.edu>
	<ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com>
	<43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de>
	<4374DB69.2080804@c2b2.columbia.edu>
	<17269.21593.575449.78938@montanaro.dyndns.org>
	<4375709B.4010009@canterbury.ac.nz>
	<17269.31319.806622.939477@montanaro.dyndns.org>
Message-ID: <b348a0850511121106u57c073eeicf8affae502cd86e@mail.gmail.com>

On 11/12/05, skip at pobox.com <skip at pobox.com> wrote:
> If I have a Gtk app I have to feed other (socket, callback) pairs to it.  It
> takes care of adding it to the select() call.  Python could dictate that the
> way to play ball is for other packages (Tkinter, PyGtk, wxPython, etc) to
> feed Python the (socket, callback) pair.  Then you have a uniform way to
> control event-driven applications.  Today, a package like Michiel's has no
> idea what sort of event loop it will encounter.  If Python provided the
> event loop API it would be the same no matter what widget set happened to be
> used.
>
> The sticking point is probably that a number of such packages presume they
> will always provide the main event loop and have to way to feed their
> sockets to another event loop controller.  That might present some hurdles
> for the various package writers/Python wrappers.
>
I think that in order to solve Michiels' problem, there's no need for
something like that, since probably neither of the "loops" are
listening to sockets.

Currently, Tkinter sets PyOS_InputHook to call its "dooneevent"
repeatedly while Python code isn't being executed. It turns out to
work excellently. All that is needed to make Tkinter and Michiels'
code run together is a way to say "add this callback to the input
hook" instead of the current "replace the current input hook with this
callback". Then, when the interpreter is idle, it will call all the
registered callbacks, one at a time, and everyone would be happy.

To make this work with IDLE, or other interactive shells written in
Python, you need to expose a function which will run all the
registered callbacks. Then IDLE can call that function repeatedly when
it's idle, and you'll get the same behaviour you have in the regular
interactive shell. Specifically for IDLE, I know where that place is -
since there's no way to generally invoke the input hook, I wrote a
patch that calls _tkinter.dooneevent(_tkinter.DONT_WAIT) in the right
place, and it works fine.

Concerning threads - please don't. The "do one event at a time while
the interpreter is idle" method works fine. Most programs aren't
designed to be thread-safe, and since Tkinter does many callbacks to
Python functions, you'll get unexpected behaviour if it's on another
thread.

I hope I made myself clear. This solution is simple, and works
whenever a "do one event" function is available.

Have a good week,
Noam

From martin at v.loewis.de  Sat Nov 12 20:17:25 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 12 Nov 2005 20:17:25 +0100
Subject: [Python-Dev] Checking working copy consistency
Message-ID: <43763FC5.8070307@v.loewis.de>

Hi Skip,

I made a script that runs through a subversion
sandbox and checks whether all md5sums are correct.
Please run that on your working copy to see whether
there are still any inconsistent files.

Regards,
Martin

-------------- next part --------------
A non-text attachment was scrubbed...
Name: svncheck.py
Type: text/x-python
Size: 649 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20051112/ab1cb332/svncheck.py

From fperez.net at gmail.com  Sat Nov 12 20:46:10 2005
From: fperez.net at gmail.com (Fernando Perez)
Date: Sat, 12 Nov 2005 12:46:10 -0700
Subject: [Python-Dev] Event loops, PyOS_InputHook,
	and Tkinter - Summary attempt
References: <fb6fbf560511110732t3dd0e530v4ffb1fec8acce3a0@mail.gmail.com>
Message-ID: <dl5gq9$scn$1@sea.gmane.org>

Jim Jewett wrote:

> (6)  Mark Hammond suggests that it might be easier to
> replace the interactive portions of python based on the
> "code" module.  matplotlib suggests using ipython
> instead of standard python for similar reasons.
> 
> If that is really the simplest answer (and telling users
> which IDE to use is acceptable), then ... I think Michiel
> has a point.

I don't claim to understand all the low-level details of this discussion, by a
very long shot.  But as the author of ipython, at least I'll mention what
ipython does to help in this problem.  Whether that is a satisfactory solution
for everyone or not, I won't get into.

For starters, ipython is an extension of the code.InteractiveConsole class, even
though by now I've changed so much that I could probably just stop using any
inheritance at all.  But this is just to put ipython in the context of the
stdlib.  

When I started using matplotlib, I wanted to be able to run my code
interactively and get good plotting, as much as I used to have with Gnuplot
before (IPython ships with extended Gnuplot support beyond what the default
Gnuplot.py module provides).  With help from John Hunter (matplotlib - mpl for
short - author), we were able to add support for ipython to happily coexist
with matplotlib when either the GTK or the WX backends were used. mpl can plot
to Tk, GTK, WX, Qt and FLTK; Tk worked out of the box (because of the Tkinter
event loop integration in Python), and with our hacks we got GTK and WX to
work.  Earlier this year, with the help of a few very knowledgeable Qt
developers, we extended the same ideas to add support for Qt as well.  As part
of this effort, ipython can generically (meaning, outside of matplotlib)
support interactive non-blocking control of WX, GTK and Qt apps, you get that
by starting it with

ipython -wthread/-gthread/-qthread

The details of how this works are slightly different for each toolkit, but the
overall approach is the same for all.  We just register with each toolkit's
idle/timer system a callback to execute pending code which is waiting in what
is essentially a one-entry queue.  I have a private branch where I'm adding
similar support for OpenGL windows using the GLUT idle function, though it's ot
ready for release yet.  So far this has worked quite well.  
If anyone wants to see the details, the relevant code is here:

http://projects.scipy.org/ipython/ipython/file/ipython/trunk/IPython/Shell.py

It may not be perfect, and it may well be the wrong approach.  If so, I'll be
glad to learn how to do it better: I know very little about threading and I got
this to work more or less by stumbling in the dark.

In particular, one thing that definitely does NOT work is mixing TWO GUI
toolkits together.  There is a hack (the -tk option) to try to allow mixing of
ONE of Qt/WX/GTK with Tk, but it has only ever worked on Debian, and we don't
really understand why.  I think it's some obscure combination of how the
low-level threading support for many different libraries is compiled in Debian.

As far as using IDLE/Emacs/whatever (I use Xemacs personally for my own
editing), our approach has been to simply tell people that the _interactive
shell_ should be ipython always.  They can use anything they want to edit their
code with, but they should execute their scripts with ipython.  ipython has a
%run command which allows code execution with a ton of extra control, so the
work cycle with ipython is more or less:

1. open the editor you like to use with your foo.py code.  Hack on foo.py

2. whenever you wish to test your code, save foo.py

3. switch to the ipython window, and type 'run foo'.  Play with the results
interactively (the foo namespace updates the interactive one after completion).

4. rinse, repeat.

In the matplotlib/scipy mailing lists we've more or less settled on 'this is
what we support.  If you don't like it, go write your own'.  It may not be
perfect, but it works reasonably for us (I use this system 10 hours a day in
scientific production work, and so does John, so we do eat our own dog food).

Given that ipython is trival to install (it's pure python code with no extra
dependencies under *nix and very few under win32), and that it provides so much
additional functionality on top of the default interactive interpreter, we've
had no complaints so far.

OK, I hope this information is useful to some of you.  Feel free to contact me
if you have any questions (I will monitor the thread, but I follow py-dev on
gmane, so I do miss things sometimes).

Cheers,

f


From noamraph at gmail.com  Sat Nov 12 20:52:32 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Sat, 12 Nov 2005 21:52:32 +0200
Subject: [Python-Dev] str.dedent
In-Reply-To: <ca471dc2050914161070f1f425@mail.gmail.com>
References: <dga72k$cah$1@sea.gmane.org>
	<ca471dc2050914161070f1f425@mail.gmail.com>
Message-ID: <b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com>

Following Avi's suggestion, can I raise this thread up again? I think
that Reinhold's .dedent() method can be a good idea after all.

The idea is to add a method called "dedent" to strings. It would do
exactly what the current textwrap.indent function does. The motivation
is to be able to write multilined strings easily without damaging the
visual indentation of the source code, like this:

def foo():
    msg = '''\
             From: %s
             To: %s\r\n'
             Subject: Host failure report for %s
             Date: %s

             %s
             '''.dedent() % (fr, ', '.join(to), host, time.ctime(), err)

Writing multilined strings without spaces in the beginning of lines
makes functions harder to read, since although the Python parser is
happy with it, it breaks the visual indentation.

On 9/15/05, Guido van Rossum <guido at python.org> wrote:
> From the sound of it, it's probably not worth endowing every string
> object with this method and hardcoding its implementation forever in C
> code. There are so many corner cases and variations on the
> functionality of "dedenting" a block that it's better to keep it as
> Python source code.

I've looked at the textwrap.dedent() function, and it's really simple
and well defined: Given a string s, take s.expandtabs().split('\n').
Take the minimal number of whitespace chars at the beginning of each
line (not counting lines with nothing but whitespaces), and remove it
from each line.

This means that the Python source code is simple, and there would be
no problems to write it in C.

On 9/15/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
>
> -1
>
> Let it continue to live in textwrap where the existing pure python code
> adequately serves all string-like objects.  It's not worth losing the
> duck typing by attaching new methods to str, unicode, UserString, and
> everything else aspiring to be string-like.
>
> String methods should be limited to generic string manipulations.
> String applications should be in other namespaces.  That is why we don't
> have str.md5(), str.crc32(), str.ziplib(), etc.
>
> Also, I don't want to encourage dedenting as a way of life --- programs
> using it often are likely to be doing things the hard way.
>
I think that the difference between "dedent" and "md5", "crc32" and
such is the fact that making "dedent" a method helps writing code that
is easier to read.

Strings already have a lot of methods which don't make code clearer
the way "dedent" will, such as center, capitalize, expandtabs, and
many others. I think that given these, there's no reason not to add
"dedent" as a string method.

Noam

From raymond.hettinger at verizon.net  Sat Nov 12 21:18:02 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Sat, 12 Nov 2005 15:18:02 -0500
Subject: [Python-Dev] str.dedent
In-Reply-To: <b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com>
Message-ID: <000001c5e7c6$2f959440$2523c797@oemcomputer>

> The motivation
> is to be able to write multilined strings easily without damaging the
> visual indentation of the source code

That is somewhat misleading.  We already have that ability.  What is
being proposed is moving existing code to a different namespace.  So the
motivation is really something like:

   I want to write 
       s = s.dedent() 
   because it is too painful to write
       s = textwrap.dedent(s)



Raymond


From tcdelaney at optusnet.com.au  Sat Nov 12 21:32:01 2005
From: tcdelaney at optusnet.com.au (Tim Delaney)
Date: Sun, 13 Nov 2005 07:32:01 +1100
Subject: [Python-Dev] Building Python with Visual C++ 2005 ExpressEdition
References: <dl073b$aro$1@sea.gmane.org>
	<4373A4F0.7010202@v.loewis.de><79990c6b0511120720w2c8b318do3a41051ba4eb0c6b@mail.gmail.com>
	<43762E39.3020005@v.loewis.de>
Message-ID: <001401c5e7c8$2333cd00$0201a8c0@ryoko>

"Martin v. L?wis" wrote:

> Using VS2005 for official builds would only be an option with the
> next major release (2.5), and I personally don't see that happening:
> AFAICT, it is not that much of a change as VS2003 was (i.e. for
> Python, nothing is gained AFAICT); also, I'm getting the impression
> that VS2005 has too many bugs (*) to be useful, so I recommend to
> skip that release completely, and go then to VS2006 (or whenever
> that is release).

With Microsoft changing the CRT all the time, I think I'd much prefer seeing 
effort going towards MinGW becoming the official Windows build platform. 
There was a considerable amount of angst with the 2.4 release that can be 
blamed solely on the CRT change (and hence different DLLs to link to). And 
with them deprecating ISO standard functions ...

Tim Delaney 


From Scott.Daniels at Acm.Org  Sat Nov 12 22:49:03 2005
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Sat, 12 Nov 2005 13:49:03 -0800
Subject: [Python-Dev] to_int -- oops, one step missing for use.
In-Reply-To: <4372F68B.5050106@Acm.Org>
References: <1f7befae0510211952x5eb2000bicdf3c1a80a3f5749@mail.gmail.com>	<4372B377.6050806@Acm.Org>
	<4372F68B.5050106@Acm.Org>
Message-ID: <dl5o07$d6g$1@sea.gmane.org>

OK, Tim and I corresponded off-list (after that Aahz graciously
suggested direct mail).  The code I have is still stand-alone,
but I get good speeds: 60% - 70% of the speed of int(string, base).

It will take a little bit to figure out how it best belongs in
the Python sources, so don't look for anything for a couple of weeks.
Also, I'd appreciate someone testing the code on a 64-bit machine.
Essentially all I'll need is that person to do a build and
then run the tests.  Unfortunately the posted module tests only
cover the 32-bit long cases, so what I need is another test tried
on a 64-bit long machine (that uses 64-bit longs in Python).
So, if you have a Python installation where
     sys.maxint == (1 << 63) - 1
is True, and you'd like to help, here's what I need.

If you already have the zip, retrieve:
      http://members.dsl-only.net/~daniels/dist/test_hi_powers.py

If you don't already have the zip, retrieve:
      http://members.dsl-only.net/~daniels/dist/to_int-0.10.zip
      (I just added the test_hi_powers.py to the tests in the zip)

Unpack the zip, do the build:
     $ python setup_x.py build

copy the built module into the test directory, cd to that dir, and
run test_hi_powers.py.  Let me know if the tests pass or fail.

Thanks.

--Scott David Daniels
Scott.Daniels at Acm.Org


From martin at v.loewis.de  Sat Nov 12 22:53:28 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 12 Nov 2005 22:53:28 +0100
Subject: [Python-Dev] Building Python with Visual C++ 2005 ExpressEdition
In-Reply-To: <001401c5e7c8$2333cd00$0201a8c0@ryoko>
References: <dl073b$aro$1@sea.gmane.org>	<4373A4F0.7010202@v.loewis.de><79990c6b0511120720w2c8b318do3a41051ba4eb0c6b@mail.gmail.com>	<43762E39.3020005@v.loewis.de>
	<001401c5e7c8$2333cd00$0201a8c0@ryoko>
Message-ID: <43766458.4040803@v.loewis.de>

Tim Delaney wrote:
> With Microsoft changing the CRT all the time, I think I'd much prefer seeing 
> effort going towards MinGW becoming the official Windows build platform. 
> There was a considerable amount of angst with the 2.4 release that can be 
> blamed solely on the CRT change (and hence different DLLs to link to). And 
> with them deprecating ISO standard functions ...

The problem (for me, atleast) is that VC is so much more convenient to
work with. That said, I would personally use what other people
contribute (and perhaps only invoke the built process for the actual
packaging).

So for this to happen, somebody would have to step forward and volunteer
as the "windows port maintainer" for the coming years; starting with the
changes to the build process.

This may be more tricky than it sounds at first: a strategy for building
the libraries that we include (such as gzip, openssl, Tcl/Tk) would be
needed as well. Plus, that person would have to defend the decision to
drop VC (just as I am in the position of defending the switch to VS
2003).

Regards,
Martin

From noamraph at gmail.com  Sat Nov 12 23:24:08 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Sun, 13 Nov 2005 00:24:08 +0200
Subject: [Python-Dev] str.dedent
In-Reply-To: <000001c5e7c6$2f959440$2523c797@oemcomputer>
References: <b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com>
	<000001c5e7c6$2f959440$2523c797@oemcomputer>
Message-ID: <b348a0850511121424n26f84b9n7c1edc45e7f9f1c@mail.gmail.com>

On 11/12/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
> > The motivation
> > is to be able to write multilined strings easily without damaging the
> > visual indentation of the source code
>
> That is somewhat misleading.  We already have that ability.  What is
> being proposed is moving existing code to a different namespace.  So the
> motivation is really something like:
>
>    I want to write
>        s = s.dedent()
>    because it is too painful to write
>        s = textwrap.dedent(s)
>
Sorry, I didn't mean to mislead. I wrote "easily" - I guess using the
current textwrap.dedent isn't really hard, but still, writing:

import textwrap
...

    r = some_func(textwrap.dedent('''\
                                  line1
                                  line2'''))

Seems harder to me than simply

    r = some_func('''\
                  line1
                  line2'''.dedent())

This example brings up another reason why "dedent" us a method is a
good idea: It is a common convention to indent things according to the
last opening bracket. "dedent" as a function makes the indentation
grow in at least 7 characters, and in 16 characters if you don't do
"from textwrap import dedent".

Another reason to make it a method is that I think it focuses
attention at the string, which comes first, instead of at the
"textwrap.dedent", which is only there to make the code look nicer.

And, a last reason: making dedent a built-in method makes it a more
"official" way of doing things, and I think that this way of writing a
multilined string inside an indented block is really the best way to
do it.

Noam

From skip at pobox.com  Sun Nov 13 00:45:40 2005
From: skip at pobox.com (skip@pobox.com)
Date: Sat, 12 Nov 2005 17:45:40 -0600
Subject: [Python-Dev] Checking working copy consistency
In-Reply-To: <43763FC5.8070307@v.loewis.de>
References: <43763FC5.8070307@v.loewis.de>
Message-ID: <17270.32420.240635.71017@montanaro.dyndns.org>


    Martin> I made a script that runs through a subversion sandbox and
    Martin> checks whether all md5sums are correct.  Please run that on your
    Martin> working copy to see whether there are still any inconsistent
    Martin> files.

Thanks Martin.  I got no complaints (trunk, release23-maint,
release24-maint, peps), though see my next message...

Skip

From skip at pobox.com  Sun Nov 13 00:48:17 2005
From: skip at pobox.com (skip@pobox.com)
Date: Sat, 12 Nov 2005 17:48:17 -0600
Subject: [Python-Dev] Is some magic required to check out new files from svn?
Message-ID: <17270.32577.193894.694593@montanaro.dyndns.org>


Is there some magic required to check out new files from the repository?
I'm trying to build on the trunk and am getting compilation errors about
code.h not being found.  If I remember correctly, this is a new file brought
over from the ast branch.  Using cvs I would have executed something like
"cvs up -dPA ." if I found I was missing something (usually a new directory)
and wanted to make sure I was in sync with the trunk.

I read the developer's FAQ and the output of "svn up --help".  Executing
"svn up" or "svn info" tells me I'm already at rev 41430, which is the
latest rev, right?  Creating a fresh build subdirectory followed by
configure and make gives me this error:

    ../Objects/frameobject.c:6:18: code.h: No such file or directory

Sure enough, I have no code.h in my Include directory.

Before I wipe out Include and svn up again is there any debugging I can do
for someone smarter in the ways of Subversion than me?  Regarding my
checksum problems (which are not appearing at the moment), Martin asked for

    1. what specific revision you had checked out (svn info)
    2. what the recorded checksum is (see .svn/entries)
    3. what the commited-rev is
    4. what the actual checksum is on the file on disk
        (.svn/text-base/filename.base)
    5. whether or not the checksums svn reports match the
        ones you determined yourself.

I don't think #2, #4 or #5 apply here.  According to .svn/entries I have:

    <entry
       committed-rev="41430"
       name=""
       committed-date="2005-11-12T15:55:04.419664Z"
       url="svn+ssh://pythondev at svn.python.org/python/trunk"
       last-author="fredrik.lundh"
       kind="dir"
       uuid="6015fed2-1504-0410-9fe1-9d1591cc4771"
       prop-time="2005-11-12T18:00:07.000000Z"
       revision="41430"/>

Here's "svn info" output:

    Path: .
    URL: svn+ssh://pythondev at svn.python.org/python/trunk
    Repository UUID: 6015fed2-1504-0410-9fe1-9d1591cc4771
    Revision: 41430
    Node Kind: directory
    Schedule: normal
    Last Changed Author: fredrik.lundh
    Last Changed Rev: 41430
    Last Changed Date: 2005-11-12 09:55:04 -0600 (Sat, 12 Nov 2005)
    Properties Last Updated: 2005-11-12 12:00:07 -0600 (Sat, 12 Nov 2005)

I was running 1.2.0.  I just downloaded and built 1.2.3.  It made no
difference.

This is getting kinda frustrating.  I haven't got a lot of confidence in
Subversion at this point.

Skip

From greg.ewing at canterbury.ac.nz  Sun Nov 13 00:40:23 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 13 Nov 2005 12:40:23 +1300
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <17269.31319.806622.939477@montanaro.dyndns.org>
References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de>
	<43729CAB.5070106@c2b2.columbia.edu>
	<87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp>
	<4372DD5F.70203@c2b2.columbia.edu>
	<ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com>
	<43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de>
	<4374DB69.2080804@c2b2.columbia.edu>
	<17269.21593.575449.78938@montanaro.dyndns.org>
	<4375709B.4010009@canterbury.ac.nz>
	<17269.31319.806622.939477@montanaro.dyndns.org>
Message-ID: <43767D67.7070307@canterbury.ac.nz>

skip at pobox.com wrote:

> Python could dictate that the
> way to play ball is for other packages (Tkinter, PyGtk, wxPython, etc) to
> feed Python the (socket, callback) pair.  Then you have a uniform way to
> control event-driven applications.

Certainly, if all other event-driven packages are willing
to change their ways, they can be made to work together.
There's not much that can be done with them the way they
are, however.

Also, putting the main event loop in Python then gives
Python itself a privileged position that it shouldn't
necessarily have.

Ultimately I think there needs to be an event dispatching
mechanism provided by the OS, that is universally used
by all packages that want events. With the proliferation
of event-driven systems these days, it's becoming as
fundamental a requirement as file I/O and deserves
serious OS support, I think.

Greg


From greg.ewing at canterbury.ac.nz  Sun Nov 13 00:50:00 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 13 Nov 2005 12:50:00 +1300
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <b348a0850511121106u57c073eeicf8affae502cd86e@mail.gmail.com>
References: <437100A7.5050907@c2b2.columbia.edu>
	<87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp>
	<4372DD5F.70203@c2b2.columbia.edu>
	<ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com>
	<43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de>
	<4374DB69.2080804@c2b2.columbia.edu>
	<17269.21593.575449.78938@montanaro.dyndns.org>
	<4375709B.4010009@canterbury.ac.nz>
	<17269.31319.806622.939477@montanaro.dyndns.org>
	<b348a0850511121106u57c073eeicf8affae502cd86e@mail.gmail.com>
Message-ID: <43767FA8.7090209@canterbury.ac.nz>

Noam Raphael wrote:

> All that is needed to make Tkinter and Michiels'
> code run together is a way to say "add this callback to the input
> hook" instead of the current "replace the current input hook with this
> callback". Then, when the interpreter is idle, it will call all the
> registered callbacks, one at a time, and everyone would be happy.

Except for those who don't like busy waiting.

Greg

From noamraph at gmail.com  Sun Nov 13 01:30:37 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Sun, 13 Nov 2005 02:30:37 +0200
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <43767FA8.7090209@canterbury.ac.nz>
References: <437100A7.5050907@c2b2.columbia.edu>
	<ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com>
	<43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de>
	<4374DB69.2080804@c2b2.columbia.edu>
	<17269.21593.575449.78938@montanaro.dyndns.org>
	<4375709B.4010009@canterbury.ac.nz>
	<17269.31319.806622.939477@montanaro.dyndns.org>
	<b348a0850511121106u57c073eeicf8affae502cd86e@mail.gmail.com>
	<43767FA8.7090209@canterbury.ac.nz>
Message-ID: <b348a0850511121630s6e3d8d9dr4c8beaa202c2f1b5@mail.gmail.com>

On 11/13/05, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Noam Raphael wrote:
>
> > All that is needed to make Tkinter and Michiels'
> > code run together is a way to say "add this callback to the input
> > hook" instead of the current "replace the current input hook with this
> > callback". Then, when the interpreter is idle, it will call all the
> > registered callbacks, one at a time, and everyone would be happy.
>
> Except for those who don't like busy waiting.
>
I'm not sure I understand what you meant. If you meant that it will
work slowly - a lot of people (including me) are using Tkinter without
a mainloop from the interactive shell, and don't feel the difference.
It uses exactly the method I described.

Noam

From jepler at unpythonic.net  Sun Nov 13 01:35:09 2005
From: jepler at unpythonic.net (jepler@unpythonic.net)
Date: Sat, 12 Nov 2005 18:35:09 -0600
Subject: [Python-Dev] to_int -- oops, one step missing for use.
In-Reply-To: <dl5o07$d6g$1@sea.gmane.org>
References: <1f7befae0510211952x5eb2000bicdf3c1a80a3f5749@mail.gmail.com>
	<4372B377.6050806@Acm.Org> <4372F68B.5050106@Acm.Org>
	<dl5o07$d6g$1@sea.gmane.org>
Message-ID: <20051113003509.GC27610@unpythonic.net>

$ python2.4 -c 'import sys; print sys.maxint, sys.maxint == (1<<63) - 1'
9223372036854775807 True
$ python2.4 test_hi_powers.py 
Test 0.2 of to_int 0.16
......................................................................
----------------------------------------------------------------------
Ran 70 tests in 0.006s

OK
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20051112/d41bbe5a/attachment.pgp

From ianb at colorstudy.com  Sun Nov 13 03:00:42 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Sat, 12 Nov 2005 20:00:42 -0600
Subject: [Python-Dev] str.dedent
In-Reply-To: <b348a0850511121424n26f84b9n7c1edc45e7f9f1c@mail.gmail.com>
References: <b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com>	<000001c5e7c6$2f959440$2523c797@oemcomputer>
	<b348a0850511121424n26f84b9n7c1edc45e7f9f1c@mail.gmail.com>
Message-ID: <43769E4A.5040408@colorstudy.com>

Noam Raphael wrote:
> Sorry, I didn't mean to mislead. I wrote "easily" - I guess using the
> current textwrap.dedent isn't really hard, but still, writing:
> 
> import textwrap
> ...
> 
>     r = some_func(textwrap.dedent('''\
>                                   line1
>                                   line2'''))
> 
> Seems harder to me than simply
> 
>     r = some_func('''\
>                   line1
>                   line2'''.dedent())

I think a better argument for this is that dedenting a literal string is
more of a syntactic operation than a functional one.  You don't think
"oh, I bet I'll need to do some dedenting on line 200 of this module, I
better import textwrap".  Instead you start writing a long string
literal once you get to line 200.  You can do it a few ways:

  some_func("line1\nline2")
  some_func("line1\n"
            "line2")
  some_func("""\
line1
line2""")
  # If nice whitespace would be pretty but not required:
  some_func("""
            line1
            line2""")

I often do that last one with HTML and SQL.

In practice textwrap.dedent() isn't one of the ways you are likely to
write this statement.  At least I've never done it that way (and I hit
the issue often), and I don't think I've seen code that has used that in
this circumstance.

Additionally I don't think textwrapping has anything particular to do
with dedenting, except perhaps that both functions were required when
that module was added.

I guess I just find the import cruft at the top of my files a little
annoying, and managing them rather tedious, so saying that you should
import textwrap because it makes a statement deep in the file look a
little prettier is unrealistic.  At the same time, the forms that don't
use it are rather ugly or sloppy.


-- 
Ian Bicking  |  ianb at colorstudy.com  |  http://blog.ianbicking.org

From mhammond at skippinet.com.au  Sun Nov 13 04:15:25 2005
From: mhammond at skippinet.com.au (Mark Hammond)
Date: Sun, 13 Nov 2005 14:15:25 +1100
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport  hooks
In-Reply-To: <43749D65.4040001@desys.de>
Message-ID: <DAELJHBGPBHPJKEBGGLNCEEKIDAD.mhammond@skippinet.com.au>


> release. The main reason why I changed the import behavior was
> pythonservice.exe from the win32 extensions. pythonservice.exe imports
> the module that contains the service class, but because
> pythonservice.exe doesn't run in optimized mode, it will only import a
> .py or a .pyc file, not a .pyo file. Because we always generate bytecode
> with -OO at distribution time, we either had to change the behavior of
> pythonservice.exe or change the import behavior of Python.

While ignoring the question of how Python should in the future handle
optimizations, I think it safe to state that that pythonservice.exe should
have the same basic functionality and operation in this regard as python.exe
does.  It doesn't sound too difficult to modify pythonservice to accept -O
flags, and to modify the service installation process to allow this flag to
be specified.  I'd certainly welcome any such patches.

Although getting off-topic for this list, note that for recent pywin32
releases, it is possible to host a service using python.exe directly, and
this is the technique py2exe uses to host service executables.  It would
take a little more work to set things up to work like that, but that's
probably not too unreasonable for a custom application with specialized
distribution requirements.  Using python.exe obviously means you get full
access to the  command-line facilities it provides.

So while I believe your idead for getting and setting these flags sounds
reasonable, and also believe that at face value the zipimport semantics
appear sane, I'm not sure we should use a weakness in a Python tool to
justify a change to Python itself.

Mark


From nico at tekNico.net  Sun Nov 13 10:05:58 2005
From: nico at tekNico.net (Nicola Larosa)
Date: Sun, 13 Nov 2005 10:05:58 +0100
Subject: [Python-Dev] OT pet peeve (was: Re: str.dedent)
In-Reply-To: <b348a0850511121424n26f84b9n7c1edc45e7f9f1c@mail.gmail.com>
References: <b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com>	<000001c5e7c6$2f959440$2523c797@oemcomputer>
	<b348a0850511121424n26f84b9n7c1edc45e7f9f1c@mail.gmail.com>
Message-ID: <dl6vlo$nl2$1@sea.gmane.org>

> Sorry, I didn't mean to mislead. I wrote "easily" - I guess using the
> current textwrap.dedent isn't really hard, but still, writing:
> 
> import textwrap
> ....
> 
>     r = some_func(textwrap.dedent('''\
>                                   line1
>                                   line2'''))
> 
> Seems harder to me than simply
> 
>     r = some_func('''\
>                   line1
>                   line2'''.dedent())
> 
> This example brings up another reason why "dedent" us a method is a
> good idea: It is a common convention to indent things according to the
> last opening bracket. "dedent" as a function makes the indentation
> grow in at least 7 characters, and in 16 characters if you don't do
> "from textwrap import dedent".

It's a common convention, but a rather ugly one. It makes harder breaking
lines at 78-80 chars, and using long enough identifiers.

I find it more useful to go straight to the next line, indenting the usual
four spaces (and also separating nested stuff):

    r = some_func(
        textwrap.dedent(
            '''\
            line1
            line2'''))

This style uses up more vertical space, but I find it also gives code a
clearer overall shape.

-- 
Nicola Larosa - nico at tekNico.net

Use of threads can be very deceptive. [...] in almost all cases they
also make debugging, testing, and maintenance vastly more difficult
and sometimes impossible.
  http://java.sun.com/products/jfc/tsc/articles/threads/threads1.html#why


From fredrik at pythonware.com  Sun Nov 13 10:11:23 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sun, 13 Nov 2005 10:11:23 +0100
Subject: [Python-Dev] Is some magic required to check out new files from
	svn?
References: <17270.32577.193894.694593@montanaro.dyndns.org>
Message-ID: <dl6vvl$o9c$1@sea.gmane.org>

skip at pobox.com wrote:

> I read the developer's FAQ and the output of "svn up --help".  Executing
> "svn up" or "svn info" tells me I'm already at rev 41430, which is the
> latest rev, right?  Creating a fresh build subdirectory followed by
> configure and make gives me this error:
>
>    ../Objects/frameobject.c:6:18: code.h: No such file or directory
>
> Sure enough, I have no code.h in my Include directory.

what does

    svn status Include/code.h

say?  if it says

    !    Include/code.h

what happens if you do

    svn revert Include/code.h

?

doing a full

    svn status

and looking for ! entries will tell you if more files are missing.

</F> 




From martin at v.loewis.de  Sun Nov 13 10:35:15 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 13 Nov 2005 10:35:15 +0100
Subject: [Python-Dev] Is some magic required to check out new files from
 svn?
In-Reply-To: <17270.32577.193894.694593@montanaro.dyndns.org>
References: <17270.32577.193894.694593@montanaro.dyndns.org>
Message-ID: <437708D3.8010406@v.loewis.de>

skip at pobox.com wrote:
> Is there some magic required to check out new files from the repository?
> I'm trying to build on the trunk and am getting compilation errors about
> code.h not being found.  If I remember correctly, this is a new file brought
> over from the ast branch.  Using cvs I would have executed something like
> "cvs up -dPA ." if I found I was missing something (usually a new directory)
> and wanted to make sure I was in sync with the trunk.

code.h should live in Include. It was originally committed to CVS, so it
is in the subversion repository from day one; it should always have
been there since you started using subversion.

Do you have code.h mentioned in Include/.svn/entries?

> This is getting kinda frustrating.  I haven't got a lot of confidence in
> Subversion at this point.

I can understand that. However, you should get confidence from that fact
that nobody else is seeing these problems :-)

I recommend to use pre-built binaries, e.g. the ones from

http://metissian.com/projects/macosx/subversion/

I would also recommend to throw away the sandbox completely and check it
out from scratch. Please report whether this gives you code.h.

Regards,
Martin

From krumms at gmail.com  Sun Nov 13 11:34:44 2005
From: krumms at gmail.com (Thomas Lee)
Date: Sun, 13 Nov 2005 20:34:44 +1000
Subject: [Python-Dev] Implementation of PEP 341
Message-ID: <437716C4.8050309@gmail.com>

Hi all,

I've been using Python for a few years and, as of a few days ago, 
finally decided to put the effort into contributing code back to the 
project.

I'm attempting to implement PEP 341 (unification of try/except and 
try/finally) against HEAD. However, this being my first attempt at a 
change to the syntax there's been a bit of a learning curve.

I've modified Grammar/Grammer to use the new try_stmt grammar, updated 
Parser/Python.asdl to accept a stmt* finalbody for TryExcept instances 
and modified Python/ast.c to handle the changes to Python.asdl - 
generating an AST for the finalbody.

All that remains as far as I can see is to modify Python/compile.c to 
generate the necessary code and update Modules/parsermodule.c to 
accommodate the changes to the grammar. (If anybody has further input as 
to what needs to be done here, I'm all ears!)

The difficulty I'm having is in Python/compile.c: currently there are 
two functions which generate the code for the two existing try_stmt 
paths. compiler_try_finally doesn't need any changes as far as I can 
see. compiler_try_except, however, now needs to generate code to handle 
TryExcept.finalbody (which I added to Parser/Python.asdl).

This sounds easy enough, but the following is causing me difficulty:

/* BEGIN */

ADDOP_JREL(c, SETUP_EXCEPT, except);
compiler_use_next_block(c, body);
if (!compiler_push_fblock(c, EXCEPT, body))
    return 0;
VISIT_SEQ(c, stmt, s->v.TryExcept.body);
ADDOP(c, POP_BLOCK);
compiler_pop_fblock(c, EXCEPT, body);

/* END */

A couple of things confuse me here:
1. What's the purpose of the push_fblock/pop_fblock calls?
2. Do I need to add "ADDOP_JREL(c, SETUP_FINALLY, end);" before/after 
SETUP_EXCEPT? Or will this conflict with the SETUP_EXCEPT op? I don't 
know enough about the internals of SETUP_EXCEPT/SETUP_FINALLY to know 
what to do here.

Also, in compiler_try_finally we see this code:

/* BEGIN */

ADDOP_JREL(c, SETUP_FINALLY, end);
compiler_use_next_block(c, body);
if (!compiler_push_fblock(c, FINALLY_TRY, body))
    return 0;
VISIT_SEQ(c, stmt, s->v.TryFinally.body);
ADDOP(c, POP_BLOCK);
compiler_pop_fblock(c, FINALLY_TRY, body);

ADDOP_O(c, LOAD_CONST, Py_None, consts);

/* END */

Why the LOAD_CONST Py_None? Does this serve any purpose? some sort of 
weird pseudo return value? Or does it have a semantic purpose that I'll 
have to reproduce in compiler_try_except?

Cheers, and thanks for any help you can provide :)

Tom



From ncoghlan at gmail.com  Sun Nov 13 13:27:26 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 13 Nov 2005 22:27:26 +1000
Subject: [Python-Dev] Implementation of PEP 341
In-Reply-To: <437716C4.8050309@gmail.com>
References: <437716C4.8050309@gmail.com>
Message-ID: <4377312E.2000002@gmail.com>

Thomas Lee wrote:
> Hi all,
> 
> I've been using Python for a few years and, as of a few days ago, 
> finally decided to put the effort into contributing code back to the 
> project.
> 
> I'm attempting to implement PEP 341 (unification of try/except and 
> try/finally) against HEAD. However, this being my first attempt at a 
> change to the syntax there's been a bit of a learning curve.

Thanks for having a go at this.

> I've modified Grammar/Grammer to use the new try_stmt grammar, updated 
> Parser/Python.asdl to accept a stmt* finalbody for TryExcept instances 
> and modified Python/ast.c to handle the changes to Python.asdl - 
> generating an AST for the finalbody.

Consider leaving the AST definition alone, and simply changing the frontend 
parser to process:

   try:
     BLOCK1
   except:
     BLOCK2
   finally:
     BLOCK3

almost precisely as if it were written:

   try:
     try:
       BLOCK1
     except:
       BLOCK2
   finally:
     BLOCK3

That is, generate a TryExcept inside a TryFinally at the AST level, rather 
than trying to give TryExcept the ability to handle a finally block directly.

Specifically, if you've determined that a finally clause is present in the 
extended statement in Python/ast.c, do something like:

   inner_seq = asdl_seq_new(1)
   asdl_seq_SET(inner_seq, 0,
                TryExcept(body_seq, handlers, else_seq, LINENO(n))
   return TryFinally(inner_seq, finally_seq, LINENO(n))

body_seq and else_seq actually have meaningful names like suite_seq1 and 
suite_seq2 in the current code ;)

Semantics-wise, this is exactly the behaviour we want, and making it pure 
syntactic sugar means the backend doesn't need to care about the new syntax at 
all. It also significantly lessens the risk of the change causing any problems 
in the compilation of normal try-except blocks.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From skip at pobox.com  Sun Nov 13 14:08:15 2005
From: skip at pobox.com (skip@pobox.com)
Date: Sun, 13 Nov 2005 07:08:15 -0600
Subject: [Python-Dev] Is some magic required to check out new files from
 svn?
In-Reply-To: <dl6vvl$o9c$1@sea.gmane.org>
References: <17270.32577.193894.694593@montanaro.dyndns.org>
	<dl6vvl$o9c$1@sea.gmane.org>
Message-ID: <17271.15039.201796.513101@montanaro.dyndns.org>


    >> ../Objects/frameobject.c:6:18: code.h: No such file or directory
    >> 
    >> Sure enough, I have no code.h in my Include directory.

    Fredrik> what does

    Fredrik>     svn status Include/code.h

    Fredrik> say?  if it says

It reports nothing.

    Fredrik> doing a full

    Fredrik>     svn status

    Fredrik> and looking for ! entries will tell you if more files are missing.

The full svn status output is

    % svn status
    !      .
    !      Python

Just for the heck of it, I tried "svn revert Include/code.h" and got

    Skipped 'Include/code.h'

code.h is not mentioned in Include/.svn/entries.

Skip

From jjl at pobox.com  Sun Nov 13 14:23:56 2005
From: jjl at pobox.com (John J Lee)
Date: Sun, 13 Nov 2005 13:23:56 +0000 (UTC)
Subject: [Python-Dev] Is some magic required to check out new files from
 svn?
In-Reply-To: <17270.32577.193894.694593@montanaro.dyndns.org>
References: <17270.32577.193894.694593@montanaro.dyndns.org>
Message-ID: <Pine.LNX.4.58.0511131320190.6217@alice>

On Sat, 12 Nov 2005 skip at pobox.com wrote:
[...]
> Before I wipe out Include and svn up again is there any debugging I can do
> for someone smarter in the ways of Subversion than me?  Regarding my
[...]

Output of the svnversion command?  That shows switched and locally
modified files, etc.

I'm not an svn guru, but I find that command useful, especially to point
out when I switched some deep directory then forgot about it.


John

From skip at pobox.com  Sun Nov 13 14:27:29 2005
From: skip at pobox.com (skip@pobox.com)
Date: Sun, 13 Nov 2005 07:27:29 -0600
Subject: [Python-Dev] Is some magic required to check out new files from
 svn?
In-Reply-To: <437708D3.8010406@v.loewis.de>
References: <17270.32577.193894.694593@montanaro.dyndns.org>
	<437708D3.8010406@v.loewis.de>
Message-ID: <17271.16193.325664.851527@montanaro.dyndns.org>


    Martin> code.h should live in Include. It was originally committed to
    Martin> CVS, so it is in the subversion repository from day one; it
    Martin> should always have been there since you started using
    Martin> subversion.

Sorry, I had some strange idea it was new with the ast branch.

    Martin> Do you have code.h mentioned in Include/.svn/entries?

Nope.

    Martin> I recommend to use pre-built binaries, e.g. the ones from

    Martin> http://metissian.com/projects/macosx/subversion/

That was where I got the 1.2.0 version I was having trouble with originally.
I built 1.2.3 from source.  I'll give the prebuilt 1.2.3 a try.

    Martin> I would also recommend to throw away the sandbox completely and
    Martin> check it out from scratch. Please report whether this gives you
    Martin> code.h.

Yes, it does (still with my built-from-source 1.2.3).

Skip

From skip at pobox.com  Sun Nov 13 14:33:55 2005
From: skip at pobox.com (skip@pobox.com)
Date: Sun, 13 Nov 2005 07:33:55 -0600
Subject: [Python-Dev] Is some magic required to check out new files from
 svn?
In-Reply-To: <Pine.LNX.4.58.0511131320190.6217@alice>
References: <17270.32577.193894.694593@montanaro.dyndns.org>
	<Pine.LNX.4.58.0511131320190.6217@alice>
Message-ID: <17271.16579.87469.834712@montanaro.dyndns.org>


    John> Output of the svnversion command?  That shows switched and locally
    John> modified files, etc.

    John> I'm not an svn guru, but I find that command useful, especially to
    John> point out when I switched some deep directory then forgot about
    John> it.

Thanks, I'll remember it for next time...

Skip

From martin at v.loewis.de  Sun Nov 13 14:40:06 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 13 Nov 2005 14:40:06 +0100
Subject: [Python-Dev] Is some magic required to check out new files from
 svn?
In-Reply-To: <17271.16193.325664.851527@montanaro.dyndns.org>
References: <17270.32577.193894.694593@montanaro.dyndns.org>
	<437708D3.8010406@v.loewis.de>
	<17271.16193.325664.851527@montanaro.dyndns.org>
Message-ID: <43774236.3040700@v.loewis.de>

skip at pobox.com wrote:
>     Martin> code.h should live in Include. It was originally committed to
>     Martin> CVS, so it is in the subversion repository from day one; it
>     Martin> should always have been there since you started using
>     Martin> subversion.
> 
> Sorry, I had some strange idea it was new with the ast branch.

It was, yes. However, the conversion to subversion happened after the
ast branch was checked in.

>     Martin> I would also recommend to throw away the sandbox completely and
>     Martin> check it out from scratch. Please report whether this gives you
>     Martin> code.h.
> 
> Yes, it does (still with my built-from-source 1.2.3).

Ok. I am now convinced (also because of the other information you 
reported) that you indeed had continued to use one of the test 
conversion repositories from before the switchover. That would explain
all the problems you see.

Regards,
Martin

From martin at v.loewis.de  Sun Nov 13 14:50:20 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 13 Nov 2005 14:50:20 +0100
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <4374DB69.2080804@c2b2.columbia.edu>
References: <437100A7.5050907@c2b2.columbia.edu>	<43710C95.30209@v.loewis.de>		<43729CAB.5070106@c2b2.columbia.edu>		<87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp>		<4372DD5F.70203@c2b2.columbia.edu>	<ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com>	<43738C55.60509@c2b2.columbia.edu>	<4373A300.3080501@v.loewis.de>
	<4374DB69.2080804@c2b2.columbia.edu>
Message-ID: <4377449C.5080206@v.loewis.de>

Michiel Jan Laurens de Hoon wrote:
> I have an extension module for scientific visualization. This extension 
> module opens one or more windows, in which plots can be made. Something 
> similar to the plotting capabilities of Matlab.
> 
> For the graphics windows to remain responsive, I need to make sure that 
> its events get handled. So I need an event loop. At the same time, the 
> user can enter new Python commands, which also need to be handled.

My recommendation: create a thread for the graphics window, which runs
the event loop of the graphics window. That way, you are completely
independent of any other event loops that may happen. It is also
independent of the operating system (as long as the thread module
is available).

Regards,
Martin

From martin at v.loewis.de  Sun Nov 13 14:54:48 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 13 Nov 2005 14:54:48 +0100
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <DAELJHBGPBHPJKEBGGLNIEBBICAD.mhammond@skippinet.com.au>
References: <DAELJHBGPBHPJKEBGGLNIEBBICAD.mhammond@skippinet.com.au>
Message-ID: <437745A8.60104@v.loewis.de>

Mark Hammond wrote:
> : Currently, event loops are available in Python via PyOS_InputHook, a
> : pointer to a user-defined function that is called when Python is idle
> : (waiting for user input). However, an event loop using PyOS_InputHook
> : has some inherent limitations, so I am thinking about how to improve
> : event loop support in Python.
> 
> Either we have an unusual definition of "event loop" (as many many other
> toolkits have implemented event loops without PyOS_InputHook), or the
> requirement is for an event loop that plays nicely with the "interactive
> loop" in Python.exe.

I would guess there is an unusual definition of an "event loop". It is
probably that inside the hook, a "process_some_events()" function is
invoked, which loops until some event queue is empty; this is not the
usual infinite-loop-until-user-terminates-program. For this to work, you
need a guarantee that the hook is invoked frequently.

Again, I still think running the loop (as a true event loop) in a
separate thread would probably solve the problem.

Regards,
Martin

From kozlovsky at mail.spbnit.ru  Sun Nov 13 14:57:40 2005
From: kozlovsky at mail.spbnit.ru (Alexander Kozlovsky)
Date: Sun, 13 Nov 2005 16:57:40 +0300
Subject: [Python-Dev] str.dedent
In-Reply-To: <000001c5e7c6$2f959440$2523c797@oemcomputer>
References: <000001c5e7c6$2f959440$2523c797@oemcomputer>
Message-ID: <5610377368.20051113165740@mail.spbnit.ru>

Raymond Hettinger wrote:
> That is somewhat misleading.  We already have that ability.  What is
> being proposed is moving existing code to a different namespace.  So the
> motivation is really something like:
> 
>    I want to write 
>        s = s.dedent() 
>    because it is too painful to write
>        s = textwrap.dedent(s)

>From technical point of view, there is nothing wrong with placing
this functionality in textwrap. But from usability point of view
using textwrap.dedent is like importing some stuff for doing string
concatenation or integer addition.

In textwrap module this function placed in section "Loosely (!)
related functionality". When Python beginner try to find "Pythonic"
way for dealing with dedenting (And she know, in Python "there
should be one -- and preferably only one -- obvious way to do it"),
it is very unlikely that she think "Which module may contain
standard string dedenting? Yes, of course textwrap! I'm sure
I'll find necessary function there!"

> String methods should be limited to generic string manipulations.
> String applications should be in other namespaces.  That is why we don't
> have str.md5(), str.crc32(), str.ziplib(), etc.

I think, dedenting must be classified as "generic string manipulations".

The need in string dedenting results from meaningful indentation
and widespread use of text editors with folding support. Multiline strings
without leading whitespaces breaks correct folding in some editors.


Best regards,
 Alexander                            mailto:kozlovsky at mail.spbnit.ru


From ncoghlan at gmail.com  Sun Nov 13 15:36:18 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 14 Nov 2005 00:36:18 +1000
Subject: [Python-Dev] Is some magic required to check out new files from
 svn?
In-Reply-To: <43774236.3040700@v.loewis.de>
References: <17270.32577.193894.694593@montanaro.dyndns.org>	<437708D3.8010406@v.loewis.de>	<17271.16193.325664.851527@montanaro.dyndns.org>
	<43774236.3040700@v.loewis.de>
Message-ID: <43774F62.6090502@gmail.com>

Martin v. L?wis wrote:
> skip at pobox.com wrote:
>>     Martin> I would also recommend to throw away the sandbox completely and
>>     Martin> check it out from scratch. Please report whether this gives you
>>     Martin> code.h.
>>
>> Yes, it does (still with my built-from-source 1.2.3).
> 
> Ok. I am now convinced (also because of the other information you 
> reported) that you indeed had continued to use one of the test 
> conversion repositories from before the switchover. That would explain
> all the problems you see.

FWIW, I haven't been following Skip's subversion woes closely, but the 
behaviour he reported seems to match the symptoms I got when I tried to update 
my test sandbox after the official changeover (I blew the sandbox away 
completely as soon as I got checksum errors, though, so I didn't see any of 
the later strangeness).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Sun Nov 13 15:45:16 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 14 Nov 2005 00:45:16 +1000
Subject: [Python-Dev] Implementation of PEP 341
In-Reply-To: <4377486B.1090400@gmail.com>
References: <437716C4.8050309@gmail.com> <4377312E.2000002@gmail.com>
	<4377486B.1090400@gmail.com>
Message-ID: <4377517C.9000808@gmail.com>

Thomas Lee wrote:
> Implemented as you suggested and tested. I'll submit the patch to the 
> tracker on sourceforge shortly. Are you guys still after contextual 
> diffs as per the developer pages, or is an svn diff the preferred way to 
> submit patches now?

svn diff should be fine. Although I thought Brett had actually updated those 
pages after the move to svn. . .

> Thanks very much for all your help, Nick. It was extremely informative.

I think we can chalk up a respectable win for the AST-based compiler - the 
trick I suggested wouldn't really have been practical without the AST layer 
between the parser and the compiler.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From mal at egenix.com  Sun Nov 13 18:43:54 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Sun, 13 Nov 2005 18:43:54 +0100
Subject: [Python-Dev] str.dedent
In-Reply-To: <b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com>
References: <dga72k$cah$1@sea.gmane.org>	<ca471dc2050914161070f1f425@mail.gmail.com>
	<b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com>
Message-ID: <43777B5A.6030602@egenix.com>

Noam Raphael wrote:
> Following Avi's suggestion, can I raise this thread up again? I think
> that Reinhold's .dedent() method can be a good idea after all.
> 
> The idea is to add a method called "dedent" to strings. It would do
> exactly what the current textwrap.indent function does. 

You are missing a point here: string methods were introduced
to make switching from plain 8-bit strings to Unicode easier.

As such they are only needed in cases where an algorithm
has to work on the resp. internals differently or where direct
access to the internals makes a huge difference in terms
of performance.

In your use case, the algorithm is independent of the data type
interals and can be defined solely by using existing string
method APIs.

> The motivation
> is to be able to write multilined strings easily without damaging the
> visual indentation of the source code, like this:
> 
> def foo():
>     msg = '''\
>              From: %s
>              To: %s\r\n'
>              Subject: Host failure report for %s
>              Date: %s
> 
>              %s
>              '''.dedent() % (fr, ', '.join(to), host, time.ctime(), err)
> 
> Writing multilined strings without spaces in the beginning of lines
> makes functions harder to read, since although the Python parser is
> happy with it, it breaks the visual indentation.

This is really a minor compiler/parser issue and not one which
warrants adding another string method.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Nov 13 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2005-10-17: Released mxODBC.Zope.DA 1.0.9        http://zope.egenix.com/

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From solipsis at pitrou.net  Sun Nov 13 18:51:47 2005
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sun, 13 Nov 2005 18:51:47 +0100
Subject: [Python-Dev] str.dedent
In-Reply-To: <43777B5A.6030602@egenix.com>
References: <dga72k$cah$1@sea.gmane.org>
	<ca471dc2050914161070f1f425@mail.gmail.com>
	<b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com>
	<43777B5A.6030602@egenix.com>
Message-ID: <1131904308.5684.8.camel@fsol>


> You are missing a point here: string methods were introduced
> to make switching from plain 8-bit strings to Unicode easier.

Is it the only purpose ?
I agree with the OP that using string methods is much nicer and more
convenient than having to import separate modules.
Especially, it is nice to just type help(str) in the interactive prompt
and get the list of supported methods.

Also, these methods are living in the namespace of the supported
objects. It feels very natural, and goes hand in hand with Python's
object-oriented nature.

(just my 2 cents - I am not arguing for or against the specific case of
dedent, by the way)

Regards

Antoine.



From nnorwitz at gmail.com  Sun Nov 13 20:41:57 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Sun, 13 Nov 2005 11:41:57 -0800
Subject: [Python-Dev] ast status, memory leaks, etc
Message-ID: <ee2a432c0511131141s72fedecax29008fd783a3b0db@mail.gmail.com>

There's still more clean up work to go, but the current AST is
hopefully much closer to the behaviour before it was checked in. 
There are still a few small memory leaks.

After running the test suite, the total references were around 380k
(down from over 1,000k).  I'm not sure exactly what the total refs
were just before AST was checked in, but I believe it was over 340k. 
So there are likely some more ref leaks that should be investigated. 
It would be good to know the exact number before AST was checked in
and now, minus any new tests.

There is one memory reference error in test_coding:

Invalid read of size 1
   at 0x41304E: tok_nextc (tokenizer.c:876)
   by 0x413874: PyTokenizer_Get (tokenizer.c:1099)
   by 0x411962: parsetok (parsetok.c:124)
   by 0x498D1F: PyParser_ASTFromFile (pythonrun.c:1292)
   by 0x48D79A: load_source_module (import.c:777)
   by 0x48E90F: load_module (import.c:1665)
   by 0x48ED61: import_submodule (import.c:2259)
   by 0x48EF60: load_next (import.c:2079)
   by 0x48F44D: import_module_ex (import.c:1921)
   by 0x48F715: PyImport_ImportModuleEx (import.c:1955)
   by 0x46D090: builtin___import__ (bltinmodule.c:44)
 Address 0x1863E8F6 is 2 bytes before a block of size 8192 free'd
   at 0x11B1BA8A: free (vg_replace_malloc.c:235)
   by 0x4127DB: decoding_fgets (tokenizer.c:167)
   by 0x412F1F: tok_nextc (tokenizer.c:823)
   by 0x413874: PyTokenizer_Get (tokenizer.c:1099)
   by 0x411962: parsetok (parsetok.c:124)
   by 0x498D1F: PyParser_ASTFromFile (pythonrun.c:1292)
   by 0x48D79A: load_source_module (import.c:777)
   by 0x48E90F: load_module (import.c:1665)
   by 0x48ED61: import_submodule (import.c:2259)
   by 0x48EF60: load_next (import.c:2079)
   by 0x48F44D: import_module_ex (import.c:1921)
   by 0x48F715: PyImport_ImportModuleEx (import.c:1955)
   by 0x46D090: builtin___import__ (bltinmodule.c:44)

I had a patch for this somewhere, I'll try to find it.  However, I
only fixed this exact error, there was another path that could still
be problematic.

Most of the memory leaks show up when we are forking in:

test_fork1
test_pty
test_subprocess

Here's what I have so far.  There are probably some more.  It would be
great if someone could try to find and fix these leaks.

n
--
 16 bytes in 1 blocks are definitely lost in loss record 25 of 599
    at 0x11B1AF13: malloc (vg_replace_malloc.c:149)
    by 0x4CA102: alias (Python-ast.c:1066)
    by 0x4CD918: alias_for_import_name (ast.c:2199)
    by 0x4D0C4E: ast_for_stmt (ast.c:2244)
    by 0x4D15E3: PyAST_FromNode (ast.c:234)
    by 0x499078: Py_CompileStringFlags (pythonrun.c:1275)
    by 0x46D6DF: builtin_compile (bltinmodule.c:457)

 56 bytes in 1 blocks are definitely lost in loss record 87 of 599
    at 0x11B1AF13: malloc (vg_replace_malloc.c:149)
    by 0x4C9C92: Name (Python-ast.c:860)
    by 0x4CE4BA: ast_for_expr (ast.c:1222)
    by 0x4D1021: ast_for_stmt (ast.c:1900)
    by 0x4D15E3: PyAST_FromNode (ast.c:234)
    by 0x499078: Py_CompileStringFlags (pythonrun.c:1275)
    by 0x46D6DF: builtin_compile (bltinmodule.c:457)

 112 bytes in 2 blocks are definitely lost in loss record 198 of 674
    at 0x11B1AF13: malloc (vg_replace_malloc.c:149)
    by 0x4C9C92: Name (Python-ast.c:860)
    by 0x4CE4BA: ast_for_expr (ast.c:1222)
    by 0x4D1021: ast_for_stmt (ast.c:1900)
    by 0x4D16D5: PyAST_FromNode (ast.c:275)
    by 0x499078: Py_CompileStringFlags (pythonrun.c:1275)
    by 0x46D6DF: builtin_compile (bltinmodule.c:457)

 56 bytes in 1 blocks are definitely lost in loss record 89 of 599
    at 0x11B1AF13: malloc (vg_replace_malloc.c:149)
    by 0x4C9C92: Name (Python-ast.c:860)
    by 0x4CF3AF: ast_for_arguments (ast.c:650)
    by 0x4D1BFF: ast_for_funcdef (ast.c:830)
    by 0x4D15E3: PyAST_FromNode (ast.c:234)
    by 0x499161: PyRun_StringFlags (pythonrun.c:1275)
    by 0x47B1B2: PyEval_EvalFrameEx (ceval.c:4221)
    by 0x47CCCC: PyEval_EvalCodeEx (ceval.c:2739)
    by 0x47ABCC: PyEval_EvalFrameEx (ceval.c:3657)
    by 0x47CCCC: PyEval_EvalCodeEx (ceval.c:2739)
    by 0x4C27F8: function_call (funcobject.c:550)

 112 bytes in 2 blocks are definitely lost in loss record 189 of 651
    at 0x11B1AF13: malloc (vg_replace_malloc.c:149)
    by 0x4C9C92: Name (Python-ast.c:860)
    by 0x4CE4BA: ast_for_expr (ast.c:1222)
    by 0x4D02F7: ast_for_stmt (ast.c:2028)
    by 0x4D16D5: PyAST_FromNode (ast.c:275)
    by 0x499078: Py_CompileStringFlags (pythonrun.c:1275)
    by 0x46D6DF: builtin_compile (bltinmodule.c:457)

 56 bytes in 1 blocks are definitely lost in loss record 118 of 651
    at 0x11B1AF13: malloc (vg_replace_malloc.c:149)
    by 0x4C9A41: Num (Python-ast.c:751)
    by 0x4CE578: ast_for_expr (ast.c:1237)
    by 0x4CF4ED: ast_for_arguments (ast.c:629)
    by 0x4D1BFF: ast_for_funcdef (ast.c:830)
    by 0x4D15E3: PyAST_FromNode (ast.c:234)
    by 0x499161: PyRun_StringFlags (pythonrun.c:1275)
    by 0x47B1B2: PyEval_EvalFrameEx (ceval.c:4221)
    by 0x47CCCC: PyEval_EvalCodeEx (ceval.c:2739)
    by 0x47ABCC: PyEval_EvalFrameEx (ceval.c:3657)
    by 0x47CCCC: PyEval_EvalCodeEx (ceval.c:2739)
    by 0x4C27F8: function_call (funcobject.c:550)

 112 (56 direct, 56 indirect) bytes in 1 blocks are definitely lost in
loss record 185 of 651
    at 0x11B1AF13: malloc (vg_replace_malloc.c:149)
    by 0x4C97CA: GeneratorExp (Python-ast.c:648)
    by 0x4CEE4F: ast_for_expr (ast.c:1251)
    by 0x4D1021: ast_for_stmt (ast.c:1900)
    by 0x4D16D5: PyAST_FromNode (ast.c:275)
    by 0x499078: Py_CompileStringFlags (pythonrun.c:1275)
    by 0x46D6DF: builtin_compile (bltinmodule.c:457)

 1024 bytes in 1 blocks are definitely lost in loss record 441 of 651
    at 0x11B1AF13: malloc (vg_replace_malloc.c:149)
    by 0x43F8C4: PyObject_Malloc (obmalloc.c:500)
    by 0x4B808F: PyNode_AddChild (node.c:95)
    by 0x4B8386: PyParser_AddToken (parser.c:126)
    by 0x411944: parsetok (parsetok.c:165)
    by 0x499062: Py_CompileStringFlags (pythonrun.c:1271)
    by 0x46D6DF: builtin_compile (bltinmodule.c:457)

From nyamatongwe at gmail.com  Sun Nov 13 23:02:48 2005
From: nyamatongwe at gmail.com (Neil Hodgson)
Date: Mon, 14 Nov 2005 09:02:48 +1100
Subject: [Python-Dev] Building Python with Visual C++ 2005 ExpressEdition
In-Reply-To: <43766458.4040803@v.loewis.de>
References: <dl073b$aro$1@sea.gmane.org> <4373A4F0.7010202@v.loewis.de>
	<79990c6b0511120720w2c8b318do3a41051ba4eb0c6b@mail.gmail.com>
	<43762E39.3020005@v.loewis.de> <001401c5e7c8$2333cd00$0201a8c0@ryoko>
	<43766458.4040803@v.loewis.de>
Message-ID: <50862ebd0511131402q768b97d3g8593859178cf7e16@mail.gmail.com>

Martin v. L?wis:

> The problem (for me, atleast) is that VC is so much more convenient to
> work with.

   In my experience Visual C++ has always produced faster, more
compact code than Mingw. While this may not be true with current
releases, I'd want to ensure that the normal Python download for
Windows didn't become slower. Visual C++ 2005 includes profile guided
optimization (although this is not included in the Express Edition)
and it would be interesting to see how much of a difference this
makes. Microsoft was willing to give some copies of VS to Python
developers before so I expect they'd be willing to give some copies of
VS Professional or Team System.

Tim Delaney:

> There was a considerable amount of angst with the 2.4 release that can be
> blamed solely on the CRT change (and hence different DLLs to link to). And
> with them deprecating ISO standard functions ...

   One solution to CRT change is to drop direct linking of modules to
the CRT and vector them through the core DLL. The core PythonXX.DLL
would expose an array of functions (malloc, strdup, getcwd, ...) that
would be called by all modules indirectly. Then, it no longer matters
which compiler version or compiler you build extension modules with.
Its quite a lot of work to do this as each CRT call site needs to
change or a well thought through macro scheme be developed.

Paul Moore:

> The project file conversions seemed to go fine, and the debug builds
> were OK, although the deprecation warnings for all the "insecure" CRT
> functions was a pain. It might be worth adding
> _CRT_SECURE_NO_DEPRECATE to the project defines somehow.

   I haven't tried to build Python with VC++ 2005 yet, but other code
has also required _CRT_NONSTDC_NO_DEPRECATE for some of the file
system calls.

   Neil

From bcannon at gmail.com  Mon Nov 14 00:40:47 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Sun, 13 Nov 2005 15:40:47 -0800
Subject: [Python-Dev] Implementation of PEP 341
In-Reply-To: <4377517C.9000808@gmail.com>
References: <437716C4.8050309@gmail.com> <4377312E.2000002@gmail.com>
	<4377486B.1090400@gmail.com> <4377517C.9000808@gmail.com>
Message-ID: <bbaeab100511131540y46cef4e6yf2496aa4f24fbec8@mail.gmail.com>

On 11/13/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Thomas Lee wrote:
> > Implemented as you suggested and tested. I'll submit the patch to the
> > tracker on sourceforge shortly. Are you guys still after contextual
> > diffs as per the developer pages, or is an svn diff the preferred way to
> > submit patches now?
>
> svn diff should be fine. Although I thought Brett had actually updated those
> pages after the move to svn. . .
>

I did.  But the docs just need to be revamped.  But I can't start on
that work until people tell me if they prefer FAQ-style (question
listing all steps and then a question covering each step) or
essay-style (bulleted list and then a definition/paragraph on each
step) for bug/patch guidelines.

> > Thanks very much for all your help, Nick. It was extremely informative.
>
> I think we can chalk up a respectable win for the AST-based compiler - the
> trick I suggested wouldn't really have been practical without the AST layer
> between the parser and the compiler.
>

Yeah, this is a total win for the AST compiler.  I would not have
wanted to attempt this with the old CST compiler.

-Brett

From mdehoon at c2b2.columbia.edu  Mon Nov 14 01:25:34 2005
From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon)
Date: Sun, 13 Nov 2005 19:25:34 -0500
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <4373A214.6060201@v.loewis.de>
References: <437100A7.5050907@c2b2.columbia.edu>	<43710C95.30209@v.loewis.de>	<43729CAB.5070106@c2b2.columbia.edu>	<4372B82C.9010800@canterbury.ac.nz>	<4372DA3A.8010206@c2b2.columbia.edu>	<4372F72B.9060501@v.loewis.de>
	<43738074.2030508@c2b2.columbia.edu> <4373A214.6060201@v.loewis.de>
Message-ID: <4377D97E.9060507@c2b2.columbia.edu>

Martin v. L?wis wrote:

>Michiel Jan Laurens de Hoon wrote:
>  
>
>>The problem with threading (apart from potential portability problems) 
>>is that Python doesn't let us know when it's idle. This would cause 
>>excessive repainting (I can give you an explicit example if you're 
>>interested).
>>    
>>
>I don't understand how these are connected: why do you need to know
>when Python is idle for multi-threaded applications, and why does not
>knowing that it is idle cause massive repainting?
>
>Not sure whether an explicit example would help, though; one would
>probably need to understand a lot of details of your application. Giving
>a simplified version of the example might help (which would do 'print
>"Repainting"' instead of actually repainting).
>  
>
As an example, consider a function plot(y,x) that plots a graph of y as 
a function of x.

If I use threading, and Python doesn't let us know when it's idle, then 
the plot function needs to invalidate the window to trigger repainting. 
Otherwise, the event loop doesn't realize that there is something new to 
plot.

Now if I want to draw two graphs:

def f():
    x = arange(1000)*0.01
    y = sin(x)
    plot(y,x)
    plot(2*y,x)

and I execute f(), then after the first plot(y,x), I get a graph of y 
vs. x with x between 0 and 10 and y between -1 and 1. After the second 
plot, the y-axis runs from -2 to 2, and we need to draw (y,x) as well as 
(2*y,x). So the first repainting was in vain.

If, however, Python contains an event loop that takes care of events as 
well as Python commands, redrawing won't happen until Python has 
executed all plot commands -- so no repainting in vain here.

I agree with you though that threads are a good solution for extension 
modules for which a standard event loop is not suitable, and for which 
graphics performance is not essential -- such as Tkinter (see my next post).

--Michiel.

-- 
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032



From mdehoon at c2b2.columbia.edu  Mon Nov 14 01:51:32 2005
From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon)
Date: Sun, 13 Nov 2005 19:51:32 -0500
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <4373A214.6060201@v.loewis.de>
References: <437100A7.5050907@c2b2.columbia.edu>	<43710C95.30209@v.loewis.de>	<43729CAB.5070106@c2b2.columbia.edu>	<4372B82C.9010800@canterbury.ac.nz>	<4372DA3A.8010206@c2b2.columbia.edu>	<4372F72B.9060501@v.loewis.de>
	<43738074.2030508@c2b2.columbia.edu> <4373A214.6060201@v.loewis.de>
Message-ID: <4377DF94.2090003@c2b2.columbia.edu>

Martin v. L?wis wrote:

>Michiel Jan Laurens de Hoon wrote:
>  
>
>>But there is another solution with threads: Can we let Tkinter run in a 
>>separate thread instead?
>>    
>>
>
>Yes, you can. Actually, Tkinter *always* runs in a separate thread 
>(separate from all other threads).
>  
>
Are you sure? If Tkinter is running in a separate thread, then why does 
it need PyOS_InputHook?
Maybe I'm misunderstanding the code in _tkinter.c, but it appears that 
the call to Tcl_DoOneEvent and the main interpreter (the one that reads 
the user commands from stdin) are in the same thread.

Anyway, if we can run Tkinter's event loop in a thread separate from the 
main interpreter, then we can avoid all interference with other event 
loops, and also improve Tkinter's behavior itself:
1) Since this event loop doesn't need to check stdin any more, we can 
avoid the busy-wait-sleep loop by calling Tcl_DoOneEvent without the 
TCL_DONT_WAIT flag, and hence get better performance.
2) With the event loop in a separate thread, we can use Tkinter from 
IDLE also.

--Michiel.

-- 
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032



From mdehoon at c2b2.columbia.edu  Mon Nov 14 02:04:55 2005
From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon)
Date: Sun, 13 Nov 2005 20:04:55 -0500
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <4375721D.6040907@canterbury.ac.nz>
References: <437100A7.5050907@c2b2.columbia.edu>
	<43710C95.30209@v.loewis.de>	<43729CAB.5070106@c2b2.columbia.edu>	<87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp>	<4372DD5F.70203@c2b2.columbia.edu>	<ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com>	<43738C55.60509@c2b2.columbia.edu>
	<4373A300.3080501@v.loewis.de>	<4374DB69.2080804@c2b2.columbia.edu>
	<4375721D.6040907@canterbury.ac.nz>
Message-ID: <4377E2B7.60309@c2b2.columbia.edu>

Greg Ewing wrote:

>Michiel Jan Laurens de Hoon wrote:
>  
>
>>I have an extension module for scientific visualization. This extension 
>>module opens one or more windows, in which plots can be made.
>>    
>>
>
>What sort of windows are these? Are you using an existing
>GUI toolkit, or rolling your own?
>  
>
Rolling my own. There's not much GUI to my window, basically it's just a 
window where I draw stuff.

>>For the graphics windows to remain responsive, I need to make sure that 
>>its events get handled. So I need an event loop.
>>    
>>
>How about running your event loop in a separate thread?
>  
>
I agree that this works for some extension modules, but not very well 
for extension modules for which graphical performance is critical (see 
my reply to Martin). Secondly, I think that by thinking this through, we 
can come up with a suitable event loop framework for Python (probably 
similar to what Skip is proposing) that works without having to resort 
to threads. So we give users a choice: use the event loop if possible or 
preferable, and use a thread otherwise.

--Michiel..

-- 
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032



From greg.ewing at canterbury.ac.nz  Mon Nov 14 02:07:35 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 14 Nov 2005 14:07:35 +1300
Subject: [Python-Dev] str.dedent
In-Reply-To: <43769E4A.5040408@colorstudy.com>
References: <b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com>
	<000001c5e7c6$2f959440$2523c797@oemcomputer>
	<b348a0850511121424n26f84b9n7c1edc45e7f9f1c@mail.gmail.com>
	<43769E4A.5040408@colorstudy.com>
Message-ID: <4377E357.5010808@canterbury.ac.nz>

Ian Bicking wrote:

> I think a better argument for this is that dedenting a literal string is
> more of a syntactic operation than a functional one.  You don't think
> "oh, I bet I'll need to do some dedenting on line 200 of this module, I
> better import textwrap".

And regardless of the need to import, there's a feeling
that it's something that ought to be done at compile
time, or even parse time.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From mdehoon at c2b2.columbia.edu  Mon Nov 14 02:20:07 2005
From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon)
Date: Sun, 13 Nov 2005 20:20:07 -0500
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <17269.31319.806622.939477@montanaro.dyndns.org>
References: <437100A7.5050907@c2b2.columbia.edu>
	<43710C95.30209@v.loewis.de>	<43729CAB.5070106@c2b2.columbia.edu>	<87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp>	<4372DD5F.70203@c2b2.columbia.edu>	<ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com>	<43738C55.60509@c2b2.columbia.edu>
	<4373A300.3080501@v.loewis.de>	<4374DB69.2080804@c2b2.columbia.edu>	<17269.21593.575449.78938@montanaro.dyndns.org>	<4375709B.4010009@canterbury.ac.nz>
	<17269.31319.806622.939477@montanaro.dyndns.org>
Message-ID: <4377E647.7080708@c2b2.columbia.edu>

skip at pobox.com wrote:

>If I have a Gtk app I have to feed other (socket, callback) pairs to it.  It
>takes care of adding it to the select() call.  Python could dictate that the
>way to play ball is for other packages (Tkinter, PyGtk, wxPython, etc) to
>feed Python the (socket, callback) pair.  Then you have a uniform way to
>control event-driven applications.  Today, a package like Michiel's has no
>idea what sort of event loop it will encounter.  If Python provided the
>event loop API it would be the same no matter what widget set happened to be
>used.
>  
>
This is essentially how Tcl does it (and which, btw, is currently being 
used in Tkinter):
Tcl has the functions *Tcl_CreateFileHandler/**Tcl_DeleteFileHandler*, 
which allow a user to add a file descriptor to the list of file 
descriptors to select() on, and to specify a callback function to the 
function to be called when the file descriptor is signaled. A similar 
API in Python would give users a clean way to hook into the event loop, 
independent of which other packages are hooked into the event loop.
//

>The sticking point is probably that a number of such packages presume they
>will always provide the main event loop and have to way to feed their
>sockets to another event loop controller.  That might present some hurdles
>for the various package writers/Python wrappers.
>  
>
This may not be such a serious problem. Being able to hook into Python's 
event loop is important only if users want to be able to use the 
extension module in interactive mode. For an extension module such as 
PyGtk, the developers may decide that PyGtk is likely to be run in 
non-interactive mode only, for which the PyGtk mainloop is sufficient. 
Having an event loop API in Python won't hurt them.

--Michiel.

-- 
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032



From greg.ewing at canterbury.ac.nz  Mon Nov 14 02:24:16 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 14 Nov 2005 14:24:16 +1300
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <4377E2B7.60309@c2b2.columbia.edu>
References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de>
	<43729CAB.5070106@c2b2.columbia.edu>
	<87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp>
	<4372DD5F.70203@c2b2.columbia.edu>
	<ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com>
	<43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de>
	<4374DB69.2080804@c2b2.columbia.edu>
	<4375721D.6040907@canterbury.ac.nz>
	<4377E2B7.60309@c2b2.columbia.edu>
Message-ID: <4377E740.70904@canterbury.ac.nz>

Michiel Jan Laurens de Hoon wrote:
> Greg Ewing wrote:
> 
> > How about running your event loop in a separate thread?
> 
> I agree that this works for some extension modules, but not very well 
> for extension modules for which graphical performance is critical

I don't understand. If the main thread is idle, your thread
should get all the time it wants.

I'd actually expect this to give better interactive response,
since you aren't doing busy-wait pauses all the time -- the
thread can wake up as soon as an event arrives for it.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From mdehoon at c2b2.columbia.edu  Mon Nov 14 02:25:18 2005
From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon)
Date: Sun, 13 Nov 2005 20:25:18 -0500
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <b348a0850511121630s6e3d8d9dr4c8beaa202c2f1b5@mail.gmail.com>
References: <437100A7.5050907@c2b2.columbia.edu>	<ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com>	<43738C55.60509@c2b2.columbia.edu>
	<4373A300.3080501@v.loewis.de>	<4374DB69.2080804@c2b2.columbia.edu>	<17269.21593.575449.78938@montanaro.dyndns.org>	<4375709B.4010009@canterbury.ac.nz>	<17269.31319.806622.939477@montanaro.dyndns.org>	<b348a0850511121106u57c073eeicf8affae502cd86e@mail.gmail.com>	<43767FA8.7090209@canterbury.ac.nz>
	<b348a0850511121630s6e3d8d9dr4c8beaa202c2f1b5@mail.gmail.com>
Message-ID: <4377E77E.5030000@c2b2.columbia.edu>

Noam Raphael wrote:

>On 11/13/05, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
>  
>
>>Noam Raphael wrote:
>>    
>>
>>>All that is needed to make Tkinter and Michiels'
>>>code run together is a way to say "add this callback to the input
>>>hook" instead of the current "replace the current input hook with this
>>>callback". Then, when the interpreter is idle, it will call all the
>>>registered callbacks, one at a time, and everyone would be happy.
>>>      
>>>
>>Except for those who don't like busy waiting.
>>    
>>
>I'm not sure I understand what you meant. If you meant that it will
>work slowly - a lot of people (including me) are using Tkinter without
>a mainloop from the interactive shell, and don't feel the difference.
>It uses exactly the method I described.
>  
>
This depends on what kind of extension module you run. I agree, for 
Tkinter you probably won't notice the difference -- although you are 
still wasting processor cycles. However, if graphics performance is 
important, busy-waiting is not ideal.

--Michiel.

-- 
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032



From greg.ewing at canterbury.ac.nz  Mon Nov 14 02:27:48 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 14 Nov 2005 14:27:48 +1300
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <b348a0850511121630s6e3d8d9dr4c8beaa202c2f1b5@mail.gmail.com>
References: <437100A7.5050907@c2b2.columbia.edu>
	<ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com>
	<43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de>
	<4374DB69.2080804@c2b2.columbia.edu>
	<17269.21593.575449.78938@montanaro.dyndns.org>
	<4375709B.4010009@canterbury.ac.nz>
	<17269.31319.806622.939477@montanaro.dyndns.org>
	<b348a0850511121106u57c073eeicf8affae502cd86e@mail.gmail.com>
	<43767FA8.7090209@canterbury.ac.nz>
	<b348a0850511121630s6e3d8d9dr4c8beaa202c2f1b5@mail.gmail.com>
Message-ID: <4377E814.6060805@canterbury.ac.nz>

Noam Raphael wrote:
> On 11/13/05, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> 
> > Noam Raphael wrote:
> >
> > > callback". Then, when the interpreter is idle, it will call all the
> > > registered callbacks, one at a time, and everyone would be happy.
> >
> > Except for those who don't like busy waiting.
>
> I'm not sure I understand what you meant. If you meant that it will
> work slowly - a lot of people (including me) are using Tkinter without
> a mainloop from the interactive shell, and don't feel the difference.

Busy waiting is less efficient and less responsive than
a solution which is able to avoid it. In many cases there
will be little noticeable difference, but there will be
some people who don't like it because it's not really
the "right" solution to this sort of problem.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From fperez.net at gmail.com  Mon Nov 14 02:30:53 2005
From: fperez.net at gmail.com (Fernando Perez)
Date: Sun, 13 Nov 2005 18:30:53 -0700
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
References: <437100A7.5050907@c2b2.columbia.edu>
	<43710C95.30209@v.loewis.de>	<43729CAB.5070106@c2b2.columbia.edu>	<87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp>	<4372DD5F.70203@c2b2.columbia.edu>	<ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com>	<43738C55.60509@c2b2.columbia.edu>
	<4373A300.3080501@v.loewis.de>	<4374DB69.2080804@c2b2.columbia.edu>	<17269.21593.575449.78938@montanaro.dyndns.org>	<4375709B.4010009@canterbury.ac.nz>
	<17269.31319.806622.939477@montanaro.dyndns.org>
	<4377E647.7080708@c2b2.columbia.edu>
Message-ID: <dl8pce$6e0$1@sea.gmane.org>

Michiel Jan Laurens de Hoon wrote:

> For an extension module such as
> PyGtk, the developers may decide that PyGtk is likely to be run in
> non-interactive mode only, for which the PyGtk mainloop is sufficient.

Did you read my reply? ipython, based on code.py, implements a few simple
threading tricks (they _are_ simple, since I know next to nothing about
threading) and gives you interactive use of PyGTK, WXPython and PyQt
applications in a manner similar to Tkinter.  Meaning, you can from the command
line make a window, change its title, add buttons to it, etc, all the while
your interactive prompt remains responsive as well as the GUI.  With that
support, matplotlib can be used to do scientific plotting with any of these
toolkits and no blocking of any kind (cross-thread signal handling is another
story, but you didn't ask about that).

As I said, there may be something in your problem that I don't understand.  But
it is certainly possible, today, to have a non-blocking Qt/WX/GTK-based
scientific plotting application with interactive input.  The ipython/matplotlib
combo has done precisely that for over a year (well, Qt support was added this
April).

Cheers,

f


From mdehoon at c2b2.columbia.edu  Mon Nov 14 02:39:36 2005
From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon)
Date: Sun, 13 Nov 2005 20:39:36 -0500
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <4377E740.70904@canterbury.ac.nz>
References: <437100A7.5050907@c2b2.columbia.edu>
	<43710C95.30209@v.loewis.de>	<43729CAB.5070106@c2b2.columbia.edu>	<87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp>	<4372DD5F.70203@c2b2.columbia.edu>	<ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com>	<43738C55.60509@c2b2.columbia.edu>
	<4373A300.3080501@v.loewis.de>	<4374DB69.2080804@c2b2.columbia.edu>	<4375721D.6040907@canterbury.ac.nz>	<4377E2B7.60309@c2b2.columbia.edu>
	<4377E740.70904@canterbury.ac.nz>
Message-ID: <4377EAD8.7050105@c2b2.columbia.edu>

Greg Ewing wrote:

>Michiel Jan Laurens de Hoon wrote:
>  
>
>>Greg Ewing wrote:
>>    
>>
>>>How about running your event loop in a separate thread?
>>>      
>>>
>>I agree that this works for some extension modules, but not very well 
>>for extension modules for which graphical performance is critical
>>    
>>
>
>I don't understand. If the main thread is idle, your thread
>should get all the time it wants.
>
>I'd actually expect this to give better interactive response,
>since you aren't doing busy-wait pauses all the time -- the
>thread can wake up as soon as an event arrives for it.
>  
>
This is exactly the problem. Drawing one picture may consist of many 
Python commands to draw the individual elements (for example, several 
graphs overlaying each other). We don't know where in the window each 
element will end up until we have the list of elements complete. For 
example, the axis may change (see my example to Martin). Or, if we're 
drawing a 3D picture, then one element may obscure another.

Now, if we have our plotting extension module in a separate thread, the 
window will be repainted each time a new element is added. Imagine a 
picture of 1000 elements: we'd have to draw 1+2+...+1000 times.

So this is tricky: we want repainting to start as soon as possible, but 
not sooner. Being able to hook into Python's event loop allows us to do so.


--Michiel.

-- 
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032



From mdehoon at c2b2.columbia.edu  Mon Nov 14 02:43:21 2005
From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon)
Date: Sun, 13 Nov 2005 20:43:21 -0500
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <dl8pce$6e0$1@sea.gmane.org>
References: <437100A7.5050907@c2b2.columbia.edu>	<43710C95.30209@v.loewis.de>	<43729CAB.5070106@c2b2.columbia.edu>	<87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp>	<4372DD5F.70203@c2b2.columbia.edu>	<ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com>	<43738C55.60509@c2b2.columbia.edu>	<4373A300.3080501@v.loewis.de>	<4374DB69.2080804@c2b2.columbia.edu>	<17269.21593.575449.78938@montanaro.dyndns.org>	<4375709B.4010009@canterbury.ac.nz>	<17269.31319.806622.939477@montanaro.dyndns.org>	<4377E647.7080708@c2b2.columbia.edu>
	<dl8pce$6e0$1@sea.gmane.org>
Message-ID: <4377EBB9.6070706@c2b2.columbia.edu>

Fernando Perez wrote:

>Michiel Jan Laurens de Hoon wrote:
>  
>
>>For an extension module such as
>>PyGtk, the developers may decide that PyGtk is likely to be run in
>>non-interactive mode only, for which the PyGtk mainloop is sufficient.
>>    
>>
>
>Did you read my reply? ipython, based on code.py, implements a few simple
>threading tricks (they _are_ simple, since I know next to nothing about
>threading) and gives you interactive use of PyGTK, WXPython and PyQt
>applications in a manner similar to Tkinter.
>
That may be, and I think that's a good thing, but it's not up to me to 
decide if PyGtk should support interactive use. The PyGtk developers 
decide whether they want to decide to spend time on that, and they may 
decide not to, no matter how simple it may be.

--Michiel.

-- 
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032



From foom at fuhm.net  Mon Nov 14 03:06:23 2005
From: foom at fuhm.net (James Y Knight)
Date: Sun, 13 Nov 2005 21:06:23 -0500
Subject: [Python-Dev] str.dedent
In-Reply-To: <4377E357.5010808@canterbury.ac.nz>
References: <b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com>
	<000001c5e7c6$2f959440$2523c797@oemcomputer>
	<b348a0850511121424n26f84b9n7c1edc45e7f9f1c@mail.gmail.com>
	<43769E4A.5040408@colorstudy.com>
	<4377E357.5010808@canterbury.ac.nz>
Message-ID: <BA7FC0D5-FEC3-453E-B2C6-B1082DCFE9ED@fuhm.net>


On Nov 13, 2005, at 8:07 PM, Greg Ewing wrote:

> Ian Bicking wrote:
>
>
>> I think a better argument for this is that dedenting a literal  
>> string is
>> more of a syntactic operation than a functional one.  You don't think
>> "oh, I bet I'll need to do some dedenting on line 200 of this  
>> module, I
>> better import textwrap".
>>
>
> And regardless of the need to import, there's a feeling
> that it's something that ought to be done at compile
> time, or even parse time.

ITYM you mean "If only python were lisp". (macros, or even reader  
macros)

James

From fperez.net at gmail.com  Mon Nov 14 03:15:59 2005
From: fperez.net at gmail.com (Fernando Perez)
Date: Sun, 13 Nov 2005 19:15:59 -0700
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
References: <437100A7.5050907@c2b2.columbia.edu>	<43710C95.30209@v.loewis.de>	<43729CAB.5070106@c2b2.columbia.edu>	<87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp>	<4372DD5F.70203@c2b2.columbia.edu>	<ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com>	<43738C55.60509@c2b2.columbia.edu>	<4373A300.3080501@v.loewis.de>	<4374DB69.2080804@c2b2.columbia.edu>	<17269.21593.575449.78938@montanaro.dyndns.org>	<4375709B.4010009@canterbury.ac.nz>	<17269.31319.806622.939477@montanaro.dyndns.org>	<4377E647.7080708@c2b2.columbia.edu>
	<dl8pce$6e0$1@sea.gmane.org> <4377EBB9.6070706@c2b2.columbia.edu>
Message-ID: <dl8s10$brm$1@sea.gmane.org>

Michiel Jan Laurens de Hoon wrote:

> Fernando Perez wrote:

>>Did you read my reply? ipython, based on code.py, implements a few simple
>>threading tricks (they _are_ simple, since I know next to nothing about
>>threading) and gives you interactive use of PyGTK, WXPython and PyQt
>>applications in a manner similar to Tkinter.
>>
> That may be, and I think that's a good thing, but it's not up to me to
> decide if PyGtk should support interactive use. The PyGtk developers
> decide whether they want to decide to spend time on that, and they may
> decide not to, no matter how simple it may be.

OK, I must really not be making myself very clear.  I am not saying anything
aobut the pygtk developers: what I said is that this can be done by the
application writer, trivially, today.  There's nothing you need from the
authors of GTK.  Don't take my word for it, look at the code:

1. You can download ipython, it's a trivial pure-python install.  Grab
matplotlib and see for yourself (which also addresses the repaint issues you
mentioned).  You can test the gui support without mpl as well.

2. If you don't want to download/install ipython, just look at the code that
implements these features:

http://projects.scipy.org/ipython/ipython/file/ipython/trunk/IPython/Shell.py

3. If you really want to see how simple this is, you can run this single,
standalone script:

http://ipython.scipy.org/tmp/pyint-gtk.py

I wrote this when I was trying to understand the necessary threading tricks for
GTK, it's a little multithreaded GTK shell based on code.py.  230 lines of code
total, including readline support and (optional) matplotlib support.  Once this
was running, the ideas in it were folded into the more complex ipython
codebase.


At this point, I should probably stop posting on this thread.  I think this is
drifting off-topic for python-dev, and I am perhaps misunderstanding the
essence of your problem for some reason.  All I can say is that many people are
doing scientific interactive plotting with ipython/mpl and all the major GUI
toolkits, and they seem pretty happy about it.

Best,

f


From edloper at gradient.cis.upenn.edu  Mon Nov 14 06:35:31 2005
From: edloper at gradient.cis.upenn.edu (Edward Loper)
Date: Mon, 14 Nov 2005 00:35:31 -0500
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
Message-ID: <43782223.2070609@gradient.cis.upenn.edu>

As I understand it, you want to improve the performance of interactively 
run plot commands by queuing up all the plot sub-commands, and then 
drawing them all at once.  Hooking into a python event loop certainly 
isn't the only way to do this.  Perhaps you could consider the following 
approach:
   - The plot event loop is in a separate thread, accepting messages
     from the interactive thread.
   - These messages can contain plot commands; and they can also contain
     two new commands:
       - suspend -- stop plotting, and start saving commands in a queue.
       - resume -- execute all commands in the queue (with whatever
         increased efficiency tricks you're using)

Then you can either just add functions to generate these messages, and 
call them at appropriate places; or set PyOS_InputHook to wrap each 
interactive call with a suspend/resume pair.

But note that putting an event loop in a separate thread will be 
problematic if you want any of the events to generate callbacks into 
user code -- this could cause all sorts of nasty race-conditions!  Using 
a separate thread for an event loop only seems practical to me if the 
event loop will never call back into user code (or if you're willing to 
put the burden on your users of making sure everything is thread safe).

-Edward


From ronaldoussoren at mac.com  Mon Nov 14 07:39:28 2005
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Mon, 14 Nov 2005 07:39:28 +0100
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <4377E647.7080708@c2b2.columbia.edu>
References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de>
	<43729CAB.5070106@c2b2.columbia.edu>
	<87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp>
	<4372DD5F.70203@c2b2.columbia.edu>
	<ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com>
	<43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de>
	<4374DB69.2080804@c2b2.columbia.edu>
	<17269.21593.575449.78938@montanaro.dyndns.org>
	<4375709B.4010009@canterbury.ac.nz>
	<17269.31319.806622.939477@montanaro.dyndns.org>
	<4377E647.7080708@c2b2.columbia.edu>
Message-ID: <B28E05A6-D140-4C15-921D-C5061A2164D1@mac.com>


On 14-nov-2005, at 2:20, Michiel Jan Laurens de Hoon wrote:

> skip at pobox.com wrote:
>
>> If I have a Gtk app I have to feed other (socket, callback) pairs  
>> to it.  It
>> takes care of adding it to the select() call.  Python could  
>> dictate that the
>> way to play ball is for other packages (Tkinter, PyGtk, wxPython,  
>> etc) to
>> feed Python the (socket, callback) pair.  Then you have a uniform  
>> way to
>> control event-driven applications.  Today, a package like  
>> Michiel's has no
>> idea what sort of event loop it will encounter.  If Python  
>> provided the
>> event loop API it would be the same no matter what widget set  
>> happened to be
>> used.
>>
>>
> This is essentially how Tcl does it (and which, btw, is currently  
> being
> used in Tkinter):
> Tcl has the functions *Tcl_CreateFileHandler/**Tcl_DeleteFileHandler*,
> which allow a user to add a file descriptor to the list of file
> descriptors to select() on, and to specify a callback function to the
> function to be called when the file descriptor is signaled. A similar
> API in Python would give users a clean way to hook into the event  
> loop,
> independent of which other packages are hooked into the event loop.

... except when the GUI you're using doesn't expose (or even use) a file
descriptor that you can use with select. Not all the world is Linux.

BTW. I find using the term 'event loop' for the interactive mode very
confusing.

Ronald


From jcarlson at uci.edu  Mon Nov 14 08:16:28 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sun, 13 Nov 2005 23:16:28 -0800
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <4377D97E.9060507@c2b2.columbia.edu>
References: <4373A214.6060201@v.loewis.de> <4377D97E.9060507@c2b2.columbia.edu>
Message-ID: <20051113230400.A403.JCARLSON@uci.edu>


I personally like Edward Loper's idea of just running your own event
handler which deals with drawing, suspend/resume, etc...

> If, however, Python contains an event loop that takes care of events as 
> well as Python commands, redrawing won't happen until Python has 
> executed all plot commands -- so no repainting in vain here.

...but even without posting and reading events as stated above, one
could check for plot events every 1/100th a second.  If there is an
update, and it has been 10/100 seconds since that undrawn event happened,
redraw. Tune that 10 up/down to alter responsiveness characteristics.

Or heck, if you are really lazy, people can use a plot() calls, but
until an update_plot() is called, the plot isn't updated.

There are many reasonable solutions to your problem, not all of which
involve changing Python's event loop.

 - Josiah


From ncoghlan at gmail.com  Mon Nov 14 08:51:25 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 14 Nov 2005 17:51:25 +1000
Subject: [Python-Dev] Revamping the bug/patch guidelines (was Re:
 Implementation of PEP 341)
In-Reply-To: <bbaeab100511131540y46cef4e6yf2496aa4f24fbec8@mail.gmail.com>
References: <437716C4.8050309@gmail.com> <4377312E.2000002@gmail.com>	
	<4377486B.1090400@gmail.com> <4377517C.9000808@gmail.com>
	<bbaeab100511131540y46cef4e6yf2496aa4f24fbec8@mail.gmail.com>
Message-ID: <437841FD.3060707@gmail.com>

Brett Cannon wrote:
> On 11/13/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> Thomas Lee wrote:
>>> Implemented as you suggested and tested. I'll submit the patch to the
>>> tracker on sourceforge shortly. Are you guys still after contextual
>>> diffs as per the developer pages, or is an svn diff the preferred way to
>>> submit patches now?
>> svn diff should be fine. Although I thought Brett had actually updated those
>> pages after the move to svn. . .
>>
> 
> I did.  But the docs just need to be revamped.  But I can't start on
> that work until people tell me if they prefer FAQ-style (question
> listing all steps and then a question covering each step) or
> essay-style (bulleted list and then a definition/paragraph on each
> step) for bug/patch guidelines.

I'd prefer essay-style for the guidelines themselves, with appropriate 
pointers to the guidelines from the dev FAQ.

However, I also think either approach will work, so I suggest going with 
whichever you find easier to write :)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From martin at v.loewis.de  Mon Nov 14 08:53:07 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 14 Nov 2005 08:53:07 +0100
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <4377DF94.2090003@c2b2.columbia.edu>
References: <437100A7.5050907@c2b2.columbia.edu>	<43710C95.30209@v.loewis.de>	<43729CAB.5070106@c2b2.columbia.edu>	<4372B82C.9010800@canterbury.ac.nz>	<4372DA3A.8010206@c2b2.columbia.edu>	<4372F72B.9060501@v.loewis.de>	<43738074.2030508@c2b2.columbia.edu>
	<4373A214.6060201@v.loewis.de> <4377DF94.2090003@c2b2.columbia.edu>
Message-ID: <43784263.3020100@v.loewis.de>

Michiel Jan Laurens de Hoon wrote:
>>Yes, you can. Actually, Tkinter *always* runs in a separate thread 
>>(separate from all other threads).
>> 
> Are you sure? If Tkinter is running in a separate thread, then why does 
> it need PyOS_InputHook?

Well, my statement was (somewhat deliberately) misleading. That separate
thread might be the main thread (and, in many cases, is). The main 
thread is still a "separate" thread (separate from all others).

Regards,
Martin

From fperez.net at gmail.com  Mon Nov 14 08:54:56 2005
From: fperez.net at gmail.com (Fernando Perez)
Date: Mon, 14 Nov 2005 00:54:56 -0700
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
References: <4373A214.6060201@v.loewis.de> <4377D97E.9060507@c2b2.columbia.edu>
	<20051113230400.A403.JCARLSON@uci.edu>
Message-ID: <dl9fsh$oca$1@sea.gmane.org>

Josiah Carlson wrote:

> Or heck, if you are really lazy, people can use a plot() calls, but
> until an update_plot() is called, the plot isn't updated.

I really recommend that those interested in all these issues have a look at
matplotlib.  All of this has been dealt with there already, a long time ago, in
detail.  The solutions may not be perfect, but they do work for a fairly wide
range of uses, including the interactive case.

There may be a good reason why mpl's approach is insufficient, but I think that
the discussion here would be more productive if that were stated precisely and
explicitly.  Up to this point, all the requirements I've been able to
understand clearly  work just fine with ipython/mpl (though I may well have
missed the key issue, I'm sure).

Cheers,

f


From martin at v.loewis.de  Mon Nov 14 09:07:50 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 14 Nov 2005 09:07:50 +0100
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <4377D97E.9060507@c2b2.columbia.edu>
References: <437100A7.5050907@c2b2.columbia.edu>	<43710C95.30209@v.loewis.de>	<43729CAB.5070106@c2b2.columbia.edu>	<4372B82C.9010800@canterbury.ac.nz>	<4372DA3A.8010206@c2b2.columbia.edu>	<4372F72B.9060501@v.loewis.de>	<43738074.2030508@c2b2.columbia.edu>
	<4373A214.6060201@v.loewis.de> <4377D97E.9060507@c2b2.columbia.edu>
Message-ID: <437845D6.3080301@v.loewis.de>

Michiel Jan Laurens de Hoon wrote:
> If, however, Python contains an event loop that takes care of events as 
> well as Python commands, redrawing won't happen until Python has 
> executed all plot commands -- so no repainting in vain here.

Ah, I think now I understand the problem. It seems that you don't care
at all about event loops. What you really want to know is "when is
Python idle?", by "being idle" defines as "there are no commands being
processed at the interactive interpreter", or perhaps "there are no
commands being processed in the main thread", or perhaps "there are no
commands being processed in any thread".

Is that a correct problem statement? If so, please don't say that you
want an event loop. Instead, it appears that you want to hook into
the interpreter loop.

As others have commented, it should be possible to get nearly the
same effect without such hooking. For example, if you chose to
redraw at most 10 times per second, you will still get good
performance. Alternatively, you could chose to redraw if there was
no drawing command for 100ms.

Regards,
Martin

From gmccaughan at synaptics-uk.com  Mon Nov 14 10:20:50 2005
From: gmccaughan at synaptics-uk.com (Gareth McCaughan)
Date: Mon, 14 Nov 2005 09:20:50 +0000
Subject: [Python-Dev] str.dedent
In-Reply-To: <43777B5A.6030602@egenix.com>
References: <dga72k$cah$1@sea.gmane.org>
	<b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com>
	<43777B5A.6030602@egenix.com>
Message-ID: <200511140920.51724.gmccaughan@synaptics-uk.com>

On Sunday 2005-11-13 17:43, Marc-Andre Lemburg wrote:

[Noam Raphael:]
> > The idea is to add a method called "dedent" to strings. It would do
> > exactly what the current textwrap.indent function does. 

[Marc-Andre:]
> You are missing a point here: string methods were introduced
> to make switching from plain 8-bit strings to Unicode easier.
> 
> As such they are only needed in cases where an algorithm
> has to work on the resp. internals differently or where direct
> access to the internals makes a huge difference in terms
> of performance.

In a language that generally pays as much attention to
practical usability as Python, it seems a pity to say
(as you seem to be implying) that whether something is
a string method or a function in (say) the "textwrap"
module should be determined by internal implementation
details.

> > Writing multilined strings without spaces in the beginning of lines
> > makes functions harder to read, since although the Python parser is
> > happy with it, it breaks the visual indentation.
> 
> This is really a minor compiler/parser issue and not one which
> warrants adding another string method.

Adding another string method seems easier, and a smaller
change, than altering the compiler or parser. What's your
point here? I think I must be missing something.

-- 
g



From ulrich.berning at desys.de  Mon Nov 14 10:54:29 2005
From: ulrich.berning at desys.de (Ulrich Berning)
Date: Mon, 14 Nov 2005 10:54:29 +0100
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks
In-Reply-To: <ca471dc20511110815p12bb82efhc887ba4f6fae670f@mail.gmail.com>
References: <20051109023347.GA15823@localhost.localdomain>	
	<ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com>	
	<b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com>	
	<ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com>	
	<bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com>	
	<5.1.1.6.0.20051109190838.01f51838@mail.telecommunity.com>	
	<5.1.1.6.0.20051110124246.02bac470@mail.telecommunity.com>	
	<43749D65.4040001@desys.de>
	<ca471dc20511110815p12bb82efhc887ba4f6fae670f@mail.gmail.com>
Message-ID: <43785ED5.40000@desys.de>

Guido van Rossum schrieb:

>On 11/11/05, Ulrich Berning <ulrich.berning at desys.de> wrote:
>  
>
>>For instance, nobody would give the output of a C compiler a different
>>extension when different compiler flags are used.
>>    
>>
>
>But the usage is completely different. With C you explicitly manage
>when compilation happens. With Python you don't. When you first run
>your program with -O but it crashes, and then you run it again without
>-O to enable assertions, you would be very unhappy if the bytecode
>cached in a .pyo file would be reused!
>
>  
>
The other way round makes definitely more sense. At development time, I 
would never use Python with -O or -OO. I use it only at distribution 
time, after doing all the tests, to generate optimized bytecode.

However, this problem could be easily solved, if the value of 
Py_OptimizeFlag would be stored together with the generated bytecode. At 
import time, the cached bytecode would not be reused if the current 
value of Py_OptimizeFlag doesn't match the stored value (if the .py file 
isn't there any longer, we could either raise an exception or we could 
emit a warning and reuse the bytecode anyway). And if we do this a 
little bit more clever, we could refuse reusing optimized bytecode if we 
are running without -O or -OO and ignore assertions and docstrings in 
unoptimized bytecode when we are running with -O or -OO.

>>I would appreciate to see the generation of .pyo files completely
>>removed in the next release.
>>    
>>
>
>You seem to forget the realities of backwards compatibility. While
>there are ways to cache bytecode without having multiple extensions,
>we probably can't do that until Python 3.0.
>
>  
>
Please can you explain what backwards compatibility means in this 
context? Generated bytecode is neither upwards nor backwards compatible. 
No matter what I try, I always get a 'Bad magic number' when I try to 
import bytecode generated with a different Python version.
The most obvious software, that may depend on the existence of .pyo 
files are the various freeze/packaging tools like py2exe, py2app, 
cx_Freeze and Installer.  I haven't checked them in detail, but after a 
short inspection, they seem to be independent of the existence of .pyo 
files. I can't imagine that there is any other Python software, that 
depends on the existence of .pyo files, but maybe I'm totally wrong in 
this wild guess.

Ulli


From mal at egenix.com  Mon Nov 14 11:41:33 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 14 Nov 2005 11:41:33 +0100
Subject: [Python-Dev] str.dedent
In-Reply-To: <200511140920.51724.gmccaughan@synaptics-uk.com>
References: <dga72k$cah$1@sea.gmane.org>	<b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com>	<43777B5A.6030602@egenix.com>
	<200511140920.51724.gmccaughan@synaptics-uk.com>
Message-ID: <437869DD.7040800@egenix.com>

Gareth McCaughan wrote:
> On Sunday 2005-11-13 17:43, Marc-Andre Lemburg wrote:
> 
> [Noam Raphael:]
> 
>>>The idea is to add a method called "dedent" to strings. It would do
>>>exactly what the current textwrap.indent function does. 
> 
> 
> [Marc-Andre:]
> 
>>You are missing a point here: string methods were introduced
>>to make switching from plain 8-bit strings to Unicode easier.
>>
>>As such they are only needed in cases where an algorithm
>>has to work on the resp. internals differently or where direct
>>access to the internals makes a huge difference in terms
>>of performance.
> 
> 
> In a language that generally pays as much attention to
> practical usability as Python, it seems a pity to say
> (as you seem to be implying) that whether something is
> a string method or a function in (say) the "textwrap"
> module should be determined by internal implementation
> details.

We have to draw a line somewhere - otherwise you could
just as well add all functions that accept single
string arguments as methods to the basestring
sub-classes.

>>>Writing multilined strings without spaces in the beginning of lines
>>>makes functions harder to read, since although the Python parser is
>>>happy with it, it breaks the visual indentation.
>>
>>This is really a minor compiler/parser issue and not one which
>>warrants adding another string method.
> 
> Adding another string method seems easier, and a smaller
> change, than altering the compiler or parser. What's your
> point here? I think I must be missing something.

The point is that the presented use case does not
originate in a common need (to dedent strings), but
from a desire to write Python code with embedded
indented triple-quoted strings which lies in the scope
of the parser, not that of string objects.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Nov 14 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2005-10-17: Released mxODBC.Zope.DA 1.0.9        http://zope.egenix.com/

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From guido at python.org  Mon Nov 14 14:23:57 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 14 Nov 2005 08:23:57 -0500
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks
In-Reply-To: <43785ED5.40000@desys.de>
References: <20051109023347.GA15823@localhost.localdomain>
	<ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com>
	<b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com>
	<ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com>
	<bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com>
	<5.1.1.6.0.20051109190838.01f51838@mail.telecommunity.com>
	<5.1.1.6.0.20051110124246.02bac470@mail.telecommunity.com>
	<43749D65.4040001@desys.de>
	<ca471dc20511110815p12bb82efhc887ba4f6fae670f@mail.gmail.com>
	<43785ED5.40000@desys.de>
Message-ID: <ca471dc20511140523uf064144m9546a06abe5b07a8@mail.gmail.com>

On 11/14/05, Ulrich Berning <ulrich.berning at desys.de> wrote:
> >You seem to forget the realities of backwards compatibility. While
> >there are ways to cache bytecode without having multiple extensions,
> >we probably can't do that until Python 3.0.
> >
> Please can you explain what backwards compatibility means in this
> context? Generated bytecode is neither upwards nor backwards compatible.

No, but the general format of .pyc/.pyo files hasn't changed since
1991 (magic number, timestamp, marshalled data) and while the magic
number has changed many times, the API for getting it has been stable
for probably 10 years. Lots of tools (you mention a few) have been
written that read or write these files and these would all to some
extent have to be taught by the changes (most likely the changes will
include a change to the file header).

> No matter what I try, I always get a 'Bad magic number' when I try to
> import bytecode generated with a different Python version.
> The most obvious software, that may depend on the existence of .pyo
> files are the various freeze/packaging tools like py2exe, py2app,
> cx_Freeze and Installer.  I haven't checked them in detail, but after a
> short inspection, they seem to be independent of the existence of .pyo
> files. I can't imagine that there is any other Python software, that
> depends on the existence of .pyo files, but maybe I'm totally wrong in
> this wild guess.

It's not just the existence of .pyo files. It's the format of the .pyc
files that will have to change to accommodate multiple versions of
bytecode.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From skip at pobox.com  Mon Nov 14 16:00:21 2005
From: skip at pobox.com (skip@pobox.com)
Date: Mon, 14 Nov 2005 09:00:21 -0600
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <B28E05A6-D140-4C15-921D-C5061A2164D1@mac.com>
References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de>
	<43729CAB.5070106@c2b2.columbia.edu>
	<87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp>
	<4372DD5F.70203@c2b2.columbia.edu>
	<ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com>
	<43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de>
	<4374DB69.2080804@c2b2.columbia.edu>
	<17269.21593.575449.78938@montanaro.dyndns.org>
	<4375709B.4010009@canterbury.ac.nz>
	<17269.31319.806622.939477@montanaro.dyndns.org>
	<4377E647.7080708@c2b2.columbia.edu>
	<B28E05A6-D140-4C15-921D-C5061A2164D1@mac.com>
Message-ID: <17272.42629.420626.88192@montanaro.dyndns.org>


    Ronald> ... except when the GUI you're using doesn't expose (or even
    Ronald> use) a file descriptor that you can use with select. Not all the
    Ronald> world is Linux.

Can you be more specific?  Are you referring to Windows?  I'm not suggesting
you'd be able to use the same exact implementation on Unix and non-Unix
platforms.  You might well have to do different things across different
platforms.  Hopefully it would look the same to the programmer though, both
across platforms and across toolkits.  I can't imagine any of the X-based
widget toolkits on Unix systems would use anything other than select() on a
socket at the bottom.

Skip

From ulrich.berning at desys.de  Mon Nov 14 16:53:30 2005
From: ulrich.berning at desys.de (Ulrich Berning)
Date: Mon, 14 Nov 2005 16:53:30 +0100
Subject: [Python-Dev] Inconsistent behaviour in import/zipimport  hooks
In-Reply-To: <DAELJHBGPBHPJKEBGGLNCEEKIDAD.mhammond@skippinet.com.au>
References: <DAELJHBGPBHPJKEBGGLNCEEKIDAD.mhammond@skippinet.com.au>
Message-ID: <4378B2FA.9020308@desys.de>

Mark Hammond schrieb:

>>release. The main reason why I changed the import behavior was
>>pythonservice.exe from the win32 extensions. pythonservice.exe imports
>>the module that contains the service class, but because
>>pythonservice.exe doesn't run in optimized mode, it will only import a
>>.py or a .pyc file, not a .pyo file. Because we always generate bytecode
>>with -OO at distribution time, we either had to change the behavior of
>>pythonservice.exe or change the import behavior of Python.
>>    
>>
>
>While ignoring the question of how Python should in the future handle
>optimizations, I think it safe to state that that pythonservice.exe should
>have the same basic functionality and operation in this regard as python.exe
>does.  It doesn't sound too difficult to modify pythonservice to accept -O
>flags, and to modify the service installation process to allow this flag to
>be specified.  I'd certainly welcome any such patches.
>
>Although getting off-topic for this list, note that for recent pywin32
>releases, it is possible to host a service using python.exe directly, and
>this is the technique py2exe uses to host service executables.  It would
>take a little more work to set things up to work like that, but that's
>probably not too unreasonable for a custom application with specialized
>distribution requirements.  Using python.exe obviously means you get full
>access to the  command-line facilities it provides.
>  
>
Although off-topic for this list, I should give a reply.

I have done both.
My first approach was to change pythonservice.exe to accept -O and -OO 
and set the Py_OptimizeFlag accordingly.
Today, we aren't using pythonservice.exe any longer. I have done nearly 
all the required changes in win32serviceutil.py to let python.exe host 
the services. It requires no changes to the services, everything should 
work as before. The difference is, that the service module is always 
executed as a script now. This requires an additional (first) argument 
'--as-service' when the script runs as a service.

NOTE: Debugging services doesn't work yet.

---
Installing the service C:\svc\testService.py is done the usual way:
C:\svc>C:\Python23\python.exe testService.py install

The resulting ImagePath value in the registry is then:
"C:\Python23\python.exe" C:\svc\testService.py --as-service

After finishing development and testing, we convert the script into an 
executable with our own tool sib.py:
C:\svc>C:\Python23\python.exe C:\Python23\sib.py -n testService -d . 
testService.py
C:\svc>nmake

Now, we just do:
C:\svc>testService.exe update

The resulting ImagePath value in the registry is then changed to:
"C:\testService.exe" --as-service

Starting, stopping and removing works as usual:
C:\svc>testService.exe start
C:\svc>testService.exe stop
C:\svc>testService.exe remove
---

Because not everything works as before (debugging doesn't work, but we 
do not use it), I haven't provided a patch yet. As soon as I have 
completed it, I will have a patch available.

Ulli




From mdehoon at c2b2.columbia.edu  Mon Nov 14 17:07:45 2005
From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon)
Date: Mon, 14 Nov 2005 11:07:45 -0500
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <17272.42629.420626.88192@montanaro.dyndns.org>
References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de>
	<43729CAB.5070106@c2b2.columbia.edu>
	<87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp>
	<4372DD5F.70203@c2b2.columbia.edu>
	<ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com>
	<43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de>
	<4374DB69.2080804@c2b2.columbia.edu>
	<17269.21593.575449.78938@montanaro.dyndns.org>
	<4375709B.4010009@canterbury.ac.nz>
	<17269.31319.806622.939477@montanaro.dyndns.org>
	<4377E647.7080708@c2b2.columbia.edu>
	<B28E05A6-D140-4C15-921D-C5061A2164D1@mac.com>
	<17272.42629.420626.88192@montanaro.dyndns.org>
Message-ID: <4378B651.4080707@c2b2.columbia.edu>

skip at pobox.com wrote:

>    Ronald> ... except when the GUI you're using doesn't expose (or even
>    Ronald> use) a file descriptor that you can use with select. Not all the
>    Ronald> world is Linux.
>
>Can you be more specific?  Are you referring to Windows?  I'm not suggesting
>you'd be able to use the same exact implementation on Unix and non-Unix
>platforms.  You might well have to do different things across different
>platforms.  Hopefully it would look the same to the programmer though, both
>across platforms and across toolkits.  I can't imagine any of the X-based
>widget toolkits on Unix systems would use anything other than select() on a
>socket at the bottom.
>
>Skip
>  
>
As far as I know, that is correct (except that some systems use poll 
instead of select). For our extension module, we use select or poll to 
wait for events on Unix (using X). I have not run into problems with 
this on the Unix systems I have used, nor have I received complaints 
from users that this didn't work.

On Windows, the situation is even easier. MsgWaitForMultipleObjects can 
wait for events on all windows created by the thread as well as stdin 
(the same function is used in Tcl's event loop). In contrast to Unix' 
select, we don't need to tell MsgWaitForMultipleObjects which callback 
function is associated with each window.

--Michiel.


-- 
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032



From fredrik at pythonware.com  Mon Nov 14 17:19:19 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 14 Nov 2005 17:19:19 +0100
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
References: <437100A7.5050907@c2b2.columbia.edu>	<43710C95.30209@v.loewis.de>	<43729CAB.5070106@c2b2.columbia.edu>	<87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp>	<4372DD5F.70203@c2b2.columbia.edu>	<ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com>	<43738C55.60509@c2b2.columbia.edu>	<4373A300.3080501@v.loewis.de>	<4374DB69.2080804@c2b2.columbia.edu>	<17269.21593.575449.78938@montanaro.dyndns.org>	<4375709B.4010009@canterbury.ac.nz>	<17269.31319.806622.939477@montanaro.dyndns.org>	<4377E647.7080708@c2b2.columbia.edu><dl8pce$6e0$1@sea.gmane.org>
	<4377EBB9.6070706@c2b2.columbia.edu>
Message-ID: <dlade8$ntq$1@sea.gmane.org>

Michiel Jan Laurens de Hoon wrote:

> >Did you read my reply? ipython, based on code.py, implements a few simple
> >threading tricks (they _are_ simple, since I know next to nothing about
> >threading) and gives you interactive use of PyGTK, WXPython and PyQt
> >applications in a manner similar to Tkinter.
> >
> That may be, and I think that's a good thing, but it's not up to me to
> decide if PyGtk should support interactive use. The PyGtk developers
> decide whether they want to decide to spend time on that, and they may
> decide not to, no matter how simple it may be.

can you *please* start reading the posts you're replying to?

</F>




From fredrik at pythonware.com  Mon Nov 14 17:30:34 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 14 Nov 2005 17:30:34 +0100
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
References: <437100A7.5050907@c2b2.columbia.edu><43710C95.30209@v.loewis.de>	<43729CAB.5070106@c2b2.columbia.edu>	<87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp>	<4372DD5F.70203@c2b2.columbia.edu>	<ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com>	<43738C55.60509@c2b2.columbia.edu><4373A300.3080501@v.loewis.de>	<4374DB69.2080804@c2b2.columbia.edu>	<4375721D.6040907@canterbury.ac.nz>	<4377E2B7.60309@c2b2.columbia.edu><4377E740.70904@canterbury.ac.nz>
	<4377EAD8.7050105@c2b2.columbia.edu>
Message-ID: <dlae3b$quk$1@sea.gmane.org>

Michiel Jan Laurens de Hoon wrote:

> This is exactly the problem. Drawing one picture may consist of many
> Python commands to draw the individual elements (for example, several
> graphs overlaying each other). We don't know where in the window each
> element will end up until we have the list of elements complete. For
> example, the axis may change (see my example to Martin). Or, if we're
> drawing a 3D picture, then one element may obscure another.
>
> Now, if we have our plotting extension module in a separate thread, the
> window will be repainted each time a new element is added. Imagine a
> picture of 1000 elements: we'd have to draw 1+2+...+1000 times.
>
> So this is tricky: we want repainting to start as soon as possible, but
> not sooner. Being able to hook into Python's event loop allows us to do so.

the solution to your problem is called damage/repair, is not tricky at
all, and is supported by every GUI toolkit under the sun.

(if you don't know how it works, google for "widget damage repair")

</F>




From ronaldoussoren at mac.com  Mon Nov 14 19:19:14 2005
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Mon, 14 Nov 2005 19:19:14 +0100
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <17272.42629.420626.88192@montanaro.dyndns.org>
References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de>
	<43729CAB.5070106@c2b2.columbia.edu>
	<87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp>
	<4372DD5F.70203@c2b2.columbia.edu>
	<ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com>
	<43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de>
	<4374DB69.2080804@c2b2.columbia.edu>
	<17269.21593.575449.78938@montanaro.dyndns.org>
	<4375709B.4010009@canterbury.ac.nz>
	<17269.31319.806622.939477@montanaro.dyndns.org>
	<4377E647.7080708@c2b2.columbia.edu>
	<B28E05A6-D140-4C15-921D-C5061A2164D1@mac.com>
	<17272.42629.420626.88192@montanaro.dyndns.org>
Message-ID: <E6BF31EF-835C-4212-B8DA-1401FF917312@mac.com>


On 14-nov-2005, at 16:00, skip at pobox.com wrote:

>
>     Ronald> ... except when the GUI you're using doesn't expose (or  
> even
>     Ronald> use) a file descriptor that you can use with select.  
> Not all the
>     Ronald> world is Linux.
>
> Can you be more specific?  Are you referring to Windows?

I was thinking of MacOS X. It does have a eventloop, but doesn't  
expose a
file descriptor to the user and might not even use one.

Adding Python's input to the runloop of the GUI might be easier (e.g.  
feed
the stdin filedescriptor to the GUI-toolkit-du-jour and process  
information
when that runloop tells you that data is present). We have an example  
of that
in the PyObjC source tree.

I'd say either choice won't be very good. The problem is that you must
interleave the execution of Python code with running the eventloop to  
get
nice behaviour, which suggests threading to me. If you don't interleave
you can easily block the GUI while Python code is executing.

> I'm not suggesting
> you'd be able to use the same exact implementation on Unix and non- 
> Unix
> platforms.  You might well have to do different things across  
> different
> platforms.  Hopefully it would look the same to the programmer  
> though, both
> across platforms and across toolkits.

Twisted anyone? ;-) ;-)

> I can't imagine any of the X-based
> widget toolkits on Unix systems would use anything other than select 
> () on a
> socket at the bottom.

I'd be very surprised if an X-based toolkit didn't use a select-loop  
somewhere.

Ronald

>
> Skip


From ronaldoussoren at mac.com  Mon Nov 14 19:21:13 2005
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Mon, 14 Nov 2005 19:21:13 +0100
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <20051113230400.A403.JCARLSON@uci.edu>
References: <4373A214.6060201@v.loewis.de> <4377D97E.9060507@c2b2.columbia.edu>
	<20051113230400.A403.JCARLSON@uci.edu>
Message-ID: <AE21F850-277C-43B5-89F8-60BA2B824F59@mac.com>


On 14-nov-2005, at 8:16, Josiah Carlson wrote:

>
> I personally like Edward Loper's idea of just running your own event
> handler which deals with drawing, suspend/resume, etc...
>
>> If, however, Python contains an event loop that takes care of  
>> events as
>> well as Python commands, redrawing won't happen until Python has
>> executed all plot commands -- so no repainting in vain here.
>
> ...but even without posting and reading events as stated above, one
> could check for plot events every 1/100th a second.  If there is an
> update, and it has been 10/100 seconds since that undrawn event  
> happened,
> redraw. Tune that 10 up/down to alter responsiveness characteristics.
>
> Or heck, if you are really lazy, people can use a plot() calls, but
> until an update_plot() is called, the plot isn't updated.

I wonder why nobody has suggested a seperate thread for managing the  
GUI and
using the hook in Python's event loop to issue the call to update_plot.

Ronald
>
> There are many reasonable solutions to your problem, not all of which
> involve changing Python's event loop.
>
>  - Josiah
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/ 
> ronaldoussoren%40mac.com


From mdehoon at c2b2.columbia.edu  Mon Nov 14 20:00:56 2005
From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon)
Date: Mon, 14 Nov 2005 14:00:56 -0500
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <AE21F850-277C-43B5-89F8-60BA2B824F59@mac.com>
References: <4373A214.6060201@v.loewis.de> <4377D97E.9060507@c2b2.columbia.edu>
	<20051113230400.A403.JCARLSON@uci.edu>
	<AE21F850-277C-43B5-89F8-60BA2B824F59@mac.com>
Message-ID: <4378DEE8.70109@c2b2.columbia.edu>

Ronald Oussoren wrote:

> I wonder why nobody has suggested a seperate thread for managing the  
> GUI and
> using the hook in Python's event loop to issue the call to update_plot.
>
Ha. That's probably the best solution I've heard so far, short of adding 
a Tcl-like event loop API to Python.
There are two remaining issues though:
1) Currently, there's only one PyOS_InputHook. So we're stuck if we find 
that some other extension module already set PyOS_InputHook. An easy 
solution would be to have an PyOS_AddInputHook/PyOS_RemoveInputHook API, 
and let Python maintain a list of input hooks to be called.
2) All extension modules have to agree to return immediately from a call 
to the hook function. Tkinter currently does not do this.

--Michiel.

-- 
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032



From noamraph at gmail.com  Mon Nov 14 20:14:39 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Mon, 14 Nov 2005 21:14:39 +0200
Subject: [Python-Dev] str.dedent
In-Reply-To: <437869DD.7040800@egenix.com>
References: <dga72k$cah$1@sea.gmane.org>
	<b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com>
	<43777B5A.6030602@egenix.com>
	<200511140920.51724.gmccaughan@synaptics-uk.com>
	<437869DD.7040800@egenix.com>
Message-ID: <b348a0850511141114p25411ea4w704a99d1ea9a629a@mail.gmail.com>

On 11/14/05, M.-A. Lemburg <mal at egenix.com> wrote:
> We have to draw a line somewhere - otherwise you could
> just as well add all functions that accept single
> string arguments as methods to the basestring
> sub-classes.

Please read my first post in this thread - I think there's more reason
for 'dedent' to be a string method than there is, for example, for
'expandtabs', since it allows you to write clearer code.
>
> The point is that the presented use case does not
> originate in a common need (to dedent strings), but
> from a desire to write Python code with embedded
> indented triple-quoted strings which lies in the scope
> of the parser, not that of string objects.
>
That's a theoretical argument. In practice, if you do it in the
parser, you have two options:
1. Automatically dedent all strings.
2. Add a 'd' or some other letter before the string.

Option 1 breaks backwards compatibility, and makes the parser do
unexpected things. Option 2 adds another string-prefix letter, which
is confusing, and it will also be hard to find out what that letter
means. On the other hand, adding ".dedent()" at the end is very clear,
and is just as easy.

Now, about performance, please see the message I'll post in a few minutes...

Noam

From noamraph at gmail.com  Mon Nov 14 20:33:13 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Mon, 14 Nov 2005 21:33:13 +0200
Subject: [Python-Dev] str.dedent
In-Reply-To: <b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com>
References: <dga72k$cah$1@sea.gmane.org>
	<ca471dc2050914161070f1f425@mail.gmail.com>
	<b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com>
Message-ID: <b348a0850511141133s69d7c10ck4a82898da0107401@mail.gmail.com>

Just two additional notes:

On 9/15/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
>
> -1
>
> Let it continue to live in textwrap where the existing pure python code
> adequately serves all string-like objects.  It's not worth losing the
> duck typing by attaching new methods to str, unicode, UserString, and
> everything else aspiring to be string-like.

It may seem like the 'dedent' code would have to be written a lot of
times, but I've checked the examples. It may be needed to write
different versions for 'str' and for 'unicode', but these are going to
be unified. In UserString you'll have to add exactly one line:

    def dedent(self): return self.data.dedent()

I've just taken the line created for 'isalpha' and replaced 'isalpha'
with 'dedent'. So in the long run, there will be exactly one
implementation of 'dedent' in the Python code. (I don't know of any
other objects which try to provide the full string interface.)

Another reason for prefering a 'dedent' method over a 'dedent'
function in some module, is that it allows sometime in the future to
add an optimization to the compiler, so that it will dedent the string
in compile time (this can't work for a function, since the function is
found in run time). This will solve the performance problem
completely, so that there will be an easy way to write multilined
strings which do not interfere with the visual structure of the code,
without the need to worry about performance. I'm not saying that this
optimization has to be done now, just that 'dedent' as a method makes
it possible, which adds to the other arguments for making it a method.

Noam

From aahz at pythoncraft.com  Mon Nov 14 20:56:50 2005
From: aahz at pythoncraft.com (Aahz)
Date: Mon, 14 Nov 2005 11:56:50 -0800
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <20051113230400.A403.JCARLSON@uci.edu>
References: <4373A214.6060201@v.loewis.de> <4377D97E.9060507@c2b2.columbia.edu>
	<20051113230400.A403.JCARLSON@uci.edu>
Message-ID: <20051114195650.GA2732@panix.com>

On Sun, Nov 13, 2005, Josiah Carlson wrote:
> 
> I personally like Edward Loper's idea of just running your own event
> handler which deals with drawing, suspend/resume, etc...
> 
>> If, however, Python contains an event loop that takes care of events as 
>> well as Python commands, redrawing won't happen until Python has 
>> executed all plot commands -- so no repainting in vain here.
> 
> ...but even without posting and reading events as stated above, one
> could check for plot events every 1/100th a second.  If there is an
> update, and it has been 10/100 seconds since that undrawn event happened,
> redraw. Tune that 10 up/down to alter responsiveness characteristics.

...and that's exactly what my sample threaded GUI application does.

Can we please move this thread to comp.lang.python?
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"If you think it's expensive to hire a professional to do the job, wait
until you hire an amateur."  --Red Adair

From skip at pobox.com  Mon Nov 14 21:04:02 2005
From: skip at pobox.com (skip@pobox.com)
Date: Mon, 14 Nov 2005 14:04:02 -0600
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <4378DEE8.70109@c2b2.columbia.edu>
References: <4373A214.6060201@v.loewis.de> <4377D97E.9060507@c2b2.columbia.edu>
	<20051113230400.A403.JCARLSON@uci.edu>
	<AE21F850-277C-43B5-89F8-60BA2B824F59@mac.com>
	<4378DEE8.70109@c2b2.columbia.edu>
Message-ID: <17272.60850.25579.583878@montanaro.dyndns.org>


    Michiel> 1) Currently, there's only one PyOS_InputHook. So we're stuck
    Michiel>    if we find that some other extension module already set
    Michiel>    PyOS_InputHook. An easy solution would be to have an
    Michiel>    PyOS_AddInputHook/PyOS_RemoveInputHook API, and let Python
    Michiel>    maintain a list of input hooks to be called.

I think we've come more-or-less full circle to the point where I jumped onto
this spinning thread.  If there is only a single input hook function, you
probably need to write a slightly higher level module that manages the hook.
See sys.exitfunc and the atexit module for a simple example.

Skip

From fredrik at pythonware.com  Mon Nov 14 21:01:00 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 14 Nov 2005 21:01:00 +0100
Subject: [Python-Dev] str.dedent
References: <dga72k$cah$1@sea.gmane.org><b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com><43777B5A.6030602@egenix.com><200511140920.51724.gmccaughan@synaptics-uk.com><437869DD.7040800@egenix.com>
	<b348a0850511141114p25411ea4w704a99d1ea9a629a@mail.gmail.com>
Message-ID: <dlaqds$8sb$1@sea.gmane.org>

Noam Raphael wrote:

> That's a theoretical argument. In practice, if you do it in the
> parser, you have two options:
>
> 1. Automatically dedent all strings.
> 2. Add a 'd' or some other letter before the string.
>
> Option 1 breaks backwards compatibility, and makes the parser do
> unexpected things. Option 2 adds another string-prefix letter, which
> is confusing, and it will also be hard to find out what that letter
> means. On the other hand, adding ".dedent()" at the end is very clear,
> and is just as easy.

so is putting the string constant in a global variable, outside the scope
you're in, like you'd do with any other constant.

(how about a new rule: you cannot post to a zombie thread on python-
dev unless they've fixed/reviewed/applied or otherwise processed at least
one tracker item earlier the same day.  there are hundreds of items on the
bugs and patches trackers that could need some loving care)

</F> 




From noamraph at gmail.com  Mon Nov 14 21:12:04 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Mon, 14 Nov 2005 22:12:04 +0200
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <4378DEE8.70109@c2b2.columbia.edu>
References: <4373A214.6060201@v.loewis.de> <4377D97E.9060507@c2b2.columbia.edu>
	<20051113230400.A403.JCARLSON@uci.edu>
	<AE21F850-277C-43B5-89F8-60BA2B824F59@mac.com>
	<4378DEE8.70109@c2b2.columbia.edu>
Message-ID: <b348a0850511141212o12556119jd2be06f9444b3d1b@mail.gmail.com>

On 11/14/05, Michiel Jan Laurens de Hoon <mdehoon at c2b2.columbia.edu> wrote:
> Ronald Oussoren wrote:
>
> > I wonder why nobody has suggested a seperate thread for managing the
> > GUI and
> > using the hook in Python's event loop to issue the call to update_plot.
> >
> Ha. That's probably the best solution I've heard so far, short of adding
> a Tcl-like event loop API to Python.

No. It is definitely a bad solution.

Where I work, we do a lot of plotting from the interactive
interpreter, using Tkinter. I always wondered how it worked, and
assumed that it was done using threading. So when people started using
IDLE, and those plots didn't show up, I've found the solution of
calling the Tkinter main() function from a thread. Everything seemed
to work fine, until...

It didn't. Strange freezes started to appear, only when working from
IDLE. This made me investigate a bit, and I've found that Tkinter
isn't run from a seperate thread - the dooneevent() function is called
repeatedly by PyOS_InputHook while the interpreter is idle.

The conclusions:
1. Don't use threads when you don't have to. Tkinter does callbacks to
Python code, and most code isn't designed to work reliably in
multithreaded environment.
2. The non-threading solution works *really* well - the fact is that I
hadn't noticed the difference between multi-threaded mode and
single-threaded mode, until things began to freeze in the
multi-threaded mode.

Noam

From noamraph at gmail.com  Mon Nov 14 23:25:24 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Tue, 15 Nov 2005 00:25:24 +0200
Subject: [Python-Dev] str.dedent
In-Reply-To: <dlaqds$8sb$1@sea.gmane.org>
References: <dga72k$cah$1@sea.gmane.org>
	<b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com>
	<43777B5A.6030602@egenix.com>
	<200511140920.51724.gmccaughan@synaptics-uk.com>
	<437869DD.7040800@egenix.com>
	<b348a0850511141114p25411ea4w704a99d1ea9a629a@mail.gmail.com>
	<dlaqds$8sb$1@sea.gmane.org>
Message-ID: <b348a0850511141425y1a894ddap14d7814568c9be5d@mail.gmail.com>

On 11/14/05, Fredrik Lundh <fredrik at pythonware.com> wrote:
> so is putting the string constant in a global variable, outside the scope
> you're in, like you'd do with any other constant.

Usually when I use a constant a single time, I write it where I use
it, and don't give it a name. I don't do:

messagea = "The value of A is "

... (a long class definition)
    print messagea, A

This is what I mean when I say "constant" - a value which is known
when I write the code, not necessarily an arbitrary value that may
change, so I write it at the beginning of the program for others to
know it's there.

There's no reason why multilined strings that are used only once
should be defined at the beginning of a program (think about a simple
CGI script, which prints HTML parts in a function.)
>
> (how about a new rule: you cannot post to a zombie thread on python-
> dev unless they've fixed/reviewed/applied or otherwise processed at least
> one tracker item earlier the same day.  there are hundreds of items on the
> bugs and patches trackers that could need some loving care)
>
I posted to this thread because it was relevant to a new post about
dedenting strings. Anyway, I looked at bug 1356720 (Ctrl+C for copy
does not work when caps-lock is on), and posted there a very simple
patch which will most probably solve the problem. I also looked at bug
1337987 (IDLE, F5 and wrong external file content. (on error!)). One
problem it raises is that IDLE doesn't have a "revert" command and
that it doesn't notice if the file was changed outside of IDLE. I am
planning to fix it.

The other problem that is reported in that bug is that exceptions show
misleading code lines when the source file was changed but wasn't
loaded into Python. Perhaps in compiled code, not only the file name
should be written but also its modification time? This way, when
tracebacks print lines of changed files, they can warn if the line
might not be the right line.

Noam

From fredrik at pythonware.com  Mon Nov 14 23:27:28 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 14 Nov 2005 23:27:28 +0100
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
References: <4373A214.6060201@v.loewis.de>
	<4377D97E.9060507@c2b2.columbia.edu><20051113230400.A403.JCARLSON@uci.edu><AE21F850-277C-43B5-89F8-60BA2B824F59@mac.com><4378DEE8.70109@c2b2.columbia.edu>
	<b348a0850511141212o12556119jd2be06f9444b3d1b@mail.gmail.com>
Message-ID: <dlb30h$771$1@sea.gmane.org>

Noam Raphael wrote:

> It didn't. Strange freezes started to appear, only when working from
> IDLE. This made me investigate a bit, and I've found that Tkinter
> isn't run from a seperate thread - the dooneevent() function is called
> repeatedly by PyOS_InputHook while the interpreter is idle.

repeatedly?

The standard myreadline implementation only calls the hook *once* for
each line it reads from stdin:

    if (PyOS_InputHook != NULL)
        (void)(PyOS_InputHook)();
    errno = 0;
    p = fgets(buf, len, fp);
    if (p != NULL)
        return 0; /* No error */

which isn't enough to keep any event pump going...

If you want any other behaviour, you need GNU readline, or a GUI toolkit
that takes control over the InputHook, just like Tkinter.  And that won't
help you if you want portable code; for example, Tkinter on Windows only
keeps the event pump running as long as the user doesn't type anything.
As soon as the user touches the keyboard, the pump stops.

To see this in action, try this:

    >>> from Tkinter import *
    >>> label = Label(text="hello")
    >>> label.pack()

and then type

    >>> label.after(1000, lambda: label.config(bg="red"))

and press return.  The widget updates after a second.

Next, type

    >>> label.after(1000, lambda: label.config(bg="blue"))

press return, and immediately press space.  This time, nothing happens,
until you press return again.

If you want to write portable code that keeps things running "in the
background" while the users hack away at the standard interactive
prompt, InputHook won't help you.

</F>




From noamraph at gmail.com  Mon Nov 14 23:39:13 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Tue, 15 Nov 2005 00:39:13 +0200
Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <dlb30h$771$1@sea.gmane.org>
References: <4373A214.6060201@v.loewis.de> <4377D97E.9060507@c2b2.columbia.edu>
	<20051113230400.A403.JCARLSON@uci.edu>
	<AE21F850-277C-43B5-89F8-60BA2B824F59@mac.com>
	<4378DEE8.70109@c2b2.columbia.edu>
	<b348a0850511141212o12556119jd2be06f9444b3d1b@mail.gmail.com>
	<dlb30h$771$1@sea.gmane.org>
Message-ID: <b348a0850511141439p3f0f4cdbp5d7332b1d1224f19@mail.gmail.com>

On 11/15/05, Fredrik Lundh <fredrik at pythonware.com> wrote:
> If you want to write portable code that keeps things running "in the
> background" while the users hack away at the standard interactive
> prompt, InputHook won't help you.
>
So probably it should be improved, or changed a bit, to work also on
Windows. Or perhaps it's Tkinter. Anyway, what I'm saying is - don't
use threads! Process events in the main thread while it doesn't run
the user's Python code. If he runs another thread - that's his
problem. The implicit event loop should never execute Python code
while a user's Python code is running in the main thread.

Noam

From BruceEckel-Python3234 at mailblocks.com  Mon Nov 14 23:46:58 2005
From: BruceEckel-Python3234 at mailblocks.com (Bruce Eckel)
Date: Mon, 14 Nov 2005 15:46:58 -0700
Subject: [Python-Dev] Coroutines (PEP 342)
In-Reply-To: <4359047B.6020203@gmail.com>
References: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com>
	<43579027.6040007@gmail.com> <43579ADC.80006@gmail.com>
	<5.1.1.6.0.20051020163313.01faf660@mail.telecommunity.com>
	<ca471dc20510201957m7823c49ama127de972eef4028@mail.gmail.com>
	<4359047B.6020203@gmail.com>
Message-ID: <1147958111.20051114154658@gmail.com>

I just finished reading PEP 342, and it appears to follow Hoare's
Communicating Sequential Processes (CSP) where a process is a
coroutine, and the communicaion is via yield and send(). It seems that
if you follow that form (and you don't seem forced to, pythonically),
then synchronization is not an issue.

What is not clear to me, and is not discussed in the PEP, is whether
coroutines can be distributed among multiple processors. If that is or
isn't possible I think it should be explained in the PEP, and I'd be
interested in know about it here (and ideally why it would or wouldn't
work).

Thanks.

Bruce Eckel



From martin at v.loewis.de  Mon Nov 14 23:48:34 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 14 Nov 2005 23:48:34 +0100
Subject: [Python-Dev] str.dedent
In-Reply-To: <b348a0850511141425y1a894ddap14d7814568c9be5d@mail.gmail.com>
References: <dga72k$cah$1@sea.gmane.org>	<b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com>	<43777B5A.6030602@egenix.com>	<200511140920.51724.gmccaughan@synaptics-uk.com>	<437869DD.7040800@egenix.com>	<b348a0850511141114p25411ea4w704a99d1ea9a629a@mail.gmail.com>	<dlaqds$8sb$1@sea.gmane.org>
	<b348a0850511141425y1a894ddap14d7814568c9be5d@mail.gmail.com>
Message-ID: <43791442.8050109@v.loewis.de>

Noam Raphael wrote:
> There's no reason why multilined strings that are used only once
> should be defined at the beginning of a program (think about a simple
> CGI script, which prints HTML parts in a function.)

I find that simple CGI scripts are precisely the example *for* putting
multi-line string literals at the beginning of a file. There are
multiple styles for writing such things:
1. Put headers and trailers into separate strings. This tends to become
    tedious to maintain, since you always have to find the matching
    string (e.g. if you add an opening tag in the header, you have
    to put the closing tag in the trailer).

2. Use interpolation (e.g. % substitution), and put the strings into
    the code. This works fine for single line strings. For multi-line
    strings, the HTML code tends to clutter the view of the algorithm,
    whether it is indented or not. Functions should fit on a single
    screen of text, and adding multiline text into functions tends
    to break this requirement.

3. Use interpolation, and put the templates at the beginning. This makes
    the templates easy to inspect, and makes it easy to follow the code
    later in the file. It is the style I use and recommend.

Of course, it may occasionally become necessary to have a few-lines
string literally in a function; in most cases, indenting it along
with the rest of the function is fine, as HTML can stand extra spaces
with no problems.

Regards,
Martin


From greg.ewing at canterbury.ac.nz  Tue Nov 15 01:41:42 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 15 Nov 2005 13:41:42 +1300
Subject: [Python-Dev] str.dedent
In-Reply-To: <BA7FC0D5-FEC3-453E-B2C6-B1082DCFE9ED@fuhm.net>
References: <b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com>
	<000001c5e7c6$2f959440$2523c797@oemcomputer>
	<b348a0850511121424n26f84b9n7c1edc45e7f9f1c@mail.gmail.com>
	<43769E4A.5040408@colorstudy.com> <4377E357.5010808@canterbury.ac.nz>
	<BA7FC0D5-FEC3-453E-B2C6-B1082DCFE9ED@fuhm.net>
Message-ID: <43792EC6.4030707@canterbury.ac.nz>

James Y Knight wrote:

> ITYM you mean "If only python were lisp". (macros, or even reader  macros)

No, I mean it would be more satisfying if there
were a syntax for expressing multiline string
literals that didn't force it to be at the left
margin. The lack of such in such an otherwise
indentation-savvy language seems a wart.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From pje at telecommunity.com  Tue Nov 15 04:24:52 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 14 Nov 2005 22:24:52 -0500
Subject: [Python-Dev] Coroutines (PEP 342)
In-Reply-To: <1147958111.20051114154658@gmail.com>
References: <4359047B.6020203@gmail.com>
	<5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com>
	<43579027.6040007@gmail.com> <43579ADC.80006@gmail.com>
	<5.1.1.6.0.20051020163313.01faf660@mail.telecommunity.com>
	<ca471dc20510201957m7823c49ama127de972eef4028@mail.gmail.com>
	<4359047B.6020203@gmail.com>
Message-ID: <5.1.1.6.0.20051114221533.01f25260@mail.telecommunity.com>

At 03:46 PM 11/14/2005 -0700, Bruce Eckel wrote:
>I just finished reading PEP 342, and it appears to follow Hoare's
>Communicating Sequential Processes (CSP) where a process is a
>coroutine, and the communicaion is via yield and send(). It seems that
>if you follow that form (and you don't seem forced to, pythonically),
>then synchronization is not an issue.
>
>What is not clear to me, and is not discussed in the PEP, is whether
>coroutines can be distributed among multiple processors.

If you were to write a trampoline that used multiple threads, *and* you 
were using a Python implementation that supported multiple processors (e.g. 
Jython, IronPython, ?), *and* that Python implementation supported PEP 342, 
then yes.

However, that just means the answer is, "if you can run Python code on 
multiple processors, you can run Python code on multiple processors".  PEP 
342 itself has nothing to say about that issue, which exists independently 
of the PEP.

So, the PEP doesn't address what you're asking about, because the GIL still 
exists in CPython, and will continue to exist.  Also, guaranteeing 
encapsulation of the coroutines would be *hard*, because lots of Python 
objects like modules, functions, and the like would be shared between more 
than one coroutine, and so then the issue of locking raises its ugly head 
again.


>  If that is or
>isn't possible I think it should be explained in the PEP, and I'd be
>interested in know about it here (and ideally why it would or wouldn't
>work).

The PEP is entirely unrelated (and entirely orthogonal) to whether a given 
Python implementation can interpret Python code on multiple processors 
simultaneously.

The only difference between what PEP 342 does and what Twisted does today 
is in syntax.  PEP 342 just provides a syntax that lets you avoid writing 
your code in CPS (continuation-passing style) with lots of callbacks.

PEP 342 is implemented in the current Python SVN HEAD, by the way, if you 
want to experiment with the implementation.


From abo at minkirri.apana.org.au  Tue Nov 15 10:25:29 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Tue, 15 Nov 2005 09:25:29 +0000
Subject: [Python-Dev] Coroutines (PEP 342)
In-Reply-To: <1147958111.20051114154658@gmail.com>
References: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com>
	<43579027.6040007@gmail.com> <43579ADC.80006@gmail.com>
	<5.1.1.6.0.20051020163313.01faf660@mail.telecommunity.com>
	<ca471dc20510201957m7823c49ama127de972eef4028@mail.gmail.com>
	<4359047B.6020203@gmail.com>  <1147958111.20051114154658@gmail.com>
Message-ID: <1132046729.17944.11.camel@warna.corp.google.com>

On Mon, 2005-11-14 at 15:46 -0700, Bruce Eckel wrote:
[...]
> What is not clear to me, and is not discussed in the PEP, is whether
> coroutines can be distributed among multiple processors. If that is or
> isn't possible I think it should be explained in the PEP, and I'd be
> interested in know about it here (and ideally why it would or wouldn't
> work).

Even if different coroutines could be run on different processors, there
would be nothing gained except extra overheads of interprocessor memory
duplication and communication delays.

The whole process communication via yield and send effectively means
only one co-routine is running at a time, and all the others are blocked
waiting for a yield or send.

This was the whole point; it is a convenient abstraction that appears to
do work in parallel, while actually doing it sequentially, avoiding the
overheads and possible race conditions of threads.

It has the problem that a single co-routine can monopolise execution,
hence the other name "co-operative multi-tasking", where co-operation is
the requirement for it to work.

At least... that's the way I understood it... I could be totally
mistaken...

-- 
Donovan Baarda <abo at minkirri.apana.org.au>
http://minkirri.apana.org.au/~abo/


From ncoghlan at iinet.net.au  Tue Nov 15 10:31:03 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Tue, 15 Nov 2005 19:31:03 +1000
Subject: [Python-Dev] Memory management in the AST parser & compiler
Message-ID: <4379AAD7.2050506@iinet.net.au>

Transferring part of the discussion of Thomas Lee's PEP 341 patch to 
python-dev. . .

Neal Norwitz wrote in the SF patch tracker:
> Thomas, I hope you will write up this experience in coding
> this patch.  IMO it clearly demonstrates a problem with the
> new AST code that needs to be addressed.  ie, Memory
> management is not possible to get right.  I've got a 700+
> line patch to ast.c to correct many more memory issues
> (hopefully that won't cause conflicts with this patch).  I
> would like to hear ideas of how the AST code can be improved
> to make it much easier to not leak memory and be safe at the
> same time.

As Neal pointed out, it's tricky to write code for the AST parser and compiler 
without accidentally letting memory leak when the parser or compiler runs into 
a problem and has to bail out on whatever it was doing. Thomas's patch got to 
v5 (based on Neal's review comments) with memory leaks still in it, my review 
got rid of some of them, and we think Neal's last review of v6 of the patch 
got rid of the last of them.

I am particularly concerned about the returns hidden inside macros in the AST 
compiler's symbol table generation and bytecode generation steps. At the 
moment, every function in compile.c which allocates code blocks (or anything 
else for that matter) and then calls one of the VISIT_* macros is a memory 
leak waiting to happen.

Something I've seen used successfully (and used myself) to deal with similar 
resource-management problems in C code is to use a switch statement, rather 
than getting goto-happy.

Specifically, the body of the entire function is written inside a switch 
statement, with 'break' then used as the equivalent of "raise Exception". For 
example:

   PyObject* switchAsTry()
   {
     switch(0) {
       default:
         /* Real function body goes here */
         return result;
     }
     /* Error cleanup code goes here */
     return NULL;
   }

It avoids the potential for labelling problems that arises when goto's are 
used for resource cleanup. It's a far cry from real exception handling, but 
it's the best solution I've seen within the limits of C.

A particular benefit comes when macros which may abort function execution are 
used inside the function - if those macros are rewritten to use break instead 
of return, then the function gets a chance to clean up after an error.

Cheers,
Nick.

P.S. Getting rid of the flow control macros entirely is another option, of 
course, but it would make compile.c and symtable.c a LOT harder to follow. 
Raymond Chen's articles notwithstanding, a preprocessor-based mini-language 
does make sense in some situations, and I think this is one of them. 
Particularly since the flow control macros are private to the relevant 
implementation files.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From mwh at python.net  Tue Nov 15 10:50:42 2005
From: mwh at python.net (Michael Hudson)
Date: Tue, 15 Nov 2005 09:50:42 +0000
Subject: [Python-Dev] str.dedent
In-Reply-To: <43792EC6.4030707@canterbury.ac.nz> (Greg Ewing's message of
	"Tue, 15 Nov 2005 13:41:42 +1300")
References: <b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com>
	<000001c5e7c6$2f959440$2523c797@oemcomputer>
	<b348a0850511121424n26f84b9n7c1edc45e7f9f1c@mail.gmail.com>
	<43769E4A.5040408@colorstudy.com> <4377E357.5010808@canterbury.ac.nz>
	<BA7FC0D5-FEC3-453E-B2C6-B1082DCFE9ED@fuhm.net>
	<43792EC6.4030707@canterbury.ac.nz>
Message-ID: <2mveyuxmrx.fsf@starship.python.net>

Greg Ewing <greg.ewing at canterbury.ac.nz> writes:

> James Y Knight wrote:
>
>> ITYM you mean "If only python were lisp". (macros, or even reader  macros)
>
> No, I mean it would be more satisfying if there
> were a syntax for expressing multiline string
> literals that didn't force it to be at the left
> margin. The lack of such in such an otherwise
> indentation-savvy language seems a wart.

Wasn't there a PEP about this?  Yes, 295.  But that was rejected, I
presume[*] because it proposed changing all multi-string literals, a
plainly doomed idea (well, it would make *me* squeal, anyway).

Cheers,
mwh
(who finds the whole issue rather hard to care about)

[*] The reason for rejection isn't in the PEP, grumble.

-- 
  I would hereby duly point you at the website for the current pedal
  powered submarine world underwater speed record, except I've lost
  the URL.                                         -- Callas, cam.misc

From imbaczek at gmail.com  Tue Nov 15 13:22:21 2005
From: imbaczek at gmail.com (=?ISO-8859-2?Q?Marek_=22Baczek=22_Baczy=F1ski?=)
Date: Tue, 15 Nov 2005 13:22:21 +0100
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <4379AAD7.2050506@iinet.net.au>
References: <4379AAD7.2050506@iinet.net.au>
Message-ID: <5f3d2c310511150422x3e2d670r@mail.gmail.com>

2005/11/15, Nick Coghlan <ncoghlan at iinet.net.au>:
> Specifically, the body of the entire function is written inside a switch
> statement, with 'break' then used as the equivalent of "raise Exception". For
> example:
>
>    PyObject* switchAsTry()
>    {
>      switch(0) {
>        default:
>          /* Real function body goes here */
>          return result;
>      }
>      /* Error cleanup code goes here */
>      return NULL;
>    }
>
> It avoids the potential for labelling problems that arises when goto's are
> used for resource cleanup. It's a far cry from real exception handling, but
> it's the best solution I've seen within the limits of C.

<delurk>
do {
    ....
    ....
} while (0);


Same benefit and saves some typing :)

Now back to my usual hiding place.
</delurk>

--
{ Marek Baczy?ski :: UIN 57114871 :: GG 161671 :: JID imbaczek at jabber.gda.pl  }
{ http://www.vlo.ids.gda.pl/ | imbaczek at poczta fm | http://www.promode.org }
.. .. .. .. ... ... ...... evolve or face extinction ...... ... ... .. .. .. ..

From ncoghlan at gmail.com  Tue Nov 15 13:26:54 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 15 Nov 2005 22:26:54 +1000
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <5f3d2c310511150422x3e2d670r@mail.gmail.com>
References: <4379AAD7.2050506@iinet.net.au>
	<5f3d2c310511150422x3e2d670r@mail.gmail.com>
Message-ID: <4379D40E.9050002@gmail.com>

Marek Baczek Baczy?ski wrote:
> 2005/11/15, Nick Coghlan <ncoghlan at iinet.net.au>:
>> It avoids the potential for labelling problems that arises when goto's are
>> used for resource cleanup. It's a far cry from real exception handling, but
>> it's the best solution I've seen within the limits of C.
> 
> <delurk>
> do {
>     ....
>     ....
> } while (0);
> 
> 
> Same benefit and saves some typing :)

Heh. Good point. I spend so much time working with a certain language I tend 
to forget do/while loops exist ;)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From krumms at gmail.com  Tue Nov 15 14:17:13 2005
From: krumms at gmail.com (Thomas Lee)
Date: Tue, 15 Nov 2005 23:17:13 +1000
Subject: [Python-Dev] PEP 341 patch & memory management (was: Memory
 management in the AST parser & compiler)
In-Reply-To: <4379D40E.9050002@gmail.com>
References: <4379AAD7.2050506@iinet.net.au>	<5f3d2c310511150422x3e2d670r@mail.gmail.com>
	<4379D40E.9050002@gmail.com>
Message-ID: <4379DFD9.3010209@gmail.com>

Interesting trick!

The PEP 341 patch is now using Marek's 'do ... while' resource cleanup 
trick instead of the nasty goto voodoo.

I've also fixed the last remaining bug that Neal pointed out. I'm 
running the unit tests right now, shall have the updated (and hopefully 
final) PEP 341 patch up on sourceforge within the next 15 minutes.

If anybody has feedback/suggestions for the patch, please let me know. 
I'm new to this stuff, so I'm still finding my way around :)

Cheers,
Tom

Nick Coghlan wrote:

>Marek Baczek Baczy?ski wrote:
>  
>
>>2005/11/15, Nick Coghlan <ncoghlan at iinet.net.au>:
>>    
>>
>>>It avoids the potential for labelling problems that arises when goto's are
>>>used for resource cleanup. It's a far cry from real exception handling, but
>>>it's the best solution I've seen within the limits of C.
>>>      
>>>
>><delurk>
>>do {
>>    ....
>>    ....
>>} while (0);
>>
>>
>>Same benefit and saves some typing :)
>>    
>>
>
>Heh. Good point. I spend so much time working with a certain language I tend 
>to forget do/while loops exist ;)
>
>Cheers,
>Nick.
>
>  
>


From mwh at python.net  Tue Nov 15 18:29:21 2005
From: mwh at python.net (Michael Hudson)
Date: Tue, 15 Nov 2005 17:29:21 +0000
Subject: [Python-Dev] Gothenburg PyPy Sprint II: 7th - 11th December 2005
Message-ID: <2mfypxyg3y.fsf@starship.python.net>

Gothenburg PyPy Sprint II: 7th - 11th December 2005 
======================================================
(NOTE: internal EU-only sprint starts on the 5th!)

The next PyPy sprint is scheduled to be in December 2005 in Gothenborg,
Sweden.  Its main focus is heading towards phase 2, which means JIT
work, alternate threading models and logic programming (but there are
also other possible topics).  We'll give newcomer-friendly
introductions.  To learn more about the new PyPy Python-in-Python
implementation look here: 

    http://codespeak.net/pypy 

Goals and topics of the sprint 
------------------------------

We have released pypy-0.8.0_, which is officially a "research base" for
future work.  The goal of the Gothenburg sprint is to start exploring
new directions and continue in the directions started at the Paris
sprint.

The currently scheduled main topics are:

 - The L3 interpreter, a small fast interpreter for "assembler-level"
   flow graphs.  This is heading towards JIT work.

 - Stackless: write an app-level interface, which might be either
   Tasklets, as in "Stackless CPython", or the more limited Greenlets.

 - Porting C modules from CPython.  (_socket is not finished)

 - Optimization/debugging work in general.  In particular our thread
   support is far from stable at the moment and unaccountably slow.

 - Experimentation: logic programming in Python.  A first step might be
   to try to add logic variables to PyPy.


.. _`pypy-0.8.0`: http://codespeak.net/pypy/dist/pypy/doc/release-0.8.0.html

Location & Accomodation  
------------------------ 

The sprint will be held in the apartment of Laura Creighton and Jacob
Halen which is in Gotabergsgatan 22.  The location is central in
Gothenburg.  It is between the tram_ stops of Vasaplatsen and Valand,
where many lines call.

.. _tram: http://www.vasttrafik.se

Probably cheapest and not too far away is to book accomodation at `SGS
Veckobostader`_.  (You can have a 10% discount there; ask in the
pypy-sprint mailing list for details.  We also have some possibilites of
free accomodation.)

.. _`SGS Veckobostader`: http://www.sgsveckobostader.com

Exact times 
-----------

The public PyPy sprint is held Wednesday 7th - Sunday 11th December
2005.  There is a sprint for people involved with the EU part of the
project on the two days before the "official" sprint.  Hours will be
from 10:00 until people have had enough.  It's a good idea to arrive a
day before the sprint starts and leave a day later.  In the middle of
the sprint there usually is a break day and it's usually ok to take
half-days off if you feel like it.


Network, Food, currency 
------------------------ 

Sweden is not part of the Euro zone. One SEK (krona in singular, kronor
in plural) is roughly 1/10th of a Euro (9.15 SEK to 1 Euro).

The venue is central in Gothenburg.  There is a large selection of
places to get food around, from edible-and-cheap to outstanding.

You normally need a wireless network card to access the network, but we
can provide a wireless/ethernet bridge.

Sweden uses the same kind of plugs as Germany. 230V AC.

Registration etc.pp. 
-------------------- 

Please subscribe to the `PyPy sprint mailing list`_, introduce yourself
and post a note that you want to come.  Feel free to ask any questions
there!  There also is a separate `Gothenburg people`_ page tracking who
is already thought to come.  If you have commit rights on codespeak then
you can modify yourself a checkout of

  http://codespeak.net/svn/pypy/extradoc/sprintinfo/gothenburg-2005/people.txt

.. _`PyPy sprint mailing list`: http://codespeak.net/mailman/listinfo/pypy-sprint
.. _`Gothenburg people`: http://codespeak.net/pypy/extradoc/sprintinfo/gothenburg-2005/people.html

Cheers,
mwh

-- 
  Speaking from personal experience, I can attest that the barrel of
  any firearm (or black powder weapon) pointed at one appears large
  enough to walk down, hands at full extension above the head,
  without touching the top.                       -- Mike Andrews, asr

From niko at alum.mit.edu  Tue Nov 15 18:50:53 2005
From: niko at alum.mit.edu (Niko Matsakis)
Date: Tue, 15 Nov 2005 18:50:53 +0100
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <4379AAD7.2050506@iinet.net.au>
References: <4379AAD7.2050506@iinet.net.au>
Message-ID: <6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu>

> As Neal pointed out, it's tricky to write code for the AST parser  
> and compiler
> without accidentally letting memory leak when the parser or  
> compiler runs into
> a problem and has to bail out on whatever it was doing. Thomas's  
> patch got to
> v5 (based on Neal's review comments) with memory leaks still in it,  
> my review
> got rid of some of them, and we think Neal's last review of v6 of  
> the patch
> got rid of the last of them.

Another lurker's 2 cents:

My experience with compilers in particular is that an arena is the  
way to go for memory management.  I haven't looked at the AST code,  
but this can take a variety of forms: anything from linked lists of  
pointers to free from something which allocates memory in large  
blocks and parcels them out.  The goal is just to be able to free the  
memory en-masse whatever happens and not have to track individual  
pointers.

Generally, compilers have memory allocations which operate in phases  
and so are very amenable to arenas.  You might have one memory pool  
for long lived representation, one that  is freed and recreated  
between passes, etc.

If you need to keep the AST around long term, then a mark-sweep  
garbage collector combined with a linked list might even be a good idea.

Obviously, the whole thing is a tradeoff of peak memory size (which  
goes up) against correctness (which is basically ensured, and at  
least easily auditable).


Niko

From jeremy at alum.mit.edu  Tue Nov 15 20:42:13 2005
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Tue, 15 Nov 2005 14:42:13 -0500
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu>
References: <4379AAD7.2050506@iinet.net.au>
	<6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu>
Message-ID: <e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com>

On 11/15/05, Niko Matsakis <niko at alum.mit.edu> wrote:
> > As Neal pointed out, it's tricky to write code for the AST parser
> > and compiler
> > without accidentally letting memory leak when the parser or
> > compiler runs into
> > a problem and has to bail out on whatever it was doing. Thomas's
> > patch got to
> > v5 (based on Neal's review comments) with memory leaks still in it,
> > my review
> > got rid of some of them, and we think Neal's last review of v6 of
> > the patch
> > got rid of the last of them.
>
> Another lurker's 2 cents:
>
> My experience with compilers in particular is that an arena is the
> way to go for memory management.  I haven't looked at the AST code,
> but this can take a variety of forms: anything from linked lists of
> pointers to free from something which allocates memory in large
> blocks and parcels them out.  The goal is just to be able to free the
> memory en-masse whatever happens and not have to track individual
> pointers.

Thanks for the message.  I was going to suggest the same thing.  I
think it's primarily a question of how to add an arena layer.  The AST
phase has a mixture of malloc/free and Python object allocation.  It
should be straightforward to change the malloc/free code to use an
arena API.  We'd probably need a separate mechanism to associate a set
of PyObject* with the arena and have those DECREFed.

Jeremy

>
> Generally, compilers have memory allocations which operate in phases
> and so are very amenable to arenas.  You might have one memory pool
> for long lived representation, one that  is freed and recreated
> between passes, etc.
>
> If you need to keep the AST around long term, then a mark-sweep
> garbage collector combined with a linked list might even be a good idea.
>
> Obviously, the whole thing is a tradeoff of peak memory size (which
> goes up) against correctness (which is basically ensured, and at
> least easily auditable).
>
>
> Niko
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu
>

From bcannon at gmail.com  Tue Nov 15 20:48:51 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Tue, 15 Nov 2005 11:48:51 -0800
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com>
References: <4379AAD7.2050506@iinet.net.au>
	<6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu>
	<e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com>
Message-ID: <bbaeab100511151148i704bdbb3oabc1b7b5dd509a67@mail.gmail.com>

On 11/15/05, Jeremy Hylton <jeremy at alum.mit.edu> wrote:
> On 11/15/05, Niko Matsakis <niko at alum.mit.edu> wrote:
> > > As Neal pointed out, it's tricky to write code for the AST parser
> > > and compiler
> > > without accidentally letting memory leak when the parser or
> > > compiler runs into
> > > a problem and has to bail out on whatever it was doing. Thomas's
> > > patch got to
> > > v5 (based on Neal's review comments) with memory leaks still in it,
> > > my review
> > > got rid of some of them, and we think Neal's last review of v6 of
> > > the patch
> > > got rid of the last of them.
> >
> > Another lurker's 2 cents:
> >
> > My experience with compilers in particular is that an arena is the
> > way to go for memory management.  I haven't looked at the AST code,
> > but this can take a variety of forms: anything from linked lists of
> > pointers to free from something which allocates memory in large
> > blocks and parcels them out.  The goal is just to be able to free the
> > memory en-masse whatever happens and not have to track individual
> > pointers.
>
> Thanks for the message.  I was going to suggest the same thing.  I
> think it's primarily a question of how to add an arena layer.  The AST
> phase has a mixture of malloc/free and Python object allocation.  It
> should be straightforward to change the malloc/free code to use an
> arena API.  We'd probably need a separate mechanism to associate a set
> of PyObject* with the arena and have those DECREFed.
>

Might just need two lists; malloc'ed pointers and PyObject pointers. 
Could redefine Py_INCREF and Py_DECREF locally for ast.c and compile.c
to use the arena API and thus hide the detail.  Otherwise just a big,
flashing "USE THIS API" sign will be needed.

I have gone ahead and added this as a possible topic to sprint on at PyCon.

-Brett

> Jeremy
>
> >
> > Generally, compilers have memory allocations which operate in phases
> > and so are very amenable to arenas.  You might have one memory pool
> > for long lived representation, one that  is freed and recreated
> > between passes, etc.
> >
> > If you need to keep the AST around long term, then a mark-sweep
> > garbage collector combined with a linked list might even be a good idea.
> >
> > Obviously, the whole thing is a tradeoff of peak memory size (which
> > goes up) against correctness (which is basically ensured, and at
> > least easily auditable).
> >
> >
> > Niko
> > _______________________________________________
> > Python-Dev mailing list
> > Python-Dev at python.org
> > http://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe: http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu
> >
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org
>

From nnorwitz at gmail.com  Tue Nov 15 22:57:05 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Tue, 15 Nov 2005 13:57:05 -0800
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com>
References: <4379AAD7.2050506@iinet.net.au>
	<6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu>
	<e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com>
Message-ID: <ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com>

On 11/15/05, Jeremy Hylton <jeremy at alum.mit.edu> wrote:
>
> Thanks for the message.  I was going to suggest the same thing.  I
> think it's primarily a question of how to add an arena layer.  The AST
> phase has a mixture of malloc/free and Python object allocation.  It
> should be straightforward to change the malloc/free code to use an
> arena API.  We'd probably need a separate mechanism to associate a set
> of PyObject* with the arena and have those DECREFed.

Well good.  It seems we all agree there is a problem and on the
general solution.  I haven't thought about Brett's idea to see if it
could work or not.  It would be great if we had someone start working
to improve the situation.  It could well be that we live with the
current code for 2.5, but it would be great to use arenas for 2.6 at
least.

Niko, Marek, how would you like to lose your lurker status? ;-)

n

From noamraph at gmail.com  Wed Nov 16 00:34:19 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Wed, 16 Nov 2005 01:34:19 +0200
Subject: [Python-Dev] str.dedent
In-Reply-To: <43791442.8050109@v.loewis.de>
References: <dga72k$cah$1@sea.gmane.org>
	<b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com>
	<43777B5A.6030602@egenix.com>
	<200511140920.51724.gmccaughan@synaptics-uk.com>
	<437869DD.7040800@egenix.com>
	<b348a0850511141114p25411ea4w704a99d1ea9a629a@mail.gmail.com>
	<dlaqds$8sb$1@sea.gmane.org>
	<b348a0850511141425y1a894ddap14d7814568c9be5d@mail.gmail.com>
	<43791442.8050109@v.loewis.de>
Message-ID: <b348a0850511151534q4e8abbf6vc3c63c07d3291d6a@mail.gmail.com>

Thanks for your examples. I understand tham sometimes it's a good idea
not to write the HTML inside the function (although it may be nice to
sometimes write it just before the function - and if it's a method,
then we get the same indentation problem.)

However, as you said, sometimes it is desired to write multilined
strings inside functions. You think it's ok to add white spaces to the
HTML code, I personally prefer not add varying indentation to my
output according to the level of indentation the code that generated
it.

I just wanted to add another use case: long messages. Consider those
lines from idlelib/run.py:133

        msg = "IDLE's subprocess can't connect to %s:%d.  This may be due "\
              "to your personal firewall configuration.  It is safe to "\
              "allow this internal connection because no data is visible on "\
              "external ports." % address
        tkMessageBox.showerror("IDLE Subprocess Error", msg, parent=root)

and from idlelib/PyShell.py:734:

    def display_port_binding_error(self):
        tkMessageBox.showerror(
            "Port Binding Error",
            "IDLE can't bind TCP/IP port 8833, which is necessary to "
            "communicate with its Python execution server.  Either "
            "no networking is installed on this computer or another "
            "process (another IDLE?) is using the port.  Run IDLE with the -n "
            "command line switch to start without a subprocess and refer to "
            "Help/IDLE Help 'Running without a subprocess' for further "
            "details.",
            master=self.tkconsole.text)

I know, of course, that it could be written using textwrap.dedent, but
I think that not having to load a module will encourage the use of
dedent; if I have to load a module, I might say, "oh, I can live with
all those marks around the text, there's no need for another module",
and then, any time I want to change that message, I have a lot of
editing work to do.

Noam

From Scott.Daniels at Acm.Org  Wed Nov 16 00:31:45 2005
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Tue, 15 Nov 2005 15:31:45 -0800
Subject: [Python-Dev] Behavoir question.
Message-ID: <dldr4m$n0v$1@sea.gmane.org>

Since I am fiddling with int/long conversions to/from string:

Is the current behavior intentional (or mandatory?):

     v = int('       55555555555555555555555555555555555555555       ')
works, but:
     v = int('       55555555555555555555555555555555555555555L      ')
fails.

--Scott David Daniels
Scott.Daniels at Acm.Org


From mdehoon at c2b2.columbia.edu  Wed Nov 16 00:48:43 2005
From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon)
Date: Tue, 15 Nov 2005 18:48:43 -0500
Subject: [Python-Dev]  Conclusion: Event loops, PyOS_InputHook, and Tkinter
In-Reply-To: <b348a0850511141439p3f0f4cdbp5d7332b1d1224f19@mail.gmail.com>
References: <4373A214.6060201@v.loewis.de>
	<4377D97E.9060507@c2b2.columbia.edu>	<20051113230400.A403.JCARLSON@uci.edu>	<AE21F850-277C-43B5-89F8-60BA2B824F59@mac.com>	<4378DEE8.70109@c2b2.columbia.edu>	<b348a0850511141212o12556119jd2be06f9444b3d1b@mail.gmail.com>	<dlb30h$771$1@sea.gmane.org>
	<b348a0850511141439p3f0f4cdbp5d7332b1d1224f19@mail.gmail.com>
Message-ID: <437A73DB.9000705@c2b2.columbia.edu>

Thanks everybody for contributing to this discussion. I didn't expect it 
to become this extensive.
I think that by now, everybody has had their chance to voice their opinion.
It seems safe to conclude that there is no consensus on this topic.

So what I'm planning to do is to write a small extension module that 
implements some of the ideas that came up in this discussion, and see 
how they perform in the wild. It will give us an idea of what works, 
what doesn't, and what the user demand is for such functionality, and 
will help us if this issue happens to turn up again at some point in the 
future.

Thanks again,

--Michiel.


-- 
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032



From pje at telecommunity.com  Wed Nov 16 01:19:38 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 15 Nov 2005 19:19:38 -0500
Subject: [Python-Dev] Conclusion: Event loops, PyOS_InputHook,
 and  Tkinter
In-Reply-To: <437A73DB.9000705@c2b2.columbia.edu>
References: <b348a0850511141439p3f0f4cdbp5d7332b1d1224f19@mail.gmail.com>
	<4373A214.6060201@v.loewis.de> <4377D97E.9060507@c2b2.columbia.edu>
	<20051113230400.A403.JCARLSON@uci.edu>
	<AE21F850-277C-43B5-89F8-60BA2B824F59@mac.com>
	<4378DEE8.70109@c2b2.columbia.edu>
	<b348a0850511141212o12556119jd2be06f9444b3d1b@mail.gmail.com>
	<dlb30h$771$1@sea.gmane.org>
	<b348a0850511141439p3f0f4cdbp5d7332b1d1224f19@mail.gmail.com>
Message-ID: <5.1.1.6.0.20051115191823.01f1a4c0@mail.telecommunity.com>

At 06:48 PM 11/15/2005 -0500, Michiel Jan Laurens de Hoon wrote:
>Thanks everybody for contributing to this discussion. I didn't expect it
>to become this extensive.
>I think that by now, everybody has had their chance to voice their opinion.
>It seems safe to conclude that there is no consensus on this topic.

Just a question: did you ever try using IPython, and confirm whether it 
does or does not address the issue you were having?  As far as I could 
tell, you never confirmed or denied that point.


From mdehoon at c2b2.columbia.edu  Wed Nov 16 02:34:02 2005
From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon)
Date: Tue, 15 Nov 2005 20:34:02 -0500
Subject: [Python-Dev] Conclusion: Event loops, PyOS_InputHook,
	and  Tkinter
In-Reply-To: <5.1.1.6.0.20051115191823.01f1a4c0@mail.telecommunity.com>
References: <b348a0850511141439p3f0f4cdbp5d7332b1d1224f19@mail.gmail.com>
	<4373A214.6060201@v.loewis.de> <4377D97E.9060507@c2b2.columbia.edu>
	<20051113230400.A403.JCARLSON@uci.edu>
	<AE21F850-277C-43B5-89F8-60BA2B824F59@mac.com>
	<4378DEE8.70109@c2b2.columbia.edu>
	<b348a0850511141212o12556119jd2be06f9444b3d1b@mail.gmail.com>
	<dlb30h$771$1@sea.gmane.org>
	<b348a0850511141439p3f0f4cdbp5d7332b1d1224f19@mail.gmail.com>
	<5.1.1.6.0.20051115191823.01f1a4c0@mail.telecommunity.com>
Message-ID: <437A8C8A.808@c2b2.columbia.edu>

Phillip J. Eby wrote:

> At 06:48 PM 11/15/2005 -0500, Michiel Jan Laurens de Hoon wrote:
>
>> Thanks everybody for contributing to this discussion. I didn't expect it
>> to become this extensive.
>> I think that by now, everybody has had their chance to voice their 
>> opinion.
>> It seems safe to conclude that there is no consensus on this topic.
>
>
> Just a question: did you ever try using IPython, and confirm whether 
> it does or does not address the issue you were having?  As far as I 
> could tell, you never confirmed or denied that point.
>
Yes I did try IPython.

First of all, IPython, being pure Python code, does not affect the 
underlying Python's loop (at the C level). So just running Python 
through IPython does not fix our event loop problem. On Windows, for 
example, after importing IPython into IDLE (which most of our users will 
want to use), our graphics window still freezes.

This leaves us with the possibility of using IPython's event loop, which 
it runs on top of regular Python. But if we use that, we'd either have 
to convince all our users to switch to IPython (which is impossible) or 
we have to maintain two mechanisms to hook our extension module into the 
event loop: one for Python and one for IPython.

There are several other reasons why the alternative solutions that came 
up in this discussion are more attractive than IPython:
1) AFAICT, IPython is not intended to work with IDLE.
2) I didn't get the impression that the IPython developers understand 
why and how their event loop works very well (which made it hard to 
respond to their posts). I am primarily interested in understanding the 
problem first and then come up with a suitable mechanism for events. 
Without such understanding, IPython's event loop smells too much like a 
hack.
3) IPython adds another layer on top of Python. For IPython's purpose, 
that's fine. But if we're just interested in event loops, I think it is 
hard to argue that another layer is absolutely necessary. So rather than 
setting up an event loop in a layer on top of Python, I'd prefer to find 
a solution within the context of Python itself (be it threads, an event 
loop, or PyOS_InputHook).
4) Call me a sentimental fool, but I just happen to like regular Python.

My apologies in advance to the IPython developers if I misunderstood how 
it works.

--Michiel.


-- 
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032



From fperez.net at gmail.com  Wed Nov 16 03:03:52 2005
From: fperez.net at gmail.com (Fernando Perez)
Date: Tue, 15 Nov 2005 19:03:52 -0700
Subject: [Python-Dev] Conclusion: Event loops, PyOS_InputHook,
	and  Tkinter
References: <b348a0850511141439p3f0f4cdbp5d7332b1d1224f19@mail.gmail.com>
	<4373A214.6060201@v.loewis.de> <4377D97E.9060507@c2b2.columbia.edu>
	<20051113230400.A403.JCARLSON@uci.edu>
	<AE21F850-277C-43B5-89F8-60BA2B824F59@mac.com>
	<4378DEE8.70109@c2b2.columbia.edu>
	<b348a0850511141212o12556119jd2be06f9444b3d1b@mail.gmail.com>
	<dlb30h$771$1@sea.gmane.org>
	<5.1.1.6.0.20051115191823.01f1a4c0@mail.telecommunity.com>
	<437A8C8A.808@c2b2.columbia.edu>
Message-ID: <dle429$d5d$1@sea.gmane.org>

Michiel Jan Laurens de Hoon wrote:

> There are several other reasons why the alternative solutions that came
> up in this discussion are more attractive than IPython:
> 1) AFAICT, IPython is not intended to work with IDLE.

Not so far, but mostly by accident. The necessary changes are fairly easy
(mainly abstracting out assumptions about being in a tty).  I plan on making
ipython embeddable inside any GUI (including IDLE), as there is much demand for
that.

> 2) I didn't get the impression that the IPython developers understand
> why and how their event loop works very well (which made it hard to
> respond to their posts). I am primarily interested in understanding the
> problem first and then come up with a suitable mechanism for events.
> Without such understanding, IPython's event loop smells too much like a
> hack.

I said I did get that code off the ground by stumbling in the dark, but I tried
to explain to you what it does, which is pretty simple:

a. You find, for each toolkit, what its timer/idle mechanism is.  This requires
reading a little about each toolkit's API, as they all do it slightly
differently.  But the principle is always the same, only the implementation
details change.

b. You subclass threading.Thread, as you do for all threading code.  The run
method of this class manages a one-entry queue where code is put for execution
from stdin.

c. The timer you set up with the info from (a) calls the method which executes
the code object from the queue in (b), with suitable locking.

That's pretty much it.  Following this same idea, just this week I implemented
an ipython-for-OpenGL shell.  All I had to do was look up what OpenGL uses for
an idle callback.

> 3) IPython adds another layer on top of Python. For IPython's purpose,
> that's fine. But if we're just interested in event loops, I think it is
> hard to argue that another layer is absolutely necessary. So rather than
> setting up an event loop in a layer on top of Python, I'd prefer to find
> a solution within the context of Python itself (be it threads, an event
> loop, or PyOS_InputHook).

I gave you a link to a 200 line script which implements the core idea for GTK
without any ipython at all.  I explained that in my message.  I don't know how
to be more specific, ipython-independent or clear with you.

> 4) Call me a sentimental fool, but I just happen to like regular Python.

That's fine.  I'd argue that ipython is exceptionally useful in a scientific
computing workflow, but I'm obviously biased.  Many others in the scientific
community seem to agree with me, though, given the frequency of ipython prompts
in posts to the scientific computing lists.  

But this is free software in a free world: use whatever you like.  All I'm
interested in is in clarifying a technical issue, not in evangelizing ipython;
that's why I gave you a link to a non-ipython example which implements the key
idea using only the standard python library.

> My apologies in advance to the IPython developers if I misunderstood how
> it works.

No problem.  But your posts so far seem to indicate you hardly read what I said,
as I've had to repeat several key points over and over (the non-ipython
solutions, for example).

Cheers,

f


From ironfroggy at gmail.com  Wed Nov 16 08:16:49 2005
From: ironfroggy at gmail.com (Calvin Spealman)
Date: Wed, 16 Nov 2005 02:16:49 -0500
Subject: [Python-Dev] Behavoir question.
In-Reply-To: <dldr4m$n0v$1@sea.gmane.org>
References: <dldr4m$n0v$1@sea.gmane.org>
Message-ID: <76fd5acf0511152316g68164f7em1f4fac0fc4b1d976@mail.gmail.com>

On 11/15/05, Scott David Daniels <Scott.Daniels at acm.org> wrote:
> Since I am fiddling with int/long conversions to/from string:
>
> Is the current behavior intentional (or mandatory?):
>
>      v = int('       55555555555555555555555555555555555555555       ')
> works, but:
>      v = int('       55555555555555555555555555555555555555555L      ')
> fails.
>
> --Scott David Daniels
> Scott.Daniels at Acm.Org

int(s) works where s is a string representing a number. 10L does not
represent a number directly, but is Python syntax for making an
integer constant a long, and not an int. (Consider that both are
representations of mathematical integers, tho in python we only call
one of them an integer by terminology).

So, what you're asking is like if list('[1,2]') returned [1,2]. If you
need this functionality, maybe you need a regex match and expr().

From oliphant at ee.byu.edu  Wed Nov 16 08:20:47 2005
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Wed, 16 Nov 2005 00:20:47 -0700
Subject: [Python-Dev] Problems with the Python Memory Manager
Message-ID: <437ADDCF.7080906@ee.byu.edu>


I know (thanks to Google) that much has been said in the past about the 
Python Memory Manager.  My purpose in posting is simply to given a 
use-case example of how the current memory manager (in Python 2.4.X) can 
be problematic in scientific/engineering code.

Scipy core is a replacement for Numeric.  One of the things scipy core 
does is define a new python scalar object for ever data type that an 
array can have (currently 21).   This has many advantages and is made 
feasible by the ability of Python to subtype in C.   These scalars all 
inherit from the standard Python types where there is a correspondence.

More to the point, however, these scalar objects were allocated using 
the standard PyObject_New and PyObject_Del functions which of course use 
the Python memory manager.    One user ported his (long-running) code to 
the new scipy core and found much to his dismay that what used to 
consume around 100MB now completely dominated his machine consuming up 
to 2GB of memory after only a few iterations.  After searching many 
hours for memory leaks in scipy core (not a bad exercise anyway as some 
were found), the real problem was tracked to the fact that his code 
ended up creating and destroying many of these new array scalars.  

The Python memory manager was not reusing memory (even though 
PyObject_Del was being called).  I don't know enough about the memory 
manager to understand why that was happening.  However, changing the 
allocation from PyObject_New to malloc and from PyObject_Del to free, 
fixed the problems this user was seeing.   Now the code runs for a long 
time consuming only around 100MB at-a-time.

Thus, all of the objects in scipy core now use system malloc and system 
free for their memory needs.   Perhaps this is unfortunate, but it was 
the only solution I could see in the short term.

In the long term, what is the status of plans to re-work the Python 
Memory manager to free memory that it acquires (or improve the detection 
of already freed memory locations).  I see from other postings that this 
has been a problem for other people as well.   Also, is there a 
recommended way for dealing with this problem other than using system 
malloc and system free (or I suppose writing your own specialized memory 
manager).

Thanks for any feedback,


-Travis Oliphant



From bcannon at gmail.com  Wed Nov 16 09:56:41 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Wed, 16 Nov 2005 00:56:41 -0800
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com>
References: <4379AAD7.2050506@iinet.net.au>
	<6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu>
	<e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com>
	<ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com>
Message-ID: <bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com>

On 11/15/05, Neal Norwitz <nnorwitz at gmail.com> wrote:
> On 11/15/05, Jeremy Hylton <jeremy at alum.mit.edu> wrote:
> >
> > Thanks for the message.  I was going to suggest the same thing.  I
> > think it's primarily a question of how to add an arena layer.  The AST
> > phase has a mixture of malloc/free and Python object allocation.  It
> > should be straightforward to change the malloc/free code to use an
> > arena API.  We'd probably need a separate mechanism to associate a set
> > of PyObject* with the arena and have those DECREFed.
>
> Well good.  It seems we all agree there is a problem and on the
> general solution.  I haven't thought about Brett's idea to see if it
> could work or not.  It would be great if we had someone start working
> to improve the situation.  It could well be that we live with the
> current code for 2.5, but it would be great to use arenas for 2.6 at
> least.
>

 I have been thinking about this some more  to put off doing homework
and I have some random ideas I just wanted to toss out there to make
sure I am not thinking about arena memory management incorrectly
(never actually encountered it directly before).

I think an arena API is going to be the best solution.  Pulling
trickery with redefining Py_INCREF and such like I suggested seems
like a pain and possibly error-prone.  With the compiler being a
specific corner of the core having a special API for handling the
memory for PyObject* stuff seems reasonable.

We might need PyArena_Malloc() and PyArena_New() to handle malloc()
and PyObject* creation.  We could then have a struct that just stored
pointers to the allocated memory (linked list for each pointer which
gives high memory overhead or linked list of arrays that should lower
memory but make having possible holes in the array for stuff already
freed a pain to handle).  We would then have PyArena_FreeAll() that
would be strategically placed in the code for when bad things happen
that would just traverse the lists and free everything.  I assume
having a way to free individual items might be useful.  Could have the
PyArena_New() and _Malloc() return structs with the needed info for a
PyArena_Free(location_struct) to be able to fee the specific item
without triggering a complete freeing of all memory.  But this usage
should be discouraged and only used when proper memory management is
guaranteed.

Boy am I wanting RAII from C++ for automatic freeing when scope is
left.  Maybe we need to come up with a similar thing, like all memory
that should be freed once a scope is left must use some special struct
that stores references to all created memory locally and then a free
call must be made at all exit points in the function using the special
struct.  Otherwise the pointer is stored in the arena and handled
en-mass later.

Hopefully this is all made some sense.  =)  Is this the basic strategy
that an arena setup would need?  if not can someone enlighten me?


-Brett

From krumms at gmail.com  Wed Nov 16 10:49:50 2005
From: krumms at gmail.com (Thomas Lee)
Date: Wed, 16 Nov 2005 19:49:50 +1000
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com>
References: <4379AAD7.2050506@iinet.net.au>	<6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu>	<e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com>	<ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com>
	<bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com>
Message-ID: <437B00BE.7060007@gmail.com>

As the writer of the crappy code that sparked this conversation, I feel 
I should say something :)

Brett Cannon wrote:

>On 11/15/05, Neal Norwitz <nnorwitz at gmail.com> wrote:
>  
>
>>On 11/15/05, Jeremy Hylton <jeremy at alum.mit.edu> wrote:
>>    
>>
>>>Thanks for the message.  I was going to suggest the same thing.  I
>>>think it's primarily a question of how to add an arena layer.  The AST
>>>phase has a mixture of malloc/free and Python object allocation.  It
>>>should be straightforward to change the malloc/free code to use an
>>>arena API.  We'd probably need a separate mechanism to associate a set
>>>of PyObject* with the arena and have those DECREFed.
>>>      
>>>
>>Well good.  It seems we all agree there is a problem and on the
>>general solution.  I haven't thought about Brett's idea to see if it
>>could work or not.  It would be great if we had someone start working
>>to improve the situation.  It could well be that we live with the
>>current code for 2.5, but it would be great to use arenas for 2.6 at
>>least.
>>
>>    
>>
>
> I have been thinking about this some more  to put off doing homework
>and I have some random ideas I just wanted to toss out there to make
>sure I am not thinking about arena memory management incorrectly
>(never actually encountered it directly before).
>
>I think an arena API is going to be the best solution.  Pulling
>trickery with redefining Py_INCREF and such like I suggested seems
>like a pain and possibly error-prone.  With the compiler being a
>specific corner of the core having a special API for handling the
>memory for PyObject* stuff seems reasonable.
>
>  
>
I agree. And it raises the learning curve for poor saps like myself. :)

>We might need PyArena_Malloc() and PyArena_New() to handle malloc()
>and PyObject* creation.  We could then have a struct that just stored
>pointers to the allocated memory (linked list for each pointer which
>gives high memory overhead or linked list of arrays that should lower
>memory but make having possible holes in the array for stuff already
>freed a pain to handle).  We would then have PyArena_FreeAll() that
>would be strategically placed in the code for when bad things happen
>that would just traverse the lists and free everything.  I assume
>having a way to free individual items might be useful.  Could have the
>PyArena_New() and _Malloc() return structs with the needed info for a
>PyArena_Free(location_struct) to be able to fee the specific item
>without triggering a complete freeing of all memory.  But this usage
>should be discouraged and only used when proper memory management is
>guaranteed.
>
>  
>
An arena/pool (as I understood it from my quick skim) for the AST would 
probably best be implemented (IMHO) as an ADT based on a linked-list:

typedef struct _ast_pool_node {
  struct _ast_pool_node *next;
  PyObject *object; /* == NULL when data != NULL */
  void *data; /* == NULL when object != NULL */
}ast_pool_node;

deallocating a node could then be as simple as:

/* ast_pool_node *n */
PyObject_Free(n->object);
if (n->data != NULL)
  free(n->data);
/* save n->next */
free(n);
/* then go on to free n->next */

I haven't really thought all that deeply about this, so somebody shoot 
me down if I'm completely off-base (Neal? :D). Every allocation of a 
seq/stmt within ast.c would have its memory saved to the pool within the 
function it's allocated in. Then before we return, we can just 
deallocate the pool/arena/whatever you want to call it.

The problem with this is that should we get to the end of the function 
and everything actually went okay (i.e. we return non-NULL), we then 
have to run through and deallocate all the nodes anyway (without 
deallocating n->object or n->data). Bah. Maybe we *would* be better off 
with a monolithic cleanup. I don't know.

>Boy am I wanting RAII from C++ for automatic freeing when scope is
>left.  Maybe we need to come up with a similar thing, like all memory
>that should be freed once a scope is left must use some special struct
>that stores references to all created memory locally and then a free
>call must be made at all exit points in the function using the special
>struct.  Otherwise the pointer is stored in the arena and handled
>en-mass later.
>
>  
>
Which is basically what I just rambled on about up above, I think :)

>Hopefully this is all made some sense.  =)  Is this the basic strategy
>that an arena setup would need?  if not can someone enlighten me?
>
>
>-Brett
>_______________________________________________
>Python-Dev mailing list
>Python-Dev at python.org
>http://mail.python.org/mailman/listinfo/python-dev
>Unsubscribe: http://mail.python.org/mailman/options/python-dev/krumms%40gmail.com
>
>  
>


From ncoghlan at gmail.com  Wed Nov 16 11:33:22 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 16 Nov 2005 20:33:22 +1000
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <4379D40E.9050002@gmail.com>
References: <4379AAD7.2050506@iinet.net.au>	<5f3d2c310511150422x3e2d670r@mail.gmail.com>
	<4379D40E.9050002@gmail.com>
Message-ID: <437B0AF2.7010400@gmail.com>

Nick Coghlan wrote:
> Marek Baczek Baczy?ski wrote:
>> 2005/11/15, Nick Coghlan <ncoghlan at iinet.net.au>:
>>> It avoids the potential for labelling problems that arises when goto's are
>>> used for resource cleanup. It's a far cry from real exception handling, but
>>> it's the best solution I've seen within the limits of C.
>> <delurk>
>> do {
>>     ....
>>     ....
>> } while (0);
>>
>>
>> Same benefit and saves some typing :)
> 
> Heh. Good point. I spend so much time working with a certain language I tend 
> to forget do/while loops exist ;)

Thomas actually tried doing things this way, and the parser/compiler code 
needs to use loops, which means this trick won't work reliably.

So we'll need to do something smarter (such as the arena idea) to deal with 
the memory allocation problem.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From skip at pobox.com  Wed Nov 16 11:59:03 2005
From: skip at pobox.com (skip@pobox.com)
Date: Wed, 16 Nov 2005 04:59:03 -0600
Subject: [Python-Dev] Problems with the Python Memory Manager
In-Reply-To: <437ADDCF.7080906@ee.byu.edu>
References: <437ADDCF.7080906@ee.byu.edu>
Message-ID: <17275.4343.724248.625173@montanaro.dyndns.org>


    Travis> More to the point, however, these scalar objects were allocated
    Travis> using the standard PyObject_New and PyObject_Del functions which
    Travis> of course use the Python memory manager.  One user ported his
    Travis> (long-running) code to the new scipy core and found much to his
    Travis> dismay that what used to consume around 100MB now completely
    Travis> dominated his machine consuming up to 2GB of memory after only a
    Travis> few iterations.  After searching many hours for memory leaks in
    Travis> scipy core (not a bad exercise anyway as some were found), the
    Travis> real problem was tracked to the fact that his code ended up
    Travis> creating and destroying many of these new array scalars.

What Python object were his array elements a subclass of?

    Travis> In the long term, what is the status of plans to re-work the
    Travis> Python Memory manager to free memory that it acquires (or
    Travis> improve the detection of already freed memory locations).  

None that I'm aware of.  It's seen a great deal of work in the past and
generally doesn't cause problems.  Maybe your user's usage patterns were
a bad corner case.  It's hard to tell without more details.

Skip

From niko at alum.mit.edu  Wed Nov 16 12:34:29 2005
From: niko at alum.mit.edu (Niko Matsakis)
Date: Wed, 16 Nov 2005 12:34:29 +0100
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com>
References: <4379AAD7.2050506@iinet.net.au>
	<6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu>
	<e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com>
	<ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com>
	<bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com>
Message-ID: <13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu>

> Boy am I wanting RAII from C++ for automatic freeing when scope is
> left.  Maybe we need to come up with a similar thing, like all memory
> that should be freed once a scope is left must use some special struct
> that stores references to all created memory locally and then a free
> call must be made at all exit points in the function using the special
> struct.  Otherwise the pointer is stored in the arena and handled
> en-mass later.

That made sense.  I think I'd be opposed to what you describe here  
just because I think anything which *requires* that cleanup code be  
placed on every function is error prone.

Depending on how much you care about peak memory usage, you do not  
necessarily need to worry about freeing pointers as you go.  If you  
can avoid thinking about it, it makes things much simpler.

If you are concerned with peak memory usage, it gets more  
complicated, and you will begin to have greater possibility of user  
error.  The problem is that dynamically allocated memory often  
outlives the stack frame in which it was created.  There are several  
possibilities:

- If you use ref-counted memory, you can add to the ref count of the  
memory which outlives the stack frame; the problem is knowing when to  
drop it down again.  I think the easiest is to have two lists: one  
for memory which will go away quickly, and another for more permanent  
memory.  The more permanent memory list goes away at the end of the  
transform and is hopefully rarely used.

- Another idea is to have trees of arenas: the idea is that when an  
arena is created, it is assigned a parent.  When an arena is freed,  
an arenas in its subtree are also freed.  This way you can have one  
master arena for exception handling, but if there is some sub-region  
where allocations can be grouped together, you create a sub-arena and  
free it when that region is complete.  Note that if you forget to  
free a sub-arena, it will eventually be freed.

There is no one-size-fits-all solution.  The right one depends on how  
memory is used; but I think all of them are much simpler and less  
error prone than tracking individual pointers.

I'd actually be happy to hack on the AST code and try to clean up the  
memory usage, assuming that the 2.6 release is far enough out that I  
will have time to squeeze it in among the other things I am doing.


Niko

From krumms at gmail.com  Wed Nov 16 13:05:09 2005
From: krumms at gmail.com (Thomas Lee)
Date: Wed, 16 Nov 2005 22:05:09 +1000
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu>
References: <4379AAD7.2050506@iinet.net.au>	<6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu>	<e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com>	<ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com>	<bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com>
	<13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu>
Message-ID: <437B2075.1000102@gmail.com>

Niko Matsakis wrote:

>>Boy am I wanting RAII from C++ for automatic freeing when scope is
>>left.  Maybe we need to come up with a similar thing, like all memory
>>that should be freed once a scope is left must use some special struct
>>that stores references to all created memory locally and then a free
>>call must be made at all exit points in the function using the special
>>struct.  Otherwise the pointer is stored in the arena and handled
>>en-mass later.
>>    
>>
>
>That made sense.  I think I'd be opposed to what you describe here  
>just because I think anything which *requires* that cleanup code be  
>placed on every function is error prone.
>
>  
>
Placing it in every function isn't really the problem: at the moment 
it's more the fact we have to keep track of too many variables at any 
given time to properly deallocate it all. Cleanup code gets tricky very 
fast.

Then it gets further complicated by the fact that 
stmt_ty/expr_ty/mod_ty/etc. deallocate members (usually asdl_seq 
instances in my experience) - so if a construction takes place, all of a 
sudden you have to make sure you don't deallocate those members a second 
time in the cleanup code :S it gets tricky very quickly.

Even if it meant we had just one function call - one, safe function call 
that deallocated all the memory allocated within a function - that we 
had to put before each and every return, that's better than what we 
have. Is it the best solution? Maybe not. But that's what we're looking 
for here I guess :)


From krumms at gmail.com  Wed Nov 16 13:11:26 2005
From: krumms at gmail.com (Thomas Lee)
Date: Wed, 16 Nov 2005 22:11:26 +1000
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <437B2075.1000102@gmail.com>
References: <4379AAD7.2050506@iinet.net.au>	<6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu>	<e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com>	<ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com>	<bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com>	<13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu>
	<437B2075.1000102@gmail.com>
Message-ID: <437B21EE.5040804@gmail.com>

By the way, I liked the sound of the arena/pool tree - really good idea.

Thomas Lee wrote:

>Niko Matsakis wrote:
>
>  
>
>>>Boy am I wanting RAII from C++ for automatic freeing when scope is
>>>left.  Maybe we need to come up with a similar thing, like all memory
>>>that should be freed once a scope is left must use some special struct
>>>that stores references to all created memory locally and then a free
>>>call must be made at all exit points in the function using the special
>>>struct.  Otherwise the pointer is stored in the arena and handled
>>>en-mass later.
>>>   
>>>
>>>      
>>>
>>That made sense.  I think I'd be opposed to what you describe here  
>>just because I think anything which *requires* that cleanup code be  
>>placed on every function is error prone.
>>
>> 
>>
>>    
>>
>Placing it in every function isn't really the problem: at the moment 
>it's more the fact we have to keep track of too many variables at any 
>given time to properly deallocate it all. Cleanup code gets tricky very 
>fast.
>
>Then it gets further complicated by the fact that 
>stmt_ty/expr_ty/mod_ty/etc. deallocate members (usually asdl_seq 
>instances in my experience) - so if a construction takes place, all of a 
>sudden you have to make sure you don't deallocate those members a second 
>time in the cleanup code :S it gets tricky very quickly.
>
>Even if it meant we had just one function call - one, safe function call 
>that deallocated all the memory allocated within a function - that we 
>had to put before each and every return, that's better than what we 
>have. Is it the best solution? Maybe not. But that's what we're looking 
>for here I guess :)
>
>_______________________________________________
>Python-Dev mailing list
>Python-Dev at python.org
>http://mail.python.org/mailman/listinfo/python-dev
>Unsubscribe: http://mail.python.org/mailman/options/python-dev/krumms%40gmail.com
>
>  
>


From fredrik at pythonware.com  Wed Nov 16 13:05:42 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed, 16 Nov 2005 13:05:42 +0100
Subject: [Python-Dev] Memory management in the AST parser & compiler
References: <4379AAD7.2050506@iinet.net.au>	<6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu>	<e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com>	<ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com>	<bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com><13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu>
	<437B2075.1000102@gmail.com>
Message-ID: <dlf7ak$ckg$1@sea.gmane.org>

Thomas Lee wrote:

> Even if it meant we had just one function call - one, safe function call
> that deallocated all the memory allocated within a function - that we
> had to put before each and every return, that's better than what we
> have.

alloca?

(duck)

</F> 




From collinw at gmail.com  Wed Nov 16 14:09:23 2005
From: collinw at gmail.com (Collin Winter)
Date: Wed, 16 Nov 2005 14:09:23 +0100
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu>
References: <4379AAD7.2050506@iinet.net.au>
	<6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu>
	<e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com>
	<ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com>
	<bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com>
	<13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu>
Message-ID: <43aa6ff70511160509y5abdd8a9y4ec8c131e429b4c0@mail.gmail.com>

On 11/16/05, Niko Matsakis <niko at alum.mit.edu> wrote:
> - Another idea is to have trees of arenas: the idea is that when an
> arena is created, it is assigned a parent.  When an arena is freed,
> an arenas in its subtree are also freed.  This way you can have one
> master arena for exception handling, but if there is some sub-region
> where allocations can be grouped together, you create a sub-arena and
> free it when that region is complete.  Note that if you forget to
> free a sub-arena, it will eventually be freed.

You might be able to draw some inspiration from the Apache Portable
Runtime. It includes a memory pool management scheme that might be of
some interest.

The main project page is http://apr.apache.org, with the docs for the
mempool API located at
http://apr.apache.org/docs/apr/group__apr__pools.html

Collin Winter

From ncoghlan at gmail.com  Wed Nov 16 14:11:02 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 16 Nov 2005 23:11:02 +1000
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <437B00BE.7060007@gmail.com>
References: <4379AAD7.2050506@iinet.net.au>	<6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu>	<e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com>	<ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com>	<bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com>
	<437B00BE.7060007@gmail.com>
Message-ID: <437B2FE6.7080206@gmail.com>

Thomas Lee wrote:
> As the writer of the crappy code that sparked this conversation, I feel 
> I should say something :)

Don't feel bad about it. It turned out the 'helpful' review comments from Neal 
and I didn't originally work out very well either ;)

With the AST compiler being so new, this is the first serious attempt to 
introduce modifications based on it. It's already better than the old CST 
compiler, but that memory management in the parser is a cow :)

>> Hopefully this is all made some sense.  =)  Is this the basic strategy
>> that an arena setup would need?  if not can someone enlighten me?

I think we need to be explicit about the problems we're trying to solve before 
deciding on what kind of solution we want :)

1. Cleaning up after failures in symtable.c and compile.c
   It turns out this is already dealt with in the case of code blocks - the 
compiler state handles a linked list of blocks which it automatically frees 
when the compiler state is cleaned up.
   So the only rule that needs to be followed in these files is to *never* 
call any of the VISIT_* macros while there is a Python object which requires 
DECREF'ing, or a C pointer which needs to be freed.
   This rule was being broken in a couple of places in compile.c (with respect 
to strings). I was the offender in both cases I found - the errors date from 
when this was still on the ast-branch in CVS.
   I've fixed those errors in SVN, and added a note to the comment at the top 
of compile.c, to help others avoid making the same mistake I did.
   It's fragile in some ways, but it does work. It makes the actual 
compilation code look clean (because there isn't any cleanup code), but it 
also makes that code look *wrong* (because the lack of cleanup code makes the 
calls to "compiler_new_block" look unbalanced), which is a little disconcerting.

2. Parsing a token stream into the AST in ast.c
   This is the bit that has caused Thomas grief (the PEP 341 patch only needs 
to modify the front end parser). When building an AST node, each of the 
contained AST nodes or sequences has to be built first. That means that, if 
there's a problem with any of the later subnodes, the earlier subnodes need to 
be freed.
   The key problem with memory management in this module is that the free 
method to be invoked is dependent on the nature of the AST node to be freed. 
In the case of a node sequence, it is dependent on the nature of the contained 
elements.
   So not only do you have to remember to free the memory, you have to 
remember to free it the *right way*.

Would it be worth the extra memory needed to store a pointer to an AST node's 
"free" method in the AST type structure itself? And do the same for ASDL 
sequences?

Then a simple FREE_AST macro would be able to "do the right thing" when it 
came to freeing either AST nodes or sequences. In particular, ASDL sequences 
would be able to free their contents without knowing what those contents 
actually are.

That wouldn't eliminate the problem with memory leaks or double-deletion, but 
it would eliminate some of the mental overhead of dealing with figuring out 
which freeing function to invoke.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From krumms at gmail.com  Wed Nov 16 15:15:20 2005
From: krumms at gmail.com (Thomas Lee)
Date: Thu, 17 Nov 2005 00:15:20 +1000
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <437B2FE6.7080206@gmail.com>
References: <4379AAD7.2050506@iinet.net.au>	<6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu>	<e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com>	<ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com>	<bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com>	<437B00BE.7060007@gmail.com>
	<437B2FE6.7080206@gmail.com>
Message-ID: <437B3EF8.2030001@gmail.com>

Just messing around with some ideas. I was trying to avoid the ugly 
macros (note my earlier whinge about a learning curve) but they're the 
cleanest way I could think of to get around the problem without 
resorting to a mass deallocation right at the end of the AST run. Which 
may not be all that bad given we're going to keep everything in-memory 
anyway until an error occurs ... anyway, anyway, I'm getting sidetracked :)

The idea is to ensure that all allocations within a single function are 
made using the pool so that a function finishes what it starts. This 
way, if the function fails it alone is responsible for cleaning up its 
own pool and that's all. No funkyness needed for sequences, because each 
member of the sequence belongs to the pool too. Note that the stmt_ty 
instances are also allocated using the pool.

This breaks interfaces all over the place though. Not exactly a pretty 
change :) But yeah, maybe somebody smarter than I will come up with 
something a bit cleaner.

--

/* snip! */

#define AST_SUCCESS(pool, result) return result
#define AST_FAILURE(pool, result) asdl_pool_free(pool); return result

static stmt_ty
ast_for_try_stmt(struct compiling *c, const node *n)
{
    /* with the pool stuff, we wouldn't need to declare _all_ the variables
       here either. I'm just lazy. */

    asdl_pool *pool;
    int i;
    const int nch = NCH(n);
    int n_except = (nch - 3)/3;
    stmt_ty result_st = NULL, except_st = NULL;
    asdl_seq *body = NULL, *orelse = NULL, *finally = NULL;
    asdl_seq *inner = NULL, *handlers = NULL;

    REQ(n, try_stmt);

    /* c->pool is the parent of pool. when pool is freed
       (via AST_FAILURE), it is also removed from c->pool's list of 
children */
    pool = asdl_pool_new(c->pool);
    if (pool == NULL)
        AST_FAILURE(pool, NULL);

    body = ast_for_suite(c, CHILD(n, 2));
    if (body == NULL)
        AST_FAILURE(pool, NULL);

    if (TYPE(CHILD(n, nch - 3)) == NAME) {
        if (strcmp(STR(CHILD(n, nch - 3)), "finally") == 0) {
            if (nch >= 9 && TYPE(CHILD(n, nch - 6)) == NAME) {
                /* we can assume it's an "else",
                   because nch >= 9 for try-else-finally and
                   it would otherwise have a type of except_clause */
                orelse = ast_for_suite(c, CHILD(n, nch - 4));
                if (orelse == NULL)
                    AST_FAILURE(pool, NULL);
                n_except--;
            }

            finally = ast_for_suite(c, CHILD(n, nch - 1));
            if (finally == NULL)
                AST_FAILURE(pool, NULL);
            n_except--;
        }
        else {
            /* we can assume it's an "else",
               otherwise it would have a type of except_clause */
            orelse = ast_for_suite(c, CHILD(n, nch - 1));
            if (orelse == NULL)
                AST_FAILURE(pool, NULL);
            n_except--;
        }
    }
    else if (TYPE(CHILD(n, nch - 3)) != except_clause) {
        ast_error(n, "malformed 'try' statement");
        AST_FAILURE(pool, NULL);
    }

 if (n_except > 0) {
        /* process except statements to create a try ... except */
        handlers = asdl_seq_new(pool, n_except);
        if (handlers == NULL)
            AST_FAILURE(pool, NULL);

        for (i = 0; i < n_except; i++) {
            excepthandler_ty e = ast_for_except_clause(c, CHILD(n, 3 + i 
* 3),
                                                    CHILD(n, 5 + i * 3));
            if (!e)
                AST_FAILURE(pool, NULL);
            asdl_seq_SET(handlers, i, e);
        }

        except_st = TryExcept(pool, body, handlers, orelse, LINENO(n));
        if (except_st == NULL)
            AST_FAILURE(pool, NULL);

        /* if a 'finally' is present too, we nest the TryExcept within a
           TryFinally to emulate try ... except ... finally */
        if (finally != NULL) {
            inner = asdl_seq_new(pool, 1);
            if (inner == NULL)
                AST_FAILURE(pool, NULL);
            asdl_seq_SET(inner, 0, except_st);
            result_st = TryFinally(pool, inner, finally, LINENO(n));
            if (result_st == NULL)
                AST_FAILURE(pool, NULL);
        }
        else
            result_st = except_st;
    }
    else {
        /* no exceptions: must be a try ... finally */
        assert(orelse == NULL);
        assert(finally != NULL);
        result_st = TryFinally(pool, body, finally, LINENO(n));
        if (result_st == NULL)
            AST_FAILURE(pool, NULL);
    }

    /* pool deallocated when c->pool is deallocated */
    return AST_SUCCESS(pool, result_st);
}


Nick Coghlan wrote:

>Thomas Lee wrote:
>  
>
>>As the writer of the crappy code that sparked this conversation, I feel 
>>I should say something :)
>>    
>>
>
>Don't feel bad about it. It turned out the 'helpful' review comments from Neal 
>and I didn't originally work out very well either ;)
>
>With the AST compiler being so new, this is the first serious attempt to 
>introduce modifications based on it. It's already better than the old CST 
>compiler, but that memory management in the parser is a cow :)
>
>  
>
>>>Hopefully this is all made some sense.  =)  Is this the basic strategy
>>>that an arena setup would need?  if not can someone enlighten me?
>>>      
>>>
>
>I think we need to be explicit about the problems we're trying to solve before 
>deciding on what kind of solution we want :)
>
>1. Cleaning up after failures in symtable.c and compile.c
>   It turns out this is already dealt with in the case of code blocks - the 
>compiler state handles a linked list of blocks which it automatically frees 
>when the compiler state is cleaned up.
>   So the only rule that needs to be followed in these files is to *never* 
>call any of the VISIT_* macros while there is a Python object which requires 
>DECREF'ing, or a C pointer which needs to be freed.
>   This rule was being broken in a couple of places in compile.c (with respect 
>to strings). I was the offender in both cases I found - the errors date from 
>when this was still on the ast-branch in CVS.
>   I've fixed those errors in SVN, and added a note to the comment at the top 
>of compile.c, to help others avoid making the same mistake I did.
>   It's fragile in some ways, but it does work. It makes the actual 
>compilation code look clean (because there isn't any cleanup code), but it 
>also makes that code look *wrong* (because the lack of cleanup code makes the 
>calls to "compiler_new_block" look unbalanced), which is a little disconcerting.
>
>2. Parsing a token stream into the AST in ast.c
>   This is the bit that has caused Thomas grief (the PEP 341 patch only needs 
>to modify the front end parser). When building an AST node, each of the 
>contained AST nodes or sequences has to be built first. That means that, if 
>there's a problem with any of the later subnodes, the earlier subnodes need to 
>be freed.
>   The key problem with memory management in this module is that the free 
>method to be invoked is dependent on the nature of the AST node to be freed. 
>In the case of a node sequence, it is dependent on the nature of the contained 
>elements.
>   So not only do you have to remember to free the memory, you have to 
>remember to free it the *right way*.
>
>Would it be worth the extra memory needed to store a pointer to an AST node's 
>"free" method in the AST type structure itself? And do the same for ASDL 
>sequences?
>
>Then a simple FREE_AST macro would be able to "do the right thing" when it 
>came to freeing either AST nodes or sequences. In particular, ASDL sequences 
>would be able to free their contents without knowing what those contents 
>actually are.
>
>That wouldn't eliminate the problem with memory leaks or double-deletion, but 
>it would eliminate some of the mental overhead of dealing with figuring out 
>which freeing function to invoke.
>
>Cheers,
>Nick.
>
>  
>


From jimjjewett at gmail.com  Wed Nov 16 16:29:01 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed, 16 Nov 2005 10:29:01 -0500
Subject: [Python-Dev] Conclusion: Event loops, PyOS_InputHook, and Tkinter
Message-ID: <fb6fbf560511160729s7953fb42k3de1fcc23774b4f6@mail.gmail.com>

Phillip J. Eby:

> did you ever try using IPython, and confirm whether it
> does or does not address the issue

As I understand it, using IPython (or otherwise changing
the interactive mode) works fine *if* you just want a point
solution -- get something up in some environment chosen
by the developer.

Michiel is looking to create a component that will work in
whatever environment the *user* chooses.  Telling users
"you must go through this particular interface" is not
acceptable.  Therefore, IPython is only a workaround,
not a solution.

On the other hand, IPython is clearly a *good* workaround.
The dance described in
http://mail.python.org/pipermail/python-dev/2005-November/058057.html
is short enough that a real solution might well be built on
IPython; it just isn't quite done yet.

-jJ

From arigo at tunes.org  Wed Nov 16 17:20:32 2005
From: arigo at tunes.org (Armin Rigo)
Date: Wed, 16 Nov 2005 17:20:32 +0100
Subject: [Python-Dev] Is some magic required to check out new files from
	svn?
In-Reply-To: <17271.15039.201796.513101@montanaro.dyndns.org>
References: <17270.32577.193894.694593@montanaro.dyndns.org>
	<dl6vvl$o9c$1@sea.gmane.org>
	<17271.15039.201796.513101@montanaro.dyndns.org>
Message-ID: <20051116162032.GA17196@code1.codespeak.net>

Hi,

On Sun, Nov 13, 2005 at 07:08:15AM -0600, skip at pobox.com wrote:
> The full svn status output is
> 
>     % svn status
>     !      .
>     !      Python

The "!" definitely mean that these items are missing, or for
directories, incomplete in some way.  You need to play around until the
"!" goes away; for example, you may try

    svn revert -R .     # revert to pristine state, recursively

if you have no local changes you want to keep, followed by 'svn up'.  If
it still doesn't help, then I'm lost about the cause and would just
recommend doing a fresh checkout.


A bientot,

Armin.

From oliphant at ee.byu.edu  Wed Nov 16 19:20:48 2005
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Wed, 16 Nov 2005 11:20:48 -0700
Subject: [Python-Dev] Problems with the Python Memory Manager
In-Reply-To: <17275.4343.724248.625173@montanaro.dyndns.org>
References: <437ADDCF.7080906@ee.byu.edu>
	<17275.4343.724248.625173@montanaro.dyndns.org>
Message-ID: <437B7880.10004@ee.byu.edu>

skip at pobox.com wrote:

>    Travis> More to the point, however, these scalar objects were allocated
>    Travis> using the standard PyObject_New and PyObject_Del functions which
>    Travis> of course use the Python memory manager.  One user ported his
>    Travis> (long-running) code to the new scipy core and found much to his
>    Travis> dismay that what used to consume around 100MB now completely
>    Travis> dominated his machine consuming up to 2GB of memory after only a
>    Travis> few iterations.  After searching many hours for memory leaks in
>    Travis> scipy core (not a bad exercise anyway as some were found), the
>    Travis> real problem was tracked to the fact that his code ended up
>    Travis> creating and destroying many of these new array scalars.
>
>What Python object were his array elements a subclass of?
>  
>
These were all scipy core arrays.  The elements were therefore all 
C-like numbers (floats and integers I think).  If he obtained an element 
in Python, he would get an instance of a new "array" scalar object which 
is a builtin extension type written in C.  The important issue though is 
that these "array" scalars were allocated using PyObject_New and 
deallocated using PyObject_Del.  The problem is that the Python memory 
manager did not free the memory. 

>    Travis> In the long term, what is the status of plans to re-work the
>    Travis> Python Memory manager to free memory that it acquires (or
>    Travis> improve the detection of already freed memory locations).  
>
>None that I'm aware of.  It's seen a great deal of work in the past and
>generally doesn't cause problems.  Maybe your user's usage patterns were
>a bad corner case.  It's hard to tell without more details.
>  
>
I think definitely, his usage pattern represented a "bad" corner case.  
An unusable "corner" case in fact.   At any rate, moving to use the 
system free and malloc fixed the immediate problem.  I mainly wanted to 
report the problem here just as another piece of anecdotal evidence.

-Travis


From jcarlson at uci.edu  Wed Nov 16 21:12:31 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed, 16 Nov 2005 12:12:31 -0800
Subject: [Python-Dev] Problems with the Python Memory Manager
In-Reply-To: <437B7880.10004@ee.byu.edu>
References: <17275.4343.724248.625173@montanaro.dyndns.org>
	<437B7880.10004@ee.byu.edu>
Message-ID: <20051116120346.A434.JCARLSON@uci.edu>


Travis Oliphant <oliphant at ee.byu.edu> wrote:
> 
> skip at pobox.com wrote:
> 
> >    Travis> More to the point, however, these scalar objects were allocated
> >    Travis> using the standard PyObject_New and PyObject_Del functions which
> >    Travis> of course use the Python memory manager.  One user ported his
> >    Travis> (long-running) code to the new scipy core and found much to his
> >    Travis> dismay that what used to consume around 100MB now completely
> >    Travis> dominated his machine consuming up to 2GB of memory after only a
> >    Travis> few iterations.  After searching many hours for memory leaks in
> >    Travis> scipy core (not a bad exercise anyway as some were found), the
> >    Travis> real problem was tracked to the fact that his code ended up
> >    Travis> creating and destroying many of these new array scalars.
> >
> >What Python object were his array elements a subclass of?
> 
> These were all scipy core arrays.  The elements were therefore all 
> C-like numbers (floats and integers I think).  If he obtained an element 
> in Python, he would get an instance of a new "array" scalar object which 
> is a builtin extension type written in C.  The important issue though is 
> that these "array" scalars were allocated using PyObject_New and 
> deallocated using PyObject_Del.  The problem is that the Python memory 
> manager did not free the memory. 

This is not a bug, and there doesn't seem to be any plans to change the
behavior: python.org/sf/1338264

If I remember correctly, arrays from the Python standard library (import
array), as well as numarray and Numeric, all store values in their
pure C representations (they don't use PyObject_New unless someone uses
the Python interface to fetch a particular element).  This saves the
overhead of allocating base objects, as well as the 3-5x space blowup
when using Python integers (depending on whether your platform has 32 or
64 bit ints).


> I think definitely, his usage pattern represented a "bad" corner case.  
> An unusable "corner" case in fact.   At any rate, moving to use the 
> system free and malloc fixed the immediate problem.  I mainly wanted to 
> report the problem here just as another piece of anecdotal evidence.

On the one hand, using PyObjects embedded in an array in scientific
Python is a good idea; you can use all of the standard Python
manipulations on them.  On the other hand, other similar projects have
found it more efficient to never embed PyObjects in their arrays, and
just allocate them as necessary on access.

 - Josiah


From robert.kern at gmail.com  Wed Nov 16 21:41:00 2005
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 16 Nov 2005 12:41:00 -0800
Subject: [Python-Dev] Problems with the Python Memory Manager
In-Reply-To: <20051116120346.A434.JCARLSON@uci.edu>
References: <17275.4343.724248.625173@montanaro.dyndns.org>	<437B7880.10004@ee.byu.edu>
	<20051116120346.A434.JCARLSON@uci.edu>
Message-ID: <dlg5gt$q1g$1@sea.gmane.org>

Josiah Carlson wrote:
> Travis Oliphant <oliphant at ee.byu.edu> wrote:

>>I think definitely, his usage pattern represented a "bad" corner case.  
>>An unusable "corner" case in fact.   At any rate, moving to use the 
>>system free and malloc fixed the immediate problem.  I mainly wanted to 
>>report the problem here just as another piece of anecdotal evidence.
> 
> On the one hand, using PyObjects embedded in an array in scientific
> Python is a good idea; you can use all of the standard Python
> manipulations on them.  On the other hand, other similar projects have
> found it more efficient to never embed PyObjects in their arrays, and
> just allocate them as necessary on access.

That's not what we're doing[1]. The scipy_core arrays here are just
blocks of C doubles. However, the offending code (I believe Chris
Fonnesbeck's PyMC, but I could be mistaken) frequently indexes into
these arrays to get scalar values. In scipy_core, we've defined a set of
numerical types that generally behave like Python ints and floats but
have the underlying storage of the appropriate C data type and have the
various array attributes and methods. When the result of an indexing
operation is a scalar (e.g., arange(10)[0]), it always returns an
instance of the appropriate scalar type. We are "just allocat[ing] them
as necessary on access."

[1] There *is* an array type for general PyObjects in scipy_core, but
that's not being used in the code that blows up and has nothing to do
with the problem Travis is talking about.

-- 
Robert Kern
robert.kern at gmail.com

"In the fields of hell where the grass grows high
 Are the graves of dreams allowed to die."
  -- Richard Harter


From jcarlson at uci.edu  Thu Nov 17 00:04:32 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed, 16 Nov 2005 15:04:32 -0800
Subject: [Python-Dev] Problems with the Python Memory Manager
In-Reply-To: <dlg5gt$q1g$1@sea.gmane.org>
References: <20051116120346.A434.JCARLSON@uci.edu> <dlg5gt$q1g$1@sea.gmane.org>
Message-ID: <20051116145820.A43A.JCARLSON@uci.edu>


Robert Kern <robert.kern at gmail.com> wrote:
> 
> [1] There *is* an array type for general PyObjects in scipy_core, but
> that's not being used in the code that blows up and has nothing to do
> with the problem Travis is talking about.

I seemed to have misunderstood the discussion.  Was the original user
accessing and saving copies of many millions of these doubles?  That's
the only way that I would be able to explain the huge overhead, and in
that case, perhaps the user should have been storing them in scipy
arrays (or even Python array.arrays).

 - Josiah


From oliphant at ee.byu.edu  Thu Nov 17 00:47:48 2005
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Wed, 16 Nov 2005 16:47:48 -0700
Subject: [Python-Dev] Problems with the Python Memory Manager
In-Reply-To: <20051116145820.A43A.JCARLSON@uci.edu>
References: <20051116120346.A434.JCARLSON@uci.edu> <dlg5gt$q1g$1@sea.gmane.org>
	<20051116145820.A43A.JCARLSON@uci.edu>
Message-ID: <437BC524.2030105@ee.byu.edu>

Josiah Carlson wrote:

>Robert Kern <robert.kern at gmail.com> wrote:
>  
>
>>[1] There *is* an array type for general PyObjects in scipy_core, but
>>that's not being used in the code that blows up and has nothing to do
>>with the problem Travis is talking about.
>>    
>>
>
>I seemed to have misunderstood the discussion.  Was the original user
>accessing and saving copies of many millions of these doubles?  
>
He *was* accessing them (therefore generating a call to an array-scalar 
object creation function).  But they *weren't being* saved.  They were 
being deleted soon after access.   That's why it was so confusing that 
his memory usage should continue to grow and grow so terribly.

As verified by removing usage of the Python PyObject_MALLOC function, it 
was the Python memory manager that was performing poorly.   Even though 
the array-scalar objects were deleted, the memory manager would not 
re-use their memory for later object creation. Instead, the memory 
manager kept allocating new arenas to cover the load (when it should 
have been able to re-use the old memory that had been freed by the 
deleted objects--- again, I don't know enough about the memory manager 
to say why this happened).

The fact that it did happen is what I'm reporting on.  If nothing will 
be done about it (which I can understand), at least this thread might 
help somebody else in a similar situation track down why their Python 
process consumes all of their memory even though their objects are being 
deleted appropriately.

Best,

-Travis


From nnorwitz at gmail.com  Thu Nov 17 01:08:48 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Wed, 16 Nov 2005 16:08:48 -0800
Subject: [Python-Dev] Problems with the Python Memory Manager
In-Reply-To: <437BC524.2030105@ee.byu.edu>
References: <20051116120346.A434.JCARLSON@uci.edu> <dlg5gt$q1g$1@sea.gmane.org>
	<20051116145820.A43A.JCARLSON@uci.edu> <437BC524.2030105@ee.byu.edu>
Message-ID: <ee2a432c0511161608n3d16ec63id47c8fc585b6efd1@mail.gmail.com>

On 11/16/05, Travis Oliphant <oliphant at ee.byu.edu> wrote:
>
> As verified by removing usage of the Python PyObject_MALLOC function, it
> was the Python memory manager that was performing poorly.   Even though
> the array-scalar objects were deleted, the memory manager would not
> re-use their memory for later object creation. Instead, the memory
> manager kept allocating new arenas to cover the load (when it should
> have been able to re-use the old memory that had been freed by the
> deleted objects--- again, I don't know enough about the memory manager
> to say why this happened).

Can you provide a minimal test case?  It's hard to do anything about
it if we can't reproduce it.

n

From skip at pobox.com  Thu Nov 17 00:32:54 2005
From: skip at pobox.com (skip@pobox.com)
Date: Wed, 16 Nov 2005 17:32:54 -0600
Subject: [Python-Dev] Problems with the Python Memory Manager
In-Reply-To: <20051116145820.A43A.JCARLSON@uci.edu>
References: <20051116120346.A434.JCARLSON@uci.edu> <dlg5gt$q1g$1@sea.gmane.org>
	<20051116145820.A43A.JCARLSON@uci.edu>
Message-ID: <17275.49574.768079.524296@montanaro.dyndns.org>


    >> [1] There *is* an array type for general PyObjects in scipy_core, but
    >> that's not being used in the code that blows up and has nothing to do
    >> with the problem Travis is talking about.

    Josiah> I seemed to have misunderstood the discussion.  

I'm sorry, but I'm confused as well.  If these scipy arrays have elements
that are subclasses of floats shouldn't we be able to provoke this memory
growth using an array.array of floats?  Can you provide a simple script in
pure Python (no scipy) that demonstrates the problem?

Skip

From oliphant at ee.byu.edu  Thu Nov 17 03:15:04 2005
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Wed, 16 Nov 2005 19:15:04 -0700
Subject: [Python-Dev] Problems with the Python Memory Manager
In-Reply-To: <fb6fbf560511161750y7cef46cdk67700606e655a6ec@mail.gmail.com>
References: <fb6fbf560511161750y7cef46cdk67700606e655a6ec@mail.gmail.com>
Message-ID: <437BE7A8.5000503@ee.byu.edu>

Jim Jewett wrote:

>Do you have the code that caused problems?
>  
>
Yes.  I was able to reproduce his trouble and was trying to debug it.

>The things I would check first are
>
>(1)  Is he allocating (peak usage) a type (such as integers) that
>never gets returned to the free pool, in case you need more of that
>same type?
>  
>
No, I don't think so.

>(2)  Is he allocating new _types_, which I think don't get properly
>
> collected.
>  
>

Bingo.  Yes, definitely allocating new _types_ (an awful lot of them...) 
--- that's what the "array scalars" are: new types created in C.  If 
they don't get properly collected then that would definitely have 
created the problem.  It would seem this should be advertised when 
telling people to use PyObject_New for allocating new memory for an object.

>(3)  Is there something in his code that keeps a live reference, or at
>least a spotty memory usage so that the memory can't be cleanly
>released?
>
>  
>
No, that's where I thought the problem was, at first.  I spent a lot of 
time tracking down references.    What finally convinced me it was the 
Python memory manager was when I re-wrote the tp->alloc functions of the 
new types to use the system malloc instead of PyObject_Malloc.    As 
soon as I did this the problems disappeared and memory stayed constant. 

Thanks for your comments,

-Travis




From decker at dacafe.com  Wed Nov 16 03:33:13 2005
From: decker at dacafe.com (decker@dacafe.com)
Date: Wed, 16 Nov 2005 02:33:13 -0000 (Australia/Sydney)
Subject: [Python-Dev] Patch Req. # 1351020 & 1351036: PythonD modifications
Message-ID: <39387.202.3.192.11.1132108393.squirrel@cafemail.mcadcafe.com>

Hello,


I would appreciate feedback concerning these patches before the next
"PythonD" (for DOS/DJGPP) is released.


Thanks in advance.




Regards,
Ben Decker
Systems Integrator
http://www.caddit.net



-----------------------------------------
Stay ahead of the information curve.
Receive MCAD news and jobs on your desktop daily.
Subscribe today to the MCAD CafeNews newsletter.
[ http://www10.mcadcafe.com/nl/newsletter_subscribe.php ]
It's informative and essential.

From tony.meyer at gmail.com  Thu Nov 17 01:36:32 2005
From: tony.meyer at gmail.com (Tony Meyer)
Date: Thu, 17 Nov 2005 13:36:32 +1300
Subject: [Python-Dev] DRAFT: python-dev Summary for 2005-09-16 to 2005-09-30
Message-ID: <6F1AA13E-2723-43D1-B6F1-7A7A9F1A6E1C@gmail.com>

It's been some time (all that concurrency discussion didn't help ;)  
but here's the second half of September.  Many apologies for the  
delay; hopefully you agree with Guido's 'better late than never', and  
I promise to try harder in the future.  Note that the delay is all my  
bad, and epithets should be directed at me and not Steve.  As usual,  
please read over if you have a chance, and direct comments/ 
corrections to tony.meyer at gmail.com or steven.bethard at gmail.com.   
(One particular question is whether the concurrency summary is too  
long).

=============
Announcements
=============

-----------------------------
QOTF: Quotes of the fortnight
-----------------------------

We have two quotes this week, one each from the two biggest threads  
of this fortnight: concurrency and conditional expressions.  The  
first quote, from Donovan Barda, puts Python's approach to threading  
into perspective:

     The reality is threads were invented as a low overhead way of  
easily implementing concurrent applications... ON A SINGLE PROCESSOR.  
Taking into account threading's limitations and objectives, Python's  
GIL is the best way to support threads. When hardware (seriously)  
moves to multiple processors, other concurrency models will start to  
shine.

Our second QOTF, by yours truly (hey, who could refuse a nomination  
from Guido?), is a not-so-subtle reminder to leave syntax decisions  
to Guido:

     Please no more syntax proposals! ... We need to leave the syntax  
to Guido.  We've already proved that ... we can't as a community  
agree on a syntax.  That's what we have a BDFL for. =)

Contributing threads:

- `GIL, Python 3, and MP vs. UP <http://mail.python.org/pipermail/ 
python-dev/2005-September/056609.html>`__
- `Adding a conditional expression in Py3.0 <http://mail.python.org/ 
pipermail/python-dev/2005-September/056617.html>`__

[SJB]

-------------------
Compressed MSI file
-------------------

Martin v. L?wis discovered that a little more than a `MiB`_ in the  
Python installer by using LZX:21 instead of the standard MSZIP when  
compressing the CAB file.  After confirmation from several testers  
that the new format worked, the change (for Python 2.4.2 and beyond)  
was made.

.. _MiB: http://en.wikipedia.org/wiki/Mibibyte

Contributing thread:

- `Compressing MSI files: 2.4.2 candidate? <http://mail.python.org/ 
pipermail/python-dev/2005-September/056694.html>`__

[TAM]

=========
Summaries
=========

-----------------------
Conditional expressions
-----------------------

Raymond Hettinger proposed that the ``and`` and ``or`` operators be  
modified in Python 3.0 to produce only booleans instead of producing  
objects, motivating this proposal in part by the common (mis-)use of  
``<cond> and <true-expr> or <false-expr>`` to emulate a conditional  
expression.  In response, Guido suggested that that the conditional  
expression discussion of `PEP 308`_ be reopened.  This time around,  
people seemed almost unanimously in support of adding a conditional  
expression, though as before they disagreed on syntax.  Fortunately,  
this time Guido cut the discussion short and pronounced a new syntax:  
``<true-expr> if <cond> else <false-expr>``.  Although it has not  
been implemented yet, the plan is for it to appear in Python 2.5.

.. _PEP 308: http://www.python.org/peps/pep-0308.html

Contributing threads:

- `"and" and "or" operators in Py3.0 <http://mail.python.org/ 
pipermail/python-dev/2005-September/056510.html>`__
- `Adding a conditional expression in Py3.0 <http://mail.python.org/ 
pipermail/python-dev/2005-September/056546.html>`__
- `Conditional Expression Resolution <http://mail.python.org/ 
pipermail/python-dev/2005-September/056846.html>`__

[SJB]

---------------------
Concurrency in Python
---------------------

Once again, the subject of removing the global interpreter lock (GIL)  
came up.  Sokolov Yura suggested that the GIL be replaced with a  
system where there are thread-local GILs that cooperate to share  
writing; Martin v. L?wis suggested that he try to implement his  
ideas, and predicted that he would find that doing so would be a lot  
of work, would require changes to all extension modules (likely to  
introduce new bugs, particularly race conditions), and possibly  
decrease performance.  This kicked off several long threads about  
multi-processor coding.

A long time ago (circa Python 1.5), Greg Ward experimented with free  
threading, which did yield around a 1.6 times speedup on a dual- 
processor machine.  To avoid the overhead of multi-processor locking  
on a uniprocessor machine, a separate binary could be distributed.   
Some of the code apparently did make it into Python 1.5, but the  
issue died off because no-one provided working code, or a strategy  
for what to do with existing extension modules.

Guido pointed out that it is not clear at this time how multiple  
processors will be used as they become the norm.  With the treaded  
programming model (e.g. in Java) there are problems with concurrent  
modification errors (without locking) or deadlocks and livelocks  
(with locking).  Guido's hunch (and mine, FWIW) is that instead of  
writing massively parallel applications, we will continue to write  
single-threaded applications that are tied together at the process  
level rather than at the thread level.  He also pointed out that it's  
likely that most problems get little benefit out of multiple processors.

Guido threw down the gauntlet: rather than the endless discussion  
about this topic, someone should come up with a GIL-free Python (not  
necessarily CPython) and demonstrate its worth.  Phillip J. Eby  
reminded everyone that Jython, IronPython, and PyPy exist, and that  
someone could, for example, create a multiprocessor-friendly backend  
for PyPy.

Guido also pointed out that fast threading benefits from fast context  
switches, which benefits from small register sets, and that the  
current trend in chips is towards larger register sets.  In addition,  
multiple processors with shared memory don't scale all that well  
(multiple processors with explicit interprocess communication (IPC)  
channels scale much better).  These all favour multi-processing over  
multi-threading.  Donovan Baarda went so far as to say (a QOTF, as  
above), that Python's GIL is the best way to support threads, which  
are for single-processor use, and that when multiple-processor  
platforms have matured more other concurrency models will likewise  
mature.  OTOH, Bob Ippolito pointed out that (in many operating  
systems) there isn't a lot of difference between threads and  
processes, and that threads can typically still use IPC.  Bob argued  
that the biggest argument for threading is that lots of existing C/C+ 
+ code uses threads.

Simon Percivall argued that the problem is that Python offers ("out  
of the box") some support for multi-threaded programming, but little  
for multi-process programming beyond the basics (e.g. data sharing,  
communication, control over running processes, dealing out tasks to  
be handled).  Simon suggested that the best way to stop people  
complaining about the GIL is to provide solid, standardized support  
for multi-process programming.  The idea of a "multiprocess" module  
gained a reasonable amount of support.

Phillip J. Eby outlined an idea he is considering PEPifying, in which  
one could switch all context variables (such as the Decimal context  
and the sys.* variables) simulaneously and instantaneously when  
changing execution contexts (like switching between coroutines).  He  
has a prototype implementation of the basic idea, which is less than  
200 lines of Python and very fast.  However, he pointed out that it's  
not completely PEP-ready at this point, and he needs to continue  
considering various parts of the concept.
Bruce Eckel joined the thread, and suggested that low-level threads  
people are only now catching up to objects, but as far as concurrency  
goes their brains still think in terms of threads, so they naturally  
apply thread concepts to objects.  He believes that pthread-style  
thinking is two steps backwards: you effectively throw open the  
innards of the object that you just spent time decoupling from the  
rest of your system, and the coupling is not unpredictable.

Bruce and Guido had discussed offlist "active objects": defining a  
class as "active" would install a worker thread and concurrent queue  
in each object of that class, automatically turn method calls into  
tasks and enqueue them, and prevent any other interaction other than  
enqueued messages.  Guido felt that if multiple active objects could  
co-exist in the same process, but be prevented (by the language  
implementation) from sharing data except via channels, and dynamic  
reallocation of active objects across multiple CPUs were possible,  
then this might be a solution.  He pointed out that an implementation  
would really be needed to prove this.

Phillip and Martin pointed out that preventing any other interacton  
other than enqueued messages is the difficult part; each active  
object would, for example, have to have its own sys.modules.  Phillip  
felt that such a solution (which Bruce posed as "a" solution, not  
"the" solution) wouldn't help with GIL removal, but would help with  
effective use of multiprocessor machines on platforms where fork() is  
available, if the API works across processes as well as threads.

Bruce then restarted the discussion, putting forth eight criteria  
that he felt would be necessary for the "pythonic" solution to  
concurrency.  Items on the list were discussed further, with some  
disagreement about what was possible.  The concurrency discussion  
continues next month...

Contributing threads:

- `Variant of removing GIL. <http://mail.python.org/pipermail/python- 
dev/2005-September/056423.html>`__
- `GIL, Python 3, and MP vs. UP (was Re: Variant of removing GIL.)  
<http://mail.python.org/pipermail/python-dev/2005-September/ 
056458.html>`__
- `GIL, Python 3, and MP vs. UP <http://mail.python.org/pipermail/ 
python-dev/2005-September/056498.html>`__
- `Active Objects in Python <http://mail.python.org/pipermail/python- 
dev/2005-September/056752.html>`__
- `Pythonic concurrency <http://mail.python.org/pipermail/python-dev/ 
2005-September/056801.html>`__
- `Pythonic concurrency - cooperative MT <http://mail.python.org/ 
pipermail/python-dev/2005-September/056860.html>`__

[TAM]

-----------------------------------
Removing nested function parameters
-----------------------------------

Brett Cannon proposed removing support for nested function parameters  
so that instead of being able to write::

     def f((x, y)):
         print x, y

you'd have to write something like::

     def f(arg):
         x, y = arg
         print x, y

Brett (with help from Guido) motivated this removal (for Python 3.0)  
by a few factors:

(1) The feature has low visibility: "For every user who is fond of  
them there are probably ten who have never even heard of it." - Guido
(2) The feature can be difficult to read for some people.
(3) The feature doesn't add any power to the language; the above  
functions emit essentially the same byte-code.
(4) The feature makes function parameter introspection difficult  
because tuple unpacking information is not stored in the function  
object.

In general, people were undecided on this proposal.  While a number  
of people said they used the feature and would miss it, many of them  
also said that their code wouldn't suffer that much if the feature  
was removed.  No decision had been made at the time of the summary.

Contributing thread:

- `removing nested tuple function parameters <http://mail.python.org/ 
pipermail/python-dev/2005-September/056459.html>`__

[SJB]

-----------------------------------------
Evaluating iterators in a boolean context
-----------------------------------------

In Python 2.4 some builtin iterators gained __len__ methods when the  
number of remaining items could be made available.  This broke some  
of Guido's code that tested iterators for their boolean value (to  
distinguish them from None).  Raymond Hettinger (who supplied the  
original patch) argued that `testing for None`_ using boolean tests  
was in general a bad idea, and that knowing the length of an  
iterator, when possible, had a number of use cases and allowed for  
some performance gains.  However, Guido felt strongly that iterators  
should not supply __len__ methods, as this would lead to some people  
writing code expecting this method, which would then break when it  
received an iterator which could not determine its own length.  The  
feature will be rolled back in Python 2.5, and Raymond will likely  
move the __len__ methods to private methods in order to maintain the  
performance gains.

.. _testing for None: http://www.python.org/peps/ 
pep-0290.html#testing-for-none

Contributing threads:

- `bool(iter([])) changed between 2.3 and 2.4 <http://mail.python.org/ 
pipermail/python-dev/2005-September/056576.html>`__
- `bool(container) [was bool(iter([])) changed between 2.3 and 2.4]  
<http://mail.python.org/pipermail/python-dev/2005-September/ 
056879.html>`__

[SJB]

--------------------------------------------------
Properties that only call the getter function once
--------------------------------------------------

Jim Fulton proposed adding a new builtin for a property-like  
descriptor that would only call the getter method once, so that  
something like::

    class Spam(object):

        @readproperty
        def eggs(self):
            ... expensive computation of eggs

            self.eggs = result
            return result

would only do the eggs computation once.  Currently, you can't do  
this with a property() because the ``self.eggs = result`` statement  
tries to call the property's ``fset`` method instead of replacing the  
property with the result of the eggs() call.  A few other people  
commented that they'd needed similar functionality at times, and  
Guido seemed moderately interested in the idea, but there was no  
final resolution.

Contributing thread:

- `RFC: readproperty <http://mail.python.org/pipermail/python-dev/ 
2005-September/056769.html>`__

[SJB]

--------
Codetags
--------

Micah Elliott submitted his `Codetags PEP 350`_ (after revisions  
following the comp.lang.python discussion) to python-dev for  
comment.  A common feeling was that this (particularly synonyms) was  
over-engineering; Guido pointed out that he only uses XXX, and this  
is certainly the most common (although not only) example in the  
Python source itself.  Some suggestions were made, many of which  
Micah integrated into the PEP.

The suggestion was made that an implementation should precede  
approval of the PEP.  Micah indicated that he would continue  
development on the tools, and that he encourages anyone interested in  
using a standard set of codetages to give these a try.

.. _Codetags PEP 350: http://python.org/peps/pep-0350.html

- `PEP 350: Codetags <http://mail.python.org/pipermail/python-dev/ 
2005-September/056744.html>`__

[TAM]

----------------------------
Improving set implementation
----------------------------

Raymond Hettinger suggested a "small, but interesting, C project" to  
determine whether the setobject.c implementation would be improved by  
recoding the set_lookkey() function to optimize key insertion order  
using Brent's variation of Algorithm D (c.f. Knuth vol. III, section  
6.4, p525).  It has the potential to boost performance for  
uniquification applications with duplicate keys being identified more  
quickly, and possibly also more frequent retirement of dummy entires  
during insertion operations.

Andrew Durdin pointed out that Brent's variation depends on the next  
probe position for a key being derivation from the key and it current  
position, which is incompatible with the current perturbation system;  
Raymond replaced perturbation with a secondary hash with linear  
probing.  Antoine Pitrou did some `experimenting with this`_,  
resulting in a -5% to 2% speedup with various benchmarks.

Raymond has also been experimenting with a simpler approach: whenever  
there are more than three probes, always swap the new key into the  
first position and then unconditionally re-insert the swapped-out  
key.  He reported that, most of the time, this gives an improvement,  
and it doesn't require changing the perturbation logic.  This simpler  
approach is cheap to implement, but the benefits are also smaller,  
with it improving only the worse collisions.

.. _experimenting with this: http://pitrou.net/python/sets

- `C coding experiment <http://mail.python.org/pipermail/python-dev/ 
2005-September/055965.html>`__

[TAM]

--------------
Relative paths
--------------

Nathan Bullock suggested a ''relpath(path_a, path_b)'' addition to  
os.path that returns a relative path from path_a to path_b.  Trent  
Mick pointed out that there are a `couple of`_ `recipes for this`_,  
as well as `Jason Orendorff's Path module`_.  Several people  
supported this idea, and hopefully either Nathan or one of the recipe  
authors will submit a patch with this functionality.

.. _couple of: http://aspn.activestate.com/ASPN/Cookbook/Python/ 
Recipe/302594
.. _recipes for this: http://aspn.activestate.com/ASPN/Cookbook/ 
Python/Recipe/208993
.. _Jason Orendorff's Path module: http://www.jorendorff.com/articles/ 
python/path/

Contributing threads:

- `os.path.diff(path1, path2) <http://mail.python.org/pipermail/ 
python-dev/2005-September/056391.html>`__
- `os.path.diff(path1, path2) (and a first post) <http:// 
mail.python.org/pipermail/python-dev/2005-September/056703.html>`__

[TAM]

----------------------------------
Adding a vendor-packages directory
----------------------------------

Rich Burridge followed up a `comp.lang.python thread`_ about a  
"vendor-packages" directory for Python by submitting a `patch`_ and  
asking for comments about the proposal on python-dev.  General  
consensus was that the proposal needed a better rationale, explaining  
why this improved on simply adding a .pth file to the site-packages  
directory.

Rich explained that the rationale is that Python files supplied by  
the vendor (Sun, Apple, RedHat, Microsoft) with their operating  
system software should go in a separate base directory to  
differentiate them from Python files installed specifically at the  
site.  However, Bob Ippolito pointed out that, as of OS X 10.4  
("Tiger") Apple already does this via a .pth file ("Extras.pth"),  
which points to ''/System/Library/Frameworks/Python.framework/ 
Versions/2.3/Extras/lib/python'' and includes wxPython by default.

Bob also pointed out that such a "vendor-packages.pth" should look  
like ''import site; site.addsitedir('/usr/lib/python2.4/vendor- 
packages')'' so that packages like Numeric, PIL, and PyObjC, which  
take advantage of .pth files themselves, work when installed to the  
vendor-packages location.

Phillip J. Eby pointed out that it would be good to have a document  
for "Python Distributors" that explained these kind of things, and  
suggested that perhaps a volunteer or two could be found within the  
distutils-SIG to do this.

.. _comp.lang.python thread: http://mail.python.org/pipermail/python- 
list/2005-September/300029.html
.. _patch: http://sourceforge.net/tracker/index.php? 
func=detail&aid=1298835&group_id=5470&atid=305470

Contributing thread:

- `vendor-packages directory <http://mail.python.org/pipermail/python- 
dev/2005-September/056682.html>`__

[TAM]

=======================
Version numbers on OS X
=======================

Guido asked if platform.system_alias() could be improved on OS X by  
mapping uname()'s ''Darwin x.y'' to ''OS X 10.(x-4).y''.  Bob  
Ippolito and others pointed out that this was not a good idea,  
because uname() only reports on the kernel version number and not the  
Cocoa API, which is really what OS X 10.x.y refers to.  He pointed  
out that the correct way to do it using a public API is to used  
gestalt, which is what platform.mac_ver() does.

On further inspection, it was discovered that parsing the /System/ 
Library/CoreServices/SystemVersion.plist property list is also a  
supported API, and would not rely on access to the Carbon API set.   
Bob and Wilfredo S?nchez Vega provided sample code that would parse  
this plist; Marc-Andre Lemburg suggested that a patch be written for  
system_alias() that would use this method (if possible) for Mac OS.

Contributing thread:

- `Mapping Darwin 8.2.0 to Mac OS X 10.4.2 in platform.py <http:// 
mail.python.org/pipermail/python-dev/2005-September/056651.html>`__

[TAM]

================
Deferred Threads
================

- `Python 2.5a1, ast-branch and PEP 342 and 343 <http:// 
mail.python.org/pipermail/python-dev/2005-September/056449.html>`__


===============
Skipped Threads
===============

- `Visibility scope for "for/while/if" statements <http:// 
mail.python.org/pipermail/python-dev/2005-September/056669.html>`__
- `inplace operators and __setitem__ <http://mail.python.org/ 
pipermail/python-dev/2005-September/056766.html>`__
- `Repository for python developers <http://mail.python.org/pipermail/ 
python-dev/2005-September/056717.html>`__
- `For/while/if statements/comprehension/generator expressions  
unification <http://mail.python.org/pipermail/python-dev/2005- 
September/056508.html>`__
- `list splicing <http://mail.python.org/pipermail/python-dev/2005- 
September/056472.html>`__
- `Compatibility between Python 2.3.x and Python 2.4.x <http:// 
mail.python.org/pipermail/python-dev/2005-September/056437.html>`__
- `python optimization <http://mail.python.org/pipermail/python-dev/ 
2005-September/056441.html>`__
- `test__locale on Mac OS X <http://mail.python.org/pipermail/python- 
dev/2005-September/056463.html>`__
- `possible memory leak on windows (valgrind report) <http:// 
mail.python.org/pipermail/python-dev/2005-September/056478.html>`__
- `Mixins. <http://mail.python.org/pipermail/python-dev/2005- 
September/056481.html>`__
- `2.4.2c1 fails test_unicode on HP-UX ia64 <http://mail.python.org/ 
pipermail/python-dev/2005-September/056551.html>`__
- `2.4.2c1: test_macfs failing on Tiger (Mac OS X 10.4.2) <http:// 
mail.python.org/pipermail/python-dev/2005-September/056558.html>`__
- `test_ossaudiodev hangs <http://mail.python.org/pipermail/python- 
dev/2005-September/056559.html>`__
- `unintentional and unsafe use of realpath() <http://mail.python.org/ 
pipermail/python-dev/2005-September/056616.html>`__
- `Alternative name for str.partition() <http://mail.python.org/ 
pipermail/python-dev/2005-September/056630.html>`__
- `Weekly Python Patch/Bug Summary <http://mail.python.org/pipermail/ 
python-dev/2005-September/056713.html>`__
- `Possible bug in urllib.urljoin <http://mail.python.org/pipermail/ 
python-dev/2005-September/056736.html>`__
- `Trasvesal thought on syntax features <http://mail.python.org/ 
pipermail/python-dev/2005-September/056741.html>`__
- `Fixing pty.spawn() <http://mail.python.org/pipermail/python-dev/ 
2005-September/056750.html>`__
- `64-bit bytecode compatibility (was Re: [PEAK] ez_setup on 64-bit  
linux problem) <http://mail.python.org/pipermail/python-dev/2005- 
September/056811.html>`__
- `C API doc fix <http://mail.python.org/pipermail/python-dev/2005- 
September/056827.html>`__
- `David Mertz on CA state e-voting panel <http://mail.python.org/ 
pipermail/python-dev/2005-September/056840.html>`__
- `[PATCH][BUG] Segmentation Fault in xml.dom.minidom.parse <http:// 
mail.python.org/pipermail/python-dev/2005-September/056844.html>`__
- `linecache problem <http://mail.python.org/pipermail/python-dev/ 
2005-September/056856.html>`__


From tony.meyer at gmail.com  Thu Nov 17 01:36:33 2005
From: tony.meyer at gmail.com (Tony Meyer)
Date: Thu, 17 Nov 2005 13:36:33 +1300
Subject: [Python-Dev] DRAFT: python-dev Summary for 2005-10-01 to 2005-10-15
Message-ID: <9C7FCF00-A936-44D0-9D36-3263BA456E1A@gmail.com>

As you have noticed, there has been a summary delay recently.  This  
is my fault (insert your favourite thesis/work/leisure excuse here).   
Steve has generously covered my slackness by doing all of the October  
summaries himself (thanks!).  Anyway, if you have some moments to  
spare, cast your mind back to the start of October, and see if these  
reflect what happened.  Comments/corrections to tony.meyer at gmail.com  
or steven.bethard at gmail.com.  Thanks!

=============
Announcements
=============

----------------------------
QOTF: Quote of the Fortnight
----------------------------

 From Phillip J. Eby:

     So, if threads are "easy" in Python compared to other langauges,  
it's *because of* the GIL, not in spite of it.

Contributing thread:

- `Pythonic concurrency <http://mail.python.org/pipermail/python-dev/ 
2005-October/057062.html>`__

[SJB]

----------------------------------------
GCC/G++ Issues on Linux: Patch available
----------------------------------------

Christoph Ludwig provided the previously `promised patch`_ to address  
some of the issues in compiling Python with GCC/G++ on Linux.  The  
patch_ keeps ELF systems like x86 / Linux from having any  
dependencies on the C++ runtime, and allows systems that require main 
() to be a C++ function to be configured appropriately.

.. _promised patch: http://www.python.org/dev/summary/ 
2005-07-01_2005-07-15.html#gcc-g-issues-on-linux
.. _patch: http://python.org/sf/1324762

Contributing thread:

- `[C++-sig] GCC version compatibility <http://mail.python.org/ 
pipermail/python-dev/2005-October/057230.html>`__

[SJB]

=========
Summaries
=========

---------------------
Concurrency in Python
---------------------

Michael Sparks spent a bit of time descibing the current state and  
future goals of the Kamaelia_ project.  Mainly, Kamaelia aims to make  
concurrency as simple and easy to use as possible.  A scheduler  
manages a set of generators that communicate with each other through  
Queues.  The long term goals include being able to farm the various  
generators off into thread or processes as needed, so that whether  
your concurrency model is cooperative, threaded or process-based,  
your code can basically look the same.

There was also continued discussion about how "easy" threads are.   
Shane Hathaway made the point that it's actually locking that's  
"insanely difficult", and approaches that simplify how much you need  
to think about locking can keep threading relatively easy -- this was  
one of the strong points of ZODB.  A fairly large camp also got  
behind the claim that threads are easy if you're limited to only  
message passing.  There were also a few comments about how Python  
makes threading easier, e.g. through the GIL (see `QOTF: Quote of the  
Fortnight`_) and through threading.threads's encapsulation of thread- 
local resources as instance attributes.

.. _Kamaelia: http://kamaelia.sourceforge.ne

Contributing threads:

- `Pythonic concurrency - cooperative MT <http://mail.python.org/ 
pipermail/python-dev/2005-October/056898.html>`__
- `Pythonic concurrency <http://mail.python.org/pipermail/python-dev/ 
2005-October/057023.html>`__

[SJB]

-------------------------------------
Organization of modules for threading
-------------------------------------

A few people took issue with the current organization of the  
threading modules into Queue, thread and threading.  Guido views  
Queue as an application of threading, so putting it in the threading  
module is inappropriate (though with a deeper package structure, it  
should definitely be a sibling).  Nick Coghlan suggested that Queue  
should be in a threadtools module (in parallel with itertools), while  
Skip proposed a hierarchy of modules with thread and lock being in  
the lowest level one, and Thread and Queue being in the highest  
level.  Aahz suggested (and Guido approved) deprecating the thread  
module and renaming it to _thread at least in Python 3.0.  It seems  
the deprecation may happen sooner though.

Contributing threads:

- `Making Queue.Queue easier to use <http://mail.python.org/pipermail/ 
python-dev/2005-October/057184.html>`__
- `Autoloading? (Making Queue.Queue easier to use) <http:// 
mail.python.org/pipermail/python-dev/2005-October/057216.html>`__
- `threadtools (was Re: Autoloading? (Making Queue.Queue easier to  
use)) <http://mail.python.org/pipermail/python-dev/2005-October/ 
057262.html>`__
- `Threading and synchronization primitives <http://mail.python.org/ 
pipermail/python-dev/2005-October/057269.html>`__

[SJB]

-------------------------
Speed of Unicode decoding
-------------------------

Tony Nelson found that decoding with a codec like mac-roman or  
iso8859-1 can take around ten times as long as decoding with utf-8.   
Walter D?rwald provided a patch_ that implements the mapping using a  
unicode string of length 256 where undefined characters are mapped to  
u"\ufffd".  This dropped the decode time for mac-roman to nearly the  
speed of the utf-8 decoding.  Hye-Shik Chang showed off a fastmap  
decoder with comparable performance.  In the end, Walter's patch was  
accepted.

.. patch: http://www.python.org/sf/1313939

Contributing thread:

- `Unicode charmap decoders slow <http://mail.python.org/pipermail/ 
python-dev/2005-October/056958.html>`__

[SJB]

------------------
Updates to PEP 343
------------------

Jason Orendorff proposed replacing the __enter__() and __exit__()  
methods on context managers with a simple __with__() method instead.   
While Guido was unconvinced that __enter__() and __exit__() should be  
dropped, he was convinced that context managers should have a __with__ 
() method for in parallel with the __iter__() method for iterators.   
There was some talk of special-casing the @contextmanager decorator  
on the __with__() method, but no conclusion.

Contributing threads:

- `Proposed changes to PEP 343 <http://mail.python.org/pipermail/ 
python-dev/2005-October/057040.html>`__
- `PEP 343 and __with__ <http://mail.python.org/pipermail/python-dev/ 
2005-October/056931.html>`__

[SJB]

----------------------
str and unicode issues
----------------------

Martin Blais wanted to completely disable the implicit conversions  
between unicode and str, so that you would always be forced to call  
either .encode() or .decode() to convert between one and the other.   
This is already available through adding ``sys.setdefaultencoding 
('undefined')`` to your sitecustomize.py file, but the suggestion  
started another long discussion over unicode issues.  Antoine Pitrou  
suggested that a good rule of thumb is to convert to unicode  
everything that is semantically textual, and to only use str for what  
is to be semantically treated as a string of bytes.  Fredrik Lundh  
argued against this for efficiency reasons -- pure ASCII text would  
consume more space as a unicode object.

There were suggestions that in Python 3.0, opening files in text mode  
will require an encoding and produce string objects, while opening  
files in binary mode will produce bytes objects.  The bytes() type  
will be a mutable array of bytes, which can be converted to a string  
object by specifying an encoding.

Contributing threads:

- `Divorcing str and unicode (no more implicit conversions). <http:// 
mail.python.org/pipermail/python-dev/2005-October/056916.html>`__
- `unifying str and unicode <http://mail.python.org/pipermail/python- 
dev/2005-October/056934.html>`__
- `bytes type <http://mail.python.org/pipermail/python-dev/2005- 
October/056945.html>`__

[SJB]

----------------------------------------------------------------------
Allowing \*args syntax in tuple unpacking and before keyword arguments
----------------------------------------------------------------------

Gustavo Niemeyer propsed the oft-seen request for allowing the \*args  
syntax in tuple unpacking, e.g.::

     for first, second, *rest in iterator:

Guido requested a PEP, saying that he wasn't convinced that there was  
much of a gain over the already valid::

     for item in iterator:
         (first, second), rest = item[2:], item[:2]

Greg Ewing and others didn't like Guido's suggestion as it violates  
DRY (Don't Repeat Yourself).  Others also chimed in with some  
examples in support of the proposal, but no one has yet put together  
a PEP.

In a related matter, Guido indicated that he wants to be able to  
write keyword-only arguments after a \*args, so that you could, for  
example, write::

     f(a, b, *args, foo=1, bar=2, **kwds)

People seemed almost unanimously in support of this proposal, but, to  
quote Nick Coghlan, it has still "never bugged anyone enough for them  
to actaully get around to fixing it".

Contributing thread:

- `Extending tuple unpacking <http://mail.python.org/pipermail/python- 
dev/2005-October/057056.html>`__

[SJB]

----------
AST Branch
----------

Guido gave the AST branch a three week ultimatum: either the branch  
should be merged into MAIN within the next three weeks, or the branch  
should be abandoned entirely.  This jump-started work on the branch,  
and the team was hoping to merge the changes the weekend of October  
15th.

Contributing threads:

- `Python 2.5a1, ast-branch and PEP 342 and 343 <http:// 
mail.python.org/pipermail/python-dev/2005-September/056449.html>`__
- `Python 2.5 and ast-branch <http://mail.python.org/pipermail/python- 
dev/2005-October/056986.html>`__
- `AST branch update <http://mail.python.org/pipermail/python-dev/ 
2005-October/057281.html>`__

[SJB]

-----------------------------------
Allowing "return obj" in generators
-----------------------------------

Piet Delport suggested having ``return obj`` in generators be  
translated into ``raise StopIteration(obj)``.  The return value of a  
generator function would thus be available as the first arg in the  
StopIteration exception.  Guido asked for some examples to give the  
idea a better motivation, and felt uncomfortable with the return  
value being silently ignored in for-loops.  The idea was postponed  
until at least one release after a PEP 342 implementation enters  
Python, so that people can have some more experience with coroutines.

Contributing threads:

- `Proposal for 2.5: Returning values from PEP 342 enhanced  
generators <http://mail.python.org/pipermail/python-dev/2005-October/ 
056957.html>`__
- `PEP 342 suggestion: start(), __call__() and unwind_call() methods  
<http://mail.python.org/pipermail/python-dev/2005-October/ 
057042.html>`__
- `New PEP 342 suggestion: result() and allow &quot;return with  
arguments&quot; in generators (was Re: PEP 342 suggestion: start(),  
__call__() and unwind_call() methods) <http://mail.python.org/ 
pipermail/python-dev/2005-October/057116.html>`__

[SJB]

-----------------------------
API for the line-number table
-----------------------------

Greg Ewing suggested trying to simplify the line-number table  
(lnotab) by simply matching each byte-code index with a file and line  
number.  Phillip J. Eby pointed out that this would make the stdlib  
take up an extra megabyte, suggesting two tables instead, one  
matching bytecodes to line numbers, and one matching the first line- 
number of a chunk with its file.  Michael Hudson suggested that what  
we really want is an API for accessing the lnotab, so that the  
implementation that is chosen is less important.  The conversation  
trailed off without a resolution.

Contributing thread:

- `Simplify lnotab? (AST branch update) <http://mail.python.org/ 
pipermail/python-dev/2005-October/057285.html>`__

[SJB]

------------------------------
Current directory and sys.path
------------------------------

A question about the status of `the CurrentVersion registry entry`_  
led to a discussion about the different behaviors of sys.path across  
platforms.  Apparently, on Windows, sys.path includes the current  
directory and the directory of the script being executed, while on  
Linux, it only includes the directory of the script.

.. _the CurrentVersion registry entry: http://www.python.org/windows/ 
python/registry.html

Contributing thread:

- `PythonCore\CurrentVersion <http://mail.python.org/pipermail/python- 
dev/2005-October/057095.html>`__

[SJB]

----------------------------------
Changing the __class__ of builtins
----------------------------------

As of Python 2.3, you can no longer change the __class__ of any  
builtin.  Phillip J. Eby suggested that these rules might be overly  
strict; modules and other mutable objects could probably reasonably  
have their __class__s changed.  No one seemed really opposed to the  
idea, but no one offered up a patch to make the change either.

Contributing thread:

- `Assignment to __class__ of module? (Autoloading? (Making  
Queue.Queue easier to use)) <http://mail.python.org/pipermail/python- 
dev/2005-October/057253.html>`__

[SJB]

------------------------------------------
exec function specification for Python 3.0
------------------------------------------

In Python 3.0, exec is slated to become a function (instead of a  
statement).  Currently, the presence of an exec statement in a  
function can cause some subtle changes since Python has to worry  
about exec modifying function locals.  Guido suggested that the exec 
() function could require a namespace, basically dumping the exec-in- 
local-namespace altogether.  People seemed generally in favor of the  
proposal, though no official specification was established.

Contributing thread:

- `PEP 3000 and exec <http://mail.python.org/pipermail/python-dev/ 
2005-October/057135.html>`__

[SJB]

------------------------------------
Adding opcodes to speed up self.attr
------------------------------------

Phillip J. Eby experimented with adding LOAD_SELF and SELF_ATTR  
opcodes to improve the speed of object-oriented programming.  This  
gained about a 5% improvement in pystone, which isn't organized in a  
very OO manner.  People seemed uncertain as to whether paying the  
cost of adding two opcodes to gain a 5% speedup was worth it.  No  
decision had been made at the time of this summary.

Contributing thread:

- `LOAD_SELF and SELF_ATTR opcodes <http://mail.python.org/pipermail/ 
python-dev/2005-October/057321.html>`__

[SJB]

--------------------------------------
Dropping support for --disable-unicode
--------------------------------------

Reinhold Birkenfeld tried unsuccessfully to make the test-suite pass  
with --disable-unicode set.  M.-A. Lemburg suggested that the feature  
should be ripped out entirely, to simplify the code.  Martin v. L?wis  
suggested deprecating it to give people a chance to object.  The plan  
is now to add a note to the configure switch that the feature will be  
removed in Python 2.6.

Contributing threads:

- `Tests and unicode <http://mail.python.org/pipermail/python-dev/ 
2005-October/056897.html>`__
- `--disable-unicode (Tests and unicode) <http://mail.python.org/ 
pipermail/python-dev/2005-October/056920.html>`__

[SJB]

-----------------------------------------
Bug in __getitem__ inheritance at C level
-----------------------------------------

Travis Oliphant discovered that the addition of the mp_item and  
sq_item descriptors and the resolution of any comptetion for  
__getitem__ calls is done  *before*  the inheritance of any slots  
takes place.  This means that if you create a type in C that supports  
the sequence protocol, and tries to inherit the mapping protocol from  
a parent C type which does not support the sequence protocol,  
__getitem__ will point to the parent type's __getitem__ instead of  
the child type's __getitem__.  This seemed like more of a bug than a  
feature, so the behavior may be changed in future Pythons.

Contributing thread:

- `Why does __getitem__ slot of builtin call sequence methods first?  
<http://mail.python.org/pipermail/python-dev/2005-October/ 
056901.html>`__

[SJB]

================
Deferred Threads
================

- `Early PEP draft (For Python 3000?) <http://mail.python.org/ 
pipermail/python-dev/2005-October/057251.html>`__
- `Pythonic concurrency - offtopic <http://mail.python.org/pipermail/ 
python-dev/2005-October/057294.html>`__

===============
Skipped Threads
===============

- `PEP 350: Codetags <http://mail.python.org/pipermail/python-dev/ 
2005-October/056894.html>`__
- `Active Objects in Python <http://mail.python.org/pipermail/python- 
dev/2005-October/056896.html>`__
- `IDLE development <http://mail.python.org/pipermail/python-dev/2005- 
October/056907.html>`__
- `Help needed with MSI permissions <http://mail.python.org/pipermail/ 
python-dev/2005-October/056908.html>`__
- `C API doc fix <http://mail.python.org/pipermail/python-dev/2005- 
October/056910.html>`__
- `Static builds on Windows (continued) <http://mail.python.org/ 
pipermail/python-dev/2005-October/056976.html>`__
- `Removing the block stack (was Re: PEP 343 and __with__) <http:// 
mail.python.org/pipermail/python-dev/2005-October/057001.html>`__
- `Removing the block stack <http://mail.python.org/pipermail/python- 
dev/2005-October/057008.html>`__
- `Lexical analysis and NEWLINE tokens <http://mail.python.org/ 
pipermail/python-dev/2005-October/057014.html>`__
- `PyObject_Init documentation <http://mail.python.org/pipermail/ 
python-dev/2005-October/057039.html>`__
- `Sourceforge CVS access <http://mail.python.org/pipermail/python- 
dev/2005-October/057051.html>`__
- `__doc__ behavior in class definitions <http://mail.python.org/ 
pipermail/python-dev/2005-October/057066.html>`__
- `Sandboxed Threads in Python <http://mail.python.org/pipermail/ 
python-dev/2005-October/057082.html>`__
- `Weekly Python Patch/Bug Summary <http://mail.python.org/pipermail/ 
python-dev/2005-October/057092.html>`__
- `test_cmd_line failure on Kubuntu 5.10 with GCC 4.0 <http:// 
mail.python.org/pipermail/python-dev/2005-October/057094.html>`__
- `defaultproperty (was: Re: RFC: readproperty) <http:// 
mail.python.org/pipermail/python-dev/2005-October/057120.html>`__
- `async IO and helper threads <http://mail.python.org/pipermail/ 
python-dev/2005-October/057121.html>`__
- `defaultproperty <http://mail.python.org/pipermail/python-dev/2005- 
October/057129.html>`__
- `Fwd: defaultproperty <http://mail.python.org/pipermail/python-dev/ 
2005-October/057131.html>`__
- `C.E.R. Thoughts <http://mail.python.org/pipermail/python-dev/2005- 
October/057137.html>`__
- `problem with genexp <http://mail.python.org/pipermail/python-dev/ 
2005-October/057175.html>`__
- `Python-Dev Digest, Vol 27, Issue 44 <http://mail.python.org/ 
pipermail/python-dev/2005-October/057207.html>`__
- `Europeans attention please! <http://mail.python.org/pipermail/ 
python-dev/2005-October/057233.html>`__


From tony.meyer at gmail.com  Thu Nov 17 01:36:36 2005
From: tony.meyer at gmail.com (Tony Meyer)
Date: Thu, 17 Nov 2005 13:36:36 +1300
Subject: [Python-Dev] DRAFT: python-dev Summary for 2005-10-16 to 2005-10-31
Message-ID: <D716D004-B827-4CB4-913B-ECE61118FF0A@gmail.com>

And this one brings us up-to-date (apart from the fortnight ending  
yesterday).  Again, if you have the time, please send any comments/ 
corrections to us.  Once again thanks to Steve for covering me and  
getting this all out on his own.

=============
Announcements
=============

--------------
AST for Python
--------------

As of October 21st, Python's compiler now uses a real Abstract Syntax  
Tree (AST)!  This should make experimenting with new syntax much  
easier, as well as allowing some optimizations that were difficult  
with the previous Concrete Syntax Tree (CST).  While there is no  
Python interface to the AST yet, one is intended for the not-so- 
distant future.

Thanks again to all who contributed, most notably: Armin Rigo, Brett  
Cannon, Grant Edwards, John Ehresman, Kurt Kaiser, Neal Norwitz, Neil  
Schemenauer, Nick Coghlan and Tim Peters.

Contributing threads:

- `AST branch merge status <http://mail.python.org/pipermail/python- 
dev/2005-October/057347.html>`__
- `AST branch update <http://mail.python.org/pipermail/python-dev/ 
2005-October/057387.html>`__
- `AST branch is in? <http://mail.python.org/pipermail/python-dev/ 
2005-October/057483.html>`__
- `Questionable AST wibbles <http://mail.python.org/pipermail/python- 
dev/2005-October/057489.html>`__
- `[Jython-dev] Re: AST branch is in? <http://mail.python.org/ 
pipermail/python-dev/2005-October/057642.html>`__

[SJB]

--------------------
Python on Subversion
--------------------

As of October 27th, Python is now on Subversion!  The new repository  
is http://svn.python.org/projects/.  Check the `Developers FAQ`_ for  
information on how to get yourself setup with Subversion.  Thanks  
again to Martin v. L?wis for making this possible!

.. _Developers FAQ: http://www.python.org/dev/devfaq.html#subversion-svn

Contributing threads:

- `Migrating to subversion <http://mail.python.org/pipermail/python- 
dev/2005-October/057424.html>`__
- `Freezing the CVS on Oct 26 for SVN switchover <http:// 
mail.python.org/pipermail/python-dev/2005-October/057537.html>`__
- `CVS is read-only <http://mail.python.org/pipermail/python-dev/2005- 
October/057679.html>`__
- `Conversion to Subversion is complete <http://mail.python.org/ 
pipermail/python-dev/2005-October/057690.html>`__

[SJB]

---------------
Faster decoding
---------------

M.-A. Lemburg checked in Walter D?rwald's patches that improve  
decoding speeds by using a character map.  These should make decoding  
into mac-roman or iso8859-1 nearly as fast as decoding into utf-8.   
Thanks again guys!

Contributing threads:

- `Unicode charmap decoders slow <http://mail.python.org/pipermail/ 
python-dev/2005-October/057341.html>`__
- `New codecs checked in <http://mail.python.org/pipermail/python-dev/ 
2005-October/057505.html>`__
- `KOI8_U (New codecs checked in) <http://mail.python.org/pipermail/ 
python-dev/2005-October/057576.html>`__

[SJB]


=========
Summaries
=========

---------------------
Strings in Python 3.0
---------------------

Guido proposed that in Python 3.0, all character strings would be  
unicode, possibly with multiple internal representations.  Some of  
the issues:

- Multiple implementations could make the C API difficult.  If utf-8,  
utf-16 and utf-32 are all possible, what types should the C API pass  
around?

- Windows expects utf-16, so using any other encoding will mean that  
calls to Windows will have to convert to and from utf-16.  However,  
even in current Python, all strings passed to Windows system calls  
have to undergo 8 bit to utf-16 conversion.

- Surrogates (two code units encoding one code point) can slow  
indexing down because the number of bytes per character isn't  
constant.  Note that even though utf-32 doesn't need surrogates, they  
may still be used (and must be interpreted correctly) in utf-32  
data.  Also, in utf-32, "graphemes" (which correspond better to the  
traditional concept of a "character" than code points do) may still  
be composed of multiple code points, e.g. "?" (e with a accent) can  
be written as "e" + "'".

This last issue was particularly vexing -- Guido thinks "it's a bad  
idea to offer an indexing operation that isn't O(1)".  A number of  
proposals were put forward, including:

- Adding a flag to strings to indicate whether or not they have any  
surrogates in them.  This makes indexing O(1) when no surrogates are  
in a string, but O(N) otherwise.

- Using a B-tree instead of an array for storage.  This would make  
all indexing O(log N).

- Discouraging using the indexing operations by providing an  
alternate API for strings.  This would require creating iterator-like  
objects that keep track of position in the unicode object.  Coming up  
with an API that's as usable as the slicing API seemed difficult though.

Contributing thread:

- `Divorcing str and unicode (no more implicit conversions). <http:// 
mail.python.org/pipermail/python-dev/2005-October/057362.html>`__

[SJB]

-------------------
Unicode identifiers
-------------------

Martin v. L?wis suggested lifting the restriction that identifiers be  
ASCII.  There was some concern about confusability, with the  
contention that confusions like "O" (uppercase O) for "0" (zero) and  
"1" (one) for "l" (lowercase L) would only multiply if larger  
character sets were allowed.  Guido seemed less concerned about this  
problem than about about how easy it would be to share code across  
languages.  Neil Hodgson pointed out that even though a  
transliteration into English exists for Japanese, the coders he knew  
preferred to use relatively meaningless names, and Oren Tirosh  
indicated that Israeli programmers often preferred transliterations  
for local business terminology.  In either case, with or without  
unicode identifiers the code would already be hard to share.  In the  
end, people seemed mostly in favor of the idea, though there was some  
suggestion that it should wait until Python 3.0.

Contributing threads:

- `Divorcing str and unicode (no more implicit conversions). <http:// 
mail.python.org/pipermail/python-dev/2005-October/057362.html>`__
- `i18n identifiers (was: Divorcing str and unicode (no more implicit  
conversions). <http://mail.python.org/pipermail/python-dev/2005- 
October/057812.html>`__
- `i18n identifiers <http://mail.python.org/pipermail/python-dev/2005- 
October/057813.html>`__

[SJB]

-----------------
Property variants
-----------------

People still seem not quite pleased with properties, both in the  
syntax, and in how they interact with inheritance.  Guido proposed  
changing the property() builtin to accept strings for fget, fset and  
fdel in addition to functions (as it currently does).  If strings  
were passed, the property() object would have late-binding behavior,  
that is, the function to call wouldn't be looked-up until the  
attribute was accessed.  Properties whose fget, fset and fdel  
functions can be overridden in subclasses might then look like::

     class C(object):
         foo = property('getFoo', 'setFoo', None, 'the foo property')
         def getFoo(self):
             return self._foo
         def setFoo(self, foo):
             self._foo = foo

There were mixed reactions to this proposal.  People liked getting  
the expected behavior in subclasses, but it does violate DRY (Don't  
Repeat Yourself).  I posted an `alternative solution`_ using  
metaclasses that would allow you to write properties like::

     class C(object):
         class foo(Property):
             """The foo property"""
             def get(self):
                 return self._foo
             def set(self, foo):
                 self._foo = foo

which operates correctly with subclasses and follows DRY, but  
introduces a confusion about the referrent of "self".  There were  
also a few suggestions of introducing a new syntax for properties  
(see `Generalizing the class declaration syntax`_) which would have  
produced things like::

     class C(object):
         Property foo():
             """The foo property"""
             def get(self):
                 return self._foo
             def set(self, foo):
                 self._foo = foo

At the moment at least, it looks like we'll be sticking with the  
status quo.

.. _alternative solution: http://aspn.activestate.com/ASPN/Cookbook/ 
Python/Recipe/442418

Contributing threads:

- `Definining properties - a use case for class decorators? <http:// 
mail.python.org/pipermail/python-dev/2005-October/057350.html>`__
- `Defining properties - a use case for class decorators? <http:// 
mail.python.org/pipermail/python-dev/2005-October/057407.html>`__
- `properties and block statement <http://mail.python.org/pipermail/ 
python-dev/2005-October/057419.html>`__
- `Property syntax for Py3k (properties and block statement) <http:// 
mail.python.org/pipermail/python-dev/2005-October/057427.html>`__

[SJB]

-------------------
PEP 343 resolutions
-------------------

After Guido accepted the idea of adding a __with__() method to the  
context protocol, `PEP 343`_ was reverted to "Proposed" until the  
remaining details could be ironed out.  The end results were:

     - The slot name "__context__" will be used instead of "__with__".
     - The builtin name "context" is currently offlimits due to its  
ambiguity.
     - Generator-iterators do NOT have a native context.
     - The builtin function "contextmanager" will convert a generator- 
function into a context manager.
     - The "__context__" slot will NOT be special cased.  If it  
defines a generator, the __context__() function should be decorated  
with @contextmanager.
     - When the result of a __context__() call returns an object that  
lacks an __enter__() or __exit__() method, an AttributeError will be  
raised.
     - Only locks, files and decimal.Context objects will gain  
__context__() methods in Python 2.5.

Guido seemed to agree with all of these, but has not yet pronounced  
on the revised `PEP 343`_.

.. _PEP 343: http://www.python.org/peps/pep-0343.html

Contributing threads:

- `PEP 343 updated <http://mail.python.org/pipermail/python-dev/2005- 
October/057349.html>`__
- `Proposed resolutions for open PEP 343 issues <http:// 
mail.python.org/pipermail/python-dev/2005-October/057516.html>`__
- `PEP 343 - multiple context managers in one statement <http:// 
mail.python.org/pipermail/python-dev/2005-October/057637.html>`__
- `PEP 343 updated with outcome of recent discussions <http:// 
mail.python.org/pipermail/python-dev/2005-October/057769.html>`__

[SJB]

---------------
Freeze protocol
---------------

Barry Warsaw propsed `PEP 351`_, which suggests a freeze() builtin  
which would call the __freeze__() method on an object if that object  
was not hashable.  This would allow dicts to automatically make  
frozen copies of mutable objects when they were used as dict keys.   
It could reduce the need for "x" and "frozenx" builtin pairs, since  
the frozen versions could be automatically derived when needed.   
Raymond Hettinger indicated some problems with the proposal:

- sets.Set supported something similar, but found that it was not  
really helpful in practice.
- Freezing a list into a tuple is not appropriate since they do not  
have all the same methods.
- Errors can arise when the mutable object gets out of sync with its  
frozen copy.
- Manually freezing things when necessary is relatively simple.

Noam Raphael proposed a copy-on-change mechanism which would  
essentially give frozen copies of an object a reference to that  
object.  When the object is about to be modified, a copy would be  
made, and all frozen copies would be pointed at this.  Thus an object  
that was mutable but never changed could have lightweight frozen  
copies, while an object that did change would have to pay the usual  
copying costs.  Noam and Josiah Carlson then had a rather heated  
debate about how feasible such a copy-on-change mechanism would be  
for Python.

.. _PEP 351: http://www.python.org/peps/pep-0351.html

Contributing thread:

- `PEP 351, the freeze protocol <http://mail.python.org/pipermail/ 
python-dev/2005-October/057543.html>`__

[SJB]

----------------------------------
Required superclass for Exceptions
----------------------------------

Guido and Brett Cannon introduced `PEP 352`_ which proposes that all  
Exceptions be required to derive from a new exception class,  
BaseException.  The chidren of BaseException would be  
KeyboardInterrupt, SystemExit and Exception (which would contain the  
remainder of the current hierarchy).  The goal here is to make the  
following code do the right thing::

     try:
         ...
     except Exception:
         ...

Currently, this code fails to catch string exceptions and other  
exceptions that do not derive from Exception, and it (probably)  
inappropriately catches KeyboardInterrupt and SystemExit which are  
supposed to indicate that Python is shutting down.  The current plan  
is to introduce BaseException and have KeyboardInterrupt and  
SystemExit multiply inherit from Exception and BaseException.  The  
PEP lists the roadplan for deprecating the various other types of  
exceptions.

The PEP also attempts to standardize on the arguments to Exception  
objects, so that by Python 3.0, all Exceptions will support a single  
argument which will be stored as their "message" attribute.

Guido was ready to accept it on October 31st, but it has not been  
marked as Accepted yet.

.. _PEP 352: http://www.python.org/peps/pep-0352.html

Contributing threads:

- `PEP 352: Required Superclass for Exceptions <http:// 
mail.python.org/pipermail/python-dev/2005-October/057736.html>`__
- `PEP 352 Transition Plan <http://mail.python.org/pipermail/python- 
dev/2005-October/057750.html>`__

[SJB]

-----------------------------------------
Generalizing the class declaration syntax
-----------------------------------------

Michele Simionato suggested a generalization of the class declaration  
syntax, so that::

     <callable> <name> <tuple>:
         <definitions>

would be translated into::

     <name> = <callable>("<name>", <tuple>, <dict-of-definitions>)

Where <dict-of-definitions> is simply the namespace that results from  
executing <definitions>. This would actually remove the need for the  
class keyword, as classes could be declared as::

     type <classname> <bases>:
         <definitions>

There were a few requests for a PEP, but nothing has been made  
available yet.

Contributing thread:

- `Definining properties - a use case for class decorators? <http:// 
mail.python.org/pipermail/python-dev/2005-October/057435.html>`__

[SJB]

--------------------
Task-local variables
--------------------

Phillip J. Eby introduced a pre-PEP proposing a mechanism similar to  
thread-local variables, to help co-routine schedulers to swap state  
between tasks.  Essentially, the scheduler would be required to take  
a snapshot of a coroutine's variables before a swap, and restore that  
snapshot when the coroutine is swapped back.  Guido asked people to  
hold off on more PEP 343-related proposals until with-blocks have  
been out in the wild for at least a release or two.

Contributing thread:

- `Pre-PEP: Task-local variables <http://mail.python.org/pipermail/ 
python-dev/2005-October/057464.html>`__

[SJB]

-----------------------------------------
Attribute-style access for all namespaces
-----------------------------------------

Eyal Lotem proposed replacing the globals() and locals() dicts with  
"module" and "frame" objects that would have attribute-style access  
instead of __getitem__-style access.  Josiah Carlson noted that the  
first is already available by doing ``module = __import__(__name__) 
``, and suggested that monkeying around with function locals is never  
a good idea, so adding additional support for doing so is not useful.

Contributing threads:

- `Early PEP draft (For Python 3000?) <http://mail.python.org/ 
pipermail/python-dev/2005-October/057251.html>`__

[SJB]

---------------------------------
Yielding all items of an iterator
---------------------------------

Gustavo J. A. M. Carneiro was looking for a nicer way of indicating  
that all items of an iterable should be yielded.  Currently, you  
probably want to use a for-loop to express this, e.g.::

     for step in animate(win, xrange(10)): # slide down
         yield step

Andrew Koenig suggested that the syntax::

     yield from <x>

be equivalent to::

     for i in x:
         yield i

People seemed uncertain as to whether or not there were enough use  
cases to merit the additional syntax.

Contributing thread:

- `Coroutines, generators, function calling <http://mail.python.org/ 
pipermail/python-dev/2005-October/057405.html>`__

[SJB]

-----------------------------------------
Getting an AST without the Python runtime
-----------------------------------------

Thanks to the merging of the AST branch, Evan Jones was able to fully  
divorce the Python parse from the Python runtime so that you can get  
AST objects without having to have Python running.  He made the  
divorced AST parser available on `his site`_.

.. _his site: http://evanjones.ca/software/pyparser.html

Contributing thread:

- `Parser and Runtime: Divorced! <http://mail.python.org/pipermail/ 
python-dev/2005-October/057684.html>`__

[SJB]

===============
Skipped Threads
===============

- `Pythonic concurrency - offtopic <http://mail.python.org/pipermail/ 
python-dev/2005-October/057294.html>`__
- `Sourceforge CVS access <http://mail.python.org/pipermail/python- 
dev/2005-October/057342.html>`__
- `Weekly Python Patch/Bug Summary <http://mail.python.org/pipermail/ 
python-dev/2005-October/057343.html>`__
- `Guido v. Python, Round 1 <http://mail.python.org/pipermail/python- 
dev/2005-October/057366.html>`__
- `Autoloading? (Making Queue.Queue easier to use) <http:// 
mail.python.org/pipermail/python-dev/2005-October/057368.html>`__
- `problem with genexp <http://mail.python.org/pipermail/python-dev/ 
2005-October/057370.html>`__
- `PEP 3000 and exec <http://mail.python.org/pipermail/python-dev/ 
2005-October/057380.html>`__
- `Pythonic concurrency - offtopic <http://mail.python.org/pipermail/ 
python-dev/2005-October/057442.html>`__
- `enumerate with a start index <http://mail.python.org/pipermail/ 
python-dev/2005-October/057459.html>`__
- `list splicing <http://mail.python.org/pipermail/python-dev/2005- 
October/057479.html>`__
- `bool(iter([])) changed between 2.3 and 2.4 <http://mail.python.org/ 
pipermail/python-dev/2005-October/057481.html>`__
- `A solution to the evils of static typing and interfaces? <http:// 
mail.python.org/pipermail/python-dev/2005-October/057485.html>`__
- `PEP 267 -- is the semantics change OK? <http://mail.python.org/ 
pipermail/python-dev/2005-October/057506.html>`__
- `DRAFT: python-dev Summary for 2005-09-01 through 2005-09-16  
<http://mail.python.org/pipermail/python-dev/2005-October/ 
057508.html>`__
- `int(string) (was: DRAFT: python-dev Summary for 2005-09-01 through  
2005-09-16) <http://mail.python.org/pipermail/python-dev/2005-October/ 
057510.html>`__
- `LXR site for Python CVS <http://mail.python.org/pipermail/python- 
dev/2005-October/057511.html>`__
- `int(string) <http://mail.python.org/pipermail/python-dev/2005- 
October/057512.html>`__
- `Comparing date+time w/ just time <http://mail.python.org/pipermail/ 
python-dev/2005-October/057514.html>`__
- `AST reverts PEP 342 implementation and IDLE starts working again  
<http://mail.python.org/pipermail/python-dev/2005-October/ 
057528.html>`__
- `cross compiling python for embedded systems <http:// 
mail.python.org/pipermail/python-dev/2005-October/057534.html>`__
- `Inconsistent Use of Buffer Interface in stringobject.c <http:// 
mail.python.org/pipermail/python-dev/2005-October/057589.html>`__
- `Reminder: PyCon 2006 submissions due in a week <http:// 
mail.python.org/pipermail/python-dev/2005-October/057618.html>`__
- `MinGW and libpython24.a <http://mail.python.org/pipermail/python- 
dev/2005-October/057624.html>`__
- `make testall hanging on HEAD? <http://mail.python.org/pipermail/ 
python-dev/2005-October/057662.html>`__
- `&quot;? operator in python&quot; <http://mail.python.org/pipermail/ 
python-dev/2005-October/057673.html>`__
- `[Docs] MinGW and libpython24.a <http://mail.python.org/pipermail/ 
python-dev/2005-October/057693.html>`__
- `Help with inotify <http://mail.python.org/pipermail/python-dev/ 
2005-October/057705.html>`__
- `[Python-checkins] commit of r41352 - in python/trunk: . Lib Lib/ 
distutils Lib/distutils/command Lib/encodings <http://mail.python.org/ 
pipermail/python-dev/2005-October/057780.html>`__
- `svn:ignore <http://mail.python.org/pipermail/python-dev/2005- 
October/057783.html>`__
- `svn checksum error <http://mail.python.org/pipermail/python-dev/ 
2005-October/057790.html>`__
- `svn:ignore (Was: [Python-checkins] commit of r41352 - in python/ 
trunk: . Lib Lib/distutils Lib/distutils/command Lib/encodings)  
<http://mail.python.org/pipermail/python-dev/2005-October/ 
057793.html>`__
- `StreamHandler eating exceptions <http://mail.python.org/pipermail/ 
python-dev/2005-October/057798.html>`__
- `a different kind of reduce... <http://mail.python.org/pipermail/ 
python-dev/2005-October/057814.html>`__


From mozbugbox at yahoo.com.au  Thu Nov 17 05:55:08 2005
From: mozbugbox at yahoo.com.au (JustFillBug)
Date: Thu, 17 Nov 2005 04:55:08 +0000 (UTC)
Subject: [Python-Dev] Problems with the Python Memory Manager
References: <20051116120346.A434.JCARLSON@uci.edu> <dlg5gt$q1g$1@sea.gmane.org>
	<20051116145820.A43A.JCARLSON@uci.edu>
	<437BC524.2030105@ee.byu.edu>
Message-ID: <slrndno3ou.l63.mozbugbox@mozbugbox.somehost.org>

On 2005-11-16, Travis Oliphant <oliphant at ee.byu.edu> wrote:
> Josiah Carlson wrote:
>>I seemed to have misunderstood the discussion.  Was the original user
>>accessing and saving copies of many millions of these doubles?  
>>
> He *was* accessing them (therefore generating a call to an array-scalar 
> object creation function).  But they *weren't being* saved.  They were 
> being deleted soon after access.   That's why it was so confusing that 
> his memory usage should continue to grow and grow so terribly.
>
> As verified by removing usage of the Python PyObject_MALLOC function, it 
> was the Python memory manager that was performing poorly.   Even though 
> the array-scalar objects were deleted, the memory manager would not 
> re-use their memory for later object creation. Instead, the memory 
> manager kept allocating new arenas to cover the load (when it should 
> have been able to re-use the old memory that had been freed by the 
> deleted objects--- again, I don't know enough about the memory manager 
> to say why this happened).

Well, the user have to call garbage collection before the memory were
freed. Python won't free memory when it can allocate more. It sucks but
it is my experience with python. I mean when python start doing swap on
my machine, I have to add manual garbage collection calls into my codes.




From ronaldoussoren at mac.com  Thu Nov 17 07:06:02 2005
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Thu, 17 Nov 2005 07:06:02 +0100
Subject: [Python-Dev] Problems with the Python Memory Manager
In-Reply-To: <437BE7A8.5000503@ee.byu.edu>
References: <fb6fbf560511161750y7cef46cdk67700606e655a6ec@mail.gmail.com>
	<437BE7A8.5000503@ee.byu.edu>
Message-ID: <A89BF905-97B2-4E08-BFEB-33B00B3AECE0@mac.com>


On 17-nov-2005, at 3:15, Travis Oliphant wrote:

> Jim Jewett wrote:
>
>>
>
>> (2)  Is he allocating new _types_, which I think don't get properly
>>
>> collected.
>>
>>
>
> Bingo.  Yes, definitely allocating new _types_ (an awful lot of  
> them...)
> --- that's what the "array scalars" are: new types created in C.

Do you really mean that someArray[1] will create a new type to represent
the second element of someArray? I would guess that you create an
instance of a type defined in your extension.

Ronald


From fredrik at pythonware.com  Thu Nov 17 09:29:52 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu, 17 Nov 2005 09:29:52 +0100
Subject: [Python-Dev] Problems with the Python Memory Manager
References: <20051116120346.A434.JCARLSON@uci.edu>
	<dlg5gt$q1g$1@sea.gmane.org><20051116145820.A43A.JCARLSON@uci.edu>
	<437BC524.2030105@ee.byu.edu>
Message-ID: <dlhf20$dib$1@sea.gmane.org>

Travis Oliphant wrote:

> The fact that it did happen is what I'm reporting on.  If nothing will
> be done about it (which I can understand), at least this thread might
> help somebody else in a similar situation track down why their Python
> process consumes all of their memory even though their objects are being
> deleted appropriately.

since that doesn't happen in other applications, I'm not sure this thread
will help much -- unless you can provide us with enough details to figure
out what makes this case so much different...

</F> 




From fredrik at pythonware.com  Thu Nov 17 09:44:06 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu, 17 Nov 2005 09:44:06 +0100
Subject: [Python-Dev] Problems with the Python Memory Manager
References: <fb6fbf560511161750y7cef46cdk67700606e655a6ec@mail.gmail.com>
	<437BE7A8.5000503@ee.byu.edu>
Message-ID: <dlhfsm$fq6$1@sea.gmane.org>

Travis Oliphant wrote:

> Bingo.  Yes, definitely allocating new _types_ (an awful lot of them...)
> --- that's what the "array scalars" are: new types created in C.

are you allocating PyTypeObject structures dynamically?

why are you creating an awful lot of new type objects to represent the
contents of a homogenous array?

> If they don't get properly collected then that would definitely have
> created the problem.  It would seem this should be advertised when
> telling people to use PyObject_New for allocating new memory for
> an object.

PyObject_New creates a new instance of a given type; it doesn't, in itself,
create a new type.

at this point, your description doesn't make much sense.  more information
is definitely needed...

</F> 




From mwh at python.net  Thu Nov 17 10:42:23 2005
From: mwh at python.net (Michael Hudson)
Date: Thu, 17 Nov 2005 09:42:23 +0000
Subject: [Python-Dev] Problems with the Python Memory Manager
In-Reply-To: <437BE7A8.5000503@ee.byu.edu> (Travis Oliphant's message of
	"Wed, 16 Nov 2005 19:15:04 -0700")
References: <fb6fbf560511161750y7cef46cdk67700606e655a6ec@mail.gmail.com>
	<437BE7A8.5000503@ee.byu.edu>
Message-ID: <2mek5ftxts.fsf@starship.python.net>

Travis Oliphant <oliphant at ee.byu.edu> writes:

> Bingo.  Yes, definitely allocating new _types_ (an awful lot of them...) 
> --- that's what the "array scalars" are: new types created in C.

Ah!  And, er, why?

> If they don't get properly collected then that would definitely have
> created the problem.

types do get collected -- but only after the cycle collector has run.
If you can still reproduce the problem can you try again but calling
'gc.set_threshold(1)'?

> It would seem this should be advertised when telling people to use
> PyObject_New for allocating new memory for an object.

Nevertheless, I think it would be good if pymalloc freed its arenas.
I think the reasons it doesn't are because of worries that people
might be called PyObject_Free without holding the GIL, but that's been
verboten for several years now so we can probably just let them
suffer.

I think there's even a patch on SF to do this...

Cheers,
mwh

-- 
  The use of COBOL cripples the mind; its teaching should, therefore,
  be regarded as a criminal offence.
           -- Edsger W. Dijkstra, SIGPLAN Notices, Volume 17, Number 5

From oliphant at ee.byu.edu  Thu Nov 17 11:00:10 2005
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Thu, 17 Nov 2005 03:00:10 -0700
Subject: [Python-Dev] Problems with the Python Memory Manager
In-Reply-To: <A89BF905-97B2-4E08-BFEB-33B00B3AECE0@mac.com>
References: <fb6fbf560511161750y7cef46cdk67700606e655a6ec@mail.gmail.com>
	<437BE7A8.5000503@ee.byu.edu>
	<A89BF905-97B2-4E08-BFEB-33B00B3AECE0@mac.com>
Message-ID: <437C54AA.9020203@ee.byu.edu>


>>
>> Bingo.  Yes, definitely allocating new _types_ (an awful lot of  
>> them...)
>> --- that's what the "array scalars" are: new types created in C.
>
>
> Do you really mean that someArray[1] will create a new type to represent
> the second element of someArray? I would guess that you create an
> instance of a type defined in your extension.


O.K.  my bad.   I can see that I was confusing in my recent description 
and possibly misunderstood the questions I was asked.   It can get 
confusing given the dynamic nature of Python.

The array scalars are new statically defined (in C) types (just like 
regular Python integers and regular Python floats).   The ndarray is 
also a statically defined type.  The ndarray holds raw memory 
interpreted in a certain fashion (very similar to Python's array 
module).   Each ndarray can have a certain data type.   For every data 
type that an array can be, there is a corresponding "array scalar" 
type.   All of these are statically defined types.   We are only talking 
about instances of these defined types. 

When the result of a user operation with an ndarray is a scalar, an 
instance of the appropriate "array scalar" type is created and passed 
back to the user.   Previously we were using PyObject_New in the 
tp_alloc slot and PyObject_Del in the tp_free slot of the typeobject 
structure in order to create and destroy the memory for these instances.

In this particular application, the user ended up creating many, many 
instances of these array scalars and then deleting them soon after.  
Despite the fact that he was not retaining any references to these 
scalars (PyObject_Del had been called on them), his application crawled 
to a halt after only several hunderd iterations consuming all of the 
available system memory.   To verify that indeed no references were 
being kept, I did a detailed analysis of the result of sys.getobjects() 
using a debug build of Python.

When I replaced PyObject_New (with malloc and PyObject_Init) and 
PyObject_Del (with free) for  the "array scalars" types in scipy core,  
the users memory problems magically disappeared. 

I therefore assume that the problem is the memory manager in Python.   
Initially, I thought this was the old problem of Python not freeing 
memory once it grabs it.  But, that should not have been a problem here, 
because the code quickly frees most of the objects it creates and so 
Python should have been able to re-use the memory. 

So, I now believe that his code (plus the array scalar extension type) 
was actually exposing a real bug in the memory manager itself.  In 
theory, the Python memory manager should have been able to re-use the 
memory for the array-scalar instances because they are always the same 
size.  In practice, the memory was apparently not being re-used but 
instead new blocks were being allocated to handle the load.

His code is quite complicated and it is difficult to replicate the 
problem.   I realize this is not helpful for fixing the Python memory 
manager, and I wish I could be more helpful.   However,  replacing 
PyObject_New with malloc does solve the problem for us and that may help 
anybody else in this situation in the future. 

Best regards,

-Travis






From walter at livinglogic.de  Thu Nov 17 21:21:48 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Thu, 17 Nov 2005 21:21:48 +0100
Subject: [Python-Dev] Iterating a closed StringIO
Message-ID: <437CE65C.7010107@livinglogic.de>

Currently StringIO.StringIO and cStringIO.StringIO behave differently 
when iterating a closed stream:

s = StringIO.StringIO("foo")
s.close()
s.next()

gives StopIteration, but

s = cStringIO.StringIO("foo")
s.close()
s.next()

gives "ValueError: I/O operation on closed file".

Should they raise the same exception? Should this be fixed for 2.5?

Bye,
    Walter D?rwald

From bcannon at gmail.com  Thu Nov 17 21:46:15 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Thu, 17 Nov 2005 12:46:15 -0800
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <dlf7ak$ckg$1@sea.gmane.org>
References: <4379AAD7.2050506@iinet.net.au>
	<6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu>
	<e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com>
	<ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com>
	<bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com>
	<13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu>
	<437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org>
Message-ID: <bbaeab100511171246v5c0ea6bei93480a669011042e@mail.gmail.com>

On 11/16/05, Fredrik Lundh <fredrik at pythonware.com> wrote:
> Thomas Lee wrote:
>
> > Even if it meant we had just one function call - one, safe function call
> > that deallocated all the memory allocated within a function - that we
> > had to put before each and every return, that's better than what we
> > have.
>
> alloca?
>
> (duck)
>

But how widespread is its support (e.g., does Windows have it)?

-Brett

From bcannon at gmail.com  Thu Nov 17 21:56:30 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Thu, 17 Nov 2005 12:56:30 -0800
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <437B3EF8.2030001@gmail.com>
References: <4379AAD7.2050506@iinet.net.au>
	<6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu>
	<e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com>
	<ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com>
	<bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com>
	<437B00BE.7060007@gmail.com> <437B2FE6.7080206@gmail.com>
	<437B3EF8.2030001@gmail.com>
Message-ID: <bbaeab100511171256h57ce0596rab12bd197b1e6da6@mail.gmail.com>

On 11/16/05, Thomas Lee <krumms at gmail.com> wrote:
> Just messing around with some ideas. I was trying to avoid the ugly
> macros (note my earlier whinge about a learning curve) but they're the
> cleanest way I could think of to get around the problem without
> resorting to a mass deallocation right at the end of the AST run. Which
> may not be all that bad given we're going to keep everything in-memory
> anyway until an error occurs ... anyway, anyway, I'm getting sidetracked :)
>
> The idea is to ensure that all allocations within a single function are
> made using the pool so that a function finishes what it starts. This
> way, if the function fails it alone is responsible for cleaning up its
> own pool and that's all. No funkyness needed for sequences, because each
> member of the sequence belongs to the pool too. Note that the stmt_ty
> instances are also allocated using the pool.
>
> This breaks interfaces all over the place though. Not exactly a pretty
> change :) But yeah, maybe somebody smarter than I will come up with
> something a bit cleaner.
>
> --
>
> /* snip! */
>
> #define AST_SUCCESS(pool, result) return result
> #define AST_FAILURE(pool, result) asdl_pool_free(pool); return result
>

This is actually exactly what I was thinking of; macros that handle
returns and specify whether the return signals a success or failure.

One tweak I would do is posibly lock down the the variable name with
AST_POOL_ALLOC() at the start of a function that creates _arena_pool. 
That way you don't need to pass in the specific pool.  I don't see why
we will need to have multiple pools within a function.  This also
allows the VISIT_* macros to be easily modified and not suddenly
require another argument to specify the arena name.

And all of this is easy to police since you can grep for 'return' and
make sure that it is meant to be there and not in actuality be one of
the macros.  Basically gives us the mini-language that Nick mentioned
way back at the beginning of this thread.

Oh, and tweak the macros to be within ``do { ... } while(0)`` (``if
(1) AST_FAILURE(pool, NULL);`` will not expand properly otherwise).

-Brett

From guido at python.org  Thu Nov 17 22:03:49 2005
From: guido at python.org (Guido van Rossum)
Date: Thu, 17 Nov 2005 13:03:49 -0800
Subject: [Python-Dev] Iterating a closed StringIO
In-Reply-To: <437CE65C.7010107@livinglogic.de>
References: <437CE65C.7010107@livinglogic.de>
Message-ID: <ca471dc20511171303t637ad7ddtd7ee2753840e2d6@mail.gmail.com>

On 11/17/05, Walter D?rwald <walter at livinglogic.de> wrote:
> Currently StringIO.StringIO and cStringIO.StringIO behave differently
> when iterating a closed stream:
>
> s = StringIO.StringIO("foo")
> s.close()
> s.next()
>
> gives StopIteration, but
>
> s = cStringIO.StringIO("foo")
> s.close()
> s.next()
>
> gives "ValueError: I/O operation on closed file".
>
> Should they raise the same exception? Should this be fixed for 2.5?

I think cStringIO is doing the right thing; "real" files behave the same way.

Submit a patch for StringIO (also docs please) and assign it to me and
I'll make sure it goes in.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From aleaxit at gmail.com  Thu Nov 17 23:27:41 2005
From: aleaxit at gmail.com (Alex Martelli)
Date: Thu, 17 Nov 2005 14:27:41 -0800
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <bbaeab100511171246v5c0ea6bei93480a669011042e@mail.gmail.com>
References: <4379AAD7.2050506@iinet.net.au>
	<6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu>
	<e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com>
	<ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com>
	<bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com>
	<13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu>
	<437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org>
	<bbaeab100511171246v5c0ea6bei93480a669011042e@mail.gmail.com>
Message-ID: <80315A07-6C80-4E27-9CA8-F62719775307@gmail.com>


On Nov 17, 2005, at 12:46 PM, Brett Cannon wrote:
    ...
>> alloca?
>>
>> (duck)
>>
>
> But how widespread is its support (e.g., does Windows have it)?

Yep, spelled with a leading underscore:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/ 
vclib/html/_crt__alloca.asp


Alex


From blais at furius.ca  Fri Nov 18 00:26:22 2005
From: blais at furius.ca (Martin Blais)
Date: Thu, 17 Nov 2005 18:26:22 -0500
Subject: [Python-Dev] Coroutines (PEP 342)
In-Reply-To: <1147958111.20051114154658@gmail.com>
References: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com>
	<43579027.6040007@gmail.com> <43579ADC.80006@gmail.com>
	<5.1.1.6.0.20051020163313.01faf660@mail.telecommunity.com>
	<ca471dc20510201957m7823c49ama127de972eef4028@mail.gmail.com>
	<4359047B.6020203@gmail.com> <1147958111.20051114154658@gmail.com>
Message-ID: <8393fff0511171526o37738c50iabbb2f73eb59c56e@mail.gmail.com>

On 11/14/05, Bruce Eckel <BruceEckel-Python3234 at mailblocks.com> wrote:
> I just finished reading PEP 342, and it appears to follow Hoare's
> Communicating Sequential Processes (CSP) where a process is a
> coroutine, and the communicaion is via yield and send(). It seems that
> if you follow that form (and you don't seem forced to, pythonically),
> then synchronization is not an issue.
>
> What is not clear to me, and is not discussed in the PEP, is whether
> coroutines can be distributed among multiple processors. If that is or
> isn't possible I think it should be explained in the PEP, and I'd be
> interested in know about it here (and ideally why it would or wouldn't
> work).

It seems to me that the concept of coroutines and PEP342 has very
little to do with concurrency itself, apart from the fact that the
generators form very convenient units of parallelization if you're
willing to do some scheduling of them yourself, and only *potentially*
with concurrency, i.e. only if you wrote a scheduler that supports
running generator iterations concurrently on two processors. 
Otherwise there is no concurrency abstraction, unlike threads: it's
cooperative and you clearly can see in the code the points where
"switching" occurs (next(... ), yield ...).

Please beat me with a stick if this is lunatic...

From walter at livinglogic.de  Fri Nov 18 01:03:26 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Fri, 18 Nov 2005 01:03:26 +0100
Subject: [Python-Dev] Iterating a closed StringIO
In-Reply-To: <ca471dc20511171303t637ad7ddtd7ee2753840e2d6@mail.gmail.com>
References: <437CE65C.7010107@livinglogic.de>
	<ca471dc20511171303t637ad7ddtd7ee2753840e2d6@mail.gmail.com>
Message-ID: <140808AA-CFCA-4679-B5CC-24D21D45C3A3@livinglogic.de>

Am 17.11.2005 um 22:03 schrieb Guido van Rossum:

> On 11/17/05, Walter D?rwald <walter at livinglogic.de> wrote:
>> Currently StringIO.StringIO and cStringIO.StringIO behave differently
>> when iterating a closed stream:
>>
>> s = StringIO.StringIO("foo")
>> s.close()
>> s.next()
>>
>> gives StopIteration, but
>>
>> s = cStringIO.StringIO("foo")
>> s.close()
>> s.next()
>>
>> gives "ValueError: I/O operation on closed file".
>>
>> Should they raise the same exception? Should this be fixed for 2.5?
>
> I think cStringIO is doing the right thing; "real" files behave the  
> same way.
>
> Submit a patch for StringIO (also docs please) and assign it to me and
> I'll make sure it goes in.

http://www.python.org/sf/1359365

Doc/lib/libstringio.tex only states "See the description of file  
objects for operations", so I'm not sure how to update the  
documentation.

Bye,
    Walter D?rwald


From krumms at gmail.com  Fri Nov 18 02:00:56 2005
From: krumms at gmail.com (Thomas Lee)
Date: Fri, 18 Nov 2005 11:00:56 +1000
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <80315A07-6C80-4E27-9CA8-F62719775307@gmail.com>
References: <4379AAD7.2050506@iinet.net.au>	<6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu>	<e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com>	<ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com>	<bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com>	<13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu>	<437B2075.1000102@gmail.com>
	<dlf7ak$ckg$1@sea.gmane.org>	<bbaeab100511171246v5c0ea6bei93480a669011042e@mail.gmail.com>
	<80315A07-6C80-4E27-9CA8-F62719775307@gmail.com>
Message-ID: <437D27C8.9040703@gmail.com>

Portability may also be an issue to take into consideration:

http://www.eskimo.com/~scs/C-faq/q7.32.html
http://archives.neohapsis.com/archives/postfix/2001-05/1305.html

Cheers,
Tom

Alex Martelli wrote:

>On Nov 17, 2005, at 12:46 PM, Brett Cannon wrote:
>    ...
>  
>
>>>alloca?
>>>
>>>(duck)
>>>
>>>      
>>>
>>But how widespread is its support (e.g., does Windows have it)?
>>    
>>
>
>Yep, spelled with a leading underscore:
>http://msdn.microsoft.com/library/default.asp?url=/library/en-us/ 
>vclib/html/_crt__alloca.asp
>
>
>Alex
>
>_______________________________________________
>Python-Dev mailing list
>Python-Dev at python.org
>http://mail.python.org/mailman/listinfo/python-dev
>Unsubscribe: http://mail.python.org/mailman/options/python-dev/krumms%40gmail.com
>
>  
>


From guido at python.org  Fri Nov 18 02:16:23 2005
From: guido at python.org (Guido van Rossum)
Date: Thu, 17 Nov 2005 17:16:23 -0800
Subject: [Python-Dev] Iterating a closed StringIO
In-Reply-To: <140808AA-CFCA-4679-B5CC-24D21D45C3A3@livinglogic.de>
References: <437CE65C.7010107@livinglogic.de>
	<ca471dc20511171303t637ad7ddtd7ee2753840e2d6@mail.gmail.com>
	<140808AA-CFCA-4679-B5CC-24D21D45C3A3@livinglogic.de>
Message-ID: <ca471dc20511171716x6dca0cb0qb81cae74beb9ed63@mail.gmail.com>

On 11/17/05, Walter D?rwald <walter at livinglogic.de> wrote:
> Am 17.11.2005 um 22:03 schrieb Guido van Rossum:
>
> > On 11/17/05, Walter D?rwald <walter at livinglogic.de> wrote:
> >> Currently StringIO.StringIO and cStringIO.StringIO behave differently
> >> when iterating a closed stream:
> >>
> >> s = StringIO.StringIO("foo")
> >> s.close()
> >> s.next()
> >>
> >> gives StopIteration, but
> >>
> >> s = cStringIO.StringIO("foo")
> >> s.close()
> >> s.next()
> >>
> >> gives "ValueError: I/O operation on closed file".
> >>
> >> Should they raise the same exception? Should this be fixed for 2.5?
> >
> > I think cStringIO is doing the right thing; "real" files behave the
> > same way.
> >
> > Submit a patch for StringIO (also docs please) and assign it to me and
> > I'll make sure it goes in.
>
> http://www.python.org/sf/1359365

Thanks!

> Doc/lib/libstringio.tex only states "See the description of file
> objects for operations", so I'm not sure how to update the
> documentation.

OK, so that's a no-op.

I hope there isn't anyone here who believes this patch would be a bad idea?

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From aleaxit at gmail.com  Fri Nov 18 02:19:45 2005
From: aleaxit at gmail.com (Alex Martelli)
Date: Thu, 17 Nov 2005 17:19:45 -0800
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <437D27C8.9040703@gmail.com>
References: <4379AAD7.2050506@iinet.net.au>	<6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu>	<e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com>	<ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com>	<bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com>	<13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu>	<437B2075.1000102@gmail.com>
	<dlf7ak$ckg$1@sea.gmane.org>	<bbaeab100511171246v5c0ea6bei93480a669011042e@mail.gmail.com>
	<80315A07-6C80-4E27-9CA8-F62719775307@gmail.com>
	<437D27C8.9040703@gmail.com>
Message-ID: <7E441ACF-1ADE-4141-953E-C64272D0629D@gmail.com>


On Nov 17, 2005, at 5:00 PM, Thomas Lee wrote:

> Portability may also be an issue to take into consideration:

Of course -- but so is anno domini... the eskimo.com FAQ is (C) 1995,  
and the neohapsis.com page just points to the eskimo.com one:

> http://www.eskimo.com/~scs/C-faq/q7.32.html
> http://archives.neohapsis.com/archives/postfix/2001-05/1305.html

In 2006, I'm not sure the need to avoid alloca is anywhere as  
strong.  Sure, it could be wrapped into a LOCALLOC macro (with a  
companion LOCFREE one), the macro expanding to alloca/nothing on  
systems which do have alloca and to malloc/free elsewhere -- this  
would keep the sources just as cluttered, but still speed things up  
where feasible.  E.g., on my iBook, a silly benchmark just freeing  
and allocating 80,000 hunks of 1000 bytes takes 13ms with alloca, 57  
without...


Alex


From walter at livinglogic.de  Fri Nov 18 11:29:17 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Fri, 18 Nov 2005 11:29:17 +0100
Subject: [Python-Dev] Iterating a closed StringIO
In-Reply-To: <ca471dc20511171716x6dca0cb0qb81cae74beb9ed63@mail.gmail.com>
References: <437CE65C.7010107@livinglogic.de>
	<ca471dc20511171303t637ad7ddtd7ee2753840e2d6@mail.gmail.com>
	<140808AA-CFCA-4679-B5CC-24D21D45C3A3@livinglogic.de>
	<ca471dc20511171716x6dca0cb0qb81cae74beb9ed63@mail.gmail.com>
Message-ID: <5D3C125B-A1D4-476A-BF5C-51346238A0F6@livinglogic.de>

Am 18.11.2005 um 02:16 schrieb Guido van Rossum:

> On 11/17/05, Walter D?rwald <walter at livinglogic.de> wrote:
>> Am 17.11.2005 um 22:03 schrieb Guido van Rossum:
>>
>>> On 11/17/05, Walter D?rwald <walter at livinglogic.de> wrote:
>>>> [...]
>>>> Should they raise the same exception? Should this be fixed for 2.5?
>>>
>>> I think cStringIO is doing the right thing; "real" files behave the
>>> same way.
>>>
>>> Submit a patch for StringIO (also docs please) and assign it to  
>>> me and
>>> I'll make sure it goes in.
>>
>> http://www.python.org/sf/1359365
>
> Thanks!
>
>> Doc/lib/libstringio.tex only states "See the description of file
>> objects for operations", so I'm not sure how to update the
>> documentation.
>
> OK, so that's a no-op.
>
> I hope there isn't anyone here who believes this patch would be a  
> bad idea?

I wonder whether we should introduce a new exception class for these  
kinds or error. IMHO ValueError is much to unspecific.

What about StateError?

Bye,
    Walter D?rwald


From walter at livinglogic.de  Fri Nov 18 13:15:33 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Fri, 18 Nov 2005 13:15:33 +0100
Subject: [Python-Dev] isatty() on closed StringIO (was: Iterating a closed
	StringIO)
In-Reply-To: <ca471dc20511171716x6dca0cb0qb81cae74beb9ed63@mail.gmail.com>
References: <437CE65C.7010107@livinglogic.de>	
	<ca471dc20511171303t637ad7ddtd7ee2753840e2d6@mail.gmail.com>	
	<140808AA-CFCA-4679-B5CC-24D21D45C3A3@livinglogic.de>
	<ca471dc20511171716x6dca0cb0qb81cae74beb9ed63@mail.gmail.com>
Message-ID: <437DC5E5.1030302@livinglogic.de>

Guido van Rossum wrote:

> On 11/17/05, Walter D?rwald <walter at livinglogic.de> wrote:
> 
>>Am 17.11.2005 um 22:03 schrieb Guido van Rossum:
>>
>>
>>>On 11/17/05, Walter D?rwald <walter at livinglogic.de> wrote:
>>>
>>>>Currently StringIO.StringIO and cStringIO.StringIO behave differently
>>>>when iterating a closed stream:
 > [...]
> 
> I hope there isn't anyone here who believes this patch would be a bad idea?

BTW, isatty() has a similar problem:

 >>> import StringIO, cStringIO
 >>> s = StringIO.StringIO()
 >>> s.close()
 >>> s.isatty()
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
   File "/usr/local/lib/python2.4/StringIO.py", line 93, in isatty
     _complain_ifclosed(self.closed)
   File "/usr/local/lib/python2.4/StringIO.py", line 40, in 
_complain_ifclosed
     raise ValueError, "I/O operation on closed file"
ValueError: I/O operation on closed file
 >>> s = cStringIO.StringIO()
 >>> s.close()
 >>> s.isatty()
False

I guess cStringIO.StringIO.isatty() should raise an exception too.

Bye,
    Walter D?rwald

From walter at livinglogic.de  Fri Nov 18 14:26:09 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Fri, 18 Nov 2005 14:26:09 +0100
Subject: [Python-Dev] Another StringIO/cStringIO discrepancy
Message-ID: <437DD671.40809@livinglogic.de>

 >>> import StringIO, cStringIO
 >>> s = StringIO.StringIO()
 >>> s.truncate(-42)
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
   File "/usr/local/lib/python2.4/StringIO.py", line 203, in truncate
     raise IOError(EINVAL, "Negative size not allowed")
IOError: [Errno 22] Negative size not allowed
 >>> s = cStringIO.StringIO()
 >>> s.truncate(-42)
 >>>

From michael.walter at gmail.com  Fri Nov 18 14:32:44 2005
From: michael.walter at gmail.com (Michael Walter)
Date: Fri, 18 Nov 2005 14:32:44 +0100
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <7E441ACF-1ADE-4141-953E-C64272D0629D@gmail.com>
References: <4379AAD7.2050506@iinet.net.au>
	<ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com>
	<bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com>
	<13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu>
	<437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org>
	<bbaeab100511171246v5c0ea6bei93480a669011042e@mail.gmail.com>
	<80315A07-6C80-4E27-9CA8-F62719775307@gmail.com>
	<437D27C8.9040703@gmail.com>
	<7E441ACF-1ADE-4141-953E-C64272D0629D@gmail.com>
Message-ID: <877e9a170511180532nc1ba329m48f1e61e2338e6df@mail.gmail.com>

The behavior of libiberty's alloca() replacement might be interesting as well:

http://gcc.gnu.org/onlinedocs/libiberty/Functions.html#index-alloca-59

Regards,
Michael

On 11/18/05, Alex Martelli <aleaxit at gmail.com> wrote:
>
> On Nov 17, 2005, at 5:00 PM, Thomas Lee wrote:
>
> > Portability may also be an issue to take into consideration:
>
> Of course -- but so is anno domini... the eskimo.com FAQ is (C) 1995,
> and the neohapsis.com page just points to the eskimo.com one:
>
> > http://www.eskimo.com/~scs/C-faq/q7.32.html
> > http://archives.neohapsis.com/archives/postfix/2001-05/1305.html
>
> In 2006, I'm not sure the need to avoid alloca is anywhere as
> strong.  Sure, it could be wrapped into a LOCALLOC macro (with a
> companion LOCFREE one), the macro expanding to alloca/nothing on
> systems which do have alloca and to malloc/free elsewhere -- this
> would keep the sources just as cluttered, but still speed things up
> where feasible.  E.g., on my iBook, a silly benchmark just freeing
> and allocating 80,000 hunks of 1000 bytes takes 13ms with alloca, 57
> without...
>
>
> Alex
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/michael.walter%40gmail.com
>

From ncoghlan at gmail.com  Fri Nov 18 14:57:19 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 18 Nov 2005 23:57:19 +1000
Subject: [Python-Dev] Iterating a closed StringIO
In-Reply-To: <ca471dc20511171716x6dca0cb0qb81cae74beb9ed63@mail.gmail.com>
References: <437CE65C.7010107@livinglogic.de>	<ca471dc20511171303t637ad7ddtd7ee2753840e2d6@mail.gmail.com>	<140808AA-CFCA-4679-B5CC-24D21D45C3A3@livinglogic.de>
	<ca471dc20511171716x6dca0cb0qb81cae74beb9ed63@mail.gmail.com>
Message-ID: <437DDDBF.7010309@gmail.com>

Guido van Rossum wrote:
> 
> I hope there isn't anyone here who believes this patch would be a bad idea?

Not me, but the Iterator protocol docs may need a minor tweak. Currently they 
say this:

"The intention of the protocol is that once an iterator's next() method raises 
StopIteration, it will continue to do so on subsequent calls. Implementations 
that do not obey this property are deemed broken."

This wording is a bit too strong, as it's perfectly acceptable for an object 
to provide other methods which affect the result of subsequent calls to the 
next() method (examples being the seek() and close() methods in the file 
interface).

The current wording does describe the basic intent of the API correctly, but 
you could forgiven for thinking that it ruled out modifying the state of a 
completed iterator in a way that restarts it, or causes it to raise an 
exception other than StopIteration.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From raymond.hettinger at verizon.net  Fri Nov 18 15:29:38 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Fri, 18 Nov 2005 09:29:38 -0500
Subject: [Python-Dev] Iterating a closed StringIO
In-Reply-To: <437DDDBF.7010309@gmail.com>
Message-ID: <005401c5ec4c$81d4be40$91af2c81@oemcomputer>

[Guido van Rossum]
> > I hope there isn't anyone here who believes this patch would be a
bad
> idea?

[Nick Coglan]
> Not me, but the Iterator protocol docs may need a minor tweak.
Currently
> they
> say this:
> 
> "The intention of the protocol is that once an iterator's next()
method
> raises
> StopIteration, it will continue to do so on subsequent calls.
> Implementations
> that do not obey this property are deemed broken."


FWIW, here is wording for PEP 342's close() method:

""" 4. Add a close() method for generator-iterators, which raises
       GeneratorExit at the point where the generator was paused.  If
       the generator then raises StopIteration (by exiting normally, or
       due to already being closed) or GeneratorExit (by not catching
       the exception), close() returns to its caller.  If the generator
       yields a value, a RuntimeError is raised.  If the generator
       raises any other exception, it is propagated to the caller.
       close() does nothing if the generator has already exited due to
       an exception or normal exit. """


For Walter's original question, my preference is to change the behavior
of regular files to raise StopIteration when next() is called on an
iterator for a closed file.  The current behavior is an implementation
artifact stemming from a file being its own iterator object.  In the
future, it is possible that iter(somefileobject) will return a distinct
iterator object and perhaps allow multiple, distinct iterators over the
same file.

Also, it is sometimes nice to wrap one iterator with another (perhaps
with itertools or somesuch).  That use case depends on the underlying
iterator raising StopIteration instead of some other exception:

    f = open(somefilename)
    for lineno, line in enumerate(f): 
        . . .


Raymond


From jimjjewett at gmail.com  Fri Nov 18 16:29:23 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 18 Nov 2005 10:29:23 -0500
Subject: [Python-Dev] Memory management in the AST parser & compiler
Message-ID: <fb6fbf560511180729y1037f23cv832cd5edc1f1c327@mail.gmail.com>

There is a public-domain implementation of alloca at

http://www.cs.purdue.edu/homes/apm/courses/BITSC461-fall03/listen-code/listen-1.0-dave/lsl_cpp/alloca.c

It would still fail on architectures that don't use a stack frame; other
than that, it seems like a reasonable fallback, if alloca is otherwise
desirable.

-jJ

From guido at python.org  Fri Nov 18 16:49:55 2005
From: guido at python.org (Guido van Rossum)
Date: Fri, 18 Nov 2005 07:49:55 -0800
Subject: [Python-Dev] Iterating a closed StringIO
In-Reply-To: <005401c5ec4c$81d4be40$91af2c81@oemcomputer>
References: <437DDDBF.7010309@gmail.com>
	<005401c5ec4c$81d4be40$91af2c81@oemcomputer>
Message-ID: <ca471dc20511180749x1f8fa1dfn7e79ce7070af88be@mail.gmail.com>

On 11/18/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
> For Walter's original question, my preference is to change the behavior
> of regular files to raise StopIteration when next() is called on an
> iterator for a closed file.

I disagree. As long as there is a possibility that you might still
want to use the iterator (even if it's exhausted) you shouldn't close
the file. Closing a file is a strong indicator that you believe that
there is no more use of the file, and *all* file methods change their
behavior at that point; e.g. read() on a closed file raises an
exception instead of returning an empty string. This is to catch the
*bug* of closing a file that is still being used.

Now it's questionable whether ValueError is the best exception in this
case, since that is an exception which reasonable programmers often
catch (e.g. when parsing a string that's supposed to represent an
int). But I propose to leave the choice of exception reform for Python
3000.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Fri Nov 18 16:52:12 2005
From: guido at python.org (Guido van Rossum)
Date: Fri, 18 Nov 2005 07:52:12 -0800
Subject: [Python-Dev] Another StringIO/cStringIO discrepancy
In-Reply-To: <437DD671.40809@livinglogic.de>
References: <437DD671.40809@livinglogic.de>
Message-ID: <ca471dc20511180752m5658acd7lc43e7762e3063e47@mail.gmail.com>

On 11/18/05, Walter D?rwald <walter at livinglogic.de> wrote:
>  >>> import StringIO, cStringIO
>  >>> s = StringIO.StringIO()
>  >>> s.truncate(-42)
> Traceback (most recent call last):
>    File "<stdin>", line 1, in ?
>    File "/usr/local/lib/python2.4/StringIO.py", line 203, in truncate
>      raise IOError(EINVAL, "Negative size not allowed")
> IOError: [Errno 22] Negative size not allowed
>  >>> s = cStringIO.StringIO()
>  >>> s.truncate(-42)
>  >>>

Well, what does a regular file say in this case?

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From walter at livinglogic.de  Fri Nov 18 17:30:33 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Fri, 18 Nov 2005 17:30:33 +0100
Subject: [Python-Dev] Another StringIO/cStringIO discrepancy
In-Reply-To: <ca471dc20511180752m5658acd7lc43e7762e3063e47@mail.gmail.com>
References: <437DD671.40809@livinglogic.de>
	<ca471dc20511180752m5658acd7lc43e7762e3063e47@mail.gmail.com>
Message-ID: <437E01A9.40208@livinglogic.de>

Guido van Rossum wrote:

> On 11/18/05, Walter D?rwald <walter at livinglogic.de> wrote:
> 
>> >>> import StringIO, cStringIO
>> >>> s = StringIO.StringIO()
>> >>> s.truncate(-42)
>>Traceback (most recent call last):
>>   File "<stdin>", line 1, in ?
>>   File "/usr/local/lib/python2.4/StringIO.py", line 203, in truncate
>>     raise IOError(EINVAL, "Negative size not allowed")
>>IOError: [Errno 22] Negative size not allowed
>> >>> s = cStringIO.StringIO()
>> >>> s.truncate(-42)
>> >>>
> 
> 
> Well, what does a regular file say in this case?

IOError: [Errno 22] Invalid argument

Bye,
    Walter D?rwald

From walter at livinglogic.de  Fri Nov 18 17:51:51 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Fri, 18 Nov 2005 17:51:51 +0100
Subject: [Python-Dev] isatty() on closed StringIO
In-Reply-To: <20051118101801.U90899@familjen.svensson.org>
References: <437CE65C.7010107@livinglogic.de>
	<ca471dc20511171303t637ad7ddtd7ee2753840e2d6@mail.gmail.com>
	<140808AA-CFCA-4679-B5CC-24D21D45C3A3@livinglogic.de>
	<ca471dc20511171716x6dca0cb0qb81cae74beb9ed63@mail.gmail.com>
	<437DC5E5.1030302@livinglogic.de>
	<20051118101801.U90899@familjen.svensson.org>
Message-ID: <437E06A7.1030908@livinglogic.de>

Paul Svensson wrote:

> On Fri, 18 Nov 2005, Walter D?rwald wrote:
> 
>> BTW, isatty() has a similar problem:
>>
>> >>> import StringIO, cStringIO
>> >>> s = StringIO.StringIO()
>> >>> s.close()
>> >>> s.isatty()
>> Traceback (most recent call last):
>>   File "<stdin>", line 1, in ?
>>   File "/usr/local/lib/python2.4/StringIO.py", line 93, in isatty
>>     _complain_ifclosed(self.closed)
>>   File "/usr/local/lib/python2.4/StringIO.py", line 40, in
>> _complain_ifclosed
>>     raise ValueError, "I/O operation on closed file"
>> ValueError: I/O operation on closed file
>> >>> s = cStringIO.StringIO()
>> >>> s.close()
>> >>> s.isatty()
>> False
>>
>> I guess cStringIO.StringIO.isatty() should raise an exception too.
> 
> 
> Why ?  Is there any doubt that it's not a tty ?

No, but for real files a ValueError is raised too.

Bye,
    Walter D?rwald

From martin at v.loewis.de  Fri Nov 18 23:13:25 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 18 Nov 2005 23:13:25 +0100
Subject: [Python-Dev] str.dedent
In-Reply-To: <b348a0850511151534q4e8abbf6vc3c63c07d3291d6a@mail.gmail.com>
References: <dga72k$cah$1@sea.gmane.org>	
	<b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com>	
	<43777B5A.6030602@egenix.com>	
	<200511140920.51724.gmccaughan@synaptics-uk.com>	
	<437869DD.7040800@egenix.com>	
	<b348a0850511141114p25411ea4w704a99d1ea9a629a@mail.gmail.com>	
	<dlaqds$8sb$1@sea.gmane.org>	
	<b348a0850511141425y1a894ddap14d7814568c9be5d@mail.gmail.com>	
	<43791442.8050109@v.loewis.de>
	<b348a0850511151534q4e8abbf6vc3c63c07d3291d6a@mail.gmail.com>
Message-ID: <437E5205.2010001@v.loewis.de>

Noam Raphael wrote:
> I just wanted to add another use case: long messages. Consider those
> lines from idlelib/run.py:133
> 
>         msg = "IDLE's subprocess can't connect to %s:%d.  This may be due "\
>               "to your personal firewall configuration.  It is safe to "\
>               "allow this internal connection because no data is visible on "\
>               "external ports." % address
>         tkMessageBox.showerror("IDLE Subprocess Error", msg, parent=root)

You are missing an important point here: There are intentionally no line
breaks in this string; it must be a single line, or else showerror will
break it in funny ways. So converting it to a multi-line string would
break it, dedent or not.

Regards,
Martin

From guido at python.org  Sat Nov 19 05:44:51 2005
From: guido at python.org (Guido van Rossum)
Date: Fri, 18 Nov 2005 20:44:51 -0800
Subject: [Python-Dev] Enjoy a week without me
Message-ID: <ca471dc20511182044v59c3de85gc58a5814d5330713@mail.gmail.com>

Folks, I'm off for a week with my wife's family (and one unlucky
turkey :-) in a place where I can't care about email. I will be back
here on Monday Nov 28.
--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From kbk at shore.net  Sat Nov 19 07:34:23 2005
From: kbk at shore.net (Kurt B. Kaiser)
Date: Sat, 19 Nov 2005 01:34:23 -0500 (EST)
Subject: [Python-Dev] Weekly Python Patch/Bug Summary
Message-ID: <200511190634.jAJ6YNMh017166@bayview.thirdcreek.com>

Patch / Bug Summary
___________________

Patches :  379 open (+14) /  2968 closed ( +7) /  3347 total (+21)
Bugs    :  910 open ( +6) /  5384 closed (+17) /  6294 total (+23)
RFE     :  200 open ( +0) /   191 closed ( +2) /   391 total ( +2)

New / Reopened Patches
______________________

PythonD DJGPP-specific patch set for porting to DOS.  (2005-11-08)
       http://python.org/sf/1351020  opened by  Ben Decker

PythonD new file: python2.4/plat-ms-dos5/djstat.py  (2005-11-08)
       http://python.org/sf/1351036  opened by  Ben Decker

askyesnocancel helper for tkMessageBox  (2005-11-08)
       http://python.org/sf/1351744  opened by  Fredrik Lundh

fix for resource leak in _subprocess  (2005-11-09)
CLOSED http://python.org/sf/1351997  opened by  Fredrik Lundh

[PATCH] Bug #1351707  (2005-11-10)
       http://python.org/sf/1352711  opened by  Thomas Lee

Small upgrades to platform.platform()  (2005-11-10)
       http://python.org/sf/1352731  opened by  daishi harada

a faster Modulefinder  (2005-11-11)
       http://python.org/sf/1353872  opened by  Thomas Heller

support whence argument for GzipFile.seek (bug #1316069)  (2005-11-12)
       http://python.org/sf/1355023  opened by  Fredrik Lundh

PEP 341 - Unification of try/except and try/finally  (2005-11-14)
       http://python.org/sf/1355913  opened by  Thomas Lee

Delete Python-ast.[ch] during "make depclean" (#1355883)  (2005-11-14)
CLOSED http://python.org/sf/1355940  opened by  Thomas Lee

Python-ast.h & Python-ast.c generated twice (#1355883)  (2005-11-14)
       http://python.org/sf/1355971  opened by  Thomas Lee

Sort nodes when writing to file  (2005-11-14)
CLOSED http://python.org/sf/1356571  opened by  Johan Str?m

potential crash and free memory read  (2005-11-15)
       http://python.org/sf/1357836  opened by  Neal Norwitz

ftplib dir() problem with certain servers  (2005-11-17)
       http://python.org/sf/1359217  opened by  Stuart D. Gathman

Iterating closed StringIO.StringIO  (2005-11-18)
       http://python.org/sf/1359365  opened by  Walter D?rwald

Speed charmap encoder  (2005-11-18)
       http://python.org/sf/1359618  opened by  Martin v. L?wis

Patch for (Doc) #1357604  (2005-11-18)
       http://python.org/sf/1359879  opened by  Peter van Kampen

Add XML-RPC Fault Interoperability to XMLRPC libraries  (2005-11-18)
       http://python.org/sf/1360243  opened by  Joshua Ginsberg

correct display of pathnames in SimpleHTTPServer  (2005-11-18)
       http://python.org/sf/1360443  opened by  Ori Avtalion

Auto Complete module for IDLE  (2005-11-19)
       http://python.org/sf/1361016  opened by  Jerry

Patches Closed
______________

Redundant connect() call in logging.handlers.SysLogHandler  (2005-11-07)
       http://python.org/sf/1350658  closed by  vsajip

incomplete support for AF_PACKET in socketmodule.c  (2004-11-19)
       http://python.org/sf/1069624  closed by  gustavo

fix for resource leak in _subprocess  (2005-11-09)
       http://python.org/sf/1351997  closed by  effbot

Info Associated with Merge to AST  (2005-01-07)
       http://python.org/sf/1097671  closed by  kbk

Delete Python-ast.[ch] during "make depclean" (#1355883)  (2005-11-13)
       http://python.org/sf/1355940  closed by  montanaro

Sort nodes when writing to file  (2005-11-14)
       http://python.org/sf/1356571  closed by  effbot

CodeContext - an extension to show you where you are  (2004-04-16)
       http://python.org/sf/936169  closed by  kbk

New / Reopened Bugs
___________________

win32serviceutil bug  (2005-11-08)
CLOSED http://python.org/sf/1351545  opened by  Tim Graber

Switch to make pprint.pprint display ints and longs in hex  (2005-11-08)
       http://python.org/sf/1351692  opened by  Mark Hirota

zipimport produces incomplete IOError instances  (2005-11-08)
       http://python.org/sf/1351707  opened by  Fred L. Drake, Jr.

CVS webbrowser.py (1.40) bugs  (2005-10-26)
CLOSED http://python.org/sf/1338995  reopened by  montanaro

SVN webbrowser.py fix 41419 didn't  (2005-11-09)
       http://python.org/sf/1352621  opened by  Greg Couch

poplib.POP3_SSL() class incompatible with socket.timeout  (2005-11-10)
       http://python.org/sf/1353269  opened by  Charles

Http redirection error in urllib2.py  (2005-11-10)
       http://python.org/sf/1353433  opened by  Thomas Dehn

Python drops core when stdin is bogus  (2005-11-10)
       http://python.org/sf/1353504  opened by  Skip Montanaro

Error in documentation for os.walk  (2005-11-11)
CLOSED http://python.org/sf/1353793  opened by  Martin Geisler

logging: Default handlers broken  (2005-11-11)
CLOSED http://python.org/sf/1354052  opened by  Jonathan S. Joseph

Interactive help fails with Windows Installer  (2005-11-11)
CLOSED http://python.org/sf/1354265  opened by  Colin J. Williams

shutil.move() does not preserve ownership  (2005-11-13)
       http://python.org/sf/1355826  opened by  lightweight

Incorrect Decimal-float behavior for +  (2005-11-13)
       http://python.org/sf/1355842  opened by  Connelly

make depend/clean issues w/ ast  (2005-11-13)
       http://python.org/sf/1355883  opened by  Skip Montanaro

Division Error  (2005-11-13)
CLOSED http://python.org/sf/1355903  opened by  Azimuth

Ctrl+C for copy does not work when caps-lock is on  (2005-11-14)
       http://python.org/sf/1356720  opened by  Lenny Domnitser

Tix.py class HList missing info_bbox  (2005-11-14)
       http://python.org/sf/1356969  opened by  Ron Provost

urllib/urllib2 cannot ftp files which are not listable.  (2005-11-15)
       http://python.org/sf/1357260  opened by  Bugs Fly

os.path.makedirs DOES handle UNC paths  (2005-11-15)
       http://python.org/sf/1357604  opened by  j vickroy

suprocess cannot handle shell arguments  (2005-11-16)
       http://python.org/sf/1357915  opened by  Pierre Ossman

Incorrect handling of unicode "strings" in asynchat.py  (2005-11-16)
CLOSED http://python.org/sf/1358186  opened by  Holger Lehmann

subprocess.py fails on Windows when there is no console  (2005-11-16)
       http://python.org/sf/1358527  opened by  Martin Blais

Incorrect documentation of raw unidaq string literals  (2005-11-17)
       http://python.org/sf/1359053  opened by  Michael Haggerty

Prefer configured browser over Mozilla and friends  (2005-11-17)
       http://python.org/sf/1359150  opened by  Ville Skytt?

bdist_rpm still can't handle dashes in versions  (2005-11-18)
       http://python.org/sf/1360200  opened by  jared jennings

telnetlib expect() and read_until() do not time out properly  (2005-11-18)
       http://python.org/sf/1360221  opened by  Duncan Grisby

Bugs Closed
___________

"setdlopenflags" leads to crash upon "import"  (2005-11-07)
       http://python.org/sf/1350188  closed by  nnorwitz

pydoc seems to run some scripts!  (2005-11-04)
       http://python.org/sf/1348477  closed by  nnorwitz

cgitb.py report wrong line number  (2005-04-06)
       http://python.org/sf/1178148  closed by  ping

win32serviceutil bug  (2005-11-08)
       http://python.org/sf/1351545  closed by  nnorwitz

CVS webbrowser.py (1.40) bugs  (2005-10-27)
       http://python.org/sf/1338995  closed by  birkenfeld

__getslice__ taking priority over __getitem__  (2005-10-17)
       http://python.org/sf/1328278  closed by  birkenfeld

_subprocess.c calls PyInt_AsLong without error checking  (2005-11-03)
       http://python.org/sf/1346547  closed by  effbot

Syntax error on large file with MBCS encoding  (2005-03-15)
       http://python.org/sf/1163244  closed by  mhammond

setgroups rejects long integer arguments  (2004-01-02)
       http://python.org/sf/869197  closed by  loewis

Error in documentation for os.walk  (2005-11-11)
       http://python.org/sf/1353793  closed by  tim_one

logging: Default handlers broken  (2005-11-11)
       http://python.org/sf/1354052  closed by  vsajip

Significant memory leak with PyImport_ReloadModule  (2005-08-11)
       http://python.org/sf/1256669  closed by  birkenfeld

Interactive help fails with Windows Installer  (2005-11-11)
       http://python.org/sf/1354265  closed by  loewis

os.remove fails on win32 with read-only file  (2004-12-29)
       http://python.org/sf/1092701  closed by  effbot

Division Error  (2005-11-13)
       http://python.org/sf/1355903  closed by  effbot

IDLE, F5 &#8211; wrong external file content. (on error!)  (2005-10-25)
       http://python.org/sf/1337987  closed by  kbk

Random core dumps  (2004-11-10)
       http://python.org/sf/1063937  closed by  nnorwitz

Incorrect handling of unicode "strings" in asynchat.py  (2005-11-16)
       http://python.org/sf/1358186  closed by  effbot

New / Reopened RFE
__________________

python.desktop  (2005-11-10)
       http://python.org/sf/1353344  opened by  Bj?rn Lindqvist

RFE Closed
__________

please support the free visual studio sdk compiler  (2005-11-05)
       http://python.org/sf/1348719  closed by  loewis

fix for ms stdio tables   (2005-10-11)
       http://python.org/sf/1324176  closed by  loewis


From gregory.petrosyan at gmail.com  Sat Nov 19 18:01:35 2005
From: gregory.petrosyan at gmail.com (Gregory Petrosyan)
Date: Sat, 19 Nov 2005 20:01:35 +0300
Subject: [Python-Dev] How to stay almost backwards compatible with all these
	new cool features
Message-ID: <6306f97b0511190901g6757acbej@mail.gmail.com>

Here's some of my ideas about subject. Maybe some of them are rather
foolish, others -- rather simple and common... I just want to add my 2
cents to Python development.

1) What is the reason for making Python backwards incompatible (let it
be just 'BIC', and let 'BC' stands for 'backwards compatible')? The
reason is revolution. But how much can we get just from intensive
evolution?

2) Is there any way both for staying (almost) BC and intense evolving?
Yes. General rule is rather simple: make old way of doing something
*deprecated* (but *not* remove it entirely) and *add* the new way of
doing this. But how to _force_ users to use new way instead of old? My
proposal: Python should raise DeprecationError after old way is used:

old_way()
                should be equivalent to
old_way()
raise DeprecationError('description')

So people who want to use old way should write something like

try:
    old_way()
except DeprecationError:
    pass

I think they soon will migrate to new style :-)  
[Manual/semi-automatic migration]

Another benefit is that apps that were written for old-way Python
version could be run under new-way Python after this simple
modification (there might be standard script for making old apps
compatible with new versions of Python!).   [Automatic migration]

3) Staying BC means no revolutions in syntax. But there are problems
with some of new-style features:

a) 'raise' statement.
I dislike idea about inheriting all exceptions from the base class and
about removing 'raise' in favor of '.raise()'. Reasons: we can think
about 'raise' as about more powerful variant of 'return'. Classic
example is recursive search in binary tree: raising the result there
seems to be very elegant and agile. Exception != Error is true IMHO.

b) Interfaces.
I like them. But isn't it ugly to write:

interface  SuperInterface: ...

Note: 'interface' is repeated there two times! And this is *not* BC
solution at all. Remember exception classes:

class MyCommonError(Exception): ...
         but not
exception MyCommonError(...): ...

and it seems to be OK! And I have great agility with it: as mentioned,
I can raise just 'some_object', but not only 'exception'.
So, my proposal is syntax

class SuperInterface(Interface): ...
         or maybe
class SuperInterface: ...
         but not
interface  SuperInterface: ...

Note: first two variants are BC solutions!
And *yes*, you should be able to implement *any* class. Example:

class Fish(object):
    def swim():
         do_swim()
    # else ...

class Dog(object):
    def bark():
         do_bark()
    # else ...

class SharkLikeDog(Fish) implements Dog

Isn't it very good-looking?
Note: IMHO Type == Implemented interface. So that's why every
type/class can be used as interface. (Sorry for type/class mess).
Could we find some benefits of it?

c) I like Optional TypeChecking. But I think it could be improved with
implementing some sort of Optional InterfaceChecking. Maybe like this:

def f(a implements B, c: D = 'HELLO') implements E:
    # ?function implements interface? well, maybe it can be some type
check interface?
    # some code here
or
def f(a implements B, c: D = 'HELLO') -> implements E:
    # some code here



Summary
--------------
Well, I think the main idea is (2):
- Don't remove; make it strongly deprecated
Then:
- Some changes to interfaces implementation
- etc ('raise' statement, InterfaceCheck -- see (3) )


Sorry for my English and for mess.


-- Regards, Gregory.

From arigo at tunes.org  Sat Nov 19 19:08:55 2005
From: arigo at tunes.org (Armin Rigo)
Date: Sat, 19 Nov 2005 19:08:55 +0100
Subject: [Python-Dev] s/hotshot/lsprof
Message-ID: <20051119180855.GA26733@code1.codespeak.net>

Hi!

The current Python profilers situation is a mess.

'profile.Profile' is the ages-old pure Python profiler.  At the end of a
run, it builds a dict that is inspected by 'pstats.Stats'.  It has some
recent support for profiling C calls, which however make it crash in
some cases [1].  And of course it's slow (makes a run take about 10x
longer).

'hotshot', new from 2.2, is quite faster (reportedly, only 30% added
overhead).  The log file is then loaded and turned into an instance of
the same 'pstats.Stats'.  This loading takes ages.  The reason is that
the log file only records events, and loading is done by instantiating a
'profile.Profile' and sending it all the events.  In other words, it
takes exactly as long as the time it spared in the first place!
Moreover, for some reasons, the results given by hotshot seem sometimes
quite wrong.  (I don't understand why, but I've seen it myself, and it's
been reported by various people, e.g. [2].)  'hotshot' doesn't know
about C calls, but it can log line events, although this information is
lost(!) in the final conversion to a 'pstats.Stats'.

'lsprof' is a third profiler by Brett Rosen and Ted Czotter, posted on
SF in June [2].  Michael Hudson and me did some minor clean-ups and
improvements on it, and it seems to be quite useful.  It is, for
example, the only of the three profilers that managed to give sensible
information about the PyPy translation process without crashing,
allowing us to accelerate it from over 30 to under 20 minutes.  The SF
patch contains a more detailed account on the reasons for writing
'lsprof'.  The current version [3] does not support C calls nor line
events.  It has its own simple interface, which is not compatible with
any of the other two profilers.  However, unlike the other two
profilers, it can record detailed stats about children, which I found
quite useful (e.g. how much take is spent in a function when it is
called by another specific function).

Therefore, I think it would be a great idea to add 'lsprof' to the
standard library.  Unless there are objections, it seems that the best
plan is to keep 'profile.py' as a pure Python implementation and replace
'hotshot' with 'lsprof'.  Indeed, I don't see any obvious advantage that
'hotshot' has over 'lsprof', and I certainly see more than one downside.
Maybe someone has a use for (and undocumented ways to fish for) line
events generated by hotshot.  Well, there is a script [4] to convert
hotshot log files to some format that a KDE tool [5] can display.  (It
even looks like hotshot files were designed with this in mind.)  Given
that the people doing that can still compile 'hotshot' as a separate
extension module, it doesn't strike me as a particularly good reason to
keep Yet Another Profiler in the standard library.

So here is my plan:

Unify a bit more the interfaces of the pure Python and the C profilers.
This also means that 'lsprof' should be made to use a pstats-compatible
log format.  The 'pstats' documentation specifically says that the file
format can change: that would give 'lsprof' a place to store its
detailed children stats.

Then we can provide a dummy 'hotshot.py' for compatibility, remove its
documentation, and provide documentation for 'lsprof'.

If anyone feels like this is a bad idea, please speak up.


A bientot,

Armin


[1] https://sourceforge.net/tracker/?group_id=5470&atid=105470&func=detail&aid=1117670

[2] http://sourceforge.net/tracker/?group_id=5470&atid=305470&func=detail&aid=1212837

[3] http://codespeak.net/svn/user/arigo/hack/misc/lsprof (Subversion)

[4] http://mail.python.org/pipermail/python-list/2003-September/183887.html

[5] http://kcachegrind.sourceforge.net/cgi-bin/show.cgi

From steven.bethard at gmail.com  Sat Nov 19 20:18:18 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Sat, 19 Nov 2005 12:18:18 -0700
Subject: [Python-Dev] str.dedent
In-Reply-To: <437E5205.2010001@v.loewis.de>
References: <dga72k$cah$1@sea.gmane.org> <43777B5A.6030602@egenix.com>
	<200511140920.51724.gmccaughan@synaptics-uk.com>
	<437869DD.7040800@egenix.com>
	<b348a0850511141114p25411ea4w704a99d1ea9a629a@mail.gmail.com>
	<dlaqds$8sb$1@sea.gmane.org>
	<b348a0850511141425y1a894ddap14d7814568c9be5d@mail.gmail.com>
	<43791442.8050109@v.loewis.de>
	<b348a0850511151534q4e8abbf6vc3c63c07d3291d6a@mail.gmail.com>
	<437E5205.2010001@v.loewis.de>
Message-ID: <d11dcfba0511191118y1da61245tcb6e1a221918b55a@mail.gmail.com>

On 11/18/05, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> Noam Raphael wrote:
> > I just wanted to add another use case: long messages. Consider those
> > lines from idlelib/run.py:133
> >
> >         msg = "IDLE's subprocess can't connect to %s:%d.  This may be due "\
> >               "to your personal firewall configuration.  It is safe to "\
> >               "allow this internal connection because no data is visible on "\
> >               "external ports." % address
> >         tkMessageBox.showerror("IDLE Subprocess Error", msg, parent=root)
>
> You are missing an important point here: There are intentionally no line
> breaks in this string; it must be a single line, or else showerror will
> break it in funny ways. So converting it to a multi-line string would
> break it, dedent or not.

Only if you didn't include newline escapes, e.g.::

    msg = textwrap.dedent('''\
        IDLE's subprocess can't connect to %s:%d.  This may be due \
        to your personal firewall configuration.  It is safe to \
        allow this internal connection because no data is visible on \
        external ports.''' % address)

STeVe
--
You can wordify anything if you just verb it.
        --- Bucky Katt, Get Fuzzy

From noamraph at gmail.com  Sat Nov 19 21:48:00 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Sat, 19 Nov 2005 22:48:00 +0200
Subject: [Python-Dev] str.dedent
In-Reply-To: <d11dcfba0511191118y1da61245tcb6e1a221918b55a@mail.gmail.com>
References: <dga72k$cah$1@sea.gmane.org>
	<200511140920.51724.gmccaughan@synaptics-uk.com>
	<437869DD.7040800@egenix.com>
	<b348a0850511141114p25411ea4w704a99d1ea9a629a@mail.gmail.com>
	<dlaqds$8sb$1@sea.gmane.org>
	<b348a0850511141425y1a894ddap14d7814568c9be5d@mail.gmail.com>
	<43791442.8050109@v.loewis.de>
	<b348a0850511151534q4e8abbf6vc3c63c07d3291d6a@mail.gmail.com>
	<437E5205.2010001@v.loewis.de>
	<d11dcfba0511191118y1da61245tcb6e1a221918b55a@mail.gmail.com>
Message-ID: <b348a0850511191248q72a1a134y27f1b756960817a@mail.gmail.com>

On 11/19/05, Steven Bethard <steven.bethard at gmail.com> wrote:
> > You are missing an important point here: There are intentionally no line
> > breaks in this string; it must be a single line, or else showerror will
> > break it in funny ways. So converting it to a multi-line string would
> > break it, dedent or not.
>
> Only if you didn't include newline escapes, e.g.::
>
>     msg = textwrap.dedent('''\
>         IDLE's subprocess can't connect to %s:%d.  This may be due \
>         to your personal firewall configuration.  It is safe to \
>         allow this internal connection because no data is visible on \
>         external ports.''' % address)
>

Unfortunately, it won't help, since the 'dedent' method won't treat
those spaces as indentation.

But if those messages were printed to the standard error, the line
breaks would be ok, and the use case valid.

Noam

From martin at v.loewis.de  Sat Nov 19 23:06:16 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 19 Nov 2005 23:06:16 +0100
Subject: [Python-Dev] Patch Req. # 1351020 & 1351036: PythonD
	modifications
In-Reply-To: <39387.202.3.192.11.1132108393.squirrel@cafemail.mcadcafe.com>
References: <39387.202.3.192.11.1132108393.squirrel@cafemail.mcadcafe.com>
Message-ID: <437FA1D8.7060600@v.loewis.de>

decker at dacafe.com wrote:
> I would appreciate feedback concerning these patches before the next
> "PythonD" (for DOS/DJGPP) is released.

PEP 11 says that DOS is not supported anymore since Python 2.0. So
I am -1 on reintroducing support for it.

Regards,
Martin

From aahz at pythoncraft.com  Sun Nov 20 00:06:24 2005
From: aahz at pythoncraft.com (Aahz)
Date: Sat, 19 Nov 2005 15:06:24 -0800
Subject: [Python-Dev] How to stay almost backwards compatible with all
	these new cool features
In-Reply-To: <6306f97b0511190901g6757acbej@mail.gmail.com>
References: <6306f97b0511190901g6757acbej@mail.gmail.com>
Message-ID: <20051119230624.GA11188@panix.com>

On Sat, Nov 19, 2005, Gregory Petrosyan wrote:
>
> Here's some of my ideas about subject. Maybe some of them are rather
> foolish, others -- rather simple and common... I just want to add my 2
> cents to Python development.

This message was more appropriate for comp.lang.python; most of what you
talk about has already been discussed before, and the rest has to do
with user-level changes.  Please continue this discussion there if
you're interested in the subject.  Thank you.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"If you think it's expensive to hire a professional to do the job, wait
until you hire an amateur."  --Red Adair

From aahz at pythoncraft.com  Sun Nov 20 00:08:40 2005
From: aahz at pythoncraft.com (Aahz)
Date: Sat, 19 Nov 2005 15:08:40 -0800
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <20051119180855.GA26733@code1.codespeak.net>
References: <20051119180855.GA26733@code1.codespeak.net>
Message-ID: <20051119230840.GB11188@panix.com>

On Sat, Nov 19, 2005, Armin Rigo wrote:
>
> If anyone feels like this is a bad idea, please speak up.

This sounds like a good idea, and your presentation already looks almost
like a PEP.  How about going ahead and making it a formal PEP, which
will make it easier to push through the dev process?
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"If you think it's expensive to hire a professional to do the job, wait
until you hire an amateur."  --Red Adair

From martin at v.loewis.de  Sun Nov 20 00:55:57 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 20 Nov 2005 00:55:57 +0100
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <20051119180855.GA26733@code1.codespeak.net>
References: <20051119180855.GA26733@code1.codespeak.net>
Message-ID: <437FBB8D.50501@v.loewis.de>

Armin Rigo wrote:
> If anyone feels like this is a bad idea, please speak up.

As stated, it certainly is a bad idea. To make it a good idea, there
should also be some commitment to maintain this library for a number
of years. So who would be maintaining it, and what are their plans
for doing so?

Regards,
Martin

From bcannon at gmail.com  Sun Nov 20 01:12:28 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Sat, 19 Nov 2005 16:12:28 -0800
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <20051119180855.GA26733@code1.codespeak.net>
References: <20051119180855.GA26733@code1.codespeak.net>
Message-ID: <bbaeab100511191612o4877977bn1144c6cba4c4f5a@mail.gmail.com>

Just  for everyone's FYI while we are talking about profilers, Floris
Bruynooghe (who I am cc'ing on this so he can contribute to the
conversation), for Google's Summer of Code, wrote a replacement for
'profile' that uses Hotshot directly.  Thanks to his direct use of
Hotshot and rewrite of pstats it loads Hotshot data 30% faster and
also alleviates keeping 'profile' around and its slightly questionable
license.

You can find his project at
http://savannah.nongnu.org/projects/pyprof/ .  I believe he also
tweaked Hotshot to accept custom timing functions.  I have not had a
chance to go over his code to clean it up for putting it up on SF, but
I thought people should be aware of it.

-Brett

On 11/19/05, Armin Rigo <arigo at tunes.org> wrote:
> Hi!
>
> The current Python profilers situation is a mess.
>
> 'profile.Profile' is the ages-old pure Python profiler.  At the end of a
> run, it builds a dict that is inspected by 'pstats.Stats'.  It has some
> recent support for profiling C calls, which however make it crash in
> some cases [1].  And of course it's slow (makes a run take about 10x
> longer).
>
> 'hotshot', new from 2.2, is quite faster (reportedly, only 30% added
> overhead).  The log file is then loaded and turned into an instance of
> the same 'pstats.Stats'.  This loading takes ages.  The reason is that
> the log file only records events, and loading is done by instantiating a
> 'profile.Profile' and sending it all the events.  In other words, it
> takes exactly as long as the time it spared in the first place!
> Moreover, for some reasons, the results given by hotshot seem sometimes
> quite wrong.  (I don't understand why, but I've seen it myself, and it's
> been reported by various people, e.g. [2].)  'hotshot' doesn't know
> about C calls, but it can log line events, although this information is
> lost(!) in the final conversion to a 'pstats.Stats'.
>
> 'lsprof' is a third profiler by Brett Rosen and Ted Czotter, posted on
> SF in June [2].  Michael Hudson and me did some minor clean-ups and
> improvements on it, and it seems to be quite useful.  It is, for
> example, the only of the three profilers that managed to give sensible
> information about the PyPy translation process without crashing,
> allowing us to accelerate it from over 30 to under 20 minutes.  The SF
> patch contains a more detailed account on the reasons for writing
> 'lsprof'.  The current version [3] does not support C calls nor line
> events.  It has its own simple interface, which is not compatible with
> any of the other two profilers.  However, unlike the other two
> profilers, it can record detailed stats about children, which I found
> quite useful (e.g. how much take is spent in a function when it is
> called by another specific function).
>
> Therefore, I think it would be a great idea to add 'lsprof' to the
> standard library.  Unless there are objections, it seems that the best
> plan is to keep 'profile.py' as a pure Python implementation and replace
> 'hotshot' with 'lsprof'.  Indeed, I don't see any obvious advantage that
> 'hotshot' has over 'lsprof', and I certainly see more than one downside.
> Maybe someone has a use for (and undocumented ways to fish for) line
> events generated by hotshot.  Well, there is a script [4] to convert
> hotshot log files to some format that a KDE tool [5] can display.  (It
> even looks like hotshot files were designed with this in mind.)  Given
> that the people doing that can still compile 'hotshot' as a separate
> extension module, it doesn't strike me as a particularly good reason to
> keep Yet Another Profiler in the standard library.
>
> So here is my plan:
>
> Unify a bit more the interfaces of the pure Python and the C profilers.
> This also means that 'lsprof' should be made to use a pstats-compatible
> log format.  The 'pstats' documentation specifically says that the file
> format can change: that would give 'lsprof' a place to store its
> detailed children stats.
>
> Then we can provide a dummy 'hotshot.py' for compatibility, remove its
> documentation, and provide documentation for 'lsprof'.
>
> If anyone feels like this is a bad idea, please speak up.
>
>
> A bientot,
>
> Armin
>
>
> [1] https://sourceforge.net/tracker/?group_id=5470&atid=105470&func=detail&aid=1117670
>
> [2] http://sourceforge.net/tracker/?group_id=5470&atid=305470&func=detail&aid=1212837
>
> [3] http://codespeak.net/svn/user/arigo/hack/misc/lsprof (Subversion)
>
> [4] http://mail.python.org/pipermail/python-list/2003-September/183887.html
>
> [5] http://kcachegrind.sourceforge.net/cgi-bin/show.cgi
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org
>

From nnorwitz at gmail.com  Sun Nov 20 01:15:01 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Sat, 19 Nov 2005 16:15:01 -0800
Subject: [Python-Dev] ast status, memory leaks, etc
In-Reply-To: <ee2a432c0511131141s72fedecax29008fd783a3b0db@mail.gmail.com>
References: <ee2a432c0511131141s72fedecax29008fd783a3b0db@mail.gmail.com>
Message-ID: <ee2a432c0511191615y6259e95bwce68aec849a7ebfa@mail.gmail.com>

I lied a bit in my previous status.  I said that the refs used at the
end of a regression test run from a clean state (*) were down to 380k.
 Well if I had remembered to remove all the .pyc's this would have
been true.  Here's the numbers now:

Before AST: r39757 [362766 refs]
Before AST: svn up [356255 refs] 266 OK 31 skipped
clean:             [342367 refs] 267 OK 31 skipped

(*) Before each run I did:  find . -name '*.pyc' | xargs rm

Unless I screwed up again, the first line is from clean at revision
39757 which was just before the AST merge.  The second line was a
selective update of other files that didn't have any relationship to
AST (primarily compile.c and symtable.c).  The last run is after my
recent checkin.

So even with an additional test, we are finishing a regrtest.py run
with less references. I don't know of any constructs which leak
references.

A patch was posted for the free memory read I reported earlier (not
related to AST branch).  It's on SF, I don't know the #.

There are many potential memory leaks in the AST code in error
conditions (hopefully these are only possible when running out of
memory).  It really needs the arena implementation to fix them and get
it right.  There are also still a few printfs in the AST code which
should be changed to SystemErrors.

There are still 2 memory leaks while running the regression tests. 
They show up when running test_fork1 and test_pty.  There may be more,
valgrind crashed on me the last run which was also before I fixed some
of the reference leaks.  It would be great if people could localize
the leaks.

512 bytes in 1 blocks are definitely lost in loss record 319 of 548
   at 0x11B1AF13: malloc (vg_replace_malloc.c:149)
   by 0x433CC4: new_arena (obmalloc.c:500)
   by 0x433EA8: PyObject_Malloc (obmalloc.c:706)
   by 0x43734B: PyString_FromStringAndSize (stringobject.c:74)
   by 0x4655B5: optimize_code (compile.c:957)
   by 0x467B86: makecode (compile.c:4092)
   by 0x467F00: assemble (compile.c:4166)
   by 0x46AA94: compiler_mod (compile.c:1755)
   by 0x46AC8B: PyAST_Compile (compile.c:285)
   by 0x47A870: run_mod (pythonrun.c:1195)
   by 0x47B0E8: PyRun_StringFlags (pythonrun.c:1159)
   by 0x45767A: builtin_eval (bltinmodule.c:589)
   by 0x41684F: PyObject_Call (abstract.c:1777)
   by 0x45EB4B: PyEval_CallObjectWithKeywords (ceval.c:3432)
   by 0x457E4E: builtin_map (bltinmodule.c:938)

1280 bytes in 2 blocks are definitely lost in loss record 383 of 548
   at 0x11B1AF13: malloc (vg_replace_malloc.c:149)
   by 0x433CC4: new_arena (obmalloc.c:500)
   by 0x433EA8: PyObject_Malloc (obmalloc.c:706)
   by 0x4953F3: PyNode_AddChild (node.c:95)
   by 0x495611: shift (parser.c:112)
   by 0x4958F0: PyParser_AddToken (parser.c:244)
   by 0x411704: parsetok (parsetok.c:166)
   by 0x47AE4F: PyParser_ASTFromFile (pythonrun.c:1292)
   by 0x472338: parse_source_module (import.c:777)
   by 0x47262B: load_source_module (import.c:905)
   by 0x4735B3: load_module (import.c:1665)
   by 0x473C4B: import_submodule (import.c:2259)
   by 0x473DE0: load_next (import.c:2079)
   by 0x4741D5: import_module_ex (import.c:1914)
   by 0x474389: PyImport_ImportModuleEx (import.c:1955)

From martin at v.loewis.de  Sun Nov 20 10:58:14 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 20 Nov 2005 10:58:14 +0100
Subject: [Python-Dev] ast status, memory leaks, etc
In-Reply-To: <ee2a432c0511191615y6259e95bwce68aec849a7ebfa@mail.gmail.com>
References: <ee2a432c0511131141s72fedecax29008fd783a3b0db@mail.gmail.com>
	<ee2a432c0511191615y6259e95bwce68aec849a7ebfa@mail.gmail.com>
Message-ID: <438048B6.2030103@v.loewis.de>

Neal Norwitz wrote:
> There are still 2 memory leaks while running the regression tests. 
> They show up when running test_fork1 and test_pty.  There may be more,
> valgrind crashed on me the last run which was also before I fixed some
> of the reference leaks.  It would be great if people could localize
> the leaks.

Can somebody please give a quick explanation how valgrind can give
*any* reasonable leak analysis when obmalloc is used? In the current
implementation, obmalloc never ever calls free(3), so all pool memory
should appear to have leaked.

So if valgrind does *not* report all memory as leaked: how does it
find out?

> 512 bytes in 1 blocks are definitely lost in loss record 319 of 548
>    at 0x11B1AF13: malloc (vg_replace_malloc.c:149)
>    by 0x433CC4: new_arena (obmalloc.c:500)

See

http://mail.python.org/pipermail/python-dev/2004-June/045253.html

This is the resizing of the list of arenas, which is a deliberate
leak. It just happened to be exhausted in this particular call
stack.

> 1280 bytes in 2 blocks are definitely lost in loss record 383 of 548
>    at 0x11B1AF13: malloc (vg_replace_malloc.c:149)
>    by 0x433CC4: new_arena (obmalloc.c:500)

Likewise.

Regards,
Martin

From jepler at unpythonic.net  Sun Nov 20 16:08:51 2005
From: jepler at unpythonic.net (jepler@unpythonic.net)
Date: Sun, 20 Nov 2005 09:08:51 -0600
Subject: [Python-Dev] Patch Req. # 1351020 & 1351036: PythonD
	modifications
In-Reply-To: <437FA1D8.7060600@v.loewis.de>
References: <39387.202.3.192.11.1132108393.squirrel@cafemail.mcadcafe.com>
	<437FA1D8.7060600@v.loewis.de>
Message-ID: <20051120150850.GA27838@unpythonic.net>

On Sat, Nov 19, 2005 at 11:06:16PM +0100, "Martin v. L?wis" wrote:
> decker at dacafe.com wrote:
> > I would appreciate feedback concerning these patches before the next
> > "PythonD" (for DOS/DJGPP) is released.
> 
> PEP 11 says that DOS is not supported anymore since Python 2.0. So
> I am -1 on reintroducing support for it.

If we have someeone who is volunteering the time to make it work, not just today
but in the future as well, we shouldn't rule out re-adding support.

I've taken a glance at the patch.  There are probably a few things to quarrel
over--for instance, it looks like a site.py change will cause python to print
a blank line when it's started, and the removal of a '#define HAVE_FORK 1' in 
posixmodule.c---but this still doesn't mean the re-addition of DOS as a supported
platform should be rejected out of hand.

Jeff
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20051120/d7bcdf5b/attachment.pgp

From martin at v.loewis.de  Sun Nov 20 19:00:27 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 20 Nov 2005 19:00:27 +0100
Subject: [Python-Dev] Patch Req. # 1351020 & 1351036: PythonD
	modifications
In-Reply-To: <20051120150850.GA27838@unpythonic.net>
References: <39387.202.3.192.11.1132108393.squirrel@cafemail.mcadcafe.com>
	<437FA1D8.7060600@v.loewis.de>
	<20051120150850.GA27838@unpythonic.net>
Message-ID: <4380B9BB.5030208@v.loewis.de>

jepler at unpythonic.net wrote:
> I've taken a glance at the patch.  There are probably a few things to quarrel
> over--for instance, it looks like a site.py change will cause python to print
> a blank line when it's started, and the removal of a '#define HAVE_FORK 1' in 
> posixmodule.c---but this still doesn't mean the re-addition of DOS as a supported
> platform should be rejected out of hand.

Well, my experience is that people contributing "minority" ports run
away after getting their patches accepted more often than not (that so
happened with the BeOS port and the VMS port, to take recent examples).
So I would prefer to see some strong commitment from the porter.

Even so, I don't think I'm willing to commit such a patch myself.
If somebody else thinks this is worthwhile, I won't object.

Regards,
Martin

From nas at arctrix.com  Fri Nov 18 18:28:03 2005
From: nas at arctrix.com (Neil Schemenauer)
Date: Fri, 18 Nov 2005 17:28:03 +0000 (UTC)
Subject: [Python-Dev] Memory management in the AST parser & compiler
References: <4379AAD7.2050506@iinet.net.au>
	<6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu>
	<e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com>
	<ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com>
	<bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com>
	<13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu>
	<437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org>
Message-ID: <dll2v3$78g$1@sea.gmane.org>

Fredrik Lundh <fredrik at pythonware.com> wrote:
> Thomas Lee wrote:
>
>> Even if it meant we had just one function call - one, safe function call
>> that deallocated all the memory allocated within a function - that we
>> had to put before each and every return, that's better than what we
>> have.
>
> alloca?

Perhaps we should use the memory management technique that the rest
of Python uses: reference counting.  I don't see why the AST
structures couldn't be PyObjects.

  Neil


From mwh at python.net  Sun Nov 20 22:43:45 2005
From: mwh at python.net (Michael Hudson)
Date: Sun, 20 Nov 2005 21:43:45 +0000
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <437FBB8D.50501@v.loewis.de> (
	=?iso-8859-1?q?Martin_v._L=F6wis's_message_of?= "Sun,
	20 Nov 2005 00:55:57 +0100")
References: <20051119180855.GA26733@code1.codespeak.net>
	<437FBB8D.50501@v.loewis.de>
Message-ID: <2mk6f3ro4u.fsf@starship.python.net>

"Martin v. L?wis" <martin at v.loewis.de> writes:

> Armin Rigo wrote:
>> If anyone feels like this is a bad idea, please speak up.
>
> As stated, it certainly is a bad idea.

This is a bit extreme...

> To make it a good idea, there should also be some commitment to
> maintain this library for a number of years. So who would be
> maintaining it, and what are their plans for doing so?

Well, the post was made by Armin who has been involved in CPython
development for quite a few years now, and mentioned that work on
lsprof was done by me who has been around for even longer -- neither
of us are going to quit anytime soon.

Cheers,
mwh

-- 
  I think if we have the choice, I'd rather we didn't explicitly put
  flaws in the reST syntax for the sole purpose of not insulting the
  almighty.                                    -- /will on the doc-sig

From martin at v.loewis.de  Sun Nov 20 23:15:14 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 20 Nov 2005 23:15:14 +0100
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <2mk6f3ro4u.fsf@starship.python.net>
References: <20051119180855.GA26733@code1.codespeak.net>	<437FBB8D.50501@v.loewis.de>
	<2mk6f3ro4u.fsf@starship.python.net>
Message-ID: <4380F572.9040402@v.loewis.de>

Michael Hudson wrote:
>>As stated, it certainly is a bad idea.
> 
> 
> This is a bit extreme...

Yes, my apologies :-(

>>To make it a good idea, there should also be some commitment to
>>maintain this library for a number of years. So who would be
>>maintaining it, and what are their plans for doing so?
> 
> 
> Well, the post was made by Armin who has been involved in CPython
> development for quite a few years now, and mentioned that work on
> lsprof was done by me who has been around for even longer -- neither
> of us are going to quit anytime soon.

The same could be said about hotshot, which was originally contributed
by Fred Drake, and hacked by Tim Peters, yourself, and others. Yet, now
people want to remove it again.

I'm really concerned that the same fate will happen to any new
profiling library: anybody but the original author will hate it,
write his own, and then suggest to replace the existing one.
It is the "let's build it from scratch" attitude which makes
me nervous.

Perhaps the library could be distributed separately for some time, e.g.
as a package in the cheeseshop. When it proves to be mature, I probably
would object less.

Regards,
Martin

From fredrik at pythonware.com  Sun Nov 20 23:33:42 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sun, 20 Nov 2005 23:33:42 +0100
Subject: [Python-Dev] s/hotshot/lsprof
References: <20051119180855.GA26733@code1.codespeak.net>	<437FBB8D.50501@v.loewis.de><2mk6f3ro4u.fsf@starship.python.net>
	<4380F572.9040402@v.loewis.de>
Message-ID: <dlqtk8$37q$1@sea.gmane.org>

Martin v. Löwis wrote:

> The same could be said about hotshot, which was originally contributed
> by Fred Drake, and hacked by Tim Peters, yourself, and others. Yet, now
> people want to remove it again.
>
> I'm really concerned that the same fate will happen to any new
> profiling library: anybody but the original author will hate it,
> write his own, and then suggest to replace the existing one.

is this some intrinsic property of profilers?  if the existing tool has
problems, why not improve the tool itself?  do we really need CADT-
based development in the standard library?

(on the other hand, I'm not sure we need a profiler as part of the
standard library either, but that's me...)

</F>




From nnorwitz at gmail.com  Mon Nov 21 01:14:07 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Sun, 20 Nov 2005 16:14:07 -0800
Subject: [Python-Dev] ast status, memory leaks, etc
In-Reply-To: <438048B6.2030103@v.loewis.de>
References: <ee2a432c0511131141s72fedecax29008fd783a3b0db@mail.gmail.com>
	<ee2a432c0511191615y6259e95bwce68aec849a7ebfa@mail.gmail.com>
	<438048B6.2030103@v.loewis.de>
Message-ID: <ee2a432c0511201614u1dadb3b2x419e3482ccf5b145@mail.gmail.com>

On 11/20/05, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>
> Can somebody please give a quick explanation how valgrind can give
> *any* reasonable leak analysis when obmalloc is used? In the current
> implementation, obmalloc never ever calls free(3), so all pool memory
> should appear to have leaked.
>
> So if valgrind does *not* report all memory as leaked: how does it
> find out?

Thanks for reminding me I wanted to do the next step and test without pymalloc.

Valgrind can't find certain kinds of leaks when pymalloc is holding on
to memory, true.  However, remember that lots of allocations are
forwarded to the system malloc().  For example, any request > 256
bytes goes directly to system malloc.  Also, PyMem_*() call the system
functions.

The core is pretty clean already, since I've been running Valgrind
pretty regularly over the years.  Before Valgrind I used purify going
back to 2000 or 2001.  Barry had used purify before me at some point
AFAIK.  So nearly all of the leaks have already been fixed.  It's
pretty much only new code that starts showing leaks.

To give you an example, I ran the entire regression suite through
Valgrind after configuring --without-pymalloc.  I only found 3
additional problems in new code.  There was also one problem in older
code (Python/modsupport.c).

The big benefit of running with pymalloc is that it only takes about
1.25 to 1.50 hours to run on my box.  When running without pymalloc, I
estimate it takes about 5 times longer.  Plus it requires a lot of
extra work since I need to run the tests in batches.  I only have 1 GB
of RAM and it takes a lot more than that when running without
pymalloc.

> This is the resizing of the list of arenas, which is a deliberate
> leak. It just happened to be exhausted in this particular call
> stack.

Thanks I was going to look into the resizing and forgot about it. 
Running without pymalloc confirmed that there weren't more serious
problems.

n

From nnorwitz at gmail.com  Mon Nov 21 01:21:41 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Sun, 20 Nov 2005 16:21:41 -0800
Subject: [Python-Dev] ast status, memory leaks, etc
In-Reply-To: <438048B6.2030103@v.loewis.de>
References: <ee2a432c0511131141s72fedecax29008fd783a3b0db@mail.gmail.com>
	<ee2a432c0511191615y6259e95bwce68aec849a7ebfa@mail.gmail.com>
	<438048B6.2030103@v.loewis.de>
Message-ID: <ee2a432c0511201621w714f035er7f1ecd8072b10247@mail.gmail.com>

I would really like it if someone could run Purify (or another memory
tool) on Windows.  Purify on any another (unix) platform would be
nice, but I doubt it will show much more.  By using different tools,
problems not found by one tool may be found by the other.  Plus there
is windows specific code that isn't exercised at all right now.

Any takers?

I still think the total references at the end of a test run are high,
342291.  I don't have anything to base this number on.  Some strategic
interning should help this number go down a bit.  I suppose I
shouldn't worry much since these references don't seem to become
actual memory leaks.

n

From skip at pobox.com  Mon Nov 21 02:43:33 2005
From: skip at pobox.com (skip@pobox.com)
Date: Sun, 20 Nov 2005 19:43:33 -0600
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <dlqtk8$37q$1@sea.gmane.org>
References: <20051119180855.GA26733@code1.codespeak.net>
	<437FBB8D.50501@v.loewis.de> <2mk6f3ro4u.fsf@starship.python.net>
	<4380F572.9040402@v.loewis.de> <dlqtk8$37q$1@sea.gmane.org>
Message-ID: <17281.9797.171955.583286@montanaro.dyndns.org>


    Fredrik> (on the other hand, I'm not sure we need a profiler as part of
    Fredrik> the standard library either, but that's me...)

Painful though hotshot can be at times, I occasionally find it extremely
useful to zoom in on trouble spots.  I haven't used profile in awhile and
haven't tried lsprof yet.  I would think having something readily available
(whether in the standard library or not) would be handy when needed,
hopefully with nothing more than "python setup.py install" required to make
it available.

Skip


From tim.peters at gmail.com  Mon Nov 21 02:55:49 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Sun, 20 Nov 2005 20:55:49 -0500
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <20051119180855.GA26733@code1.codespeak.net>
References: <20051119180855.GA26733@code1.codespeak.net>
Message-ID: <1f7befae0511201755h2cb4bdf8s9c4b8586ee3c530a@mail.gmail.com>

[Armin Rigo]
...
> ...
> 'hotshot', new from 2.2, is quite faster (reportedly, only 30% added
> overhead).  The log file is then loaded and turned into an instance of
> the same 'pstats.Stats'.  This loading takes ages.  The reason is that
> the log file only records events, and loading is done by instantiating a
> 'profile.Profile' and sending it all the events.  In other words, it
> takes exactly as long as the time it spared in the first place!

We should note that hotshot didn't intend to reduce total time
overhead.  What it's aiming at here is to be less disruptive (than
profile.py) to the code being profiled _while_ that code is running. 
On modern boxes, any kind of profiling gimmick has the unfortunate
side effect of _changing_ the runtime behavior of the code being
profiled, at least by polluting I and D caches with droppings from the
profiling code itself (or, in the case of profile.py, possibly
overwhelming I and top-level D caches -- and distorting non-profiling
runtime so badly that, e.g., networked apps may end up taking entirely
different code paths).

hotshot tries to stick with tiny little C functions that pack away a
tiny amount of data each time, and avoid memory alloc/dealloc, to try
to minimize this disruption.  It looked like it was making real
progress on this at one time ;-)

> Moreover, for some reasons, the results given by hotshot seem sometimes
> quite wrong.  (I don't understand why, but I've seen it myself, and it's
> been reported by various people, e.g. [2].)  'hotshot' doesn't know
> about C calls, but it can log line events, although this information is
> lost(!) in the final conversion to a 'pstats.Stats'.

Ya, hotshot isn't finished.  It had corporate support for its initial
development, but lost that, and became an orphan then.  That's the
eventual fate of most profilers, alas.  They're fiddly, difficult, and
always wrong in some respect.  Because of this, the existence of an
eager maintainer without a real life is more important than the code
;-).

From tim.peters at gmail.com  Mon Nov 21 03:02:58 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Sun, 20 Nov 2005 21:02:58 -0500
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <dlqtk8$37q$1@sea.gmane.org>
References: <20051119180855.GA26733@code1.codespeak.net>
	<437FBB8D.50501@v.loewis.de> <2mk6f3ro4u.fsf@starship.python.net>
	<4380F572.9040402@v.loewis.de> <dlqtk8$37q$1@sea.gmane.org>
Message-ID: <1f7befae0511201802h5ddfe36fxe0879ddf91a11923@mail.gmail.com>

[Martin v. L?wis]
>> I'm really concerned that the same fate will happen to any new
>> profiling library: anybody but the original author will hate it,
>> write his own, and then suggest to replace the existing one.

[Fredrik Lundh]
> is this some intrinsic property of profilers?  if the existing tool has
> problems, why not improve the tool itself?

How many regexp engines has Python gone through now?  Profilers are
even more irritating to write and maintain than those -- and you
presumably know why you started over from scratch instead of improving
pcre, or whatever-the-heck-it-was that came before that ;-)

> do we really need CADT-based development in the standard library?

Since I didn't know what that meant, Google helpfully told me:

     Center for Alcohol & Drug Treatment

Fits, anyway <wink>.

From steve at holdenweb.com  Mon Nov 21 04:04:09 2005
From: steve at holdenweb.com (Steve Holden)
Date: Mon, 21 Nov 2005 03:04:09 +0000
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <1f7befae0511201802h5ddfe36fxe0879ddf91a11923@mail.gmail.com>
References: <20051119180855.GA26733@code1.codespeak.net>	<437FBB8D.50501@v.loewis.de>
	<2mk6f3ro4u.fsf@starship.python.net>	<4380F572.9040402@v.loewis.de>
	<dlqtk8$37q$1@sea.gmane.org>
	<1f7befae0511201802h5ddfe36fxe0879ddf91a11923@mail.gmail.com>
Message-ID: <43813929.3080000@holdenweb.com>

Tim Peters wrote:
> [Martin v. L?wis]
> 
>>>I'm really concerned that the same fate will happen to any new
>>>profiling library: anybody but the original author will hate it,
>>>write his own, and then suggest to replace the existing one.
> 
> 
> [Fredrik Lundh]
> 
>>is this some intrinsic property of profilers?  if the existing tool has
>>problems, why not improve the tool itself?
> 
> 
> How many regexp engines has Python gone through now?  Profilers are
> even more irritating to write and maintain than those -- and you
> presumably know why you started over from scratch instead of improving
> pcre, or whatever-the-heck-it-was that came before that ;-)
> 
> 
>>do we really need CADT-based development in the standard library?
> 
> 
> Since I didn't know what that meant, Google helpfully told me:
> 
>      Center for Alcohol & Drug Treatment
> 
I suspect you may already know that Fredrik referred to

         Cascade of Attention-Deficit Teenagers

Where's the BDFL to say "yes" or "no" when you need one?

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC                     www.holdenweb.com
PyCon TX 2006                  www.python.org/pycon/


From amk at amk.ca  Mon Nov 21 05:12:08 2005
From: amk at amk.ca (A.M. Kuchling)
Date: Sun, 20 Nov 2005 23:12:08 -0500
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <dlqtk8$37q$1@sea.gmane.org>
References: <20051119180855.GA26733@code1.codespeak.net>
	<4380F572.9040402@v.loewis.de> <dlqtk8$37q$1@sea.gmane.org>
Message-ID: <20051121041208.GA7924@rogue.amk.ca>

On Sun, Nov 20, 2005 at 11:33:42PM +0100, Fredrik Lundh wrote:
> do we really need CADT-based development in the standard library?

I didn't recognize the acronym, but Google told me CADT = "Cascade of
Attention-Deficit Teenagers"; see http://www.jwz.org/doc/cadt.html
for a rant.

--amk


From steve at holdenweb.com  Mon Nov 21 05:07:45 2005
From: steve at holdenweb.com (Steve Holden)
Date: Mon, 21 Nov 2005 04:07:45 +0000
Subject: [Python-Dev] ast status, memory leaks, etc
In-Reply-To: <ee2a432c0511201614u1dadb3b2x419e3482ccf5b145@mail.gmail.com>
References: <ee2a432c0511131141s72fedecax29008fd783a3b0db@mail.gmail.com>	<ee2a432c0511191615y6259e95bwce68aec849a7ebfa@mail.gmail.com>	<438048B6.2030103@v.loewis.de>
	<ee2a432c0511201614u1dadb3b2x419e3482ccf5b145@mail.gmail.com>
Message-ID: <43814811.2070004@holdenweb.com>

Neal Norwitz wrote:
[...]

> To give you an example, I ran the entire regression suite through
> Valgrind after configuring --without-pymalloc.  I only found 3
> additional problems in new code.  There was also one problem in older
> code (Python/modsupport.c).
> 
> The big benefit of running with pymalloc is that it only takes about
> 1.25 to 1.50 hours to run on my box.  When running without pymalloc, I
> estimate it takes about 5 times longer.  Plus it requires a lot of
> extra work since I need to run the tests in batches.  I only have 1 GB
> of RAM and it takes a lot more than that when running without
> pymalloc.
> 
Is there maybe a machine in the SourceForge compile farm that could be
used for this work?

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC                     www.holdenweb.com
PyCon TX 2006                  www.python.org/pycon/

From steve at holdenweb.com  Mon Nov 21 05:07:45 2005
From: steve at holdenweb.com (Steve Holden)
Date: Mon, 21 Nov 2005 04:07:45 +0000
Subject: [Python-Dev] ast status, memory leaks, etc
In-Reply-To: <ee2a432c0511201614u1dadb3b2x419e3482ccf5b145@mail.gmail.com>
References: <ee2a432c0511131141s72fedecax29008fd783a3b0db@mail.gmail.com>	<ee2a432c0511191615y6259e95bwce68aec849a7ebfa@mail.gmail.com>	<438048B6.2030103@v.loewis.de>
	<ee2a432c0511201614u1dadb3b2x419e3482ccf5b145@mail.gmail.com>
Message-ID: <43814811.2070004@holdenweb.com>

Neal Norwitz wrote:
[...]

> To give you an example, I ran the entire regression suite through
> Valgrind after configuring --without-pymalloc.  I only found 3
> additional problems in new code.  There was also one problem in older
> code (Python/modsupport.c).
> 
> The big benefit of running with pymalloc is that it only takes about
> 1.25 to 1.50 hours to run on my box.  When running without pymalloc, I
> estimate it takes about 5 times longer.  Plus it requires a lot of
> extra work since I need to run the tests in batches.  I only have 1 GB
> of RAM and it takes a lot more than that when running without
> pymalloc.
> 
Is there maybe a machine in the SourceForge compile farm that could be
used for this work?

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC                     www.holdenweb.com
PyCon TX 2006                  www.python.org/pycon/


From fdrake at acm.org  Mon Nov 21 05:11:19 2005
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Sun, 20 Nov 2005 23:11:19 -0500
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <1f7befae0511201802h5ddfe36fxe0879ddf91a11923@mail.gmail.com>
References: <20051119180855.GA26733@code1.codespeak.net>
	<dlqtk8$37q$1@sea.gmane.org>
	<1f7befae0511201802h5ddfe36fxe0879ddf91a11923@mail.gmail.com>
Message-ID: <200511202311.20271.fdrake@acm.org>

On Sunday 20 November 2005 21:02, Tim Peters wrote:
 > Since I didn't know what that meant, Google helpfully told me:
 >
 >      Center for Alcohol & Drug Treatment

On Sunday 20 November 2005 22:04, Steve Holden wrote:
 > I suspect you may already know that Fredrik referred to
 >
?> ? ? ? ?Cascade of Attention-Deficit Teenagers

Yes, our former office in McLean, Virginia was known by many names.  :-)

 > Where's the BDFL to say "yes" or "no" when you need one?

Actually, he was just in the next room for HotShot.  Guess he was distracted 
by the photons from the Window, which I was protected from (ironically, by 
his office).


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From nas at arctrix.com  Mon Nov 21 05:53:13 2005
From: nas at arctrix.com (Neil Schemenauer)
Date: Mon, 21 Nov 2005 04:53:13 +0000 (UTC)
Subject: [Python-Dev] s/hotshot/lsprof
References: <20051119180855.GA26733@code1.codespeak.net>
	<1f7befae0511201755h2cb4bdf8s9c4b8586ee3c530a@mail.gmail.com>
Message-ID: <dlrjro$l5g$1@sea.gmane.org>

Tim Peters <tim.peters at gmail.com> wrote:
> We should note that hotshot didn't intend to reduce total time
> overhead.  What it's aiming at here is to be less disruptive (than
> profile.py) to the code being profiled _while_ that code is running. 

A statistical profiler (e.g.
http://wingolog.org/archives/2005/10/28/profiling) would be a nice
addition, IMHO.  I guess we should get the current profilers in
shape first though.

  Neil


From martin at v.loewis.de  Mon Nov 21 07:39:01 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 21 Nov 2005 07:39:01 +0100
Subject: [Python-Dev] ast status, memory leaks, etc
In-Reply-To: <ee2a432c0511201621w714f035er7f1ecd8072b10247@mail.gmail.com>
References: <ee2a432c0511131141s72fedecax29008fd783a3b0db@mail.gmail.com>	
	<ee2a432c0511191615y6259e95bwce68aec849a7ebfa@mail.gmail.com>	
	<438048B6.2030103@v.loewis.de>
	<ee2a432c0511201621w714f035er7f1ecd8072b10247@mail.gmail.com>
Message-ID: <43816B85.1080407@v.loewis.de>

Neal Norwitz wrote:
> I still think the total references at the end of a test run are high,
> 342291.  I don't have anything to base this number on.  Some strategic
> interning should help this number go down a bit.  I suppose I
> shouldn't worry much since these references don't seem to become
> actual memory leaks.

You could try to classify the objects remaining, counting them
by type. Perhaps selectively clearing out sys.modules to what
it is after startup might also give insights.

Regards,
Martin

From martin at v.loewis.de  Mon Nov 21 07:44:50 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 21 Nov 2005 07:44:50 +0100
Subject: [Python-Dev] Patch Req. # 1351020 & 1351036: PythonD
	modifications
In-Reply-To: <25509.202.3.192.11.1132533752.squirrel@cafemail.mcadcafe.com>
References: <39387.202.3.192.11.1132108393.squirrel@cafemail.mcadcafe.com>
	<437FA1D8.7060600@v.loewis.de>
	<20051120150850.GA27838@unpythonic.net>
	<25509.202.3.192.11.1132533752.squirrel@cafemail.mcadcafe.com>
Message-ID: <43816CE2.2020808@v.loewis.de>

decker at dacafe.com wrote:
> The local python community here in Sydney indicated that python.org is
> only upset when groups port the source to 'obscure' systems and *don't*
> submit patches... It is possible that I was misinformed.

I never heard such concerns. I personally wouldn't notice if somebody
ported Python, and did not feed back the patches.

Sometimes, people ask "there is this and that port, why isn't it
integrated", to which the answer is in most cases "because authors
didn't contribute". This is not being upset - it is merely a fact.
This port (djgcc) is the first one in a long time (IIRC) where
anybody proposed rejecting it.

> I am not sure about the future myself. DJGPP 2.04 has been parked at beta
> for two years now. It might be fair to say that the *general* DJGPP
> developer base has shrunk a little bit. But the PythonD userbase has
> actually grown since the first release three years ago. For the time
> being, people get very angry when the servers go down here :-)

It's not that much availability of the platform I worry about, but the
commitment of the Python porter. We need somebody to forward bug
reports to, and somebody to intervene if incompatible changes are made.
This person would also indicate that the platform is no longer
available, and hence the port can be removed.

Regards,
Martin

From martin at v.loewis.de  Mon Nov 21 08:12:53 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 21 Nov 2005 08:12:53 +0100
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <dlqtk8$37q$1@sea.gmane.org>
References: <20051119180855.GA26733@code1.codespeak.net>	<437FBB8D.50501@v.loewis.de><2mk6f3ro4u.fsf@starship.python.net>	<4380F572.9040402@v.loewis.de>
	<dlqtk8$37q$1@sea.gmane.org>
Message-ID: <43817375.6040108@v.loewis.de>

Fredrik Lundh wrote:
> is this some intrinsic property of profilers?  if the existing tool has
> problems, why not improve the tool itself?  do we really need CADT-
> based development in the standard library?

It is, IMO, intrinsic to parts of the library that aren't used much.
If bugs are in the heavily-used parts of the library, like regular
expressions, it doesn't matter much if the original author goes
away for some period of time - other contributors will fix the bugs
that they care about, and not by rewriting the entire thing.

If the library is less used, this kind of model is more likely,
as resistance to replacing the existing library will be lower.

> (on the other hand, I'm not sure we need a profiler as part of the
> standard library either, but that's me...)

It's a battery, to some.

Regards,
Martin

From martin at v.loewis.de  Mon Nov 21 08:16:19 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 21 Nov 2005 08:16:19 +0100
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <1f7befae0511201802h5ddfe36fxe0879ddf91a11923@mail.gmail.com>
References: <20051119180855.GA26733@code1.codespeak.net>	<437FBB8D.50501@v.loewis.de>
	<2mk6f3ro4u.fsf@starship.python.net>	<4380F572.9040402@v.loewis.de>
	<dlqtk8$37q$1@sea.gmane.org>
	<1f7befae0511201802h5ddfe36fxe0879ddf91a11923@mail.gmail.com>
Message-ID: <43817443.4000402@v.loewis.de>

Tim Peters wrote:
>      Center for Alcohol & Drug Treatment

Besides Jamie Zawinski's definition, Google also told me it stands
for

        Computer Aided Drafting Technology

where "to draft" turns out to have two different meanings :-)

Regards,
Martin

From arigo at tunes.org  Mon Nov 21 12:14:26 2005
From: arigo at tunes.org (Armin Rigo)
Date: Mon, 21 Nov 2005 12:14:26 +0100
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <bbaeab100511191612o4877977bn1144c6cba4c4f5a@mail.gmail.com>
References: <20051119180855.GA26733@code1.codespeak.net>
	<bbaeab100511191612o4877977bn1144c6cba4c4f5a@mail.gmail.com>
Message-ID: <20051121111426.GA13478@code1.codespeak.net>

Hi Brett, hi Floris,

On Sat, Nov 19, 2005 at 04:12:28PM -0800, Brett Cannon wrote:
> Just  for everyone's FYI while we are talking about profilers, Floris
> Bruynooghe (who I am cc'ing on this so he can contribute to the
> conversation), for Google's Summer of Code, wrote a replacement for
> 'profile' that uses Hotshot directly.  Thanks to his direct use of
> Hotshot and rewrite of pstats it loads Hotshot data 30% faster and
> also alleviates keeping 'profile' around and its slightly questionable
> license.

Thanks for the note!  30% faster than an incredibly long time is still
quite long, but that's an improvment, I suppose.  However, this code is
not ready yet.  For example the new loader gives wrong results in the
presence of recursive function calls.


A bientot,

Armin.

From arigo at tunes.org  Mon Nov 21 12:14:30 2005
From: arigo at tunes.org (Armin Rigo)
Date: Mon, 21 Nov 2005 12:14:30 +0100
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <1f7befae0511201755h2cb4bdf8s9c4b8586ee3c530a@mail.gmail.com>
References: <20051119180855.GA26733@code1.codespeak.net>
	<1f7befae0511201755h2cb4bdf8s9c4b8586ee3c530a@mail.gmail.com>
Message-ID: <20051121111430.GB13478@code1.codespeak.net>

Hi Tim,

On Sun, Nov 20, 2005 at 08:55:49PM -0500, Tim Peters wrote:
> We should note that hotshot didn't intend to reduce total time
> overhead.  What it's aiming at here is to be less disruptive (than
> profile.py) to the code being profiled _while_ that code is running. 

> hotshot tries to stick with tiny little C functions that pack away a
> tiny amount of data each time, and avoid memory alloc/dealloc, to try
> to minimize this disruption.  It looked like it was making real
> progress on this at one time ;-)

I see the point.  I suppose that we can discuss if hotshot is really
nicer on the D cache, as it produces a constant stream of data, whereas
classical profilers like lsprof would in the common case only update a
few counters in existing data structures.  I can tweak lsprof a bit
more, though -- there is a malloc on each call, but it could be avoided.

Still, people generally agree that profile.py, while taking a longer
time overall, gives more meaningful results than hotshot.  Now Brett's
student, Floris, extended hotshot to allow custom timers.  This is
essential, because it enables testing.  The timing parts of hotshot were
not tested previously.

Given the high correlation between untestedness and brokenness, you bet
that Floris' adapted test_profile for hotshot gives wrong numbers.  (My
guess is that Floris overlooked that test_profile was an output test, so
he didn't compare the resulting numbers with the expected ones.)
Looking at the errors in the numbers pointed us immediately to the bug
in the C code.  Some time intervals are lost: the ones before an
exception is raised or a C function is called or returns.  That's a lot
of them.  The current hotshot is hence not so much a profiler than "a
reflection on the meaning of time" (quoting Samuele).

> Ya, hotshot isn't finished.  It had corporate support for its initial
> development, but lost that, and became an orphan then.

I will check in the bug fix for hotshot, but the question is what's the
point.  I would argue that lsprof even with children call stats is much
simpler than hotshot.  Lines-of-code also reflect that (factor of 2).
Obviously hotshot can do much more (undocumented, unmaintained) things
beside profiling if you get the correct tools.  This plays in favour of
lsprof as a stdlib-integrated useful-for-common-people maintained piece
of software and hotshot as distributed together with the tools that can
use its full potential.


A bientot,

Armin.

From arigo at tunes.org  Mon Nov 21 12:41:01 2005
From: arigo at tunes.org (Armin Rigo)
Date: Mon, 21 Nov 2005 12:41:01 +0100
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <43817375.6040108@v.loewis.de>
References: <20051119180855.GA26733@code1.codespeak.net>
	<4380F572.9040402@v.loewis.de> <dlqtk8$37q$1@sea.gmane.org>
	<43817375.6040108@v.loewis.de>
Message-ID: <20051121114101.GC13478@code1.codespeak.net>

Hi Martin,

On Mon, Nov 21, 2005 at 08:12:53AM +0100, "Martin v. L?wis" wrote:
> If bugs are in the heavily-used parts of the library, like regular
> expressions, it doesn't matter much if the original author goes
> away for some period of time - other contributors will fix the bugs
> that they care about, and not by rewriting the entire thing.

I see no incremental way of fixing some of the downsides of hotshot,
like its huge log file size and loading time.  I doubt people often find
the motivation to dig into this large orphaned piece of software.
Instead, they rewrite their own profilers, because writing a basic one
is not difficult.  It is much less difficult than, say, writing a basic
regular expression engine (but even the latter has gotten rewritten at
times) -- unless you want to go into the advanced corners mentioned by
Tim.

Some guys posted their 'lsprof' on SF because it was well-polished and
they found it useful, so here I am, arguing for a standard library
containing preferably simple pieces of code that work and are practical
for the common advertised use case.  I'm not even sure in this case why
we are arguing: the new piece of code's interface can be made 100%
compatible with the documented parts of the previous interface; the
previous module has been around for longer but so far it produced
half-meaningless numbers due to bugs.


A bientot,

Armin.

From jepler at unpythonic.net  Mon Nov 21 14:50:48 2005
From: jepler at unpythonic.net (jepler@unpythonic.net)
Date: Mon, 21 Nov 2005 07:50:48 -0600
Subject: [Python-Dev] Patch Req. # 1351020 & 1351036: PythonD
	modifications
In-Reply-To: <20051121070845.GA12993@ithaca04.ddaustralia.local>
References: <39387.202.3.192.11.1132108393.squirrel@cafemail.mcadcafe.com>
	<437FA1D8.7060600@v.loewis.de>
	<20051120150850.GA27838@unpythonic.net>
	<25509.202.3.192.11.1132533752.squirrel@cafemail.mcadcafe.com>
	<43816CE2.2020808@v.loewis.de>
	<20051121070845.GA12993@ithaca04.ddaustralia.local>
Message-ID: <20051121135047.GA22167@unpythonic.net>

On Mon, Nov 21, 2005 at 06:08:45PM +1100, Ben Decker wrote:
> I think the port has beed supported for three years now. I am not sure what
> kind of commitment you are looking for, but the patch and software are
> supplied under the same terms of liability and warranty as anything else
> under the GPL. 

Python is not GPL software.  If your patch is under the terms of the GPL, it
cannot be accepted into Python.

Jeff
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20051121/9342ee84/attachment.pgp

From barry at python.org  Mon Nov 21 15:25:05 2005
From: barry at python.org (Barry Warsaw)
Date: Mon, 21 Nov 2005 09:25:05 -0500
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <20051121111430.GB13478@code1.codespeak.net>
References: <20051119180855.GA26733@code1.codespeak.net>
	<1f7befae0511201755h2cb4bdf8s9c4b8586ee3c530a@mail.gmail.com>
	<20051121111430.GB13478@code1.codespeak.net>
Message-ID: <1132583105.10235.32.camel@geddy.wooz.org>

On Mon, 2005-11-21 at 12:14 +0100, Armin Rigo wrote:

> Still, people generally agree that profile.py, while taking a longer
> time overall, gives more meaningful results than hotshot.  Now Brett's
> student, Floris, extended hotshot to allow custom timers.  This is
> essential, because it enables testing.  The timing parts of hotshot were
> not tested previously.

hotshot used to produce incorrect data because it couldn't track exits
from functions due to exception propagation.  We fixed that a while back
and since then it's been pretty useful for us.  While I'm not sure I
like the idea of three profilers in the stdlib, I think in this case
(unless they're incompatible) it would make sense to keep hotshot
around, at least until any new profiler proves it's better over a couple
of releases.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20051121/79efc9da/attachment.pgp

From fredrik at pythonware.com  Mon Nov 21 15:41:03 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 21 Nov 2005 15:41:03 +0100
Subject: [Python-Dev] ast status, memory leaks, etc
References: <ee2a432c0511131141s72fedecax29008fd783a3b0db@mail.gmail.com><ee2a432c0511191615y6259e95bwce68aec849a7ebfa@mail.gmail.com><438048B6.2030103@v.loewis.de>
	<ee2a432c0511201614u1dadb3b2x419e3482ccf5b145@mail.gmail.com>
Message-ID: <dlsma2$kj1$1@sea.gmane.org>

Neal Norwitz wrote:

> The big benefit of running with pymalloc is that it only takes about
> 1.25 to 1.50 hours to run on my box.  When running without pymalloc, I
> estimate it takes about 5 times longer.  Plus it requires a lot of
> extra work since I need to run the tests in batches.  I only have 1 GB
> of RAM and it takes a lot more than that when running without
> pymalloc.

sounds like the PSF should buy you some more RAM.

</F>




From arigo at tunes.org  Mon Nov 21 16:09:33 2005
From: arigo at tunes.org (Armin Rigo)
Date: Mon, 21 Nov 2005 16:09:33 +0100
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <1132583105.10235.32.camel@geddy.wooz.org>
References: <20051119180855.GA26733@code1.codespeak.net>
	<1f7befae0511201755h2cb4bdf8s9c4b8586ee3c530a@mail.gmail.com>
	<20051121111430.GB13478@code1.codespeak.net>
	<1132583105.10235.32.camel@geddy.wooz.org>
Message-ID: <20051121150932.GA7134@code1.codespeak.net>

Hi Barry,

On Mon, Nov 21, 2005 at 09:25:05AM -0500, Barry Warsaw wrote:
> hotshot used to produce incorrect data because it couldn't track exits
> from functions due to exception propagation.  We fixed that a while back

It might be me, but I find it a bit odd that you didn't do anything with
this fix.  I'm sure that for each alternate profiler posted on SF there
are ten half-finished ones on somebody's box.  The problem of hotshot
producing slightly wrong data is not new, and in hindsight the
discrepencies only became larger in 2.4 with the introduction of new
tracing events (C function calls). 

At this point I'm interpreting your mail as saying that you don't really
mind if hotshot is in the standard library or not, because you are using
your own fixed version anyway.  Nobody is proposing to wipe out hotshot
from the face of the planet.  Sorry if I sound offensive, but I'd rather
hear the opinion of people that care about the stdlib.

Armin

From barry at python.org  Mon Nov 21 17:40:37 2005
From: barry at python.org (Barry Warsaw)
Date: Mon, 21 Nov 2005 11:40:37 -0500
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <20051121150932.GA7134@code1.codespeak.net>
References: <20051119180855.GA26733@code1.codespeak.net>
	<1f7befae0511201755h2cb4bdf8s9c4b8586ee3c530a@mail.gmail.com>
	<20051121111430.GB13478@code1.codespeak.net>
	<1132583105.10235.32.camel@geddy.wooz.org>
	<20051121150932.GA7134@code1.codespeak.net>
Message-ID: <1132591237.10237.51.camel@geddy.wooz.org>

On Mon, 2005-11-21 at 16:09 +0100, Armin Rigo wrote:

> It might be me, but I find it a bit odd that you didn't do anything with
> this fix.  

Hi Armin.  Actually it was SF #900092 that I was referring to.  We fixed
this bug and those patches were applied to CVS (pre-svn conversion) for
both 2.4.2 and 2.5a1.  So at least the one I was talking about are
already in there!

> At this point I'm interpreting your mail as saying that you don't really
> mind if hotshot is in the standard library or not, because you are using
> your own fixed version anyway.  Nobody is proposing to wipe out hotshot
> from the face of the planet.  Sorry if I sound offensive, but I'd rather
> hear the opinion of people that care about the stdlib.

I think you just misunderstood me.  I definitely care about the stdlib
and no, we strongly prefer not to use some locally hacked up Python.
E.g. we were running 2.4.1 with this (and a few other patches) until
2.4.2 came out, but now we're pretty much using pristine Python 2.4.2.

So I still think hotshot can stay in the stdlib for a few releases,
unless it's totally incompatible with lsprof, and then it's worth
discussing.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20051121/c8f29d14/attachment.pgp

From arigo at tunes.org  Mon Nov 21 18:03:04 2005
From: arigo at tunes.org (Armin Rigo)
Date: Mon, 21 Nov 2005 18:03:04 +0100
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <1132591237.10237.51.camel@geddy.wooz.org>
References: <20051119180855.GA26733@code1.codespeak.net>
	<1f7befae0511201755h2cb4bdf8s9c4b8586ee3c530a@mail.gmail.com>
	<20051121111430.GB13478@code1.codespeak.net>
	<1132583105.10235.32.camel@geddy.wooz.org>
	<20051121150932.GA7134@code1.codespeak.net>
	<1132591237.10237.51.camel@geddy.wooz.org>
Message-ID: <20051121170304.GA8711@code1.codespeak.net>

Hi Barry,

On Mon, Nov 21, 2005 at 11:40:37AM -0500, Barry Warsaw wrote:
> Hi Armin.  Actually it was SF #900092 that I was referring to.

Ah, we're talking about different things then.  The patch in SF #900092
is not related to hotshot, it's just ceval.c not producing enough events
to allow a precise timing of exceptions.  (Now that ceval.c is fixed, we
could remove a few hacks from profile.py, BTW.)

I am referring to a specific bug of hotshot which entirely drops some
genuine time intervals, all the time.  It's untested code!  A minimal
test like Floris' test_profile shows it clearly.


A bientot,

Armin.

From nnorwitz at gmail.com  Mon Nov 21 20:05:19 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Mon, 21 Nov 2005 11:05:19 -0800
Subject: [Python-Dev] ast status, memory leaks, etc
In-Reply-To: <dlsma2$kj1$1@sea.gmane.org>
References: <ee2a432c0511131141s72fedecax29008fd783a3b0db@mail.gmail.com>
	<ee2a432c0511191615y6259e95bwce68aec849a7ebfa@mail.gmail.com>
	<438048B6.2030103@v.loewis.de>
	<ee2a432c0511201614u1dadb3b2x419e3482ccf5b145@mail.gmail.com>
	<dlsma2$kj1$1@sea.gmane.org>
Message-ID: <ee2a432c0511211105w7b60bae1ibaaf6e2a4bd077fb@mail.gmail.com>

On 11/21/05, Fredrik Lundh <fredrik at pythonware.com> wrote:
>
> sounds like the PSF should buy you some more RAM.

I think I still have some allocation from the PSF. Wanna have a party. ;-)

Seriously, I don't know that more RAM would help too much.  I didn't
notice much swapping, but maybe if I had run in bigger chunks
--without-pymalloc I would have.

I think a bigger bang for the buck would be to buy a Windows box with
Purify.  Rational was a real pain to deal with, maybe it's better now
that IBM bought them.  Parasoft (Insure++) was even worse to deal
with.  There would be many other benefits for someone to do more
testing on Windows.  The worst part of all this is ... it's still
Windows.

I'm not tied to Purify, I just don't know anything that works better. 
I've never used any such tool on Windows though.

n

From bcannon at gmail.com  Mon Nov 21 20:38:09 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Mon, 21 Nov 2005 11:38:09 -0800
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <20051121114101.GC13478@code1.codespeak.net>
References: <20051119180855.GA26733@code1.codespeak.net>
	<4380F572.9040402@v.loewis.de> <dlqtk8$37q$1@sea.gmane.org>
	<43817375.6040108@v.loewis.de>
	<20051121114101.GC13478@code1.codespeak.net>
Message-ID: <bbaeab100511211138w244f0498k728363802328df2c@mail.gmail.com>

On 11/21/05, Armin Rigo <arigo at tunes.org> wrote:
> Hi Martin,
>
> On Mon, Nov 21, 2005 at 08:12:53AM +0100, "Martin v. L?wis" wrote:
> > If bugs are in the heavily-used parts of the library, like regular
> > expressions, it doesn't matter much if the original author goes
> > away for some period of time - other contributors will fix the bugs
> > that they care about, and not by rewriting the entire thing.
>
> I see no incremental way of fixing some of the downsides of hotshot,
> like its huge log file size and loading time.  I doubt people often find
> the motivation to dig into this large orphaned piece of software.
> Instead, they rewrite their own profilers, because writing a basic one
> is not difficult.  It is much less difficult than, say, writing a basic
> regular expression engine (but even the latter has gotten rewritten at
> times) -- unless you want to go into the advanced corners mentioned by
> Tim.
>
> Some guys posted their 'lsprof' on SF because it was well-polished and
> they found it useful, so here I am, arguing for a standard library
> containing preferably simple pieces of code that work and are practical
> for the common advertised use case.  I'm not even sure in this case why
> we are arguing: the new piece of code's interface can be made 100%
> compatible with the documented parts of the previous interface; the
> previous module has been around for longer but so far it produced
> half-meaningless numbers due to bugs.
>

Just because it is starting to feel like the objections are getting
spread out amongst various parts of this thread, I want to try to
summarize them as I remember them and give my input on them.

So one objection seems to be the question of maintenance.  Who is
going to keep this code updated and running?  As has been pointed out,
Hotshot is not perfect and its development basically stopped.  So
people being a little on edge about yet another profiler that might
not be maintained seems reasonable.

But this worry, in my mind, is alleviated since I believe both Michael
and Armin are willing to maintain the code.  With them both willing to
make sure it stays working (which is a pretty damn good commitment
since we have two core developers willing to keep this going and not
just one) I think this worry is dealt with.

The other issue seems to be some people wanting to keep Hotshot around
for a few releases until lsprof can prove its worth.  I believe this
is what Barry is asking for.  Now Armin has said that a wrapper around
lsprof can be written that will match Hotshot's public API so its need
is not there if lsprof works and the wrapper is good.

If it wasn't Armin or someone else whose opinion I trusted, I would
say go ahead and keep Hotshot around and then eventually do the
wrapper.  But since it is Armin making this claim and the PyPy team
uses this thing (who has several members who I think know what they
are doing  =)  I have faith in them coming up with a good wrapper. 
Thus I say removing Hotshot is fine.

Lastly, there is the argument of whether we should even include a
profiler.  Personally I say yes.  It is another battery that is rather
nice.  I think if the profiler finally had a good reputation of being
accurate and useful it would get more play in the real world.  Plus we
already include other development tools such as IDLE with Python so it
seems fitting to include other dev tools when we have the code and a
maintenance commitment.

In other words, I say let Armin and Michael add lsprof and the
wrappers for it (all while removing any redundant profilers that they
have wrappers for) with them knowing we will have a public stoning at
PyCon the instant they don't keep it all working.  =)

-Brett

From jeremy at alum.mit.edu  Mon Nov 21 20:42:32 2005
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Mon, 21 Nov 2005 14:42:32 -0500
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <bbaeab100511211138w244f0498k728363802328df2c@mail.gmail.com>
References: <20051119180855.GA26733@code1.codespeak.net>
	<4380F572.9040402@v.loewis.de> <dlqtk8$37q$1@sea.gmane.org>
	<43817375.6040108@v.loewis.de>
	<20051121114101.GC13478@code1.codespeak.net>
	<bbaeab100511211138w244f0498k728363802328df2c@mail.gmail.com>
Message-ID: <e8bf7a530511211142v629f6c69s6b07a3025db6f2ae@mail.gmail.com>

Here's another attempt to disentagle some issues:
- Should lsprof be added to the standard distribution?
- Should hotshot be removed from the standard distribution?

These two aren't at all related, unless you believe that two is the
maximum number of profiles allowed per Python distribution.

I've never trusted results from hotshot, but I'd rather see it fixed
than removed.

Jeremy

On 11/21/05, Brett Cannon <bcannon at gmail.com> wrote:
> On 11/21/05, Armin Rigo <arigo at tunes.org> wrote:
> > Hi Martin,
> >
> > On Mon, Nov 21, 2005 at 08:12:53AM +0100, "Martin v. L?wis" wrote:
> > > If bugs are in the heavily-used parts of the library, like regular
> > > expressions, it doesn't matter much if the original author goes
> > > away for some period of time - other contributors will fix the bugs
> > > that they care about, and not by rewriting the entire thing.
> >
> > I see no incremental way of fixing some of the downsides of hotshot,
> > like its huge log file size and loading time.  I doubt people often find
> > the motivation to dig into this large orphaned piece of software.
> > Instead, they rewrite their own profilers, because writing a basic one
> > is not difficult.  It is much less difficult than, say, writing a basic
> > regular expression engine (but even the latter has gotten rewritten at
> > times) -- unless you want to go into the advanced corners mentioned by
> > Tim.
> >
> > Some guys posted their 'lsprof' on SF because it was well-polished and
> > they found it useful, so here I am, arguing for a standard library
> > containing preferably simple pieces of code that work and are practical
> > for the common advertised use case.  I'm not even sure in this case why
> > we are arguing: the new piece of code's interface can be made 100%
> > compatible with the documented parts of the previous interface; the
> > previous module has been around for longer but so far it produced
> > half-meaningless numbers due to bugs.
> >
>
> Just because it is starting to feel like the objections are getting
> spread out amongst various parts of this thread, I want to try to
> summarize them as I remember them and give my input on them.
>
> So one objection seems to be the question of maintenance.  Who is
> going to keep this code updated and running?  As has been pointed out,
> Hotshot is not perfect and its development basically stopped.  So
> people being a little on edge about yet another profiler that might
> not be maintained seems reasonable.
>
> But this worry, in my mind, is alleviated since I believe both Michael
> and Armin are willing to maintain the code.  With them both willing to
> make sure it stays working (which is a pretty damn good commitment
> since we have two core developers willing to keep this going and not
> just one) I think this worry is dealt with.
>
> The other issue seems to be some people wanting to keep Hotshot around
> for a few releases until lsprof can prove its worth.  I believe this
> is what Barry is asking for.  Now Armin has said that a wrapper around
> lsprof can be written that will match Hotshot's public API so its need
> is not there if lsprof works and the wrapper is good.
>
> If it wasn't Armin or someone else whose opinion I trusted, I would
> say go ahead and keep Hotshot around and then eventually do the
> wrapper.  But since it is Armin making this claim and the PyPy team
> uses this thing (who has several members who I think know what they
> are doing  =)  I have faith in them coming up with a good wrapper.
> Thus I say removing Hotshot is fine.
>
> Lastly, there is the argument of whether we should even include a
> profiler.  Personally I say yes.  It is another battery that is rather
> nice.  I think if the profiler finally had a good reputation of being
> accurate and useful it would get more play in the real world.  Plus we
> already include other development tools such as IDLE with Python so it
> seems fitting to include other dev tools when we have the code and a
> maintenance commitment.
>
> In other words, I say let Armin and Michael add lsprof and the
> wrappers for it (all while removing any redundant profilers that they
> have wrappers for) with them knowing we will have a public stoning at
> PyCon the instant they don't keep it all working.  =)
>
> -Brett
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu
>

From skip at pobox.com  Mon Nov 21 21:06:24 2005
From: skip at pobox.com (skip@pobox.com)
Date: Mon, 21 Nov 2005 14:06:24 -0600
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <e8bf7a530511211142v629f6c69s6b07a3025db6f2ae@mail.gmail.com>
References: <20051119180855.GA26733@code1.codespeak.net>
	<4380F572.9040402@v.loewis.de> <dlqtk8$37q$1@sea.gmane.org>
	<43817375.6040108@v.loewis.de>
	<20051121114101.GC13478@code1.codespeak.net>
	<bbaeab100511211138w244f0498k728363802328df2c@mail.gmail.com>
	<e8bf7a530511211142v629f6c69s6b07a3025db6f2ae@mail.gmail.com>
Message-ID: <17282.10432.609056.20620@montanaro.dyndns.org>


    Jeremy> Here's another attempt to disentagle some issues:
    Jeremy> - Should lsprof be added to the standard distribution?
    Jeremy> - Should hotshot be removed from the standard distribution?

Adding another log to the fire, what about statprof, a sampling profiler,
which Neil Schemenauer mentioned?  I installed it here at work.  Seems to
work as advertised.  Took me about two minutes to modify our main app to
accept a -P command line flag to enable statprof profiling.  It has the
beauty of being minimally invasive since it only samples the execution state
every 100ms or so.  Of course, sampling profilers have their own warts, but
they avoid some of the problems of instrumenting profilers.

Another tack to take would be to modify the generated byte code to only
increment counts for each basic block, similar to what gcc's -pg flag does.
I think that would yield a fully instrumented profiler, but one that's less
invasive than the current alternatives.  It could maybe be implemented as an
import hook.  Of course, such a beast has yet to be written, so this email
and a couple bucks will get you a cup of coffee.

This entire discussion simply serves to demonstrate that there are lots of
different ways to skin this particular cat.  How many of these various
alternatives belong in the standard library remains to be seen.

Skip

From bcannon at gmail.com  Mon Nov 21 21:16:58 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Mon, 21 Nov 2005 12:16:58 -0800
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <17282.10432.609056.20620@montanaro.dyndns.org>
References: <20051119180855.GA26733@code1.codespeak.net>
	<4380F572.9040402@v.loewis.de> <dlqtk8$37q$1@sea.gmane.org>
	<43817375.6040108@v.loewis.de>
	<20051121114101.GC13478@code1.codespeak.net>
	<bbaeab100511211138w244f0498k728363802328df2c@mail.gmail.com>
	<e8bf7a530511211142v629f6c69s6b07a3025db6f2ae@mail.gmail.com>
	<17282.10432.609056.20620@montanaro.dyndns.org>
Message-ID: <bbaeab100511211216v596ff9acg150fd2994e98b159@mail.gmail.com>

On 11/21/05, skip at pobox.com <skip at pobox.com> wrote:
>
>     Jeremy> Here's another attempt to disentagle some issues:
>     Jeremy> - Should lsprof be added to the standard distribution?
>     Jeremy> - Should hotshot be removed from the standard distribution?
>
> Adding another log to the fire, what about statprof, a sampling profiler,
> which Neil Schemenauer mentioned?  I installed it here at work.  Seems to
> work as advertised.  Took me about two minutes to modify our main app to
> accept a -P command line flag to enable statprof profiling.  It has the
> beauty of being minimally invasive since it only samples the execution state
> every 100ms or so.  Of course, sampling profilers have their own warts, but
> they avoid some of the problems of instrumenting profilers.
>

My question is whether anyone is willing to maintain it in the stdlib?

-Brett

From bcannon at gmail.com  Mon Nov 21 21:30:16 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Mon, 21 Nov 2005 12:30:16 -0800
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <e8bf7a530511211142v629f6c69s6b07a3025db6f2ae@mail.gmail.com>
References: <20051119180855.GA26733@code1.codespeak.net>
	<4380F572.9040402@v.loewis.de> <dlqtk8$37q$1@sea.gmane.org>
	<43817375.6040108@v.loewis.de>
	<20051121114101.GC13478@code1.codespeak.net>
	<bbaeab100511211138w244f0498k728363802328df2c@mail.gmail.com>
	<e8bf7a530511211142v629f6c69s6b07a3025db6f2ae@mail.gmail.com>
Message-ID: <bbaeab100511211230v304cb37dw5dd16e2a81f5572e@mail.gmail.com>

On 11/21/05, Jeremy Hylton <jeremy at alum.mit.edu> wrote:
> Here's another attempt to disentagle some issues:
> - Should lsprof be added to the standard distribution?
> - Should hotshot be removed from the standard distribution?
>
> These two aren't at all related, unless you believe that two is the
> maximum number of profiles allowed per Python distribution.
>

They aren't related if Hotshot provides some functionality that lsprof
cannot provide (such as profiling C code; I thought Nick Bastin added
support for this?).  But if there isn't, then there is some soft
relatedness between them since it means that if lsprof is added then
hotshot could be removed without backwards-compatibilty issues.  They
are not mutually exclusive, but one being true does influence the
other.

And as for how many profilers to have, I personally think one is
plenty if they all provide similar type of output using similar
techniques.  But backwards-compatibility obviously is going to make
total removal of a module and its API hard so I am thinking more
towards Python 3000 and having the best solution in now.  Otherwise we
should do what must be  done to fix hotshot and stick with it.

-Brett

From arigo at tunes.org  Mon Nov 21 22:15:56 2005
From: arigo at tunes.org (Armin Rigo)
Date: Mon, 21 Nov 2005 22:15:56 +0100
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <20051121164104.GA8898@laurie.sheepb.homeip.net>
References: <20051119180855.GA26733@code1.codespeak.net>
	<1f7befae0511201755h2cb4bdf8s9c4b8586ee3c530a@mail.gmail.com>
	<20051121111430.GB13478@code1.codespeak.net>
	<20051121164104.GA8898@laurie.sheepb.homeip.net>
Message-ID: <20051121211556.GA10821@code1.codespeak.net>

Hi Floris,

On Mon, Nov 21, 2005 at 04:41:04PM +0000, Floris Bruynooghe wrote:
> > Now Brett's
> > student, Floris, extended hotshot to allow custom timers.  This is
> > essential, because it enables testing.  The timing parts of hotshot were
> > not tested previously.
> 
> Don't be too enthousiastic here.

Testing is done by feeding the profiler something that is not a real
timer function, but gives easy to predict answers.  Then we check that
the profiler accounted all this pseudo-time to the correct functions in
the correct way.  This is one of the few way to reliably test a
profiler, that's why it is essential.

> Iirc I did compare the output of test_profile between profile and my
> wrapper.  This was one of my checks to make sure it was wrapped
> correctly.  So could you tell me how they are different?

test_profile works as I explained above.  Running it with hotshot shows
different numbers, which means that there is a bug (and not just some
difference in real speed).   More precisely, a specific number of the
pseudo-clock-ticks are dropped for no reason other than a bug, and
doesn't show up in the final results at all.


A bientot,

Armin

From martin at v.loewis.de  Mon Nov 21 22:16:16 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 21 Nov 2005 22:16:16 +0100
Subject: [Python-Dev] Patch Req. # 1351020 & 1351036: PythonD
	modifications
In-Reply-To: <20051121070845.GA12993@ithaca04.ddaustralia.local>
References: <39387.202.3.192.11.1132108393.squirrel@cafemail.mcadcafe.com>
	<437FA1D8.7060600@v.loewis.de>
	<20051120150850.GA27838@unpythonic.net>
	<25509.202.3.192.11.1132533752.squirrel@cafemail.mcadcafe.com>
	<43816CE2.2020808@v.loewis.de>
	<20051121070845.GA12993@ithaca04.ddaustralia.local>
Message-ID: <43823920.3070802@v.loewis.de>

Ben Decker wrote:
> I think the port has beed supported for three years now. I am not
> sure what kind of commitment you are looking for, but the patch and
> software are supplied under the same terms of liability and warranty
> as anything else under the GPL.

That (licensed under GPL) would be an issue, as we are not accepting
GPL-licensed code. I would guess that you are flexibly in licensing,
though: we would request that you allow us to relicense the contribution
under the terms at

http://www.python.org/psf/contrib.html

The commitment I was looking for was rather a statement like
"I will be maintaining it for several coming years; when I ever
stop maintaining it, feel free to remove the code again".

So it is not that much past history (although this also matters,
and three years of availability is certainly a good record); it
is more important to somehow commit to future support, so that
we are not left alone with code when cannot maintain if you
ever drop out.

Regards,
Martin

From arigo at tunes.org  Mon Nov 21 22:23:09 2005
From: arigo at tunes.org (Armin Rigo)
Date: Mon, 21 Nov 2005 22:23:09 +0100
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <20051121164503.GB8898@laurie.sheepb.homeip.net>
References: <20051119180855.GA26733@code1.codespeak.net>
	<bbaeab100511191612o4877977bn1144c6cba4c4f5a@mail.gmail.com>
	<20051121111426.GA13478@code1.codespeak.net>
	<20051121164503.GB8898@laurie.sheepb.homeip.net>
Message-ID: <20051121212309.GB10821@code1.codespeak.net>

Hi Floris,

On Mon, Nov 21, 2005 at 04:45:03PM +0000, Floris Bruynooghe wrote:
> Afaik I did test recursive calls etc.

It seems to show up in any test case I try, e.g.

    import hprofile
    def wait(m):
        if m > 0:
            wait(m-1)
    def f(n):
        wait(n)
        if n > 1:
            return n*f(n-1)
        else:
            return 1
    hprofile.run("f(500)", 'dump-hprof')

The problem is in the cumulative time column, which (on this machine)
says 163 seconds for both f() and wait().  The whole program finishes in
1 second...  The same log file loaded with hotshot.stats doesn't have
this problem.


A bientot,

Armin.

From martin at v.loewis.de  Mon Nov 21 22:29:55 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 21 Nov 2005 22:29:55 +0100
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <20051121114101.GC13478@code1.codespeak.net>
References: <20051119180855.GA26733@code1.codespeak.net>
	<4380F572.9040402@v.loewis.de> <dlqtk8$37q$1@sea.gmane.org>
	<43817375.6040108@v.loewis.de>
	<20051121114101.GC13478@code1.codespeak.net>
Message-ID: <43823C53.8080403@v.loewis.de>

Armin Rigo wrote:
> I see no incremental way of fixing some of the downsides of hotshot,
> like its huge log file size and loading time.

I haven't looked into the details myself, but it appears that some
google-summer-of-code contributor has found some way of fixing it.

> I doubt people often find
> the motivation to dig into this large orphaned piece of software.

As Fredrik says: this sounds like the CADT model. The code isn't really
orphaned - it's just that it isn't used much. Contributions to this
code certainly would still be accepted (and happily so).

So essentially: fixing bugs isn't fun, but rewriting it from scratch is.

> I'm not even sure in this case why
> we are arguing

That's pretty obvious to me: because some people are shy of letting
version 0.8 of the old software be replaced with version 0.8 of the
new software, which is then replaced with version 0.8 of the next
rewrite.

Instead, we should stick to what we have, and improve it.

Now, it might be that in this specific case, replacing the library
really is the right thing to do. It would be if:
1.it has improvements over the current library already
   (certified by users other than the authors), AND
2.it has no drawbacks over the current library, AND
3.there is some clear indication that it will get better maintenance
   than the previous library.

I'm not certain lsprof has properties 2 and 3; property 1, so far,
is only asserted by the library author himself.

Perhaps it is true what Fredrik Lundh says: there shouldn't be a
profiler in the standard library at all.

Regards,
Martin

From jimjjewett at gmail.com  Mon Nov 21 22:36:33 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Mon, 21 Nov 2005 16:36:33 -0500
Subject: [Python-Dev] s/hotshot/lsprof
Message-ID: <fb6fbf560511211336w3d5bc7dbn71c2154bf5455c99@mail.gmail.com>

Jeremy Hylton jeremy at alum.mit.edu
>  Should lsprof be added to the standard distribution?
>  Should hotshot be removed from the standard distribution?

>  These two aren't at all related, unless you believe that two is the
>  maximum number of profiles allowed per Python distribution.

One is a better number.

("There should be one-- and preferably only one --obvious way to do it.")

Adding a second (let alone third) module to the stdlib to do
the same thing just makes the documentation bulkier,
and makes the "where do I start" problem harder for beginners.

And yes, I think beginners are the most important audience
here; anyone sufficiently comfortable with python to make an
intelligent choice between different code profilers is probably
also able to install 3rd-party modules anyway.

Note that I have no objection to (and would like to see) a
section in the module documentation saying "This is just
one alternative; many people prefer XXX because of YYY".
This mention would provide enough endorsement for
anyone ready to choose another profiler.

Even putting the alternatives into a single stdlib package
(and making it clear that they are alternatives, rather than
complementary building blocks) is better than simply
leaving them all scattered throughout the stdlib as
roll-the-dice-to-pick alternatives.

-jJ

From martin at v.loewis.de  Mon Nov 21 22:40:02 2005
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Mon, 21 Nov 2005 22:40:02 +0100
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <bbaeab100511211138w244f0498k728363802328df2c@mail.gmail.com>
References: <20051119180855.GA26733@code1.codespeak.net>	
	<4380F572.9040402@v.loewis.de> <dlqtk8$37q$1@sea.gmane.org>	
	<43817375.6040108@v.loewis.de>	
	<20051121114101.GC13478@code1.codespeak.net>
	<bbaeab100511211138w244f0498k728363802328df2c@mail.gmail.com>
Message-ID: <43823EB2.8040108@v.loewis.de>

Brett Cannon wrote:
> But this worry, in my mind, is alleviated since I believe both Michael
> and Armin are willing to maintain the code.  With them both willing to
> make sure it stays working (which is a pretty damn good commitment
> since we have two core developers willing to keep this going and not
> just one) I think this worry is dealt with.

So far, neither of them has explicitly said so: Michael said he will
be around; and I'm certain that is the case for Python as a whole.
An explicit commitment to lsprof maintenance would help (me, atleast).

> In other words, I say let Armin and Michael add lsprof and the
> wrappers for it (all while removing any redundant profilers that they
> have wrappers for) with them knowing we will have a public stoning at
> PyCon the instant they don't keep it all working.  =)

I would prefer to see some advance support from lsprof users, confirming
that this is really a good thing to have.

Regards,
Martin

From ncoghlan at gmail.com  Mon Nov 21 23:02:38 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 22 Nov 2005 08:02:38 +1000
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <fb6fbf560511211336w3d5bc7dbn71c2154bf5455c99@mail.gmail.com>
References: <fb6fbf560511211336w3d5bc7dbn71c2154bf5455c99@mail.gmail.com>
Message-ID: <438243FE.1020504@gmail.com>

Jim Jewett wrote:
> Jeremy Hylton jeremy at alum.mit.edu
>>  Should lsprof be added to the standard distribution?
>>  Should hotshot be removed from the standard distribution?
> 
>>  These two aren't at all related, unless you believe that two is the
>>  maximum number of profiles allowed per Python distribution.
> 
> One is a better number.
> 
> ("There should be one-- and preferably only one --obvious way to do it.")
> 
> Adding a second (let alone third) module to the stdlib to do
> the same thing just makes the documentation bulkier,
> and makes the "where do I start" problem harder for beginners.
> 
> And yes, I think beginners are the most important audience
> here; anyone sufficiently comfortable with python to make an
> intelligent choice between different code profilers is probably
> also able to install 3rd-party modules anyway.

Chiming in as a user of 'profile', that has also attempted to use hotshot. . .

I used profile heavily when we working on the implementation of the decimal 
module, trying to figure out where the bottlenecks were (e.g., profile showed 
that converting to integers to do arithmetic and back to sequences to do 
rounding was a net win, despite the conversion costs in switching back and 
forth between the two formats).

I tried using hotshot to do the same thing (profiled runs of the arithmetic 
tests took a *long* time), and found the results to be well-nigh useless (I 
seem to recall it was related to the fact that profile separated out C calls, 
while hotshot didn't).

So my experience of hotshot has been "sure it's slightly less invasive, but it 
doesn't actually work". If hotshot can be replaced with something that 
actually works as intended, or if lsprof can be added in a way that is more 
closely coupled with profile (so that there is a clear choice between "less 
invasive but less detailed results" and "more detailed results but more 
invasive during execution"), I'd be quite happy.

If a statistical profiler was later added to round out the minimally invasive 
end, that actually makes for a decent profiling toolset:

1. Use the statistical profiler to identify potential problem areas
2. Use hotshot/lsprof to further analyse the potential problem areas
3. Use profile to get detailed results on the bottlenecks

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From nyamatongwe at gmail.com  Mon Nov 21 23:04:40 2005
From: nyamatongwe at gmail.com (Neil Hodgson)
Date: Tue, 22 Nov 2005 09:04:40 +1100
Subject: [Python-Dev] ast status, memory leaks, etc
In-Reply-To: <ee2a432c0511211105w7b60bae1ibaaf6e2a4bd077fb@mail.gmail.com>
References: <ee2a432c0511131141s72fedecax29008fd783a3b0db@mail.gmail.com>
	<ee2a432c0511191615y6259e95bwce68aec849a7ebfa@mail.gmail.com>
	<438048B6.2030103@v.loewis.de>
	<ee2a432c0511201614u1dadb3b2x419e3482ccf5b145@mail.gmail.com>
	<dlsma2$kj1$1@sea.gmane.org>
	<ee2a432c0511211105w7b60bae1ibaaf6e2a4bd077fb@mail.gmail.com>
Message-ID: <50862ebd0511211404m2190f880sa210eda18c216140@mail.gmail.com>

Neal Norwitz:

> I think a bigger bang for the buck would be to buy a Windows box with
> Purify.  Rational was a real pain to deal with, maybe it's better now
> that IBM bought them.  Parasoft (Insure++) was even worse to deal
> with.

   My experience with the other Windows option, BoundsChecker, is
similarly negative and I haven't bothered upgrading for a couple of
versions (so can only use it with VC++ 6). The original developer,
NuMega, were great but they were absorbed into Compuware which seems
to see it more as a source of consulting income than as a product. I'm
fairly experienced with BoundsChecker and related programs (like their
profiler) so could run it over a test suite if a license was provided.
A demonstration license can probably not be installed on my machine
due to earlier installs.

   Neil

From fredrik at pythonware.com  Mon Nov 21 23:07:06 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 21 Nov 2005 23:07:06 +0100
Subject: [Python-Dev] ast status, memory leaks, etc
References: <ee2a432c0511131141s72fedecax29008fd783a3b0db@mail.gmail.com><ee2a432c0511191615y6259e95bwce68aec849a7ebfa@mail.gmail.com><438048B6.2030103@v.loewis.de><ee2a432c0511201614u1dadb3b2x419e3482ccf5b145@mail.gmail.com><dlsma2$kj1$1@sea.gmane.org>
	<ee2a432c0511211105w7b60bae1ibaaf6e2a4bd077fb@mail.gmail.com>
Message-ID: <dltgef$irs$1@sea.gmane.org>

Neal Norwitz wrote:

> I think a bigger bang for the buck would be to buy a Windows box with
> Purify.  Rational was a real pain to deal with, maybe it's better now
> that IBM bought them.  Parasoft (Insure++) was even worse to deal
> with.  There would be many other benefits for someone to do more
> testing on Windows.

I don't think there's a shortage of Windows boxes among the python-dev
crowd (I have plenty).  Does anyone knows that kind of box you need to
run purify these days ?

(looks like a license costs $780 in the US but $1100 in Sweden.  hmm...)

> The worst part of all this is ... it's still Windows.

Some of us are OS agnostics, you know.

</F>




From nnorwitz at gmail.com  Mon Nov 21 23:24:36 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Mon, 21 Nov 2005 14:24:36 -0800
Subject: [Python-Dev] ast status, memory leaks, etc
In-Reply-To: <dltgef$irs$1@sea.gmane.org>
References: <ee2a432c0511131141s72fedecax29008fd783a3b0db@mail.gmail.com>
	<ee2a432c0511191615y6259e95bwce68aec849a7ebfa@mail.gmail.com>
	<438048B6.2030103@v.loewis.de>
	<ee2a432c0511201614u1dadb3b2x419e3482ccf5b145@mail.gmail.com>
	<dlsma2$kj1$1@sea.gmane.org>
	<ee2a432c0511211105w7b60bae1ibaaf6e2a4bd077fb@mail.gmail.com>
	<dltgef$irs$1@sea.gmane.org>
Message-ID: <ee2a432c0511211424v649272f1of439ad5cdad00301@mail.gmail.com>

On 11/21/05, Fredrik Lundh <fredrik at pythonware.com> wrote:
>
> I don't think there's a shortage of Windows boxes among the python-dev
> crowd (I have plenty).  Does anyone knows that kind of box you need to
> run purify these days ?

Dunno, but it would probably be fine on a reasonably new box with at
least 1 GB of RAM.

If you are interested in using purify for python, I think that would
be great and doubt there would be an issue for the PSF to buy a copy.

> (looks like a license costs $780 in the US but $1100 in Sweden.  hmm...)

There was also PurifyPlus (I think that was the name) for $1380 or so.
 My guess is that also included Quantify and the other program bundled
together.

> > The worst part of all this is ... it's still Windows.
>
> Some of us are OS agnostics, you know.

Yeah, it was meant as a joke (though also my preference).  Guess I
shouldn't go on tour. :-)

n

From skip at pobox.com  Mon Nov 21 23:30:47 2005
From: skip at pobox.com (skip@pobox.com)
Date: Mon, 21 Nov 2005 16:30:47 -0600
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <bbaeab100511211216v596ff9acg150fd2994e98b159@mail.gmail.com>
References: <20051119180855.GA26733@code1.codespeak.net>
	<4380F572.9040402@v.loewis.de> <dlqtk8$37q$1@sea.gmane.org>
	<43817375.6040108@v.loewis.de>
	<20051121114101.GC13478@code1.codespeak.net>
	<bbaeab100511211138w244f0498k728363802328df2c@mail.gmail.com>
	<e8bf7a530511211142v629f6c69s6b07a3025db6f2ae@mail.gmail.com>
	<17282.10432.609056.20620@montanaro.dyndns.org>
	<bbaeab100511211216v596ff9acg150fd2994e98b159@mail.gmail.com>
Message-ID: <17282.19095.558236.430148@montanaro.dyndns.org>


    Brett> My question is whether anyone is willing to maintain it in the
    Brett> stdlib?

My answer is: I'm not sure it matters at this point.  There are so many
profiling possibilities, it doesn't seem like we yet know which options are
the best.  There is some tacit crowning of "best of breed" when a package is
added to the standard library, so we probably shouldn't be adding every
candidate that comes along until we have a better idea of the best way to do
things.

Skip


From martin at v.loewis.de  Mon Nov 21 23:48:29 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 21 Nov 2005 23:48:29 +0100
Subject: [Python-Dev] svn diff -r {2001-01-01}
Message-ID: <43824EBD.50402@v.loewis.de>

Greg Stein points out that because of the way the subversion
conversion was done, by-date revision specifications won't
work. Subversion assumes that time is monotonically increasing
over revions numbers - it does a binary search to find out
the revision that immediately precedes(?) the specified date.

Yet, as the conversion was done project-by-project (toplevel
svn dirs), commit time sometimes goes backward along with
increasing revision numbers; this breaks the algorithm
svn uses.

There are two way in which you might want to use date specifications
(that I can think of): svn diff (find the changes since some date)
and svn up (check out revision at some date). If you need to
do such operations, you will have to look up the closest
revision number manually (e.g. in viewcvs, or through svn log).
If this is a common operation, I'm sure it would be possible
to put a table of commit dates for python/ somewhere, to find
the necessary revision number more quickly.

For dates past the switchover, everything is fine. So

svn diff -r{00:00} Lib/

works fine.

Regards,
Martin

From simon at arrowtheory.com  Tue Nov 22 05:30:38 2005
From: simon at arrowtheory.com (Simon Burton)
Date: Tue, 22 Nov 2005 15:30:38 +1100
Subject: [Python-Dev] DRAFT: python-dev Summary for 2005-10-16 to
 2005-10-31
In-Reply-To: <D716D004-B827-4CB4-913B-ECE61118FF0A@gmail.com>
References: <D716D004-B827-4CB4-913B-ECE61118FF0A@gmail.com>
Message-ID: <20051122153038.030c8586.simon@arrowtheory.com>

On Thu, 17 Nov 2005 13:36:36 +1300
Tony Meyer <tony.meyer at gmail.com> wrote:

> 
> --------------
> AST for Python
> --------------
> 
> As of October 21st, Python's compiler now uses a real Abstract Syntax  
> Tree (AST)!  This should make experimenting with new syntax much  
> easier, as well as allowing some optimizations that were difficult  
> with the previous Concrete Syntax Tree (CST). 

> While there is no  
> Python interface to the AST yet, one is intended for the not-so- 
> distant future.

OK, who is doing this ? I am mad keen to get this happening.

Simon.


-- 
Simon Burton, B.Sc.
Licensed PO Box 8066
ANU Canberra 2601
Australia
Ph. 61 02 6249 6940
http://arrowtheory.com 

From abkhd at hotmail.com  Tue Nov 22 06:16:15 2005
From: abkhd at hotmail.com (A.B., Khalid)
Date: Tue, 22 Nov 2005 05:16:15 +0000
Subject: [Python-Dev] test_cmd_line on Windows
Message-ID: <BAY12-F7A12E367A717740C27791AB520@phx.gbl>

Currently test_directories of test_cmd_line fails on the latest Python 2.4.2 
from svn branch and from the svn head. The reason it seems is that the test 
assumes that the local language of Windows is English and so tries to find 
the string " denied" in the returned system error messages of the commands
("python .") and ("python < .").

But while it is true that the first command ("python .") does return an 
English string error message even on so-called non-English versions of 
Windows, the same does not seem to be true for the second command ("python < 
."), which seems to return a locale-related string error message. And since 
the latter test is looking for the English " denied" in a non-English 
language formated string, the test fails in non-English versions of Windows.

Regards
Khalid

_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today it's FREE! 
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/


From nnorwitz at gmail.com  Tue Nov 22 06:20:51 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Mon, 21 Nov 2005 21:20:51 -0800
Subject: [Python-Dev] Fwd: [Python-checkins] commit of r41497 -
	python/trunk/Lib/test
In-Reply-To: <20051122051745.B440A1E400B@bag.python.org>
References: <20051122051745.B440A1E400B@bag.python.org>
Message-ID: <ee2a432c0511212120r2a4429c9lbda2ead70a3156f6@mail.gmail.com>

I just checked in the modification below.  I'm not sure if this
behaviour is on purpose or by accident.  Do we want to support hex
values in floats?
Do we want to support p, similar to e in floats?

Here are the lines from the test:

+            self.assertEqual(float("  0x3.1  "), 3.0625)
+            self.assertEqual(float("  -0x3.p-1  "), -1.5)

n

---------- Forwarded message ----------
From: neal.norwitz at python.org <neal.norwitz at python.org>
Date: Nov 21, 2005 9:17 PM
Subject: [Python-checkins]  commit of r41497 - python/trunk/Lib/test
To: python-checkins at python.org


Author: neal.norwitz
Date: Tue Nov 22 06:17:40 2005
New Revision: 41497

Modified:
   python/trunk/Lib/test/test_builtin.py
Log:
improve test coverage in Python/pystrtod.c and Python/mystrtoul.c.

Modified: python/trunk/Lib/test/test_builtin.py
==============================================================================
--- python/trunk/Lib/test/test_builtin.py       (original)
+++ python/trunk/Lib/test/test_builtin.py       Tue Nov 22 06:17:40 2005
@@ -545,6 +545,34 @@
             self.assertEqual(float(unicode("  3.14  ")), 3.14)
             self.assertEqual(float(unicode("  \u0663.\u0661\u0664 
",'raw-unicode-escape')), 3.14)

+    def test_float_with_comma(self):
+        # set locale to something that doesn't use '.' for the decimal point
+        try:
+            import locale
+            orig_locale = locale.setlocale(locale.LC_NUMERIC, '')
+            locale.setlocale(locale.LC_NUMERIC, 'fr_FR')
+        except:
+            # if we can't set the locale, just ignore this test
+            return
+
+        try:
+            self.assertEqual(locale.localeconv()['decimal_point'], ',')
+        except:
+            # this test is worthless, just skip it and reset the locale
+            locale.setlocale(locale.LC_NUMERIC, orig_locale)
+            return
+
+        try:
+            self.assertEqual(float("  3,14  "), 3.14)
+            self.assertEqual(float("  +3,14  "), 3.14)
+            self.assertEqual(float("  -3,14  "), -3.14)
+            self.assertEqual(float("  0x3.1  "), 3.0625)
+            self.assertEqual(float("  -0x3.p-1  "), -1.5)
+            self.assertEqual(float("  25.e-1  "), 2.5)
+            self.assertEqual(fcmp(float("  .25e-1  "), .025), 0)
+        finally:
+            locale.setlocale(locale.LC_NUMERIC, orig_locale)
+
     def test_floatconversion(self):
         # Make sure that calls to __float__() work properly
         class Foo0:
@@ -682,6 +710,7 @@
         self.assertRaises(TypeError, int, 1, 12)

         self.assertEqual(int('0123', 0), 83)
+        self.assertEqual(int('0x123', 16), 291)

     def test_intconversion(self):
         # Test __int__()
_______________________________________________
Python-checkins mailing list
Python-checkins at python.org
http://mail.python.org/mailman/listinfo/python-checkins

From arigo at tunes.org  Tue Nov 22 07:01:47 2005
From: arigo at tunes.org (Armin Rigo)
Date: Tue, 22 Nov 2005 07:01:47 +0100
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <43823C53.8080403@v.loewis.de>
References: <20051119180855.GA26733@code1.codespeak.net>
	<4380F572.9040402@v.loewis.de> <dlqtk8$37q$1@sea.gmane.org>
	<43817375.6040108@v.loewis.de>
	<20051121114101.GC13478@code1.codespeak.net>
	<43823C53.8080403@v.loewis.de>
Message-ID: <20051122060146.GA14960@code1.codespeak.net>

Hi Martin,

On Mon, Nov 21, 2005 at 10:29:55PM +0100, "Martin v. L?wis" wrote:
> > I see no incremental way of fixing some of the downsides of hotshot,
> > like its huge log file size and loading time.
> 
> I haven't looked into the details myself, but it appears that some
> google-summer-of-code contributor has found some way of fixing it.

As discussed elsewhere on this thread: this contribution did not fix any
of the mentioned problems.  The goal was only to get rid of profile.py
by linking it to Hotshot.  So the log file size didn't change and the
loading time was only 20-30% better, which is still a really long time.

> So essentially: fixing bugs isn't fun, but rewriting it from scratch is.

Well, sorry for being interested in having fun.  And yes, I am formally
committing myself to maintaining this new piece of software, because
that also looks like fun: it's simple code that does just what you
expect from it.

Note that I may sound too negative about Hotshot.  I see by now that it
is a very powerful piece of code, full of careful design trade-offs and
capabilities.  It can do much more than what the minimalistic
documentation says, e.g. it can or could be used as the basis of a
tracing tool to debug software, to measure test coverage, etc. (with
external tools).  Moreover, it comes with carefully chosen drawbacks --
log file size and loading time -- for advanced reasons.  You won't find
them discussed in the documentation, which makes user experience mostly
negative, but you do find them in Tim's e-mails :-)

So no, I'm not willing to debug and maintain an "unfinished" (quoting
Tim) advanced piece of software doing much more than what common-people-
reading-the-stdlib-docs use it for.  That is not fun.

> Now, it might be that in this specific case, replacing the library
> really is the right thing to do. It would be if:
> 1.it has improvements over the current library already
>    (certified by users other than the authors), AND
> 2.it has no drawbacks over the current library, AND
> 3.there is some clear indication that it will get better maintenance
>    than the previous library.

1. Log file size (could reuse the existing compact profile.py format) --
good "profile-tweak-reprofile" round-trip time for the developer (no
ages spent loading the log) -- ability to interpret the logs in memory,
no need for a file -- collecting children call stats.  Positive early
user experience comes from the authors, me, and at least one other
company (Strakt) that cared enough to push for lsprof on the SF tracker.

There is this widespread user experience that hotshot is nice "but it
doesn't actually appear to work" (as Nick Coghlan put it).  Hotshot is
indeed buggy and has been producing wrong timings all along (up to and
including the current HEAD version) as shown by the test_profile found
in the Summer of Code project mentioned above.  Now we can fix that one,
and see if things get better.  In some sense this fix will discard the
meaning of any previous user experience, so that lsprof has now more of
it than Hotshot...

2. Drawbacks: there are many, as Hotshot has much more capabilities or
potential capabilities than lsprof.  None of them is to be found in the
documentation of Hotshot, though.  There is no drawback for people using
Hotshot only as documented.  Of course we might keep both Hotshot and
lsprof in the stdlib, if this sounds like a problem, but I really think
the stdlib could do with clean-ups more than pile-ups.

3. Maintenance group: two core developers.


A bientot,

Armin.

From bcannon at gmail.com  Tue Nov 22 08:35:37 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Mon, 21 Nov 2005 23:35:37 -0800
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <20051122060146.GA14960@code1.codespeak.net>
References: <20051119180855.GA26733@code1.codespeak.net>
	<4380F572.9040402@v.loewis.de> <dlqtk8$37q$1@sea.gmane.org>
	<43817375.6040108@v.loewis.de>
	<20051121114101.GC13478@code1.codespeak.net>
	<43823C53.8080403@v.loewis.de>
	<20051122060146.GA14960@code1.codespeak.net>
Message-ID: <bbaeab100511212335v7be01235o71f932593b7d3fe0@mail.gmail.com>

On 11/21/05, Armin Rigo <arigo at tunes.org> wrote:
> Hi Martin,
>
> On Mon, Nov 21, 2005 at 10:29:55PM +0100, "Martin v. L?wis" wrote:
> > > I see no incremental way of fixing some of the downsides of hotshot,
> > > like its huge log file size and loading time.
> >
> > I haven't looked into the details myself, but it appears that some
> > google-summer-of-code contributor has found some way of fixing it.
>
> As discussed elsewhere on this thread: this contribution did not fix any
> of the mentioned problems.  The goal was only to get rid of profile.py
> by linking it to Hotshot.  So the log file size didn't change and the
> loading time was only 20-30% better, which is still a really long time.
>
> > So essentially: fixing bugs isn't fun, but rewriting it from scratch is.
>
> Well, sorry for being interested in having fun.  And yes, I am formally
> committing myself to maintaining this new piece of software, because
> that also looks like fun: it's simple code that does just what you
> expect from it.
>
> Note that I may sound too negative about Hotshot.  I see by now that it
> is a very powerful piece of code, full of careful design trade-offs and
> capabilities.  It can do much more than what the minimalistic
> documentation says, e.g. it can or could be used as the basis of a
> tracing tool to debug software, to measure test coverage, etc. (with
> external tools).  Moreover, it comes with carefully chosen drawbacks --
> log file size and loading time -- for advanced reasons.  You won't find
> them discussed in the documentation, which makes user experience mostly
> negative, but you do find them in Tim's e-mails :-)
>
> So no, I'm not willing to debug and maintain an "unfinished" (quoting
> Tim) advanced piece of software doing much more than what common-people-
> reading-the-stdlib-docs use it for.  That is not fun.
>
> > Now, it might be that in this specific case, replacing the library
> > really is the right thing to do. It would be if:
> > 1.it has improvements over the current library already
> >    (certified by users other than the authors), AND
> > 2.it has no drawbacks over the current library, AND
> > 3.there is some clear indication that it will get better maintenance
> >    than the previous library.
>
> 1. Log file size (could reuse the existing compact profile.py format) --
> good "profile-tweak-reprofile" round-trip time for the developer (no
> ages spent loading the log) -- ability to interpret the logs in memory,
> no need for a file -- collecting children call stats.  Positive early
> user experience comes from the authors, me, and at least one other
> company (Strakt) that cared enough to push for lsprof on the SF tracker.
>
> There is this widespread user experience that hotshot is nice "but it
> doesn't actually appear to work" (as Nick Coghlan put it).  Hotshot is
> indeed buggy and has been producing wrong timings all along (up to and
> including the current HEAD version) as shown by the test_profile found
> in the Summer of Code project mentioned above.  Now we can fix that one,
> and see if things get better.  In some sense this fix will discard the
> meaning of any previous user experience, so that lsprof has now more of
> it than Hotshot...
>
> 2. Drawbacks: there are many, as Hotshot has much more capabilities or
> potential capabilities than lsprof.  None of them is to be found in the
> documentation of Hotshot, though.  There is no drawback for people using
> Hotshot only as documented.  Of course we might keep both Hotshot and
> lsprof in the stdlib, if this sounds like a problem, but I really think
> the stdlib could do with clean-ups more than pile-ups.
>

I am perfectly happy with having lsprof be added with all of this and
point 3 (any chance we can replace profile with a wrapper to lsprof
without much issue?).  As for cleanup, I say Hotshot should stay if we
can get it working properly and document its power features.  If we
can't get it to that state then it should go (maybe not until Python
3.0, but eventually).

> 3. Maintenance group: two core developers.

-Brett

From fredrik at pythonware.com  Tue Nov 22 08:48:25 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 22 Nov 2005 08:48:25 +0100
Subject: [Python-Dev] [Python-checkins] commit of r41497
	-python/trunk/Lib/test
References: <20051122051745.B440A1E400B@bag.python.org>
	<ee2a432c0511212120r2a4429c9lbda2ead70a3156f6@mail.gmail.com>
Message-ID: <dluigb$2as$1@sea.gmane.org>

Neal Norwitz wrote:

> I just checked in the modification below.  I'm not sure if this
> behaviour is on purpose or by accident.

Python 2.4 on Linux:

>>> float("  0x3.1  ")
3.0625

Python 2.4 on Windows:

>>> float("  0x3.1  ")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: invalid literal for float(): 0x3.1

</F>




From phd at mail2.phd.pp.ru  Tue Nov 22 10:00:43 2005
From: phd at mail2.phd.pp.ru (Oleg Broytmann)
Date: Tue, 22 Nov 2005 12:00:43 +0300
Subject: [Python-Dev] svn diff -r {2001-01-01}
In-Reply-To: <43824EBD.50402@v.loewis.de>
References: <43824EBD.50402@v.loewis.de>
Message-ID: <20051122090043.GA30828@phd.pp.ru>

On Mon, Nov 21, 2005 at 11:48:29PM +0100, "Martin v. L?wis" wrote:
> you will have to look up the closest
> revision number manually (e.g. in viewcvs, or through svn log).

   svn annotate (aka svn blame) may help too.

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.

From walter at livinglogic.de  Tue Nov 22 14:13:56 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Tue, 22 Nov 2005 14:13:56 +0100
Subject: [Python-Dev] test_cmd_line on Windows
In-Reply-To: <BAY12-F7A12E367A717740C27791AB520@phx.gbl>
References: <BAY12-F7A12E367A717740C27791AB520@phx.gbl>
Message-ID: <43831994.6060104@livinglogic.de>

A.B., Khalid wrote:

> Currently test_directories of test_cmd_line fails on the latest Python 2.4.2 
> from svn branch and from the svn head. The reason it seems is that the test 
> assumes that the local language of Windows is English and so tries to find 
> the string " denied" in the returned system error messages of the commands
> ("python .") and ("python < .").
> 
> But while it is true that the first command ("python .") does return an 
> English string error message even on so-called non-English versions of 
> Windows, the same does not seem to be true for the second command ("python < 
> ."), which seems to return a locale-related string error message. And since 
> the latter test is looking for the English " denied" in a non-English 
> language formated string, the test fails in non-English versions of Windows.

Does the popen2.popen4() used by the test provide return values of the 
execute command?

Using os.system() instead seems to provide enough information:

On Windows:

Python 2.4.2 (#67, Sep 28 2005, 12:41:11) [MSC v.1310 32 bit (Intel)] on 
win32
Type "help", "copyright", "credits" or "license" for more information.
 >>> import os
 >>> os.system("python < .")
Zugriff verweigert
1
 >>> os.system("python <NUL:")
Python 2.4.2 (#67, Sep 28 2005, 12:41:11) [MSC v.1310 32 bit (Intel)] on 
win32
Type "help", "copyright", "credits" or "license" for more information.
 >>>
0
 >>>

On Linux:

Python 2.4.2 (#1, Oct  3 2005, 15:51:22)
[GCC 3.3.5 (Debian 1:3.3.5-13)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
 >>> import os
 >>> os.system("python < .")
35584
 >>> os.system("python < /dev/null")
0

Can you provide a patch to test_cmd_line.py?

Bye,
    Walter D?rwald

From decker at dacafe.com  Mon Nov 21 01:42:32 2005
From: decker at dacafe.com (decker@dacafe.com)
Date: Mon, 21 Nov 2005 00:42:32 -0000 (Australia/Sydney)
Subject: [Python-Dev] Patch Req. # 1351020 & 1351036: PythonD
 modifications
In-Reply-To: <20051120150850.GA27838@unpythonic.net>
References: <39387.202.3.192.11.1132108393.squirrel@cafemail.mcadcafe.com>
	<437FA1D8.7060600@v.loewis.de>
	<20051120150850.GA27838@unpythonic.net>
Message-ID: <25509.202.3.192.11.1132533752.squirrel@cafemail.mcadcafe.com>

<quote who="jepler at unpythonic.net">
> On Sat, Nov 19, 2005 at 11:06:16PM +0100, "Martin v. L?wis" wrote:
>> decker at dacafe.com wrote:
>> > I would appreciate feedback concerning these patches before the next
>> > "PythonD" (for DOS/DJGPP) is released.
>>
>> PEP 11 says that DOS is not supported anymore since Python 2.0. So
>> I am -1 on reintroducing support for it.


The local python community here in Sydney indicated that python.org is
only upset when groups port the source to 'obscure' systems and *don't*
submit patches... It is possible that I was misinformed.


> If we have someeone who is volunteering the time to make it work, not just
> today
> but in the future as well, we shouldn't rule out re-adding support.


I am not sure about the future myself. DJGPP 2.04 has been parked at beta
for two years now. It might be fair to say that the *general* DJGPP
developer base has shrunk a little bit. But the PythonD userbase has
actually grown since the first release three years ago. For the time
being, people get very angry when the servers go down here :-)


> I've taken a glance at the patch.  There are probably a few things to
> quarrel
> over--for instance, it looks like a site.py change will cause python to
> print
> a blank line when it's started, and the removal of a '#define HAVE_FORK 1'
> in
> posixmodule.c---but this still doesn't mean the re-addition of DOS as a
> supported
> platform should be rejected out of hand.


Well, that's for sure! These patches have never been reviewed by
python.org before, so I am sure that there are *plenty* of ways to better
fit DOS support into the Python source.

Fork will never work under DOS, no matter how much we dream :-)

The empty line 'print' was a legacy error to kludge the ANSI color scheme
to work correctly. Long story. It can be ignored. In fact, none of the
changes to site.py are essential for python to work under DOS. They are
'additions' that most of the PythonD userbase seem to enjoy, but few knew
how to do for themselves at one time. But they aren't essential tto the
port.

The important aspects are the path and stat stuff. Nothing works without
them. I should mention that one thing that never did get ported was the
build scripts themselves to accomodate DJGPP-DOS. For a complete port, we
must still look at Modules/makesetup to remember that although directory
separators "\\" or "/" are OK, the path separator ":" is definitely not.
";" must be used.

So far, we have simply changed Setup and the Makefiles by hand after
initial confiure.


Ben


-----------------------------------------
Stay ahead of the information curve.
Receive MCAD news and jobs on your desktop daily.
Subscribe today to the MCAD CafeNews newsletter.
[ http://www10.mcadcafe.com/nl/newsletter_subscribe.php ]
It's informative and essential.

From falcon at intercable.ru  Mon Nov 21 08:02:04 2005
From: falcon at intercable.ru (Sokolov Yura)
Date: Mon, 21 Nov 2005 10:02:04 +0300
Subject: [Python-Dev]  str.dedent
Message-ID: <438170EC.8090509@intercable.ru>

>>/     msg = textwrap.dedent('''\
>/>/         IDLE's subprocess can't connect to %s:%d.  This may be due \
>/>/         to your personal firewall configuration.  It is safe to \
>/>/         allow this internal connection because no data is visible on \
>/>/         external ports.''' % address)
>/>/
>/
>Unfortunately, it won't help, since the 'dedent' method won't treat
>those spaces as indentation.
>

So that it would be usefull to implicit parser dedent on string with 'd' prefix

/     msg = d'''\
//         IDLE's subprocess can't connect to %s:%d.  This may be due \
//         to your personal firewall configuration.  It is safe to \
//         allow this internal connection because no data is visible on \
//         external ports.''' % address/



From bend at ddaustralia.com.au  Mon Nov 21 08:08:45 2005
From: bend at ddaustralia.com.au (Ben Decker)
Date: Mon, 21 Nov 2005 18:08:45 +1100
Subject: [Python-Dev] Patch Req. # 1351020 & 1351036: PythonD
	modifications
In-Reply-To: <43816CE2.2020808@v.loewis.de>
References: <39387.202.3.192.11.1132108393.squirrel@cafemail.mcadcafe.com>
	<437FA1D8.7060600@v.loewis.de>
	<20051120150850.GA27838@unpythonic.net>
	<25509.202.3.192.11.1132533752.squirrel@cafemail.mcadcafe.com>
	<43816CE2.2020808@v.loewis.de>
Message-ID: <20051121070845.GA12993@ithaca04.ddaustralia.local>

> It's not that much availability of the platform I worry about, but the
> commitment of the Python porter. We need somebody to forward bug
> reports to, and somebody to intervene if incompatible changes are made.
> This person would also indicate that the platform is no longer
> available, and hence the port can be removed.
> 
> Regards,
> Martin


I think the port has beed supported for three years now. I am not sure what kind of commitment you are looking for, but the patch and software are supplied under the same terms of liability and warranty as anything else under the GPL. 

Bug reports can be sent to either python at exemail.com.au, decker at dacafe.com or developemnt at exemail.com.au.

From tony.meyer at gmail.com  Mon Nov 21 11:14:25 2005
From: tony.meyer at gmail.com (Tony Meyer)
Date: Mon, 21 Nov 2005 23:14:25 +1300
Subject: [Python-Dev] DRAFT: python-dev Summary for 2005-11-01 through
	2005-11-15
Message-ID: <6c63de570511210214o69c0a5b6q4683fcbd974441d3@mail.gmail.com>

Surprise! It's November, and here's a November summary <wink>. Thanks to all
those that proofread the triple summary hit last week; if anyone can spare
some time to take a look over these in the next couple of days, that would
be great. As always, corrections and suggestions to tony.meyer at gmail.com or
steven.bethard at gmail.com. A couple of largish threads were skipped: one
continuing discussion about freezing (I couldn't come up with a summary
longer than "the heated debate continued"), and one on weak reference
dereference notifications (I wasn't sure what to say). If anyone wants those
summarized, let me know (ideally with some hints!) and I'll add them in.

=============
Announcements
=============

----------------------------------------
PyPy 0.8.0 and Gothenburg PyPy Sprint II
----------------------------------------

`PyPy 0.8.0`_ has been released. This third release of PyPy includes a
translatable parser and AST compiler, some speed enhancements (transated
PyPy is now about 10 times faster than 0.7, but still 10-20 times slower
than CPython), increased language compliancy, and some experimental features
are now translatable. This release also includes snapshots of interesting,
but not yet completed, subprojects including the OOtyper (a RTyper variation
for higher-level backends), a JavaScript backend, a limited (PPC) assembler
backend, and some bits for a socket module.

The next PyPy Sprint is also coming up soon. The Gothenborg PyPy Sprint II
is on the 7th to 11th of December 2005 in Gothenborg, Sweden. Its focus is
heading towards phase 2, which means JIT work, alternate threading modules,
and logic programming. Newcomer-friendly introductions will also be given.
The main topics that are currently scheduled are the L3 interpreter (a small
fast interpreter for "assembler-level" flow graphs), Stackless (e.g.
Tasklets or Greenlets), porting C modules from CPython,
optimization/debugging work, and logic programming in Python.

.. _`PyPy 0.8.0`: http://codespeak.net/pypy/dist/pypy/doc/release-0.8.0.html

Contributing threads:

(1) - `PyPy 0.8.0 is released! <
http://mail.python.org/pipermail/python-dev/2005-November/057878.html>`__
(1) - `Gothenburg PyPy Sprint II: 7th - 11th December 2005 <
http://mail.python.org/pipermail/python-dev/2005-November/058143.html>`__

[TAM]

------------------------
PyCon Sprint suggestions
------------------------

Every PyCon has featured a python-dev `sprint`_. For the past few years,
hacking on the AST branch has been a tradition, but since the AST branch has
now been merged into the trunk, other options are worth considering this
year. Several PEP implementations were suggested, including `PEP 343`_
('with:'), `PEP 308`_ ('x if y else z'), `PEP 328`_ ('absolute/relative
import'), and `PEP 341`_ ('unifying try/except and try/finally').

Suggestions to continue the AST theme were also made, including one of the
"global variable speedup" PEPs, `Guido's instance variable speedup idea`_,
using the new AST code to improve/extend/rewrite the optimization steps the
compiler performs, or rewriting PyChecker to operate from the AST
representation.

Phillip J. Eby also suggested working on the oft-mentioned bytes type. All
of these suggestions, as well as any others that are made, are being
recorded on the `PythonCore sprint wiki`_.

.. _sprint: http://wiki.python.org/moin/PyCon2006/Sprints
.. _PEP 343: http://www.python.org/peps/pep-0343.html
.. _PEP 308: http://www.python.org/peps/pep-0308.html
.. _PEP 328: http://www.python.org/peps/pep-0328.html
.. _PEP 341: http://www.python.org/peps/pep-0341.html
.. _Guido's instance variable speedup idea:
http://mail.python.org/pipermail/python-dev/2002-February/019854.html
.. _PythonCore sprint wiki:
http://wiki.python.org/moin/PyCon2006/Sprints/PythonCore

Contributing threads:

(13) - `python-dev sprint at PyCon <
http://mail.python.org/pipermail/python-dev/2005-November/057830.html>`__
(1) - `PEP 328 - absolute imports (python-dev sprint at PyCon) <
http://mail.python.org/pipermail/python-dev/2005-November/057853.html>`__

[TAM]

--------------------------------------
Reminder: Python is now on Subversion!
--------------------------------------

Just a reminder to everyone that the Python source repository_ is now hosted
on Subversion. A few minor bugs were fixed, so you can make SVK mirrors of
the repository successfully now. Be sure to check out the newly revised
Python Developers FAQ_ if you haven't already.

.. _repository: http://svn.python.org/projects/
.. _FAQ: http://www.python.org/dev/devfaq.html

Contributing threads:

(4) - `Freezing the CVS on Oct 26 for SVN switchover <
http://mail.python.org/pipermail/python-dev/2005-November/057823.html>`__
(1) - `svn checksum error <
http://mail.python.org/pipermail/python-dev/2005-November/057843.html>`__
(6) - `Problems with revision 4077 of new SVN repository <
http://mail.python.org/pipermail/python-dev/2005-November/057867.html>`__
(4) - `No more problems with new SVN repository <
http://mail.python.org/pipermail/python-dev/2005-November/057888.html>`__
(7) - `dev FAQ updated with day-to-day svn questions <
http://mail.python.org/pipermail/python-dev/2005-November/057999.html>`__
(2) - `Mapping cvs version numbers to svn revisions? <
http://mail.python.org/pipermail/python-dev/2005-November/058051.html>`__
(2) - `Checking working copy consistency <
http://mail.python.org/pipermail/python-dev/2005-November/058056.html>`__
(9) - `Is some magic required to check out new files from svn? <
http://mail.python.org/pipermail/python-dev/2005-November/058065.html>`__

[SJB]

---------------------------
Updating the Python-Dev FAQ
---------------------------

Brett Cannon has generously volunteered to clean up some of the developers'
documentation and wants to know if people would rather the bug/patch
guidelines to be in a classic paragraph-style layout or a more FAQ-style
layout. If you have an opinion on the topic, please let him know!

Contributing threads:

(2) - `dev FAQ updated with day-to-day svn questions <
http://mail.python.org/pipermail/python-dev/2005-November/058025.html>`__
(1) - `Revamping the bug/patch guidelines (was Re: Implementation of PEP
341) <http://mail.python.org/pipermail/python-dev/2005-November/058108.html
>`__

[SJB]

=========
Summaries
=========

-----------
Event loops
-----------

This thread initiated in discussion on sourceforge about patches 1049855_
and 1252236_; Martin v. L?wis and Michiel de Hoon agreed that the fixes were
fragile, and that a larger change should be discussed on python-dev. Michiel
writes visualization software; he (and others, such as the writers of
matplotlib) has trouble creating a good event loop, because the GUI toolkit
(especially Tkinter) wants its own event loop to be in charge. Michiel
doesn't actually need Tkinter for his own project, but he has to play nice
with it because his users expect to be able to use other tools --
particularly IDLE -- while running his software.

Note that this isn't the first time this sort of problem has come up;
usually it is phrased in terms of a problem with Tix, or not being able to
run turtle while in IDLE. Event loops by their very nature are infinite
loops; once they start, everything else is out of luck unless it gets
triggered by an event or is already started.

Donovan Baarda suggested looking at Twisted for state of the art in event
loop integration. Unfortunately, as Phillip Eby notes, it works by not using
the Tkinter event loop. It decides for itself when to call dooneevent
(do-one-event). It is possible to run Tkinter's dooneevent version as part
of your own event loop (as Twisted does), but you can't really listen for
its events, so you end up with a busy loop polling, and stepping into lots
of "I have nothing to do" functions for every client eventloop. You can use
Tkinter's loop, but once it goes to sleep waiting for input, everything sort
of stalls out for a while, and even non-Tkinter events get queued instead of
processed.

Mark Hammond suggests that it might be easier to replace the interactive
portions of python based on the "code" module. matplotlib suggests using
ipython instead of standard python for similar reasons. Another option might
be to always start Tk in a new thread, rather than letting it take over the
main thread. There was some concern (see patch 1049855) that Tkinter doesn't
- and shouldn't - require threading.

[Jim Jewett posted a summary of this very repetitive and confusing (to the
participants, not just summarizers!) thread towards its end, which this
summary is very heavily based on. Many thanks Jim!]

Contributing threads:

(60) - `Event loops, PyOS_InputHook, and Tkinter <
http://mail.python.org/pipermail/python-dev/2005-November/057954.html>`__
(4) - `Event loops, PyOS_InputHook, and Tkinter - Summary attempt <
http://mail.python.org/pipermail/python-dev/2005-November/058034.html>`__

.. _1049855: http://www.python.org/sf/1049855
.. _1252236: http://www.python.org/sf/1252236

[TAM]

-----------------------------
Importing .pyc and .pyo files
-----------------------------

Osvaldo Santana Neto pointed out that if a .pyo file exists, but a .pyc
doesn't, then a regularly run python will not import it (unless run with
-O), but if the .pyo is in a zip file (which is on the PYTHONPATH) then it
will import it. He felt that the inconsistency should be addressed and that
the zipimport behaviour was preferable. However, Guido said that his
intention was always that, without -O, *.pyo files are entirely ignored
(and, with -O, *.pyc files are entirely ignored). In other words, it is the
zipimport behaviour that is incorrect.

Guido suggested that perhaps .pyo should be deprecated altogether and
instead we could have a post-load optimizer optimize .pyc files according to
the current optimization settings. The two use cases presented for including
.pyo files but not .py files were in situations where disk space is at a
premium, and where a proprietary "canned" application is distributed to end
users who have no intention or need to ever add to the code.

A suggestion was that a new bytecode could be introduced for assertions that
would turn into a jump if assertions were disabled (with -O). Guido thought
that the idea had potential, but pointed out that it would take someone
thinking really hard about all the use cases, edge cases, implementation
details, and so on, in order to write a PEP. He suggested that Brett and
Phillip might be suitable volunteers for this.

Contributing thread:

(40) - `Inconsistent behaviour in import/zipimport hooks <
http://mail.python.org/pipermail/python-dev/2005-November/057959.html>`__

[TAM]

---------------------------------------
Default __hash__() and __eq__() methods
---------------------------------------

Noam Raphael suggested that having the default __hash__() and __eq__()
methods based off of the object's id() might have been a mistake. He
proposed that the default __hash__() method be removed, and the default
__eq__() method compare the two objects' __dict__ and slot members. Jim
Fulton offered a counter-proposal that both the default __hash__() and
__eq__() methods should be dropped for Python 3.0, but Guido convinced him
that removing __eq__() is probably a bad idea; it would mean an object
wouldn't compare equal to itself.

In the end, Guido decided that having a default __hash__() method based on
id() isn't really a bad decision; without it, you couldn't have sets of
"identity objects" (objects which don't have a usefully defined value-based
comparison). He suggested that the right decision was to make the hash()
function smarter, and have it raise an exception if a class redefined
__eq__() without redefining __hash__(). (In fact, this is what it used to
do, but it was lost when object.__hash__() was introduced.)

Contributing threads:

(11) - `Why should the default hash(x) == id(x)? <
http://mail.python.org/pipermail/python-dev/2005-November/057859.html>`__
(14) - `Should the default equality operator compare values instead of
identities? <
http://mail.python.org/pipermail/python-dev/2005-November/057868.html>`__
(13) - `For Python 3k, drop default/implicit hash, and comparison <
http://mail.python.org/pipermail/python-dev/2005-November/057924.html>`__

[SJB]

---------------------------
Indented multi-line strings
---------------------------

Avi Kivity reintroduced the oft-requested means of writing a multi-line
string without getting the spaces from the code indentation. The usual
options were presented::

def f(...):
...
msg = ('From: %s\n'
'To: %s\n'
'Subject: Host failure report for %s\n')
...
msg = '''\
From: %s
To: %s
Subject: Host failure report for %s'''
...
msg = textwrap.dedent('''\
From: %s
To: %s
Subject: Host failure report for %s''')

Noam Raphael suggested that to simplify the latter option, textwrap.dedent()
should become a string method, str.dedent(). There were also a few
suggestions that this sort of dedenting should have syntactic support (e.g.
with an appropriate string prefix). In general, the discussion harkened back
to `PEP 295`_, a similar proposal that was previously rejected. People
tossed the ideas around for a bit, but it didn't look like any changes were
likely to be made.

.. _PEP 295: http://www.python.org/peps/pep-0295.html

Contributing threads:

(3) - `indented longstrings? <
http://mail.python.org/pipermail/python-dev/2005-November/058042.html>`__
(18) - `str.dedent <
http://mail.python.org/pipermail/python-dev/2005-November/058058.html>`__
(1) - `OT pet peeve (was: Re: str.dedent) <
http://mail.python.org/pipermail/python-dev/2005-November/058072.html>`__

[SJB]

------------------
Continued AST work
------------------

Neal Norwitz has been chasing down memory leaks; he believes that the
current AST is now as good as before the AST branch was merged in. Nick
explained that he is particularly concerned about the returns hidden inside
in macros in the AST compiler's symbol table generation and bytecode
generation steps. Niko Matsakis suggested that an arena is the way to go for
memory management; the goal is to be able to free memory en-masse whatever
happens and not have to track individual pointers. Jeremy Hylton noted that
the AST phase has a mixture of malloc/free and Python object allocation; he
felt that it should be straightforward to change the malloc/free to use an
arena API, but that a separate mechanism would be needed to associate a set
of PyObject* with the arena. The arena concept gained general approval, and
there was some discussion about how best to implement it.

In other AST news Rune Holm submitted two_ patches_ for the AST compiler
that add better dead code elimination and constant folding and Thomas Lee is
attempting to implement `PEP 341`_ (unification of try/except and
try/finally), and asked for some help (Nick Coghlan gave some suggestions).

.. _two: http://www.python.org/sf/1346214
.. _patches: http://www.python.org/sf/1346238
.. _PEP 341: http://python.org/peps/pep-0341.html

Contributing threads:

(1) - `Optimizations on the AST representation <
http://mail.python.org/pipermail/python-dev/2005-November/057865.html>`__
(4) - `Implementation of PEP 341 <
http://mail.python.org/pipermail/python-dev/2005-November/058075.html>`__
(1) - `ast status, memory leaks, etc <
http://mail.python.org/pipermail/python-dev/2005-November/058089.html>`__
(7) - `Memory management in the AST parser &amp; compiler <
http://mail.python.org/pipermail/python-dev/2005-November/058138.html>`__
(1) - `PEP 341 patch &amp; memory management (was: Memory management in the
AST parser &amp; compiler) <
http://mail.python.org/pipermail/python-dev/2005-November/058142.html>`__

[TAM]

---------------------------------------------------------------------
Adding functional methods (reduce, partial, etc.) to function objects
---------------------------------------------------------------------

Raymond Hettinger suggested that some of the functionals, like map or
partial, might be appropriate as attributes of function objects. This would
allow code like::

results = f.map(data)
newf = f.partial(somearg)

A number of people liked the idea, but it was pointed out that map() and
partial() are intended to work with any callable, and turning these into
attributes of function objects would make it hard to use them with classes
that define __call__(). Guido emphasized this point, saying that
complicating the callable interface was a bad idea.

Contributing thread:

(9) - `a different kind of reduce... <
http://mail.python.org/pipermail/python-dev/2005-November/057828.html>`__

[SJB]

--------------------------------------------------
Distributing debug build binaries (python2x_d.dll)
--------------------------------------------------

David Abrahams asked whether it would be possible for
python.org<http://python.org>to make a debug build of the Python DLL
more accessible. Thomas Heller
pointed out that the Microsoft debug runtime DLLs are not distributable
(which is why the Windows installer does not include the debug version of
the Python DLL), and that the ActiveState distribution contains Python debug
DLLs.

Tim Peters explained that when he used to collect up the debug-build bits at
the time the official installer was built, they weren't included in the main
installer, because they bloated its size for something that most users don't
want. He explainged that he stopped collecting the bits because no two users
wanted the same set of stuff, and so it grew so large that people complained
that it was too big.

Tim suggested that the best thing to do would be to define precisely what an
acceptable distribution format is and what exactly it should contain. Martin
indicated that he would accept a patch that picked up the files and packages
them, and he would include them in the official distribution.

Contributing thread:

(12) - `Plea to distribute debugging lib <
http://mail.python.org/pipermail/python-dev/2005-November/057896.html>`__

[TAM]

-----------------------------------
Creating a python-dev-announce list
-----------------------------------

Jack Jansen suggested that a low-volume, moderated, python-dev-announce
mailing list be created for time-critical announcements for people
developing Python. The main benefit would be the ability to keep up with
important announcements such as new releases, the switch to svn, and so on,
even when developers don't have time to keep up with all threads.
Additionally, it would be easier to separate out such announcements, even
when following all threads.

Although these summaries exist (and the announcements section at the top
pretty much covers what Jack is after), the summaries occur at least a week
after the end of the period that they cover, which could be as much as three
weeks after any announcement (if it occured on the first of a month, for
example).

I suggested that a simpler possibility might follow along the lines of the
PEP topic that the python-checkins list provides (a feature of Mailman).
This would still require some sort of effort by the announcer (e.g. putting
some sort of tag in the subject), but wouldn't require an additional list,
or additional moderators.

However, Martin pointed out that this would put an extra burden on people to
remember to post to such a list; this burden would also exist using the
Mailman topic mechanism. There wasn't much apparent support for the list, so
this seems unlikely to occur at present. Of course, that could be because
the people that would like it are too busy to have noticed the thread yet <
0.5 wink>, so perhaps there is more to come.

Contributing thread:

(7) - `Proposal: can we have a python-dev-announce mailing list? <
http://mail.python.org/pipermail/python-dev/2005-November/057880.html>`__

[TAM]

===============
Skipped Threads
===============

(2) - `Divorcing str and unicode (no more implicit conversions). <
http://mail.python.org/pipermail/python-dev/2005-November/057827.html>`__
(1) - `[C++-sig] GCC version compatibility <
http://mail.python.org/pipermail/python-dev/2005-November/057831.html>`__
(2) - `PYTHOPN_API_VERSION <
http://mail.python.org/pipermail/python-dev/2005-November/057879.html>`__
(3) - `Adding examples to PEP 263 <
http://mail.python.org/pipermail/python-dev/2005-November/057891.html>`__
(3) - `Class decorators vs metaclasses <
http://mail.python.org/pipermail/python-dev/2005-November/057904.html>`__
(2) - `PEP 352 Transition Plan <
http://mail.python.org/pipermail/python-dev/2005-November/057911.html>`__
(4) - `PEP submission broken? <
http://mail.python.org/pipermail/python-dev/2005-November/057935.html>`__
(7) - `cross-compiling <
http://mail.python.org/pipermail/python-dev/2005-November/057939.html>`__
(1) - `[OTAnn] Feedback <
http://mail.python.org/pipermail/python-dev/2005-November/057941.html>`__
(1) - `Weekly Python Patch/Bug Summary <
http://mail.python.org/pipermail/python-dev/2005-November/057949.html>`__
(4) - `Unifying decimal numbers. <
http://mail.python.org/pipermail/python-dev/2005-November/057951.html>`__
(1) - `int(string) (was: DRAFT: python-dev Summary for 2005-09-01 through
2005-09-16) <
http://mail.python.org/pipermail/python-dev/2005-November/057994.html>`__
(3) - `to_int -- oops, one step missing for use. <
http://mail.python.org/pipermail/python-dev/2005-November/058006.html>`__
(2) - `(no subject) <
http://mail.python.org/pipermail/python-dev/2005-November/058023.html>`__
(7) - `Building Python with Visual C++ 2005 Express Edition <
http://mail.python.org/pipermail/python-dev/2005-November/058024.html>`__
(3) - `Coroutines (PEP 342) <
http://mail.python.org/pipermail/python-dev/2005-November/058133.html>`__
(13) - `Weak references: dereference notification <
http://mail.python.org/pipermail/python-dev/2005-November/057961.html>`__
(8) - `apparent ruminations on mutable immutables (was: PEP 351, the freeze
protocol) <
http://mail.python.org/pipermail/python-dev/2005-November/057839.html>`__
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20051121/8138c1c8/attachment-0001.html

From fb102 at soton.ac.uk  Mon Nov 21 17:41:04 2005
From: fb102 at soton.ac.uk (Floris Bruynooghe)
Date: Mon, 21 Nov 2005 16:41:04 +0000
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <20051121111430.GB13478@code1.codespeak.net>
References: <20051119180855.GA26733@code1.codespeak.net>
	<1f7befae0511201755h2cb4bdf8s9c4b8586ee3c530a@mail.gmail.com>
	<20051121111430.GB13478@code1.codespeak.net>
Message-ID: <20051121164104.GA8898@laurie.sheepb.homeip.net>

Hello

On Mon, Nov 21, 2005 at 12:14:30PM +0100, Armin Rigo wrote:
> On Sun, Nov 20, 2005 at 08:55:49PM -0500, Tim Peters wrote:
> > We should note that hotshot didn't intend to reduce total time
> > overhead.  What it's aiming at here is to be less disruptive (than
> > profile.py) to the code being profiled _while_ that code is running. 
> 
> > hotshot tries to stick with tiny little C functions that pack away a
> > tiny amount of data each time, and avoid memory alloc/dealloc, to try
> > to minimize this disruption.  It looked like it was making real
> > progress on this at one time ;-)
> 
> I see the point.  I suppose that we can discuss if hotshot is really
> nicer on the D cache, as it produces a constant stream of data, whereas
> classical profilers like lsprof would in the common case only update a
> few counters in existing data structures.  I can tweak lsprof a bit
> more, though -- there is a malloc on each call, but it could be avoided.
> 
> Still, people generally agree that profile.py, while taking a longer
> time overall, gives more meaningful results than hotshot.

When I looked into this at the beginning of the summer I could find
none around on the net.  And since hotshot had been around a lot
longer then the new lsprof I just made a conservative choice.

> Now Brett's
> student, Floris, extended hotshot to allow custom timers.  This is
> essential, because it enables testing.  The timing parts of hotshot were
> not tested previously.

Don't be too enthousiastic here.  My aim was to replicate the profile
module and thus I needed to hack this into hotshot.  However I feel
like it is not entirely in hotshot's ideals to do this.  The problem
is that the call to the timing function is accounted to the code that
is being profiled afaik.  Since a generic timer interface was needed
this means that the call goes out from the C code back to Python and
back to whatever-the-timing-function-is-writtin-in.  Thus wrongly
accounting even more time to the profiled code (not sure how long
execing a python statement takes from a C module).  Just keep this in
mind.

> Given the high correlation between untestedness and brokenness, you bet
> that Floris' adapted test_profile for hotshot gives wrong numbers.  (My
> guess is that Floris overlooked that test_profile was an output test, so
> he didn't compare the resulting numbers with the expected ones.)

Iirc I did compare the output of test_profile between profile and my
wrapper.  This was one of my checks to make sure it was wrapped
correctly.  So could you tell me how they are different?

On a stdlib note, one recommended and good working profiler would
definitely be better then two or three all with their own quirks.


Greetings
Floris

-- 
Debian GNU/Linux -- The Power of Freedom
www.debian.org | www.gnu.org | www.kernel.org

From fb102 at soton.ac.uk  Mon Nov 21 17:45:03 2005
From: fb102 at soton.ac.uk (Floris Bruynooghe)
Date: Mon, 21 Nov 2005 16:45:03 +0000
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <20051121111426.GA13478@code1.codespeak.net>
References: <20051119180855.GA26733@code1.codespeak.net>
	<bbaeab100511191612o4877977bn1144c6cba4c4f5a@mail.gmail.com>
	<20051121111426.GA13478@code1.codespeak.net>
Message-ID: <20051121164503.GB8898@laurie.sheepb.homeip.net>

On Mon, Nov 21, 2005 at 12:14:26PM +0100, Armin Rigo wrote:
> Hi Brett, hi Floris,
> 
> On Sat, Nov 19, 2005 at 04:12:28PM -0800, Brett Cannon wrote:
> > Just  for everyone's FYI while we are talking about profilers, Floris
> > Bruynooghe (who I am cc'ing on this so he can contribute to the
> > conversation), for Google's Summer of Code, wrote a replacement for
> > 'profile' that uses Hotshot directly.  Thanks to his direct use of
> > Hotshot and rewrite of pstats it loads Hotshot data 30% faster and
> > also alleviates keeping 'profile' around and its slightly questionable
> > license.
> 
> Thanks for the note!  30% faster than an incredibly long time is still
> quite long, but that's an improvment, I suppose.

It is indeed still a long time.  But is was more of a secondary aim
really.

>  However, this code is
> not ready yet.  For example the new loader gives wrong results in the
> presence of recursive function calls.

Afaik I did test recursive calls etc.  I must admit that I don't think
anyone else appart form me tested it, which is far from ideal and thus
it is bound to still have bugs.

Could you provide a test case for this?

Cheers
Floris

-- 
Debian GNU/Linux -- The Power of Freedom
www.debian.org | www.gnu.org | www.kernel.org

From amauryfa at gmail.com  Tue Nov 22 09:17:00 2005
From: amauryfa at gmail.com (Amaury Forgeot d'Arc)
Date: Tue, 22 Nov 2005 09:17:00 +0100
Subject: [Python-Dev]  ast status, memory leaks, etc
Message-ID: <e27efe130511220017i29eb0e0bl@mail.gmail.com>

Hello,

Purify is not so difficult to use: just run and learn to read the output ;-)
My config: Win2k using VC6sp5, and only 512Mb RAM.
I downloaded the snapshot dated 2005/11/21 05:01,
commented out #define WITH_PYMALLOC,
built in debug mode,
modified the rt.bat file to use purify,
and ran "rt -d".

Here are the most important results so far :

1 - Memory error in test_coding, while importing bad_coding.py :
IPR: Invalid pointer read in tok_nextc {1 occurrence}
    Reading 1 byte from 0x048af076 (1 byte at 0x048af076 illegal)
    Address 0x048af076 points into a malloc'd block in unallocated
region of heap 0x03120000
    Thread ID: 0x718
    Error location
        tok_nextc      [tokenizer.c:881]
        tok_get        [tokenizer.c:1104]
        PyTokenizer_Get [tokenizer.c:1495]
        parsetok       [parsetok.c:125]
        PyParser_ParseFileFlags [parsetok.c:89]
        PyParser_ASTFromFile [pythonrun.c:1293]
        parse_source_module [import.c:778]
        load_source_module [import.c:905]
        load_module    [import.c:1665]
        import_submodule [import.c:2259]

2 - Stack overflow in test_compile.test_extended_arg. No need to
Purify, the debug build is enough to reproduce the problem.

Because of the stack overflow, the test suite stopped. I ran some
random tests alone, to get memory leak reports, but there is no
significant message so far.
Today I'll try the complete test suite, excluding test_compile only.

--
Amaury

From arigo at tunes.org  Tue Nov 22 15:35:52 2005
From: arigo at tunes.org (Armin Rigo)
Date: Tue, 22 Nov 2005 15:35:52 +0100
Subject: [Python-Dev] s/hotshot/lsprof
In-Reply-To: <bbaeab100511212335v7be01235o71f932593b7d3fe0@mail.gmail.com>
References: <20051119180855.GA26733@code1.codespeak.net>
	<4380F572.9040402@v.loewis.de> <dlqtk8$37q$1@sea.gmane.org>
	<43817375.6040108@v.loewis.de>
	<20051121114101.GC13478@code1.codespeak.net>
	<43823C53.8080403@v.loewis.de>
	<20051122060146.GA14960@code1.codespeak.net>
	<bbaeab100511212335v7be01235o71f932593b7d3fe0@mail.gmail.com>
Message-ID: <20051122143552.GA19036@code1.codespeak.net>

Hi Brett,

On Mon, Nov 21, 2005 at 11:35:37PM -0800, Brett Cannon wrote:
> (any chance we can replace profile with a wrapper to lsprof
> without much issue?)

Yes.  In fact I am thinking about adding lsprof under the module name
'cProfile', to keep true to the (IMHO) good tradition of pickle/cPickle
and StringIO/cStringIO.

We could also just call it 'profile' and drop the existing profile.py,
but I'm not in favor of that.  Having pure Python equivalent of our
modules is good.  When I am in a good mood I am thinking that it would
instead be fun to rewrite profile.py to look exactly like lsprof.  Not
sure pstats would be that much fun, though, and I can't be bothered by
license issues too much.  However cares can probably derive a pstats
replacement from the Summer of Code project.


A bientot,

Armin.

From vinay_sajip at red-dove.com  Tue Nov 22 16:17:19 2005
From: vinay_sajip at red-dove.com (Vinay Sajip)
Date: Tue, 22 Nov 2005 15:17:19 -0000
Subject: [Python-Dev] Proposed additional keyword argument in logging calls
Message-ID: <001a01c5ef77$d7682300$0200a8c0@alpha>

On numerous occasions, requests have been made for the ability to easily add
user-defined data to logging events. For example, a multi-threaded server
application may want to output specific information to a particular server
thread (e.g. the identity of the client, specific protocol options for the
client connection, etc.)

This is currently possible, but you have to subclass the Logger class and
override its makeRecord method to put custom attributes in the LogRecord.
These can then be output using a customised format string containing e.g.
"%(foo)s %(bar)d". The approach is usable but requires more work than
necessary.

I'd like to propose a simpler way of achieving the same result, which
requires use of an additional optional keyword argument in logging calls.
The signature of the (internal) Logger._log method would change from

  def _log(self, level, msg, args, exc_info=None)

to

  def _log(self, level, msg, args, exc_info=None, extra_info=None)

The extra_info argument will be passed to Logger.makeRecord, whose signature
will change from

  def makeRecord(self, name, level, fn, lno, msg, args, exc_info):

to

  def makeRecord(self, name, level, fn, lno, msg, args, exc_info,
extra_info)

makeRecord will, after doing what it does now, use the extra_info argument
as follows:

If type(extra_info) != types.DictType, it will be ignored.

Otherwise, any entries in extra_info whose keys are not already in the
LogRecord's __dict__ will be added to the LogRecord's __dict__.

Can anyone see any problems with this approach? If not, I propose to post
the approach on python-list and then if there are no strong objections,
check it in to the trunk. (Since it could break existing code, I'm assuming
(please correct me if I'm wrong) that it shouldn't go into the
release24-maint branch.)

Of course, if anyone can suggest a better way of doing it, I'm all ears :-)

Regards,


Vinay Sajip


From nnorwitz at gmail.com  Tue Nov 22 19:13:13 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Tue, 22 Nov 2005 10:13:13 -0800
Subject: [Python-Dev] ast status, memory leaks, etc
In-Reply-To: <e27efe130511220017i29eb0e0bl@mail.gmail.com>
References: <e27efe130511220017i29eb0e0bl@mail.gmail.com>
Message-ID: <ee2a432c0511221013s75f52ab0he98e36860f7a3649@mail.gmail.com>

On 11/22/05, Amaury Forgeot d'Arc <amauryfa at gmail.com> wrote:
> Hello,
>
> Purify is not so difficult to use: just run and learn to read the output ;-)

Amaury,

Thank you for running Purify.

> 1 - Memory error in test_coding, while importing bad_coding.py :
> IPR: Invalid pointer read in tok_nextc {1 occurrence}

There is a patch for this on SourceForge.  It's pretty new.

> Because of the stack overflow, the test suite stopped. I ran some
> random tests alone, to get memory leak reports, but there is no
> significant message so far.
> Today I'll try the complete test suite, excluding test_compile only.

Great.  Thanks!

n

From bcannon at gmail.com  Tue Nov 22 20:31:34 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Tue, 22 Nov 2005 11:31:34 -0800
Subject: [Python-Dev] DRAFT: python-dev Summary for 2005-10-16 to
	2005-10-31
In-Reply-To: <20051122153038.030c8586.simon@arrowtheory.com>
References: <D716D004-B827-4CB4-913B-ECE61118FF0A@gmail.com>
	<20051122153038.030c8586.simon@arrowtheory.com>
Message-ID: <bbaeab100511221131n39da6ad2q3604365b09d2e45@mail.gmail.com>

On 11/21/05, Simon Burton <simon at arrowtheory.com> wrote:
> On Thu, 17 Nov 2005 13:36:36 +1300
> Tony Meyer <tony.meyer at gmail.com> wrote:
>
> >
> > --------------
> > AST for Python
> > --------------
> >
> > As of October 21st, Python's compiler now uses a real Abstract Syntax
> > Tree (AST)!  This should make experimenting with new syntax much
> > easier, as well as allowing some optimizations that were difficult
> > with the previous Concrete Syntax Tree (CST).
>
> > While there is no
> > Python interface to the AST yet, one is intended for the not-so-
> > distant future.
>
> OK, who is doing this ? I am mad keen to get this happening.
>

No one yet.  Some ideas have been tossed around (read the thread for
details), but no one has sat down to hammer out the details.  Might
happen at PyCon.

-Brett

From barbieri at gmail.com  Tue Nov 22 20:48:38 2005
From: barbieri at gmail.com (Gustavo Sverzut Barbieri)
Date: Tue, 22 Nov 2005 17:48:38 -0200
Subject: [Python-Dev] ast status, memory leaks, etc
In-Reply-To: <ee2a432c0511201614u1dadb3b2x419e3482ccf5b145@mail.gmail.com>
References: <ee2a432c0511131141s72fedecax29008fd783a3b0db@mail.gmail.com>
	<ee2a432c0511191615y6259e95bwce68aec849a7ebfa@mail.gmail.com>
	<438048B6.2030103@v.loewis.de>
	<ee2a432c0511201614u1dadb3b2x419e3482ccf5b145@mail.gmail.com>
Message-ID: <9ef20ef30511221148g905deefo548a8fb3e68a08ae@mail.gmail.com>

On 11/20/05, Neal Norwitz <nnorwitz at gmail.com> wrote:
> Thanks I was going to look into the resizing and forgot about it.
> Running without pymalloc confirmed that there weren't more serious
> problems.

At least with gentoo's Python 2.4.2, I get a bunch of errors from
invalid  reads and jumps/moves that depends on unitialized values in
PyObject_Free().

Running:

valgrind --leak-check=full --leak-resolution=high --show-reachable=yes
python -c "pass" 2> ~/python-2.4.2-valgrind.log

gives me the attached log file.

--
Gustavo Sverzut Barbieri
--------------------------------------
Computer Engineer 2001 - UNICAMP
Mobile: +55 (19) 9165 8010
 Phone:  +1 (347) 624 6296 @ sip.stanaphone.com
Jabber: gsbarbieri at jabber.org
  ICQ#: 17249123
   MSN: barbieri at gmail.com
   GPG: 0xB640E1A2 @ wwwkeys.pgp.net
-------------- next part --------------
A non-text attachment was scrubbed...
Name: python-2.4.2-valgrind.log.bz2
Type: application/x-bzip2
Size: 4080 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20051122/76707dd5/python-2.4.2-valgrind.log-0001.bin

From fredrik at pythonware.com  Tue Nov 22 20:55:52 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 22 Nov 2005 20:55:52 +0100
Subject: [Python-Dev] ast status, memory leaks, etc
References: <ee2a432c0511131141s72fedecax29008fd783a3b0db@mail.gmail.com><ee2a432c0511191615y6259e95bwce68aec849a7ebfa@mail.gmail.com><438048B6.2030103@v.loewis.de><ee2a432c0511201614u1dadb3b2x419e3482ccf5b145@mail.gmail.com>
	<9ef20ef30511221148g905deefo548a8fb3e68a08ae@mail.gmail.com>
Message-ID: <dlvt41$cvl$1@sea.gmane.org>

Gustavo Sverzut Barbieri wrote:

> At least with gentoo's Python 2.4.2, I get a bunch of errors from
> invalid  reads and jumps/moves that depends on unitialized values in
> PyObject_Free().
>
> Running:
>
> valgrind --leak-check=full --leak-resolution=high --show-reachable=yes
> python -c "pass" 2> ~/python-2.4.2-valgrind.log


did you read the instructions ?

$ more Misc/README.valgrind

http://cvs.sourceforge.net/viewcvs.py/python/python/dist/src/Misc/README.valgrind?view=markup

</F> 




From p.f.moore at gmail.com  Tue Nov 22 21:10:36 2005
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 22 Nov 2005 20:10:36 +0000
Subject: [Python-Dev] Proposed additional keyword argument in logging
	calls
In-Reply-To: <001a01c5ef77$d7682300$0200a8c0@alpha>
References: <001a01c5ef77$d7682300$0200a8c0@alpha>
Message-ID: <79990c6b0511221210g47bc10eas2531726871da92ba@mail.gmail.com>

On 11/22/05, Vinay Sajip <vinay_sajip at red-dove.com> wrote:
> makeRecord will, after doing what it does now, use the extra_info argument
> as follows:
>
> If type(extra_info) != types.DictType, it will be ignored.
>
> Otherwise, any entries in extra_info whose keys are not already in the
> LogRecord's __dict__ will be added to the LogRecord's __dict__.
>
> Can anyone see any problems with this approach?

I'd suggest that you raise an error if extra_info doesn't act like a
dictionary - probably, just try to add its entries and let any error
pass back to the caller. You definitely want to allow dict subclasses,
and anything that acts like a dictionary. And you want to catch errors
like

    log(..., extra_info = "whatever")

with a format of "... %(extra_info)s..." (ie, assuming that extra_info
is a single value - it's what I expected you to propose when I started
reading).

The rest looks good (I don't have a need for it myself, but it looks
like a nice, clean solution to the problem you describe).

Paul.

From reinhold-birkenfeld-nospam at wolke7.net  Tue Nov 22 22:47:37 2005
From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld)
Date: Tue, 22 Nov 2005 22:47:37 +0100
Subject: [Python-Dev] something is wrong with test___all__
Message-ID: <dm03lq$41u$1@sea.gmane.org>

Hi,

on my machine, "make test" hangs at test_colorsys.

Careful investigation shows that when the bytecode is freshly generated
by "make all" (precisely in test___all__) the .pyc file is different from what a
direct call to "regrtest.py test_colorsys" produces.

Curiously, a call to "regrtest.py test___all__" instead of "make test" produces
the correct bytecode.

I can only suspect some AST bug here.

Reinhold

-- 
Mail address is perfectly valid!


From simon at arrowtheory.com  Wed Nov 23 10:15:29 2005
From: simon at arrowtheory.com (Simon Burton)
Date: Wed, 23 Nov 2005 09:15:29 +0000
Subject: [Python-Dev] DRAFT: python-dev Summary for 2005-10-16 to
 2005-10-31
In-Reply-To: <bbaeab100511221131n39da6ad2q3604365b09d2e45@mail.gmail.com>
References: <D716D004-B827-4CB4-913B-ECE61118FF0A@gmail.com>
	<20051122153038.030c8586.simon@arrowtheory.com>
	<bbaeab100511221131n39da6ad2q3604365b09d2e45@mail.gmail.com>
Message-ID: <20051123091529.6b5ae4d7.simon@arrowtheory.com>

On Tue, 22 Nov 2005 11:31:34 -0800
Brett Cannon <bcannon at gmail.com> wrote:

> 
> On 11/21/05, Simon Burton <simon at arrowtheory.com> wrote:
> > On Thu, 17 Nov 2005 13:36:36 +1300
> > Tony Meyer <tony.meyer at gmail.com> wrote:
> >
> > >
> > > --------------
> > > AST for Python
> > > --------------
> > >
> > > As of October 21st, Python's compiler now uses a real Abstract Syntax
> > > Tree (AST)!  This should make experimenting with new syntax much
> > > easier, as well as allowing some optimizations that were difficult
> > > with the previous Concrete Syntax Tree (CST).
> >
> > > While there is no
> > > Python interface to the AST yet, one is intended for the not-so-
> > > distant future.
> >
> > OK, who is doing this ? I am mad keen to get this happening.
> >
> 
> No one yet.  Some ideas have been tossed around (read the thread for
> details), but no one has sat down to hammer out the details.  Might
> happen at PyCon.
> 
> -Brett

Yes, i've been reading the threads but I don't see anything about a python interface.
Why I'm asking is because I could probably convince my employer to let me (or an intern) work on it.
And pycon is not until febuary. I am likely to start hacking on this before then.

Simon.

-- 
Simon Burton, B.Sc.
Licensed PO Box 8066
ANU Canberra 2601
Australia
Ph. 61 02 6249 6940
http://arrowtheory.com 

From steven.bethard at gmail.com  Tue Nov 22 23:30:07 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Tue, 22 Nov 2005 15:30:07 -0700
Subject: [Python-Dev] a Python interface for the AST (WAS: DRAFT:
	python-dev...)
Message-ID: <d11dcfba0511221430j519a2f8fh6faac1ab89ee7d99@mail.gmail.com>

I wrote (in the summary):
> While there is no interface to the AST yet, one is
> intended for the not-so-distant future.

Simon Burton wrote:
> who is doing this ? I am mad keen to get this happening.

Brett Cannon wrote:
> No one yet.  Some ideas have been tossed around (read the thread for
> details), but no one has sat down to hammer out the details.  Might
> happen at PyCon.

Simon Burton wrote:
> Yes, i've been reading the threads but I don't see anything
> about a python interface. Why I'm asking is because I could
> probably convince my employer to let me (or an intern) work
> on it. And pycon is not until febuary. I am likely to start
> hacking on this before then.

Basically, all I saw was your post asking for a Python interface[1],
and a few "not yet" responses.  I suspect that if you were to
volunteer to head up the work on the Python interface, no one would be
likely to stop you. ;-)

[1]http://mail.python.org/pipermail/python-dev/2005-October/057611.html

Steve
--
You can wordify anything if you just verb it.
        --- Bucky Katt, Get Fuzzy

From bcannon at gmail.com  Wed Nov 23 01:02:40 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Tue, 22 Nov 2005 16:02:40 -0800
Subject: [Python-Dev] a Python interface for the AST (WAS: DRAFT:
	python-dev...)
In-Reply-To: <d11dcfba0511221430j519a2f8fh6faac1ab89ee7d99@mail.gmail.com>
References: <d11dcfba0511221430j519a2f8fh6faac1ab89ee7d99@mail.gmail.com>
Message-ID: <bbaeab100511221602h36bbc30bpc9f317d7fb3354fa@mail.gmail.com>

On 11/22/05, Steven Bethard <steven.bethard at gmail.com> wrote:
> I wrote (in the summary):
> > While there is no interface to the AST yet, one is
> > intended for the not-so-distant future.
>
> Simon Burton wrote:
> > who is doing this ? I am mad keen to get this happening.
>
> Brett Cannon wrote:
> > No one yet.  Some ideas have been tossed around (read the thread for
> > details), but no one has sat down to hammer out the details.  Might
> > happen at PyCon.
>
> Simon Burton wrote:
> > Yes, i've been reading the threads but I don't see anything
> > about a python interface. Why I'm asking is because I could
> > probably convince my employer to let me (or an intern) work
> > on it. And pycon is not until febuary. I am likely to start
> > hacking on this before then.
>
> Basically, all I saw was your post asking for a Python interface[1],
> and a few "not yet" responses.  I suspect that if you were to
> volunteer to head up the work on the Python interface, no one would be
> likely to stop you. ;-)
>
> [1]http://mail.python.org/pipermail/python-dev/2005-October/057611.html
>

All of the discussion has just been "we hope to have it some day" with
no real planning.  =)  There are two problems to this topic; how to
get the AST structs into Python objects and how to allow Python code
to modify the AST before bytecode emission (or perhaps even after for
in-place optimization).

To get the AST into Python objects, there are two options.  One is to
use the AST grammar to generate struct -> serialized form -> Python
objects and vice-versa.  There might be some rough code already there
in the form of emitting a string that looks like Scheme code that
represents the AST.  Then Python code could use that to make up
objects, manipulate, translate back into its serialized form, and then
back into the AST structs.  It sounds like a lot but with the grammar
right there it should be an automated generation of code to make.

The other option is to have all AST structs be contained in PyObjects.
 Neil suggested this for the whole memory problem since we could then
just do proper refcounting and we all know how to do that (supposedly
=) .  With that then all it is to get access is to pass the PyObject
of the root out and make sure that the proper attributes or accessor
methods (I prefer the former) are available.  Once again this can be
auto-generated from the AST grammar.

The second problem is where to give access to the AST from within
Python.  One place is the command-line.  One could be able to specify
the path to function objects (using import syntax, e.g.,
``optimizations.static.folding``) on the command-line that are always
applied to all generated bytecode.  Another possibility is to have an
iterable in sys that is iterated over everytime something has bytecode
generated.  Each call to the iterator would return a function that
took in an AST object and returned an AST object.  Another possibility
is to have a function (like ``ast()`` as a built-in)  to pass in a
code object and then have the AST returned for that code object.  If a
function was provided that took an AST and returned the bytecode then
selective AST access can be given instead of applying across the board
(this could allow for decorators that performed AST optimizations or
even hotshot stuff).

Obvously this is all pie-in-the-sky stuff.  Getting the memory leak
situation resolved is a bigger priority in my mind than any of this.
But if I had my way I think that having all AST objects be PyObjects
and then providing support for all three ways of getting access to the
AST (command-line, sys iterable, function for specific code object)
would be fantastic.

-Brett

From nnorwitz at gmail.com  Wed Nov 23 02:48:33 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Tue, 22 Nov 2005 17:48:33 -0800
Subject: [Python-Dev] a Python interface for the AST (WAS: DRAFT:
	python-dev...)
In-Reply-To: <bbaeab100511221602h36bbc30bpc9f317d7fb3354fa@mail.gmail.com>
References: <d11dcfba0511221430j519a2f8fh6faac1ab89ee7d99@mail.gmail.com>
	<bbaeab100511221602h36bbc30bpc9f317d7fb3354fa@mail.gmail.com>
Message-ID: <ee2a432c0511221748r272540ffld1ef1f772f8058e3@mail.gmail.com>

On 11/22/05, Brett Cannon <bcannon at gmail.com> wrote:
>
> But if I had my way I think that having all AST objects be PyObjects
> and then providing support for all three ways of getting access to the
> AST (command-line, sys iterable, function for specific code object)
> would be fantastic.

There needs to be a function that takes a filename (or string of code)
and returns an AST.  Hmm, it would be nice to give a function a module
name (like from an import statement) and have Python resolve it using
the normal sys.path iteration.

n

From bcannon at gmail.com  Wed Nov 23 03:32:59 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Tue, 22 Nov 2005 18:32:59 -0800
Subject: [Python-Dev] a Python interface for the AST (WAS: DRAFT:
	python-dev...)
In-Reply-To: <ee2a432c0511221748r272540ffld1ef1f772f8058e3@mail.gmail.com>
References: <d11dcfba0511221430j519a2f8fh6faac1ab89ee7d99@mail.gmail.com>
	<bbaeab100511221602h36bbc30bpc9f317d7fb3354fa@mail.gmail.com>
	<ee2a432c0511221748r272540ffld1ef1f772f8058e3@mail.gmail.com>
Message-ID: <bbaeab100511221832j7939e3e2wdd3a7bff42d4765a@mail.gmail.com>

On 11/22/05, Neal Norwitz <nnorwitz at gmail.com> wrote:
> On 11/22/05, Brett Cannon <bcannon at gmail.com> wrote:
> >
> > But if I had my way I think that having all AST objects be PyObjects
> > and then providing support for all three ways of getting access to the
> > AST (command-line, sys iterable, function for specific code object)
> > would be fantastic.
>
> There needs to be a function that takes a filename (or string of code)
> and returns an AST.

"Yes" and "I guess".  =)  I can see the filename to check a module
useful for stuff like PyChecker.  But for a string of code, I don't
think it would be that critical; if you provide a way to get the AST
for a code object you can just pass the string to compile() and then
get the AST from there.

>  Hmm, it would be nice to give a function a module
> name (like from an import statement) and have Python resolve it using
> the normal sys.path iteration.
>

Yep, import path -> filename path would be cool.

-Brett

From pje at telecommunity.com  Wed Nov 23 03:58:58 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 22 Nov 2005 21:58:58 -0500
Subject: [Python-Dev] a Python interface for the AST (WAS: DRAFT:
 python-dev...)
In-Reply-To: <bbaeab100511221832j7939e3e2wdd3a7bff42d4765a@mail.gmail.co
 m>
References: <ee2a432c0511221748r272540ffld1ef1f772f8058e3@mail.gmail.com>
	<d11dcfba0511221430j519a2f8fh6faac1ab89ee7d99@mail.gmail.com>
	<bbaeab100511221602h36bbc30bpc9f317d7fb3354fa@mail.gmail.com>
	<ee2a432c0511221748r272540ffld1ef1f772f8058e3@mail.gmail.com>
Message-ID: <5.1.1.6.0.20051122215139.01f99f90@mail.telecommunity.com>

At 06:32 PM 11/22/2005 -0800, Brett Cannon wrote:
> >  Hmm, it would be nice to give a function a module
> > name (like from an import statement) and have Python resolve it using
> > the normal sys.path iteration.
> >
>
>Yep, import path -> filename path would be cool.

Zipped and frozen modules don't have filename paths, so I'd personally 
rather see fewer stdlib modules making the assumption that modules are 
files.  Instead, extensions to the PEP 302 loader protocol should be used 
to support introspection, assuming there aren't already equivalent 
capabilities available.  For example, PEP 302 allows a 'get_source()' 
method on loaders, and I believe the zipimport loader supports that.  (I 
don't know about frozen modules.)

The main barrier to this being really usable is the absence of loader 
objects for the built-in import process.  This was proposed by PEP 302, but 
never actually implemented, probably due to time constraints on the Python 
2.3 release schedule.

It's relatively easy to implement this "missing loader class" in Python, 
though, and in fact the PEP 302 regression test in the stdlib does exactly 
that.  Some work, however, would be required to port this to C and expose 
it from an appropriate module (imp?).


From krumms at gmail.com  Wed Nov 23 09:44:27 2005
From: krumms at gmail.com (Thomas Lee)
Date: Wed, 23 Nov 2005 18:44:27 +1000
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <dll2v3$78g$1@sea.gmane.org>
References: <4379AAD7.2050506@iinet.net.au>	<6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu>	<e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com>	<ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com>	<bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com>	<13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu>	<437B2075.1000102@gmail.com>
	<dlf7ak$ckg$1@sea.gmane.org> <dll2v3$78g$1@sea.gmane.org>
Message-ID: <43842BEB.5000406@gmail.com>

Neil Schemenauer wrote:

>Fredrik Lundh <fredrik at pythonware.com> wrote:
>  
>
>>Thomas Lee wrote:
>>
>>    
>>
>>>Even if it meant we had just one function call - one, safe function call
>>>that deallocated all the memory allocated within a function - that we
>>>had to put before each and every return, that's better than what we
>>>have.
>>>      
>>>
>>alloca?
>>    
>>
>
>Perhaps we should use the memory management technique that the rest
>of Python uses: reference counting.  I don't see why the AST
>structures couldn't be PyObjects.
>
>  Neil
>
>  
>
I'm +1 for reference counting. It's going to be a little error prone 
initially (certainly much less error prone than the current system in 
the long run), but the pooling/arena idea is going to screw with all 
sorts of stuff within the AST and possibly in bits of Python/compile.c 
too. At least, all my attempts wound up looking that way :)

Cheers,
Tom

>_______________________________________________
>Python-Dev mailing list
>Python-Dev at python.org
>http://mail.python.org/mailman/listinfo/python-dev
>Unsubscribe: http://mail.python.org/mailman/options/python-dev/krumms%40gmail.com
>
>  
>


From ncoghlan at gmail.com  Wed Nov 23 14:51:59 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 23 Nov 2005 23:51:59 +1000
Subject: [Python-Dev] PEP 302,
 PEP 338 and imp.getloader (was Re: a Python interface for the AST
 (WAS: DRAFT: python-dev...)
In-Reply-To: <5.1.1.6.0.20051122215139.01f99f90@mail.telecommunity.com>
References: <ee2a432c0511221748r272540ffld1ef1f772f8058e3@mail.gmail.com>	<d11dcfba0511221430j519a2f8fh6faac1ab89ee7d99@mail.gmail.com>	<bbaeab100511221602h36bbc30bpc9f317d7fb3354fa@mail.gmail.com>	<ee2a432c0511221748r272540ffld1ef1f772f8058e3@mail.gmail.com>
	<5.1.1.6.0.20051122215139.01f99f90@mail.telecommunity.com>
Message-ID: <438473FF.8020107@gmail.com>

Phillip J. Eby wrote:
> At 06:32 PM 11/22/2005 -0800, Brett Cannon wrote:
>>>  Hmm, it would be nice to give a function a module
>>> name (like from an import statement) and have Python resolve it using
>>> the normal sys.path iteration.
>>>
>> Yep, import path -> filename path would be cool.
> 
> Zipped and frozen modules don't have filename paths, so I'd personally 
> rather see fewer stdlib modules making the assumption that modules are 
> files.  Instead, extensions to the PEP 302 loader protocol should be used 
> to support introspection, assuming there aren't already equivalent 
> capabilities available.  For example, PEP 302 allows a 'get_source()' 
> method on loaders, and I believe the zipimport loader supports that.  (I 
> don't know about frozen modules.)
> 
> The main barrier to this being really usable is the absence of loader 
> objects for the built-in import process.  This was proposed by PEP 302, but 
> never actually implemented, probably due to time constraints on the Python 
> 2.3 release schedule.
> 
> It's relatively easy to implement this "missing loader class" in Python, 
> though, and in fact the PEP 302 regression test in the stdlib does exactly 
> that.  Some work, however, would be required to port this to C and expose 
> it from an appropriate module (imp?).

Prompted by this, I finally got around to reading PEP 302 to see how it 
related to PEP 338 (which is intended to fix the current limitations of the 
'-m' switch by providing a Python fallback when the basic C code can't find 
the module to be run).

The key thing that is missing is the "imp.getloader" functionality discussed 
at the end of PEP 302.

Using that functionality and the exec statement, PEP 338 could easily be 
modified to support any module accessed via a loader which supports get_code() 
(and it could probably also get rid of all of the current cruft dealing with 
normal filesystem packages).

So with that in mind, I'm thinking of updating PEP 338 to propose the following:

1. A new pure Python module called "runpy"

2. A function called "runpy.execmodule" that is very similar to execfile, but 
takes a module reference instead of a filename. It will NOT support 
modification of the caller's namespace (based on recent discussions regarding 
the exec statement). argv[0] and the name __file__ in the execution dictionary 
will be set to the file name for real files (those of type PY_SOURCE or 
PY_COMPILED), and the module reference otherwise. An optional argument will 
permit argv[0] (and __file__) to be forced to a specific value.**

3. A function called "runpy.get_source" that, given a module reference, 
retrieves the source code for that module via loader.get_source()

4. A function called "runpy.get_code" that, given a module reference, 
retrieves the code object for that module via loader.get_code()

5. A function called "runpy.is_runnable" that, given a module reference, 
determines if execmodule will work on that module (e.g. by checking that the 
loader provides the getcode method, that loader.is_package returns false, etc)

6. If invoked as a script, runpy interprets argv[1] as the module to run

7. If the '-m' switch fails to find a module, it invokes runpy as a fallback.

To make PEP 338 independent of the C implementation of imp.getloader for PEP 
302 being finished, it would propose two private elements in runpy: 
runpy._getloader and runpy._StandardImportMetaHook

If imp.getloader was available, it would be assigned to runpy._getloader, 
otherwise runpy would fall back on the Python equivalents.

** I'm open to suggestions on how to deal with argv[0] and __file__. They 
should be set to whatever __file__ would be set to by the module loader, but 
the Importer Protocol in PEP 302 doesn't seem to expose that information. The 
current proposal is a compromise that matches the existing behaviour of -m 
(which supports scripts like regrtest.py) while still giving a meaningful 
value for scripts which are not part of the normal filesystem.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From pj at place.org  Wed Nov 23 06:04:55 2005
From: pj at place.org (Paul Jimenez)
Date: Tue, 22 Nov 2005 23:04:55 -0600
Subject: [Python-Dev] urlparse brokenness
Message-ID: <20051123050455.9010E7FBF@place.org>


It is my assertion that urlparse is currently broken.  Specifically, I 
think that urlparse breaks an abstraction boundary with ill effect.

In writing a mailclient, I wished to allow my users to specify their
imap server as a url, such as 'imap://user:password at host:port/'. Which
worked fine. I then thought that the natural extension to support
configuration of imapssl would be 'imaps://user:password at host:port/'....
which failed - user:passwrod at host:port got parsed as the *path* of
the URL instead of the network location. It turns out that urlparse
keeps a table of url schemes that 'use netloc'... that is to say,
that have a 'user:password at host:port' part to their URL. I think this
'special knowledge' about particular schemes 1) breaks an abstraction
boundary by having a function whose charter is to pull apart a
particularly-formatted string behave differently based on the meaning of
the string instead of the structure of it and 2) fails to be extensible
or forward compatible due to hardcoded 'magic' strings - if schemes were
somehow 'registerable' as 'netloc using' or not, then this objection
might be nullified, but the previous objection would still stand.

So I propose that urlsplit, the main offender, be replaced with something
that looks like:

def urlsplit(url, scheme='', allow_fragments=1, default=('','','','','')):
    """Parse a URL into 5 components:
    <scheme>://<netloc>/<path>?<query>#<fragment>
    Return a 5-tuple: (scheme, netloc, path, query, fragment).
    Note that we don't break the components up in smaller bits
    (e.g. netloc is a single string) and we don't expand % escapes."""
    key = url, scheme, allow_fragments, default
    cached = _parse_cache.get(key, None)
    if cached:
        return cached
    if len(_parse_cache) >= MAX_CACHE_SIZE: # avoid runaway growth
        clear_cache()

    if "://" in url:
        uscheme, npqf = url.split("://", 1)
    else:
        uscheme = scheme
        if not uscheme:
            uscheme = default[0]
        npqf = url
    pathidx = npqf.find('/')
    if pathidx == -1:  # not found
        netloc = npqf
        path, query, fragment = default[1:4]
    else:
        netloc = npqf[:pathidx]
        pqf = npqf[pathidx:]
        if '?' in pqf:
            path, qf = pqf.split('?',1)
        else:
            path, qf = pqf, ''.join(default[3:5])
        if ('#' in qf) and allow_fragments:
            query, fragment = qf.split('#',1)
        else:
            query, fragment = default[3:5]
    tuple = (uscheme, netloc, path, query, fragment)
    _parse_cache[key] = tuple
    return tuple

Note that I'm not sold on the _parse_cache, but I'm assuming it was there
for a reason so I'm leaving that functionality as-is.

If this isn't the right forum for this discussion, or the right place to 
submit code, please let me know.  Also, please cc: me directly on responses
as I'm not subscribed to the firehose that is python-dev.

  --pj


From aahz at pythoncraft.com  Wed Nov 23 17:55:29 2005
From: aahz at pythoncraft.com (Aahz)
Date: Wed, 23 Nov 2005 08:55:29 -0800
Subject: [Python-Dev] urlparse brokenness
In-Reply-To: <20051123050455.9010E7FBF@place.org>
References: <20051123050455.9010E7FBF@place.org>
Message-ID: <20051123165529.GA4322@panix.com>

On Tue, Nov 22, 2005, Paul Jimenez wrote:
>
> If this isn't the right forum for this discussion, or the right place
> to submit code, please let me know.  Also, please cc: me directly on
> responses as I'm not subscribed to the firehose that is python-dev.

This is the right forum for discussion.  You should post your patch to
SourceForge *before* starting a discussion on python-dev, including a
link to the patch in your post.  It is not essential, but it is certainly
a courtesy to subscribe to python-dev for the duration of the discussion;
you can feel feel to filter threads you're not interested in.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"If you think it's expensive to hire a professional to do the job, wait
until you hire an amateur."  --Red Adair

From pje at telecommunity.com  Wed Nov 23 19:25:44 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 23 Nov 2005 13:25:44 -0500
Subject: [Python-Dev] PEP 302,
 PEP 338 and imp.getloader (was Re: a Python interface for the AST
 (WAS: DRAFT: python-dev...)
In-Reply-To: <438473FF.8020107@gmail.com>
References: <5.1.1.6.0.20051122215139.01f99f90@mail.telecommunity.com>
	<ee2a432c0511221748r272540ffld1ef1f772f8058e3@mail.gmail.com>
	<d11dcfba0511221430j519a2f8fh6faac1ab89ee7d99@mail.gmail.com>
	<bbaeab100511221602h36bbc30bpc9f317d7fb3354fa@mail.gmail.com>
	<ee2a432c0511221748r272540ffld1ef1f772f8058e3@mail.gmail.com>
	<5.1.1.6.0.20051122215139.01f99f90@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20051123131857.03aba388@mail.telecommunity.com>

At 11:51 PM 11/23/2005 +1000, Nick Coghlan wrote:
>The key thing that is missing is the "imp.getloader" functionality discussed
>at the end of PEP 302.

This isn't hard to implement per se; setuptools for example has a 
'get_importer' function, and going from importer to loader is simple:

def get_importer(path_item):
     """Retrieve a PEP 302 "importer" for the given path item

     If there is no importer, this returns a wrapper around the builtin import
     machinery.  The returned importer is only cached if it was created by a
     path hook.
     """
     try:
         importer = sys.path_importer_cache[path_item]
     except KeyError:
         for hook in sys.path_hooks:
             try:
                 importer = hook(path_item)
             except ImportError:
                 pass
             else:
                 break
         else:
             importer = None

     sys.path_importer_cache.setdefault(path_item,importer)
     if importer is None:
         try:
             importer = ImpWrapper(path_item)
         except ImportError:
             pass
     return importer

So with the above function you could do something like:

def get_loader(fullname, path):
     for path_item in path:
         try:
             loader = get_importer(path_item).find_module(fullname)
             if loader is not None:
                 return loader
         except ImportError:
             continue
     else:
         return None

in order to implement the rest.


>** I'm open to suggestions on how to deal with argv[0] and __file__. They
>should be set to whatever __file__ would be set to by the module loader, but
>the Importer Protocol in PEP 302 doesn't seem to expose that information. The
>current proposal is a compromise that matches the existing behaviour of -m
>(which supports scripts like regrtest.py) while still giving a meaningful
>value for scripts which are not part of the normal filesystem.

Ugh.  Those are tricky, no question.  I can think of several simple answers 
for each, all of which are wrong in some way.  :)


From greg.ewing at canterbury.ac.nz  Thu Nov 24 04:47:02 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 24 Nov 2005 16:47:02 +1300
Subject: [Python-Dev] a Python interface for the AST (WAS: DRAFT:
 python-dev...)
In-Reply-To: <bbaeab100511221602h36bbc30bpc9f317d7fb3354fa@mail.gmail.com>
References: <d11dcfba0511221430j519a2f8fh6faac1ab89ee7d99@mail.gmail.com>
	<bbaeab100511221602h36bbc30bpc9f317d7fb3354fa@mail.gmail.com>
Message-ID: <438537B6.7020009@canterbury.ac.nz>

Brett Cannon wrote:

> There are two problems to this topic; how to
> get the AST structs into Python objects and how to allow Python code
> to modify the AST before bytecode emission

I'm astounded to hear that the AST isn't made from
Python objects in the first place. Is there a particular
reason it wasn't done that way?

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From mike at skew.org  Thu Nov 24 06:38:39 2005
From: mike at skew.org (Mike Brown)
Date: Wed, 23 Nov 2005 22:38:39 -0700 (MST)
Subject: [Python-Dev] urlparse brokenness
In-Reply-To: <20051123050455.9010E7FBF@place.org>
Message-ID: <200511240538.jAO5cdb8012274@chilled.skew.org>

Paul Jimenez wrote:
> So I propose that urlsplit, the main offender, be replaced with something
> that looks like:
> 
> def urlsplit(url, scheme='', allow_fragments=1, default=('','','','','')):

+1 in principle.

You should probably do a
    global _parse_cache

and add 'is not None' after 'if cached'.

From bcannon at gmail.com  Thu Nov 24 07:52:12 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Wed, 23 Nov 2005 22:52:12 -0800
Subject: [Python-Dev] a Python interface for the AST (WAS: DRAFT:
	python-dev...)
In-Reply-To: <438537B6.7020009@canterbury.ac.nz>
References: <d11dcfba0511221430j519a2f8fh6faac1ab89ee7d99@mail.gmail.com>
	<bbaeab100511221602h36bbc30bpc9f317d7fb3354fa@mail.gmail.com>
	<438537B6.7020009@canterbury.ac.nz>
Message-ID: <bbaeab100511232252l40892b56vc348a5b899accef4@mail.gmail.com>

On 11/23/05, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Brett Cannon wrote:
>
> > There are two problems to this topic; how to
> > get the AST structs into Python objects and how to allow Python code
> > to modify the AST before bytecode emission
>
> I'm astounded to hear that the AST isn't made from
> Python objects in the first place. Is there a particular
> reason it wasn't done that way?
>

I honestly don't know, Greg.  All of the structs are generated by
Parser/asdl_c.py which reads in the AST definition from
Parser/Python.asdl .  The code that is used to allocate and initialize
the structs is in Python/Python-ast.c and is also auto-generated by
Parser/asdl_c.py .

I am guessing here, but it might have to do with type safety.  Some
nodes can be different kinds of subnodes (like the stmt node) and thus
are created using a single struct and a bunch unions internally.  So
there is some added security that stuff is being done correctly.

Otherwise memory is the only other reason I can think of.  Or Jeremy
just didn't think of doing it that way when this was all started years
ago.  =)  But since it is all auto-generated it should be doable to
make them Python objects.

-Brett

From martin at v.loewis.de  Thu Nov 24 10:01:14 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 24 Nov 2005 10:01:14 +0100
Subject: [Python-Dev] Problems with the Python Memory Manager
In-Reply-To: <437ADDCF.7080906@ee.byu.edu>
References: <437ADDCF.7080906@ee.byu.edu>
Message-ID: <4385815A.5090705@v.loewis.de>

Travis Oliphant wrote:
> In the long term, what is the status of plans to re-work the Python 
> Memory manager to free memory that it acquires (or improve the detection 
> of already freed memory locations).

The Python memory manager does reuse memory that has been deallocated
earlier. There are patches "floating around" that makes it return
unused memory to the system (which it currently doesn't).

Regards,
Martin

From martin at v.loewis.de  Thu Nov 24 10:06:59 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 24 Nov 2005 10:06:59 +0100
Subject: [Python-Dev] Problems with the Python Memory Manager
In-Reply-To: <437BC524.2030105@ee.byu.edu>
References: <20051116120346.A434.JCARLSON@uci.edu>
	<dlg5gt$q1g$1@sea.gmane.org>	<20051116145820.A43A.JCARLSON@uci.edu>
	<437BC524.2030105@ee.byu.edu>
Message-ID: <438582B3.80204@v.loewis.de>

Travis Oliphant wrote:
> As verified by removing usage of the Python PyObject_MALLOC function, it 
> was the Python memory manager that was performing poorly.   Even though 
> the array-scalar objects were deleted, the memory manager would not 
> re-use their memory for later object creation. Instead, the memory 
> manager kept allocating new arenas to cover the load (when it should 
> have been able to re-use the old memory that had been freed by the 
> deleted objects--- again, I don't know enough about the memory manager 
> to say why this happened).

One way (I think the only way) this could happen if:
- the objects being allocated are all smaller than 256 bytes
- when allocating new objects, the requested size was different
   from any other size previously deallocated.

So if you first allocate 1,000,000 objects of size 200, and then
release them, and then allocate 1,000,000 objects of size 208,
the memory is not reused.

If the objects are all of same size, or all larger than 256 bytes,
this effect does not occur.

Regards,
Martin

From martin at v.loewis.de  Thu Nov 24 10:14:41 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 24 Nov 2005 10:14:41 +0100
Subject: [Python-Dev] Problems with the Python Memory Manager
In-Reply-To: <437C54AA.9020203@ee.byu.edu>
References: <fb6fbf560511161750y7cef46cdk67700606e655a6ec@mail.gmail.com>	<437BE7A8.5000503@ee.byu.edu>	<A89BF905-97B2-4E08-BFEB-33B00B3AECE0@mac.com>
	<437C54AA.9020203@ee.byu.edu>
Message-ID: <43858481.5060202@v.loewis.de>

Travis Oliphant wrote:
> So, I now believe that his code (plus the array scalar extension type) 
> was actually exposing a real bug in the memory manager itself.  In 
> theory, the Python memory manager should have been able to re-use the 
> memory for the array-scalar instances because they are always the same 
> size.  In practice, the memory was apparently not being re-used but 
> instead new blocks were being allocated to handle the load.

That is really very hard to believe. Most people on this list would
probably agree that obmalloc certain *will* reuse deallocated memory
if the next request is for the very same size (number of bytes) that
the previously-release object had.

> His code is quite complicated and it is difficult to replicate the 
> problem.  

That the code is complex would not so much be a problem: we often
analyse complex code here. It is a problem that the code is not
available, and it would be a problem if the problem was not
reproducable even if you had the code (i.e. if the problem would
sometimes occur, but not the next day when you ran it again).

So if you can, please post the code somewhere, and add a bugreport
on sf.net/projects/python.

Regards,
Martin

From fredrik at pythonware.com  Thu Nov 24 10:19:31 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu, 24 Nov 2005 10:19:31 +0100
Subject: [Python-Dev] Problems with the Python Memory Manager
References: <20051116120346.A434.JCARLSON@uci.edu><dlg5gt$q1g$1@sea.gmane.org>	<20051116145820.A43A.JCARLSON@uci.edu><437BC524.2030105@ee.byu.edu>
	<438582B3.80204@v.loewis.de>
Message-ID: <dm40j6$bn7$1@sea.gmane.org>

Martin v. Löwis wrote:

> One way (I think the only way) this could happen if:
> - the objects being allocated are all smaller than 256 bytes
> - when allocating new objects, the requested size was different
>    from any other size previously deallocated.
>
> So if you first allocate 1,000,000 objects of size 200, and then
> release them, and then allocate 1,000,000 objects of size 208,
> the memory is not reused.
>
> If the objects are all of same size, or all larger than 256 bytes,
> this effect does not occur.

but the allocator should be able to move empty pools between size
classes via the freepools list, right ?  or am I missing something ?

maybe what's happening here is more like

    So if you first allocate 1,000,000 objects of size 200, and then
    release most of them, and then allocate 1,000,000 objects of
    size 208, all memory might not be reused.

?

</F>




From robert.kern at gmail.com  Thu Nov 24 10:59:57 2005
From: robert.kern at gmail.com (Robert Kern)
Date: Thu, 24 Nov 2005 01:59:57 -0800
Subject: [Python-Dev] Problems with the Python Memory Manager
In-Reply-To: <43858481.5060202@v.loewis.de>
References: <fb6fbf560511161750y7cef46cdk67700606e655a6ec@mail.gmail.com>	<437BE7A8.5000503@ee.byu.edu>	<A89BF905-97B2-4E08-BFEB-33B00B3AECE0@mac.com>	<437C54AA.9020203@ee.byu.edu>
	<43858481.5060202@v.loewis.de>
Message-ID: <dm42uu$i4m$1@sea.gmane.org>

Martin v. L?wis wrote:

> That the code is complex would not so much be a problem: we often
> analyse complex code here. It is a problem that the code is not
> available, and it would be a problem if the problem was not
> reproducable even if you had the code (i.e. if the problem would
> sometimes occur, but not the next day when you ran it again).

You can get the version of scipy_core just before the fix that Travis
applied:

  svn co -r 1488 http://svn.scipy.org/svn/scipy_core/trunk

The fix:

  http://projects.scipy.org/scipy/scipy_core/changeset/1489
  http://projects.scipy.org/scipy/scipy_core/changeset/1490

Here's some code that eats up memory with rev1488, but not with the HEAD:

"""
import scipy

a = scipy.arange(10)
for i in xrange(10000000):
    x = a[5]
"""

-- 
Robert Kern
robert.kern at gmail.com

"In the fields of hell where the grass grows high
 Are the graves of dreams allowed to die."
  -- Richard Harter


From abo at minkirri.apana.org.au  Thu Nov 24 11:09:34 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Thu, 24 Nov 2005 10:09:34 +0000
Subject: [Python-Dev] urlparse brokenness
In-Reply-To: <20051123050455.9010E7FBF@place.org>
References: <20051123050455.9010E7FBF@place.org>
Message-ID: <1132826974.24108.6.camel@warna.corp.google.com>

On Tue, 2005-11-22 at 23:04 -0600, Paul Jimenez wrote:
> It is my assertion that urlparse is currently broken.  Specifically, I 
> think that urlparse breaks an abstraction boundary with ill effect.
> 
> In writing a mailclient, I wished to allow my users to specify their
> imap server as a url, such as 'imap://user:password at host:port/'. Which
> worked fine. I then thought that the natural extension to support

FWIW, I have a small addition related to this that I think would be
handy to add to the urlparse module. It is a pair of functions
"netlocparse()" and "netlocunparse()" that is for parsing and unparsing
"user:password at host:port" netloc's.

Feel free to use/add/ignore it...

http://minkirri.apana.org.au/~abo/projects/osVFS/netlocparse.py

-- 
Donovan Baarda <abo at minkirri.apana.org.au>
http://minkirri.apana.org.au/~abo/


From arigo at tunes.org  Thu Nov 24 12:38:58 2005
From: arigo at tunes.org (Armin Rigo)
Date: Thu, 24 Nov 2005 12:38:58 +0100
Subject: [Python-Dev] Problems with the Python Memory Manager
In-Reply-To: <dm42uu$i4m$1@sea.gmane.org>
References: <fb6fbf560511161750y7cef46cdk67700606e655a6ec@mail.gmail.com>
	<437BE7A8.5000503@ee.byu.edu>
	<A89BF905-97B2-4E08-BFEB-33B00B3AECE0@mac.com>
	<437C54AA.9020203@ee.byu.edu> <43858481.5060202@v.loewis.de>
	<dm42uu$i4m$1@sea.gmane.org>
Message-ID: <20051124113858.GA9262@code1.codespeak.net>

Hi,

On Thu, Nov 24, 2005 at 01:59:57AM -0800, Robert Kern wrote:
> You can get the version of scipy_core just before the fix that Travis
> applied:

Now we can start debugging :-)

>   http://projects.scipy.org/scipy/scipy_core/changeset/1490

This changeset alone fixes the small example you provided.  However,
compiling python "--without-pymalloc" doesn't fix it, so we can't blame
the memory allocator.  That's all I can say; I am rather clueless as to
how the above patch manages to make any difference even without
pymalloc.


A bientot,

Armin

From arigo at tunes.org  Thu Nov 24 13:11:13 2005
From: arigo at tunes.org (Armin Rigo)
Date: Thu, 24 Nov 2005 13:11:13 +0100
Subject: [Python-Dev] Problems with the Python Memory Manager
In-Reply-To: <20051124113858.GA9262@code1.codespeak.net>
References: <fb6fbf560511161750y7cef46cdk67700606e655a6ec@mail.gmail.com>
	<437BE7A8.5000503@ee.byu.edu>
	<A89BF905-97B2-4E08-BFEB-33B00B3AECE0@mac.com>
	<437C54AA.9020203@ee.byu.edu> <43858481.5060202@v.loewis.de>
	<dm42uu$i4m$1@sea.gmane.org>
	<20051124113858.GA9262@code1.codespeak.net>
Message-ID: <20051124121113.GA9444@code1.codespeak.net>

Hi,

Ok, here is the reason for the leak...

There is in scipy a type called 'int32_arrtype' which inherits from both
another scipy type called 'signedinteger_arrtype', and from 'int'.
Obscure!  This is not 100% officially allowed: you are inheriting from
two C types.  You're living dangerously!

Now in this case it mostly works as expected, because the parent scipy
type has no field at all, so it's mostly like inheriting from both
'object' and 'int' -- which is allowed, or would be if the bases were
written in the opposite order.  But still, something confuses the
fragile logic of typeobject.c.  (I'll leave this bit to scipy people to
debug :-)

The net result is that unless you force your own tp_free as in revision
1490, the type 'int32_arrtype' has tp_free set to int_free(), which is
the normal tp_free of 'int' objects.  This causes all deallocated
int32_arrtype instances to be added to the CPython free list of integers
instead of being freed!


A bientot,

Armin

From ncoghlan at gmail.com  Thu Nov 24 14:10:07 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 24 Nov 2005 23:10:07 +1000
Subject: [Python-Dev] PEP 302,
 PEP 338 and imp.getloader (was Re: a Python interface for the AST
 (WAS: DRAFT: python-dev...)
In-Reply-To: <5.1.1.6.0.20051123131857.03aba388@mail.telecommunity.com>
References: <5.1.1.6.0.20051122215139.01f99f90@mail.telecommunity.com>
	<ee2a432c0511221748r272540ffld1ef1f772f8058e3@mail.gmail.com>
	<d11dcfba0511221430j519a2f8fh6faac1ab89ee7d99@mail.gmail.com>
	<bbaeab100511221602h36bbc30bpc9f317d7fb3354fa@mail.gmail.com>
	<ee2a432c0511221748r272540ffld1ef1f772f8058e3@mail.gmail.com>
	<5.1.1.6.0.20051122215139.01f99f90@mail.telecommunity.com>
	<5.1.1.6.0.20051123131857.03aba388@mail.telecommunity.com>
Message-ID: <4385BBAF.8040104@gmail.com>

Phillip J. Eby wrote:
> This isn't hard to implement per se; setuptools for example has a 
> 'get_importer' function, and going from importer to loader is simple:

Thanks, I think I'll definitely be able to build something out of that.

> So with the above function you could do something like:
> 
> def get_loader(fullname, path):
>     for path_item in path:
>         try:
>             loader = get_importer(path_item).find_module(fullname)
>             if loader is not None:
>                 return loader
>         except ImportError:
>             continue
>     else:
>         return None
> 
> in order to implement the rest.

I think sys.meta_path needs to figure into that before digging through 
sys.path, but otherwise the concept seems basically correct.

[NickC]
>> ** I'm open to suggestions on how to deal with argv[0] and __file__. They
>> should be set to whatever __file__ would be set to by the module 
>> loader, but
>> the Importer Protocol in PEP 302 doesn't seem to expose that 
>> information. The
>> current proposal is a compromise that matches the existing behaviour 
>> of -m
>> (which supports scripts like regrtest.py) while still giving a meaningful
>> value for scripts which are not part of the normal filesystem.

[PJE]
> Ugh.  Those are tricky, no question.  I can think of several simple 
> answers for each, all of which are wrong in some way.  :)

Indeed. I tried turning to "exec co in d" and "execfile(name, d)" for 
guidance, and didn't find any real help there. The only thing they 
automatically add to the supplied dictionary is __builtins__.

The consequence is that any code executed using "exec" or "execfile" sees its 
name as being "__builtin__" because the lookup for '__name__' falls back to 
the builtin namespace.

Further, "__file__" and "__loader__" won't be set at all when using these 
functions, which may be something of a surprise for some modules (to say the 
least).

My current thinking is to actually try to distance the runpy module from 
"exec" and "execfile" significantly more than I'd originally intended. That 
way, I can explicitly focus on making it look like the item was invoked from 
the command line, without worrying about behaviour differences between this 
and the exec statement. It also means runpy can avoid the "implicitly modify 
the current namespace" behaviour that exec and execfile currently have.

The basic function runpy.run_code would look like:

   def run_code(code, init_globals=None,
                      mod_name=None, mod_file=None, mod_loader=None):
       """Executes a string of source code or a code object
          Returns the resulting top level namespace dictionary
       """
       # Handle omitted arguments
       if mod_name is None:
           mod_name = "<run>"
       if mod_file is None:
           mod_file = "<run>"
       if mod_loader is None:
           mod_loader = StandardImportLoader(".")
       # Set up the top level namespace dictionary
       run_globals = {}
       if init_globals is not None:
           run_globals.update(init_globals)
       run_globals.update(__name__ = mod_name,
                          __file__ = mod_file,
                          __loader__ = mod_loader)
       # Run it!
       exec code in run_globals
       return run_globals

Note that run_code always creates a new execution dictionary and returns it, 
in contrast to exec and execfile. This is so that naively doing:

   run_code("print 'Hi there!'", globals())

or:

   run_code("print 'Hi there!'", locals())

doesn't trash __name__, __file__ or __loader__ in the current module (which 
would be bad).

And runpy.run_module would look something like:

   def run_module(mod_name, run_globals=None, run_name=None, as_script=False)
       loader = _get_loader(mod_name) # Handle lack of imp.get_loader
       code = loader.get_code(mod_name)
       filename = _get_filename(loader, mod_name) # Handle lack of protocol
       if run_name is None:
           run_name = mod_name
       if as_script:
           sys.argv[0] = filename
       return run_code(code, run_globals, run_name, filename, loader)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From duncan-pythondev at grisby.org  Thu Nov 24 15:11:30 2005
From: duncan-pythondev at grisby.org (Duncan Grisby)
Date: Thu, 24 Nov 2005 14:11:30 +0000
Subject: [Python-Dev] (no subject)
Message-ID: <E1EfHow-0002xd-Ar@apasphere.com>

Hi,

I posted this to comp.lang.python, but got no response, so I thought I
would consult the wise people here...

I have encountered a problem with the re module. I have a
multi-threaded program that does lots of regular expression searching,
with some relatively complex regular expressions. Occasionally, events
can conspire to mean that the re search takes minutes. That's bad
enough in and of itself, but the real problem is that the re engine
does not release the interpreter lock while it is running. All the
other threads are therefore blocked for the entire time it takes to do
the regular expression search.

Is there any fundamental reason why the re module cannot release the
interpreter lock, for at least some of the time it is running?  The
ideal situation for me would be if it could do most of its work with
the lock released, since the software is running on a multi processor
machine that could productively do other work while the re is being
processed. Failing that, could it at least periodically release the
lock to give other threads a chance to run?

A quick look at the code in _sre.c suggests that for most of the time,
no Python objects are being manipulated, so the interpreter lock could
be released. Has anyone tried to do that?

Thanks,

Duncan.

-- 
 -- Duncan Grisby         --
  -- duncan at grisby.org     --
   -- http://www.grisby.org --

From abo at minkirri.apana.org.au  Thu Nov 24 15:52:01 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Thu, 24 Nov 2005 14:52:01 +0000
Subject: [Python-Dev] (no subject)
In-Reply-To: <E1EfHow-0002xd-Ar@apasphere.com>
References: <E1EfHow-0002xd-Ar@apasphere.com>
Message-ID: <1132843921.25145.62.camel@warna.corp.google.com>

On Thu, 2005-11-24 at 14:11 +0000, Duncan Grisby wrote:
> Hi,
> 
> I posted this to comp.lang.python, but got no response, so I thought I
> would consult the wise people here...
> 
> I have encountered a problem with the re module. I have a
> multi-threaded program that does lots of regular expression searching,
> with some relatively complex regular expressions. Occasionally, events
> can conspire to mean that the re search takes minutes. That's bad
> enough in and of itself, but the real problem is that the re engine
> does not release the interpreter lock while it is running. All the
> other threads are therefore blocked for the entire time it takes to do
> the regular expression search.

I don't know if this will help, but in my experience compiling re's
often takes longer than matching them... are you sure that it's the
match and not a compile that is taking a long time? Are you using
pre-compiled re's or are you dynamically generating strings and using
them?

> Is there any fundamental reason why the re module cannot release the
> interpreter lock, for at least some of the time it is running?  The
> ideal situation for me would be if it could do most of its work with
> the lock released, since the software is running on a multi processor
> machine that could productively do other work while the re is being
> processed. Failing that, could it at least periodically release the
> lock to give other threads a chance to run?
> 
> A quick look at the code in _sre.c suggests that for most of the time,
> no Python objects are being manipulated, so the interpreter lock could
> be released. Has anyone tried to do that?

probably not... not many people would have several-minutes-to-match
re's.

I suspect it would be do-able... I suggest you put together a patch and
submit it on SF...


-- 
Donovan Baarda <abo at minkirri.apana.org.au>
http://minkirri.apana.org.au/~abo/


From duncan-pythondev at grisby.org  Thu Nov 24 16:00:57 2005
From: duncan-pythondev at grisby.org (Duncan Grisby)
Date: Thu, 24 Nov 2005 15:00:57 +0000
Subject: [Python-Dev]  Re: Regular expressions
In-Reply-To: Message from Donovan Baarda <abo@minkirri.apana.org.au> of "Thu,
	24 Nov 2005 14:52:01 GMT."
	<1132843921.25145.62.camel@warna.corp.google.com> 
Message-ID: <E1EfIao-00030y-OX@apasphere.com>

On Thursday 24 November, Donovan Baarda wrote:

> I don't know if this will help, but in my experience compiling re's
> often takes longer than matching them... are you sure that it's the
> match and not a compile that is taking a long time? Are you using
> pre-compiled re's or are you dynamically generating strings and using
> them?

It's definitely matching time. The res are all pre-compiled.

[...]
> > A quick look at the code in _sre.c suggests that for most of the time,
> > no Python objects are being manipulated, so the interpreter lock could
> > be released. Has anyone tried to do that?
> 
> probably not... not many people would have several-minutes-to-match
> re's.
> 
> I suspect it would be do-able... I suggest you put together a patch and
> submit it on SF...

The thing that scares me about doing that is that there might be
single-threadedness assumptions in the code that I don't spot. It's the
kind of thing where a patch could appear to work fine, but them
mysteriously fail due to some occasional race condition. Does anyone
know if there is there any global state in _sre that would prevent it
being re-entered, or know for certain that there isn't?

Cheers,

Duncan.

-- 
 -- Duncan Grisby         --
  -- duncan at grisby.org     --
   -- http://www.grisby.org --

From fredrik at pythonware.com  Thu Nov 24 16:00:04 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu, 24 Nov 2005 16:00:04 +0100
Subject: [Python-Dev] (no subject)
References: <E1EfHow-0002xd-Ar@apasphere.com>
	<1132843921.25145.62.camel@warna.corp.google.com>
Message-ID: <dm4khk$8rg$1@sea.gmane.org>

Donovan Baarda wrote:

> I don't know if this will help, but in my experience compiling re's
> often takes longer than matching them... are you sure that it's the
> match and not a compile that is taking a long time? Are you using
> pre-compiled re's or are you dynamically generating strings and using
> them?

patterns with nested repeats can behave badly on certain types of non-
matching input. (each repeat is basically a loop, and if you nest enough
loops things can quickly get out of hand, even if the inner loop doesn't
do much...)

</F> 




From tim.peters at gmail.com  Thu Nov 24 16:44:42 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 24 Nov 2005 10:44:42 -0500
Subject: [Python-Dev] Problems with the Python Memory Manager
In-Reply-To: <438582B3.80204@v.loewis.de>
References: <20051116120346.A434.JCARLSON@uci.edu> <dlg5gt$q1g$1@sea.gmane.org>
	<20051116145820.A43A.JCARLSON@uci.edu> <437BC524.2030105@ee.byu.edu>
	<438582B3.80204@v.loewis.de>
Message-ID: <1f7befae0511240744t1d5b40fdsf0dd5e9201ae22cb@mail.gmail.com>

[Martin v. L?wis]
> One way (I think the only way) this could happen if:
> - the objects being allocated are all smaller than 256 bytes
> - when allocating new objects, the requested size was different
>   from any other size previously deallocated.
>
> So if you first allocate 1,000,000 objects of size 200, and then
> release them, and then allocate 1,000,000 objects of size 208,
> the memory is not reused.

Nope, the memory is reused in this case.  While each obmalloc "pool" P
is devoted to a fixed size so long as at least one object from P is in
use, when all objects allocated from P have been released, P can be
reassigned to any other size class.

The comments in obmalloc.c are quite accurate.  This particular case
is talked about here:

"""
empty == all the pool's blocks are currently available for allocation
    On transition to empty, a pool is unlinked from its usedpools[] list,
    and linked to the front of the (file static) singly-linked freepools list,
    via its nextpool member.  The prevpool member has no meaning in this
    case.  Empty pools have no inherent size class:  the next time a
    malloc finds an empty list in usedpools[], it takes the first pool off of
    freepools.  If the size class needed happens to be the same as the
    size class the pool last had, some pool initialization can be skipped.
"""

Now if you end up allocating a million pools all devoted to 72-byte
objects, and leave one object from each pool in use, then all those
pools remain devoted to 72-byte objects.  Wholly empty pools can be
(and do get) reused freely, though.

> If the objects are all of same size, or all larger than 256 bytes,
> this effect does not occur.

If they're larger than 256 bytes, then you see the reuse behavior of
the system malloc/free, about which virtually nothing can be said
that's true across all Python platforms.

From oliphant.travis at ieee.org  Thu Nov 24 17:30:55 2005
From: oliphant.travis at ieee.org (Travis E. Oliphant)
Date: Thu, 24 Nov 2005 09:30:55 -0700
Subject: [Python-Dev] Problems with the Python Memory Manager
In-Reply-To: <20051124121113.GA9444@code1.codespeak.net>
References: <fb6fbf560511161750y7cef46cdk67700606e655a6ec@mail.gmail.com>	<437BE7A8.5000503@ee.byu.edu>	<A89BF905-97B2-4E08-BFEB-33B00B3AECE0@mac.com>	<437C54AA.9020203@ee.byu.edu>
	<43858481.5060202@v.loewis.de>	<dm42uu$i4m$1@sea.gmane.org>	<20051124113858.GA9262@code1.codespeak.net>
	<20051124121113.GA9444@code1.codespeak.net>
Message-ID: <dm4ps1$qds$1@sea.gmane.org>

Armin Rigo wrote:
> Hi,
> 
> Ok, here is the reason for the leak...
> 
> There is in scipy a type called 'int32_arrtype' which inherits from both
> another scipy type called 'signedinteger_arrtype', and from 'int'.
> Obscure!  This is not 100% officially allowed: you are inheriting from
> two C types.  You're living dangerously!

This is allowed because the two types have compatible binaries (in fact 
the signed integer type is only the PyObject_HEAD)

> 
> Now in this case it mostly works as expected, because the parent scipy
> type has no field at all, so it's mostly like inheriting from both
> 'object' and 'int' -- which is allowed, or would be if the bases were
> written in the opposite order.  But still, something confuses the
> fragile logic of typeobject.c.  (I'll leave this bit to scipy people to
> debug :-)
> 

This is definitely possible.  I've tripped up in this logic before.   I 
was beginning to suspect that it might have something to do with what is 
going on.

> The net result is that unless you force your own tp_free as in revision
> 1490, the type 'int32_arrtype' has tp_free set to int_free(), which is
> the normal tp_free of 'int' objects.  This causes all deallocated
> int32_arrtype instances to be added to the CPython free list of integers
> instead of being freed!

I'm not sure this is true,  It sounds plausible but I will have to 
check.   Previously the tp_free should have been inherited as 
PyObject_Del for the int32_arrtype.  Unless the typeobject.c code copied 
the tp_free from the wrong base type, this shouldn't have been the case.

Thanks for the pointers.  It sounds like we're getting close.  Perhaps 
the problem is in typeobject.c ....


-Travis



From oliphant.travis at ieee.org  Thu Nov 24 18:02:59 2005
From: oliphant.travis at ieee.org (Travis E. Oliphant)
Date: Thu, 24 Nov 2005 10:02:59 -0700
Subject: [Python-Dev] Problems with the Python Memory Manager
In-Reply-To: <20051124121113.GA9444@code1.codespeak.net>
References: <fb6fbf560511161750y7cef46cdk67700606e655a6ec@mail.gmail.com>	<437BE7A8.5000503@ee.byu.edu>	<A89BF905-97B2-4E08-BFEB-33B00B3AECE0@mac.com>	<437C54AA.9020203@ee.byu.edu>
	<43858481.5060202@v.loewis.de>	<dm42uu$i4m$1@sea.gmane.org>	<20051124113858.GA9262@code1.codespeak.net>
	<20051124121113.GA9444@code1.codespeak.net>
Message-ID: <dm4ro4$n0$1@sea.gmane.org>

Armin Rigo wrote:
> Hi,
> 
> Ok, here is the reason for the leak...
> 
> There is in scipy a type called 'int32_arrtype' which inherits from both
> another scipy type called 'signedinteger_arrtype', and from 'int'.
> Obscure!  This is not 100% officially allowed: you are inheriting from
> two C types.  You're living dangerously!
> 
> Now in this case it mostly works as expected, because the parent scipy
> type has no field at all, so it's mostly like inheriting from both
> 'object' and 'int' -- which is allowed, or would be if the bases were
> written in the opposite order.  But still, something confuses the
> fragile logic of typeobject.c.  (I'll leave this bit to scipy people to
> debug :-)
> 
> The net result is that unless you force your own tp_free as in revision
> 1490, the type 'int32_arrtype' has tp_free set to int_free(), which is
> the normal tp_free of 'int' objects.  This causes all deallocated
> int32_arrtype instances to be added to the CPython free list of integers
> instead of being freed!

I can confirm that indeed the int32_arrtype object gets the tp_free slot 
from it's second parent (the python integer type) instead of its first 
parent (the new, empty signed integer type).  I just did a printf after 
PyType_Ready was called to see what the tp_free slot contained, and 
indeed it contained the wrong thing.

I suspect this may also be true of the float64_arrtype as well (which 
inherits from Python's float type).

What I don't understand is why the tp_free slot from the second base 
type got copied over into the tp_free slot of the child.  It should have 
received the tp_free slot of the first parent, right?

I'm still looking for why that would be the case.  I think, though, 
Armin has identified the real culprit of the problem.  I apologize for 
any consternation over the memory manager that may have taken place. 
This problem is obviously an issue of dual inheritance in C.

I understand this is not well tested code, but in principle it should 
work correctly, right?  I'll keep looking to see if I made a mistake in 
believing that the int32_arrtype should have inherited its tp_free slot 
from the first parent and not the second.

-Travis


From oliphant.travis at ieee.org  Thu Nov 24 18:17:43 2005
From: oliphant.travis at ieee.org (Travis E. Oliphant)
Date: Thu, 24 Nov 2005 10:17:43 -0700
Subject: [Python-Dev] Problems with mro for dual inheritance in C [Was:
 Problems with the Python Memory Manager]
In-Reply-To: <20051124121113.GA9444@code1.codespeak.net>
References: <fb6fbf560511161750y7cef46cdk67700606e655a6ec@mail.gmail.com>	<437BE7A8.5000503@ee.byu.edu>	<A89BF905-97B2-4E08-BFEB-33B00B3AECE0@mac.com>	<437C54AA.9020203@ee.byu.edu>
	<43858481.5060202@v.loewis.de>	<dm42uu$i4m$1@sea.gmane.org>	<20051124113858.GA9262@code1.codespeak.net>
	<20051124121113.GA9444@code1.codespeak.net>
Message-ID: <dm4sjp$3ov$1@sea.gmane.org>

Armin Rigo wrote:
> Hi,
> 
> Ok, here is the reason for the leak...
> 
> There is in scipy a type called 'int32_arrtype' which inherits from both
> another scipy type called 'signedinteger_arrtype', and from 'int'.
> Obscure!  This is not 100% officially allowed: you are inheriting from
> two C types.  You're living dangerously!
> 
> Now in this case it mostly works as expected, because the parent scipy
> type has no field at all, so it's mostly like inheriting from both
> 'object' and 'int' -- which is allowed, or would be if the bases were
> written in the opposite order.  But still, something confuses the
> fragile logic of typeobject.c.  (I'll leave this bit to scipy people to
> debug :-)

Well, I'm stumped on this.  Note the method resolution order for the new 
scalar array type (exactly as I would expect).   Why doesn't the int32 
type inherit its tp_free from the early types first?

a = zeros(10)
type(a[0]).mro()

[<type 'int32_arrtype'>, <type 'signedinteger_arrtype'>, <type 
'integer_arrtype'>,
<type 'numeric_arrtype'>, <type 'generic_arrtype'>, <type 'int'>, <type 
'object'>]





From nnorwitz at gmail.com  Thu Nov 24 20:34:37 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Thu, 24 Nov 2005 11:34:37 -0800
Subject: [Python-Dev] registering unicode codecs
Message-ID: <ee2a432c0511241134r173626d4u3cae9c17ccc4ea8c@mail.gmail.com>

While running regrtest with -R to find reference leaks I found a usage
issue.  When a codec is registered it is stored in the interpreter
state and cannot be removed.  Since it is stored as a list, if you
repeated add the same search function, you will get duplicates in the
list and they can't be removed.  This shows up as a reference leak
(which it really isn't) in test_unicode with this code modified from
test_codecs_errors:

import codecs
def search_function(encoding):
    def encode1(input, errors="strict"):
        return 42
    return (encode1, None, None, None)

codecs.register(search_function)

###

Should the search function be added to the search path if it is
already in there?  I don't understand a benefit of having duplicate
search functions.

Should users have access to the search path (through a
codecs.unregister())?  If so, should it search from the end of the
list to the beginning to remove an item?  That way the last entry
would be removed rather than the first.

n

From mal at egenix.com  Thu Nov 24 20:44:38 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 24 Nov 2005 20:44:38 +0100
Subject: [Python-Dev] registering unicode codecs
In-Reply-To: <ee2a432c0511241134r173626d4u3cae9c17ccc4ea8c@mail.gmail.com>
References: <ee2a432c0511241134r173626d4u3cae9c17ccc4ea8c@mail.gmail.com>
Message-ID: <43861826.4000705@egenix.com>

Neal Norwitz wrote:
> While running regrtest with -R to find reference leaks I found a usage
> issue.  When a codec is registered it is stored in the interpreter
> state and cannot be removed.  Since it is stored as a list, if you
> repeated add the same search function, you will get duplicates in the
> list and they can't be removed.  This shows up as a reference leak
> (which it really isn't) in test_unicode with this code modified from
> test_codecs_errors:
> 
> import codecs
> def search_function(encoding):
>     def encode1(input, errors="strict"):
>         return 42
>     return (encode1, None, None, None)
> 
> codecs.register(search_function)
> 
> ###
> 
> Should the search function be added to the search path if it is
> already in there?  I don't understand a benefit of having duplicate
> search functions.

Me neither :-) I never expected someone to register a search
function more than once, since there's no point in doing so.

> Should users have access to the search path (through a
> codecs.unregister())?  

Maybe, but why would you want to unregister a search function ?

> If so, should it search from the end of the
> list to the beginning to remove an item?  That way the last entry
> would be removed rather than the first.

I'd suggest to raise an exception in case a user tries
to register a search function twice. Removal should be the
same as doing list.remove(), ie. remove the first (and
only) item in the list of search functions.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Nov 24 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From nnorwitz at gmail.com  Thu Nov 24 20:51:20 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Thu, 24 Nov 2005 11:51:20 -0800
Subject: [Python-Dev] registering unicode codecs
In-Reply-To: <43861826.4000705@egenix.com>
References: <ee2a432c0511241134r173626d4u3cae9c17ccc4ea8c@mail.gmail.com>
	<43861826.4000705@egenix.com>
Message-ID: <ee2a432c0511241151j6b028dbcm36b8032c978d456c@mail.gmail.com>

On 11/24/05, M.-A. Lemburg <mal at egenix.com> wrote:
>
> > Should users have access to the search path (through a
> > codecs.unregister())?
>
> Maybe, but why would you want to unregister a search function ?
>
> > If so, should it search from the end of the
> > list to the beginning to remove an item?  That way the last entry
> > would be removed rather than the first.
>
> I'd suggest to raise an exception in case a user tries
> to register a search function twice.

This should take care of the testing problem.

> Removal should be the
> same as doing list.remove(), ie. remove the first (and
> only) item in the list of search functions.

Do you recommend adding an unregister()?  It's not necessary for this case.

n

From mal at egenix.com  Thu Nov 24 21:12:39 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 24 Nov 2005 21:12:39 +0100
Subject: [Python-Dev] registering unicode codecs
In-Reply-To: <ee2a432c0511241151j6b028dbcm36b8032c978d456c@mail.gmail.com>
References: <ee2a432c0511241134r173626d4u3cae9c17ccc4ea8c@mail.gmail.com>	
	<43861826.4000705@egenix.com>
	<ee2a432c0511241151j6b028dbcm36b8032c978d456c@mail.gmail.com>
Message-ID: <43861EB7.2030301@egenix.com>

Neal Norwitz wrote:
> On 11/24/05, M.-A. Lemburg <mal at egenix.com> wrote:
> 
>>>Should users have access to the search path (through a
>>>codecs.unregister())?
>>
>>Maybe, but why would you want to unregister a search function ?
>>
>>
>>>If so, should it search from the end of the
>>>list to the beginning to remove an item?  That way the last entry
>>>would be removed rather than the first.
>>
>>I'd suggest to raise an exception in case a user tries
>>to register a search function twice.
> 
> 
> This should take care of the testing problem.
> 
> 
>>Removal should be the
>>same as doing list.remove(), ie. remove the first (and
>>only) item in the list of search functions.
> 
> 
> Do you recommend adding an unregister()?  It's not necessary for this case.

Not really - I don't see much of a need for this; except
maybe if a codec package wants to replace another codec
package.

So far no-one has requested such a feature, so I'd say
we don't add .unregister() until a request for it pops up.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Nov 24 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From arigo at tunes.org  Thu Nov 24 23:24:52 2005
From: arigo at tunes.org (Armin Rigo)
Date: Thu, 24 Nov 2005 23:24:52 +0100
Subject: [Python-Dev] Problems with mro for dual inheritance in C [Was:
	Problems with the Python Memory Manager]
In-Reply-To: <dm4sjp$3ov$1@sea.gmane.org>
References: <fb6fbf560511161750y7cef46cdk67700606e655a6ec@mail.gmail.com>
	<437BE7A8.5000503@ee.byu.edu>
	<A89BF905-97B2-4E08-BFEB-33B00B3AECE0@mac.com>
	<437C54AA.9020203@ee.byu.edu> <43858481.5060202@v.loewis.de>
	<dm42uu$i4m$1@sea.gmane.org>
	<20051124113858.GA9262@code1.codespeak.net>
	<20051124121113.GA9444@code1.codespeak.net>
	<dm4sjp$3ov$1@sea.gmane.org>
Message-ID: <20051124222452.GA14236@code1.codespeak.net>

Hi Travis,

On Thu, Nov 24, 2005 at 10:17:43AM -0700, Travis E. Oliphant wrote:
> Why doesn't the int32 
> type inherit its tp_free from the early types first?

In your case I suspect that the tp_free is inherited from the tp_base
which is probably 'int'.  I don't see how to "fix" typeobject.c, because
I'm not sure that there is a solution that would do the right thing in
all cases at this level.

I would suggest that you just force the tp_alloc/tp_free that you want
in your static types instead.  That's what occurs for example if you
build a similar inheritance hierarchy with classes defined in Python:
these classes are then 'heap types', so they always get the generic
tp_alloc/tp_free before PyType_Ready() has a chance to see them.


Armin

From martin at v.loewis.de  Thu Nov 24 23:51:15 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 24 Nov 2005 23:51:15 +0100
Subject: [Python-Dev] SRE should release the GIL (was: no subject)
In-Reply-To: <E1EfHow-0002xd-Ar@apasphere.com>
References: <E1EfHow-0002xd-Ar@apasphere.com>
Message-ID: <438643E3.6030106@v.loewis.de>

Duncan Grisby wrote:
> Is there any fundamental reason why the re module cannot release the
> interpreter lock, for at least some of the time it is running?  The
> ideal situation for me would be if it could do most of its work with
> the lock released, since the software is running on a multi processor
> machine that could productively do other work while the re is being
> processed. Failing that, could it at least periodically release the
> lock to give other threads a chance to run?

Formally: no; it access a Python string/Python unicode object all
the time.

Now, since all the shared objects it accesses are immutable, likely
no harm would be done releasing the GIL. I think SRE was originally
also intended to operate on array.array objects; this would have
caused bigger problems. Not sure whether this is still an issue.

Regards,
Martin

From nnorwitz at gmail.com  Fri Nov 25 04:35:06 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Thu, 24 Nov 2005 19:35:06 -0800
Subject: [Python-Dev] reference leaks
Message-ID: <ee2a432c0511241935i70127dc0o50999f72b5094f89@mail.gmail.com>

There are still a few reference leaks I've been able to identify.  I
didn't see an obvious solution to these (well, I saw one obvious
solution which crashed, so obviously I was wrong).

When running regrtest with -R here are the ref leaks reported:

test_codeccallbacks leaked [2, 2, 2, 2] references
test_compiler leaked [176, 242, 202, 248] references
test_generators leaked [254, 254, 254, 254] references
test_tcl leaked [35, 35, 35, 35] references
test_threading_local leaked [36, 36, 28, 36] references
test_urllib2 leaked [-130, 70, -120, 60] references

test_compiler and test_urllib2 are probably not real leaks, but data
being cached.  I'm not really sure if test_tcl is a leak or not. 
Since there's a lot that goes on under the covers.  I didn't see
anything obvious in _tkinter.c.

I have no idea about test_threading_local.

I'm pretty certain test_codeccallbacks and test_generators are leaks. 
Here is code that I gleaned/modified from the tests and causes leaks
in the interpreter:

#### test_codeccallbacks

import codecs
def test_callbacks():
  def handler(exc):
    l = [u"<%d>" % ord(exc.object[pos]) for pos in xrange(exc.start, exc.end)]
    return (u"[%s]" % u"".join(l), exc.end)
  codecs.register_error("test.handler", handler)
  # the {} is necessary to cause the leak, {} can hold data too
  codecs.charmap_decode("abc", "test.handler", {})

test_callbacks()
# leak from PyUnicode_DecodeCharmap() each time test_callbacks() is called

#### test_generators

from itertools import tee

def fib():
  def yield_identity_forever(g):
    while 1:
      yield g
  def _fib():
    for i in yield_identity_forever(head):
      yield i
  head, tail, result = tee(_fib(), 3)
  return result

x = fib()
# x.next() leak from itertool.tee()

####

The itertools.tee() fix I thought was quite obvious:

+++ Modules/itertoolsmodule.c   (working copy)
@@ -356,7 +356,8 @@
 {
        if (tdo->nextlink == NULL)
                tdo->nextlink = teedataobject_new(tdo->it);
-       Py_INCREF(tdo->nextlink);
+       else
+               Py_INCREF(tdo->nextlink);
        return tdo->nextlink;
 }

However, this creates problems elsewhere.  I think test_heapq crashed
when I added this fix.  The patch also didn't fix all the leaks, just
a bunch of them.  So clearly there's more going on that I'm not
getting.

n

From oliphant at ee.byu.edu  Thu Nov 24 10:54:13 2005
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Thu, 24 Nov 2005 02:54:13 -0700
Subject: [Python-Dev] Problems with the Python Memory Manager
In-Reply-To: <43858481.5060202@v.loewis.de>
References: <fb6fbf560511161750y7cef46cdk67700606e655a6ec@mail.gmail.com>	<437BE7A8.5000503@ee.byu.edu>	<A89BF905-97B2-4E08-BFEB-33B00B3AECE0@mac.com>
	<437C54AA.9020203@ee.byu.edu> <43858481.5060202@v.loewis.de>
Message-ID: <43858DC5.2080607@ee.byu.edu>

Martin v. L?wis wrote:

> Travis Oliphant wrote:
>
>> So, I now believe that his code (plus the array scalar extension 
>> type) was actually exposing a real bug in the memory manager itself.  
>> In theory, the Python memory manager should have been able to re-use 
>> the memory for the array-scalar instances because they are always the 
>> same size.  In practice, the memory was apparently not being re-used 
>> but instead new blocks were being allocated to handle the load.
>
>
> That is really very hard to believe. Most people on this list would
> probably agree that obmalloc certain *will* reuse deallocated memory
> if the next request is for the very same size (number of bytes) that
> the previously-release object had.


Yes, I see that it does.  This became more clear as all the simple tests 
I tried failed to reproduce the problem (and I spent some time looking 
at the code and reading its comments).   I just can't figure out another 
explanation for why the problem went away when I went to using the 
system malloc other than some kind of corner-case in the Python memory 
allocator.

>
>> His code is quite complicated and it is difficult to replicate the 
>> problem.  
>
>
> That the code is complex would not so much be a problem: we often
> analyse complex code here. It is a problem that the code is not
> available, and it would be a problem if the problem was not
> reproducable even if you had the code (i.e. if the problem would
> sometimes occur, but not the next day when you ran it again).
>
The problem was definitely reproducible.  On his machine, and on the two 
machines I tried to run it on.  It without fail rapidly consumed all 
available memory.

> So if you can, please post the code somewhere, and add a bugreport
> on sf.net/projects/python.
>
I'll try to do this at some point. 

I'll have to get permission from him for the actual Python code.  The 
extension modules he used are all publically available (PyMC).  I 
changed the memory allocator in scipy --- which eliminated the problem 
--- so you'd have to check out an older version of the code from SVN to 
see the problem.

Thanks for the tips.

-Travis


From oliphant.travis at ieee.org  Thu Nov 24 11:08:11 2005
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Thu, 24 Nov 2005 03:08:11 -0700
Subject: [Python-Dev] Problems with the Python Memory Manager
In-Reply-To: <438582B3.80204@v.loewis.de>
References: <20051116120346.A434.JCARLSON@uci.edu>
	<dlg5gt$q1g$1@sea.gmane.org>	<20051116145820.A43A.JCARLSON@uci.edu>
	<437BC524.2030105@ee.byu.edu> <438582B3.80204@v.loewis.de>
Message-ID: <4385910B.40503@ieee.org>

Martin v. L?wis wrote:

> Travis Oliphant wrote:
>
>> As verified by removing usage of the Python PyObject_MALLOC function, 
>> it was the Python memory manager that was performing poorly.   Even 
>> though the array-scalar objects were deleted, the memory manager 
>> would not re-use their memory for later object creation. Instead, the 
>> memory manager kept allocating new arenas to cover the load (when it 
>> should have been able to re-use the old memory that had been freed by 
>> the deleted objects--- again, I don't know enough about the memory 
>> manager to say why this happened).
>
>
> One way (I think the only way) this could happen if:
> - the objects being allocated are all smaller than 256 bytes
> - when allocating new objects, the requested size was different
>   from any other size previously deallocated.


In one version of the code I had moved all objects from the Python 
memory manager to the system malloc *except* the array scalars.   The 
problem still remained, so I'm pretty sure these were the problem.    
The array scalars are all less than 256 bytes but they are always the 
same number of bytes. 

>
> So if you first allocate 1,000,000 objects of size 200, and then
> release them, and then allocate 1,000,000 objects of size 208,
> the memory is not reused.

That is useful information.   I don't think his code was doing that kind 
of thing, but it definitely provides something to check on.

Previously I was using the standard tp_alloc and tp_free methods (I was 
not setting them but just letting PyType_Ready fill those slots in with 
the default values).    When I changed these methods to ones that used 
system free and system malloc the problem went away.  That's why I 
attribute the issue to the Python memory manager.   Of course, it's 
always possible that I was doing something wrong, but I really did try 
to make sure I wasn't making a mistake.  I didn't do anything fancy with 
the Python memory allocator. 

The array scalars all subclass from each other in C, though.  I don't 
see how that could be relevant, but I could be missing something.

-Travis





From allison at shasta.stanford.edu  Thu Nov 24 17:44:27 2005
From: allison at shasta.stanford.edu (Dennis Allison)
Date: Thu, 24 Nov 2005 08:44:27 -0800 (PST)
Subject: [Python-Dev] Regular expressions
In-Reply-To: <E1EfIao-00030y-OX@apasphere.com>
Message-ID: <Pine.LNX.4.44.0511240834080.5028-100000@shasta.stanford.edu>


This is probably OT for [Python-dev]

I suspect that your problem is not the GIL but is due to something else.
Rather than dorking with the interpreter's threading, you probably would 
be better off rethinking your problem and finding a better way to 
accomplish your task.

On Thu, 24 Nov 2005, Duncan Grisby wrote:

> On Thursday 24 November, Donovan Baarda wrote:
> 
> > I don't know if this will help, but in my experience compiling re's
> > often takes longer than matching them... are you sure that it's the
> > match and not a compile that is taking a long time? Are you using
> > pre-compiled re's or are you dynamically generating strings and using
> > them?
> 
> It's definitely matching time. The res are all pre-compiled.
> 
> [...]
> > > A quick look at the code in _sre.c suggests that for most of the time,
> > > no Python objects are being manipulated, so the interpreter lock could
> > > be released. Has anyone tried to do that?
> > 
> > probably not... not many people would have several-minutes-to-match
> > re's.
> > 
> > I suspect it would be do-able... I suggest you put together a patch and
> > submit it on SF...
> 
> The thing that scares me about doing that is that there might be
> single-threadedness assumptions in the code that I don't spot. It's the
> kind of thing where a patch could appear to work fine, but them
> mysteriously fail due to some occasional race condition. Does anyone
> know if there is there any global state in _sre that would prevent it
> being re-entered, or know for certain that there isn't?
> 
> Cheers,
> 
> Duncan.
> 
> 

-- 


From victor.stinner at haypocalc.com  Fri Nov 25 03:31:40 2005
From: victor.stinner at haypocalc.com (Victor STINNER)
Date: Fri, 25 Nov 2005 03:31:40 +0100
Subject: [Python-Dev] Bug bz2.BZ2File(...).seek(0,2) + patch
Message-ID: <1132885900.18774.5.camel@haypopc>

Hi,

I found a bug in bz2 python module. Example:
 import bz2
 bz2.BZ2File("test.bz2","r")
 bz2.seek(0,2)
 assert bz2.tell() != 0

Details and *patch* at:
http://sourceforge.net/tracker/index.php?func=detail&aid=1366000&group_id=5470&atid=105470

Please CC-me for all your answers.

Bye, Victor
-- 
Victor Stinner - student at the UTBM (Belfort, France)
http://www.haypocalc.com/wiki/Accueil
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20051125/0a0385a7/attachment.pgp

From fanghao at corp.netease.com  Fri Nov 25 08:32:15 2005
From: fanghao at corp.netease.com (Frank)
Date: Fri, 25 Nov 2005 15:32:15 +0800
Subject: [Python-Dev] (no subject)
Message-ID: <20051125074127.32E1C1E400B@bag.python.org>

hi,
	test mail list :)

	

¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡ÖÂ
Àñ£¡
 				

¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡Frank
¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡fanghao at corp.netease.com
¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡2005-11-25

From fredrik at pythonware.com  Fri Nov 25 09:23:25 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 25 Nov 2005 09:23:25 +0100
Subject: [Python-Dev] SRE should release the GIL (was: no subject)
References: <E1EfHow-0002xd-Ar@apasphere.com> <438643E3.6030106@v.loewis.de>
Message-ID: <dm6hmk$23v$1@sea.gmane.org>

Martin v. Löwis wrote:

> Formally: no; it access a Python string/Python unicode object all
> the time.
>
> Now, since all the shared objects it accesses are immutable, likely
> no harm would be done releasing the GIL. I think SRE was originally
> also intended to operate on array.array objects; this would have
> caused bigger problems.

SRE can operate on anything that implements the buffer interface.

</F>




From arigo at tunes.org  Fri Nov 25 09:41:30 2005
From: arigo at tunes.org (Armin Rigo)
Date: Fri, 25 Nov 2005 09:41:30 +0100
Subject: [Python-Dev] Problems with the Python Memory Manager
In-Reply-To: <437BE7A8.5000503@ee.byu.edu>
References: <fb6fbf560511161750y7cef46cdk67700606e655a6ec@mail.gmail.com>
	<437BE7A8.5000503@ee.byu.edu>
Message-ID: <20051125084130.GA18796@code1.codespeak.net>

Hi Jim,

You wrote:
> >(2)  Is he allocating new _types_, which I think don't get properly
> > collected.

(Off-topic) For reference, as far as I know new types are properly
freed.  There has been a number of bugs and lots of corner cases to fix,
but I know of no remaining one.  This assumes that the new types are
heap types allocated in some official way -- either by Python code or by
somehow calling type() from C.


A bientot,

Armin

From mwh at python.net  Fri Nov 25 09:57:00 2005
From: mwh at python.net (Michael Hudson)
Date: Fri, 25 Nov 2005 08:57:00 +0000
Subject: [Python-Dev] reference leaks
In-Reply-To: <ee2a432c0511241935i70127dc0o50999f72b5094f89@mail.gmail.com>
	(Neal Norwitz's message of "Thu, 24 Nov 2005 19:35:06 -0800")
References: <ee2a432c0511241935i70127dc0o50999f72b5094f89@mail.gmail.com>
Message-ID: <2mhda1f6lf.fsf@starship.python.net>

Neal Norwitz <nnorwitz at gmail.com> writes:

> There are still a few reference leaks I've been able to identify.  I
> didn't see an obvious solution to these (well, I saw one obvious
> solution which crashed, so obviously I was wrong).
>
> When running regrtest with -R here are the ref leaks reported:
>
> test_codeccallbacks leaked [2, 2, 2, 2] references
> test_compiler leaked [176, 242, 202, 248] references
> test_generators leaked [254, 254, 254, 254] references
> test_tcl leaked [35, 35, 35, 35] references
> test_threading_local leaked [36, 36, 28, 36] references
> test_urllib2 leaked [-130, 70, -120, 60] references
>
> test_compiler and test_urllib2 are probably not real leaks, but data
> being cached.  I'm not really sure if test_tcl is a leak or not. 
> Since there's a lot that goes on under the covers.  I didn't see
> anything obvious in _tkinter.c.
>
> I have no idea about test_threading_local.

It's very odd, but probably not a leak.

> I'm pretty certain test_codeccallbacks and test_generators are leaks. 

Isn't test_codeccallbacks just the extra references you get from
registering an error handler?  test_generators is new, I think.

Cheers,
mwh

-- 
  Good? Bad? Strap him into the IETF-approved witch-dunking
  apparatus immediately!                        -- NTK now, 21/07/2000

From arigo at tunes.org  Fri Nov 25 09:59:55 2005
From: arigo at tunes.org (Armin Rigo)
Date: Fri, 25 Nov 2005 09:59:55 +0100
Subject: [Python-Dev] reference leaks
In-Reply-To: <ee2a432c0511241935i70127dc0o50999f72b5094f89@mail.gmail.com>
References: <ee2a432c0511241935i70127dc0o50999f72b5094f89@mail.gmail.com>
Message-ID: <20051125085955.GB18796@code1.codespeak.net>

Hi Neal,

On Thu, Nov 24, 2005 at 07:35:06PM -0800, Neal Norwitz wrote:
> The itertools.tee() fix I thought was quite obvious:
> 
> +++ Modules/itertoolsmodule.c   (working copy)
> @@ -356,7 +356,8 @@
>  {
>         if (tdo->nextlink == NULL)
>                 tdo->nextlink = teedataobject_new(tdo->it);
> -       Py_INCREF(tdo->nextlink);
> +       else
> +               Py_INCREF(tdo->nextlink);
>         return tdo->nextlink;
>  }

No, if this object is saved as a cache on 'tdo' then obviously it needs
to keep a reference on its own.  This reference will go away in
teedataobject_dealloc().

After debugging, the problem is a reference cycle: the teedataobject
'head' has a field 'it' pointing to the generator-iterator '_fib()',
which has a reference back to 'head'.  So what is missing is making
teedataobject GC-aware, which it current isn't.

I suspect that there are other itertools types in the same situation.


A bientot,

Armin.

From walter at livinglogic.de  Fri Nov 25 10:27:55 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Fri, 25 Nov 2005 10:27:55 +0100
Subject: [Python-Dev] reference leaks
In-Reply-To: <ee2a432c0511241935i70127dc0o50999f72b5094f89@mail.gmail.com>
References: <ee2a432c0511241935i70127dc0o50999f72b5094f89@mail.gmail.com>
Message-ID: <4386D91B.7030505@livinglogic.de>

Neal Norwitz wrote:

> [...]
> #### test_codeccallbacks
> 
> import codecs
> def test_callbacks():
>   def handler(exc):
>     l = [u"<%d>" % ord(exc.object[pos]) for pos in xrange(exc.start, exc.end)]
>     return (u"[%s]" % u"".join(l), exc.end)
>   codecs.register_error("test.handler", handler)
>   # the {} is necessary to cause the leak, {} can hold data too
>   codecs.charmap_decode("abc", "test.handler", {})
> 
> test_callbacks()
> # leak from PyUnicode_DecodeCharmap() each time test_callbacks() is called

Can you move the call to codecs.register_error() out of test_callbacks() 
and retry?

Bye,
    Walter D?rwald


From victor.stinner-linux at haypocalc.com  Fri Nov 25 12:55:23 2005
From: victor.stinner-linux at haypocalc.com (Victor STINNER)
Date: Fri, 25 Nov 2005 12:55:23 +0100
Subject: [Python-Dev] Bug bz2.BZ2File(...).seek(0,2) + patch
Message-ID: <1132919724.26613.4.camel@haypopc>

Hi,

I found a bug in bz2 python module. Example:
 import bz2
 bz2.BZ2File("test.bz2","r")
 bz2.seek(0,2)
 assert bz2.tell() != 0

Details and *patch* at:
http://sourceforge.net/tracker/index.php?func=detail&aid=1366000&group_id=5470&atid=105470

Please CC-me for all your answers.

Bye, Victor
-- 
Victor Stinner - student at the UTBM (Belfort, France)
http://www.haypocalc.com/wiki/Accueil
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20051125/9fcfd32d/attachment.pgp

From eric.noyau at gmail.com  Fri Nov 25 15:21:19 2005
From: eric.noyau at gmail.com (Eric Noyau)
Date: Fri, 25 Nov 2005 14:21:19 +0000
Subject: [Python-Dev] SRE should release the GIL (was: no subject)
In-Reply-To: <dm6hmk$23v$1@sea.gmane.org>
References: <E1EfHow-0002xd-Ar@apasphere.com> <438643E3.6030106@v.loewis.de>
	<dm6hmk$23v$1@sea.gmane.org>
Message-ID: <49e1c2960511250621t526bbc53p430a3144d1eafe5d@mail.gmail.com>

Hi all,

I've implemented a patch,  please visit bug 1366311 for details.

https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1366311&group_id=5470

This patch only release the GIL when the engine perform a low level search
*and* if the object searched is a string or a unicode string. The GIL will
not be released for any other kind of objects, as there is no guarantee of
immutability of the buffer during the run.

I've tested this with a couple of simple tests, and also by running the
application Duncan talked about. My testing indicates that everything works
as before with the added value that our application is still responsive even
when processing some of the more egregious regular expressions.

As it is my first foray into python module writing I'll welcome any feedback
you may have on the patch.

Regards,
-- Eric


On 11/25/05, Fredrik Lundh <fredrik at pythonware.com> wrote:
>
> Martin v. L?wis wrote:
>
> > Formally: no; it access a Python string/Python unicode object all
> > the time.
> >
> > Now, since all the shared objects it accesses are immutable, likely
> > no harm would be done releasing the GIL. I think SRE was originally
> > also intended to operate on array.array objects; this would have
> > caused bigger problems.
>
> SRE can operate on anything that implements the buffer interface.
>
> </F>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20051125/308a9692/attachment.html

From aahz at pythoncraft.com  Fri Nov 25 15:54:47 2005
From: aahz at pythoncraft.com (Aahz)
Date: Fri, 25 Nov 2005 06:54:47 -0800
Subject: [Python-Dev] Bug bz2.BZ2File(...).seek(0,2) + patch
In-Reply-To: <1132885900.18774.5.camel@haypopc>
References: <1132885900.18774.5.camel@haypopc>
Message-ID: <20051125145447.GA25513@panix.com>

On Fri, Nov 25, 2005, Victor STINNER wrote:
>
> I found a bug in bz2 python module. Example:
> 
> Details and *patch* at:
> http://sourceforge.net/tracker/index.php?func=detail&aid=1366000&group_id=5470&atid=105470

Thanks!  Particularly with the Thanksgiving weekend, you may not get any
other responses for a while.  Please be patient.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"If you think it's expensive to hire a professional to do the job, wait
until you hire an amateur."  --Red Adair

From nnorwitz at gmail.com  Fri Nov 25 19:02:56 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Fri, 25 Nov 2005 10:02:56 -0800
Subject: [Python-Dev] reference leaks
In-Reply-To: <4386D91B.7030505@livinglogic.de>
References: <ee2a432c0511241935i70127dc0o50999f72b5094f89@mail.gmail.com>
	<4386D91B.7030505@livinglogic.de>
Message-ID: <ee2a432c0511251002n438ca00eib1d7bdee53df30d7@mail.gmail.com>

On 11/25/05, Walter D?rwald <walter at livinglogic.de> wrote:
>
> Can you move the call to codecs.register_error() out of test_callbacks()
> and retry?

It then leaks 3 refs on each call to test_callbacks().

n
--

>>> import codecs
[24540 refs]
>>>
[24541 refs]
>>> def handler(exc):
...   l = [u"<%d>" % ord(exc.object[pos]) for pos in xrange(exc.start, exc.end)]
...   return (u"[%s]" % u"".join(l), exc.end)
...
[24575 refs]
>>> codecs.register_error("test.handler", handler)
[24579 refs]
>>>
[24579 refs]
>>> def test_callbacks():
...   # the {} is necessary to cause the leak
...   codecs.charmap_decode("abc", "test.handler", {})
...
[24604 refs]
>>> test_callbacks()
[24608 refs]
>>> test_callbacks()
[24611 refs]
>>> test_callbacks()
[24614 refs]

From jjl at pobox.com  Sat Nov 26 17:14:29 2005
From: jjl at pobox.com (John J Lee)
Date: Sat, 26 Nov 2005 16:14:29 +0000 (UTC)
Subject: [Python-Dev] urlparse brokenness
In-Reply-To: <20051123050455.9010E7FBF@place.org>
References: <20051123050455.9010E7FBF@place.org>
Message-ID: <Pine.LNX.4.58.0511261612330.6228@alice>

On Tue, 22 Nov 2005, Paul Jimenez wrote:

> It is my assertion that urlparse is currently broken.  Specifically, I
> think that urlparse breaks an abstraction boundary with ill effect.
[...]

I have some comments, but I can't see a patch on SF.  Did you post it?


John

From reinhold-birkenfeld-nospam at wolke7.net  Sat Nov 26 16:57:34 2005
From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld)
Date: Sat, 26 Nov 2005 16:57:34 +0100
Subject: [Python-Dev] Python 3
Message-ID: <dma473$pi$1@sea.gmane.org>

Hi,

don't know if this is known here, but it seems we have quite a long way to go:

http://kuerzer.de/python3

Reinhold <wink>


From jjl at pobox.com  Sat Nov 26 19:48:57 2005
From: jjl at pobox.com (John J Lee)
Date: Sat, 26 Nov 2005 18:48:57 +0000 (UTC)
Subject: [Python-Dev] ast status, memory leaks, etc
In-Reply-To: <dlvt41$cvl$1@sea.gmane.org>
References: <ee2a432c0511131141s72fedecax29008fd783a3b0db@mail.gmail.com><ee2a432c0511191615y6259e95bwce68aec849a7ebfa@mail.gmail.com><438048B6.2030103@v.loewis.de><ee2a432c0511201614u1dadb3b2x419e3482ccf5b145@mail.gmail.com>
	<9ef20ef30511221148g905deefo548a8fb3e68a08ae@mail.gmail.com>
	<dlvt41$cvl$1@sea.gmane.org>
Message-ID: <Pine.LNX.4.58.0511261845390.6228@alice>

On Tue, 22 Nov 2005, Fredrik Lundh wrote:
[...]
> http://cvs.sourceforge.net/viewcvs.py/python/python/dist/src/Misc/README.valgrind?view=markup

The up-to-date version of that (from SVN instead of old CVS repository) is
here:

http://svn.python.org/view/python/trunk/Misc/README.valgrind?view=markup


John

From martin at v.loewis.de  Sat Nov 26 22:36:27 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 26 Nov 2005 22:36:27 +0100
Subject: [Python-Dev] CVS repository mostly closed now
Message-ID: <4388D55B.1070501@v.loewis.de>

I tried removing the CVS repository from SF; it turns
out that this operation is not supported. Instead, it
is only possible to remove it from the project page;
pserver and ssh access remain indefinitely, as does
viewcvs.

The recommended procedure is to place a file into
the repository indicating the repository has moved;
this is what I just did.

Regards,
Martin

From noamraph at gmail.com  Sun Nov 27 00:11:36 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Sun, 27 Nov 2005 01:11:36 +0200
Subject: [Python-Dev] For Python 3k, drop default/implicit hash,
	and comparison
In-Reply-To: <ca471dc20511070910u3e2e7ea6o6e98b46357a1af5c@mail.gmail.com>
References: <436E2C3E.7060807@zope.com> <436E6A0E.4070508@pobox.com>
	<5.1.1.6.0.20051106162127.01ede358@mail.telecommunity.com>
	<5.1.1.6.0.20051106191059.01edcf78@mail.telecommunity.com>
	<5.1.1.6.0.20051106191251.01fa9818@mail.telecommunity.com>
	<ca471dc20511070910u3e2e7ea6o6e98b46357a1af5c@mail.gmail.com>
Message-ID: <b348a0850511261511q64ed5e6dxa8366af22846fe9a@mail.gmail.com>

Three weeks ago, I read this and thought, "well, you have two options
for a default comparison, one based on identity and one on value, both
are useful sometimes and Guido prefers identity, and it's OK." But
today I understood that I still think otherwise.

In two sentences: sometimes you wish to compare objects according to
"identity", and sometimes you wish to compare objects according to
"values". Identity-based comparison is done by the "is" operator;
Value-based comparison should be done by the == operator.

Let's take the car example, and expand it a bit. Let's say wheels have
attributes - say, diameter and manufacturer. Let's say those can't
change (which is reasonable), to make wheels hashable. There are two
ways to compare wheels: by value and by identity. Two wheels may have
the same value, that is, they have the same diameter and were created
by the same manufacturer. Two wheels may have the same identity, that
is, they are actually the same wheel.

We may want to compare wheels based on value, for example to make sure
that all the car's wheels fit together nicely: assert car.wheel1 ==
car.wheel2 == car.wheel3 == car.wheel4. We may want to compare wheels
based on identity, for example to make sure that we actually bought
four wheels in order to assemble the car: assert car.wheel1 is not
car.wheel2 and car.wheel3 is not car.wheel1 and car.wheel3 is not
car.wheel2...

We may want to associate values with wheels based on their values. For
example, it's reasonable to suppose that the price of every wheel of
the same model is the same. In that case, we'll write: price[wheel] =
25. We may want to associate values with wheels based on their
identities. For example, we may want to note that a specific wheel is
broken. For this, I'll first define a general class (I defined it
before in one of the discussions, that's because I believe it's
useful):

class Ref(object):
    def __init__(self, obj):
        self._obj = obj
    def __call__(self):
        return self._obj
    def __eq__(self, other):
        return isinstance(other, ref) and self._obj is other._obj
    def __hash__(self):
        return id(self._obj) ^ 0xBEEF

Now again, how will we say that a specific wheel is broken? Like this:

broken[Ref(wheel)] = True

Note that the Ref class also allows us to group wheels of the same
kind in a set, regardless of their __hash__ method.

I think that most objects, especially most user-defined objects, have
a *value*. I don't have an exact definition, but a hint is that two
objects that were created in the same way have the same value.
Sometimes we wish to compare objects based on their identity - in
those cases we use the "is" operator. Sometimes we wish to compare
objects based on their value - and that's what the == operator is for.
Sometimes we wish to use the value of objects as a dictionary key or
as a set member, and that's easy. Sometimes we wish to use the
identity of objects as a dictionary key or as a set member - and I
claim that we should do that by using the Ref class, whose *value* is
the object's *identity*, or by using a dict/set subclass, and not by
misusing the __hash__ and __eq__ methods.

I think that whenever value-based comparison is meaningful, the __eq__
and __hash__ should be value-based. Treating objects by identity
should be done explicitly, by the one who uses the objects, by using
the "is" operator or the Ref class. It should not be the job of the
object to decide which method (value or identity) is more useful - it
should allow the user to use both methods, by defining __eq__ and
__hash__ based on value.

Please give me examples which prove me wrong. I currently think that
the only objects for whom value-based comparison is not meaningful,
are objects which represent entities which are "outside" of the
process, or in other words, entities which are not "computational".
This includes files, sockets, possibly user-interface objects,
loggers, etc. I think that objects that represent purely "data", have
a "value" that they can be compared according to. Even wheels that
don't have any attributes are simply equal to other wheels, and not
equal to other objects. Since user-defined classes can interact with
the "environment" only through other objects or functions, it  is
reasonable to suggest that they should get a value-based equality
operator. Many times the value is defined by the __dict__ and
__slots__ members, so it seems to me a reasonable default.

I would greatly appreciate repliers that find a tiny bit of reason in
what I said (even if they don't agree), and not deny it all as a
complete load of rubbish.

Thanks,
Noam

From martin at v.loewis.de  Sun Nov 27 00:48:50 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 27 Nov 2005 00:48:50 +0100
Subject: [Python-Dev] For Python 3k, drop default/implicit hash,
	and comparison
In-Reply-To: <b348a0850511261511q64ed5e6dxa8366af22846fe9a@mail.gmail.com>
References: <436E2C3E.7060807@zope.com>
	<436E6A0E.4070508@pobox.com>	<5.1.1.6.0.20051106162127.01ede358@mail.telecommunity.com>	<5.1.1.6.0.20051106191059.01edcf78@mail.telecommunity.com>	<5.1.1.6.0.20051106191251.01fa9818@mail.telecommunity.com>	<ca471dc20511070910u3e2e7ea6o6e98b46357a1af5c@mail.gmail.com>
	<b348a0850511261511q64ed5e6dxa8366af22846fe9a@mail.gmail.com>
Message-ID: <4388F462.1090808@v.loewis.de>

Noam Raphael wrote:
 > I would greatly appreciate repliers that find a tiny bit of reason in
 > what I said (even if they don't agree), and not deny it all as a
 > complete load of rubbish.

I don't understand what your message is. With this posting, did you
suggest that somebody does something specific? If so, who is that one,
and what should he do?

Anyway, a lot of your posting is what I thought was common knowledge;
and with some of it, I disagree.

> In two sentences: sometimes you wish to compare objects according to
> "identity", and sometimes you wish to compare objects according to
> "values". Identity-based comparison is done by the "is" operator;
> Value-based comparison should be done by the == operator.

Certainly.

> We may want to compare wheels based on value, for example to make sure
> that all the car's wheels fit together nicely: assert car.wheel1 ==
> car.wheel2 == car.wheel3 == car.wheel4.

I would never write it that way. This would suggest that the wheels
have to be "the same". However, this is certainly not true for wheels:
they have to have to be of the same make. Now, you write that wheels
only carry manufacturer and diameter. However, I would expect that
wheels grow additional attributes over time, like whether they are
left or right, and what their wear level is. So to write your property,
I would write

car.wheel1.manufacturer_and_make() ==
car.wheel2.manufacturer_and_make() ==
car.wheel3.manufacturer_and_make() ==
car.wheel4.manufacturer_and_make()

> We may want to associate values with wheels based on their values. For
> example, it's reasonable to suppose that the price of every wheel of
> the same model is the same. In that case, we'll write: price[wheel] =
> 25. 

Again, I would not write it this way. I would find

wheel.price()

most natural. If I have the notion of a price list, then I would
try to understand what the price list is keyed-by, e.g. model number:

price[wheel.model] = 25

> Now again, how will we say that a specific wheel is broken? Like this:
> 
> broken[Ref(wheel)] = True

If I want things to be keyed by identity, I would write

broken = IdentityDictionary()
...
broken[wheel] = True

although I would prefer to write

wheel.broken = True

> I think that most objects, especially most user-defined objects, have
> a *value*. I don't have an exact definition, but a hint is that two
> objects that were created in the same way have the same value.

Here I disagree. Consider the wheel example. I would expect that
a wheel has a "wear level" or some such, and that this changes over
time, and that it belongs to the "value" of the wheel ("value"
being synonym to "state"). As this changes over time, it is certainly
not that the object is created with that value.

Think of lists: what is their value? Are they created with it?

> Sometimes we wish to use the
> identity of objects as a dictionary key or as a set member - and I
> claim that we should do that by using the Ref class, whose *value* is
> the object's *identity*, or by using a dict/set subclass, and not by
> misusing the __hash__ and __eq__ methods.

I think we should a specific type of dictionary then.

> I think that whenever value-based comparison is meaningful, the __eq__
> and __hash__ should be value-based. Treating objects by identity
> should be done explicitly, by the one who uses the objects, by using
> the "is" operator or the Ref class. It should not be the job of the
> object to decide which method (value or identity) is more useful - it
> should allow the user to use both methods, by defining __eq__ and
> __hash__ based on value.

If objects are compared for value equality, the object should decide
which part of its state goes into that comparison. It may be that
two objects compare equal even though their state is memberwise
different:

Rational(1,2) == Rational(5,10)

> Please give me examples which prove me wrong. I currently think that
> the only objects for whom value-based comparison is not meaningful,
> are objects which represent entities which are "outside" of the
> process, or in other words, entities which are not "computational".

You mean, things of the real world, right? Like people, bank accounts,
and wheels.

Regards,
Martin

From pedronis at strakt.com  Sun Nov 27 01:13:28 2005
From: pedronis at strakt.com (Samuele Pedroni)
Date: Sun, 27 Nov 2005 01:13:28 +0100
Subject: [Python-Dev] For Python 3k, drop default/implicit hash,
	and comparison
In-Reply-To: <b348a0850511261511q64ed5e6dxa8366af22846fe9a@mail.gmail.com>
References: <436E2C3E.7060807@zope.com>
	<436E6A0E.4070508@pobox.com>	<5.1.1.6.0.20051106162127.01ede358@mail.telecommunity.com>	<5.1.1.6.0.20051106191059.01edcf78@mail.telecommunity.com>	<5.1.1.6.0.20051106191251.01fa9818@mail.telecommunity.com>	<ca471dc20511070910u3e2e7ea6o6e98b46357a1af5c@mail.gmail.com>
	<b348a0850511261511q64ed5e6dxa8366af22846fe9a@mail.gmail.com>
Message-ID: <4388FA28.5080800@strakt.com>

Noam Raphael wrote:
> Three weeks ago, I read this and thought, "well, you have two options
> for a default comparison, one based on identity and one on value, both
> are useful sometimes and Guido prefers identity, and it's OK." But
> today I understood that I still think otherwise.
> 

well, this still belongs to comp.lang.python.

> In two sentences: sometimes you wish to compare objects according to
> "identity", and sometimes you wish to compare objects according to
> "values". Identity-based comparison is done by the "is" operator;
> Value-based comparison should be done by the == operator.
> 
> Let's take the car example, and expand it a bit. Let's say wheels have
> attributes - say, diameter and manufacturer. Let's say those can't
> change (which is reasonable), to make wheels hashable. There are two
> ways to compare wheels: by value and by identity. Two wheels may have
> the same value, that is, they have the same diameter and were created
> by the same manufacturer. Two wheels may have the same identity, that
> is, they are actually the same wheel.
> 
> We may want to compare wheels based on value, for example to make sure
> that all the car's wheels fit together nicely: assert car.wheel1 ==
> car.wheel2 == car.wheel3 == car.wheel4. We may want to compare wheels
> based on identity, for example to make sure that we actually bought
> four wheels in order to assemble the car: assert car.wheel1 is not
> car.wheel2 and car.wheel3 is not car.wheel1 and car.wheel3 is not
> car.wheel2...
> 
> We may want to associate values with wheels based on their values. For
> example, it's reasonable to suppose that the price of every wheel of
> the same model is the same. In that case, we'll write: price[wheel] =
> 25. We may want to associate values with wheels based on their
> identities. For example, we may want to note that a specific wheel is
> broken. For this, I'll first define a general class (I defined it
> before in one of the discussions, that's because I believe it's
> useful):
> 
> class Ref(object):
>     def __init__(self, obj):
>         self._obj = obj
>     def __call__(self):
>         return self._obj
>     def __eq__(self, other):
>         return isinstance(other, ref) and self._obj is other._obj
>     def __hash__(self):
>         return id(self._obj) ^ 0xBEEF
> 
> Now again, how will we say that a specific wheel is broken? Like this:
> 
> broken[Ref(wheel)] = True
> 
> Note that the Ref class also allows us to group wheels of the same
> kind in a set, regardless of their __hash__ method.
> 
> I think that most objects, especially most user-defined objects, have
> a *value*. I don't have an exact definition, but a hint is that two
> objects that were created in the same way have the same value.
> Sometimes we wish to compare objects based on their identity - in
> those cases we use the "is" operator. Sometimes we wish to compare
> objects based on their value - and that's what the == operator is for.
> Sometimes we wish to use the value of objects as a dictionary key or
> as a set member, and that's easy. Sometimes we wish to use the
> identity of objects as a dictionary key or as a set member - and I
> claim that we should do that by using the Ref class, whose *value* is
> the object's *identity*, or by using a dict/set subclass, and not by
> misusing the __hash__ and __eq__ methods.
> 
> I think that whenever value-based comparison is meaningful, the __eq__
> and __hash__ should be value-based. Treating objects by identity
> should be done explicitly, by the one who uses the objects, by using
> the "is" operator or the Ref class. It should not be the job of the
> object to decide which method (value or identity) is more useful - it
> should allow the user to use both methods, by defining __eq__ and
> __hash__ based on value.
> 
> Please give me examples which prove me wrong. I currently think that
> the only objects for whom value-based comparison is not meaningful,
> are objects which represent entities which are "outside" of the
> process, or in other words, entities which are not "computational".
> This includes files, sockets, possibly user-interface objects,
> loggers, etc. I think that objects that represent purely "data", have
> a "value" that they can be compared according to. Even wheels that
> don't have any attributes are simply equal to other wheels, and not
> equal to other objects. Since user-defined classes can interact with
> the "environment" only through other objects or functions, it  is
> reasonable to suggest that they should get a value-based equality
> operator. Many times the value is defined by the __dict__ and
> __slots__ members, so it seems to me a reasonable default.
> 
> I would greatly appreciate repliers that find a tiny bit of reason in
> what I said (even if they don't agree), and not deny it all as a
> complete load of rubbish.
> 

not if you think python-dev is a forum for such discussions
on OO thinking vs other paradigms.


> Thanks,
> Noam
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/pedronis%40strakt.com


From rhamph at gmail.com  Sun Nov 27 01:25:15 2005
From: rhamph at gmail.com (Adam Olsen)
Date: Sat, 26 Nov 2005 17:25:15 -0700
Subject: [Python-Dev] For Python 3k, drop default/implicit hash,
	and comparison
In-Reply-To: <b348a0850511261511q64ed5e6dxa8366af22846fe9a@mail.gmail.com>
References: <436E2C3E.7060807@zope.com> <436E6A0E.4070508@pobox.com>
	<5.1.1.6.0.20051106162127.01ede358@mail.telecommunity.com>
	<5.1.1.6.0.20051106191059.01edcf78@mail.telecommunity.com>
	<5.1.1.6.0.20051106191251.01fa9818@mail.telecommunity.com>
	<ca471dc20511070910u3e2e7ea6o6e98b46357a1af5c@mail.gmail.com>
	<b348a0850511261511q64ed5e6dxa8366af22846fe9a@mail.gmail.com>
Message-ID: <aac2c7cb0511261625p6cdefb6epce8fc1e30e99b1c7@mail.gmail.com>

On 11/26/05, Noam Raphael <noamraph at gmail.com> wrote:
> [...stuff about using Ref() for identity dictionaries...]

I too have thought along these lines, but I went one step further. 
There is an existing function that could be modified to produce Ref
objects: id().

Making id() into a type allows it force unsignedness, incorporate a
method for easy printing, maintain a reference to the target so that
"id(x.foo) == id(x.bar)" doesn't risk reusing the same id.. and the id
object would be the same size as an int object is today.  I don't see
any disadvantage, except perhaps code that assumes id() returns an
int.  That could be fixed by having id() subclass int for a few
versions while we transition, although that may require we store the
pointer seperate from the integer value.

id() would be usable in dicts as a value, behaving as Noam suggests
that Ref behave.  Kills two birds with one stone.

--
Adam Olsen, aka Rhamphoryncus

From ncoghlan at gmail.com  Sun Nov 27 03:09:37 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 27 Nov 2005 12:09:37 +1000
Subject: [Python-Dev] For Python 3k, drop default/implicit hash,
	and comparison
In-Reply-To: <aac2c7cb0511261625p6cdefb6epce8fc1e30e99b1c7@mail.gmail.com>
References: <436E2C3E.7060807@zope.com>
	<436E6A0E.4070508@pobox.com>	<5.1.1.6.0.20051106162127.01ede358@mail.telecommunity.com>	<5.1.1.6.0.20051106191059.01edcf78@mail.telecommunity.com>	<5.1.1.6.0.20051106191251.01fa9818@mail.telecommunity.com>	<ca471dc20511070910u3e2e7ea6o6e98b46357a1af5c@mail.gmail.com>	<b348a0850511261511q64ed5e6dxa8366af22846fe9a@mail.gmail.com>
	<aac2c7cb0511261625p6cdefb6epce8fc1e30e99b1c7@mail.gmail.com>
Message-ID: <43891561.4080604@gmail.com>

Adam Olsen wrote:
> On 11/26/05, Noam Raphael <noamraph at gmail.com> wrote:
>> [...stuff about using Ref() for identity dictionaries...]
> 
> I too have thought along these lines, but I went one step further. 
> There is an existing function that could be modified to produce Ref
> objects: id().
> 
> Making id() into a type allows it force unsignedness, incorporate a
> method for easy printing, maintain a reference to the target so that
> "id(x.foo) == id(x.bar)" doesn't risk reusing the same id.. and the id
> object would be the same size as an int object is today.  I don't see
> any disadvantage, except perhaps code that assumes id() returns an
> int.  That could be fixed by having id() subclass int for a few
> versions while we transition, although that may require we store the
> pointer seperate from the integer value.
> 
> id() would be usable in dicts as a value, behaving as Noam suggests
> that Ref behave.  Kills two birds with one stone.

I've occasionally considered the concept of a "Ref" class - usually when I 
want to be able to access a value in multiple places, and have them all track 
rebinding operations. You can't do it perfectly (you need to rebind the 
attribute directly because objects aren't notified of name rebinding) but you 
can get pretty close (because objects *are* notified of augmented assignment).

However, re-using id() for this doesn't seem like the right approach.

Cheers,
Nick.

P.S. Yes, those musings where prompted at least in part by Paul Graham's 
ramblings ;) The sample version below obviously misses out all the slots it 
would actually need to delegate to get correct behaviour.

Py> class Ref(object):
...     def __init__(self, val):
...         self._val = val
...     def __str__(self):
...         return str(self._val)
...     def __repr__(self):
...         return "%s(%s)" % (type(self).__name__, repr(self._val))
...     def __iadd__(self, other):
...         self._val += other
...         return self
...
Py> n = Ref(1)
Py> i = n
Py> n += 2
Py> n
Ref(3)
Py> i
Ref(3)
Py> def make_accum(n):
...     def accum(i, n=Ref(n)):
...         n += i
...         return n._val
...     return accum
...
Py> acc = make_accum(3)
Py> acc(1)
4
Py> acc(5)
9


-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From kbk at shore.net  Sun Nov 27 06:26:03 2005
From: kbk at shore.net (Kurt B. Kaiser)
Date: Sun, 27 Nov 2005 00:26:03 -0500 (EST)
Subject: [Python-Dev] Weekly Python Patch/Bug Summary
Message-ID: <200511270526.jAR5Q3Mh017757@bayview.thirdcreek.com>

Patch / Bug Summary
___________________

Patches :  372 open ( -7) /  2980 closed (+12) /  3352 total ( +5)
Bugs    :  908 open ( -2) /  5395 closed (+11) /  6303 total ( +9)
RFE     :  200 open ( +0) /   191 closed ( +0) /   391 total ( +0)

New / Reopened Patches
______________________

CodeContext - Improved text indentation  (2005-11-21)
       http://python.org/sf/1362975  opened by  Tal Einat

test_cmd_line expecting English error messages   (2005-11-23)
CLOSED http://python.org/sf/1364545  opened by  A.B., Khalid

Add reference for en/decode error types  (2005-11-23)
CLOSED http://python.org/sf/1364946  opened by  Wummel

[PATCH] mmap fails on AMD64  (2005-11-24)
       http://python.org/sf/1365916  opened by  Joe Wreschnig

Patches Closed
______________

zlib.crc32 doesn't handle 0xffffffff seed  (2005-11-07)
       http://python.org/sf/1350573  closed by  akuchling

xml.dom.minidom.Node.replaceChild(obj, x, x) removes child x  (2005-01-01)
       http://python.org/sf/1094164  closed by  akuchling

Patch for (Doc) #1255218  (2005-10-17)
       http://python.org/sf/1328526  closed by  birkenfeld

Patch for (Doc) #1261659  (2005-10-17)
       http://python.org/sf/1328566  closed by  birkenfeld

Patch for (Doc) #1357604  (2005-11-18)
       http://python.org/sf/1359879  closed by  birkenfeld

CallTip Modifications  (2005-05-11)
       http://python.org/sf/1200038  closed by  kbk

ensure lock is released if exception is raised  (2005-10-05)
       http://python.org/sf/1314396  closed by  bcannon

test_cmd_line expecting English error messages   (2005-11-23)
       http://python.org/sf/1364545  closed by  doerwalter

ToolTip.py: fix main() function  (2005-10-06)
       http://python.org/sf/1315161  closed by  kbk

Add reference for en/decode error types  (2005-11-23)
       http://python.org/sf/1364946  closed by  doerwalter

solaris 10 should not define _XOPEN_SOURCE_EXTENDED  (2005-06-27)
       http://python.org/sf/1227966  closed by  loewis

Solaris 10 fails to compile complexobject.c [FIX incl.]  (2005-02-05)
       http://python.org/sf/1116722  closed by  loewis

New / Reopened Bugs
___________________

textwrap.dedent() expands tabs  (2005-11-19)
       http://python.org/sf/1361643  opened by  Steven Bethard

Text.edit_modified() doesn't work  (2005-11-20)
       http://python.org/sf/1362475  opened by  Ron Provost

Problem with tapedevices and the tarfile module  (2005-11-21)
       http://python.org/sf/1362587  opened by  Henrik

spawnlp is missing  (2005-11-21)
       http://python.org/sf/1363104  opened by  Greg MacDonald

A possible thinko in the description of os/chmod  (2005-11-22)
CLOSED http://python.org/sf/1363712  opened by  Evgeny Roubinchtein

urllib cannot open data: urls  (2005-11-25)
CLOSED http://python.org/sf/1365984  opened by  Warren Butler

Bug bz2.BZ2File(...).seek(0,2)  (2005-11-25)
       http://python.org/sf/1366000  opened by  STINNER Victor

inoorrect documentation for optparse  (2005-11-25)
       http://python.org/sf/1366250  opened by  Michael Dunn

SRE engine do not release the GIL  (2005-11-25)
       http://python.org/sf/1366311  opened by  Eric Noyau

inspect.getdoc fails on objs that use property for __doc__  (2005-11-26)
       http://python.org/sf/1367183  opened by  Drew Perttula

Bugs Closed
___________

A possible thinko in the description of os.chmod  (2005-11-22)
       http://python.org/sf/1363712  closed by  birkenfeld

docs need to discuss // and __future__.division  (2001-08-08)
       http://python.org/sf/449093  closed by  akuchling

Prefer configured browser over Mozilla and friends  (2005-11-17)
       http://python.org/sf/1359150  closed by  birkenfeld

Incorrect documentation of raw unidaq string literals  (2005-11-17)
       http://python.org/sf/1359053  closed by  birkenfeld

"appropriately decorated" is undefined in MultiFile.push doc  (2005-08-09)
       http://python.org/sf/1255218  closed by  birkenfeld

Tutorial doesn't cover * and ** function calls  (2005-08-17)
       http://python.org/sf/1261659  closed by  birkenfeld

os.path.makedirs DOES handle UNC paths  (2005-11-15)
       http://python.org/sf/1357604  closed by  birkenfeld

Exec Inside A Function  (2005-04-06)
       http://python.org/sf/1177811  closed by  birkenfeld

Py_BuildValue k format units don't work with big values  (2005-09-04)
       http://python.org/sf/1281408  closed by  birkenfeld

urllib cannot open data: urls  (2005-11-25)
       http://python.org/sf/1365984  closed by  birkenfeld

imaplib: parsing INTERNALDATE  (2003-03-06)
       http://python.org/sf/698706  closed by  birkenfeld


From noamraph at gmail.com  Sun Nov 27 20:04:25 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Sun, 27 Nov 2005 21:04:25 +0200
Subject: [Python-Dev] For Python 3k, drop default/implicit hash,
	and comparison
In-Reply-To: <4388F462.1090808@v.loewis.de>
References: <436E2C3E.7060807@zope.com> <436E6A0E.4070508@pobox.com>
	<5.1.1.6.0.20051106162127.01ede358@mail.telecommunity.com>
	<5.1.1.6.0.20051106191059.01edcf78@mail.telecommunity.com>
	<5.1.1.6.0.20051106191251.01fa9818@mail.telecommunity.com>
	<ca471dc20511070910u3e2e7ea6o6e98b46357a1af5c@mail.gmail.com>
	<b348a0850511261511q64ed5e6dxa8366af22846fe9a@mail.gmail.com>
	<4388F462.1090808@v.loewis.de>
Message-ID: <b348a0850511271104q387ece75sc75b186b96bd792f@mail.gmail.com>

On 11/27/05, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> Noam Raphael wrote:
>  > I would greatly appreciate repliers that find a tiny bit of reason in
>  > what I said (even if they don't agree), and not deny it all as a
>  > complete load of rubbish.
>
> I don't understand what your message is. With this posting, did you
> suggest that somebody does something specific? If so, who is that one,
> and what should he do?

Perhaps I felt a bit attacked. It was probably my fault, and anyway, a
general message like this is not the proper way - I'm sorry.

>
> Anyway, a lot of your posting is what I thought was common knowledge;
> and with some of it, I disagree.

This is fine, of course.
> > We may want to compare wheels based on value, for example to make sure
> > that all the car's wheels fit together nicely: assert car.wheel1 ==
> > car.wheel2 == car.wheel3 == car.wheel4.
>
> I would never write it that way. This would suggest that the wheels
> have to be "the same". However, this is certainly not true for wheels:
> they have to have to be of the same make. Now, you write that wheels
> only carry manufacturer and diameter. However, I would expect that
> wheels grow additional attributes over time, like whether they are
> left or right, and what their wear level is. So to write your property,
> I would write
>
> car.wheel1.manufacturer_and_make() ==
> car.wheel2.manufacturer_and_make() ==
> car.wheel3.manufacturer_and_make() ==
> car.wheel4.manufacturer_and_make()
>
You may be right in the case of wheels. From time to time, in the real
(programming) world, I encounter objects that I wish to compare by
value - this is certainly the case for built-in objects, but is
sometimes the case for more complex objects.

> > We may want to associate values with wheels based on their values. For
> > example, it's reasonable to suppose that the price of every wheel of
> > the same model is the same. In that case, we'll write: price[wheel] =
> > 25.
>
> Again, I would not write it this way. I would find
>
> wheel.price()

Many times the objects are not yours to add attributes, or may have
__slots__ defined. The truth is that I prefer not to add attributes to
external objects even when it's possible.
>
> most natural. If I have the notion of a price list, then I would
> try to understand what the price list is keyed-by, e.g. model number:
>
> price[wheel.model] = 25
>
Sometimes there's no "key" - it's just the state of the object (what
if wheels don't have a model number?)

> > Now again, how will we say that a specific wheel is broken? Like this:
> >
> > broken[Ref(wheel)] = True
>
> If I want things to be keyed by identity, I would write
>
> broken = IdentityDictionary()
> ...
> broken[wheel] = True
>
> although I would prefer to write
>
> wheel.broken = True
>
I personally prefer the first method, but the second one is ok too.

> > I think that most objects, especially most user-defined objects, have
> > a *value*. I don't have an exact definition, but a hint is that two
> > objects that were created in the same way have the same value.
>
> Here I disagree. Consider the wheel example. I would expect that
> a wheel has a "wear level" or some such, and that this changes over
> time, and that it belongs to the "value" of the wheel ("value"
> being synonym to "state"). As this changes over time, it is certainly
> not that the object is created with that value.
>
> Think of lists: what is their value? Are they created with it?
>
My tounge failed me. I meant: created in the same way = have gone
through the same series of actions. That is:
a = []; a.append(5); a.extend([2,1]); a.pop()
b = []; b.append(5); b.entend([2,1]); b.pop()
a == b

> > Sometimes we wish to use the
> > identity of objects as a dictionary key or as a set member - and I
> > claim that we should do that by using the Ref class, whose *value* is
> > the object's *identity*, or by using a dict/set subclass, and not by
> > misusing the __hash__ and __eq__ methods.
>
> I think we should a specific type of dictionary then.
That's OK too. My point was that the one who uses the objects should
explicitly specify whether he means value-based of identity-based
lookup. This means that if an object has a "value", it should not make
__eq__ and __hash__ be identity-based just to make identity-based
lookup easier and implicit.
>
> > I think that whenever value-based comparison is meaningful, the __eq__
> > and __hash__ should be value-based. Treating objects by identity
> > should be done explicitly, by the one who uses the objects, by using
> > the "is" operator or the Ref class. It should not be the job of the
> > object to decide which method (value or identity) is more useful - it
> > should allow the user to use both methods, by defining __eq__ and
> > __hash__ based on value.
>
> If objects are compared for value equality, the object should decide
> which part of its state goes into that comparison. It may be that
> two objects compare equal even though their state is memberwise
> different:
>
> Rational(1,2) == Rational(5,10)
>
I completely agree. Indeed, the "value of an object" is in many times
not "the value of all its attributes".

> > Please give me examples which prove me wrong. I currently think that
> > the only objects for whom value-based comparison is not meaningful,
> > are objects which represent entities which are "outside" of the
> > process, or in other words, entities which are not "computational".
>
> You mean, things of the real world, right? Like people, bank accounts,
> and wheels.

No, I meant real programming examples. My theory is that most
user-defined classes have a "value", and those that don't are related
to I/O, in some sort of a broad definition of the term. I may be
wrong, so I ask for counter-examples.

Thanks for your reply,
Noam

From noamraph at gmail.com  Sun Nov 27 20:14:15 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Sun, 27 Nov 2005 21:14:15 +0200
Subject: [Python-Dev] For Python 3k, drop default/implicit hash,
	and comparison
In-Reply-To: <4388FA28.5080800@strakt.com>
References: <436E2C3E.7060807@zope.com> <436E6A0E.4070508@pobox.com>
	<5.1.1.6.0.20051106162127.01ede358@mail.telecommunity.com>
	<5.1.1.6.0.20051106191059.01edcf78@mail.telecommunity.com>
	<5.1.1.6.0.20051106191251.01fa9818@mail.telecommunity.com>
	<ca471dc20511070910u3e2e7ea6o6e98b46357a1af5c@mail.gmail.com>
	<b348a0850511261511q64ed5e6dxa8366af22846fe9a@mail.gmail.com>
	<4388FA28.5080800@strakt.com>
Message-ID: <b348a0850511271114g1193090fwa5cff444d2fb8b02@mail.gmail.com>

On 11/27/05, Samuele Pedroni <pedronis at strakt.com> wrote:
> well, this still belongs to comp.lang.python.
...
> not if you think python-dev is a forum for such discussions
> on OO thinking vs other paradigms.

Perhaps my style made it look like a discussion on OO thinking vs
other paradigms, but my conclusion is exactly about the issue of this
thread -
Jim suggested to drop default __hash__ and __eq__ for Python 3K. Guido
decided not to, because it's useful to use them for identity-based
comparison and lookup. I say that I disagree, because I think that
__hash__ and __eq__ should be used for value-based comparison and
lookup, and because if the user of the object does explicit
identity-based comparison/lookup, it doesn't matter to him whether
__hash__ and __eq__ are defined or not. I also suggested, in a way,
that it's OK to define a default value-based __eq__ method.

Noam

From arigo at tunes.org  Sun Nov 27 21:00:38 2005
From: arigo at tunes.org (Armin Rigo)
Date: Sun, 27 Nov 2005 21:00:38 +0100
Subject: [Python-Dev] For Python 3k, drop default/implicit hash,
	and comparison
In-Reply-To: <b348a0850511271104q387ece75sc75b186b96bd792f@mail.gmail.com>
References: <436E2C3E.7060807@zope.com> <436E6A0E.4070508@pobox.com>
	<5.1.1.6.0.20051106162127.01ede358@mail.telecommunity.com>
	<5.1.1.6.0.20051106191059.01edcf78@mail.telecommunity.com>
	<5.1.1.6.0.20051106191251.01fa9818@mail.telecommunity.com>
	<ca471dc20511070910u3e2e7ea6o6e98b46357a1af5c@mail.gmail.com>
	<b348a0850511261511q64ed5e6dxa8366af22846fe9a@mail.gmail.com>
	<4388F462.1090808@v.loewis.de>
	<b348a0850511271104q387ece75sc75b186b96bd792f@mail.gmail.com>
Message-ID: <20051127200038.GA7033@code1.codespeak.net>

Hi Noam,

On Sun, Nov 27, 2005 at 09:04:25PM +0200, Noam Raphael wrote:
> No, I meant real programming examples. My theory is that most
> user-defined classes have a "value", and those that don't are related
> to I/O, in some sort of a broad definition of the term. I may be
> wrong, so I ask for counter-examples.

In the source code base of PyPy, trying to count only what we really
wrote and not external tools, I found 19 classes defining __eq__ on a
total of 1413.  There must be close to zero classes that have anything
to do with I/O in there.  If anything, this proves that the default
comparison for classes is absolutely fine and nothing needs to be fixed
in the Python language.

Please move this discussion outside python-dev.


Armin

From guido at python.org  Mon Nov 28 03:24:12 2005
From: guido at python.org (Guido van Rossum)
Date: Sun, 27 Nov 2005 18:24:12 -0800
Subject: [Python-Dev] urlparse brokenness
In-Reply-To: <20051123050455.9010E7FBF@place.org>
References: <20051123050455.9010E7FBF@place.org>
Message-ID: <ca471dc20511271824k1e227bdeo594559904b9894fe@mail.gmail.com>

On 11/22/05, Paul Jimenez <pj at place.org> wrote:
>
> It is my assertion that urlparse is currently broken.  Specifically, I
> think that urlparse breaks an abstraction boundary with ill effect.

IIRC I did it this way because the RFC about parsing urls specifically
prescribed it had to be done this way. Maybe there's a newer RFC with
different rules?

> In writing a mailclient, I wished to allow my users to specify their
> imap server as a url, such as 'imap://user:password at host:port/'. Which
> worked fine. I then thought that the natural extension to support
> configuration of imapssl would be 'imaps://user:password at host:port/'....
> which failed - user:passwrod at host:port got parsed as the *path* of
> the URL instead of the network location. It turns out that urlparse
> keeps a table of url schemes that 'use netloc'... that is to say,
> that have a 'user:password at host:port' part to their URL. I think this
> 'special knowledge' about particular schemes 1) breaks an abstraction
> boundary by having a function whose charter is to pull apart a
> particularly-formatted string behave differently based on the meaning of
> the string instead of the structure of it

I disagree. You have to know what the scheme means before you can
parse the rest -- there is (by design!) no standard parsing for
anything that follows the scheme and the colon. I don't even think
that you can trust that if the colon is followed by two slashes that
what follows is a netloc for all schemes.

But if there's an RFC that says otherwise I'll gladly concede;
urlparse's main goal in life is to b RFC compliant. Is your opinion
based on an RFC?

> and 2) fails to be extensible
> or forward compatible due to hardcoded 'magic' strings - if schemes were
> somehow 'registerable' as 'netloc using' or not, then this objection
> might be nullified, but the previous objection would still stand.

I think it is reasonable to propose an extension whereby one can
register a parser (or parsing flags like uses_netloc) for a specific
scheme, presuming there won't be conflicting registrations (which
should only happen if two independently developed libraries have a
different use for the same scheme -- a failure of standardization).

> So I propose that urlsplit, the main offender, be replaced with something
> that looks like:
>
> def urlsplit(url, scheme='', allow_fragments=1, default=('','','','','')):

Since you don't present your new code in diff format, could you
explain in English how what it does differs from the original? Or
perhaps you could present some unit tests (doctest would be ideal)
showing the desired behavior of the proposed code (I understand from
later posts that it may have some bugs). (For example, why add the
default parameter?)

> Note that I'm not sold on the _parse_cache, but I'm assuming it was there
> for a reason so I'm leaving that functionality as-is.

There's also a special case for http; given that the code is rather
general and hence slow, it makes sense that it attempts some
optimizations, and removing these might cause a nasty surprise for
some users.

> If this isn't the right forum for this discussion, or the right place to
> submit code, please let me know.

Please do submit patches to SF if you want then to be discussed.

> Also, please cc: me directly on responses
> as I'm not subscribed to the firehose that is python-dev.

ACK.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From mike at skew.org  Mon Nov 28 06:07:08 2005
From: mike at skew.org (Mike Brown)
Date: Sun, 27 Nov 2005 22:07:08 -0700 (MST)
Subject: [Python-Dev] urlparse brokenness
In-Reply-To: <ca471dc20511271824k1e227bdeo594559904b9894fe@mail.gmail.com>
Message-ID: <200511280507.jAS578np069306@chilled.skew.org>

Guido van Rossum wrote:
> IIRC I did it this way because the RFC about parsing urls specifically
> prescribed it had to be done this way.

That was true as of RFC 1808 (1995-1998), although the grammar actually 
allowed for a more generic interpretation. 

Such an interpretation was suggested in RFC 2396 (1998-2004) via a regular 
expression for parsing URI 'references' (a formal abstraction introduced in 
2396) into 5 components (not six, since 'params' were moved into 'path'
and eventually became an option on every path segment, not just the end
of the path). The 5 components are:

  scheme, authority (formerly netloc), path, query, fragment.

Parsing could result in some components being undefined, which is distinct 
from being empty (e.g., 'mailto:foo at bar?' would have an undefined authority 
and fragment, and a defined, but empty, query).

RFC 3986 / STD 66 (2005-) did not change the regular expression, but makes 
several references to these '5 major components' of a URI, and says that these 
components are scheme-independent; parsers that operate at the generic syntax
level "can parse any URI reference into its major components. Once the scheme
is determined, further scheme-specific parsing can be performed on the
components."

> You have to know what the scheme means before you can
> parse the rest -- there is (by design!) no standard parsing for
> anything that follows the scheme and the colon.

Not since 1998, IMHO. It was implicit, at least since RFC 2396, that all URI 
references can be interpreted as having the 5 components, it was made explicit 
in RFC 3986 / STD 66.

> I don't even think
> that you can trust that if the colon is followed by two slashes that
> what follows is a netloc for all schemes.

You can.

> But if there's an RFC that says otherwise I'll gladly concede;
> urlparse's main goal in life is to b RFC compliant.

Its intent seems to be to split a URI into its major components, which are now 
by definition scheme-independent (and have been, implicitly, for a long time), 
so the function shouldn't distinguish between schemes.

Do you want to keep returning that 6-tuple, or can we make it return a 
5-tuple? If we keep returning 'params' for backward compatibility, then that 
means the 'path' we are returning is not the 'path' that people would expect 
(they'll have to concatenate path+params to get what the generic syntax calls 
a 'path' nowadays). It's also deceptive because params are now allowed on all 
path segments, and the current function only takes them from the last segment.

Also for backward compatibility, should an absent component continue to 
manifest in the result as an empty string? I think a compliant parser should 
make a distinction between absent and empty (it could make a difference, in 
theory).

If a regular expression were used for parsing, it would produce None for 
absent components and empty-string for empty ones. I implemented it this
way in 4Suite's Ft.Lib.Uri and it works nicely.

Mike

From ncoghlan at iinet.net.au  Mon Nov 28 12:26:53 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Mon, 28 Nov 2005 21:26:53 +1000
Subject: [Python-Dev] Metaclass problem in the "with" statement semantics in
	PEP 343
Message-ID: <438AE97D.2050600@iinet.net.au>

Given the current semantics of PEP 343 and the following class:

   class null_context(object):
     def __context__(self):
         return self
     def __enter__(self):
         return self
     def __exit__(self, *exc_info):
         pass

Mistakenly writing:

    with null_context:
        # Oops, passed the class instead of an instance

Would give a less than meaningful error message:

     TypeError: unbound method __context__() must be called with null_context 
instance as first argument (got nothing instead)

It's the usual metaclass problem with invoking a slot (or slot equivalent) via 
"obj.__slot__()" rather than via "type(obj).__slot__(obj)" the way the 
underlying C code does.

I think we need to fix the proposed semantics so that they access the slots 
via the type, rather than directly through the instance. Otherwise the slots 
for the with statement will behave strangely when compared to the slots for 
other magic methods.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From guido at python.org  Mon Nov 28 15:53:35 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 28 Nov 2005 06:53:35 -0800
Subject: [Python-Dev] urlparse brokenness
In-Reply-To: <200511280507.jAS578np069306@chilled.skew.org>
References: <ca471dc20511271824k1e227bdeo594559904b9894fe@mail.gmail.com>
	<200511280507.jAS578np069306@chilled.skew.org>
Message-ID: <ca471dc20511280653r7520fc6av10fcda4ff217958c@mail.gmail.com>

OK, you've convinced me. But for backwards compatibility (until Python
3000), a new API should be designed. We can't change the old API in an
incompatible way. Please submit complete code + docs to SF. (If you
think this requires much design work, a PEP may be in order but I
think that given the new RFCs it's probably straightforward enough to
not require that.

--Guido

On 11/27/05, Mike Brown <mike at skew.org> wrote:
> Guido van Rossum wrote:
> > IIRC I did it this way because the RFC about parsing urls specifically
> > prescribed it had to be done this way.
>
> That was true as of RFC 1808 (1995-1998), although the grammar actually
> allowed for a more generic interpretation.
>
> Such an interpretation was suggested in RFC 2396 (1998-2004) via a regular
> expression for parsing URI 'references' (a formal abstraction introduced in
> 2396) into 5 components (not six, since 'params' were moved into 'path'
> and eventually became an option on every path segment, not just the end
> of the path). The 5 components are:
>
>   scheme, authority (formerly netloc), path, query, fragment.
>
> Parsing could result in some components being undefined, which is distinct
> from being empty (e.g., 'mailto:foo at bar?' would have an undefined authority
> and fragment, and a defined, but empty, query).
>
> RFC 3986 / STD 66 (2005-) did not change the regular expression, but makes
> several references to these '5 major components' of a URI, and says that these
> components are scheme-independent; parsers that operate at the generic syntax
> level "can parse any URI reference into its major components. Once the scheme
> is determined, further scheme-specific parsing can be performed on the
> components."
>
> > You have to know what the scheme means before you can
> > parse the rest -- there is (by design!) no standard parsing for
> > anything that follows the scheme and the colon.
>
> Not since 1998, IMHO. It was implicit, at least since RFC 2396, that all URI
> references can be interpreted as having the 5 components, it was made explicit
> in RFC 3986 / STD 66.
>
> > I don't even think
> > that you can trust that if the colon is followed by two slashes that
> > what follows is a netloc for all schemes.
>
> You can.
>
> > But if there's an RFC that says otherwise I'll gladly concede;
> > urlparse's main goal in life is to b RFC compliant.
>
> Its intent seems to be to split a URI into its major components, which are now
> by definition scheme-independent (and have been, implicitly, for a long time),
> so the function shouldn't distinguish between schemes.
>
> Do you want to keep returning that 6-tuple, or can we make it return a
> 5-tuple? If we keep returning 'params' for backward compatibility, then that
> means the 'path' we are returning is not the 'path' that people would expect
> (they'll have to concatenate path+params to get what the generic syntax calls
> a 'path' nowadays). It's also deceptive because params are now allowed on all
> path segments, and the current function only takes them from the last segment.
>
> Also for backward compatibility, should an absent component continue to
> manifest in the result as an empty string? I think a compliant parser should
> make a distinction between absent and empty (it could make a difference, in
> theory).
>
> If a regular expression were used for parsing, it would produce None for
> absent components and empty-string for empty ones. I implemented it this
> way in 4Suite's Ft.Lib.Uri and it works nicely.
>
> Mike
>


--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon Nov 28 17:24:08 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 28 Nov 2005 08:24:08 -0800
Subject: [Python-Dev] Metaclass problem in the "with" statement
	semantics in PEP 343
In-Reply-To: <438AE97D.2050600@iinet.net.au>
References: <438AE97D.2050600@iinet.net.au>
Message-ID: <ca471dc20511280824y6af50950y93f70f9c19bfe0d9@mail.gmail.com>

On 11/28/05, Nick Coghlan <ncoghlan at iinet.net.au> wrote:
> Given the current semantics of PEP 343 and the following class:
>
>    class null_context(object):
>      def __context__(self):
>          return self
>      def __enter__(self):
>          return self
>      def __exit__(self, *exc_info):
>          pass
>
> Mistakenly writing:
>
>     with null_context:
>         # Oops, passed the class instead of an instance
>
> Would give a less than meaningful error message:
>
>      TypeError: unbound method __context__() must be called with null_context
> instance as first argument (got nothing instead)
>
> It's the usual metaclass problem with invoking a slot (or slot equivalent) via
> "obj.__slot__()" rather than via "type(obj).__slot__(obj)" the way the
> underlying C code does.
>
> I think we need to fix the proposed semantics so that they access the slots
> via the type, rather than directly through the instance. Otherwise the slots
> for the with statement will behave strangely when compared to the slots for
> other magic methods.

Maybe it's because I'm just an old fart, but I can't make myself care
about this. The code is broken. You get an error message. It even has
the correct exception (TypeError). In this particular case the error
message isn't that great -- well, the same is true in many other cases
(like whenever the invocation is a method call from Python code).

That most built-in operations produce a different error message
doesn't mean we have to make *all* built-in operations use the same
approach. I fail to see the value of the consistency you're calling
for.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon Nov 28 17:45:52 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 28 Nov 2005 08:45:52 -0800
Subject: [Python-Dev] (no subject)
In-Reply-To: <E1EfHow-0002xd-Ar@apasphere.com>
References: <E1EfHow-0002xd-Ar@apasphere.com>
Message-ID: <ca471dc20511280845k3c73a7ccj381b9013b3651871@mail.gmail.com>

On 11/24/05, Duncan Grisby <duncan-pythondev at grisby.org> wrote:
> Hi,
>
> I posted this to comp.lang.python, but got no response, so I thought I
> would consult the wise people here...
>
> I have encountered a problem with the re module. I have a
> multi-threaded program that does lots of regular expression searching,
> with some relatively complex regular expressions. Occasionally, events
> can conspire to mean that the re search takes minutes. That's bad
> enough in and of itself, but the real problem is that the re engine
> does not release the interpreter lock while it is running. All the
> other threads are therefore blocked for the entire time it takes to do
> the regular expression search.

Rather than trying to fight the GIL, I suggest that you let a regex
expert look at your regex(es) and the input that causes the long
running times. As Fredrik suggested, certain patterns are just
inefficient but can be rewritten more efficiently. There are plenty of
regex experts on c.l.py.

Unless you have a multi-CPU box, the performance of your app isn't
going to improve by releasing the GIL -- it only affects the
responsiveness of other threads.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon Nov 28 18:11:13 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 28 Nov 2005 09:11:13 -0800
Subject: [Python-Dev] Patch Req. # 1351020 & 1351036: PythonD
	modifications
In-Reply-To: <43816CE2.2020808@v.loewis.de>
References: <39387.202.3.192.11.1132108393.squirrel@cafemail.mcadcafe.com>
	<437FA1D8.7060600@v.loewis.de> <20051120150850.GA27838@unpythonic.net>
	<25509.202.3.192.11.1132533752.squirrel@cafemail.mcadcafe.com>
	<43816CE2.2020808@v.loewis.de>
Message-ID: <ca471dc20511280911o3966d2fcr4b9c5bc932407cc4@mail.gmail.com>

On 11/20/05, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> decker at dacafe.com wrote:
> > The local python community here in Sydney indicated that python.org is
> > only upset when groups port the source to 'obscure' systems and *don't*
> > submit patches... It is possible that I was misinformed.
>
> I never heard such concerns. I personally wouldn't notice if somebody
> ported Python, and did not feed back the patches.

I guess that I'm the source of that sentiment.

My reason for wanting people to contribute ports back is that if they
don't, the port is more likely to stick on some ancient version of
Python (e.g. I believe Nokia is still at 2.2.2). Then, assuming the
port remains popular, its users are going to pressure developers of
general Python packages to provide support for old versions of Python.

While I agree that maintaining port-specific code is a pain whenever
Python is upgraded, I still think that accepting patches for
odd-platform ports is the better alternative. Even if the patches
deteriorate as Python evolves, they should still (in principle) make a
re-port easier.

Perhaps the following compromise can be made: the PSF accepts patches
from reputable platform maintainers. (Of course, like all
contributions, they must be of high quality and not break anything,
etc., before they are accepted.) If such patches cause problems with
later Python versions, the PSF won't maintain them, but instead invite
the original contributors (or other developers who are interested in
that particular port) to fix them. If there is insufficient response,
or if it comes too late given the PSF release schedule, the PSF
developers may decide to break or remove support for the affected
platform.

There's a subtle balance between keeping too much old cruft and being
too aggressive in removing cruft that still serves a purpose for
someone. I bet that we've erred in both directions at times.

> Sometimes, people ask "there is this and that port, why isn't it
> integrated", to which the answer is in most cases "because authors
> didn't contribute". This is not being upset - it is merely a fact.
> This port (djgcc) is the first one in a long time (IIRC) where
> anybody proposed rejecting it.
>
> > I am not sure about the future myself. DJGPP 2.04 has been parked at beta
> > for two years now. It might be fair to say that the *general* DJGPP
> > developer base has shrunk a little bit. But the PythonD userbase has
> > actually grown since the first release three years ago. For the time
> > being, people get very angry when the servers go down here :-)
>
> It's not that much availability of the platform I worry about, but the
> commitment of the Python porter. We need somebody to forward bug
> reports to, and somebody to intervene if incompatible changes are made.
> This person would also indicate that the platform is no longer
> available, and hence the port can be removed.

It sounds like Ben Decker is for the time being volunteering to
provide patches and to maintain them. (I hope I'm reading you right,
Ben.) I'm +1 on accepting his patches, *provided* as always they pass
muster in terms of general Python development standards. (Jeff Epler's
comments should be taken to heart.)

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From duncan-pythondev at grisby.org  Mon Nov 28 18:59:57 2005
From: duncan-pythondev at grisby.org (Duncan Grisby)
Date: Mon, 28 Nov 2005 17:59:57 +0000
Subject: [Python-Dev] SRE should release the GIL (was: no subject)
In-Reply-To: Message from Guido van Rossum <guido@python.org> of "Mon,
	28 Nov 2005 08:45:52 PST."
	<ca471dc20511280845k3c73a7ccj381b9013b3651871@mail.gmail.com> 
Message-ID: <E1EgnIE-0004TN-3W@apasphere.com>

On Monday 28 November, Guido van Rossum wrote:

> On 11/24/05, Duncan Grisby <duncan-pythondev at grisby.org> wrote:

> > I have encountered a problem with the re module. I have a
> > multi-threaded program that does lots of regular expression searching,
> > with some relatively complex regular expressions. Occasionally, events
> > can conspire to mean that the re search takes minutes. That's bad
> > enough in and of itself, but the real problem is that the re engine
> > does not release the interpreter lock while it is running. All the
> > other threads are therefore blocked for the entire time it takes to do
> > the regular expression search.
> 
> Rather than trying to fight the GIL, I suggest that you let a regex
> expert look at your regex(es) and the input that causes the long
> running times. As Fredrik suggested, certain patterns are just
> inefficient but can be rewritten more efficiently. There are plenty of
> regex experts on c.l.py.

Part of the problem is certainly inefficient regexes, and we have
improved things to some extent by changing some of them. Unfortunately,
the regexes come from user input, so we can't be certain that our users
aren't going to do stupid things. It's not too bad if a stupid regex
slows things down for a bit, but it is bad if it causes the whole
application to freeze for minutes at a time.

> Unless you have a multi-CPU box, the performance of your app isn't
> going to improve by releasing the GIL -- it only affects the
> responsiveness of other threads.

We do have a multi-CPU box. Even with good regexes, regex matching takes
up a significant proportion of the time spent processing in our
application, so being able to release the GIL will hopefully increase
performance overall as well as increasing responsiveness.

We are currently testing our application with the patch to sre that Eric
posted. Once we get on to some performance tests, we'll post the results
of whether releasing the GIL does make a measurable difference for us.

Cheers,

Duncan.

-- 
 -- Duncan Grisby         --
  -- duncan at grisby.org     --
   -- http://www.grisby.org --

From martin at v.loewis.de  Mon Nov 28 20:51:27 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 28 Nov 2005 20:51:27 +0100
Subject: [Python-Dev] Patch Req. # 1351020 & 1351036: PythonD
	modifications
In-Reply-To: <ca471dc20511280911o3966d2fcr4b9c5bc932407cc4@mail.gmail.com>
References: <39387.202.3.192.11.1132108393.squirrel@cafemail.mcadcafe.com>	
	<437FA1D8.7060600@v.loewis.de>
	<20051120150850.GA27838@unpythonic.net>	
	<25509.202.3.192.11.1132533752.squirrel@cafemail.mcadcafe.com>	
	<43816CE2.2020808@v.loewis.de>
	<ca471dc20511280911o3966d2fcr4b9c5bc932407cc4@mail.gmail.com>
Message-ID: <438B5FBF.7050604@v.loewis.de>

Guido van Rossum wrote:
> Perhaps the following compromise can be made: the PSF accepts patches
> from reputable platform maintainers. (Of course, like all
> contributions, they must be of high quality and not break anything,
> etc., before they are accepted.) If such patches cause problems with
> later Python versions, the PSF won't maintain them, but instead invite
> the original contributors (or other developers who are interested in
> that particular port) to fix them. If there is insufficient response,
> or if it comes too late given the PSF release schedule, the PSF
> developers may decide to break or remove support for the affected
> platform.

This is indeed the compromise I was after. If the contributors indicate
that they will maintain it for some time (which happened in this case),
then I can happily accept any port (and did indeed in the past).

In the specific case, there is an additional twist that we deliberately
removed DOS support some time ago, and listed that as officially removed
in a PEP. I understand that djgpp somehow isn't quite the same as DOS,
although I don't understand the differences (anymore).

But if it's fine with you, it is fine with me.

Regards,
Martin

From amk at amk.ca  Mon Nov 28 20:56:46 2005
From: amk at amk.ca (A.M. Kuchling)
Date: Mon, 28 Nov 2005 14:56:46 -0500
Subject: [Python-Dev] Bug day this Sunday?
Message-ID: <20051128195646.GA21584@rogue.amk.ca>

Is anyone interested in joining a Python bug day this Sunday?

A useful task might be to prepare for the python-core sprint at PyCon
by going through the bug and patch managers, and listing bugs/patches
that would be good candidates for working on at PyCon.

We'd meet in the usual location: #python-dev on irc.freenode.net, from
roughly 9AM to 3PM Eastern (2PM to 8PM UTC) on Sunday Dec. 4.

--amk

From guido at python.org  Mon Nov 28 21:07:37 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 28 Nov 2005 12:07:37 -0800
Subject: [Python-Dev] Patch Req. # 1351020 & 1351036: PythonD
	modifications
In-Reply-To: <438B5FBF.7050604@v.loewis.de>
References: <39387.202.3.192.11.1132108393.squirrel@cafemail.mcadcafe.com>
	<437FA1D8.7060600@v.loewis.de> <20051120150850.GA27838@unpythonic.net>
	<25509.202.3.192.11.1132533752.squirrel@cafemail.mcadcafe.com>
	<43816CE2.2020808@v.loewis.de>
	<ca471dc20511280911o3966d2fcr4b9c5bc932407cc4@mail.gmail.com>
	<438B5FBF.7050604@v.loewis.de>
Message-ID: <ca471dc20511281207i1bb3dabpa0693014d818a4a8@mail.gmail.com>

On 11/28/05, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> Guido van Rossum wrote:
> > Perhaps the following compromise can be made: the PSF accepts patches
> > from reputable platform maintainers. (Of course, like all
> > contributions, they must be of high quality and not break anything,
> > etc., before they are accepted.) If such patches cause problems with
> > later Python versions, the PSF won't maintain them, but instead invite
> > the original contributors (or other developers who are interested in
> > that particular port) to fix them. If there is insufficient response,
> > or if it comes too late given the PSF release schedule, the PSF
> > developers may decide to break or remove support for the affected
> > platform.
>
> This is indeed the compromise I was after. If the contributors indicate
> that they will maintain it for some time (which happened in this case),
> then I can happily accept any port (and did indeed in the past).
>
> In the specific case, there is an additional twist that we deliberately
> removed DOS support some time ago, and listed that as officially removed
> in a PEP. I understand that djgpp somehow isn't quite the same as DOS,
> although I don't understand the differences (anymore).
>
> But if it's fine with you, it is fine with me.

Thanks. :-) I say, the more platforms the merrier.

I don't recall why DOS support was removed (PEP 11 doesn't say) but I
presume it was just because nobody volunteered to maintain it, not
because we have a particularly dislike for DOS. So now that we have a
volunteer let's deal with his patches without prejudice.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon Nov 28 21:13:09 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 28 Nov 2005 12:13:09 -0800
Subject: [Python-Dev] Proposed additional keyword argument in logging
	calls
In-Reply-To: <001a01c5ef77$d7682300$0200a8c0@alpha>
References: <001a01c5ef77$d7682300$0200a8c0@alpha>
Message-ID: <ca471dc20511281213i7aa48897qb4fd10d89fbae5dd@mail.gmail.com>

On 11/22/05, Vinay Sajip <vinay_sajip at red-dove.com> wrote:
> On numerous occasions, requests have been made for the ability to easily add
> user-defined data to logging events. For example, a multi-threaded server
> application may want to output specific information to a particular server
> thread (e.g. the identity of the client, specific protocol options for the
> client connection, etc.)
>
> This is currently possible, but you have to subclass the Logger class and
> override its makeRecord method to put custom attributes in the LogRecord.
> These can then be output using a customised format string containing e.g.
> "%(foo)s %(bar)d". The approach is usable but requires more work than
> necessary.
>
> I'd like to propose a simpler way of achieving the same result, which
> requires use of an additional optional keyword argument in logging calls.
> The signature of the (internal) Logger._log method would change from
>
>   def _log(self, level, msg, args, exc_info=None)
>
> to
>
>   def _log(self, level, msg, args, exc_info=None, extra_info=None)
>
> The extra_info argument will be passed to Logger.makeRecord, whose signature
> will change from
>
>   def makeRecord(self, name, level, fn, lno, msg, args, exc_info):
>
> to
>
>   def makeRecord(self, name, level, fn, lno, msg, args, exc_info,
> extra_info)
>
> makeRecord will, after doing what it does now, use the extra_info argument
> as follows:
>
> If type(extra_info) != types.DictType, it will be ignored.
>
> Otherwise, any entries in extra_info whose keys are not already in the
> LogRecord's __dict__ will be added to the LogRecord's __dict__.
>
> Can anyone see any problems with this approach? If not, I propose to post
> the approach on python-list and then if there are no strong objections,
> check it in to the trunk. (Since it could break existing code, I'm assuming
> (please correct me if I'm wrong) that it shouldn't go into the
> release24-maint branch.)

This looks like a good clean solution to me. I agree with Paul Moore's
suggestion that if extra_info is not None you should just go ahead and
use it as a dict and let the errors propagate.

What's the rationale for not letting it override existing fields?
(There may be a good one, I just don't see it without turning on my
thinking cap, which would cost extra. :-)

Perhaps it makes sense to call it 'extra' instead of 'extra_info'?

As a new feature it should definitely not go into 2.4; but I don't see
how it could break existing code.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon Nov 28 21:14:41 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 28 Nov 2005 12:14:41 -0800
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <dll2v3$78g$1@sea.gmane.org>
References: <4379AAD7.2050506@iinet.net.au>
	<6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu>
	<e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com>
	<ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com>
	<bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com>
	<13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu>
	<437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org>
	<dll2v3$78g$1@sea.gmane.org>
Message-ID: <ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com>

On 11/18/05, Neil Schemenauer <nas at arctrix.com> wrote:
> Perhaps we should use the memory management technique that the rest
> of Python uses: reference counting.  I don't see why the AST
> structures couldn't be PyObjects.

Me neither. Adding yet another memory allocation scheme to Python's
already staggering number of memory allocation strategies sounds like
a bad idea.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon Nov 28 21:16:21 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 28 Nov 2005 12:16:21 -0800
Subject: [Python-Dev] something is wrong with test___all__
In-Reply-To: <dm03lq$41u$1@sea.gmane.org>
References: <dm03lq$41u$1@sea.gmane.org>
Message-ID: <ca471dc20511281216p36548ba6l6779da343d14e805@mail.gmail.com>

Has this been handled yet? If not, perhaps showing the good and bad
bytecode here would help trigger someone's brain into understanding
the problem.

On 11/22/05, Reinhold Birkenfeld <reinhold-birkenfeld-nospam at wolke7.net> wrote:
> Hi,
>
> on my machine, "make test" hangs at test_colorsys.
>
> Careful investigation shows that when the bytecode is freshly generated
> by "make all" (precisely in test___all__) the .pyc file is different from what a
> direct call to "regrtest.py test_colorsys" produces.
>
> Curiously, a call to "regrtest.py test___all__" instead of "make test" produces
> the correct bytecode.
>
> I can only suspect some AST bug here.
>
> Reinhold
>
> --
> Mail address is perfectly valid!
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Mon Nov 28 21:19:38 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 28 Nov 2005 21:19:38 +0100
Subject: [Python-Dev] Patch Req. # 1351020 & 1351036:
	PythonD	modifications
In-Reply-To: <ca471dc20511281207i1bb3dabpa0693014d818a4a8@mail.gmail.com>
References: <39387.202.3.192.11.1132108393.squirrel@cafemail.mcadcafe.com>	<437FA1D8.7060600@v.loewis.de>
	<20051120150850.GA27838@unpythonic.net>	<25509.202.3.192.11.1132533752.squirrel@cafemail.mcadcafe.com>	<43816CE2.2020808@v.loewis.de>	<ca471dc20511280911o3966d2fcr4b9c5bc932407cc4@mail.gmail.com>	<438B5FBF.7050604@v.loewis.de>
	<ca471dc20511281207i1bb3dabpa0693014d818a4a8@mail.gmail.com>
Message-ID: <438B665A.1090002@v.loewis.de>

Guido van Rossum wrote:
 > I don't recall why DOS support was removed (PEP 11 doesn't say)

The PEP was actually created after the removal, so you added (or
asked me to add) this entry:

     Name:             MS-DOS, MS-Windows 3.x
     Unsupported in:   Python 2.0
     Code removed in:  Python 2.1

Regards,
Martin

From jeremy at alum.mit.edu  Mon Nov 28 21:47:07 2005
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Mon, 28 Nov 2005 15:47:07 -0500
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com>
References: <4379AAD7.2050506@iinet.net.au>
	<6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu>
	<e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com>
	<ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com>
	<bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com>
	<13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu>
	<437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org>
	<dll2v3$78g$1@sea.gmane.org>
	<ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com>
Message-ID: <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>

On 11/28/05, Guido van Rossum <guido at python.org> wrote:
> On 11/18/05, Neil Schemenauer <nas at arctrix.com> wrote:
> > Perhaps we should use the memory management technique that the rest
> > of Python uses: reference counting.  I don't see why the AST
> > structures couldn't be PyObjects.
>
> Me neither. Adding yet another memory allocation scheme to Python's
> already staggering number of memory allocation strategies sounds like
> a bad idea.

The reason this thread started was the complaint that reference
counting in the compiler is really difficult.  Almost every line of
code can lead to an error exit.  The code becomes quite cluttered when
it uses reference counting.  Right now, the AST is created with
malloc/free, but that makes it hard to free the ast at the right time.
 It would be fairly complex to convert the ast nodes to pyobjects. 
They're just simple discriminated unions right now.  If they were
allocated from an arena, the entire arena could be freed when the
compilation pass ends.

Jeremy

From guido at python.org  Mon Nov 28 22:15:58 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 28 Nov 2005 13:15:58 -0800
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>
References: <4379AAD7.2050506@iinet.net.au>
	<e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com>
	<ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com>
	<bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com>
	<13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu>
	<437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org>
	<dll2v3$78g$1@sea.gmane.org>
	<ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com>
	<e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>
Message-ID: <ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com>

On 11/28/05, Jeremy Hylton <jeremy at alum.mit.edu> wrote:
> On 11/28/05, Guido van Rossum <guido at python.org> wrote:
> > On 11/18/05, Neil Schemenauer <nas at arctrix.com> wrote:
> > > Perhaps we should use the memory management technique that the rest
> > > of Python uses: reference counting.  I don't see why the AST
> > > structures couldn't be PyObjects.
> >
> > Me neither. Adding yet another memory allocation scheme to Python's
> > already staggering number of memory allocation strategies sounds like
> > a bad idea.
>
> The reason this thread started was the complaint that reference
> counting in the compiler is really difficult.  Almost every line of
> code can lead to an error exit.

Sorry, I forgot that (I've been off-line for a week of quality time
with Orlijn, and am now digging my self out from under several hundred
emails :-).

> The code becomes quite cluttered when
> it uses reference counting.  Right now, the AST is created with
> malloc/free, but that makes it hard to free the ast at the right time.

Would fixing the code to add free() calls in all the error exits make
it more or less cluttered than using reference counting?

>  It would be fairly complex to convert the ast nodes to pyobjects.
> They're just simple discriminated unions right now.

Are they all the same size?

> If they were
> allocated from an arena, the entire arena could be freed when the
> compilation pass ends.

Then I don't understand why there was discussion of alloca() earlier
on -- surely the lifetime of a node should not be limited by the stack
frame that allocated it?

I'm not in principle against having an arena for this purpose, but I
worry that this will make it really hard to provide a Python API for
the AST, which has already been requested and whose feasibility
(unless I'm mistaken) also was touted as an argument for switching to
the AST compiler in the first place. I hope we'll never have to deal
with an API like the parser module provides...

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From jeremy at alum.mit.edu  Mon Nov 28 22:23:00 2005
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Mon, 28 Nov 2005 16:23:00 -0500
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com>
References: <4379AAD7.2050506@iinet.net.au>
	<ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com>
	<bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com>
	<13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu>
	<437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org>
	<dll2v3$78g$1@sea.gmane.org>
	<ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com>
	<e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>
	<ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com>
Message-ID: <e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com>

On 11/28/05, Guido van Rossum <guido at python.org> wrote:
> > The code becomes quite cluttered when
> > it uses reference counting.  Right now, the AST is created with
> > malloc/free, but that makes it hard to free the ast at the right time.
>
> Would fixing the code to add free() calls in all the error exits make
> it more or less cluttered than using reference counting?

If we had an arena API, we'd only need to call free on the arena at
top-level entry points.  If an error occurs deeps inside the compiler,
the arena will still get cleaned up by calling free at the top.

> >  It would be fairly complex to convert the ast nodes to pyobjects.
> > They're just simple discriminated unions right now.
>
> Are they all the same size?

No.  Each type is a different size and there are actually a lot of
types -- statements, expressions, arguments, slices, &c.  All the
objects of one type are the same size.

> > If they were
> > allocated from an arena, the entire arena could be freed when the
> > compilation pass ends.
>
> Then I don't understand why there was discussion of alloca() earlier
> on -- surely the lifetime of a node should not be limited by the stack
> frame that allocated it?

Actually this is a pretty good limit, because all these data
structures are temporaries used by the compiler.  Once compilation has
finished, there's no need for the AST or the compiler state.

> I'm not in principle against having an arena for this purpose, but I
> worry that this will make it really hard to provide a Python API for
> the AST, which has already been requested and whose feasibility
> (unless I'm mistaken) also was touted as an argument for switching to
> the AST compiler in the first place. I hope we'll never have to deal
> with an API like the parser module provides...

My preference would be to have the ast shared by value.  We generate
code to serialize it to and from a byte stream and share that between
Python and C.  It is less efficient, but it is also very simple.

Jeremy

From martin at v.loewis.de  Mon Nov 28 22:37:05 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 28 Nov 2005 22:37:05 +0100
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>
References: <4379AAD7.2050506@iinet.net.au>	<6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu>	<e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com>	<ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com>	<bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com>	<13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu>	<437B2075.1000102@gmail.com>
	<dlf7ak$ckg$1@sea.gmane.org>	<dll2v3$78g$1@sea.gmane.org>	<ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com>
	<e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>
Message-ID: <438B7881.8020200@v.loewis.de>

Jeremy Hylton wrote:
 > The reason this thread started was the complaint that reference
 > counting in the compiler is really difficult.  Almost every line of
 > code can lead to an error exit.  The code becomes quite cluttered when
 > it uses reference counting.  Right now, the AST is created with
 > malloc/free, but that makes it hard to free the ast at the right time.
 >  It would be fairly complex to convert the ast nodes to pyobjects.
 > They're just simple discriminated unions right now.  If they were
 > allocated from an arena, the entire arena could be freed when the
 > compilation pass ends.

I haven't looked at the AST code at all so far, but my experience
with gcc is that such an approach is fundamentally flawed: you
would always have memory that ought to survive the parsing, so
you will have to copy it out of the arena. This will either lead
to dangling pointers, or garbage memory. So in gcc, they eventually
moved to a full garbage collector (after several iterations).

Reference counting has the advantage that you can always DECREF
at the end of the function. So if you put all local variables
at the beginning of the function, and all DECREFs at the end,
getting clean memory management should be doable, IMO. Plus,
contributors would be familiar with the scheme in place.

I don't know if details have already been proposed, but I would
update asdl to generate a hierarchy of classes: i.e.

class mod(object):pass

class Module(mod):
   def __init__(self, body):
     self.body = body # List of stmt

#...

class Expression(mod):
   def __init__(self, body):
     self.body = body # expr

# ...
class Raise(stmt):
   def __init__(self, dest, values, nl):
      self.dest # expr or None
      self.values # List of expr
      self.bl     # bool (True or False)

There would be convenience functions, like

   PyObject *mod_Module(PyObject* body);
   enum mod_kind mod_kind(PyObject* mod);
   // Module, Interactive, Expression, or mod_INVALID
   PyObject *mod_Expression_body(PyObject*);
   //...
   PyObject *stmt_Raise_dest(PyObject*);

(whether the accessors return new or borrowed reference
  could be debated; plain C struct accesses would also
  be possible)

Regards,
Martin

From nas at arctrix.com  Mon Nov 28 22:46:05 2005
From: nas at arctrix.com (Neil Schemenauer)
Date: Mon, 28 Nov 2005 14:46:05 -0700
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>
References: <6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu>
	<e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com>
	<ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com>
	<bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com>
	<13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu>
	<437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org>
	<dll2v3$78g$1@sea.gmane.org>
	<ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com>
	<e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>
Message-ID: <20051128214605.GB26230@mems-exchange.org>

On Mon, Nov 28, 2005 at 03:47:07PM -0500, Jeremy Hylton wrote:
> The reason this thread started was the complaint that reference
> counting in the compiler is really difficult.

I don't think that's exactly right.  The problem is that the AST
compiler mixes its own memory management strategy with reference
counting and the result doesn't quite work.  The AST compiler mainly
keeps track of memory via containment: for example, if B is an
attribute of A then B gets freed when A gets freed.  That works fine
as long as B is never shared.  My memory of the problems is a little
fuzzy.  Maybe Neal Norwitz can explain it better.

  Neil

From guido at python.org  Mon Nov 28 22:46:31 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 28 Nov 2005 13:46:31 -0800
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com>
References: <4379AAD7.2050506@iinet.net.au>
	<bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com>
	<13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu>
	<437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org>
	<dll2v3$78g$1@sea.gmane.org>
	<ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com>
	<e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>
	<ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com>
	<e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com>
Message-ID: <ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com>

[Guido]
> > Then I don't understand why there was discussion of alloca() earlier
> > on -- surely the lifetime of a node should not be limited by the stack
> > frame that allocated it?

[Jeremy]
> Actually this is a pretty good limit, because all these data
> structures are temporaries used by the compiler.  Once compilation has
> finished, there's no need for the AST or the compiler state.

Are you really saying that there is one function which is called only
once (per compilation) which allocates *all* the AST nodes? That's the
only situation where I'd see alloca() working -- unless your alloca()
doesn't allocate memory on the stack. I was somehow assuming that the
tree would be built piecemeal by parser callbacks or some such
mechanism. There's still a stack frame whose lifetime limits the AST
lifetime, but it is not usually the current stackframe when a new node
is allocated, so alloca() can't be used.

I guess I don't understand the AST compiler code enough to participate
in this discussion. Or perhaps we are agreeing violently?

> > I'm not in principle against having an arena for this purpose, but I
> > worry that this will make it really hard to provide a Python API for
> > the AST, which has already been requested and whose feasibility
> > (unless I'm mistaken) also was touted as an argument for switching to
> > the AST compiler in the first place. I hope we'll never have to deal
> > with an API like the parser module provides...
>
> My preference would be to have the ast shared by value.  We generate
> code to serialize it to and from a byte stream and share that between
> Python and C.  It is less efficient, but it is also very simple.

So there would still be a Python-objects version of the AST but the
compiler itself doesn't use it.

At least by-value makes sense to me -- if you're making tree
transformations you don't want accidental sharing to cause unexpected
side effects.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From bcannon at gmail.com  Mon Nov 28 22:59:04 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Mon, 28 Nov 2005 13:59:04 -0800
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com>
References: <4379AAD7.2050506@iinet.net.au>
	<13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu>
	<437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org>
	<dll2v3$78g$1@sea.gmane.org>
	<ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com>
	<e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>
	<ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com>
	<e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com>
	<ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com>
Message-ID: <bbaeab100511281359r16a8fc63k5fd300447a35e7ce@mail.gmail.com>

On 11/28/05, Guido van Rossum <guido at python.org> wrote:
> [Guido]
> > > Then I don't understand why there was discussion of alloca() earlier
> > > on -- surely the lifetime of a node should not be limited by the stack
> > > frame that allocated it?
>
> [Jeremy]
> > Actually this is a pretty good limit, because all these data
> > structures are temporaries used by the compiler.  Once compilation has
> > finished, there's no need for the AST or the compiler state.
>
> Are you really saying that there is one function which is called only
> once (per compilation) which allocates *all* the AST nodes?

Nope, there isn't for everything.  It's just that some are temporary
to internal functions and thus can stand to be freed later (unless my
memory is really shot).  Otherwise it is piece-meal.  There is the
main data structure such as the compiler struct and the top-level node
for the AST, but otherwise everything (currently) is allocated as
needed.

> That's the
> only situation where I'd see alloca() working -- unless your alloca()
> doesn't allocate memory on the stack. I was somehow assuming that the
> tree would be built piecemeal by parser callbacks or some such
> mechanism. There's still a stack frame whose lifetime limits the AST
> lifetime, but it is not usually the current stackframe when a new node
> is allocated, so alloca() can't be used.
>
> I guess I don't understand the AST compiler code enough to participate
> in this discussion. Or perhaps we are agreeing violently?
>

I don't think your knowledge of the codebase precludes your
participation.  Actually, I think it makes it even more important
since if some scheme is devised that is not easily explained it is
really going to hinder who can help out with maintenance and
enhancements on the compiler.

> > > I'm not in principle against having an arena for this purpose, but I
> > > worry that this will make it really hard to provide a Python API for
> > > the AST, which has already been requested and whose feasibility
> > > (unless I'm mistaken) also was touted as an argument for switching to
> > > the AST compiler in the first place. I hope we'll never have to deal
> > > with an API like the parser module provides...
> >
> > My preference would be to have the ast shared by value.  We generate
> > code to serialize it to and from a byte stream and share that between
> > Python and C.  It is less efficient, but it is also very simple.
>
> So there would still be a Python-objects version of the AST but the
> compiler itself doesn't use it.
>

Yep.  The idea was be to return a PyString formatted ala the parser
module where it is just a bunch of nested items in a Scheme-like
format.  There would then be Python or C code that would generate a
Python object representation from that.  Then, when you were finished
tweaking the structure, you would write back out as a PyString and
then recreate the internal representation.  That makes it
pass-by-value since you pass the serialized PyString version across
the C-Python boundary.

> At least by-value makes sense to me -- if you're making tree
> transformations you don't want accidental sharing to cause unexpected
> side effects.
>

Yeah, that could be bad.  =)

-Brett

From walter at livinglogic.de  Mon Nov 28 23:13:58 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Mon, 28 Nov 2005 23:13:58 +0100
Subject: [Python-Dev] reference leaks
In-Reply-To: <ee2a432c0511251002n438ca00eib1d7bdee53df30d7@mail.gmail.com>
References: <ee2a432c0511241935i70127dc0o50999f72b5094f89@mail.gmail.com>	
	<4386D91B.7030505@livinglogic.de>
	<ee2a432c0511251002n438ca00eib1d7bdee53df30d7@mail.gmail.com>
Message-ID: <438B8126.7090502@livinglogic.de>

Neal Norwitz wrote:

> On 11/25/05, Walter D?rwald <walter at livinglogic.de> wrote:
>> Can you move the call to codecs.register_error() out of test_callbacks()
>> and retry?
> 
> It then leaks 3 refs on each call to test_callbacks().

This should be fixed now in r41555 and r41556.

Bye,
    Walter D?rwald


From nnorwitz at gmail.com  Mon Nov 28 23:58:24 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Mon, 28 Nov 2005 14:58:24 -0800
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com>
References: <4379AAD7.2050506@iinet.net.au>
	<13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu>
	<437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org>
	<dll2v3$78g$1@sea.gmane.org>
	<ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com>
	<e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>
	<ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com>
	<e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com>
	<ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com>
Message-ID: <ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com>

On 11/28/05, Guido van Rossum <guido at python.org> wrote:
>
> I guess I don't understand the AST compiler code enough to participate
> in this discussion.

I hope everyone while chime in here.  This is important to improve and
learn from others.

Let me try to describe the current situation with a small amount of
code.  Hopefully it will give some idea of the larger problems.

This is an entire function from Python/ast.c.  It demonstrates the
issues fairly clearly.  It contains at least one memory leak.  It uses
asdl_seq which are barely more than somewhat dynamic arrays. 
Sequences do not know what type they hold, so there needs to be
different dealloc functions to free them properly (asdl_*_seq_free()).
 ast_for_*() allocate memory, so in case of an error, the memory will
need to be freed.  Most of this memory is internal to the AST code. 
However, there are some identifiers (PyString's) that must be
DECREF'ed.  See below for the memory leak.

static stmt_ty
ast_for_funcdef(struct compiling *c, const node *n)
{
    /* funcdef: 'def' [decorators] NAME parameters ':' suite */
    identifier name = NULL;
    arguments_ty args = NULL;
    asdl_seq *body = NULL;
    asdl_seq *decorator_seq = NULL;
    int name_i;

    REQ(n, funcdef);

    if (NCH(n) == 6) { /* decorators are present */
	decorator_seq = ast_for_decorators(c, CHILD(n, 0));
	if (!decorator_seq)
	    goto error;
	name_i = 2;
    }
    else {
	name_i = 1;
    }

    name = NEW_IDENTIFIER(CHILD(n, name_i));
    if (!name)
	goto error;
    else if (!strcmp(STR(CHILD(n, name_i)), "None")) {
	ast_error(CHILD(n, name_i), "assignment to None");
	goto error;
    }
    args = ast_for_arguments(c, CHILD(n, name_i + 1));
    if (!args)
	goto error;
    body = ast_for_suite(c, CHILD(n, name_i + 3));
    if (!body)
	goto error;

    return FunctionDef(name, args, body, decorator_seq, LINENO(n));

error:
    asdl_stmt_seq_free(body);
    asdl_expr_seq_free(decorator_seq);
    free_arguments(args);
    Py_XDECREF(name);
    return NULL;
}

The memory leak occurs when FunctionDef fails.  name, args, body, and
decorator_seq are all local and would not be freed.  The simple
variables can be freed in each "constructor" like FunctionDef(), but
the sequences cannot unless they keep the info about which type they
hold.  That would help quite a bit, but I'm not sure it's the
right/best solution.

Hope this helps explain a bit.  Please speak up with how this can be
improved.  Gotta run.

n

From greg.ewing at canterbury.ac.nz  Tue Nov 29 00:55:17 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 29 Nov 2005 12:55:17 +1300
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>
References: <4379AAD7.2050506@iinet.net.au>
	<6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu>
	<e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com>
	<ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com>
	<bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com>
	<13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu>
	<437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org>
	<dll2v3$78g$1@sea.gmane.org>
	<ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com>
	<e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>
Message-ID: <438B98E5.5010209@canterbury.ac.nz>

Jeremy Hylton wrote:

> Almost every line of
> code can lead to an error exit.  The code becomes quite cluttered when
> it uses reference counting.

I don't see why very many more error exits should become
possible just by introducing refcounting. Errors are possible
whenever you allocate something, however you do it, so you
need error checks on all your allocations in any case.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From greg.ewing at canterbury.ac.nz  Tue Nov 29 01:11:11 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 29 Nov 2005 13:11:11 +1300
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com>
References: <4379AAD7.2050506@iinet.net.au>
	<13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu>
	<437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org>
	<dll2v3$78g$1@sea.gmane.org>
	<ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com>
	<e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>
	<ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com>
	<e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com>
	<ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com>
	<ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com>
Message-ID: <438B9C9F.6000802@canterbury.ac.nz>

Neal Norwitz wrote:

> This is an entire function from Python/ast.c.

> Sequences do not know what type they hold, so there needs to be
> different dealloc functions to free them properly (asdl_*_seq_free()).

Well, that's one complication that would go away if
the nodes were PyObjects.

> The memory leak occurs when FunctionDef fails.  name, args, body, and
> decorator_seq are all local and would not be freed.  The simple
> variables can be freed in each "constructor" like FunctionDef(), but
> the sequences cannot unless they keep the info about which type they
> hold.

If FunctionDef's reference semantics are defined so
that it steals references to its arguments, then here
is how the same function would look with PyObject
AST nodes, as far as I can see:

  static PyObject *
  ast_for_funcdef(struct compiling *c, const node *n)
  {
      /* funcdef: 'def' [decorators] NAME parameters ':' suite */
      PyObject *name = NULL;
      PyObject *args = NULL;
      PyObject *body = NULL;
      PyObject *decorator_seq = NULL;
      int name_i;

      REQ(n, funcdef);

      if (NCH(n) == 6) { /* decorators are present */
  	decorator_seq = ast_for_decorators(c, CHILD(n, 0));
  	if (!decorator_seq)
  	    goto error;
  	name_i = 2;
      }
      else {
  	name_i = 1;
      }

      name = NEW_IDENTIFIER(CHILD(n, name_i));
      if (!name)
  	goto error;
      else if (!strcmp(STR(CHILD(n, name_i)), "None")) {
  	ast_error(CHILD(n, name_i), "assignment to None");
  	goto error;
      }
      args = ast_for_arguments(c, CHILD(n, name_i + 1));
      if (!args)
  	goto error;
      body = ast_for_suite(c, CHILD(n, name_i + 3));
      if (!body)
  	goto error;

      return FunctionDef(name, args, body, decorator_seq, LINENO(n));

  error:
      Py_XDECREF(body);
      Py_XDECREF(decorator_seq);
      Py_XDECREF(args);
      Py_XDECREF(name);
      return NULL;
  }

The only things I've changed are turning some type
declarations into PyObject * and replacing the
deallocation functions at the end with Py_XDECREF!

Maybe there are other functions where it would not
be so straightforward, but if this really is a
typical AST function, switching to PyObjects looks
like it wouldn't be difficult at all, and would
actually make some things simpler.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From greg.ewing at canterbury.ac.nz  Tue Nov 29 01:13:29 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 29 Nov 2005 13:13:29 +1300
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com>
References: <4379AAD7.2050506@iinet.net.au>
	<13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu>
	<437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org>
	<dll2v3$78g$1@sea.gmane.org>
	<ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com>
	<e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>
	<ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com>
	<e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com>
	<ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com>
	<ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com>
Message-ID: <438B9D29.2020403@canterbury.ac.nz>

Here's a somewhat radical idea:

Why not write the parser and bytecode compiler in Python?

A .pyc could be bootstrapped from it and frozen into
the executable.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From martin at v.loewis.de  Tue Nov 29 01:21:38 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 29 Nov 2005 01:21:38 +0100
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com>
References: <4379AAD7.2050506@iinet.net.au>	<13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu>	<437B2075.1000102@gmail.com>
	<dlf7ak$ckg$1@sea.gmane.org>	<dll2v3$78g$1@sea.gmane.org>	<ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com>	<e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>	<ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com>	<e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com>	<ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com>
	<ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com>
Message-ID: <438B9F12.3060607@v.loewis.de>

Neal Norwitz wrote:
> Hope this helps explain a bit.  Please speak up with how this can be
> improved.  Gotta run.

I would rewrite it as

static PyObject*
ast_for_funcdef(struct compiling *c, const node *n)
{
     /* funcdef: [decorators] 'def' NAME parameters ':' suite */
     PyObject *name = NULL;
     PyObject *args = NULL;
     PyObject *body = NULL;
     PyObject *decorator_seq = NULL;
     PyObject *result = NULL;
     int name_i;

     REQ(n, funcdef);

     if (NCH(n) == 6) { /* decorators are present */
	decorator_seq = ast_for_decorators(c, CHILD(n, 0));
	if (!decorator_seq)
	    goto error;
	name_i = 2;
     }
     else {
	name_i = 1;
     }

     name = NEW_IDENTIFIER(CHILD(n, name_i));
     if (!name)
	goto error;
     else if (!strcmp(STR(CHILD(n, name_i)), "None")) {
	ast_error(CHILD(n, name_i), "assignment to None");
	goto error;
     }
     args = ast_for_arguments(c, CHILD(n, name_i + 1));
     if (!args)
	goto error;
     body = ast_for_suite(c, CHILD(n, name_i + 3));
     if (!body)
	goto error;

     result = FunctionDef(name, args, body, decorator_seq, LINENO(n));

error:
     Py_XDECREF(name);
     Py_XDECREF(args);
     Py_XDECREF(body);
     Py_XDECREF(decorator_seq);
     return result;
}

The convention would be that ast_for_* returns new references, which
have to be released regardless of success or failure. FunctionDef
would duplicate all of its parameter references if it succeeds,
and leave them untouched if it fails.

One could develop a checker that verifies that:
a) all PyObject* local variables are initialized to NULL, and
b) all such variables are Py_XDECREF'ed after the error label.
c) result is initialized to NULL, and returned.
Then, "goto error" at any point in the code would be correct
(assuming an exception had been set prior to the goto).

No special release function for the body or the decorators
would be necessary - they would be plain Python lists.

Regards,
Martin

From bcannon at gmail.com  Tue Nov 29 01:29:09 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Mon, 28 Nov 2005 16:29:09 -0800
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <438B9D29.2020403@canterbury.ac.nz>
References: <4379AAD7.2050506@iinet.net.au> <dlf7ak$ckg$1@sea.gmane.org>
	<dll2v3$78g$1@sea.gmane.org>
	<ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com>
	<e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>
	<ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com>
	<e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com>
	<ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com>
	<ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com>
	<438B9D29.2020403@canterbury.ac.nz>
Message-ID: <bbaeab100511281629xd89651eudc0c7ed5b1a36eb7@mail.gmail.com>

On 11/28/05, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Here's a somewhat radical idea:
>
> Why not write the parser and bytecode compiler in Python?
>
> A .pyc could be bootstrapped from it and frozen into
> the executable.
>

Is there a specific reason you are leaving out the AST, Greg, or do
you count that as part of the bytecode compiler (I think of that as
the AST->bytecode step handled by Python/compile.c)?

While ease of maintenance would be fantastic and would probably lead
to much more language experimentation if more of the core parts of
Python were written in Python, I would worry about performance.  While
generating bytecode is not necessarily an everytime thing, I know
Guido has said he doesn't like punishing the performance of small
scripts in the name of large-scale apps (reason why interpreter
startup time has always been an issue) which tend not to have a .pyc
file.

-Brett

From hyeshik at gmail.com  Tue Nov 29 02:14:56 2005
From: hyeshik at gmail.com (=?EUC-KR?B?wOXH/b3E?=)
Date: Tue, 29 Nov 2005 10:14:56 +0900
Subject: [Python-Dev] CVS repository mostly closed now
In-Reply-To: <4388D55B.1070501@v.loewis.de>
References: <4388D55B.1070501@v.loewis.de>
Message-ID: <4f0b69dc0511281714y42a73b7fm6caa34340f0d6fc7@mail.gmail.com>

On 11/27/05, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> I tried removing the CVS repository from SF; it turns
> out that this operation is not supported. Instead, it
> is only possible to remove it from the project page;
> pserver and ssh access remain indefinitely, as does
> viewcvs.

There's a hacky trick to remove them:
 put  rm -rf $CVSROOT/src into CVSROOT/loginfo
and remove the line then and commit again. :)


Hye-Shik

From fdrake at acm.org  Tue Nov 29 02:32:08 2005
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Mon, 28 Nov 2005 20:32:08 -0500
Subject: [Python-Dev] CVS repository mostly closed now
In-Reply-To: <4f0b69dc0511281714y42a73b7fm6caa34340f0d6fc7@mail.gmail.com>
References: <4388D55B.1070501@v.loewis.de>
	<4f0b69dc0511281714y42a73b7fm6caa34340f0d6fc7@mail.gmail.com>
Message-ID: <200511282032.09373.fdrake@acm.org>

On Monday 28 November 2005 20:14, ??? wrote:
 > There's a hacky trick to remove them:
 >  put  rm -rf $CVSROOT/src into CVSROOT/loginfo
 > and remove the line then and commit again. :)

Wow, that is tricky!  Glad it wasn't me who thought of this one.  :-)


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From greg.ewing at canterbury.ac.nz  Tue Nov 29 07:31:49 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 29 Nov 2005 19:31:49 +1300
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <bbaeab100511281629xd89651eudc0c7ed5b1a36eb7@mail.gmail.com>
References: <4379AAD7.2050506@iinet.net.au> <dlf7ak$ckg$1@sea.gmane.org>
	<dll2v3$78g$1@sea.gmane.org>
	<ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com>
	<e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>
	<ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com>
	<e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com>
	<ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com>
	<ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com>
	<438B9D29.2020403@canterbury.ac.nz>
	<bbaeab100511281629xd89651eudc0c7ed5b1a36eb7@mail.gmail.com>
Message-ID: <438BF5D5.6090000@canterbury.ac.nz>

Brett Cannon wrote:

> Is there a specific reason you are leaving out the AST, Greg, or do
> you count that as part of the bytecode compiler

No, I consider it part of the parser. My mental model
of parsing & compiling in the presence of a parse tree
is like this:

   [source] -> scanner -> [tokens]
     -> parser -> [AST] -> code_generator -> [code]

The fact that there still seems to be another kind of
parse tree in between the scanner and the AST generator
is an oddity which I hope will eventually disappear.

> I know
> Guido has said he doesn't like punishing the performance of small
> scripts in the name of large-scale apps

To me, that's an argument in favour of always generating
a .pyc, even for scripts.

Greg

From martin at v.loewis.de  Tue Nov 29 08:14:20 2005
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Tue, 29 Nov 2005 08:14:20 +0100
Subject: [Python-Dev] CVS repository mostly closed now
In-Reply-To: <4f0b69dc0511281714y42a73b7fm6caa34340f0d6fc7@mail.gmail.com>
References: <4388D55B.1070501@v.loewis.de>
	<4f0b69dc0511281714y42a73b7fm6caa34340f0d6fc7@mail.gmail.com>
Message-ID: <438BFFCC.1010005@v.loewis.de>

??? wrote:
> There's a hacky trick to remove them:
>  put  rm -rf $CVSROOT/src into CVSROOT/loginfo
> and remove the line then and commit again. :)

Sure :-) SF makes a big fuss as to how good a service
this is: open source will never go away. I tend to
agree, somewhat. For historical reasons, it is surely
nice to be able to browse the CVS repository (in particular
if you need to correlate CVS revision numbers and svn
revision numbers); also, people can take any time they
want to convert CVS sandboxes.

So instead of hacking them, I thought we better comply.
With the mechanics in place, anybody should notice
we switched to subversion (but I will write something
on c.l.p.a, anyway).

Regards,
Martin

P.S. Sorry for not getting your name right in the To:
field; that's thunderbird.

From nnorwitz at gmail.com  Tue Nov 29 08:24:25 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Mon, 28 Nov 2005 23:24:25 -0800
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <438B9F12.3060607@v.loewis.de>
References: <4379AAD7.2050506@iinet.net.au> <dlf7ak$ckg$1@sea.gmane.org>
	<dll2v3$78g$1@sea.gmane.org>
	<ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com>
	<e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>
	<ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com>
	<e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com>
	<ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com>
	<ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com>
	<438B9F12.3060607@v.loewis.de>
Message-ID: <ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com>

On 11/28/05, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> Neal Norwitz wrote:
> > Hope this helps explain a bit.  Please speak up with how this can be
> > improved.  Gotta run.
>
> I would rewrite it as

[code snipped]

For those watching, Greg's and Martin's version were almost the same. 
However, Greg's version left in the memory leak, while Martin fixed it
by letting the result fall through.  Martin added some helpful rules
about dealing with the memory.  Martin also gets bonus points for
talking about developing a checker. :-)

In both cases, their modified code is similar to the existing AST
code, but all deallocation is done with Py_[X]DECREFs rather than a
type specific deallocator.  Definitely nicer than the current
situation.  It's also the same as the rest of the python code.

With arenas the code would presumably look something like this:

static stmt_ty
ast_for_funcdef(struct compiling *c, const node *n)
{
    /* funcdef: 'def' [decorators] NAME parameters ':' suite */
    identifier name;
    arguments_ty args;
    asdl_seq *body;
    asdl_seq *decorator_seq = NULL;
    int name_i;

    REQ(n, funcdef);

    if (NCH(n) == 6) { /* decorators are present */
        decorator_seq = ast_for_decorators(c, CHILD(n, 0));
        if (!decorator_seq)
            return NULL;
        name_i = 2;
    }
    else {
        name_i = 1;
    }

    name = NEW_IDENTIFIER(CHILD(n, name_i));
    if (!name)
        return NULL;
    Py_AST_Register(name);
    if (!strcmp(STR(CHILD(n, name_i)), "None")) {
        ast_error(CHILD(n, name_i), "assignment to None");
        return NULL;
    }
    args = ast_for_arguments(c, CHILD(n, name_i + 1));
    body = ast_for_suite(c, CHILD(n, name_i + 3));
    if (!args || !body)
        return NULL;

    return FunctionDef(name, args, body, decorator_seq, LINENO(n));
}

All the goto's become return NULLs.  After allocating a PyObject, it
would need to be registered (ie, the mythical Py_AST_Register(name)). 
This is easier than using all PyObjects in that when an error occurs,
there's nothing to think about, just return.  Only optional values
(like decorator_seq) need to be initialized.  It's harder in that one
must remember to register any PyObject so it can be Py_DECREFed at the
end.  Since the arena is allocated in big hunk(s), it would presumably
be faster than using PyObjects since there would be less memory
allocation (and fragmentation).  It should be possible to get rid of
some of the conditionals too (I joined body and args above).

Using all PyObjects has another benefit that may have been mentioned
elsewhere, ie that the rest of Python uses the same techniques for
handling deallocation.

I'm not really advocating any particular approach.  I *think* arenas
would be easiest, but it's not a clear winner.  I think Martin's note
about GCC using GC is interesting.  AFAIK GCC is a lot more complex
than the Python code, so I'm not sure it's 100% relevant.  OTOH, we
need to weigh that experience.

n

From martin at v.loewis.de  Tue Nov 29 08:33:03 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 29 Nov 2005 08:33:03 +0100
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com>
References: <4379AAD7.2050506@iinet.net.au> <dlf7ak$ckg$1@sea.gmane.org>	
	<dll2v3$78g$1@sea.gmane.org>	
	<ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com>	
	<e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>	
	<ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com>	
	<e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com>	
	<ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com>	
	<ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com>	
	<438B9F12.3060607@v.loewis.de>
	<ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com>
Message-ID: <438C042F.2050502@v.loewis.de>

Neal Norwitz wrote:
> For those watching, Greg's and Martin's version were almost the same. 
> However, Greg's version left in the memory leak, while Martin fixed it
> by letting the result fall through.

Actually, Greg said (correctly) that his version also fixes the
leak: he assumed that FunctionDef would *consume* the references
being passed (whether it is successful or not).

I don't think this is a good convention, though.

Regards,
Martin

From ncoghlan at gmail.com  Tue Nov 29 11:48:31 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 29 Nov 2005 20:48:31 +1000
Subject: [Python-Dev] Metaclass problem in the "with" statement
 semantics in PEP 343
In-Reply-To: <ca471dc20511280824y6af50950y93f70f9c19bfe0d9@mail.gmail.com>
References: <438AE97D.2050600@iinet.net.au>
	<ca471dc20511280824y6af50950y93f70f9c19bfe0d9@mail.gmail.com>
Message-ID: <438C31FF.5040302@gmail.com>

Guido van Rossum wrote:
> On 11/28/05, Nick Coghlan <ncoghlan at iinet.net.au> wrote:
>> I think we need to fix the proposed semantics so that they access the slots
>> via the type, rather than directly through the instance. Otherwise the slots
>> for the with statement will behave strangely when compared to the slots for
>> other magic methods.
> 
> Maybe it's because I'm just an old fart, but I can't make myself care
> about this. The code is broken. You get an error message. It even has
> the correct exception (TypeError). In this particular case the error
> message isn't that great -- well, the same is true in many other cases
> (like whenever the invocation is a method call from Python code).

I'm not particularly worried about the error message - as you say, it even has 
the right type. Or at least one of the two right types ;)

> That most built-in operations produce a different error message
> doesn't mean we have to make *all* built-in operations use the same
> approach. I fail to see the value of the consistency you're calling
> for.

The bit that more concerns me is the behavioural discrepancy that comes from 
having a piece of syntax that looks in the instance dictionary. No other 
Python syntax is affected by the instance attributes - if the object doesn't 
have the right type, you're out of luck.

Sticking an __iter__ method on an instance doesn't turn an object into an 
iterator, but with the current semantics, doing the same thing with 
__context__ *will* give you a manageable context.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From fredrik at pythonware.com  Tue Nov 29 09:29:37 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 29 Nov 2005 09:29:37 +0100
Subject: [Python-Dev] Memory management in the AST parser & compiler
References: <4379AAD7.2050506@iinet.net.au><6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu><e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com><ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com><bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com><13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu><437B2075.1000102@gmail.com>
	<dlf7ak$ckg$1@sea.gmane.org><dll2v3$78g$1@sea.gmane.org><ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com>
	<e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>
Message-ID: <dmh3hm$rvp$1@sea.gmane.org>

Jeremy Hylton wrote:

> > Me neither. Adding yet another memory allocation scheme to Python's
> > already staggering number of memory allocation strategies sounds like
> > a bad idea.
>
> The reason this thread started was the complaint that reference
> counting in the compiler is really difficult.  Almost every line of
> code can lead to an error exit.  The code becomes quite cluttered when
> it uses reference counting.  Right now, the AST is created with
> malloc/free, but that makes it hard to free the ast at the right time.
>  It would be fairly complex to convert the ast nodes to pyobjects.
> They're just simple discriminated unions right now.  If they were
> allocated from an arena, the entire arena could be freed when the
> compilation pass ends.

if you're using PyObject's for everything, you can use a list object as the
arena.  just append every "transient" value to the arena list, and a single
DECREF will get rid of it all.  if you want to copy something out from the
arena, just INCREF the object and it's yours.

(for performance reasons, it might be a good idea to add a _PyList_APPEND
helper that works like app1 but steals the value reference; e.g.

PyObject*
_PyList_APPEND(PyListObject *self, PyObject *v)
{
    int n;
    if (!v)
        return v;
    n = PyList_GET_SIZE(self);
    if (n == INT_MAX) {
        PyErr_SetString(PyExc_OverflowError,
        "cannot add more objects to list");
        return NULL;
    }
    if (list_resize(self, n+1) == -1)
        return NULL;
    PyList_SET_ITEM(self, n, v);
    return v;
}

which can be called as

    obj = _PyList_APPEND(c->arena, AST_Foobar_New(...));
    if (!obj)
        return NULL;

</F>




From ncoghlan at gmail.com  Tue Nov 29 13:59:52 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 29 Nov 2005 22:59:52 +1000
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com>
References: <4379AAD7.2050506@iinet.net.au>
	<dlf7ak$ckg$1@sea.gmane.org>	<dll2v3$78g$1@sea.gmane.org>	<ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com>	<e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>	<ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com>	<e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com>	<ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com>	<ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com>	<438B9F12.3060607@v.loewis.de>
	<ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com>
Message-ID: <438C50C8.9040005@gmail.com>

Neal Norwitz wrote:
> On 11/28/05, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>> Neal Norwitz wrote:
>>> Hope this helps explain a bit.  Please speak up with how this can be
>>> improved.  Gotta run.
>> I would rewrite it as
> 
> [code snipped]
> 
> For those watching, Greg's and Martin's version were almost the same. 
> However, Greg's version left in the memory leak, while Martin fixed it
> by letting the result fall through.  Martin added some helpful rules
> about dealing with the memory.  Martin also gets bonus points for
> talking about developing a checker. :-)
> 
> In both cases, their modified code is similar to the existing AST
> code, but all deallocation is done with Py_[X]DECREFs rather than a
> type specific deallocator.  Definitely nicer than the current
> situation.  It's also the same as the rest of the python code.

When working on the CST->AST parser, there were only a few things I found to 
be seriously painful about the memory management:

   1. Remembering which free_* variant to call for AST nodes
   2. Remembering which asdl_seq_*_free variant to call for ASDL sequences (it 
was worse when the variant I wanted didn't exist, since this was done with 
functions rather than preprocessor macros)
   3. Remembering to transpose free_* and *_free between freeing a single node 
and freeing a sequence.
   4. Remembering whether or not a given cleanup function could cope with 
NULL's or not
   5. The fact that there wasn't a consistent "goto error" exception-alike 
mechanism in use

(I had a Spanish Inquisition-esque experience writing that list ;)

Simply switching to PyObjects would solve the first four problems: everything 
becomes a Py_XDECREF.

Declaring that none of the AST node creation methods steal references would be 
consistent with most of the existing C API (e.g. PySequence_SetItem, 
PySequence_Tuple, PySequence_List), and has nice properties if we handle AST 
nodes as borrowed references from a PyList used as the arena, as Fredrik 
suggested.

If the top level function refrains from putting the top level node in the 
arena, then it will all "just work" - any objects will be deleted only if both 
the PyList arena AND the top-level node object are DECREF'ed. The top-level 
function only has to obey two simple rules:
   1. Always DECREF the arena list
   2. On success, INCREF the top-level node BEFORE DECREF'ing the arena list 
(otherwise Step 1 kills everything. . .)

To make the code a little more self-documenting, Fredrik's _PyList_APPEND 
could be called "new_ast_node" and accept the compiling struct directly:

PyObject*
new_ast_node(struct compiling *c, PyObject *ast_node)
{
     int n;
     if (!ast_node)
         return ast_node;
     idx = PyList_GET_SIZE(c->arena);
     if (idx == INT_MAX) {
         PyErr_SetString(PyExc_OverflowError,
         "cannot add more objects to arena");
         return NULL;
     }
     if (list_resize(c->arena, idx+1) == -1)
         return NULL;
     PyList_SET_ITEM(c->arena, idx, ast_node);
     return ast_node;
}

We'd also need to modify the helper macro for identifiers:

#define NEW_IDENTIFER(c, n) \
   new_ast_node(c, PyString_InternFromString(STR(n)))

Then the function is only borrowing the arena's reference, and doesn't need to 
decref anything:

static PyObject*
ast_for_funcdef(struct compiling *c, const node *n)
{
      /* funcdef: [decorators] 'def' NAME parameters ':' suite */
      PyObject *name = NULL;
      PyObject *args = NULL;
      PyObject *body = NULL;
      PyObject *decorator_seq = NULL;
      int name_i;

      REQ(n, funcdef);

      if (NCH(n) == 6) { /* decorators are present */
	decorator_seq = ast_for_decorators(c, CHILD(n, 0));
	if (!decorator_seq)
	    return NULL;
	name_i = 2;
      }
      else {
	name_i = 1;
      }

      name = NEW_IDENTIFIER(c, CHILD(n, name_i));
      if (!name)
	    return NULL;
      else if (!strcmp(STR(CHILD(n, name_i)), "None")) {
	ast_error(CHILD(n, name_i), "assignment to None");
         return NULL;
      }
      args = ast_for_arguments(c, CHILD(n, name_i + 1));
      if (!args)
         return NULL;
      body = ast_for_suite(c, CHILD(n, name_i + 3));
      if (!body)
         return NULL;

      return new_ast_node(\
        FunctionDef(name, args, body, decorator_seq, LINENO(n)));
}

No need for a checker, because there isn't anything special to do at the call 
sites: each AST node can take care of putting *itself* in the arena.

And as the identifier example shows, this even works for the non-AST leaf 
nodes that are some other kind of PyObject.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From mcherm at mcherm.com  Tue Nov 29 14:28:51 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Tue, 29 Nov 2005 05:28:51 -0800
Subject: [Python-Dev] Metaclass problem in the "with"
	statement	semantics in PEP 343
Message-ID: <20051129052851.l8aezron2rc0ksck@login.werra.lunarpages.com>

Nick writes:
> I think we need to fix the proposed semantics so that they access the slots
> via the type, rather than directly through the instance. Otherwise the slots
> for the with statement will behave strangely when compared to the slots for
> other magic methods.

Guido writes:
> I can't make myself care
> about this. The code is broken. You get an error message.

Nick writes:
> The bit that more concerns me is the behavioural discrepancy that comes from
> having a piece of syntax that looks in the instance dictionary. No other
> Python syntax is affected by the instance attributes - if the object doesn't
> have the right type, you're out of luck.
>
> Sticking an __iter__ method on an instance doesn't turn an object into an
> iterator, but with the current semantics, doing the same thing with
> __context__ *will* give you a manageable context.

If I'm understanding the situation here correctly, I'd like to chime in
on Nick's side. I'm unconcerned about the bit of code that uses or misuses
Context objects... I'm more concerned about the bit of the manual that
describes (in simple words that "fit your brain") how attribute/method
resolution works in Python.

Right now, we say that there's one rule for all *normal* attributes and
methods, and a slightly different rule for all double-underbar methods.
(I'd summarize the rules here, but they're just sufficiently complex that
I'm sure I'd make a mistake and wind up having people correct my mistake.
Suffice to say that the difference between normal and double-underbar
lookup has to do with checking (or not checking) the instance dictionary.)

With the current state of the code, we'd need to say that there's one
rule for all *normal* attributes and a slightly different rule for all
double-underbar methods except for __context__ which is just like a normal
attribute. That feels too big for my brain -- what on earth is so special
about __context__ that it has to be different from all other
double-underbar methods? If it were __init__ that had to be an exception,
I'd understand, but __context__?

-- Michael Chermside


From guido at python.org  Tue Nov 29 16:15:26 2005
From: guido at python.org (Guido van Rossum)
Date: Tue, 29 Nov 2005 07:15:26 -0800
Subject: [Python-Dev] Metaclass problem in the "with" statement
	semantics in PEP 343
In-Reply-To: <438C31FF.5040302@gmail.com>
References: <438AE97D.2050600@iinet.net.au>
	<ca471dc20511280824y6af50950y93f70f9c19bfe0d9@mail.gmail.com>
	<438C31FF.5040302@gmail.com>
Message-ID: <ca471dc20511290715g1740938ch5d02189de8f3c2a9@mail.gmail.com>

On 11/29/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> The bit that more concerns me is the behavioural discrepancy that comes from
> having a piece of syntax that looks in the instance dictionary. No other
> Python syntax is affected by the instance attributes - if the object doesn't
> have the right type, you're out of luck.

I'm not sure I buy that. Surely there are plenty of other places that
call PyObject_GetAttr(). Classic classes still let you put an __add__
attribute in the instance dict to make it addable (though admittedly
this is a weak argument since it'll go away in Py3k).

> Sticking an __iter__ method on an instance doesn't turn an object into an
> iterator, but with the current semantics, doing the same thing with
> __context__ *will* give you a manageable context.

This is all a very gray area. Before Python 2.2 most of the built-in
operations *did* call PyObject_GetAttr(). I added the slots mostly as
a speed-up, and the change in semantics was a side-effect of that.

And I'm still not sure why you care -- apart from the error case, it's
not going to affect anybody's code -- you should never use __xyzzy__
names except as documented since their undocumented use can change.
(So yes I'm keeping the door open for turning __context__ into a slot
later.)

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Nov 29 16:17:56 2005
From: guido at python.org (Guido van Rossum)
Date: Tue, 29 Nov 2005 07:17:56 -0800
Subject: [Python-Dev] Metaclass problem in the "with" statement
	semantics in PEP 343
In-Reply-To: <20051129052851.l8aezron2rc0ksck@login.werra.lunarpages.com>
References: <20051129052851.l8aezron2rc0ksck@login.werra.lunarpages.com>
Message-ID: <ca471dc20511290717o568cda19nf4f1899adff2c6f5@mail.gmail.com>

On 11/29/05, Michael Chermside <mcherm at mcherm.com> wrote:
> Right now, we say that there's one rule for all *normal* attributes and
> methods, and a slightly different rule for all double-underbar methods.

But it's not normal vs. __xyzzy__. A specific set of slots (including
next, but excluding things like __doc__) get special treatment. The
rest don't. All I'm saying is that I don't care to give __context__
this special treatment.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Nov 29 16:27:41 2005
From: guido at python.org (Guido van Rossum)
Date: Tue, 29 Nov 2005 07:27:41 -0800
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <438BF5D5.6090000@canterbury.ac.nz>
References: <4379AAD7.2050506@iinet.net.au>
	<ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com>
	<e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>
	<ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com>
	<e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com>
	<ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com>
	<ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com>
	<438B9D29.2020403@canterbury.ac.nz>
	<bbaeab100511281629xd89651eudc0c7ed5b1a36eb7@mail.gmail.com>
	<438BF5D5.6090000@canterbury.ac.nz>
Message-ID: <ca471dc20511290727v3e34f2efhf4dc54150d84d28a@mail.gmail.com>

On 11/28/05, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> [...] My mental model
> of parsing & compiling in the presence of a parse tree
> is like this:
>
>    [source] -> scanner -> [tokens]
>      -> parser -> [AST] -> code_generator -> [code]
>
> The fact that there still seems to be another kind of
> parse tree in between the scanner and the AST generator
> is an oddity which I hope will eventually disappear.

Have a look at http://python.org/sf/1337696 -- a reimplementation of
pgen in Python that I did for Elemental and am contributing to the
PSF. It customizes the tree generation callback so as to let you
produce an style of AST you like.

> > I know
> > Guido has said he doesn't like punishing the performance of small
> > scripts in the name of large-scale apps
>
> To me, that's an argument in favour of always generating
> a .pyc, even for scripts.

I'm not sure I follow the connection. But I wouldn't mind if someone
contributed code that did this. :)

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From goodger at python.org  Tue Nov 29 15:59:39 2005
From: goodger at python.org (David Goodger)
Date: Tue, 29 Nov 2005 09:59:39 -0500
Subject: [Python-Dev] CVS repository mostly closed now
In-Reply-To: <4388D55B.1070501@v.loewis.de>
References: <4388D55B.1070501@v.loewis.de>
Message-ID: <438C6CDB.7070805@python.org>

You can also remove CVS write privileges from project members.
It's a good way to prevent accidental checkins.

--
David Goodger <http://python.net/~goodger>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 253 bytes
Desc: OpenPGP digital signature
Url : http://mail.python.org/pipermail/python-dev/attachments/20051129/37c9a2d7/signature.pgp

From nnorwitz at gmail.com  Tue Nov 29 19:29:07 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Tue, 29 Nov 2005 10:29:07 -0800
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <438C50C8.9040005@gmail.com>
References: <4379AAD7.2050506@iinet.net.au>
	<ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com>
	<e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>
	<ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com>
	<e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com>
	<ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com>
	<ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com>
	<438B9F12.3060607@v.loewis.de>
	<ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com>
	<438C50C8.9040005@gmail.com>
Message-ID: <ee2a432c0511291029m5bdc4564s84457533037a7e11@mail.gmail.com>

On 11/29/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
>
> When working on the CST->AST parser, there were only a few things I found to
> be seriously painful about the memory management:
>
>    1. Remembering which free_* variant to call for AST nodes
>    2. Remembering which asdl_seq_*_free variant to call for ASDL sequences (it
> was worse when the variant I wanted didn't exist, since this was done with
> functions rather than preprocessor macros)
>    3. Remembering to transpose free_* and *_free between freeing a single node
> and freeing a sequence.
>    4. Remembering whether or not a given cleanup function could cope with
> NULL's or not
>    5. The fact that there wasn't a consistent "goto error" exception-alike
> mechanism in use
>
> (I had a Spanish Inquisition-esque experience writing that list ;)

:-)  I agree all those are existing issues.  #3 could be easily fixed.
 #4 I think all cleanup functions can deal with NULLs now.  #5
probably ought to be fixed in favor of using gotos.

> Simply switching to PyObjects would solve the first four problems: everything
> becomes a Py_XDECREF.

I'm mostly convinced that using PyObjects would be a good thing. 
However, making the change isn't free as all the types need to be
created and this is likely quite a bit of code.  I'd like to hear what
Jeremy thinks about this.

Is anyone interested in creating a patch along these lines (even a
partial patch) to see the benefits?

n

From nnorwitz at gmail.com  Tue Nov 29 19:17:20 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Tue, 29 Nov 2005 10:17:20 -0800
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <438C042F.2050502@v.loewis.de>
References: <4379AAD7.2050506@iinet.net.au>
	<ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com>
	<e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>
	<ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com>
	<e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com>
	<ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com>
	<ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com>
	<438B9F12.3060607@v.loewis.de>
	<ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com>
	<438C042F.2050502@v.loewis.de>
Message-ID: <ee2a432c0511291017r6c99c115y722e67fbea7e5cee@mail.gmail.com>

On 11/28/05, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> Neal Norwitz wrote:
> > For those watching, Greg's and Martin's version were almost the same.
> > However, Greg's version left in the memory leak, while Martin fixed it
> > by letting the result fall through.
>
> Actually, Greg said (correctly) that his version also fixes the
> leak: he assumed that FunctionDef would *consume* the references
> being passed (whether it is successful or not).

Ah right, I forgot about that.  Thanks for correcting me (sorry Greg).
 Jeremy and I had talked about this before.  I keep resisting this
solution, though I'm not sure why.

n

From edloper at gradient.cis.upenn.edu  Tue Nov 29 21:27:51 2005
From: edloper at gradient.cis.upenn.edu (Edward Loper)
Date: Tue, 29 Nov 2005 15:27:51 -0500
Subject: [Python-Dev] Metaclass problem in the "with" statement
	semantics in PEP 343
In-Reply-To: <mailman.8173.1133288956.18700.python-dev@python.org>
References: <mailman.8173.1133288956.18700.python-dev@python.org>
Message-ID: <f6c9c084e0f75619d461c871954c3900@gradient.cis.upenn.edu>

Michael Chermside wrote:
>> Right now, we say that there's one rule for all *normal* attributes 
>> and
>> methods, and a slightly different rule for all double-underbar 
>> methods.

Guido responded:
> But it's not normal vs. __xyzzy__. A specific set of slots (including
> next, but excluding things like __doc__) get special treatment. The
> rest don't. All I'm saying is that I don't care to give __context__
> this special treatment.

Perhaps we should officially document that the effect on special 
methods of overriding a class attribute with an instance attribute is 
undefined, for some given set of attributes?  (I would say all 
double-underbar methods, but it sounds like the list needs to also 
include next().)

Otherwise, it seems like people might write code that relies on the 
current behavior, which will then break if we eg turn __context__ into 
a slot.  (It sounds like you want to reserve the right to change this.) 
  Well, of course, people may rely on the current behavior anyway, but 
at least they'll have been warned. :)

-Edward


From bcannon at gmail.com  Tue Nov 29 23:03:00 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Tue, 29 Nov 2005 14:03:00 -0800
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <ca471dc20511290727v3e34f2efhf4dc54150d84d28a@mail.gmail.com>
References: <4379AAD7.2050506@iinet.net.au>
	<e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>
	<ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com>
	<e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com>
	<ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com>
	<ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com>
	<438B9D29.2020403@canterbury.ac.nz>
	<bbaeab100511281629xd89651eudc0c7ed5b1a36eb7@mail.gmail.com>
	<438BF5D5.6090000@canterbury.ac.nz>
	<ca471dc20511290727v3e34f2efhf4dc54150d84d28a@mail.gmail.com>
Message-ID: <bbaeab100511291403t6402c613j96c6fb283fb1368@mail.gmail.com>

On 11/29/05, Guido van Rossum <guido at python.org> wrote:
> On 11/28/05, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> > [...] My mental model
> > of parsing & compiling in the presence of a parse tree
> > is like this:
> >
> >    [source] -> scanner -> [tokens]
> >      -> parser -> [AST] -> code_generator -> [code]
> >
> > The fact that there still seems to be another kind of
> > parse tree in between the scanner and the AST generator
> > is an oddity which I hope will eventually disappear.
>
> Have a look at http://python.org/sf/1337696 -- a reimplementation of
> pgen in Python that I did for Elemental and am contributing to the
> PSF. It customizes the tree generation callback so as to let you
> produce an style of AST you like.
>
> > > I know
> > > Guido has said he doesn't like punishing the performance of small
> > > scripts in the name of large-scale apps
> >
> > To me, that's an argument in favour of always generating
> > a .pyc, even for scripts.
>
> I'm not sure I follow the connection.

Greg was proposing having parser, AST, and bytecode compilation all be
written in Python and frozen into the executable instead of it being
all C code.  I said that would be slower and would punish single file
scripts that don't get a .pyc generated for them because they would
need to have the file compiled every execution.  Greg said that is
just a good argument for  having *any* file, imported or passed in on
the command line, to have a .pyc generated when possible.

> But I wouldn't mind if someone
> contributed code that did this. :)
>

=)  Shouldn't be that complicated (but I don't have time for it right
now so it isn't dead simple either  =).

-Brett

From bcannon at gmail.com  Tue Nov 29 23:05:21 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Tue, 29 Nov 2005 14:05:21 -0800
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <ee2a432c0511291029m5bdc4564s84457533037a7e11@mail.gmail.com>
References: <4379AAD7.2050506@iinet.net.au>
	<e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>
	<ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com>
	<e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com>
	<ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com>
	<ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com>
	<438B9F12.3060607@v.loewis.de>
	<ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com>
	<438C50C8.9040005@gmail.com>
	<ee2a432c0511291029m5bdc4564s84457533037a7e11@mail.gmail.com>
Message-ID: <bbaeab100511291405v4061a5ben5ac014e1178f336b@mail.gmail.com>

On 11/29/05, Neal Norwitz <nnorwitz at gmail.com> wrote:
> On 11/29/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> >
> > When working on the CST->AST parser, there were only a few things I found to
> > be seriously painful about the memory management:
> >
> >    1. Remembering which free_* variant to call for AST nodes
> >    2. Remembering which asdl_seq_*_free variant to call for ASDL sequences (it
> > was worse when the variant I wanted didn't exist, since this was done with
> > functions rather than preprocessor macros)
> >    3. Remembering to transpose free_* and *_free between freeing a single node
> > and freeing a sequence.
> >    4. Remembering whether or not a given cleanup function could cope with
> > NULL's or not
> >    5. The fact that there wasn't a consistent "goto error" exception-alike
> > mechanism in use
> >
> > (I had a Spanish Inquisition-esque experience writing that list ;)
>
> :-)  I agree all those are existing issues.  #3 could be easily fixed.
>  #4 I think all cleanup functions can deal with NULLs now.  #5
> probably ought to be fixed in favor of using gotos.
>
> > Simply switching to PyObjects would solve the first four problems: everything
> > becomes a Py_XDECREF.
>
> I'm mostly convinced that using PyObjects would be a good thing.
> However, making the change isn't free as all the types need to be
> created and this is likely quite a bit of code.  I'd like to hear what
> Jeremy thinks about this.
>
> Is anyone interested in creating a patch along these lines (even a
> partial patch) to see the benefits?
>

Or should perhaps a branch be made since Subversion makes it so cheap
and this allows multiple people to work on it?

-Brett

From greg.ewing at canterbury.ac.nz  Tue Nov 29 23:15:16 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 30 Nov 2005 11:15:16 +1300
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com>
References: <4379AAD7.2050506@iinet.net.au> <dlf7ak$ckg$1@sea.gmane.org>
	<dll2v3$78g$1@sea.gmane.org>
	<ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com>
	<e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>
	<ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com>
	<e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com>
	<ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com>
	<ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com>
	<438B9F12.3060607@v.loewis.de>
	<ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com>
Message-ID: <438CD2F4.4090702@canterbury.ac.nz>

Neal Norwitz wrote:

> For those watching, Greg's and Martin's version were almost the same. 
> However, Greg's version left in the memory leak, while Martin fixed it
> by letting the result fall through.

I addressed the memory leak by stipulating that FunctionDef
should steal references to its arguments (whether it
succeeds or not).

However, while that trick works in this particular case, it
wouldn't be so helpful in more complicated situations, so
Martin's version is probably a better model to follow.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From greg.ewing at canterbury.ac.nz  Tue Nov 29 23:32:06 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 30 Nov 2005 11:32:06 +1300
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <438C50C8.9040005@gmail.com>
References: <4379AAD7.2050506@iinet.net.au> <dlf7ak$ckg$1@sea.gmane.org>
	<dll2v3$78g$1@sea.gmane.org>
	<ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com>
	<e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>
	<ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com>
	<e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com>
	<ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com>
	<ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com>
	<438B9F12.3060607@v.loewis.de>
	<ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com>
	<438C50C8.9040005@gmail.com>
Message-ID: <438CD6E6.4030504@canterbury.ac.nz>

Nick Coghlan wrote:

> Declaring that none of the AST node creation methods steal references would be 
> consistent with most of the existing C API (e.g. PySequence_SetItem, 
> PySequence_Tuple, PySequence_List),

Agreed, although the rest of your proposal (while
admirably cunning) requires that ast-building functions
effectively return borrowed references, which is not
usual

Thats' not to say it shouldn't be done, but it does
differ from the usual conventions, and that would need
to be kept in mind.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From greg.ewing at canterbury.ac.nz  Tue Nov 29 23:43:45 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 30 Nov 2005 11:43:45 +1300
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <ca471dc20511290727v3e34f2efhf4dc54150d84d28a@mail.gmail.com>
References: <4379AAD7.2050506@iinet.net.au>
	<ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com>
	<e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>
	<ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com>
	<e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com>
	<ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com>
	<ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com>
	<438B9D29.2020403@canterbury.ac.nz>
	<bbaeab100511281629xd89651eudc0c7ed5b1a36eb7@mail.gmail.com>
	<438BF5D5.6090000@canterbury.ac.nz>
	<ca471dc20511290727v3e34f2efhf4dc54150d84d28a@mail.gmail.com>
Message-ID: <438CD9A1.4050202@canterbury.ac.nz>

Guido van Rossum wrote:

>>To me, that's an argument in favour of always generating
>>a .pyc, even for scripts.
> 
> I'm not sure I follow the connection.

You were saying that if the parser and compiler were
slow, it would slow down single-file scripts that
didn't have a .pyc (or at least that's what I thought
you were saying). If a .pyc were always generated,
this problem would not arise.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From greg.ewing at canterbury.ac.nz  Tue Nov 29 23:49:00 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 30 Nov 2005 11:49:00 +1300
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <ee2a432c0511291017r6c99c115y722e67fbea7e5cee@mail.gmail.com>
References: <4379AAD7.2050506@iinet.net.au>
	<ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com>
	<e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>
	<ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com>
	<e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com>
	<ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com>
	<ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com>
	<438B9F12.3060607@v.loewis.de>
	<ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com>
	<438C042F.2050502@v.loewis.de>
	<ee2a432c0511291017r6c99c115y722e67fbea7e5cee@mail.gmail.com>
Message-ID: <438CDADC.1090806@canterbury.ac.nz>

Neal Norwitz wrote:
> On 11/28/05, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> 
> > he assumed that FunctionDef would *consume* the references
> > being passed (whether it is successful or not).
> 
> I keep resisting this solution, though I'm not sure why.

One reason for not liking it is that it only works well
when you only call one such function from a given function.
If there are two, you have to worry about not reaching the
second one due to the first one failing, in which case
you need to decref the second one's args yourself.

In the long run it's probably best to stick to the
conventional conventions, which are there for a reason --
they work!

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From greg.ewing at canterbury.ac.nz  Tue Nov 29 23:52:21 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 30 Nov 2005 11:52:21 +1300
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <ee2a432c0511291029m5bdc4564s84457533037a7e11@mail.gmail.com>
References: <4379AAD7.2050506@iinet.net.au>
	<ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com>
	<e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>
	<ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com>
	<e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com>
	<ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com>
	<ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com>
	<438B9F12.3060607@v.loewis.de>
	<ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com>
	<438C50C8.9040005@gmail.com>
	<ee2a432c0511291029m5bdc4564s84457533037a7e11@mail.gmail.com>
Message-ID: <438CDBA5.9050207@canterbury.ac.nz>

Neal Norwitz wrote:

> I'm mostly convinced that using PyObjects would be a good thing. 
> However, making the change isn't free as all the types need to be
> created and this is likely quite a bit of code.

Since they're all so similar, perhaps they could be
auto-generated by a fairly simple script?

(I'm being very careful not to suggest using Pyrex
for this, as I can appreciate the desire not to make
such a fundamental part of the core dependent on it!)

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From vinay_sajip at yahoo.co.uk  Tue Nov 29 23:49:48 2005
From: vinay_sajip at yahoo.co.uk (Vinay Sajip)
Date: Tue, 29 Nov 2005 22:49:48 +0000 (UTC)
Subject: [Python-Dev]
	=?utf-8?q?Proposed_additional_keyword_argument_in_lo?=
	=?utf-8?q?gging=09calls?=
References: <001a01c5ef77$d7682300$0200a8c0@alpha>
	<ca471dc20511281213i7aa48897qb4fd10d89fbae5dd@mail.gmail.com>
Message-ID: <loom.20051129T234808-309@post.gmane.org>

Guido van Rossum <guido <at> python.org> writes:
> This looks like a good clean solution to me. I agree with Paul Moore's
> suggestion that if extra_info is not None you should just go ahead and
> use it as a dict and let the errors propagate.

OK.

> What's the rationale for not letting it override existing fields?
> (There may be a good one, I just don't see it without turning on my
> thinking cap, which would cost extra.

The existing fields which could be overwritten are ones which have been computed
by the logging package itself:

name            Name of the logger
levelno         Numeric logging level for the message (DEBUG, INFO,
                WARNING, ERROR, CRITICAL)
levelname       Text logging level for the message ("DEBUG", "INFO",
                "WARNING", "ERROR", "CRITICAL")
msg             The message passed in the logging call
args            The additional args passed in the logging call
exc_info        Exception information (from sys.exc_info())
exc_text        Exception text (cached for use by multiple handlers)
pathname        Full pathname of the source file where the logging call
                was issued (if available)
filename        Filename portion of pathname
module          Module (name portion of filename)
lineno          Source line number where the logging call was issued
                (if available)
created         Time when the LogRecord was created (time.time()
                return value)
msecs           Millisecond portion of the creation time
relativeCreated Time in milliseconds when the LogRecord was created,
                relative to the time the logging module was loaded
                (typically at application startup time)
thread          Thread ID (if available)
process         Process ID (if available)
message         The result of record.getMessage(), computed just as
                the record is emitted

I couldn't think of a good reason why it should be possible to overwrite these
values with values from a user-supplied dictionary, other than to spoof log
entries in some way. The intention is to stop a user accidentally overwriting
one of the above attributes.

But thinking about "Errors should never pass silently", I propose that an
exception (KeyError seems most appropriate, though here it would be because a
key was present rather than absent) be thrown if one of the above attribute
names is supplied as a key in the user-supplied dict.

> Perhaps it makes sense to call it 'extra' instead of 'extra_info'?

Fine - 'extra' it will be.

> As a new feature it should definitely not go into 2.4; but I don't see
> how it could break existing code.
>

OK - thanks for the feedback.

Regards,

Vinay Sajip



From skip at pobox.com  Wed Nov 30 00:53:47 2005
From: skip at pobox.com (skip@pobox.com)
Date: Tue, 29 Nov 2005 17:53:47 -0600
Subject: [Python-Dev]
 =?utf-8?q?Proposed_additional_keyword_argument_in_lo?=
 =?utf-8?q?gging=09calls?=
In-Reply-To: <loom.20051129T234808-309@post.gmane.org>
References: <001a01c5ef77$d7682300$0200a8c0@alpha>
	<ca471dc20511281213i7aa48897qb4fd10d89fbae5dd@mail.gmail.com>
	<loom.20051129T234808-309@post.gmane.org>
Message-ID: <17292.59915.267228.293830@montanaro.dyndns.org>


    Vinay> I couldn't think of a good reason why it should be possible to
    Vinay> overwrite these values with values from a user-supplied
    Vinay> dictionary, other than to spoof log entries in some way. 

If the user doesn't need those values and can provide cheap substitutes,
perhaps their computation can be avoided.  I did that recently by inlining
only the parts of logging.LogRecord.__init__ in a subclass and avoided
calling logging.LogRecord.__init__ altogether.  It generated lots of
instance variables we never use and just slowed things down.

Skip

From guido at python.org  Wed Nov 30 05:19:20 2005
From: guido at python.org (Guido van Rossum)
Date: Tue, 29 Nov 2005 20:19:20 -0800
Subject: [Python-Dev] Proposed additional keyword argument in logging
	calls
In-Reply-To: <loom.20051129T234808-309@post.gmane.org>
References: <001a01c5ef77$d7682300$0200a8c0@alpha>
	<ca471dc20511281213i7aa48897qb4fd10d89fbae5dd@mail.gmail.com>
	<loom.20051129T234808-309@post.gmane.org>
Message-ID: <ca471dc20511292019o39a863e9p2a4e030ee3eb6ee8@mail.gmail.com>

On 11/29/05, Vinay Sajip <vinay_sajip at yahoo.co.uk> wrote:
> But thinking about "Errors should never pass silently", I propose that an
> exception (KeyError seems most appropriate, though here it would be because a
> key was present rather than absent) be thrown if one of the above attribute
> names is supplied as a key in the user-supplied dict.

+1

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From ncoghlan at gmail.com  Wed Nov 30 10:42:20 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 30 Nov 2005 19:42:20 +1000
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <438CDBA5.9050207@canterbury.ac.nz>
References: <4379AAD7.2050506@iinet.net.au>	<ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com>	<e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>	<ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com>	<e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com>	<ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com>	<ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com>	<438B9F12.3060607@v.loewis.de>	<ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com>	<438C50C8.9040005@gmail.com>	<ee2a432c0511291029m5bdc4564s84457533037a7e11@mail.gmail.com>
	<438CDBA5.9050207@canterbury.ac.nz>
Message-ID: <438D73FC.4090009@gmail.com>

Greg Ewing wrote:
> Neal Norwitz wrote:
> 
>> I'm mostly convinced that using PyObjects would be a good thing. 
>> However, making the change isn't free as all the types need to be
>> created and this is likely quite a bit of code.
> 
> Since they're all so similar, perhaps they could be
> auto-generated by a fairly simple script?
> 
> (I'm being very careful not to suggest using Pyrex
> for this, as I can appreciate the desire not to make
> such a fundamental part of the core dependent on it!)

The ast C structs are already auto-generated by a Python script (asdl_c.py, to 
be precise). The trick is to make that script generate full PyObjects rather 
than the simple C structures that it generates now.

I believe Jeremy wrote that early in the life of the AST branch, so it's worth 
waiting for his advice on how to go about modifying it.

asdl_seq can disappear entirely: we can just use a PyList instead.

The second step is to then modify ast.c to use the new structures. A branch 
probably wouldn't help much with initial development (this is a "break the 
world, check in when stuff compiles again" kind of change, which is hard to 
split amongst multiple people), but I think it would be of benefit when 
reviewing the change before moving it back to the trunk.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From ncoghlan at gmail.com  Wed Nov 30 10:51:26 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 30 Nov 2005 19:51:26 +1000
Subject: [Python-Dev] Metaclass problem in the "with"
 statement	semantics in PEP 343
In-Reply-To: <f6c9c084e0f75619d461c871954c3900@gradient.cis.upenn.edu>
References: <mailman.8173.1133288956.18700.python-dev@python.org>
	<f6c9c084e0f75619d461c871954c3900@gradient.cis.upenn.edu>
Message-ID: <438D761E.2040602@gmail.com>

Edward Loper wrote:
> Otherwise, it seems like people might write code that relies on the 
> current behavior, which will then break if we eg turn __context__ into 
> a slot.  (It sounds like you want to reserve the right to change this.) 
>   Well, of course, people may rely on the current behavior anyway, but 
> at least they'll have been warned. :)

Yep - I thought "the instance dictionary has no effect" was an actual rule, 
but it turns out the rules are slightly looser than that (specifically, that 
fact that the effect of having a slot name in the instance dictionary is 
undefined).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From mwh at python.net  Wed Nov 30 11:02:05 2005
From: mwh at python.net (Michael Hudson)
Date: Wed, 30 Nov 2005 10:02:05 +0000
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <438CD9A1.4050202@canterbury.ac.nz> (Greg Ewing's message of
	"Wed, 30 Nov 2005 11:43:45 +1300")
References: <4379AAD7.2050506@iinet.net.au>
	<ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com>
	<e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>
	<ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com>
	<e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com>
	<ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com>
	<ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com>
	<438B9D29.2020403@canterbury.ac.nz>
	<bbaeab100511281629xd89651eudc0c7ed5b1a36eb7@mail.gmail.com>
	<438BF5D5.6090000@canterbury.ac.nz>
	<ca471dc20511290727v3e34f2efhf4dc54150d84d28a@mail.gmail.com>
	<438CD9A1.4050202@canterbury.ac.nz>
Message-ID: <2m1x0ycv36.fsf@starship.python.net>

Greg Ewing <greg.ewing at canterbury.ac.nz> writes:

> Guido van Rossum wrote:
>
>>>To me, that's an argument in favour of always generating
>>>a .pyc, even for scripts.
>> 
>> I'm not sure I follow the connection.
>
> You were saying that if the parser and compiler were
> slow, it would slow down single-file scripts that
> didn't have a .pyc (or at least that's what I thought
> you were saying). If a .pyc were always generated,
> this problem would not arise.

Well, the current stdlib compiler is unacceptably slow, no question.
I don't want "make install" to take as long as "regrtest -u all
test_compiler", or make test to take nearly that long in all cases.

Cheers,
mwh

-- 
58. Fools ignore complexity. Pragmatists suffer it. Some can avoid
    it. Geniuses remove it.
  -- Alan Perlis, http://www.cs.yale.edu/homes/perlis-alan/quotes.html

From krumms at gmail.com  Wed Nov 30 12:58:40 2005
From: krumms at gmail.com (Thomas Lee)
Date: Wed, 30 Nov 2005 21:58:40 +1000
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <438D73FC.4090009@gmail.com>
References: <4379AAD7.2050506@iinet.net.au>	<ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com>	<e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>	<ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com>	<e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com>	<ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com>	<ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com>	<438B9F12.3060607@v.loewis.de>	<ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com>	<438C50C8.9040005@gmail.com>	<ee2a432c0511291029m5bdc4564s84457533037a7e11@mail.gmail.com>	<438CDBA5.9050207@canterbury.ac.nz>
	<438D73FC.4090009@gmail.com>
Message-ID: <438D93F0.3000005@gmail.com>

Nick Coghlan wrote:

>Greg Ewing wrote:
>  
>
>>Neal Norwitz wrote:
>>
>>    
>>
>>>I'm mostly convinced that using PyObjects would be a good thing. 
>>>However, making the change isn't free as all the types need to be
>>>created and this is likely quite a bit of code.
>>>      
>>>
>>Since they're all so similar, perhaps they could be
>>auto-generated by a fairly simple script?
>>
>>(I'm being very careful not to suggest using Pyrex
>>for this, as I can appreciate the desire not to make
>>such a fundamental part of the core dependent on it!)
>>    
>>
>
>The ast C structs are already auto-generated by a Python script (asdl_c.py, to 
>be precise). The trick is to make that script generate full PyObjects rather 
>than the simple C structures that it generates now.
>
>  
>
I was actually trying this approach last night. I'm back to it this 
evening, working with the ast-objects branch. I'll push a patch tonight 
with whatever I get done.

Quick semi-related question: where are the marshal_* functions called? 
They're all static in Python-ast.c and don't seem to be actually called 
anywhere. Can we ditch them?

>The second step is to then modify ast.c to use the new structures. A branch 
>probably wouldn't help much with initial development (this is a "break the 
>world, check in when stuff compiles again" kind of change, which is hard to 
>split amongst multiple people), but I think it would be of benefit when 
>reviewing the change before moving it back to the trunk.
>
>  
>
Based on my (limited) experience and your approach, compile.c may also 
need to be modified a little too (this should be pretty trivial).

Cheers,
Tom

From amk at amk.ca  Wed Nov 30 14:52:18 2005
From: amk at amk.ca (A.M. Kuchling)
Date: Wed, 30 Nov 2005 08:52:18 -0500
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <438D73FC.4090009@gmail.com>
References: <ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com>
	<e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com>
	<ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com>
	<ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com>
	<438B9F12.3060607@v.loewis.de>
	<ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com>
	<438C50C8.9040005@gmail.com>
	<ee2a432c0511291029m5bdc4564s84457533037a7e11@mail.gmail.com>
	<438CDBA5.9050207@canterbury.ac.nz> <438D73FC.4090009@gmail.com>
Message-ID: <20051130135218.GA23728@rogue.amk.ca>

On Wed, Nov 30, 2005 at 07:42:20PM +1000, Nick Coghlan wrote:
> The second step is to then modify ast.c to use the new structures. A branch 
> probably wouldn't help much with initial development (this is a "break the 
> world, check in when stuff compiles again" kind of change, which is hard to 
> split amongst multiple people), ...

There is a bug day scheduled for this Sunday, so maybe the AST
developers could meet to coordinate this change.

--amk


From theller at python.net  Wed Nov 30 10:04:07 2005
From: theller at python.net (Thomas Heller)
Date: Wed, 30 Nov 2005 10:04:07 +0100
Subject: [Python-Dev]
 =?utf-8?q?Proposed_additional_keyword_argument_in_lo?=
 =?utf-8?q?gging=09calls?=
References: <001a01c5ef77$d7682300$0200a8c0@alpha>
	<ca471dc20511281213i7aa48897qb4fd10d89fbae5dd@mail.gmail.com>
	<loom.20051129T234808-309@post.gmane.org>
Message-ID: <64qaik1k.fsf@python.net>

Vinay Sajip <vinay_sajip at yahoo.co.uk> writes:

> The existing fields which could be overwritten are ones which have been computed
> by the logging package itself:
>
> name            Name of the logger
> levelno         Numeric logging level for the message (DEBUG, INFO,
>                 WARNING, ERROR, CRITICAL)
[and so on].

Shouldn't this list be documented?  Or is it?

Thomas


From jimjjewett at gmail.com  Wed Nov 30 18:39:58 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed, 30 Nov 2005 12:39:58 -0500
Subject: [Python-Dev] Proposed additional keyword argument in logging calls
Message-ID: <fb6fbf560511300939i2e31eb16la2a3fb15bb688053@mail.gmail.com>

> I couldn't think of a good reason why it should be possible to overwrite these
> values with values from a user-supplied dictionary, other than to spoof log
> entries in some way. The intention is to stop a user accidentally overwriting
> one of the above attributes.

This makes sense, but is it worth the time to check on each logging call?

-jJ

From nnorwitz at gmail.com  Wed Nov 30 19:21:27 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Wed, 30 Nov 2005 10:21:27 -0800
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <438D93F0.3000005@gmail.com>
References: <4379AAD7.2050506@iinet.net.au>
	<ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com>
	<ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com>
	<438B9F12.3060607@v.loewis.de>
	<ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com>
	<438C50C8.9040005@gmail.com>
	<ee2a432c0511291029m5bdc4564s84457533037a7e11@mail.gmail.com>
	<438CDBA5.9050207@canterbury.ac.nz> <438D73FC.4090009@gmail.com>
	<438D93F0.3000005@gmail.com>
Message-ID: <ee2a432c0511301021m2e72d710r173f085b84cc2f4@mail.gmail.com>

On 11/30/05, Thomas Lee <krumms at gmail.com> wrote:
>
> Quick semi-related question: where are the marshal_* functions called?
> They're all static in Python-ast.c and don't seem to be actually called
> anywhere. Can we ditch them?

I *think* they are not necessary.  My guess is that they were there
for marshaling the AST to disk, though I'm not sure why we would want
to do that.  It could have been there was the idea of how they would
be marshalled to PyObjects and exported.

Unless you hear otherwise from Jeremy, I would probably remove them.

I can check your patch into the branch so others can get an idea and
hopefully provide comments.

n

From nas at arctrix.com  Wed Nov 30 18:24:49 2005
From: nas at arctrix.com (Neil Schemenauer)
Date: Wed, 30 Nov 2005 17:24:49 +0000 (UTC)
Subject: [Python-Dev] Memory management in the AST parser & compiler
References: <4379AAD7.2050506@iinet.net.au>
	<ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com>
	<e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com>
	<ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com>
	<e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com>
	<ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com>
	<ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com>
	<438B9F12.3060607@v.loewis.de>
	<ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com>
	<438C50C8.9040005@gmail.com>
	<ee2a432c0511291029m5bdc4564s84457533037a7e11@mail.gmail.com>
	<438CDBA5.9050207@canterbury.ac.nz>
	<438D73FC.4090009@gmail.com> <438D93F0.3000005@gmail.com>
Message-ID: <dmkn90$27q$1@sea.gmane.org>

Thomas Lee <krumms at gmail.com> wrote:
> Quick semi-related question: where are the marshal_* functions called? 
> They're all static in Python-ast.c and don't seem to be actually called 
> anywhere. Can we ditch them?

They are intended to be used to make the AST available to Python
code.  It would be nice if they could be retained but nothing will
break (AFAIK) if they are ditched.

  Neil


From mfb at lotusland.dyndns.org  Wed Nov 30 19:40:26 2005
From: mfb at lotusland.dyndns.org (Matthew F. Barnes)
Date: Wed, 30 Nov 2005 12:40:26 -0600
Subject: [Python-Dev] Short-circuiting iterators
Message-ID: <1133376026.19766.31.camel@localhost.localdomain>

Hello,

I've not had much luck in searching for a discussion on this in the
Python-Dev archives, so bear with me.

I had an idea this morning for a simple extension to Python's iterator
protocol that would allow the user to force an iterator to raise
StopIteration on the next call to next().  My thought was to add a new
method to iterators called stop().

In my situation it would be useful as a control-flow mechanism, but I
imagine there are many other use cases for it:


    generator = some_generator_function()

    for x in generator:
        ... deeply ...
            ... nested ...
                ... control-flow ...

                    if satisfaction_condition:
                        # Terminates the for-loop, but
                        # finishes the current iteration
                        generator.stop()

        ... more stuff ...


I'm curious if anything like this has been proposed in the past.  If so,
could someone kindly point me to any relevant mailing list threads?

Matthew Barnes

From nnorwitz at gmail.com  Wed Nov 30 19:54:45 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Wed, 30 Nov 2005 10:54:45 -0800
Subject: [Python-Dev] Memory management in the AST parser & compiler
In-Reply-To: <dmkn90$27q$1@sea.gmane.org>
References: <4379AAD7.2050506@iinet.net.au>
	<ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com>
	<438B9F12.3060607@v.loewis.de>
	<ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com>
	<438C50C8.9040005@gmail.com>
	<ee2a432c0511291029m5bdc4564s84457533037a7e11@mail.gmail.com>
	<438CDBA5.9050207@canterbury.ac.nz> <438D73FC.4090009@gmail.com>
	<438D93F0.3000005@gmail.com> <dmkn90$27q$1@sea.gmane.org>
Message-ID: <ee2a432c0511301054u7bae50f9i22c2e2749b2f9969@mail.gmail.com>

On 11/30/05, Neil Schemenauer <nas at arctrix.com> wrote:
> Thomas Lee <krumms at gmail.com> wrote:
> > Quick semi-related question: where are the marshal_* functions called?
> > They're all static in Python-ast.c and don't seem to be actually called
> > anywhere. Can we ditch them?
>
> They are intended to be used to make the AST available to Python
> code.  It would be nice if they could be retained but nothing will
> break (AFAIK) if they are ditched.

If everything is a PyObject, wouldn't they be redundant?

n

From aleaxit at gmail.com  Wed Nov 30 19:57:51 2005
From: aleaxit at gmail.com (Alex Martelli)
Date: Wed, 30 Nov 2005 10:57:51 -0800
Subject: [Python-Dev] Short-circuiting iterators
In-Reply-To: <1133376026.19766.31.camel@localhost.localdomain>
References: <1133376026.19766.31.camel@localhost.localdomain>
Message-ID: <e8a0972d0511301057i35e1a4cei42ba02529859ceb2@mail.gmail.com>

On 11/30/05, Matthew F. Barnes <mfb at lotusland.dyndns.org> wrote:
   ...
> I'm curious if anything like this has been proposed in the past.  If so,
> could someone kindly point me to any relevant mailing list threads?

PEP 342, already accepted and found at
http://python.org/peps/pep-0342.html , covers related functionality
(as well as many other points).


Akex

From mfb at lotusland.dyndns.org  Wed Nov 30 20:16:25 2005
From: mfb at lotusland.dyndns.org (Matthew F. Barnes)
Date: Wed, 30 Nov 2005 13:16:25 -0600
Subject: [Python-Dev] Short-circuiting iterators
In-Reply-To: <e8a0972d0511301057i35e1a4cei42ba02529859ceb2@mail.gmail.com>
References: <1133376026.19766.31.camel@localhost.localdomain>
	<e8a0972d0511301057i35e1a4cei42ba02529859ceb2@mail.gmail.com>
Message-ID: <1133378185.19766.39.camel@localhost.localdomain>

On Wed, 2005-11-30 at 10:57 -0800, Alex Martelli wrote:
> PEP 342, already accepted and found at
> http://python.org/peps/pep-0342.html , covers related functionality
> (as well as many other points).

Thanks Alex, I'll take another look at that PEP.  The first time I tried
to read it my brain started to sizzle.

I happened to use a generator-iterator in my example, but my thought was
that the extension could be applied to iterators in general, including
sequence-iterators.

Matthew Barnes

From edloper at gradient.cis.upenn.edu  Wed Nov 30 20:36:54 2005
From: edloper at gradient.cis.upenn.edu (Edward Loper)
Date: Wed, 30 Nov 2005 14:36:54 -0500
Subject: [Python-Dev] Short-circuiting iterators
In-Reply-To: <mailman.8427.1133378193.18700.python-dev@python.org>
References: <mailman.8427.1133378193.18700.python-dev@python.org>
Message-ID: <4a6398cac1420f3b957ec1fd449e439f@gradient.cis.upenn.edu>

> I had an idea this morning for a simple extension to Python's iterator
> protocol that would allow the user to force an iterator to raise
> StopIteration on the next call to next().  My thought was to add a new
> method to iterators called stop().

There's no need to change the iterator protocol for your example use 
case; you could just define a simple iterator-wrapper:

class InterruptableIterator:
     stopped = False
     def __init__(self, iter):
         self.iter = iter()
     def next(self):
         if stopped:
             raise StopIteration('iterator stopped.')
         return self.iter.next()
     def stop(self):
         self.stopped = True

And then just replace:
>     generator = some_generator_function()
with:
     generator = InterruptableIterator(some_generator_function())

-Edward


From reinhold-birkenfeld-nospam at wolke7.net  Wed Nov 30 20:51:09 2005
From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld)
Date: Wed, 30 Nov 2005 20:51:09 +0100
Subject: [Python-Dev] something is wrong with test___all__
In-Reply-To: <ca471dc20511281216p36548ba6l6779da343d14e805@mail.gmail.com>
References: <dm03lq$41u$1@sea.gmane.org>
	<ca471dc20511281216p36548ba6l6779da343d14e805@mail.gmail.com>
Message-ID: <dmkvre$5jc$1@sea.gmane.org>

Guido van Rossum wrote:
> Has this been handled yet? If not, perhaps showing the good and bad
> bytecode here would help trigger someone's brain into understanding
> the problem.

I've created a tracker item at www.python.org/sf/1370322.

Reinhold

-- 
Mail address is perfectly valid!


From mfb at lotusland.dyndns.org  Wed Nov 30 20:52:03 2005
From: mfb at lotusland.dyndns.org (Matthew F. Barnes)
Date: Wed, 30 Nov 2005 13:52:03 -0600
Subject: [Python-Dev] Short-circuiting iterators
In-Reply-To: <4a6398cac1420f3b957ec1fd449e439f@gradient.cis.upenn.edu>
References: <mailman.8427.1133378193.18700.python-dev@python.org>
	<4a6398cac1420f3b957ec1fd449e439f@gradient.cis.upenn.edu>
Message-ID: <1133380323.19766.45.camel@localhost.localdomain>

On Wed, 2005-11-30 at 14:36 -0500, Edward Loper wrote:
> There's no need to change the iterator protocol for your example use 
> case; you could just define a simple iterator-wrapper:

Good point.  Perhaps it would be a useful addition to the itertools
module then?

        itertools.interruptable(iterable)

Matthew Barnes

From nas at arctrix.com  Wed Nov 30 20:49:53 2005
From: nas at arctrix.com (Neil Schemenauer)
Date: Wed, 30 Nov 2005 19:49:53 +0000 (UTC)
Subject: [Python-Dev] Memory management in the AST parser & compiler
References: <4379AAD7.2050506@iinet.net.au>
	<ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com>
	<438B9F12.3060607@v.loewis.de>
	<ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com>
	<438C50C8.9040005@gmail.com>
	<ee2a432c0511291029m5bdc4564s84457533037a7e11@mail.gmail.com>
	<438CDBA5.9050207@canterbury.ac.nz>
	<438D73FC.4090009@gmail.com> <438D93F0.3000005@gmail.com>
	<dmkn90$27q$1@sea.gmane.org>
	<ee2a432c0511301054u7bae50f9i22c2e2749b2f9969@mail.gmail.com>
Message-ID: <dmkvp1$4f3$1@sea.gmane.org>

Neal Norwitz <nnorwitz at gmail.com> wrote:
> If everything is a PyObject, wouldn't [the marshal functions] be
> redundant?

You could be right.  Spending time to kept them working is probably
wasted effort.

  Neil


From barry at python.org  Wed Nov 30 22:24:01 2005
From: barry at python.org (Barry Warsaw)
Date: Wed, 30 Nov 2005 16:24:01 -0500
Subject: [Python-Dev] Standalone email package in the sandbox
Message-ID: <1133385841.23988.10.camel@geddy.wooz.org>

Unless there are any objections, I'd like to create a space in the
sandbox for the standalone email package miscellany.  This currently
lives in the mimelib project's hidden CVS on SF, but that seems pretty
silly.  

Basically I'm just going to add the test script, setup.py, generated
html docs and a few additional unit tests, along with svn:external refs
to pull in Lib/email from the appropriate Python svn tree.  This way,
I'll be able to create standalone email packages from the sandbox (which
I need to do because I plan on fixing a few outstanding email bugs).

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20051130/e88db51d/attachment.pgp