From guido at python.org Tue Nov 1 00:36:09 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 31 Oct 2005 16:36:09 -0700 Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover In-Reply-To: <43654858.9020108@v.loewis.de> References: <435BC27C.1010503@v.loewis.de> <2mbr1g6loh.fsf@starship.python.net> <e8bf7a530510270523g4a3bef5fk1dd5e8e016d9aa1a@mail.gmail.com> <17248.52771.225830.484931@montanaro.dyndns.org> <43610C36.2030500@v.loewis.de> <1f7befae0510281829n20ae2936pbc9f923da807bf6a@mail.gmail.com> <17252.50390.256221.4882@montanaro.dyndns.org> <17252.59653.792906.582288@montanaro.dyndns.org> <43654858.9020108@v.loewis.de> Message-ID: <ca471dc20510311536g406db798o6249ab8108813c6f@mail.gmail.com> Help! What's the magic to get $Revision$ and $Date$ to be expanded upon checkin? Comparing pep-0352.txt and pep-0343.txt, I noticed that the latter has the svn revision and date in the headers, while the former still has Brett's original revision 1.5 and a date somewhere in June. I tried to fix this by rewriting the fields as $Revision$ and $Date$ but that doesn't seem to make a difference. Googling for this is a bit tricky because Google collapses $Revision and Revision, which makes any query for svn and $Revision rather non-specific. :-( It's also not yet in our Wiki. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.peters at gmail.com Tue Nov 1 00:48:44 2005 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 31 Oct 2005 18:48:44 -0500 Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover In-Reply-To: <ca471dc20510311536g406db798o6249ab8108813c6f@mail.gmail.com> References: <435BC27C.1010503@v.loewis.de> <2mbr1g6loh.fsf@starship.python.net> <e8bf7a530510270523g4a3bef5fk1dd5e8e016d9aa1a@mail.gmail.com> <17248.52771.225830.484931@montanaro.dyndns.org> <43610C36.2030500@v.loewis.de> <1f7befae0510281829n20ae2936pbc9f923da807bf6a@mail.gmail.com> <17252.50390.256221.4882@montanaro.dyndns.org> <17252.59653.792906.582288@montanaro.dyndns.org> <43654858.9020108@v.loewis.de> <ca471dc20510311536g406db798o6249ab8108813c6f@mail.gmail.com> Message-ID: <1f7befae0510311548v34da0695jc38e0a5c831256c8@mail.gmail.com> [Guido] > Help! > > What's the magic to get $Revision$ and $Date$ to be expanded upon > checkin? Comparing pep-0352.txt and pep-0343.txt, I noticed that the > latter has the svn revision and date in the headers, while the former > still has Brett's original revision 1.5 and a date somewhere in June. > I tried to fix this by rewriting the fields as $Revision$ and $Date$ > but that doesn't seem to make a difference. > > Googling for this is a bit tricky because Google collapses $Revision > and Revision, which makes any query for svn and $Revision rather > non-specific. :-( It's also not yet in our Wiki. You have to set the `svn:keywords` property on each file for which you want these kinds of expansions: http://svnbook.red-bean.com/en/1.0/ch07s02.html#svn-ch-7-sect-2.3.4 Use svn propedit svn:keywords path/to/file to set that property to what you want. Looking at your examples, C:\Code>svn proplist -v http://svn.python.org/projects/peps/trunk/pep-0343.txt Properties on 'http://svn.python.org/projects/peps/trunk/pep-0343.txt': svn:keywords : Author Date Id Revision svn:eol-style : native So that has svn:keywords set, and expansion occurs. OTOH, C:\Code>svn proplist -v http://svn.python.org/projects/peps/trunk/pep-0352.txt Nada -- that one doesn't even have svn:eol-style set. See http://wiki.python.org/moin/CvsToSvn section "File Modes" for how to convince SVN to automatically set the properties you want on new files you commit (unfortunately, each developer has to do this in their own SVN config file). From pinard at iro.umontreal.ca Tue Nov 1 00:50:00 2005 From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard) Date: Mon, 31 Oct 2005 18:50:00 -0500 Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover In-Reply-To: <ca471dc20510311536g406db798o6249ab8108813c6f@mail.gmail.com> References: <435BC27C.1010503@v.loewis.de> <2mbr1g6loh.fsf@starship.python.net> <e8bf7a530510270523g4a3bef5fk1dd5e8e016d9aa1a@mail.gmail.com> <17248.52771.225830.484931@montanaro.dyndns.org> <43610C36.2030500@v.loewis.de> <1f7befae0510281829n20ae2936pbc9f923da807bf6a@mail.gmail.com> <17252.50390.256221.4882@montanaro.dyndns.org> <17252.59653.792906.582288@montanaro.dyndns.org> <43654858.9020108@v.loewis.de> <ca471dc20510311536g406db798o6249ab8108813c6f@mail.gmail.com> Message-ID: <20051031235000.GA14812@alcyon.progiciels-bpi.ca> [Guido van Rossum] >What's the magic to get $Revision$ and $Date$ to be expanded upon >checkin? Expansion does not occur on checkin, but on checkout, and even then, only in your copy -- that one you see (the internal Subversion copy is untouched). You have to edit a property for the file where you want substitutions. That property is named "svn:keywords" and its value decides which kind of substitution you want to allow. This is all theory for me, I never used them. -- Fran?ois Pinard http://pinard.progiciels-bpi.ca From gherron at islandtraining.com Tue Nov 1 00:54:46 2005 From: gherron at islandtraining.com (Gary Herron) Date: Mon, 31 Oct 2005 15:54:46 -0800 Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover In-Reply-To: <ca471dc20510311536g406db798o6249ab8108813c6f@mail.gmail.com> References: <435BC27C.1010503@v.loewis.de> <2mbr1g6loh.fsf@starship.python.net> <e8bf7a530510270523g4a3bef5fk1dd5e8e016d9aa1a@mail.gmail.com> <17248.52771.225830.484931@montanaro.dyndns.org> <43610C36.2030500@v.loewis.de> <1f7befae0510281829n20ae2936pbc9f923da807bf6a@mail.gmail.com> <17252.50390.256221.4882@montanaro.dyndns.org> <17252.59653.792906.582288@montanaro.dyndns.org> <43654858.9020108@v.loewis.de> <ca471dc20510311536g406db798o6249ab8108813c6f@mail.gmail.com> Message-ID: <4366AEC6.2000803@islandtraining.com> Guido van Rossum wrote: >Help! > >What's the magic to get $Revision$ and $Date$ to be expanded upon >checkin? Comparing pep-0352.txt and pep-0343.txt, I noticed that the >latter has the svn revision and date in the headers, while the former >still has Brett's original revision 1.5 and a date somewhere in June. >I tried to fix this by rewriting the fields as $Revision$ and $Date$ >but that doesn't seem to make a difference. > >Googling for this is a bit tricky because Google collapses $Revision >and Revision, which makes any query for svn and $Revision rather >non-specific. :-( It's also not yet in our Wiki. > > It's an svn property associated with the file. The property name is svn:keywords, and the value is a space separated list of keywords you'd like to have substituted. Like this: svn propset svn:keywords "Date Revision" ...file list... The list of keywords it will handle is LastChangedDate (or Date) LastChangedRevision (or Revision or Rev) LastChangedBy (or Author) HeadURL (or URL) Id Gary Herron From greg.ewing at canterbury.ac.nz Tue Nov 1 02:03:25 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 01 Nov 2005 14:03:25 +1300 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: <20051031022554.GA20255@alcyon.progiciels-bpi.ca> References: <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com> <4362A44F.9010506@v.loewis.de> <20051029110331.D5AA.ISHIMOTO@gembook.org> <4363395A.3040606@v.loewis.de> <1130589142.5945.11.camel@fsol> <43638BC0.40108@v.loewis.de> <20051031022554.GA20255@alcyon.progiciels-bpi.ca> Message-ID: <4366BEDD.9020100@canterbury.ac.nz> Fran?ois Pinard wrote: > All development is done in house by French people. All documentation, > external or internal, comments, identifier and function names, > everything is in French. There's nothing stopping you from creating your own Frenchified version of Python that lets you use all the characters you want, for your own in-house use. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Tue Nov 1 02:24:11 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 01 Nov 2005 14:24:11 +1300 Subject: [Python-Dev] a different kind of reduce... In-Reply-To: <8393fff0510311113p63bc194ak88580f84a25b1a1a@mail.gmail.com> References: <8393fff0510311113p63bc194ak88580f84a25b1a1a@mail.gmail.com> Message-ID: <4366C3BB.3010407@canterbury.ac.nz> Martin Blais wrote: > I'm always--literally every time-- looking for a more functional form, > something that would be like this: > > # apply dirname() 3 times on its results, initializing with p > ... = repapply(dirname, 3, p) Maybe ** should be defined for functions so that you could do things like up3levels = dirname ** 3 -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From pinard at iro.umontreal.ca Tue Nov 1 03:51:15 2005 From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard) Date: Mon, 31 Oct 2005 21:51:15 -0500 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: <4366BEDD.9020100@canterbury.ac.nz> References: <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com> <4362A44F.9010506@v.loewis.de> <20051029110331.D5AA.ISHIMOTO@gembook.org> <4363395A.3040606@v.loewis.de> <1130589142.5945.11.camel@fsol> <43638BC0.40108@v.loewis.de> <20051031022554.GA20255@alcyon.progiciels-bpi.ca> <4366BEDD.9020100@canterbury.ac.nz> Message-ID: <20051101025115.GA18573@alcyon.progiciels-bpi.ca> [Greg Ewing] >> All development is done in house by French people. All documentation, >> external or internal, comments, identifier and function names, >> everything is in French. > There's nothing stopping you from creating your own Frenchified > version of Python that lets you use all the characters you want, for > your own in-house use. No doubt that we, you and me and everybody, could all have our own little version of Python. :-) To tell all the truth, the very topic of your suggestion has already been discussed in-house already, and the decision has been to stick to Python mainstream. We could not justify to our administration that we start modifying our sources, in such a way that we ought to invest maintainance each time a new Python version appears, forever. On the other hand, we may reasonably guess that many people in this world would love being as comfortable as possible using Python, while naming identifiers naturally. It is not so unreasonable that we keep some _hope_ that Guido will soon choose to help us all, not only me. -- Fran?ois Pinard http://pinard.progiciels-bpi.ca From amk at amk.ca Tue Nov 1 15:35:05 2005 From: amk at amk.ca (A.M. Kuchling) Date: Tue, 1 Nov 2005 09:35:05 -0500 Subject: [Python-Dev] python-dev sprint at PyCon Message-ID: <20051101143505.GE14719@rogue.amk.ca> Every PyCon has featured a python-dev sprint. For the past few years, hacking on the AST branch has been a tradition, but we'll have to come up with something new for this year's conference (in Dallas Texas; sprints will be Monday Feb. 27 through Thursday March 2). According to Anthony's release plan, a first alpha of 2.5 would be released in March, hence after PyCon and the sprints. We should discuss possible tasks for a python-dev sprint. What could we do? When the discussion is over, someone should update the wiki page with whatever tasks are suggested: <http://wiki.python.org/moin/PyCon2006/Sprints>. --amk From dave at boost-consulting.com Tue Nov 1 17:25:23 2005 From: dave at boost-consulting.com (David Abrahams) Date: Tue, 01 Nov 2005 11:25:23 -0500 Subject: [Python-Dev] [C++-sig] GCC version compatibility References: <42CDA654.2080106@v.loewis.de> <uu0j6p7z1.fsf@boost-consulting.com> <20050708072807.GC3581@lap200.cdc.informatik.tu-darmstadt.de> <u8y0hl45u.fsf@boost-consulting.com> <42CEF948.3010908@v.loewis.de> <20050709102010.GA3836@lap200.cdc.informatik.tu-darmstadt.de> <42D0D215.9000708@v.loewis.de> <20050710125458.GA3587@lap200.cdc.informatik.tu-darmstadt.de> <42D15DB2.3020300@v.loewis.de> <20050716101357.GC3607@lap200.cdc.informatik.tu-darmstadt.de> <20051012120917.GA11058@lap200.cdc.informatik.tu-darmstadt.de> Message-ID: <u64rc49os.fsf@boost-consulting.com> Christoph Ludwig <cludwig at cdc.informatik.tu-darmstadt.de> writes: > Hi, > > this is to continue a discussion started back in July by a posting by > Dave Abrahams <url:http://thread.gmane.org/gmane.comp.python.devel/69651> > regarding the compiler (C vs. C++) used to compile python's main() and to link > the executable. > > > On Sat, Jul 16, 2005 at 12:13:58PM +0200, Christoph Ludwig wrote: >> On Sun, Jul 10, 2005 at 07:41:06PM +0200, "Martin v. L?wis" wrote: >> > Maybe. For Python 2.4, feel free to contribute a more complex test. For >> > Python 2.5, I would prefer if the entire code around ccpython.cc was >> > removed. >> >> I submitted patch #1239112 that implements the test involving two TUs for >> Python 2.4. I plan to work on a more comprehensive patch for Python 2.5 but >> that will take some time. > > > I finally had the spare time to look into this problem again and submitted > patch #1324762. The proposed patch implements the following: I just wanted to write to encourage some Python developers to look at (and accept!) Christoph's patch. This is really crucial for smooth interoperability between C++ and Python. Thank you, Dave -- Dave Abrahams Boost Consulting www.boost-consulting.com From pje at telecommunity.com Tue Nov 1 18:16:52 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 01 Nov 2005 12:16:52 -0500 Subject: [Python-Dev] python-dev sprint at PyCon In-Reply-To: <20051101143505.GE14719@rogue.amk.ca> Message-ID: <5.1.1.6.0.20051101121245.020559e8@mail.telecommunity.com> At 09:35 AM 11/1/2005 -0500, A.M. Kuchling wrote: >Every PyCon has featured a python-dev sprint. For the past few years, >hacking on the AST branch has been a tradition, but we'll have to come >up with something new for this year's conference (in Dallas Texas; >sprints will be Monday Feb. 27 through Thursday March 2). > >According to Anthony's release plan, a first alpha of 2.5 would be >released in March, hence after PyCon and the sprints. We should >discuss possible tasks for a python-dev sprint. What could we do? * PEP 343 implementation ('with:') * PEP 308 implementation ('x if y else z') * A bytes type Or perhaps some of the things that have been waiting for the AST branch to be finished, i.e.: * One of the "global variable speedup" PEPs * Guido's instance variable speedup idea (LOAD_SELF_IVAR and STORE_SELF_IVAR, see http://mail.python.org/pipermail/python-dev/2002-February/019854.html) From guido at python.org Tue Nov 1 18:22:16 2005 From: guido at python.org (Guido van Rossum) Date: Tue, 1 Nov 2005 10:22:16 -0700 Subject: [Python-Dev] python-dev sprint at PyCon In-Reply-To: <5.1.1.6.0.20051101121245.020559e8@mail.telecommunity.com> References: <20051101143505.GE14719@rogue.amk.ca> <5.1.1.6.0.20051101121245.020559e8@mail.telecommunity.com> Message-ID: <ca471dc20511010922g2f463d7en5d9bc8dbc5a26c92@mail.gmail.com> On 11/1/05, Phillip J. Eby <pje at telecommunity.com> wrote: > At 09:35 AM 11/1/2005 -0500, A.M. Kuchling wrote: > >Every PyCon has featured a python-dev sprint. For the past few years, > >hacking on the AST branch has been a tradition, but we'll have to come > >up with something new for this year's conference (in Dallas Texas; > >sprints will be Monday Feb. 27 through Thursday March 2). > > > >According to Anthony's release plan, a first alpha of 2.5 would be > >released in March, hence after PyCon and the sprints. We should > >discuss possible tasks for a python-dev sprint. What could we do? > > * PEP 343 implementation ('with:') > * PEP 308 implementation ('x if y else z') > * A bytes type * PEP 328 - absolute/relative import * PEP 341 - unifying try/except and try/finally (I believe this was accepted; it's still marked Open in PEP 0) > Or perhaps some of the things that have been waiting for the AST branch to > be finished, i.e.: > > * One of the "global variable speedup" PEPs > * Guido's instance variable speedup idea (LOAD_SELF_IVAR and > STORE_SELF_IVAR, see > http://mail.python.org/pipermail/python-dev/2002-February/019854.html) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From nnorwitz at gmail.com Tue Nov 1 18:59:26 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue, 1 Nov 2005 09:59:26 -0800 Subject: [Python-Dev] python-dev sprint at PyCon In-Reply-To: <ca471dc20511010922g2f463d7en5d9bc8dbc5a26c92@mail.gmail.com> References: <20051101143505.GE14719@rogue.amk.ca> <5.1.1.6.0.20051101121245.020559e8@mail.telecommunity.com> <ca471dc20511010922g2f463d7en5d9bc8dbc5a26c92@mail.gmail.com> Message-ID: <ee2a432c0511010959h7348679endf7ecc4bdf12d7a9@mail.gmail.com> On 11/1/05, Guido van Rossum <guido at python.org> wrote: > On 11/1/05, Phillip J. Eby <pje at telecommunity.com> wrote: > > At 09:35 AM 11/1/2005 -0500, A.M. Kuchling wrote: > > >Every PyCon has featured a python-dev sprint. For the past few years, > > >hacking on the AST branch has been a tradition, but we'll have to come > > >up with something new for this year's conference (in Dallas Texas; > > >sprints will be Monday Feb. 27 through Thursday March 2). > > > > > >According to Anthony's release plan, a first alpha of 2.5 would be > > >released in March, hence after PyCon and the sprints. We should > > >discuss possible tasks for a python-dev sprint. What could we do? I added the 4 PEPs mentioned and a few more ideas here: http://wiki.python.org/moin/PyCon2006/Sprints/PythonCore n From pje at telecommunity.com Tue Nov 1 19:02:09 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 01 Nov 2005 13:02:09 -0500 Subject: [Python-Dev] python-dev sprint at PyCon Message-ID: <5.1.1.6.0.20051101130208.02047018@mail.telecommunity.com> At 10:22 AM 11/1/2005 -0700, Guido van Rossum wrote: >* PEP 328 - absolute/relative import I assume that references to 2.4 in that PEP should be changed to 2.5, and so on. It also appears to me that the PEP doesn't record the issue brought up by some people about the current absolute/relative ambiguity being useful for packaging purposes. i.e., being able to nest third-party packages such that they end up seeing their dependencies, even though they're not installed at the "root" package level. For example, I have a package that needs Python 2.4's version of pyexpat, and I need it to run in 2.3, but I can't really overwrite the 2.3 pyexpat, so I just build a backported pyexpat and drop it in the package, so that the code importing it just ends up with the right thing. Of course, that specific example is okay since 2.3 isn't going to somehow grow absolute importing. :) But I think people brought up other examples besides that, it's just the one that I personally know I've done. From guido at python.org Tue Nov 1 19:14:46 2005 From: guido at python.org (Guido van Rossum) Date: Tue, 1 Nov 2005 11:14:46 -0700 Subject: [Python-Dev] python-dev sprint at PyCon In-Reply-To: <5.1.1.6.0.20051101130208.02047018@mail.telecommunity.com> References: <5.1.1.6.0.20051101130208.02047018@mail.telecommunity.com> Message-ID: <ca471dc20511011014o721c0d88w9244915e368a1a6c@mail.gmail.com> On 11/1/05, Phillip J. Eby <pje at telecommunity.com> wrote: > At 10:22 AM 11/1/2005 -0700, Guido van Rossum wrote: > >* PEP 328 - absolute/relative import > > I assume that references to 2.4 in that PEP should be changed to 2.5, and > so on. For the part that hasn't been implemented yet, yes. > It also appears to me that the PEP doesn't record the issue brought up by > some people about the current absolute/relative ambiguity being useful for > packaging purposes. i.e., being able to nest third-party packages such > that they end up seeing their dependencies, even though they're not > installed at the "root" package level. > > For example, I have a package that needs Python 2.4's version of pyexpat, > and I need it to run in 2.3, but I can't really overwrite the 2.3 pyexpat, > so I just build a backported pyexpat and drop it in the package, so that > the code importing it just ends up with the right thing. > > Of course, that specific example is okay since 2.3 isn't going to somehow > grow absolute importing. :) But I think people brought up other examples > besides that, it's just the one that I personally know I've done. I guess this ought to be recorded. :-( The issue has been beaten to death and my position remains firm: rather than playing namespace games, consistent renaming is the right thing to do here. This becomes a trivial source edit, which beats the problems of debugging things when it doesn't work out as expected (which is very common due to the endless subtleties of loading multiple versions of the same code). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From pje at telecommunity.com Tue Nov 1 19:28:12 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 01 Nov 2005 13:28:12 -0500 Subject: [Python-Dev] python-dev sprint at PyCon In-Reply-To: <ca471dc20511011014o721c0d88w9244915e368a1a6c@mail.gmail.co m> References: <5.1.1.6.0.20051101130208.02047018@mail.telecommunity.com> <5.1.1.6.0.20051101130208.02047018@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20051101132151.02fe9708@mail.telecommunity.com> At 11:14 AM 11/1/2005 -0700, Guido van Rossum wrote: >I guess this ought to be recorded. :-( > >The issue has been beaten to death and my position remains firm: >rather than playing namespace games, consistent renaming is the right >thing to do here. This becomes a trivial source edit, Well, it's not trivial if you're (in my case) trying to support 2.3 and 2.4 with the same code base. It'd be nice to have some other advice to offer people besides, "go edit your code". Of course, if the feature hadn't already existed, I suppose a PEP to add it would have been shot down, so it's a reasonable decision. >which beats the >problems of debugging things when it doesn't work out as expected >(which is very common due to the endless subtleties of loading >multiple versions of the same code). Yeah, Bob Ippolito and I batted around a few ideas about how to implement simultaneous multi-version imports for Python Eggs, some of which relied on the relative/absolute ambiguity, but I think the main subtleties have to do with dynamic imports (including pickling) and the use of __name__. Of course, since we never actually implemented it, I don't know what other subtleties could potentially exist. Python Eggs currently allow you to install multiple versions of a package, but at runtime you can only import one of them, and you get a runtime VersionConflict exception if two eggs' version criteria are incompatible. From nnorwitz at gmail.com Tue Nov 1 19:34:29 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue, 1 Nov 2005 10:34:29 -0800 Subject: [Python-Dev] python-dev sprint at PyCon In-Reply-To: <5.1.1.6.0.20051101132151.02fe9708@mail.telecommunity.com> References: <5.1.1.6.0.20051101130208.02047018@mail.telecommunity.com> <5.1.1.6.0.20051101132151.02fe9708@mail.telecommunity.com> Message-ID: <ee2a432c0511011034g678f93dbvca06cc44c0c643b7@mail.gmail.com> On 11/1/05, Phillip J. Eby <pje at telecommunity.com> wrote: > At 11:14 AM 11/1/2005 -0700, Guido van Rossum wrote: > >I guess this ought to be recorded. :-( > > > >The issue has been beaten to death and my position remains firm: > >rather than playing namespace games, consistent renaming is the right > >thing to do here. This becomes a trivial source edit, > > Well, it's not trivial if you're (in my case) trying to support 2.3 and 2.4 > with the same code base. > > It'd be nice to have some other advice to offer people besides, "go edit > your code". Of course, if the feature hadn't already existed, I suppose a > PEP to add it would have been shot down, so it's a reasonable decision. Why can't you add your version's directory to sys.path before importing pyexpat? n From jcarlson at uci.edu Tue Nov 1 19:48:46 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 01 Nov 2005 10:48:46 -0800 Subject: [Python-Dev] apparent ruminations on mutable immutables (was: PEP 351, the freeze protocol) In-Reply-To: <b348a0850510311425w493c14few57fc0677ad273d80@mail.gmail.com> References: <20051031120205.3A0C.JCARLSON@uci.edu> <b348a0850510311425w493c14few57fc0677ad273d80@mail.gmail.com> Message-ID: <20051101104731.0389.JCARLSON@uci.edu> Noam Raphael <noamraph at gmail.com> wrote: > On 10/31/05, Josiah Carlson <jcarlson at uci.edu> wrote: > > > About the users-changing-my-internal-data issue: > ... > > You can have a printout before it dies: > > "I'm crashing your program because something attempted to modify a data > > structure (here's the traceback), and you were told not to." > > > > Then again, you can even raise an exception when people try to change > > the object, as imdict does, as tuples do, etc. > > Both solutions would solve the problem, but would require me to wrap > the built-in set with something which doesn't allow changes. This is a > lot of work - but it's quite similiar to what my solution would > actually do, in a single built-in function. I am an advocate for PEP 351. However, I am against your proposed implementation/variant of PEP 351 because I don't believe it ads enough to warrant the additional complication and overhead necessary for every object (even tuples would need to get a .frozen_cache member). Give me a recursive freeze from PEP 351 (which handles objects that are duplicated, but errors out on circular references), and I'll be happy. > > > You suggest two ways for solving the problem. The first is by copying > > > my mutable objects to immutable copies: > > > > And by caching those results, then invalidating them when they are > > updated by your application. This is the same as what you would like to > > do, except that I do not rely on copy-on-write semantics, which aren't > > any faster than freeze+cache by your application. > > This isn't correct - freezing a set won't require a single copy to be > performed, as long as the frozen copy isn't saved after the original > is changed. Copy+cache always requires one copy. You are wrong, and you even say you are wrong..."freezing a set doesn't require a COPY, IF the frozen COPY isn't saved after the original is CHANGED". Creating an immutable set IS CREATING A COPY, so it ALSO copies, and you admit as much, but then say the equivalent of "copying isn't copying because I say so". > > In any case, whether you choose to use freeze, or use a different API, > > this particular problem is solvable without copy-on-write semantics. > > Right. But I think that a significant simplification of the API is a > nice bonus for my solution. And about those copy-on-write semantics - > it should be proven how complex they are. Remember that we are talking > about frozen-copy-on-write, which I think would simplify matters > considerably - for example, there are at most two instances sharing > the same data, since the frozen copy can be returned again and again. I think that adding an additional attribute to literally every single object to handle the caching of 'frozen' objects, as well as a list to every object to handle callbacks which should be called on object mutation, along with a _call_stuff_when_mutated() method that handles these callback calls, IN ADDITION TO the __freeze__ method which is necessary to support this, is a little much, AND IS CERTAINLY NOT A SIMPLIFICATION! Let us pause for a second and consider: Original PEP proposed 1 new method: __freeze__, which could be implemented as a subclass of the original object (now), and integrated into the original classes as time goes on. One could /register/ __freeze__ functions/methods a'la Pickle, at which point objects wouldn't even need a native freeze method. Your suggestion offers 2 new methods along with 2 new instance variables. Let's see, a callback handler, __freeze__, the cache, and the callback list. Doesn't that seem a little excessive to you to support freezing? It does to me. If Guido were to offer your implementation of freeze, or no freeze at all, I would opt for no freeze, as implementing your freeze on user-defined classes would be a pain in the ass, not to mention implementing them in C code would be more than I would care to do, and more than I would ask any of the core developers to work on. > > Even without validation, there are examples that force a high number of > > calls, which are not O(1), ammortized or otherwise. > > > [Snap - a very interesting example] > > > > Now, the actual time analysis on repeated freezings and such gets ugly. > > There are actually O(k) objects, which take up O(k**2) space. When you > > modify object b[i][j] (which has just been frozen), you get O(k) > > callbacks, and when you call freeze(b), it actually results in O(k**2) > > time to re-copy the O(k**2) pointers to the O(k) objects. It should be > > obvious that this IS NOT AMMORTIZABLE to original object creation time. > > > That's absolutely right. My ammortized analysis is correct only if you > limit yourself to cases in which the original object doesn't change > after a frozen() call was made. In that case, it's ok to count the > O(k**2) copy with the O(k**2) object creation, because it's made only > once. But here's the crucial observation which you are missing. You yourself have stated that in both your table and graph examples you want your application to continue to modify values while the user can't manipulate them. So even in your own use-cases, you are going to be modifying objects after they have been frozen, and even then it won't be fast! I believe that in general, people who are freezing things are going to want to be changing the original objects - hence the use of mutables to begin with - maybe for situations like yours where you don't want users mutating returns, whatever. If after they have frozen the object, they don't want to be changing the original objects, then they are probably going to be tossing out the original mutable and using the immutable created with freeze anyways (mutate your object until you get it right, then freeze it and use that so that no one can alter your data, not even yourself), so I think that caching is something that the /user/ should be doing, NOT Python. The simple implementation (not copy-on-write) leads us to a simple matter of documenting, "Freeze is 'stateless'; every call to freeze returns a new object, regardless of modifications (or lack thereof) between freeze calls." Remember: "Simple is better than complex." > Why it's ok to analyze only that limited case? I am suggesting a > change in Python: that every object you would like be mutable, and > would support the frozen() protocol. When you evaluate my suggestion, > you need to take a program, and measure its performance in the current > Python and in a Python which implements my suggestion. This means that > the program should work also on the current Python. In that case, my > assumption is true - you won't change objects after you have frozen > them, simply because these objects (strings which are used as dict > keys, for example) can't be changed at all in the current Python > implementation! Not everything can/should become mutable. Integers should never become mutable, as tuples should never become mutable, as strings/unicode should never become mutable...wait, aren't we getting to the point that everything which is currently immutable shouldn't become mutable? Indeed. I don't believe that any currently immutable object should be able to become mutable in order to satisfy /anyone's/ desire for mutable /anything/. In starting to bring up benchmarks you are falling into the trap of needing to /have/ a benchmark (I have too), for which there are very few, if any, current use-cases. Without having or needing a benchmark, I'll state quite clearly where your proposed copy-on-write would beat out the naive 'create a new copy on every call to freeze': 1. If objects after they are frozen are never modified, copy on write will be faster. 2. If original objects are modified after they are frozen, then the naive implementation will be as fast if not faster in general, due to far lower overhead, but may be slower in corner cases where some nested structure is unchanged, and some shallow bit has changed: x = [[], NEVER_CHANGED_MUTABLE_NESTED_STRUCTURE] y = freeze(x) x[0].append(1) z = freeze(x) Further, discussing benchmarks on use-cases, for which there are few (if any) previously existing uses, is like saying "let's race cars" back in 1850; it's a bit premature. Then there is this other example: x = [1,2,3] y = freeze(x) The flat version of freeze in the PEP right now handles this case. I can change x all I want, yet I have a frozen y which stays unchanged. This is what I would want, and I would imagine it is what others would want too. In fact, this is precisely the use-case you offered for your table and graph examples, so your expression of a sentiment of "users aren't going to be changing the object after it has been frozen" is, by definition, wrong: you do it yourself! > I will write it in another way: I am proposing a change that will make > Python objects, including strings, mutable, and gives you other > advantages as well. I claim that it won't make existing Python > programs run slower in O() terms. It would allow you to do many things > that you can't do today; some of them would be fast, like editing a > string, and some of them would be less fast - for example, repeatedly > changing an object and freezing it. Your claim on running time only works if the original isn't changed after it is frozen And I don't like making everything mutable, it's a "solution looking for a problem", or a "tail wagging the dog" idea. There is no good reason to make everything mutable, and I challenge you to come up with a valid one that isn't already covered by the existing standard library or extension modules. There is no need to bring strings into this conversation as there are string variants which are already mutable: array.array('c', ...), StringIO, mmap, take your pick! And some future Python (perhaps 2.5) will support a 'bytes' object, which is essentially an mmap which doesn't need to be backed by a file. > I think that the performance penalty may be rather small - remember > that in programs which do not change strings, there would never be a > need to copy the string data at all. And since I think that usually > most of the dict lookups are for method or function names, there would > almost never be a need to constuct a new object on dict lookup, > because you search for the same names again and again, and a new > object is created only on the first frozen() call. You might even gain > performance, because s += x would be faster. You really don't know how Python internals work. The slow part of s += x on strings in Python 2.4 is the memory reallocation and occasional data copy (it has been tuned like crazy by Raymond in 2.4, see _PyString_Resize in stringobject.c). Unless you severely over-allocated your strings, this WOULD NOT BE SPED UP BY MUTABLE STRINGS. Further, identifiers/names (obj, obj.attr, obj.attr1.attr2, ...) are already created during compile-time, and are 'interned'. That is, if you have an object that you call 'foo', there gets to be a single "foo" string, which is referenced by pointer by any code in that module which references the 'foo' object to the single, always unchanging "foo" string. And because the string has already been hashed, it has a cached hash value, and lookups in dictionaries are already fast due to a check for pointer equivalency before comparing contents. Mutable strings CANNOT be faster than this method. > > You have clarified it, but it is still wrong. I stand by 'it is not > > easy to get right', and would further claim, "I doubt it is possible to > > make it fast." > > It would not be very easy to implement, of course, but I hope that it > won't be very hard either, since the basic idea is quite simple. Do > you still doubt the possibility of making it fast, given my (correct) > definition of fast? I would claim that your definition is limited. Yours would be fast if objects never changed after they are frozen, which is counter to your own use-cases. This suggests that your definition is in fact incorrect, and you fail to see your own inconsistancy. > And if it's possible (which I think it is), it would allow us to get > rid of inconvinient immutable objects, and it would let us put > everything into a set. Isn't that nice? No, it sounds like a solution looking for a problem. I see no need to make strings, floats, ints, tuples, etc. mutable, and I think that you will have very little luck in getting core Python developer support for any attempt to make them mutable. If you make such a suggestion, I would offer that you create a new PEP, because this discussion has gone beyond PEP 351, and has wandered into the realm of "What other kinds of objects would be interesting to have in a Python-like system?" I'll summarize your claims: 1. copy-on-write is a simplification 2. everything being mutable would add to Python 3. copy-on-write is fast 4. people won't be mutating objects after they are frozen I'll counter your claims: 1. 2 methods and 2 instance variables on ALL OBJECTS is not a simplification. 2. a = b = 1; a += 1; If all objects were to become mutable, then a == b, despite what Python and every other sane language would tell you, and dct[a] would stop working (you would have to use c = freeze(a);dct[c], or dct[x] would need to automatically call freeze and only ever reference the result, significantly slowing down ALL dictionary references). 3. only if you NEVER MUTATE an object after it has been frozen 4. /you/ mutate original objects after they are frozen ALSO: 5. You fail to realize that if all objects were to become mutable, then one COULDN'T implement frozen, because the frozen objects THEMSELVES would be mutable. I'm going to bow out of this discussion for a few reasons, not the least of which being that I've spent too much time on this subject, and that I think it is quite clear that your proposal is dead, whether I had anything to do with it or not. - Josiah From pje at telecommunity.com Tue Nov 1 19:50:00 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 01 Nov 2005 13:50:00 -0500 Subject: [Python-Dev] python-dev sprint at PyCon In-Reply-To: <ee2a432c0511011034g678f93dbvca06cc44c0c643b7@mail.gmail.co m> References: <5.1.1.6.0.20051101132151.02fe9708@mail.telecommunity.com> <5.1.1.6.0.20051101130208.02047018@mail.telecommunity.com> <5.1.1.6.0.20051101132151.02fe9708@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20051101134754.0380cf68@mail.telecommunity.com> At 10:34 AM 11/1/2005 -0800, Neal Norwitz wrote: >Why can't you add your version's directory to sys.path before importing >pyexpat? With library code that can be imported in any order, there is no such thing as "before". Anyway, Guido has pronounced on this already, so it's moot. From guido at python.org Tue Nov 1 20:39:39 2005 From: guido at python.org (Guido van Rossum) Date: Tue, 1 Nov 2005 12:39:39 -0700 Subject: [Python-Dev] python-dev sprint at PyCon In-Reply-To: <5.1.1.6.0.20051101132151.02fe9708@mail.telecommunity.com> References: <5.1.1.6.0.20051101130208.02047018@mail.telecommunity.com> <5.1.1.6.0.20051101132151.02fe9708@mail.telecommunity.com> Message-ID: <ca471dc20511011139h2076b250mfa144d60f57a0fcb@mail.gmail.com> On 11/1/05, Phillip J. Eby <pje at telecommunity.com> wrote: > At 11:14 AM 11/1/2005 -0700, Guido van Rossum wrote: > >I guess this ought to be recorded. :-( > > > >The issue has been beaten to death and my position remains firm: > >rather than playing namespace games, consistent renaming is the right > >thing to do here. This becomes a trivial source edit, > > Well, it's not trivial if you're (in my case) trying to support 2.3 and 2.4 > with the same code base. You should just bite the bullet and make a privatized copy of the package(s) on which you depend part of your own distributions. > It'd be nice to have some other advice to offer people besides, "go edit > your code". Of course, if the feature hadn't already existed, I suppose a > PEP to add it would have been shot down, so it's a reasonable decision. I agree it would be nice if we could do something about deep version issues. But it's hard, and using the absolute/relative ambiguity isn't a solution but a nasty hack. I don't have a solution either except copying code (which IMO is a *fine* solution in most cases as long as copyright issues don't prevent you). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From raymond.hettinger at verizon.net Tue Nov 1 21:14:32 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Tue, 01 Nov 2005 15:14:32 -0500 Subject: [Python-Dev] a different kind of reduce... In-Reply-To: <4366C3BB.3010407@canterbury.ac.nz> Message-ID: <001301c5df20$df865f00$153dc797@oemcomputer> [Martin Blais] > > I'm always--literally every time-- looking for a more functional form, > > something that would be like this: > > > > # apply dirname() 3 times on its results, initializing with p > > ... = repapply(dirname, 3, p) [Greg Ewing] > Maybe ** should be defined for functions so that you > could do things like > > up3levels = dirname ** 3 Hmm, using the function's own namespace is an interesting idea. It might also be a good place to put other functionals: results = f.map(data) newf = f.partial(somearg) Raymond From dberlin at dberlin.org Tue Nov 1 21:15:08 2005 From: dberlin at dberlin.org (Daniel Berlin) Date: Tue, 01 Nov 2005 15:15:08 -0500 Subject: [Python-Dev] svn checksum error In-Reply-To: <17253.28294.538932.570903@montanaro.dyndns.org> References: <17252.59531.252751.768301@montanaro.dyndns.org> <43654CA7.8030200@v.loewis.de> <17253.28294.538932.570903@montanaro.dyndns.org> Message-ID: <1130876108.7280.35.camel@IBM-82ZWS052TEN.watson.ibm.com> On Sun, 2005-10-30 at 19:08 -0600, skip at pobox.com wrote: > Martin> The natural question then is: what operating system, what > Martin> subversion version are you using? > > Sorry, wasn't thinking in terms of svn bugs. I was anticipating some sort > of obvious pilot error. I am on Mac OSX 10.3.9, running svn 1.1.3 I built > from source back in the May timeframe. Should I upgrade to 1.2.3 as a > matter of course? > > Fredrik> "welcome to the wonderful world of subversion error messages" > ... > Fredrik> deleting the offending directory and doing "svn up" is the > Fredrik> easiest way to fix this. > > Thanks. I zapped Objects. The next svn up complained about Misc. The next > about Lib. After that, the next svn up ran to completion. > > Skip You didn't happen to try to update a checked out copy from a repo that had an older cvs2svn conversion to the one produced by the final conversion, did you? Cause that will cause these errors too. --Dan From jeremy at alum.mit.edu Tue Nov 1 21:23:05 2005 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Tue, 1 Nov 2005 15:23:05 -0500 Subject: [Python-Dev] python-dev sprint at PyCon In-Reply-To: <5.1.1.6.0.20051101121245.020559e8@mail.telecommunity.com> References: <20051101143505.GE14719@rogue.amk.ca> <5.1.1.6.0.20051101121245.020559e8@mail.telecommunity.com> Message-ID: <e8bf7a530511011223x996d960oc029a5e18590c94b@mail.gmail.com> On 11/1/05, Phillip J. Eby <pje at telecommunity.com> wrote: > At 09:35 AM 11/1/2005 -0500, A.M. Kuchling wrote: > >Every PyCon has featured a python-dev sprint. For the past few years, > >hacking on the AST branch has been a tradition, but we'll have to come > >up with something new for this year's conference (in Dallas Texas; > >sprints will be Monday Feb. 27 through Thursday March 2). > > > >According to Anthony's release plan, a first alpha of 2.5 would be > >released in March, hence after PyCon and the sprints. We should > >discuss possible tasks for a python-dev sprint. What could we do? > > * PEP 343 implementation ('with:') > * PEP 308 implementation ('x if y else z') > * A bytes type > > Or perhaps some of the things that have been waiting for the AST branch to > be finished, i.e.: > > * One of the "global variable speedup" PEPs > * Guido's instance variable speedup idea (LOAD_SELF_IVAR and > STORE_SELF_IVAR, see > http://mail.python.org/pipermail/python-dev/2002-February/019854.html) I hope to attend the sprints this year, so i'd be around to help people get started and answer questions. With luck, I'll also be giving a technical presentation on the work at the main conference. Jeremy From reinhold-birkenfeld-nospam at wolke7.net Tue Nov 1 21:27:23 2005 From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld) Date: Tue, 01 Nov 2005 21:27:23 +0100 Subject: [Python-Dev] a different kind of reduce... In-Reply-To: <001301c5df20$df865f00$153dc797@oemcomputer> References: <4366C3BB.3010407@canterbury.ac.nz> <001301c5df20$df865f00$153dc797@oemcomputer> Message-ID: <dk8j3b$rd0$1@sea.gmane.org> Raymond Hettinger wrote: > [Martin Blais] >> > I'm always--literally every time-- looking for a more functional > form, >> > something that would be like this: >> > >> > # apply dirname() 3 times on its results, initializing with p >> > ... = repapply(dirname, 3, p) > > [Greg Ewing] >> Maybe ** should be defined for functions so that you >> could do things like >> >> up3levels = dirname ** 3 > > Hmm, using the function's own namespace is an interesting idea. It > might also be a good place to put other functionals: > > results = f.map(data) > newf = f.partial(somearg) And we have solved the "map, filter and reduce are going away! Let's all weep together" problem with one strike! Reinhold -- Mail address is perfectly valid! From noamraph at gmail.com Tue Nov 1 21:49:59 2005 From: noamraph at gmail.com (Noam Raphael) Date: Tue, 1 Nov 2005 22:49:59 +0200 Subject: [Python-Dev] apparent ruminations on mutable immutables (was: PEP 351, the freeze protocol) In-Reply-To: <20051101104731.0389.JCARLSON@uci.edu> References: <20051031120205.3A0C.JCARLSON@uci.edu> <b348a0850510311425w493c14few57fc0677ad273d80@mail.gmail.com> <20051101104731.0389.JCARLSON@uci.edu> Message-ID: <b348a0850511011249j2d7a0645v8489746be4986d84@mail.gmail.com> On 11/1/05, Josiah Carlson <jcarlson at uci.edu> wrote: ... > > I am an advocate for PEP 351. However, I am against your proposed > implementation/variant of PEP 351 because I don't believe it ads enough > to warrant the additional complication and overhead necessary for every > object (even tuples would need to get a .frozen_cache member). > > Give me a recursive freeze from PEP 351 (which handles objects that are > duplicated, but errors out on circular references), and I'll be happy. > That's fine - but it doesn't mean that I must be happy with it. > ... > > > > This isn't correct - freezing a set won't require a single copy to be > > performed, as long as the frozen copy isn't saved after the original > > is changed. Copy+cache always requires one copy. > > You are wrong, and you even say you are wrong..."freezing a set doesn't > require a COPY, IF the frozen COPY isn't saved after the original is > CHANGED". Creating an immutable set IS CREATING A COPY, so it ALSO > copies, and you admit as much, but then say the equivalent of "copying > isn't copying because I say so". No, I am not wrong. I am just using misleading terms. I will call a "frozen copy" a "frozen image". Here it goes: "freezing a set doesn't require a COPY, IF the frozen IMAGE isn't saved after the original is CHANGED". I suggest that there would be a way to create a frozenset without COPYING an O(n) amount of MEMORY. When a frozen set is created by a call frozen(x), it would not copy all the data, but would rather reference the existing data, which was created by the non-frozen set. Only if the original set changes, when there's a frozen set referencing the data, the MEMORY would be actually copied. I call it a "frozen copy" because it behaves as a frozen copy, even though not all the memory is being copied. When you call the COPY function in the COPY module with a string, it doesn't really copy memory - the same string is returned. When you copy a file inside subversion, it doesn't actually copy all the data associated with it, but does something smarter, which takes O(1). The point is, for the user, it's a copy. Whether or not memory is actually being copied, is an implementation detail. > ... > > I think that adding an additional attribute to literally every single > object to handle the caching of 'frozen' objects, as well as a list to > every object to handle callbacks which should be called on object > mutation, along with a _call_stuff_when_mutated() method that handles > these callback calls, IN ADDITION TO the __freeze__ method which is > necessary to support this, is a little much, AND IS CERTAINLY NOT A > SIMPLIFICATION! I don't agree. You don't need to add a list to every object, since you can store all those relations in one place, with a standard function for registering them. Anyway, code written in Python (which is the language we are discussing) WON'T BE COMPLICATED! The frozen mechanism, along with two new protocols (__frozen__ and __changed__), would be added automatically! The internal state of a class written in Python can be automatically frozen, since it's basically a dict. Now let's see if it's a simplification: 1. No Python code would have to be made more complicated because of the change. 2. There would be no need to find workarounds, like cStringIO, for the fact that strings and tuples are immutable. 3. You would be able to put any kind of object into a set, or use it as a dict key. 4. Classes (like the graph example) would be able to give users things without having to make a choice between risking their users with strange bugs, making a complicated interface, making very inefficient methods, and writing complicated wrapper classes. I will ask you: Is this a complication? The answer is: it requires a significent change of the CPython implementation. But about the Python language: it's definitely a simplification. > > Let us pause for a second and consider: > Original PEP proposed 1 new method: __freeze__, which could be > implemented as a subclass of the original object (now), and integrated > into the original classes as time goes on. One could /register/ > __freeze__ functions/methods a'la Pickle, at which point objects > wouldn't even need a native freeze method. > > Your suggestion offers 2 new methods along with 2 new instance variables. > Let's see, a callback handler, __freeze__, the cache, and the callback > list. Doesn't that seem a little excessive to you to support freezing? > It does to me. If Guido were to offer your implementation of freeze, or > no freeze at all, I would opt for no freeze, as implementing your freeze > on user-defined classes would be a pain in the ass, not to mention > implementing them in C code would be more than I would care to do, and > more than I would ask any of the core developers to work on. > As I said above: this suggestion would certainly require more change in the Python implementation than your suggestion. But the Python language would gain a lot more. Implementing my frozen on user-defined classes would not be a pain in the ass, because it will require no work at all - the Python implementation would provide it automatically. The fact that it can be done automatically for user-defined classes raises a hope in me that it can be made not too complicated for classes written in C. > ... > > But here's the crucial observation which you are missing. You yourself > have stated that in both your table and graph examples you want your > application to continue to modify values while the user can't manipulate > them. So even in your own use-cases, you are going to be modifying > objects after they have been frozen, and even then it won't be fast! No. In the table example, the table would never change the object themselves - it may only calculate new values, and drop the references to the old ones. This is definitely a case of not changing the value after it has been frozen. In the graph example, it is true that the set would be changed after it's frozen, but it is expected that the frozen copy would not exist by the time the change happens - think about the x is graph.neighbours(y) example. There is actually no reason for keeping them, besides for tracking the history of the graph - which would require a copy anyway. The frozen() implementation of objects which do not reference non-frozen objects, such as sets, really doesn't copy any memory when it's called, and will never cause a memory copy if there are no living frozen copies of the object while the object changes. > > I believe that in general, people who are freezing things are going to > want to be changing the original objects - hence the use of mutables to > begin with - maybe for situations like yours where you don't want users > mutating returns, whatever. If after they have frozen the object, they > don't want to be changing the original objects, then they are probably > going to be tossing out the original mutable and using the immutable > created with freeze anyways (mutate your object until you get it right, > then freeze it and use that so that no one can alter your data, not even > yourself), so I think that caching is something that the /user/ should > be doing, NOT Python. I don't agree. The table and the graph are examples. The common use patterns I see regarding frozen() are: 1. Don't use frozen() at all. Think about strings becoming mutable. Most strings which are changed would never be frozen. When you are using a list, how many times do you make a frozen copy of it? (The answer is zero, of course, you can't. You can't use it as a dict key, or as a member of a set. This is just to show you that not freezing mutable objects is a common thing.) 2. Create the object using more operations than constructing it, and then don't change it, possibly making a frozen copy of it. The table is an example: functions given by the user create objects, in whatever way they choose, and then the table doesn't need to change them, and needs to create a frozen copy. It's a very reasonable use case: I would say that the less common case is that you can create an object using only the constructor. Many times you make a tuple out of a list that you've created just for that purpose. It's not intuitive! > > The simple implementation (not copy-on-write) leads us to a simple > matter of documenting, "Freeze is 'stateless'; every call to freeze > returns a new object, regardless of modifications (or lack thereof) > between freeze calls." > > Remember: "Simple is better than complex." Firstly, you are talking about implementation. Secondly, sometimes things are too simple, and lead to complex workarounds. > ... > > Not everything can/should become mutable. Integers should never become > mutable, as tuples should never become mutable, as strings/unicode > should never become mutable...wait, aren't we getting to the point that > everything which is currently immutable shouldn't become mutable? > Indeed. I don't believe that any currently immutable object should be > able to become mutable in order to satisfy /anyone's/ desire for mutable > /anything/. Integers should never become mutable - right. There should be no mutable ints in Python. Tuples and strings should never become mutable - wrong. Strings created by the user should be mutable - those immutable strings are a Python anomaly. All I was saying was that sometimes, the Python implementation would want to use immutable strings. So will users, sometimes. There is a need for a mutable string, and a need for an immutable string, and a need for an efficient conversion between the two. That's all. > > > In starting to bring up benchmarks you are falling into the trap of > needing to /have/ a benchmark (I have too), for which there are very few, > if any, current use-cases. > No, I don't. There are a lot of use cases. As I said, I suggest a change to the Python language, which would give you many benefits. When suggesting such a change, it should be verified that the performance of existing Python programs won't be harmed, which I did. What might be done as well, is to compare my suggestion to yours: > Without having or needing a benchmark, I'll state quite clearly where > your proposed copy-on-write would beat out the naive 'create a new copy > on every call to freeze': > 1. If objects after they are frozen are never modified, copy on write > will be faster. > 2. If original objects are modified after they are frozen, then the > naive implementation will be as fast if not faster in general, due to > far lower overhead, but may be slower in corner cases where some nested > structure is unchanged, and some shallow bit has changed: As I said, many objects are never modified after they are frozen. ***This includes all the strings which are used in current Python programs as dict keys*** - I suggest that strings would become mutable by default. This means that whenever you use a string as a dict key, a call to frozen() is done by the dict. It's obvious that the string won't change after it is frozen. Now, my suggestion is faster in its order of complexity than yours. In some cases, yours is faster by a constant, which I claim that would be quite small in real use cases. > > x = [[], NEVER_CHANGED_MUTABLE_NESTED_STRUCTURE] > y = freeze(x) > x[0].append(1) > z = freeze(x) > This is one of the cases in which the change in order of complexity is significant. > Further, discussing benchmarks on use-cases, for which there are few (if > any) previously existing uses, is like saying "let's race cars" back in > 1850; it's a bit premature. I don't agree. That's why we're discussing it. > > > Then there is this other example: > > x = [1,2,3] > y = freeze(x) > > The flat version of freeze in the PEP right now handles this case. I > can change x all I want, yet I have a frozen y which stays unchanged. > This is what I would want, and I would imagine it is what others would > want too. In fact, this is precisely the use-case you offered for your > table and graph examples, so your expression of a sentiment of "users > aren't going to be changing the object after it has been frozen" is, by > definition, wrong: you do it yourself! Okay, so may I add another use case in which frozen() is fast: If an object which only holds references to frozen object is changed after a frozen copy of it has been made, and the frozen copy is discarded before the change is made, frozen() would still take O(1). This is the case with the graph. > > > > I will write it in another way: I am proposing a change that will make > > Python objects, including strings, mutable, and gives you other > > advantages as well. I claim that it won't make existing Python > > programs run slower in O() terms. It would allow you to do many things > > that you can't do today; some of them would be fast, like editing a > > string, and some of them would be less fast - for example, repeatedly > > changing an object and freezing it. > > Your claim on running time only works if the original isn't changed > after it is frozen But they won't, in existing Python programs, so my claim: "it won't make existing Python programs run slower in O() terms" is absolutely correct! > > And I don't like making everything mutable, it's a "solution looking for > a problem", or a "tail wagging the dog" idea. There is no good reason > to make everything mutable, and I challenge you to come up with a valid > one that isn't already covered by the existing standard library or > extension modules. > > There is no need to bring strings into this conversation as there are > string variants which are already mutable: array.array('c', ...), > StringIO, mmap, take your pick! And some future Python (perhaps 2.5) > will support a 'bytes' object, which is essentially an mmap which > doesn't need to be backed by a file. My two examples don't have a satisfactory solution currently. All this variety acutally proves my point: There is "more than one way to do it" because these are all workarounds! If strings were mutable, I won't have to learn about all these nice modules. And if that's not enough, here's a simple use case which isn't covered by all those modules. Say I store my byte arrays using array.array, or mmap. What if I want to make a set of those, in order to check if a certain byte sequence was already encountered? I CAN'T. I have to do another workaround, which will probably be complicated and unefficient, to convert my byte array into a string. Everything is possible, if you are willing to work hard enough. I am suggesting to simplify things. ... > > You really don't know how Python internals work. > > The slow part of s += x on strings in Python 2.4 is the memory > reallocation and occasional data copy (it has been tuned like crazy by > Raymond in 2.4, see _PyString_Resize in stringobject.c). Unless you > severely over-allocated your strings, this WOULD NOT BE SPED UP BY > MUTABLE STRINGS. The fact that it has been tuned like crazy, and that it had to wait for Python 2.4, is just showing us that we talking on *another* workaround. And please remember that this optimization was announced not to be counted on, in case you want your code to work efficiently on other Python implementations. In that case (which would just grow more common in the future), you would have to get back to the other workarounds, like cStringIO. > > Further, identifiers/names (obj, obj.attr, obj.attr1.attr2, ...) are > already created during compile-time, and are 'interned'. That is, if > you have an object that you call 'foo', there gets to be a single "foo" > string, which is referenced by pointer by any code in that module which > references the 'foo' object to the single, always unchanging "foo" > string. And because the string has already been hashed, it has a cached > hash value, and lookups in dictionaries are already fast due to a check > for pointer equivalency before comparing contents. Mutable strings > CANNOT be faster than this method. Right - they can't be faster than this method. But they can be virtually AS FAST. Store frozen strings as identifiers/names, and you can continue to use exactly the same method you described. > > ... > > I would claim that your definition is limited. Yours would be fast if > objects never changed after they are frozen, which is counter to your > own use-cases. This suggests that your definition is in fact incorrect, > and you fail to see your own inconsistancy. > I have answered this above: It is not counter to my use cases, and it's a very good assumption, as it is true in many examples, including all current Python programs. > > > And if it's possible (which I think it is), it would allow us to get > > rid of inconvinient immutable objects, and it would let us put > > everything into a set. Isn't that nice? > > No, it sounds like a solution looking for a problem. I see no need to > make strings, floats, ints, tuples, etc. mutable, and I think that you > will have very little luck in getting core Python developer support for > any attempt to make them mutable. Concerning ints, floats, complexes, and any other object with a constant memory use, I agree. Concerning other objects, I disagree. I think that it would simplify things considerably, and that many things that we are used to are actually workarounds. > > If you make such a suggestion, I would offer that you create a new PEP, > because this discussion has gone beyond PEP 351, and has wandered into > the realm of "What other kinds of objects would be interesting to have > in a Python-like system?" > That is a good suggestion, and I have already started to write one. It takes me a long time, but I hope I will manage. > > > I'll summarize your claims: > 1. copy-on-write is a simplification > 2. everything being mutable would add to Python > 3. copy-on-write is fast > 4. people won't be mutating objects after they are frozen > > I'll counter your claims: I'll counter-counter them: > 1. 2 methods and 2 instance variables on ALL OBJECTS is not a > simplification. It is. This is basically an implementation detail, Python code would never be complicated. > 2. a = b = 1; a += 1; If all objects were to become mutable, then a == > b, despite what Python and every other sane language would tell you, and > dct[a] would stop working (you would have to use c = freeze(a);dct[c], > or dct[x] would need to automatically call freeze and only ever > reference the result, significantly slowing down ALL dictionary > references). This might be the point that I didn't stress enough. Dict *would* call freeze, and this is why more work is needed to make sure it is a quick operation. I have proven that it is quick in O() terms, and I claimed that it can be made quick in actual terms. > 3. only if you NEVER MUTATE an object after it has been frozen ...or if the frozen copy is killed before the change, for many types of objects. > 4. /you/ mutate original objects after they are frozen Yes I do, but see 3. > > ALSO: > 5. You fail to realize that if all objects were to become mutable, then > one COULDN'T implement frozen, because the frozen objects THEMSELVES > would be mutable. Really, you take me by the word. All objects COULD become mutable, if we supply a frozen version of it. This doesn't include any object which you don't want, including ints, and including frozen objects. > > I'm going to bow out of this discussion for a few reasons, not the least > of which being that I've spent too much time on this subject, and that I > think it is quite clear that your proposal is dead, whether I had > anything to do with it or not. > > - Josiah That's fine. I wish that you read my answer, think about it a little, and just tell me in a yes or a no if you still consider it dead. I think that I have answered all your questions, and I hope that at least others would be convinced by them, and that at the end my suggestion would be accepted. Others who read this - please respond if you think there's something to my suggestion! Thanks for your involvement. I hope it would at least help me better explain my idea. Noam From noamraph at gmail.com Tue Nov 1 21:55:14 2005 From: noamraph at gmail.com (Noam Raphael) Date: Tue, 1 Nov 2005 22:55:14 +0200 Subject: [Python-Dev] a different kind of reduce... In-Reply-To: <dk8j3b$rd0$1@sea.gmane.org> References: <4366C3BB.3010407@canterbury.ac.nz> <001301c5df20$df865f00$153dc797@oemcomputer> <dk8j3b$rd0$1@sea.gmane.org> Message-ID: <b348a0850511011255t7683b34bk9fd90cf4a99c4fb6@mail.gmail.com> On 11/1/05, Reinhold Birkenfeld <reinhold-birkenfeld-nospam at wolke7.net> wrote: > > Hmm, using the function's own namespace is an interesting idea. It > > might also be a good place to put other functionals: > > > > results = f.map(data) > > newf = f.partial(somearg) > > And we have solved the "map, filter and reduce are going away! Let's all > weep together" problem with one strike! > > Reinhold I have no problems with map and filter goint away. About reduce - please remember that you need to add this method to any callable, including every type (I mean the constructor). I am not sure it is a good trade for throwing away one builting, which is a perfectly reasonable function. Noam From pedronis at strakt.com Tue Nov 1 22:00:20 2005 From: pedronis at strakt.com (Samuele Pedroni) Date: Tue, 01 Nov 2005 22:00:20 +0100 Subject: [Python-Dev] a different kind of reduce... In-Reply-To: <dk8j3b$rd0$1@sea.gmane.org> References: <4366C3BB.3010407@canterbury.ac.nz> <001301c5df20$df865f00$153dc797@oemcomputer> <dk8j3b$rd0$1@sea.gmane.org> Message-ID: <4367D764.2090609@strakt.com> Reinhold Birkenfeld wrote: > Raymond Hettinger wrote: > >>[Martin Blais] >> >>>>I'm always--literally every time-- looking for a more functional >> >>form, >> >>>>something that would be like this: >>>> >>>> # apply dirname() 3 times on its results, initializing with p >>>> ... = repapply(dirname, 3, p) >> >>[Greg Ewing] >> >>>Maybe ** should be defined for functions so that you >>>could do things like >>> >>> up3levels = dirname ** 3 >> >>Hmm, using the function's own namespace is an interesting idea. It >>might also be a good place to put other functionals: >> >> results = f.map(data) >> newf = f.partial(somearg) > > > And we have solved the "map, filter and reduce are going away! Let's all > weep together" problem with one strike! not really, those right now work with any callable, >>> class C: ... def __call__(self, x): ... return 2*x ... >>> map(C(), [1,2,3]) [2, 4, 6] that's why attaching functionaliy as methods is not always the best solution. regards. From skip at pobox.com Tue Nov 1 21:58:37 2005 From: skip at pobox.com (skip@pobox.com) Date: Tue, 1 Nov 2005 14:58:37 -0600 Subject: [Python-Dev] python-dev sprint at PyCon In-Reply-To: <20051101143505.GE14719@rogue.amk.ca> References: <20051101143505.GE14719@rogue.amk.ca> Message-ID: <17255.55037.609312.773649@montanaro.dyndns.org> amk> Every PyCon has featured a python-dev sprint. For the past few amk> years, hacking on the AST branch has been a tradition, but we'll amk> have to come up with something new for this year's conference... This is just a comment from the peanut gallery, as it's highly unlikely I'll be in attendance, but why not continue with the AST theme? Instead of working on the AST branch, you could start to propagate the AST representation around. For example, you could use the new AST code to improve/extend/rewrite the optimization steps the compiler currently performs. Another alternative would be to rewrite Pychecker (or Pychecker 2) to operate from the AST representation. Skip From tdelaney at avaya.com Tue Nov 1 21:59:07 2005 From: tdelaney at avaya.com (Delaney, Timothy (Tim)) Date: Wed, 2 Nov 2005 07:59:07 +1100 Subject: [Python-Dev] apparent ruminations on mutable immutables (was:PEP 351, the freeze protocol) Message-ID: <2773CAC687FD5F4689F526998C7E4E5F4DB75B@au3010avexu1.global.avaya.com> Noam, There's a simple solution to all this - write a competing PEP. One of the two competing PEPs may be accepted. FWIW, I'm +1 on PEP 351 in general, and -1 on what you've proposed. PEP 351 is simple to explain, simple to implement and leaves things under the control of the developer. I think there are still some issues to be resolved, but the basic premise is exactly what I would want of a freeze protocol. Tim Delaney From jcarlson at uci.edu Tue Nov 1 22:03:56 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 01 Nov 2005 13:03:56 -0800 Subject: [Python-Dev] apparent ruminations on mutable immutables (was: PEP 351, the freeze protocol) In-Reply-To: <b348a0850511011249j2d7a0645v8489746be4986d84@mail.gmail.com> References: <20051101104731.0389.JCARLSON@uci.edu> <b348a0850511011249j2d7a0645v8489746be4986d84@mail.gmail.com> Message-ID: <20051101125918.0396.JCARLSON@uci.edu> > That's fine. I wish that you read my answer, think about it a little, > and just tell me in a yes or a no if you still consider it dead. I > think that I have answered all your questions, and I hope that at > least others would be convinced by them, and that at the end my > suggestion would be accepted. I still consider it dead. "If the implementation is hard to explain, it's a bad idea." Also, not all user-defined classes have a __dict__, and not all user-defined classes can have arbitrary attributes added to them. c>>> class foo(object): ... __slots__ = ['lst'] ... def __init__(self): ... self.lst = [] ... >>> a = foo() >>> a.bar = 1 Traceback (most recent call last): File "<stdin>", line 1, in ? AttributeError: 'foo' object has no attribute 'bar' >>> - Josiah From tdelaney at avaya.com Tue Nov 1 22:02:04 2005 From: tdelaney at avaya.com (Delaney, Timothy (Tim)) Date: Wed, 2 Nov 2005 08:02:04 +1100 Subject: [Python-Dev] a different kind of reduce... Message-ID: <2773CAC687FD5F4689F526998C7E4E5F4DB75C@au3010avexu1.global.avaya.com> Reinhold Birkenfeld wrote: > And we have solved the "map, filter and reduce are going away! Let's > all weep together" problem with one strike! I'm not sure if you're wildly enthusiastic, or very sarcastic. I'm not sure which I should be either ... The thought does appeal to me - especially func.partial(args). I don't see any advantage to func.map(args) over func(*args), and it loses functionality in comparison with map(func, args) (passing the function as a separate reference). Tim Delaney From mal at egenix.com Tue Nov 1 22:11:52 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 01 Nov 2005 22:11:52 +0100 Subject: [Python-Dev] PEP 328 - absolute imports (python-dev sprint at PyCon) In-Reply-To: <ca471dc20511011014o721c0d88w9244915e368a1a6c@mail.gmail.com> References: <5.1.1.6.0.20051101130208.02047018@mail.telecommunity.com> <ca471dc20511011014o721c0d88w9244915e368a1a6c@mail.gmail.com> Message-ID: <4367DA18.6070502@egenix.com> Guido van Rossum wrote: > On 11/1/05, Phillip J. Eby <pje at telecommunity.com> wrote: > >>At 10:22 AM 11/1/2005 -0700, Guido van Rossum wrote: >> >>>* PEP 328 - absolute/relative import >> >>I assume that references to 2.4 in that PEP should be changed to 2.5, and >>so on. > > > For the part that hasn't been implemented yet, yes. > > >>It also appears to me that the PEP doesn't record the issue brought up by >>some people about the current absolute/relative ambiguity being useful for >>packaging purposes. i.e., being able to nest third-party packages such >>that they end up seeing their dependencies, even though they're not >>installed at the "root" package level. >> >>For example, I have a package that needs Python 2.4's version of pyexpat, >>and I need it to run in 2.3, but I can't really overwrite the 2.3 pyexpat, >>so I just build a backported pyexpat and drop it in the package, so that >>the code importing it just ends up with the right thing. >> >>Of course, that specific example is okay since 2.3 isn't going to somehow >>grow absolute importing. :) But I think people brought up other examples >>besides that, it's just the one that I personally know I've done. > > > I guess this ought to be recorded. :-( > > The issue has been beaten to death and my position remains firm: > rather than playing namespace games, consistent renaming is the right > thing to do here. This becomes a trivial source edit, which beats the > problems of debugging things when it doesn't work out as expected > (which is very common due to the endless subtleties of loading > multiple versions of the same code). Just for reference, may I remind you of this thread last year: http://mail.python.org/pipermail/python-dev/2004-September/048695.html The PEP's timeline should be updated accordingly. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 01 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From noamraph at gmail.com Tue Nov 1 22:20:34 2005 From: noamraph at gmail.com (Noam Raphael) Date: Tue, 1 Nov 2005 23:20:34 +0200 Subject: [Python-Dev] apparent ruminations on mutable immutables (was:PEP 351, the freeze protocol) In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5F4DB75B@au3010avexu1.global.avaya.com> References: <2773CAC687FD5F4689F526998C7E4E5F4DB75B@au3010avexu1.global.avaya.com> Message-ID: <b348a0850511011320h5e799ad6q43aef3d1da88508c@mail.gmail.com> On 11/1/05, Delaney, Timothy (Tim) <tdelaney at avaya.com> wrote: > Noam, > > There's a simple solution to all this - write a competing PEP. One of > the two competing PEPs may be accepted. I will. It may take some time, though. > > FWIW, I'm +1 on PEP 351 in general, and -1 on what you've proposed. > > PEP 351 is simple to explain, simple to implement and leaves things > under the control of the developer. I think there are still some issues > to be resolved, but the basic premise is exactly what I would want of a > freeze protocol. > > Tim Delaney It is true that PEP 351 is simpler. The problem is, that thanks to PEP 351 I have found a fundamental place in which the current Python design is not optimal. It is not easy to fix it, because 1) it would require a significant change to the current implementation, and 2) people are so used to the current design that it is hard to convince them that it's flawed. The fact that discussing the design is long doesn't mean that the result, for the Python programmer, would be complicated. They won't - my suggestion will cause almost no backward-compatibility problems. Think about it - it clearly means that my suggestion simply can't make Python programming *more* complicated. Please consider new-style classes. I'm sure they required a great deal of discussion, but they are simple to use -- and they are a good thing. And I think that my suggestion would make things easier, more than the new-style-classes change did. Features of new-style classes are an advanced topic. The questions, "why can't I change my strings?" "why do you need both a tuple and a list?" and maybe "why can't I add my list to a set", are fundamental ones, which would all not be asked at all if my suggestion is accepted. Noam From jcarlson at uci.edu Tue Nov 1 22:29:29 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 01 Nov 2005 13:29:29 -0800 Subject: [Python-Dev] a different kind of reduce... In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5F4DB75C@au3010avexu1.global.avaya.com> References: <2773CAC687FD5F4689F526998C7E4E5F4DB75C@au3010avexu1.global.avaya.com> Message-ID: <20051101131830.0399.JCARLSON@uci.edu> "Delaney, Timothy (Tim)" <tdelaney at avaya.com> wrote: > > Reinhold Birkenfeld wrote: > > > And we have solved the "map, filter and reduce are going away! Let's > > all weep together" problem with one strike! > > I'm not sure if you're wildly enthusiastic, or very sarcastic. > > I'm not sure which I should be either ... > > The thought does appeal to me - especially func.partial(args). I don't > see any advantage to func.map(args) over func(*args), and it loses > functionality in comparison with map(func, args) (passing the function > as a separate reference). I was under the impression that: fcn.<old builtin name>(...) would perform equivalently as <old builtin name>(fcn, ...) does now. So all the following would be equivalent... func.map(args) map(func, args) [func(i) for i in args] Me, I still use map, so seeing it as fcn.map(...) instead of map(fcn,...) sounds good to me...though it does have the ugly rub of suggesting that None.map/filter should exist, which I'm not really happy about. In regards to the instance __call__ method, it seems reasonable to require users to implement their own map/filter/reduce call. - Josiah From noamraph at gmail.com Tue Nov 1 22:30:48 2005 From: noamraph at gmail.com (Noam Raphael) Date: Tue, 1 Nov 2005 23:30:48 +0200 Subject: [Python-Dev] apparent ruminations on mutable immutables (was: PEP 351, the freeze protocol) In-Reply-To: <20051101125918.0396.JCARLSON@uci.edu> References: <20051101104731.0389.JCARLSON@uci.edu> <b348a0850511011249j2d7a0645v8489746be4986d84@mail.gmail.com> <20051101125918.0396.JCARLSON@uci.edu> Message-ID: <b348a0850511011330g3b4c4edr9469940650d88b9c@mail.gmail.com> On 11/1/05, Josiah Carlson <jcarlson at uci.edu> wrote: ... > > I still consider it dead. > "If the implementation is hard to explain, it's a bad idea." It is sometimes true, but not always. It may mean two other things: 1. The one trying to explain is not talented enough. 2. The implementation is really not very simple. A hash table, used so widely in Python, is really not a simple idea, and it's not that easy to explain. > > Also, not all user-defined classes have a __dict__, and not all > user-defined classes can have arbitrary attributes added to them. > > c>>> class foo(object): > ... __slots__ = ['lst'] > ... def __init__(self): > ... self.lst = [] > ... > >>> a = foo() > >>> a.bar = 1 > Traceback (most recent call last): > File "<stdin>", line 1, in ? > AttributeError: 'foo' object has no attribute 'bar' > >>> It doesn't matter. It only means that the implementation would have to make frozen copies also of __slots__ items, when freezing a user-defined class. I am afraid that this question proves that I didn't convey my idea to you. If you like, please forgive my inability to explain it clearly, and try again to understand my idea, by going over what I wrote again, and thinking on it. You can also wait for the PEP that I intend to write. And you can also forget about it, if you don't want to bother with it - you've already helped a lot. Noam From guido at python.org Tue Nov 1 22:40:40 2005 From: guido at python.org (Guido van Rossum) Date: Tue, 1 Nov 2005 14:40:40 -0700 Subject: [Python-Dev] a different kind of reduce... In-Reply-To: <001301c5df20$df865f00$153dc797@oemcomputer> References: <4366C3BB.3010407@canterbury.ac.nz> <001301c5df20$df865f00$153dc797@oemcomputer> Message-ID: <ca471dc20511011340y7db2a86dn18ea361ecc7fafbd@mail.gmail.com> > [Greg Ewing] > > Maybe ** should be defined for functions so that you > > could do things like > > > > up3levels = dirname ** 3 [Raymond Hettinger] > Hmm, using the function's own namespace is an interesting idea. It > might also be a good place to put other functionals: > > results = f.map(data) > newf = f.partial(somearg) Sorry to rain on everybody's parade, but I don't think so. There are many different types of callables. This stuff would only work if they all implemented the same API. That's unlikely to happen. A module with functions to implement the various functional operations has much more potential. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From s.percivall at chello.se Wed Nov 2 00:14:20 2005 From: s.percivall at chello.se (Simon Percivall) Date: Wed, 2 Nov 2005 00:14:20 +0100 Subject: [Python-Dev] a different kind of reduce... In-Reply-To: <ca471dc20511011340y7db2a86dn18ea361ecc7fafbd@mail.gmail.com> References: <4366C3BB.3010407@canterbury.ac.nz> <001301c5df20$df865f00$153dc797@oemcomputer> <ca471dc20511011340y7db2a86dn18ea361ecc7fafbd@mail.gmail.com> Message-ID: <6FF0C116-C55B-43CC-AA21-2A7D72E09545@chello.se> On 1 nov 2005, at 22.40, Guido van Rossum wrote: >> [Greg Ewing] >>> Maybe ** should be defined for functions so that you >>> could do things like >>> >>> up3levels = dirname ** 3 > > [Raymond Hettinger] >> Hmm, using the function's own namespace is an interesting idea. It >> might also be a good place to put other functionals: >> >> results = f.map(data) >> newf = f.partial(somearg) > > Sorry to rain on everybody's parade, but I don't think so. There are > many different types of callables. This stuff would only work if they > all implemented the same API. That's unlikely to happen. A module with > functions to implement the various functional operations has much more > potential. Perhaps then a decorator that uses these functions? //Simon From noamraph at gmail.com Wed Nov 2 02:21:38 2005 From: noamraph at gmail.com (Noam Raphael) Date: Wed, 2 Nov 2005 03:21:38 +0200 Subject: [Python-Dev] Why should the default hash(x) == id(x)? Message-ID: <b348a0850511011721ve1c3817vd5f61b644257e855@mail.gmail.com> Hello, While writing my PEP about unifying mutable and immutable, I came upon this: Is there a reason why the default __hash__ method returns the id of the objects? It is consistent with the default __eq__ behaviour, which is the same as "is", but: 1. It can easily become inconsistent, if someone implements __eq__ and doesn't implement __hash__. 2. It is confusing: even if someone doesn't implement __eq__, he may see that it is suitable as a key to a dict, and expect it to be found by other objects with the same "value". 3. If someone does want to associate values with objects, he can explicitly use id: dct[id(x)] = 3. This seems to better explain what he wants. Now, I just thought of a possible answer: "because he wants to store in his dict both normal objects and objects of his user-defined type, which turn out to be not equal to any other object." This leads me to another question: why should the default __eq__ method be the same as "is"? If someone wants to check if two objects are the same object, that's what the "is" operator is for. Why not make the default __eq__ really compare the objects, that is, their dicts and their slot-members? I would be happy to get answers. Noam From nnorwitz at gmail.com Wed Nov 2 03:23:23 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue, 1 Nov 2005 18:23:23 -0800 Subject: [Python-Dev] python-dev sprint at PyCon In-Reply-To: <17255.55037.609312.773649@montanaro.dyndns.org> References: <20051101143505.GE14719@rogue.amk.ca> <17255.55037.609312.773649@montanaro.dyndns.org> Message-ID: <ee2a432c0511011823h4d45f1d1pb703c284f331c52e@mail.gmail.com> On 11/1/05, skip at pobox.com <skip at pobox.com> wrote: > > This is just a comment from the peanut gallery, as it's highly unlikely I'll > be in attendance, but why not continue with the AST theme? Instead of > working on the AST branch, you could start to propagate the AST > representation around. For example, you could use the new AST code to > improve/extend/rewrite the optimization steps the compiler currently > performs. Another alternative would be to rewrite Pychecker (or Pychecker > 2) to operate from the AST representation. That's an excellent suggestion. I think I will borrow the time machine and add it to the wiki. :-) It's up on the wiki. Brett also added an item for the peephole optimizer. Everyone should add whatever they think are good ideas, even if they don't plan to attend the sprints. n From mcherm at mcherm.com Wed Nov 2 14:48:55 2005 From: mcherm at mcherm.com (Michael Chermside) Date: Wed, 02 Nov 2005 05:48:55 -0800 Subject: [Python-Dev] apparent ruminations on mutable immutables (was:PEP 351, the freeze protocol) Message-ID: <20051102054855.st4vtcvrorogggc8@login.werra.lunarpages.com> Josiah Carlson writes: > If you make such a suggestion, I would offer that you create a new PEP, > because this discussion has gone beyond PEP 351, and has wandered into > the realm of "What other kinds of objects would be interesting to have > in a Python-like system?" Noam Raphael replies: > That is a good suggestion, and I have already started to write one. It > takes me a long time, but I hope I will manage. My thanks to both of you... following this conversation has been an educational experience. Just for the record, I wanted to chime in with my own opinion formed after following the full interchange. I think Noam's propsal is very interesting. I like the idea of allowing both "frozen" (ie, immutable) and mutable treatments for the same object. I think that C++'s version of this concept (the "const" modifier) has, on balance, been only a very limited success. I find myself convinced by Noam's claims that many common use patterns either (1) only use mutables, or (2) only use immutables, or (3) only use immutable copies temporarily and avoid mutating while doing so. Any such use patterns (particularly use (3)) would benefit from the presence of an efficient method for creating an immutable copy of a mutable object which avoids the copy where possible. However... it seems to me that what is being described here is not Python. Python is a wonderful language, but it has certain characteristics, like extremely dynamic behavior and close integration with underlying system methods (C in CPython, Java in Jython, etc) that seem to me to make this particular feature a poor fit. That's OK... not all languages need to be Python! I would encourage you (Noam) to go ahead and explore this idea of yours. You might wind up building a new language from scratch (in which case I strongly encourage you to borrow _syntax_ from Python -- its syntax is more usable than that of any other language I know of). Or perhaps you will prefer to take CPython and make minor modifications. This kind of experimentation is allowed (open source) and even encouraged... consider Christian Tismer's Stackless -- a widely admired variant of CPython which is unlikely to ever become part of the core, but is nevertheless an important part of the vivrant Python community. You might even be interested in starting, instead, with PyPy -- an large project which has as its main goal producing an implementation of Python which is easy to modify so as to support just this kind of experimentation. You are also welcome to submit a PEP for modifying Python (presumably CPython, Jython, Iron Python, and all other implementations). However, I think such a PEP would be rejected. Building your own thing that works well with Python would NOT be rejected. The idea is interesting, and it _may_ be sound; only an actual implementation could prove this either way. -- Michael Chermside From mcherm at mcherm.com Wed Nov 2 18:39:44 2005 From: mcherm at mcherm.com (Michael Chermside) Date: Wed, 02 Nov 2005 09:39:44 -0800 Subject: [Python-Dev] Why should the default hash(x) == id(x)? Message-ID: <20051102093944.8jhktwq4e98g4444@login.werra.lunarpages.com> Noam Raphael writes: > Is there a reason why the default __hash__ method returns the id of the objects? > > It is consistent with the default __eq__ behaviour, which is the same > as "is", but: > > 1. It can easily become inconsistent, if someone implements __eq__ and > doesn't implement __hash__. > 2. It is confusing: even if someone doesn't implement __eq__, he may > see that it is suitable as a key to a dict, and expect it to be found > by other objects with the same "value". > 3. If someone does want to associate values with objects, he can > explicitly use id: > dct[id(x)] = 3. This seems to better explain what he wants. Your first criticism is valid... it's too bad that there isn't a magical __hash__ function that automatically derived its behavior from __eq__. To your second point, I would tell this user to read the requirements. And your third point isn't a criticism, just an alternative. But to answer your question, the reason that the default __hash__ returns the ID in CPython is just that this works. In Jython, I belive that the VM provides a native hash method, and __hash__ uses that instead of returning ID. Actually, it not only works, it's also FAST (which is important... many algorithms prefer that __hash__ being O(1)). I can't imagine what you would propose instead. Keep in mind that the requirements are that __hash__ must return a value which distinguishes the object. So, for instance, two mutable objects with identical values MUST (probably) return different __hash__ values as they are distinct objects. > This leads me to another question: why should the default __eq__ > method be the same as "is"? Another excellent question. The answer is that this is the desired behavior of the language. Two user-defined object references are considered equal if and only if (1) they are two references to the same object, or (2) the user who designed it has specified a way to compare objects (implemented __eq__) and it returns a True value. > Why not make the default __eq__ really compare the objects, that is, > their dicts and their slot-members? Short answer: not the desired behavior. Longer answer: there are three common patterns in object design. There are "value" objects, which should be considered equal if all fields are equal. There are "identity" objects which are considered equal only when they are the same object. And then there are (somewhat less common) "value" objects in which a few fields don't count -- they may be used for caching a pre-computed result for example. The default __eq__ behavior has to cater to one of these -- clearly either "value" objects or "identity" objects. Guido chose to cater to "identity" objects believing that they are actually more common in most situations. A beneficial side-effect is that the default behavior of __eq__ is QUITE simple to explain, and if the implementation is easy to explain then it may be a good idea. -- Michael Chermside From jcarlson at uci.edu Wed Nov 2 18:46:09 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed, 02 Nov 2005 09:46:09 -0800 Subject: [Python-Dev] Why should the default hash(x) == id(x)? In-Reply-To: <b348a0850511011721ve1c3817vd5f61b644257e855@mail.gmail.com> References: <b348a0850511011721ve1c3817vd5f61b644257e855@mail.gmail.com> Message-ID: <20051102092422.F283.JCARLSON@uci.edu> Noam Raphael <noamraph at gmail.com> wrote: > > Hello, > > While writing my PEP about unifying mutable and immutable, I came upon this: > > Is there a reason why the default __hash__ method returns the id of the objects? A quick search in the list archives via google search "site:mail.python.org object __hash__" Says that Guido wanted to remove the default __hash__ method for object in Python 2.4, but that never actually happened. http://www.python.org/sf/660098 http://mail.python.org/pipermail/python-dev/2003-December/041375.html There may be more code which relies on the default behavior now, but fixing such things is easy. > Now, I just thought of a possible answer: "because he wants to store > in his dict both normal objects and objects of his user-defined type, > which turn out to be not equal to any other object." Which is a use-case, but a use-case which isn't always useful. Great for singleton/default arguments that no one should ever pass, not quite so good when you need the /original key/ (no copies) in order to get at a value in a dictionary - but that could be something that someone wants. > This leads me to another question: why should the default __eq__ > method be the same as "is"? If someone wants to check if two objects > are the same object, that's what the "is" operator is for. Why not > make the default __eq__ really compare the objects, that is, their > dicts and their slot-members? Using 'is' makes sense when the default hash is id (and actually in certain other cases as well). Actually comparing the contents of an object is certainly not desireable with the default hash, and probably not desireable in the general case because equality doesn't always depend on /all/ attributes of extension objects. Explicit is better than implicit. In the face of ambiguity, refuse the temptation to guess. I believe the current behavior of __eq__ is more desireable than comparing contents, as this may result in undesireable behavior (recursive compares on large nested objects are now slow, which used to be fast because default methods wouldn't cause a recursive comparison at all). As for removing the default __hash__ for objects, I'm actually hovering around a -0, if only because it is sometimes useful to generate unique keys for dictionaries (which can be done right now with object() ), and I acknowledge that it would be easy to subclass and use that instead. - Josiah From noamraph at gmail.com Wed Nov 2 20:04:50 2005 From: noamraph at gmail.com (Noam Raphael) Date: Wed, 2 Nov 2005 21:04:50 +0200 Subject: [Python-Dev] apparent ruminations on mutable immutables (was:PEP 351, the freeze protocol) In-Reply-To: <20051102054855.st4vtcvrorogggc8@login.werra.lunarpages.com> References: <20051102054855.st4vtcvrorogggc8@login.werra.lunarpages.com> Message-ID: <b348a0850511021104s13298755nbe71fd877388ae26@mail.gmail.com> Thank you for your encouraging words! I am currently working on a PEP. I am sure that writing it is a good idea, and that it would help with explaining this idea both to others and to myself. What I already wrote makes me think that it can be accomplished with no really large changes to the language - only six built-in types are affected, and there is no reason why existing code, both in C and in Python, would stop working. I hope others would be interested in the idea too, when I finish writing the PEP draft, so it would be discussed. Trying the idea with PyPy is a really nice idea - it seems that it would be much simpler to implement, and I'm sure that learning PyPy would be interesting. Thanks again, and I would really like to hear your comments when I post the PEP draft, Noam From runehol at ping.uio.no Wed Nov 2 20:18:52 2005 From: runehol at ping.uio.no (Rune Holm) Date: Wed, 02 Nov 2005 20:18:52 +0100 Subject: [Python-Dev] Optimizations on the AST representation Message-ID: <4369111C.5060508@ping.uio.no> Hi, I'm a norwegian applied mathematics student with an interest in compilers, and I've been a long-time python user and a python-dev lurker for some time. I'm very happy that you've integrated the AST branch into mainline, but I noticed that the AST compiler does not perform much optimization yet, so I though I'd take a crack at it. I just submitted the following patches: http://www.python.org/sf/1346214 http://www.python.org/sf/1346238 which adds better dead code elimination and constant folding of the AST representation to Python. The constant folding patch adds two new files, Include/optimize.h and Python/optimize.c, which includes a general visitor interface abstracted from exisiting visitor code for easy optimization pass creation. The code is in new files in order to make room for more AST optimization passes, and since Python/compile.c is already quite crowded with byte code generation and bytecode optimization. If desired, this patch could changed to add code to Python/compile.c instead. Further work: A limited form of type interference (e.g. as a synthesized attribute) could be very useful for further optimizations. Since python allows operator overloading, it isn't safe to perform strength reductions on expressions with operands of unknown type, as there is no way to know if algebraic identities will hold. However, if we can infer from the context that expressions have the type of int, float or long, many optimizations become possible, for instance: x**2 => x*x x*2 => x+x x*0 => 0 x*1 => x 4*x + 5*x => 9*x (this optimization actually requires common subexpression elimination for the general case, but simple cases can be performed without this) and so forth. Another interesting optimization that can potensially bring a lot of additional speed is hoisting of loop invariants, since calling python methods involves allocating and creating a method-wrapper object. An informal test shows that optimizing lst = [] for i in range(10): lst.append(i+1) into lst = [] tmp = lst.append for i in range(10): tmp(i+1) will yield a 10% speed increase. This operation is of course not safe with arbitrary types, but with the above type interference, we could perform this operation if the object is of a type that disallows attribute assignment, for instance lists, tuples, strings and unicode strings. Regards, Rune Holm From noamraph at gmail.com Wed Nov 2 20:26:52 2005 From: noamraph at gmail.com (Noam Raphael) Date: Wed, 2 Nov 2005 21:26:52 +0200 Subject: [Python-Dev] Why should the default hash(x) == id(x)? In-Reply-To: <20051102092422.F283.JCARLSON@uci.edu> References: <b348a0850511011721ve1c3817vd5f61b644257e855@mail.gmail.com> <20051102092422.F283.JCARLSON@uci.edu> Message-ID: <b348a0850511021126p31a12e15n7a3c22eb1b69026b@mail.gmail.com> On 11/2/05, Josiah Carlson <jcarlson at uci.edu> wrote: ... > > A quick search in the list archives via google search > "site:mail.python.org object __hash__" > Says that Guido wanted to remove the default __hash__ method for object > in Python 2.4, but that never actually happened. > > http://www.python.org/sf/660098 > http://mail.python.org/pipermail/python-dev/2003-December/041375.html > > There may be more code which relies on the default behavior now, but > fixing such things is easy. > Cool! If Guido also thinks that it should be gone, who am I to argue... (Seriously, I am in favor of removing it. I really think that it is confusing.) And if backwards-compatibility is a problem: You can, in Python 2.5, show a warning when the default __hash__ method is being called, saying that it is going to disappear in Python 2.6. [Snip - I will open a new thread about the equality operator] > As for removing the default __hash__ for objects, I'm actually hovering > around a -0, if only because it is sometimes useful to generate unique > keys for dictionaries (which can be done right now with object() ), and > I acknowledge that it would be easy to subclass and use that instead. > I can suggest a new class, that will help you in the cases that you do want a dict of identities: class ref(object): def __init__(self, obj): self._obj = obj def __call__(self): return self._obj def __eq__(self, other): return self._obj is other._obj def __hash__(self): return hash(id(self._obj)) It has the advantage over using ids as keys, that it saves a reference to the object, so it won't be killed. It lets you make a dict of object identities just as easily as before, in a more explicit and error-prone way. Perhaps it should become a builtin? Noam From rmunn at pobox.com Wed Nov 2 20:46:05 2005 From: rmunn at pobox.com (Robin Munn) Date: Wed, 02 Nov 2005 13:46:05 -0600 Subject: [Python-Dev] Problems with revision 4077 of new SVN repository Message-ID: <4369177D.3020000@pobox.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I'm trying to mirror the brand-new Python SVN repository with SVK, to better be able to track both the trunk and the various branches. Since I'm not a Python developer and don't have svn+ssh access, I'm doing so over http. The process fails when trying to fetch revision 4077, with the following error message: "RA layer request failed: REPORT request failed on 'projects/!svn/bc/41373/python': The REPORT request returned invalid XML in the response: XML parse error at line 7: not well-formed (invalid token) (/projects/!svn/bc/41373/python)" The thread at http://svn.haxx.se/dev/archive-2004-07/0793.shtml suggests that the problem may lie in the commit message for revision 4077: if it has a character in the 0x01-0x1f range (which are invalid XML), then Subversion methods like http: will fail to retrieve it, while methods like file: will succeed. I haven't tried svn+ssh: since I don't have an SSH key on the server. Trying "svn log -r 4077 http://svn.python.org/projects/python/" also fails: subversion/libsvn_ra_dav/util.c:780: (apr_err=175002) svn: REPORT request failed on '/projects/!svn/bc/4077/python' subversion/libsvn_ra_dav/util.c:760: (apr_err=175002) svn: The REPORT request returned invalid XML in the response: XML parse error at line 7: not well-formed (invalid token) (/projects/!svn/bc/4077/python) When I visit http://svn.python.org/view/python/?rev=4077, I can see the offending log message. Sure enough, there's a 0x1b character in it, between the space after "Added" and the "h" immediately before the word "Moved". This problem can be fixed by someone with root permissions on the SVN server logging in and running the following: echo "New commit message goes here" > new-message.txt svnadmin setlog --bypass-hooks -r 4077 /path/to/repos new-message.txt If there are other, similar problems later in the SVN repository, I was unable to find them because the SVK mirror process consistently halts at revision 4077. If revision 4077 is fixed and I turn up other log problems, I'll report them as well. - -- Robin Munn rmunn at pobox.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.0 (Darwin) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFDaRd46OLMk9ZJcBQRApjAAJ9K3Y5z1q4TulqwVjmZTZb9ZgY31ACcD8RI fNFmGL2U4XaIKa2n6UUyxEA= =tEbq -----END PGP SIGNATURE----- From noamraph at gmail.com Wed Nov 2 21:36:54 2005 From: noamraph at gmail.com (Noam Raphael) Date: Wed, 2 Nov 2005 22:36:54 +0200 Subject: [Python-Dev] Should the default equality operator compare values instead of identities? Message-ID: <b348a0850511021236u66c94838pb7bb9e27f1314c3d@mail.gmail.com> I think it should. (I copy here messages from the thread about the default hash method.) On 11/2/05, Michael Chermside <mcherm at mcherm.com> wrote: > > Why not make the default __eq__ really compare the objects, that is, > > their dicts and their slot-members? > > Short answer: not the desired behavior. Longer answer: there are > three common patterns in object design. There are "value" objects, > which should be considered equal if all fields are equal. There are > "identity" objects which are considered equal only when they are > the same object. And then there are (somewhat less common) "value" > objects in which a few fields don't count -- they may be used for > caching a pre-computed result for example. The default __eq__ > behavior has to cater to one of these -- clearly either "value" > objects or "identity" objects. Guido chose to cater to "identity" > objects believing that they are actually more common in most > situations. A beneficial side-effect is that the default behavior > of __eq__ is QUITE simple to explain, and if the implementation is > easy to explain then it may be a good idea. > This is a very nice observation. I wish to explain why I think that the default __eq__ should compare values, not identities. 1. If you want to compare identities, you can always use "is". There is currently no easy way to compare your user-defined classes by value, in case they happen to be "value objects", in Michael's terminology - you have to compare every single member. (Comparing the __dict__ attributes is ugly, and will not always work). If the default were to compare the objects by value, and they happen to be "identity objects", you can always do: def __eq__(self, other): return self is other 2. I believe that counter to what Michael said, "value objects" are more common than "identity objects", at least when talking about user-defined classes, and especially when talking about simple user-defined classes, where the defaults are most important, since the writer wouldn't care to define all the appropriate protocols. (this was a long sentence) Can you give examples of common "identity objects"? I believe that they are usually dealing with some input/output, that is, with things that interact with the environment (files, for example). I believe almost all "algorithmic" classes are "value objects". And I think that usually, comparison based on value will give the correct result for "identity objects" too, since if they do I/O, they will usually hold a reference to an I/O object, like file, which is an "identity object" by itself. This means that the comparison will compare those objects, and return false, since the I/O objects they hold are not the same one. 3. I think that value-based comparison is also quite easy to explain: user-defined classes combine functions with a data structure. In Python, the "data structure" is simply member names which reference other objects. The default, value-based, comparison, checks if two objects have the same member names, and that they are referencing equal (by value) objects, and if so, returns True. I think that explaining this is not harder than explaining the current dict comparison. Now, for Josiah's reply: On 11/2/05, Josiah Carlson <jcarlson at uci.edu> wrote: > > This leads me to another question: why should the default __eq__ > > method be the same as "is"? If someone wants to check if two objects > > are the same object, that's what the "is" operator is for. Why not > > make the default __eq__ really compare the objects, that is, their > > dicts and their slot-members? > > Using 'is' makes sense when the default hash is id (and actually in > certain other cases as well). Actually comparing the contents of an > object is certainly not desireable with the default hash, and probably > not desireable in the general case because equality doesn't always > depend on /all/ attributes of extension objects. > > Explicit is better than implicit. > In the face of ambiguity, refuse the temptation to guess. > I hope that the default hash would stop being id, as Josiah showed that Guido decided, so let's don't discuss it. Now, about the good point that sometimes the state doesn't depend on all the attributes. Right. But the current default doesn't compare them well too - you have no escape from writing an equality operator by yourself. And I think this is not the common case. I think that the meaning of "in the face of ambiguity, refuse the temptation to guess" is that you should not write code that changes its behaviour according to what the user will do, based on your guess as to what he meant. This is not the case - the value-based comparison is strictly defined. It may just not be what the user would want - and in most cases, I think it will. "Explicit is better than implicit" says only "better". identity-based comparison is just as implicit as value-based comparison. (I want to add that there is a simple way to support value-based comparison when some members don't count, by writing a metaclass that will check if your class has a member like __non_state_members__ = ["_calculated_hash", "debug_member"] and if so, would not compare them in the default equality-testing method. I would say that this can even be made the behavior of the default type.) > I believe the current behavior of __eq__ is more desireable than > comparing contents, as this may result in undesireable behavior > (recursive compares on large nested objects are now slow, which used to > be fast because default methods wouldn't cause a recursive comparison at > all). But if the default method doesn't do what you want, it doesn't matter how fast it is. Remember that it's very easy to make recursive comparisons, by comparing lists for example, and it hasn't disturbed anyone. To summarize, I think that value-based equality testing would usually be what you want, and currently implementing it is a bit of a pain. Concerning backwards-compatibility: show a warning in Python 2.5 when the default equality test is being made, and change it in Python 2.6. Comments, please! Thanks, Noam From noamraph at gmail.com Wed Nov 2 22:11:25 2005 From: noamraph at gmail.com (Noam Raphael) Date: Wed, 2 Nov 2005 23:11:25 +0200 Subject: [Python-Dev] Should the default equality operator compare values instead of identities? In-Reply-To: <b348a0850511021236u66c94838pb7bb9e27f1314c3d@mail.gmail.com> References: <b348a0850511021236u66c94838pb7bb9e27f1314c3d@mail.gmail.com> Message-ID: <b348a0850511021311t456b8f6bg61763a5ea20497af@mail.gmail.com> I've looked for classes in my /usr/lib/python2.4 directory. I won't go over all the 7346 classes that were found there, but let's see: "identity objects" that will continue to work because they contain other "identity objects" ======================== SocketServer, and everything which inherits from it (like HTTPServer) Queue csv (contains _csv objects) "value objects" that would probably gain a meaningful equality operator ============================================ StringIO ConfigParser markupbase, HTMLParser HexBin, BinHex cgi.FieldStorage AST Nodes others ====== Cookie - inherits from dict its __eq__ method. I'll stop here. I was not strictly scientific, because I chose classes that I thought that I might guess what they do easily, and perhaps discarded classes that didn't look interesting to me. But I didn't have any bad intention when choosing the classes. I have seen no class that the change would damage its equality operator. I have seen quite a lot of classes which didn't define an equality operator, and that a value-based comparison would be the right way to compare them. I'm getting more convinced in my opinion. Noam From noamraph at gmail.com Wed Nov 2 22:18:58 2005 From: noamraph at gmail.com (Noam Raphael) Date: Wed, 2 Nov 2005 23:18:58 +0200 Subject: [Python-Dev] Should the default equality operator compare valuesinstead of identities? In-Reply-To: <001101c5dff0$462fa5c0$153dc797@oemcomputer> References: <b348a0850511021236u66c94838pb7bb9e27f1314c3d@mail.gmail.com> <001101c5dff0$462fa5c0$153dc797@oemcomputer> Message-ID: <b348a0850511021318s1fbdd0a5u4605deb4464b0dc8@mail.gmail.com> On 11/2/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote: > > Should the default equality operator compare valuesinstead of > identities? > > No. Look back into last year's python-dev postings where we agreed that > identity would always imply equality. There were a number of practical > reasons. Also, there are a number of places in CPython where that > assumption is implicit. > Perhaps you've meant something else, or I didn't understand? Identity implying equality is true also in value-based comparison. If the default __eq__ operator compares by value, I would say that it would do something like: def __eq__(self, other): if self is other: return True if type(self) is not type(other): return False (compare the __dict__ and any __slots__, and if they are all ==, return True.) Noam From martin at v.loewis.de Wed Nov 2 23:10:12 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 02 Nov 2005 23:10:12 +0100 Subject: [Python-Dev] Problems with revision 4077 of new SVN repository In-Reply-To: <4369177D.3020000@pobox.com> References: <4369177D.3020000@pobox.com> Message-ID: <43693944.3090803@v.loewis.de> Robin Munn wrote: > echo "New commit message goes here" > new-message.txt > svnadmin setlog --bypass-hooks -r 4077 /path/to/repos new-message.txt Thanks for pointing that out, and for giving those instructions. I now corrected the log message. Regards, Martin From rmunn at pobox.com Thu Nov 3 00:14:50 2005 From: rmunn at pobox.com (Robin Munn) Date: Wed, 02 Nov 2005 17:14:50 -0600 Subject: [Python-Dev] Problems with revision 4077 of new SVN repository In-Reply-To: <43693944.3090803@v.loewis.de> References: <4369177D.3020000@pobox.com> <43693944.3090803@v.loewis.de> Message-ID: <4369486A.8090107@pobox.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Martin v. L?wis wrote: > Robin Munn wrote: > >> echo "New commit message goes here" > new-message.txt >> svnadmin setlog --bypass-hooks -r 4077 /path/to/repos new-message.txt > > > Thanks for pointing that out, and for giving those instructions. > I now corrected the log message. Revision 4077 is fine now. However, the same problem exists in revision 4284, which has a 0x01 character before the word "add". Same solution: echo "New commit message goes here" > new-message.txt svnadmin setlog --bypass-hooks -r 4284 /path/to/repos new-message.txt If there are two errors of the same type within about 200 revisions, there may be more. I'm currently running "svn log" on every revision in the Python SVN repository to see if I find any more errors of this type, so that I don't have to hunt them down one-by-one by rerunning SVK. I'll post my findings when I'm done. - -- Robin Munn rmunn at pobox.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.0 (Darwin) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFDaUho6OLMk9ZJcBQRAg5eAJ9cJTPKX69DhXJyoT/cDV5GmZlC3QCfRj/E wCix8IYU8xbh5/Ibnpa+kg4= =+jLR -----END PGP SIGNATURE----- From greg.ewing at canterbury.ac.nz Thu Nov 3 01:39:44 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 03 Nov 2005 13:39:44 +1300 Subject: [Python-Dev] Why should the default hash(x) == id(x)? In-Reply-To: <b348a0850511011721ve1c3817vd5f61b644257e855@mail.gmail.com> References: <b348a0850511011721ve1c3817vd5f61b644257e855@mail.gmail.com> Message-ID: <43695C50.5070600@canterbury.ac.nz> Noam Raphael wrote: > 3. If someone does want to associate values with objects, he can > explicitly use id: > dct[id(x)] = 3. This is fragile. Once all references to x are dropped, it is possible for another object to be created having the same id that x used to have. The dict now unintentionally references the new object. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From jcarlson at uci.edu Thu Nov 3 02:16:40 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed, 02 Nov 2005 17:16:40 -0800 Subject: [Python-Dev] Should the default equality operator compare values instead of identities? In-Reply-To: <b348a0850511021236u66c94838pb7bb9e27f1314c3d@mail.gmail.com> References: <b348a0850511021236u66c94838pb7bb9e27f1314c3d@mail.gmail.com> Message-ID: <20051102125437.F290.JCARLSON@uci.edu> Noam Raphael <noamraph at gmail.com> wrote: > On 11/2/05, Josiah Carlson <jcarlson at uci.edu> wrote: > > I believe the current behavior of __eq__ is more desireable than > > comparing contents, as this may result in undesireable behavior > > (recursive compares on large nested objects are now slow, which used to > > be fast because default methods wouldn't cause a recursive comparison at > > all). > > But if the default method doesn't do what you want, it doesn't matter > how fast it is. Remember that it's very easy to make recursive > comparisons, by comparing lists for example, and it hasn't disturbed > anyone. Right, but lists (dicts, tuples, etc.) are defined as containers, and their comparison operation is defined on their contents. Objects are not defined as containers in the general case, so defining comparisons based on their contents (as opposed to identity) is just one of the two assumptions to be made. I personally like the current behavior, and I see no /compelling/ reason to change it. You obviously feel so compelled for the behavior to change that you are willing to express your desires. How about you do something more productive and produce a patch which implements the changes you want, verify that it passes tests in the standard library, then post it on sourceforge. If someone is similarly compelled and agrees with you (so far I've not seen any public support for your proposal by any of the core developers), the discussion will restart, and it will be decided (not by you or I). > To summarize, I think that value-based equality testing would usually > be what you want, and currently implementing it is a bit of a pain. Actually, implementing value-based equality testing, when you have a finite set of values you want to test, is quite easy. def __eq__(self, other): for i in self.__cmp_eq__: if getattr(self, i) != getattr(other, i): return False return True With a simple metaclass that discovers all of those values automatically, and/or your own protocol for exclusion, and you are done. Remember, not all 5-line functions should become builtin/default behavior, and this implementation shows that it is not a significant burdon for you (or anyone else) to implement this in your own custom library. - Josiah P.S. One thing that you should remember is that even if your patch is accepted, and even if this is desireable, Python 2.5 is supposed to be released sometime next year (spring/summer?), and because it is a backwards incompatible change, would need at least 2.6-2.7 before it becomes the default behavior without a __future__ import, which is another 3-4 years down the line. I understand you are passionate, really I do (you should see some of my proposals), but by the time these things get around to getting into mainline Python, there are high odds that you probably won't care about them much anymore (I've come to feel that way myself about many of my proposals), and I think it is a good idea to attempt to balance - when it comes to Python - "Now is better than never." and "Although never is often better than *right* now." Removing __hash__, changing __eq__, and trying to get in copy-on-write freezing (which is really copy-and-cache freezing), all read to me like "We gotta do this now!", which certainly isn't helping the proposal. From rmunn at pobox.com Thu Nov 3 03:13:33 2005 From: rmunn at pobox.com (Robin Munn) Date: Wed, 02 Nov 2005 20:13:33 -0600 Subject: [Python-Dev] Problems with revision 4077 of new SVN repository In-Reply-To: <4369486A.8090107@pobox.com> References: <4369177D.3020000@pobox.com> <43693944.3090803@v.loewis.de> <4369486A.8090107@pobox.com> Message-ID: <4369724D.8060001@pobox.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Robin Munn wrote: > Revision 4077 is fine now. However, the same problem exists in revision > 4284, which has a 0x01 character before the word "add". Same solution: > > echo "New commit message goes here" > new-message.txt > svnadmin setlog --bypass-hooks -r 4284 /path/to/repos new-message.txt > > If there are two errors of the same type within about 200 revisions, > there may be more. I'm currently running "svn log" on every revision in > the Python SVN repository to see if I find any more errors of this type, > so that I don't have to hunt them down one-by-one by rerunning SVK. I'll > post my findings when I'm done. My script is up to revision 17500 with no further problems found; I now believe that 4077 and 4284 were isolated cases. Once 4284 is fixed, it should now be possible to SVK-mirror the entire repository. - -- Robin Munn rmunn at pobox.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.0 (Darwin) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFDaXJF6OLMk9ZJcBQRAtZpAJ9iE1SlRJiQQOdIuBFuvjmQG3gshACgl9/A vbsGD0bX3NCirQC5qtxdLYo= =sgk/ -----END PGP SIGNATURE----- From martin at v.loewis.de Thu Nov 3 08:57:30 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 03 Nov 2005 08:57:30 +0100 Subject: [Python-Dev] Problems with revision 4077 of new SVN repository In-Reply-To: <4369724D.8060001@pobox.com> References: <4369177D.3020000@pobox.com> <43693944.3090803@v.loewis.de> <4369486A.8090107@pobox.com> <4369724D.8060001@pobox.com> Message-ID: <4369C2EA.6030407@v.loewis.de> Robin Munn wrote: >>Revision 4077 is fine now. However, the same problem exists in revision >>4284, which has a 0x01 character before the word "add". Same solution: I now have fixed that as well. Regards, Martin From rmunn at pobox.com Thu Nov 3 09:07:43 2005 From: rmunn at pobox.com (Robin Munn) Date: Thu, 03 Nov 2005 02:07:43 -0600 Subject: [Python-Dev] Problems with revision 4077 of new SVN repository In-Reply-To: <4369C2EA.6030407@v.loewis.de> References: <4369177D.3020000@pobox.com> <43693944.3090803@v.loewis.de> <4369486A.8090107@pobox.com> <4369724D.8060001@pobox.com> <4369C2EA.6030407@v.loewis.de> Message-ID: <4369C54F.3050803@pobox.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Martin v. L?wis wrote: > Robin Munn wrote: > >>> Revision 4077 is fine now. However, the same problem exists in revision >>> 4284, which has a 0x01 character before the word "add". Same solution: > > > I now have fixed that as well. > > Regards, > Martin And my script just finished running, with no further errors of this type found. So doing an SVK mirror of the repository should work now, barring any further surprises. I'm starting the SVK sync now; we'll see what happens. Thanks for fixing these! - -- Robin Munn rmunn at pobox.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.0 (Darwin) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFDacVN6OLMk9ZJcBQRApUbAJ9+Ly5vPr8HRmoRbwJ3po4IWe8PBwCePTdm XNx8HGqPvs7fwahHuJSogMw= =a6Nc -----END PGP SIGNATURE----- From mwh at python.net Thu Nov 3 13:48:06 2005 From: mwh at python.net (Michael Hudson) Date: Thu, 03 Nov 2005 12:48:06 +0000 Subject: [Python-Dev] PyPy 0.8.0 is released! Message-ID: <2mbr113njt.fsf@starship.python.net> pypy-0.8.0: Translatable compiler/parser and some more speed ============================================================== The PyPy development team has been busy working and we've now packaged our latest improvements, completed work and new experiments as version 0.8.0, our third public release. The highlights of this third release of PyPy are: - Translatable parser and AST compiler. PyPy now integrates its own compiler based on Python own 'compiler' package but with a number of fixes and code simplifications in order to get it translated with the rest of PyPy. This makes using the translated pypy interactively much more pleasant, as compilation is considerably faster than in 0.7.0. - Some Speed enhancements. Translated PyPy is now about 10 times faster than 0.7 but still 10-20 times slower than CPython on pystones and other benchmarks. At the same time, language compliancy has been slightly increased compared to 0.7 which had already reached major CPython compliancy goals. - Some experimental features are now translateable. Since 0.6.0, PyPy shipped with an experimental Object Space (the part of PyPy implementing Python object operations and manipulation) implementing lazily computed objects, the "Thunk" object space. With 0.8.0 this object space can also be translated preserving its feature additions. What is PyPy (about)? ------------------------------------------------ PyPy is a MIT-licensed research-oriented reimplementation of Python written in Python itself, flexible and easy to experiment with. It translates itself to lower level languages. Our goals are to target a large variety of platforms, small and large, by providing a compilation toolsuite that can produce custom Python versions. Platform, Memory and Threading models are to become aspects of the translation process - as opposed to encoding low level details into a language implementation itself. Eventually, dynamic optimization techniques - implemented as another translation aspect - should become robust against language changes. Note that PyPy is mainly a research and development project and does not by itself focus on getting a production-ready Python implementation although we do hope and expect it to become a viable contender in that area sometime next year. PyPy is partially funded as a research project under the European Union's IST programme. Where to start? ----------------------------- Getting started: http://codespeak.net/pypy/dist/pypy/doc/getting-started.html PyPy Documentation: http://codespeak.net/pypy/dist/pypy/doc/ PyPy Homepage: http://codespeak.net/pypy/ The interpreter and object model implementations shipped with the 0.8 version can run on their own and implement the core language features of Python as of CPython 2.4. However, we still do not recommend using PyPy for anything else than for education, playing or research purposes. Ongoing work and near term goals --------------------------------- At the last sprint in Paris we started exploring the new directions of our work, in terms of extending and optimising PyPy further. We started to scratch the surface of Just-In-Time compiler related work, which we still expect will be the major source of our future speed improvements and some successful amount of work has been done on the support needed for stackless-like features. This release also includes the snapshots in preliminary or embryonic form of the following interesting but yet not completed sub projects: - The OOtyper, a RTyper variation for higher-level backends (Squeak, ...) - A JavaScript backend - A limited (PPC) assembler backend (this related to the JIT) - some bits for a socket module PyPy has been developed during approximately 16 coding sprints across Europe and the US. It continues to be a very dynamically and incrementally evolving project with many of these one-week workshops to follow. PyPy has been a community effort from the start and it would not have got that far without the coding and feedback support from numerous people. Please feel free to give feedback and raise questions. contact points: http://codespeak.net/pypy/dist/pypy/doc/contact.html have fun, the pypy team, (Armin Rigo, Samuele Pedroni, Holger Krekel, Christian Tismer, Carl Friedrich Bolz, Michael Hudson, and many others: http://codespeak.net/pypy/dist/pypy/doc/contributor.html) PyPy development and activities happen as an open source project and with the support of a consortium partially funded by a two year European Union IST research grant. The full partners of that consortium are: Heinrich-Heine University (Germany), AB Strakt (Sweden) merlinux GmbH (Germany), tismerysoft GmbH (Germany) Logilab Paris (France), DFKI GmbH (Germany) ChangeMaker (Sweden), Impara (Germany) From theller at python.net Thu Nov 3 21:01:35 2005 From: theller at python.net (Thomas Heller) Date: Thu, 03 Nov 2005 21:01:35 +0100 Subject: [Python-Dev] PYTHOPN_API_VERSION Message-ID: <br11xzz4.fsf@python.net> Shouldn't PYTHON_API_VERSION be different between 2.3 and 2.4? It is 1012 in both versions. I tried to detect whether PyTuple_Pack is supported, which was added in 2.4. Or is this only to detect changed apis, and not added apis? Thomas From Jack.Jansen at cwi.nl Thu Nov 3 22:29:37 2005 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Thu, 3 Nov 2005 22:29:37 +0100 Subject: [Python-Dev] Proposal: can we have a python-dev-announce mailing list? Message-ID: <4407AF2E-9F9F-4D75-B890-052438D20468@cwi.nl> As people may have noticed (or possibly not:-) I've been rather inactive on python-dev the last year or so, due to being completely inundated with other work. Too bad that I've missed all the interesting discussions on Python 3000, but I'm bound to catch up some time later this year:-). BUT: what I also missed are all the important announcements, such as new releases, the switch to svn, and a couple more (I think). I know I would be much helped with a moderated python-dev-announce mailing list, which would be only low-volume, time-critical announcements for people developing Python. Even during times when I am actively following python-dev it would be handy to have important announcements coming in in a separate mailbox in stead of buried under design discussions and such... -- Jack Jansen, <Jack.Jansen at cwi.nl>, http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From phd at mail2.phd.pp.ru Thu Nov 3 22:36:59 2005 From: phd at mail2.phd.pp.ru (Oleg Broytmann) Date: Fri, 4 Nov 2005 00:36:59 +0300 Subject: [Python-Dev] Proposal: can we have a python-dev-announce mailing list? In-Reply-To: <4407AF2E-9F9F-4D75-B890-052438D20468@cwi.nl> References: <4407AF2E-9F9F-4D75-B890-052438D20468@cwi.nl> Message-ID: <20051103213659.GA26132@phd.pp.ru> On Thu, Nov 03, 2005 at 10:29:37PM +0100, Jack Jansen wrote: > I know I would be much helped with a moderated python-dev-announce > mailing list, which would be only low-volume http://www.google.com/search?q=python-dev+summary+site%3Amail.python.org Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From jcarlson at uci.edu Thu Nov 3 22:52:25 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Thu, 03 Nov 2005 13:52:25 -0800 Subject: [Python-Dev] Proposal: can we have a python-dev-announce mailing list? In-Reply-To: <20051103213659.GA26132@phd.pp.ru> References: <4407AF2E-9F9F-4D75-B890-052438D20468@cwi.nl> <20051103213659.GA26132@phd.pp.ru> Message-ID: <20051103134856.BFB2.JCARLSON@uci.edu> Even when they are on the ball, the summaries generally occur one week after the discussion/execution happens. That's not so much in the 'time-critical' aspect which, I would imagine, is about as important as the 'low-volume' aspect. - Josiah Oleg Broytmann <phd at oper.phd.pp.ru> wrote: > > On Thu, Nov 03, 2005 at 10:29:37PM +0100, Jack Jansen wrote: > > I know I would be much helped with a moderated python-dev-announce > > mailing list, which would be only low-volume > > http://www.google.com/search?q=python-dev+summary+site%3Amail.python.org > > Oleg. > -- > Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru > Programmers don't die, they just GOSUB without RETURN. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/jcarlson%40uci.edu From Jack.Jansen at cwi.nl Thu Nov 3 22:51:14 2005 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Thu, 3 Nov 2005 22:51:14 +0100 Subject: [Python-Dev] Proposal: can we have a python-dev-announce mailing list? In-Reply-To: <20051103213659.GA26132@phd.pp.ru> References: <4407AF2E-9F9F-4D75-B890-052438D20468@cwi.nl> <20051103213659.GA26132@phd.pp.ru> Message-ID: <5630A610-FB3B-4359-8E86-39CBF074CF0D@cwi.nl> On 3-nov-2005, at 22:36, Oleg Broytmann wrote: > On Thu, Nov 03, 2005 at 10:29:37PM +0100, Jack Jansen wrote: > >> I know I would be much helped with a moderated python-dev-announce >> mailing list, which would be only low-volume >> > > http://www.google.com/search?q=python-dev+summary+site% > 3Amail.python.org Hmm. I wouldn't mind if it was push in stead of pull, I wouldn't mind if it was in the right order, and I wouldn't mind if itwas more concise:-) But: I'll just wait to see whether more people chime in that they'd like this, or that I'm alone... -- Jack Jansen, <Jack.Jansen at cwi.nl>, http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From skip at pobox.com Thu Nov 3 22:55:23 2005 From: skip at pobox.com (skip@pobox.com) Date: Thu, 3 Nov 2005 15:55:23 -0600 Subject: [Python-Dev] Proposal: can we have a python-dev-announce mailing list? In-Reply-To: <20051103213659.GA26132@phd.pp.ru> References: <4407AF2E-9F9F-4D75-B890-052438D20468@cwi.nl> <20051103213659.GA26132@phd.pp.ru> Message-ID: <17258.34635.582411.34526@montanaro.dyndns.org> >> I know I would be much helped with a moderated python-dev-announce >> mailing list, which would be only low-volume Oleg> http://www.google.com/search?q=python-dev+summary+site%3Amail.python.org That works up to a point, however the python-dev summaries only come out once every couple of weeks, so probably aren't going to catch important stuff that comes and goes with less than a two-week lifespan. Alerts that machines are going down for maintenance fall into this category. Also, I think the cvs->svn switch probably didn't take more than a few days once the ball got rolling. I think Martin announced the demise of the SF repository around 20 October, with a cutover date of 26 October. Skip From martin at v.loewis.de Thu Nov 3 23:08:55 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 03 Nov 2005 23:08:55 +0100 Subject: [Python-Dev] PYTHOPN_API_VERSION In-Reply-To: <br11xzz4.fsf@python.net> References: <br11xzz4.fsf@python.net> Message-ID: <436A8A77.4040306@v.loewis.de> Thomas Heller wrote: > Shouldn't PYTHON_API_VERSION be different between 2.3 and 2.4? > It is 1012 in both versions. > > I tried to detect whether PyTuple_Pack is supported, which was added in > 2.4. Or is this only to detect changed apis, and not added apis? It's meant to detect changes that can break existing binary modules. In most cases, this would be changed structs. Whether such changes happened between 2.3 and 2.4, I don't know. If you want to ask whether a certain function is present, either use autoconf, or check for the Python (hex) version where it was first introduced. Regards, Martin From martin at v.loewis.de Thu Nov 3 23:16:42 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 03 Nov 2005 23:16:42 +0100 Subject: [Python-Dev] Proposal: can we have a python-dev-announce mailing list? In-Reply-To: <5630A610-FB3B-4359-8E86-39CBF074CF0D@cwi.nl> References: <4407AF2E-9F9F-4D75-B890-052438D20468@cwi.nl> <20051103213659.GA26132@phd.pp.ru> <5630A610-FB3B-4359-8E86-39CBF074CF0D@cwi.nl> Message-ID: <436A8C4A.8090908@v.loewis.de> Jack Jansen wrote: > Hmm. I wouldn't mind if it was push in stead of pull, I wouldn't mind > if it was in the right order, and I wouldn't mind if itwas more > concise:-) > > But: I'll just wait to see whether more people chime in that they'd > like this, or that I'm alone... I'm -1 on such a list. If it existed, people could complain "why wasn't this announced properly". So the "blame" would be on people who failed to give proper notice, instead of on the people who did not care enough to follow the entire discussion. More specifically, I'm sure I would have forgotten to post about the svn switchover to python-dev-announce, just as I failed to post to comp.lang.python.announce. This is all volunteer work. Regards, Martin From t-meyer at ihug.co.nz Fri Nov 4 00:41:12 2005 From: t-meyer at ihug.co.nz (Tony Meyer) Date: Fri, 4 Nov 2005 12:41:12 +1300 Subject: [Python-Dev] Proposal: can we have a python-dev-announce mailing list? In-Reply-To: <4407AF2E-9F9F-4D75-B890-052438D20468@cwi.nl> References: <4407AF2E-9F9F-4D75-B890-052438D20468@cwi.nl> Message-ID: <27F1C5EB-9459-4F1F-A43F-9898941D66CF@ihug.co.nz> > I know I would be much helped with a moderated python-dev-announce > mailing list, which would be only low-volume, time-critical > announcements for people developing Python. Even during times when I > am actively following python-dev it would be handy to have important > announcements coming in in a separate mailbox in stead of buried > under design discussions and such... Firstly, my apologies for the current delay in summaries, which exacerbates this problem (although others are right when they say that things sometimes happen too fast even for on-time summaries). A while back there was talk about a mailing list for PEP changes and the solution was instead to use the "topic" feature of mailman, essentially creating a subset-mailing-list. Would something like this be feasible for this? (I don't really know enough how how the topic feature can be used to know if it is workable or not). I presume that this would still need some sort of action from the poster (e.g. including a tag somewhere), but it would probably be easier for people to remember to do that than cross-post to another list entirely. =Tony.Meyer From rmunn at pobox.com Fri Nov 4 01:17:32 2005 From: rmunn at pobox.com (Robin Munn) Date: Thu, 03 Nov 2005 18:17:32 -0600 Subject: [Python-Dev] No more problems with new SVN repository In-Reply-To: <4369C54F.3050803@pobox.com> References: <4369177D.3020000@pobox.com> <43693944.3090803@v.loewis.de> <4369486A.8090107@pobox.com> <4369724D.8060001@pobox.com> <4369C2EA.6030407@v.loewis.de> <4369C54F.3050803@pobox.com> Message-ID: <436AA89C.6050401@pobox.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Robin Munn wrote: > So doing an SVK mirror of the repository should work now, barring > any further surprises. I'm starting the SVK sync now; we'll see what > happens. Confirmed; the SVK mirror took about 18 hours, but it completed successfully with no further problems. Again, thanks for fixing the issues so quickly. - -- Robin Munn rmunn at pobox.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.0 (Darwin) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFDaqiZ6OLMk9ZJcBQRAjGuAJwLmbrxBgrHYUb/7LOvjq89GfKrWACghGgn pvuMT5edAfMw3OAoZf5mJiw= =2i88 -----END PGP SIGNATURE----- From guido at python.org Fri Nov 4 01:21:15 2005 From: guido at python.org (Guido van Rossum) Date: Thu, 3 Nov 2005 16:21:15 -0800 Subject: [Python-Dev] No more problems with new SVN repository In-Reply-To: <436AA89C.6050401@pobox.com> References: <4369177D.3020000@pobox.com> <43693944.3090803@v.loewis.de> <4369486A.8090107@pobox.com> <4369724D.8060001@pobox.com> <4369C2EA.6030407@v.loewis.de> <4369C54F.3050803@pobox.com> <436AA89C.6050401@pobox.com> Message-ID: <ca471dc20511031621s53fd3a5fha0e56b924974babb@mail.gmail.com> I have a question after this exhilarating exchange. Is there a way to prevent this kind of thing in the future, e.g. by removing or rejecting change log messages with characters that are considered invalid in XML? (Or should perhaps the fix be to suppress or quote these characters somehow in XML?) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Fri Nov 4 07:31:03 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 04 Nov 2005 07:31:03 +0100 Subject: [Python-Dev] No more problems with new SVN repository In-Reply-To: <ca471dc20511031621s53fd3a5fha0e56b924974babb@mail.gmail.com> References: <4369177D.3020000@pobox.com> <43693944.3090803@v.loewis.de> <4369486A.8090107@pobox.com> <4369724D.8060001@pobox.com> <4369C2EA.6030407@v.loewis.de> <4369C54F.3050803@pobox.com> <436AA89C.6050401@pobox.com> <ca471dc20511031621s53fd3a5fha0e56b924974babb@mail.gmail.com> Message-ID: <436B0027.6010808@v.loewis.de> Guido van Rossum wrote: > I have a question after this exhilarating exchange. > > Is there a way to prevent this kind of thing in the future, e.g. by > removing or rejecting change log messages with characters that are > considered invalid in XML? I don't think it can happen again. Without testing, I would hope subversion rejects log messages that contain "random" control characters (if it doesn't, I should report that as a bug). The characters are in there because of the CVS conversion (that might be a bug in cvs2svn, which should have replaced them perhaps). It only happened in very old log messages - so perhaps even CVS doesn't allow them anymore. In XML 1.0, there is a lot of confusion about including control characters in text. In XML 1.1, this was clarified that you can include them, but only through character references. So in the future, subversion might be able to transmit such log messages in well-formed webdav. Regards, Martin From fredrik at pythonware.com Fri Nov 4 10:05:40 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 4 Nov 2005 10:05:40 +0100 Subject: [Python-Dev] Adding examples to PEP 263 Message-ID: <dkf895$4p9$1@sea.gmane.org> the runtime warning you get when you use non-ascii characters in python source code points the poor user to this page: http://www.python.org/peps/pep-0263.html which tells the user to add a # -*- coding: <encoding name> -*- to the source, and then provides a more detailed syntax description as a RE pattern. to help people that didn't grow up with emacs, and don't speak fluent RE, and/or prefer to skim documentation, it would be a quite helpful if the page also contained a few examples; e.g. # -*- coding: utf-8 -*- # -*- coding: iso-8859-1 -*- can anyone with SVN write access perhaps add this? (I'd probably add a note to the top of the page for anyone who arrives there via a Python error message, which summarizes the pep and provides an example or two; abstracts and rationales are nice, but if you're just a plain user, a "do this; here's how it works; further discussion below" style is a bit more practical...) </F> From mal at egenix.com Fri Nov 4 10:27:43 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 04 Nov 2005 10:27:43 +0100 Subject: [Python-Dev] Adding examples to PEP 263 In-Reply-To: <dkf895$4p9$1@sea.gmane.org> References: <dkf895$4p9$1@sea.gmane.org> Message-ID: <436B298F.3050803@egenix.com> Fredrik Lundh wrote: > the runtime warning you get when you use non-ascii characters in > python source code points the poor user to this page: > > http://www.python.org/peps/pep-0263.html > > which tells the user to add a > > # -*- coding: <encoding name> -*- > > to the source, and then provides a more detailed syntax description > as a RE pattern. to help people that didn't grow up with emacs, and > don't speak fluent RE, and/or prefer to skim documentation, it would > be a quite helpful if the page also contained a few examples; e.g. > > # -*- coding: utf-8 -*- > # -*- coding: iso-8859-1 -*- > > can anyone with SVN write access perhaps add this? Good point. I'll add some examples. > (I'd probably add a note to the top of the page for anyone who arrives > there via a Python error message, which summarizes the pep and provides > an example or two; abstracts and rationales are nice, but if you're just a > plain user, a "do this; here's how it works; further discussion below" style > is a bit more practical...) The PEP isn't all that long, so I don't think a summary would help. However, we might want to point the user to a different URL in the error message, e.g. a Wiki page with more user-friendly content. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 04 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2005-10-17: Released mxODBC.Zope.DA 1.0.9 http://zope.egenix.com/ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From noamraph at gmail.com Fri Nov 4 13:02:31 2005 From: noamraph at gmail.com (Noam Raphael) Date: Fri, 4 Nov 2005 14:02:31 +0200 Subject: [Python-Dev] Why should the default hash(x) == id(x)? In-Reply-To: <43695C50.5070600@canterbury.ac.nz> References: <b348a0850511011721ve1c3817vd5f61b644257e855@mail.gmail.com> <43695C50.5070600@canterbury.ac.nz> Message-ID: <b348a0850511040402v1edadf2ehc962166c329538df@mail.gmail.com> On 11/3/05, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote: > > 3. If someone does want to associate values with objects, he can > > explicitly use id: > > dct[id(x)] = 3. > > This is fragile. Once all references to x are dropped, > it is possible for another object to be created having > the same id that x used to have. The dict now > unintentionally references the new object. > You are right. Please see the simple "ref" class that I wrote in my previous post, which solves this problem. Noam From steve at holdenweb.com Fri Nov 4 13:03:44 2005 From: steve at holdenweb.com (Steve Holden) Date: Fri, 04 Nov 2005 12:03:44 +0000 Subject: [Python-Dev] Adding examples to PEP 263 In-Reply-To: <436B298F.3050803@egenix.com> References: <dkf895$4p9$1@sea.gmane.org> <436B298F.3050803@egenix.com> Message-ID: <dkfin0$1qa$2@sea.gmane.org> M.-A. Lemburg wrote: > Fredrik Lundh wrote: > >>the runtime warning you get when you use non-ascii characters in >>python source code points the poor user to this page: >> >> http://www.python.org/peps/pep-0263.html >> >>which tells the user to add a >> >> # -*- coding: <encoding name> -*- >> >>to the source, and then provides a more detailed syntax description >>as a RE pattern. to help people that didn't grow up with emacs, and >>don't speak fluent RE, and/or prefer to skim documentation, it would >>be a quite helpful if the page also contained a few examples; e.g. >> >># -*- coding: utf-8 -*- >># -*- coding: iso-8859-1 -*- >> >>can anyone with SVN write access perhaps add this? > > > Good point. I'll add some examples. > > >>(I'd probably add a note to the top of the page for anyone who arrives >>there via a Python error message, which summarizes the pep and provides >>an example or two; abstracts and rationales are nice, but if you're just a >>plain user, a "do this; here's how it works; further discussion below" style >>is a bit more practical...) > > > The PEP isn't all that long, so I don't think a summary would > help. However, we might want to point the user to a different > URL in the error message, e.g. a Wiki page with more user-friendly > content. > Under NO circumstances should a Wiki page be used as the destination for a link in a runtime error message. If the page happens to be spammed when the user follows the link they'll wonder why the error message is pointing to a page full of links to hot babes, or whatever. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ From pinard at iro.umontreal.ca Fri Nov 4 16:32:24 2005 From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard) Date: Fri, 4 Nov 2005 10:32:24 -0500 Subject: [Python-Dev] No more problems with new SVN repository In-Reply-To: <ca471dc20511031621s53fd3a5fha0e56b924974babb@mail.gmail.com> References: <4369177D.3020000@pobox.com> <43693944.3090803@v.loewis.de> <4369486A.8090107@pobox.com> <4369724D.8060001@pobox.com> <4369C2EA.6030407@v.loewis.de> <4369C54F.3050803@pobox.com> <436AA89C.6050401@pobox.com> <ca471dc20511031621s53fd3a5fha0e56b924974babb@mail.gmail.com> Message-ID: <20051104153224.GA22469@alcyon.progiciels-bpi.ca> [Guido van Rossum] > Is there a way to prevent this kind of thing in the future, e.g. by > removing or rejecting change log messages with characters that are > considered invalid in XML? Suppose TOP is the top of the Subversion repository. The easiest way is providing a TOP/hook/pre-commit script. If the script exits with non-zero status, usually with some clear diagnostic on stderr, the whole commit aborts, and the diagnostic is shown to the committing user. The tricky part is getting the tentative log message from within the script. This is done by popening "svnlook log -t ARG2 ARG1", where ARG1 and ARG2 are arguments given to the pre-commit script. -- Fran?ois Pinard http://pinard.progiciels-bpi.ca From dave at boost-consulting.com Fri Nov 4 21:09:39 2005 From: dave at boost-consulting.com (David Abrahams) Date: Fri, 04 Nov 2005 15:09:39 -0500 Subject: [Python-Dev] Plea to distribute debugging lib Message-ID: <uek5wdvjw.fsf@boost-consulting.com> For years, Boost.Python has been doing some hacks to work around the fact that a Windows Python distro doesn't include the debug build of the library. http://www.boost.org/libs/python/doc/building.html#variants explains. We wanted to make it reasonably convenient for Windows developers (and our distributed testers) to work with a debug build of the Boost.Python library and of their own code. Having to download the Python source and build the debug DLL was deemed unacceptable. Well, those hacks have run out of road. VC++8 now detects that some of its headers have been #included with _DEBUG and some without, and it will refuse to build anything when it does. We have several new hacks to work around that detection, and I think we _might_ be able to get away with them for one more release. But it's really time to do it right. MS is recommending that we (Boost) start distributing a debug build of the Python DLL with Boost, but Boost really seems like the wrong place to host such a thing. Is there any way Python.org can make a debug build more accessible? Thanks, Dave -- Dave Abrahams Boost Consulting www.boost-consulting.com From python at discworld.dyndns.org Fri Nov 4 21:28:25 2005 From: python at discworld.dyndns.org (Charles Cazabon) Date: Fri, 4 Nov 2005 14:28:25 -0600 Subject: [Python-Dev] Plea to distribute debugging lib In-Reply-To: <uek5wdvjw.fsf@boost-consulting.com> References: <uek5wdvjw.fsf@boost-consulting.com> Message-ID: <20051104202824.GA19678@discworld.dyndns.org> David Abrahams <dave at boost-consulting.com> wrote: > > For years, Boost.Python has been doing some hacks to work around the fact > that a Windows Python distro doesn't include the debug build of the library. [...] > Having to download the Python source and build the debug DLL was deemed > unacceptable. I'm curious: why was this "deemed unacceptable"? Python's license is about as liberal as it gets, and the code is almost startlingly easy to compile -- easier than any other similarly-sized codebase I've had to work with. Charles -- ----------------------------------------------------------------------- Charles Cazabon <python at discworld.dyndns.org> GPL'ed software available at: http://pyropus.ca/software/ ----------------------------------------------------------------------- From guido at python.org Fri Nov 4 21:33:44 2005 From: guido at python.org (Guido van Rossum) Date: Fri, 4 Nov 2005 12:33:44 -0800 Subject: [Python-Dev] Plea to distribute debugging lib In-Reply-To: <20051104202824.GA19678@discworld.dyndns.org> References: <uek5wdvjw.fsf@boost-consulting.com> <20051104202824.GA19678@discworld.dyndns.org> Message-ID: <ca471dc20511041233i25b3c33wbfa3e3ad4a356702@mail.gmail.com> I vaguely recall that there were problems with distributing the debug version of the MS runtime. Anyway, why can't you do this yourself for all Boost users? It's all volunteer time, you know... --Guido On 11/4/05, Charles Cazabon <python at discworld.dyndns.org> wrote: > David Abrahams <dave at boost-consulting.com> wrote: > > > > For years, Boost.Python has been doing some hacks to work around the fact > > that a Windows Python distro doesn't include the debug build of the library. > [...] > > Having to download the Python source and build the debug DLL was deemed > > unacceptable. > > I'm curious: why was this "deemed unacceptable"? Python's license is about as > liberal as it gets, and the code is almost startlingly easy to compile -- > easier than any other similarly-sized codebase I've had to work with. > > Charles > -- > ----------------------------------------------------------------------- > Charles Cazabon <python at discworld.dyndns.org> > GPL'ed software available at: http://pyropus.ca/software/ > ----------------------------------------------------------------------- > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.peters at gmail.com Fri Nov 4 21:37:37 2005 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 4 Nov 2005 15:37:37 -0500 Subject: [Python-Dev] Plea to distribute debugging lib In-Reply-To: <uek5wdvjw.fsf@boost-consulting.com> References: <uek5wdvjw.fsf@boost-consulting.com> Message-ID: <1f7befae0511041237j2156306fhe5b90053a7e027f8@mail.gmail.com> [David Abrahams] > For years, Boost.Python has been doing some hacks to work around the > fact that a Windows Python distro doesn't include the debug build of > the library. > ... > MS is recommending that we (Boost) start distributing a debug build of the > Python DLL with Boost, but Boost really seems like the wrong place to host > such a thing. Is there any way Python.org can make a debug build more > accessible? Possibly. I don't do this anymore (this == build the Python Windows installers), but I used to. For some time I also made available a zip file containing various debug-build bits, captured at the time the official installer was built. We didn't (and I'm sure we still don't) want to include them in the main installer, because they bloat its size for something most users truly do not want. I got sick of building the debug zip file, and stopped doing that too. No two users wanted the same set of stuff in it, so it grew to contain the union of everything everyone wanted, and then people complained that it was "too big". This is one of the few times in your Uncle Timmy's life that he said "so screw it -- do it yourself, you whiny baby whiners with your incessant baby whining you " ;-) Based on that sure-to-be universal reaction from anyone who signs up for this, I'd say the best thing you could do to help it along is to define precisely (a) what an acceptable distribution format is; and, (b) what exactly it should contain. That, and being nice to Martin, would go a long way. From theller at python.net Fri Nov 4 21:47:40 2005 From: theller at python.net (Thomas Heller) Date: Fri, 04 Nov 2005 21:47:40 +0100 Subject: [Python-Dev] Plea to distribute debugging lib References: <uek5wdvjw.fsf@boost-consulting.com> <20051104202824.GA19678@discworld.dyndns.org> <ca471dc20511041233i25b3c33wbfa3e3ad4a356702@mail.gmail.com> Message-ID: <irv86syb.fsf@python.net> Guido van Rossum <guido at python.org> writes: > I vaguely recall that there were problems with distributing the debug > version of the MS runtime. Right: the debug runtime dlls are not disributable. > Anyway, why can't you do this yourself for all Boost users? It's all > volunteer time, you know... Doesn't any boost user need a C compiler anyway, so it should not really be a problem to compile Python? Anyway, AFAIK, the activestate distribution contains Python debug dlls. Thomas From dave at boost-consulting.com Fri Nov 4 21:58:11 2005 From: dave at boost-consulting.com (David Abrahams) Date: Fri, 04 Nov 2005 15:58:11 -0500 Subject: [Python-Dev] Plea to distribute debugging lib In-Reply-To: <1f7befae0511041237j2156306fhe5b90053a7e027f8@mail.gmail.com> (Tim Peters's message of "Fri, 4 Nov 2005 15:37:37 -0500") References: <uek5wdvjw.fsf@boost-consulting.com> <1f7befae0511041237j2156306fhe5b90053a7e027f8@mail.gmail.com> Message-ID: <u8xw4dtb0.fsf@boost-consulting.com> Tim Peters <tim.peters at gmail.com> writes: > [David Abrahams] >> For years, Boost.Python has been doing some hacks to work around the >> fact that a Windows Python distro doesn't include the debug build of >> the library. >> ... >> MS is recommending that we (Boost) start distributing a debug build of the >> Python DLL with Boost, but Boost really seems like the wrong place to host >> such a thing. Is there any way Python.org can make a debug build more >> accessible? > > Possibly. I don't do this anymore (this == build the Python Windows > installers), but I used to. For some time I also made available a zip > file containing various debug-build bits, captured at the time the > official installer was built. We didn't (and I'm sure we still don't) > want to include them in the main installer, because they bloat its > size for something most users truly do not want. > > I got sick of building the debug zip file, and stopped doing that too. > No two users wanted the same set of stuff in it, so it grew to > contain the union of everything everyone wanted, and then people > complained that it was "too big". This is one of the few times in > your Uncle Timmy's life that he said "so screw it -- do it yourself, > you whiny baby whiners with your incessant baby whining you " ;-) > > Based on that sure-to-be universal reaction from anyone who signs up > for this, I'd say the best thing you could do to help it along is to > define precisely (a) what an acceptable distribution format is; and, > (b) what exactly it should contain. Who knows what the whiny babies will accept? That said, I think people would be happy with a .zip file containing whatever is built by selecting the debug build in the VS project and asking it to build everything. (**) > That, and being nice to Martin, I'm always as nice as Davidly possible to Martin! > would go a long way. My fingers and toes are crossed. Thanks! (**) If you could build the ability to download the debugging binaries into the regular installer, that would be the shiznit, but I don't dare ask for it. ;-) -- Dave Abrahams Boost Consulting www.boost-consulting.com From dave at boost-consulting.com Fri Nov 4 23:25:55 2005 From: dave at boost-consulting.com (David Abrahams) Date: Fri, 04 Nov 2005 17:25:55 -0500 Subject: [Python-Dev] Plea to distribute debugging lib In-Reply-To: <436BD111.5080808@rubikon.pl> (Bronek Kozicki's message of "Fri, 04 Nov 2005 21:22:25 +0000") References: <uek5wdvjw.fsf@boost-consulting.com> <1f7befae0511041237j2156306fhe5b90053a7e027f8@mail.gmail.com> <u8xw4dtb0.fsf@boost-consulting.com> <436BD111.5080808@rubikon.pl> Message-ID: <ufyqccaoc.fsf@boost-consulting.com> Bronek Kozicki <brok at rubikon.pl> writes: > David Abrahams wrote: >> Who knows what the whiny babies will accept? That said, I think >> people would be happy with a .zip file containing whatever is built by >> selecting the debug build in the VS project and asking it to build >> everything. (**) > > Just to clarify - what we are asking for is library built with _DEBUG > and no BOOST_DEBUG_PYTHON, that is the one compatible with default > Python distribution. Bronek, I know you're trying to help, but I'm sure that's not making anything clearer for these people. They don't know anything about BOOST_DEBUG_PYTHON and would never have cause to define it. -- Dave Abrahams Boost Consulting www.boost-consulting.com From martin at v.loewis.de Fri Nov 4 23:29:56 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 04 Nov 2005 23:29:56 +0100 Subject: [Python-Dev] Plea to distribute debugging lib In-Reply-To: <u8xw4dtb0.fsf@boost-consulting.com> References: <uek5wdvjw.fsf@boost-consulting.com> <1f7befae0511041237j2156306fhe5b90053a7e027f8@mail.gmail.com> <u8xw4dtb0.fsf@boost-consulting.com> Message-ID: <436BE0E4.6@v.loewis.de> David Abrahams wrote: > Who knows what the whiny babies will accept? That said, I think > people would be happy with a .zip file containing whatever is built by > selecting the debug build in the VS project and asking it to build > everything. (**) I would go a step further than Tim: Send me (*) a patch to msi.py (which is used to build the distribution) that picks up the files and packages them in the desired way, and I will include the files it outputs in the official distribution. This is how the libpython24.a got in (and this is also the way in which it will get out again). In the patch, preferably state whom to contact for the specific feature, as I won't be able to answer questions about it. I don't have a personal need for the feature (I do have debug builds myself, and it takes only 10 minutes or so to create them), so I won't even have a way to test whether the feature works correctly. Regards, Martin (*) that is, sf.net/projects/python From eyal.lotem at gmail.com Fri Nov 4 23:33:29 2005 From: eyal.lotem at gmail.com (Eyal Lotem) Date: Sat, 5 Nov 2005 00:33:29 +0200 Subject: [Python-Dev] Class decorators vs metaclasses Message-ID: <b64f365b0511041433m773361d9x202d57ac83534aa8@mail.gmail.com> I have a few claims, some unrelated, and some built on top of each other. I would like to hear your responses as to which are convincing, which arne't, and why. I think that if these claims are true, Python 3000 should change quite a bit. A. Metaclass code is black magic and few understand how it works, while decorator code is mostly understandable, even by non-gurus. B. One of Decorators' most powerful features is that they can mixed-and-matched, which makes them very powerful for many purposes, while metaclasses are exclusive, and only one can be used. This is especially problematic as some classes may assume their subclasses must use their respective metaclasses. This means classdecorators are strictly more powerful than metaclasses, without cumbersome convertions between metaclass mechanisms and decorator mechanisms. C. Interesting uses of classdecorators are allowing super-calling without redundantly specifying the name of your class, or your superclass. D. Python seems to be incrementally adding power to the core language with these features, which is great, but it also causes significant overlapping of language features, which I believe is something to avoid when possible. If metaclasses are replaced with class decorators, then suddenly inheritence becomes a redundant feature. E. If inheritence is a redundant feature, it can be removed and an "inherit" class decorator can be used. This could also reduce all the __mro__ clutter from the language along with other complexities, into alternate implementations of the inherit classdecorator. From martin at v.loewis.de Fri Nov 4 23:44:53 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 04 Nov 2005 23:44:53 +0100 Subject: [Python-Dev] Plea to distribute debugging lib In-Reply-To: <ufyqccaoc.fsf@boost-consulting.com> References: <uek5wdvjw.fsf@boost-consulting.com> <1f7befae0511041237j2156306fhe5b90053a7e027f8@mail.gmail.com> <u8xw4dtb0.fsf@boost-consulting.com> <436BD111.5080808@rubikon.pl> <ufyqccaoc.fsf@boost-consulting.com> Message-ID: <436BE465.4000100@v.loewis.de> David Abrahams wrote: >>Just to clarify - what we are asking for is library built with _DEBUG >>and no BOOST_DEBUG_PYTHON, that is the one compatible with default >>Python distribution. > > > I know you're trying to help, but I'm sure that's not making anything > clearer for these people. They don't know anything about > BOOST_DEBUG_PYTHON and would never have cause to define it. > Actually, I'm truly puzzled. Why would a library that has _DEBUG defined be compatible with the standard distribution? Doesn't _DEBUG cause linkage with msvcr71d.dll? In addition (more correctly: for that reason), the debug build causes python2x_d.dll to be build, instead of python2x.dll, which definitely is incompatible with the standard DLL. It not only uses a different C library; it also causes Py_DEBUG to be defined, which in turn creates a different memory layout for PyObject. So in the end, I would assume you are requesting what you call a debug-python, i.e. one that (in your system) *has* BOOST_DEBUG_PYTHON defined. Regards, Martin From aleaxit at gmail.com Sat Nov 5 00:02:42 2005 From: aleaxit at gmail.com (Alex Martelli) Date: Fri, 4 Nov 2005 15:02:42 -0800 Subject: [Python-Dev] Class decorators vs metaclasses In-Reply-To: <b64f365b0511041433m773361d9x202d57ac83534aa8@mail.gmail.com> References: <b64f365b0511041433m773361d9x202d57ac83534aa8@mail.gmail.com> Message-ID: <e8a0972d0511041502i5de78d12ua23b1eb79970e120@mail.gmail.com> On 11/4/05, Eyal Lotem <eyal.lotem at gmail.com> wrote: > I have a few claims, some unrelated, and some built on top of each > other. I would like to hear your responses as to which are > convincing, which arne't, and why. I think that if these claims are > true, Python 3000 should change quite a bit. > > A. Metaclass code is black magic and few understand how it works, > while decorator code is mostly understandable, even by non-gurus. I disagree. I've held many presentations and classes on both subjects, and while people may INITIALLY feel like metaclasses are black magic, as soon as I've explained it the fear dissipates. It all boils down do understanding that: class Name(Ba,Ses): <<body>> means Name = suitable_metaclass('Name', (Ba,Ses), <<dict-built-by-body>>) which isn't any harder than understanding that @foo(bar) def baz(args): ... means def baz(args): ... baz = foo(bar)(baz) > B. One of Decorators' most powerful features is that they can > mixed-and-matched, which makes them very powerful for many purposes, > while metaclasses are exclusive, and only one can be used. This is Wrong. You can mix as many metaclasses as you wish, as long as they're properly coded for multiple inheritance (using super, etc) -- just inherit from them all. This is reasonably easy to automate (see the last recipe in the 2nd ed of the Python Cookbook), too. > especially problematic as some classes may assume their subclasses > must use their respective metaclasses. This means classdecorators are > strictly more powerful than metaclasses, without cumbersome > convertions between metaclass mechanisms and decorator mechanisms. The assertion that classdecorators are strictly more powerful than custom metaclasses is simply false. How would you design classdecorator XXX so that @XXX class Foo: ... allows 'print Foo' to emit 'this is beautiful class Foo', for example? the str(Foo) implicit in print calls type(Foo).__str__(Foo), so you do need a custom type(Foo) -- which is all that is meant by "a custom metaclass"... a custom type whose instances are classes, that's all. > C. Interesting uses of classdecorators are allowing super-calling > without redundantly specifying the name of your class, or your > superclass. Can you give an example? > > D. Python seems to be incrementally adding power to the core language > with these features, which is great, but it also causes significant > overlapping of language features, which I believe is something to > avoid when possible. If metaclasses are replaced with class > decorators, then suddenly inheritence becomes a redundant feature. And how do you customize what "print Foo" emits, as above? > E. If inheritence is a redundant feature, it can be removed and an > "inherit" class decorator can be used. This could also reduce all the > __mro__ clutter from the language along with other complexities, into > alternate implementations of the inherit classdecorator. How do you propose to get exactly the same effects as inheritance (affect every attribute lookup on a class and its instances) without inheritance? Essentially, inheritance is automated delegation obtained by having getattr(foo, 'bar') look through a chain of objects (essentially the __mro__) until an attribute named 'bar' is found in one of those objects, plus a few minor but useful side effects, e.g. on isinstance and issubclass, and the catching of exceptions in try/except statements. How would any mechanism allowing all of these uses NOT be inheritance? Alex From dave at boost-consulting.com Sat Nov 5 00:04:29 2005 From: dave at boost-consulting.com (David Abrahams) Date: Fri, 04 Nov 2005 18:04:29 -0500 Subject: [Python-Dev] Plea to distribute debugging lib In-Reply-To: <436BE0E4.6@v.loewis.de> (Martin v. =?iso-8859-1?Q?L=F6wis's?= message of "Fri, 04 Nov 2005 23:29:56 +0100") References: <uek5wdvjw.fsf@boost-consulting.com> <1f7befae0511041237j2156306fhe5b90053a7e027f8@mail.gmail.com> <u8xw4dtb0.fsf@boost-consulting.com> <436BE0E4.6@v.loewis.de> Message-ID: <uy844aubm.fsf@boost-consulting.com> "Martin v. L?wis" <martin at v.loewis.de> writes: > David Abrahams wrote: >> Who knows what the whiny babies will accept? That said, I think >> people would be happy with a .zip file containing whatever is built by >> selecting the debug build in the VS project and asking it to build >> everything. (**) > > I would go a step further than Tim: Send me (*) a patch to msi.py (which > is used to build the distribution) that picks up the files and packages > them in the desired way, and I will include the files it outputs > in the official distribution. This is how the libpython24.a got in > (and this is also the way in which it will get out again). Not to look a gift horse in the mouth, but won't that cause the problem that Tim was worried about, i.e. a bloated Python installer? > In the patch, preferably state whom to contact for the specific feature, > as I won't be able to answer questions about it. > > I don't have a personal need for the feature (I do have debug builds > myself, and it takes only 10 minutes or so to create them), I know, me too. It's easy enough once you get started building Python. I just think it's too big a hump for many people. > so I won't even have a way to test whether the feature works > correctly. > > Regards, > Martin > > (*) that is, sf.net/projects/python I s'pose that means, "put it in the patches tracker." grateful-ly y'rs, -- Dave Abrahams Boost Consulting www.boost-consulting.com From dave at boost-consulting.com Sat Nov 5 00:13:39 2005 From: dave at boost-consulting.com (David Abrahams) Date: Fri, 04 Nov 2005 18:13:39 -0500 Subject: [Python-Dev] Plea to distribute debugging lib In-Reply-To: <436BE465.4000100@v.loewis.de> (Martin v. =?iso-8859-1?Q?L=F6?= =?iso-8859-1?Q?wis's?= message of "Fri, 04 Nov 2005 23:44:53 +0100") References: <uek5wdvjw.fsf@boost-consulting.com> <1f7befae0511041237j2156306fhe5b90053a7e027f8@mail.gmail.com> <u8xw4dtb0.fsf@boost-consulting.com> <436BD111.5080808@rubikon.pl> <ufyqccaoc.fsf@boost-consulting.com> <436BE465.4000100@v.loewis.de> Message-ID: <ur79watwc.fsf@boost-consulting.com> "Martin v. L?wis" <martin at v.loewis.de> writes: > David Abrahams wrote: >>> Just to clarify - what we are asking for is library built with >>> _DEBUG and no BOOST_DEBUG_PYTHON, that is the one compatible with >>> default Python distribution. >> I know you're trying to help, but I'm sure that's not making >> anything >> clearer for these people. They don't know anything about >> BOOST_DEBUG_PYTHON and would never have cause to define it. >> > > Actually, I'm truly puzzled. I was afraid this would happen. Really, you're better off ignoring Bronek's message. > Why would a library that has _DEBUG defined > be compatible with the standard distribution? Doesn't _DEBUG cause > linkage with msvcr71d.dll? Unless you do the hacks that I mentioned in my opening message. Read http://www.boost.org/libs/python/doc/building.html#variants for details. > In addition (more correctly: for that reason), the debug build causes > python2x_d.dll to be build, instead of python2x.dll, which definitely > is incompatible with the standard DLL. It not only uses a different > C library; it also causes Py_DEBUG to be defined, which in turn creates > a different memory layout for PyObject. Exactly. > So in the end, I would assume you are requesting what you call a > debug-python, i.e. one that (in your system) *has* > BOOST_DEBUG_PYTHON defined. What I am requesting is the good old python2x_d.dll and any associated extension modules that get built as part of the Python distro, so I can stop doing the hack, drop BOOST_DEBUG_PYTHON, and tell people use _DEBUG in the usual way. -- Dave Abrahams Boost Consulting www.boost-consulting.com From martin at v.loewis.de Sat Nov 5 00:21:05 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 05 Nov 2005 00:21:05 +0100 Subject: [Python-Dev] Plea to distribute debugging lib In-Reply-To: <uy844aubm.fsf@boost-consulting.com> References: <uek5wdvjw.fsf@boost-consulting.com> <1f7befae0511041237j2156306fhe5b90053a7e027f8@mail.gmail.com> <u8xw4dtb0.fsf@boost-consulting.com> <436BE0E4.6@v.loewis.de> <uy844aubm.fsf@boost-consulting.com> Message-ID: <436BECE1.3090202@v.loewis.de> David Abrahams wrote: >>I would go a step further than Tim: Send me (*) a patch to msi.py (which >>is used to build the distribution) that picks up the files and packages >>them in the desired way, and I will include the files it outputs >>in the official distribution. This is how the libpython24.a got in >>(and this is also the way in which it will get out again). > > > Not to look a gift horse in the mouth, but won't that cause the > problem that Tim was worried about, i.e. a bloated Python installer? Not if done properly: it would, of course, *not* add the desired files in to the msi file, but create a separate file. It is pure Python code, and called msi.py because that's it main function. It does several other things, though (such as creating a .cab file and a .a file); it could well create another zip file. As to how it would work: preferably by invoking the Python zip library, but invoking external programs to package up everything might be acceptable as well (assuming I'm told what these tools are, and assuming it falls back to doing nothing if the tools are not available). The separate file would have a name similar to the MSI file, so that the debug file has the same version number as the MSI file. > I s'pose that means, "put it in the patches tracker." > Exactly. Regards, Martin From eyal.lotem at gmail.com Sat Nov 5 12:27:55 2005 From: eyal.lotem at gmail.com (Eyal Lotem) Date: Sat, 5 Nov 2005 13:27:55 +0200 Subject: [Python-Dev] Class decorators vs metaclasses In-Reply-To: <e8a0972d0511041502i5de78d12ua23b1eb79970e120@mail.gmail.com> References: <b64f365b0511041433m773361d9x202d57ac83534aa8@mail.gmail.com> <e8a0972d0511041502i5de78d12ua23b1eb79970e120@mail.gmail.com> Message-ID: <b64f365b0511050327g40eaebecp90f0a60c6440b660@mail.gmail.com> On 11/5/05, Alex Martelli <aleaxit at gmail.com> wrote: > On 11/4/05, Eyal Lotem <eyal.lotem at gmail.com> wrote: > > I have a few claims, some unrelated, and some built on top of each > > other. I would like to hear your responses as to which are > > convincing, which arne't, and why. I think that if these claims are > > true, Python 3000 should change quite a bit. > > > > A. Metaclass code is black magic and few understand how it works, > > while decorator code is mostly understandable, even by non-gurus. > > I disagree. I've held many presentations and classes on both > subjects, and while people may INITIALLY feel like metaclasses are > black magic, as soon as I've explained it the fear dissipates. It all > boils down do understanding that: > > class Name(Ba,Ses): <<body>> > > means > > Name = suitable_metaclass('Name', (Ba,Ses), <<dict-built-by-body>>) > > which isn't any harder than understanding that > > @foo(bar) > def baz(args): ... > > means > > def baz(args): ... > baz = foo(bar)(baz) I disagree again. My experience is that metaclass code is very hard to understand. Especially when it starts doing non-trivial things, such as using a base metaclass class that is parametrized by metaclass attributes in its subclasses. Lookups of attributes in the base metaclass methods is mind boggling (is it searching them in the base metaclass, the subclass, the instance [which is the class]?). The same code would be much easier to understand with class decorators. > > B. One of Decorators' most powerful features is that they can > > mixed-and-matched, which makes them very powerful for many purposes, > > while metaclasses are exclusive, and only one can be used. This is > > Wrong. You can mix as many metaclasses as you wish, as long as > they're properly coded for multiple inheritance (using super, etc) -- > just inherit from them all. This is reasonably easy to automate (see > the last recipe in the 2nd ed of the Python Cookbook), too. Multiple inheritence is an awful way to mix class fucntionalities though. Lets take a simpler example. Most UT frameworks use a TestCase base class they inherit from to implement setup, tearDown, and then inherit from it again to implement the test itself. I argue this is a weak approach, because then mixing/matching setups is difficult. You would argue this is not the case, because of the ability to multiply-inherit from test cases, but how easy is the equivalent of: @with_setup('blah') @with_other_setup('bleh') def my_test(): # the blah setup and bleh other setup are up and usable here, # and will be "torn down" at the end of this test The equivalent of this requires a lot more work and violating DRY. Creating a specific function to multiply inherit from TestCases is a possible solution, but it is much more conceptually complex, and needs to be reimplemented in the next scenario (Metaclasses for example). > > especially problematic as some classes may assume their subclasses > > must use their respective metaclasses. This means classdecorators are > > strictly more powerful than metaclasses, without cumbersome > > convertions between metaclass mechanisms and decorator mechanisms. > > The assertion that classdecorators are strictly more powerful than > custom metaclasses is simply false. How would you design > classdecorator XXX so that > > @XXX > class Foo: ... > > allows 'print Foo' to emit 'this is beautiful class Foo', for example? > the str(Foo) implicit in print calls type(Foo).__str__(Foo), so you > do need a custom type(Foo) -- which is all that is meant by "a custom > metaclass"... a custom type whose instances are classes, that's all. I would argue that this is not such a useful feature, as in that case you can simply use a factory object instead of a class. If this feature remains, that's fine, but the fact it allows for a weak form of "decoration" of classes should not kill the concept of class decorators. The only reason of using metaclasses rather than factory objects, in my experience, was that references to class objects are considered different than references to factories (by pickle and deepcopy, and maybe others) and that can be a useful feature. This feature can be implemented in more readable means though. > > C. Interesting uses of classdecorators are allowing super-calling > > without redundantly specifying the name of your class, or your > > superclass. > > Can you give an example? @anotherclassdecorator @supercallerclass class MyClass(object): @supercaller def my_method(self, supcaller, x, y, z): ... result = supcaller.my_method(x, y, z) ... Could be nice to remove the need for decorating the class, and only decorating the methods, but the method decorators get a function object, not a method object, so its more difficult (perhaps portably impossible?) to do this. Note that "__metaclass__ = superclasscaller" could also work, but then combining "anotherclassdecorator" would require a lot more code at worst, or a complex mechanism to combine metaclasses via multiple inheritence at best. > > D. Python seems to be incrementally adding power to the core language > > with these features, which is great, but it also causes significant > > overlapping of language features, which I believe is something to > > avoid when possible. If metaclasses are replaced with class > > decorators, then suddenly inheritence becomes a redundant feature. > > And how do you customize what "print Foo" emits, as above? As I said, "Foo" can be a factory object rather than a class object. > > E. If inheritence is a redundant feature, it can be removed and an > > "inherit" class decorator can be used. This could also reduce all the > > __mro__ clutter from the language along with other complexities, into > > alternate implementations of the inherit classdecorator. > > How do you propose to get exactly the same effects as inheritance > (affect every attribute lookup on a class and its instances) without > inheritance? Essentially, inheritance is automated delegation > obtained by having getattr(foo, 'bar') look through a chain of objects > (essentially the __mro__) until an attribute named 'bar' is found in > one of those objects, plus a few minor but useful side effects, e.g. > on isinstance and issubclass, and the catching of exceptions in > try/except statements. How would any mechanism allowing all of these > uses NOT be inheritance? One possibility is to copy the superclass attributes into subclasses. Another is to allow the class decorator to specify getattr/setattr/delattr's implementation without modifying the metaclass [admittedly this is a difficult/problematic solution]. In any case, the inheritence class decorator could specify special attributes in the class (it can remain compatible with __bases__) for isinstance/try to work. I agree that implementing inheritence this way is problematic [I'm convinced], but don't let that determine the fate of class decorators in general, which are more useful than metaclasses in many (most?) scenarios. From pinard at iro.umontreal.ca Sat Nov 5 17:29:49 2005 From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard) Date: Sat, 5 Nov 2005 11:29:49 -0500 Subject: [Python-Dev] PEP 352 Transition Plan In-Reply-To: <ca471dc20510311124s1c3aeffeya879056477ea515d@mail.gmail.com> References: <ca471dc20510281329m5312946bjedf100d942c0dc49@mail.gmail.com> <007d01c5dc00$738da2e0$b62dc797@oemcomputer> <bbaeab100510281552rfd260afrde3e72eec14dd5df@mail.gmail.com> <4362DD15.4080606@gmail.com> <bbaeab100510282037m5bad1f67kb4d5cb7171ac163b@mail.gmail.com> <ca471dc20510311124s1c3aeffeya879056477ea515d@mail.gmail.com> Message-ID: <20051105162949.GA8992@phenix.sram.qc.ca> [Guido van Rossum] > I've made a final pass over PEP 352, mostly fixing the __str__, > __unicode__ and __repr__ methods to behave more reasonably. I'm all > for accepting it now. Does anybody see any last-minute show-stopping > problems with it? I did not follow the thread, so maybe I'm out in order, be kind with me. After having read PEP 352, it is not crystal clear whether in: try: ... except: ... the "except:" will mean "except BaseException:" or "except Exception:". I would except the first, but the text beginning the section titled "Exception Hierarchy Changes" suggests it could mean the second, without really stating it. Let me argue that "except BaseException:" is preferable. First, because there is no reason to load a bare "except:" by anything but a very simple and clean meaning, like the real base of the exception hierarchy. Second, as a bare "except:" is not considered good practice on average, it would be counter-productive trying to figure out ways to make it more frequently _usable_. -- Fran?ois Pinard http://pinard.progiciels-bpi.ca From guido at python.org Sat Nov 5 18:46:54 2005 From: guido at python.org (Guido van Rossum) Date: Sat, 5 Nov 2005 09:46:54 -0800 Subject: [Python-Dev] PEP 352 Transition Plan In-Reply-To: <20051105162949.GA8992@phenix.sram.qc.ca> References: <ca471dc20510281329m5312946bjedf100d942c0dc49@mail.gmail.com> <007d01c5dc00$738da2e0$b62dc797@oemcomputer> <bbaeab100510281552rfd260afrde3e72eec14dd5df@mail.gmail.com> <4362DD15.4080606@gmail.com> <bbaeab100510282037m5bad1f67kb4d5cb7171ac163b@mail.gmail.com> <ca471dc20510311124s1c3aeffeya879056477ea515d@mail.gmail.com> <20051105162949.GA8992@phenix.sram.qc.ca> Message-ID: <ca471dc20511050946k38da87d7o1e676df61a9b9a78@mail.gmail.com> > [Guido van Rossum] > > > I've made a final pass over PEP 352, mostly fixing the __str__, > > __unicode__ and __repr__ methods to behave more reasonably. I'm all > > for accepting it now. Does anybody see any last-minute show-stopping > > problems with it? [Fran?ois] > I did not follow the thread, so maybe I'm out in order, be kind with me. > > After having read PEP 352, it is not crystal clear whether in: > > try: > ... > except: > ... > > the "except:" will mean "except BaseException:" or "except Exception:". > I would except the first, but the text beginning the section titled > "Exception Hierarchy Changes" suggests it could mean the second, without > really stating it. This is probably a leftover from PEP 348, which did have a change for bare 'except:' in mind. PEP 352 doesn't propose to change its meaning, and if there are words that suggest this, they should be removed. Until Python 3.0, it will not change its meaning from what it is now; this is because until then, it is still *possible* (though it will become deprecated behavior) to raise string exceptions or classes that don't inherit from BaseException. > Let me argue that "except BaseException:" is preferable. First, because > there is no reason to load a bare "except:" by anything but a very > simple and clean meaning, like the real base of the exception hierarchy. > Second, as a bare "except:" is not considered good practice on average, > it would be counter-productive trying to figure out ways to make it more > frequently _usable_. What bare 'except:' will mean in Python 3.0, and whether it is even allowed at all, is up for discussion -- it will have to be a new PEP. Personally, I think bare 'except:' should be removed from the language in Python 3.0, so that all except clauses are explicit in what they catch and there isn't any confusion over whether KeyboardInterrupt, SystemExit etc. are included or not. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From noamraph at gmail.com Sat Nov 5 20:05:28 2005 From: noamraph at gmail.com (Noam Raphael) Date: Sat, 5 Nov 2005 21:05:28 +0200 Subject: [Python-Dev] Should the default equality operator compare values instead of identities? In-Reply-To: <20051102125437.F290.JCARLSON@uci.edu> References: <b348a0850511021236u66c94838pb7bb9e27f1314c3d@mail.gmail.com> <20051102125437.F290.JCARLSON@uci.edu> Message-ID: <b348a0850511051105k44906fbepdd0258ba2435bd2e@mail.gmail.com> On 11/3/05, Josiah Carlson <jcarlson at uci.edu> wrote: ... > > Right, but lists (dicts, tuples, etc.) are defined as containers, and > their comparison operation is defined on their contents. Objects are > not defined as containers in the general case, so defining comparisons > based on their contents (as opposed to identity) is just one of the two > assumptions to be made. > > I personally like the current behavior, and I see no /compelling/ reason > to change it. You obviously feel so compelled for the behavior to > change that you are willing to express your desires. How about you do > something more productive and produce a patch which implements the > changes you want, verify that it passes tests in the standard library, > then post it on sourceforge. If someone is similarly compelled and > agrees with you (so far I've not seen any public support for your > proposal by any of the core developers), the discussion will restart, > and it will be decided (not by you or I). Thanks for the advice - I will try to do as you suggest. > > > > To summarize, I think that value-based equality testing would usually > > be what you want, and currently implementing it is a bit of a pain. > > Actually, implementing value-based equality testing, when you have a > finite set of values you want to test, is quite easy. > > def __eq__(self, other): > for i in self.__cmp_eq__: > if getattr(self, i) != getattr(other, i): > return False > return True > > With a simple metaclass that discovers all of those values automatically, > and/or your own protocol for exclusion, and you are done. Remember, not > all 5-line functions should become builtin/default behavior, and this > implementation shows that it is not a significant burdon for you (or > anyone else) to implement this in your own custom library. > You are right that not all 5-line functions should become builtin/default behaviour. However, I personally think that this one should, since: 1. It doesn't add complexity, or a new builtin. 2. Those five line doesn't include the metaclass code, which will probably take more than five lines and won't be trivial. 3. It will make other objects behave better, not only mine - other classes will get a meaningful comparison operator, for free. > > P.S. One thing that you should remember is that even if your patch is > accepted, and even if this is desireable, Python 2.5 is supposed to be > released sometime next year (spring/summer?), and because it is a > backwards incompatible change, would need at least 2.6-2.7 before it > becomes the default behavior without a __future__ import, which is > another 3-4 years down the line. I hope that the warning can go in by Python 2.5, so the change (which I think will cause relatively few backwards incompatibility problems) can go in by Python 2.6, which I think is less than 2 years down the line. > > I understand you are passionate, really I do (you should see some of my > proposals), but by the time these things get around to getting into > mainline Python, there are high odds that you probably won't care about > them much anymore (I've come to feel that way myself about many of my > proposals), and I think it is a good idea to attempt to balance - when > it comes to Python - "Now is better than never." and "Although never is > often better than *right* now." > > Removing __hash__, changing __eq__, and trying to get in copy-on-write > freezing (which is really copy-and-cache freezing), all read to me like > "We gotta do this now!", which certainly isn't helping the proposal. > Thanks - I should really calm down a bit. I will try to go "safe and slowly", and I hope that at the end I will succeed in making my own small contribution to Python. Noam From jcarlson at uci.edu Sat Nov 5 21:30:17 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Sat, 05 Nov 2005 12:30:17 -0800 Subject: [Python-Dev] Should the default equality operator compare values instead of identities? In-Reply-To: <b348a0850511051105k44906fbepdd0258ba2435bd2e@mail.gmail.com> References: <20051102125437.F290.JCARLSON@uci.edu> <b348a0850511051105k44906fbepdd0258ba2435bd2e@mail.gmail.com> Message-ID: <20051105115816.BFE3.JCARLSON@uci.edu> Noam Raphael <noamraph at gmail.com> wrote: > On 11/3/05, Josiah Carlson <jcarlson at uci.edu> wrote: > > > To summarize, I think that value-based equality testing would usually > > > be what you want, and currently implementing it is a bit of a pain. > > > > Actually, implementing value-based equality testing, when you have a > > finite set of values you want to test, is quite easy. > > > > def __eq__(self, other): > > for i in self.__cmp_eq__: > > if getattr(self, i) != getattr(other, i): > > return False > > return True > > > > With a simple metaclass that discovers all of those values automatically, > > and/or your own protocol for exclusion, and you are done. Remember, not > > all 5-line functions should become builtin/default behavior, and this > > implementation shows that it is not a significant burdon for you (or > > anyone else) to implement this in your own custom library. > > > You are right that not all 5-line functions should become > builtin/default behaviour. However, I personally think that this one > should, since: > 1. It doesn't add complexity, or a new builtin. It changes default behavior (which I specified as a portion of my statement, which you quote. And you are wrong, it adds complexity to the implementation of both class instantiation and the default comparison mechanism. The former, I believe, you will find more difficult to patch than the comparison, though if you have not yet had adventures in that which is writing C extension modules, modifying the default class instantiation may be the deal breaker for you (I personally would have no idea where to start). > 2. Those five line doesn't include the metaclass code, which will > probably take more than five lines and won't be trivial. class eqMetaclass(type): def __new__(cls, name, bases, dct): if '__cmp_include__' in dct: include = dict.fromkeys(dct['__cmp_include__']) else: include = dict.fromkeys(dct.keys) for i in dct.get('__cmp_exclude__'): _ = include.pop(i, None) dct['__cmp_eq__'] = include.keys() return type.__new__(cls, name, bases, dct) It took 10 lines of code, and was trivial (except for not-included multi-metaclass support code, which is being discussed in another thread). Oh, I suppose I should modify that __eq__ definition to be smarter about comparison... def __eq__(self, other): if not hasattr(other, '__cmp_eq__'): return False if dict.fromkeys(self.__cmp_eq__) != \ dict.fromkeys(other.__cmp_eq__): return False for i in self.__cmp_eq__: if getattr(self, i) != getattr(other, i): return False return True Wow, 20 lines of support code, how could one ever expect users to write that? ;) > 3. It will make other objects behave better, not only mine - other > classes will get a meaningful comparison operator, for free. You are that the comparison previously wasn't "meaningful". It has a meaning, though it may not be exactly what you wanted it to be, which is why Python allows users to define __eq__ operators to be exactly what they want, and which is why I don't find your uses compelling. > > P.S. One thing that you should remember is that even if your patch is > > accepted, and even if this is desireable, Python 2.5 is supposed to be > > released sometime next year (spring/summer?), and because it is a > > backwards incompatible change, would need at least 2.6-2.7 before it > > becomes the default behavior without a __future__ import, which is > > another 3-4 years down the line. > > I hope that the warning can go in by Python 2.5, so the change (which > I think will cause relatively few backwards incompatibility problems) > can go in by Python 2.6, which I think is less than 2 years down the > line. As per historical release schedules (available in PEP form at www.python.org/peps), alpha 1 to final generally takes 6 months. It then takes at least a year before the alpha 1 of the following version is to be released. Being that 2.4 final was released November 2004, and we've not seen an alpha for 2.5 yet, we are at least 6 months (according to history) from 2.5 final, and at least 2 years from 2.6 final. From what I have previously learned from others in python-dev, the warnings machinery is slow, so one is to be wary of using warnings unless absolutely necessary. Regardless of it being absolutely necessary, it would be 2 years at least before the feature would actually make it into Python and become default behavior, IF it were desireable default behavior. > Thanks - I should really calm down a bit. I will try to go "safe and > slowly", and I hope that at the end I will succeed in making my own > small contribution to Python. You should also realize that you can make contributions to Python without changing the language or the implementation of the langauge. Read and review patches, help with bug reports, hang out on python-list and attempt to help the hundreds (if not thousands) of users who are asking for help, try to help new users in python-tutor, etc. If you have an idea for a language change, offer it up on python-list first (I've forgotten to do this more often than I would like to admit), and if it generally has more "cool" than "ick", then bring it back here. - Josiah From martin at v.loewis.de Sat Nov 5 22:41:21 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 05 Nov 2005 22:41:21 +0100 Subject: [Python-Dev] Why should the default hash(x) == id(x)? In-Reply-To: <b348a0850511011721ve1c3817vd5f61b644257e855@mail.gmail.com> References: <b348a0850511011721ve1c3817vd5f61b644257e855@mail.gmail.com> Message-ID: <436D2701.6080400@v.loewis.de> Noam Raphael wrote: > Is there a reason why the default __hash__ method returns the id of the objects? You are asking "why" question of the kind which are best answered as "why not". IOW, you are saying that the current behaviour is bad, but you are not proposing any alternative behaviour. There are many alternatives possible, and they are presumably all worse than the current implementation. To give an example: "why does hash() return id()"? Answer: The alternative would be that hash() returns always 0 unless implemented otherwise. This would cause serious performance issues for people using the objects as dictionary keys. If they don't do that, it doesn't matter what hash() returns. > This leads me to another question: why should the default __eq__ > method be the same as "is"? Because the alternative would be to always return "False". This would be confusing, because it would cause "x == x" to give False. More generally, I claim that the current behaviour is better than *any* alternative. To refute this claim, you would have to come up with an alternative first. Regards, Martin From noamraph at gmail.com Sun Nov 6 00:00:16 2005 From: noamraph at gmail.com (Noam Raphael) Date: Sun, 6 Nov 2005 01:00:16 +0200 Subject: [Python-Dev] Should the default equality operator compare values instead of identities? In-Reply-To: <20051105115816.BFE3.JCARLSON@uci.edu> References: <20051102125437.F290.JCARLSON@uci.edu> <b348a0850511051105k44906fbepdd0258ba2435bd2e@mail.gmail.com> <20051105115816.BFE3.JCARLSON@uci.edu> Message-ID: <b348a0850511051500w5205c608u13768da5156cd58b@mail.gmail.com> On 11/5/05, Josiah Carlson <jcarlson at uci.edu> wrote: ... > > 1. It doesn't add complexity, or a new builtin. > > It changes default behavior (which I specified as a portion of my > statement, which you quote. > > And you are wrong, it adds complexity to the implementation of both > class instantiation and the default comparison mechanism. The former, I > believe, you will find more difficult to patch than the comparison, > though if you have not yet had adventures in that which is writing C > extension modules, modifying the default class instantiation may be > the deal breaker for you (I personally would have no idea where to start). Sorry, I meant complexity to the Python user - it won't require him to learn more in order to write programs in Python. > > class eqMetaclass(type): > def __new__(cls, name, bases, dct): > if '__cmp_include__' in dct: > include = dict.fromkeys(dct['__cmp_include__']) > else: > include = dict.fromkeys(dct.keys) > > for i in dct.get('__cmp_exclude__'): > _ = include.pop(i, None) > > dct['__cmp_eq__'] = include.keys() > return type.__new__(cls, name, bases, dct) > > It took 10 lines of code, and was trivial (except for not-included > multi-metaclass support code, which is being discussed in another thread). > > Oh, I suppose I should modify that __eq__ definition to be smarter about > comparison... > > def __eq__(self, other): > if not hasattr(other, '__cmp_eq__'): > return False > if dict.fromkeys(self.__cmp_eq__) != \ > dict.fromkeys(other.__cmp_eq__): > return False > for i in self.__cmp_eq__: > if getattr(self, i) != getattr(other, i): > return False > return True Thanks for the implementation. It would be very useful in order to explain my suggestion. It's nice that it compares only attributes, not types. It makes it possible for two people to write classes that can be equal to one another. > > Wow, 20 lines of support code, how could one ever expect users to write > that? ;) This might mean that implementing it in C, once I find the right place, won't be too difficult. And I think that for most users it will be harder than it was for you, and there are some subtleties in those lines. > > > > 3. It will make other objects behave better, not only mine - other > > classes will get a meaningful comparison operator, for free. > > You are that the comparison previously wasn't "meaningful". It has a > meaning, though it may not be exactly what you wanted it to be, which is > why Python allows users to define __eq__ operators to be exactly what > they want, and which is why I don't find your uses compelling. > I think that value-based equality testing is a better default, since in more cases it does what you want it to, and since in those cases they won't have to write those 20 lines, or download them from somewhere. > ... > > From what I have previously learned from others in python-dev, the > warnings machinery is slow, so one is to be wary of using warnings > unless absolutely necessary. Regardless of it being absolutely necessary, > it would be 2 years at least before the feature would actually make it > into Python and become default behavior, IF it were desireable default > behavior. All right. I hope that those warnings will be ok - it's yet to be seen. And about those 2 years - better later than never. ... > > You should also realize that you can make contributions to Python > without changing the language or the implementation of the langauge. > Read and review patches, help with bug reports, hang out on python-list > and attempt to help the hundreds (if not thousands) of users who are > asking for help, try to help new users in python-tutor, etc. I confess that I don't do these a lot. I can say that I from time to time teach beginners Python, and that where I work I help a lot of other people with Python. > If you > have an idea for a language change, offer it up on python-list first > (I've forgotten to do this more often than I would like to admit), and > if it generally has more "cool" than "ick", then bring it back here. > I will. Thanks again. Noam From jcarlson at uci.edu Sun Nov 6 00:40:34 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Sat, 05 Nov 2005 15:40:34 -0800 Subject: [Python-Dev] Should the default equality operator compare values instead of identities? In-Reply-To: <b348a0850511051500w5205c608u13768da5156cd58b@mail.gmail.com> References: <20051105115816.BFE3.JCARLSON@uci.edu> <b348a0850511051500w5205c608u13768da5156cd58b@mail.gmail.com> Message-ID: <20051105151436.BFF7.JCARLSON@uci.edu> Noam Raphael <noamraph at gmail.com> wrote: > > On 11/5/05, Josiah Carlson <jcarlson at uci.edu> wrote: > ... > > > 1. It doesn't add complexity, or a new builtin. > > > > It changes default behavior (which I specified as a portion of my > > statement, which you quote. > > > > And you are wrong, it adds complexity to the implementation of both > > class instantiation and the default comparison mechanism. The former, I > > believe, you will find more difficult to patch than the comparison, > > though if you have not yet had adventures in that which is writing C > > extension modules, modifying the default class instantiation may be > > the deal breaker for you (I personally would have no idea where to start). > > Sorry, I meant complexity to the Python user - it won't require him to > learn more in order to write programs in Python. Ahh, but it does add complexity. Along with knowing __doc__, __slots__, __metaclass__, __init__, __new__, __cmp__, __eq__, ..., __str__, __repr__, __getitem__, __setitem__, __delitem__, __getattr__, __setattr__, __delattr__, ... The user must also know what __cmp_include__ and __cmp_exclude__ means in order to understand code which uses them, and they must understand that exclude entries overwrite include entries. > > Wow, 20 lines of support code, how could one ever expect users to write > > that? ;) > > This might mean that implementing it in C, once I find the right > place, won't be too difficult. > > And I think that for most users it will be harder than it was for you, > and there are some subtleties in those lines. So put it in the Python Cookbook: http://aspn.activestate.com/ASPN/Cookbook/Python > > > 3. It will make other objects behave better, not only mine - other > > > classes will get a meaningful comparison operator, for free. > > > > You are that the comparison previously wasn't "meaningful". It has a > > meaning, though it may not be exactly what you wanted it to be, which is > > why Python allows users to define __eq__ operators to be exactly what > > they want, and which is why I don't find your uses compelling. > > > I think that value-based equality testing is a better default, since > in more cases it does what you want it to, and since in those cases > they won't have to write those 20 lines, or download them from > somewhere. You are making a value judgement on what people want to happen with default Python. Until others state that they want such an operation as a default, I'm going to consider this particular argument relatively unfounded. > > From what I have previously learned from others in python-dev, the > > warnings machinery is slow, so one is to be wary of using warnings > > unless absolutely necessary. Regardless of it being absolutely necessary, > > it would be 2 years at least before the feature would actually make it > > into Python and become default behavior, IF it were desireable default > > behavior. > > All right. I hope that those warnings will be ok - it's yet to be > seen. And about those 2 years - better later than never. It won't be OK. Every comparison using the default operator will incur a speed penalty while it checks the (pure Python) warning machinery to determine if the warning has been issued yet. This alone makes the transition require a __future__ import. - Josiah From noamraph at gmail.com Sun Nov 6 01:02:36 2005 From: noamraph at gmail.com (Noam Raphael) Date: Sun, 6 Nov 2005 02:02:36 +0200 Subject: [Python-Dev] Should the default equality operator compare values instead of identities? In-Reply-To: <20051105151436.BFF7.JCARLSON@uci.edu> References: <20051105115816.BFE3.JCARLSON@uci.edu> <b348a0850511051500w5205c608u13768da5156cd58b@mail.gmail.com> <20051105151436.BFF7.JCARLSON@uci.edu> Message-ID: <b348a0850511051602u4db5e332mdbc3dcecbe95b170@mail.gmail.com> On 11/6/05, Josiah Carlson <jcarlson at uci.edu> wrote: ... > > > > Sorry, I meant complexity to the Python user - it won't require him to > > learn more in order to write programs in Python. > > Ahh, but it does add complexity. Along with knowing __doc__, __slots__, > __metaclass__, __init__, __new__, __cmp__, __eq__, ..., __str__, > __repr__, __getitem__, __setitem__, __delitem__, __getattr__, > __setattr__, __delattr__, ... > > > The user must also know what __cmp_include__ and __cmp_exclude__ means > in order to understand code which uses them, and they must understand > that exclude entries overwrite include entries. > You are right. But that's Python - I think that nobody knows all the exact details of what all these do. You look in the documentation. It is a compliation - but it's of the type that I can live with, if there's a reason. > > > > Wow, 20 lines of support code, how could one ever expect users to write > > > that? ;) > > > > This might mean that implementing it in C, once I find the right > > place, won't be too difficult. > > > > And I think that for most users it will be harder than it was for you, > > and there are some subtleties in those lines. > > So put it in the Python Cookbook: > http://aspn.activestate.com/ASPN/Cookbook/Python > A good idea. > > > > > 3. It will make other objects behave better, not only mine - other > > > > classes will get a meaningful comparison operator, for free. > > > > > > You are that the comparison previously wasn't "meaningful". It has a > > > meaning, though it may not be exactly what you wanted it to be, which is > > > why Python allows users to define __eq__ operators to be exactly what > > > they want, and which is why I don't find your uses compelling. > > > > > I think that value-based equality testing is a better default, since > > in more cases it does what you want it to, and since in those cases > > they won't have to write those 20 lines, or download them from > > somewhere. > > You are making a value judgement on what people want to happen with > default Python. Until others state that they want such an operation as a > default, I'm going to consider this particular argument relatively > unfounded. > All right. I will try to collect more examples for my proposal. > > > > From what I have previously learned from others in python-dev, the > > > warnings machinery is slow, so one is to be wary of using warnings > > > unless absolutely necessary. Regardless of it being absolutely necessary, > > > it would be 2 years at least before the feature would actually make it > > > into Python and become default behavior, IF it were desireable default > > > behavior. > > > > All right. I hope that those warnings will be ok - it's yet to be > > seen. And about those 2 years - better later than never. > > It won't be OK. Every comparison using the default operator will incur > a speed penalty while it checks the (pure Python) warning machinery to > determine if the warning has been issued yet. This alone makes the > transition require a __future__ import. > How will the __future__ statement help? I think that the warning is still needed, so that people using code that may stop working will know about it. I see that they can add a __future__ import and see if it still works, but it will catch much fewer problems, because usually code would be run without the __future__ import. If it really slows down things, it seems to me that the only solution is to optimize the warning module... Noam From noamraph at gmail.com Sun Nov 6 01:03:22 2005 From: noamraph at gmail.com (Noam Raphael) Date: Sun, 6 Nov 2005 02:03:22 +0200 Subject: [Python-Dev] Why should the default hash(x) == id(x)? In-Reply-To: <436D2701.6080400@v.loewis.de> References: <b348a0850511011721ve1c3817vd5f61b644257e855@mail.gmail.com> <436D2701.6080400@v.loewis.de> Message-ID: <b348a0850511051603r6f218a46wa5ff0abf006a1fc3@mail.gmail.com> On 11/5/05, "Martin v. L?wis" <martin at v.loewis.de> wrote: > More generally, I claim that the current behaviour is better than > *any* alternative. To refute this claim, you would have to come > up with an alternative first. > The alternative is to drop the __hash__ method of user-defined classes (as Guido already decided to do), and to make the default __eq__ method compare the two objects' __dict__ and slot members. See the thread about default equality operator - Josiah Carlson posted there a metaclass implementing this equality operator. Noam From jcarlson at uci.edu Sun Nov 6 01:19:49 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Sat, 05 Nov 2005 16:19:49 -0800 Subject: [Python-Dev] Why should the default hash(x) == id(x)? In-Reply-To: <b348a0850511051603r6f218a46wa5ff0abf006a1fc3@mail.gmail.com> References: <436D2701.6080400@v.loewis.de> <b348a0850511051603r6f218a46wa5ff0abf006a1fc3@mail.gmail.com> Message-ID: <20051105161846.C004.JCARLSON@uci.edu> Noam Raphael <noamraph at gmail.com> wrote: > > On 11/5/05, "Martin v. L?wis" <martin at v.loewis.de> wrote: > > More generally, I claim that the current behaviour is better than > > *any* alternative. To refute this claim, you would have to come > > up with an alternative first. > > > The alternative is to drop the __hash__ method of user-defined classes > (as Guido already decided to do), and to make the default __eq__ > method compare the two objects' __dict__ and slot members. > > See the thread about default equality operator - Josiah Carlson posted > there a metaclass implementing this equality operator. The existance of a simple equality operator and metaclass is actually a strike against changing the default behavior for equality. - Josiah From pedronis at strakt.com Sun Nov 6 01:29:18 2005 From: pedronis at strakt.com (Samuele Pedroni) Date: Sun, 06 Nov 2005 01:29:18 +0100 Subject: [Python-Dev] Why should the default hash(x) == id(x)? In-Reply-To: <b348a0850511051603r6f218a46wa5ff0abf006a1fc3@mail.gmail.com> References: <b348a0850511011721ve1c3817vd5f61b644257e855@mail.gmail.com> <436D2701.6080400@v.loewis.de> <b348a0850511051603r6f218a46wa5ff0abf006a1fc3@mail.gmail.com> Message-ID: <436D4E5E.7000301@strakt.com> Noam Raphael wrote: > On 11/5/05, "Martin v. L?wis" <martin at v.loewis.de> wrote: > >>More generally, I claim that the current behaviour is better than >>*any* alternative. To refute this claim, you would have to come >>up with an alternative first. >> > > The alternative is to drop the __hash__ method of user-defined classes > (as Guido already decided to do), and to make the default __eq__ > method compare the two objects' __dict__ and slot members. > no, whether object has an __hash__ and what is the default hashing are different issues. Also all this discussion should have started and lived on comp.lang.python and this is a good point as any to rectify this. > See the thread about default equality operator - Josiah Carlson posted > there a metaclass implementing this equality operator. > > Noam > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/pedronis%40strakt.com From jcarlson at uci.edu Sun Nov 6 01:30:52 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Sat, 05 Nov 2005 16:30:52 -0800 Subject: [Python-Dev] Should the default equality operator compare values instead of identities? In-Reply-To: <b348a0850511051602u4db5e332mdbc3dcecbe95b170@mail.gmail.com> References: <20051105151436.BFF7.JCARLSON@uci.edu> <b348a0850511051602u4db5e332mdbc3dcecbe95b170@mail.gmail.com> Message-ID: <20051105162001.C007.JCARLSON@uci.edu> Noam Raphael <noamraph at gmail.com> wrote: > > On 11/6/05, Josiah Carlson <jcarlson at uci.edu> wrote: > ... > > > > > > Sorry, I meant complexity to the Python user - it won't require him to > > > learn more in order to write programs in Python. > You are right. But that's Python - I think that nobody knows all the > exact details of what all these do. You look in the documentation. It > is a compliation - but it's of the type that I can live with, if > there's a reason. Regardless of whether people check the documentation, it does add complexity to Python. > > > All right. I hope that those warnings will be ok - it's yet to be > > > seen. And about those 2 years - better later than never. > > > > It won't be OK. Every comparison using the default operator will incur > > a speed penalty while it checks the (pure Python) warning machinery to > > determine if the warning has been issued yet. This alone makes the > > transition require a __future__ import. > > > How will the __future__ statement help? I think that the warning is > still needed, so that people using code that may stop working will > know about it. I see that they can add a __future__ import and see if > it still works, but it will catch much fewer problems, because usually > code would be run without the __future__ import. What has been common is to use __future__ along with a note in the release notes specifying the changes between 2.x and 2.x-1. The precise mechanisms when using __future__ vary from import to import, though this one could signal the change of a single variable as to which code path to use. > If it really slows down things, it seems to me that the only solution > is to optimize the warning module... Possible solutions to possible problem of default __eq__ behavior: 1. It is not a problem, leave it alone. 2. Use __future__. 3. Use warnings, and deal with it being slow. 4. Make warnings a C module and expose it to CPython internals. You are claiming that there is such a need to fix __eq__ that one would NEEDs to change the warnings module so that the __eq__ fix can be fast. Again, implement this, post it to sourceforge, and someone will decide. - Josiah From martin at v.loewis.de Sun Nov 6 11:08:22 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 06 Nov 2005 11:08:22 +0100 Subject: [Python-Dev] Why should the default hash(x) == id(x)? In-Reply-To: <b348a0850511051603r6f218a46wa5ff0abf006a1fc3@mail.gmail.com> References: <b348a0850511011721ve1c3817vd5f61b644257e855@mail.gmail.com> <436D2701.6080400@v.loewis.de> <b348a0850511051603r6f218a46wa5ff0abf006a1fc3@mail.gmail.com> Message-ID: <436DD616.3080304@v.loewis.de> Noam Raphael wrote: > The alternative is to drop the __hash__ method of user-defined classes > (as Guido already decided to do), and to make the default __eq__ > method compare the two objects' __dict__ and slot members. The question then is what hash(x) would do. It seems that you expect it then somehow not to return a value. However, under this patch, the fallback implementation (use pointer as the hash) would be used, which would preserve hash(x)==id(x). > See the thread about default equality operator - Josiah Carlson posted > there a metaclass implementing this equality operator. This will likely cause a lot of breakage. Objects will compare equal even though they conceptually are not, and even though they did not compare equal in previous Python versions. Regards, Martin From jim at zope.com Sun Nov 6 17:15:58 2005 From: jim at zope.com (Jim Fulton) Date: Sun, 06 Nov 2005 11:15:58 -0500 Subject: [Python-Dev] For Python 3k, drop default/implicit hash, and comparison Message-ID: <436E2C3E.7060807@zope.com> The recent discussion about what the default hash and equality comparisons should do makes me want to chime in. IMO, the provision of defaults for hash, eq and other comparisons was a mistake. I'm especially sensitive to this because I do a lot of work with persistent data that outlives program execution. For such objects, memory address is meaningless. In particular, the default ordering of objects based in address has caused a great deal of pain to people who store data in persistent BTrees. Oddly, what I've read in these threads seems to be arguing about which implicit method is best. The answer, IMO, is to not do this implicitly at all. If programmers want their objects to be hashable, comparable, or orderable, then they should implement operators explicitly. There could even be a handy, but *optional*, base class that provides these operators based on ids. This would be too big a change for Python 2 but, IMO, should definately be made for Python 3k. I doubt any change in the default definition of these operations is practical for Python 2. Too many people rely on them, usually without really realizing it. Lets plan to stop guessing how to do hash and comparison. Explicit is better than implicit. :) Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From guido at python.org Sun Nov 6 20:47:20 2005 From: guido at python.org (Guido van Rossum) Date: Sun, 6 Nov 2005 11:47:20 -0800 Subject: [Python-Dev] For Python 3k, drop default/implicit hash, and comparison In-Reply-To: <436E2C3E.7060807@zope.com> References: <436E2C3E.7060807@zope.com> Message-ID: <ca471dc20511061147p2e0ae9dbt83b6e52dbbd7e69b@mail.gmail.com> On 11/6/05, Jim Fulton <jim at zope.com> wrote: > IMO, the provision of defaults for hash, eq and other comparisons > was a mistake. I agree with you for 66%. Default hash and inequalities were a mistake. But I wouldn't want to do without a default ==/!= implementation (and of course it should be defined so that an object is only equal to itself). In fact, the original hash() was clever enough to complain when __eq__ (or __cmp__) was overridden but __hash__ wasn't; but this got lost by accident for new-style classes when I added a default __hash__ to the new universal base class (object). But I think the original default hash() isn't particularly useful, so I think it's better to just not be hashable unless __hash__ is defined explicitly. > I'm especially sensitive to this because I do a lot > of work with persistent data that outlives program execution. For such > objects, memory address is meaningless. In particular, the default > ordering of objects based in address has caused a great deal of pain > to people who store data in persistent BTrees. This argues against the inequalities (<, <=, >, >=) and I agree. > Oddly, what I've read in these threads seems to be arguing about > which implicit method is best. The answer, IMO, is to not do this > implicitly at all. If programmers want their objects to be > hashable, comparable, or orderable, then they should implement operators > explicitly. There could even be a handy, but *optional*, base class that > provides these operators based on ids. I don't like that final suggestion. Before you know it, a meme develops telling newbies that all classes should inherit from that "optional" base class, and then later it's impossible to remove it because you can't tell whether it's actually needed or not. > This would be too big a change for Python 2 but, IMO, should definately > be made for Python 3k. I doubt any change in the default definition > of these operations is practical for Python 2. Too many people rely on > them, usually without really realizing it. Agreed. > Lets plan to stop guessing how to do hash and comparison. > > Explicit is better than implicit. :) Except that I really don't think that there's anything wrong with a default __eq__ that uses object identity. As Martin pointed out, it's just too weird that an object wouldn't be considered equal to itself. It's the default __hash__ and __cmp__ that mess things up. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jim at zope.com Sun Nov 6 21:13:23 2005 From: jim at zope.com (Jim Fulton) Date: Sun, 06 Nov 2005 15:13:23 -0500 Subject: [Python-Dev] For Python 3k, drop default/implicit hash, and comparison In-Reply-To: <ca471dc20511061147p2e0ae9dbt83b6e52dbbd7e69b@mail.gmail.com> References: <436E2C3E.7060807@zope.com> <ca471dc20511061147p2e0ae9dbt83b6e52dbbd7e69b@mail.gmail.com> Message-ID: <436E63E3.7040307@zope.com> Guido van Rossum wrote: > On 11/6/05, Jim Fulton <jim at zope.com> wrote: > ... > Except that I really don't think that there's anything wrong with a > default __eq__ that uses object identity. As Martin pointed out, it's > just too weird that an object wouldn't be considered equal to itself. > It's the default __hash__ and __cmp__ that mess things up. Good point. I agree. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jrw at pobox.com Sun Nov 6 21:39:42 2005 From: jrw at pobox.com (John Williams) Date: Sun, 06 Nov 2005 14:39:42 -0600 Subject: [Python-Dev] For Python 3k, drop default/implicit hash, and comparison In-Reply-To: <436E2C3E.7060807@zope.com> References: <436E2C3E.7060807@zope.com> Message-ID: <436E6A0E.4070508@pobox.com> (This is kind of on a tangent to the original discussion, but I don't want to create yet another subject line about object comparisons.) Lately I've found that virtually all my implementations of __cmp__, __hash__, etc. can be factored into this form inspired by the "key" parameter to the built-in sorting functions: class MyClass: def __key(self): # Return a tuple of attributes to compare. return (self.foo, self.bar, ...) def __cmp__(self, that): return cmp(self.__key(), that.__key()) def __hash__(self): return hash(self.__key()) I wonder if it wouldn't make sense to formalize this pattern with a magic __key__ method such that a class with a __key__ method would behave as if it had interited the definitions of __cmp__ and __hash__ above. This scheme would eliminate the tedium of keeping the __hash__ method in sync with the __cmp__/__eq__ method, and writing a __key__ method would involve writing less code than a naive __eq__ method, since each attribute name only needs to be mentioned once instead of appearing on either side of a "==" expression. On the other hand, this idea doesn't work in all situations (for instance, I don't think you could define the default __cmp__/__hash__ semantics in terms of __key__), it would only eliminate two one-line methods for each class, and it would further complicate the "==" operator (__key__, falling back to __eq__, falling back to __cmp__, falling back to object identity--ouch!) If anyone thinks this is a good idea I'll investiate how many places in the standard library this pattern would apply. --jw From guido at python.org Sun Nov 6 21:58:57 2005 From: guido at python.org (Guido van Rossum) Date: Sun, 6 Nov 2005 12:58:57 -0800 Subject: [Python-Dev] For Python 3k, drop default/implicit hash, and comparison In-Reply-To: <436E6A0E.4070508@pobox.com> References: <436E2C3E.7060807@zope.com> <436E6A0E.4070508@pobox.com> Message-ID: <ca471dc20511061258q636689c0se9e45b0f503e1299@mail.gmail.com> On 11/6/05, John Williams <jrw at pobox.com> wrote: > (This is kind of on a tangent to the original discussion, but I don't > want to create yet another subject line about object comparisons.) > > Lately I've found that virtually all my implementations of __cmp__, > __hash__, etc. can be factored into this form inspired by the "key" > parameter to the built-in sorting functions: > > class MyClass: > > def __key(self): > # Return a tuple of attributes to compare. > return (self.foo, self.bar, ...) > > def __cmp__(self, that): > return cmp(self.__key(), that.__key()) > > def __hash__(self): > return hash(self.__key()) The main way this breaks down is when comparing objects of different types. While most comparisons typically are defined in terms of comparisons on simpler or contained objects, two objects of different types that happen to have the same "key" shouldn't necessarily be considered equal. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From pje at telecommunity.com Sun Nov 6 22:22:31 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun, 06 Nov 2005 16:22:31 -0500 Subject: [Python-Dev] For Python 3k, drop default/implicit hash, and comparison In-Reply-To: <ca471dc20511061258q636689c0se9e45b0f503e1299@mail.gmail.co m> References: <436E6A0E.4070508@pobox.com> <436E2C3E.7060807@zope.com> <436E6A0E.4070508@pobox.com> Message-ID: <5.1.1.6.0.20051106162127.01ede358@mail.telecommunity.com> At 12:58 PM 11/6/2005 -0800, Guido van Rossum wrote: >The main way this breaks down is when comparing objects of different >types. While most comparisons typically are defined in terms of >comparisons on simpler or contained objects, two objects of different >types that happen to have the same "key" shouldn't necessarily be >considered equal. When I use this pattern, I often just include the object's type in the key. (I call it the 'hashcmp' value, but otherwise it's the same pattern.) From guido at python.org Sun Nov 6 22:29:27 2005 From: guido at python.org (Guido van Rossum) Date: Sun, 6 Nov 2005 13:29:27 -0800 Subject: [Python-Dev] For Python 3k, drop default/implicit hash, and comparison In-Reply-To: <5.1.1.6.0.20051106162127.01ede358@mail.telecommunity.com> References: <436E2C3E.7060807@zope.com> <436E6A0E.4070508@pobox.com> <5.1.1.6.0.20051106162127.01ede358@mail.telecommunity.com> Message-ID: <ca471dc20511061329t46078897wdc02dd86e43d133d@mail.gmail.com> On 11/6/05, Phillip J. Eby <pje at telecommunity.com> wrote: > At 12:58 PM 11/6/2005 -0800, Guido van Rossum wrote: > >The main way this breaks down is when comparing objects of different > >types. While most comparisons typically are defined in terms of > >comparisons on simpler or contained objects, two objects of different > >types that happen to have the same "key" shouldn't necessarily be > >considered equal. > > When I use this pattern, I often just include the object's type in the > key. (I call it the 'hashcmp' value, but otherwise it's the same pattern.) But how do you make that work with subclassing? (I'm guessing your answer is that you don't. :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From josh at janrain.com Sun Nov 6 22:57:34 2005 From: josh at janrain.com (Josh Hoyt) Date: Sun, 6 Nov 2005 13:57:34 -0800 Subject: [Python-Dev] For Python 3k, drop default/implicit hash, and comparison In-Reply-To: <ca471dc20511061329t46078897wdc02dd86e43d133d@mail.gmail.com> References: <436E2C3E.7060807@zope.com> <436E6A0E.4070508@pobox.com> <5.1.1.6.0.20051106162127.01ede358@mail.telecommunity.com> <ca471dc20511061329t46078897wdc02dd86e43d133d@mail.gmail.com> Message-ID: <34714aad0511061357x2dc3765y2cdca412dec4e432@mail.gmail.com> On 11/6/05, Guido van Rossum <guido at python.org> wrote: > On 11/6/05, Phillip J. Eby <pje at telecommunity.com> wrote: > > When I use this pattern, I often just include the object's type in the > > key. (I call it the 'hashcmp' value, but otherwise it's the same pattern.) > > But how do you make that work with subclassing? (I'm guessing your > answer is that you don't. :-) If there is a well-defined desired behaviour for comparisons in the face of subclassing (which I'm not sure if there is) then that behaviour could become part of the definition of how __key__ works. Since __key__ would be for clarity of intent and convenience of implementation, adding default behaviour for the most common case seems like it would be a good idea. My initial thought was that all subclasses of the class where __key__ was defined would compare as equal if they return the same value. More precisely, if two objects have the same __key__ method, and it returns the same value, then they are equal. That does not solve the __cmp__ problem, unless the __key__ function is used as part of the ordering. For example: def getKey(obj): __key__ = getattr(obj.__class__, '__key__') return (id(key), key(obj)) An obvious drawback is that if __key__ is overridden, then the subclass where it is overridden and all further subclasses will no longer have equality to the superclass. I think that this is probably OK, except that it may be occasionally surprising. Josh From jcarlson at uci.edu Mon Nov 7 00:12:36 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Sun, 06 Nov 2005 15:12:36 -0800 Subject: [Python-Dev] For Python 3k, drop default/implicit hash, and comparison In-Reply-To: <436E6A0E.4070508@pobox.com> References: <436E2C3E.7060807@zope.com> <436E6A0E.4070508@pobox.com> Message-ID: <20051106144700.C01C.JCARLSON@uci.edu> John Williams <jrw at pobox.com> wrote: > > (This is kind of on a tangent to the original discussion, but I don't > want to create yet another subject line about object comparisons.) > > Lately I've found that virtually all my implementations of __cmp__, > __hash__, etc. can be factored into this form inspired by the "key" > parameter to the built-in sorting functions: > > class MyClass: > > def __key(self): > # Return a tuple of attributes to compare. > return (self.foo, self.bar, ...) > > def __cmp__(self, that): > return cmp(self.__key(), that.__key()) > > def __hash__(self): > return hash(self.__key()) > > I wonder if it wouldn't make sense to formalize this pattern with a > magic __key__ method such that a class with a __key__ method would > behave as if it had interited the definitions of __cmp__ and __hash__ above. You probably already realize this, but I thought I would point out the obvious. Given a suitably modified MyClass... >>> x = {} >>> a = MyClass() >>> a.a = 8 >>> x[a] = a >>> a.a = 9 >>> x[a] = a >>> >>> x {<__main__.MyClass instance at 0x007E0A08>: <__main__.MyClass instance at 0x007E 0A08>, <__main__.MyClass instance at 0x007E0A08>: <__main__.MyClass instance at 0x007E0A08>} Of course everyone is saying "Josiah, people shouldn't be doing that"; but they will. Given a mechanism to offer hash-by-value, a large number of users will think that it will work for what they want, regardless of the fact that in order for it to really work, those attributes must be read-only by semantics or access mechanisms. Not everyone who uses Python understands fully the concepts of mutability and immutability, and very few will realize that the attributes returned by __key() need to be immutable aspects of the instance of that class (you can perform at most one assignment to the attribute during its lifetime, and that assignment must occur before any hash calls). Call me a pessimist, but I don't believe that using magical key methods will be helpful for understanding or using Python. - Josiah From pje at telecommunity.com Mon Nov 7 01:12:21 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun, 06 Nov 2005 19:12:21 -0500 Subject: [Python-Dev] For Python 3k, drop default/implicit hash, and comparison In-Reply-To: <ca471dc20511061329t46078897wdc02dd86e43d133d@mail.gmail.co m> References: <5.1.1.6.0.20051106162127.01ede358@mail.telecommunity.com> <436E2C3E.7060807@zope.com> <436E6A0E.4070508@pobox.com> <5.1.1.6.0.20051106162127.01ede358@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20051106191059.01edcf78@mail.telecommunity.com> At 01:29 PM 11/6/2005 -0800, Guido van Rossum wrote: >On 11/6/05, Phillip J. Eby <pje at telecommunity.com> wrote: > > At 12:58 PM 11/6/2005 -0800, Guido van Rossum wrote: > > >The main way this breaks down is when comparing objects of different > > >types. While most comparisons typically are defined in terms of > > >comparisons on simpler or contained objects, two objects of different > > >types that happen to have the same "key" shouldn't necessarily be > > >considered equal. > > > > When I use this pattern, I often just include the object's type in the > > key. (I call it the 'hashcmp' value, but otherwise it's the same pattern.) > >But how do you make that work with subclassing? (I'm guessing your >answer is that you don't. :-) By either changing the subclass __init__ to initialize it with a different hashcmp value, or by redefining the method that computes it. From pje at telecommunity.com Mon Nov 7 01:15:01 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun, 06 Nov 2005 19:15:01 -0500 Subject: [Python-Dev] For Python 3k, drop default/implicit hash, and comparison In-Reply-To: <5.1.1.6.0.20051106191059.01edcf78@mail.telecommunity.com> References: <ca471dc20511061329t46078897wdc02dd86e43d133d@mail.gmail.co m> <5.1.1.6.0.20051106162127.01ede358@mail.telecommunity.com> <436E2C3E.7060807@zope.com> <436E6A0E.4070508@pobox.com> <5.1.1.6.0.20051106162127.01ede358@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20051106191251.01fa9818@mail.telecommunity.com> At 07:12 PM 11/6/2005 -0500, Phillip J. Eby wrote: >At 01:29 PM 11/6/2005 -0800, Guido van Rossum wrote: > >On 11/6/05, Phillip J. Eby <pje at telecommunity.com> wrote: > > > At 12:58 PM 11/6/2005 -0800, Guido van Rossum wrote: > > > >The main way this breaks down is when comparing objects of different > > > >types. While most comparisons typically are defined in terms of > > > >comparisons on simpler or contained objects, two objects of different > > > >types that happen to have the same "key" shouldn't necessarily be > > > >considered equal. > > > > > > When I use this pattern, I often just include the object's type in the > > > key. (I call it the 'hashcmp' value, but otherwise it's the same > pattern.) > > > >But how do you make that work with subclassing? (I'm guessing your > >answer is that you don't. :-) > >By either changing the subclass __init__ to initialize it with a different >hashcmp value, or by redefining the method that computes it. Scratch that. I realized 2 seconds after hitting "Send" that you meant the case where you want to compare instances with a common parent type. And the answer is, I can't recall having needed to. (Which is probably why it took me so long to realize what you meant.) From fakeaddress at nowhere.org Thu Nov 3 01:55:01 2005 From: fakeaddress at nowhere.org (Bryan Olson) Date: Wed, 02 Nov 2005 16:55:01 -0800 Subject: [Python-Dev] PEP submission broken? Message-ID: <43695FE5.1080803@nowhere.org> Though I tried to submit a (pre-) PEP in the proper form through the proper channels, it has disappeared into the ether. In building a class that supports Python's slicing interface, http://groups.google.com/group/comp.lang.python/msg/8f35464483aa7d7b I encountered a Python bug, which, upon further discussion, seemed to be a combination of a wart and a documentation error. http://groups.google.com/group/comp.lang.python/browse_frm/thread/402d770b6f503c27 I submitted the bug report via SourceForge; the resolution was to document the actual behavior. Next I worked out what behavior I think would eliminate the wart, wrote it up as a pre-PEP, and sent it peps at python.org on 27 Aug of this year. I promptly received an automated response from Barry Warsaw, saying, in part, "I get so much email that I can't promise a personal response." I gathered that he is a PEP editor. I did not infer from his reply that PEP's are simply ignored, but this automated reply was the only response I ever received. I subscribed to the Python-dev list, and watched, and waited; nothing on my concern appeared. One response on the comp.lang.python newsgroup noted that a popular extention module would have difficulty maintaining consistency with my proposed PEP. My proposal does not break how the extension currently works, but still, that's a valid point. There are variations which do not have that problem, and I think I can see a course that will serve the entire Python community. From what I can tell, We need to address fixing the PEP process before there is any point in working on PEP's, -- --Bryan From ncoghlan at gmail.com Mon Nov 7 10:32:57 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 07 Nov 2005 19:32:57 +1000 Subject: [Python-Dev] PEP submission broken? In-Reply-To: <43695FE5.1080803@nowhere.org> References: <43695FE5.1080803@nowhere.org> Message-ID: <436F1F49.90606@gmail.com> Bryan Olson wrote: > From what I can tell, We need to address fixing the > PEP process before there is any point in working on PEP's, I think this is a somewhat fair point (although perhaps a bit overstated) - David and Barry can be busy IRL, which can certainly slow down the process of PEP submission. PEP 328 hung in limbo for a while on that basis (I'm going to have to look into if and how PEP 328 relates to Python eggs one of these days. . .). Would it be worth having a PEP category on the RFE tracker, rather than submitting pre-PEP's directly to the PEP editors? The process still wouldn't be perfect, but it would widen the pool of people that can bring a pre-PEP to the attention of python-dev. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From bjourne at gmail.com Mon Nov 7 14:06:11 2005 From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=) Date: Mon, 7 Nov 2005 14:06:11 +0100 Subject: [Python-Dev] Should the default equality operator compare values instead of identities? In-Reply-To: <20051105162001.C007.JCARLSON@uci.edu> References: <20051105151436.BFF7.JCARLSON@uci.edu> <b348a0850511051602u4db5e332mdbc3dcecbe95b170@mail.gmail.com> <20051105162001.C007.JCARLSON@uci.edu> Message-ID: <740c3aec0511070506v6833686ag4894270034e01559@mail.gmail.com> How would the value equality operator deal with recursive objects? class Foo: def __init__(self): self.foo = self Seems to me that it would take atleast some special-casing to get Foo() == Foo() to evalute to True in this case... -- mvh Bj?rn From guido at python.org Mon Nov 7 18:10:15 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 7 Nov 2005 09:10:15 -0800 Subject: [Python-Dev] For Python 3k, drop default/implicit hash, and comparison In-Reply-To: <5.1.1.6.0.20051106191251.01fa9818@mail.telecommunity.com> References: <436E2C3E.7060807@zope.com> <436E6A0E.4070508@pobox.com> <5.1.1.6.0.20051106162127.01ede358@mail.telecommunity.com> <5.1.1.6.0.20051106191059.01edcf78@mail.telecommunity.com> <5.1.1.6.0.20051106191251.01fa9818@mail.telecommunity.com> Message-ID: <ca471dc20511070910u3e2e7ea6o6e98b46357a1af5c@mail.gmail.com> Two more thoughts in this thread. (1) The "key" idiom is a great pattern but I don't think it would work well to make it a standard language API. (2) I'm changing my mind about the default hash(). The original default hash() (which would raise TypeError if __eq__ was overridden but __hash__ was not) is actually quite useful in some situations. Basically, simplifying a bit, there are two types of objects: those that represent *values* and those that do not. For value-ish objects, overriding __eq__ is common and then __hash__ needs to be overridden in order to get the proper dict and set behavior. In a sense, __eq__ defines an "equivalence class" in the mathematical sense. But in many applications I've used objects for which object identity is important. Let me construct a hypothetical example: suppose we represent a car and its parts as objects. Let's say each wheel is an object. Each wheel is unique and we don't have equivalency classes for them. However, it would be useful to construct sets of wheels (e.g. the set of wheels currently on my car that have never had a flat tire). Python sets use hashing just like dicts. The original hash() and __eq__ implementation would work exactly right for this purpose, and it seems silly to have to add it to every object type that could possibly be used as a set member (especially since this means that if a third party library creates objects for you that don't implement __hash__ you'd have a hard time of adding it). In short, I agree something's broken, but the fix should not be to remove the default __hash__ and __eq__ altogether. Instead, the default __hash__ should be made smarter (and perhaps the only way to do this is to build the smarts into hash() again). I do agree that __cmp__, __gt__ etc. should be left undefined by default. All of this is Python 3000 only. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From nnorwitz at gmail.com Mon Nov 7 21:49:21 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Mon, 7 Nov 2005 12:49:21 -0800 Subject: [Python-Dev] cross-compiling Message-ID: <ee2a432c0511071249r41964175p166537722d4c51b0@mail.gmail.com> We've been having some issues and discussions at work about cross compiling. There are various people that have tried (are) cross compiling python. Right now the support kinda sucks due to a couple of reasons. First, distutils is required to build all the modules. This means that python must be built twice. Once for the target machine and once for the host machine. The host machine is really not desired since it's only purpose is to run distutils. I don't know the history of why distutils is used. I haven't had much of an issue with it since I've never needed to cross compile. What are the issues with not requiring python to be built on the host machine (ie, not using distutils)? Second, in configure we try to run little programs (AC_TRY_RUN) to determine what to set. I don't know of any good alternative but to force those to be defined manually for cross-compiled environments. Any suggestions here? I'm thinking we can skip the the AC_TRY_RUNs if host != target and we pickup the answers to those from a user supplied file. I'm *not* suggesting that normal builds see any change in behaviour. Nothing will change for most developers. ie, ./configure ; make ; ./python will continue to work the same. I only want to make it possible to cross compile python by building it only on the target platform. n PS. I would be interested to hear from others who are doing cross compiling and know more about it than me. From guido at python.org Mon Nov 7 22:04:53 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 7 Nov 2005 13:04:53 -0800 Subject: [Python-Dev] cross-compiling In-Reply-To: <ee2a432c0511071249r41964175p166537722d4c51b0@mail.gmail.com> References: <ee2a432c0511071249r41964175p166537722d4c51b0@mail.gmail.com> Message-ID: <ca471dc20511071304h5ce50ddar522e2ca216e44a4a@mail.gmail.com> I know some folks have successfully used cross-compilation before. But this was in a distant past. There was some support for it in the configure script; surely you're using that? I believe it lets you specify defaults for the TRY_RUN macros. But it's probably very primitive. About using distutils to build the extensions, this is because some extensions require quite a bit of logic to determine the build commands (e.g. look at BSDDB or Tkinter). There was a pre-distutils way of building extensions using Modules/Setup* but this required extensive manual editing if tools weren't in the place where they were expected, and they never were. I don't have time to look into this further right now, but I hope I will in the future. Keep me posted! --Guido On 11/7/05, Neal Norwitz <nnorwitz at gmail.com> wrote: > We've been having some issues and discussions at work about cross > compiling. There are various people that have tried (are) cross > compiling python. Right now the support kinda sucks due to a couple > of reasons. > > First, distutils is required to build all the modules. This means > that python must be built twice. Once for the target machine and once > for the host machine. The host machine is really not desired since > it's only purpose is to run distutils. I don't know the history of > why distutils is used. I haven't had much of an issue with it since > I've never needed to cross compile. What are the issues with not > requiring python to be built on the host machine (ie, not using > distutils)? > > Second, in configure we try to run little programs (AC_TRY_RUN) to > determine what to set. I don't know of any good alternative but to > force those to be defined manually for cross-compiled environments. > Any suggestions here? I'm thinking we can skip the the AC_TRY_RUNs > if host != target and we pickup the answers to those from a user > supplied file. > > I'm *not* suggesting that normal builds see any change in behaviour. > Nothing will change for most developers. ie, ./configure ; make ; > ./python will continue to work the same. I only want to make it > possible to cross compile python by building it only on the target > platform. > > n > > PS. I would be interested to hear from others who are doing cross > compiling and know more about it than me. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From maillist at roomity.com Mon Nov 7 22:11:11 2005 From: maillist at roomity.com (shenanigans) Date: Mon, 7 Nov 2005 13:11:11 -0800 (PST) Subject: [Python-Dev] [OTAnn] Feedback Message-ID: <26578370.591131397871429.JavaMail.tomcat5@slave1.roomity.com> I was interested in getting feedback from current mail group users. We have mirrored your mail list in a new application that provides a more aggregated and safe environment which utilizes the power of broadband. Roomity.com v 1.5 is a web 2.01 community webapp. Our newest version adds broadcast video and social networking such as favorite authors and an html editor. It?s free to join and any feedback would be appreciated. S. -------------------------------------------------------------------------------------------------------------------------------------------------- Broadband interface (RIA) + mail box saftey = <a href="http://Python_Core_Developers_List.roomity.com">Python_Core_Developers_List.roomity.com</a> *Your* clubs, no sign up to read, ad supported; try broadband internet. ~~1131397871425~~ -------------------------------------------------------------------------------------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20051107/ad884a6f/attachment.html From jeremy at alum.mit.edu Mon Nov 7 22:38:18 2005 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Mon, 7 Nov 2005 16:38:18 -0500 Subject: [Python-Dev] cross-compiling In-Reply-To: <ca471dc20511071304h5ce50ddar522e2ca216e44a4a@mail.gmail.com> References: <ee2a432c0511071249r41964175p166537722d4c51b0@mail.gmail.com> <ca471dc20511071304h5ce50ddar522e2ca216e44a4a@mail.gmail.com> Message-ID: <e8bf7a530511071338j4a02ee19y535f935d1c15979b@mail.gmail.com> On 11/7/05, Guido van Rossum <guido at python.org> wrote: > About using distutils to build the extensions, this is because some > extensions require quite a bit of logic to determine the build > commands (e.g. look at BSDDB or Tkinter). There was a pre-distutils > way of building extensions using Modules/Setup* but this required > extensive manual editing if tools weren't in the place where they were > expected, and they never were. I think part of the problem is that setup.py has a bunch of heuristics that are intended to do the right thing without user intervention. If, on the other hand, the user wants to intervene, because "the right thing" is wrong for cross-compiling, you are kind of stuck. I don't think there is an obvious way to select the extension modules to build and the C libraries for them to link against. Jeremy From bcannon at gmail.com Mon Nov 7 22:41:33 2005 From: bcannon at gmail.com (Brett Cannon) Date: Mon, 7 Nov 2005 13:41:33 -0800 Subject: [Python-Dev] cross-compiling In-Reply-To: <ee2a432c0511071249r41964175p166537722d4c51b0@mail.gmail.com> References: <ee2a432c0511071249r41964175p166537722d4c51b0@mail.gmail.com> Message-ID: <bbaeab100511071341w49c8a593u12b1d0ab68ca1110@mail.gmail.com> On 11/7/05, Neal Norwitz <nnorwitz at gmail.com> wrote: > We've been having some issues and discussions at work about cross > compiling. There are various people that have tried (are) cross > compiling python. Right now the support kinda sucks due to a couple > of reasons. This might make a good sprint topic. Maybe your employer might be willing to get some people to come to hack on this? I know I wouldn't mind seeing the whole build process cleaned up. It works well enough, but I think some things could stand to be updated (speaking from experience of adding EXTRA_CFLAGS to the build process), such as setup.py being made more modular. -Brett From barry at python.org Mon Nov 7 23:05:38 2005 From: barry at python.org (Barry Warsaw) Date: Mon, 07 Nov 2005 17:05:38 -0500 Subject: [Python-Dev] cross-compiling In-Reply-To: <e8bf7a530511071338j4a02ee19y535f935d1c15979b@mail.gmail.com> References: <ee2a432c0511071249r41964175p166537722d4c51b0@mail.gmail.com> <ca471dc20511071304h5ce50ddar522e2ca216e44a4a@mail.gmail.com> <e8bf7a530511071338j4a02ee19y535f935d1c15979b@mail.gmail.com> Message-ID: <1131401138.4926.38.camel@geddy.wooz.org> On Mon, 2005-11-07 at 16:38, Jeremy Hylton wrote: > I think part of the problem is that setup.py has a bunch of heuristics > that are intended to do the right thing without user intervention. > If, on the other hand, the user wants to intervene, because "the right > thing" is wrong for cross-compiling, you are kind of stuck. I don't > think there is an obvious way to select the extension modules to build > and the C libraries for them to link against. This relates to an issue we've had to workaround with the distutils based module builds in Python. For some of the modules, we want the auto-detection code to find versions of dependent libraries in locations other than the "standard" system locations. I don't think there's a good way to convince the various setup.py scripts to look elsewhere for things, short of modifying the code. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20051107/69363bd3/attachment.pgp From martin at v.loewis.de Mon Nov 7 23:34:26 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 07 Nov 2005 23:34:26 +0100 Subject: [Python-Dev] Should the default equality operator compare values instead of identities? In-Reply-To: <740c3aec0511070506v6833686ag4894270034e01559@mail.gmail.com> References: <20051105151436.BFF7.JCARLSON@uci.edu> <b348a0850511051602u4db5e332mdbc3dcecbe95b170@mail.gmail.com> <20051105162001.C007.JCARLSON@uci.edu> <740c3aec0511070506v6833686ag4894270034e01559@mail.gmail.com> Message-ID: <436FD672.5070807@v.loewis.de> BJ?rn Lindqvist wrote: > How would the value equality operator deal with recursive objects? > > class Foo: > def __init__(self): > self.foo = self > > Seems to me that it would take atleast some special-casing to get > Foo() == Foo() to evalute to True in this case... This is sort-of supported today: >>> a=[] >>> a.append(a) >>> b=[] >>> b.append(b) >>> a == b True Regards, Martin From martin at v.loewis.de Mon Nov 7 23:38:34 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 07 Nov 2005 23:38:34 +0100 Subject: [Python-Dev] cross-compiling In-Reply-To: <ee2a432c0511071249r41964175p166537722d4c51b0@mail.gmail.com> References: <ee2a432c0511071249r41964175p166537722d4c51b0@mail.gmail.com> Message-ID: <436FD76A.3020401@v.loewis.de> Neal Norwitz wrote: > First, distutils is required to build all the modules. As Guido already suggests, this assertion is false. In a cross-compilation environment, I would try to avoid distutils, and indeed, the build process to do so is still supported. > Second, in configure we try to run little programs (AC_TRY_RUN) to > determine what to set. I don't know of any good alternative but to > force those to be defined manually for cross-compiled environments. > Any suggestions here? I'm thinking we can skip the the AC_TRY_RUNs > if host != target and we pickup the answers to those from a user > supplied file. You shouldn't be required to do that. Instead, just edit pyconfig.h manually, to match the target. autoconf is designed to support that. It would help if Makefile was target-independent (only host-dependent). Not sure whether this is the case. Regards, Martin From martin at v.loewis.de Mon Nov 7 23:39:18 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 07 Nov 2005 23:39:18 +0100 Subject: [Python-Dev] cross-compiling In-Reply-To: <e8bf7a530511071338j4a02ee19y535f935d1c15979b@mail.gmail.com> References: <ee2a432c0511071249r41964175p166537722d4c51b0@mail.gmail.com> <ca471dc20511071304h5ce50ddar522e2ca216e44a4a@mail.gmail.com> <e8bf7a530511071338j4a02ee19y535f935d1c15979b@mail.gmail.com> Message-ID: <436FD796.3010208@v.loewis.de> Jeremy Hylton wrote: > I think part of the problem is that setup.py has a bunch of heuristics > that are intended to do the right thing without user intervention. > If, on the other hand, the user wants to intervene, because "the right > thing" is wrong for cross-compiling, you are kind of stuck. I don't > think there is an obvious way to select the extension modules to build > and the C libraries for them to link against. Of course there is: Modules/Setup. Regards, Martin From mwh at python.net Tue Nov 8 00:02:12 2005 From: mwh at python.net (Michael Hudson) Date: Mon, 07 Nov 2005 23:02:12 +0000 Subject: [Python-Dev] Should the default equality operator compare values instead of identities? In-Reply-To: <436FD672.5070807@v.loewis.de> ( =?iso-8859-1?q?Martin_v._L=F6wis's_message_of?= "Mon, 07 Nov 2005 23:34:26 +0100") References: <20051105151436.BFF7.JCARLSON@uci.edu> <b348a0850511051602u4db5e332mdbc3dcecbe95b170@mail.gmail.com> <20051105162001.C007.JCARLSON@uci.edu> <740c3aec0511070506v6833686ag4894270034e01559@mail.gmail.com> <436FD672.5070807@v.loewis.de> Message-ID: <2mbr0w12q3.fsf@starship.python.net> "Martin v. L?wis" <martin at v.loewis.de> writes: > BJ?rn Lindqvist wrote: >> How would the value equality operator deal with recursive objects? >> >> class Foo: >> def __init__(self): >> self.foo = self >> >> Seems to me that it would take atleast some special-casing to get >> Foo() == Foo() to evalute to True in this case... > > This is sort-of supported today: > > >>> a=[] > >>> a.append(a) > >>> b=[] > >>> b.append(b) > >>> a == b > True Uh, I think this changed in Python 2.4: >>> a = [] >>> a.append(a) >>> b = [] >>> b.append(b) >>> a == b Traceback (most recent call last): File "<stdin>", line 1, in ? RuntimeError: maximum recursion depth exceeded in cmp Cheers, mwh -- First of all, email me your AOL password as a security measure. You may find that won't be able to connect to the 'net for a while. This is normal. The next thing to do is turn your computer upside down and shake it to reboot it. -- Darren Tucker, asr From kbk at shore.net Tue Nov 8 03:29:49 2005 From: kbk at shore.net (Kurt B. Kaiser) Date: Mon, 7 Nov 2005 21:29:49 -0500 (EST) Subject: [Python-Dev] Weekly Python Patch/Bug Summary Message-ID: <200511080229.jA82TnHG017341@bayview.thirdcreek.com> Patch / Bug Summary ___________________ Patches : 365 open ( +5) / 2961 closed ( +5) / 3326 total (+10) Bugs : 904 open (+11) / 5367 closed (+14) / 6271 total (+25) RFE : 200 open ( +1) / 189 closed ( +0) / 389 total ( +1) New / Reopened Patches ______________________ new function: os.path.relpath (2005-10-27) http://python.org/sf/1339796 opened by Richard Barran commands.getstatusoutput() (2005-11-02) http://python.org/sf/1346211 opened by Dawa Lama Better dead code elimination for the AST compiler (2005-11-02) http://python.org/sf/1346214 opened by Rune Holm A constant folding optimization pass for the AST (2005-11-02) http://python.org/sf/1346238 opened by Rune Holm Remove inconsistent behavior between import and zipimport (2005-11-03) http://python.org/sf/1346572 opened by Osvaldo Santana Neto Patch f. bug 495682 cannot handle http_proxy with user:pass@ (2005-11-05) CLOSED http://python.org/sf/1349117 opened by Johannes Nicolai Patch f. bug 495682 cannot handle http_proxy with user:pass@ (2005-11-05) http://python.org/sf/1349118 opened by Johannes Nicolai [PATCH] 100x optimization for ngettext (2005-11-06) http://python.org/sf/1349274 opened by Joe Wreschnig Fix for signal related abort in Visual Studio 2005 (2005-11-07) http://python.org/sf/1350409 opened by Adal Chiriliuc Redundant connect() call in logging.handlers.SysLogHandler (2005-11-07) http://python.org/sf/1350658 opened by Ken Lalonde Patches Closed ______________ tarfile.py: fix for bug #1336623 (2005-10-26) http://python.org/sf/1338314 closed by nnorwitz Python 2.4.2 doesn't build with "--without-threads" (2005-10-22) http://python.org/sf/1335054 closed by nnorwitz Speedup PyUnicode_DecodeCharmap (2005-10-05) http://python.org/sf/1313939 closed by lemburg Allow use of non-latin1 chars in interactive shell (2005-10-21) http://python.org/sf/1333679 closed by loewis Patch f. bug 495682 cannot handle http_proxy with user:pass@ (2005-11-05) http://python.org/sf/1349117 closed by birkenfeld New / Reopened Bugs ___________________ CVS webbrowser.py (1.40) bugs (2005-10-27) CLOSED http://python.org/sf/1339806 opened by Greg Couch TAB SETTINGS DONT WORK (win) (2005-10-27) http://python.org/sf/1339883 opened by reson5 time.strptime() with bad % code throws bad exception (2005-10-27) CLOSED http://python.org/sf/1340337 opened by Steve R. Hastings mmap does not accept length as 0 (2005-10-28) http://python.org/sf/1341031 opened by liturgist "\n" is incorrectly represented (2005-10-30) CLOSED http://python.org/sf/1341934 opened by Pavel Tkinter.Menu.delete doesn't delete command of entry (2005-10-30) http://python.org/sf/1342811 opened by Sverker Nilsson Broken docs for os.removedirs (2005-10-31) http://python.org/sf/1343671 opened by David K?gedal UNIX mmap leaks file descriptors (2005-11-01) CLOSED http://python.org/sf/1344508 opened by Erwin S. Andreasen colorsys tests, bug in frange (2005-11-01) CLOSED http://python.org/sf/1345263 opened by Rune Holm Python 2.4 and 2.3.5 won't build on OpenBSD 3.7 (2005-11-01) http://python.org/sf/1345313 opened by Dan doc typo (2005-11-02) CLOSED http://python.org/sf/1346026 opened by Keith Briggs Segfaults from unaligned loads in floatobject.c (2005-11-02) http://python.org/sf/1346144 opened by Rune Holm Missing import in documentation (2005-11-03) CLOSED http://python.org/sf/1346395 opened by Aggelos Orfanakos selectmodule.c calls PyInt_AsLong without error checking (2005-11-03) CLOSED http://python.org/sf/1346533 opened by Luke _subprocess.c calls PyInt_AsLong without error checking (2005-11-03) http://python.org/sf/1346547 opened by Luke httplib simply ignores CONTINUE (2005-11-03) http://python.org/sf/1346874 opened by Mike Looijmans FeedParser does not comply with RFC2822 (2005-11-04) http://python.org/sf/1347874 opened by Julian Phillips pydoc seems to run some scripts! (2005-11-04) http://python.org/sf/1348477 opened by Olly Betts email.Generators does not separates headers with "\r\n" (2005-11-05) http://python.org/sf/1349106 opened by Manlio Perillo xmlrpclib does not use http proxy (2005-11-06) http://python.org/sf/1349316 opened by greatred urllib.urlencode provides two features in one param (2005-11-06) http://python.org/sf/1349732 opened by Ori Avtalion urllib2 blocked from news.google.com (2005-11-07) CLOSED http://python.org/sf/1349977 opened by Michael Hoisie built-in method .__cmp__ (2005-11-07) http://python.org/sf/1350060 opened by Armin Rigo "setdlopenflags" leads to crash upon "import" (2005-11-07) http://python.org/sf/1350188 opened by John Pye CVS migration not in www.python.org docs (2005-11-07) http://python.org/sf/1350498 opened by Jim Jewett zlib.crc32 doesn't handle 0xffffffff seed (2005-11-07) http://python.org/sf/1350573 opened by Danny Yoo Bugs Closed ___________ CVS webbrowser.py (1.40) bugs (2005-10-27) http://python.org/sf/1339806 deleted by gregcouch Memory keeping (2005-10-26) http://python.org/sf/1338264 closed by tim_one tarfile can't extract some tar archives.. (2005-10-24) http://python.org/sf/1336623 closed by nnorwitz time.strptime() with bad % code throws bad exception (2005-10-27) http://python.org/sf/1340337 closed by bcannon _socket module not build under cygwin (2005-09-22) http://python.org/sf/1298709 closed by jlt63 "\n" is incorrectly represented (2005-10-30) http://python.org/sf/1341934 closed by perky pydoc HTTP reload failure (2001-04-21) http://python.org/sf/417833 closed by ping UNIX mmap leaks file descriptors (2005-10-31) http://python.org/sf/1344508 closed by nnorwitz colorsys tests, bug in frange (2005-11-01) http://python.org/sf/1345263 closed by nnorwitz Python.h should include system headers properly [POSIX] (2005-10-25) http://python.org/sf/1337400 closed by loewis doc typo (2005-11-02) http://python.org/sf/1346026 closed by nnorwitz Missing import in documentation (2005-11-02) http://python.org/sf/1346395 closed by bcannon selectmodule.c calls PyInt_AsLong without error checking (2005-11-02) http://python.org/sf/1346533 closed by nnorwitz pydoc ignores $PAGER if TERM='dumb' (2002-12-09) http://python.org/sf/651124 closed by ping urllib2 blocked from news.google.com (2005-11-06) http://python.org/sf/1349977 closed by bcannon New / Reopened RFE __________________ please support the free visual studio sdk compiler (2005-11-04) http://python.org/sf/1348719 opened by David McNab From ronaldoussoren at mac.com Tue Nov 8 08:37:16 2005 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Tue, 8 Nov 2005 08:37:16 +0100 Subject: [Python-Dev] Should the default equality operator compare values instead of identities? In-Reply-To: <436FD672.5070807@v.loewis.de> References: <20051105151436.BFF7.JCARLSON@uci.edu> <b348a0850511051602u4db5e332mdbc3dcecbe95b170@mail.gmail.com> <20051105162001.C007.JCARLSON@uci.edu> <740c3aec0511070506v6833686ag4894270034e01559@mail.gmail.com> <436FD672.5070807@v.loewis.de> Message-ID: <994C0B10-7382-4492-9EDE-1AE47BF7FA32@mac.com> On 7-nov-2005, at 23:34, Martin v. L?wis wrote: > BJ?rn Lindqvist wrote: >> How would the value equality operator deal with recursive objects? >> >> class Foo: >> def __init__(self): >> self.foo = self >> >> Seems to me that it would take atleast some special-casing to get >> Foo() == Foo() to evalute to True in this case... > > This is sort-of supported today: But only for lists ;-) >>> a = {} >>> a[1] = a >>> >>> b = {} >>> b[1] = b >>> >>> a == b Traceback (most recent call last): File "<stdin>", line 1, in ? RuntimeError: maximum recursion depth exceeded in cmp >>> > >>>> a=[] >>>> a.append(a) >>>> b=[] >>>> b.append(b) >>>> a == b > True > > Regards, > Martin > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ > ronaldoussoren%40mac.com From winlinchu at yahoo.it Tue Nov 8 10:16:06 2005 From: winlinchu at yahoo.it (winlinchu) Date: Tue, 8 Nov 2005 10:16:06 +0100 (CET) Subject: [Python-Dev] Unifying decimal numbers. Message-ID: <20051108091607.13272.qmail@web26010.mail.ukl.yahoo.com> Now, Python unified ints and long ints. For Python 3k, could be introduced a "Decimal" type (yes, Decimal, the library module!) in place of the actual float object. Of course, the Decimal type would be rewritten in C. Thanks. ___________________________________ Yahoo! Messenger: chiamate gratuite in tutto il mondo http://it.messenger.yahoo.com From support at intercable.ru Tue Nov 8 08:59:42 2005 From: support at intercable.ru (Technical Support of Intercable Co) Date: Tue, 08 Nov 2005 10:59:42 +0300 Subject: [Python-Dev] For Python 3k, drop default/implicit hash, and comparison Message-ID: <43705AEE.2050600@intercable.ru> Why 'identity' objects can't define: def __getKey__(self): return Key(self, id(self)) Then they would act as usually, while value object can define def __getKey__(self): return Key(self, self.i, self.j, self.a[1]) (Key is an abstraction to handle subclassing) Of course, there should be a way to handle comparison off the class ierarhy (i think) Today one can write: >>> class Boo(object): def __init__(self,s=""): self.s=s def __hash__(self): return hash(self.s) def __cmp__(self,other): if type(self)==type(other): return cmp(self.s,other.s) if type(other)==str: return cmp(self.s,other) >>> a={} >>> a['s']=1 >>> a[Boo('s')] 1 >>> a[Boo('z')]=2 >>> a['z'] 2 It is confused and hardly usefull, but possible. Excuse my english. From jcarlson at uci.edu Tue Nov 8 19:26:16 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 08 Nov 2005 10:26:16 -0800 Subject: [Python-Dev] Unifying decimal numbers. In-Reply-To: <20051108091607.13272.qmail@web26010.mail.ukl.yahoo.com> References: <20051108091607.13272.qmail@web26010.mail.ukl.yahoo.com> Message-ID: <20051108101410.C050.JCARLSON@uci.edu> winlinchu <winlinchu at yahoo.it> wrote: > Now, Python unified ints and long ints. > For Python 3k, could be introduced a "Decimal" type > (yes, Decimal, the library module!) in place of the > actual float object. Of course, the Decimal type would > be rewritten in C. There is code which relies on standard IEEE 754 floating point math (speed, behavior, rounding, etc.) that would break with the replacement of floats with decimals. Further, even if it were to be converted to C, it would likely be hundreds of times slower than the processor-native float operations. This discussion has been had before use: site:mail.python.org decimal replace float python in google to discover such discussions. For example: http://mail.python.org/pipermail/python-dev/2005-June/054416.html - Josiah From mdehoon at c2b2.columbia.edu Tue Nov 8 20:46:48 2005 From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon) Date: Tue, 08 Nov 2005 14:46:48 -0500 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter Message-ID: <437100A7.5050907@c2b2.columbia.edu> Dear Pythoneers, I use Python heavily for scientific computing, originally for computational physics and nowadays for computational biology. One of the extension modules I support is pygist, a module for scientific visualization. For this (and similar) packages to work, it is important to have an event loop in Python. Currently, event loops are available in Python via PyOS_InputHook, a pointer to a user-defined function that is called when Python is idle (waiting for user input). However, an event loop using PyOS_InputHook has some inherent limitations, so I am thinking about how to improve event loop support in Python. As an example, consider the current implementation of Tkinter. What's nice about it is that events as well as user-typed Python commands are handled without having to call mainloop() explicitly (except on some platforms): "import Tkinter; Tkinter.Tk()" causes a Tk window to pop up that remains responsive throughout. It works as follows (using Tkinter as an example; pygist works essentially the same): 1) Importing Tkinter causes PyOS_InputHook to be set to the EventHook function in _tkinter.c. 2) Before Python calls fgets to read the next Python command typed by the user, it checks PyOS_InputHook and calls it if it is not NULL. 3) The EventHook function in _tkinter runs the following loop: - Check if user input is present; if so, exit the loop - Handle a Tcl/Tk event, if present - Sleep for 20 milliseconds 4) Once the EventHook function returns, Python continues to read the next user command. After executing the command, return to 2). However, this implementation has the following problems: 1) Essentially, the event loop is a busy-wait loop with a 20 ms sleep in between. An event loop using select() (or equivalent on Windows) will give better performance. 2) Since this event loop runs inside Tkinter, there is no way for other extension modules to get their messages handled. Hence, we cannot have more than one extension module that needs an event loop. As an example, it would be nice to have a Tkinter GUI to steer a simulation and a (non-Tk) graphics output window to visualize the simulation. 3) Whereas PyOS_InputHook is called when Python is waiting for user input, it is not called when Python is waiting for anything else, for example one thread waiting for another. For example, IDLE uses two threads, one handling the GUI and one handling the user commands. When the second thread is waiting for the first thread (when waiting for user input to become available), PyOS_InputHook is not being called, and no Tkinter events are being handled. Hence, "import Tkinter; Tkinter.Tk()" does nothing when executed from an IDLE window. Which means that our scientific visualization software can only be run from Python started from the command line, whereas many users (especially on Windows) will want to use IDLE. Now the problem I'm facing is that because of its integration with Tcl, this cannot be fixed easily with Tkinter as the GUI toolkit for Python. If the events to be handled were purely graphical events (move a window, repaint a window, etc.), there would be no harm in handling these events when waiting for e.g. another thread. With Tkinter, however, we cannot enter EventHook while waiting for another thread: a) because EventHook only returns if user input is available (it doesn't wait for threads); b) because EventHook also runs Tcl/Tk commands, and we wouldn't want to run some Tcl commands in some thread while waiting for another thread. Therefore, as far as I can tell, there is no way to set up a true event loop in Python that works nicely with Tkinter, and there is no way to have an event loop function from IDLE. So I'd like to ask the following questions: 1) Did I miss something? Is there some way to get an event loop with Tkinter? 2) Will Tkinter always be the standard toolkit for Python, or are there plans to replace it at some point? I realize that Tkinter has been an important part of Python for some time now, and I don't expect it to be ditched just because of my event loop problems. At the same time, event loop support could use some improvement, so I'd like to call your attention to this issue. Tcl actually has event loops implemented quite nicely, and may serve as an example of how event loops may work in Python. --Michiel. -- Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 From janssen at parc.com Tue Nov 8 21:32:59 2005 From: janssen at parc.com (Bill Janssen) Date: Tue, 8 Nov 2005 12:32:59 PST Subject: [Python-Dev] Unifying decimal numbers. In-Reply-To: Your message of "Tue, 08 Nov 2005 10:26:16 PST." <20051108101410.C050.JCARLSON@uci.edu> Message-ID: <05Nov8.123259pst."58633"@synergy1.parc.xerox.com> Might be more interesting to think about replacing ints and Decimal with implicit-denominator rational type. In the HTTP-NG typing proposal, we called this a "fixed-point" type. See Section 4.5.1 of http://www.w3.org/Protocols/HTTP-NG/1998/08/draft-frystyk-httpng-arch-00.txt for details. The current notion of "int" would be defined as a specific kind of fixed-point type (a denominator of 1), but other fixed-point types such as dollars (denominator of 100) or dozens (denominator of 1/12) could also be defined. The nice thing about type systems like this is that they can accurately describe non-binary values, like 1/3. Bill From martin at v.loewis.de Tue Nov 8 21:37:41 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 08 Nov 2005 21:37:41 +0100 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <437100A7.5050907@c2b2.columbia.edu> References: <437100A7.5050907@c2b2.columbia.edu> Message-ID: <43710C95.30209@v.loewis.de> Michiel Jan Laurens de Hoon wrote: > 1) Did I miss something? Is there some way to get an event loop with > Tkinter? Yes, and yes. You are missing multi-threading, which is the widely used approach to doing things simultaneously in a single process. In one thread, user interaction can occur; in another, computation. If you need non-blocking interaction between the threads, use queues, or other global variables. If you have other event sources, deal with them in separate threads. Yes, it is possible to get event loops with Tkinter. Atleast on Unix, you can install a file handler into the Tk event loop (through createfilehandler), which gives you callbacks whenever there is some activity on the files. Furthermore, it is possible to turn the event loop around, by doing dooneevent explicitly. In principle, it would also be possible to expose Tcl events and notifications in Tkinter (i.e. the Tcl_CreateEventSource/Tcl_WaitForEvent family of APIs). If you think this would help in your case, then contributions are welcome. > 2) Will Tkinter always be the standard toolkit for Python, or are there > plans to replace it at some point? Python does not evolve along a grand master plan. Instead, individual contributors propose specific modifications, e.g. through PEPs. I personally have no plan to replace Tkinter. Regards, Martin From goodger at python.org Wed Nov 9 00:54:41 2005 From: goodger at python.org (David Goodger) Date: Tue, 08 Nov 2005 18:54:41 -0500 Subject: [Python-Dev] PEP submission broken? In-Reply-To: <43695FE5.1080803__36597.7150541314$1131334275$gmane$org@nowhere.org> References: <43695FE5.1080803__36597.7150541314$1131334275$gmane$org@nowhere.org> Message-ID: <43713AC1.5060803@python.org> [Bryan Olson] > Though I tried to submit a (pre-) PEP in the proper form through the > proper channels, it has disappeared into the ether. Indeed, it has; I can't find it in my mailbox. Could you re-send the latest text? I'll review it right away. > From what I can tell, We need to address fixing the > PEP process before there is any point in working on PEP's, Email is imperfect; just send it again. And "fakeaddress at nowhere.org" doesn't help ;-) -- David Goodger <http://python.net/~goodger> -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 253 bytes Desc: OpenPGP digital signature Url : http://mail.python.org/pipermail/python-dev/attachments/20051108/c039f37b/signature.pgp From goodger at python.org Wed Nov 9 00:55:35 2005 From: goodger at python.org (David Goodger) Date: Tue, 08 Nov 2005 18:55:35 -0500 Subject: [Python-Dev] PEP submission broken? In-Reply-To: <436F1F49.90606@gmail.com> References: <43695FE5.1080803@nowhere.org> <436F1F49.90606@gmail.com> Message-ID: <43713AF7.1080600@python.org> [Nick Coghlan] > Would it be worth having a PEP category on the RFE tracker, rather than > submitting pre-PEP's directly to the PEP editors? Couldn't hurt. -- David Goodger <http://python.net/~goodger> -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 253 bytes Desc: OpenPGP digital signature Url : http://mail.python.org/pipermail/python-dev/attachments/20051108/c555115d/signature.pgp From osantana at gmail.com Wed Nov 9 03:33:47 2005 From: osantana at gmail.com (Osvaldo Santana Neto) Date: Tue, 8 Nov 2005 23:33:47 -0300 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks Message-ID: <20051109023347.GA15823@localhost.localdomain> Hi, I'm working on Python[1] port for Maemo Platform[2] and I've found a inconsistent behavior in zipimport and import hook with '.pyc' and '.pyo' files. The shell section below show this problem using a 'module_c.pyc', 'module_o.pyo' and 'modules.zip' (with module_c and module_o inside): $ ls module_c.pyc module_o.pyo modules.zip $ python >>> import module_c >>> import module_o ImportError: No module named module_o $ python -O >>> import module_c ImportError: No module named module_c >>> import module_o $ rm *.pyc *.pyo $ PYTHONPATH=modules.zip python >>> import module_c module_c >>> import module_o module_o $ PYTHONPATH=modules.zip python -O >>> import module_c module_c >>> import module_o module_o I've create a patch suggestion to remove this inconsistency[3] (*I* think zipimport behaviour is better). [1] http://pymaemo.sf.net/ [2] http://www.maemo.org/ [3] http://python.org/sf/1346572 -- Osvaldo Santana Neto (aCiDBaSe) icq, url = (11287184, "http://www.pythonbrasil.com.br") From guido at python.org Wed Nov 9 04:14:51 2005 From: guido at python.org (Guido van Rossum) Date: Tue, 8 Nov 2005 19:14:51 -0800 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: <20051109023347.GA15823@localhost.localdomain> References: <20051109023347.GA15823@localhost.localdomain> Message-ID: <ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com> You didn't show us what's in the zip file. Can you show a zipinfo output? My intention with import was always that without -O, *.pyo files are entirely ignored; and with -O, *.pyc files are entirely ignored. It sounds like you're saying that you want to change this so that .pyc and .pyo are always honored (with .pyc preferred if -O is not present and .pyo preferred if -O is present). I'm not sure that I like that better. If that's how zipimport works, I think it's broken! --Guido On 11/8/05, Osvaldo Santana Neto <osantana at gmail.com> wrote: > Hi, > > I'm working on Python[1] port for Maemo Platform[2] and I've found a > inconsistent behavior in zipimport and import hook with '.pyc' and > '.pyo' files. The shell section below show this problem using a > 'module_c.pyc', 'module_o.pyo' and 'modules.zip' (with module_c and > module_o inside): > > $ ls > module_c.pyc module_o.pyo modules.zip > > $ python > >>> import module_c > >>> import module_o > ImportError: No module named module_o > > $ python -O > >>> import module_c > ImportError: No module named module_c > >>> import module_o > > $ rm *.pyc *.pyo > $ PYTHONPATH=modules.zip python > >>> import module_c > module_c > >>> import module_o > module_o > > $ PYTHONPATH=modules.zip python -O > >>> import module_c > module_c > >>> import module_o > module_o > > I've create a patch suggestion to remove this inconsistency[3] (*I* think > zipimport behaviour is better). > > [1] http://pymaemo.sf.net/ > [2] http://www.maemo.org/ > [3] http://python.org/sf/1346572 > > -- > Osvaldo Santana Neto (aCiDBaSe) > icq, url = (11287184, "http://www.pythonbrasil.com.br") > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gjc at inescporto.pt Wed Nov 9 12:40:25 2005 From: gjc at inescporto.pt (Gustavo J. A. M. Carneiro) Date: Wed, 09 Nov 2005 11:40:25 +0000 Subject: [Python-Dev] Weak references: dereference notification Message-ID: <1131536425.9130.10.camel@localhost> Hello, I have come across a situation where I find the current weak references interface for extension types insufficient. Currently you only have a tp_weaklistoffset slot, pointing to a PyObject with weak references. However, in my case[1] I _really_ need to be notified when a weak reference is dereferenced. What happens now is that, when you call a weakref object, a simple Py_INCREF is done on the referenced object. It would be easy to implement a new slot to contain a function that should be called when a weak reference is dereferenced. Or, alternatively, a slot or class attribute that indicates an alternative type that should be used to create weak references: instead of the builtin weakref object, a subtype of it, so you can override tp_call. Does this sounds acceptable? Regards. [1] http://bugzilla.gnome.org/show_bug.cgi?id=320428 -- Gustavo J. A. M. Carneiro <gjc at inescporto.pt> <gustavo at users.sourceforge.net> The universe is always one step beyond logic. From osantana at gmail.com Wed Nov 9 15:33:02 2005 From: osantana at gmail.com (Osvaldo Santana) Date: Wed, 9 Nov 2005 12:33:02 -0200 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: <ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com> References: <20051109023347.GA15823@localhost.localdomain> <ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com> Message-ID: <b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com> On 11/9/05, Guido van Rossum <guido at python.org> wrote: > You didn't show us what's in the zip file. Can you show a zipinfo output? $ zipinfo modules.zip Archive: modules.zip 426 bytes 2 files -rw-r--r-- 2.3 unx 109 bx defN 31-Oct-05 14:49 module_o.pyo -rw-r--r-- 2.3 unx 109 bx defN 31-Oct-05 14:48 module_c.pyc 2 files, 218 bytes uncompressed, 136 bytes compressed: 37.6% > My intention with import was always that without -O, *.pyo files are > entirely ignored; and with -O, *.pyc files are entirely ignored. > > It sounds like you're saying that you want to change this so that .pyc > and .pyo are always honored (with .pyc preferred if -O is not present > and .pyo preferred if -O is present). I'm not sure that I like that > better. If that's how zipimport works, I think it's broken! Yes, this is how zipimport works and I think this is good in cases where a third-party binary module/package is available only with .pyo files and others only with .pyc files (without .py source files, of course). I know we can rename the files, but this is a good solution? Well, I don't have a strong opinion about the solution adopted and I really like to see other alternatives and opinions. Thanks, Osvaldo -- Osvaldo Santana Neto (aCiDBaSe) icq, url = (11287184, "http://www.pythonbrasil.com.br") From guido at python.org Wed Nov 9 16:39:29 2005 From: guido at python.org (Guido van Rossum) Date: Wed, 9 Nov 2005 07:39:29 -0800 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: <b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com> References: <20051109023347.GA15823@localhost.localdomain> <ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com> <b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com> Message-ID: <ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com> Maybe it makes more sense to deprecate .pyo altogether and instead have a post-load optimizer optimize .pyc files according to the current optimization settings? Unless others are interested in this nothing will happen. I've never heard of a third party making their code available only as .pyo, so the use case for changing things isn't very strong. In fact the only use cases I know for not making .py available are in situations where a proprietary "canned" application is distributed to end users who have no intention or need to ever add to the code. --Guido On 11/9/05, Osvaldo Santana <osantana at gmail.com> wrote: > On 11/9/05, Guido van Rossum <guido at python.org> wrote: > > You didn't show us what's in the zip file. Can you show a zipinfo output? > > $ zipinfo modules.zip > Archive: modules.zip 426 bytes 2 files > -rw-r--r-- 2.3 unx 109 bx defN 31-Oct-05 14:49 module_o.pyo > -rw-r--r-- 2.3 unx 109 bx defN 31-Oct-05 14:48 module_c.pyc > 2 files, 218 bytes uncompressed, 136 bytes compressed: 37.6% > > > My intention with import was always that without -O, *.pyo files are > > entirely ignored; and with -O, *.pyc files are entirely ignored. > > > > It sounds like you're saying that you want to change this so that .pyc > > and .pyo are always honored (with .pyc preferred if -O is not present > > and .pyo preferred if -O is present). I'm not sure that I like that > > better. If that's how zipimport works, I think it's broken! > > Yes, this is how zipimport works and I think this is good in cases > where a third-party binary module/package is available only with .pyo > files and others only with .pyc files (without .py source files, of > course). > > I know we can rename the files, but this is a good solution? Well, I > don't have a strong opinion about the solution adopted and I really > like to see other alternatives and opinions. > > Thanks, > Osvaldo > > -- > Osvaldo Santana Neto (aCiDBaSe) > icq, url = (11287184, "http://www.pythonbrasil.com.br") > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jim at zope.com Wed Nov 9 17:50:44 2005 From: jim at zope.com (Jim Fulton) Date: Wed, 09 Nov 2005 11:50:44 -0500 Subject: [Python-Dev] Weak references: dereference notification In-Reply-To: <1131536425.9130.10.camel@localhost> References: <1131536425.9130.10.camel@localhost> Message-ID: <437228E4.4070800@zope.com> Gustavo J. A. M. Carneiro wrote: > Hello, > > I have come across a situation where I find the current weak > references interface for extension types insufficient. > > Currently you only have a tp_weaklistoffset slot, pointing to a > PyObject with weak references. However, in my case[1] I _really_ need > to be notified when a weak reference is dereferenced. What happens now > is that, when you call a weakref object, a simple Py_INCREF is done on > the referenced object. It would be easy to implement a new slot to > contain a function that should be called when a weak reference is > dereferenced. Or, alternatively, a slot or class attribute that > indicates an alternative type that should be used to create weak > references: instead of the builtin weakref object, a subtype of it, so > you can override tp_call. > > Does this sounds acceptable? Since you can now (as of 2.4) subclass the weakref.ref class, you should be able to do this yourself in Python. See for example, weakref.KeyedRef. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From gjc at inescporto.pt Wed Nov 9 18:14:59 2005 From: gjc at inescporto.pt (Gustavo J. A. M. Carneiro) Date: Wed, 09 Nov 2005 17:14:59 +0000 Subject: [Python-Dev] Weak references: dereference notification In-Reply-To: <437228E4.4070800@zope.com> References: <1131536425.9130.10.camel@localhost> <437228E4.4070800@zope.com> Message-ID: <1131556500.9130.18.camel@localhost> Qua, 2005-11-09 ?s 11:50 -0500, Jim Fulton escreveu: > Gustavo J. A. M. Carneiro wrote: > > Hello, > > > > I have come across a situation where I find the current weak > > references interface for extension types insufficient. > > > > Currently you only have a tp_weaklistoffset slot, pointing to a > > PyObject with weak references. However, in my case[1] I _really_ need > > to be notified when a weak reference is dereferenced. What happens now > > is that, when you call a weakref object, a simple Py_INCREF is done on > > the referenced object. It would be easy to implement a new slot to > > contain a function that should be called when a weak reference is > > dereferenced. Or, alternatively, a slot or class attribute that > > indicates an alternative type that should be used to create weak > > references: instead of the builtin weakref object, a subtype of it, so > > you can override tp_call. > > > > Does this sounds acceptable? > > Since you can now (as of 2.4) subclass the weakref.ref class, you should be able to > do this yourself in Python. See for example, weakref.KeyedRef. I know I can subclass it, but it doesn't change anything. If people keep writing code like weakref.ref(myobj) instead of myweakref(myobj), it still won't work. I wouldn't want to have to teach users of the library that they need to use an alternative type; that seldom doesn't work. Now, if there was a place in the type that contained information like "for creating weak references of instances of this type, use this weakref class" and weakref.ref was smart enough to lookup this type and use it, only _then_ it could work. Thanks, -- Gustavo J. A. M. Carneiro <gjc at inescporto.pt> <gustavo at users.sourceforge.net> The universe is always one step beyond logic. From guido at python.org Wed Nov 9 18:23:34 2005 From: guido at python.org (Guido van Rossum) Date: Wed, 9 Nov 2005 09:23:34 -0800 Subject: [Python-Dev] Weak references: dereference notification In-Reply-To: <1131556500.9130.18.camel@localhost> References: <1131536425.9130.10.camel@localhost> <437228E4.4070800@zope.com> <1131556500.9130.18.camel@localhost> Message-ID: <ca471dc20511090923u4ae0d00evf85c2cc8a123a1b5@mail.gmail.com> > > Gustavo J. A. M. Carneiro wrote: > > > I have come across a situation where I find the current weak > > > references interface for extension types insufficient. > > > > > > Currently you only have a tp_weaklistoffset slot, pointing to a > > > PyObject with weak references. However, in my case[1] I _really_ need > > > to be notified when a weak reference is dereferenced. I find reading through the bug discussion a bit difficult to understand your use case. Could you explain it here? If you can't explain it you certainly won't get your problem solved! :-) > > > What happens now > > > is that, when you call a weakref object, a simple Py_INCREF is done on > > > the referenced object. It would be easy to implement a new slot to > > > contain a function that should be called when a weak reference is > > > dereferenced. Or, alternatively, a slot or class attribute that > > > indicates an alternative type that should be used to create weak > > > references: instead of the builtin weakref object, a subtype of it, so > > > you can override tp_call. > > > > > > Does this sounds acceptable? [Jim Fulton] > > Since you can now (as of 2.4) subclass the weakref.ref class, you should be able to > > do this yourself in Python. See for example, weakref.KeyedRef. > > I know I can subclass it, but it doesn't change anything. If people > keep writing code like weakref.ref(myobj) instead of myweakref(myobj), > it still won't work. > > I wouldn't want to have to teach users of the library that they need > to use an alternative type; that seldom doesn't work. > > Now, if there was a place in the type that contained information like > > "for creating weak references of instances of this type, use this > weakref class" > > and weakref.ref was smart enough to lookup this type and use it, only > _then_ it could work. Looks what you're looking for is a customizable factory fuction. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From nico at tekNico.net Wed Nov 9 17:24:01 2005 From: nico at tekNico.net (Nicola Larosa) Date: Wed, 09 Nov 2005 17:24:01 +0100 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: <ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com> References: <20051109023347.GA15823@localhost.localdomain> <ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com> <b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com> <ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com> Message-ID: <dkt7r2$amm$1@sea.gmane.org> > Maybe it makes more sense to deprecate .pyo altogether and instead > have a post-load optimizer optimize .pyc files according to the > current optimization settings? That would not be enough, because it would leave the docstrings in the .pyc files. > Unless others are interested in this nothing will happen. The status quo is good enough, for "normal" imports. If zipimport works differently, well, that's not nice. > I've never heard of a third party making their code available only as > .pyo, *cough* Ahem, here we are (the firm I work for). > so the use case for changing things isn't very strong. In fact > the only use cases I know for not making .py available are in > situations where a proprietary "canned" application is distributed to > end users who have no intention or need to ever add to the code. Well, exactly. :-) -- Nicola Larosa - nico at tekNico.net No inventions have really significantly eased the cognitive difficulty of writing scalable concurrent applications and it is unlikely that any will in the near term. [...] Most of all, threads do not help, in fact, they make the problem worse in many cases. -- G. Lefkowitz, August 2005 From gjc at inescporto.pt Wed Nov 9 18:52:19 2005 From: gjc at inescporto.pt (Gustavo J. A. M. Carneiro) Date: Wed, 09 Nov 2005 17:52:19 +0000 Subject: [Python-Dev] Weak references: dereference notification In-Reply-To: <ca471dc20511090923u4ae0d00evf85c2cc8a123a1b5@mail.gmail.com> References: <1131536425.9130.10.camel@localhost> <437228E4.4070800@zope.com> <1131556500.9130.18.camel@localhost> <ca471dc20511090923u4ae0d00evf85c2cc8a123a1b5@mail.gmail.com> Message-ID: <1131558739.9130.40.camel@localhost> Qua, 2005-11-09 ?s 09:23 -0800, Guido van Rossum escreveu: > > > Gustavo J. A. M. Carneiro wrote: > > > > I have come across a situation where I find the current weak > > > > references interface for extension types insufficient. > > > > > > > > Currently you only have a tp_weaklistoffset slot, pointing to a > > > > PyObject with weak references. However, in my case[1] I _really_ need > > > > to be notified when a weak reference is dereferenced. > > I find reading through the bug discussion a bit difficult to > understand your use case. Could you explain it here? If you can't > explain it you certainly won't get your problem solved! :-) This is a typical PyObject wrapping C object (GObject) problem. Both PyObject and GObject have independent reference counts. For each GObject there is at most one PyObject wrapper. When the refcount on the wrapper drops to zero, tp_dealloc is called. In tp_dealloc, and if the GObject refcount is > 1, I do something slightly evil: I 'resurect' the PyObject (calling PyObject_Init), create a weak reference to the GObject, and drop the "strong" reference. I call this a 'hibernation state'. Now the problem. Suppose the user had a weak ref to the PyObject: 1- At certain point in time, when the wrapper is in hibernation state, the user calls the weak ref 2- It gets a PyObject that contains a weak reference to the GObject; 3- Now suppose whatever was holding the GObject ref drops its reference, which was the last one, and the GObject dies; 4- Now the user does something with the PyObject obtained through the weakref -> invalid memory access. The cause for the problem is that between steps 2 and 3 the wrapper needs to change the weak reference to the GObject to a strong one. Unfortunately, I don't get any notification that 2 happened. BTW, I fixed this problem in the mean time with a bit more of slightly evil code. I override tp_call of the standard weakref type :-P [...] > > and weakref.ref was smart enough to lookup this type and use it, only > > _then_ it could work. > > Looks what you're looking for is a customizable factory fuction. Sure, if weakref.ref could be such a factory, and could take "advice" on what type of weakref to use for each class. Regards. -- Gustavo J. A. M. Carneiro <gjc at inescporto.pt> <gustavo at users.sourceforge.net> The universe is always one step beyond logic. From osantana at gmail.com Wed Nov 9 19:15:04 2005 From: osantana at gmail.com (Osvaldo Santana) Date: Wed, 9 Nov 2005 16:15:04 -0200 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: <ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com> References: <20051109023347.GA15823@localhost.localdomain> <ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com> <b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com> <ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com> Message-ID: <b674ca220511091015s6d0a36bcj9ec1bd04dff93559@mail.gmail.com> On 11/9/05, Guido van Rossum <guido at python.org> wrote: > Maybe it makes more sense to deprecate .pyo altogether and instead > have a post-load optimizer optimize .pyc files according to the > current optimization settings? I agree with this idea, but we've to think about docstrings (like Nicola said in his e-mail). Maybe we want to create a different and optimization-independent option to remove docstrings from modules? > Unless others are interested in this nothing will happen. > > I've never heard of a third party making their code available only as > .pyo, so the use case for changing things isn't very strong. In fact > the only use cases I know for not making .py available are in > situations where a proprietary "canned" application is distributed to > end users who have no intention or need to ever add to the code. I've other use case: I'm porting Python to Maemo Platform and I want to reduce the size of modules. The .pyo (-OO) are smaller than .pyc files (mainly because docstring removing) and we start to use this optimization flag to compile our Python distribution. In this case we want to force developers to call Python Interpreter with -O flags, set PYTHONOPTIMIZE, or apply my patch :) to make this more transparent. I've noticed this inconsistency when we stop to use zipimport in our Python For Maemo distribution. We've decided to stop using zipimport because the device (Nokia 770) uses a compressed filesystem. Some friends (mainly Gustavo Barbieri) help me to create the suggested patch after some discussion in our PythonBrasil mailing list. Thanks, Osvaldo -- Osvaldo Santana Neto (aCiDBaSe) icq, url = (11287184, "http://www.pythonbrasil.com.br") From guido at python.org Wed Nov 9 20:32:54 2005 From: guido at python.org (Guido van Rossum) Date: Wed, 9 Nov 2005 11:32:54 -0800 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: <b674ca220511091015s6d0a36bcj9ec1bd04dff93559@mail.gmail.com> References: <20051109023347.GA15823@localhost.localdomain> <ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com> <b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com> <ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com> <b674ca220511091015s6d0a36bcj9ec1bd04dff93559@mail.gmail.com> Message-ID: <ca471dc20511091132v4a5df88fy835da4ef092be053@mail.gmail.com> On 11/9/05, Osvaldo Santana <osantana at gmail.com> wrote: > I've noticed this inconsistency when we stop to use zipimport in our > Python For Maemo distribution. We've decided to stop using zipimport > because the device (Nokia 770) uses a compressed filesystem. I won't comment further on the brainstorm that's going on (this is becoming a topic for c.l.py) but I think you are misunderstanding the point of zipimport. It's not done (usually) for the compression but for the index. Finding a name in the zipfile index is much more efficient than doing a directory search; and the zip index can be cached. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ronaldoussoren at mac.com Wed Nov 9 20:40:02 2005 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Wed, 9 Nov 2005 20:40:02 +0100 Subject: [Python-Dev] Weak references: dereference notification In-Reply-To: <1131558739.9130.40.camel@localhost> References: <1131536425.9130.10.camel@localhost> <437228E4.4070800@zope.com> <1131556500.9130.18.camel@localhost> <ca471dc20511090923u4ae0d00evf85c2cc8a123a1b5@mail.gmail.com> <1131558739.9130.40.camel@localhost> Message-ID: <9E82C8B1-8A32-457D-827A-F0135EB9F8D3@mac.com> On 9-nov-2005, at 18:52, Gustavo J. A. M. Carneiro wrote: > Qua, 2005-11-09 ?s 09:23 -0800, Guido van Rossum escreveu: >>>> Gustavo J. A. M. Carneiro wrote: >>>>> I have come across a situation where I find the current weak >>>>> references interface for extension types insufficient. >>>>> >>>>> Currently you only have a tp_weaklistoffset slot, pointing to a >>>>> PyObject with weak references. However, in my case[1] I >>>>> _really_ need >>>>> to be notified when a weak reference is dereferenced. >> >> I find reading through the bug discussion a bit difficult to >> understand your use case. Could you explain it here? If you can't >> explain it you certainly won't get your problem solved! :-) > > This is a typical PyObject wrapping C object (GObject) problem. > Both > PyObject and GObject have independent reference counts. For each > GObject there is at most one PyObject wrapper. > > When the refcount on the wrapper drops to zero, tp_dealloc is > called. > In tp_dealloc, and if the GObject refcount is > 1, I do something > slightly evil: I 'resurect' the PyObject (calling PyObject_Init), > create > a weak reference to the GObject, and drop the "strong" reference. I > call this a 'hibernation state'. Why do you do that? The only reasons I can think of are that you hope to gain some speed from this or that you want to support weak references to the GObject. For what its worth, in PyObjC we don't support weak references to the underlying Objective-C object and delete the proxy object when it is garbage collected. Objective-C also has reference counts, we increase that in the constructor for the proxy object and decrease it again in the destroctor. Ronald From pje at telecommunity.com Wed Nov 9 20:48:25 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 09 Nov 2005 14:48:25 -0500 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: <ca471dc20511091132v4a5df88fy835da4ef092be053@mail.gmail.co m> References: <b674ca220511091015s6d0a36bcj9ec1bd04dff93559@mail.gmail.com> <20051109023347.GA15823@localhost.localdomain> <ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com> <b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com> <ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com> <b674ca220511091015s6d0a36bcj9ec1bd04dff93559@mail.gmail.com> Message-ID: <5.1.1.6.0.20051109144523.01f4a6a8@mail.telecommunity.com> At 11:32 AM 11/9/2005 -0800, Guido van Rossum wrote: >On 11/9/05, Osvaldo Santana <osantana at gmail.com> wrote: > > I've noticed this inconsistency when we stop to use zipimport in our > > Python For Maemo distribution. We've decided to stop using zipimport > > because the device (Nokia 770) uses a compressed filesystem. > >I won't comment further on the brainstorm that's going on (this is >becoming a topic for c.l.py) but I think you are misunderstanding the >point of zipimport. It's not done (usually) for the compression but >for the index. Finding a name in the zipfile index is much more >efficient than doing a directory search; and the zip index can be >cached. zipimport also helps distribution convenience - a large and elaborate package can be distributed in a single zipfile (such as is built by setuptools' "bdist_egg" command) and simply placed on PYTHONPATH or directly on sys.path. And tools like py2exe can also append all an application's modules to an executable file in zipped form. From barbieri at gmail.com Wed Nov 9 22:12:38 2005 From: barbieri at gmail.com (Gustavo Sverzut Barbieri) Date: Wed, 9 Nov 2005 19:12:38 -0200 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: <ca471dc20511091132v4a5df88fy835da4ef092be053@mail.gmail.com> References: <20051109023347.GA15823@localhost.localdomain> <ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com> <b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com> <ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com> <b674ca220511091015s6d0a36bcj9ec1bd04dff93559@mail.gmail.com> <ca471dc20511091132v4a5df88fy835da4ef092be053@mail.gmail.com> Message-ID: <9ef20ef30511091312lcaa1caetbe0c4bade802738a@mail.gmail.com> On 11/9/05, Guido van Rossum <guido at python.org> wrote: > On 11/9/05, Osvaldo Santana <osantana at gmail.com> wrote: > > I've noticed this inconsistency when we stop to use zipimport in our > > Python For Maemo distribution. We've decided to stop using zipimport > > because the device (Nokia 770) uses a compressed filesystem. > > I won't comment further on the brainstorm that's going on (this is > becoming a topic for c.l.py) but I think you are misunderstanding the > point of zipimport. It's not done (usually) for the compression but > for the index. Finding a name in the zipfile index is much more > efficient than doing a directory search; and the zip index can be > cached. Any way, not loading .pyo if no .pyc or .py is available is a drawback, specially on unices that have scripts starting with "#!/usr/bin/python" or "#!/usr/bin/env python" and the system just have .pyo files, due a bunch of reasons, in this case the small disc space. -- Gustavo Sverzut Barbieri -------------------------------------- Computer Engineer 2001 - UNICAMP Mobile: +55 (19) 9165 8010 Phone: +1 (347) 624 6296 @ sip.stanaphone.com Jabber: gsbarbieri at jabber.org ICQ#: 17249123 MSN: barbieri at gmail.com GPG: 0xB640E1A2 @ wwwkeys.pgp.net From janssen at parc.com Wed Nov 9 22:22:35 2005 From: janssen at parc.com (Bill Janssen) Date: Wed, 9 Nov 2005 13:22:35 PST Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: Your message of "Wed, 09 Nov 2005 11:48:25 PST." <5.1.1.6.0.20051109144523.01f4a6a8@mail.telecommunity.com> Message-ID: <05Nov9.132241pst."58633"@synergy1.parc.xerox.com> It's a shame that 1) there's no equivalent of "java -jar", i.e., "python -z FILE.ZIP", and 2) the use of zipfiles is so poorly documented. Bill From bob at redivi.com Wed Nov 9 22:38:33 2005 From: bob at redivi.com (Bob Ippolito) Date: Wed, 9 Nov 2005 13:38:33 -0800 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: <05Nov9.132241pst."58633"@synergy1.parc.xerox.com> References: <05Nov9.132241pst."58633"@synergy1.parc.xerox.com> Message-ID: <A0F78CD8-1F2C-4201-B92B-1707AA822DF0@redivi.com> On Nov 9, 2005, at 1:22 PM, Bill Janssen wrote: > It's a shame that > > 1) there's no equivalent of "java -jar", i.e., "python -z > FILE.ZIP", and This should work on a few platforms: env PYTHONPATH=FILE.zip python -m some_module_in_the_zip -bob From theller at python.net Wed Nov 9 22:48:07 2005 From: theller at python.net (Thomas Heller) Date: Wed, 09 Nov 2005 22:48:07 +0100 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks References: <05Nov9.132241pst."58633"@synergy1.parc.xerox.com> <A0F78CD8-1F2C-4201-B92B-1707AA822DF0@redivi.com> Message-ID: <1x1ppk6g.fsf@python.net> Bob Ippolito <bob at redivi.com> writes: > On Nov 9, 2005, at 1:22 PM, Bill Janssen wrote: > >> It's a shame that >> >> 1) there's no equivalent of "java -jar", i.e., "python -z >> FILE.ZIP", and > > This should work on a few platforms: > env PYTHONPATH=FILE.zip python -m some_module_in_the_zip It should, yes - but it doesn't: -m doesn't work with zipimport. Thomas From bob at redivi.com Wed Nov 9 22:55:04 2005 From: bob at redivi.com (Bob Ippolito) Date: Wed, 9 Nov 2005 13:55:04 -0800 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: <1x1ppk6g.fsf@python.net> References: <05Nov9.132241pst."58633"@synergy1.parc.xerox.com> <A0F78CD8-1F2C-4201-B92B-1707AA822DF0@redivi.com> <1x1ppk6g.fsf@python.net> Message-ID: <EDEA56AC-BB60-496D-8A3E-1FBD68F40D44@redivi.com> On Nov 9, 2005, at 1:48 PM, Thomas Heller wrote: > Bob Ippolito <bob at redivi.com> writes: > >> On Nov 9, 2005, at 1:22 PM, Bill Janssen wrote: >> >>> It's a shame that >>> >>> 1) there's no equivalent of "java -jar", i.e., "python -z >>> FILE.ZIP", and >> >> This should work on a few platforms: >> env PYTHONPATH=FILE.zip python -m some_module_in_the_zip > > It should, yes - but it doesn't: -m doesn't work with zipimport. That's dumb, someone should fix that. Is there a bug filed? -bob From ncoghlan at gmail.com Wed Nov 9 22:58:44 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 10 Nov 2005 07:58:44 +1000 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: <A0F78CD8-1F2C-4201-B92B-1707AA822DF0@redivi.com> References: <05Nov9.132241pst."58633"@synergy1.parc.xerox.com> <A0F78CD8-1F2C-4201-B92B-1707AA822DF0@redivi.com> Message-ID: <43727114.4030107@gmail.com> Bob Ippolito wrote: > On Nov 9, 2005, at 1:22 PM, Bill Janssen wrote: > >> It's a shame that >> >> 1) there's no equivalent of "java -jar", i.e., "python -z >> FILE.ZIP", and > > This should work on a few platforms: > env PYTHONPATH=FILE.zip python -m some_module_in_the_zip Really? I wrote the '-m' code, and I wouldn't expect that to work anywhere because 'execfile' and the C equivalent that -m relies on expect a real file. PEP 328 goes some way towards fixing that by having a Python fallback to find and execute the module if the current C code fails. If we had execmodule as a Python function, it would make it much easier to add support for compiling and executing the target module directly, rather than indirecting through the file-system-dependent execfile. In theory this could be done in C, but execmodule is fairly long even written in Python. I'm actually fairly sure it *could* be written in C, but I think doing so would be horribly tedious (and not as useful in the long run). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From gjc at inescporto.pt Wed Nov 9 23:44:38 2005 From: gjc at inescporto.pt (Gustavo J. A. M. Carneiro) Date: Wed, 09 Nov 2005 22:44:38 +0000 Subject: [Python-Dev] Weak references: dereference notification In-Reply-To: <9E82C8B1-8A32-457D-827A-F0135EB9F8D3@mac.com> References: <1131536425.9130.10.camel@localhost> <437228E4.4070800@zope.com> <1131556500.9130.18.camel@localhost> <ca471dc20511090923u4ae0d00evf85c2cc8a123a1b5@mail.gmail.com> <1131558739.9130.40.camel@localhost> <9E82C8B1-8A32-457D-827A-F0135EB9F8D3@mac.com> Message-ID: <1131576278.8540.14.camel@localhost.localdomain> On Wed, 2005-11-09 at 20:40 +0100, Ronald Oussoren wrote: > On 9-nov-2005, at 18:52, Gustavo J. A. M. Carneiro wrote: > > > Qua, 2005-11-09 ?s 09:23 -0800, Guido van Rossum escreveu: > >>>> Gustavo J. A. M. Carneiro wrote: > >>>>> I have come across a situation where I find the current weak > >>>>> references interface for extension types insufficient. > >>>>> > >>>>> Currently you only have a tp_weaklistoffset slot, pointing to a > >>>>> PyObject with weak references. However, in my case[1] I > >>>>> _really_ need > >>>>> to be notified when a weak reference is dereferenced. > >> > >> I find reading through the bug discussion a bit difficult to > >> understand your use case. Could you explain it here? If you can't > >> explain it you certainly won't get your problem solved! :-) > > > > This is a typical PyObject wrapping C object (GObject) problem. > > Both > > PyObject and GObject have independent reference counts. For each > > GObject there is at most one PyObject wrapper. > > > > When the refcount on the wrapper drops to zero, tp_dealloc is > > called. > > In tp_dealloc, and if the GObject refcount is > 1, I do something > > slightly evil: I 'resurect' the PyObject (calling PyObject_Init), > > create > > a weak reference to the GObject, and drop the "strong" reference. I > > call this a 'hibernation state'. > > Why do you do that? The only reasons I can think of are that you hope > to gain > some speed from this or that you want to support weak references to > the GObject. We want to support weak references to GObjects. Mainly because that support has always been there and we don't want/can't break API. And it does have some uses... > > For what its worth, in PyObjC we don't support weak references to the > underlying > Objective-C object and delete the proxy object when it is garbage > collected. > Objective-C also has reference counts, we increase that in the > constructor for > the proxy object and decrease it again in the destroctor. OK, but what if it is a subclass of a builtin type, with instance variables? What if the PyObject is GC'ed but the ObjC object remains alive, and later you get a new reference to it? Do you create a new PyObject wrapper for it? What happened to the instance variables? Our goal in wrapping GObject is that, once a Python wrapper for a GObject instance is created, it never dies until the GObject dies too. At the same time, once the python wrapper loses all references, it should not stop keeping the GObject alive. What happens currently, which is what I'm trying to change, is that there is a reference loop between PyObject and GObject, so that deallocation only happens with the help of the cyclic GC. But relying on the GC for _everything_ causes annoying problems: 1- The GC runs only once in a while, not soon enough if eg. you have an image object with several megabytes; 2- It makes it hard to debug reference counting bugs, as the symptom only appears when the GC runs, far away from the code that cause the problem in the first place; 3- Generally the GC has a lot more work, since every PyGTK object needs it, and a GUI app can have lots of PyGTK objects. Regards. -- Gustavo J. A. M. Carneiro <gjc at inescporto.pt> <gustavo at users.sourceforge.net> The universe is always one step beyond logic From p.f.moore at gmail.com Wed Nov 9 23:56:13 2005 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 9 Nov 2005 22:56:13 +0000 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: <EDEA56AC-BB60-496D-8A3E-1FBD68F40D44@redivi.com> References: <A0F78CD8-1F2C-4201-B92B-1707AA822DF0@redivi.com> <1x1ppk6g.fsf@python.net> <EDEA56AC-BB60-496D-8A3E-1FBD68F40D44@redivi.com> Message-ID: <79990c6b0511091456y329f1c5ey53b7428e59c97bc7@mail.gmail.com> On 11/9/05, Bob Ippolito <bob at redivi.com> wrote: > > On Nov 9, 2005, at 1:48 PM, Thomas Heller wrote: > > > Bob Ippolito <bob at redivi.com> writes: > > > >> On Nov 9, 2005, at 1:22 PM, Bill Janssen wrote: > >> > >>> It's a shame that > >>> > >>> 1) there's no equivalent of "java -jar", i.e., "python -z > >>> FILE.ZIP", and > >> > >> This should work on a few platforms: > >> env PYTHONPATH=FILE.zip python -m some_module_in_the_zip > > > > It should, yes - but it doesn't: -m doesn't work with zipimport. > > That's dumb, someone should fix that. Is there a bug filed? I did, a while ago. http://www.python.org/sf/1250389 Paul. From bcannon at gmail.com Thu Nov 10 00:05:13 2005 From: bcannon at gmail.com (Brett Cannon) Date: Wed, 9 Nov 2005 15:05:13 -0800 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: <ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com> References: <20051109023347.GA15823@localhost.localdomain> <ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com> <b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com> <ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com> Message-ID: <bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com> On 11/9/05, Guido van Rossum <guido at python.org> wrote: > Maybe it makes more sense to deprecate .pyo altogether and instead > have a post-load optimizer optimize .pyc files according to the > current optimization settings? > But I thought part of the point of .pyo files was that they left out docstrings and thus had a smaller footprint? Plus I wouldn't be surprised if we started to move away from bytecode optimization and instead tried to do more AST transformations which would remove possible post-load optimizations. I would have no issue with removing .pyo files and have .pyc files just be as optimized as they the current settings are and leave it at that. Could have some metadata listing what optimizations occurred, but do we really need to have a specific way to denote if bytecode has been optimized? Binary files compiled from C don't note what -O optimization they were compiled with. If someone distributes optimized .pyc files chances are they are going to have a specific compile step with py_compile and they will know what optimizations they are using. -Brett From foom at fuhm.net Thu Nov 10 00:15:02 2005 From: foom at fuhm.net (James Y Knight) Date: Wed, 9 Nov 2005 18:15:02 -0500 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: <bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com> References: <20051109023347.GA15823@localhost.localdomain> <ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com> <b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com> <ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com> <bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com> Message-ID: <4132F2DC-0981-49F0-8DF4-B20FF840290D@fuhm.net> On Nov 9, 2005, at 6:05 PM, Brett Cannon wrote: > I would have no issue with removing .pyo files and have .pyc files > just be as optimized as they the current settings are and leave it at > that. Could have some metadata listing what optimizations occurred, > but do we really need to have a specific way to denote if bytecode has > been optimized? Binary files compiled from C don't note what -O > optimization they were compiled with. If someone distributes > optimized .pyc files chances are they are going to have a specific > compile step with py_compile and they will know what optimizations > they are using. > This sounds quite sensible. The only thing I'd add is that iff there is a .py file of the same name, and the current optimization settings are different from those in the .pyc file, python should recompile the .py file. James From guido at python.org Thu Nov 10 00:25:03 2005 From: guido at python.org (Guido van Rossum) Date: Wed, 9 Nov 2005 15:25:03 -0800 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: <bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com> References: <20051109023347.GA15823@localhost.localdomain> <ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com> <b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com> <ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com> <bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com> Message-ID: <ca471dc20511091525g11986fb8pf7e2a4ba9a21f5c0@mail.gmail.com> On 11/9/05, Brett Cannon <bcannon at gmail.com> wrote: > On 11/9/05, Guido van Rossum <guido at python.org> wrote: > > Maybe it makes more sense to deprecate .pyo altogether and instead > > have a post-load optimizer optimize .pyc files according to the > > current optimization settings? > > But I thought part of the point of .pyo files was that they left out > docstrings and thus had a smaller footprint? Very few people care about the smaller footprint (although one piped up here). > Plus I wouldn't be > surprised if we started to move away from bytecode optimization and > instead tried to do more AST transformations which would remove > possible post-load optimizations. > > I would have no issue with removing .pyo files and have .pyc files > just be as optimized as they the current settings are and leave it at > that. Could have some metadata listing what optimizations occurred, > but do we really need to have a specific way to denote if bytecode has > been optimized? Binary files compiled from C don't note what -O > optimization they were compiled with. If someone distributes > optimized .pyc files chances are they are going to have a specific > compile step with py_compile and they will know what optimizations > they are using. Currently, .pyo files have some important semantic differences with .pyc files; -O doesn't remove docstrings (that's -OO) but it does remove asserts. I wouldn't want to accidentally use a .pyc file without asserts compiled in unless the .py file wasn't around. For application distribution, the following probably would work: - instead of .pyo files, we use .pyc files - the .pyc file records whether optimizations were applied, whether asserts are compiled, and whether docstrings are retained - if the compiler finds a .pyc that is inconsistent with the current command line, it ignores it and rewrites it (if it is writable) just as if the .py file were newer However, this would be a major pain for the standard library and other shared code -- there it's really nice to have a cache for each of the optimization levels since usually regular users can't write the .py[co] files there, meaning very slow always-recompilation if the standard .pyc files aren't of the right level, causing unacceptable start-up times. The only solutions I can think of that use a single file actually *increase* the file size by having unoptimized and optimized code side-by-side, or some way to quickly skip the assertions -- the -OO option is a special case that probably needs to be done differently anyway and only for final distribution. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bcannon at gmail.com Thu Nov 10 01:04:15 2005 From: bcannon at gmail.com (Brett Cannon) Date: Wed, 9 Nov 2005 16:04:15 -0800 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: <ca471dc20511091525g11986fb8pf7e2a4ba9a21f5c0@mail.gmail.com> References: <20051109023347.GA15823@localhost.localdomain> <ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com> <b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com> <ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com> <bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com> <ca471dc20511091525g11986fb8pf7e2a4ba9a21f5c0@mail.gmail.com> Message-ID: <bbaeab100511091604j732cfc86k170e782e0233f638@mail.gmail.com> On 11/9/05, Guido van Rossum <guido at python.org> wrote: > On 11/9/05, Brett Cannon <bcannon at gmail.com> wrote: > > Plus I wouldn't be > > surprised if we started to move away from bytecode optimization and > > instead tried to do more AST transformations which would remove > > possible post-load optimizations. > > > > I would have no issue with removing .pyo files and have .pyc files > > just be as optimized as they the current settings are and leave it at > > that. Could have some metadata listing what optimizations occurred, > > but do we really need to have a specific way to denote if bytecode has > > been optimized? Binary files compiled from C don't note what -O > > optimization they were compiled with. If someone distributes > > optimized .pyc files chances are they are going to have a specific > > compile step with py_compile and they will know what optimizations > > they are using. > > Currently, .pyo files have some important semantic differences with > .pyc files; -O doesn't remove docstrings (that's -OO) but it does > remove asserts. I wouldn't want to accidentally use a .pyc file > without asserts compiled in unless the .py file wasn't around. > > For application distribution, the following probably would work: > > - instead of .pyo files, we use .pyc files > - the .pyc file records whether optimizations were applied, whether > asserts are compiled, and whether docstrings are retained > - if the compiler finds a .pyc that is inconsistent with the current > command line, it ignores it and rewrites it (if it is writable) just > as if the .py file were newer > > However, this would be a major pain for the standard library and other > shared code -- there it's really nice to have a cache for each of the > optimization levels since usually regular users can't write the > .py[co] files there, meaning very slow always-recompilation if the > standard .pyc files aren't of the right level, causing unacceptable > start-up times. > What if PEP 304 came into being? Then people would have a place to have the shared code's recompiled version stored and thus avoid the overhead from repeated use. > The only solutions I can think of that use a single file actually > *increase* the file size by having unoptimized and optimized code > side-by-side, or some way to quickly skip the assertions -- the -OO > option is a special case that probably needs to be done differently > anyway and only for final distribution. > One option would be to introduce an ASSERTION bytecode that has an argument specifying the amount of bytecode for the assertion. The eval loop can then just igonore the bytecode if assertions are being evaluated and fall through to the bytecode for the assertions (and thus be the equivalent of NOP) or use the argument to jump forward that number of bytes in the bytecode and completely skip over the assertion (and thus be just like a JUMP_FORWARD). Either way assertions becomes slightly more costly but it should be very minimal. -Brett From pje at telecommunity.com Thu Nov 10 01:16:11 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 09 Nov 2005 19:16:11 -0500 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: <ca471dc20511091525g11986fb8pf7e2a4ba9a21f5c0@mail.gmail.co m> References: <bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com> <20051109023347.GA15823@localhost.localdomain> <ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com> <b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com> <ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com> <bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com> Message-ID: <5.1.1.6.0.20051109190838.01f51838@mail.telecommunity.com> At 03:25 PM 11/9/2005 -0800, Guido van Rossum wrote: >The only solutions I can think of that use a single file actually >*increase* the file size by having unoptimized and optimized code >side-by-side, or some way to quickly skip the assertions -- the -OO >option is a special case that probably needs to be done differently >anyway and only for final distribution. We could have a "JUMP_IF_NOT_DEBUG" opcode to skip over asserts and "if __debug__" blocks. Then under -O we could either patch this to a plain jump, or compact the bytecode to remove the jumped-over part(s). By the way, while we're on this subject, can we make the optimization options be part of the compile() interface? Right now the distutils has to actually exec another Python process whenever you want to compile code with a different optimization level than what's currently in effect, whereas if it could pass the desired level to compile(), this wouldn't be necessary. From guido at python.org Thu Nov 10 01:33:00 2005 From: guido at python.org (Guido van Rossum) Date: Wed, 9 Nov 2005 16:33:00 -0800 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: <5.1.1.6.0.20051109190838.01f51838@mail.telecommunity.com> References: <20051109023347.GA15823@localhost.localdomain> <ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com> <b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com> <ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com> <bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com> <5.1.1.6.0.20051109190838.01f51838@mail.telecommunity.com> Message-ID: <ca471dc20511091633m4b7869b7jc3bd847436f452ab@mail.gmail.com> On 11/9/05, Phillip J. Eby <pje at telecommunity.com> wrote: > At 03:25 PM 11/9/2005 -0800, Guido van Rossum wrote: > >The only solutions I can think of that use a single file actually > >*increase* the file size by having unoptimized and optimized code > >side-by-side, or some way to quickly skip the assertions -- the -OO > >option is a special case that probably needs to be done differently > >anyway and only for final distribution. > > We could have a "JUMP_IF_NOT_DEBUG" opcode to skip over asserts and "if > __debug__" blocks. Then under -O we could either patch this to a plain > jump, or compact the bytecode to remove the jumped-over part(s). That sounds very reasonable. > By the way, while we're on this subject, can we make the optimization > options be part of the compile() interface? Right now the distutils has to > actually exec another Python process whenever you want to compile > code with > a different optimization level than what's currently in effect, whereas if > it could pass the desired level to compile(), this wouldn't be necessary. Makes sense to me; we need a patch of course. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Thu Nov 10 01:35:14 2005 From: guido at python.org (Guido van Rossum) Date: Wed, 9 Nov 2005 16:35:14 -0800 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: <bbaeab100511091604j732cfc86k170e782e0233f638@mail.gmail.com> References: <20051109023347.GA15823@localhost.localdomain> <ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com> <b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com> <ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com> <bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com> <ca471dc20511091525g11986fb8pf7e2a4ba9a21f5c0@mail.gmail.com> <bbaeab100511091604j732cfc86k170e782e0233f638@mail.gmail.com> Message-ID: <ca471dc20511091635p586d127cpa923926b3ac65639@mail.gmail.com> [Guido] > > However, this would be a major pain for the standard library and other > > shared code -- there it's really nice to have a cache for each of the > > optimization levels since usually regular users can't write the > > .py[co] files there, meaning very slow always-recompilation if the > > standard .pyc files aren't of the right level, causing unacceptable > > start-up times. [Brett] > What if PEP 304 came into being? Then people would have a place to > have the shared code's recompiled version stored and thus avoid the > overhead from repeated use. Still sounds suboptimal for the standard library; IMO it should "just work". > > The only solutions I can think of that use a single file actually > > *increase* the file size by having unoptimized and optimized code > > side-by-side, or some way to quickly skip the assertions -- the -OO > > option is a special case that probably needs to be done differently > > anyway and only for final distribution. > > One option would be to introduce an ASSERTION bytecode that has an > argument specifying the amount of bytecode for the assertion. The > eval loop can then just igonore the bytecode if assertions are being > evaluated and fall through to the bytecode for the assertions (and > thus be the equivalent of NOP) or use the argument to jump forward > that number of bytes in the bytecode and completely skip over the > assertion (and thus be just like a JUMP_FORWARD). Either way > assertions becomes slightly more costly but it should be very minimal. I like Phillip's suggestion -- no new opcode, just a conditional jump that can be easily optimized out. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bcannon at gmail.com Thu Nov 10 01:57:07 2005 From: bcannon at gmail.com (Brett Cannon) Date: Wed, 9 Nov 2005 16:57:07 -0800 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: <ca471dc20511091635p586d127cpa923926b3ac65639@mail.gmail.com> References: <20051109023347.GA15823@localhost.localdomain> <ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com> <b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com> <ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com> <bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com> <ca471dc20511091525g11986fb8pf7e2a4ba9a21f5c0@mail.gmail.com> <bbaeab100511091604j732cfc86k170e782e0233f638@mail.gmail.com> <ca471dc20511091635p586d127cpa923926b3ac65639@mail.gmail.com> Message-ID: <bbaeab100511091657t5377f05dl111b4b701b551d4a@mail.gmail.com> On 11/9/05, Guido van Rossum <guido at python.org> wrote: > [Guido] > > > However, this would be a major pain for the standard library and other > > > shared code -- there it's really nice to have a cache for each of the > > > optimization levels since usually regular users can't write the > > > .py[co] files there, meaning very slow always-recompilation if the > > > standard .pyc files aren't of the right level, causing unacceptable > > > start-up times. > [Brett] > > What if PEP 304 came into being? Then people would have a place to > > have the shared code's recompiled version stored and thus avoid the > > overhead from repeated use. > > Still sounds suboptimal for the standard library; IMO it should "just work". > Fair enough. > > > The only solutions I can think of that use a single file actually > > > *increase* the file size by having unoptimized and optimized code > > > side-by-side, or some way to quickly skip the assertions -- the -OO > > > option is a special case that probably needs to be done differently > > > anyway and only for final distribution. > > > > One option would be to introduce an ASSERTION bytecode that has an > > argument specifying the amount of bytecode for the assertion. The > > eval loop can then just igonore the bytecode if assertions are being > > evaluated and fall through to the bytecode for the assertions (and > > thus be the equivalent of NOP) or use the argument to jump forward > > that number of bytes in the bytecode and completely skip over the > > assertion (and thus be just like a JUMP_FORWARD). Either way > > assertions becomes slightly more costly but it should be very minimal. > > I like Phillip's suggestion -- no new opcode, just a conditional jump > that can be easily optimized out. Huh? But Phillip is suggesting a new opcode that is essentially the same as my proposal but naming it differently and saying the bytecode should get changed directly instead of having the eval loop handle the semantic differences based on whether -O is being used. -Brett From greg.ewing at canterbury.ac.nz Thu Nov 10 01:57:43 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 10 Nov 2005 13:57:43 +1300 Subject: [Python-Dev] Weak references: dereference notification In-Reply-To: <1131576278.8540.14.camel@localhost.localdomain> References: <1131536425.9130.10.camel@localhost> <437228E4.4070800@zope.com> <1131556500.9130.18.camel@localhost> <ca471dc20511090923u4ae0d00evf85c2cc8a123a1b5@mail.gmail.com> <1131558739.9130.40.camel@localhost> <9E82C8B1-8A32-457D-827A-F0135EB9F8D3@mac.com> <1131576278.8540.14.camel@localhost.localdomain> Message-ID: <43729B07.6010907@canterbury.ac.nz> Gustavo J. A. M. Carneiro wrote: > OK, but what if it is a subclass of a builtin type, with instance > variables? What if the PyObject is GC'ed but the ObjC object remains > alive, and later you get a new reference to it? Do you create a new > PyObject wrapper for it? What happened to the instance variables? Your proposed scheme appears to involve destroying and then re-initialising the Python wrapper. Isn't that going to wipe out any instance variables it may have had? Also, it seems to me that as soon as the refcount on the wrapper drops to zero, any weak references to it will be broken. Or does your resurrection code intervene before that happens? -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From guido at python.org Thu Nov 10 02:01:57 2005 From: guido at python.org (Guido van Rossum) Date: Wed, 9 Nov 2005 17:01:57 -0800 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: <bbaeab100511091657t5377f05dl111b4b701b551d4a@mail.gmail.com> References: <20051109023347.GA15823@localhost.localdomain> <ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com> <b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com> <ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com> <bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com> <ca471dc20511091525g11986fb8pf7e2a4ba9a21f5c0@mail.gmail.com> <bbaeab100511091604j732cfc86k170e782e0233f638@mail.gmail.com> <ca471dc20511091635p586d127cpa923926b3ac65639@mail.gmail.com> <bbaeab100511091657t5377f05dl111b4b701b551d4a@mail.gmail.com> Message-ID: <ca471dc20511091701x36cc9061x142ad6afc2aeb853@mail.gmail.com> > > I like Phillip's suggestion -- no new opcode, just a conditional jump > > that can be easily optimized out. > > Huh? But Phillip is suggesting a new opcode that is essentially the > same as my proposal but naming it differently and saying the bytecode > should get changed directly instead of having the eval loop handle the > semantic differences based on whether -O is being used. Sorry. Looking back they look pretty much the same to me. Somehow I glanced over Phillip's code and thought he was proposing to use a regular JUMP_IF opcode with the special __debug__ variable (which would be a 3rd possibility, good if we had backwards compatibility requirements for bytecode -- which we don't, fortunately :-). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From mdehoon at c2b2.columbia.edu Thu Nov 10 02:04:43 2005 From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon) Date: Wed, 09 Nov 2005 20:04:43 -0500 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <43710C95.30209@v.loewis.de> References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> Message-ID: <43729CAB.5070106@c2b2.columbia.edu> Martin v. L?wis wrote: > Michiel Jan Laurens de Hoon wrote: > >> 2) Will Tkinter always be the standard toolkit for Python, or are >> there plans to replace it at some point? > > > Python does not evolve along a grand master plan. Instead, individual > contributors propose specific modifications, e.g. through PEPs. At this point, I can't propose a specific modification yet because I don't know the reasoning that went behind the original choice of Tk as the default GUI toolkit for Python (and hence, I don't know if those reasons are still valid today). I can see one disadvantage (using Tk limits our options to run an event loop for other Python extensions), and I am trying to find out why Tk was deemed more appropriate than other GUI toolkits anyway. So let me rephrase the question: What is the advantage of Tk in comparison to other GUI toolkits? Is it Mac availability? More advanced widget set? Installation is easier? Portability? Switching to a different GUI toolkit would break too much existing code? I think that having the answer to this will stimulate further development of alternative GUI toolkits, which may give some future Python version a toolkit at least as good as Tk, and one that doesn't interfere with Python's event loop capabilities. --Michiel. -- Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 From bcannon at gmail.com Thu Nov 10 02:49:39 2005 From: bcannon at gmail.com (Brett Cannon) Date: Wed, 9 Nov 2005 17:49:39 -0800 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: <ca471dc20511091701x36cc9061x142ad6afc2aeb853@mail.gmail.com> References: <20051109023347.GA15823@localhost.localdomain> <ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com> <b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com> <ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com> <bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com> <ca471dc20511091525g11986fb8pf7e2a4ba9a21f5c0@mail.gmail.com> <bbaeab100511091604j732cfc86k170e782e0233f638@mail.gmail.com> <ca471dc20511091635p586d127cpa923926b3ac65639@mail.gmail.com> <bbaeab100511091657t5377f05dl111b4b701b551d4a@mail.gmail.com> <ca471dc20511091701x36cc9061x142ad6afc2aeb853@mail.gmail.com> Message-ID: <bbaeab100511091749k3f8feb0fue5146474bb4d0deb@mail.gmail.com> On 11/9/05, Guido van Rossum <guido at python.org> wrote: > > > I like Phillip's suggestion -- no new opcode, just a conditional jump > > > that can be easily optimized out. > > > > Huh? But Phillip is suggesting a new opcode that is essentially the > > same as my proposal but naming it differently and saying the bytecode > > should get changed directly instead of having the eval loop handle the > > semantic differences based on whether -O is being used. > > Sorry. No problem. Figured you just misread mine. > Looking back they look pretty much the same to me. Somehow I > glanced over Phillip's code and thought he was proposing to use a > regular JUMP_IF opcode with the special __debug__ variable (which > would be a 3rd possibility, good if we had backwards compatibility > requirements for bytecode -- which we don't, fortunately :-). > Fortunately. =) So does this mean you like the idea? Should this all move forward somehow? -Brett From guido at python.org Thu Nov 10 02:51:19 2005 From: guido at python.org (Guido van Rossum) Date: Wed, 9 Nov 2005 17:51:19 -0800 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: <bbaeab100511091749k3f8feb0fue5146474bb4d0deb@mail.gmail.com> References: <20051109023347.GA15823@localhost.localdomain> <b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com> <ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com> <bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com> <ca471dc20511091525g11986fb8pf7e2a4ba9a21f5c0@mail.gmail.com> <bbaeab100511091604j732cfc86k170e782e0233f638@mail.gmail.com> <ca471dc20511091635p586d127cpa923926b3ac65639@mail.gmail.com> <bbaeab100511091657t5377f05dl111b4b701b551d4a@mail.gmail.com> <ca471dc20511091701x36cc9061x142ad6afc2aeb853@mail.gmail.com> <bbaeab100511091749k3f8feb0fue5146474bb4d0deb@mail.gmail.com> Message-ID: <ca471dc20511091751t18052057j9cc7a4e9da627b5d@mail.gmail.com> On 11/9/05, Brett Cannon <bcannon at gmail.com> wrote: > On 11/9/05, Guido van Rossum <guido at python.org> wrote: > > > > I like Phillip's suggestion -- no new opcode, just a conditional jump > > > > that can be easily optimized out. > > > > > > Huh? But Phillip is suggesting a new opcode that is essentially the > > > same as my proposal but naming it differently and saying the bytecode > > > should get changed directly instead of having the eval loop handle the > > > semantic differences based on whether -O is being used. > > > > Sorry. > > No problem. Figured you just misread mine. > > > Looking back they look pretty much the same to me. Somehow I > > glanced over Phillip's code and thought he was proposing to use a > > regular JUMP_IF opcode with the special __debug__ variable (which > > would be a 3rd possibility, good if we had backwards compatibility > > requirements for bytecode -- which we don't, fortunately :-). > > > > Fortunately. =) > > So does this mean you like the idea? Should this all move forward somehow? I guess so. :-) It will need someone thinking really hard about all the use cases, edge cases, etc., implementation details, and writing up a PEP. Feel like volunteering? You might squeeze Phillip as a co-author. He's a really good one. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From Scott.Daniels at Acm.Org Thu Nov 10 03:41:59 2005 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Wed, 09 Nov 2005 18:41:59 -0800 Subject: [Python-Dev] int(string) (was: DRAFT: python-dev Summary for 2005-09-01 through 2005-09-16) In-Reply-To: <1f7befae0510211952x5eb2000bicdf3c1a80a3f5749@mail.gmail.com> References: <1f7befae0510211952x5eb2000bicdf3c1a80a3f5749@mail.gmail.com> Message-ID: <4372B377.6050806@Acm.Org> Tim Peters wrote: > ... > Someone want a finite project that would _really_ help their Uncle > Timmy in his slow-motion crusade to get Python on the list of "solved > it!" languages for each problem on that magnificent site? ... > Turns out it's _not_ input speed that's the problem here, and not even > mainly the speed of integer mod: the bulk of the time is spent in > int(string).... OK, I got an idea about how to do this fast. I started with Python code, and I now have C code that should beat int(string) always while getting a lot of speed making long values. The working tables can be used to do the reverse transformation (int or long to string in some base) with a little finesse, but I haven't done that yet in C. The code is pretty sprawly now (a lot left in for instrumentation and testing pieces), but can eventually get smaller. I gave myself time to do this as a birthday present to myself. It may take a while to build a patch, but perhaps you can let me know how much speedup you get using this code. if you build this module, I'd suggest using "from to_int import chomp" to get a function that works like int (producing a long when needed and so on). > If you can even track all the levels of C function calls that ends up > invoking <wink>, you find yourself in PyOS_strtoul(), which is a > nifty all-purpose routine that accepts inputs in bases 2 thru 36, can > auto-detect base, and does platform-independent overflow checking at > the cost of a division per digit. All those are features, but it > makes for sloooow conversion. OK, this code doesn't deal with unicode at all. The key observations are: A) to figure out the base, you pretty much need to get to the first digit; getting to the first non-zero digit is not that much worse. B) If you know the length of a string of digits (starting at the first non-zero digit) and the base, you know approximately how bits the result will have. You can do a single allocation if you are building a long. You can tell if you need to test for overflow in building an int; there is one length per base where you must. So the question becomes, is it worth taking two passes at the digits? Well, it sure looks like it to me, but I haven't timed one or two- character integers. I do longs in "megadigits" -- the largest set of digits that fits safely in SHIFT bits, so they have no need for overflow checks. For further excitement, you can use a similar technique to go from the number of bits to the string length. That should make for a fast convert int/long to string in any of 36 (or more, really) bases. I pass all of your mentioned test cases (including the one from a later message). I'm pretty much out of time for this project at the moment, but encouraging words would help me steal some time to finish. For anyone wanting to look at the code, or try it themselves: Installer: http://members.dsl-only.net/~daniels/dist/to_int-0.10.win32-py2.4.exe Just the 2.4 dll: http://members.dsl-only.net/~daniels/dist/to_int-0.10.win32.zip Sources: http://members.dsl-only.net/~daniels/dist/to_int-0.10.zip --Scott David Daniels Scott.Daniels at Acm.Org From greg.ewing at canterbury.ac.nz Thu Nov 10 04:02:04 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 10 Nov 2005 16:02:04 +1300 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <43729CAB.5070106@c2b2.columbia.edu> References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> Message-ID: <4372B82C.9010800@canterbury.ac.nz> Michiel Jan Laurens de Hoon wrote: > At this point, I can't propose a specific modification yet because I > don't know the reasoning that went behind the original choice of Tk as > the default GUI toolkit for Python Probably because at the time it was really the only cross-platform GUI toolkit that worked about equally well (or equally badly, depending on your point of view) on all the major platforms. I'm not sure the event-loop situation would be much different with another one, anyway. From what I've seen of GUI toolkits, they all have their own form of event loop, and they all provide some way of hooking other things into it (as does Tkinter), but whichever one you're using, it likes to be in charge. Code which blocks reading from standard input doesn't fit very well into any of them. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From exarkun at divmod.com Thu Nov 10 04:08:52 2005 From: exarkun at divmod.com (Jean-Paul Calderone) Date: Wed, 9 Nov 2005 22:08:52 -0500 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <4372B82C.9010800@canterbury.ac.nz> Message-ID: <20051110030852.10365.1719239053.divmod.quotient.6042@ohm> On Thu, 10 Nov 2005 16:02:04 +1300, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote: >Michiel Jan Laurens de Hoon wrote: > >> At this point, I can't propose a specific modification yet because I >> don't know the reasoning that went behind the original choice of Tk as >> the default GUI toolkit for Python > >Probably because at the time it was really the >only cross-platform GUI toolkit that worked >about equally well (or equally badly, depending >on your point of view) on all the major >platforms. > >I'm not sure the event-loop situation would be >much different with another one, anyway. From what >I've seen of GUI toolkits, they all have their own >form of event loop, and they all provide some way >of hooking other things into it (as does Tkinter), >but whichever one you're using, it likes to be in >charge. Code which blocks reading from standard >input doesn't fit very well into any of them. > Of course, the problem could be approached from the other direction: the blocking reads could be replaced with something else... Jean-Paul From janssen at parc.com Thu Nov 10 05:00:44 2005 From: janssen at parc.com (Bill Janssen) Date: Wed, 9 Nov 2005 20:00:44 PST Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: Your message of "Wed, 09 Nov 2005 13:38:33 PST." <A0F78CD8-1F2C-4201-B92B-1707AA822DF0@redivi.com> Message-ID: <05Nov9.200052pst."58633"@synergy1.parc.xerox.com> > This should work on a few platforms: > env PYTHONPATH=FILE.zip python -m some_module_in_the_zip Yeah, that's not bad, but I hate setting PYTHONPATH. I was thinking more along the line of python -z ZIPFILE where python would look at the ZIPFILE to see if there's a top-level module called "__init__", and if so, load it. That would allow existing PYTHONPATH settings to still be used if the user cares. Bill From falcon at intercable.ru Wed Nov 9 08:24:04 2005 From: falcon at intercable.ru (Sokolov Yura) Date: Wed, 09 Nov 2005 10:24:04 +0300 Subject: [Python-Dev] Unifying decimal numbers. Message-ID: <4371A414.2020400@intercable.ru> Excuse my English I think, we could just segregate tokens for decimal and real float and make them interoperable. Motivation: Most of us works with business databases - all "floats" are really decimals, algebraic operations should work without float inconsistency and those operations rare so speed is not important. But some of us use floats for speed in scientific and multimedia programs. with from __future__ import Decimal we could: a) interpret regular float constants as decimal b) interpret float constants with suffix 'f' as float (like 1.5f 345.2e-5f etc) c) result of operation with decimal operands should be decimal >>> 1.0/3.0 0.33333333333333333 d) result of operation with float operands should be float >>> 1.0f/3.0f 0.33333333333333331f e) result of operation with decimal and float should be float (decimal converts into float and operation perfomed) >>> 1.0f/3.0 0.33333333333333331f >>> 1.0/3.0f 0.33333333333333331f From bcannon at gmail.com Thu Nov 10 06:14:14 2005 From: bcannon at gmail.com (Brett Cannon) Date: Wed, 9 Nov 2005 21:14:14 -0800 Subject: [Python-Dev] dev FAQ updated with day-to-day svn questions Message-ID: <bbaeab100511092114y73e5f525ubf5011fae39eab01@mail.gmail.com> I just finished fleshing out the dev FAQ (http://www.python.org/dev/devfaq.html) with questions covering what someone might need to know for regular usage. If anyone thinks I didn't cover something I should have, let me know. -Brett From stephen at xemacs.org Thu Nov 10 06:19:42 2005 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 10 Nov 2005 14:19:42 +0900 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <43729CAB.5070106@c2b2.columbia.edu> (Michiel Jan Laurens de Hoon's message of "Wed, 09 Nov 2005 20:04:43 -0500") References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> Message-ID: <87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Michiel" == Michiel Jan Laurens de Hoon <mdehoon at c2b2.columbia.edu> writes: Michiel> What is the advantage of Tk in comparison to other GUI Michiel> toolkits? IMO, Tk's _advantage_ is that it's there already. As a standard component, it works well for typical simple GUI applications (thus satisfying "batteries included" IMO), and it's self-contained. So I would say it's at _no disadvantage_ to other toolkits. Alternatives like PyGtk and wxWidgets are easily available and provide some degree of cross-platform support for those who need something more/different. Is there some reason why you can't require users to install a toolkit more suited to your application's needs? -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From mdehoon at c2b2.columbia.edu Thu Nov 10 06:27:22 2005 From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon) Date: Thu, 10 Nov 2005 00:27:22 -0500 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <4372B82C.9010800@canterbury.ac.nz> References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> <4372B82C.9010800@canterbury.ac.nz> Message-ID: <4372DA3A.8010206@c2b2.columbia.edu> Greg Ewing wrote: >I'm not sure the event-loop situation would be >much different with another one, anyway. From what >I've seen of GUI toolkits, they all have their own >form of event loop, and they all provide some way >of hooking other things into it (as does Tkinter), >but whichever one you're using, it likes to be in >charge. > It's not because it likes to be in charge, it's because there's no other way to do it in Python. In our scientific visualization software, we also have our own event loop. I'd much rather let a Python event loop handle our messages. Not only would it save us time programming (setting up an event loop in an extension module that passes control back to Python when needed is tricky), it would also give better performance, it would work with IDLE (which an event loop in an extension module cannot as explained in my previous post), and it would let different extension modules live happily together all using the same event loop. Tkinter is a special case among GUI toolkits because it is married to Tcl. It doesn't just need to handle its GUI events, it also needs to run the Tcl interpreter in between. Which is why Tkinter needs to be in charge of the event loop. For other GUI toolkits, I don't see a reason why they'd need their own event loop. > Code which blocks reading from standard >input doesn't fit very well into any of them. > > Actually, this is not difficult to accomplish. For example, try Tcl's wish on Linux: It will pop up a (responsive) graphics window but continue to read Tcl commands from the terminal. This is done via a call to select (on Unix) or MsgWaitForMultipleObjects (on Windows). Both of these can listen for terminal input and GUI events at the same time. --Michiel. -- Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 From mdehoon at c2b2.columbia.edu Thu Nov 10 06:40:47 2005 From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon) Date: Thu, 10 Nov 2005 00:40:47 -0500 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp> References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> <87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <4372DD5F.70203@c2b2.columbia.edu> Stephen J. Turnbull wrote: > Michiel> What is the advantage of Tk in comparison to other GUI > Michiel> toolkits? > >IMO, Tk's _advantage_ is that it's there already. As a standard >component, it works well for typical simple GUI applications (thus >satisfying "batteries included" IMO), and it's self-contained. So I >would say it's at _no disadvantage_ to other toolkits. > >Alternatives like PyGtk and wxWidgets are easily available and provide >some degree of cross-platform support for those who need something >more/different. > >Is there some reason why you can't require users to install a toolkit >more suited to your application's needs? > > My application doesn't need a toolkit at all. My problem is that because of Tkinter being the standard Python toolkit, we cannot have a decent event loop in Python. So this is the disadvantage I see in Tkinter. --Michiel. -- Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 From ronaldoussoren at mac.com Thu Nov 10 08:06:02 2005 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Thu, 10 Nov 2005 08:06:02 +0100 Subject: [Python-Dev] Weak references: dereference notification In-Reply-To: <1131576278.8540.14.camel@localhost.localdomain> References: <1131536425.9130.10.camel@localhost> <437228E4.4070800@zope.com> <1131556500.9130.18.camel@localhost> <ca471dc20511090923u4ae0d00evf85c2cc8a123a1b5@mail.gmail.com> <1131558739.9130.40.camel@localhost> <9E82C8B1-8A32-457D-827A-F0135EB9F8D3@mac.com> <1131576278.8540.14.camel@localhost.localdomain> Message-ID: <49FAFACC-3892-49EA-9154-1AC43E533179@mac.com> On 9-nov-2005, at 23:44, Gustavo J. A. M. Carneiro wrote: > On Wed, 2005-11-09 at 20:40 +0100, Ronald Oussoren wrote: >> On 9-nov-2005, at 18:52, Gustavo J. A. M. Carneiro wrote: >> >>> Qua, 2005-11-09 ?s 09:23 -0800, Guido van Rossum escreveu: >>>>>> Gustavo J. A. M. Carneiro wrote: >>>>>>> I have come across a situation where I find the current weak >>>>>>> references interface for extension types insufficient. >>>>>>> >>>>>>> Currently you only have a tp_weaklistoffset slot, pointing >>>>>>> to a >>>>>>> PyObject with weak references. However, in my case[1] I >>>>>>> _really_ need >>>>>>> to be notified when a weak reference is dereferenced. >>>> >>>> I find reading through the bug discussion a bit difficult to >>>> understand your use case. Could you explain it here? If you can't >>>> explain it you certainly won't get your problem solved! :-) >>> >>> This is a typical PyObject wrapping C object (GObject) problem. >>> Both >>> PyObject and GObject have independent reference counts. For each >>> GObject there is at most one PyObject wrapper. >>> >>> When the refcount on the wrapper drops to zero, tp_dealloc is >>> called. >>> In tp_dealloc, and if the GObject refcount is > 1, I do something >>> slightly evil: I 'resurect' the PyObject (calling PyObject_Init), >>> create >>> a weak reference to the GObject, and drop the "strong" reference. I >>> call this a 'hibernation state'. >> >> Why do you do that? The only reasons I can think of are that you hope >> to gain >> some speed from this or that you want to support weak references to >> the GObject. > > We want to support weak references to GObjects. Mainly because that > support has always been there and we don't want/can't break API. > And it > does have some uses... > >> >> For what its worth, in PyObjC we don't support weak references to the >> underlying >> Objective-C object and delete the proxy object when it is garbage >> collected. >> Objective-C also has reference counts, we increase that in the >> constructor for >> the proxy object and decrease it again in the destroctor. > > OK, but what if it is a subclass of a builtin type, with instance > variables? What if the PyObject is GC'ed but the ObjC object remains > alive, and later you get a new reference to it? Do you create a new > PyObject wrapper for it? What happened to the instance variables? Our main goal is that there is at most one wrapper for a python object alive at any one time. And likewise there is at most one Objective-C wrapper for a python object. If a PyObject is GC'ed and the ObjC object remains alive you will get a new PyObject when a reference to the ObjC object passes into python space again. That is no problem because the proxy object contains no state other than the pointer to the ObjC object. ObjC's runtime might be more flexible than that of GObject. If you create a subclass of an ObjC class the PyObjC runtime will create a real ObjC class for you and all object state, including Python instance variables, are stored on the ObjC side. > > Our goal in wrapping GObject is that, once a Python wrapper for a > GObject instance is created, it never dies until the GObject dies too. > At the same time, once the python wrapper loses all references, it > should not stop keeping the GObject alive. I tried that too, but ran into some very ugly issues and decided that weak references are not important enough for that. There's also the problem that this will keep the python proxy alive even when it is not needed anymore, which gives significant overhead of you traverse a large datastructure. What I don't quite understand is how you know that your python wrapper is the last reference to the GObject and your wrapper should not be forcefully kept alive. > > What happens currently, which is what I'm trying to change, is that > there is a reference loop between PyObject and GObject, so that > deallocation only happens with the help of the cyclic GC. But relying > on the GC for _everything_ causes annoying problems: At one time I used Python's reference counts as the reference count of the Objective-C object (ObjC's reference count management is done through method calls and can therefore be overridden in subclasses). That did work, but getting the semantics completely correct turned the code into a mess. Our current solution is much more satisfying. > > 1- The GC runs only once in a while, not soon enough if eg. you > have an > image object with several megabytes; > > 2- It makes it hard to debug reference counting bugs, as the symptom > only appears when the GC runs, far away from the code that cause the > problem in the first place; > > 3- Generally the GC has a lot more work, since every PyGTK object > needs > it, and a GUI app can have lots of PyGTK objects. > > Regards. > > -- > Gustavo J. A. M. Carneiro > <gjc at inescporto.pt> <gustavo at users.sourceforge.net> > The universe is always one step beyond logic > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ > ronaldoussoren%40mac.com From martin at v.loewis.de Thu Nov 10 08:15:00 2005 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Thu, 10 Nov 2005 08:15:00 +0100 Subject: [Python-Dev] Weak references: dereference notification In-Reply-To: <1131576278.8540.14.camel@localhost.localdomain> References: <1131536425.9130.10.camel@localhost> <437228E4.4070800@zope.com> <1131556500.9130.18.camel@localhost> <ca471dc20511090923u4ae0d00evf85c2cc8a123a1b5@mail.gmail.com> <1131558739.9130.40.camel@localhost> <9E82C8B1-8A32-457D-827A-F0135EB9F8D3@mac.com> <1131576278.8540.14.camel@localhost.localdomain> Message-ID: <4372F374.4060709@v.loewis.de> Gustavo J. A. M. Carneiro wrote: > OK, but what if it is a subclass of a builtin type, with instance > variables? What if the PyObject is GC'ed but the ObjC object remains > alive, and later you get a new reference to it? Do you create a new > PyObject wrapper for it? What happened to the instance variables? Normally, wrappers don't have state. But if you do have state, this is how it could work: 1. Make two Python objects, PyState and PyWrapper (actually, PyState doesn't need to be a Python object) PyState holds the instance variables, and PyWrapper just holds a pointer to a GObject. 2. When a Python reference to a GObject is created for the first time, create both a PyState and a PyWrapper. Have the GObject point to the PyState, and the PyWrapper to the GObject. Have the PyState weakly reference the PyWrapper. 3. When the refcount to the PyWrapper drops to zero, discard it. 4. When somebody asks for the data in the PyWrapper, go to the GObject, then to the PyState, and return the data from there. 5. When somebody wants a reference to a GObject which already has a PyState, check the weak reference to find out whether there is a PyWrapper already. If yes, return it; if not, create a new one (and weakly reference it). 6. When the GObject is discarded, drop the PyState as well. This has the following properties: 1. There are no cyclic references for wrapping GObjects. 2. Weakly-referencing wrappers is supported; if there are no strong Python references to the wrapper, the wrapper goes away, and, potentially, the GObject as well. 3. The PyState object lives as long as the GObject. 4. Using "is" for GObjects/PyWrappers "works": there is at most one PyWrapper per GObject at any time. 5. id() of a GObject may change over time, if the wrapper becomes unreferenced and then recreated. Regards, Martin From martin at v.loewis.de Thu Nov 10 08:26:51 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 10 Nov 2005 08:26:51 +0100 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <43729CAB.5070106@c2b2.columbia.edu> References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> Message-ID: <4372F63B.70301@v.loewis.de> Michiel Jan Laurens de Hoon wrote: > At this point, I can't propose a specific modification yet because I > don't know the reasoning that went behind the original choice of Tk as > the default GUI toolkit for Python (and hence, I don't know if those > reasons are still valid today). I don't know, either, but I guess that it was the only option as a cross-platform GUI at the time when it was created. > I can see one disadvantage (using Tk > limits our options to run an event loop for other Python extensions), > and I am trying to find out why Tk was deemed more appropriate than > other GUI toolkits anyway. I don't think this is a disadvantage: my guess is that other GUI toolkits share the very same problems. So even though it looks like a limitation of Tkinter, it really is a fundamental limitation, and Tk is not any worse than the competitors. Also, I firmly believe that whatever your event processing requirements are, that there is a solution that meets all your end-user needs. That solution would fail the requirement to be easy to implement for you (IOW, it may take some work). > So let me rephrase the question: What is the advantage of Tk in > comparison to other GUI toolkits? It comes bundled with Python. If this sounds circular: It is. Whatever the historical reasons for original inclusion where, I'm sure they are not that important anymore. Today, what matters is that an actual implementation of a GUI toolkit integration is actively being maintained, in the Python core. This is something that is not the case for any other GUI toolkit. If you think it would be easy to change: it isn't. Somebody would have to step forward and say "I will maintain it for the next 10 years". Removal of Tkinter would meet strong resistance, so it would have to be maintained in addition to Tkinter. Nobody has stepped forward making such an offer. For Tkinter, it's different: because it *already* is part of Python, various maintainers fix problems as they find them, and contributors contribute improvements. > Is it Mac availability? More advanced > widget set? Installation is easier? Portability? These are all important, yes. But other GUI toolkits likely have the same properties. > Switching to a > different GUI toolkit would break too much existing code? Most definitely, yes. Switching would not be an option at all. Another GUI toolkit would have to be an addition, not a replacement. > I think that > having the answer to this will stimulate further development of > alternative GUI toolkits, which may give some future Python version a > toolkit at least as good as Tk, and one that doesn't interfere with > Python's event loop capabilities. I personally don't think so. The task is just too huge for volunteers to commit to. Regards, Martin From Scott.Daniels at Acm.Org Thu Nov 10 08:28:11 2005 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Wed, 09 Nov 2005 23:28:11 -0800 Subject: [Python-Dev] to_int -- oops, one step missing for use. In-Reply-To: <4372B377.6050806@Acm.Org> References: <1f7befae0510211952x5eb2000bicdf3c1a80a3f5749@mail.gmail.com> <4372B377.6050806@Acm.Org> Message-ID: <4372F68B.5050106@Acm.Org> Well, wouldn't you know it. I get the code right and mess up the directions. Scott David Daniels wrote: > if you build this module, I'd suggest using > "from to_int import chomp" to get a function that works like int > (producing a long when needed and so on). Well, actually it is a bit more than that. "from to_int import chomp, _flag; _flag(1)" This sets a flag to suppress the return of the length along with the value from chomp. --Scott David Daniels Scott.Daniels at Acm.Org From martin at v.loewis.de Thu Nov 10 08:30:51 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 10 Nov 2005 08:30:51 +0100 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <4372DA3A.8010206@c2b2.columbia.edu> References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> <4372B82C.9010800@canterbury.ac.nz> <4372DA3A.8010206@c2b2.columbia.edu> Message-ID: <4372F72B.9060501@v.loewis.de> Michiel Jan Laurens de Hoon wrote: > It's not because it likes to be in charge, it's because there's no other > way to do it in Python. As I said: this is simply not true. > Tkinter is a special case among GUI toolkits because it is married to > Tcl. It doesn't just need to handle its GUI events, it also needs to run > the Tcl interpreter in between. That statement is somewhat deceiving: there isn't much interpreter to run, really. > Which is why Tkinter needs to be in > charge of the event loop. For other GUI toolkits, I don't see a reason > why they'd need their own event loop. They need to fetch events from the operating system level, and dispatch them to the widgets. Regards, Martin From fredrik at pythonware.com Thu Nov 10 09:04:24 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu, 10 Nov 2005 09:04:24 +0100 Subject: [Python-Dev] dev FAQ updated with day-to-day svn questions References: <bbaeab100511092114y73e5f525ubf5011fae39eab01@mail.gmail.com> Message-ID: <dkuuu8$8i3$1@sea.gmane.org> Brett Cannon wrote: >I just finished fleshing out the dev FAQ > (http://www.python.org/dev/devfaq.html) with questions covering what > someone might need to know for regular usage. If anyone thinks I > didn't cover something I should have, let me know. SVK! </F> From ncoghlan at gmail.com Thu Nov 10 10:11:50 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 10 Nov 2005 19:11:50 +1000 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: <79990c6b0511091456y329f1c5ey53b7428e59c97bc7@mail.gmail.com> References: <A0F78CD8-1F2C-4201-B92B-1707AA822DF0@redivi.com> <1x1ppk6g.fsf@python.net> <EDEA56AC-BB60-496D-8A3E-1FBD68F40D44@redivi.com> <79990c6b0511091456y329f1c5ey53b7428e59c97bc7@mail.gmail.com> Message-ID: <43730ED6.1010807@gmail.com> Paul Moore wrote: > On 11/9/05, Bob Ippolito <bob at redivi.com> wrote: >> On Nov 9, 2005, at 1:48 PM, Thomas Heller wrote: >> >>> Bob Ippolito <bob at redivi.com> writes: >>> >>>> On Nov 9, 2005, at 1:22 PM, Bill Janssen wrote: >>>> >>>>> It's a shame that >>>>> >>>>> 1) there's no equivalent of "java -jar", i.e., "python -z >>>>> FILE.ZIP", and >>>> This should work on a few platforms: >>>> env PYTHONPATH=FILE.zip python -m some_module_in_the_zip >>> It should, yes - but it doesn't: -m doesn't work with zipimport. >> That's dumb, someone should fix that. Is there a bug filed? > > I did, a while ago. http://www.python.org/sf/1250389 Please consider looking at and commenting on PEP 328 - I got zero feedback when I wrote it, and basically assumed no-one else was bothered by the -m switch's fairly significant limitations (it went in close to the first Python 2.4 alpha release, so we wanted to keep it simple). The PEP and the associated patch currently only cover lifting the limitation against executing modules inside packages, but it should be possible to extend it to cover executing modules inside zip files as well (as you say, increasing use of eggs will only make the current limitations more annoying). That discussion should probably happen on c.l.p, though. cc' me if you start one, and I can keep on eye on it through Google (I won't have time to participate actively, unfortunately :() Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Thu Nov 10 10:15:14 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 10 Nov 2005 19:15:14 +1000 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: <43730ED6.1010807@gmail.com> References: <A0F78CD8-1F2C-4201-B92B-1707AA822DF0@redivi.com> <1x1ppk6g.fsf@python.net> <EDEA56AC-BB60-496D-8A3E-1FBD68F40D44@redivi.com> <79990c6b0511091456y329f1c5ey53b7428e59c97bc7@mail.gmail.com> <43730ED6.1010807@gmail.com> Message-ID: <43730FA2.6070404@gmail.com> Nick Coghlan wrote: > Please consider looking at and commenting on PEP 328 - I got zero feedback > when I wrote it, and basically assumed no-one else was bothered by the -m > switch's fairly significant limitations (it went in close to the first Python > 2.4 alpha release, so we wanted to keep it simple). Oops, that should be PEP 3*3*8. PEP 328 is something completely different. That'll teach me to post without checking the PEP number ;) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Thu Nov 10 10:21:55 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 10 Nov 2005 19:21:55 +1000 Subject: [Python-Dev] dev FAQ updated with day-to-day svn questions In-Reply-To: <bbaeab100511092114y73e5f525ubf5011fae39eab01@mail.gmail.com> References: <bbaeab100511092114y73e5f525ubf5011fae39eab01@mail.gmail.com> Message-ID: <43731133.9000904@gmail.com> Brett Cannon wrote: > I just finished fleshing out the dev FAQ > (http://www.python.org/dev/devfaq.html) with questions covering what > someone might need to know for regular usage. If anyone thinks I > didn't cover something I should have, let me know. Should the section "Developing on Windows" disappear now? Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Thu Nov 10 10:43:43 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 10 Nov 2005 19:43:43 +1000 Subject: [Python-Dev] dev FAQ updated with day-to-day svn questions In-Reply-To: <bbaeab100511092114y73e5f525ubf5011fae39eab01@mail.gmail.com> References: <bbaeab100511092114y73e5f525ubf5011fae39eab01@mail.gmail.com> Message-ID: <4373164F.8070606@gmail.com> Brett Cannon wrote: > I just finished fleshing out the dev FAQ > (http://www.python.org/dev/devfaq.html) with questions covering what > someone might need to know for regular usage. If anyone thinks I > didn't cover something I should have, let me know. For question 1.2.10, I believe you also want: [miscellany] enable-auto-props = yes so that "svn add" works properly. Question 1.4.1 should cover the use of "svn diff" instead of "cvs diff" to make the patch. On that note, we need to update the patch submission guidelines to point to SVN instead of CVS (those guidelines also still say context diffs are preferred to unified diffs, which I believe is no longer true). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From p.f.moore at gmail.com Thu Nov 10 11:23:48 2005 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 10 Nov 2005 10:23:48 +0000 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: <43730ED6.1010807@gmail.com> References: <A0F78CD8-1F2C-4201-B92B-1707AA822DF0@redivi.com> <1x1ppk6g.fsf@python.net> <EDEA56AC-BB60-496D-8A3E-1FBD68F40D44@redivi.com> <79990c6b0511091456y329f1c5ey53b7428e59c97bc7@mail.gmail.com> <43730ED6.1010807@gmail.com> Message-ID: <79990c6b0511100223l20d66617ka681223cf2fb7c0@mail.gmail.com> On 11/10/05, Nick Coghlan <ncoghlan at gmail.com> wrote: > Paul Moore wrote: > > On 11/9/05, Bob Ippolito <bob at redivi.com> wrote: > >> On Nov 9, 2005, at 1:48 PM, Thomas Heller wrote: > >> > >>> Bob Ippolito <bob at redivi.com> writes: > >>> > >>>> On Nov 9, 2005, at 1:22 PM, Bill Janssen wrote: > >>>> > >>>>> It's a shame that > >>>>> > >>>>> 1) there's no equivalent of "java -jar", i.e., "python -z > >>>>> FILE.ZIP", and > >>>> This should work on a few platforms: > >>>> env PYTHONPATH=FILE.zip python -m some_module_in_the_zip > >>> It should, yes - but it doesn't: -m doesn't work with zipimport. > >> That's dumb, someone should fix that. Is there a bug filed? > > > > I did, a while ago. http://www.python.org/sf/1250389 > > Please consider looking at and commenting on PEP 328 - I got zero feedback > when I wrote it, and basically assumed no-one else was bothered by the -m > switch's fairly significant limitations (it went in close to the first Python > 2.4 alpha release, so we wanted to keep it simple). > > The PEP and the associated patch currently only cover lifting the limitation > against executing modules inside packages, but it should be possible to extend > it to cover executing modules inside zip files as well (as you say, increasing > use of eggs will only make the current limitations more annoying). > > That discussion should probably happen on c.l.p, though. cc' me if you start > one, and I can keep on eye on it through Google (I won't have time to > participate actively, unfortunately :() I didn't respond simply because it seemed obvious that this should go in, and I expected no debate. I assumed the only reason it didn't go into 2.4 was because the issue came up too close to the release. Teach me to assume, I guess... FWIW, I'm +1 on PEP 338. Paul. From gjc at inescporto.pt Thu Nov 10 13:04:23 2005 From: gjc at inescporto.pt (Gustavo J. A. M. Carneiro) Date: Thu, 10 Nov 2005 12:04:23 +0000 Subject: [Python-Dev] Weak references: dereference notification In-Reply-To: <4372F374.4060709@v.loewis.de> References: <1131536425.9130.10.camel@localhost> <437228E4.4070800@zope.com> <1131556500.9130.18.camel@localhost> <ca471dc20511090923u4ae0d00evf85c2cc8a123a1b5@mail.gmail.com> <1131558739.9130.40.camel@localhost> <9E82C8B1-8A32-457D-827A-F0135EB9F8D3@mac.com> <1131576278.8540.14.camel@localhost.localdomain> <4372F374.4060709@v.loewis.de> Message-ID: <1131624263.4292.16.camel@localhost> Qui, 2005-11-10 ?s 08:15 +0100, "Martin v. L?wis" escreveu: > Gustavo J. A. M. Carneiro wrote: > > OK, but what if it is a subclass of a builtin type, with instance > > variables? What if the PyObject is GC'ed but the ObjC object remains > > alive, and later you get a new reference to it? Do you create a new > > PyObject wrapper for it? What happened to the instance variables? > > Normally, wrappers don't have state. But if you do have state, this > is how it could work: > > 1. Make two Python objects, PyState and PyWrapper (actually, > PyState doesn't need to be a Python object) > PyState holds the instance variables, and PyWrapper just > holds a pointer to a GObject. > 2. When a Python reference to a GObject is created for the > first time, create both a PyState and a PyWrapper. Have > the GObject point to the PyState, and the PyWrapper to > the GObject. Have the PyState weakly reference the > PyWrapper. > 3. When the refcount to the PyWrapper drops to zero, discard it. > 4. When somebody asks for the data in the PyWrapper, > go to the GObject, then to the PyState, and return the > data from there. > 5. When somebody wants a reference to a GObject which already > has a PyState, check the weak reference to find out > whether there is a PyWrapper already. If yes, return it; > if not, create a new one (and weakly reference it). > 6. When the GObject is discarded, drop the PyState as well. > > This has the following properties: > 1. There are no cyclic references for wrapping GObjects. > 2. Weakly-referencing wrappers is supported; if there > are no strong Python references to the wrapper, > the wrapper goes away, and, potentially, the GObject > as well. > 3. The PyState object lives as long as the GObject. > 4. Using "is" for GObjects/PyWrappers "works": there is > at most one PyWrapper per GObject at any time. > 5. id() of a GObject may change over time, if the wrapper > becomes unreferenced and then recreated. This was my first approach, actually, in patch 4.1 in [1]. Only your property 2 above drove me to try a different approach -- the weakrefs may become invalid while the GObject may still be alive. That's a bit "surprising". Of course, if I could override weakref.ref() for GObject wrapper types, even that could be worked around... ;-) Thanks, [1] http://bugzilla.gnome.org/show_bug.cgi?id=320428 -- Gustavo J. A. M. Carneiro <gjc at inescporto.pt> <gustavo at users.sourceforge.net> The universe is always one step beyond logic. From gjc at inescporto.pt Thu Nov 10 13:12:58 2005 From: gjc at inescporto.pt (Gustavo J. A. M. Carneiro) Date: Thu, 10 Nov 2005 12:12:58 +0000 Subject: [Python-Dev] Weak references: dereference notification In-Reply-To: <43729B07.6010907@canterbury.ac.nz> References: <1131536425.9130.10.camel@localhost> <437228E4.4070800@zope.com> <1131556500.9130.18.camel@localhost> <ca471dc20511090923u4ae0d00evf85c2cc8a123a1b5@mail.gmail.com> <1131558739.9130.40.camel@localhost> <9E82C8B1-8A32-457D-827A-F0135EB9F8D3@mac.com> <1131576278.8540.14.camel@localhost.localdomain> <43729B07.6010907@canterbury.ac.nz> Message-ID: <1131624778.4292.22.camel@localhost> Qui, 2005-11-10 ?s 13:57 +1300, Greg Ewing escreveu: > Gustavo J. A. M. Carneiro wrote: > > > OK, but what if it is a subclass of a builtin type, with instance > > variables? What if the PyObject is GC'ed but the ObjC object remains > > alive, and later you get a new reference to it? Do you create a new > > PyObject wrapper for it? What happened to the instance variables? > > Your proposed scheme appears to involve destroying and > then re-initialising the Python wrapper. Isn't that > going to wipe out any instance variables it may > have had? The object isn't really destroyed. Simply ob_refcnt drops to zero, then tp_dealloc is called, which is supposed to destroy it. But since I wrote tp_dealloc, I choose not to destroy it, and revive it by calling PyObject_Init(), which makes ob_refcnt == 1 again, among other things. > > Also, it seems to me that as soon as the refcount on > the wrapper drops to zero, any weak references to it > will be broken. Or does your resurrection code > intervene before that happens? Yes, I intervene before that happens. Regards. -- Gustavo J. A. M. Carneiro <gjc at inescporto.pt> <gustavo at users.sourceforge.net> The universe is always one step beyond logic. From abo at minkirri.apana.org.au Thu Nov 10 14:47:00 2005 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Thu, 10 Nov 2005 13:47:00 +0000 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <4372DD5F.70203@c2b2.columbia.edu> References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> <87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp> <4372DD5F.70203@c2b2.columbia.edu> Message-ID: <1131630420.12077.44.camel@warna.corp.google.com> On Thu, 2005-11-10 at 00:40 -0500, Michiel Jan Laurens de Hoon wrote: > Stephen J. Turnbull wrote: > > > Michiel> What is the advantage of Tk in comparison to other GUI > > Michiel> toolkits? [...] > My application doesn't need a toolkit at all. My problem is that because > of Tkinter being the standard Python toolkit, we cannot have a decent > event loop in Python. So this is the disadvantage I see in Tkinter. [...] I'm kind of surprised no-one has mentioned Twisted in this thread. Twisted is an async-framework that I believe has support for using a variety of different event-loops, including Tkinter and wxWidgets, as well as it's own. It has been heavily re-factored many times, so if you want to see the current Python "state of the art" way of doing this, I'd be having a look at what they are doing. -- Donovan Baarda <abo at minkirri.apana.org.au> http://minkirri.apana.org.au/~abo/ From guido at python.org Thu Nov 10 17:50:40 2005 From: guido at python.org (Guido van Rossum) Date: Thu, 10 Nov 2005 08:50:40 -0800 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <4372DD5F.70203@c2b2.columbia.edu> References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> <87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp> <4372DD5F.70203@c2b2.columbia.edu> Message-ID: <ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com> On 11/9/05, Michiel Jan Laurens de Hoon <mdehoon at c2b2.columbia.edu> wrote: > My application doesn't need a toolkit at all. My problem is that because > of Tkinter being the standard Python toolkit, we cannot have a decent > event loop in Python. So this is the disadvantage I see in Tkinter. That's a non-sequitur if I ever saw one. Who gave you that idea? There is no connection. (If there's *any* reason for Python not having a standard event loop it's probably because I've never needed one.) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From mdehoon at c2b2.columbia.edu Thu Nov 10 18:16:36 2005 From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon) Date: Thu, 10 Nov 2005 12:16:36 -0500 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <4372F72B.9060501@v.loewis.de> References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> <4372B82C.9010800@canterbury.ac.nz> <4372DA3A.8010206@c2b2.columbia.edu> <4372F72B.9060501@v.loewis.de> Message-ID: <43738074.2030508@c2b2.columbia.edu> Martin v. L?wis wrote: > Michiel Jan Laurens de Hoon wrote: > >> It's not because it likes to be in charge, it's because there's no >> other way to do it in Python. > > As I said: this is simply not true. You are right in the sense it is possible to get events handled using the solutions you proposed before (sorry for not responding to those earlier). But I don't believe that these are very good solutions: > You are missing multi-threading, which is the widely used > approach to doing things simultaneously in a single process. In one > thread, user interaction can occur; in another, computation. If you need > non-blocking interaction between the threads, use queues, or other > global variables. If you have other event sources, deal with them > in separate threads. The problem with threading (apart from potential portability problems) is that Python doesn't let us know when it's idle. This would cause excessive repainting (I can give you an explicit example if you're interested). But there is another solution with threads: Can we let Tkinter run in a separate thread instead? > Yes, it is possible to get event loops with Tkinter. Atleast on Unix, > you can install a file handler into the Tk event loop (through > createfilehandler), which gives you callbacks whenever there is some > activity on the files. This works, but only if Tkinter is installed, and even then it will give poor performance due to the busy-loop with 20 ms sleep in between in Tkinter. Furthermore, this will not work with IDLE, because the Python thread that handles user commands never enters the Tkinter event loop, even if we import Tkinter. AFAIK, there is no easy solution to this. > Furthermore, it is possible to turn the event loop around, by doing > dooneevent explicitly. Here, the problem is that we don't know *when* to call dooneevent, so we'd have to do a busy-loop and sleep in between. >> Tkinter is a special case among GUI toolkits because it is married to >> Tcl. It doesn't just need to handle its GUI events, it also needs to >> run the Tcl interpreter in between. > > That statement is somewhat deceiving: there isn't much interpreter to > run, really. I may be wrong here, but I'd think that it would be dangerous to run Tkinter's event loop when one thread is waiting for another (as happens in IDLE). >> Which is why Tkinter needs to be in charge of the event loop. For >> other GUI toolkits, I don't see a reason why they'd need their own >> event loop. > > They need to fetch events from the operating system level, and dispatch > them to the widgets. This is a perfect task for an event loop located in Python, instead of in an extension module. I could write a prototype event loop for Python to demonstrate how this would work. Sorry if I'm sounding negative, but we've actually considered many different things to get the event loop working for our scientific visualization software, and we were never able to come up with a satisfactory scheme within the current Python framework. Other packages have run into the same problem (e.g. matplotlib, which now recommends using the interactive ipython instead of regular python; the python extension for the Rasmol protein viewer is another). --Michiel. -- Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 From pje at telecommunity.com Thu Nov 10 18:46:34 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 10 Nov 2005 12:46:34 -0500 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: <ca471dc20511091633m4b7869b7jc3bd847436f452ab@mail.gmail.co m> References: <5.1.1.6.0.20051109190838.01f51838@mail.telecommunity.com> <20051109023347.GA15823@localhost.localdomain> <ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com> <b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com> <ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com> <bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com> <5.1.1.6.0.20051109190838.01f51838@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20051110124246.02bac470@mail.telecommunity.com> At 04:33 PM 11/9/2005 -0800, Guido van Rossum wrote: >On 11/9/05, Phillip J. Eby <pje at telecommunity.com> wrote: > > By the way, while we're on this subject, can we make the optimization > > options be part of the compile() interface? Right now the distutils has to > > actually exec another Python process whenever you want to compile > > code with > > a different optimization level than what's currently in effect, whereas if > > it could pass the desired level to compile(), this wouldn't be necessary. > >Makes sense to me; we need a patch of course. But before we can do that, it's not clear to me if it should be part of the existing "flags" argument, or whether it should be separate. Similarly, whether it's just going to be a level or an optimization bitmask in its own right might be relevant too. For the current use case, obviously, a level argument suffices, with 'None' meaning "whatever the command-line level was" for backward compatibility. And I guess we could go with that for now easily enough, I'd just like to know whether any of the AST or optimization mavens had anything they were planning in the immediate future that might affect how the API addition should be structured. From guido at python.org Thu Nov 10 18:53:28 2005 From: guido at python.org (Guido van Rossum) Date: Thu, 10 Nov 2005 09:53:28 -0800 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: <5.1.1.6.0.20051110124246.02bac470@mail.telecommunity.com> References: <20051109023347.GA15823@localhost.localdomain> <ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com> <b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com> <ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com> <bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com> <5.1.1.6.0.20051109190838.01f51838@mail.telecommunity.com> <5.1.1.6.0.20051110124246.02bac470@mail.telecommunity.com> Message-ID: <ca471dc20511100953l2f1f2748s1d721782cb12c53c@mail.gmail.com> On 11/10/05, Phillip J. Eby <pje at telecommunity.com> wrote: > At 04:33 PM 11/9/2005 -0800, Guido van Rossum wrote: > >On 11/9/05, Phillip J. Eby <pje at telecommunity.com> wrote: > > > By the way, while we're on this subject, can we make the optimization > > > options be part of the compile() interface? Right now the distutils has to > > > actually exec another Python process whenever you want to compile > > > code with > > > a different optimization level than what's currently in effect, whereas if > > > it could pass the desired level to compile(), this wouldn't be necessary. > > > >Makes sense to me; we need a patch of course. > > But before we can do that, it's not clear to me if it should be part of the > existing "flags" argument, or whether it should be separate. Similarly, > whether it's just going to be a level or an optimization bitmask in its own > right might be relevant too. > > For the current use case, obviously, a level argument suffices, with 'None' > meaning "whatever the command-line level was" for backward > compatibility. And I guess we could go with that for now easily enough, > I'd just like to know whether any of the AST or optimization mavens had > anything they were planning in the immediate future that might affect how > the API addition should be structured. I'm not a big user of this API, please design as you see fit. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From pje at telecommunity.com Thu Nov 10 18:55:19 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 10 Nov 2005 12:55:19 -0500 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <1131630420.12077.44.camel@warna.corp.google.com> References: <4372DD5F.70203@c2b2.columbia.edu> <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> <87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp> <4372DD5F.70203@c2b2.columbia.edu> Message-ID: <5.1.1.6.0.20051110125301.02b2d318@mail.telecommunity.com> At 01:47 PM 11/10/2005 +0000, Donovan Baarda wrote: >Twisted is an async-framework that I believe has support for using a >variety of different event-loops, including Tkinter and wxWidgets, as >well as it's own. Technically, it just gives Tkinter a chance to run every so often; you specifically *can't* use Tkinter's event loop. Instead, you run the Twisted event loop after telling it that you'd like Tkinter to be kept in the loop, as it were. But Twisted is definitely worth looking at for this sort of thing. It's the nearest thing to a "standard Python event loop" that exists, apart from the asyncore stuff in the stdlib (which doesn't have any GUI support AFAIK). From mdehoon at c2b2.columbia.edu Thu Nov 10 19:07:17 2005 From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon) Date: Thu, 10 Nov 2005 13:07:17 -0500 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com> References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> <87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp> <4372DD5F.70203@c2b2.columbia.edu> <ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com> Message-ID: <43738C55.60509@c2b2.columbia.edu> Guido van Rossum wrote: >On 11/9/05, Michiel Jan Laurens de Hoon <mdehoon at c2b2.columbia.edu> wrote: > > >>My application doesn't need a toolkit at all. My problem is that because >>of Tkinter being the standard Python toolkit, we cannot have a decent >>event loop in Python. So this is the disadvantage I see in Tkinter. >> >> > >That's a non-sequitur if I ever saw one. Who gave you that idea? There is no connection. > I have come to this conclusion after several years of maintaining a scientific plotting package and trying to set up an event loop for it. Whereas there are some solutions that more or less work, none of them work very well, and the solutions that we found tend to break. Other visualization packages are struggling with the same problem. I'm trying the best I can to explain in my other posts why I feel that Tkinter is the underlying reason, and why it would be difficult to solve. >(If there's *any* reason for Python not having a standard event loop >it's probably because I've never needed one.) > It's probably because we have gotten away with piggy-backing on Tcl's event loop for so long. --Michiel. -- Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 From mcherm at mcherm.com Thu Nov 10 19:12:03 2005 From: mcherm at mcherm.com (Michael Chermside) Date: Thu, 10 Nov 2005 10:12:03 -0800 Subject: [Python-Dev] (no subject) Message-ID: <20051110101203.2v6dz00ya8ogs08o@login.werra.lunarpages.com> Sokolov Yura writes: > Excuse my English No problem. You command of english probably exceeds my command of any other language. > I think, we could just segregate tokens for decimal and real float and > make them interoperable. > Most of us works with business databases - all "floats" are really > decimals, algebraic operations > should work without float inconsistency and those operations rare so > speed is not important. > But some of us use floats for speed in scientific and multimedia programs. I'm not sure why you say "most" (have you seen some surveys of Python programmers that I haven't seen?), but I think we all agree that there are Python users who rarely have a need for machine floats, and others who badly need them. I'll take your specific suggestions out of order: > with "from __future__ import Decimal" we could: > c) result of operation with decimal operands should be decimal > >>> 1.0/3.0 > 0.33333333333333333 This already works. > d) result of operation with float operands should be float > >>> 1.0f/3.0f > 0.33333333333333331f This already works. > e) result of operation with decimal and float should be float (decimal > converts into float and operation perfomed) > >>> 1.0f/3.0 > 0.33333333333333331f > >>> 1.0/3.0f > 0.33333333333333331f Mixing Decimal and float is nearly ALWAYS a user error. Doing it correctly requires significant expertise in the peculiarities of floating point representations. So Python protects the user by throwing exceptions when attempts are made to mix Decimal and floats. This is the desired behavior (and the experts already know how to work around it in the RARE occasions when they need to). > a) interpret regular float constants as decimal > b) interpret float constants with suffix 'f' as float (like 1.5f > 345.2e-5f etc) There are two different ideas here, which I will separate. The first is a proposal that there be a way to provide Decimal literals. The second proposal is that the ###.### be the literal for Decimals and that ###.###f be the literal for floats. I'm in favor of the first idea. Decimals are useful enough that it would be a good idea to provide some sort of literal for their use. This is well worth a PEP. But if we DO agree that we ought to have literals for both floats and Decimals, then we also need to decide which gets the coveted "unadorned decimal literal" (ie, ###.###). Performance argues in favor of floats (they run *MUCH* faster). Usability (particularly for beginners) argues in favor of Decimals (they sometimes still have surprising behavior, but less often than with binary floats). And backward compatibility argues in favor of floats. Myself, I'm an "expert" user (at least to this extent) and I could easily handle either choice. If others felt like me, then its likely that the backward compatibility argument and the need to fight the pervasive meme that "Python is slow" will win the day. -- Michael Chermside From tzot at mediconsa.com Thu Nov 10 20:29:45 2005 From: tzot at mediconsa.com (Christos Georgiou) Date: Thu, 10 Nov 2005 21:29:45 +0200 Subject: [Python-Dev] Building Python with Visual C++ 2005 Express Edition Message-ID: <dl073b$aro$1@sea.gmane.org> I didn't see any mention of this product in the Python-Dev list, so I thought to let you know. http://msdn.microsoft.com/vstudio/express/visualc/download/ There is also a link for a CD image (.img) file to download. I am downloading now, so I don't know yet whether Python compiles with it without any problems. So if anyone has previous experience, please reply. PS This page ( http://msdn.microsoft.com/vstudio/express/support/faq/default.aspx#pricing ) says that if you download it until Nov 7, 2006, it's a gift --the Microsoft VC++ compiler for free (perhaps a cut-down version). Bits from the FAQ: http://msdn.microsoft.com/vstudio/express/support/faq/default.aspx 4. Can be used even for commercial products without licensing restrictions 40. It includes the optimizing compiler (without stuff like Profile Guided Optimizations) 41. Builds both native and managed applications (you 99% need to download the SDK too) 42. No MFC or ATL included From bcannon at gmail.com Thu Nov 10 20:36:51 2005 From: bcannon at gmail.com (Brett Cannon) Date: Thu, 10 Nov 2005 11:36:51 -0800 Subject: [Python-Dev] dev FAQ updated with day-to-day svn questions In-Reply-To: <43731133.9000904@gmail.com> References: <bbaeab100511092114y73e5f525ubf5011fae39eab01@mail.gmail.com> <43731133.9000904@gmail.com> Message-ID: <bbaeab100511101136q56ae01a2t29379079a933fc3b@mail.gmail.com> On 11/10/05, Nick Coghlan <ncoghlan at gmail.com> wrote: > Brett Cannon wrote: > > I just finished fleshing out the dev FAQ > > (http://www.python.org/dev/devfaq.html) with questions covering what > > someone might need to know for regular usage. If anyone thinks I > > didn't cover something I should have, let me know. > > Should the section "Developing on Windows" disappear now? > Well, the whole dev doc section needs cleaning up and that includes the dev FAQ. I was planning on doing this at some point; might as well start talking about it now. In my mind, the steps in each of the major things to do (bugs and patches) needs better docs. With that fleshed out, Intro to Development can act as an overview of the process. This should, together with the dev FAQ, cover what someone needs to do dev work. The question is how to structure the bug/patch guidelines. There are two options; dev FAQ entires much like the svn section or a more classic layout of the info. Both would have a bulleted list of the steps necessary for a bug/patch. The question is whether the information is presented in paragraphs of text following the bulleted list or as a list of questions. What do people prefer? -Brett From martin at v.loewis.de Thu Nov 10 20:40:04 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 10 Nov 2005 20:40:04 +0100 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <43738074.2030508@c2b2.columbia.edu> References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> <4372B82C.9010800@canterbury.ac.nz> <4372DA3A.8010206@c2b2.columbia.edu> <4372F72B.9060501@v.loewis.de> <43738074.2030508@c2b2.columbia.edu> Message-ID: <4373A214.6060201@v.loewis.de> Michiel Jan Laurens de Hoon wrote: >>You are missing multi-threading, which is the widely used >>approach to doing things simultaneously in a single process. > > The problem with threading (apart from potential portability problems) > is that Python doesn't let us know when it's idle. This would cause > excessive repainting (I can give you an explicit example if you're > interested). I don't understand how these are connected: why do you need to know when Python is idle for multi-threaded applications, and why does not knowing that it is idle cause massive repainting? Not sure whether an explicit example would help, though; one would probably need to understand a lot of details of your application. Giving a simplified version of the example might help (which would do 'print "Repainting"' instead of actually repainting). > But there is another solution with threads: Can we let Tkinter run in a > separate thread instead? Yes, you can. Actually, Tkinter *always* runs in a separate thread (separate from all other threads). > This works, but only if Tkinter is installed, and even then it will give > poor performance due to the busy-loop with 20 ms sleep in between in > Tkinter. Furthermore, this will not work with IDLE, because the Python > thread that handles user commands never enters the Tkinter event loop, > even if we import Tkinter. AFAIK, there is no easy solution to this. Here I'm losing track. What is "this" which is no easy solution for? Why do you need a callback when Python is idle in the first place? > I may be wrong here, but I'd think that it would be dangerous to run > Tkinter's event loop when one thread is waiting for another (as happens > in IDLE). I don't understand. Threads don't wait for each other. Threads wait for events (which might be generated by some other thread, of course). However, there is no problem to run the Tkinter event loop when some unnrelated thread is blocked. > Sorry if I'm sounding negative, but we've actually considered many > different things to get the event loop working for our scientific > visualization software, and we were never able to come up with a > satisfactory scheme within the current Python framework. I really don't see what the problem is. Why does the visualization framework care that Tkinter is around? Why are the events that the visualization framework needs to process? Regards, Martin From martin at v.loewis.de Thu Nov 10 20:44:00 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 10 Nov 2005 20:44:00 +0100 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <43738C55.60509@c2b2.columbia.edu> References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> <87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp> <4372DD5F.70203@c2b2.columbia.edu> <ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com> <43738C55.60509@c2b2.columbia.edu> Message-ID: <4373A300.3080501@v.loewis.de> Michiel Jan Laurens de Hoon wrote: > I have come to this conclusion after several years of maintaining a > scientific plotting package and trying to set up an event loop for > it. Whereas there are some solutions that more or less work, none of > them work very well, and the solutions that we found tend to break. > Other visualization packages are struggling with the same problem. As you can see, the problem is not familiar to anybody reading python-dev. > I'm trying the best I can to explain in my other posts why I feel > that Tkinter is the underlying reason, and why it would be difficult > to solve. Before trying to explain the reason, please try to explain the problem first. What is it *really* that you want to do which you feel you currently can't do? Regards, Martin From martin at v.loewis.de Thu Nov 10 20:47:04 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 10 Nov 2005 20:47:04 +0100 Subject: [Python-Dev] dev FAQ updated with day-to-day svn questions In-Reply-To: <43731133.9000904@gmail.com> References: <bbaeab100511092114y73e5f525ubf5011fae39eab01@mail.gmail.com> <43731133.9000904@gmail.com> Message-ID: <4373A3B8.3090402@v.loewis.de> Nick Coghlan wrote: > Should the section "Developing on Windows" disappear now? I think so, yes (along with the document it refers to). Regards, Martin From martin at v.loewis.de Thu Nov 10 20:52:16 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 10 Nov 2005 20:52:16 +0100 Subject: [Python-Dev] Building Python with Visual C++ 2005 Express Edition In-Reply-To: <dl073b$aro$1@sea.gmane.org> References: <dl073b$aro$1@sea.gmane.org> Message-ID: <4373A4F0.7010202@v.loewis.de> Christos Georgiou wrote: > I didn't see any mention of this product in the Python-Dev list, so I > thought to let you know. > > http://msdn.microsoft.com/vstudio/express/visualc/download/ > > There is also a link for a CD image (.img) file to download. > > I am downloading now, so I don't know yet whether Python compiles with it > without any problems. So if anyone has previous experience, please reply. I don't have previous experience, but I think this it likely shares the issues that VS.NET 2005 has with the current code: 1. the project files are for VS.NET 2003. In theory, conversion to the new format is supported, but I don't know whether this conversion works flawlessly. 2. MS broke ISO C conformance in VS.NET 2005 in a way that affects Python's signal handling. There is a patch on SF which addresses the issue, but that hasn't been checked in yet. Regards, Martin From bcannon at gmail.com Thu Nov 10 22:38:50 2005 From: bcannon at gmail.com (Brett Cannon) Date: Thu, 10 Nov 2005 13:38:50 -0800 Subject: [Python-Dev] dev FAQ updated with day-to-day svn questions In-Reply-To: <4373164F.8070606@gmail.com> References: <bbaeab100511092114y73e5f525ubf5011fae39eab01@mail.gmail.com> <4373164F.8070606@gmail.com> Message-ID: <bbaeab100511101338v59de0200o3c28f150958457c0@mail.gmail.com> On 11/10/05, Nick Coghlan <ncoghlan at gmail.com> wrote: > Brett Cannon wrote: > > I just finished fleshing out the dev FAQ > > (http://www.python.org/dev/devfaq.html) with questions covering what > > someone might need to know for regular usage. If anyone thinks I > > didn't cover something I should have, let me know. > > For question 1.2.10, I believe you also want: > > [miscellany] > enable-auto-props = yes > > so that "svn add" works properly. Added. Missed that I had it in my personal config. =) > > Question 1.4.1 should cover the use of "svn diff" instead of "cvs diff" to > make the patch. > Changed. > On that note, we need to update the patch submission guidelines to point to > SVN instead of CVS (those guidelines also still say context diffs are > preferred to unified diffs, which I believe is no longer true). > Fixed and fixed. -Brett From mhammond at skippinet.com.au Thu Nov 10 22:49:20 2005 From: mhammond at skippinet.com.au (Mark Hammond) Date: Fri, 11 Nov 2005 08:49:20 +1100 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <43738C55.60509@c2b2.columbia.edu> Message-ID: <DAELJHBGPBHPJKEBGGLNIEBBICAD.mhammond@skippinet.com.au> Michiel wrote: > Guido van Rossum wrote: > > >On 11/9/05, Michiel Jan Laurens de Hoon > <mdehoon at c2b2.columbia.edu> wrote: > > > > > >>My application doesn't need a toolkit at all. My problem is that because > >>of Tkinter being the standard Python toolkit, we cannot have a decent > >>event loop in Python. So this is the disadvantage I see in Tkinter. > >> > >> > > > >That's a non-sequitur if I ever saw one. Who gave you that idea? > There is no connection. > > > I have come to this conclusion after several years of maintaining > a scientific plotting package and trying to set up an event loop > for it. Whereas there are some solutions that more or less work, > none of them work very well, and the solutions that we found tend > to break. Other visualization packages are struggling with the > same problem. I'm trying the best I can to explain in my other > posts why I feel that Tkinter is the underlying reason, and why > it would be difficult to solve. I believe this problem all boils down to this paragraph from the first mail on this topic: : Currently, event loops are available in Python via PyOS_InputHook, a : pointer to a user-defined function that is called when Python is idle : (waiting for user input). However, an event loop using PyOS_InputHook : has some inherent limitations, so I am thinking about how to improve : event loop support in Python. Either we have an unusual definition of "event loop" (as many many other toolkits have implemented event loops without PyOS_InputHook), or the requirement is for an event loop that plays nicely with the "interactive loop" in Python.exe. Assuming the latter, I would suggest simply not trying to do that! Look at the "code" module for a way you can create your own interactive loop that plays nicely with your event loop (rather than trying to do it the other way around). Otherwise, I suggest you get very specific about what this event loop should do. From a previous mail in this thread (an exchange between you and Martin): > >> Which is why Tkinter needs to be in charge of the event loop. For > >> other GUI toolkits, I don't see a reason why they'd need their own > >> event loop. > > > > They need to fetch events from the operating system level, and dispatch > > them to the widgets. > This is a perfect task for an event loop located in Python, instead of > in an extension module. I believe the point Martin was trying to make is that we have 2 "unknown" quantities here - the "operating system" and the "widgets". Each OS delivers raw GUI events differently, and each GUI framework consumes and generates events differently. I can't see what a single event loop would look like. Even on Windows there is no single, standard "event loop" construct - MFC and VB apps both have custom message loops. Mozilla XUL applications (which are very close to being able to be written in Python <wink>) have an event loop that could not possibly be expressed in Python - but they do expose a way to *call* their standard event loop (which is quite a different thing - you are asking to *implement* it.) > I could write a prototype event loop for Python > to demonstrate how this would work. I think that would be the best way forward - this may all simply be one big misunderstanding <wink>. The next step after that would be to find even one person who currently uses an event-loop based app, and for whom your event loop would work. Mark. From ulrich.berning at desys.de Fri Nov 11 14:32:21 2005 From: ulrich.berning at desys.de (Ulrich Berning) Date: Fri, 11 Nov 2005 14:32:21 +0100 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: <5.1.1.6.0.20051110124246.02bac470@mail.telecommunity.com> References: <5.1.1.6.0.20051109190838.01f51838@mail.telecommunity.com> <20051109023347.GA15823@localhost.localdomain> <ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com> <b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com> <ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com> <bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com> <5.1.1.6.0.20051109190838.01f51838@mail.telecommunity.com> <5.1.1.6.0.20051110124246.02bac470@mail.telecommunity.com> Message-ID: <43749D65.4040001@desys.de> Phillip J. Eby schrieb: >At 04:33 PM 11/9/2005 -0800, Guido van Rossum wrote: > > >>On 11/9/05, Phillip J. Eby <pje at telecommunity.com> wrote: >> >> >>>By the way, while we're on this subject, can we make the optimization >>>options be part of the compile() interface? Right now the distutils has to >>>actually exec another Python process whenever you want to compile >>>code with >>>a different optimization level than what's currently in effect, whereas if >>>it could pass the desired level to compile(), this wouldn't be necessary. >>> >>> >>Makes sense to me; we need a patch of course. >> >> > >But before we can do that, it's not clear to me if it should be part of the >existing "flags" argument, or whether it should be separate. Similarly, >whether it's just going to be a level or an optimization bitmask in its own >right might be relevant too. > >For the current use case, obviously, a level argument suffices, with 'None' >meaning "whatever the command-line level was" for backward >compatibility. And I guess we could go with that for now easily enough, >I'd just like to know whether any of the AST or optimization mavens had >anything they were planning in the immediate future that might affect how >the API addition should be structured. > > > I'm using a totally different approach for the above problem. I have implemented two functions in the sys module, that make the startup flags accessible at runtime. This also solves some other problems I had, as you will see in the examples below: The first function makes most of the flags readable (I have ommited the flags, that are documented as deprecated in the code): sys.getrunflag(name) -> integer Return one of the interpreter run flags. Possible names are 'Optimize', 'Verbose', 'Interactive', 'IgnoreEnvironment', 'Debug', 'DivisionWarning', 'NoSite', 'NoZipImport', 'UseClassExceptions', 'Unicode', 'Frozen', 'Tabcheck'. getrunflag('Optimize') for example returns the current value of Py_OptimizeFlag. The second function makes a few flags writable: sys.setrunflag(name, value) -> integer Set an interpreter run flag. The only flags that can be changed at runtime are Py_VerboseFlag ('Verbose') and Py_OptimizeFlag ('Optimize'). Returns the previous value of the flag. As you can see, I have also introduced the new flag Py_NoZipImport that can be activated with -Z at startup. This bypasses the activation of zipimport and is very handy, if you edit modules stored in the filesystem, that are normally imported from a zip archive and you want to test your modifications. With this flag, there is no need to delete, rename or update the zip archive or to modify sys.path to ensure that your changed modules are imported from the filesystem and not from the zip archive. And here are a few usable examples for the new functions: 1.) You have an application, that does a huge amount of imports and some of them are mysterious, so you want to track them in verbose mode. You could start python with -v or -vv, but then you get hundreds or thousands of lines of output. Instead, you can do the following: import sys import ... import ... oldval = sys.setrunflag('Verbose', 1) # -v, use 2 for -vv import ... import ... sys.setrunflag('Verbose', oldval) import ... import ... Now, you get only verbose messages for the imports that you want to track. 2.) You need to generate optimized byte code (without assertions and docstrings) from a source code, no matter how the interpreter was started: import sys ... source = ... oldval = sys.setrunflag('Optimize', 2) # -OO, use 1 for -O bytecode = compile(source, ...) sys.setrunflag('Optimize', oldval) ... 3.) You have to build a command line for the running application (e.g. for registration in the registry) and need to check, if you are running a script or a frozen executable (this assumes, that your freeze tool sets the Py_FrozenFlag): import sys ... if sys.getrunflag('Frozen'): commandline = sys.executable else: commandline = '%s %s' % (sys.executable, sys.argv[0]) ... NOTE: My own freeze tool sib.py, which is part of the VendorID package (www.riverbankcomputing.co.uk/vendorid) doesn't set the Py_FrozenFlag yet. I will provide an update soon. ---- And now back to the original subject: I have done nearly the same changes, that Osvaldo provided with his patch and I would highly appreciate if this patch goes into the next release. The main reason why I changed the import behavior was pythonservice.exe from the win32 extensions. pythonservice.exe imports the module that contains the service class, but because pythonservice.exe doesn't run in optimized mode, it will only import a .py or a .pyc file, not a .pyo file. Because we always generate bytecode with -OO at distribution time, we either had to change the behavior of pythonservice.exe or change the import behavior of Python. It is essential for us to remove assertions and docstrings in our commercial Python applications at distribution time, because assertions are meaningful only at development time and docstrings may contain implementation details, that our customers should never see (this makes reverse engineering a little bit harder, but not impossible). Another important reason is, that the number of files in the standard library is reduced dramatically. I can have .py and .pyc files in lib/pythonX.Y containing assertions and docstrings and put optimized code in lib/pythonXY.zip. Then, if I need docstrings, I just bypass zipimport as described above with -Z. On the customer site, we only need to install the zip archive (our customers are not developers, just end users; they may not even recognize that we use Python for application development). Guido, if it was intentional to separate slightly different generated bytecode into different files and if you have good reasons for doing this, why have I never seen a .pyoo file :-) For instance, nobody would give the output of a C compiler a different extension when different compiler flags are used. I would appreciate to see the generation of .pyo files completely removed in the next release. Just create them as .pyc files, no matter if -O or -OO is used or not. At runtime, Python should just ignore (jump over?) assert statements when started with -O or ignore assert statements and docstrings when started with -OO if they are in the .pyc file. ---- There are two reasons, why I haven't started a discussion about those issues earlier on this list: 1.) I have done those changes (and a lot of other minor and major changes) in Python 2.3.x at a time when Python 2.4.x came up and we still use Python 2.3.5. I just wanted to wait until I have the next release of our Python runtime environment (RTE) ready that will contain the most recent Python version. 2.) In the last months, my time was limited, because I first had to stabilize the current RTE (developers and customers were waiting on it). A few notes about this Python runtime environment: The RTE that I maintain contains everything (mainly, but not only Python) that is needed to run our applications on the customer site. The applications are distributed as frozen binaries either containing the whole application or only the frozen main script together with an additional zip archive holding the application specific modules and packages. Tools like py2exe or cx_Freeze follow a different approach: they always package a kind of runtime environment together with the application. There is nothing bad with this approach, but if you provide more than one application, you waste resources and it may be harder to maintain (bugfixes, updates, etc.). Our RTE currently runs on AIX, HP-UX, IRIX, Windows and Linux, SunOS/Solaris will follow in the near future. It gets installed together with the application(s) on the customer site if it is not already there in the appropriate version. It is vendor specific (more in the sense of a provider, not necessarily in the sense of a seller) and shouldn't interfere with any other software already installed on the target site. This RTE is the result of years of experience of commercial application development with Python (we started with Python-1.3.x a long long time ago). My guideline is: Application developers and/or customers should not worry about the requirements needed to get the applications running on the target site. Developers should spend all their time for application development and customers can expect a complete installation with only a very small set of system requirements/dependencies. In general, I would like to discuss those things that increase Python's usability especially in a commercial environment and things that make Python more consistent across platforms (e.g. unifying the filesystem layout on Windows and UNIX/Linux), but I don't know if this is the right mailing list. Ulli From jimjjewett at gmail.com Fri Nov 11 15:27:13 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Fri, 11 Nov 2005 09:27:13 -0500 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks Message-ID: <fb6fbf560511110627w7435754v9175076eb256932f@mail.gmail.com> Ulrich Berning schrieb: [He already has a patch that does much of what is being discussed] > I have also introduced the new flag Py_NoZipImport that > can be activated with -Z at startup. This bypasses the > activation of zipimport I think -Z could be confusing; I would expect it to work more like the recent suggestion that it name a specific Zip file to use as the only (or at least the first) source of modules. I do see that the switch is handy; I'm just suggesting a different name, such as -nozip or -skip file.zip. -jJ From jimjjewett at gmail.com Fri Nov 11 16:32:42 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Fri, 11 Nov 2005 10:32:42 -0500 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter - Summary attempt Message-ID: <fb6fbf560511110732t3dd0e530v4ffb1fec8acce3a0@mail.gmail.com> There has been enough misunderstanding in this thread that the summarizers are going to have trouble. So I'm posting this draft in hopes of clarification; please correct me. (1) There is some pre-discussion attached to patches 1049855 and 1252236. Martin Loewis and Michiel de Hoon agreed that the fixes were fragile, and that a larger change should be discussed on python-dev. (2) Michiel writes visualization software; he (and others, such as the writers of matplotlib) has trouble creating a good event loop, because the GUI toolkit (especially Tkinter?) wants its own event loop to be in charge. Note that this isn't the first time this sort of problem has come up; usually it is phrased in terms of a problem with Tix, or not being able to run turtle while in IDLE. Event loops by their very nature are infinite loops; once they start, everything else is out of luck unless it gets triggered by an event or is already started. (3) Donovan Baarda suggested looking at Twisted for state of the art in event loop integration. Unfortunately, as Phillip Eby notes, it works by not using the Tkinter event loop. It decides for itself when to call dooneevent. (4) Michiel doesn't actually need Tkinter (or any other GUI framework?) for his own project, but he has to play nice with it because his users expect to be able to use other tools -- particularly IDLE -- while running his software. (5) It is possible to run Tkinter's dooneevent version as part of your own event loop (as Twisted does), but you can't really listen for its events, so you end up with a busy loop polling, and stepping into lots of "I have nothing to do" functions for every client eventloop. You can use Tkinter's loop, but once it goes to sleep waiting for input, everything sort of stalls out for a while, and even non-Tkinter events get queued instead of processed. (6) Mark Hammond suggests that it might be easier to replace the interactive portions of python based on the "code" module. matplotlib suggests using ipython instead of standard python for similar reasons. If that is really the simplest answer (and telling users which IDE to use is acceptable), then ... I think Michiel has a point. (7) One option might be to always start Tk in a new thread, rather than letting it take over the main thread. There was some concern (see patch 1049855) that Tkinter doesn't -- and shouldn't -- require threading. My thoughts are that some of the biggest problems with the event loop (waiting on a mutex) won't happen in non-threaded python, and that even dummy_thread would be an improvement over the current state (by forcing the event loop to start last). I may well be missing something, but obviously I'm not sure what that is. -jJ From guido at python.org Fri Nov 11 17:15:18 2005 From: guido at python.org (Guido van Rossum) Date: Fri, 11 Nov 2005 08:15:18 -0800 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: <43749D65.4040001@desys.de> References: <20051109023347.GA15823@localhost.localdomain> <ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com> <b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com> <ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com> <bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com> <5.1.1.6.0.20051109190838.01f51838@mail.telecommunity.com> <5.1.1.6.0.20051110124246.02bac470@mail.telecommunity.com> <43749D65.4040001@desys.de> Message-ID: <ca471dc20511110815p12bb82efhc887ba4f6fae670f@mail.gmail.com> On 11/11/05, Ulrich Berning <ulrich.berning at desys.de> wrote: > Guido, if it was intentional to separate slightly different generated > bytecode into different files and if you have good reasons for doing > this, why have I never seen a .pyoo file :-) Because -OO was an afterthought and not implemented by me. > For instance, nobody would give the output of a C compiler a different > extension when different compiler flags are used. But the usage is completely different. With C you explicitly manage when compilation happens. With Python you don't. When you first run your program with -O but it crashes, and then you run it again without -O to enable assertions, you would be very unhappy if the bytecode cached in a .pyo file would be reused! > I would appreciate to see the generation of .pyo files completely > removed in the next release. You seem to forget the realities of backwards compatibility. While there are ways to cache bytecode without having multiple extensions, we probably can't do that until Python 3.0. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From mdehoon at c2b2.columbia.edu Fri Nov 11 17:58:31 2005 From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon) Date: Fri, 11 Nov 2005 11:58:31 -0500 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter - Summary attempt In-Reply-To: <fb6fbf560511110732t3dd0e530v4ffb1fec8acce3a0@mail.gmail.com> References: <fb6fbf560511110732t3dd0e530v4ffb1fec8acce3a0@mail.gmail.com> Message-ID: <4374CDB7.2020001@c2b2.columbia.edu> I think this is an excellent summary of the discussion so far. Probably clearer than my own posts. Thanks, Jim! --Michiel. Jim Jewett wrote: >There has been enough misunderstanding in this thread >that the summarizers are going to have trouble. So I'm >posting this draft in hopes of clarification; please correct >me. > >(1) There is some pre-discussion attached to patches >1049855 and 1252236. Martin Loewis and Michiel >de Hoon agreed that the fixes were fragile, and that a >larger change should be discussed on python-dev. > >(2) Michiel writes visualization software; he (and >others, such as the writers of matplotlib) has trouble >creating a good event loop, because the GUI toolkit >(especially Tkinter?) wants its own event loop to be in >charge. > >Note that this isn't the first time this sort of problem has >come up; usually it is phrased in terms of a problem with >Tix, or not being able to run turtle while in IDLE. > >Event loops by their very nature are infinite loops; >once they start, everything else is out of luck unless it >gets triggered by an event or is already started. > >(3) Donovan Baarda suggested looking at Twisted for >state of the art in event loop integration. Unfortunately, >as Phillip Eby notes, it works by not using the Tkinter >event loop. It decides for itself when to call dooneevent. > >(4) Michiel doesn't actually need Tkinter (or any other GUI >framework?) for his own project, but he has to play nice >with it because his users expect to be able to use other >tools -- particularly IDLE -- while running his software. > >(5) It is possible to run Tkinter's dooneevent version >as part of your own event loop (as Twisted does), but >you can't really listen for its events, so you end up with >a busy loop polling, and stepping into lots of "I have >nothing to do" functions for every client eventloop. > >You can use Tkinter's loop, but once it goes to sleep >waiting for input, everything sort of stalls out for a while, >and even non-Tkinter events get queued instead of >processed. > >(6) Mark Hammond suggests that it might be easier to >replace the interactive portions of python based on the >"code" module. matplotlib suggests using ipython >instead of standard python for similar reasons. > >If that is really the simplest answer (and telling users >which IDE to use is acceptable), then ... I think Michiel >has a point. > >(7) One option might be to always start Tk in a new >thread, rather than letting it take over the main thread. > >There was some concern (see patch 1049855) that >Tkinter doesn't -- and shouldn't -- require threading. > >My thoughts are that some of the biggest problems >with the event loop (waiting on a mutex) won't happen >in non-threaded python, and that even dummy_thread >would be an improvement over the current state (by >forcing the event loop to start last). I may well be >missing something, but obviously I'm not sure what that is. > >-jJ >_______________________________________________ >Python-Dev mailing list >Python-Dev at python.org >http://mail.python.org/mailman/listinfo/python-dev >Unsubscribe: http://mail.python.org/mailman/options/python-dev/mdehoon%40c2b2.columbia.edu > > -- Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 From mdehoon at c2b2.columbia.edu Fri Nov 11 18:56:57 2005 From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon) Date: Fri, 11 Nov 2005 12:56:57 -0500 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <4373A300.3080501@v.loewis.de> References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> <87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp> <4372DD5F.70203@c2b2.columbia.edu> <ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com> <43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de> Message-ID: <4374DB69.2080804@c2b2.columbia.edu> Martin v. L?wis wrote: >Before trying to explain the reason, please try to explain the >problem first. What is it *really* that you want to do which >you feel you currently can't do? > > Probably I should have started the discussion with this; sorry if I confused everybody. But here it is: I have an extension module for scientific visualization. This extension module opens one or more windows, in which plots can be made. Something similar to the plotting capabilities of Matlab. For the graphics windows to remain responsive, I need to make sure that its events get handled. So I need an event loop. At the same time, the user can enter new Python commands, which also need to be handled. To achieve this, I make use of PyOS_InputHook, a pointer to a function which gets called just before going into fgets to read the next Python command. I use PyOS_InputHook to enter an event loop inside my extension module. This event loop handles the window events, and returns as soon as a new Python command is available at stdin, at which point we continue to fgets as usual. While this approach more or less works, there are two problems that I have run into: 1) What if the user decides to import Tkinter next? Tkinter notices that PyOS_InputHook is already set, and does not reset it to its own event loop. Hence, Tkinter's events are not handled. Similarly, if a user imports Tkinter before my extension module, I don't reset PyOS_InputHook, so Tkinter's events are handled but not mine. If I were to reset PyOS_InputHook to my extension module's event loop, then my events get handled but not Tkinter's. 2) On Windows, many users will use IDLE to run Python. IDLE uses two Python threads, one for the GUI and one for the user's Python commands. Each has its own PyOS_InputHook. If I import my extension module (or Tkinter, for that matter), the user-Python's PyOS_InputHook gets set to the corresponding event loop function. So far so good. However, PyOS_InputHook doesn't actually get called: Between Python commands, the GUI-Python is waiting for the user to type something, and the user-Python is waiting for the GUI-Python. While the user-Python is waiting for the GUI-Python thread, no call is made to PyOS_InputHook, therefore we don't enter an event loop, and no events get handled. Hence, neither my extension module nor Tkinter work when run from IDLE. --Michiel. -- Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 From fredrik at pythonware.com Fri Nov 11 19:14:57 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 11 Nov 2005 19:14:57 +0100 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter - Summary attempt References: <fb6fbf560511110732t3dd0e530v4ffb1fec8acce3a0@mail.gmail.com> Message-ID: <dl2n2s$a5e$1@sea.gmane.org> Jim Jewett wrote: > (6) Mark Hammond suggests that it might be easier to > replace the interactive portions of python based on the > "code" module. matplotlib suggests using ipython > instead of standard python for similar reasons. > > If that is really the simplest answer (and telling users > which IDE to use is acceptable), then ... I think Michiel > has a point. really? Python comes with a module that makes it trivial to get a fully working interpreter console under any kind of UI toolkit with very little effort, and you think that proves that we need to reengineer the CPython interpreter to support arbitary event loops so you don't have to use that module? as usual, you make absolutely no sense whatsoever. (or was "..." short for "CPython's interactive mode should use the code module" ?) </F> From jimjjewett at gmail.com Fri Nov 11 21:44:34 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Fri, 11 Nov 2005 15:44:34 -0500 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter Message-ID: <fb6fbf560511111244v6bf66043n27674c23670f42e9@mail.gmail.com> >> (6) Mark Hammond suggests that it might be easier to >> replace the interactive portions of python based on the >> "code" module. matplotlib suggests using ipython >> instead of standard python for similar reasons. >> If that is really the simplest answer (and telling users >> which IDE to use is acceptable), then ... I think Michiel >> has a point. Fredrik Lundh wrote: > really? Python comes with a module that makes it trivial to get > a fully working interpreter console ... Using an event loop (or an external GUI) should not require forking the entire interactive mode, no matter how trivial that fork is. The subtle differences between interactive mode and IDLE already cause occasional problems; the same would be true of code.interact() if it were more widely used. Part of Michiel's pain is that users want to make their own decisions on whether to use IDLE or emacs or vt100, and they want to mix and match toolkits. They already run into unexpected freezes because of the event loop conflicts. If every extension writer also relied on their own subclasses of the interactive mode, users would be in for far more unpleasant surprises. The right answer might be to run each event loop in a separate thread. The answer might be to have a python event loop that re-dispatches single events to the other frameworks. The answer might be a way to chain PyOS_InputHook functions like atexit does for shutdown functions. The answer might be something else entirely. But I'm pretty sure that the right answer does not involve adding an additional layer of potential incompatibility. -jJ From fredrik at pythonware.com Fri Nov 11 22:34:19 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 11 Nov 2005 22:34:19 +0100 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter References: <fb6fbf560511111244v6bf66043n27674c23670f42e9@mail.gmail.com> Message-ID: <dl32ot$dkm$1@sea.gmane.org> Jim Jewett wrote: > > really? Python comes with a module that makes it trivial to get > > a fully working interpreter console ... > > Using an event loop (or an external GUI) should not require > forking the entire interactive mode, no matter how trivial that > fork is. repeating a bogus argument doesn't make it any better. </F> From skip at pobox.com Sat Nov 12 03:32:57 2005 From: skip at pobox.com (skip@pobox.com) Date: Fri, 11 Nov 2005 20:32:57 -0600 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <4374DB69.2080804@c2b2.columbia.edu> References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> <87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp> <4372DD5F.70203@c2b2.columbia.edu> <ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com> <43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de> <4374DB69.2080804@c2b2.columbia.edu> Message-ID: <17269.21593.575449.78938@montanaro.dyndns.org> Michiel> 1) What if the user decides to import Tkinter next? Tkinter Michiel> notices that PyOS_InputHook is already set, and does not Michiel> reset it to its own event loop. Hence, Tkinter's events are Michiel> not handled. Similarly, if a user imports Tkinter before my Michiel> extension module, I don't reset PyOS_InputHook, so Tkinter's Michiel> events are handled but not mine. If I were to reset Michiel> PyOS_InputHook to my extension module's event loop, then my Michiel> events get handled but not Tkinter's. This sounds sort of like the situation that existed with sys.exitfunc before the creation of the atexit module. Can't we develop an API similar to that so that many different event-loop-wanting packages can play nice together? (Then again, maybe I'm just being too simpleminded.) Skip From avi at argo.co.il Thu Nov 10 10:26:05 2005 From: avi at argo.co.il (Avi Kivity) Date: Thu, 10 Nov 2005 11:26:05 +0200 Subject: [Python-Dev] indented longstrings? Message-ID: <4373122D.30507@argo.co.il> Python's longstring facility is very useful, but unhappily breaks indentation. I find myself writing code like msg = ('From: %s\r\n' + 'To: %s\r\n' + 'Subject: Host failure report for %s\r\n' + 'Date: %s\r\n' + '\r\n' + '%s\r\n') % (fr, ', '.join(to), host, time.ctime(), err) mail.sendmail(fr, to, msg) instead of msg = ('''From: %s To: %s Subject: Host failure report for %s Date: %s %s ''') % (fr, ', '.join(to), host, time.ctime(), err) mail.sendmail(fr, to, msg) while wishing for a msg = i'''From: %s To: %s\r\n' Subject: Host failure report for %s Date: %s %s ''' % (fr, ', '.join(to), host, time.ctime(), err) mail.sendmail(fr, to, msg.replace('\n', '\r\n')) isn't it so much prettier? (((an indented longstring, i''' ... ''' behaves like a regular longstring except that indentation on the lines following the beginning of the longstring is stripped up to the first character position of the longstring on the first line. non-blanks before that character position are a syntax error))) Avi From falcon at intercable.ru Thu Nov 10 21:48:14 2005 From: falcon at intercable.ru (Sokolov Yura) Date: Thu, 10 Nov 2005 23:48:14 +0300 Subject: [Python-Dev] (no subject) Message-ID: <4373B20E.3060002@intercable.ru> > > >Mixing Decimal and float is nearly ALWAYS a user error. Doing it correctly >requires significant expertise in the peculiarities of floating point >representations. > So that I think user should declare floats explicitly (###.###f) - he will fall into float space only if he wish it. >So Python protects the user by throwing exceptions when >attempts are made to mix Decimal and floats. I hate it. I want to get float when I wish to get float. In that case i would like to write #f. I want to stay with decimals by default. (and I want decimals written in C) But it just an opinion of young inexperienced/unpractised man. Excuse my English. From skip at pobox.com Sat Nov 12 03:56:32 2005 From: skip at pobox.com (skip@pobox.com) Date: Fri, 11 Nov 2005 20:56:32 -0600 Subject: [Python-Dev] indented longstrings? In-Reply-To: <4373122D.30507@argo.co.il> References: <4373122D.30507@argo.co.il> Message-ID: <17269.23008.424606.403292@montanaro.dyndns.org> Avi> Python's longstring facility is very useful, but unhappily breaks Avi> indentation. I find myself writing code like Avi> msg = ('From: %s\r\n' Avi> + 'To: %s\r\n' Avi> + 'Subject: Host failure report for %s\r\n' Avi> + 'Date: %s\r\n' Avi> + '\r\n' Avi> + '%s\r\n') % (fr, ', '.join(to), host, time.ctime(), err) Avi> mail.sendmail(fr, to, msg) This really belongs on comp.lang.python, at least until you've exhausted the existing possibilities and found them lacking. However, try: msg = ('From: %s\r\n' 'To: %s\r\n' 'Subject: Host failure report for %s\r\n' 'Date: %s\r\n' '\r\n' '%s\r\n') % (fr, ', '.join(to), host, time.ctime(), err) or msg = ('''\ From: %s To: %s Subject: Host failure report for %s Date: %s %s ') % (fr, ', '.join(to), host, time.ctime(), err) or (untested) def istring(s): return re.sub(r"(\r?\n)\s+", r"\1", s) msg = """From: %s To: %s Subject: Host failure report for %s Date: %s %s """ msg = istring(msg) % (fr, ', '.join(to), host, time.ctime(), err) Skip From greg.ewing at canterbury.ac.nz Sat Nov 12 04:12:58 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 12 Nov 2005 16:12:58 +1300 Subject: [Python-Dev] Weak references: dereference notification In-Reply-To: <1131624778.4292.22.camel@localhost> References: <1131536425.9130.10.camel@localhost> <437228E4.4070800@zope.com> <1131556500.9130.18.camel@localhost> <ca471dc20511090923u4ae0d00evf85c2cc8a123a1b5@mail.gmail.com> <1131558739.9130.40.camel@localhost> <9E82C8B1-8A32-457D-827A-F0135EB9F8D3@mac.com> <1131576278.8540.14.camel@localhost.localdomain> <43729B07.6010907@canterbury.ac.nz> <1131624778.4292.22.camel@localhost> Message-ID: <43755DBA.8070709@canterbury.ac.nz> Gustavo J. A. M. Carneiro wrote: > The object isn't really destroyed. Simply ob_refcnt drops to zero, > then tp_dealloc is called, which is supposed to destroy it. But since I > wrote tp_dealloc, I choose not to destroy it, Be aware that a C subclass of your wrapper that overrides tp_dealloc is going to have its tp_dealloc called before yours, and will therefore be partly destroyed before you get control. Greg From bob at redivi.com Sat Nov 12 04:31:34 2005 From: bob at redivi.com (Bob Ippolito) Date: Fri, 11 Nov 2005 19:31:34 -0800 Subject: [Python-Dev] indented longstrings? In-Reply-To: <4373122D.30507@argo.co.il> References: <4373122D.30507@argo.co.il> Message-ID: <2BF318ED-4FB9-4E7B-A3E2-1AA131E63B16@redivi.com> On Nov 10, 2005, at 1:26 AM, Avi Kivity wrote: > Python's longstring facility is very useful, but unhappily breaks > indentation. I find myself writing code like http://docs.python.org/lib/module-textwrap.html -bob From s.joaopaulo at gmail.com Sat Nov 12 05:23:42 2005 From: s.joaopaulo at gmail.com (=?ISO-8859-1?Q?Jo=E3o_Paulo_Silva?=) Date: Sat, 12 Nov 2005 04:23:42 +0000 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: <ca471dc20511110815p12bb82efhc887ba4f6fae670f@mail.gmail.com> References: <20051109023347.GA15823@localhost.localdomain> <ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com> <b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com> <ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com> <bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com> <5.1.1.6.0.20051109190838.01f51838@mail.telecommunity.com> <5.1.1.6.0.20051110124246.02bac470@mail.telecommunity.com> <43749D65.4040001@desys.de> <ca471dc20511110815p12bb82efhc887ba4f6fae670f@mail.gmail.com> Message-ID: <787073ca0511112023l29794930n@mail.gmail.com> Hi (first post here, note that English is not my native language), One thing we shoudn't forgot is that Osvaldo is porting Python to a plataform that has not so much disk space. He needs Python modules with just the essencial. I like ideias like __debug__ opcode, but in Osvaldo use case, there are limitations to expanding a Python module size. What the problem for the interpreter looks for both .pyc and .pyo? I believe zipimport way to lookup modules is more useful. -- At? mais.. Jo?o Paulo da Silva LinuxUser #355914 ICQ: 265770691 | Jabber: joaopinga at jabber.org PS: Guido, sorry the PVT... From greg.ewing at canterbury.ac.nz Sat Nov 12 05:33:31 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 12 Nov 2005 17:33:31 +1300 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <17269.21593.575449.78938@montanaro.dyndns.org> References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> <87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp> <4372DD5F.70203@c2b2.columbia.edu> <ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com> <43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de> <4374DB69.2080804@c2b2.columbia.edu> <17269.21593.575449.78938@montanaro.dyndns.org> Message-ID: <4375709B.4010009@canterbury.ac.nz> skip at pobox.com wrote: > This sounds sort of like the situation that existed with sys.exitfunc before > the creation of the atexit module. Can't we develop an API similar to that > so that many different event-loop-wanting packages can play nice together? I can't see how that would help. If the different hooks know nothing about each other, there's no way for one to know when to give up control to the next one in the chain. Greg From greg.ewing at canterbury.ac.nz Sat Nov 12 05:39:57 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 12 Nov 2005 17:39:57 +1300 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <4374DB69.2080804@c2b2.columbia.edu> References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> <87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp> <4372DD5F.70203@c2b2.columbia.edu> <ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com> <43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de> <4374DB69.2080804@c2b2.columbia.edu> Message-ID: <4375721D.6040907@canterbury.ac.nz> Michiel Jan Laurens de Hoon wrote: > I have an extension module for scientific visualization. This extension > module opens one or more windows, in which plots can be made. What sort of windows are these? Are you using an existing GUI toolkit, or rolling your own? > For the graphics windows to remain responsive, I need to make sure that > its events get handled. So I need an event loop. How about running your event loop in a separate thread? Greg From skip at pobox.com Sat Nov 12 06:15:03 2005 From: skip at pobox.com (skip@pobox.com) Date: Fri, 11 Nov 2005 23:15:03 -0600 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <4375709B.4010009@canterbury.ac.nz> References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> <87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp> <4372DD5F.70203@c2b2.columbia.edu> <ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com> <43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de> <4374DB69.2080804@c2b2.columbia.edu> <17269.21593.575449.78938@montanaro.dyndns.org> <4375709B.4010009@canterbury.ac.nz> Message-ID: <17269.31319.806622.939477@montanaro.dyndns.org> >> This sounds sort of like the situation that existed with sys.exitfunc >> before the creation of the atexit module. Can't we develop an API >> similar to that so that many different event-loop-wanting packages >> can play nice together? Greg> I can't see how that would help. If the different hooks know Greg> nothing about each other, there's no way for one to know when to Greg> give up control to the next one in the chain. If I have a Gtk app I have to feed other (socket, callback) pairs to it. It takes care of adding it to the select() call. Python could dictate that the way to play ball is for other packages (Tkinter, PyGtk, wxPython, etc) to feed Python the (socket, callback) pair. Then you have a uniform way to control event-driven applications. Today, a package like Michiel's has no idea what sort of event loop it will encounter. If Python provided the event loop API it would be the same no matter what widget set happened to be used. The sticking point is probably that a number of such packages presume they will always provide the main event loop and have to way to feed their sockets to another event loop controller. That might present some hurdles for the various package writers/Python wrappers. Skip From skip at pobox.com Sat Nov 12 14:21:59 2005 From: skip at pobox.com (skip@pobox.com) Date: Sat, 12 Nov 2005 07:21:59 -0600 Subject: [Python-Dev] Mapping cvs version numbers to svn revisions? Message-ID: <17269.60535.243900.974801@montanaro.dyndns.org> In a bug report I filed Neal Norwitz referred me to an earlier, fixed, bug report from before the cvs-to-svn switch. The file versions were thus cvs version numbers instead of svn revisions. Is it possible to map from cvs version number to svn? In this particular situation I can fairly easily infer the revision number because I know Neal made the change and roughly where in the given file(s) he was making changes, but I doubt that would always be true. I guess, did cvstosvn save that mapping somewhere? Thx, Skip From p.f.moore at gmail.com Sat Nov 12 16:20:35 2005 From: p.f.moore at gmail.com (Paul Moore) Date: Sat, 12 Nov 2005 15:20:35 +0000 Subject: [Python-Dev] Building Python with Visual C++ 2005 Express Edition In-Reply-To: <4373A4F0.7010202@v.loewis.de> References: <dl073b$aro$1@sea.gmane.org> <4373A4F0.7010202@v.loewis.de> Message-ID: <79990c6b0511120720w2c8b318do3a41051ba4eb0c6b@mail.gmail.com> On 11/10/05, "Martin v. L?wis" <martin at v.loewis.de> wrote: > Christos Georgiou wrote: > > I didn't see any mention of this product in the Python-Dev list, so I > > thought to let you know. > > > > http://msdn.microsoft.com/vstudio/express/visualc/download/ > > > > There is also a link for a CD image (.img) file to download. > > > > I am downloading now, so I don't know yet whether Python compiles with it > > without any problems. So if anyone has previous experience, please reply. > > I don't have previous experience, but I think this it likely shares the > issues that VS.NET 2005 has with the current code: > 1. the project files are for VS.NET 2003. In theory, conversion to > the new format is supported, but I don't know whether this conversion > works flawlessly. > 2. MS broke ISO C conformance in VS.NET 2005 in a way that affects > Python's signal handling. There is a patch on SF which addresses > the issue, but that hasn't been checked in yet. FWIW, I downloaded Visual C++ 2005 Express edition, and the latest platform SDK, and had a go at building Python trunk. I just followed the build instructions from PCBuild\readme.txt as best I could - of the optional packages, I only got zlib to work. The issues with the other modules may or may not be serious - for example, bzip2 is now at version 1.0.3, and the old source isn't available. Just renaming the directory didn't work, but I didn't bother investigating further. I applied the patch you mentioned, but otherwise left everything unchanged. The project file conversions seemed to go fine, and the debug builds were OK, although the deprecation warnings for all the "insecure" CRT functions was a pain. It might be worth adding _CRT_SECURE_NO_DEPRECATE to the project defines somehow. I then ran the test suite, which mostly worked. Results: 235 tests OK. 5 tests failed: test_asynchat test_cookie test_grammar test_mmap test_profile 58 tests skipped: test__locale test_aepack test_al test_applesingle test_bsddb185 test_bsddb3 test_cd test_cl test_cmd_line test_code test_codecmaps_cn test_codecmaps_hk test_codecmaps_jp test_codecmaps_kr test_codecmaps_tw test_coding test_commands test_crypt test_curses test_dbm test_dl test_fcntl test_float test_fork1 test_functional test_gdbm test_gl test_grp test_hashlib test_hashlib_speed test_imgfile test_ioctl test_largefile test_linuxaudiodev test_macfs test_macostools test_mhlib test_nis test_normalization test_openpty test_ossaudiodev test_plistlib test_poll test_posix test_pty test_pwd test_resource test_scriptpackages test_signal test_socket_ssl test_socketserver test_sunaudiodev test_threadsignals test_timeout test_timing test_urllib2net test_urllibnet test_xdrlib 7 skips unexpected on win32: test_hashlib test_cmd_line test_xdrlib test_code test_float test_coding test_functional I'm not sure what to make of the "unexpected" skips... The output for the failed tests was: test_asynchat test test_asynchat produced unexpected output: ********************************************************************** *** lines 2-3 of actual output doesn't appear in expected output after line 1: + Connected + Received: 'hello world' ********************************************************************** test_cookie test test_cookie produced unexpected output: ********************************************************************** *** mismatch between lines 3-4 of expected output and lines 3-4 of actual output: - Set-Cookie: chips=ahoy + Set-Cookie: chips=ahoy; ? + - Set-Cookie: vienna=finger + Set-Cookie: vienna=finger; ? + *** mismatch between line 6 of expected output and line 6 of actual output: - Set-Cookie: chips=ahoy + Set-Cookie: chips=ahoy; ? + *** mismatch between line 8 of expected output and line 8 of actual output: - Set-Cookie: vienna=finger + Set-Cookie: vienna=finger; ? + *** mismatch between line 10 of expected output and line 10 of actual output: - Set-Cookie: keebler="E=mc2; L=\"Loves\"; fudge=\012;" + Set-Cookie: keebler="E=mc2; L=\"Loves\"; fudge=\012;"; ? + *** mismatch between line 12 of expected output and line 12 of actual output: - Set-Cookie: keebler="E=mc2; L=\"Loves\"; fudge=\012;" + Set-Cookie: keebler="E=mc2; L=\"Loves\"; fudge=\012;"; ? + *** mismatch between line 14 of expected output and line 14 of actual output: - Set-Cookie: keebler=E=mc2 + Set-Cookie: keebler=E=mc2; ? + *** mismatch between lines 16-17 of expected output and lines 16-17 of actual output: - Set-Cookie: keebler=E=mc2 + Set-Cookie: keebler=E=mc2; ? + - Set-Cookie: Customer="WILE_E_COYOTE"; Path=/acme + Set-Cookie: Customer="WILE_E_COYOTE"; Path=/acme; ? + *** mismatch between line 19 of expected output and line 19 of actual output: - <script type="text/javascript"> + <SCRIPT LANGUAGE="JavaScript"> *** mismatch between line 21 of expected output and line 21 of actual output: - document.cookie = "Customer="WILE_E_COYOTE"; Path=/acme; Version=1"; ? - + document.cookie = "Customer="WILE_E_COYOTE"; Path=/acme; Version=1;" ? + *** mismatch between line 26 of expected output and line 26 of actual output: - <script type="text/javascript"> + <SCRIPT LANGUAGE="JavaScript"> *** mismatch between line 28 of expected output and line 28 of actual output: - document.cookie = "Customer="WILE_E_COYOTE"; Path=/acme"; ? - + document.cookie = "Customer="WILE_E_COYOTE"; Path=/acme;" ? + ********************************************************************** test_grammar test test_grammar produced unexpected output: ********************************************************************** *** line 37 of expected output missing: - yield_stmt ********************************************************************** test_mmap test test_mmap produced unexpected output: ********************************************************************** *** lines 34-35 of expected output missing: - Ensuring that passing 0 as map length sets map size to current file size. - Ensuring that passing 0 as map length sets map size to current file size. ********************************************************************** test_profile test test_profile produced unexpected output: ********************************************************************** *** mismatch between line 10 of expected output and line 10 of actual output: - 1 0.000 0.000 1.000 1.000 <string>:1(<module>) ? ^^^^^^^^ + 1 0.000 0.000 1.000 1.000 <string>:1(?) ? ^ ********************************************************************** I don't have time to investigate much further, and not all of these look to be VC 2005 Express issues (for example, the test_profile and test_cookie errors look like code issues rather than compiler ones), but I don't have an alternative compiler to check. I hope this is of some use - it would be brilliant if VC 2005 Express could be a supported build environment. (Of course, MS have updated the CRT again, so binaries built with VC 2005 Express aren't binary compatible with extensions built for the standard release... :-( ) Regards, Paul. From martin at v.loewis.de Sat Nov 12 19:02:33 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 12 Nov 2005 19:02:33 +0100 Subject: [Python-Dev] Building Python with Visual C++ 2005 Express Edition In-Reply-To: <79990c6b0511120720w2c8b318do3a41051ba4eb0c6b@mail.gmail.com> References: <dl073b$aro$1@sea.gmane.org> <4373A4F0.7010202@v.loewis.de> <79990c6b0511120720w2c8b318do3a41051ba4eb0c6b@mail.gmail.com> Message-ID: <43762E39.3020005@v.loewis.de> Paul Moore wrote: > I hope this is of some use - it would be brilliant if VC 2005 Express > could be a supported build environment. (Of course, MS have updated > the CRT again, so binaries built with VC 2005 Express aren't binary > compatible with extensions built for the standard release... :-( ) It is not really practical to support two build environments fully; as MS changed the format of the project files again, one would have to maintain two sets of project files (actually, it would be three sets, as we keep the VC6 files as well). So really having the VS2005 files in subversion isn't an option; trying to make conversion go smooth all the time certainly is a desirable goal. Using VS2005 for official builds would only be an option with the next major release (2.5), and I personally don't see that happening: AFAICT, it is not that much of a change as VS2003 was (i.e. for Python, nothing is gained AFAICT); also, I'm getting the impression that VS2005 has too many bugs (*) to be useful, so I recommend to skip that release completely, and go then to VS2006 (or whenever that is release). Regards, Martin (*) besides the really sad changes in the CRT which break ISO C compliance, pre-release versions of the IDE were really unstable. That might have improved for the release, of course. In addition, I'm aware of various problems with .NET 2.0; something that doesn't Python affect too much, though. From martin at v.loewis.de Sat Nov 12 19:10:08 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 12 Nov 2005 19:10:08 +0100 Subject: [Python-Dev] Mapping cvs version numbers to svn revisions? In-Reply-To: <17269.60535.243900.974801@montanaro.dyndns.org> References: <17269.60535.243900.974801@montanaro.dyndns.org> Message-ID: <43763000.1000500@v.loewis.de> skip at pobox.com wrote: > In a bug report I filed Neal Norwitz referred me to an earlier, fixed, bug > report from before the cvs-to-svn switch. The file versions were thus cvs > version numbers instead of svn revisions. Is it possible to map from cvs > version number to svn? It would have been possible in the process of using cvs2svn, which could have generated subversion properties to collect the CVS revision numbers. I decided against doing so, as this will become less important over time, and I was uncertain if we would still have to carry those properties around on the trunk forever. I also expected that in most cases, it should be easy to find the relationship from the commit messages. Also, nobody requested that feature in the test installation. If somebody wants to come up with something (e.g. rerunning the conversion, only to create some kind of mapping file): the tarball that was used to do the conversion is at http://svn.python.org/snapshots/python-cvsroot-final.tar.bz2 Regards, Martin From noamraph at gmail.com Sat Nov 12 20:06:59 2005 From: noamraph at gmail.com (Noam Raphael) Date: Sat, 12 Nov 2005 21:06:59 +0200 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <17269.31319.806622.939477@montanaro.dyndns.org> References: <437100A7.5050907@c2b2.columbia.edu> <87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp> <4372DD5F.70203@c2b2.columbia.edu> <ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com> <43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de> <4374DB69.2080804@c2b2.columbia.edu> <17269.21593.575449.78938@montanaro.dyndns.org> <4375709B.4010009@canterbury.ac.nz> <17269.31319.806622.939477@montanaro.dyndns.org> Message-ID: <b348a0850511121106u57c073eeicf8affae502cd86e@mail.gmail.com> On 11/12/05, skip at pobox.com <skip at pobox.com> wrote: > If I have a Gtk app I have to feed other (socket, callback) pairs to it. It > takes care of adding it to the select() call. Python could dictate that the > way to play ball is for other packages (Tkinter, PyGtk, wxPython, etc) to > feed Python the (socket, callback) pair. Then you have a uniform way to > control event-driven applications. Today, a package like Michiel's has no > idea what sort of event loop it will encounter. If Python provided the > event loop API it would be the same no matter what widget set happened to be > used. > > The sticking point is probably that a number of such packages presume they > will always provide the main event loop and have to way to feed their > sockets to another event loop controller. That might present some hurdles > for the various package writers/Python wrappers. > I think that in order to solve Michiels' problem, there's no need for something like that, since probably neither of the "loops" are listening to sockets. Currently, Tkinter sets PyOS_InputHook to call its "dooneevent" repeatedly while Python code isn't being executed. It turns out to work excellently. All that is needed to make Tkinter and Michiels' code run together is a way to say "add this callback to the input hook" instead of the current "replace the current input hook with this callback". Then, when the interpreter is idle, it will call all the registered callbacks, one at a time, and everyone would be happy. To make this work with IDLE, or other interactive shells written in Python, you need to expose a function which will run all the registered callbacks. Then IDLE can call that function repeatedly when it's idle, and you'll get the same behaviour you have in the regular interactive shell. Specifically for IDLE, I know where that place is - since there's no way to generally invoke the input hook, I wrote a patch that calls _tkinter.dooneevent(_tkinter.DONT_WAIT) in the right place, and it works fine. Concerning threads - please don't. The "do one event at a time while the interpreter is idle" method works fine. Most programs aren't designed to be thread-safe, and since Tkinter does many callbacks to Python functions, you'll get unexpected behaviour if it's on another thread. I hope I made myself clear. This solution is simple, and works whenever a "do one event" function is available. Have a good week, Noam From martin at v.loewis.de Sat Nov 12 20:17:25 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 12 Nov 2005 20:17:25 +0100 Subject: [Python-Dev] Checking working copy consistency Message-ID: <43763FC5.8070307@v.loewis.de> Hi Skip, I made a script that runs through a subversion sandbox and checks whether all md5sums are correct. Please run that on your working copy to see whether there are still any inconsistent files. Regards, Martin -------------- next part -------------- A non-text attachment was scrubbed... Name: svncheck.py Type: text/x-python Size: 649 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20051112/ab1cb332/svncheck.py From fperez.net at gmail.com Sat Nov 12 20:46:10 2005 From: fperez.net at gmail.com (Fernando Perez) Date: Sat, 12 Nov 2005 12:46:10 -0700 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter - Summary attempt References: <fb6fbf560511110732t3dd0e530v4ffb1fec8acce3a0@mail.gmail.com> Message-ID: <dl5gq9$scn$1@sea.gmane.org> Jim Jewett wrote: > (6) Mark Hammond suggests that it might be easier to > replace the interactive portions of python based on the > "code" module. matplotlib suggests using ipython > instead of standard python for similar reasons. > > If that is really the simplest answer (and telling users > which IDE to use is acceptable), then ... I think Michiel > has a point. I don't claim to understand all the low-level details of this discussion, by a very long shot. But as the author of ipython, at least I'll mention what ipython does to help in this problem. Whether that is a satisfactory solution for everyone or not, I won't get into. For starters, ipython is an extension of the code.InteractiveConsole class, even though by now I've changed so much that I could probably just stop using any inheritance at all. But this is just to put ipython in the context of the stdlib. When I started using matplotlib, I wanted to be able to run my code interactively and get good plotting, as much as I used to have with Gnuplot before (IPython ships with extended Gnuplot support beyond what the default Gnuplot.py module provides). With help from John Hunter (matplotlib - mpl for short - author), we were able to add support for ipython to happily coexist with matplotlib when either the GTK or the WX backends were used. mpl can plot to Tk, GTK, WX, Qt and FLTK; Tk worked out of the box (because of the Tkinter event loop integration in Python), and with our hacks we got GTK and WX to work. Earlier this year, with the help of a few very knowledgeable Qt developers, we extended the same ideas to add support for Qt as well. As part of this effort, ipython can generically (meaning, outside of matplotlib) support interactive non-blocking control of WX, GTK and Qt apps, you get that by starting it with ipython -wthread/-gthread/-qthread The details of how this works are slightly different for each toolkit, but the overall approach is the same for all. We just register with each toolkit's idle/timer system a callback to execute pending code which is waiting in what is essentially a one-entry queue. I have a private branch where I'm adding similar support for OpenGL windows using the GLUT idle function, though it's ot ready for release yet. So far this has worked quite well. If anyone wants to see the details, the relevant code is here: http://projects.scipy.org/ipython/ipython/file/ipython/trunk/IPython/Shell.py It may not be perfect, and it may well be the wrong approach. If so, I'll be glad to learn how to do it better: I know very little about threading and I got this to work more or less by stumbling in the dark. In particular, one thing that definitely does NOT work is mixing TWO GUI toolkits together. There is a hack (the -tk option) to try to allow mixing of ONE of Qt/WX/GTK with Tk, but it has only ever worked on Debian, and we don't really understand why. I think it's some obscure combination of how the low-level threading support for many different libraries is compiled in Debian. As far as using IDLE/Emacs/whatever (I use Xemacs personally for my own editing), our approach has been to simply tell people that the _interactive shell_ should be ipython always. They can use anything they want to edit their code with, but they should execute their scripts with ipython. ipython has a %run command which allows code execution with a ton of extra control, so the work cycle with ipython is more or less: 1. open the editor you like to use with your foo.py code. Hack on foo.py 2. whenever you wish to test your code, save foo.py 3. switch to the ipython window, and type 'run foo'. Play with the results interactively (the foo namespace updates the interactive one after completion). 4. rinse, repeat. In the matplotlib/scipy mailing lists we've more or less settled on 'this is what we support. If you don't like it, go write your own'. It may not be perfect, but it works reasonably for us (I use this system 10 hours a day in scientific production work, and so does John, so we do eat our own dog food). Given that ipython is trival to install (it's pure python code with no extra dependencies under *nix and very few under win32), and that it provides so much additional functionality on top of the default interactive interpreter, we've had no complaints so far. OK, I hope this information is useful to some of you. Feel free to contact me if you have any questions (I will monitor the thread, but I follow py-dev on gmane, so I do miss things sometimes). Cheers, f From noamraph at gmail.com Sat Nov 12 20:52:32 2005 From: noamraph at gmail.com (Noam Raphael) Date: Sat, 12 Nov 2005 21:52:32 +0200 Subject: [Python-Dev] str.dedent In-Reply-To: <ca471dc2050914161070f1f425@mail.gmail.com> References: <dga72k$cah$1@sea.gmane.org> <ca471dc2050914161070f1f425@mail.gmail.com> Message-ID: <b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com> Following Avi's suggestion, can I raise this thread up again? I think that Reinhold's .dedent() method can be a good idea after all. The idea is to add a method called "dedent" to strings. It would do exactly what the current textwrap.indent function does. The motivation is to be able to write multilined strings easily without damaging the visual indentation of the source code, like this: def foo(): msg = '''\ From: %s To: %s\r\n' Subject: Host failure report for %s Date: %s %s '''.dedent() % (fr, ', '.join(to), host, time.ctime(), err) Writing multilined strings without spaces in the beginning of lines makes functions harder to read, since although the Python parser is happy with it, it breaks the visual indentation. On 9/15/05, Guido van Rossum <guido at python.org> wrote: > From the sound of it, it's probably not worth endowing every string > object with this method and hardcoding its implementation forever in C > code. There are so many corner cases and variations on the > functionality of "dedenting" a block that it's better to keep it as > Python source code. I've looked at the textwrap.dedent() function, and it's really simple and well defined: Given a string s, take s.expandtabs().split('\n'). Take the minimal number of whitespace chars at the beginning of each line (not counting lines with nothing but whitespaces), and remove it from each line. This means that the Python source code is simple, and there would be no problems to write it in C. On 9/15/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote: > > -1 > > Let it continue to live in textwrap where the existing pure python code > adequately serves all string-like objects. It's not worth losing the > duck typing by attaching new methods to str, unicode, UserString, and > everything else aspiring to be string-like. > > String methods should be limited to generic string manipulations. > String applications should be in other namespaces. That is why we don't > have str.md5(), str.crc32(), str.ziplib(), etc. > > Also, I don't want to encourage dedenting as a way of life --- programs > using it often are likely to be doing things the hard way. > I think that the difference between "dedent" and "md5", "crc32" and such is the fact that making "dedent" a method helps writing code that is easier to read. Strings already have a lot of methods which don't make code clearer the way "dedent" will, such as center, capitalize, expandtabs, and many others. I think that given these, there's no reason not to add "dedent" as a string method. Noam From raymond.hettinger at verizon.net Sat Nov 12 21:18:02 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Sat, 12 Nov 2005 15:18:02 -0500 Subject: [Python-Dev] str.dedent In-Reply-To: <b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com> Message-ID: <000001c5e7c6$2f959440$2523c797@oemcomputer> > The motivation > is to be able to write multilined strings easily without damaging the > visual indentation of the source code That is somewhat misleading. We already have that ability. What is being proposed is moving existing code to a different namespace. So the motivation is really something like: I want to write s = s.dedent() because it is too painful to write s = textwrap.dedent(s) Raymond From tcdelaney at optusnet.com.au Sat Nov 12 21:32:01 2005 From: tcdelaney at optusnet.com.au (Tim Delaney) Date: Sun, 13 Nov 2005 07:32:01 +1100 Subject: [Python-Dev] Building Python with Visual C++ 2005 ExpressEdition References: <dl073b$aro$1@sea.gmane.org> <4373A4F0.7010202@v.loewis.de><79990c6b0511120720w2c8b318do3a41051ba4eb0c6b@mail.gmail.com> <43762E39.3020005@v.loewis.de> Message-ID: <001401c5e7c8$2333cd00$0201a8c0@ryoko> "Martin v. L?wis" wrote: > Using VS2005 for official builds would only be an option with the > next major release (2.5), and I personally don't see that happening: > AFAICT, it is not that much of a change as VS2003 was (i.e. for > Python, nothing is gained AFAICT); also, I'm getting the impression > that VS2005 has too many bugs (*) to be useful, so I recommend to > skip that release completely, and go then to VS2006 (or whenever > that is release). With Microsoft changing the CRT all the time, I think I'd much prefer seeing effort going towards MinGW becoming the official Windows build platform. There was a considerable amount of angst with the 2.4 release that can be blamed solely on the CRT change (and hence different DLLs to link to). And with them deprecating ISO standard functions ... Tim Delaney From Scott.Daniels at Acm.Org Sat Nov 12 22:49:03 2005 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Sat, 12 Nov 2005 13:49:03 -0800 Subject: [Python-Dev] to_int -- oops, one step missing for use. In-Reply-To: <4372F68B.5050106@Acm.Org> References: <1f7befae0510211952x5eb2000bicdf3c1a80a3f5749@mail.gmail.com> <4372B377.6050806@Acm.Org> <4372F68B.5050106@Acm.Org> Message-ID: <dl5o07$d6g$1@sea.gmane.org> OK, Tim and I corresponded off-list (after that Aahz graciously suggested direct mail). The code I have is still stand-alone, but I get good speeds: 60% - 70% of the speed of int(string, base). It will take a little bit to figure out how it best belongs in the Python sources, so don't look for anything for a couple of weeks. Also, I'd appreciate someone testing the code on a 64-bit machine. Essentially all I'll need is that person to do a build and then run the tests. Unfortunately the posted module tests only cover the 32-bit long cases, so what I need is another test tried on a 64-bit long machine (that uses 64-bit longs in Python). So, if you have a Python installation where sys.maxint == (1 << 63) - 1 is True, and you'd like to help, here's what I need. If you already have the zip, retrieve: http://members.dsl-only.net/~daniels/dist/test_hi_powers.py If you don't already have the zip, retrieve: http://members.dsl-only.net/~daniels/dist/to_int-0.10.zip (I just added the test_hi_powers.py to the tests in the zip) Unpack the zip, do the build: $ python setup_x.py build copy the built module into the test directory, cd to that dir, and run test_hi_powers.py. Let me know if the tests pass or fail. Thanks. --Scott David Daniels Scott.Daniels at Acm.Org From martin at v.loewis.de Sat Nov 12 22:53:28 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 12 Nov 2005 22:53:28 +0100 Subject: [Python-Dev] Building Python with Visual C++ 2005 ExpressEdition In-Reply-To: <001401c5e7c8$2333cd00$0201a8c0@ryoko> References: <dl073b$aro$1@sea.gmane.org> <4373A4F0.7010202@v.loewis.de><79990c6b0511120720w2c8b318do3a41051ba4eb0c6b@mail.gmail.com> <43762E39.3020005@v.loewis.de> <001401c5e7c8$2333cd00$0201a8c0@ryoko> Message-ID: <43766458.4040803@v.loewis.de> Tim Delaney wrote: > With Microsoft changing the CRT all the time, I think I'd much prefer seeing > effort going towards MinGW becoming the official Windows build platform. > There was a considerable amount of angst with the 2.4 release that can be > blamed solely on the CRT change (and hence different DLLs to link to). And > with them deprecating ISO standard functions ... The problem (for me, atleast) is that VC is so much more convenient to work with. That said, I would personally use what other people contribute (and perhaps only invoke the built process for the actual packaging). So for this to happen, somebody would have to step forward and volunteer as the "windows port maintainer" for the coming years; starting with the changes to the build process. This may be more tricky than it sounds at first: a strategy for building the libraries that we include (such as gzip, openssl, Tcl/Tk) would be needed as well. Plus, that person would have to defend the decision to drop VC (just as I am in the position of defending the switch to VS 2003). Regards, Martin From noamraph at gmail.com Sat Nov 12 23:24:08 2005 From: noamraph at gmail.com (Noam Raphael) Date: Sun, 13 Nov 2005 00:24:08 +0200 Subject: [Python-Dev] str.dedent In-Reply-To: <000001c5e7c6$2f959440$2523c797@oemcomputer> References: <b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com> <000001c5e7c6$2f959440$2523c797@oemcomputer> Message-ID: <b348a0850511121424n26f84b9n7c1edc45e7f9f1c@mail.gmail.com> On 11/12/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote: > > The motivation > > is to be able to write multilined strings easily without damaging the > > visual indentation of the source code > > That is somewhat misleading. We already have that ability. What is > being proposed is moving existing code to a different namespace. So the > motivation is really something like: > > I want to write > s = s.dedent() > because it is too painful to write > s = textwrap.dedent(s) > Sorry, I didn't mean to mislead. I wrote "easily" - I guess using the current textwrap.dedent isn't really hard, but still, writing: import textwrap ... r = some_func(textwrap.dedent('''\ line1 line2''')) Seems harder to me than simply r = some_func('''\ line1 line2'''.dedent()) This example brings up another reason why "dedent" us a method is a good idea: It is a common convention to indent things according to the last opening bracket. "dedent" as a function makes the indentation grow in at least 7 characters, and in 16 characters if you don't do "from textwrap import dedent". Another reason to make it a method is that I think it focuses attention at the string, which comes first, instead of at the "textwrap.dedent", which is only there to make the code look nicer. And, a last reason: making dedent a built-in method makes it a more "official" way of doing things, and I think that this way of writing a multilined string inside an indented block is really the best way to do it. Noam From skip at pobox.com Sun Nov 13 00:45:40 2005 From: skip at pobox.com (skip@pobox.com) Date: Sat, 12 Nov 2005 17:45:40 -0600 Subject: [Python-Dev] Checking working copy consistency In-Reply-To: <43763FC5.8070307@v.loewis.de> References: <43763FC5.8070307@v.loewis.de> Message-ID: <17270.32420.240635.71017@montanaro.dyndns.org> Martin> I made a script that runs through a subversion sandbox and Martin> checks whether all md5sums are correct. Please run that on your Martin> working copy to see whether there are still any inconsistent Martin> files. Thanks Martin. I got no complaints (trunk, release23-maint, release24-maint, peps), though see my next message... Skip From skip at pobox.com Sun Nov 13 00:48:17 2005 From: skip at pobox.com (skip@pobox.com) Date: Sat, 12 Nov 2005 17:48:17 -0600 Subject: [Python-Dev] Is some magic required to check out new files from svn? Message-ID: <17270.32577.193894.694593@montanaro.dyndns.org> Is there some magic required to check out new files from the repository? I'm trying to build on the trunk and am getting compilation errors about code.h not being found. If I remember correctly, this is a new file brought over from the ast branch. Using cvs I would have executed something like "cvs up -dPA ." if I found I was missing something (usually a new directory) and wanted to make sure I was in sync with the trunk. I read the developer's FAQ and the output of "svn up --help". Executing "svn up" or "svn info" tells me I'm already at rev 41430, which is the latest rev, right? Creating a fresh build subdirectory followed by configure and make gives me this error: ../Objects/frameobject.c:6:18: code.h: No such file or directory Sure enough, I have no code.h in my Include directory. Before I wipe out Include and svn up again is there any debugging I can do for someone smarter in the ways of Subversion than me? Regarding my checksum problems (which are not appearing at the moment), Martin asked for 1. what specific revision you had checked out (svn info) 2. what the recorded checksum is (see .svn/entries) 3. what the commited-rev is 4. what the actual checksum is on the file on disk (.svn/text-base/filename.base) 5. whether or not the checksums svn reports match the ones you determined yourself. I don't think #2, #4 or #5 apply here. According to .svn/entries I have: <entry committed-rev="41430" name="" committed-date="2005-11-12T15:55:04.419664Z" url="svn+ssh://pythondev at svn.python.org/python/trunk" last-author="fredrik.lundh" kind="dir" uuid="6015fed2-1504-0410-9fe1-9d1591cc4771" prop-time="2005-11-12T18:00:07.000000Z" revision="41430"/> Here's "svn info" output: Path: . URL: svn+ssh://pythondev at svn.python.org/python/trunk Repository UUID: 6015fed2-1504-0410-9fe1-9d1591cc4771 Revision: 41430 Node Kind: directory Schedule: normal Last Changed Author: fredrik.lundh Last Changed Rev: 41430 Last Changed Date: 2005-11-12 09:55:04 -0600 (Sat, 12 Nov 2005) Properties Last Updated: 2005-11-12 12:00:07 -0600 (Sat, 12 Nov 2005) I was running 1.2.0. I just downloaded and built 1.2.3. It made no difference. This is getting kinda frustrating. I haven't got a lot of confidence in Subversion at this point. Skip From greg.ewing at canterbury.ac.nz Sun Nov 13 00:40:23 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 13 Nov 2005 12:40:23 +1300 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <17269.31319.806622.939477@montanaro.dyndns.org> References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> <87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp> <4372DD5F.70203@c2b2.columbia.edu> <ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com> <43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de> <4374DB69.2080804@c2b2.columbia.edu> <17269.21593.575449.78938@montanaro.dyndns.org> <4375709B.4010009@canterbury.ac.nz> <17269.31319.806622.939477@montanaro.dyndns.org> Message-ID: <43767D67.7070307@canterbury.ac.nz> skip at pobox.com wrote: > Python could dictate that the > way to play ball is for other packages (Tkinter, PyGtk, wxPython, etc) to > feed Python the (socket, callback) pair. Then you have a uniform way to > control event-driven applications. Certainly, if all other event-driven packages are willing to change their ways, they can be made to work together. There's not much that can be done with them the way they are, however. Also, putting the main event loop in Python then gives Python itself a privileged position that it shouldn't necessarily have. Ultimately I think there needs to be an event dispatching mechanism provided by the OS, that is universally used by all packages that want events. With the proliferation of event-driven systems these days, it's becoming as fundamental a requirement as file I/O and deserves serious OS support, I think. Greg From greg.ewing at canterbury.ac.nz Sun Nov 13 00:50:00 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 13 Nov 2005 12:50:00 +1300 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <b348a0850511121106u57c073eeicf8affae502cd86e@mail.gmail.com> References: <437100A7.5050907@c2b2.columbia.edu> <87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp> <4372DD5F.70203@c2b2.columbia.edu> <ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com> <43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de> <4374DB69.2080804@c2b2.columbia.edu> <17269.21593.575449.78938@montanaro.dyndns.org> <4375709B.4010009@canterbury.ac.nz> <17269.31319.806622.939477@montanaro.dyndns.org> <b348a0850511121106u57c073eeicf8affae502cd86e@mail.gmail.com> Message-ID: <43767FA8.7090209@canterbury.ac.nz> Noam Raphael wrote: > All that is needed to make Tkinter and Michiels' > code run together is a way to say "add this callback to the input > hook" instead of the current "replace the current input hook with this > callback". Then, when the interpreter is idle, it will call all the > registered callbacks, one at a time, and everyone would be happy. Except for those who don't like busy waiting. Greg From noamraph at gmail.com Sun Nov 13 01:30:37 2005 From: noamraph at gmail.com (Noam Raphael) Date: Sun, 13 Nov 2005 02:30:37 +0200 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <43767FA8.7090209@canterbury.ac.nz> References: <437100A7.5050907@c2b2.columbia.edu> <ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com> <43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de> <4374DB69.2080804@c2b2.columbia.edu> <17269.21593.575449.78938@montanaro.dyndns.org> <4375709B.4010009@canterbury.ac.nz> <17269.31319.806622.939477@montanaro.dyndns.org> <b348a0850511121106u57c073eeicf8affae502cd86e@mail.gmail.com> <43767FA8.7090209@canterbury.ac.nz> Message-ID: <b348a0850511121630s6e3d8d9dr4c8beaa202c2f1b5@mail.gmail.com> On 11/13/05, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote: > Noam Raphael wrote: > > > All that is needed to make Tkinter and Michiels' > > code run together is a way to say "add this callback to the input > > hook" instead of the current "replace the current input hook with this > > callback". Then, when the interpreter is idle, it will call all the > > registered callbacks, one at a time, and everyone would be happy. > > Except for those who don't like busy waiting. > I'm not sure I understand what you meant. If you meant that it will work slowly - a lot of people (including me) are using Tkinter without a mainloop from the interactive shell, and don't feel the difference. It uses exactly the method I described. Noam From jepler at unpythonic.net Sun Nov 13 01:35:09 2005 From: jepler at unpythonic.net (jepler@unpythonic.net) Date: Sat, 12 Nov 2005 18:35:09 -0600 Subject: [Python-Dev] to_int -- oops, one step missing for use. In-Reply-To: <dl5o07$d6g$1@sea.gmane.org> References: <1f7befae0510211952x5eb2000bicdf3c1a80a3f5749@mail.gmail.com> <4372B377.6050806@Acm.Org> <4372F68B.5050106@Acm.Org> <dl5o07$d6g$1@sea.gmane.org> Message-ID: <20051113003509.GC27610@unpythonic.net> $ python2.4 -c 'import sys; print sys.maxint, sys.maxint == (1<<63) - 1' 9223372036854775807 True $ python2.4 test_hi_powers.py Test 0.2 of to_int 0.16 ...................................................................... ---------------------------------------------------------------------- Ran 70 tests in 0.006s OK -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20051112/d41bbe5a/attachment.pgp From ianb at colorstudy.com Sun Nov 13 03:00:42 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Sat, 12 Nov 2005 20:00:42 -0600 Subject: [Python-Dev] str.dedent In-Reply-To: <b348a0850511121424n26f84b9n7c1edc45e7f9f1c@mail.gmail.com> References: <b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com> <000001c5e7c6$2f959440$2523c797@oemcomputer> <b348a0850511121424n26f84b9n7c1edc45e7f9f1c@mail.gmail.com> Message-ID: <43769E4A.5040408@colorstudy.com> Noam Raphael wrote: > Sorry, I didn't mean to mislead. I wrote "easily" - I guess using the > current textwrap.dedent isn't really hard, but still, writing: > > import textwrap > ... > > r = some_func(textwrap.dedent('''\ > line1 > line2''')) > > Seems harder to me than simply > > r = some_func('''\ > line1 > line2'''.dedent()) I think a better argument for this is that dedenting a literal string is more of a syntactic operation than a functional one. You don't think "oh, I bet I'll need to do some dedenting on line 200 of this module, I better import textwrap". Instead you start writing a long string literal once you get to line 200. You can do it a few ways: some_func("line1\nline2") some_func("line1\n" "line2") some_func("""\ line1 line2""") # If nice whitespace would be pretty but not required: some_func(""" line1 line2""") I often do that last one with HTML and SQL. In practice textwrap.dedent() isn't one of the ways you are likely to write this statement. At least I've never done it that way (and I hit the issue often), and I don't think I've seen code that has used that in this circumstance. Additionally I don't think textwrapping has anything particular to do with dedenting, except perhaps that both functions were required when that module was added. I guess I just find the import cruft at the top of my files a little annoying, and managing them rather tedious, so saying that you should import textwrap because it makes a statement deep in the file look a little prettier is unrealistic. At the same time, the forms that don't use it are rather ugly or sloppy. -- Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org From mhammond at skippinet.com.au Sun Nov 13 04:15:25 2005 From: mhammond at skippinet.com.au (Mark Hammond) Date: Sun, 13 Nov 2005 14:15:25 +1100 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: <43749D65.4040001@desys.de> Message-ID: <DAELJHBGPBHPJKEBGGLNCEEKIDAD.mhammond@skippinet.com.au> > release. The main reason why I changed the import behavior was > pythonservice.exe from the win32 extensions. pythonservice.exe imports > the module that contains the service class, but because > pythonservice.exe doesn't run in optimized mode, it will only import a > .py or a .pyc file, not a .pyo file. Because we always generate bytecode > with -OO at distribution time, we either had to change the behavior of > pythonservice.exe or change the import behavior of Python. While ignoring the question of how Python should in the future handle optimizations, I think it safe to state that that pythonservice.exe should have the same basic functionality and operation in this regard as python.exe does. It doesn't sound too difficult to modify pythonservice to accept -O flags, and to modify the service installation process to allow this flag to be specified. I'd certainly welcome any such patches. Although getting off-topic for this list, note that for recent pywin32 releases, it is possible to host a service using python.exe directly, and this is the technique py2exe uses to host service executables. It would take a little more work to set things up to work like that, but that's probably not too unreasonable for a custom application with specialized distribution requirements. Using python.exe obviously means you get full access to the command-line facilities it provides. So while I believe your idead for getting and setting these flags sounds reasonable, and also believe that at face value the zipimport semantics appear sane, I'm not sure we should use a weakness in a Python tool to justify a change to Python itself. Mark From nico at tekNico.net Sun Nov 13 10:05:58 2005 From: nico at tekNico.net (Nicola Larosa) Date: Sun, 13 Nov 2005 10:05:58 +0100 Subject: [Python-Dev] OT pet peeve (was: Re: str.dedent) In-Reply-To: <b348a0850511121424n26f84b9n7c1edc45e7f9f1c@mail.gmail.com> References: <b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com> <000001c5e7c6$2f959440$2523c797@oemcomputer> <b348a0850511121424n26f84b9n7c1edc45e7f9f1c@mail.gmail.com> Message-ID: <dl6vlo$nl2$1@sea.gmane.org> > Sorry, I didn't mean to mislead. I wrote "easily" - I guess using the > current textwrap.dedent isn't really hard, but still, writing: > > import textwrap > .... > > r = some_func(textwrap.dedent('''\ > line1 > line2''')) > > Seems harder to me than simply > > r = some_func('''\ > line1 > line2'''.dedent()) > > This example brings up another reason why "dedent" us a method is a > good idea: It is a common convention to indent things according to the > last opening bracket. "dedent" as a function makes the indentation > grow in at least 7 characters, and in 16 characters if you don't do > "from textwrap import dedent". It's a common convention, but a rather ugly one. It makes harder breaking lines at 78-80 chars, and using long enough identifiers. I find it more useful to go straight to the next line, indenting the usual four spaces (and also separating nested stuff): r = some_func( textwrap.dedent( '''\ line1 line2''')) This style uses up more vertical space, but I find it also gives code a clearer overall shape. -- Nicola Larosa - nico at tekNico.net Use of threads can be very deceptive. [...] in almost all cases they also make debugging, testing, and maintenance vastly more difficult and sometimes impossible. http://java.sun.com/products/jfc/tsc/articles/threads/threads1.html#why From fredrik at pythonware.com Sun Nov 13 10:11:23 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sun, 13 Nov 2005 10:11:23 +0100 Subject: [Python-Dev] Is some magic required to check out new files from svn? References: <17270.32577.193894.694593@montanaro.dyndns.org> Message-ID: <dl6vvl$o9c$1@sea.gmane.org> skip at pobox.com wrote: > I read the developer's FAQ and the output of "svn up --help". Executing > "svn up" or "svn info" tells me I'm already at rev 41430, which is the > latest rev, right? Creating a fresh build subdirectory followed by > configure and make gives me this error: > > ../Objects/frameobject.c:6:18: code.h: No such file or directory > > Sure enough, I have no code.h in my Include directory. what does svn status Include/code.h say? if it says ! Include/code.h what happens if you do svn revert Include/code.h ? doing a full svn status and looking for ! entries will tell you if more files are missing. </F> From martin at v.loewis.de Sun Nov 13 10:35:15 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 13 Nov 2005 10:35:15 +0100 Subject: [Python-Dev] Is some magic required to check out new files from svn? In-Reply-To: <17270.32577.193894.694593@montanaro.dyndns.org> References: <17270.32577.193894.694593@montanaro.dyndns.org> Message-ID: <437708D3.8010406@v.loewis.de> skip at pobox.com wrote: > Is there some magic required to check out new files from the repository? > I'm trying to build on the trunk and am getting compilation errors about > code.h not being found. If I remember correctly, this is a new file brought > over from the ast branch. Using cvs I would have executed something like > "cvs up -dPA ." if I found I was missing something (usually a new directory) > and wanted to make sure I was in sync with the trunk. code.h should live in Include. It was originally committed to CVS, so it is in the subversion repository from day one; it should always have been there since you started using subversion. Do you have code.h mentioned in Include/.svn/entries? > This is getting kinda frustrating. I haven't got a lot of confidence in > Subversion at this point. I can understand that. However, you should get confidence from that fact that nobody else is seeing these problems :-) I recommend to use pre-built binaries, e.g. the ones from http://metissian.com/projects/macosx/subversion/ I would also recommend to throw away the sandbox completely and check it out from scratch. Please report whether this gives you code.h. Regards, Martin From krumms at gmail.com Sun Nov 13 11:34:44 2005 From: krumms at gmail.com (Thomas Lee) Date: Sun, 13 Nov 2005 20:34:44 +1000 Subject: [Python-Dev] Implementation of PEP 341 Message-ID: <437716C4.8050309@gmail.com> Hi all, I've been using Python for a few years and, as of a few days ago, finally decided to put the effort into contributing code back to the project. I'm attempting to implement PEP 341 (unification of try/except and try/finally) against HEAD. However, this being my first attempt at a change to the syntax there's been a bit of a learning curve. I've modified Grammar/Grammer to use the new try_stmt grammar, updated Parser/Python.asdl to accept a stmt* finalbody for TryExcept instances and modified Python/ast.c to handle the changes to Python.asdl - generating an AST for the finalbody. All that remains as far as I can see is to modify Python/compile.c to generate the necessary code and update Modules/parsermodule.c to accommodate the changes to the grammar. (If anybody has further input as to what needs to be done here, I'm all ears!) The difficulty I'm having is in Python/compile.c: currently there are two functions which generate the code for the two existing try_stmt paths. compiler_try_finally doesn't need any changes as far as I can see. compiler_try_except, however, now needs to generate code to handle TryExcept.finalbody (which I added to Parser/Python.asdl). This sounds easy enough, but the following is causing me difficulty: /* BEGIN */ ADDOP_JREL(c, SETUP_EXCEPT, except); compiler_use_next_block(c, body); if (!compiler_push_fblock(c, EXCEPT, body)) return 0; VISIT_SEQ(c, stmt, s->v.TryExcept.body); ADDOP(c, POP_BLOCK); compiler_pop_fblock(c, EXCEPT, body); /* END */ A couple of things confuse me here: 1. What's the purpose of the push_fblock/pop_fblock calls? 2. Do I need to add "ADDOP_JREL(c, SETUP_FINALLY, end);" before/after SETUP_EXCEPT? Or will this conflict with the SETUP_EXCEPT op? I don't know enough about the internals of SETUP_EXCEPT/SETUP_FINALLY to know what to do here. Also, in compiler_try_finally we see this code: /* BEGIN */ ADDOP_JREL(c, SETUP_FINALLY, end); compiler_use_next_block(c, body); if (!compiler_push_fblock(c, FINALLY_TRY, body)) return 0; VISIT_SEQ(c, stmt, s->v.TryFinally.body); ADDOP(c, POP_BLOCK); compiler_pop_fblock(c, FINALLY_TRY, body); ADDOP_O(c, LOAD_CONST, Py_None, consts); /* END */ Why the LOAD_CONST Py_None? Does this serve any purpose? some sort of weird pseudo return value? Or does it have a semantic purpose that I'll have to reproduce in compiler_try_except? Cheers, and thanks for any help you can provide :) Tom From ncoghlan at gmail.com Sun Nov 13 13:27:26 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 13 Nov 2005 22:27:26 +1000 Subject: [Python-Dev] Implementation of PEP 341 In-Reply-To: <437716C4.8050309@gmail.com> References: <437716C4.8050309@gmail.com> Message-ID: <4377312E.2000002@gmail.com> Thomas Lee wrote: > Hi all, > > I've been using Python for a few years and, as of a few days ago, > finally decided to put the effort into contributing code back to the > project. > > I'm attempting to implement PEP 341 (unification of try/except and > try/finally) against HEAD. However, this being my first attempt at a > change to the syntax there's been a bit of a learning curve. Thanks for having a go at this. > I've modified Grammar/Grammer to use the new try_stmt grammar, updated > Parser/Python.asdl to accept a stmt* finalbody for TryExcept instances > and modified Python/ast.c to handle the changes to Python.asdl - > generating an AST for the finalbody. Consider leaving the AST definition alone, and simply changing the frontend parser to process: try: BLOCK1 except: BLOCK2 finally: BLOCK3 almost precisely as if it were written: try: try: BLOCK1 except: BLOCK2 finally: BLOCK3 That is, generate a TryExcept inside a TryFinally at the AST level, rather than trying to give TryExcept the ability to handle a finally block directly. Specifically, if you've determined that a finally clause is present in the extended statement in Python/ast.c, do something like: inner_seq = asdl_seq_new(1) asdl_seq_SET(inner_seq, 0, TryExcept(body_seq, handlers, else_seq, LINENO(n)) return TryFinally(inner_seq, finally_seq, LINENO(n)) body_seq and else_seq actually have meaningful names like suite_seq1 and suite_seq2 in the current code ;) Semantics-wise, this is exactly the behaviour we want, and making it pure syntactic sugar means the backend doesn't need to care about the new syntax at all. It also significantly lessens the risk of the change causing any problems in the compilation of normal try-except blocks. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From skip at pobox.com Sun Nov 13 14:08:15 2005 From: skip at pobox.com (skip@pobox.com) Date: Sun, 13 Nov 2005 07:08:15 -0600 Subject: [Python-Dev] Is some magic required to check out new files from svn? In-Reply-To: <dl6vvl$o9c$1@sea.gmane.org> References: <17270.32577.193894.694593@montanaro.dyndns.org> <dl6vvl$o9c$1@sea.gmane.org> Message-ID: <17271.15039.201796.513101@montanaro.dyndns.org> >> ../Objects/frameobject.c:6:18: code.h: No such file or directory >> >> Sure enough, I have no code.h in my Include directory. Fredrik> what does Fredrik> svn status Include/code.h Fredrik> say? if it says It reports nothing. Fredrik> doing a full Fredrik> svn status Fredrik> and looking for ! entries will tell you if more files are missing. The full svn status output is % svn status ! . ! Python Just for the heck of it, I tried "svn revert Include/code.h" and got Skipped 'Include/code.h' code.h is not mentioned in Include/.svn/entries. Skip From jjl at pobox.com Sun Nov 13 14:23:56 2005 From: jjl at pobox.com (John J Lee) Date: Sun, 13 Nov 2005 13:23:56 +0000 (UTC) Subject: [Python-Dev] Is some magic required to check out new files from svn? In-Reply-To: <17270.32577.193894.694593@montanaro.dyndns.org> References: <17270.32577.193894.694593@montanaro.dyndns.org> Message-ID: <Pine.LNX.4.58.0511131320190.6217@alice> On Sat, 12 Nov 2005 skip at pobox.com wrote: [...] > Before I wipe out Include and svn up again is there any debugging I can do > for someone smarter in the ways of Subversion than me? Regarding my [...] Output of the svnversion command? That shows switched and locally modified files, etc. I'm not an svn guru, but I find that command useful, especially to point out when I switched some deep directory then forgot about it. John From skip at pobox.com Sun Nov 13 14:27:29 2005 From: skip at pobox.com (skip@pobox.com) Date: Sun, 13 Nov 2005 07:27:29 -0600 Subject: [Python-Dev] Is some magic required to check out new files from svn? In-Reply-To: <437708D3.8010406@v.loewis.de> References: <17270.32577.193894.694593@montanaro.dyndns.org> <437708D3.8010406@v.loewis.de> Message-ID: <17271.16193.325664.851527@montanaro.dyndns.org> Martin> code.h should live in Include. It was originally committed to Martin> CVS, so it is in the subversion repository from day one; it Martin> should always have been there since you started using Martin> subversion. Sorry, I had some strange idea it was new with the ast branch. Martin> Do you have code.h mentioned in Include/.svn/entries? Nope. Martin> I recommend to use pre-built binaries, e.g. the ones from Martin> http://metissian.com/projects/macosx/subversion/ That was where I got the 1.2.0 version I was having trouble with originally. I built 1.2.3 from source. I'll give the prebuilt 1.2.3 a try. Martin> I would also recommend to throw away the sandbox completely and Martin> check it out from scratch. Please report whether this gives you Martin> code.h. Yes, it does (still with my built-from-source 1.2.3). Skip From skip at pobox.com Sun Nov 13 14:33:55 2005 From: skip at pobox.com (skip@pobox.com) Date: Sun, 13 Nov 2005 07:33:55 -0600 Subject: [Python-Dev] Is some magic required to check out new files from svn? In-Reply-To: <Pine.LNX.4.58.0511131320190.6217@alice> References: <17270.32577.193894.694593@montanaro.dyndns.org> <Pine.LNX.4.58.0511131320190.6217@alice> Message-ID: <17271.16579.87469.834712@montanaro.dyndns.org> John> Output of the svnversion command? That shows switched and locally John> modified files, etc. John> I'm not an svn guru, but I find that command useful, especially to John> point out when I switched some deep directory then forgot about John> it. Thanks, I'll remember it for next time... Skip From martin at v.loewis.de Sun Nov 13 14:40:06 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 13 Nov 2005 14:40:06 +0100 Subject: [Python-Dev] Is some magic required to check out new files from svn? In-Reply-To: <17271.16193.325664.851527@montanaro.dyndns.org> References: <17270.32577.193894.694593@montanaro.dyndns.org> <437708D3.8010406@v.loewis.de> <17271.16193.325664.851527@montanaro.dyndns.org> Message-ID: <43774236.3040700@v.loewis.de> skip at pobox.com wrote: > Martin> code.h should live in Include. It was originally committed to > Martin> CVS, so it is in the subversion repository from day one; it > Martin> should always have been there since you started using > Martin> subversion. > > Sorry, I had some strange idea it was new with the ast branch. It was, yes. However, the conversion to subversion happened after the ast branch was checked in. > Martin> I would also recommend to throw away the sandbox completely and > Martin> check it out from scratch. Please report whether this gives you > Martin> code.h. > > Yes, it does (still with my built-from-source 1.2.3). Ok. I am now convinced (also because of the other information you reported) that you indeed had continued to use one of the test conversion repositories from before the switchover. That would explain all the problems you see. Regards, Martin From martin at v.loewis.de Sun Nov 13 14:50:20 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 13 Nov 2005 14:50:20 +0100 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <4374DB69.2080804@c2b2.columbia.edu> References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> <87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp> <4372DD5F.70203@c2b2.columbia.edu> <ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com> <43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de> <4374DB69.2080804@c2b2.columbia.edu> Message-ID: <4377449C.5080206@v.loewis.de> Michiel Jan Laurens de Hoon wrote: > I have an extension module for scientific visualization. This extension > module opens one or more windows, in which plots can be made. Something > similar to the plotting capabilities of Matlab. > > For the graphics windows to remain responsive, I need to make sure that > its events get handled. So I need an event loop. At the same time, the > user can enter new Python commands, which also need to be handled. My recommendation: create a thread for the graphics window, which runs the event loop of the graphics window. That way, you are completely independent of any other event loops that may happen. It is also independent of the operating system (as long as the thread module is available). Regards, Martin From martin at v.loewis.de Sun Nov 13 14:54:48 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 13 Nov 2005 14:54:48 +0100 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <DAELJHBGPBHPJKEBGGLNIEBBICAD.mhammond@skippinet.com.au> References: <DAELJHBGPBHPJKEBGGLNIEBBICAD.mhammond@skippinet.com.au> Message-ID: <437745A8.60104@v.loewis.de> Mark Hammond wrote: > : Currently, event loops are available in Python via PyOS_InputHook, a > : pointer to a user-defined function that is called when Python is idle > : (waiting for user input). However, an event loop using PyOS_InputHook > : has some inherent limitations, so I am thinking about how to improve > : event loop support in Python. > > Either we have an unusual definition of "event loop" (as many many other > toolkits have implemented event loops without PyOS_InputHook), or the > requirement is for an event loop that plays nicely with the "interactive > loop" in Python.exe. I would guess there is an unusual definition of an "event loop". It is probably that inside the hook, a "process_some_events()" function is invoked, which loops until some event queue is empty; this is not the usual infinite-loop-until-user-terminates-program. For this to work, you need a guarantee that the hook is invoked frequently. Again, I still think running the loop (as a true event loop) in a separate thread would probably solve the problem. Regards, Martin From kozlovsky at mail.spbnit.ru Sun Nov 13 14:57:40 2005 From: kozlovsky at mail.spbnit.ru (Alexander Kozlovsky) Date: Sun, 13 Nov 2005 16:57:40 +0300 Subject: [Python-Dev] str.dedent In-Reply-To: <000001c5e7c6$2f959440$2523c797@oemcomputer> References: <000001c5e7c6$2f959440$2523c797@oemcomputer> Message-ID: <5610377368.20051113165740@mail.spbnit.ru> Raymond Hettinger wrote: > That is somewhat misleading. We already have that ability. What is > being proposed is moving existing code to a different namespace. So the > motivation is really something like: > > I want to write > s = s.dedent() > because it is too painful to write > s = textwrap.dedent(s) >From technical point of view, there is nothing wrong with placing this functionality in textwrap. But from usability point of view using textwrap.dedent is like importing some stuff for doing string concatenation or integer addition. In textwrap module this function placed in section "Loosely (!) related functionality". When Python beginner try to find "Pythonic" way for dealing with dedenting (And she know, in Python "there should be one -- and preferably only one -- obvious way to do it"), it is very unlikely that she think "Which module may contain standard string dedenting? Yes, of course textwrap! I'm sure I'll find necessary function there!" > String methods should be limited to generic string manipulations. > String applications should be in other namespaces. That is why we don't > have str.md5(), str.crc32(), str.ziplib(), etc. I think, dedenting must be classified as "generic string manipulations". The need in string dedenting results from meaningful indentation and widespread use of text editors with folding support. Multiline strings without leading whitespaces breaks correct folding in some editors. Best regards, Alexander mailto:kozlovsky at mail.spbnit.ru From ncoghlan at gmail.com Sun Nov 13 15:36:18 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 14 Nov 2005 00:36:18 +1000 Subject: [Python-Dev] Is some magic required to check out new files from svn? In-Reply-To: <43774236.3040700@v.loewis.de> References: <17270.32577.193894.694593@montanaro.dyndns.org> <437708D3.8010406@v.loewis.de> <17271.16193.325664.851527@montanaro.dyndns.org> <43774236.3040700@v.loewis.de> Message-ID: <43774F62.6090502@gmail.com> Martin v. L?wis wrote: > skip at pobox.com wrote: >> Martin> I would also recommend to throw away the sandbox completely and >> Martin> check it out from scratch. Please report whether this gives you >> Martin> code.h. >> >> Yes, it does (still with my built-from-source 1.2.3). > > Ok. I am now convinced (also because of the other information you > reported) that you indeed had continued to use one of the test > conversion repositories from before the switchover. That would explain > all the problems you see. FWIW, I haven't been following Skip's subversion woes closely, but the behaviour he reported seems to match the symptoms I got when I tried to update my test sandbox after the official changeover (I blew the sandbox away completely as soon as I got checksum errors, though, so I didn't see any of the later strangeness). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Sun Nov 13 15:45:16 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 14 Nov 2005 00:45:16 +1000 Subject: [Python-Dev] Implementation of PEP 341 In-Reply-To: <4377486B.1090400@gmail.com> References: <437716C4.8050309@gmail.com> <4377312E.2000002@gmail.com> <4377486B.1090400@gmail.com> Message-ID: <4377517C.9000808@gmail.com> Thomas Lee wrote: > Implemented as you suggested and tested. I'll submit the patch to the > tracker on sourceforge shortly. Are you guys still after contextual > diffs as per the developer pages, or is an svn diff the preferred way to > submit patches now? svn diff should be fine. Although I thought Brett had actually updated those pages after the move to svn. . . > Thanks very much for all your help, Nick. It was extremely informative. I think we can chalk up a respectable win for the AST-based compiler - the trick I suggested wouldn't really have been practical without the AST layer between the parser and the compiler. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From mal at egenix.com Sun Nov 13 18:43:54 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Sun, 13 Nov 2005 18:43:54 +0100 Subject: [Python-Dev] str.dedent In-Reply-To: <b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com> References: <dga72k$cah$1@sea.gmane.org> <ca471dc2050914161070f1f425@mail.gmail.com> <b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com> Message-ID: <43777B5A.6030602@egenix.com> Noam Raphael wrote: > Following Avi's suggestion, can I raise this thread up again? I think > that Reinhold's .dedent() method can be a good idea after all. > > The idea is to add a method called "dedent" to strings. It would do > exactly what the current textwrap.indent function does. You are missing a point here: string methods were introduced to make switching from plain 8-bit strings to Unicode easier. As such they are only needed in cases where an algorithm has to work on the resp. internals differently or where direct access to the internals makes a huge difference in terms of performance. In your use case, the algorithm is independent of the data type interals and can be defined solely by using existing string method APIs. > The motivation > is to be able to write multilined strings easily without damaging the > visual indentation of the source code, like this: > > def foo(): > msg = '''\ > From: %s > To: %s\r\n' > Subject: Host failure report for %s > Date: %s > > %s > '''.dedent() % (fr, ', '.join(to), host, time.ctime(), err) > > Writing multilined strings without spaces in the beginning of lines > makes functions harder to read, since although the Python parser is > happy with it, it breaks the visual indentation. This is really a minor compiler/parser issue and not one which warrants adding another string method. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 13 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2005-10-17: Released mxODBC.Zope.DA 1.0.9 http://zope.egenix.com/ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From solipsis at pitrou.net Sun Nov 13 18:51:47 2005 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 13 Nov 2005 18:51:47 +0100 Subject: [Python-Dev] str.dedent In-Reply-To: <43777B5A.6030602@egenix.com> References: <dga72k$cah$1@sea.gmane.org> <ca471dc2050914161070f1f425@mail.gmail.com> <b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com> <43777B5A.6030602@egenix.com> Message-ID: <1131904308.5684.8.camel@fsol> > You are missing a point here: string methods were introduced > to make switching from plain 8-bit strings to Unicode easier. Is it the only purpose ? I agree with the OP that using string methods is much nicer and more convenient than having to import separate modules. Especially, it is nice to just type help(str) in the interactive prompt and get the list of supported methods. Also, these methods are living in the namespace of the supported objects. It feels very natural, and goes hand in hand with Python's object-oriented nature. (just my 2 cents - I am not arguing for or against the specific case of dedent, by the way) Regards Antoine. From nnorwitz at gmail.com Sun Nov 13 20:41:57 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Sun, 13 Nov 2005 11:41:57 -0800 Subject: [Python-Dev] ast status, memory leaks, etc Message-ID: <ee2a432c0511131141s72fedecax29008fd783a3b0db@mail.gmail.com> There's still more clean up work to go, but the current AST is hopefully much closer to the behaviour before it was checked in. There are still a few small memory leaks. After running the test suite, the total references were around 380k (down from over 1,000k). I'm not sure exactly what the total refs were just before AST was checked in, but I believe it was over 340k. So there are likely some more ref leaks that should be investigated. It would be good to know the exact number before AST was checked in and now, minus any new tests. There is one memory reference error in test_coding: Invalid read of size 1 at 0x41304E: tok_nextc (tokenizer.c:876) by 0x413874: PyTokenizer_Get (tokenizer.c:1099) by 0x411962: parsetok (parsetok.c:124) by 0x498D1F: PyParser_ASTFromFile (pythonrun.c:1292) by 0x48D79A: load_source_module (import.c:777) by 0x48E90F: load_module (import.c:1665) by 0x48ED61: import_submodule (import.c:2259) by 0x48EF60: load_next (import.c:2079) by 0x48F44D: import_module_ex (import.c:1921) by 0x48F715: PyImport_ImportModuleEx (import.c:1955) by 0x46D090: builtin___import__ (bltinmodule.c:44) Address 0x1863E8F6 is 2 bytes before a block of size 8192 free'd at 0x11B1BA8A: free (vg_replace_malloc.c:235) by 0x4127DB: decoding_fgets (tokenizer.c:167) by 0x412F1F: tok_nextc (tokenizer.c:823) by 0x413874: PyTokenizer_Get (tokenizer.c:1099) by 0x411962: parsetok (parsetok.c:124) by 0x498D1F: PyParser_ASTFromFile (pythonrun.c:1292) by 0x48D79A: load_source_module (import.c:777) by 0x48E90F: load_module (import.c:1665) by 0x48ED61: import_submodule (import.c:2259) by 0x48EF60: load_next (import.c:2079) by 0x48F44D: import_module_ex (import.c:1921) by 0x48F715: PyImport_ImportModuleEx (import.c:1955) by 0x46D090: builtin___import__ (bltinmodule.c:44) I had a patch for this somewhere, I'll try to find it. However, I only fixed this exact error, there was another path that could still be problematic. Most of the memory leaks show up when we are forking in: test_fork1 test_pty test_subprocess Here's what I have so far. There are probably some more. It would be great if someone could try to find and fix these leaks. n -- 16 bytes in 1 blocks are definitely lost in loss record 25 of 599 at 0x11B1AF13: malloc (vg_replace_malloc.c:149) by 0x4CA102: alias (Python-ast.c:1066) by 0x4CD918: alias_for_import_name (ast.c:2199) by 0x4D0C4E: ast_for_stmt (ast.c:2244) by 0x4D15E3: PyAST_FromNode (ast.c:234) by 0x499078: Py_CompileStringFlags (pythonrun.c:1275) by 0x46D6DF: builtin_compile (bltinmodule.c:457) 56 bytes in 1 blocks are definitely lost in loss record 87 of 599 at 0x11B1AF13: malloc (vg_replace_malloc.c:149) by 0x4C9C92: Name (Python-ast.c:860) by 0x4CE4BA: ast_for_expr (ast.c:1222) by 0x4D1021: ast_for_stmt (ast.c:1900) by 0x4D15E3: PyAST_FromNode (ast.c:234) by 0x499078: Py_CompileStringFlags (pythonrun.c:1275) by 0x46D6DF: builtin_compile (bltinmodule.c:457) 112 bytes in 2 blocks are definitely lost in loss record 198 of 674 at 0x11B1AF13: malloc (vg_replace_malloc.c:149) by 0x4C9C92: Name (Python-ast.c:860) by 0x4CE4BA: ast_for_expr (ast.c:1222) by 0x4D1021: ast_for_stmt (ast.c:1900) by 0x4D16D5: PyAST_FromNode (ast.c:275) by 0x499078: Py_CompileStringFlags (pythonrun.c:1275) by 0x46D6DF: builtin_compile (bltinmodule.c:457) 56 bytes in 1 blocks are definitely lost in loss record 89 of 599 at 0x11B1AF13: malloc (vg_replace_malloc.c:149) by 0x4C9C92: Name (Python-ast.c:860) by 0x4CF3AF: ast_for_arguments (ast.c:650) by 0x4D1BFF: ast_for_funcdef (ast.c:830) by 0x4D15E3: PyAST_FromNode (ast.c:234) by 0x499161: PyRun_StringFlags (pythonrun.c:1275) by 0x47B1B2: PyEval_EvalFrameEx (ceval.c:4221) by 0x47CCCC: PyEval_EvalCodeEx (ceval.c:2739) by 0x47ABCC: PyEval_EvalFrameEx (ceval.c:3657) by 0x47CCCC: PyEval_EvalCodeEx (ceval.c:2739) by 0x4C27F8: function_call (funcobject.c:550) 112 bytes in 2 blocks are definitely lost in loss record 189 of 651 at 0x11B1AF13: malloc (vg_replace_malloc.c:149) by 0x4C9C92: Name (Python-ast.c:860) by 0x4CE4BA: ast_for_expr (ast.c:1222) by 0x4D02F7: ast_for_stmt (ast.c:2028) by 0x4D16D5: PyAST_FromNode (ast.c:275) by 0x499078: Py_CompileStringFlags (pythonrun.c:1275) by 0x46D6DF: builtin_compile (bltinmodule.c:457) 56 bytes in 1 blocks are definitely lost in loss record 118 of 651 at 0x11B1AF13: malloc (vg_replace_malloc.c:149) by 0x4C9A41: Num (Python-ast.c:751) by 0x4CE578: ast_for_expr (ast.c:1237) by 0x4CF4ED: ast_for_arguments (ast.c:629) by 0x4D1BFF: ast_for_funcdef (ast.c:830) by 0x4D15E3: PyAST_FromNode (ast.c:234) by 0x499161: PyRun_StringFlags (pythonrun.c:1275) by 0x47B1B2: PyEval_EvalFrameEx (ceval.c:4221) by 0x47CCCC: PyEval_EvalCodeEx (ceval.c:2739) by 0x47ABCC: PyEval_EvalFrameEx (ceval.c:3657) by 0x47CCCC: PyEval_EvalCodeEx (ceval.c:2739) by 0x4C27F8: function_call (funcobject.c:550) 112 (56 direct, 56 indirect) bytes in 1 blocks are definitely lost in loss record 185 of 651 at 0x11B1AF13: malloc (vg_replace_malloc.c:149) by 0x4C97CA: GeneratorExp (Python-ast.c:648) by 0x4CEE4F: ast_for_expr (ast.c:1251) by 0x4D1021: ast_for_stmt (ast.c:1900) by 0x4D16D5: PyAST_FromNode (ast.c:275) by 0x499078: Py_CompileStringFlags (pythonrun.c:1275) by 0x46D6DF: builtin_compile (bltinmodule.c:457) 1024 bytes in 1 blocks are definitely lost in loss record 441 of 651 at 0x11B1AF13: malloc (vg_replace_malloc.c:149) by 0x43F8C4: PyObject_Malloc (obmalloc.c:500) by 0x4B808F: PyNode_AddChild (node.c:95) by 0x4B8386: PyParser_AddToken (parser.c:126) by 0x411944: parsetok (parsetok.c:165) by 0x499062: Py_CompileStringFlags (pythonrun.c:1271) by 0x46D6DF: builtin_compile (bltinmodule.c:457) From nyamatongwe at gmail.com Sun Nov 13 23:02:48 2005 From: nyamatongwe at gmail.com (Neil Hodgson) Date: Mon, 14 Nov 2005 09:02:48 +1100 Subject: [Python-Dev] Building Python with Visual C++ 2005 ExpressEdition In-Reply-To: <43766458.4040803@v.loewis.de> References: <dl073b$aro$1@sea.gmane.org> <4373A4F0.7010202@v.loewis.de> <79990c6b0511120720w2c8b318do3a41051ba4eb0c6b@mail.gmail.com> <43762E39.3020005@v.loewis.de> <001401c5e7c8$2333cd00$0201a8c0@ryoko> <43766458.4040803@v.loewis.de> Message-ID: <50862ebd0511131402q768b97d3g8593859178cf7e16@mail.gmail.com> Martin v. L?wis: > The problem (for me, atleast) is that VC is so much more convenient to > work with. In my experience Visual C++ has always produced faster, more compact code than Mingw. While this may not be true with current releases, I'd want to ensure that the normal Python download for Windows didn't become slower. Visual C++ 2005 includes profile guided optimization (although this is not included in the Express Edition) and it would be interesting to see how much of a difference this makes. Microsoft was willing to give some copies of VS to Python developers before so I expect they'd be willing to give some copies of VS Professional or Team System. Tim Delaney: > There was a considerable amount of angst with the 2.4 release that can be > blamed solely on the CRT change (and hence different DLLs to link to). And > with them deprecating ISO standard functions ... One solution to CRT change is to drop direct linking of modules to the CRT and vector them through the core DLL. The core PythonXX.DLL would expose an array of functions (malloc, strdup, getcwd, ...) that would be called by all modules indirectly. Then, it no longer matters which compiler version or compiler you build extension modules with. Its quite a lot of work to do this as each CRT call site needs to change or a well thought through macro scheme be developed. Paul Moore: > The project file conversions seemed to go fine, and the debug builds > were OK, although the deprecation warnings for all the "insecure" CRT > functions was a pain. It might be worth adding > _CRT_SECURE_NO_DEPRECATE to the project defines somehow. I haven't tried to build Python with VC++ 2005 yet, but other code has also required _CRT_NONSTDC_NO_DEPRECATE for some of the file system calls. Neil From bcannon at gmail.com Mon Nov 14 00:40:47 2005 From: bcannon at gmail.com (Brett Cannon) Date: Sun, 13 Nov 2005 15:40:47 -0800 Subject: [Python-Dev] Implementation of PEP 341 In-Reply-To: <4377517C.9000808@gmail.com> References: <437716C4.8050309@gmail.com> <4377312E.2000002@gmail.com> <4377486B.1090400@gmail.com> <4377517C.9000808@gmail.com> Message-ID: <bbaeab100511131540y46cef4e6yf2496aa4f24fbec8@mail.gmail.com> On 11/13/05, Nick Coghlan <ncoghlan at gmail.com> wrote: > Thomas Lee wrote: > > Implemented as you suggested and tested. I'll submit the patch to the > > tracker on sourceforge shortly. Are you guys still after contextual > > diffs as per the developer pages, or is an svn diff the preferred way to > > submit patches now? > > svn diff should be fine. Although I thought Brett had actually updated those > pages after the move to svn. . . > I did. But the docs just need to be revamped. But I can't start on that work until people tell me if they prefer FAQ-style (question listing all steps and then a question covering each step) or essay-style (bulleted list and then a definition/paragraph on each step) for bug/patch guidelines. > > Thanks very much for all your help, Nick. It was extremely informative. > > I think we can chalk up a respectable win for the AST-based compiler - the > trick I suggested wouldn't really have been practical without the AST layer > between the parser and the compiler. > Yeah, this is a total win for the AST compiler. I would not have wanted to attempt this with the old CST compiler. -Brett From mdehoon at c2b2.columbia.edu Mon Nov 14 01:25:34 2005 From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon) Date: Sun, 13 Nov 2005 19:25:34 -0500 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <4373A214.6060201@v.loewis.de> References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> <4372B82C.9010800@canterbury.ac.nz> <4372DA3A.8010206@c2b2.columbia.edu> <4372F72B.9060501@v.loewis.de> <43738074.2030508@c2b2.columbia.edu> <4373A214.6060201@v.loewis.de> Message-ID: <4377D97E.9060507@c2b2.columbia.edu> Martin v. L?wis wrote: >Michiel Jan Laurens de Hoon wrote: > > >>The problem with threading (apart from potential portability problems) >>is that Python doesn't let us know when it's idle. This would cause >>excessive repainting (I can give you an explicit example if you're >>interested). >> >> >I don't understand how these are connected: why do you need to know >when Python is idle for multi-threaded applications, and why does not >knowing that it is idle cause massive repainting? > >Not sure whether an explicit example would help, though; one would >probably need to understand a lot of details of your application. Giving >a simplified version of the example might help (which would do 'print >"Repainting"' instead of actually repainting). > > As an example, consider a function plot(y,x) that plots a graph of y as a function of x. If I use threading, and Python doesn't let us know when it's idle, then the plot function needs to invalidate the window to trigger repainting. Otherwise, the event loop doesn't realize that there is something new to plot. Now if I want to draw two graphs: def f(): x = arange(1000)*0.01 y = sin(x) plot(y,x) plot(2*y,x) and I execute f(), then after the first plot(y,x), I get a graph of y vs. x with x between 0 and 10 and y between -1 and 1. After the second plot, the y-axis runs from -2 to 2, and we need to draw (y,x) as well as (2*y,x). So the first repainting was in vain. If, however, Python contains an event loop that takes care of events as well as Python commands, redrawing won't happen until Python has executed all plot commands -- so no repainting in vain here. I agree with you though that threads are a good solution for extension modules for which a standard event loop is not suitable, and for which graphics performance is not essential -- such as Tkinter (see my next post). --Michiel. -- Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 From mdehoon at c2b2.columbia.edu Mon Nov 14 01:51:32 2005 From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon) Date: Sun, 13 Nov 2005 19:51:32 -0500 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <4373A214.6060201@v.loewis.de> References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> <4372B82C.9010800@canterbury.ac.nz> <4372DA3A.8010206@c2b2.columbia.edu> <4372F72B.9060501@v.loewis.de> <43738074.2030508@c2b2.columbia.edu> <4373A214.6060201@v.loewis.de> Message-ID: <4377DF94.2090003@c2b2.columbia.edu> Martin v. L?wis wrote: >Michiel Jan Laurens de Hoon wrote: > > >>But there is another solution with threads: Can we let Tkinter run in a >>separate thread instead? >> >> > >Yes, you can. Actually, Tkinter *always* runs in a separate thread >(separate from all other threads). > > Are you sure? If Tkinter is running in a separate thread, then why does it need PyOS_InputHook? Maybe I'm misunderstanding the code in _tkinter.c, but it appears that the call to Tcl_DoOneEvent and the main interpreter (the one that reads the user commands from stdin) are in the same thread. Anyway, if we can run Tkinter's event loop in a thread separate from the main interpreter, then we can avoid all interference with other event loops, and also improve Tkinter's behavior itself: 1) Since this event loop doesn't need to check stdin any more, we can avoid the busy-wait-sleep loop by calling Tcl_DoOneEvent without the TCL_DONT_WAIT flag, and hence get better performance. 2) With the event loop in a separate thread, we can use Tkinter from IDLE also. --Michiel. -- Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 From mdehoon at c2b2.columbia.edu Mon Nov 14 02:04:55 2005 From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon) Date: Sun, 13 Nov 2005 20:04:55 -0500 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <4375721D.6040907@canterbury.ac.nz> References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> <87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp> <4372DD5F.70203@c2b2.columbia.edu> <ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com> <43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de> <4374DB69.2080804@c2b2.columbia.edu> <4375721D.6040907@canterbury.ac.nz> Message-ID: <4377E2B7.60309@c2b2.columbia.edu> Greg Ewing wrote: >Michiel Jan Laurens de Hoon wrote: > > >>I have an extension module for scientific visualization. This extension >>module opens one or more windows, in which plots can be made. >> >> > >What sort of windows are these? Are you using an existing >GUI toolkit, or rolling your own? > > Rolling my own. There's not much GUI to my window, basically it's just a window where I draw stuff. >>For the graphics windows to remain responsive, I need to make sure that >>its events get handled. So I need an event loop. >> >> >How about running your event loop in a separate thread? > > I agree that this works for some extension modules, but not very well for extension modules for which graphical performance is critical (see my reply to Martin). Secondly, I think that by thinking this through, we can come up with a suitable event loop framework for Python (probably similar to what Skip is proposing) that works without having to resort to threads. So we give users a choice: use the event loop if possible or preferable, and use a thread otherwise. --Michiel.. -- Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 From greg.ewing at canterbury.ac.nz Mon Nov 14 02:07:35 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 14 Nov 2005 14:07:35 +1300 Subject: [Python-Dev] str.dedent In-Reply-To: <43769E4A.5040408@colorstudy.com> References: <b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com> <000001c5e7c6$2f959440$2523c797@oemcomputer> <b348a0850511121424n26f84b9n7c1edc45e7f9f1c@mail.gmail.com> <43769E4A.5040408@colorstudy.com> Message-ID: <4377E357.5010808@canterbury.ac.nz> Ian Bicking wrote: > I think a better argument for this is that dedenting a literal string is > more of a syntactic operation than a functional one. You don't think > "oh, I bet I'll need to do some dedenting on line 200 of this module, I > better import textwrap". And regardless of the need to import, there's a feeling that it's something that ought to be done at compile time, or even parse time. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From mdehoon at c2b2.columbia.edu Mon Nov 14 02:20:07 2005 From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon) Date: Sun, 13 Nov 2005 20:20:07 -0500 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <17269.31319.806622.939477@montanaro.dyndns.org> References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> <87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp> <4372DD5F.70203@c2b2.columbia.edu> <ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com> <43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de> <4374DB69.2080804@c2b2.columbia.edu> <17269.21593.575449.78938@montanaro.dyndns.org> <4375709B.4010009@canterbury.ac.nz> <17269.31319.806622.939477@montanaro.dyndns.org> Message-ID: <4377E647.7080708@c2b2.columbia.edu> skip at pobox.com wrote: >If I have a Gtk app I have to feed other (socket, callback) pairs to it. It >takes care of adding it to the select() call. Python could dictate that the >way to play ball is for other packages (Tkinter, PyGtk, wxPython, etc) to >feed Python the (socket, callback) pair. Then you have a uniform way to >control event-driven applications. Today, a package like Michiel's has no >idea what sort of event loop it will encounter. If Python provided the >event loop API it would be the same no matter what widget set happened to be >used. > > This is essentially how Tcl does it (and which, btw, is currently being used in Tkinter): Tcl has the functions *Tcl_CreateFileHandler/**Tcl_DeleteFileHandler*, which allow a user to add a file descriptor to the list of file descriptors to select() on, and to specify a callback function to the function to be called when the file descriptor is signaled. A similar API in Python would give users a clean way to hook into the event loop, independent of which other packages are hooked into the event loop. // >The sticking point is probably that a number of such packages presume they >will always provide the main event loop and have to way to feed their >sockets to another event loop controller. That might present some hurdles >for the various package writers/Python wrappers. > > This may not be such a serious problem. Being able to hook into Python's event loop is important only if users want to be able to use the extension module in interactive mode. For an extension module such as PyGtk, the developers may decide that PyGtk is likely to be run in non-interactive mode only, for which the PyGtk mainloop is sufficient. Having an event loop API in Python won't hurt them. --Michiel. -- Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 From greg.ewing at canterbury.ac.nz Mon Nov 14 02:24:16 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 14 Nov 2005 14:24:16 +1300 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <4377E2B7.60309@c2b2.columbia.edu> References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> <87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp> <4372DD5F.70203@c2b2.columbia.edu> <ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com> <43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de> <4374DB69.2080804@c2b2.columbia.edu> <4375721D.6040907@canterbury.ac.nz> <4377E2B7.60309@c2b2.columbia.edu> Message-ID: <4377E740.70904@canterbury.ac.nz> Michiel Jan Laurens de Hoon wrote: > Greg Ewing wrote: > > > How about running your event loop in a separate thread? > > I agree that this works for some extension modules, but not very well > for extension modules for which graphical performance is critical I don't understand. If the main thread is idle, your thread should get all the time it wants. I'd actually expect this to give better interactive response, since you aren't doing busy-wait pauses all the time -- the thread can wake up as soon as an event arrives for it. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From mdehoon at c2b2.columbia.edu Mon Nov 14 02:25:18 2005 From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon) Date: Sun, 13 Nov 2005 20:25:18 -0500 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <b348a0850511121630s6e3d8d9dr4c8beaa202c2f1b5@mail.gmail.com> References: <437100A7.5050907@c2b2.columbia.edu> <ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com> <43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de> <4374DB69.2080804@c2b2.columbia.edu> <17269.21593.575449.78938@montanaro.dyndns.org> <4375709B.4010009@canterbury.ac.nz> <17269.31319.806622.939477@montanaro.dyndns.org> <b348a0850511121106u57c073eeicf8affae502cd86e@mail.gmail.com> <43767FA8.7090209@canterbury.ac.nz> <b348a0850511121630s6e3d8d9dr4c8beaa202c2f1b5@mail.gmail.com> Message-ID: <4377E77E.5030000@c2b2.columbia.edu> Noam Raphael wrote: >On 11/13/05, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote: > > >>Noam Raphael wrote: >> >> >>>All that is needed to make Tkinter and Michiels' >>>code run together is a way to say "add this callback to the input >>>hook" instead of the current "replace the current input hook with this >>>callback". Then, when the interpreter is idle, it will call all the >>>registered callbacks, one at a time, and everyone would be happy. >>> >>> >>Except for those who don't like busy waiting. >> >> >I'm not sure I understand what you meant. If you meant that it will >work slowly - a lot of people (including me) are using Tkinter without >a mainloop from the interactive shell, and don't feel the difference. >It uses exactly the method I described. > > This depends on what kind of extension module you run. I agree, for Tkinter you probably won't notice the difference -- although you are still wasting processor cycles. However, if graphics performance is important, busy-waiting is not ideal. --Michiel. -- Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 From greg.ewing at canterbury.ac.nz Mon Nov 14 02:27:48 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 14 Nov 2005 14:27:48 +1300 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <b348a0850511121630s6e3d8d9dr4c8beaa202c2f1b5@mail.gmail.com> References: <437100A7.5050907@c2b2.columbia.edu> <ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com> <43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de> <4374DB69.2080804@c2b2.columbia.edu> <17269.21593.575449.78938@montanaro.dyndns.org> <4375709B.4010009@canterbury.ac.nz> <17269.31319.806622.939477@montanaro.dyndns.org> <b348a0850511121106u57c073eeicf8affae502cd86e@mail.gmail.com> <43767FA8.7090209@canterbury.ac.nz> <b348a0850511121630s6e3d8d9dr4c8beaa202c2f1b5@mail.gmail.com> Message-ID: <4377E814.6060805@canterbury.ac.nz> Noam Raphael wrote: > On 11/13/05, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote: > > > Noam Raphael wrote: > > > > > callback". Then, when the interpreter is idle, it will call all the > > > registered callbacks, one at a time, and everyone would be happy. > > > > Except for those who don't like busy waiting. > > I'm not sure I understand what you meant. If you meant that it will > work slowly - a lot of people (including me) are using Tkinter without > a mainloop from the interactive shell, and don't feel the difference. Busy waiting is less efficient and less responsive than a solution which is able to avoid it. In many cases there will be little noticeable difference, but there will be some people who don't like it because it's not really the "right" solution to this sort of problem. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From fperez.net at gmail.com Mon Nov 14 02:30:53 2005 From: fperez.net at gmail.com (Fernando Perez) Date: Sun, 13 Nov 2005 18:30:53 -0700 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> <87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp> <4372DD5F.70203@c2b2.columbia.edu> <ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com> <43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de> <4374DB69.2080804@c2b2.columbia.edu> <17269.21593.575449.78938@montanaro.dyndns.org> <4375709B.4010009@canterbury.ac.nz> <17269.31319.806622.939477@montanaro.dyndns.org> <4377E647.7080708@c2b2.columbia.edu> Message-ID: <dl8pce$6e0$1@sea.gmane.org> Michiel Jan Laurens de Hoon wrote: > For an extension module such as > PyGtk, the developers may decide that PyGtk is likely to be run in > non-interactive mode only, for which the PyGtk mainloop is sufficient. Did you read my reply? ipython, based on code.py, implements a few simple threading tricks (they _are_ simple, since I know next to nothing about threading) and gives you interactive use of PyGTK, WXPython and PyQt applications in a manner similar to Tkinter. Meaning, you can from the command line make a window, change its title, add buttons to it, etc, all the while your interactive prompt remains responsive as well as the GUI. With that support, matplotlib can be used to do scientific plotting with any of these toolkits and no blocking of any kind (cross-thread signal handling is another story, but you didn't ask about that). As I said, there may be something in your problem that I don't understand. But it is certainly possible, today, to have a non-blocking Qt/WX/GTK-based scientific plotting application with interactive input. The ipython/matplotlib combo has done precisely that for over a year (well, Qt support was added this April). Cheers, f From mdehoon at c2b2.columbia.edu Mon Nov 14 02:39:36 2005 From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon) Date: Sun, 13 Nov 2005 20:39:36 -0500 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <4377E740.70904@canterbury.ac.nz> References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> <87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp> <4372DD5F.70203@c2b2.columbia.edu> <ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com> <43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de> <4374DB69.2080804@c2b2.columbia.edu> <4375721D.6040907@canterbury.ac.nz> <4377E2B7.60309@c2b2.columbia.edu> <4377E740.70904@canterbury.ac.nz> Message-ID: <4377EAD8.7050105@c2b2.columbia.edu> Greg Ewing wrote: >Michiel Jan Laurens de Hoon wrote: > > >>Greg Ewing wrote: >> >> >>>How about running your event loop in a separate thread? >>> >>> >>I agree that this works for some extension modules, but not very well >>for extension modules for which graphical performance is critical >> >> > >I don't understand. If the main thread is idle, your thread >should get all the time it wants. > >I'd actually expect this to give better interactive response, >since you aren't doing busy-wait pauses all the time -- the >thread can wake up as soon as an event arrives for it. > > This is exactly the problem. Drawing one picture may consist of many Python commands to draw the individual elements (for example, several graphs overlaying each other). We don't know where in the window each element will end up until we have the list of elements complete. For example, the axis may change (see my example to Martin). Or, if we're drawing a 3D picture, then one element may obscure another. Now, if we have our plotting extension module in a separate thread, the window will be repainted each time a new element is added. Imagine a picture of 1000 elements: we'd have to draw 1+2+...+1000 times. So this is tricky: we want repainting to start as soon as possible, but not sooner. Being able to hook into Python's event loop allows us to do so. --Michiel. -- Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 From mdehoon at c2b2.columbia.edu Mon Nov 14 02:43:21 2005 From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon) Date: Sun, 13 Nov 2005 20:43:21 -0500 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <dl8pce$6e0$1@sea.gmane.org> References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> <87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp> <4372DD5F.70203@c2b2.columbia.edu> <ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com> <43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de> <4374DB69.2080804@c2b2.columbia.edu> <17269.21593.575449.78938@montanaro.dyndns.org> <4375709B.4010009@canterbury.ac.nz> <17269.31319.806622.939477@montanaro.dyndns.org> <4377E647.7080708@c2b2.columbia.edu> <dl8pce$6e0$1@sea.gmane.org> Message-ID: <4377EBB9.6070706@c2b2.columbia.edu> Fernando Perez wrote: >Michiel Jan Laurens de Hoon wrote: > > >>For an extension module such as >>PyGtk, the developers may decide that PyGtk is likely to be run in >>non-interactive mode only, for which the PyGtk mainloop is sufficient. >> >> > >Did you read my reply? ipython, based on code.py, implements a few simple >threading tricks (they _are_ simple, since I know next to nothing about >threading) and gives you interactive use of PyGTK, WXPython and PyQt >applications in a manner similar to Tkinter. > That may be, and I think that's a good thing, but it's not up to me to decide if PyGtk should support interactive use. The PyGtk developers decide whether they want to decide to spend time on that, and they may decide not to, no matter how simple it may be. --Michiel. -- Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 From foom at fuhm.net Mon Nov 14 03:06:23 2005 From: foom at fuhm.net (James Y Knight) Date: Sun, 13 Nov 2005 21:06:23 -0500 Subject: [Python-Dev] str.dedent In-Reply-To: <4377E357.5010808@canterbury.ac.nz> References: <b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com> <000001c5e7c6$2f959440$2523c797@oemcomputer> <b348a0850511121424n26f84b9n7c1edc45e7f9f1c@mail.gmail.com> <43769E4A.5040408@colorstudy.com> <4377E357.5010808@canterbury.ac.nz> Message-ID: <BA7FC0D5-FEC3-453E-B2C6-B1082DCFE9ED@fuhm.net> On Nov 13, 2005, at 8:07 PM, Greg Ewing wrote: > Ian Bicking wrote: > > >> I think a better argument for this is that dedenting a literal >> string is >> more of a syntactic operation than a functional one. You don't think >> "oh, I bet I'll need to do some dedenting on line 200 of this >> module, I >> better import textwrap". >> > > And regardless of the need to import, there's a feeling > that it's something that ought to be done at compile > time, or even parse time. ITYM you mean "If only python were lisp". (macros, or even reader macros) James From fperez.net at gmail.com Mon Nov 14 03:15:59 2005 From: fperez.net at gmail.com (Fernando Perez) Date: Sun, 13 Nov 2005 19:15:59 -0700 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> <87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp> <4372DD5F.70203@c2b2.columbia.edu> <ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com> <43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de> <4374DB69.2080804@c2b2.columbia.edu> <17269.21593.575449.78938@montanaro.dyndns.org> <4375709B.4010009@canterbury.ac.nz> <17269.31319.806622.939477@montanaro.dyndns.org> <4377E647.7080708@c2b2.columbia.edu> <dl8pce$6e0$1@sea.gmane.org> <4377EBB9.6070706@c2b2.columbia.edu> Message-ID: <dl8s10$brm$1@sea.gmane.org> Michiel Jan Laurens de Hoon wrote: > Fernando Perez wrote: >>Did you read my reply? ipython, based on code.py, implements a few simple >>threading tricks (they _are_ simple, since I know next to nothing about >>threading) and gives you interactive use of PyGTK, WXPython and PyQt >>applications in a manner similar to Tkinter. >> > That may be, and I think that's a good thing, but it's not up to me to > decide if PyGtk should support interactive use. The PyGtk developers > decide whether they want to decide to spend time on that, and they may > decide not to, no matter how simple it may be. OK, I must really not be making myself very clear. I am not saying anything aobut the pygtk developers: what I said is that this can be done by the application writer, trivially, today. There's nothing you need from the authors of GTK. Don't take my word for it, look at the code: 1. You can download ipython, it's a trivial pure-python install. Grab matplotlib and see for yourself (which also addresses the repaint issues you mentioned). You can test the gui support without mpl as well. 2. If you don't want to download/install ipython, just look at the code that implements these features: http://projects.scipy.org/ipython/ipython/file/ipython/trunk/IPython/Shell.py 3. If you really want to see how simple this is, you can run this single, standalone script: http://ipython.scipy.org/tmp/pyint-gtk.py I wrote this when I was trying to understand the necessary threading tricks for GTK, it's a little multithreaded GTK shell based on code.py. 230 lines of code total, including readline support and (optional) matplotlib support. Once this was running, the ideas in it were folded into the more complex ipython codebase. At this point, I should probably stop posting on this thread. I think this is drifting off-topic for python-dev, and I am perhaps misunderstanding the essence of your problem for some reason. All I can say is that many people are doing scientific interactive plotting with ipython/mpl and all the major GUI toolkits, and they seem pretty happy about it. Best, f From edloper at gradient.cis.upenn.edu Mon Nov 14 06:35:31 2005 From: edloper at gradient.cis.upenn.edu (Edward Loper) Date: Mon, 14 Nov 2005 00:35:31 -0500 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter Message-ID: <43782223.2070609@gradient.cis.upenn.edu> As I understand it, you want to improve the performance of interactively run plot commands by queuing up all the plot sub-commands, and then drawing them all at once. Hooking into a python event loop certainly isn't the only way to do this. Perhaps you could consider the following approach: - The plot event loop is in a separate thread, accepting messages from the interactive thread. - These messages can contain plot commands; and they can also contain two new commands: - suspend -- stop plotting, and start saving commands in a queue. - resume -- execute all commands in the queue (with whatever increased efficiency tricks you're using) Then you can either just add functions to generate these messages, and call them at appropriate places; or set PyOS_InputHook to wrap each interactive call with a suspend/resume pair. But note that putting an event loop in a separate thread will be problematic if you want any of the events to generate callbacks into user code -- this could cause all sorts of nasty race-conditions! Using a separate thread for an event loop only seems practical to me if the event loop will never call back into user code (or if you're willing to put the burden on your users of making sure everything is thread safe). -Edward From ronaldoussoren at mac.com Mon Nov 14 07:39:28 2005 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Mon, 14 Nov 2005 07:39:28 +0100 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <4377E647.7080708@c2b2.columbia.edu> References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> <87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp> <4372DD5F.70203@c2b2.columbia.edu> <ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com> <43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de> <4374DB69.2080804@c2b2.columbia.edu> <17269.21593.575449.78938@montanaro.dyndns.org> <4375709B.4010009@canterbury.ac.nz> <17269.31319.806622.939477@montanaro.dyndns.org> <4377E647.7080708@c2b2.columbia.edu> Message-ID: <B28E05A6-D140-4C15-921D-C5061A2164D1@mac.com> On 14-nov-2005, at 2:20, Michiel Jan Laurens de Hoon wrote: > skip at pobox.com wrote: > >> If I have a Gtk app I have to feed other (socket, callback) pairs >> to it. It >> takes care of adding it to the select() call. Python could >> dictate that the >> way to play ball is for other packages (Tkinter, PyGtk, wxPython, >> etc) to >> feed Python the (socket, callback) pair. Then you have a uniform >> way to >> control event-driven applications. Today, a package like >> Michiel's has no >> idea what sort of event loop it will encounter. If Python >> provided the >> event loop API it would be the same no matter what widget set >> happened to be >> used. >> >> > This is essentially how Tcl does it (and which, btw, is currently > being > used in Tkinter): > Tcl has the functions *Tcl_CreateFileHandler/**Tcl_DeleteFileHandler*, > which allow a user to add a file descriptor to the list of file > descriptors to select() on, and to specify a callback function to the > function to be called when the file descriptor is signaled. A similar > API in Python would give users a clean way to hook into the event > loop, > independent of which other packages are hooked into the event loop. ... except when the GUI you're using doesn't expose (or even use) a file descriptor that you can use with select. Not all the world is Linux. BTW. I find using the term 'event loop' for the interactive mode very confusing. Ronald From jcarlson at uci.edu Mon Nov 14 08:16:28 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Sun, 13 Nov 2005 23:16:28 -0800 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <4377D97E.9060507@c2b2.columbia.edu> References: <4373A214.6060201@v.loewis.de> <4377D97E.9060507@c2b2.columbia.edu> Message-ID: <20051113230400.A403.JCARLSON@uci.edu> I personally like Edward Loper's idea of just running your own event handler which deals with drawing, suspend/resume, etc... > If, however, Python contains an event loop that takes care of events as > well as Python commands, redrawing won't happen until Python has > executed all plot commands -- so no repainting in vain here. ...but even without posting and reading events as stated above, one could check for plot events every 1/100th a second. If there is an update, and it has been 10/100 seconds since that undrawn event happened, redraw. Tune that 10 up/down to alter responsiveness characteristics. Or heck, if you are really lazy, people can use a plot() calls, but until an update_plot() is called, the plot isn't updated. There are many reasonable solutions to your problem, not all of which involve changing Python's event loop. - Josiah From ncoghlan at gmail.com Mon Nov 14 08:51:25 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 14 Nov 2005 17:51:25 +1000 Subject: [Python-Dev] Revamping the bug/patch guidelines (was Re: Implementation of PEP 341) In-Reply-To: <bbaeab100511131540y46cef4e6yf2496aa4f24fbec8@mail.gmail.com> References: <437716C4.8050309@gmail.com> <4377312E.2000002@gmail.com> <4377486B.1090400@gmail.com> <4377517C.9000808@gmail.com> <bbaeab100511131540y46cef4e6yf2496aa4f24fbec8@mail.gmail.com> Message-ID: <437841FD.3060707@gmail.com> Brett Cannon wrote: > On 11/13/05, Nick Coghlan <ncoghlan at gmail.com> wrote: >> Thomas Lee wrote: >>> Implemented as you suggested and tested. I'll submit the patch to the >>> tracker on sourceforge shortly. Are you guys still after contextual >>> diffs as per the developer pages, or is an svn diff the preferred way to >>> submit patches now? >> svn diff should be fine. Although I thought Brett had actually updated those >> pages after the move to svn. . . >> > > I did. But the docs just need to be revamped. But I can't start on > that work until people tell me if they prefer FAQ-style (question > listing all steps and then a question covering each step) or > essay-style (bulleted list and then a definition/paragraph on each > step) for bug/patch guidelines. I'd prefer essay-style for the guidelines themselves, with appropriate pointers to the guidelines from the dev FAQ. However, I also think either approach will work, so I suggest going with whichever you find easier to write :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From martin at v.loewis.de Mon Nov 14 08:53:07 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 14 Nov 2005 08:53:07 +0100 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <4377DF94.2090003@c2b2.columbia.edu> References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> <4372B82C.9010800@canterbury.ac.nz> <4372DA3A.8010206@c2b2.columbia.edu> <4372F72B.9060501@v.loewis.de> <43738074.2030508@c2b2.columbia.edu> <4373A214.6060201@v.loewis.de> <4377DF94.2090003@c2b2.columbia.edu> Message-ID: <43784263.3020100@v.loewis.de> Michiel Jan Laurens de Hoon wrote: >>Yes, you can. Actually, Tkinter *always* runs in a separate thread >>(separate from all other threads). >> > Are you sure? If Tkinter is running in a separate thread, then why does > it need PyOS_InputHook? Well, my statement was (somewhat deliberately) misleading. That separate thread might be the main thread (and, in many cases, is). The main thread is still a "separate" thread (separate from all others). Regards, Martin From fperez.net at gmail.com Mon Nov 14 08:54:56 2005 From: fperez.net at gmail.com (Fernando Perez) Date: Mon, 14 Nov 2005 00:54:56 -0700 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter References: <4373A214.6060201@v.loewis.de> <4377D97E.9060507@c2b2.columbia.edu> <20051113230400.A403.JCARLSON@uci.edu> Message-ID: <dl9fsh$oca$1@sea.gmane.org> Josiah Carlson wrote: > Or heck, if you are really lazy, people can use a plot() calls, but > until an update_plot() is called, the plot isn't updated. I really recommend that those interested in all these issues have a look at matplotlib. All of this has been dealt with there already, a long time ago, in detail. The solutions may not be perfect, but they do work for a fairly wide range of uses, including the interactive case. There may be a good reason why mpl's approach is insufficient, but I think that the discussion here would be more productive if that were stated precisely and explicitly. Up to this point, all the requirements I've been able to understand clearly work just fine with ipython/mpl (though I may well have missed the key issue, I'm sure). Cheers, f From martin at v.loewis.de Mon Nov 14 09:07:50 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 14 Nov 2005 09:07:50 +0100 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <4377D97E.9060507@c2b2.columbia.edu> References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> <4372B82C.9010800@canterbury.ac.nz> <4372DA3A.8010206@c2b2.columbia.edu> <4372F72B.9060501@v.loewis.de> <43738074.2030508@c2b2.columbia.edu> <4373A214.6060201@v.loewis.de> <4377D97E.9060507@c2b2.columbia.edu> Message-ID: <437845D6.3080301@v.loewis.de> Michiel Jan Laurens de Hoon wrote: > If, however, Python contains an event loop that takes care of events as > well as Python commands, redrawing won't happen until Python has > executed all plot commands -- so no repainting in vain here. Ah, I think now I understand the problem. It seems that you don't care at all about event loops. What you really want to know is "when is Python idle?", by "being idle" defines as "there are no commands being processed at the interactive interpreter", or perhaps "there are no commands being processed in the main thread", or perhaps "there are no commands being processed in any thread". Is that a correct problem statement? If so, please don't say that you want an event loop. Instead, it appears that you want to hook into the interpreter loop. As others have commented, it should be possible to get nearly the same effect without such hooking. For example, if you chose to redraw at most 10 times per second, you will still get good performance. Alternatively, you could chose to redraw if there was no drawing command for 100ms. Regards, Martin From gmccaughan at synaptics-uk.com Mon Nov 14 10:20:50 2005 From: gmccaughan at synaptics-uk.com (Gareth McCaughan) Date: Mon, 14 Nov 2005 09:20:50 +0000 Subject: [Python-Dev] str.dedent In-Reply-To: <43777B5A.6030602@egenix.com> References: <dga72k$cah$1@sea.gmane.org> <b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com> <43777B5A.6030602@egenix.com> Message-ID: <200511140920.51724.gmccaughan@synaptics-uk.com> On Sunday 2005-11-13 17:43, Marc-Andre Lemburg wrote: [Noam Raphael:] > > The idea is to add a method called "dedent" to strings. It would do > > exactly what the current textwrap.indent function does. [Marc-Andre:] > You are missing a point here: string methods were introduced > to make switching from plain 8-bit strings to Unicode easier. > > As such they are only needed in cases where an algorithm > has to work on the resp. internals differently or where direct > access to the internals makes a huge difference in terms > of performance. In a language that generally pays as much attention to practical usability as Python, it seems a pity to say (as you seem to be implying) that whether something is a string method or a function in (say) the "textwrap" module should be determined by internal implementation details. > > Writing multilined strings without spaces in the beginning of lines > > makes functions harder to read, since although the Python parser is > > happy with it, it breaks the visual indentation. > > This is really a minor compiler/parser issue and not one which > warrants adding another string method. Adding another string method seems easier, and a smaller change, than altering the compiler or parser. What's your point here? I think I must be missing something. -- g From ulrich.berning at desys.de Mon Nov 14 10:54:29 2005 From: ulrich.berning at desys.de (Ulrich Berning) Date: Mon, 14 Nov 2005 10:54:29 +0100 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: <ca471dc20511110815p12bb82efhc887ba4f6fae670f@mail.gmail.com> References: <20051109023347.GA15823@localhost.localdomain> <ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com> <b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com> <ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com> <bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com> <5.1.1.6.0.20051109190838.01f51838@mail.telecommunity.com> <5.1.1.6.0.20051110124246.02bac470@mail.telecommunity.com> <43749D65.4040001@desys.de> <ca471dc20511110815p12bb82efhc887ba4f6fae670f@mail.gmail.com> Message-ID: <43785ED5.40000@desys.de> Guido van Rossum schrieb: >On 11/11/05, Ulrich Berning <ulrich.berning at desys.de> wrote: > > >>For instance, nobody would give the output of a C compiler a different >>extension when different compiler flags are used. >> >> > >But the usage is completely different. With C you explicitly manage >when compilation happens. With Python you don't. When you first run >your program with -O but it crashes, and then you run it again without >-O to enable assertions, you would be very unhappy if the bytecode >cached in a .pyo file would be reused! > > > The other way round makes definitely more sense. At development time, I would never use Python with -O or -OO. I use it only at distribution time, after doing all the tests, to generate optimized bytecode. However, this problem could be easily solved, if the value of Py_OptimizeFlag would be stored together with the generated bytecode. At import time, the cached bytecode would not be reused if the current value of Py_OptimizeFlag doesn't match the stored value (if the .py file isn't there any longer, we could either raise an exception or we could emit a warning and reuse the bytecode anyway). And if we do this a little bit more clever, we could refuse reusing optimized bytecode if we are running without -O or -OO and ignore assertions and docstrings in unoptimized bytecode when we are running with -O or -OO. >>I would appreciate to see the generation of .pyo files completely >>removed in the next release. >> >> > >You seem to forget the realities of backwards compatibility. While >there are ways to cache bytecode without having multiple extensions, >we probably can't do that until Python 3.0. > > > Please can you explain what backwards compatibility means in this context? Generated bytecode is neither upwards nor backwards compatible. No matter what I try, I always get a 'Bad magic number' when I try to import bytecode generated with a different Python version. The most obvious software, that may depend on the existence of .pyo files are the various freeze/packaging tools like py2exe, py2app, cx_Freeze and Installer. I haven't checked them in detail, but after a short inspection, they seem to be independent of the existence of .pyo files. I can't imagine that there is any other Python software, that depends on the existence of .pyo files, but maybe I'm totally wrong in this wild guess. Ulli From mal at egenix.com Mon Nov 14 11:41:33 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 14 Nov 2005 11:41:33 +0100 Subject: [Python-Dev] str.dedent In-Reply-To: <200511140920.51724.gmccaughan@synaptics-uk.com> References: <dga72k$cah$1@sea.gmane.org> <b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com> <43777B5A.6030602@egenix.com> <200511140920.51724.gmccaughan@synaptics-uk.com> Message-ID: <437869DD.7040800@egenix.com> Gareth McCaughan wrote: > On Sunday 2005-11-13 17:43, Marc-Andre Lemburg wrote: > > [Noam Raphael:] > >>>The idea is to add a method called "dedent" to strings. It would do >>>exactly what the current textwrap.indent function does. > > > [Marc-Andre:] > >>You are missing a point here: string methods were introduced >>to make switching from plain 8-bit strings to Unicode easier. >> >>As such they are only needed in cases where an algorithm >>has to work on the resp. internals differently or where direct >>access to the internals makes a huge difference in terms >>of performance. > > > In a language that generally pays as much attention to > practical usability as Python, it seems a pity to say > (as you seem to be implying) that whether something is > a string method or a function in (say) the "textwrap" > module should be determined by internal implementation > details. We have to draw a line somewhere - otherwise you could just as well add all functions that accept single string arguments as methods to the basestring sub-classes. >>>Writing multilined strings without spaces in the beginning of lines >>>makes functions harder to read, since although the Python parser is >>>happy with it, it breaks the visual indentation. >> >>This is really a minor compiler/parser issue and not one which >>warrants adding another string method. > > Adding another string method seems easier, and a smaller > change, than altering the compiler or parser. What's your > point here? I think I must be missing something. The point is that the presented use case does not originate in a common need (to dedent strings), but from a desire to write Python code with embedded indented triple-quoted strings which lies in the scope of the parser, not that of string objects. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 14 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2005-10-17: Released mxODBC.Zope.DA 1.0.9 http://zope.egenix.com/ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From guido at python.org Mon Nov 14 14:23:57 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 14 Nov 2005 08:23:57 -0500 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: <43785ED5.40000@desys.de> References: <20051109023347.GA15823@localhost.localdomain> <ca471dc20511081914x1bba649eqbd47be8735c145b3@mail.gmail.com> <b674ca220511090633r41ead9edx739577cbe00f5460@mail.gmail.com> <ca471dc20511090739h4b4daf13p3dfda0a98413820a@mail.gmail.com> <bbaeab100511091505p352e4e94we1286404ad81ecd7@mail.gmail.com> <5.1.1.6.0.20051109190838.01f51838@mail.telecommunity.com> <5.1.1.6.0.20051110124246.02bac470@mail.telecommunity.com> <43749D65.4040001@desys.de> <ca471dc20511110815p12bb82efhc887ba4f6fae670f@mail.gmail.com> <43785ED5.40000@desys.de> Message-ID: <ca471dc20511140523uf064144m9546a06abe5b07a8@mail.gmail.com> On 11/14/05, Ulrich Berning <ulrich.berning at desys.de> wrote: > >You seem to forget the realities of backwards compatibility. While > >there are ways to cache bytecode without having multiple extensions, > >we probably can't do that until Python 3.0. > > > Please can you explain what backwards compatibility means in this > context? Generated bytecode is neither upwards nor backwards compatible. No, but the general format of .pyc/.pyo files hasn't changed since 1991 (magic number, timestamp, marshalled data) and while the magic number has changed many times, the API for getting it has been stable for probably 10 years. Lots of tools (you mention a few) have been written that read or write these files and these would all to some extent have to be taught by the changes (most likely the changes will include a change to the file header). > No matter what I try, I always get a 'Bad magic number' when I try to > import bytecode generated with a different Python version. > The most obvious software, that may depend on the existence of .pyo > files are the various freeze/packaging tools like py2exe, py2app, > cx_Freeze and Installer. I haven't checked them in detail, but after a > short inspection, they seem to be independent of the existence of .pyo > files. I can't imagine that there is any other Python software, that > depends on the existence of .pyo files, but maybe I'm totally wrong in > this wild guess. It's not just the existence of .pyo files. It's the format of the .pyc files that will have to change to accommodate multiple versions of bytecode. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at pobox.com Mon Nov 14 16:00:21 2005 From: skip at pobox.com (skip@pobox.com) Date: Mon, 14 Nov 2005 09:00:21 -0600 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <B28E05A6-D140-4C15-921D-C5061A2164D1@mac.com> References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> <87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp> <4372DD5F.70203@c2b2.columbia.edu> <ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com> <43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de> <4374DB69.2080804@c2b2.columbia.edu> <17269.21593.575449.78938@montanaro.dyndns.org> <4375709B.4010009@canterbury.ac.nz> <17269.31319.806622.939477@montanaro.dyndns.org> <4377E647.7080708@c2b2.columbia.edu> <B28E05A6-D140-4C15-921D-C5061A2164D1@mac.com> Message-ID: <17272.42629.420626.88192@montanaro.dyndns.org> Ronald> ... except when the GUI you're using doesn't expose (or even Ronald> use) a file descriptor that you can use with select. Not all the Ronald> world is Linux. Can you be more specific? Are you referring to Windows? I'm not suggesting you'd be able to use the same exact implementation on Unix and non-Unix platforms. You might well have to do different things across different platforms. Hopefully it would look the same to the programmer though, both across platforms and across toolkits. I can't imagine any of the X-based widget toolkits on Unix systems would use anything other than select() on a socket at the bottom. Skip From ulrich.berning at desys.de Mon Nov 14 16:53:30 2005 From: ulrich.berning at desys.de (Ulrich Berning) Date: Mon, 14 Nov 2005 16:53:30 +0100 Subject: [Python-Dev] Inconsistent behaviour in import/zipimport hooks In-Reply-To: <DAELJHBGPBHPJKEBGGLNCEEKIDAD.mhammond@skippinet.com.au> References: <DAELJHBGPBHPJKEBGGLNCEEKIDAD.mhammond@skippinet.com.au> Message-ID: <4378B2FA.9020308@desys.de> Mark Hammond schrieb: >>release. The main reason why I changed the import behavior was >>pythonservice.exe from the win32 extensions. pythonservice.exe imports >>the module that contains the service class, but because >>pythonservice.exe doesn't run in optimized mode, it will only import a >>.py or a .pyc file, not a .pyo file. Because we always generate bytecode >>with -OO at distribution time, we either had to change the behavior of >>pythonservice.exe or change the import behavior of Python. >> >> > >While ignoring the question of how Python should in the future handle >optimizations, I think it safe to state that that pythonservice.exe should >have the same basic functionality and operation in this regard as python.exe >does. It doesn't sound too difficult to modify pythonservice to accept -O >flags, and to modify the service installation process to allow this flag to >be specified. I'd certainly welcome any such patches. > >Although getting off-topic for this list, note that for recent pywin32 >releases, it is possible to host a service using python.exe directly, and >this is the technique py2exe uses to host service executables. It would >take a little more work to set things up to work like that, but that's >probably not too unreasonable for a custom application with specialized >distribution requirements. Using python.exe obviously means you get full >access to the command-line facilities it provides. > > Although off-topic for this list, I should give a reply. I have done both. My first approach was to change pythonservice.exe to accept -O and -OO and set the Py_OptimizeFlag accordingly. Today, we aren't using pythonservice.exe any longer. I have done nearly all the required changes in win32serviceutil.py to let python.exe host the services. It requires no changes to the services, everything should work as before. The difference is, that the service module is always executed as a script now. This requires an additional (first) argument '--as-service' when the script runs as a service. NOTE: Debugging services doesn't work yet. --- Installing the service C:\svc\testService.py is done the usual way: C:\svc>C:\Python23\python.exe testService.py install The resulting ImagePath value in the registry is then: "C:\Python23\python.exe" C:\svc\testService.py --as-service After finishing development and testing, we convert the script into an executable with our own tool sib.py: C:\svc>C:\Python23\python.exe C:\Python23\sib.py -n testService -d . testService.py C:\svc>nmake Now, we just do: C:\svc>testService.exe update The resulting ImagePath value in the registry is then changed to: "C:\testService.exe" --as-service Starting, stopping and removing works as usual: C:\svc>testService.exe start C:\svc>testService.exe stop C:\svc>testService.exe remove --- Because not everything works as before (debugging doesn't work, but we do not use it), I haven't provided a patch yet. As soon as I have completed it, I will have a patch available. Ulli From mdehoon at c2b2.columbia.edu Mon Nov 14 17:07:45 2005 From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon) Date: Mon, 14 Nov 2005 11:07:45 -0500 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <17272.42629.420626.88192@montanaro.dyndns.org> References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> <87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp> <4372DD5F.70203@c2b2.columbia.edu> <ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com> <43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de> <4374DB69.2080804@c2b2.columbia.edu> <17269.21593.575449.78938@montanaro.dyndns.org> <4375709B.4010009@canterbury.ac.nz> <17269.31319.806622.939477@montanaro.dyndns.org> <4377E647.7080708@c2b2.columbia.edu> <B28E05A6-D140-4C15-921D-C5061A2164D1@mac.com> <17272.42629.420626.88192@montanaro.dyndns.org> Message-ID: <4378B651.4080707@c2b2.columbia.edu> skip at pobox.com wrote: > Ronald> ... except when the GUI you're using doesn't expose (or even > Ronald> use) a file descriptor that you can use with select. Not all the > Ronald> world is Linux. > >Can you be more specific? Are you referring to Windows? I'm not suggesting >you'd be able to use the same exact implementation on Unix and non-Unix >platforms. You might well have to do different things across different >platforms. Hopefully it would look the same to the programmer though, both >across platforms and across toolkits. I can't imagine any of the X-based >widget toolkits on Unix systems would use anything other than select() on a >socket at the bottom. > >Skip > > As far as I know, that is correct (except that some systems use poll instead of select). For our extension module, we use select or poll to wait for events on Unix (using X). I have not run into problems with this on the Unix systems I have used, nor have I received complaints from users that this didn't work. On Windows, the situation is even easier. MsgWaitForMultipleObjects can wait for events on all windows created by the thread as well as stdin (the same function is used in Tcl's event loop). In contrast to Unix' select, we don't need to tell MsgWaitForMultipleObjects which callback function is associated with each window. --Michiel. -- Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 From fredrik at pythonware.com Mon Nov 14 17:19:19 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 14 Nov 2005 17:19:19 +0100 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> <87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp> <4372DD5F.70203@c2b2.columbia.edu> <ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com> <43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de> <4374DB69.2080804@c2b2.columbia.edu> <17269.21593.575449.78938@montanaro.dyndns.org> <4375709B.4010009@canterbury.ac.nz> <17269.31319.806622.939477@montanaro.dyndns.org> <4377E647.7080708@c2b2.columbia.edu><dl8pce$6e0$1@sea.gmane.org> <4377EBB9.6070706@c2b2.columbia.edu> Message-ID: <dlade8$ntq$1@sea.gmane.org> Michiel Jan Laurens de Hoon wrote: > >Did you read my reply? ipython, based on code.py, implements a few simple > >threading tricks (they _are_ simple, since I know next to nothing about > >threading) and gives you interactive use of PyGTK, WXPython and PyQt > >applications in a manner similar to Tkinter. > > > That may be, and I think that's a good thing, but it's not up to me to > decide if PyGtk should support interactive use. The PyGtk developers > decide whether they want to decide to spend time on that, and they may > decide not to, no matter how simple it may be. can you *please* start reading the posts you're replying to? </F> From fredrik at pythonware.com Mon Nov 14 17:30:34 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 14 Nov 2005 17:30:34 +0100 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter References: <437100A7.5050907@c2b2.columbia.edu><43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> <87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp> <4372DD5F.70203@c2b2.columbia.edu> <ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com> <43738C55.60509@c2b2.columbia.edu><4373A300.3080501@v.loewis.de> <4374DB69.2080804@c2b2.columbia.edu> <4375721D.6040907@canterbury.ac.nz> <4377E2B7.60309@c2b2.columbia.edu><4377E740.70904@canterbury.ac.nz> <4377EAD8.7050105@c2b2.columbia.edu> Message-ID: <dlae3b$quk$1@sea.gmane.org> Michiel Jan Laurens de Hoon wrote: > This is exactly the problem. Drawing one picture may consist of many > Python commands to draw the individual elements (for example, several > graphs overlaying each other). We don't know where in the window each > element will end up until we have the list of elements complete. For > example, the axis may change (see my example to Martin). Or, if we're > drawing a 3D picture, then one element may obscure another. > > Now, if we have our plotting extension module in a separate thread, the > window will be repainted each time a new element is added. Imagine a > picture of 1000 elements: we'd have to draw 1+2+...+1000 times. > > So this is tricky: we want repainting to start as soon as possible, but > not sooner. Being able to hook into Python's event loop allows us to do so. the solution to your problem is called damage/repair, is not tricky at all, and is supported by every GUI toolkit under the sun. (if you don't know how it works, google for "widget damage repair") </F> From ronaldoussoren at mac.com Mon Nov 14 19:19:14 2005 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Mon, 14 Nov 2005 19:19:14 +0100 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <17272.42629.420626.88192@montanaro.dyndns.org> References: <437100A7.5050907@c2b2.columbia.edu> <43710C95.30209@v.loewis.de> <43729CAB.5070106@c2b2.columbia.edu> <87br0tw041.fsf@tleepslib.sk.tsukuba.ac.jp> <4372DD5F.70203@c2b2.columbia.edu> <ca471dc20511100850l4f2b12d4lc11a55e0bb70891f@mail.gmail.com> <43738C55.60509@c2b2.columbia.edu> <4373A300.3080501@v.loewis.de> <4374DB69.2080804@c2b2.columbia.edu> <17269.21593.575449.78938@montanaro.dyndns.org> <4375709B.4010009@canterbury.ac.nz> <17269.31319.806622.939477@montanaro.dyndns.org> <4377E647.7080708@c2b2.columbia.edu> <B28E05A6-D140-4C15-921D-C5061A2164D1@mac.com> <17272.42629.420626.88192@montanaro.dyndns.org> Message-ID: <E6BF31EF-835C-4212-B8DA-1401FF917312@mac.com> On 14-nov-2005, at 16:00, skip at pobox.com wrote: > > Ronald> ... except when the GUI you're using doesn't expose (or > even > Ronald> use) a file descriptor that you can use with select. > Not all the > Ronald> world is Linux. > > Can you be more specific? Are you referring to Windows? I was thinking of MacOS X. It does have a eventloop, but doesn't expose a file descriptor to the user and might not even use one. Adding Python's input to the runloop of the GUI might be easier (e.g. feed the stdin filedescriptor to the GUI-toolkit-du-jour and process information when that runloop tells you that data is present). We have an example of that in the PyObjC source tree. I'd say either choice won't be very good. The problem is that you must interleave the execution of Python code with running the eventloop to get nice behaviour, which suggests threading to me. If you don't interleave you can easily block the GUI while Python code is executing. > I'm not suggesting > you'd be able to use the same exact implementation on Unix and non- > Unix > platforms. You might well have to do different things across > different > platforms. Hopefully it would look the same to the programmer > though, both > across platforms and across toolkits. Twisted anyone? ;-) ;-) > I can't imagine any of the X-based > widget toolkits on Unix systems would use anything other than select > () on a > socket at the bottom. I'd be very surprised if an X-based toolkit didn't use a select-loop somewhere. Ronald > > Skip From ronaldoussoren at mac.com Mon Nov 14 19:21:13 2005 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Mon, 14 Nov 2005 19:21:13 +0100 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <20051113230400.A403.JCARLSON@uci.edu> References: <4373A214.6060201@v.loewis.de> <4377D97E.9060507@c2b2.columbia.edu> <20051113230400.A403.JCARLSON@uci.edu> Message-ID: <AE21F850-277C-43B5-89F8-60BA2B824F59@mac.com> On 14-nov-2005, at 8:16, Josiah Carlson wrote: > > I personally like Edward Loper's idea of just running your own event > handler which deals with drawing, suspend/resume, etc... > >> If, however, Python contains an event loop that takes care of >> events as >> well as Python commands, redrawing won't happen until Python has >> executed all plot commands -- so no repainting in vain here. > > ...but even without posting and reading events as stated above, one > could check for plot events every 1/100th a second. If there is an > update, and it has been 10/100 seconds since that undrawn event > happened, > redraw. Tune that 10 up/down to alter responsiveness characteristics. > > Or heck, if you are really lazy, people can use a plot() calls, but > until an update_plot() is called, the plot isn't updated. I wonder why nobody has suggested a seperate thread for managing the GUI and using the hook in Python's event loop to issue the call to update_plot. Ronald > > There are many reasonable solutions to your problem, not all of which > involve changing Python's event loop. > > - Josiah > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ > ronaldoussoren%40mac.com From mdehoon at c2b2.columbia.edu Mon Nov 14 20:00:56 2005 From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon) Date: Mon, 14 Nov 2005 14:00:56 -0500 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <AE21F850-277C-43B5-89F8-60BA2B824F59@mac.com> References: <4373A214.6060201@v.loewis.de> <4377D97E.9060507@c2b2.columbia.edu> <20051113230400.A403.JCARLSON@uci.edu> <AE21F850-277C-43B5-89F8-60BA2B824F59@mac.com> Message-ID: <4378DEE8.70109@c2b2.columbia.edu> Ronald Oussoren wrote: > I wonder why nobody has suggested a seperate thread for managing the > GUI and > using the hook in Python's event loop to issue the call to update_plot. > Ha. That's probably the best solution I've heard so far, short of adding a Tcl-like event loop API to Python. There are two remaining issues though: 1) Currently, there's only one PyOS_InputHook. So we're stuck if we find that some other extension module already set PyOS_InputHook. An easy solution would be to have an PyOS_AddInputHook/PyOS_RemoveInputHook API, and let Python maintain a list of input hooks to be called. 2) All extension modules have to agree to return immediately from a call to the hook function. Tkinter currently does not do this. --Michiel. -- Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 From noamraph at gmail.com Mon Nov 14 20:14:39 2005 From: noamraph at gmail.com (Noam Raphael) Date: Mon, 14 Nov 2005 21:14:39 +0200 Subject: [Python-Dev] str.dedent In-Reply-To: <437869DD.7040800@egenix.com> References: <dga72k$cah$1@sea.gmane.org> <b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com> <43777B5A.6030602@egenix.com> <200511140920.51724.gmccaughan@synaptics-uk.com> <437869DD.7040800@egenix.com> Message-ID: <b348a0850511141114p25411ea4w704a99d1ea9a629a@mail.gmail.com> On 11/14/05, M.-A. Lemburg <mal at egenix.com> wrote: > We have to draw a line somewhere - otherwise you could > just as well add all functions that accept single > string arguments as methods to the basestring > sub-classes. Please read my first post in this thread - I think there's more reason for 'dedent' to be a string method than there is, for example, for 'expandtabs', since it allows you to write clearer code. > > The point is that the presented use case does not > originate in a common need (to dedent strings), but > from a desire to write Python code with embedded > indented triple-quoted strings which lies in the scope > of the parser, not that of string objects. > That's a theoretical argument. In practice, if you do it in the parser, you have two options: 1. Automatically dedent all strings. 2. Add a 'd' or some other letter before the string. Option 1 breaks backwards compatibility, and makes the parser do unexpected things. Option 2 adds another string-prefix letter, which is confusing, and it will also be hard to find out what that letter means. On the other hand, adding ".dedent()" at the end is very clear, and is just as easy. Now, about performance, please see the message I'll post in a few minutes... Noam From noamraph at gmail.com Mon Nov 14 20:33:13 2005 From: noamraph at gmail.com (Noam Raphael) Date: Mon, 14 Nov 2005 21:33:13 +0200 Subject: [Python-Dev] str.dedent In-Reply-To: <b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com> References: <dga72k$cah$1@sea.gmane.org> <ca471dc2050914161070f1f425@mail.gmail.com> <b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com> Message-ID: <b348a0850511141133s69d7c10ck4a82898da0107401@mail.gmail.com> Just two additional notes: On 9/15/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote: > > -1 > > Let it continue to live in textwrap where the existing pure python code > adequately serves all string-like objects. It's not worth losing the > duck typing by attaching new methods to str, unicode, UserString, and > everything else aspiring to be string-like. It may seem like the 'dedent' code would have to be written a lot of times, but I've checked the examples. It may be needed to write different versions for 'str' and for 'unicode', but these are going to be unified. In UserString you'll have to add exactly one line: def dedent(self): return self.data.dedent() I've just taken the line created for 'isalpha' and replaced 'isalpha' with 'dedent'. So in the long run, there will be exactly one implementation of 'dedent' in the Python code. (I don't know of any other objects which try to provide the full string interface.) Another reason for prefering a 'dedent' method over a 'dedent' function in some module, is that it allows sometime in the future to add an optimization to the compiler, so that it will dedent the string in compile time (this can't work for a function, since the function is found in run time). This will solve the performance problem completely, so that there will be an easy way to write multilined strings which do not interfere with the visual structure of the code, without the need to worry about performance. I'm not saying that this optimization has to be done now, just that 'dedent' as a method makes it possible, which adds to the other arguments for making it a method. Noam From aahz at pythoncraft.com Mon Nov 14 20:56:50 2005 From: aahz at pythoncraft.com (Aahz) Date: Mon, 14 Nov 2005 11:56:50 -0800 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <20051113230400.A403.JCARLSON@uci.edu> References: <4373A214.6060201@v.loewis.de> <4377D97E.9060507@c2b2.columbia.edu> <20051113230400.A403.JCARLSON@uci.edu> Message-ID: <20051114195650.GA2732@panix.com> On Sun, Nov 13, 2005, Josiah Carlson wrote: > > I personally like Edward Loper's idea of just running your own event > handler which deals with drawing, suspend/resume, etc... > >> If, however, Python contains an event loop that takes care of events as >> well as Python commands, redrawing won't happen until Python has >> executed all plot commands -- so no repainting in vain here. > > ...but even without posting and reading events as stated above, one > could check for plot events every 1/100th a second. If there is an > update, and it has been 10/100 seconds since that undrawn event happened, > redraw. Tune that 10 up/down to alter responsiveness characteristics. ...and that's exactly what my sample threaded GUI application does. Can we please move this thread to comp.lang.python? -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "If you think it's expensive to hire a professional to do the job, wait until you hire an amateur." --Red Adair From skip at pobox.com Mon Nov 14 21:04:02 2005 From: skip at pobox.com (skip@pobox.com) Date: Mon, 14 Nov 2005 14:04:02 -0600 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <4378DEE8.70109@c2b2.columbia.edu> References: <4373A214.6060201@v.loewis.de> <4377D97E.9060507@c2b2.columbia.edu> <20051113230400.A403.JCARLSON@uci.edu> <AE21F850-277C-43B5-89F8-60BA2B824F59@mac.com> <4378DEE8.70109@c2b2.columbia.edu> Message-ID: <17272.60850.25579.583878@montanaro.dyndns.org> Michiel> 1) Currently, there's only one PyOS_InputHook. So we're stuck Michiel> if we find that some other extension module already set Michiel> PyOS_InputHook. An easy solution would be to have an Michiel> PyOS_AddInputHook/PyOS_RemoveInputHook API, and let Python Michiel> maintain a list of input hooks to be called. I think we've come more-or-less full circle to the point where I jumped onto this spinning thread. If there is only a single input hook function, you probably need to write a slightly higher level module that manages the hook. See sys.exitfunc and the atexit module for a simple example. Skip From fredrik at pythonware.com Mon Nov 14 21:01:00 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 14 Nov 2005 21:01:00 +0100 Subject: [Python-Dev] str.dedent References: <dga72k$cah$1@sea.gmane.org><b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com><43777B5A.6030602@egenix.com><200511140920.51724.gmccaughan@synaptics-uk.com><437869DD.7040800@egenix.com> <b348a0850511141114p25411ea4w704a99d1ea9a629a@mail.gmail.com> Message-ID: <dlaqds$8sb$1@sea.gmane.org> Noam Raphael wrote: > That's a theoretical argument. In practice, if you do it in the > parser, you have two options: > > 1. Automatically dedent all strings. > 2. Add a 'd' or some other letter before the string. > > Option 1 breaks backwards compatibility, and makes the parser do > unexpected things. Option 2 adds another string-prefix letter, which > is confusing, and it will also be hard to find out what that letter > means. On the other hand, adding ".dedent()" at the end is very clear, > and is just as easy. so is putting the string constant in a global variable, outside the scope you're in, like you'd do with any other constant. (how about a new rule: you cannot post to a zombie thread on python- dev unless they've fixed/reviewed/applied or otherwise processed at least one tracker item earlier the same day. there are hundreds of items on the bugs and patches trackers that could need some loving care) </F> From noamraph at gmail.com Mon Nov 14 21:12:04 2005 From: noamraph at gmail.com (Noam Raphael) Date: Mon, 14 Nov 2005 22:12:04 +0200 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <4378DEE8.70109@c2b2.columbia.edu> References: <4373A214.6060201@v.loewis.de> <4377D97E.9060507@c2b2.columbia.edu> <20051113230400.A403.JCARLSON@uci.edu> <AE21F850-277C-43B5-89F8-60BA2B824F59@mac.com> <4378DEE8.70109@c2b2.columbia.edu> Message-ID: <b348a0850511141212o12556119jd2be06f9444b3d1b@mail.gmail.com> On 11/14/05, Michiel Jan Laurens de Hoon <mdehoon at c2b2.columbia.edu> wrote: > Ronald Oussoren wrote: > > > I wonder why nobody has suggested a seperate thread for managing the > > GUI and > > using the hook in Python's event loop to issue the call to update_plot. > > > Ha. That's probably the best solution I've heard so far, short of adding > a Tcl-like event loop API to Python. No. It is definitely a bad solution. Where I work, we do a lot of plotting from the interactive interpreter, using Tkinter. I always wondered how it worked, and assumed that it was done using threading. So when people started using IDLE, and those plots didn't show up, I've found the solution of calling the Tkinter main() function from a thread. Everything seemed to work fine, until... It didn't. Strange freezes started to appear, only when working from IDLE. This made me investigate a bit, and I've found that Tkinter isn't run from a seperate thread - the dooneevent() function is called repeatedly by PyOS_InputHook while the interpreter is idle. The conclusions: 1. Don't use threads when you don't have to. Tkinter does callbacks to Python code, and most code isn't designed to work reliably in multithreaded environment. 2. The non-threading solution works *really* well - the fact is that I hadn't noticed the difference between multi-threaded mode and single-threaded mode, until things began to freeze in the multi-threaded mode. Noam From noamraph at gmail.com Mon Nov 14 23:25:24 2005 From: noamraph at gmail.com (Noam Raphael) Date: Tue, 15 Nov 2005 00:25:24 +0200 Subject: [Python-Dev] str.dedent In-Reply-To: <dlaqds$8sb$1@sea.gmane.org> References: <dga72k$cah$1@sea.gmane.org> <b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com> <43777B5A.6030602@egenix.com> <200511140920.51724.gmccaughan@synaptics-uk.com> <437869DD.7040800@egenix.com> <b348a0850511141114p25411ea4w704a99d1ea9a629a@mail.gmail.com> <dlaqds$8sb$1@sea.gmane.org> Message-ID: <b348a0850511141425y1a894ddap14d7814568c9be5d@mail.gmail.com> On 11/14/05, Fredrik Lundh <fredrik at pythonware.com> wrote: > so is putting the string constant in a global variable, outside the scope > you're in, like you'd do with any other constant. Usually when I use a constant a single time, I write it where I use it, and don't give it a name. I don't do: messagea = "The value of A is " ... (a long class definition) print messagea, A This is what I mean when I say "constant" - a value which is known when I write the code, not necessarily an arbitrary value that may change, so I write it at the beginning of the program for others to know it's there. There's no reason why multilined strings that are used only once should be defined at the beginning of a program (think about a simple CGI script, which prints HTML parts in a function.) > > (how about a new rule: you cannot post to a zombie thread on python- > dev unless they've fixed/reviewed/applied or otherwise processed at least > one tracker item earlier the same day. there are hundreds of items on the > bugs and patches trackers that could need some loving care) > I posted to this thread because it was relevant to a new post about dedenting strings. Anyway, I looked at bug 1356720 (Ctrl+C for copy does not work when caps-lock is on), and posted there a very simple patch which will most probably solve the problem. I also looked at bug 1337987 (IDLE, F5 and wrong external file content. (on error!)). One problem it raises is that IDLE doesn't have a "revert" command and that it doesn't notice if the file was changed outside of IDLE. I am planning to fix it. The other problem that is reported in that bug is that exceptions show misleading code lines when the source file was changed but wasn't loaded into Python. Perhaps in compiled code, not only the file name should be written but also its modification time? This way, when tracebacks print lines of changed files, they can warn if the line might not be the right line. Noam From fredrik at pythonware.com Mon Nov 14 23:27:28 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 14 Nov 2005 23:27:28 +0100 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter References: <4373A214.6060201@v.loewis.de> <4377D97E.9060507@c2b2.columbia.edu><20051113230400.A403.JCARLSON@uci.edu><AE21F850-277C-43B5-89F8-60BA2B824F59@mac.com><4378DEE8.70109@c2b2.columbia.edu> <b348a0850511141212o12556119jd2be06f9444b3d1b@mail.gmail.com> Message-ID: <dlb30h$771$1@sea.gmane.org> Noam Raphael wrote: > It didn't. Strange freezes started to appear, only when working from > IDLE. This made me investigate a bit, and I've found that Tkinter > isn't run from a seperate thread - the dooneevent() function is called > repeatedly by PyOS_InputHook while the interpreter is idle. repeatedly? The standard myreadline implementation only calls the hook *once* for each line it reads from stdin: if (PyOS_InputHook != NULL) (void)(PyOS_InputHook)(); errno = 0; p = fgets(buf, len, fp); if (p != NULL) return 0; /* No error */ which isn't enough to keep any event pump going... If you want any other behaviour, you need GNU readline, or a GUI toolkit that takes control over the InputHook, just like Tkinter. And that won't help you if you want portable code; for example, Tkinter on Windows only keeps the event pump running as long as the user doesn't type anything. As soon as the user touches the keyboard, the pump stops. To see this in action, try this: >>> from Tkinter import * >>> label = Label(text="hello") >>> label.pack() and then type >>> label.after(1000, lambda: label.config(bg="red")) and press return. The widget updates after a second. Next, type >>> label.after(1000, lambda: label.config(bg="blue")) press return, and immediately press space. This time, nothing happens, until you press return again. If you want to write portable code that keeps things running "in the background" while the users hack away at the standard interactive prompt, InputHook won't help you. </F> From noamraph at gmail.com Mon Nov 14 23:39:13 2005 From: noamraph at gmail.com (Noam Raphael) Date: Tue, 15 Nov 2005 00:39:13 +0200 Subject: [Python-Dev] Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <dlb30h$771$1@sea.gmane.org> References: <4373A214.6060201@v.loewis.de> <4377D97E.9060507@c2b2.columbia.edu> <20051113230400.A403.JCARLSON@uci.edu> <AE21F850-277C-43B5-89F8-60BA2B824F59@mac.com> <4378DEE8.70109@c2b2.columbia.edu> <b348a0850511141212o12556119jd2be06f9444b3d1b@mail.gmail.com> <dlb30h$771$1@sea.gmane.org> Message-ID: <b348a0850511141439p3f0f4cdbp5d7332b1d1224f19@mail.gmail.com> On 11/15/05, Fredrik Lundh <fredrik at pythonware.com> wrote: > If you want to write portable code that keeps things running "in the > background" while the users hack away at the standard interactive > prompt, InputHook won't help you. > So probably it should be improved, or changed a bit, to work also on Windows. Or perhaps it's Tkinter. Anyway, what I'm saying is - don't use threads! Process events in the main thread while it doesn't run the user's Python code. If he runs another thread - that's his problem. The implicit event loop should never execute Python code while a user's Python code is running in the main thread. Noam From BruceEckel-Python3234 at mailblocks.com Mon Nov 14 23:46:58 2005 From: BruceEckel-Python3234 at mailblocks.com (Bruce Eckel) Date: Mon, 14 Nov 2005 15:46:58 -0700 Subject: [Python-Dev] Coroutines (PEP 342) In-Reply-To: <4359047B.6020203@gmail.com> References: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com> <43579027.6040007@gmail.com> <43579ADC.80006@gmail.com> <5.1.1.6.0.20051020163313.01faf660@mail.telecommunity.com> <ca471dc20510201957m7823c49ama127de972eef4028@mail.gmail.com> <4359047B.6020203@gmail.com> Message-ID: <1147958111.20051114154658@gmail.com> I just finished reading PEP 342, and it appears to follow Hoare's Communicating Sequential Processes (CSP) where a process is a coroutine, and the communicaion is via yield and send(). It seems that if you follow that form (and you don't seem forced to, pythonically), then synchronization is not an issue. What is not clear to me, and is not discussed in the PEP, is whether coroutines can be distributed among multiple processors. If that is or isn't possible I think it should be explained in the PEP, and I'd be interested in know about it here (and ideally why it would or wouldn't work). Thanks. Bruce Eckel From martin at v.loewis.de Mon Nov 14 23:48:34 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 14 Nov 2005 23:48:34 +0100 Subject: [Python-Dev] str.dedent In-Reply-To: <b348a0850511141425y1a894ddap14d7814568c9be5d@mail.gmail.com> References: <dga72k$cah$1@sea.gmane.org> <b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com> <43777B5A.6030602@egenix.com> <200511140920.51724.gmccaughan@synaptics-uk.com> <437869DD.7040800@egenix.com> <b348a0850511141114p25411ea4w704a99d1ea9a629a@mail.gmail.com> <dlaqds$8sb$1@sea.gmane.org> <b348a0850511141425y1a894ddap14d7814568c9be5d@mail.gmail.com> Message-ID: <43791442.8050109@v.loewis.de> Noam Raphael wrote: > There's no reason why multilined strings that are used only once > should be defined at the beginning of a program (think about a simple > CGI script, which prints HTML parts in a function.) I find that simple CGI scripts are precisely the example *for* putting multi-line string literals at the beginning of a file. There are multiple styles for writing such things: 1. Put headers and trailers into separate strings. This tends to become tedious to maintain, since you always have to find the matching string (e.g. if you add an opening tag in the header, you have to put the closing tag in the trailer). 2. Use interpolation (e.g. % substitution), and put the strings into the code. This works fine for single line strings. For multi-line strings, the HTML code tends to clutter the view of the algorithm, whether it is indented or not. Functions should fit on a single screen of text, and adding multiline text into functions tends to break this requirement. 3. Use interpolation, and put the templates at the beginning. This makes the templates easy to inspect, and makes it easy to follow the code later in the file. It is the style I use and recommend. Of course, it may occasionally become necessary to have a few-lines string literally in a function; in most cases, indenting it along with the rest of the function is fine, as HTML can stand extra spaces with no problems. Regards, Martin From greg.ewing at canterbury.ac.nz Tue Nov 15 01:41:42 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 15 Nov 2005 13:41:42 +1300 Subject: [Python-Dev] str.dedent In-Reply-To: <BA7FC0D5-FEC3-453E-B2C6-B1082DCFE9ED@fuhm.net> References: <b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com> <000001c5e7c6$2f959440$2523c797@oemcomputer> <b348a0850511121424n26f84b9n7c1edc45e7f9f1c@mail.gmail.com> <43769E4A.5040408@colorstudy.com> <4377E357.5010808@canterbury.ac.nz> <BA7FC0D5-FEC3-453E-B2C6-B1082DCFE9ED@fuhm.net> Message-ID: <43792EC6.4030707@canterbury.ac.nz> James Y Knight wrote: > ITYM you mean "If only python were lisp". (macros, or even reader macros) No, I mean it would be more satisfying if there were a syntax for expressing multiline string literals that didn't force it to be at the left margin. The lack of such in such an otherwise indentation-savvy language seems a wart. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From pje at telecommunity.com Tue Nov 15 04:24:52 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 14 Nov 2005 22:24:52 -0500 Subject: [Python-Dev] Coroutines (PEP 342) In-Reply-To: <1147958111.20051114154658@gmail.com> References: <4359047B.6020203@gmail.com> <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com> <43579027.6040007@gmail.com> <43579ADC.80006@gmail.com> <5.1.1.6.0.20051020163313.01faf660@mail.telecommunity.com> <ca471dc20510201957m7823c49ama127de972eef4028@mail.gmail.com> <4359047B.6020203@gmail.com> Message-ID: <5.1.1.6.0.20051114221533.01f25260@mail.telecommunity.com> At 03:46 PM 11/14/2005 -0700, Bruce Eckel wrote: >I just finished reading PEP 342, and it appears to follow Hoare's >Communicating Sequential Processes (CSP) where a process is a >coroutine, and the communicaion is via yield and send(). It seems that >if you follow that form (and you don't seem forced to, pythonically), >then synchronization is not an issue. > >What is not clear to me, and is not discussed in the PEP, is whether >coroutines can be distributed among multiple processors. If you were to write a trampoline that used multiple threads, *and* you were using a Python implementation that supported multiple processors (e.g. Jython, IronPython, ?), *and* that Python implementation supported PEP 342, then yes. However, that just means the answer is, "if you can run Python code on multiple processors, you can run Python code on multiple processors". PEP 342 itself has nothing to say about that issue, which exists independently of the PEP. So, the PEP doesn't address what you're asking about, because the GIL still exists in CPython, and will continue to exist. Also, guaranteeing encapsulation of the coroutines would be *hard*, because lots of Python objects like modules, functions, and the like would be shared between more than one coroutine, and so then the issue of locking raises its ugly head again. > If that is or >isn't possible I think it should be explained in the PEP, and I'd be >interested in know about it here (and ideally why it would or wouldn't >work). The PEP is entirely unrelated (and entirely orthogonal) to whether a given Python implementation can interpret Python code on multiple processors simultaneously. The only difference between what PEP 342 does and what Twisted does today is in syntax. PEP 342 just provides a syntax that lets you avoid writing your code in CPS (continuation-passing style) with lots of callbacks. PEP 342 is implemented in the current Python SVN HEAD, by the way, if you want to experiment with the implementation. From abo at minkirri.apana.org.au Tue Nov 15 10:25:29 2005 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Tue, 15 Nov 2005 09:25:29 +0000 Subject: [Python-Dev] Coroutines (PEP 342) In-Reply-To: <1147958111.20051114154658@gmail.com> References: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com> <43579027.6040007@gmail.com> <43579ADC.80006@gmail.com> <5.1.1.6.0.20051020163313.01faf660@mail.telecommunity.com> <ca471dc20510201957m7823c49ama127de972eef4028@mail.gmail.com> <4359047B.6020203@gmail.com> <1147958111.20051114154658@gmail.com> Message-ID: <1132046729.17944.11.camel@warna.corp.google.com> On Mon, 2005-11-14 at 15:46 -0700, Bruce Eckel wrote: [...] > What is not clear to me, and is not discussed in the PEP, is whether > coroutines can be distributed among multiple processors. If that is or > isn't possible I think it should be explained in the PEP, and I'd be > interested in know about it here (and ideally why it would or wouldn't > work). Even if different coroutines could be run on different processors, there would be nothing gained except extra overheads of interprocessor memory duplication and communication delays. The whole process communication via yield and send effectively means only one co-routine is running at a time, and all the others are blocked waiting for a yield or send. This was the whole point; it is a convenient abstraction that appears to do work in parallel, while actually doing it sequentially, avoiding the overheads and possible race conditions of threads. It has the problem that a single co-routine can monopolise execution, hence the other name "co-operative multi-tasking", where co-operation is the requirement for it to work. At least... that's the way I understood it... I could be totally mistaken... -- Donovan Baarda <abo at minkirri.apana.org.au> http://minkirri.apana.org.au/~abo/ From ncoghlan at iinet.net.au Tue Nov 15 10:31:03 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Tue, 15 Nov 2005 19:31:03 +1000 Subject: [Python-Dev] Memory management in the AST parser & compiler Message-ID: <4379AAD7.2050506@iinet.net.au> Transferring part of the discussion of Thomas Lee's PEP 341 patch to python-dev. . . Neal Norwitz wrote in the SF patch tracker: > Thomas, I hope you will write up this experience in coding > this patch. IMO it clearly demonstrates a problem with the > new AST code that needs to be addressed. ie, Memory > management is not possible to get right. I've got a 700+ > line patch to ast.c to correct many more memory issues > (hopefully that won't cause conflicts with this patch). I > would like to hear ideas of how the AST code can be improved > to make it much easier to not leak memory and be safe at the > same time. As Neal pointed out, it's tricky to write code for the AST parser and compiler without accidentally letting memory leak when the parser or compiler runs into a problem and has to bail out on whatever it was doing. Thomas's patch got to v5 (based on Neal's review comments) with memory leaks still in it, my review got rid of some of them, and we think Neal's last review of v6 of the patch got rid of the last of them. I am particularly concerned about the returns hidden inside macros in the AST compiler's symbol table generation and bytecode generation steps. At the moment, every function in compile.c which allocates code blocks (or anything else for that matter) and then calls one of the VISIT_* macros is a memory leak waiting to happen. Something I've seen used successfully (and used myself) to deal with similar resource-management problems in C code is to use a switch statement, rather than getting goto-happy. Specifically, the body of the entire function is written inside a switch statement, with 'break' then used as the equivalent of "raise Exception". For example: PyObject* switchAsTry() { switch(0) { default: /* Real function body goes here */ return result; } /* Error cleanup code goes here */ return NULL; } It avoids the potential for labelling problems that arises when goto's are used for resource cleanup. It's a far cry from real exception handling, but it's the best solution I've seen within the limits of C. A particular benefit comes when macros which may abort function execution are used inside the function - if those macros are rewritten to use break instead of return, then the function gets a chance to clean up after an error. Cheers, Nick. P.S. Getting rid of the flow control macros entirely is another option, of course, but it would make compile.c and symtable.c a LOT harder to follow. Raymond Chen's articles notwithstanding, a preprocessor-based mini-language does make sense in some situations, and I think this is one of them. Particularly since the flow control macros are private to the relevant implementation files. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From mwh at python.net Tue Nov 15 10:50:42 2005 From: mwh at python.net (Michael Hudson) Date: Tue, 15 Nov 2005 09:50:42 +0000 Subject: [Python-Dev] str.dedent In-Reply-To: <43792EC6.4030707@canterbury.ac.nz> (Greg Ewing's message of "Tue, 15 Nov 2005 13:41:42 +1300") References: <b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com> <000001c5e7c6$2f959440$2523c797@oemcomputer> <b348a0850511121424n26f84b9n7c1edc45e7f9f1c@mail.gmail.com> <43769E4A.5040408@colorstudy.com> <4377E357.5010808@canterbury.ac.nz> <BA7FC0D5-FEC3-453E-B2C6-B1082DCFE9ED@fuhm.net> <43792EC6.4030707@canterbury.ac.nz> Message-ID: <2mveyuxmrx.fsf@starship.python.net> Greg Ewing <greg.ewing at canterbury.ac.nz> writes: > James Y Knight wrote: > >> ITYM you mean "If only python were lisp". (macros, or even reader macros) > > No, I mean it would be more satisfying if there > were a syntax for expressing multiline string > literals that didn't force it to be at the left > margin. The lack of such in such an otherwise > indentation-savvy language seems a wart. Wasn't there a PEP about this? Yes, 295. But that was rejected, I presume[*] because it proposed changing all multi-string literals, a plainly doomed idea (well, it would make *me* squeal, anyway). Cheers, mwh (who finds the whole issue rather hard to care about) [*] The reason for rejection isn't in the PEP, grumble. -- I would hereby duly point you at the website for the current pedal powered submarine world underwater speed record, except I've lost the URL. -- Callas, cam.misc From imbaczek at gmail.com Tue Nov 15 13:22:21 2005 From: imbaczek at gmail.com (=?ISO-8859-2?Q?Marek_=22Baczek=22_Baczy=F1ski?=) Date: Tue, 15 Nov 2005 13:22:21 +0100 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <4379AAD7.2050506@iinet.net.au> References: <4379AAD7.2050506@iinet.net.au> Message-ID: <5f3d2c310511150422x3e2d670r@mail.gmail.com> 2005/11/15, Nick Coghlan <ncoghlan at iinet.net.au>: > Specifically, the body of the entire function is written inside a switch > statement, with 'break' then used as the equivalent of "raise Exception". For > example: > > PyObject* switchAsTry() > { > switch(0) { > default: > /* Real function body goes here */ > return result; > } > /* Error cleanup code goes here */ > return NULL; > } > > It avoids the potential for labelling problems that arises when goto's are > used for resource cleanup. It's a far cry from real exception handling, but > it's the best solution I've seen within the limits of C. <delurk> do { .... .... } while (0); Same benefit and saves some typing :) Now back to my usual hiding place. </delurk> -- { Marek Baczy?ski :: UIN 57114871 :: GG 161671 :: JID imbaczek at jabber.gda.pl } { http://www.vlo.ids.gda.pl/ | imbaczek at poczta fm | http://www.promode.org } .. .. .. .. ... ... ...... evolve or face extinction ...... ... ... .. .. .. .. From ncoghlan at gmail.com Tue Nov 15 13:26:54 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 15 Nov 2005 22:26:54 +1000 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <5f3d2c310511150422x3e2d670r@mail.gmail.com> References: <4379AAD7.2050506@iinet.net.au> <5f3d2c310511150422x3e2d670r@mail.gmail.com> Message-ID: <4379D40E.9050002@gmail.com> Marek Baczek Baczy?ski wrote: > 2005/11/15, Nick Coghlan <ncoghlan at iinet.net.au>: >> It avoids the potential for labelling problems that arises when goto's are >> used for resource cleanup. It's a far cry from real exception handling, but >> it's the best solution I've seen within the limits of C. > > <delurk> > do { > .... > .... > } while (0); > > > Same benefit and saves some typing :) Heh. Good point. I spend so much time working with a certain language I tend to forget do/while loops exist ;) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From krumms at gmail.com Tue Nov 15 14:17:13 2005 From: krumms at gmail.com (Thomas Lee) Date: Tue, 15 Nov 2005 23:17:13 +1000 Subject: [Python-Dev] PEP 341 patch & memory management (was: Memory management in the AST parser & compiler) In-Reply-To: <4379D40E.9050002@gmail.com> References: <4379AAD7.2050506@iinet.net.au> <5f3d2c310511150422x3e2d670r@mail.gmail.com> <4379D40E.9050002@gmail.com> Message-ID: <4379DFD9.3010209@gmail.com> Interesting trick! The PEP 341 patch is now using Marek's 'do ... while' resource cleanup trick instead of the nasty goto voodoo. I've also fixed the last remaining bug that Neal pointed out. I'm running the unit tests right now, shall have the updated (and hopefully final) PEP 341 patch up on sourceforge within the next 15 minutes. If anybody has feedback/suggestions for the patch, please let me know. I'm new to this stuff, so I'm still finding my way around :) Cheers, Tom Nick Coghlan wrote: >Marek Baczek Baczy?ski wrote: > > >>2005/11/15, Nick Coghlan <ncoghlan at iinet.net.au>: >> >> >>>It avoids the potential for labelling problems that arises when goto's are >>>used for resource cleanup. It's a far cry from real exception handling, but >>>it's the best solution I've seen within the limits of C. >>> >>> >><delurk> >>do { >> .... >> .... >>} while (0); >> >> >>Same benefit and saves some typing :) >> >> > >Heh. Good point. I spend so much time working with a certain language I tend >to forget do/while loops exist ;) > >Cheers, >Nick. > > > From mwh at python.net Tue Nov 15 18:29:21 2005 From: mwh at python.net (Michael Hudson) Date: Tue, 15 Nov 2005 17:29:21 +0000 Subject: [Python-Dev] Gothenburg PyPy Sprint II: 7th - 11th December 2005 Message-ID: <2mfypxyg3y.fsf@starship.python.net> Gothenburg PyPy Sprint II: 7th - 11th December 2005 ====================================================== (NOTE: internal EU-only sprint starts on the 5th!) The next PyPy sprint is scheduled to be in December 2005 in Gothenborg, Sweden. Its main focus is heading towards phase 2, which means JIT work, alternate threading models and logic programming (but there are also other possible topics). We'll give newcomer-friendly introductions. To learn more about the new PyPy Python-in-Python implementation look here: http://codespeak.net/pypy Goals and topics of the sprint ------------------------------ We have released pypy-0.8.0_, which is officially a "research base" for future work. The goal of the Gothenburg sprint is to start exploring new directions and continue in the directions started at the Paris sprint. The currently scheduled main topics are: - The L3 interpreter, a small fast interpreter for "assembler-level" flow graphs. This is heading towards JIT work. - Stackless: write an app-level interface, which might be either Tasklets, as in "Stackless CPython", or the more limited Greenlets. - Porting C modules from CPython. (_socket is not finished) - Optimization/debugging work in general. In particular our thread support is far from stable at the moment and unaccountably slow. - Experimentation: logic programming in Python. A first step might be to try to add logic variables to PyPy. .. _`pypy-0.8.0`: http://codespeak.net/pypy/dist/pypy/doc/release-0.8.0.html Location & Accomodation ------------------------ The sprint will be held in the apartment of Laura Creighton and Jacob Halen which is in Gotabergsgatan 22. The location is central in Gothenburg. It is between the tram_ stops of Vasaplatsen and Valand, where many lines call. .. _tram: http://www.vasttrafik.se Probably cheapest and not too far away is to book accomodation at `SGS Veckobostader`_. (You can have a 10% discount there; ask in the pypy-sprint mailing list for details. We also have some possibilites of free accomodation.) .. _`SGS Veckobostader`: http://www.sgsveckobostader.com Exact times ----------- The public PyPy sprint is held Wednesday 7th - Sunday 11th December 2005. There is a sprint for people involved with the EU part of the project on the two days before the "official" sprint. Hours will be from 10:00 until people have had enough. It's a good idea to arrive a day before the sprint starts and leave a day later. In the middle of the sprint there usually is a break day and it's usually ok to take half-days off if you feel like it. Network, Food, currency ------------------------ Sweden is not part of the Euro zone. One SEK (krona in singular, kronor in plural) is roughly 1/10th of a Euro (9.15 SEK to 1 Euro). The venue is central in Gothenburg. There is a large selection of places to get food around, from edible-and-cheap to outstanding. You normally need a wireless network card to access the network, but we can provide a wireless/ethernet bridge. Sweden uses the same kind of plugs as Germany. 230V AC. Registration etc.pp. -------------------- Please subscribe to the `PyPy sprint mailing list`_, introduce yourself and post a note that you want to come. Feel free to ask any questions there! There also is a separate `Gothenburg people`_ page tracking who is already thought to come. If you have commit rights on codespeak then you can modify yourself a checkout of http://codespeak.net/svn/pypy/extradoc/sprintinfo/gothenburg-2005/people.txt .. _`PyPy sprint mailing list`: http://codespeak.net/mailman/listinfo/pypy-sprint .. _`Gothenburg people`: http://codespeak.net/pypy/extradoc/sprintinfo/gothenburg-2005/people.html Cheers, mwh -- Speaking from personal experience, I can attest that the barrel of any firearm (or black powder weapon) pointed at one appears large enough to walk down, hands at full extension above the head, without touching the top. -- Mike Andrews, asr From niko at alum.mit.edu Tue Nov 15 18:50:53 2005 From: niko at alum.mit.edu (Niko Matsakis) Date: Tue, 15 Nov 2005 18:50:53 +0100 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <4379AAD7.2050506@iinet.net.au> References: <4379AAD7.2050506@iinet.net.au> Message-ID: <6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu> > As Neal pointed out, it's tricky to write code for the AST parser > and compiler > without accidentally letting memory leak when the parser or > compiler runs into > a problem and has to bail out on whatever it was doing. Thomas's > patch got to > v5 (based on Neal's review comments) with memory leaks still in it, > my review > got rid of some of them, and we think Neal's last review of v6 of > the patch > got rid of the last of them. Another lurker's 2 cents: My experience with compilers in particular is that an arena is the way to go for memory management. I haven't looked at the AST code, but this can take a variety of forms: anything from linked lists of pointers to free from something which allocates memory in large blocks and parcels them out. The goal is just to be able to free the memory en-masse whatever happens and not have to track individual pointers. Generally, compilers have memory allocations which operate in phases and so are very amenable to arenas. You might have one memory pool for long lived representation, one that is freed and recreated between passes, etc. If you need to keep the AST around long term, then a mark-sweep garbage collector combined with a linked list might even be a good idea. Obviously, the whole thing is a tradeoff of peak memory size (which goes up) against correctness (which is basically ensured, and at least easily auditable). Niko From jeremy at alum.mit.edu Tue Nov 15 20:42:13 2005 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Tue, 15 Nov 2005 14:42:13 -0500 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu> References: <4379AAD7.2050506@iinet.net.au> <6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu> Message-ID: <e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com> On 11/15/05, Niko Matsakis <niko at alum.mit.edu> wrote: > > As Neal pointed out, it's tricky to write code for the AST parser > > and compiler > > without accidentally letting memory leak when the parser or > > compiler runs into > > a problem and has to bail out on whatever it was doing. Thomas's > > patch got to > > v5 (based on Neal's review comments) with memory leaks still in it, > > my review > > got rid of some of them, and we think Neal's last review of v6 of > > the patch > > got rid of the last of them. > > Another lurker's 2 cents: > > My experience with compilers in particular is that an arena is the > way to go for memory management. I haven't looked at the AST code, > but this can take a variety of forms: anything from linked lists of > pointers to free from something which allocates memory in large > blocks and parcels them out. The goal is just to be able to free the > memory en-masse whatever happens and not have to track individual > pointers. Thanks for the message. I was going to suggest the same thing. I think it's primarily a question of how to add an arena layer. The AST phase has a mixture of malloc/free and Python object allocation. It should be straightforward to change the malloc/free code to use an arena API. We'd probably need a separate mechanism to associate a set of PyObject* with the arena and have those DECREFed. Jeremy > > Generally, compilers have memory allocations which operate in phases > and so are very amenable to arenas. You might have one memory pool > for long lived representation, one that is freed and recreated > between passes, etc. > > If you need to keep the AST around long term, then a mark-sweep > garbage collector combined with a linked list might even be a good idea. > > Obviously, the whole thing is a tradeoff of peak memory size (which > goes up) against correctness (which is basically ensured, and at > least easily auditable). > > > Niko > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu > From bcannon at gmail.com Tue Nov 15 20:48:51 2005 From: bcannon at gmail.com (Brett Cannon) Date: Tue, 15 Nov 2005 11:48:51 -0800 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com> References: <4379AAD7.2050506@iinet.net.au> <6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu> <e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com> Message-ID: <bbaeab100511151148i704bdbb3oabc1b7b5dd509a67@mail.gmail.com> On 11/15/05, Jeremy Hylton <jeremy at alum.mit.edu> wrote: > On 11/15/05, Niko Matsakis <niko at alum.mit.edu> wrote: > > > As Neal pointed out, it's tricky to write code for the AST parser > > > and compiler > > > without accidentally letting memory leak when the parser or > > > compiler runs into > > > a problem and has to bail out on whatever it was doing. Thomas's > > > patch got to > > > v5 (based on Neal's review comments) with memory leaks still in it, > > > my review > > > got rid of some of them, and we think Neal's last review of v6 of > > > the patch > > > got rid of the last of them. > > > > Another lurker's 2 cents: > > > > My experience with compilers in particular is that an arena is the > > way to go for memory management. I haven't looked at the AST code, > > but this can take a variety of forms: anything from linked lists of > > pointers to free from something which allocates memory in large > > blocks and parcels them out. The goal is just to be able to free the > > memory en-masse whatever happens and not have to track individual > > pointers. > > Thanks for the message. I was going to suggest the same thing. I > think it's primarily a question of how to add an arena layer. The AST > phase has a mixture of malloc/free and Python object allocation. It > should be straightforward to change the malloc/free code to use an > arena API. We'd probably need a separate mechanism to associate a set > of PyObject* with the arena and have those DECREFed. > Might just need two lists; malloc'ed pointers and PyObject pointers. Could redefine Py_INCREF and Py_DECREF locally for ast.c and compile.c to use the arena API and thus hide the detail. Otherwise just a big, flashing "USE THIS API" sign will be needed. I have gone ahead and added this as a possible topic to sprint on at PyCon. -Brett > Jeremy > > > > > Generally, compilers have memory allocations which operate in phases > > and so are very amenable to arenas. You might have one memory pool > > for long lived representation, one that is freed and recreated > > between passes, etc. > > > > If you need to keep the AST around long term, then a mark-sweep > > garbage collector combined with a linked list might even be a good idea. > > > > Obviously, the whole thing is a tradeoff of peak memory size (which > > goes up) against correctness (which is basically ensured, and at > > least easily auditable). > > > > > > Niko > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > http://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org > From nnorwitz at gmail.com Tue Nov 15 22:57:05 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue, 15 Nov 2005 13:57:05 -0800 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com> References: <4379AAD7.2050506@iinet.net.au> <6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu> <e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com> Message-ID: <ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com> On 11/15/05, Jeremy Hylton <jeremy at alum.mit.edu> wrote: > > Thanks for the message. I was going to suggest the same thing. I > think it's primarily a question of how to add an arena layer. The AST > phase has a mixture of malloc/free and Python object allocation. It > should be straightforward to change the malloc/free code to use an > arena API. We'd probably need a separate mechanism to associate a set > of PyObject* with the arena and have those DECREFed. Well good. It seems we all agree there is a problem and on the general solution. I haven't thought about Brett's idea to see if it could work or not. It would be great if we had someone start working to improve the situation. It could well be that we live with the current code for 2.5, but it would be great to use arenas for 2.6 at least. Niko, Marek, how would you like to lose your lurker status? ;-) n From noamraph at gmail.com Wed Nov 16 00:34:19 2005 From: noamraph at gmail.com (Noam Raphael) Date: Wed, 16 Nov 2005 01:34:19 +0200 Subject: [Python-Dev] str.dedent In-Reply-To: <43791442.8050109@v.loewis.de> References: <dga72k$cah$1@sea.gmane.org> <b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com> <43777B5A.6030602@egenix.com> <200511140920.51724.gmccaughan@synaptics-uk.com> <437869DD.7040800@egenix.com> <b348a0850511141114p25411ea4w704a99d1ea9a629a@mail.gmail.com> <dlaqds$8sb$1@sea.gmane.org> <b348a0850511141425y1a894ddap14d7814568c9be5d@mail.gmail.com> <43791442.8050109@v.loewis.de> Message-ID: <b348a0850511151534q4e8abbf6vc3c63c07d3291d6a@mail.gmail.com> Thanks for your examples. I understand tham sometimes it's a good idea not to write the HTML inside the function (although it may be nice to sometimes write it just before the function - and if it's a method, then we get the same indentation problem.) However, as you said, sometimes it is desired to write multilined strings inside functions. You think it's ok to add white spaces to the HTML code, I personally prefer not add varying indentation to my output according to the level of indentation the code that generated it. I just wanted to add another use case: long messages. Consider those lines from idlelib/run.py:133 msg = "IDLE's subprocess can't connect to %s:%d. This may be due "\ "to your personal firewall configuration. It is safe to "\ "allow this internal connection because no data is visible on "\ "external ports." % address tkMessageBox.showerror("IDLE Subprocess Error", msg, parent=root) and from idlelib/PyShell.py:734: def display_port_binding_error(self): tkMessageBox.showerror( "Port Binding Error", "IDLE can't bind TCP/IP port 8833, which is necessary to " "communicate with its Python execution server. Either " "no networking is installed on this computer or another " "process (another IDLE?) is using the port. Run IDLE with the -n " "command line switch to start without a subprocess and refer to " "Help/IDLE Help 'Running without a subprocess' for further " "details.", master=self.tkconsole.text) I know, of course, that it could be written using textwrap.dedent, but I think that not having to load a module will encourage the use of dedent; if I have to load a module, I might say, "oh, I can live with all those marks around the text, there's no need for another module", and then, any time I want to change that message, I have a lot of editing work to do. Noam From Scott.Daniels at Acm.Org Wed Nov 16 00:31:45 2005 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Tue, 15 Nov 2005 15:31:45 -0800 Subject: [Python-Dev] Behavoir question. Message-ID: <dldr4m$n0v$1@sea.gmane.org> Since I am fiddling with int/long conversions to/from string: Is the current behavior intentional (or mandatory?): v = int(' 55555555555555555555555555555555555555555 ') works, but: v = int(' 55555555555555555555555555555555555555555L ') fails. --Scott David Daniels Scott.Daniels at Acm.Org From mdehoon at c2b2.columbia.edu Wed Nov 16 00:48:43 2005 From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon) Date: Tue, 15 Nov 2005 18:48:43 -0500 Subject: [Python-Dev] Conclusion: Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <b348a0850511141439p3f0f4cdbp5d7332b1d1224f19@mail.gmail.com> References: <4373A214.6060201@v.loewis.de> <4377D97E.9060507@c2b2.columbia.edu> <20051113230400.A403.JCARLSON@uci.edu> <AE21F850-277C-43B5-89F8-60BA2B824F59@mac.com> <4378DEE8.70109@c2b2.columbia.edu> <b348a0850511141212o12556119jd2be06f9444b3d1b@mail.gmail.com> <dlb30h$771$1@sea.gmane.org> <b348a0850511141439p3f0f4cdbp5d7332b1d1224f19@mail.gmail.com> Message-ID: <437A73DB.9000705@c2b2.columbia.edu> Thanks everybody for contributing to this discussion. I didn't expect it to become this extensive. I think that by now, everybody has had their chance to voice their opinion. It seems safe to conclude that there is no consensus on this topic. So what I'm planning to do is to write a small extension module that implements some of the ideas that came up in this discussion, and see how they perform in the wild. It will give us an idea of what works, what doesn't, and what the user demand is for such functionality, and will help us if this issue happens to turn up again at some point in the future. Thanks again, --Michiel. -- Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 From pje at telecommunity.com Wed Nov 16 01:19:38 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 15 Nov 2005 19:19:38 -0500 Subject: [Python-Dev] Conclusion: Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <437A73DB.9000705@c2b2.columbia.edu> References: <b348a0850511141439p3f0f4cdbp5d7332b1d1224f19@mail.gmail.com> <4373A214.6060201@v.loewis.de> <4377D97E.9060507@c2b2.columbia.edu> <20051113230400.A403.JCARLSON@uci.edu> <AE21F850-277C-43B5-89F8-60BA2B824F59@mac.com> <4378DEE8.70109@c2b2.columbia.edu> <b348a0850511141212o12556119jd2be06f9444b3d1b@mail.gmail.com> <dlb30h$771$1@sea.gmane.org> <b348a0850511141439p3f0f4cdbp5d7332b1d1224f19@mail.gmail.com> Message-ID: <5.1.1.6.0.20051115191823.01f1a4c0@mail.telecommunity.com> At 06:48 PM 11/15/2005 -0500, Michiel Jan Laurens de Hoon wrote: >Thanks everybody for contributing to this discussion. I didn't expect it >to become this extensive. >I think that by now, everybody has had their chance to voice their opinion. >It seems safe to conclude that there is no consensus on this topic. Just a question: did you ever try using IPython, and confirm whether it does or does not address the issue you were having? As far as I could tell, you never confirmed or denied that point. From mdehoon at c2b2.columbia.edu Wed Nov 16 02:34:02 2005 From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon) Date: Tue, 15 Nov 2005 20:34:02 -0500 Subject: [Python-Dev] Conclusion: Event loops, PyOS_InputHook, and Tkinter In-Reply-To: <5.1.1.6.0.20051115191823.01f1a4c0@mail.telecommunity.com> References: <b348a0850511141439p3f0f4cdbp5d7332b1d1224f19@mail.gmail.com> <4373A214.6060201@v.loewis.de> <4377D97E.9060507@c2b2.columbia.edu> <20051113230400.A403.JCARLSON@uci.edu> <AE21F850-277C-43B5-89F8-60BA2B824F59@mac.com> <4378DEE8.70109@c2b2.columbia.edu> <b348a0850511141212o12556119jd2be06f9444b3d1b@mail.gmail.com> <dlb30h$771$1@sea.gmane.org> <b348a0850511141439p3f0f4cdbp5d7332b1d1224f19@mail.gmail.com> <5.1.1.6.0.20051115191823.01f1a4c0@mail.telecommunity.com> Message-ID: <437A8C8A.808@c2b2.columbia.edu> Phillip J. Eby wrote: > At 06:48 PM 11/15/2005 -0500, Michiel Jan Laurens de Hoon wrote: > >> Thanks everybody for contributing to this discussion. I didn't expect it >> to become this extensive. >> I think that by now, everybody has had their chance to voice their >> opinion. >> It seems safe to conclude that there is no consensus on this topic. > > > Just a question: did you ever try using IPython, and confirm whether > it does or does not address the issue you were having? As far as I > could tell, you never confirmed or denied that point. > Yes I did try IPython. First of all, IPython, being pure Python code, does not affect the underlying Python's loop (at the C level). So just running Python through IPython does not fix our event loop problem. On Windows, for example, after importing IPython into IDLE (which most of our users will want to use), our graphics window still freezes. This leaves us with the possibility of using IPython's event loop, which it runs on top of regular Python. But if we use that, we'd either have to convince all our users to switch to IPython (which is impossible) or we have to maintain two mechanisms to hook our extension module into the event loop: one for Python and one for IPython. There are several other reasons why the alternative solutions that came up in this discussion are more attractive than IPython: 1) AFAICT, IPython is not intended to work with IDLE. 2) I didn't get the impression that the IPython developers understand why and how their event loop works very well (which made it hard to respond to their posts). I am primarily interested in understanding the problem first and then come up with a suitable mechanism for events. Without such understanding, IPython's event loop smells too much like a hack. 3) IPython adds another layer on top of Python. For IPython's purpose, that's fine. But if we're just interested in event loops, I think it is hard to argue that another layer is absolutely necessary. So rather than setting up an event loop in a layer on top of Python, I'd prefer to find a solution within the context of Python itself (be it threads, an event loop, or PyOS_InputHook). 4) Call me a sentimental fool, but I just happen to like regular Python. My apologies in advance to the IPython developers if I misunderstood how it works. --Michiel. -- Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 From fperez.net at gmail.com Wed Nov 16 03:03:52 2005 From: fperez.net at gmail.com (Fernando Perez) Date: Tue, 15 Nov 2005 19:03:52 -0700 Subject: [Python-Dev] Conclusion: Event loops, PyOS_InputHook, and Tkinter References: <b348a0850511141439p3f0f4cdbp5d7332b1d1224f19@mail.gmail.com> <4373A214.6060201@v.loewis.de> <4377D97E.9060507@c2b2.columbia.edu> <20051113230400.A403.JCARLSON@uci.edu> <AE21F850-277C-43B5-89F8-60BA2B824F59@mac.com> <4378DEE8.70109@c2b2.columbia.edu> <b348a0850511141212o12556119jd2be06f9444b3d1b@mail.gmail.com> <dlb30h$771$1@sea.gmane.org> <5.1.1.6.0.20051115191823.01f1a4c0@mail.telecommunity.com> <437A8C8A.808@c2b2.columbia.edu> Message-ID: <dle429$d5d$1@sea.gmane.org> Michiel Jan Laurens de Hoon wrote: > There are several other reasons why the alternative solutions that came > up in this discussion are more attractive than IPython: > 1) AFAICT, IPython is not intended to work with IDLE. Not so far, but mostly by accident. The necessary changes are fairly easy (mainly abstracting out assumptions about being in a tty). I plan on making ipython embeddable inside any GUI (including IDLE), as there is much demand for that. > 2) I didn't get the impression that the IPython developers understand > why and how their event loop works very well (which made it hard to > respond to their posts). I am primarily interested in understanding the > problem first and then come up with a suitable mechanism for events. > Without such understanding, IPython's event loop smells too much like a > hack. I said I did get that code off the ground by stumbling in the dark, but I tried to explain to you what it does, which is pretty simple: a. You find, for each toolkit, what its timer/idle mechanism is. This requires reading a little about each toolkit's API, as they all do it slightly differently. But the principle is always the same, only the implementation details change. b. You subclass threading.Thread, as you do for all threading code. The run method of this class manages a one-entry queue where code is put for execution from stdin. c. The timer you set up with the info from (a) calls the method which executes the code object from the queue in (b), with suitable locking. That's pretty much it. Following this same idea, just this week I implemented an ipython-for-OpenGL shell. All I had to do was look up what OpenGL uses for an idle callback. > 3) IPython adds another layer on top of Python. For IPython's purpose, > that's fine. But if we're just interested in event loops, I think it is > hard to argue that another layer is absolutely necessary. So rather than > setting up an event loop in a layer on top of Python, I'd prefer to find > a solution within the context of Python itself (be it threads, an event > loop, or PyOS_InputHook). I gave you a link to a 200 line script which implements the core idea for GTK without any ipython at all. I explained that in my message. I don't know how to be more specific, ipython-independent or clear with you. > 4) Call me a sentimental fool, but I just happen to like regular Python. That's fine. I'd argue that ipython is exceptionally useful in a scientific computing workflow, but I'm obviously biased. Many others in the scientific community seem to agree with me, though, given the frequency of ipython prompts in posts to the scientific computing lists. But this is free software in a free world: use whatever you like. All I'm interested in is in clarifying a technical issue, not in evangelizing ipython; that's why I gave you a link to a non-ipython example which implements the key idea using only the standard python library. > My apologies in advance to the IPython developers if I misunderstood how > it works. No problem. But your posts so far seem to indicate you hardly read what I said, as I've had to repeat several key points over and over (the non-ipython solutions, for example). Cheers, f From ironfroggy at gmail.com Wed Nov 16 08:16:49 2005 From: ironfroggy at gmail.com (Calvin Spealman) Date: Wed, 16 Nov 2005 02:16:49 -0500 Subject: [Python-Dev] Behavoir question. In-Reply-To: <dldr4m$n0v$1@sea.gmane.org> References: <dldr4m$n0v$1@sea.gmane.org> Message-ID: <76fd5acf0511152316g68164f7em1f4fac0fc4b1d976@mail.gmail.com> On 11/15/05, Scott David Daniels <Scott.Daniels at acm.org> wrote: > Since I am fiddling with int/long conversions to/from string: > > Is the current behavior intentional (or mandatory?): > > v = int(' 55555555555555555555555555555555555555555 ') > works, but: > v = int(' 55555555555555555555555555555555555555555L ') > fails. > > --Scott David Daniels > Scott.Daniels at Acm.Org int(s) works where s is a string representing a number. 10L does not represent a number directly, but is Python syntax for making an integer constant a long, and not an int. (Consider that both are representations of mathematical integers, tho in python we only call one of them an integer by terminology). So, what you're asking is like if list('[1,2]') returned [1,2]. If you need this functionality, maybe you need a regex match and expr(). From oliphant at ee.byu.edu Wed Nov 16 08:20:47 2005 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed, 16 Nov 2005 00:20:47 -0700 Subject: [Python-Dev] Problems with the Python Memory Manager Message-ID: <437ADDCF.7080906@ee.byu.edu> I know (thanks to Google) that much has been said in the past about the Python Memory Manager. My purpose in posting is simply to given a use-case example of how the current memory manager (in Python 2.4.X) can be problematic in scientific/engineering code. Scipy core is a replacement for Numeric. One of the things scipy core does is define a new python scalar object for ever data type that an array can have (currently 21). This has many advantages and is made feasible by the ability of Python to subtype in C. These scalars all inherit from the standard Python types where there is a correspondence. More to the point, however, these scalar objects were allocated using the standard PyObject_New and PyObject_Del functions which of course use the Python memory manager. One user ported his (long-running) code to the new scipy core and found much to his dismay that what used to consume around 100MB now completely dominated his machine consuming up to 2GB of memory after only a few iterations. After searching many hours for memory leaks in scipy core (not a bad exercise anyway as some were found), the real problem was tracked to the fact that his code ended up creating and destroying many of these new array scalars. The Python memory manager was not reusing memory (even though PyObject_Del was being called). I don't know enough about the memory manager to understand why that was happening. However, changing the allocation from PyObject_New to malloc and from PyObject_Del to free, fixed the problems this user was seeing. Now the code runs for a long time consuming only around 100MB at-a-time. Thus, all of the objects in scipy core now use system malloc and system free for their memory needs. Perhaps this is unfortunate, but it was the only solution I could see in the short term. In the long term, what is the status of plans to re-work the Python Memory manager to free memory that it acquires (or improve the detection of already freed memory locations). I see from other postings that this has been a problem for other people as well. Also, is there a recommended way for dealing with this problem other than using system malloc and system free (or I suppose writing your own specialized memory manager). Thanks for any feedback, -Travis Oliphant From bcannon at gmail.com Wed Nov 16 09:56:41 2005 From: bcannon at gmail.com (Brett Cannon) Date: Wed, 16 Nov 2005 00:56:41 -0800 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com> References: <4379AAD7.2050506@iinet.net.au> <6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu> <e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com> <ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com> Message-ID: <bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com> On 11/15/05, Neal Norwitz <nnorwitz at gmail.com> wrote: > On 11/15/05, Jeremy Hylton <jeremy at alum.mit.edu> wrote: > > > > Thanks for the message. I was going to suggest the same thing. I > > think it's primarily a question of how to add an arena layer. The AST > > phase has a mixture of malloc/free and Python object allocation. It > > should be straightforward to change the malloc/free code to use an > > arena API. We'd probably need a separate mechanism to associate a set > > of PyObject* with the arena and have those DECREFed. > > Well good. It seems we all agree there is a problem and on the > general solution. I haven't thought about Brett's idea to see if it > could work or not. It would be great if we had someone start working > to improve the situation. It could well be that we live with the > current code for 2.5, but it would be great to use arenas for 2.6 at > least. > I have been thinking about this some more to put off doing homework and I have some random ideas I just wanted to toss out there to make sure I am not thinking about arena memory management incorrectly (never actually encountered it directly before). I think an arena API is going to be the best solution. Pulling trickery with redefining Py_INCREF and such like I suggested seems like a pain and possibly error-prone. With the compiler being a specific corner of the core having a special API for handling the memory for PyObject* stuff seems reasonable. We might need PyArena_Malloc() and PyArena_New() to handle malloc() and PyObject* creation. We could then have a struct that just stored pointers to the allocated memory (linked list for each pointer which gives high memory overhead or linked list of arrays that should lower memory but make having possible holes in the array for stuff already freed a pain to handle). We would then have PyArena_FreeAll() that would be strategically placed in the code for when bad things happen that would just traverse the lists and free everything. I assume having a way to free individual items might be useful. Could have the PyArena_New() and _Malloc() return structs with the needed info for a PyArena_Free(location_struct) to be able to fee the specific item without triggering a complete freeing of all memory. But this usage should be discouraged and only used when proper memory management is guaranteed. Boy am I wanting RAII from C++ for automatic freeing when scope is left. Maybe we need to come up with a similar thing, like all memory that should be freed once a scope is left must use some special struct that stores references to all created memory locally and then a free call must be made at all exit points in the function using the special struct. Otherwise the pointer is stored in the arena and handled en-mass later. Hopefully this is all made some sense. =) Is this the basic strategy that an arena setup would need? if not can someone enlighten me? -Brett From krumms at gmail.com Wed Nov 16 10:49:50 2005 From: krumms at gmail.com (Thomas Lee) Date: Wed, 16 Nov 2005 19:49:50 +1000 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com> References: <4379AAD7.2050506@iinet.net.au> <6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu> <e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com> <ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com> <bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com> Message-ID: <437B00BE.7060007@gmail.com> As the writer of the crappy code that sparked this conversation, I feel I should say something :) Brett Cannon wrote: >On 11/15/05, Neal Norwitz <nnorwitz at gmail.com> wrote: > > >>On 11/15/05, Jeremy Hylton <jeremy at alum.mit.edu> wrote: >> >> >>>Thanks for the message. I was going to suggest the same thing. I >>>think it's primarily a question of how to add an arena layer. The AST >>>phase has a mixture of malloc/free and Python object allocation. It >>>should be straightforward to change the malloc/free code to use an >>>arena API. We'd probably need a separate mechanism to associate a set >>>of PyObject* with the arena and have those DECREFed. >>> >>> >>Well good. It seems we all agree there is a problem and on the >>general solution. I haven't thought about Brett's idea to see if it >>could work or not. It would be great if we had someone start working >>to improve the situation. It could well be that we live with the >>current code for 2.5, but it would be great to use arenas for 2.6 at >>least. >> >> >> > > I have been thinking about this some more to put off doing homework >and I have some random ideas I just wanted to toss out there to make >sure I am not thinking about arena memory management incorrectly >(never actually encountered it directly before). > >I think an arena API is going to be the best solution. Pulling >trickery with redefining Py_INCREF and such like I suggested seems >like a pain and possibly error-prone. With the compiler being a >specific corner of the core having a special API for handling the >memory for PyObject* stuff seems reasonable. > > > I agree. And it raises the learning curve for poor saps like myself. :) >We might need PyArena_Malloc() and PyArena_New() to handle malloc() >and PyObject* creation. We could then have a struct that just stored >pointers to the allocated memory (linked list for each pointer which >gives high memory overhead or linked list of arrays that should lower >memory but make having possible holes in the array for stuff already >freed a pain to handle). We would then have PyArena_FreeAll() that >would be strategically placed in the code for when bad things happen >that would just traverse the lists and free everything. I assume >having a way to free individual items might be useful. Could have the >PyArena_New() and _Malloc() return structs with the needed info for a >PyArena_Free(location_struct) to be able to fee the specific item >without triggering a complete freeing of all memory. But this usage >should be discouraged and only used when proper memory management is >guaranteed. > > > An arena/pool (as I understood it from my quick skim) for the AST would probably best be implemented (IMHO) as an ADT based on a linked-list: typedef struct _ast_pool_node { struct _ast_pool_node *next; PyObject *object; /* == NULL when data != NULL */ void *data; /* == NULL when object != NULL */ }ast_pool_node; deallocating a node could then be as simple as: /* ast_pool_node *n */ PyObject_Free(n->object); if (n->data != NULL) free(n->data); /* save n->next */ free(n); /* then go on to free n->next */ I haven't really thought all that deeply about this, so somebody shoot me down if I'm completely off-base (Neal? :D). Every allocation of a seq/stmt within ast.c would have its memory saved to the pool within the function it's allocated in. Then before we return, we can just deallocate the pool/arena/whatever you want to call it. The problem with this is that should we get to the end of the function and everything actually went okay (i.e. we return non-NULL), we then have to run through and deallocate all the nodes anyway (without deallocating n->object or n->data). Bah. Maybe we *would* be better off with a monolithic cleanup. I don't know. >Boy am I wanting RAII from C++ for automatic freeing when scope is >left. Maybe we need to come up with a similar thing, like all memory >that should be freed once a scope is left must use some special struct >that stores references to all created memory locally and then a free >call must be made at all exit points in the function using the special >struct. Otherwise the pointer is stored in the arena and handled >en-mass later. > > > Which is basically what I just rambled on about up above, I think :) >Hopefully this is all made some sense. =) Is this the basic strategy >that an arena setup would need? if not can someone enlighten me? > > >-Brett >_______________________________________________ >Python-Dev mailing list >Python-Dev at python.org >http://mail.python.org/mailman/listinfo/python-dev >Unsubscribe: http://mail.python.org/mailman/options/python-dev/krumms%40gmail.com > > > From ncoghlan at gmail.com Wed Nov 16 11:33:22 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 16 Nov 2005 20:33:22 +1000 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <4379D40E.9050002@gmail.com> References: <4379AAD7.2050506@iinet.net.au> <5f3d2c310511150422x3e2d670r@mail.gmail.com> <4379D40E.9050002@gmail.com> Message-ID: <437B0AF2.7010400@gmail.com> Nick Coghlan wrote: > Marek Baczek Baczy?ski wrote: >> 2005/11/15, Nick Coghlan <ncoghlan at iinet.net.au>: >>> It avoids the potential for labelling problems that arises when goto's are >>> used for resource cleanup. It's a far cry from real exception handling, but >>> it's the best solution I've seen within the limits of C. >> <delurk> >> do { >> .... >> .... >> } while (0); >> >> >> Same benefit and saves some typing :) > > Heh. Good point. I spend so much time working with a certain language I tend > to forget do/while loops exist ;) Thomas actually tried doing things this way, and the parser/compiler code needs to use loops, which means this trick won't work reliably. So we'll need to do something smarter (such as the arena idea) to deal with the memory allocation problem. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From skip at pobox.com Wed Nov 16 11:59:03 2005 From: skip at pobox.com (skip@pobox.com) Date: Wed, 16 Nov 2005 04:59:03 -0600 Subject: [Python-Dev] Problems with the Python Memory Manager In-Reply-To: <437ADDCF.7080906@ee.byu.edu> References: <437ADDCF.7080906@ee.byu.edu> Message-ID: <17275.4343.724248.625173@montanaro.dyndns.org> Travis> More to the point, however, these scalar objects were allocated Travis> using the standard PyObject_New and PyObject_Del functions which Travis> of course use the Python memory manager. One user ported his Travis> (long-running) code to the new scipy core and found much to his Travis> dismay that what used to consume around 100MB now completely Travis> dominated his machine consuming up to 2GB of memory after only a Travis> few iterations. After searching many hours for memory leaks in Travis> scipy core (not a bad exercise anyway as some were found), the Travis> real problem was tracked to the fact that his code ended up Travis> creating and destroying many of these new array scalars. What Python object were his array elements a subclass of? Travis> In the long term, what is the status of plans to re-work the Travis> Python Memory manager to free memory that it acquires (or Travis> improve the detection of already freed memory locations). None that I'm aware of. It's seen a great deal of work in the past and generally doesn't cause problems. Maybe your user's usage patterns were a bad corner case. It's hard to tell without more details. Skip From niko at alum.mit.edu Wed Nov 16 12:34:29 2005 From: niko at alum.mit.edu (Niko Matsakis) Date: Wed, 16 Nov 2005 12:34:29 +0100 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com> References: <4379AAD7.2050506@iinet.net.au> <6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu> <e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com> <ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com> <bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com> Message-ID: <13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu> > Boy am I wanting RAII from C++ for automatic freeing when scope is > left. Maybe we need to come up with a similar thing, like all memory > that should be freed once a scope is left must use some special struct > that stores references to all created memory locally and then a free > call must be made at all exit points in the function using the special > struct. Otherwise the pointer is stored in the arena and handled > en-mass later. That made sense. I think I'd be opposed to what you describe here just because I think anything which *requires* that cleanup code be placed on every function is error prone. Depending on how much you care about peak memory usage, you do not necessarily need to worry about freeing pointers as you go. If you can avoid thinking about it, it makes things much simpler. If you are concerned with peak memory usage, it gets more complicated, and you will begin to have greater possibility of user error. The problem is that dynamically allocated memory often outlives the stack frame in which it was created. There are several possibilities: - If you use ref-counted memory, you can add to the ref count of the memory which outlives the stack frame; the problem is knowing when to drop it down again. I think the easiest is to have two lists: one for memory which will go away quickly, and another for more permanent memory. The more permanent memory list goes away at the end of the transform and is hopefully rarely used. - Another idea is to have trees of arenas: the idea is that when an arena is created, it is assigned a parent. When an arena is freed, an arenas in its subtree are also freed. This way you can have one master arena for exception handling, but if there is some sub-region where allocations can be grouped together, you create a sub-arena and free it when that region is complete. Note that if you forget to free a sub-arena, it will eventually be freed. There is no one-size-fits-all solution. The right one depends on how memory is used; but I think all of them are much simpler and less error prone than tracking individual pointers. I'd actually be happy to hack on the AST code and try to clean up the memory usage, assuming that the 2.6 release is far enough out that I will have time to squeeze it in among the other things I am doing. Niko From krumms at gmail.com Wed Nov 16 13:05:09 2005 From: krumms at gmail.com (Thomas Lee) Date: Wed, 16 Nov 2005 22:05:09 +1000 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu> References: <4379AAD7.2050506@iinet.net.au> <6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu> <e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com> <ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com> <bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com> <13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu> Message-ID: <437B2075.1000102@gmail.com> Niko Matsakis wrote: >>Boy am I wanting RAII from C++ for automatic freeing when scope is >>left. Maybe we need to come up with a similar thing, like all memory >>that should be freed once a scope is left must use some special struct >>that stores references to all created memory locally and then a free >>call must be made at all exit points in the function using the special >>struct. Otherwise the pointer is stored in the arena and handled >>en-mass later. >> >> > >That made sense. I think I'd be opposed to what you describe here >just because I think anything which *requires* that cleanup code be >placed on every function is error prone. > > > Placing it in every function isn't really the problem: at the moment it's more the fact we have to keep track of too many variables at any given time to properly deallocate it all. Cleanup code gets tricky very fast. Then it gets further complicated by the fact that stmt_ty/expr_ty/mod_ty/etc. deallocate members (usually asdl_seq instances in my experience) - so if a construction takes place, all of a sudden you have to make sure you don't deallocate those members a second time in the cleanup code :S it gets tricky very quickly. Even if it meant we had just one function call - one, safe function call that deallocated all the memory allocated within a function - that we had to put before each and every return, that's better than what we have. Is it the best solution? Maybe not. But that's what we're looking for here I guess :) From krumms at gmail.com Wed Nov 16 13:11:26 2005 From: krumms at gmail.com (Thomas Lee) Date: Wed, 16 Nov 2005 22:11:26 +1000 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <437B2075.1000102@gmail.com> References: <4379AAD7.2050506@iinet.net.au> <6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu> <e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com> <ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com> <bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com> <13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu> <437B2075.1000102@gmail.com> Message-ID: <437B21EE.5040804@gmail.com> By the way, I liked the sound of the arena/pool tree - really good idea. Thomas Lee wrote: >Niko Matsakis wrote: > > > >>>Boy am I wanting RAII from C++ for automatic freeing when scope is >>>left. Maybe we need to come up with a similar thing, like all memory >>>that should be freed once a scope is left must use some special struct >>>that stores references to all created memory locally and then a free >>>call must be made at all exit points in the function using the special >>>struct. Otherwise the pointer is stored in the arena and handled >>>en-mass later. >>> >>> >>> >>> >>That made sense. I think I'd be opposed to what you describe here >>just because I think anything which *requires* that cleanup code be >>placed on every function is error prone. >> >> >> >> >> >Placing it in every function isn't really the problem: at the moment >it's more the fact we have to keep track of too many variables at any >given time to properly deallocate it all. Cleanup code gets tricky very >fast. > >Then it gets further complicated by the fact that >stmt_ty/expr_ty/mod_ty/etc. deallocate members (usually asdl_seq >instances in my experience) - so if a construction takes place, all of a >sudden you have to make sure you don't deallocate those members a second >time in the cleanup code :S it gets tricky very quickly. > >Even if it meant we had just one function call - one, safe function call >that deallocated all the memory allocated within a function - that we >had to put before each and every return, that's better than what we >have. Is it the best solution? Maybe not. But that's what we're looking >for here I guess :) > >_______________________________________________ >Python-Dev mailing list >Python-Dev at python.org >http://mail.python.org/mailman/listinfo/python-dev >Unsubscribe: http://mail.python.org/mailman/options/python-dev/krumms%40gmail.com > > > From fredrik at pythonware.com Wed Nov 16 13:05:42 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 16 Nov 2005 13:05:42 +0100 Subject: [Python-Dev] Memory management in the AST parser & compiler References: <4379AAD7.2050506@iinet.net.au> <6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu> <e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com> <ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com> <bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com><13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu> <437B2075.1000102@gmail.com> Message-ID: <dlf7ak$ckg$1@sea.gmane.org> Thomas Lee wrote: > Even if it meant we had just one function call - one, safe function call > that deallocated all the memory allocated within a function - that we > had to put before each and every return, that's better than what we > have. alloca? (duck) </F> From collinw at gmail.com Wed Nov 16 14:09:23 2005 From: collinw at gmail.com (Collin Winter) Date: Wed, 16 Nov 2005 14:09:23 +0100 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu> References: <4379AAD7.2050506@iinet.net.au> <6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu> <e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com> <ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com> <bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com> <13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu> Message-ID: <43aa6ff70511160509y5abdd8a9y4ec8c131e429b4c0@mail.gmail.com> On 11/16/05, Niko Matsakis <niko at alum.mit.edu> wrote: > - Another idea is to have trees of arenas: the idea is that when an > arena is created, it is assigned a parent. When an arena is freed, > an arenas in its subtree are also freed. This way you can have one > master arena for exception handling, but if there is some sub-region > where allocations can be grouped together, you create a sub-arena and > free it when that region is complete. Note that if you forget to > free a sub-arena, it will eventually be freed. You might be able to draw some inspiration from the Apache Portable Runtime. It includes a memory pool management scheme that might be of some interest. The main project page is http://apr.apache.org, with the docs for the mempool API located at http://apr.apache.org/docs/apr/group__apr__pools.html Collin Winter From ncoghlan at gmail.com Wed Nov 16 14:11:02 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 16 Nov 2005 23:11:02 +1000 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <437B00BE.7060007@gmail.com> References: <4379AAD7.2050506@iinet.net.au> <6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu> <e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com> <ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com> <bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com> <437B00BE.7060007@gmail.com> Message-ID: <437B2FE6.7080206@gmail.com> Thomas Lee wrote: > As the writer of the crappy code that sparked this conversation, I feel > I should say something :) Don't feel bad about it. It turned out the 'helpful' review comments from Neal and I didn't originally work out very well either ;) With the AST compiler being so new, this is the first serious attempt to introduce modifications based on it. It's already better than the old CST compiler, but that memory management in the parser is a cow :) >> Hopefully this is all made some sense. =) Is this the basic strategy >> that an arena setup would need? if not can someone enlighten me? I think we need to be explicit about the problems we're trying to solve before deciding on what kind of solution we want :) 1. Cleaning up after failures in symtable.c and compile.c It turns out this is already dealt with in the case of code blocks - the compiler state handles a linked list of blocks which it automatically frees when the compiler state is cleaned up. So the only rule that needs to be followed in these files is to *never* call any of the VISIT_* macros while there is a Python object which requires DECREF'ing, or a C pointer which needs to be freed. This rule was being broken in a couple of places in compile.c (with respect to strings). I was the offender in both cases I found - the errors date from when this was still on the ast-branch in CVS. I've fixed those errors in SVN, and added a note to the comment at the top of compile.c, to help others avoid making the same mistake I did. It's fragile in some ways, but it does work. It makes the actual compilation code look clean (because there isn't any cleanup code), but it also makes that code look *wrong* (because the lack of cleanup code makes the calls to "compiler_new_block" look unbalanced), which is a little disconcerting. 2. Parsing a token stream into the AST in ast.c This is the bit that has caused Thomas grief (the PEP 341 patch only needs to modify the front end parser). When building an AST node, each of the contained AST nodes or sequences has to be built first. That means that, if there's a problem with any of the later subnodes, the earlier subnodes need to be freed. The key problem with memory management in this module is that the free method to be invoked is dependent on the nature of the AST node to be freed. In the case of a node sequence, it is dependent on the nature of the contained elements. So not only do you have to remember to free the memory, you have to remember to free it the *right way*. Would it be worth the extra memory needed to store a pointer to an AST node's "free" method in the AST type structure itself? And do the same for ASDL sequences? Then a simple FREE_AST macro would be able to "do the right thing" when it came to freeing either AST nodes or sequences. In particular, ASDL sequences would be able to free their contents without knowing what those contents actually are. That wouldn't eliminate the problem with memory leaks or double-deletion, but it would eliminate some of the mental overhead of dealing with figuring out which freeing function to invoke. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From krumms at gmail.com Wed Nov 16 15:15:20 2005 From: krumms at gmail.com (Thomas Lee) Date: Thu, 17 Nov 2005 00:15:20 +1000 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <437B2FE6.7080206@gmail.com> References: <4379AAD7.2050506@iinet.net.au> <6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu> <e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com> <ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com> <bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com> <437B00BE.7060007@gmail.com> <437B2FE6.7080206@gmail.com> Message-ID: <437B3EF8.2030001@gmail.com> Just messing around with some ideas. I was trying to avoid the ugly macros (note my earlier whinge about a learning curve) but they're the cleanest way I could think of to get around the problem without resorting to a mass deallocation right at the end of the AST run. Which may not be all that bad given we're going to keep everything in-memory anyway until an error occurs ... anyway, anyway, I'm getting sidetracked :) The idea is to ensure that all allocations within a single function are made using the pool so that a function finishes what it starts. This way, if the function fails it alone is responsible for cleaning up its own pool and that's all. No funkyness needed for sequences, because each member of the sequence belongs to the pool too. Note that the stmt_ty instances are also allocated using the pool. This breaks interfaces all over the place though. Not exactly a pretty change :) But yeah, maybe somebody smarter than I will come up with something a bit cleaner. -- /* snip! */ #define AST_SUCCESS(pool, result) return result #define AST_FAILURE(pool, result) asdl_pool_free(pool); return result static stmt_ty ast_for_try_stmt(struct compiling *c, const node *n) { /* with the pool stuff, we wouldn't need to declare _all_ the variables here either. I'm just lazy. */ asdl_pool *pool; int i; const int nch = NCH(n); int n_except = (nch - 3)/3; stmt_ty result_st = NULL, except_st = NULL; asdl_seq *body = NULL, *orelse = NULL, *finally = NULL; asdl_seq *inner = NULL, *handlers = NULL; REQ(n, try_stmt); /* c->pool is the parent of pool. when pool is freed (via AST_FAILURE), it is also removed from c->pool's list of children */ pool = asdl_pool_new(c->pool); if (pool == NULL) AST_FAILURE(pool, NULL); body = ast_for_suite(c, CHILD(n, 2)); if (body == NULL) AST_FAILURE(pool, NULL); if (TYPE(CHILD(n, nch - 3)) == NAME) { if (strcmp(STR(CHILD(n, nch - 3)), "finally") == 0) { if (nch >= 9 && TYPE(CHILD(n, nch - 6)) == NAME) { /* we can assume it's an "else", because nch >= 9 for try-else-finally and it would otherwise have a type of except_clause */ orelse = ast_for_suite(c, CHILD(n, nch - 4)); if (orelse == NULL) AST_FAILURE(pool, NULL); n_except--; } finally = ast_for_suite(c, CHILD(n, nch - 1)); if (finally == NULL) AST_FAILURE(pool, NULL); n_except--; } else { /* we can assume it's an "else", otherwise it would have a type of except_clause */ orelse = ast_for_suite(c, CHILD(n, nch - 1)); if (orelse == NULL) AST_FAILURE(pool, NULL); n_except--; } } else if (TYPE(CHILD(n, nch - 3)) != except_clause) { ast_error(n, "malformed 'try' statement"); AST_FAILURE(pool, NULL); } if (n_except > 0) { /* process except statements to create a try ... except */ handlers = asdl_seq_new(pool, n_except); if (handlers == NULL) AST_FAILURE(pool, NULL); for (i = 0; i < n_except; i++) { excepthandler_ty e = ast_for_except_clause(c, CHILD(n, 3 + i * 3), CHILD(n, 5 + i * 3)); if (!e) AST_FAILURE(pool, NULL); asdl_seq_SET(handlers, i, e); } except_st = TryExcept(pool, body, handlers, orelse, LINENO(n)); if (except_st == NULL) AST_FAILURE(pool, NULL); /* if a 'finally' is present too, we nest the TryExcept within a TryFinally to emulate try ... except ... finally */ if (finally != NULL) { inner = asdl_seq_new(pool, 1); if (inner == NULL) AST_FAILURE(pool, NULL); asdl_seq_SET(inner, 0, except_st); result_st = TryFinally(pool, inner, finally, LINENO(n)); if (result_st == NULL) AST_FAILURE(pool, NULL); } else result_st = except_st; } else { /* no exceptions: must be a try ... finally */ assert(orelse == NULL); assert(finally != NULL); result_st = TryFinally(pool, body, finally, LINENO(n)); if (result_st == NULL) AST_FAILURE(pool, NULL); } /* pool deallocated when c->pool is deallocated */ return AST_SUCCESS(pool, result_st); } Nick Coghlan wrote: >Thomas Lee wrote: > > >>As the writer of the crappy code that sparked this conversation, I feel >>I should say something :) >> >> > >Don't feel bad about it. It turned out the 'helpful' review comments from Neal >and I didn't originally work out very well either ;) > >With the AST compiler being so new, this is the first serious attempt to >introduce modifications based on it. It's already better than the old CST >compiler, but that memory management in the parser is a cow :) > > > >>>Hopefully this is all made some sense. =) Is this the basic strategy >>>that an arena setup would need? if not can someone enlighten me? >>> >>> > >I think we need to be explicit about the problems we're trying to solve before >deciding on what kind of solution we want :) > >1. Cleaning up after failures in symtable.c and compile.c > It turns out this is already dealt with in the case of code blocks - the >compiler state handles a linked list of blocks which it automatically frees >when the compiler state is cleaned up. > So the only rule that needs to be followed in these files is to *never* >call any of the VISIT_* macros while there is a Python object which requires >DECREF'ing, or a C pointer which needs to be freed. > This rule was being broken in a couple of places in compile.c (with respect >to strings). I was the offender in both cases I found - the errors date from >when this was still on the ast-branch in CVS. > I've fixed those errors in SVN, and added a note to the comment at the top >of compile.c, to help others avoid making the same mistake I did. > It's fragile in some ways, but it does work. It makes the actual >compilation code look clean (because there isn't any cleanup code), but it >also makes that code look *wrong* (because the lack of cleanup code makes the >calls to "compiler_new_block" look unbalanced), which is a little disconcerting. > >2. Parsing a token stream into the AST in ast.c > This is the bit that has caused Thomas grief (the PEP 341 patch only needs >to modify the front end parser). When building an AST node, each of the >contained AST nodes or sequences has to be built first. That means that, if >there's a problem with any of the later subnodes, the earlier subnodes need to >be freed. > The key problem with memory management in this module is that the free >method to be invoked is dependent on the nature of the AST node to be freed. >In the case of a node sequence, it is dependent on the nature of the contained >elements. > So not only do you have to remember to free the memory, you have to >remember to free it the *right way*. > >Would it be worth the extra memory needed to store a pointer to an AST node's >"free" method in the AST type structure itself? And do the same for ASDL >sequences? > >Then a simple FREE_AST macro would be able to "do the right thing" when it >came to freeing either AST nodes or sequences. In particular, ASDL sequences >would be able to free their contents without knowing what those contents >actually are. > >That wouldn't eliminate the problem with memory leaks or double-deletion, but >it would eliminate some of the mental overhead of dealing with figuring out >which freeing function to invoke. > >Cheers, >Nick. > > > From jimjjewett at gmail.com Wed Nov 16 16:29:01 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Wed, 16 Nov 2005 10:29:01 -0500 Subject: [Python-Dev] Conclusion: Event loops, PyOS_InputHook, and Tkinter Message-ID: <fb6fbf560511160729s7953fb42k3de1fcc23774b4f6@mail.gmail.com> Phillip J. Eby: > did you ever try using IPython, and confirm whether it > does or does not address the issue As I understand it, using IPython (or otherwise changing the interactive mode) works fine *if* you just want a point solution -- get something up in some environment chosen by the developer. Michiel is looking to create a component that will work in whatever environment the *user* chooses. Telling users "you must go through this particular interface" is not acceptable. Therefore, IPython is only a workaround, not a solution. On the other hand, IPython is clearly a *good* workaround. The dance described in http://mail.python.org/pipermail/python-dev/2005-November/058057.html is short enough that a real solution might well be built on IPython; it just isn't quite done yet. -jJ From arigo at tunes.org Wed Nov 16 17:20:32 2005 From: arigo at tunes.org (Armin Rigo) Date: Wed, 16 Nov 2005 17:20:32 +0100 Subject: [Python-Dev] Is some magic required to check out new files from svn? In-Reply-To: <17271.15039.201796.513101@montanaro.dyndns.org> References: <17270.32577.193894.694593@montanaro.dyndns.org> <dl6vvl$o9c$1@sea.gmane.org> <17271.15039.201796.513101@montanaro.dyndns.org> Message-ID: <20051116162032.GA17196@code1.codespeak.net> Hi, On Sun, Nov 13, 2005 at 07:08:15AM -0600, skip at pobox.com wrote: > The full svn status output is > > % svn status > ! . > ! Python The "!" definitely mean that these items are missing, or for directories, incomplete in some way. You need to play around until the "!" goes away; for example, you may try svn revert -R . # revert to pristine state, recursively if you have no local changes you want to keep, followed by 'svn up'. If it still doesn't help, then I'm lost about the cause and would just recommend doing a fresh checkout. A bientot, Armin. From oliphant at ee.byu.edu Wed Nov 16 19:20:48 2005 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed, 16 Nov 2005 11:20:48 -0700 Subject: [Python-Dev] Problems with the Python Memory Manager In-Reply-To: <17275.4343.724248.625173@montanaro.dyndns.org> References: <437ADDCF.7080906@ee.byu.edu> <17275.4343.724248.625173@montanaro.dyndns.org> Message-ID: <437B7880.10004@ee.byu.edu> skip at pobox.com wrote: > Travis> More to the point, however, these scalar objects were allocated > Travis> using the standard PyObject_New and PyObject_Del functions which > Travis> of course use the Python memory manager. One user ported his > Travis> (long-running) code to the new scipy core and found much to his > Travis> dismay that what used to consume around 100MB now completely > Travis> dominated his machine consuming up to 2GB of memory after only a > Travis> few iterations. After searching many hours for memory leaks in > Travis> scipy core (not a bad exercise anyway as some were found), the > Travis> real problem was tracked to the fact that his code ended up > Travis> creating and destroying many of these new array scalars. > >What Python object were his array elements a subclass of? > > These were all scipy core arrays. The elements were therefore all C-like numbers (floats and integers I think). If he obtained an element in Python, he would get an instance of a new "array" scalar object which is a builtin extension type written in C. The important issue though is that these "array" scalars were allocated using PyObject_New and deallocated using PyObject_Del. The problem is that the Python memory manager did not free the memory. > Travis> In the long term, what is the status of plans to re-work the > Travis> Python Memory manager to free memory that it acquires (or > Travis> improve the detection of already freed memory locations). > >None that I'm aware of. It's seen a great deal of work in the past and >generally doesn't cause problems. Maybe your user's usage patterns were >a bad corner case. It's hard to tell without more details. > > I think definitely, his usage pattern represented a "bad" corner case. An unusable "corner" case in fact. At any rate, moving to use the system free and malloc fixed the immediate problem. I mainly wanted to report the problem here just as another piece of anecdotal evidence. -Travis From jcarlson at uci.edu Wed Nov 16 21:12:31 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed, 16 Nov 2005 12:12:31 -0800 Subject: [Python-Dev] Problems with the Python Memory Manager In-Reply-To: <437B7880.10004@ee.byu.edu> References: <17275.4343.724248.625173@montanaro.dyndns.org> <437B7880.10004@ee.byu.edu> Message-ID: <20051116120346.A434.JCARLSON@uci.edu> Travis Oliphant <oliphant at ee.byu.edu> wrote: > > skip at pobox.com wrote: > > > Travis> More to the point, however, these scalar objects were allocated > > Travis> using the standard PyObject_New and PyObject_Del functions which > > Travis> of course use the Python memory manager. One user ported his > > Travis> (long-running) code to the new scipy core and found much to his > > Travis> dismay that what used to consume around 100MB now completely > > Travis> dominated his machine consuming up to 2GB of memory after only a > > Travis> few iterations. After searching many hours for memory leaks in > > Travis> scipy core (not a bad exercise anyway as some were found), the > > Travis> real problem was tracked to the fact that his code ended up > > Travis> creating and destroying many of these new array scalars. > > > >What Python object were his array elements a subclass of? > > These were all scipy core arrays. The elements were therefore all > C-like numbers (floats and integers I think). If he obtained an element > in Python, he would get an instance of a new "array" scalar object which > is a builtin extension type written in C. The important issue though is > that these "array" scalars were allocated using PyObject_New and > deallocated using PyObject_Del. The problem is that the Python memory > manager did not free the memory. This is not a bug, and there doesn't seem to be any plans to change the behavior: python.org/sf/1338264 If I remember correctly, arrays from the Python standard library (import array), as well as numarray and Numeric, all store values in their pure C representations (they don't use PyObject_New unless someone uses the Python interface to fetch a particular element). This saves the overhead of allocating base objects, as well as the 3-5x space blowup when using Python integers (depending on whether your platform has 32 or 64 bit ints). > I think definitely, his usage pattern represented a "bad" corner case. > An unusable "corner" case in fact. At any rate, moving to use the > system free and malloc fixed the immediate problem. I mainly wanted to > report the problem here just as another piece of anecdotal evidence. On the one hand, using PyObjects embedded in an array in scientific Python is a good idea; you can use all of the standard Python manipulations on them. On the other hand, other similar projects have found it more efficient to never embed PyObjects in their arrays, and just allocate them as necessary on access. - Josiah From robert.kern at gmail.com Wed Nov 16 21:41:00 2005 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 16 Nov 2005 12:41:00 -0800 Subject: [Python-Dev] Problems with the Python Memory Manager In-Reply-To: <20051116120346.A434.JCARLSON@uci.edu> References: <17275.4343.724248.625173@montanaro.dyndns.org> <437B7880.10004@ee.byu.edu> <20051116120346.A434.JCARLSON@uci.edu> Message-ID: <dlg5gt$q1g$1@sea.gmane.org> Josiah Carlson wrote: > Travis Oliphant <oliphant at ee.byu.edu> wrote: >>I think definitely, his usage pattern represented a "bad" corner case. >>An unusable "corner" case in fact. At any rate, moving to use the >>system free and malloc fixed the immediate problem. I mainly wanted to >>report the problem here just as another piece of anecdotal evidence. > > On the one hand, using PyObjects embedded in an array in scientific > Python is a good idea; you can use all of the standard Python > manipulations on them. On the other hand, other similar projects have > found it more efficient to never embed PyObjects in their arrays, and > just allocate them as necessary on access. That's not what we're doing[1]. The scipy_core arrays here are just blocks of C doubles. However, the offending code (I believe Chris Fonnesbeck's PyMC, but I could be mistaken) frequently indexes into these arrays to get scalar values. In scipy_core, we've defined a set of numerical types that generally behave like Python ints and floats but have the underlying storage of the appropriate C data type and have the various array attributes and methods. When the result of an indexing operation is a scalar (e.g., arange(10)[0]), it always returns an instance of the appropriate scalar type. We are "just allocat[ing] them as necessary on access." [1] There *is* an array type for general PyObjects in scipy_core, but that's not being used in the code that blows up and has nothing to do with the problem Travis is talking about. -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From jcarlson at uci.edu Thu Nov 17 00:04:32 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed, 16 Nov 2005 15:04:32 -0800 Subject: [Python-Dev] Problems with the Python Memory Manager In-Reply-To: <dlg5gt$q1g$1@sea.gmane.org> References: <20051116120346.A434.JCARLSON@uci.edu> <dlg5gt$q1g$1@sea.gmane.org> Message-ID: <20051116145820.A43A.JCARLSON@uci.edu> Robert Kern <robert.kern at gmail.com> wrote: > > [1] There *is* an array type for general PyObjects in scipy_core, but > that's not being used in the code that blows up and has nothing to do > with the problem Travis is talking about. I seemed to have misunderstood the discussion. Was the original user accessing and saving copies of many millions of these doubles? That's the only way that I would be able to explain the huge overhead, and in that case, perhaps the user should have been storing them in scipy arrays (or even Python array.arrays). - Josiah From oliphant at ee.byu.edu Thu Nov 17 00:47:48 2005 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed, 16 Nov 2005 16:47:48 -0700 Subject: [Python-Dev] Problems with the Python Memory Manager In-Reply-To: <20051116145820.A43A.JCARLSON@uci.edu> References: <20051116120346.A434.JCARLSON@uci.edu> <dlg5gt$q1g$1@sea.gmane.org> <20051116145820.A43A.JCARLSON@uci.edu> Message-ID: <437BC524.2030105@ee.byu.edu> Josiah Carlson wrote: >Robert Kern <robert.kern at gmail.com> wrote: > > >>[1] There *is* an array type for general PyObjects in scipy_core, but >>that's not being used in the code that blows up and has nothing to do >>with the problem Travis is talking about. >> >> > >I seemed to have misunderstood the discussion. Was the original user >accessing and saving copies of many millions of these doubles? > He *was* accessing them (therefore generating a call to an array-scalar object creation function). But they *weren't being* saved. They were being deleted soon after access. That's why it was so confusing that his memory usage should continue to grow and grow so terribly. As verified by removing usage of the Python PyObject_MALLOC function, it was the Python memory manager that was performing poorly. Even though the array-scalar objects were deleted, the memory manager would not re-use their memory for later object creation. Instead, the memory manager kept allocating new arenas to cover the load (when it should have been able to re-use the old memory that had been freed by the deleted objects--- again, I don't know enough about the memory manager to say why this happened). The fact that it did happen is what I'm reporting on. If nothing will be done about it (which I can understand), at least this thread might help somebody else in a similar situation track down why their Python process consumes all of their memory even though their objects are being deleted appropriately. Best, -Travis From nnorwitz at gmail.com Thu Nov 17 01:08:48 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Wed, 16 Nov 2005 16:08:48 -0800 Subject: [Python-Dev] Problems with the Python Memory Manager In-Reply-To: <437BC524.2030105@ee.byu.edu> References: <20051116120346.A434.JCARLSON@uci.edu> <dlg5gt$q1g$1@sea.gmane.org> <20051116145820.A43A.JCARLSON@uci.edu> <437BC524.2030105@ee.byu.edu> Message-ID: <ee2a432c0511161608n3d16ec63id47c8fc585b6efd1@mail.gmail.com> On 11/16/05, Travis Oliphant <oliphant at ee.byu.edu> wrote: > > As verified by removing usage of the Python PyObject_MALLOC function, it > was the Python memory manager that was performing poorly. Even though > the array-scalar objects were deleted, the memory manager would not > re-use their memory for later object creation. Instead, the memory > manager kept allocating new arenas to cover the load (when it should > have been able to re-use the old memory that had been freed by the > deleted objects--- again, I don't know enough about the memory manager > to say why this happened). Can you provide a minimal test case? It's hard to do anything about it if we can't reproduce it. n From skip at pobox.com Thu Nov 17 00:32:54 2005 From: skip at pobox.com (skip@pobox.com) Date: Wed, 16 Nov 2005 17:32:54 -0600 Subject: [Python-Dev] Problems with the Python Memory Manager In-Reply-To: <20051116145820.A43A.JCARLSON@uci.edu> References: <20051116120346.A434.JCARLSON@uci.edu> <dlg5gt$q1g$1@sea.gmane.org> <20051116145820.A43A.JCARLSON@uci.edu> Message-ID: <17275.49574.768079.524296@montanaro.dyndns.org> >> [1] There *is* an array type for general PyObjects in scipy_core, but >> that's not being used in the code that blows up and has nothing to do >> with the problem Travis is talking about. Josiah> I seemed to have misunderstood the discussion. I'm sorry, but I'm confused as well. If these scipy arrays have elements that are subclasses of floats shouldn't we be able to provoke this memory growth using an array.array of floats? Can you provide a simple script in pure Python (no scipy) that demonstrates the problem? Skip From oliphant at ee.byu.edu Thu Nov 17 03:15:04 2005 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed, 16 Nov 2005 19:15:04 -0700 Subject: [Python-Dev] Problems with the Python Memory Manager In-Reply-To: <fb6fbf560511161750y7cef46cdk67700606e655a6ec@mail.gmail.com> References: <fb6fbf560511161750y7cef46cdk67700606e655a6ec@mail.gmail.com> Message-ID: <437BE7A8.5000503@ee.byu.edu> Jim Jewett wrote: >Do you have the code that caused problems? > > Yes. I was able to reproduce his trouble and was trying to debug it. >The things I would check first are > >(1) Is he allocating (peak usage) a type (such as integers) that >never gets returned to the free pool, in case you need more of that >same type? > > No, I don't think so. >(2) Is he allocating new _types_, which I think don't get properly > > collected. > > Bingo. Yes, definitely allocating new _types_ (an awful lot of them...) --- that's what the "array scalars" are: new types created in C. If they don't get properly collected then that would definitely have created the problem. It would seem this should be advertised when telling people to use PyObject_New for allocating new memory for an object. >(3) Is there something in his code that keeps a live reference, or at >least a spotty memory usage so that the memory can't be cleanly >released? > > > No, that's where I thought the problem was, at first. I spent a lot of time tracking down references. What finally convinced me it was the Python memory manager was when I re-wrote the tp->alloc functions of the new types to use the system malloc instead of PyObject_Malloc. As soon as I did this the problems disappeared and memory stayed constant. Thanks for your comments, -Travis From decker at dacafe.com Wed Nov 16 03:33:13 2005 From: decker at dacafe.com (decker@dacafe.com) Date: Wed, 16 Nov 2005 02:33:13 -0000 (Australia/Sydney) Subject: [Python-Dev] Patch Req. # 1351020 & 1351036: PythonD modifications Message-ID: <39387.202.3.192.11.1132108393.squirrel@cafemail.mcadcafe.com> Hello, I would appreciate feedback concerning these patches before the next "PythonD" (for DOS/DJGPP) is released. Thanks in advance. Regards, Ben Decker Systems Integrator http://www.caddit.net ----------------------------------------- Stay ahead of the information curve. Receive MCAD news and jobs on your desktop daily. Subscribe today to the MCAD CafeNews newsletter. [ http://www10.mcadcafe.com/nl/newsletter_subscribe.php ] It's informative and essential. From tony.meyer at gmail.com Thu Nov 17 01:36:32 2005 From: tony.meyer at gmail.com (Tony Meyer) Date: Thu, 17 Nov 2005 13:36:32 +1300 Subject: [Python-Dev] DRAFT: python-dev Summary for 2005-09-16 to 2005-09-30 Message-ID: <6F1AA13E-2723-43D1-B6F1-7A7A9F1A6E1C@gmail.com> It's been some time (all that concurrency discussion didn't help ;) but here's the second half of September. Many apologies for the delay; hopefully you agree with Guido's 'better late than never', and I promise to try harder in the future. Note that the delay is all my bad, and epithets should be directed at me and not Steve. As usual, please read over if you have a chance, and direct comments/ corrections to tony.meyer at gmail.com or steven.bethard at gmail.com. (One particular question is whether the concurrency summary is too long). ============= Announcements ============= ----------------------------- QOTF: Quotes of the fortnight ----------------------------- We have two quotes this week, one each from the two biggest threads of this fortnight: concurrency and conditional expressions. The first quote, from Donovan Barda, puts Python's approach to threading into perspective: The reality is threads were invented as a low overhead way of easily implementing concurrent applications... ON A SINGLE PROCESSOR. Taking into account threading's limitations and objectives, Python's GIL is the best way to support threads. When hardware (seriously) moves to multiple processors, other concurrency models will start to shine. Our second QOTF, by yours truly (hey, who could refuse a nomination from Guido?), is a not-so-subtle reminder to leave syntax decisions to Guido: Please no more syntax proposals! ... We need to leave the syntax to Guido. We've already proved that ... we can't as a community agree on a syntax. That's what we have a BDFL for. =) Contributing threads: - `GIL, Python 3, and MP vs. UP <http://mail.python.org/pipermail/ python-dev/2005-September/056609.html>`__ - `Adding a conditional expression in Py3.0 <http://mail.python.org/ pipermail/python-dev/2005-September/056617.html>`__ [SJB] ------------------- Compressed MSI file ------------------- Martin v. L?wis discovered that a little more than a `MiB`_ in the Python installer by using LZX:21 instead of the standard MSZIP when compressing the CAB file. After confirmation from several testers that the new format worked, the change (for Python 2.4.2 and beyond) was made. .. _MiB: http://en.wikipedia.org/wiki/Mibibyte Contributing thread: - `Compressing MSI files: 2.4.2 candidate? <http://mail.python.org/ pipermail/python-dev/2005-September/056694.html>`__ [TAM] ========= Summaries ========= ----------------------- Conditional expressions ----------------------- Raymond Hettinger proposed that the ``and`` and ``or`` operators be modified in Python 3.0 to produce only booleans instead of producing objects, motivating this proposal in part by the common (mis-)use of ``<cond> and <true-expr> or <false-expr>`` to emulate a conditional expression. In response, Guido suggested that that the conditional expression discussion of `PEP 308`_ be reopened. This time around, people seemed almost unanimously in support of adding a conditional expression, though as before they disagreed on syntax. Fortunately, this time Guido cut the discussion short and pronounced a new syntax: ``<true-expr> if <cond> else <false-expr>``. Although it has not been implemented yet, the plan is for it to appear in Python 2.5. .. _PEP 308: http://www.python.org/peps/pep-0308.html Contributing threads: - `"and" and "or" operators in Py3.0 <http://mail.python.org/ pipermail/python-dev/2005-September/056510.html>`__ - `Adding a conditional expression in Py3.0 <http://mail.python.org/ pipermail/python-dev/2005-September/056546.html>`__ - `Conditional Expression Resolution <http://mail.python.org/ pipermail/python-dev/2005-September/056846.html>`__ [SJB] --------------------- Concurrency in Python --------------------- Once again, the subject of removing the global interpreter lock (GIL) came up. Sokolov Yura suggested that the GIL be replaced with a system where there are thread-local GILs that cooperate to share writing; Martin v. L?wis suggested that he try to implement his ideas, and predicted that he would find that doing so would be a lot of work, would require changes to all extension modules (likely to introduce new bugs, particularly race conditions), and possibly decrease performance. This kicked off several long threads about multi-processor coding. A long time ago (circa Python 1.5), Greg Ward experimented with free threading, which did yield around a 1.6 times speedup on a dual- processor machine. To avoid the overhead of multi-processor locking on a uniprocessor machine, a separate binary could be distributed. Some of the code apparently did make it into Python 1.5, but the issue died off because no-one provided working code, or a strategy for what to do with existing extension modules. Guido pointed out that it is not clear at this time how multiple processors will be used as they become the norm. With the treaded programming model (e.g. in Java) there are problems with concurrent modification errors (without locking) or deadlocks and livelocks (with locking). Guido's hunch (and mine, FWIW) is that instead of writing massively parallel applications, we will continue to write single-threaded applications that are tied together at the process level rather than at the thread level. He also pointed out that it's likely that most problems get little benefit out of multiple processors. Guido threw down the gauntlet: rather than the endless discussion about this topic, someone should come up with a GIL-free Python (not necessarily CPython) and demonstrate its worth. Phillip J. Eby reminded everyone that Jython, IronPython, and PyPy exist, and that someone could, for example, create a multiprocessor-friendly backend for PyPy. Guido also pointed out that fast threading benefits from fast context switches, which benefits from small register sets, and that the current trend in chips is towards larger register sets. In addition, multiple processors with shared memory don't scale all that well (multiple processors with explicit interprocess communication (IPC) channels scale much better). These all favour multi-processing over multi-threading. Donovan Baarda went so far as to say (a QOTF, as above), that Python's GIL is the best way to support threads, which are for single-processor use, and that when multiple-processor platforms have matured more other concurrency models will likewise mature. OTOH, Bob Ippolito pointed out that (in many operating systems) there isn't a lot of difference between threads and processes, and that threads can typically still use IPC. Bob argued that the biggest argument for threading is that lots of existing C/C+ + code uses threads. Simon Percivall argued that the problem is that Python offers ("out of the box") some support for multi-threaded programming, but little for multi-process programming beyond the basics (e.g. data sharing, communication, control over running processes, dealing out tasks to be handled). Simon suggested that the best way to stop people complaining about the GIL is to provide solid, standardized support for multi-process programming. The idea of a "multiprocess" module gained a reasonable amount of support. Phillip J. Eby outlined an idea he is considering PEPifying, in which one could switch all context variables (such as the Decimal context and the sys.* variables) simulaneously and instantaneously when changing execution contexts (like switching between coroutines). He has a prototype implementation of the basic idea, which is less than 200 lines of Python and very fast. However, he pointed out that it's not completely PEP-ready at this point, and he needs to continue considering various parts of the concept. Bruce Eckel joined the thread, and suggested that low-level threads people are only now catching up to objects, but as far as concurrency goes their brains still think in terms of threads, so they naturally apply thread concepts to objects. He believes that pthread-style thinking is two steps backwards: you effectively throw open the innards of the object that you just spent time decoupling from the rest of your system, and the coupling is not unpredictable. Bruce and Guido had discussed offlist "active objects": defining a class as "active" would install a worker thread and concurrent queue in each object of that class, automatically turn method calls into tasks and enqueue them, and prevent any other interaction other than enqueued messages. Guido felt that if multiple active objects could co-exist in the same process, but be prevented (by the language implementation) from sharing data except via channels, and dynamic reallocation of active objects across multiple CPUs were possible, then this might be a solution. He pointed out that an implementation would really be needed to prove this. Phillip and Martin pointed out that preventing any other interacton other than enqueued messages is the difficult part; each active object would, for example, have to have its own sys.modules. Phillip felt that such a solution (which Bruce posed as "a" solution, not "the" solution) wouldn't help with GIL removal, but would help with effective use of multiprocessor machines on platforms where fork() is available, if the API works across processes as well as threads. Bruce then restarted the discussion, putting forth eight criteria that he felt would be necessary for the "pythonic" solution to concurrency. Items on the list were discussed further, with some disagreement about what was possible. The concurrency discussion continues next month... Contributing threads: - `Variant of removing GIL. <http://mail.python.org/pipermail/python- dev/2005-September/056423.html>`__ - `GIL, Python 3, and MP vs. UP (was Re: Variant of removing GIL.) <http://mail.python.org/pipermail/python-dev/2005-September/ 056458.html>`__ - `GIL, Python 3, and MP vs. UP <http://mail.python.org/pipermail/ python-dev/2005-September/056498.html>`__ - `Active Objects in Python <http://mail.python.org/pipermail/python- dev/2005-September/056752.html>`__ - `Pythonic concurrency <http://mail.python.org/pipermail/python-dev/ 2005-September/056801.html>`__ - `Pythonic concurrency - cooperative MT <http://mail.python.org/ pipermail/python-dev/2005-September/056860.html>`__ [TAM] ----------------------------------- Removing nested function parameters ----------------------------------- Brett Cannon proposed removing support for nested function parameters so that instead of being able to write:: def f((x, y)): print x, y you'd have to write something like:: def f(arg): x, y = arg print x, y Brett (with help from Guido) motivated this removal (for Python 3.0) by a few factors: (1) The feature has low visibility: "For every user who is fond of them there are probably ten who have never even heard of it." - Guido (2) The feature can be difficult to read for some people. (3) The feature doesn't add any power to the language; the above functions emit essentially the same byte-code. (4) The feature makes function parameter introspection difficult because tuple unpacking information is not stored in the function object. In general, people were undecided on this proposal. While a number of people said they used the feature and would miss it, many of them also said that their code wouldn't suffer that much if the feature was removed. No decision had been made at the time of the summary. Contributing thread: - `removing nested tuple function parameters <http://mail.python.org/ pipermail/python-dev/2005-September/056459.html>`__ [SJB] ----------------------------------------- Evaluating iterators in a boolean context ----------------------------------------- In Python 2.4 some builtin iterators gained __len__ methods when the number of remaining items could be made available. This broke some of Guido's code that tested iterators for their boolean value (to distinguish them from None). Raymond Hettinger (who supplied the original patch) argued that `testing for None`_ using boolean tests was in general a bad idea, and that knowing the length of an iterator, when possible, had a number of use cases and allowed for some performance gains. However, Guido felt strongly that iterators should not supply __len__ methods, as this would lead to some people writing code expecting this method, which would then break when it received an iterator which could not determine its own length. The feature will be rolled back in Python 2.5, and Raymond will likely move the __len__ methods to private methods in order to maintain the performance gains. .. _testing for None: http://www.python.org/peps/ pep-0290.html#testing-for-none Contributing threads: - `bool(iter([])) changed between 2.3 and 2.4 <http://mail.python.org/ pipermail/python-dev/2005-September/056576.html>`__ - `bool(container) [was bool(iter([])) changed between 2.3 and 2.4] <http://mail.python.org/pipermail/python-dev/2005-September/ 056879.html>`__ [SJB] -------------------------------------------------- Properties that only call the getter function once -------------------------------------------------- Jim Fulton proposed adding a new builtin for a property-like descriptor that would only call the getter method once, so that something like:: class Spam(object): @readproperty def eggs(self): ... expensive computation of eggs self.eggs = result return result would only do the eggs computation once. Currently, you can't do this with a property() because the ``self.eggs = result`` statement tries to call the property's ``fset`` method instead of replacing the property with the result of the eggs() call. A few other people commented that they'd needed similar functionality at times, and Guido seemed moderately interested in the idea, but there was no final resolution. Contributing thread: - `RFC: readproperty <http://mail.python.org/pipermail/python-dev/ 2005-September/056769.html>`__ [SJB] -------- Codetags -------- Micah Elliott submitted his `Codetags PEP 350`_ (after revisions following the comp.lang.python discussion) to python-dev for comment. A common feeling was that this (particularly synonyms) was over-engineering; Guido pointed out that he only uses XXX, and this is certainly the most common (although not only) example in the Python source itself. Some suggestions were made, many of which Micah integrated into the PEP. The suggestion was made that an implementation should precede approval of the PEP. Micah indicated that he would continue development on the tools, and that he encourages anyone interested in using a standard set of codetages to give these a try. .. _Codetags PEP 350: http://python.org/peps/pep-0350.html - `PEP 350: Codetags <http://mail.python.org/pipermail/python-dev/ 2005-September/056744.html>`__ [TAM] ---------------------------- Improving set implementation ---------------------------- Raymond Hettinger suggested a "small, but interesting, C project" to determine whether the setobject.c implementation would be improved by recoding the set_lookkey() function to optimize key insertion order using Brent's variation of Algorithm D (c.f. Knuth vol. III, section 6.4, p525). It has the potential to boost performance for uniquification applications with duplicate keys being identified more quickly, and possibly also more frequent retirement of dummy entires during insertion operations. Andrew Durdin pointed out that Brent's variation depends on the next probe position for a key being derivation from the key and it current position, which is incompatible with the current perturbation system; Raymond replaced perturbation with a secondary hash with linear probing. Antoine Pitrou did some `experimenting with this`_, resulting in a -5% to 2% speedup with various benchmarks. Raymond has also been experimenting with a simpler approach: whenever there are more than three probes, always swap the new key into the first position and then unconditionally re-insert the swapped-out key. He reported that, most of the time, this gives an improvement, and it doesn't require changing the perturbation logic. This simpler approach is cheap to implement, but the benefits are also smaller, with it improving only the worse collisions. .. _experimenting with this: http://pitrou.net/python/sets - `C coding experiment <http://mail.python.org/pipermail/python-dev/ 2005-September/055965.html>`__ [TAM] -------------- Relative paths -------------- Nathan Bullock suggested a ''relpath(path_a, path_b)'' addition to os.path that returns a relative path from path_a to path_b. Trent Mick pointed out that there are a `couple of`_ `recipes for this`_, as well as `Jason Orendorff's Path module`_. Several people supported this idea, and hopefully either Nathan or one of the recipe authors will submit a patch with this functionality. .. _couple of: http://aspn.activestate.com/ASPN/Cookbook/Python/ Recipe/302594 .. _recipes for this: http://aspn.activestate.com/ASPN/Cookbook/ Python/Recipe/208993 .. _Jason Orendorff's Path module: http://www.jorendorff.com/articles/ python/path/ Contributing threads: - `os.path.diff(path1, path2) <http://mail.python.org/pipermail/ python-dev/2005-September/056391.html>`__ - `os.path.diff(path1, path2) (and a first post) <http:// mail.python.org/pipermail/python-dev/2005-September/056703.html>`__ [TAM] ---------------------------------- Adding a vendor-packages directory ---------------------------------- Rich Burridge followed up a `comp.lang.python thread`_ about a "vendor-packages" directory for Python by submitting a `patch`_ and asking for comments about the proposal on python-dev. General consensus was that the proposal needed a better rationale, explaining why this improved on simply adding a .pth file to the site-packages directory. Rich explained that the rationale is that Python files supplied by the vendor (Sun, Apple, RedHat, Microsoft) with their operating system software should go in a separate base directory to differentiate them from Python files installed specifically at the site. However, Bob Ippolito pointed out that, as of OS X 10.4 ("Tiger") Apple already does this via a .pth file ("Extras.pth"), which points to ''/System/Library/Frameworks/Python.framework/ Versions/2.3/Extras/lib/python'' and includes wxPython by default. Bob also pointed out that such a "vendor-packages.pth" should look like ''import site; site.addsitedir('/usr/lib/python2.4/vendor- packages')'' so that packages like Numeric, PIL, and PyObjC, which take advantage of .pth files themselves, work when installed to the vendor-packages location. Phillip J. Eby pointed out that it would be good to have a document for "Python Distributors" that explained these kind of things, and suggested that perhaps a volunteer or two could be found within the distutils-SIG to do this. .. _comp.lang.python thread: http://mail.python.org/pipermail/python- list/2005-September/300029.html .. _patch: http://sourceforge.net/tracker/index.php? func=detail&aid=1298835&group_id=5470&atid=305470 Contributing thread: - `vendor-packages directory <http://mail.python.org/pipermail/python- dev/2005-September/056682.html>`__ [TAM] ======================= Version numbers on OS X ======================= Guido asked if platform.system_alias() could be improved on OS X by mapping uname()'s ''Darwin x.y'' to ''OS X 10.(x-4).y''. Bob Ippolito and others pointed out that this was not a good idea, because uname() only reports on the kernel version number and not the Cocoa API, which is really what OS X 10.x.y refers to. He pointed out that the correct way to do it using a public API is to used gestalt, which is what platform.mac_ver() does. On further inspection, it was discovered that parsing the /System/ Library/CoreServices/SystemVersion.plist property list is also a supported API, and would not rely on access to the Carbon API set. Bob and Wilfredo S?nchez Vega provided sample code that would parse this plist; Marc-Andre Lemburg suggested that a patch be written for system_alias() that would use this method (if possible) for Mac OS. Contributing thread: - `Mapping Darwin 8.2.0 to Mac OS X 10.4.2 in platform.py <http:// mail.python.org/pipermail/python-dev/2005-September/056651.html>`__ [TAM] ================ Deferred Threads ================ - `Python 2.5a1, ast-branch and PEP 342 and 343 <http:// mail.python.org/pipermail/python-dev/2005-September/056449.html>`__ =============== Skipped Threads =============== - `Visibility scope for "for/while/if" statements <http:// mail.python.org/pipermail/python-dev/2005-September/056669.html>`__ - `inplace operators and __setitem__ <http://mail.python.org/ pipermail/python-dev/2005-September/056766.html>`__ - `Repository for python developers <http://mail.python.org/pipermail/ python-dev/2005-September/056717.html>`__ - `For/while/if statements/comprehension/generator expressions unification <http://mail.python.org/pipermail/python-dev/2005- September/056508.html>`__ - `list splicing <http://mail.python.org/pipermail/python-dev/2005- September/056472.html>`__ - `Compatibility between Python 2.3.x and Python 2.4.x <http:// mail.python.org/pipermail/python-dev/2005-September/056437.html>`__ - `python optimization <http://mail.python.org/pipermail/python-dev/ 2005-September/056441.html>`__ - `test__locale on Mac OS X <http://mail.python.org/pipermail/python- dev/2005-September/056463.html>`__ - `possible memory leak on windows (valgrind report) <http:// mail.python.org/pipermail/python-dev/2005-September/056478.html>`__ - `Mixins. <http://mail.python.org/pipermail/python-dev/2005- September/056481.html>`__ - `2.4.2c1 fails test_unicode on HP-UX ia64 <http://mail.python.org/ pipermail/python-dev/2005-September/056551.html>`__ - `2.4.2c1: test_macfs failing on Tiger (Mac OS X 10.4.2) <http:// mail.python.org/pipermail/python-dev/2005-September/056558.html>`__ - `test_ossaudiodev hangs <http://mail.python.org/pipermail/python- dev/2005-September/056559.html>`__ - `unintentional and unsafe use of realpath() <http://mail.python.org/ pipermail/python-dev/2005-September/056616.html>`__ - `Alternative name for str.partition() <http://mail.python.org/ pipermail/python-dev/2005-September/056630.html>`__ - `Weekly Python Patch/Bug Summary <http://mail.python.org/pipermail/ python-dev/2005-September/056713.html>`__ - `Possible bug in urllib.urljoin <http://mail.python.org/pipermail/ python-dev/2005-September/056736.html>`__ - `Trasvesal thought on syntax features <http://mail.python.org/ pipermail/python-dev/2005-September/056741.html>`__ - `Fixing pty.spawn() <http://mail.python.org/pipermail/python-dev/ 2005-September/056750.html>`__ - `64-bit bytecode compatibility (was Re: [PEAK] ez_setup on 64-bit linux problem) <http://mail.python.org/pipermail/python-dev/2005- September/056811.html>`__ - `C API doc fix <http://mail.python.org/pipermail/python-dev/2005- September/056827.html>`__ - `David Mertz on CA state e-voting panel <http://mail.python.org/ pipermail/python-dev/2005-September/056840.html>`__ - `[PATCH][BUG] Segmentation Fault in xml.dom.minidom.parse <http:// mail.python.org/pipermail/python-dev/2005-September/056844.html>`__ - `linecache problem <http://mail.python.org/pipermail/python-dev/ 2005-September/056856.html>`__ From tony.meyer at gmail.com Thu Nov 17 01:36:33 2005 From: tony.meyer at gmail.com (Tony Meyer) Date: Thu, 17 Nov 2005 13:36:33 +1300 Subject: [Python-Dev] DRAFT: python-dev Summary for 2005-10-01 to 2005-10-15 Message-ID: <9C7FCF00-A936-44D0-9D36-3263BA456E1A@gmail.com> As you have noticed, there has been a summary delay recently. This is my fault (insert your favourite thesis/work/leisure excuse here). Steve has generously covered my slackness by doing all of the October summaries himself (thanks!). Anyway, if you have some moments to spare, cast your mind back to the start of October, and see if these reflect what happened. Comments/corrections to tony.meyer at gmail.com or steven.bethard at gmail.com. Thanks! ============= Announcements ============= ---------------------------- QOTF: Quote of the Fortnight ---------------------------- From Phillip J. Eby: So, if threads are "easy" in Python compared to other langauges, it's *because of* the GIL, not in spite of it. Contributing thread: - `Pythonic concurrency <http://mail.python.org/pipermail/python-dev/ 2005-October/057062.html>`__ [SJB] ---------------------------------------- GCC/G++ Issues on Linux: Patch available ---------------------------------------- Christoph Ludwig provided the previously `promised patch`_ to address some of the issues in compiling Python with GCC/G++ on Linux. The patch_ keeps ELF systems like x86 / Linux from having any dependencies on the C++ runtime, and allows systems that require main () to be a C++ function to be configured appropriately. .. _promised patch: http://www.python.org/dev/summary/ 2005-07-01_2005-07-15.html#gcc-g-issues-on-linux .. _patch: http://python.org/sf/1324762 Contributing thread: - `[C++-sig] GCC version compatibility <http://mail.python.org/ pipermail/python-dev/2005-October/057230.html>`__ [SJB] ========= Summaries ========= --------------------- Concurrency in Python --------------------- Michael Sparks spent a bit of time descibing the current state and future goals of the Kamaelia_ project. Mainly, Kamaelia aims to make concurrency as simple and easy to use as possible. A scheduler manages a set of generators that communicate with each other through Queues. The long term goals include being able to farm the various generators off into thread or processes as needed, so that whether your concurrency model is cooperative, threaded or process-based, your code can basically look the same. There was also continued discussion about how "easy" threads are. Shane Hathaway made the point that it's actually locking that's "insanely difficult", and approaches that simplify how much you need to think about locking can keep threading relatively easy -- this was one of the strong points of ZODB. A fairly large camp also got behind the claim that threads are easy if you're limited to only message passing. There were also a few comments about how Python makes threading easier, e.g. through the GIL (see `QOTF: Quote of the Fortnight`_) and through threading.threads's encapsulation of thread- local resources as instance attributes. .. _Kamaelia: http://kamaelia.sourceforge.ne Contributing threads: - `Pythonic concurrency - cooperative MT <http://mail.python.org/ pipermail/python-dev/2005-October/056898.html>`__ - `Pythonic concurrency <http://mail.python.org/pipermail/python-dev/ 2005-October/057023.html>`__ [SJB] ------------------------------------- Organization of modules for threading ------------------------------------- A few people took issue with the current organization of the threading modules into Queue, thread and threading. Guido views Queue as an application of threading, so putting it in the threading module is inappropriate (though with a deeper package structure, it should definitely be a sibling). Nick Coghlan suggested that Queue should be in a threadtools module (in parallel with itertools), while Skip proposed a hierarchy of modules with thread and lock being in the lowest level one, and Thread and Queue being in the highest level. Aahz suggested (and Guido approved) deprecating the thread module and renaming it to _thread at least in Python 3.0. It seems the deprecation may happen sooner though. Contributing threads: - `Making Queue.Queue easier to use <http://mail.python.org/pipermail/ python-dev/2005-October/057184.html>`__ - `Autoloading? (Making Queue.Queue easier to use) <http:// mail.python.org/pipermail/python-dev/2005-October/057216.html>`__ - `threadtools (was Re: Autoloading? (Making Queue.Queue easier to use)) <http://mail.python.org/pipermail/python-dev/2005-October/ 057262.html>`__ - `Threading and synchronization primitives <http://mail.python.org/ pipermail/python-dev/2005-October/057269.html>`__ [SJB] ------------------------- Speed of Unicode decoding ------------------------- Tony Nelson found that decoding with a codec like mac-roman or iso8859-1 can take around ten times as long as decoding with utf-8. Walter D?rwald provided a patch_ that implements the mapping using a unicode string of length 256 where undefined characters are mapped to u"\ufffd". This dropped the decode time for mac-roman to nearly the speed of the utf-8 decoding. Hye-Shik Chang showed off a fastmap decoder with comparable performance. In the end, Walter's patch was accepted. .. patch: http://www.python.org/sf/1313939 Contributing thread: - `Unicode charmap decoders slow <http://mail.python.org/pipermail/ python-dev/2005-October/056958.html>`__ [SJB] ------------------ Updates to PEP 343 ------------------ Jason Orendorff proposed replacing the __enter__() and __exit__() methods on context managers with a simple __with__() method instead. While Guido was unconvinced that __enter__() and __exit__() should be dropped, he was convinced that context managers should have a __with__ () method for in parallel with the __iter__() method for iterators. There was some talk of special-casing the @contextmanager decorator on the __with__() method, but no conclusion. Contributing threads: - `Proposed changes to PEP 343 <http://mail.python.org/pipermail/ python-dev/2005-October/057040.html>`__ - `PEP 343 and __with__ <http://mail.python.org/pipermail/python-dev/ 2005-October/056931.html>`__ [SJB] ---------------------- str and unicode issues ---------------------- Martin Blais wanted to completely disable the implicit conversions between unicode and str, so that you would always be forced to call either .encode() or .decode() to convert between one and the other. This is already available through adding ``sys.setdefaultencoding ('undefined')`` to your sitecustomize.py file, but the suggestion started another long discussion over unicode issues. Antoine Pitrou suggested that a good rule of thumb is to convert to unicode everything that is semantically textual, and to only use str for what is to be semantically treated as a string of bytes. Fredrik Lundh argued against this for efficiency reasons -- pure ASCII text would consume more space as a unicode object. There were suggestions that in Python 3.0, opening files in text mode will require an encoding and produce string objects, while opening files in binary mode will produce bytes objects. The bytes() type will be a mutable array of bytes, which can be converted to a string object by specifying an encoding. Contributing threads: - `Divorcing str and unicode (no more implicit conversions). <http:// mail.python.org/pipermail/python-dev/2005-October/056916.html>`__ - `unifying str and unicode <http://mail.python.org/pipermail/python- dev/2005-October/056934.html>`__ - `bytes type <http://mail.python.org/pipermail/python-dev/2005- October/056945.html>`__ [SJB] ---------------------------------------------------------------------- Allowing \*args syntax in tuple unpacking and before keyword arguments ---------------------------------------------------------------------- Gustavo Niemeyer propsed the oft-seen request for allowing the \*args syntax in tuple unpacking, e.g.:: for first, second, *rest in iterator: Guido requested a PEP, saying that he wasn't convinced that there was much of a gain over the already valid:: for item in iterator: (first, second), rest = item[2:], item[:2] Greg Ewing and others didn't like Guido's suggestion as it violates DRY (Don't Repeat Yourself). Others also chimed in with some examples in support of the proposal, but no one has yet put together a PEP. In a related matter, Guido indicated that he wants to be able to write keyword-only arguments after a \*args, so that you could, for example, write:: f(a, b, *args, foo=1, bar=2, **kwds) People seemed almost unanimously in support of this proposal, but, to quote Nick Coghlan, it has still "never bugged anyone enough for them to actaully get around to fixing it". Contributing thread: - `Extending tuple unpacking <http://mail.python.org/pipermail/python- dev/2005-October/057056.html>`__ [SJB] ---------- AST Branch ---------- Guido gave the AST branch a three week ultimatum: either the branch should be merged into MAIN within the next three weeks, or the branch should be abandoned entirely. This jump-started work on the branch, and the team was hoping to merge the changes the weekend of October 15th. Contributing threads: - `Python 2.5a1, ast-branch and PEP 342 and 343 <http:// mail.python.org/pipermail/python-dev/2005-September/056449.html>`__ - `Python 2.5 and ast-branch <http://mail.python.org/pipermail/python- dev/2005-October/056986.html>`__ - `AST branch update <http://mail.python.org/pipermail/python-dev/ 2005-October/057281.html>`__ [SJB] ----------------------------------- Allowing "return obj" in generators ----------------------------------- Piet Delport suggested having ``return obj`` in generators be translated into ``raise StopIteration(obj)``. The return value of a generator function would thus be available as the first arg in the StopIteration exception. Guido asked for some examples to give the idea a better motivation, and felt uncomfortable with the return value being silently ignored in for-loops. The idea was postponed until at least one release after a PEP 342 implementation enters Python, so that people can have some more experience with coroutines. Contributing threads: - `Proposal for 2.5: Returning values from PEP 342 enhanced generators <http://mail.python.org/pipermail/python-dev/2005-October/ 056957.html>`__ - `PEP 342 suggestion: start(), __call__() and unwind_call() methods <http://mail.python.org/pipermail/python-dev/2005-October/ 057042.html>`__ - `New PEP 342 suggestion: result() and allow "return with arguments" in generators (was Re: PEP 342 suggestion: start(), __call__() and unwind_call() methods) <http://mail.python.org/ pipermail/python-dev/2005-October/057116.html>`__ [SJB] ----------------------------- API for the line-number table ----------------------------- Greg Ewing suggested trying to simplify the line-number table (lnotab) by simply matching each byte-code index with a file and line number. Phillip J. Eby pointed out that this would make the stdlib take up an extra megabyte, suggesting two tables instead, one matching bytecodes to line numbers, and one matching the first line- number of a chunk with its file. Michael Hudson suggested that what we really want is an API for accessing the lnotab, so that the implementation that is chosen is less important. The conversation trailed off without a resolution. Contributing thread: - `Simplify lnotab? (AST branch update) <http://mail.python.org/ pipermail/python-dev/2005-October/057285.html>`__ [SJB] ------------------------------ Current directory and sys.path ------------------------------ A question about the status of `the CurrentVersion registry entry`_ led to a discussion about the different behaviors of sys.path across platforms. Apparently, on Windows, sys.path includes the current directory and the directory of the script being executed, while on Linux, it only includes the directory of the script. .. _the CurrentVersion registry entry: http://www.python.org/windows/ python/registry.html Contributing thread: - `PythonCore\CurrentVersion <http://mail.python.org/pipermail/python- dev/2005-October/057095.html>`__ [SJB] ---------------------------------- Changing the __class__ of builtins ---------------------------------- As of Python 2.3, you can no longer change the __class__ of any builtin. Phillip J. Eby suggested that these rules might be overly strict; modules and other mutable objects could probably reasonably have their __class__s changed. No one seemed really opposed to the idea, but no one offered up a patch to make the change either. Contributing thread: - `Assignment to __class__ of module? (Autoloading? (Making Queue.Queue easier to use)) <http://mail.python.org/pipermail/python- dev/2005-October/057253.html>`__ [SJB] ------------------------------------------ exec function specification for Python 3.0 ------------------------------------------ In Python 3.0, exec is slated to become a function (instead of a statement). Currently, the presence of an exec statement in a function can cause some subtle changes since Python has to worry about exec modifying function locals. Guido suggested that the exec () function could require a namespace, basically dumping the exec-in- local-namespace altogether. People seemed generally in favor of the proposal, though no official specification was established. Contributing thread: - `PEP 3000 and exec <http://mail.python.org/pipermail/python-dev/ 2005-October/057135.html>`__ [SJB] ------------------------------------ Adding opcodes to speed up self.attr ------------------------------------ Phillip J. Eby experimented with adding LOAD_SELF and SELF_ATTR opcodes to improve the speed of object-oriented programming. This gained about a 5% improvement in pystone, which isn't organized in a very OO manner. People seemed uncertain as to whether paying the cost of adding two opcodes to gain a 5% speedup was worth it. No decision had been made at the time of this summary. Contributing thread: - `LOAD_SELF and SELF_ATTR opcodes <http://mail.python.org/pipermail/ python-dev/2005-October/057321.html>`__ [SJB] -------------------------------------- Dropping support for --disable-unicode -------------------------------------- Reinhold Birkenfeld tried unsuccessfully to make the test-suite pass with --disable-unicode set. M.-A. Lemburg suggested that the feature should be ripped out entirely, to simplify the code. Martin v. L?wis suggested deprecating it to give people a chance to object. The plan is now to add a note to the configure switch that the feature will be removed in Python 2.6. Contributing threads: - `Tests and unicode <http://mail.python.org/pipermail/python-dev/ 2005-October/056897.html>`__ - `--disable-unicode (Tests and unicode) <http://mail.python.org/ pipermail/python-dev/2005-October/056920.html>`__ [SJB] ----------------------------------------- Bug in __getitem__ inheritance at C level ----------------------------------------- Travis Oliphant discovered that the addition of the mp_item and sq_item descriptors and the resolution of any comptetion for __getitem__ calls is done *before* the inheritance of any slots takes place. This means that if you create a type in C that supports the sequence protocol, and tries to inherit the mapping protocol from a parent C type which does not support the sequence protocol, __getitem__ will point to the parent type's __getitem__ instead of the child type's __getitem__. This seemed like more of a bug than a feature, so the behavior may be changed in future Pythons. Contributing thread: - `Why does __getitem__ slot of builtin call sequence methods first? <http://mail.python.org/pipermail/python-dev/2005-October/ 056901.html>`__ [SJB] ================ Deferred Threads ================ - `Early PEP draft (For Python 3000?) <http://mail.python.org/ pipermail/python-dev/2005-October/057251.html>`__ - `Pythonic concurrency - offtopic <http://mail.python.org/pipermail/ python-dev/2005-October/057294.html>`__ =============== Skipped Threads =============== - `PEP 350: Codetags <http://mail.python.org/pipermail/python-dev/ 2005-October/056894.html>`__ - `Active Objects in Python <http://mail.python.org/pipermail/python- dev/2005-October/056896.html>`__ - `IDLE development <http://mail.python.org/pipermail/python-dev/2005- October/056907.html>`__ - `Help needed with MSI permissions <http://mail.python.org/pipermail/ python-dev/2005-October/056908.html>`__ - `C API doc fix <http://mail.python.org/pipermail/python-dev/2005- October/056910.html>`__ - `Static builds on Windows (continued) <http://mail.python.org/ pipermail/python-dev/2005-October/056976.html>`__ - `Removing the block stack (was Re: PEP 343 and __with__) <http:// mail.python.org/pipermail/python-dev/2005-October/057001.html>`__ - `Removing the block stack <http://mail.python.org/pipermail/python- dev/2005-October/057008.html>`__ - `Lexical analysis and NEWLINE tokens <http://mail.python.org/ pipermail/python-dev/2005-October/057014.html>`__ - `PyObject_Init documentation <http://mail.python.org/pipermail/ python-dev/2005-October/057039.html>`__ - `Sourceforge CVS access <http://mail.python.org/pipermail/python- dev/2005-October/057051.html>`__ - `__doc__ behavior in class definitions <http://mail.python.org/ pipermail/python-dev/2005-October/057066.html>`__ - `Sandboxed Threads in Python <http://mail.python.org/pipermail/ python-dev/2005-October/057082.html>`__ - `Weekly Python Patch/Bug Summary <http://mail.python.org/pipermail/ python-dev/2005-October/057092.html>`__ - `test_cmd_line failure on Kubuntu 5.10 with GCC 4.0 <http:// mail.python.org/pipermail/python-dev/2005-October/057094.html>`__ - `defaultproperty (was: Re: RFC: readproperty) <http:// mail.python.org/pipermail/python-dev/2005-October/057120.html>`__ - `async IO and helper threads <http://mail.python.org/pipermail/ python-dev/2005-October/057121.html>`__ - `defaultproperty <http://mail.python.org/pipermail/python-dev/2005- October/057129.html>`__ - `Fwd: defaultproperty <http://mail.python.org/pipermail/python-dev/ 2005-October/057131.html>`__ - `C.E.R. Thoughts <http://mail.python.org/pipermail/python-dev/2005- October/057137.html>`__ - `problem with genexp <http://mail.python.org/pipermail/python-dev/ 2005-October/057175.html>`__ - `Python-Dev Digest, Vol 27, Issue 44 <http://mail.python.org/ pipermail/python-dev/2005-October/057207.html>`__ - `Europeans attention please! <http://mail.python.org/pipermail/ python-dev/2005-October/057233.html>`__ From tony.meyer at gmail.com Thu Nov 17 01:36:36 2005 From: tony.meyer at gmail.com (Tony Meyer) Date: Thu, 17 Nov 2005 13:36:36 +1300 Subject: [Python-Dev] DRAFT: python-dev Summary for 2005-10-16 to 2005-10-31 Message-ID: <D716D004-B827-4CB4-913B-ECE61118FF0A@gmail.com> And this one brings us up-to-date (apart from the fortnight ending yesterday). Again, if you have the time, please send any comments/ corrections to us. Once again thanks to Steve for covering me and getting this all out on his own. ============= Announcements ============= -------------- AST for Python -------------- As of October 21st, Python's compiler now uses a real Abstract Syntax Tree (AST)! This should make experimenting with new syntax much easier, as well as allowing some optimizations that were difficult with the previous Concrete Syntax Tree (CST). While there is no Python interface to the AST yet, one is intended for the not-so- distant future. Thanks again to all who contributed, most notably: Armin Rigo, Brett Cannon, Grant Edwards, John Ehresman, Kurt Kaiser, Neal Norwitz, Neil Schemenauer, Nick Coghlan and Tim Peters. Contributing threads: - `AST branch merge status <http://mail.python.org/pipermail/python- dev/2005-October/057347.html>`__ - `AST branch update <http://mail.python.org/pipermail/python-dev/ 2005-October/057387.html>`__ - `AST branch is in? <http://mail.python.org/pipermail/python-dev/ 2005-October/057483.html>`__ - `Questionable AST wibbles <http://mail.python.org/pipermail/python- dev/2005-October/057489.html>`__ - `[Jython-dev] Re: AST branch is in? <http://mail.python.org/ pipermail/python-dev/2005-October/057642.html>`__ [SJB] -------------------- Python on Subversion -------------------- As of October 27th, Python is now on Subversion! The new repository is http://svn.python.org/projects/. Check the `Developers FAQ`_ for information on how to get yourself setup with Subversion. Thanks again to Martin v. L?wis for making this possible! .. _Developers FAQ: http://www.python.org/dev/devfaq.html#subversion-svn Contributing threads: - `Migrating to subversion <http://mail.python.org/pipermail/python- dev/2005-October/057424.html>`__ - `Freezing the CVS on Oct 26 for SVN switchover <http:// mail.python.org/pipermail/python-dev/2005-October/057537.html>`__ - `CVS is read-only <http://mail.python.org/pipermail/python-dev/2005- October/057679.html>`__ - `Conversion to Subversion is complete <http://mail.python.org/ pipermail/python-dev/2005-October/057690.html>`__ [SJB] --------------- Faster decoding --------------- M.-A. Lemburg checked in Walter D?rwald's patches that improve decoding speeds by using a character map. These should make decoding into mac-roman or iso8859-1 nearly as fast as decoding into utf-8. Thanks again guys! Contributing threads: - `Unicode charmap decoders slow <http://mail.python.org/pipermail/ python-dev/2005-October/057341.html>`__ - `New codecs checked in <http://mail.python.org/pipermail/python-dev/ 2005-October/057505.html>`__ - `KOI8_U (New codecs checked in) <http://mail.python.org/pipermail/ python-dev/2005-October/057576.html>`__ [SJB] ========= Summaries ========= --------------------- Strings in Python 3.0 --------------------- Guido proposed that in Python 3.0, all character strings would be unicode, possibly with multiple internal representations. Some of the issues: - Multiple implementations could make the C API difficult. If utf-8, utf-16 and utf-32 are all possible, what types should the C API pass around? - Windows expects utf-16, so using any other encoding will mean that calls to Windows will have to convert to and from utf-16. However, even in current Python, all strings passed to Windows system calls have to undergo 8 bit to utf-16 conversion. - Surrogates (two code units encoding one code point) can slow indexing down because the number of bytes per character isn't constant. Note that even though utf-32 doesn't need surrogates, they may still be used (and must be interpreted correctly) in utf-32 data. Also, in utf-32, "graphemes" (which correspond better to the traditional concept of a "character" than code points do) may still be composed of multiple code points, e.g. "?" (e with a accent) can be written as "e" + "'". This last issue was particularly vexing -- Guido thinks "it's a bad idea to offer an indexing operation that isn't O(1)". A number of proposals were put forward, including: - Adding a flag to strings to indicate whether or not they have any surrogates in them. This makes indexing O(1) when no surrogates are in a string, but O(N) otherwise. - Using a B-tree instead of an array for storage. This would make all indexing O(log N). - Discouraging using the indexing operations by providing an alternate API for strings. This would require creating iterator-like objects that keep track of position in the unicode object. Coming up with an API that's as usable as the slicing API seemed difficult though. Contributing thread: - `Divorcing str and unicode (no more implicit conversions). <http:// mail.python.org/pipermail/python-dev/2005-October/057362.html>`__ [SJB] ------------------- Unicode identifiers ------------------- Martin v. L?wis suggested lifting the restriction that identifiers be ASCII. There was some concern about confusability, with the contention that confusions like "O" (uppercase O) for "0" (zero) and "1" (one) for "l" (lowercase L) would only multiply if larger character sets were allowed. Guido seemed less concerned about this problem than about about how easy it would be to share code across languages. Neil Hodgson pointed out that even though a transliteration into English exists for Japanese, the coders he knew preferred to use relatively meaningless names, and Oren Tirosh indicated that Israeli programmers often preferred transliterations for local business terminology. In either case, with or without unicode identifiers the code would already be hard to share. In the end, people seemed mostly in favor of the idea, though there was some suggestion that it should wait until Python 3.0. Contributing threads: - `Divorcing str and unicode (no more implicit conversions). <http:// mail.python.org/pipermail/python-dev/2005-October/057362.html>`__ - `i18n identifiers (was: Divorcing str and unicode (no more implicit conversions). <http://mail.python.org/pipermail/python-dev/2005- October/057812.html>`__ - `i18n identifiers <http://mail.python.org/pipermail/python-dev/2005- October/057813.html>`__ [SJB] ----------------- Property variants ----------------- People still seem not quite pleased with properties, both in the syntax, and in how they interact with inheritance. Guido proposed changing the property() builtin to accept strings for fget, fset and fdel in addition to functions (as it currently does). If strings were passed, the property() object would have late-binding behavior, that is, the function to call wouldn't be looked-up until the attribute was accessed. Properties whose fget, fset and fdel functions can be overridden in subclasses might then look like:: class C(object): foo = property('getFoo', 'setFoo', None, 'the foo property') def getFoo(self): return self._foo def setFoo(self, foo): self._foo = foo There were mixed reactions to this proposal. People liked getting the expected behavior in subclasses, but it does violate DRY (Don't Repeat Yourself). I posted an `alternative solution`_ using metaclasses that would allow you to write properties like:: class C(object): class foo(Property): """The foo property""" def get(self): return self._foo def set(self, foo): self._foo = foo which operates correctly with subclasses and follows DRY, but introduces a confusion about the referrent of "self". There were also a few suggestions of introducing a new syntax for properties (see `Generalizing the class declaration syntax`_) which would have produced things like:: class C(object): Property foo(): """The foo property""" def get(self): return self._foo def set(self, foo): self._foo = foo At the moment at least, it looks like we'll be sticking with the status quo. .. _alternative solution: http://aspn.activestate.com/ASPN/Cookbook/ Python/Recipe/442418 Contributing threads: - `Definining properties - a use case for class decorators? <http:// mail.python.org/pipermail/python-dev/2005-October/057350.html>`__ - `Defining properties - a use case for class decorators? <http:// mail.python.org/pipermail/python-dev/2005-October/057407.html>`__ - `properties and block statement <http://mail.python.org/pipermail/ python-dev/2005-October/057419.html>`__ - `Property syntax for Py3k (properties and block statement) <http:// mail.python.org/pipermail/python-dev/2005-October/057427.html>`__ [SJB] ------------------- PEP 343 resolutions ------------------- After Guido accepted the idea of adding a __with__() method to the context protocol, `PEP 343`_ was reverted to "Proposed" until the remaining details could be ironed out. The end results were: - The slot name "__context__" will be used instead of "__with__". - The builtin name "context" is currently offlimits due to its ambiguity. - Generator-iterators do NOT have a native context. - The builtin function "contextmanager" will convert a generator- function into a context manager. - The "__context__" slot will NOT be special cased. If it defines a generator, the __context__() function should be decorated with @contextmanager. - When the result of a __context__() call returns an object that lacks an __enter__() or __exit__() method, an AttributeError will be raised. - Only locks, files and decimal.Context objects will gain __context__() methods in Python 2.5. Guido seemed to agree with all of these, but has not yet pronounced on the revised `PEP 343`_. .. _PEP 343: http://www.python.org/peps/pep-0343.html Contributing threads: - `PEP 343 updated <http://mail.python.org/pipermail/python-dev/2005- October/057349.html>`__ - `Proposed resolutions for open PEP 343 issues <http:// mail.python.org/pipermail/python-dev/2005-October/057516.html>`__ - `PEP 343 - multiple context managers in one statement <http:// mail.python.org/pipermail/python-dev/2005-October/057637.html>`__ - `PEP 343 updated with outcome of recent discussions <http:// mail.python.org/pipermail/python-dev/2005-October/057769.html>`__ [SJB] --------------- Freeze protocol --------------- Barry Warsaw propsed `PEP 351`_, which suggests a freeze() builtin which would call the __freeze__() method on an object if that object was not hashable. This would allow dicts to automatically make frozen copies of mutable objects when they were used as dict keys. It could reduce the need for "x" and "frozenx" builtin pairs, since the frozen versions could be automatically derived when needed. Raymond Hettinger indicated some problems with the proposal: - sets.Set supported something similar, but found that it was not really helpful in practice. - Freezing a list into a tuple is not appropriate since they do not have all the same methods. - Errors can arise when the mutable object gets out of sync with its frozen copy. - Manually freezing things when necessary is relatively simple. Noam Raphael proposed a copy-on-change mechanism which would essentially give frozen copies of an object a reference to that object. When the object is about to be modified, a copy would be made, and all frozen copies would be pointed at this. Thus an object that was mutable but never changed could have lightweight frozen copies, while an object that did change would have to pay the usual copying costs. Noam and Josiah Carlson then had a rather heated debate about how feasible such a copy-on-change mechanism would be for Python. .. _PEP 351: http://www.python.org/peps/pep-0351.html Contributing thread: - `PEP 351, the freeze protocol <http://mail.python.org/pipermail/ python-dev/2005-October/057543.html>`__ [SJB] ---------------------------------- Required superclass for Exceptions ---------------------------------- Guido and Brett Cannon introduced `PEP 352`_ which proposes that all Exceptions be required to derive from a new exception class, BaseException. The chidren of BaseException would be KeyboardInterrupt, SystemExit and Exception (which would contain the remainder of the current hierarchy). The goal here is to make the following code do the right thing:: try: ... except Exception: ... Currently, this code fails to catch string exceptions and other exceptions that do not derive from Exception, and it (probably) inappropriately catches KeyboardInterrupt and SystemExit which are supposed to indicate that Python is shutting down. The current plan is to introduce BaseException and have KeyboardInterrupt and SystemExit multiply inherit from Exception and BaseException. The PEP lists the roadplan for deprecating the various other types of exceptions. The PEP also attempts to standardize on the arguments to Exception objects, so that by Python 3.0, all Exceptions will support a single argument which will be stored as their "message" attribute. Guido was ready to accept it on October 31st, but it has not been marked as Accepted yet. .. _PEP 352: http://www.python.org/peps/pep-0352.html Contributing threads: - `PEP 352: Required Superclass for Exceptions <http:// mail.python.org/pipermail/python-dev/2005-October/057736.html>`__ - `PEP 352 Transition Plan <http://mail.python.org/pipermail/python- dev/2005-October/057750.html>`__ [SJB] ----------------------------------------- Generalizing the class declaration syntax ----------------------------------------- Michele Simionato suggested a generalization of the class declaration syntax, so that:: <callable> <name> <tuple>: <definitions> would be translated into:: <name> = <callable>("<name>", <tuple>, <dict-of-definitions>) Where <dict-of-definitions> is simply the namespace that results from executing <definitions>. This would actually remove the need for the class keyword, as classes could be declared as:: type <classname> <bases>: <definitions> There were a few requests for a PEP, but nothing has been made available yet. Contributing thread: - `Definining properties - a use case for class decorators? <http:// mail.python.org/pipermail/python-dev/2005-October/057435.html>`__ [SJB] -------------------- Task-local variables -------------------- Phillip J. Eby introduced a pre-PEP proposing a mechanism similar to thread-local variables, to help co-routine schedulers to swap state between tasks. Essentially, the scheduler would be required to take a snapshot of a coroutine's variables before a swap, and restore that snapshot when the coroutine is swapped back. Guido asked people to hold off on more PEP 343-related proposals until with-blocks have been out in the wild for at least a release or two. Contributing thread: - `Pre-PEP: Task-local variables <http://mail.python.org/pipermail/ python-dev/2005-October/057464.html>`__ [SJB] ----------------------------------------- Attribute-style access for all namespaces ----------------------------------------- Eyal Lotem proposed replacing the globals() and locals() dicts with "module" and "frame" objects that would have attribute-style access instead of __getitem__-style access. Josiah Carlson noted that the first is already available by doing ``module = __import__(__name__) ``, and suggested that monkeying around with function locals is never a good idea, so adding additional support for doing so is not useful. Contributing threads: - `Early PEP draft (For Python 3000?) <http://mail.python.org/ pipermail/python-dev/2005-October/057251.html>`__ [SJB] --------------------------------- Yielding all items of an iterator --------------------------------- Gustavo J. A. M. Carneiro was looking for a nicer way of indicating that all items of an iterable should be yielded. Currently, you probably want to use a for-loop to express this, e.g.:: for step in animate(win, xrange(10)): # slide down yield step Andrew Koenig suggested that the syntax:: yield from <x> be equivalent to:: for i in x: yield i People seemed uncertain as to whether or not there were enough use cases to merit the additional syntax. Contributing thread: - `Coroutines, generators, function calling <http://mail.python.org/ pipermail/python-dev/2005-October/057405.html>`__ [SJB] ----------------------------------------- Getting an AST without the Python runtime ----------------------------------------- Thanks to the merging of the AST branch, Evan Jones was able to fully divorce the Python parse from the Python runtime so that you can get AST objects without having to have Python running. He made the divorced AST parser available on `his site`_. .. _his site: http://evanjones.ca/software/pyparser.html Contributing thread: - `Parser and Runtime: Divorced! <http://mail.python.org/pipermail/ python-dev/2005-October/057684.html>`__ [SJB] =============== Skipped Threads =============== - `Pythonic concurrency - offtopic <http://mail.python.org/pipermail/ python-dev/2005-October/057294.html>`__ - `Sourceforge CVS access <http://mail.python.org/pipermail/python- dev/2005-October/057342.html>`__ - `Weekly Python Patch/Bug Summary <http://mail.python.org/pipermail/ python-dev/2005-October/057343.html>`__ - `Guido v. Python, Round 1 <http://mail.python.org/pipermail/python- dev/2005-October/057366.html>`__ - `Autoloading? (Making Queue.Queue easier to use) <http:// mail.python.org/pipermail/python-dev/2005-October/057368.html>`__ - `problem with genexp <http://mail.python.org/pipermail/python-dev/ 2005-October/057370.html>`__ - `PEP 3000 and exec <http://mail.python.org/pipermail/python-dev/ 2005-October/057380.html>`__ - `Pythonic concurrency - offtopic <http://mail.python.org/pipermail/ python-dev/2005-October/057442.html>`__ - `enumerate with a start index <http://mail.python.org/pipermail/ python-dev/2005-October/057459.html>`__ - `list splicing <http://mail.python.org/pipermail/python-dev/2005- October/057479.html>`__ - `bool(iter([])) changed between 2.3 and 2.4 <http://mail.python.org/ pipermail/python-dev/2005-October/057481.html>`__ - `A solution to the evils of static typing and interfaces? <http:// mail.python.org/pipermail/python-dev/2005-October/057485.html>`__ - `PEP 267 -- is the semantics change OK? <http://mail.python.org/ pipermail/python-dev/2005-October/057506.html>`__ - `DRAFT: python-dev Summary for 2005-09-01 through 2005-09-16 <http://mail.python.org/pipermail/python-dev/2005-October/ 057508.html>`__ - `int(string) (was: DRAFT: python-dev Summary for 2005-09-01 through 2005-09-16) <http://mail.python.org/pipermail/python-dev/2005-October/ 057510.html>`__ - `LXR site for Python CVS <http://mail.python.org/pipermail/python- dev/2005-October/057511.html>`__ - `int(string) <http://mail.python.org/pipermail/python-dev/2005- October/057512.html>`__ - `Comparing date+time w/ just time <http://mail.python.org/pipermail/ python-dev/2005-October/057514.html>`__ - `AST reverts PEP 342 implementation and IDLE starts working again <http://mail.python.org/pipermail/python-dev/2005-October/ 057528.html>`__ - `cross compiling python for embedded systems <http:// mail.python.org/pipermail/python-dev/2005-October/057534.html>`__ - `Inconsistent Use of Buffer Interface in stringobject.c <http:// mail.python.org/pipermail/python-dev/2005-October/057589.html>`__ - `Reminder: PyCon 2006 submissions due in a week <http:// mail.python.org/pipermail/python-dev/2005-October/057618.html>`__ - `MinGW and libpython24.a <http://mail.python.org/pipermail/python- dev/2005-October/057624.html>`__ - `make testall hanging on HEAD? <http://mail.python.org/pipermail/ python-dev/2005-October/057662.html>`__ - `"? operator in python" <http://mail.python.org/pipermail/ python-dev/2005-October/057673.html>`__ - `[Docs] MinGW and libpython24.a <http://mail.python.org/pipermail/ python-dev/2005-October/057693.html>`__ - `Help with inotify <http://mail.python.org/pipermail/python-dev/ 2005-October/057705.html>`__ - `[Python-checkins] commit of r41352 - in python/trunk: . Lib Lib/ distutils Lib/distutils/command Lib/encodings <http://mail.python.org/ pipermail/python-dev/2005-October/057780.html>`__ - `svn:ignore <http://mail.python.org/pipermail/python-dev/2005- October/057783.html>`__ - `svn checksum error <http://mail.python.org/pipermail/python-dev/ 2005-October/057790.html>`__ - `svn:ignore (Was: [Python-checkins] commit of r41352 - in python/ trunk: . Lib Lib/distutils Lib/distutils/command Lib/encodings) <http://mail.python.org/pipermail/python-dev/2005-October/ 057793.html>`__ - `StreamHandler eating exceptions <http://mail.python.org/pipermail/ python-dev/2005-October/057798.html>`__ - `a different kind of reduce... <http://mail.python.org/pipermail/ python-dev/2005-October/057814.html>`__ From mozbugbox at yahoo.com.au Thu Nov 17 05:55:08 2005 From: mozbugbox at yahoo.com.au (JustFillBug) Date: Thu, 17 Nov 2005 04:55:08 +0000 (UTC) Subject: [Python-Dev] Problems with the Python Memory Manager References: <20051116120346.A434.JCARLSON@uci.edu> <dlg5gt$q1g$1@sea.gmane.org> <20051116145820.A43A.JCARLSON@uci.edu> <437BC524.2030105@ee.byu.edu> Message-ID: <slrndno3ou.l63.mozbugbox@mozbugbox.somehost.org> On 2005-11-16, Travis Oliphant <oliphant at ee.byu.edu> wrote: > Josiah Carlson wrote: >>I seemed to have misunderstood the discussion. Was the original user >>accessing and saving copies of many millions of these doubles? >> > He *was* accessing them (therefore generating a call to an array-scalar > object creation function). But they *weren't being* saved. They were > being deleted soon after access. That's why it was so confusing that > his memory usage should continue to grow and grow so terribly. > > As verified by removing usage of the Python PyObject_MALLOC function, it > was the Python memory manager that was performing poorly. Even though > the array-scalar objects were deleted, the memory manager would not > re-use their memory for later object creation. Instead, the memory > manager kept allocating new arenas to cover the load (when it should > have been able to re-use the old memory that had been freed by the > deleted objects--- again, I don't know enough about the memory manager > to say why this happened). Well, the user have to call garbage collection before the memory were freed. Python won't free memory when it can allocate more. It sucks but it is my experience with python. I mean when python start doing swap on my machine, I have to add manual garbage collection calls into my codes. From ronaldoussoren at mac.com Thu Nov 17 07:06:02 2005 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Thu, 17 Nov 2005 07:06:02 +0100 Subject: [Python-Dev] Problems with the Python Memory Manager In-Reply-To: <437BE7A8.5000503@ee.byu.edu> References: <fb6fbf560511161750y7cef46cdk67700606e655a6ec@mail.gmail.com> <437BE7A8.5000503@ee.byu.edu> Message-ID: <A89BF905-97B2-4E08-BFEB-33B00B3AECE0@mac.com> On 17-nov-2005, at 3:15, Travis Oliphant wrote: > Jim Jewett wrote: > >> > >> (2) Is he allocating new _types_, which I think don't get properly >> >> collected. >> >> > > Bingo. Yes, definitely allocating new _types_ (an awful lot of > them...) > --- that's what the "array scalars" are: new types created in C. Do you really mean that someArray[1] will create a new type to represent the second element of someArray? I would guess that you create an instance of a type defined in your extension. Ronald From fredrik at pythonware.com Thu Nov 17 09:29:52 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu, 17 Nov 2005 09:29:52 +0100 Subject: [Python-Dev] Problems with the Python Memory Manager References: <20051116120346.A434.JCARLSON@uci.edu> <dlg5gt$q1g$1@sea.gmane.org><20051116145820.A43A.JCARLSON@uci.edu> <437BC524.2030105@ee.byu.edu> Message-ID: <dlhf20$dib$1@sea.gmane.org> Travis Oliphant wrote: > The fact that it did happen is what I'm reporting on. If nothing will > be done about it (which I can understand), at least this thread might > help somebody else in a similar situation track down why their Python > process consumes all of their memory even though their objects are being > deleted appropriately. since that doesn't happen in other applications, I'm not sure this thread will help much -- unless you can provide us with enough details to figure out what makes this case so much different... </F> From fredrik at pythonware.com Thu Nov 17 09:44:06 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu, 17 Nov 2005 09:44:06 +0100 Subject: [Python-Dev] Problems with the Python Memory Manager References: <fb6fbf560511161750y7cef46cdk67700606e655a6ec@mail.gmail.com> <437BE7A8.5000503@ee.byu.edu> Message-ID: <dlhfsm$fq6$1@sea.gmane.org> Travis Oliphant wrote: > Bingo. Yes, definitely allocating new _types_ (an awful lot of them...) > --- that's what the "array scalars" are: new types created in C. are you allocating PyTypeObject structures dynamically? why are you creating an awful lot of new type objects to represent the contents of a homogenous array? > If they don't get properly collected then that would definitely have > created the problem. It would seem this should be advertised when > telling people to use PyObject_New for allocating new memory for > an object. PyObject_New creates a new instance of a given type; it doesn't, in itself, create a new type. at this point, your description doesn't make much sense. more information is definitely needed... </F> From mwh at python.net Thu Nov 17 10:42:23 2005 From: mwh at python.net (Michael Hudson) Date: Thu, 17 Nov 2005 09:42:23 +0000 Subject: [Python-Dev] Problems with the Python Memory Manager In-Reply-To: <437BE7A8.5000503@ee.byu.edu> (Travis Oliphant's message of "Wed, 16 Nov 2005 19:15:04 -0700") References: <fb6fbf560511161750y7cef46cdk67700606e655a6ec@mail.gmail.com> <437BE7A8.5000503@ee.byu.edu> Message-ID: <2mek5ftxts.fsf@starship.python.net> Travis Oliphant <oliphant at ee.byu.edu> writes: > Bingo. Yes, definitely allocating new _types_ (an awful lot of them...) > --- that's what the "array scalars" are: new types created in C. Ah! And, er, why? > If they don't get properly collected then that would definitely have > created the problem. types do get collected -- but only after the cycle collector has run. If you can still reproduce the problem can you try again but calling 'gc.set_threshold(1)'? > It would seem this should be advertised when telling people to use > PyObject_New for allocating new memory for an object. Nevertheless, I think it would be good if pymalloc freed its arenas. I think the reasons it doesn't are because of worries that people might be called PyObject_Free without holding the GIL, but that's been verboten for several years now so we can probably just let them suffer. I think there's even a patch on SF to do this... Cheers, mwh -- The use of COBOL cripples the mind; its teaching should, therefore, be regarded as a criminal offence. -- Edsger W. Dijkstra, SIGPLAN Notices, Volume 17, Number 5 From oliphant at ee.byu.edu Thu Nov 17 11:00:10 2005 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Thu, 17 Nov 2005 03:00:10 -0700 Subject: [Python-Dev] Problems with the Python Memory Manager In-Reply-To: <A89BF905-97B2-4E08-BFEB-33B00B3AECE0@mac.com> References: <fb6fbf560511161750y7cef46cdk67700606e655a6ec@mail.gmail.com> <437BE7A8.5000503@ee.byu.edu> <A89BF905-97B2-4E08-BFEB-33B00B3AECE0@mac.com> Message-ID: <437C54AA.9020203@ee.byu.edu> >> >> Bingo. Yes, definitely allocating new _types_ (an awful lot of >> them...) >> --- that's what the "array scalars" are: new types created in C. > > > Do you really mean that someArray[1] will create a new type to represent > the second element of someArray? I would guess that you create an > instance of a type defined in your extension. O.K. my bad. I can see that I was confusing in my recent description and possibly misunderstood the questions I was asked. It can get confusing given the dynamic nature of Python. The array scalars are new statically defined (in C) types (just like regular Python integers and regular Python floats). The ndarray is also a statically defined type. The ndarray holds raw memory interpreted in a certain fashion (very similar to Python's array module). Each ndarray can have a certain data type. For every data type that an array can be, there is a corresponding "array scalar" type. All of these are statically defined types. We are only talking about instances of these defined types. When the result of a user operation with an ndarray is a scalar, an instance of the appropriate "array scalar" type is created and passed back to the user. Previously we were using PyObject_New in the tp_alloc slot and PyObject_Del in the tp_free slot of the typeobject structure in order to create and destroy the memory for these instances. In this particular application, the user ended up creating many, many instances of these array scalars and then deleting them soon after. Despite the fact that he was not retaining any references to these scalars (PyObject_Del had been called on them), his application crawled to a halt after only several hunderd iterations consuming all of the available system memory. To verify that indeed no references were being kept, I did a detailed analysis of the result of sys.getobjects() using a debug build of Python. When I replaced PyObject_New (with malloc and PyObject_Init) and PyObject_Del (with free) for the "array scalars" types in scipy core, the users memory problems magically disappeared. I therefore assume that the problem is the memory manager in Python. Initially, I thought this was the old problem of Python not freeing memory once it grabs it. But, that should not have been a problem here, because the code quickly frees most of the objects it creates and so Python should have been able to re-use the memory. So, I now believe that his code (plus the array scalar extension type) was actually exposing a real bug in the memory manager itself. In theory, the Python memory manager should have been able to re-use the memory for the array-scalar instances because they are always the same size. In practice, the memory was apparently not being re-used but instead new blocks were being allocated to handle the load. His code is quite complicated and it is difficult to replicate the problem. I realize this is not helpful for fixing the Python memory manager, and I wish I could be more helpful. However, replacing PyObject_New with malloc does solve the problem for us and that may help anybody else in this situation in the future. Best regards, -Travis From walter at livinglogic.de Thu Nov 17 21:21:48 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Thu, 17 Nov 2005 21:21:48 +0100 Subject: [Python-Dev] Iterating a closed StringIO Message-ID: <437CE65C.7010107@livinglogic.de> Currently StringIO.StringIO and cStringIO.StringIO behave differently when iterating a closed stream: s = StringIO.StringIO("foo") s.close() s.next() gives StopIteration, but s = cStringIO.StringIO("foo") s.close() s.next() gives "ValueError: I/O operation on closed file". Should they raise the same exception? Should this be fixed for 2.5? Bye, Walter D?rwald From bcannon at gmail.com Thu Nov 17 21:46:15 2005 From: bcannon at gmail.com (Brett Cannon) Date: Thu, 17 Nov 2005 12:46:15 -0800 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <dlf7ak$ckg$1@sea.gmane.org> References: <4379AAD7.2050506@iinet.net.au> <6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu> <e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com> <ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com> <bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com> <13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu> <437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org> Message-ID: <bbaeab100511171246v5c0ea6bei93480a669011042e@mail.gmail.com> On 11/16/05, Fredrik Lundh <fredrik at pythonware.com> wrote: > Thomas Lee wrote: > > > Even if it meant we had just one function call - one, safe function call > > that deallocated all the memory allocated within a function - that we > > had to put before each and every return, that's better than what we > > have. > > alloca? > > (duck) > But how widespread is its support (e.g., does Windows have it)? -Brett From bcannon at gmail.com Thu Nov 17 21:56:30 2005 From: bcannon at gmail.com (Brett Cannon) Date: Thu, 17 Nov 2005 12:56:30 -0800 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <437B3EF8.2030001@gmail.com> References: <4379AAD7.2050506@iinet.net.au> <6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu> <e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com> <ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com> <bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com> <437B00BE.7060007@gmail.com> <437B2FE6.7080206@gmail.com> <437B3EF8.2030001@gmail.com> Message-ID: <bbaeab100511171256h57ce0596rab12bd197b1e6da6@mail.gmail.com> On 11/16/05, Thomas Lee <krumms at gmail.com> wrote: > Just messing around with some ideas. I was trying to avoid the ugly > macros (note my earlier whinge about a learning curve) but they're the > cleanest way I could think of to get around the problem without > resorting to a mass deallocation right at the end of the AST run. Which > may not be all that bad given we're going to keep everything in-memory > anyway until an error occurs ... anyway, anyway, I'm getting sidetracked :) > > The idea is to ensure that all allocations within a single function are > made using the pool so that a function finishes what it starts. This > way, if the function fails it alone is responsible for cleaning up its > own pool and that's all. No funkyness needed for sequences, because each > member of the sequence belongs to the pool too. Note that the stmt_ty > instances are also allocated using the pool. > > This breaks interfaces all over the place though. Not exactly a pretty > change :) But yeah, maybe somebody smarter than I will come up with > something a bit cleaner. > > -- > > /* snip! */ > > #define AST_SUCCESS(pool, result) return result > #define AST_FAILURE(pool, result) asdl_pool_free(pool); return result > This is actually exactly what I was thinking of; macros that handle returns and specify whether the return signals a success or failure. One tweak I would do is posibly lock down the the variable name with AST_POOL_ALLOC() at the start of a function that creates _arena_pool. That way you don't need to pass in the specific pool. I don't see why we will need to have multiple pools within a function. This also allows the VISIT_* macros to be easily modified and not suddenly require another argument to specify the arena name. And all of this is easy to police since you can grep for 'return' and make sure that it is meant to be there and not in actuality be one of the macros. Basically gives us the mini-language that Nick mentioned way back at the beginning of this thread. Oh, and tweak the macros to be within ``do { ... } while(0)`` (``if (1) AST_FAILURE(pool, NULL);`` will not expand properly otherwise). -Brett From guido at python.org Thu Nov 17 22:03:49 2005 From: guido at python.org (Guido van Rossum) Date: Thu, 17 Nov 2005 13:03:49 -0800 Subject: [Python-Dev] Iterating a closed StringIO In-Reply-To: <437CE65C.7010107@livinglogic.de> References: <437CE65C.7010107@livinglogic.de> Message-ID: <ca471dc20511171303t637ad7ddtd7ee2753840e2d6@mail.gmail.com> On 11/17/05, Walter D?rwald <walter at livinglogic.de> wrote: > Currently StringIO.StringIO and cStringIO.StringIO behave differently > when iterating a closed stream: > > s = StringIO.StringIO("foo") > s.close() > s.next() > > gives StopIteration, but > > s = cStringIO.StringIO("foo") > s.close() > s.next() > > gives "ValueError: I/O operation on closed file". > > Should they raise the same exception? Should this be fixed for 2.5? I think cStringIO is doing the right thing; "real" files behave the same way. Submit a patch for StringIO (also docs please) and assign it to me and I'll make sure it goes in. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From aleaxit at gmail.com Thu Nov 17 23:27:41 2005 From: aleaxit at gmail.com (Alex Martelli) Date: Thu, 17 Nov 2005 14:27:41 -0800 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <bbaeab100511171246v5c0ea6bei93480a669011042e@mail.gmail.com> References: <4379AAD7.2050506@iinet.net.au> <6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu> <e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com> <ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com> <bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com> <13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu> <437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org> <bbaeab100511171246v5c0ea6bei93480a669011042e@mail.gmail.com> Message-ID: <80315A07-6C80-4E27-9CA8-F62719775307@gmail.com> On Nov 17, 2005, at 12:46 PM, Brett Cannon wrote: ... >> alloca? >> >> (duck) >> > > But how widespread is its support (e.g., does Windows have it)? Yep, spelled with a leading underscore: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/ vclib/html/_crt__alloca.asp Alex From blais at furius.ca Fri Nov 18 00:26:22 2005 From: blais at furius.ca (Martin Blais) Date: Thu, 17 Nov 2005 18:26:22 -0500 Subject: [Python-Dev] Coroutines (PEP 342) In-Reply-To: <1147958111.20051114154658@gmail.com> References: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com> <43579027.6040007@gmail.com> <43579ADC.80006@gmail.com> <5.1.1.6.0.20051020163313.01faf660@mail.telecommunity.com> <ca471dc20510201957m7823c49ama127de972eef4028@mail.gmail.com> <4359047B.6020203@gmail.com> <1147958111.20051114154658@gmail.com> Message-ID: <8393fff0511171526o37738c50iabbb2f73eb59c56e@mail.gmail.com> On 11/14/05, Bruce Eckel <BruceEckel-Python3234 at mailblocks.com> wrote: > I just finished reading PEP 342, and it appears to follow Hoare's > Communicating Sequential Processes (CSP) where a process is a > coroutine, and the communicaion is via yield and send(). It seems that > if you follow that form (and you don't seem forced to, pythonically), > then synchronization is not an issue. > > What is not clear to me, and is not discussed in the PEP, is whether > coroutines can be distributed among multiple processors. If that is or > isn't possible I think it should be explained in the PEP, and I'd be > interested in know about it here (and ideally why it would or wouldn't > work). It seems to me that the concept of coroutines and PEP342 has very little to do with concurrency itself, apart from the fact that the generators form very convenient units of parallelization if you're willing to do some scheduling of them yourself, and only *potentially* with concurrency, i.e. only if you wrote a scheduler that supports running generator iterations concurrently on two processors. Otherwise there is no concurrency abstraction, unlike threads: it's cooperative and you clearly can see in the code the points where "switching" occurs (next(... ), yield ...). Please beat me with a stick if this is lunatic... From walter at livinglogic.de Fri Nov 18 01:03:26 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Fri, 18 Nov 2005 01:03:26 +0100 Subject: [Python-Dev] Iterating a closed StringIO In-Reply-To: <ca471dc20511171303t637ad7ddtd7ee2753840e2d6@mail.gmail.com> References: <437CE65C.7010107@livinglogic.de> <ca471dc20511171303t637ad7ddtd7ee2753840e2d6@mail.gmail.com> Message-ID: <140808AA-CFCA-4679-B5CC-24D21D45C3A3@livinglogic.de> Am 17.11.2005 um 22:03 schrieb Guido van Rossum: > On 11/17/05, Walter D?rwald <walter at livinglogic.de> wrote: >> Currently StringIO.StringIO and cStringIO.StringIO behave differently >> when iterating a closed stream: >> >> s = StringIO.StringIO("foo") >> s.close() >> s.next() >> >> gives StopIteration, but >> >> s = cStringIO.StringIO("foo") >> s.close() >> s.next() >> >> gives "ValueError: I/O operation on closed file". >> >> Should they raise the same exception? Should this be fixed for 2.5? > > I think cStringIO is doing the right thing; "real" files behave the > same way. > > Submit a patch for StringIO (also docs please) and assign it to me and > I'll make sure it goes in. http://www.python.org/sf/1359365 Doc/lib/libstringio.tex only states "See the description of file objects for operations", so I'm not sure how to update the documentation. Bye, Walter D?rwald From krumms at gmail.com Fri Nov 18 02:00:56 2005 From: krumms at gmail.com (Thomas Lee) Date: Fri, 18 Nov 2005 11:00:56 +1000 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <80315A07-6C80-4E27-9CA8-F62719775307@gmail.com> References: <4379AAD7.2050506@iinet.net.au> <6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu> <e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com> <ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com> <bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com> <13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu> <437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org> <bbaeab100511171246v5c0ea6bei93480a669011042e@mail.gmail.com> <80315A07-6C80-4E27-9CA8-F62719775307@gmail.com> Message-ID: <437D27C8.9040703@gmail.com> Portability may also be an issue to take into consideration: http://www.eskimo.com/~scs/C-faq/q7.32.html http://archives.neohapsis.com/archives/postfix/2001-05/1305.html Cheers, Tom Alex Martelli wrote: >On Nov 17, 2005, at 12:46 PM, Brett Cannon wrote: > ... > > >>>alloca? >>> >>>(duck) >>> >>> >>> >>But how widespread is its support (e.g., does Windows have it)? >> >> > >Yep, spelled with a leading underscore: >http://msdn.microsoft.com/library/default.asp?url=/library/en-us/ >vclib/html/_crt__alloca.asp > > >Alex > >_______________________________________________ >Python-Dev mailing list >Python-Dev at python.org >http://mail.python.org/mailman/listinfo/python-dev >Unsubscribe: http://mail.python.org/mailman/options/python-dev/krumms%40gmail.com > > > From guido at python.org Fri Nov 18 02:16:23 2005 From: guido at python.org (Guido van Rossum) Date: Thu, 17 Nov 2005 17:16:23 -0800 Subject: [Python-Dev] Iterating a closed StringIO In-Reply-To: <140808AA-CFCA-4679-B5CC-24D21D45C3A3@livinglogic.de> References: <437CE65C.7010107@livinglogic.de> <ca471dc20511171303t637ad7ddtd7ee2753840e2d6@mail.gmail.com> <140808AA-CFCA-4679-B5CC-24D21D45C3A3@livinglogic.de> Message-ID: <ca471dc20511171716x6dca0cb0qb81cae74beb9ed63@mail.gmail.com> On 11/17/05, Walter D?rwald <walter at livinglogic.de> wrote: > Am 17.11.2005 um 22:03 schrieb Guido van Rossum: > > > On 11/17/05, Walter D?rwald <walter at livinglogic.de> wrote: > >> Currently StringIO.StringIO and cStringIO.StringIO behave differently > >> when iterating a closed stream: > >> > >> s = StringIO.StringIO("foo") > >> s.close() > >> s.next() > >> > >> gives StopIteration, but > >> > >> s = cStringIO.StringIO("foo") > >> s.close() > >> s.next() > >> > >> gives "ValueError: I/O operation on closed file". > >> > >> Should they raise the same exception? Should this be fixed for 2.5? > > > > I think cStringIO is doing the right thing; "real" files behave the > > same way. > > > > Submit a patch for StringIO (also docs please) and assign it to me and > > I'll make sure it goes in. > > http://www.python.org/sf/1359365 Thanks! > Doc/lib/libstringio.tex only states "See the description of file > objects for operations", so I'm not sure how to update the > documentation. OK, so that's a no-op. I hope there isn't anyone here who believes this patch would be a bad idea? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From aleaxit at gmail.com Fri Nov 18 02:19:45 2005 From: aleaxit at gmail.com (Alex Martelli) Date: Thu, 17 Nov 2005 17:19:45 -0800 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <437D27C8.9040703@gmail.com> References: <4379AAD7.2050506@iinet.net.au> <6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu> <e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com> <ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com> <bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com> <13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu> <437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org> <bbaeab100511171246v5c0ea6bei93480a669011042e@mail.gmail.com> <80315A07-6C80-4E27-9CA8-F62719775307@gmail.com> <437D27C8.9040703@gmail.com> Message-ID: <7E441ACF-1ADE-4141-953E-C64272D0629D@gmail.com> On Nov 17, 2005, at 5:00 PM, Thomas Lee wrote: > Portability may also be an issue to take into consideration: Of course -- but so is anno domini... the eskimo.com FAQ is (C) 1995, and the neohapsis.com page just points to the eskimo.com one: > http://www.eskimo.com/~scs/C-faq/q7.32.html > http://archives.neohapsis.com/archives/postfix/2001-05/1305.html In 2006, I'm not sure the need to avoid alloca is anywhere as strong. Sure, it could be wrapped into a LOCALLOC macro (with a companion LOCFREE one), the macro expanding to alloca/nothing on systems which do have alloca and to malloc/free elsewhere -- this would keep the sources just as cluttered, but still speed things up where feasible. E.g., on my iBook, a silly benchmark just freeing and allocating 80,000 hunks of 1000 bytes takes 13ms with alloca, 57 without... Alex From walter at livinglogic.de Fri Nov 18 11:29:17 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Fri, 18 Nov 2005 11:29:17 +0100 Subject: [Python-Dev] Iterating a closed StringIO In-Reply-To: <ca471dc20511171716x6dca0cb0qb81cae74beb9ed63@mail.gmail.com> References: <437CE65C.7010107@livinglogic.de> <ca471dc20511171303t637ad7ddtd7ee2753840e2d6@mail.gmail.com> <140808AA-CFCA-4679-B5CC-24D21D45C3A3@livinglogic.de> <ca471dc20511171716x6dca0cb0qb81cae74beb9ed63@mail.gmail.com> Message-ID: <5D3C125B-A1D4-476A-BF5C-51346238A0F6@livinglogic.de> Am 18.11.2005 um 02:16 schrieb Guido van Rossum: > On 11/17/05, Walter D?rwald <walter at livinglogic.de> wrote: >> Am 17.11.2005 um 22:03 schrieb Guido van Rossum: >> >>> On 11/17/05, Walter D?rwald <walter at livinglogic.de> wrote: >>>> [...] >>>> Should they raise the same exception? Should this be fixed for 2.5? >>> >>> I think cStringIO is doing the right thing; "real" files behave the >>> same way. >>> >>> Submit a patch for StringIO (also docs please) and assign it to >>> me and >>> I'll make sure it goes in. >> >> http://www.python.org/sf/1359365 > > Thanks! > >> Doc/lib/libstringio.tex only states "See the description of file >> objects for operations", so I'm not sure how to update the >> documentation. > > OK, so that's a no-op. > > I hope there isn't anyone here who believes this patch would be a > bad idea? I wonder whether we should introduce a new exception class for these kinds or error. IMHO ValueError is much to unspecific. What about StateError? Bye, Walter D?rwald From walter at livinglogic.de Fri Nov 18 13:15:33 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Fri, 18 Nov 2005 13:15:33 +0100 Subject: [Python-Dev] isatty() on closed StringIO (was: Iterating a closed StringIO) In-Reply-To: <ca471dc20511171716x6dca0cb0qb81cae74beb9ed63@mail.gmail.com> References: <437CE65C.7010107@livinglogic.de> <ca471dc20511171303t637ad7ddtd7ee2753840e2d6@mail.gmail.com> <140808AA-CFCA-4679-B5CC-24D21D45C3A3@livinglogic.de> <ca471dc20511171716x6dca0cb0qb81cae74beb9ed63@mail.gmail.com> Message-ID: <437DC5E5.1030302@livinglogic.de> Guido van Rossum wrote: > On 11/17/05, Walter D?rwald <walter at livinglogic.de> wrote: > >>Am 17.11.2005 um 22:03 schrieb Guido van Rossum: >> >> >>>On 11/17/05, Walter D?rwald <walter at livinglogic.de> wrote: >>> >>>>Currently StringIO.StringIO and cStringIO.StringIO behave differently >>>>when iterating a closed stream: > [...] > > I hope there isn't anyone here who believes this patch would be a bad idea? BTW, isatty() has a similar problem: >>> import StringIO, cStringIO >>> s = StringIO.StringIO() >>> s.close() >>> s.isatty() Traceback (most recent call last): File "<stdin>", line 1, in ? File "/usr/local/lib/python2.4/StringIO.py", line 93, in isatty _complain_ifclosed(self.closed) File "/usr/local/lib/python2.4/StringIO.py", line 40, in _complain_ifclosed raise ValueError, "I/O operation on closed file" ValueError: I/O operation on closed file >>> s = cStringIO.StringIO() >>> s.close() >>> s.isatty() False I guess cStringIO.StringIO.isatty() should raise an exception too. Bye, Walter D?rwald From walter at livinglogic.de Fri Nov 18 14:26:09 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Fri, 18 Nov 2005 14:26:09 +0100 Subject: [Python-Dev] Another StringIO/cStringIO discrepancy Message-ID: <437DD671.40809@livinglogic.de> >>> import StringIO, cStringIO >>> s = StringIO.StringIO() >>> s.truncate(-42) Traceback (most recent call last): File "<stdin>", line 1, in ? File "/usr/local/lib/python2.4/StringIO.py", line 203, in truncate raise IOError(EINVAL, "Negative size not allowed") IOError: [Errno 22] Negative size not allowed >>> s = cStringIO.StringIO() >>> s.truncate(-42) >>> From michael.walter at gmail.com Fri Nov 18 14:32:44 2005 From: michael.walter at gmail.com (Michael Walter) Date: Fri, 18 Nov 2005 14:32:44 +0100 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <7E441ACF-1ADE-4141-953E-C64272D0629D@gmail.com> References: <4379AAD7.2050506@iinet.net.au> <ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com> <bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com> <13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu> <437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org> <bbaeab100511171246v5c0ea6bei93480a669011042e@mail.gmail.com> <80315A07-6C80-4E27-9CA8-F62719775307@gmail.com> <437D27C8.9040703@gmail.com> <7E441ACF-1ADE-4141-953E-C64272D0629D@gmail.com> Message-ID: <877e9a170511180532nc1ba329m48f1e61e2338e6df@mail.gmail.com> The behavior of libiberty's alloca() replacement might be interesting as well: http://gcc.gnu.org/onlinedocs/libiberty/Functions.html#index-alloca-59 Regards, Michael On 11/18/05, Alex Martelli <aleaxit at gmail.com> wrote: > > On Nov 17, 2005, at 5:00 PM, Thomas Lee wrote: > > > Portability may also be an issue to take into consideration: > > Of course -- but so is anno domini... the eskimo.com FAQ is (C) 1995, > and the neohapsis.com page just points to the eskimo.com one: > > > http://www.eskimo.com/~scs/C-faq/q7.32.html > > http://archives.neohapsis.com/archives/postfix/2001-05/1305.html > > In 2006, I'm not sure the need to avoid alloca is anywhere as > strong. Sure, it could be wrapped into a LOCALLOC macro (with a > companion LOCFREE one), the macro expanding to alloca/nothing on > systems which do have alloca and to malloc/free elsewhere -- this > would keep the sources just as cluttered, but still speed things up > where feasible. E.g., on my iBook, a silly benchmark just freeing > and allocating 80,000 hunks of 1000 bytes takes 13ms with alloca, 57 > without... > > > Alex > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/michael.walter%40gmail.com > From ncoghlan at gmail.com Fri Nov 18 14:57:19 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 18 Nov 2005 23:57:19 +1000 Subject: [Python-Dev] Iterating a closed StringIO In-Reply-To: <ca471dc20511171716x6dca0cb0qb81cae74beb9ed63@mail.gmail.com> References: <437CE65C.7010107@livinglogic.de> <ca471dc20511171303t637ad7ddtd7ee2753840e2d6@mail.gmail.com> <140808AA-CFCA-4679-B5CC-24D21D45C3A3@livinglogic.de> <ca471dc20511171716x6dca0cb0qb81cae74beb9ed63@mail.gmail.com> Message-ID: <437DDDBF.7010309@gmail.com> Guido van Rossum wrote: > > I hope there isn't anyone here who believes this patch would be a bad idea? Not me, but the Iterator protocol docs may need a minor tweak. Currently they say this: "The intention of the protocol is that once an iterator's next() method raises StopIteration, it will continue to do so on subsequent calls. Implementations that do not obey this property are deemed broken." This wording is a bit too strong, as it's perfectly acceptable for an object to provide other methods which affect the result of subsequent calls to the next() method (examples being the seek() and close() methods in the file interface). The current wording does describe the basic intent of the API correctly, but you could forgiven for thinking that it ruled out modifying the state of a completed iterator in a way that restarts it, or causes it to raise an exception other than StopIteration. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From raymond.hettinger at verizon.net Fri Nov 18 15:29:38 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Fri, 18 Nov 2005 09:29:38 -0500 Subject: [Python-Dev] Iterating a closed StringIO In-Reply-To: <437DDDBF.7010309@gmail.com> Message-ID: <005401c5ec4c$81d4be40$91af2c81@oemcomputer> [Guido van Rossum] > > I hope there isn't anyone here who believes this patch would be a bad > idea? [Nick Coglan] > Not me, but the Iterator protocol docs may need a minor tweak. Currently > they > say this: > > "The intention of the protocol is that once an iterator's next() method > raises > StopIteration, it will continue to do so on subsequent calls. > Implementations > that do not obey this property are deemed broken." FWIW, here is wording for PEP 342's close() method: """ 4. Add a close() method for generator-iterators, which raises GeneratorExit at the point where the generator was paused. If the generator then raises StopIteration (by exiting normally, or due to already being closed) or GeneratorExit (by not catching the exception), close() returns to its caller. If the generator yields a value, a RuntimeError is raised. If the generator raises any other exception, it is propagated to the caller. close() does nothing if the generator has already exited due to an exception or normal exit. """ For Walter's original question, my preference is to change the behavior of regular files to raise StopIteration when next() is called on an iterator for a closed file. The current behavior is an implementation artifact stemming from a file being its own iterator object. In the future, it is possible that iter(somefileobject) will return a distinct iterator object and perhaps allow multiple, distinct iterators over the same file. Also, it is sometimes nice to wrap one iterator with another (perhaps with itertools or somesuch). That use case depends on the underlying iterator raising StopIteration instead of some other exception: f = open(somefilename) for lineno, line in enumerate(f): . . . Raymond From jimjjewett at gmail.com Fri Nov 18 16:29:23 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Fri, 18 Nov 2005 10:29:23 -0500 Subject: [Python-Dev] Memory management in the AST parser & compiler Message-ID: <fb6fbf560511180729y1037f23cv832cd5edc1f1c327@mail.gmail.com> There is a public-domain implementation of alloca at http://www.cs.purdue.edu/homes/apm/courses/BITSC461-fall03/listen-code/listen-1.0-dave/lsl_cpp/alloca.c It would still fail on architectures that don't use a stack frame; other than that, it seems like a reasonable fallback, if alloca is otherwise desirable. -jJ From guido at python.org Fri Nov 18 16:49:55 2005 From: guido at python.org (Guido van Rossum) Date: Fri, 18 Nov 2005 07:49:55 -0800 Subject: [Python-Dev] Iterating a closed StringIO In-Reply-To: <005401c5ec4c$81d4be40$91af2c81@oemcomputer> References: <437DDDBF.7010309@gmail.com> <005401c5ec4c$81d4be40$91af2c81@oemcomputer> Message-ID: <ca471dc20511180749x1f8fa1dfn7e79ce7070af88be@mail.gmail.com> On 11/18/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote: > For Walter's original question, my preference is to change the behavior > of regular files to raise StopIteration when next() is called on an > iterator for a closed file. I disagree. As long as there is a possibility that you might still want to use the iterator (even if it's exhausted) you shouldn't close the file. Closing a file is a strong indicator that you believe that there is no more use of the file, and *all* file methods change their behavior at that point; e.g. read() on a closed file raises an exception instead of returning an empty string. This is to catch the *bug* of closing a file that is still being used. Now it's questionable whether ValueError is the best exception in this case, since that is an exception which reasonable programmers often catch (e.g. when parsing a string that's supposed to represent an int). But I propose to leave the choice of exception reform for Python 3000. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Fri Nov 18 16:52:12 2005 From: guido at python.org (Guido van Rossum) Date: Fri, 18 Nov 2005 07:52:12 -0800 Subject: [Python-Dev] Another StringIO/cStringIO discrepancy In-Reply-To: <437DD671.40809@livinglogic.de> References: <437DD671.40809@livinglogic.de> Message-ID: <ca471dc20511180752m5658acd7lc43e7762e3063e47@mail.gmail.com> On 11/18/05, Walter D?rwald <walter at livinglogic.de> wrote: > >>> import StringIO, cStringIO > >>> s = StringIO.StringIO() > >>> s.truncate(-42) > Traceback (most recent call last): > File "<stdin>", line 1, in ? > File "/usr/local/lib/python2.4/StringIO.py", line 203, in truncate > raise IOError(EINVAL, "Negative size not allowed") > IOError: [Errno 22] Negative size not allowed > >>> s = cStringIO.StringIO() > >>> s.truncate(-42) > >>> Well, what does a regular file say in this case? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From walter at livinglogic.de Fri Nov 18 17:30:33 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Fri, 18 Nov 2005 17:30:33 +0100 Subject: [Python-Dev] Another StringIO/cStringIO discrepancy In-Reply-To: <ca471dc20511180752m5658acd7lc43e7762e3063e47@mail.gmail.com> References: <437DD671.40809@livinglogic.de> <ca471dc20511180752m5658acd7lc43e7762e3063e47@mail.gmail.com> Message-ID: <437E01A9.40208@livinglogic.de> Guido van Rossum wrote: > On 11/18/05, Walter D?rwald <walter at livinglogic.de> wrote: > >> >>> import StringIO, cStringIO >> >>> s = StringIO.StringIO() >> >>> s.truncate(-42) >>Traceback (most recent call last): >> File "<stdin>", line 1, in ? >> File "/usr/local/lib/python2.4/StringIO.py", line 203, in truncate >> raise IOError(EINVAL, "Negative size not allowed") >>IOError: [Errno 22] Negative size not allowed >> >>> s = cStringIO.StringIO() >> >>> s.truncate(-42) >> >>> > > > Well, what does a regular file say in this case? IOError: [Errno 22] Invalid argument Bye, Walter D?rwald From walter at livinglogic.de Fri Nov 18 17:51:51 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Fri, 18 Nov 2005 17:51:51 +0100 Subject: [Python-Dev] isatty() on closed StringIO In-Reply-To: <20051118101801.U90899@familjen.svensson.org> References: <437CE65C.7010107@livinglogic.de> <ca471dc20511171303t637ad7ddtd7ee2753840e2d6@mail.gmail.com> <140808AA-CFCA-4679-B5CC-24D21D45C3A3@livinglogic.de> <ca471dc20511171716x6dca0cb0qb81cae74beb9ed63@mail.gmail.com> <437DC5E5.1030302@livinglogic.de> <20051118101801.U90899@familjen.svensson.org> Message-ID: <437E06A7.1030908@livinglogic.de> Paul Svensson wrote: > On Fri, 18 Nov 2005, Walter D?rwald wrote: > >> BTW, isatty() has a similar problem: >> >> >>> import StringIO, cStringIO >> >>> s = StringIO.StringIO() >> >>> s.close() >> >>> s.isatty() >> Traceback (most recent call last): >> File "<stdin>", line 1, in ? >> File "/usr/local/lib/python2.4/StringIO.py", line 93, in isatty >> _complain_ifclosed(self.closed) >> File "/usr/local/lib/python2.4/StringIO.py", line 40, in >> _complain_ifclosed >> raise ValueError, "I/O operation on closed file" >> ValueError: I/O operation on closed file >> >>> s = cStringIO.StringIO() >> >>> s.close() >> >>> s.isatty() >> False >> >> I guess cStringIO.StringIO.isatty() should raise an exception too. > > > Why ? Is there any doubt that it's not a tty ? No, but for real files a ValueError is raised too. Bye, Walter D?rwald From martin at v.loewis.de Fri Nov 18 23:13:25 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 18 Nov 2005 23:13:25 +0100 Subject: [Python-Dev] str.dedent In-Reply-To: <b348a0850511151534q4e8abbf6vc3c63c07d3291d6a@mail.gmail.com> References: <dga72k$cah$1@sea.gmane.org> <b348a0850511121152h63746cb1jfa6bf90339f9439b@mail.gmail.com> <43777B5A.6030602@egenix.com> <200511140920.51724.gmccaughan@synaptics-uk.com> <437869DD.7040800@egenix.com> <b348a0850511141114p25411ea4w704a99d1ea9a629a@mail.gmail.com> <dlaqds$8sb$1@sea.gmane.org> <b348a0850511141425y1a894ddap14d7814568c9be5d@mail.gmail.com> <43791442.8050109@v.loewis.de> <b348a0850511151534q4e8abbf6vc3c63c07d3291d6a@mail.gmail.com> Message-ID: <437E5205.2010001@v.loewis.de> Noam Raphael wrote: > I just wanted to add another use case: long messages. Consider those > lines from idlelib/run.py:133 > > msg = "IDLE's subprocess can't connect to %s:%d. This may be due "\ > "to your personal firewall configuration. It is safe to "\ > "allow this internal connection because no data is visible on "\ > "external ports." % address > tkMessageBox.showerror("IDLE Subprocess Error", msg, parent=root) You are missing an important point here: There are intentionally no line breaks in this string; it must be a single line, or else showerror will break it in funny ways. So converting it to a multi-line string would break it, dedent or not. Regards, Martin From guido at python.org Sat Nov 19 05:44:51 2005 From: guido at python.org (Guido van Rossum) Date: Fri, 18 Nov 2005 20:44:51 -0800 Subject: [Python-Dev] Enjoy a week without me Message-ID: <ca471dc20511182044v59c3de85gc58a5814d5330713@mail.gmail.com> Folks, I'm off for a week with my wife's family (and one unlucky turkey :-) in a place where I can't care about email. I will be back here on Monday Nov 28. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From kbk at shore.net Sat Nov 19 07:34:23 2005 From: kbk at shore.net (Kurt B. Kaiser) Date: Sat, 19 Nov 2005 01:34:23 -0500 (EST) Subject: [Python-Dev] Weekly Python Patch/Bug Summary Message-ID: <200511190634.jAJ6YNMh017166@bayview.thirdcreek.com> Patch / Bug Summary ___________________ Patches : 379 open (+14) / 2968 closed ( +7) / 3347 total (+21) Bugs : 910 open ( +6) / 5384 closed (+17) / 6294 total (+23) RFE : 200 open ( +0) / 191 closed ( +2) / 391 total ( +2) New / Reopened Patches ______________________ PythonD DJGPP-specific patch set for porting to DOS. (2005-11-08) http://python.org/sf/1351020 opened by Ben Decker PythonD new file: python2.4/plat-ms-dos5/djstat.py (2005-11-08) http://python.org/sf/1351036 opened by Ben Decker askyesnocancel helper for tkMessageBox (2005-11-08) http://python.org/sf/1351744 opened by Fredrik Lundh fix for resource leak in _subprocess (2005-11-09) CLOSED http://python.org/sf/1351997 opened by Fredrik Lundh [PATCH] Bug #1351707 (2005-11-10) http://python.org/sf/1352711 opened by Thomas Lee Small upgrades to platform.platform() (2005-11-10) http://python.org/sf/1352731 opened by daishi harada a faster Modulefinder (2005-11-11) http://python.org/sf/1353872 opened by Thomas Heller support whence argument for GzipFile.seek (bug #1316069) (2005-11-12) http://python.org/sf/1355023 opened by Fredrik Lundh PEP 341 - Unification of try/except and try/finally (2005-11-14) http://python.org/sf/1355913 opened by Thomas Lee Delete Python-ast.[ch] during "make depclean" (#1355883) (2005-11-14) CLOSED http://python.org/sf/1355940 opened by Thomas Lee Python-ast.h & Python-ast.c generated twice (#1355883) (2005-11-14) http://python.org/sf/1355971 opened by Thomas Lee Sort nodes when writing to file (2005-11-14) CLOSED http://python.org/sf/1356571 opened by Johan Str?m potential crash and free memory read (2005-11-15) http://python.org/sf/1357836 opened by Neal Norwitz ftplib dir() problem with certain servers (2005-11-17) http://python.org/sf/1359217 opened by Stuart D. Gathman Iterating closed StringIO.StringIO (2005-11-18) http://python.org/sf/1359365 opened by Walter D?rwald Speed charmap encoder (2005-11-18) http://python.org/sf/1359618 opened by Martin v. L?wis Patch for (Doc) #1357604 (2005-11-18) http://python.org/sf/1359879 opened by Peter van Kampen Add XML-RPC Fault Interoperability to XMLRPC libraries (2005-11-18) http://python.org/sf/1360243 opened by Joshua Ginsberg correct display of pathnames in SimpleHTTPServer (2005-11-18) http://python.org/sf/1360443 opened by Ori Avtalion Auto Complete module for IDLE (2005-11-19) http://python.org/sf/1361016 opened by Jerry Patches Closed ______________ Redundant connect() call in logging.handlers.SysLogHandler (2005-11-07) http://python.org/sf/1350658 closed by vsajip incomplete support for AF_PACKET in socketmodule.c (2004-11-19) http://python.org/sf/1069624 closed by gustavo fix for resource leak in _subprocess (2005-11-09) http://python.org/sf/1351997 closed by effbot Info Associated with Merge to AST (2005-01-07) http://python.org/sf/1097671 closed by kbk Delete Python-ast.[ch] during "make depclean" (#1355883) (2005-11-13) http://python.org/sf/1355940 closed by montanaro Sort nodes when writing to file (2005-11-14) http://python.org/sf/1356571 closed by effbot CodeContext - an extension to show you where you are (2004-04-16) http://python.org/sf/936169 closed by kbk New / Reopened Bugs ___________________ win32serviceutil bug (2005-11-08) CLOSED http://python.org/sf/1351545 opened by Tim Graber Switch to make pprint.pprint display ints and longs in hex (2005-11-08) http://python.org/sf/1351692 opened by Mark Hirota zipimport produces incomplete IOError instances (2005-11-08) http://python.org/sf/1351707 opened by Fred L. Drake, Jr. CVS webbrowser.py (1.40) bugs (2005-10-26) CLOSED http://python.org/sf/1338995 reopened by montanaro SVN webbrowser.py fix 41419 didn't (2005-11-09) http://python.org/sf/1352621 opened by Greg Couch poplib.POP3_SSL() class incompatible with socket.timeout (2005-11-10) http://python.org/sf/1353269 opened by Charles Http redirection error in urllib2.py (2005-11-10) http://python.org/sf/1353433 opened by Thomas Dehn Python drops core when stdin is bogus (2005-11-10) http://python.org/sf/1353504 opened by Skip Montanaro Error in documentation for os.walk (2005-11-11) CLOSED http://python.org/sf/1353793 opened by Martin Geisler logging: Default handlers broken (2005-11-11) CLOSED http://python.org/sf/1354052 opened by Jonathan S. Joseph Interactive help fails with Windows Installer (2005-11-11) CLOSED http://python.org/sf/1354265 opened by Colin J. Williams shutil.move() does not preserve ownership (2005-11-13) http://python.org/sf/1355826 opened by lightweight Incorrect Decimal-float behavior for + (2005-11-13) http://python.org/sf/1355842 opened by Connelly make depend/clean issues w/ ast (2005-11-13) http://python.org/sf/1355883 opened by Skip Montanaro Division Error (2005-11-13) CLOSED http://python.org/sf/1355903 opened by Azimuth Ctrl+C for copy does not work when caps-lock is on (2005-11-14) http://python.org/sf/1356720 opened by Lenny Domnitser Tix.py class HList missing info_bbox (2005-11-14) http://python.org/sf/1356969 opened by Ron Provost urllib/urllib2 cannot ftp files which are not listable. (2005-11-15) http://python.org/sf/1357260 opened by Bugs Fly os.path.makedirs DOES handle UNC paths (2005-11-15) http://python.org/sf/1357604 opened by j vickroy suprocess cannot handle shell arguments (2005-11-16) http://python.org/sf/1357915 opened by Pierre Ossman Incorrect handling of unicode "strings" in asynchat.py (2005-11-16) CLOSED http://python.org/sf/1358186 opened by Holger Lehmann subprocess.py fails on Windows when there is no console (2005-11-16) http://python.org/sf/1358527 opened by Martin Blais Incorrect documentation of raw unidaq string literals (2005-11-17) http://python.org/sf/1359053 opened by Michael Haggerty Prefer configured browser over Mozilla and friends (2005-11-17) http://python.org/sf/1359150 opened by Ville Skytt? bdist_rpm still can't handle dashes in versions (2005-11-18) http://python.org/sf/1360200 opened by jared jennings telnetlib expect() and read_until() do not time out properly (2005-11-18) http://python.org/sf/1360221 opened by Duncan Grisby Bugs Closed ___________ "setdlopenflags" leads to crash upon "import" (2005-11-07) http://python.org/sf/1350188 closed by nnorwitz pydoc seems to run some scripts! (2005-11-04) http://python.org/sf/1348477 closed by nnorwitz cgitb.py report wrong line number (2005-04-06) http://python.org/sf/1178148 closed by ping win32serviceutil bug (2005-11-08) http://python.org/sf/1351545 closed by nnorwitz CVS webbrowser.py (1.40) bugs (2005-10-27) http://python.org/sf/1338995 closed by birkenfeld __getslice__ taking priority over __getitem__ (2005-10-17) http://python.org/sf/1328278 closed by birkenfeld _subprocess.c calls PyInt_AsLong without error checking (2005-11-03) http://python.org/sf/1346547 closed by effbot Syntax error on large file with MBCS encoding (2005-03-15) http://python.org/sf/1163244 closed by mhammond setgroups rejects long integer arguments (2004-01-02) http://python.org/sf/869197 closed by loewis Error in documentation for os.walk (2005-11-11) http://python.org/sf/1353793 closed by tim_one logging: Default handlers broken (2005-11-11) http://python.org/sf/1354052 closed by vsajip Significant memory leak with PyImport_ReloadModule (2005-08-11) http://python.org/sf/1256669 closed by birkenfeld Interactive help fails with Windows Installer (2005-11-11) http://python.org/sf/1354265 closed by loewis os.remove fails on win32 with read-only file (2004-12-29) http://python.org/sf/1092701 closed by effbot Division Error (2005-11-13) http://python.org/sf/1355903 closed by effbot IDLE, F5 – wrong external file content. (on error!) (2005-10-25) http://python.org/sf/1337987 closed by kbk Random core dumps (2004-11-10) http://python.org/sf/1063937 closed by nnorwitz Incorrect handling of unicode "strings" in asynchat.py (2005-11-16) http://python.org/sf/1358186 closed by effbot New / Reopened RFE __________________ python.desktop (2005-11-10) http://python.org/sf/1353344 opened by Bj?rn Lindqvist RFE Closed __________ please support the free visual studio sdk compiler (2005-11-05) http://python.org/sf/1348719 closed by loewis fix for ms stdio tables (2005-10-11) http://python.org/sf/1324176 closed by loewis From gregory.petrosyan at gmail.com Sat Nov 19 18:01:35 2005 From: gregory.petrosyan at gmail.com (Gregory Petrosyan) Date: Sat, 19 Nov 2005 20:01:35 +0300 Subject: [Python-Dev] How to stay almost backwards compatible with all these new cool features Message-ID: <6306f97b0511190901g6757acbej@mail.gmail.com> Here's some of my ideas about subject. Maybe some of them are rather foolish, others -- rather simple and common... I just want to add my 2 cents to Python development. 1) What is the reason for making Python backwards incompatible (let it be just 'BIC', and let 'BC' stands for 'backwards compatible')? The reason is revolution. But how much can we get just from intensive evolution? 2) Is there any way both for staying (almost) BC and intense evolving? Yes. General rule is rather simple: make old way of doing something *deprecated* (but *not* remove it entirely) and *add* the new way of doing this. But how to _force_ users to use new way instead of old? My proposal: Python should raise DeprecationError after old way is used: old_way() should be equivalent to old_way() raise DeprecationError('description') So people who want to use old way should write something like try: old_way() except DeprecationError: pass I think they soon will migrate to new style :-) [Manual/semi-automatic migration] Another benefit is that apps that were written for old-way Python version could be run under new-way Python after this simple modification (there might be standard script for making old apps compatible with new versions of Python!). [Automatic migration] 3) Staying BC means no revolutions in syntax. But there are problems with some of new-style features: a) 'raise' statement. I dislike idea about inheriting all exceptions from the base class and about removing 'raise' in favor of '.raise()'. Reasons: we can think about 'raise' as about more powerful variant of 'return'. Classic example is recursive search in binary tree: raising the result there seems to be very elegant and agile. Exception != Error is true IMHO. b) Interfaces. I like them. But isn't it ugly to write: interface SuperInterface: ... Note: 'interface' is repeated there two times! And this is *not* BC solution at all. Remember exception classes: class MyCommonError(Exception): ... but not exception MyCommonError(...): ... and it seems to be OK! And I have great agility with it: as mentioned, I can raise just 'some_object', but not only 'exception'. So, my proposal is syntax class SuperInterface(Interface): ... or maybe class SuperInterface: ... but not interface SuperInterface: ... Note: first two variants are BC solutions! And *yes*, you should be able to implement *any* class. Example: class Fish(object): def swim(): do_swim() # else ... class Dog(object): def bark(): do_bark() # else ... class SharkLikeDog(Fish) implements Dog Isn't it very good-looking? Note: IMHO Type == Implemented interface. So that's why every type/class can be used as interface. (Sorry for type/class mess). Could we find some benefits of it? c) I like Optional TypeChecking. But I think it could be improved with implementing some sort of Optional InterfaceChecking. Maybe like this: def f(a implements B, c: D = 'HELLO') implements E: # ?function implements interface? well, maybe it can be some type check interface? # some code here or def f(a implements B, c: D = 'HELLO') -> implements E: # some code here Summary -------------- Well, I think the main idea is (2): - Don't remove; make it strongly deprecated Then: - Some changes to interfaces implementation - etc ('raise' statement, InterfaceCheck -- see (3) ) Sorry for my English and for mess. -- Regards, Gregory. From arigo at tunes.org Sat Nov 19 19:08:55 2005 From: arigo at tunes.org (Armin Rigo) Date: Sat, 19 Nov 2005 19:08:55 +0100 Subject: [Python-Dev] s/hotshot/lsprof Message-ID: <20051119180855.GA26733@code1.codespeak.net> Hi! The current Python profilers situation is a mess. 'profile.Profile' is the ages-old pure Python profiler. At the end of a run, it builds a dict that is inspected by 'pstats.Stats'. It has some recent support for profiling C calls, which however make it crash in some cases [1]. And of course it's slow (makes a run take about 10x longer). 'hotshot', new from 2.2, is quite faster (reportedly, only 30% added overhead). The log file is then loaded and turned into an instance of the same 'pstats.Stats'. This loading takes ages. The reason is that the log file only records events, and loading is done by instantiating a 'profile.Profile' and sending it all the events. In other words, it takes exactly as long as the time it spared in the first place! Moreover, for some reasons, the results given by hotshot seem sometimes quite wrong. (I don't understand why, but I've seen it myself, and it's been reported by various people, e.g. [2].) 'hotshot' doesn't know about C calls, but it can log line events, although this information is lost(!) in the final conversion to a 'pstats.Stats'. 'lsprof' is a third profiler by Brett Rosen and Ted Czotter, posted on SF in June [2]. Michael Hudson and me did some minor clean-ups and improvements on it, and it seems to be quite useful. It is, for example, the only of the three profilers that managed to give sensible information about the PyPy translation process without crashing, allowing us to accelerate it from over 30 to under 20 minutes. The SF patch contains a more detailed account on the reasons for writing 'lsprof'. The current version [3] does not support C calls nor line events. It has its own simple interface, which is not compatible with any of the other two profilers. However, unlike the other two profilers, it can record detailed stats about children, which I found quite useful (e.g. how much take is spent in a function when it is called by another specific function). Therefore, I think it would be a great idea to add 'lsprof' to the standard library. Unless there are objections, it seems that the best plan is to keep 'profile.py' as a pure Python implementation and replace 'hotshot' with 'lsprof'. Indeed, I don't see any obvious advantage that 'hotshot' has over 'lsprof', and I certainly see more than one downside. Maybe someone has a use for (and undocumented ways to fish for) line events generated by hotshot. Well, there is a script [4] to convert hotshot log files to some format that a KDE tool [5] can display. (It even looks like hotshot files were designed with this in mind.) Given that the people doing that can still compile 'hotshot' as a separate extension module, it doesn't strike me as a particularly good reason to keep Yet Another Profiler in the standard library. So here is my plan: Unify a bit more the interfaces of the pure Python and the C profilers. This also means that 'lsprof' should be made to use a pstats-compatible log format. The 'pstats' documentation specifically says that the file format can change: that would give 'lsprof' a place to store its detailed children stats. Then we can provide a dummy 'hotshot.py' for compatibility, remove its documentation, and provide documentation for 'lsprof'. If anyone feels like this is a bad idea, please speak up. A bientot, Armin [1] https://sourceforge.net/tracker/?group_id=5470&atid=105470&func=detail&aid=1117670 [2] http://sourceforge.net/tracker/?group_id=5470&atid=305470&func=detail&aid=1212837 [3] http://codespeak.net/svn/user/arigo/hack/misc/lsprof (Subversion) [4] http://mail.python.org/pipermail/python-list/2003-September/183887.html [5] http://kcachegrind.sourceforge.net/cgi-bin/show.cgi From steven.bethard at gmail.com Sat Nov 19 20:18:18 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Sat, 19 Nov 2005 12:18:18 -0700 Subject: [Python-Dev] str.dedent In-Reply-To: <437E5205.2010001@v.loewis.de> References: <dga72k$cah$1@sea.gmane.org> <43777B5A.6030602@egenix.com> <200511140920.51724.gmccaughan@synaptics-uk.com> <437869DD.7040800@egenix.com> <b348a0850511141114p25411ea4w704a99d1ea9a629a@mail.gmail.com> <dlaqds$8sb$1@sea.gmane.org> <b348a0850511141425y1a894ddap14d7814568c9be5d@mail.gmail.com> <43791442.8050109@v.loewis.de> <b348a0850511151534q4e8abbf6vc3c63c07d3291d6a@mail.gmail.com> <437E5205.2010001@v.loewis.de> Message-ID: <d11dcfba0511191118y1da61245tcb6e1a221918b55a@mail.gmail.com> On 11/18/05, "Martin v. L?wis" <martin at v.loewis.de> wrote: > Noam Raphael wrote: > > I just wanted to add another use case: long messages. Consider those > > lines from idlelib/run.py:133 > > > > msg = "IDLE's subprocess can't connect to %s:%d. This may be due "\ > > "to your personal firewall configuration. It is safe to "\ > > "allow this internal connection because no data is visible on "\ > > "external ports." % address > > tkMessageBox.showerror("IDLE Subprocess Error", msg, parent=root) > > You are missing an important point here: There are intentionally no line > breaks in this string; it must be a single line, or else showerror will > break it in funny ways. So converting it to a multi-line string would > break it, dedent or not. Only if you didn't include newline escapes, e.g.:: msg = textwrap.dedent('''\ IDLE's subprocess can't connect to %s:%d. This may be due \ to your personal firewall configuration. It is safe to \ allow this internal connection because no data is visible on \ external ports.''' % address) STeVe -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From noamraph at gmail.com Sat Nov 19 21:48:00 2005 From: noamraph at gmail.com (Noam Raphael) Date: Sat, 19 Nov 2005 22:48:00 +0200 Subject: [Python-Dev] str.dedent In-Reply-To: <d11dcfba0511191118y1da61245tcb6e1a221918b55a@mail.gmail.com> References: <dga72k$cah$1@sea.gmane.org> <200511140920.51724.gmccaughan@synaptics-uk.com> <437869DD.7040800@egenix.com> <b348a0850511141114p25411ea4w704a99d1ea9a629a@mail.gmail.com> <dlaqds$8sb$1@sea.gmane.org> <b348a0850511141425y1a894ddap14d7814568c9be5d@mail.gmail.com> <43791442.8050109@v.loewis.de> <b348a0850511151534q4e8abbf6vc3c63c07d3291d6a@mail.gmail.com> <437E5205.2010001@v.loewis.de> <d11dcfba0511191118y1da61245tcb6e1a221918b55a@mail.gmail.com> Message-ID: <b348a0850511191248q72a1a134y27f1b756960817a@mail.gmail.com> On 11/19/05, Steven Bethard <steven.bethard at gmail.com> wrote: > > You are missing an important point here: There are intentionally no line > > breaks in this string; it must be a single line, or else showerror will > > break it in funny ways. So converting it to a multi-line string would > > break it, dedent or not. > > Only if you didn't include newline escapes, e.g.:: > > msg = textwrap.dedent('''\ > IDLE's subprocess can't connect to %s:%d. This may be due \ > to your personal firewall configuration. It is safe to \ > allow this internal connection because no data is visible on \ > external ports.''' % address) > Unfortunately, it won't help, since the 'dedent' method won't treat those spaces as indentation. But if those messages were printed to the standard error, the line breaks would be ok, and the use case valid. Noam From martin at v.loewis.de Sat Nov 19 23:06:16 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 19 Nov 2005 23:06:16 +0100 Subject: [Python-Dev] Patch Req. # 1351020 & 1351036: PythonD modifications In-Reply-To: <39387.202.3.192.11.1132108393.squirrel@cafemail.mcadcafe.com> References: <39387.202.3.192.11.1132108393.squirrel@cafemail.mcadcafe.com> Message-ID: <437FA1D8.7060600@v.loewis.de> decker at dacafe.com wrote: > I would appreciate feedback concerning these patches before the next > "PythonD" (for DOS/DJGPP) is released. PEP 11 says that DOS is not supported anymore since Python 2.0. So I am -1 on reintroducing support for it. Regards, Martin From aahz at pythoncraft.com Sun Nov 20 00:06:24 2005 From: aahz at pythoncraft.com (Aahz) Date: Sat, 19 Nov 2005 15:06:24 -0800 Subject: [Python-Dev] How to stay almost backwards compatible with all these new cool features In-Reply-To: <6306f97b0511190901g6757acbej@mail.gmail.com> References: <6306f97b0511190901g6757acbej@mail.gmail.com> Message-ID: <20051119230624.GA11188@panix.com> On Sat, Nov 19, 2005, Gregory Petrosyan wrote: > > Here's some of my ideas about subject. Maybe some of them are rather > foolish, others -- rather simple and common... I just want to add my 2 > cents to Python development. This message was more appropriate for comp.lang.python; most of what you talk about has already been discussed before, and the rest has to do with user-level changes. Please continue this discussion there if you're interested in the subject. Thank you. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "If you think it's expensive to hire a professional to do the job, wait until you hire an amateur." --Red Adair From aahz at pythoncraft.com Sun Nov 20 00:08:40 2005 From: aahz at pythoncraft.com (Aahz) Date: Sat, 19 Nov 2005 15:08:40 -0800 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <20051119180855.GA26733@code1.codespeak.net> References: <20051119180855.GA26733@code1.codespeak.net> Message-ID: <20051119230840.GB11188@panix.com> On Sat, Nov 19, 2005, Armin Rigo wrote: > > If anyone feels like this is a bad idea, please speak up. This sounds like a good idea, and your presentation already looks almost like a PEP. How about going ahead and making it a formal PEP, which will make it easier to push through the dev process? -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "If you think it's expensive to hire a professional to do the job, wait until you hire an amateur." --Red Adair From martin at v.loewis.de Sun Nov 20 00:55:57 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 20 Nov 2005 00:55:57 +0100 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <20051119180855.GA26733@code1.codespeak.net> References: <20051119180855.GA26733@code1.codespeak.net> Message-ID: <437FBB8D.50501@v.loewis.de> Armin Rigo wrote: > If anyone feels like this is a bad idea, please speak up. As stated, it certainly is a bad idea. To make it a good idea, there should also be some commitment to maintain this library for a number of years. So who would be maintaining it, and what are their plans for doing so? Regards, Martin From bcannon at gmail.com Sun Nov 20 01:12:28 2005 From: bcannon at gmail.com (Brett Cannon) Date: Sat, 19 Nov 2005 16:12:28 -0800 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <20051119180855.GA26733@code1.codespeak.net> References: <20051119180855.GA26733@code1.codespeak.net> Message-ID: <bbaeab100511191612o4877977bn1144c6cba4c4f5a@mail.gmail.com> Just for everyone's FYI while we are talking about profilers, Floris Bruynooghe (who I am cc'ing on this so he can contribute to the conversation), for Google's Summer of Code, wrote a replacement for 'profile' that uses Hotshot directly. Thanks to his direct use of Hotshot and rewrite of pstats it loads Hotshot data 30% faster and also alleviates keeping 'profile' around and its slightly questionable license. You can find his project at http://savannah.nongnu.org/projects/pyprof/ . I believe he also tweaked Hotshot to accept custom timing functions. I have not had a chance to go over his code to clean it up for putting it up on SF, but I thought people should be aware of it. -Brett On 11/19/05, Armin Rigo <arigo at tunes.org> wrote: > Hi! > > The current Python profilers situation is a mess. > > 'profile.Profile' is the ages-old pure Python profiler. At the end of a > run, it builds a dict that is inspected by 'pstats.Stats'. It has some > recent support for profiling C calls, which however make it crash in > some cases [1]. And of course it's slow (makes a run take about 10x > longer). > > 'hotshot', new from 2.2, is quite faster (reportedly, only 30% added > overhead). The log file is then loaded and turned into an instance of > the same 'pstats.Stats'. This loading takes ages. The reason is that > the log file only records events, and loading is done by instantiating a > 'profile.Profile' and sending it all the events. In other words, it > takes exactly as long as the time it spared in the first place! > Moreover, for some reasons, the results given by hotshot seem sometimes > quite wrong. (I don't understand why, but I've seen it myself, and it's > been reported by various people, e.g. [2].) 'hotshot' doesn't know > about C calls, but it can log line events, although this information is > lost(!) in the final conversion to a 'pstats.Stats'. > > 'lsprof' is a third profiler by Brett Rosen and Ted Czotter, posted on > SF in June [2]. Michael Hudson and me did some minor clean-ups and > improvements on it, and it seems to be quite useful. It is, for > example, the only of the three profilers that managed to give sensible > information about the PyPy translation process without crashing, > allowing us to accelerate it from over 30 to under 20 minutes. The SF > patch contains a more detailed account on the reasons for writing > 'lsprof'. The current version [3] does not support C calls nor line > events. It has its own simple interface, which is not compatible with > any of the other two profilers. However, unlike the other two > profilers, it can record detailed stats about children, which I found > quite useful (e.g. how much take is spent in a function when it is > called by another specific function). > > Therefore, I think it would be a great idea to add 'lsprof' to the > standard library. Unless there are objections, it seems that the best > plan is to keep 'profile.py' as a pure Python implementation and replace > 'hotshot' with 'lsprof'. Indeed, I don't see any obvious advantage that > 'hotshot' has over 'lsprof', and I certainly see more than one downside. > Maybe someone has a use for (and undocumented ways to fish for) line > events generated by hotshot. Well, there is a script [4] to convert > hotshot log files to some format that a KDE tool [5] can display. (It > even looks like hotshot files were designed with this in mind.) Given > that the people doing that can still compile 'hotshot' as a separate > extension module, it doesn't strike me as a particularly good reason to > keep Yet Another Profiler in the standard library. > > So here is my plan: > > Unify a bit more the interfaces of the pure Python and the C profilers. > This also means that 'lsprof' should be made to use a pstats-compatible > log format. The 'pstats' documentation specifically says that the file > format can change: that would give 'lsprof' a place to store its > detailed children stats. > > Then we can provide a dummy 'hotshot.py' for compatibility, remove its > documentation, and provide documentation for 'lsprof'. > > If anyone feels like this is a bad idea, please speak up. > > > A bientot, > > Armin > > > [1] https://sourceforge.net/tracker/?group_id=5470&atid=105470&func=detail&aid=1117670 > > [2] http://sourceforge.net/tracker/?group_id=5470&atid=305470&func=detail&aid=1212837 > > [3] http://codespeak.net/svn/user/arigo/hack/misc/lsprof (Subversion) > > [4] http://mail.python.org/pipermail/python-list/2003-September/183887.html > > [5] http://kcachegrind.sourceforge.net/cgi-bin/show.cgi > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org > From nnorwitz at gmail.com Sun Nov 20 01:15:01 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Sat, 19 Nov 2005 16:15:01 -0800 Subject: [Python-Dev] ast status, memory leaks, etc In-Reply-To: <ee2a432c0511131141s72fedecax29008fd783a3b0db@mail.gmail.com> References: <ee2a432c0511131141s72fedecax29008fd783a3b0db@mail.gmail.com> Message-ID: <ee2a432c0511191615y6259e95bwce68aec849a7ebfa@mail.gmail.com> I lied a bit in my previous status. I said that the refs used at the end of a regression test run from a clean state (*) were down to 380k. Well if I had remembered to remove all the .pyc's this would have been true. Here's the numbers now: Before AST: r39757 [362766 refs] Before AST: svn up [356255 refs] 266 OK 31 skipped clean: [342367 refs] 267 OK 31 skipped (*) Before each run I did: find . -name '*.pyc' | xargs rm Unless I screwed up again, the first line is from clean at revision 39757 which was just before the AST merge. The second line was a selective update of other files that didn't have any relationship to AST (primarily compile.c and symtable.c). The last run is after my recent checkin. So even with an additional test, we are finishing a regrtest.py run with less references. I don't know of any constructs which leak references. A patch was posted for the free memory read I reported earlier (not related to AST branch). It's on SF, I don't know the #. There are many potential memory leaks in the AST code in error conditions (hopefully these are only possible when running out of memory). It really needs the arena implementation to fix them and get it right. There are also still a few printfs in the AST code which should be changed to SystemErrors. There are still 2 memory leaks while running the regression tests. They show up when running test_fork1 and test_pty. There may be more, valgrind crashed on me the last run which was also before I fixed some of the reference leaks. It would be great if people could localize the leaks. 512 bytes in 1 blocks are definitely lost in loss record 319 of 548 at 0x11B1AF13: malloc (vg_replace_malloc.c:149) by 0x433CC4: new_arena (obmalloc.c:500) by 0x433EA8: PyObject_Malloc (obmalloc.c:706) by 0x43734B: PyString_FromStringAndSize (stringobject.c:74) by 0x4655B5: optimize_code (compile.c:957) by 0x467B86: makecode (compile.c:4092) by 0x467F00: assemble (compile.c:4166) by 0x46AA94: compiler_mod (compile.c:1755) by 0x46AC8B: PyAST_Compile (compile.c:285) by 0x47A870: run_mod (pythonrun.c:1195) by 0x47B0E8: PyRun_StringFlags (pythonrun.c:1159) by 0x45767A: builtin_eval (bltinmodule.c:589) by 0x41684F: PyObject_Call (abstract.c:1777) by 0x45EB4B: PyEval_CallObjectWithKeywords (ceval.c:3432) by 0x457E4E: builtin_map (bltinmodule.c:938) 1280 bytes in 2 blocks are definitely lost in loss record 383 of 548 at 0x11B1AF13: malloc (vg_replace_malloc.c:149) by 0x433CC4: new_arena (obmalloc.c:500) by 0x433EA8: PyObject_Malloc (obmalloc.c:706) by 0x4953F3: PyNode_AddChild (node.c:95) by 0x495611: shift (parser.c:112) by 0x4958F0: PyParser_AddToken (parser.c:244) by 0x411704: parsetok (parsetok.c:166) by 0x47AE4F: PyParser_ASTFromFile (pythonrun.c:1292) by 0x472338: parse_source_module (import.c:777) by 0x47262B: load_source_module (import.c:905) by 0x4735B3: load_module (import.c:1665) by 0x473C4B: import_submodule (import.c:2259) by 0x473DE0: load_next (import.c:2079) by 0x4741D5: import_module_ex (import.c:1914) by 0x474389: PyImport_ImportModuleEx (import.c:1955) From martin at v.loewis.de Sun Nov 20 10:58:14 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 20 Nov 2005 10:58:14 +0100 Subject: [Python-Dev] ast status, memory leaks, etc In-Reply-To: <ee2a432c0511191615y6259e95bwce68aec849a7ebfa@mail.gmail.com> References: <ee2a432c0511131141s72fedecax29008fd783a3b0db@mail.gmail.com> <ee2a432c0511191615y6259e95bwce68aec849a7ebfa@mail.gmail.com> Message-ID: <438048B6.2030103@v.loewis.de> Neal Norwitz wrote: > There are still 2 memory leaks while running the regression tests. > They show up when running test_fork1 and test_pty. There may be more, > valgrind crashed on me the last run which was also before I fixed some > of the reference leaks. It would be great if people could localize > the leaks. Can somebody please give a quick explanation how valgrind can give *any* reasonable leak analysis when obmalloc is used? In the current implementation, obmalloc never ever calls free(3), so all pool memory should appear to have leaked. So if valgrind does *not* report all memory as leaked: how does it find out? > 512 bytes in 1 blocks are definitely lost in loss record 319 of 548 > at 0x11B1AF13: malloc (vg_replace_malloc.c:149) > by 0x433CC4: new_arena (obmalloc.c:500) See http://mail.python.org/pipermail/python-dev/2004-June/045253.html This is the resizing of the list of arenas, which is a deliberate leak. It just happened to be exhausted in this particular call stack. > 1280 bytes in 2 blocks are definitely lost in loss record 383 of 548 > at 0x11B1AF13: malloc (vg_replace_malloc.c:149) > by 0x433CC4: new_arena (obmalloc.c:500) Likewise. Regards, Martin From jepler at unpythonic.net Sun Nov 20 16:08:51 2005 From: jepler at unpythonic.net (jepler@unpythonic.net) Date: Sun, 20 Nov 2005 09:08:51 -0600 Subject: [Python-Dev] Patch Req. # 1351020 & 1351036: PythonD modifications In-Reply-To: <437FA1D8.7060600@v.loewis.de> References: <39387.202.3.192.11.1132108393.squirrel@cafemail.mcadcafe.com> <437FA1D8.7060600@v.loewis.de> Message-ID: <20051120150850.GA27838@unpythonic.net> On Sat, Nov 19, 2005 at 11:06:16PM +0100, "Martin v. L?wis" wrote: > decker at dacafe.com wrote: > > I would appreciate feedback concerning these patches before the next > > "PythonD" (for DOS/DJGPP) is released. > > PEP 11 says that DOS is not supported anymore since Python 2.0. So > I am -1 on reintroducing support for it. If we have someeone who is volunteering the time to make it work, not just today but in the future as well, we shouldn't rule out re-adding support. I've taken a glance at the patch. There are probably a few things to quarrel over--for instance, it looks like a site.py change will cause python to print a blank line when it's started, and the removal of a '#define HAVE_FORK 1' in posixmodule.c---but this still doesn't mean the re-addition of DOS as a supported platform should be rejected out of hand. Jeff -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20051120/d7bcdf5b/attachment.pgp From martin at v.loewis.de Sun Nov 20 19:00:27 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 20 Nov 2005 19:00:27 +0100 Subject: [Python-Dev] Patch Req. # 1351020 & 1351036: PythonD modifications In-Reply-To: <20051120150850.GA27838@unpythonic.net> References: <39387.202.3.192.11.1132108393.squirrel@cafemail.mcadcafe.com> <437FA1D8.7060600@v.loewis.de> <20051120150850.GA27838@unpythonic.net> Message-ID: <4380B9BB.5030208@v.loewis.de> jepler at unpythonic.net wrote: > I've taken a glance at the patch. There are probably a few things to quarrel > over--for instance, it looks like a site.py change will cause python to print > a blank line when it's started, and the removal of a '#define HAVE_FORK 1' in > posixmodule.c---but this still doesn't mean the re-addition of DOS as a supported > platform should be rejected out of hand. Well, my experience is that people contributing "minority" ports run away after getting their patches accepted more often than not (that so happened with the BeOS port and the VMS port, to take recent examples). So I would prefer to see some strong commitment from the porter. Even so, I don't think I'm willing to commit such a patch myself. If somebody else thinks this is worthwhile, I won't object. Regards, Martin From nas at arctrix.com Fri Nov 18 18:28:03 2005 From: nas at arctrix.com (Neil Schemenauer) Date: Fri, 18 Nov 2005 17:28:03 +0000 (UTC) Subject: [Python-Dev] Memory management in the AST parser & compiler References: <4379AAD7.2050506@iinet.net.au> <6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu> <e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com> <ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com> <bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com> <13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu> <437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org> Message-ID: <dll2v3$78g$1@sea.gmane.org> Fredrik Lundh <fredrik at pythonware.com> wrote: > Thomas Lee wrote: > >> Even if it meant we had just one function call - one, safe function call >> that deallocated all the memory allocated within a function - that we >> had to put before each and every return, that's better than what we >> have. > > alloca? Perhaps we should use the memory management technique that the rest of Python uses: reference counting. I don't see why the AST structures couldn't be PyObjects. Neil From mwh at python.net Sun Nov 20 22:43:45 2005 From: mwh at python.net (Michael Hudson) Date: Sun, 20 Nov 2005 21:43:45 +0000 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <437FBB8D.50501@v.loewis.de> ( =?iso-8859-1?q?Martin_v._L=F6wis's_message_of?= "Sun, 20 Nov 2005 00:55:57 +0100") References: <20051119180855.GA26733@code1.codespeak.net> <437FBB8D.50501@v.loewis.de> Message-ID: <2mk6f3ro4u.fsf@starship.python.net> "Martin v. L?wis" <martin at v.loewis.de> writes: > Armin Rigo wrote: >> If anyone feels like this is a bad idea, please speak up. > > As stated, it certainly is a bad idea. This is a bit extreme... > To make it a good idea, there should also be some commitment to > maintain this library for a number of years. So who would be > maintaining it, and what are their plans for doing so? Well, the post was made by Armin who has been involved in CPython development for quite a few years now, and mentioned that work on lsprof was done by me who has been around for even longer -- neither of us are going to quit anytime soon. Cheers, mwh -- I think if we have the choice, I'd rather we didn't explicitly put flaws in the reST syntax for the sole purpose of not insulting the almighty. -- /will on the doc-sig From martin at v.loewis.de Sun Nov 20 23:15:14 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 20 Nov 2005 23:15:14 +0100 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <2mk6f3ro4u.fsf@starship.python.net> References: <20051119180855.GA26733@code1.codespeak.net> <437FBB8D.50501@v.loewis.de> <2mk6f3ro4u.fsf@starship.python.net> Message-ID: <4380F572.9040402@v.loewis.de> Michael Hudson wrote: >>As stated, it certainly is a bad idea. > > > This is a bit extreme... Yes, my apologies :-( >>To make it a good idea, there should also be some commitment to >>maintain this library for a number of years. So who would be >>maintaining it, and what are their plans for doing so? > > > Well, the post was made by Armin who has been involved in CPython > development for quite a few years now, and mentioned that work on > lsprof was done by me who has been around for even longer -- neither > of us are going to quit anytime soon. The same could be said about hotshot, which was originally contributed by Fred Drake, and hacked by Tim Peters, yourself, and others. Yet, now people want to remove it again. I'm really concerned that the same fate will happen to any new profiling library: anybody but the original author will hate it, write his own, and then suggest to replace the existing one. It is the "let's build it from scratch" attitude which makes me nervous. Perhaps the library could be distributed separately for some time, e.g. as a package in the cheeseshop. When it proves to be mature, I probably would object less. Regards, Martin From fredrik at pythonware.com Sun Nov 20 23:33:42 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sun, 20 Nov 2005 23:33:42 +0100 Subject: [Python-Dev] s/hotshot/lsprof References: <20051119180855.GA26733@code1.codespeak.net> <437FBB8D.50501@v.loewis.de><2mk6f3ro4u.fsf@starship.python.net> <4380F572.9040402@v.loewis.de> Message-ID: <dlqtk8$37q$1@sea.gmane.org> Martin v. Löwis wrote: > The same could be said about hotshot, which was originally contributed > by Fred Drake, and hacked by Tim Peters, yourself, and others. Yet, now > people want to remove it again. > > I'm really concerned that the same fate will happen to any new > profiling library: anybody but the original author will hate it, > write his own, and then suggest to replace the existing one. is this some intrinsic property of profilers? if the existing tool has problems, why not improve the tool itself? do we really need CADT- based development in the standard library? (on the other hand, I'm not sure we need a profiler as part of the standard library either, but that's me...) </F> From nnorwitz at gmail.com Mon Nov 21 01:14:07 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Sun, 20 Nov 2005 16:14:07 -0800 Subject: [Python-Dev] ast status, memory leaks, etc In-Reply-To: <438048B6.2030103@v.loewis.de> References: <ee2a432c0511131141s72fedecax29008fd783a3b0db@mail.gmail.com> <ee2a432c0511191615y6259e95bwce68aec849a7ebfa@mail.gmail.com> <438048B6.2030103@v.loewis.de> Message-ID: <ee2a432c0511201614u1dadb3b2x419e3482ccf5b145@mail.gmail.com> On 11/20/05, "Martin v. L?wis" <martin at v.loewis.de> wrote: > > Can somebody please give a quick explanation how valgrind can give > *any* reasonable leak analysis when obmalloc is used? In the current > implementation, obmalloc never ever calls free(3), so all pool memory > should appear to have leaked. > > So if valgrind does *not* report all memory as leaked: how does it > find out? Thanks for reminding me I wanted to do the next step and test without pymalloc. Valgrind can't find certain kinds of leaks when pymalloc is holding on to memory, true. However, remember that lots of allocations are forwarded to the system malloc(). For example, any request > 256 bytes goes directly to system malloc. Also, PyMem_*() call the system functions. The core is pretty clean already, since I've been running Valgrind pretty regularly over the years. Before Valgrind I used purify going back to 2000 or 2001. Barry had used purify before me at some point AFAIK. So nearly all of the leaks have already been fixed. It's pretty much only new code that starts showing leaks. To give you an example, I ran the entire regression suite through Valgrind after configuring --without-pymalloc. I only found 3 additional problems in new code. There was also one problem in older code (Python/modsupport.c). The big benefit of running with pymalloc is that it only takes about 1.25 to 1.50 hours to run on my box. When running without pymalloc, I estimate it takes about 5 times longer. Plus it requires a lot of extra work since I need to run the tests in batches. I only have 1 GB of RAM and it takes a lot more than that when running without pymalloc. > This is the resizing of the list of arenas, which is a deliberate > leak. It just happened to be exhausted in this particular call > stack. Thanks I was going to look into the resizing and forgot about it. Running without pymalloc confirmed that there weren't more serious problems. n From nnorwitz at gmail.com Mon Nov 21 01:21:41 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Sun, 20 Nov 2005 16:21:41 -0800 Subject: [Python-Dev] ast status, memory leaks, etc In-Reply-To: <438048B6.2030103@v.loewis.de> References: <ee2a432c0511131141s72fedecax29008fd783a3b0db@mail.gmail.com> <ee2a432c0511191615y6259e95bwce68aec849a7ebfa@mail.gmail.com> <438048B6.2030103@v.loewis.de> Message-ID: <ee2a432c0511201621w714f035er7f1ecd8072b10247@mail.gmail.com> I would really like it if someone could run Purify (or another memory tool) on Windows. Purify on any another (unix) platform would be nice, but I doubt it will show much more. By using different tools, problems not found by one tool may be found by the other. Plus there is windows specific code that isn't exercised at all right now. Any takers? I still think the total references at the end of a test run are high, 342291. I don't have anything to base this number on. Some strategic interning should help this number go down a bit. I suppose I shouldn't worry much since these references don't seem to become actual memory leaks. n From skip at pobox.com Mon Nov 21 02:43:33 2005 From: skip at pobox.com (skip@pobox.com) Date: Sun, 20 Nov 2005 19:43:33 -0600 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <dlqtk8$37q$1@sea.gmane.org> References: <20051119180855.GA26733@code1.codespeak.net> <437FBB8D.50501@v.loewis.de> <2mk6f3ro4u.fsf@starship.python.net> <4380F572.9040402@v.loewis.de> <dlqtk8$37q$1@sea.gmane.org> Message-ID: <17281.9797.171955.583286@montanaro.dyndns.org> Fredrik> (on the other hand, I'm not sure we need a profiler as part of Fredrik> the standard library either, but that's me...) Painful though hotshot can be at times, I occasionally find it extremely useful to zoom in on trouble spots. I haven't used profile in awhile and haven't tried lsprof yet. I would think having something readily available (whether in the standard library or not) would be handy when needed, hopefully with nothing more than "python setup.py install" required to make it available. Skip From tim.peters at gmail.com Mon Nov 21 02:55:49 2005 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 20 Nov 2005 20:55:49 -0500 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <20051119180855.GA26733@code1.codespeak.net> References: <20051119180855.GA26733@code1.codespeak.net> Message-ID: <1f7befae0511201755h2cb4bdf8s9c4b8586ee3c530a@mail.gmail.com> [Armin Rigo] ... > ... > 'hotshot', new from 2.2, is quite faster (reportedly, only 30% added > overhead). The log file is then loaded and turned into an instance of > the same 'pstats.Stats'. This loading takes ages. The reason is that > the log file only records events, and loading is done by instantiating a > 'profile.Profile' and sending it all the events. In other words, it > takes exactly as long as the time it spared in the first place! We should note that hotshot didn't intend to reduce total time overhead. What it's aiming at here is to be less disruptive (than profile.py) to the code being profiled _while_ that code is running. On modern boxes, any kind of profiling gimmick has the unfortunate side effect of _changing_ the runtime behavior of the code being profiled, at least by polluting I and D caches with droppings from the profiling code itself (or, in the case of profile.py, possibly overwhelming I and top-level D caches -- and distorting non-profiling runtime so badly that, e.g., networked apps may end up taking entirely different code paths). hotshot tries to stick with tiny little C functions that pack away a tiny amount of data each time, and avoid memory alloc/dealloc, to try to minimize this disruption. It looked like it was making real progress on this at one time ;-) > Moreover, for some reasons, the results given by hotshot seem sometimes > quite wrong. (I don't understand why, but I've seen it myself, and it's > been reported by various people, e.g. [2].) 'hotshot' doesn't know > about C calls, but it can log line events, although this information is > lost(!) in the final conversion to a 'pstats.Stats'. Ya, hotshot isn't finished. It had corporate support for its initial development, but lost that, and became an orphan then. That's the eventual fate of most profilers, alas. They're fiddly, difficult, and always wrong in some respect. Because of this, the existence of an eager maintainer without a real life is more important than the code ;-). From tim.peters at gmail.com Mon Nov 21 03:02:58 2005 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 20 Nov 2005 21:02:58 -0500 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <dlqtk8$37q$1@sea.gmane.org> References: <20051119180855.GA26733@code1.codespeak.net> <437FBB8D.50501@v.loewis.de> <2mk6f3ro4u.fsf@starship.python.net> <4380F572.9040402@v.loewis.de> <dlqtk8$37q$1@sea.gmane.org> Message-ID: <1f7befae0511201802h5ddfe36fxe0879ddf91a11923@mail.gmail.com> [Martin v. L?wis] >> I'm really concerned that the same fate will happen to any new >> profiling library: anybody but the original author will hate it, >> write his own, and then suggest to replace the existing one. [Fredrik Lundh] > is this some intrinsic property of profilers? if the existing tool has > problems, why not improve the tool itself? How many regexp engines has Python gone through now? Profilers are even more irritating to write and maintain than those -- and you presumably know why you started over from scratch instead of improving pcre, or whatever-the-heck-it-was that came before that ;-) > do we really need CADT-based development in the standard library? Since I didn't know what that meant, Google helpfully told me: Center for Alcohol & Drug Treatment Fits, anyway <wink>. From steve at holdenweb.com Mon Nov 21 04:04:09 2005 From: steve at holdenweb.com (Steve Holden) Date: Mon, 21 Nov 2005 03:04:09 +0000 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <1f7befae0511201802h5ddfe36fxe0879ddf91a11923@mail.gmail.com> References: <20051119180855.GA26733@code1.codespeak.net> <437FBB8D.50501@v.loewis.de> <2mk6f3ro4u.fsf@starship.python.net> <4380F572.9040402@v.loewis.de> <dlqtk8$37q$1@sea.gmane.org> <1f7befae0511201802h5ddfe36fxe0879ddf91a11923@mail.gmail.com> Message-ID: <43813929.3080000@holdenweb.com> Tim Peters wrote: > [Martin v. L?wis] > >>>I'm really concerned that the same fate will happen to any new >>>profiling library: anybody but the original author will hate it, >>>write his own, and then suggest to replace the existing one. > > > [Fredrik Lundh] > >>is this some intrinsic property of profilers? if the existing tool has >>problems, why not improve the tool itself? > > > How many regexp engines has Python gone through now? Profilers are > even more irritating to write and maintain than those -- and you > presumably know why you started over from scratch instead of improving > pcre, or whatever-the-heck-it-was that came before that ;-) > > >>do we really need CADT-based development in the standard library? > > > Since I didn't know what that meant, Google helpfully told me: > > Center for Alcohol & Drug Treatment > I suspect you may already know that Fredrik referred to Cascade of Attention-Deficit Teenagers Where's the BDFL to say "yes" or "no" when you need one? regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ From amk at amk.ca Mon Nov 21 05:12:08 2005 From: amk at amk.ca (A.M. Kuchling) Date: Sun, 20 Nov 2005 23:12:08 -0500 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <dlqtk8$37q$1@sea.gmane.org> References: <20051119180855.GA26733@code1.codespeak.net> <4380F572.9040402@v.loewis.de> <dlqtk8$37q$1@sea.gmane.org> Message-ID: <20051121041208.GA7924@rogue.amk.ca> On Sun, Nov 20, 2005 at 11:33:42PM +0100, Fredrik Lundh wrote: > do we really need CADT-based development in the standard library? I didn't recognize the acronym, but Google told me CADT = "Cascade of Attention-Deficit Teenagers"; see http://www.jwz.org/doc/cadt.html for a rant. --amk From steve at holdenweb.com Mon Nov 21 05:07:45 2005 From: steve at holdenweb.com (Steve Holden) Date: Mon, 21 Nov 2005 04:07:45 +0000 Subject: [Python-Dev] ast status, memory leaks, etc In-Reply-To: <ee2a432c0511201614u1dadb3b2x419e3482ccf5b145@mail.gmail.com> References: <ee2a432c0511131141s72fedecax29008fd783a3b0db@mail.gmail.com> <ee2a432c0511191615y6259e95bwce68aec849a7ebfa@mail.gmail.com> <438048B6.2030103@v.loewis.de> <ee2a432c0511201614u1dadb3b2x419e3482ccf5b145@mail.gmail.com> Message-ID: <43814811.2070004@holdenweb.com> Neal Norwitz wrote: [...] > To give you an example, I ran the entire regression suite through > Valgrind after configuring --without-pymalloc. I only found 3 > additional problems in new code. There was also one problem in older > code (Python/modsupport.c). > > The big benefit of running with pymalloc is that it only takes about > 1.25 to 1.50 hours to run on my box. When running without pymalloc, I > estimate it takes about 5 times longer. Plus it requires a lot of > extra work since I need to run the tests in batches. I only have 1 GB > of RAM and it takes a lot more than that when running without > pymalloc. > Is there maybe a machine in the SourceForge compile farm that could be used for this work? regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ From steve at holdenweb.com Mon Nov 21 05:07:45 2005 From: steve at holdenweb.com (Steve Holden) Date: Mon, 21 Nov 2005 04:07:45 +0000 Subject: [Python-Dev] ast status, memory leaks, etc In-Reply-To: <ee2a432c0511201614u1dadb3b2x419e3482ccf5b145@mail.gmail.com> References: <ee2a432c0511131141s72fedecax29008fd783a3b0db@mail.gmail.com> <ee2a432c0511191615y6259e95bwce68aec849a7ebfa@mail.gmail.com> <438048B6.2030103@v.loewis.de> <ee2a432c0511201614u1dadb3b2x419e3482ccf5b145@mail.gmail.com> Message-ID: <43814811.2070004@holdenweb.com> Neal Norwitz wrote: [...] > To give you an example, I ran the entire regression suite through > Valgrind after configuring --without-pymalloc. I only found 3 > additional problems in new code. There was also one problem in older > code (Python/modsupport.c). > > The big benefit of running with pymalloc is that it only takes about > 1.25 to 1.50 hours to run on my box. When running without pymalloc, I > estimate it takes about 5 times longer. Plus it requires a lot of > extra work since I need to run the tests in batches. I only have 1 GB > of RAM and it takes a lot more than that when running without > pymalloc. > Is there maybe a machine in the SourceForge compile farm that could be used for this work? regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ From fdrake at acm.org Mon Nov 21 05:11:19 2005 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Sun, 20 Nov 2005 23:11:19 -0500 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <1f7befae0511201802h5ddfe36fxe0879ddf91a11923@mail.gmail.com> References: <20051119180855.GA26733@code1.codespeak.net> <dlqtk8$37q$1@sea.gmane.org> <1f7befae0511201802h5ddfe36fxe0879ddf91a11923@mail.gmail.com> Message-ID: <200511202311.20271.fdrake@acm.org> On Sunday 20 November 2005 21:02, Tim Peters wrote: > Since I didn't know what that meant, Google helpfully told me: > > Center for Alcohol & Drug Treatment On Sunday 20 November 2005 22:04, Steve Holden wrote: > I suspect you may already know that Fredrik referred to > ?> ? ? ? ?Cascade of Attention-Deficit Teenagers Yes, our former office in McLean, Virginia was known by many names. :-) > Where's the BDFL to say "yes" or "no" when you need one? Actually, he was just in the next room for HotShot. Guess he was distracted by the photons from the Window, which I was protected from (ironically, by his office). -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> From nas at arctrix.com Mon Nov 21 05:53:13 2005 From: nas at arctrix.com (Neil Schemenauer) Date: Mon, 21 Nov 2005 04:53:13 +0000 (UTC) Subject: [Python-Dev] s/hotshot/lsprof References: <20051119180855.GA26733@code1.codespeak.net> <1f7befae0511201755h2cb4bdf8s9c4b8586ee3c530a@mail.gmail.com> Message-ID: <dlrjro$l5g$1@sea.gmane.org> Tim Peters <tim.peters at gmail.com> wrote: > We should note that hotshot didn't intend to reduce total time > overhead. What it's aiming at here is to be less disruptive (than > profile.py) to the code being profiled _while_ that code is running. A statistical profiler (e.g. http://wingolog.org/archives/2005/10/28/profiling) would be a nice addition, IMHO. I guess we should get the current profilers in shape first though. Neil From martin at v.loewis.de Mon Nov 21 07:39:01 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 21 Nov 2005 07:39:01 +0100 Subject: [Python-Dev] ast status, memory leaks, etc In-Reply-To: <ee2a432c0511201621w714f035er7f1ecd8072b10247@mail.gmail.com> References: <ee2a432c0511131141s72fedecax29008fd783a3b0db@mail.gmail.com> <ee2a432c0511191615y6259e95bwce68aec849a7ebfa@mail.gmail.com> <438048B6.2030103@v.loewis.de> <ee2a432c0511201621w714f035er7f1ecd8072b10247@mail.gmail.com> Message-ID: <43816B85.1080407@v.loewis.de> Neal Norwitz wrote: > I still think the total references at the end of a test run are high, > 342291. I don't have anything to base this number on. Some strategic > interning should help this number go down a bit. I suppose I > shouldn't worry much since these references don't seem to become > actual memory leaks. You could try to classify the objects remaining, counting them by type. Perhaps selectively clearing out sys.modules to what it is after startup might also give insights. Regards, Martin From martin at v.loewis.de Mon Nov 21 07:44:50 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 21 Nov 2005 07:44:50 +0100 Subject: [Python-Dev] Patch Req. # 1351020 & 1351036: PythonD modifications In-Reply-To: <25509.202.3.192.11.1132533752.squirrel@cafemail.mcadcafe.com> References: <39387.202.3.192.11.1132108393.squirrel@cafemail.mcadcafe.com> <437FA1D8.7060600@v.loewis.de> <20051120150850.GA27838@unpythonic.net> <25509.202.3.192.11.1132533752.squirrel@cafemail.mcadcafe.com> Message-ID: <43816CE2.2020808@v.loewis.de> decker at dacafe.com wrote: > The local python community here in Sydney indicated that python.org is > only upset when groups port the source to 'obscure' systems and *don't* > submit patches... It is possible that I was misinformed. I never heard such concerns. I personally wouldn't notice if somebody ported Python, and did not feed back the patches. Sometimes, people ask "there is this and that port, why isn't it integrated", to which the answer is in most cases "because authors didn't contribute". This is not being upset - it is merely a fact. This port (djgcc) is the first one in a long time (IIRC) where anybody proposed rejecting it. > I am not sure about the future myself. DJGPP 2.04 has been parked at beta > for two years now. It might be fair to say that the *general* DJGPP > developer base has shrunk a little bit. But the PythonD userbase has > actually grown since the first release three years ago. For the time > being, people get very angry when the servers go down here :-) It's not that much availability of the platform I worry about, but the commitment of the Python porter. We need somebody to forward bug reports to, and somebody to intervene if incompatible changes are made. This person would also indicate that the platform is no longer available, and hence the port can be removed. Regards, Martin From martin at v.loewis.de Mon Nov 21 08:12:53 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 21 Nov 2005 08:12:53 +0100 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <dlqtk8$37q$1@sea.gmane.org> References: <20051119180855.GA26733@code1.codespeak.net> <437FBB8D.50501@v.loewis.de><2mk6f3ro4u.fsf@starship.python.net> <4380F572.9040402@v.loewis.de> <dlqtk8$37q$1@sea.gmane.org> Message-ID: <43817375.6040108@v.loewis.de> Fredrik Lundh wrote: > is this some intrinsic property of profilers? if the existing tool has > problems, why not improve the tool itself? do we really need CADT- > based development in the standard library? It is, IMO, intrinsic to parts of the library that aren't used much. If bugs are in the heavily-used parts of the library, like regular expressions, it doesn't matter much if the original author goes away for some period of time - other contributors will fix the bugs that they care about, and not by rewriting the entire thing. If the library is less used, this kind of model is more likely, as resistance to replacing the existing library will be lower. > (on the other hand, I'm not sure we need a profiler as part of the > standard library either, but that's me...) It's a battery, to some. Regards, Martin From martin at v.loewis.de Mon Nov 21 08:16:19 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 21 Nov 2005 08:16:19 +0100 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <1f7befae0511201802h5ddfe36fxe0879ddf91a11923@mail.gmail.com> References: <20051119180855.GA26733@code1.codespeak.net> <437FBB8D.50501@v.loewis.de> <2mk6f3ro4u.fsf@starship.python.net> <4380F572.9040402@v.loewis.de> <dlqtk8$37q$1@sea.gmane.org> <1f7befae0511201802h5ddfe36fxe0879ddf91a11923@mail.gmail.com> Message-ID: <43817443.4000402@v.loewis.de> Tim Peters wrote: > Center for Alcohol & Drug Treatment Besides Jamie Zawinski's definition, Google also told me it stands for Computer Aided Drafting Technology where "to draft" turns out to have two different meanings :-) Regards, Martin From arigo at tunes.org Mon Nov 21 12:14:26 2005 From: arigo at tunes.org (Armin Rigo) Date: Mon, 21 Nov 2005 12:14:26 +0100 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <bbaeab100511191612o4877977bn1144c6cba4c4f5a@mail.gmail.com> References: <20051119180855.GA26733@code1.codespeak.net> <bbaeab100511191612o4877977bn1144c6cba4c4f5a@mail.gmail.com> Message-ID: <20051121111426.GA13478@code1.codespeak.net> Hi Brett, hi Floris, On Sat, Nov 19, 2005 at 04:12:28PM -0800, Brett Cannon wrote: > Just for everyone's FYI while we are talking about profilers, Floris > Bruynooghe (who I am cc'ing on this so he can contribute to the > conversation), for Google's Summer of Code, wrote a replacement for > 'profile' that uses Hotshot directly. Thanks to his direct use of > Hotshot and rewrite of pstats it loads Hotshot data 30% faster and > also alleviates keeping 'profile' around and its slightly questionable > license. Thanks for the note! 30% faster than an incredibly long time is still quite long, but that's an improvment, I suppose. However, this code is not ready yet. For example the new loader gives wrong results in the presence of recursive function calls. A bientot, Armin. From arigo at tunes.org Mon Nov 21 12:14:30 2005 From: arigo at tunes.org (Armin Rigo) Date: Mon, 21 Nov 2005 12:14:30 +0100 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <1f7befae0511201755h2cb4bdf8s9c4b8586ee3c530a@mail.gmail.com> References: <20051119180855.GA26733@code1.codespeak.net> <1f7befae0511201755h2cb4bdf8s9c4b8586ee3c530a@mail.gmail.com> Message-ID: <20051121111430.GB13478@code1.codespeak.net> Hi Tim, On Sun, Nov 20, 2005 at 08:55:49PM -0500, Tim Peters wrote: > We should note that hotshot didn't intend to reduce total time > overhead. What it's aiming at here is to be less disruptive (than > profile.py) to the code being profiled _while_ that code is running. > hotshot tries to stick with tiny little C functions that pack away a > tiny amount of data each time, and avoid memory alloc/dealloc, to try > to minimize this disruption. It looked like it was making real > progress on this at one time ;-) I see the point. I suppose that we can discuss if hotshot is really nicer on the D cache, as it produces a constant stream of data, whereas classical profilers like lsprof would in the common case only update a few counters in existing data structures. I can tweak lsprof a bit more, though -- there is a malloc on each call, but it could be avoided. Still, people generally agree that profile.py, while taking a longer time overall, gives more meaningful results than hotshot. Now Brett's student, Floris, extended hotshot to allow custom timers. This is essential, because it enables testing. The timing parts of hotshot were not tested previously. Given the high correlation between untestedness and brokenness, you bet that Floris' adapted test_profile for hotshot gives wrong numbers. (My guess is that Floris overlooked that test_profile was an output test, so he didn't compare the resulting numbers with the expected ones.) Looking at the errors in the numbers pointed us immediately to the bug in the C code. Some time intervals are lost: the ones before an exception is raised or a C function is called or returns. That's a lot of them. The current hotshot is hence not so much a profiler than "a reflection on the meaning of time" (quoting Samuele). > Ya, hotshot isn't finished. It had corporate support for its initial > development, but lost that, and became an orphan then. I will check in the bug fix for hotshot, but the question is what's the point. I would argue that lsprof even with children call stats is much simpler than hotshot. Lines-of-code also reflect that (factor of 2). Obviously hotshot can do much more (undocumented, unmaintained) things beside profiling if you get the correct tools. This plays in favour of lsprof as a stdlib-integrated useful-for-common-people maintained piece of software and hotshot as distributed together with the tools that can use its full potential. A bientot, Armin. From arigo at tunes.org Mon Nov 21 12:41:01 2005 From: arigo at tunes.org (Armin Rigo) Date: Mon, 21 Nov 2005 12:41:01 +0100 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <43817375.6040108@v.loewis.de> References: <20051119180855.GA26733@code1.codespeak.net> <4380F572.9040402@v.loewis.de> <dlqtk8$37q$1@sea.gmane.org> <43817375.6040108@v.loewis.de> Message-ID: <20051121114101.GC13478@code1.codespeak.net> Hi Martin, On Mon, Nov 21, 2005 at 08:12:53AM +0100, "Martin v. L?wis" wrote: > If bugs are in the heavily-used parts of the library, like regular > expressions, it doesn't matter much if the original author goes > away for some period of time - other contributors will fix the bugs > that they care about, and not by rewriting the entire thing. I see no incremental way of fixing some of the downsides of hotshot, like its huge log file size and loading time. I doubt people often find the motivation to dig into this large orphaned piece of software. Instead, they rewrite their own profilers, because writing a basic one is not difficult. It is much less difficult than, say, writing a basic regular expression engine (but even the latter has gotten rewritten at times) -- unless you want to go into the advanced corners mentioned by Tim. Some guys posted their 'lsprof' on SF because it was well-polished and they found it useful, so here I am, arguing for a standard library containing preferably simple pieces of code that work and are practical for the common advertised use case. I'm not even sure in this case why we are arguing: the new piece of code's interface can be made 100% compatible with the documented parts of the previous interface; the previous module has been around for longer but so far it produced half-meaningless numbers due to bugs. A bientot, Armin. From jepler at unpythonic.net Mon Nov 21 14:50:48 2005 From: jepler at unpythonic.net (jepler@unpythonic.net) Date: Mon, 21 Nov 2005 07:50:48 -0600 Subject: [Python-Dev] Patch Req. # 1351020 & 1351036: PythonD modifications In-Reply-To: <20051121070845.GA12993@ithaca04.ddaustralia.local> References: <39387.202.3.192.11.1132108393.squirrel@cafemail.mcadcafe.com> <437FA1D8.7060600@v.loewis.de> <20051120150850.GA27838@unpythonic.net> <25509.202.3.192.11.1132533752.squirrel@cafemail.mcadcafe.com> <43816CE2.2020808@v.loewis.de> <20051121070845.GA12993@ithaca04.ddaustralia.local> Message-ID: <20051121135047.GA22167@unpythonic.net> On Mon, Nov 21, 2005 at 06:08:45PM +1100, Ben Decker wrote: > I think the port has beed supported for three years now. I am not sure what > kind of commitment you are looking for, but the patch and software are > supplied under the same terms of liability and warranty as anything else > under the GPL. Python is not GPL software. If your patch is under the terms of the GPL, it cannot be accepted into Python. Jeff -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20051121/9342ee84/attachment.pgp From barry at python.org Mon Nov 21 15:25:05 2005 From: barry at python.org (Barry Warsaw) Date: Mon, 21 Nov 2005 09:25:05 -0500 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <20051121111430.GB13478@code1.codespeak.net> References: <20051119180855.GA26733@code1.codespeak.net> <1f7befae0511201755h2cb4bdf8s9c4b8586ee3c530a@mail.gmail.com> <20051121111430.GB13478@code1.codespeak.net> Message-ID: <1132583105.10235.32.camel@geddy.wooz.org> On Mon, 2005-11-21 at 12:14 +0100, Armin Rigo wrote: > Still, people generally agree that profile.py, while taking a longer > time overall, gives more meaningful results than hotshot. Now Brett's > student, Floris, extended hotshot to allow custom timers. This is > essential, because it enables testing. The timing parts of hotshot were > not tested previously. hotshot used to produce incorrect data because it couldn't track exits from functions due to exception propagation. We fixed that a while back and since then it's been pretty useful for us. While I'm not sure I like the idea of three profilers in the stdlib, I think in this case (unless they're incompatible) it would make sense to keep hotshot around, at least until any new profiler proves it's better over a couple of releases. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20051121/79efc9da/attachment.pgp From fredrik at pythonware.com Mon Nov 21 15:41:03 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 21 Nov 2005 15:41:03 +0100 Subject: [Python-Dev] ast status, memory leaks, etc References: <ee2a432c0511131141s72fedecax29008fd783a3b0db@mail.gmail.com><ee2a432c0511191615y6259e95bwce68aec849a7ebfa@mail.gmail.com><438048B6.2030103@v.loewis.de> <ee2a432c0511201614u1dadb3b2x419e3482ccf5b145@mail.gmail.com> Message-ID: <dlsma2$kj1$1@sea.gmane.org> Neal Norwitz wrote: > The big benefit of running with pymalloc is that it only takes about > 1.25 to 1.50 hours to run on my box. When running without pymalloc, I > estimate it takes about 5 times longer. Plus it requires a lot of > extra work since I need to run the tests in batches. I only have 1 GB > of RAM and it takes a lot more than that when running without > pymalloc. sounds like the PSF should buy you some more RAM. </F> From arigo at tunes.org Mon Nov 21 16:09:33 2005 From: arigo at tunes.org (Armin Rigo) Date: Mon, 21 Nov 2005 16:09:33 +0100 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <1132583105.10235.32.camel@geddy.wooz.org> References: <20051119180855.GA26733@code1.codespeak.net> <1f7befae0511201755h2cb4bdf8s9c4b8586ee3c530a@mail.gmail.com> <20051121111430.GB13478@code1.codespeak.net> <1132583105.10235.32.camel@geddy.wooz.org> Message-ID: <20051121150932.GA7134@code1.codespeak.net> Hi Barry, On Mon, Nov 21, 2005 at 09:25:05AM -0500, Barry Warsaw wrote: > hotshot used to produce incorrect data because it couldn't track exits > from functions due to exception propagation. We fixed that a while back It might be me, but I find it a bit odd that you didn't do anything with this fix. I'm sure that for each alternate profiler posted on SF there are ten half-finished ones on somebody's box. The problem of hotshot producing slightly wrong data is not new, and in hindsight the discrepencies only became larger in 2.4 with the introduction of new tracing events (C function calls). At this point I'm interpreting your mail as saying that you don't really mind if hotshot is in the standard library or not, because you are using your own fixed version anyway. Nobody is proposing to wipe out hotshot from the face of the planet. Sorry if I sound offensive, but I'd rather hear the opinion of people that care about the stdlib. Armin From barry at python.org Mon Nov 21 17:40:37 2005 From: barry at python.org (Barry Warsaw) Date: Mon, 21 Nov 2005 11:40:37 -0500 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <20051121150932.GA7134@code1.codespeak.net> References: <20051119180855.GA26733@code1.codespeak.net> <1f7befae0511201755h2cb4bdf8s9c4b8586ee3c530a@mail.gmail.com> <20051121111430.GB13478@code1.codespeak.net> <1132583105.10235.32.camel@geddy.wooz.org> <20051121150932.GA7134@code1.codespeak.net> Message-ID: <1132591237.10237.51.camel@geddy.wooz.org> On Mon, 2005-11-21 at 16:09 +0100, Armin Rigo wrote: > It might be me, but I find it a bit odd that you didn't do anything with > this fix. Hi Armin. Actually it was SF #900092 that I was referring to. We fixed this bug and those patches were applied to CVS (pre-svn conversion) for both 2.4.2 and 2.5a1. So at least the one I was talking about are already in there! > At this point I'm interpreting your mail as saying that you don't really > mind if hotshot is in the standard library or not, because you are using > your own fixed version anyway. Nobody is proposing to wipe out hotshot > from the face of the planet. Sorry if I sound offensive, but I'd rather > hear the opinion of people that care about the stdlib. I think you just misunderstood me. I definitely care about the stdlib and no, we strongly prefer not to use some locally hacked up Python. E.g. we were running 2.4.1 with this (and a few other patches) until 2.4.2 came out, but now we're pretty much using pristine Python 2.4.2. So I still think hotshot can stay in the stdlib for a few releases, unless it's totally incompatible with lsprof, and then it's worth discussing. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20051121/c8f29d14/attachment.pgp From arigo at tunes.org Mon Nov 21 18:03:04 2005 From: arigo at tunes.org (Armin Rigo) Date: Mon, 21 Nov 2005 18:03:04 +0100 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <1132591237.10237.51.camel@geddy.wooz.org> References: <20051119180855.GA26733@code1.codespeak.net> <1f7befae0511201755h2cb4bdf8s9c4b8586ee3c530a@mail.gmail.com> <20051121111430.GB13478@code1.codespeak.net> <1132583105.10235.32.camel@geddy.wooz.org> <20051121150932.GA7134@code1.codespeak.net> <1132591237.10237.51.camel@geddy.wooz.org> Message-ID: <20051121170304.GA8711@code1.codespeak.net> Hi Barry, On Mon, Nov 21, 2005 at 11:40:37AM -0500, Barry Warsaw wrote: > Hi Armin. Actually it was SF #900092 that I was referring to. Ah, we're talking about different things then. The patch in SF #900092 is not related to hotshot, it's just ceval.c not producing enough events to allow a precise timing of exceptions. (Now that ceval.c is fixed, we could remove a few hacks from profile.py, BTW.) I am referring to a specific bug of hotshot which entirely drops some genuine time intervals, all the time. It's untested code! A minimal test like Floris' test_profile shows it clearly. A bientot, Armin. From nnorwitz at gmail.com Mon Nov 21 20:05:19 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Mon, 21 Nov 2005 11:05:19 -0800 Subject: [Python-Dev] ast status, memory leaks, etc In-Reply-To: <dlsma2$kj1$1@sea.gmane.org> References: <ee2a432c0511131141s72fedecax29008fd783a3b0db@mail.gmail.com> <ee2a432c0511191615y6259e95bwce68aec849a7ebfa@mail.gmail.com> <438048B6.2030103@v.loewis.de> <ee2a432c0511201614u1dadb3b2x419e3482ccf5b145@mail.gmail.com> <dlsma2$kj1$1@sea.gmane.org> Message-ID: <ee2a432c0511211105w7b60bae1ibaaf6e2a4bd077fb@mail.gmail.com> On 11/21/05, Fredrik Lundh <fredrik at pythonware.com> wrote: > > sounds like the PSF should buy you some more RAM. I think I still have some allocation from the PSF. Wanna have a party. ;-) Seriously, I don't know that more RAM would help too much. I didn't notice much swapping, but maybe if I had run in bigger chunks --without-pymalloc I would have. I think a bigger bang for the buck would be to buy a Windows box with Purify. Rational was a real pain to deal with, maybe it's better now that IBM bought them. Parasoft (Insure++) was even worse to deal with. There would be many other benefits for someone to do more testing on Windows. The worst part of all this is ... it's still Windows. I'm not tied to Purify, I just don't know anything that works better. I've never used any such tool on Windows though. n From bcannon at gmail.com Mon Nov 21 20:38:09 2005 From: bcannon at gmail.com (Brett Cannon) Date: Mon, 21 Nov 2005 11:38:09 -0800 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <20051121114101.GC13478@code1.codespeak.net> References: <20051119180855.GA26733@code1.codespeak.net> <4380F572.9040402@v.loewis.de> <dlqtk8$37q$1@sea.gmane.org> <43817375.6040108@v.loewis.de> <20051121114101.GC13478@code1.codespeak.net> Message-ID: <bbaeab100511211138w244f0498k728363802328df2c@mail.gmail.com> On 11/21/05, Armin Rigo <arigo at tunes.org> wrote: > Hi Martin, > > On Mon, Nov 21, 2005 at 08:12:53AM +0100, "Martin v. L?wis" wrote: > > If bugs are in the heavily-used parts of the library, like regular > > expressions, it doesn't matter much if the original author goes > > away for some period of time - other contributors will fix the bugs > > that they care about, and not by rewriting the entire thing. > > I see no incremental way of fixing some of the downsides of hotshot, > like its huge log file size and loading time. I doubt people often find > the motivation to dig into this large orphaned piece of software. > Instead, they rewrite their own profilers, because writing a basic one > is not difficult. It is much less difficult than, say, writing a basic > regular expression engine (but even the latter has gotten rewritten at > times) -- unless you want to go into the advanced corners mentioned by > Tim. > > Some guys posted their 'lsprof' on SF because it was well-polished and > they found it useful, so here I am, arguing for a standard library > containing preferably simple pieces of code that work and are practical > for the common advertised use case. I'm not even sure in this case why > we are arguing: the new piece of code's interface can be made 100% > compatible with the documented parts of the previous interface; the > previous module has been around for longer but so far it produced > half-meaningless numbers due to bugs. > Just because it is starting to feel like the objections are getting spread out amongst various parts of this thread, I want to try to summarize them as I remember them and give my input on them. So one objection seems to be the question of maintenance. Who is going to keep this code updated and running? As has been pointed out, Hotshot is not perfect and its development basically stopped. So people being a little on edge about yet another profiler that might not be maintained seems reasonable. But this worry, in my mind, is alleviated since I believe both Michael and Armin are willing to maintain the code. With them both willing to make sure it stays working (which is a pretty damn good commitment since we have two core developers willing to keep this going and not just one) I think this worry is dealt with. The other issue seems to be some people wanting to keep Hotshot around for a few releases until lsprof can prove its worth. I believe this is what Barry is asking for. Now Armin has said that a wrapper around lsprof can be written that will match Hotshot's public API so its need is not there if lsprof works and the wrapper is good. If it wasn't Armin or someone else whose opinion I trusted, I would say go ahead and keep Hotshot around and then eventually do the wrapper. But since it is Armin making this claim and the PyPy team uses this thing (who has several members who I think know what they are doing =) I have faith in them coming up with a good wrapper. Thus I say removing Hotshot is fine. Lastly, there is the argument of whether we should even include a profiler. Personally I say yes. It is another battery that is rather nice. I think if the profiler finally had a good reputation of being accurate and useful it would get more play in the real world. Plus we already include other development tools such as IDLE with Python so it seems fitting to include other dev tools when we have the code and a maintenance commitment. In other words, I say let Armin and Michael add lsprof and the wrappers for it (all while removing any redundant profilers that they have wrappers for) with them knowing we will have a public stoning at PyCon the instant they don't keep it all working. =) -Brett From jeremy at alum.mit.edu Mon Nov 21 20:42:32 2005 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Mon, 21 Nov 2005 14:42:32 -0500 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <bbaeab100511211138w244f0498k728363802328df2c@mail.gmail.com> References: <20051119180855.GA26733@code1.codespeak.net> <4380F572.9040402@v.loewis.de> <dlqtk8$37q$1@sea.gmane.org> <43817375.6040108@v.loewis.de> <20051121114101.GC13478@code1.codespeak.net> <bbaeab100511211138w244f0498k728363802328df2c@mail.gmail.com> Message-ID: <e8bf7a530511211142v629f6c69s6b07a3025db6f2ae@mail.gmail.com> Here's another attempt to disentagle some issues: - Should lsprof be added to the standard distribution? - Should hotshot be removed from the standard distribution? These two aren't at all related, unless you believe that two is the maximum number of profiles allowed per Python distribution. I've never trusted results from hotshot, but I'd rather see it fixed than removed. Jeremy On 11/21/05, Brett Cannon <bcannon at gmail.com> wrote: > On 11/21/05, Armin Rigo <arigo at tunes.org> wrote: > > Hi Martin, > > > > On Mon, Nov 21, 2005 at 08:12:53AM +0100, "Martin v. L?wis" wrote: > > > If bugs are in the heavily-used parts of the library, like regular > > > expressions, it doesn't matter much if the original author goes > > > away for some period of time - other contributors will fix the bugs > > > that they care about, and not by rewriting the entire thing. > > > > I see no incremental way of fixing some of the downsides of hotshot, > > like its huge log file size and loading time. I doubt people often find > > the motivation to dig into this large orphaned piece of software. > > Instead, they rewrite their own profilers, because writing a basic one > > is not difficult. It is much less difficult than, say, writing a basic > > regular expression engine (but even the latter has gotten rewritten at > > times) -- unless you want to go into the advanced corners mentioned by > > Tim. > > > > Some guys posted their 'lsprof' on SF because it was well-polished and > > they found it useful, so here I am, arguing for a standard library > > containing preferably simple pieces of code that work and are practical > > for the common advertised use case. I'm not even sure in this case why > > we are arguing: the new piece of code's interface can be made 100% > > compatible with the documented parts of the previous interface; the > > previous module has been around for longer but so far it produced > > half-meaningless numbers due to bugs. > > > > Just because it is starting to feel like the objections are getting > spread out amongst various parts of this thread, I want to try to > summarize them as I remember them and give my input on them. > > So one objection seems to be the question of maintenance. Who is > going to keep this code updated and running? As has been pointed out, > Hotshot is not perfect and its development basically stopped. So > people being a little on edge about yet another profiler that might > not be maintained seems reasonable. > > But this worry, in my mind, is alleviated since I believe both Michael > and Armin are willing to maintain the code. With them both willing to > make sure it stays working (which is a pretty damn good commitment > since we have two core developers willing to keep this going and not > just one) I think this worry is dealt with. > > The other issue seems to be some people wanting to keep Hotshot around > for a few releases until lsprof can prove its worth. I believe this > is what Barry is asking for. Now Armin has said that a wrapper around > lsprof can be written that will match Hotshot's public API so its need > is not there if lsprof works and the wrapper is good. > > If it wasn't Armin or someone else whose opinion I trusted, I would > say go ahead and keep Hotshot around and then eventually do the > wrapper. But since it is Armin making this claim and the PyPy team > uses this thing (who has several members who I think know what they > are doing =) I have faith in them coming up with a good wrapper. > Thus I say removing Hotshot is fine. > > Lastly, there is the argument of whether we should even include a > profiler. Personally I say yes. It is another battery that is rather > nice. I think if the profiler finally had a good reputation of being > accurate and useful it would get more play in the real world. Plus we > already include other development tools such as IDLE with Python so it > seems fitting to include other dev tools when we have the code and a > maintenance commitment. > > In other words, I say let Armin and Michael add lsprof and the > wrappers for it (all while removing any redundant profilers that they > have wrappers for) with them knowing we will have a public stoning at > PyCon the instant they don't keep it all working. =) > > -Brett > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu > From skip at pobox.com Mon Nov 21 21:06:24 2005 From: skip at pobox.com (skip@pobox.com) Date: Mon, 21 Nov 2005 14:06:24 -0600 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <e8bf7a530511211142v629f6c69s6b07a3025db6f2ae@mail.gmail.com> References: <20051119180855.GA26733@code1.codespeak.net> <4380F572.9040402@v.loewis.de> <dlqtk8$37q$1@sea.gmane.org> <43817375.6040108@v.loewis.de> <20051121114101.GC13478@code1.codespeak.net> <bbaeab100511211138w244f0498k728363802328df2c@mail.gmail.com> <e8bf7a530511211142v629f6c69s6b07a3025db6f2ae@mail.gmail.com> Message-ID: <17282.10432.609056.20620@montanaro.dyndns.org> Jeremy> Here's another attempt to disentagle some issues: Jeremy> - Should lsprof be added to the standard distribution? Jeremy> - Should hotshot be removed from the standard distribution? Adding another log to the fire, what about statprof, a sampling profiler, which Neil Schemenauer mentioned? I installed it here at work. Seems to work as advertised. Took me about two minutes to modify our main app to accept a -P command line flag to enable statprof profiling. It has the beauty of being minimally invasive since it only samples the execution state every 100ms or so. Of course, sampling profilers have their own warts, but they avoid some of the problems of instrumenting profilers. Another tack to take would be to modify the generated byte code to only increment counts for each basic block, similar to what gcc's -pg flag does. I think that would yield a fully instrumented profiler, but one that's less invasive than the current alternatives. It could maybe be implemented as an import hook. Of course, such a beast has yet to be written, so this email and a couple bucks will get you a cup of coffee. This entire discussion simply serves to demonstrate that there are lots of different ways to skin this particular cat. How many of these various alternatives belong in the standard library remains to be seen. Skip From bcannon at gmail.com Mon Nov 21 21:16:58 2005 From: bcannon at gmail.com (Brett Cannon) Date: Mon, 21 Nov 2005 12:16:58 -0800 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <17282.10432.609056.20620@montanaro.dyndns.org> References: <20051119180855.GA26733@code1.codespeak.net> <4380F572.9040402@v.loewis.de> <dlqtk8$37q$1@sea.gmane.org> <43817375.6040108@v.loewis.de> <20051121114101.GC13478@code1.codespeak.net> <bbaeab100511211138w244f0498k728363802328df2c@mail.gmail.com> <e8bf7a530511211142v629f6c69s6b07a3025db6f2ae@mail.gmail.com> <17282.10432.609056.20620@montanaro.dyndns.org> Message-ID: <bbaeab100511211216v596ff9acg150fd2994e98b159@mail.gmail.com> On 11/21/05, skip at pobox.com <skip at pobox.com> wrote: > > Jeremy> Here's another attempt to disentagle some issues: > Jeremy> - Should lsprof be added to the standard distribution? > Jeremy> - Should hotshot be removed from the standard distribution? > > Adding another log to the fire, what about statprof, a sampling profiler, > which Neil Schemenauer mentioned? I installed it here at work. Seems to > work as advertised. Took me about two minutes to modify our main app to > accept a -P command line flag to enable statprof profiling. It has the > beauty of being minimally invasive since it only samples the execution state > every 100ms or so. Of course, sampling profilers have their own warts, but > they avoid some of the problems of instrumenting profilers. > My question is whether anyone is willing to maintain it in the stdlib? -Brett From bcannon at gmail.com Mon Nov 21 21:30:16 2005 From: bcannon at gmail.com (Brett Cannon) Date: Mon, 21 Nov 2005 12:30:16 -0800 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <e8bf7a530511211142v629f6c69s6b07a3025db6f2ae@mail.gmail.com> References: <20051119180855.GA26733@code1.codespeak.net> <4380F572.9040402@v.loewis.de> <dlqtk8$37q$1@sea.gmane.org> <43817375.6040108@v.loewis.de> <20051121114101.GC13478@code1.codespeak.net> <bbaeab100511211138w244f0498k728363802328df2c@mail.gmail.com> <e8bf7a530511211142v629f6c69s6b07a3025db6f2ae@mail.gmail.com> Message-ID: <bbaeab100511211230v304cb37dw5dd16e2a81f5572e@mail.gmail.com> On 11/21/05, Jeremy Hylton <jeremy at alum.mit.edu> wrote: > Here's another attempt to disentagle some issues: > - Should lsprof be added to the standard distribution? > - Should hotshot be removed from the standard distribution? > > These two aren't at all related, unless you believe that two is the > maximum number of profiles allowed per Python distribution. > They aren't related if Hotshot provides some functionality that lsprof cannot provide (such as profiling C code; I thought Nick Bastin added support for this?). But if there isn't, then there is some soft relatedness between them since it means that if lsprof is added then hotshot could be removed without backwards-compatibilty issues. They are not mutually exclusive, but one being true does influence the other. And as for how many profilers to have, I personally think one is plenty if they all provide similar type of output using similar techniques. But backwards-compatibility obviously is going to make total removal of a module and its API hard so I am thinking more towards Python 3000 and having the best solution in now. Otherwise we should do what must be done to fix hotshot and stick with it. -Brett From arigo at tunes.org Mon Nov 21 22:15:56 2005 From: arigo at tunes.org (Armin Rigo) Date: Mon, 21 Nov 2005 22:15:56 +0100 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <20051121164104.GA8898@laurie.sheepb.homeip.net> References: <20051119180855.GA26733@code1.codespeak.net> <1f7befae0511201755h2cb4bdf8s9c4b8586ee3c530a@mail.gmail.com> <20051121111430.GB13478@code1.codespeak.net> <20051121164104.GA8898@laurie.sheepb.homeip.net> Message-ID: <20051121211556.GA10821@code1.codespeak.net> Hi Floris, On Mon, Nov 21, 2005 at 04:41:04PM +0000, Floris Bruynooghe wrote: > > Now Brett's > > student, Floris, extended hotshot to allow custom timers. This is > > essential, because it enables testing. The timing parts of hotshot were > > not tested previously. > > Don't be too enthousiastic here. Testing is done by feeding the profiler something that is not a real timer function, but gives easy to predict answers. Then we check that the profiler accounted all this pseudo-time to the correct functions in the correct way. This is one of the few way to reliably test a profiler, that's why it is essential. > Iirc I did compare the output of test_profile between profile and my > wrapper. This was one of my checks to make sure it was wrapped > correctly. So could you tell me how they are different? test_profile works as I explained above. Running it with hotshot shows different numbers, which means that there is a bug (and not just some difference in real speed). More precisely, a specific number of the pseudo-clock-ticks are dropped for no reason other than a bug, and doesn't show up in the final results at all. A bientot, Armin From martin at v.loewis.de Mon Nov 21 22:16:16 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 21 Nov 2005 22:16:16 +0100 Subject: [Python-Dev] Patch Req. # 1351020 & 1351036: PythonD modifications In-Reply-To: <20051121070845.GA12993@ithaca04.ddaustralia.local> References: <39387.202.3.192.11.1132108393.squirrel@cafemail.mcadcafe.com> <437FA1D8.7060600@v.loewis.de> <20051120150850.GA27838@unpythonic.net> <25509.202.3.192.11.1132533752.squirrel@cafemail.mcadcafe.com> <43816CE2.2020808@v.loewis.de> <20051121070845.GA12993@ithaca04.ddaustralia.local> Message-ID: <43823920.3070802@v.loewis.de> Ben Decker wrote: > I think the port has beed supported for three years now. I am not > sure what kind of commitment you are looking for, but the patch and > software are supplied under the same terms of liability and warranty > as anything else under the GPL. That (licensed under GPL) would be an issue, as we are not accepting GPL-licensed code. I would guess that you are flexibly in licensing, though: we would request that you allow us to relicense the contribution under the terms at http://www.python.org/psf/contrib.html The commitment I was looking for was rather a statement like "I will be maintaining it for several coming years; when I ever stop maintaining it, feel free to remove the code again". So it is not that much past history (although this also matters, and three years of availability is certainly a good record); it is more important to somehow commit to future support, so that we are not left alone with code when cannot maintain if you ever drop out. Regards, Martin From arigo at tunes.org Mon Nov 21 22:23:09 2005 From: arigo at tunes.org (Armin Rigo) Date: Mon, 21 Nov 2005 22:23:09 +0100 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <20051121164503.GB8898@laurie.sheepb.homeip.net> References: <20051119180855.GA26733@code1.codespeak.net> <bbaeab100511191612o4877977bn1144c6cba4c4f5a@mail.gmail.com> <20051121111426.GA13478@code1.codespeak.net> <20051121164503.GB8898@laurie.sheepb.homeip.net> Message-ID: <20051121212309.GB10821@code1.codespeak.net> Hi Floris, On Mon, Nov 21, 2005 at 04:45:03PM +0000, Floris Bruynooghe wrote: > Afaik I did test recursive calls etc. It seems to show up in any test case I try, e.g. import hprofile def wait(m): if m > 0: wait(m-1) def f(n): wait(n) if n > 1: return n*f(n-1) else: return 1 hprofile.run("f(500)", 'dump-hprof') The problem is in the cumulative time column, which (on this machine) says 163 seconds for both f() and wait(). The whole program finishes in 1 second... The same log file loaded with hotshot.stats doesn't have this problem. A bientot, Armin. From martin at v.loewis.de Mon Nov 21 22:29:55 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 21 Nov 2005 22:29:55 +0100 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <20051121114101.GC13478@code1.codespeak.net> References: <20051119180855.GA26733@code1.codespeak.net> <4380F572.9040402@v.loewis.de> <dlqtk8$37q$1@sea.gmane.org> <43817375.6040108@v.loewis.de> <20051121114101.GC13478@code1.codespeak.net> Message-ID: <43823C53.8080403@v.loewis.de> Armin Rigo wrote: > I see no incremental way of fixing some of the downsides of hotshot, > like its huge log file size and loading time. I haven't looked into the details myself, but it appears that some google-summer-of-code contributor has found some way of fixing it. > I doubt people often find > the motivation to dig into this large orphaned piece of software. As Fredrik says: this sounds like the CADT model. The code isn't really orphaned - it's just that it isn't used much. Contributions to this code certainly would still be accepted (and happily so). So essentially: fixing bugs isn't fun, but rewriting it from scratch is. > I'm not even sure in this case why > we are arguing That's pretty obvious to me: because some people are shy of letting version 0.8 of the old software be replaced with version 0.8 of the new software, which is then replaced with version 0.8 of the next rewrite. Instead, we should stick to what we have, and improve it. Now, it might be that in this specific case, replacing the library really is the right thing to do. It would be if: 1.it has improvements over the current library already (certified by users other than the authors), AND 2.it has no drawbacks over the current library, AND 3.there is some clear indication that it will get better maintenance than the previous library. I'm not certain lsprof has properties 2 and 3; property 1, so far, is only asserted by the library author himself. Perhaps it is true what Fredrik Lundh says: there shouldn't be a profiler in the standard library at all. Regards, Martin From jimjjewett at gmail.com Mon Nov 21 22:36:33 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Mon, 21 Nov 2005 16:36:33 -0500 Subject: [Python-Dev] s/hotshot/lsprof Message-ID: <fb6fbf560511211336w3d5bc7dbn71c2154bf5455c99@mail.gmail.com> Jeremy Hylton jeremy at alum.mit.edu > Should lsprof be added to the standard distribution? > Should hotshot be removed from the standard distribution? > These two aren't at all related, unless you believe that two is the > maximum number of profiles allowed per Python distribution. One is a better number. ("There should be one-- and preferably only one --obvious way to do it.") Adding a second (let alone third) module to the stdlib to do the same thing just makes the documentation bulkier, and makes the "where do I start" problem harder for beginners. And yes, I think beginners are the most important audience here; anyone sufficiently comfortable with python to make an intelligent choice between different code profilers is probably also able to install 3rd-party modules anyway. Note that I have no objection to (and would like to see) a section in the module documentation saying "This is just one alternative; many people prefer XXX because of YYY". This mention would provide enough endorsement for anyone ready to choose another profiler. Even putting the alternatives into a single stdlib package (and making it clear that they are alternatives, rather than complementary building blocks) is better than simply leaving them all scattered throughout the stdlib as roll-the-dice-to-pick alternatives. -jJ From martin at v.loewis.de Mon Nov 21 22:40:02 2005 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Mon, 21 Nov 2005 22:40:02 +0100 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <bbaeab100511211138w244f0498k728363802328df2c@mail.gmail.com> References: <20051119180855.GA26733@code1.codespeak.net> <4380F572.9040402@v.loewis.de> <dlqtk8$37q$1@sea.gmane.org> <43817375.6040108@v.loewis.de> <20051121114101.GC13478@code1.codespeak.net> <bbaeab100511211138w244f0498k728363802328df2c@mail.gmail.com> Message-ID: <43823EB2.8040108@v.loewis.de> Brett Cannon wrote: > But this worry, in my mind, is alleviated since I believe both Michael > and Armin are willing to maintain the code. With them both willing to > make sure it stays working (which is a pretty damn good commitment > since we have two core developers willing to keep this going and not > just one) I think this worry is dealt with. So far, neither of them has explicitly said so: Michael said he will be around; and I'm certain that is the case for Python as a whole. An explicit commitment to lsprof maintenance would help (me, atleast). > In other words, I say let Armin and Michael add lsprof and the > wrappers for it (all while removing any redundant profilers that they > have wrappers for) with them knowing we will have a public stoning at > PyCon the instant they don't keep it all working. =) I would prefer to see some advance support from lsprof users, confirming that this is really a good thing to have. Regards, Martin From ncoghlan at gmail.com Mon Nov 21 23:02:38 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 22 Nov 2005 08:02:38 +1000 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <fb6fbf560511211336w3d5bc7dbn71c2154bf5455c99@mail.gmail.com> References: <fb6fbf560511211336w3d5bc7dbn71c2154bf5455c99@mail.gmail.com> Message-ID: <438243FE.1020504@gmail.com> Jim Jewett wrote: > Jeremy Hylton jeremy at alum.mit.edu >> Should lsprof be added to the standard distribution? >> Should hotshot be removed from the standard distribution? > >> These two aren't at all related, unless you believe that two is the >> maximum number of profiles allowed per Python distribution. > > One is a better number. > > ("There should be one-- and preferably only one --obvious way to do it.") > > Adding a second (let alone third) module to the stdlib to do > the same thing just makes the documentation bulkier, > and makes the "where do I start" problem harder for beginners. > > And yes, I think beginners are the most important audience > here; anyone sufficiently comfortable with python to make an > intelligent choice between different code profilers is probably > also able to install 3rd-party modules anyway. Chiming in as a user of 'profile', that has also attempted to use hotshot. . . I used profile heavily when we working on the implementation of the decimal module, trying to figure out where the bottlenecks were (e.g., profile showed that converting to integers to do arithmetic and back to sequences to do rounding was a net win, despite the conversion costs in switching back and forth between the two formats). I tried using hotshot to do the same thing (profiled runs of the arithmetic tests took a *long* time), and found the results to be well-nigh useless (I seem to recall it was related to the fact that profile separated out C calls, while hotshot didn't). So my experience of hotshot has been "sure it's slightly less invasive, but it doesn't actually work". If hotshot can be replaced with something that actually works as intended, or if lsprof can be added in a way that is more closely coupled with profile (so that there is a clear choice between "less invasive but less detailed results" and "more detailed results but more invasive during execution"), I'd be quite happy. If a statistical profiler was later added to round out the minimally invasive end, that actually makes for a decent profiling toolset: 1. Use the statistical profiler to identify potential problem areas 2. Use hotshot/lsprof to further analyse the potential problem areas 3. Use profile to get detailed results on the bottlenecks Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From nyamatongwe at gmail.com Mon Nov 21 23:04:40 2005 From: nyamatongwe at gmail.com (Neil Hodgson) Date: Tue, 22 Nov 2005 09:04:40 +1100 Subject: [Python-Dev] ast status, memory leaks, etc In-Reply-To: <ee2a432c0511211105w7b60bae1ibaaf6e2a4bd077fb@mail.gmail.com> References: <ee2a432c0511131141s72fedecax29008fd783a3b0db@mail.gmail.com> <ee2a432c0511191615y6259e95bwce68aec849a7ebfa@mail.gmail.com> <438048B6.2030103@v.loewis.de> <ee2a432c0511201614u1dadb3b2x419e3482ccf5b145@mail.gmail.com> <dlsma2$kj1$1@sea.gmane.org> <ee2a432c0511211105w7b60bae1ibaaf6e2a4bd077fb@mail.gmail.com> Message-ID: <50862ebd0511211404m2190f880sa210eda18c216140@mail.gmail.com> Neal Norwitz: > I think a bigger bang for the buck would be to buy a Windows box with > Purify. Rational was a real pain to deal with, maybe it's better now > that IBM bought them. Parasoft (Insure++) was even worse to deal > with. My experience with the other Windows option, BoundsChecker, is similarly negative and I haven't bothered upgrading for a couple of versions (so can only use it with VC++ 6). The original developer, NuMega, were great but they were absorbed into Compuware which seems to see it more as a source of consulting income than as a product. I'm fairly experienced with BoundsChecker and related programs (like their profiler) so could run it over a test suite if a license was provided. A demonstration license can probably not be installed on my machine due to earlier installs. Neil From fredrik at pythonware.com Mon Nov 21 23:07:06 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 21 Nov 2005 23:07:06 +0100 Subject: [Python-Dev] ast status, memory leaks, etc References: <ee2a432c0511131141s72fedecax29008fd783a3b0db@mail.gmail.com><ee2a432c0511191615y6259e95bwce68aec849a7ebfa@mail.gmail.com><438048B6.2030103@v.loewis.de><ee2a432c0511201614u1dadb3b2x419e3482ccf5b145@mail.gmail.com><dlsma2$kj1$1@sea.gmane.org> <ee2a432c0511211105w7b60bae1ibaaf6e2a4bd077fb@mail.gmail.com> Message-ID: <dltgef$irs$1@sea.gmane.org> Neal Norwitz wrote: > I think a bigger bang for the buck would be to buy a Windows box with > Purify. Rational was a real pain to deal with, maybe it's better now > that IBM bought them. Parasoft (Insure++) was even worse to deal > with. There would be many other benefits for someone to do more > testing on Windows. I don't think there's a shortage of Windows boxes among the python-dev crowd (I have plenty). Does anyone knows that kind of box you need to run purify these days ? (looks like a license costs $780 in the US but $1100 in Sweden. hmm...) > The worst part of all this is ... it's still Windows. Some of us are OS agnostics, you know. </F> From nnorwitz at gmail.com Mon Nov 21 23:24:36 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Mon, 21 Nov 2005 14:24:36 -0800 Subject: [Python-Dev] ast status, memory leaks, etc In-Reply-To: <dltgef$irs$1@sea.gmane.org> References: <ee2a432c0511131141s72fedecax29008fd783a3b0db@mail.gmail.com> <ee2a432c0511191615y6259e95bwce68aec849a7ebfa@mail.gmail.com> <438048B6.2030103@v.loewis.de> <ee2a432c0511201614u1dadb3b2x419e3482ccf5b145@mail.gmail.com> <dlsma2$kj1$1@sea.gmane.org> <ee2a432c0511211105w7b60bae1ibaaf6e2a4bd077fb@mail.gmail.com> <dltgef$irs$1@sea.gmane.org> Message-ID: <ee2a432c0511211424v649272f1of439ad5cdad00301@mail.gmail.com> On 11/21/05, Fredrik Lundh <fredrik at pythonware.com> wrote: > > I don't think there's a shortage of Windows boxes among the python-dev > crowd (I have plenty). Does anyone knows that kind of box you need to > run purify these days ? Dunno, but it would probably be fine on a reasonably new box with at least 1 GB of RAM. If you are interested in using purify for python, I think that would be great and doubt there would be an issue for the PSF to buy a copy. > (looks like a license costs $780 in the US but $1100 in Sweden. hmm...) There was also PurifyPlus (I think that was the name) for $1380 or so. My guess is that also included Quantify and the other program bundled together. > > The worst part of all this is ... it's still Windows. > > Some of us are OS agnostics, you know. Yeah, it was meant as a joke (though also my preference). Guess I shouldn't go on tour. :-) n From skip at pobox.com Mon Nov 21 23:30:47 2005 From: skip at pobox.com (skip@pobox.com) Date: Mon, 21 Nov 2005 16:30:47 -0600 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <bbaeab100511211216v596ff9acg150fd2994e98b159@mail.gmail.com> References: <20051119180855.GA26733@code1.codespeak.net> <4380F572.9040402@v.loewis.de> <dlqtk8$37q$1@sea.gmane.org> <43817375.6040108@v.loewis.de> <20051121114101.GC13478@code1.codespeak.net> <bbaeab100511211138w244f0498k728363802328df2c@mail.gmail.com> <e8bf7a530511211142v629f6c69s6b07a3025db6f2ae@mail.gmail.com> <17282.10432.609056.20620@montanaro.dyndns.org> <bbaeab100511211216v596ff9acg150fd2994e98b159@mail.gmail.com> Message-ID: <17282.19095.558236.430148@montanaro.dyndns.org> Brett> My question is whether anyone is willing to maintain it in the Brett> stdlib? My answer is: I'm not sure it matters at this point. There are so many profiling possibilities, it doesn't seem like we yet know which options are the best. There is some tacit crowning of "best of breed" when a package is added to the standard library, so we probably shouldn't be adding every candidate that comes along until we have a better idea of the best way to do things. Skip From martin at v.loewis.de Mon Nov 21 23:48:29 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 21 Nov 2005 23:48:29 +0100 Subject: [Python-Dev] svn diff -r {2001-01-01} Message-ID: <43824EBD.50402@v.loewis.de> Greg Stein points out that because of the way the subversion conversion was done, by-date revision specifications won't work. Subversion assumes that time is monotonically increasing over revions numbers - it does a binary search to find out the revision that immediately precedes(?) the specified date. Yet, as the conversion was done project-by-project (toplevel svn dirs), commit time sometimes goes backward along with increasing revision numbers; this breaks the algorithm svn uses. There are two way in which you might want to use date specifications (that I can think of): svn diff (find the changes since some date) and svn up (check out revision at some date). If you need to do such operations, you will have to look up the closest revision number manually (e.g. in viewcvs, or through svn log). If this is a common operation, I'm sure it would be possible to put a table of commit dates for python/ somewhere, to find the necessary revision number more quickly. For dates past the switchover, everything is fine. So svn diff -r{00:00} Lib/ works fine. Regards, Martin From simon at arrowtheory.com Tue Nov 22 05:30:38 2005 From: simon at arrowtheory.com (Simon Burton) Date: Tue, 22 Nov 2005 15:30:38 +1100 Subject: [Python-Dev] DRAFT: python-dev Summary for 2005-10-16 to 2005-10-31 In-Reply-To: <D716D004-B827-4CB4-913B-ECE61118FF0A@gmail.com> References: <D716D004-B827-4CB4-913B-ECE61118FF0A@gmail.com> Message-ID: <20051122153038.030c8586.simon@arrowtheory.com> On Thu, 17 Nov 2005 13:36:36 +1300 Tony Meyer <tony.meyer at gmail.com> wrote: > > -------------- > AST for Python > -------------- > > As of October 21st, Python's compiler now uses a real Abstract Syntax > Tree (AST)! This should make experimenting with new syntax much > easier, as well as allowing some optimizations that were difficult > with the previous Concrete Syntax Tree (CST). > While there is no > Python interface to the AST yet, one is intended for the not-so- > distant future. OK, who is doing this ? I am mad keen to get this happening. Simon. -- Simon Burton, B.Sc. Licensed PO Box 8066 ANU Canberra 2601 Australia Ph. 61 02 6249 6940 http://arrowtheory.com From abkhd at hotmail.com Tue Nov 22 06:16:15 2005 From: abkhd at hotmail.com (A.B., Khalid) Date: Tue, 22 Nov 2005 05:16:15 +0000 Subject: [Python-Dev] test_cmd_line on Windows Message-ID: <BAY12-F7A12E367A717740C27791AB520@phx.gbl> Currently test_directories of test_cmd_line fails on the latest Python 2.4.2 from svn branch and from the svn head. The reason it seems is that the test assumes that the local language of Windows is English and so tries to find the string " denied" in the returned system error messages of the commands ("python .") and ("python < ."). But while it is true that the first command ("python .") does return an English string error message even on so-called non-English versions of Windows, the same does not seem to be true for the second command ("python < ."), which seems to return a locale-related string error message. And since the latter test is looking for the English " denied" in a non-English language formated string, the test fails in non-English versions of Windows. Regards Khalid _________________________________________________________________ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ From nnorwitz at gmail.com Tue Nov 22 06:20:51 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Mon, 21 Nov 2005 21:20:51 -0800 Subject: [Python-Dev] Fwd: [Python-checkins] commit of r41497 - python/trunk/Lib/test In-Reply-To: <20051122051745.B440A1E400B@bag.python.org> References: <20051122051745.B440A1E400B@bag.python.org> Message-ID: <ee2a432c0511212120r2a4429c9lbda2ead70a3156f6@mail.gmail.com> I just checked in the modification below. I'm not sure if this behaviour is on purpose or by accident. Do we want to support hex values in floats? Do we want to support p, similar to e in floats? Here are the lines from the test: + self.assertEqual(float(" 0x3.1 "), 3.0625) + self.assertEqual(float(" -0x3.p-1 "), -1.5) n ---------- Forwarded message ---------- From: neal.norwitz at python.org <neal.norwitz at python.org> Date: Nov 21, 2005 9:17 PM Subject: [Python-checkins] commit of r41497 - python/trunk/Lib/test To: python-checkins at python.org Author: neal.norwitz Date: Tue Nov 22 06:17:40 2005 New Revision: 41497 Modified: python/trunk/Lib/test/test_builtin.py Log: improve test coverage in Python/pystrtod.c and Python/mystrtoul.c. Modified: python/trunk/Lib/test/test_builtin.py ============================================================================== --- python/trunk/Lib/test/test_builtin.py (original) +++ python/trunk/Lib/test/test_builtin.py Tue Nov 22 06:17:40 2005 @@ -545,6 +545,34 @@ self.assertEqual(float(unicode(" 3.14 ")), 3.14) self.assertEqual(float(unicode(" \u0663.\u0661\u0664 ",'raw-unicode-escape')), 3.14) + def test_float_with_comma(self): + # set locale to something that doesn't use '.' for the decimal point + try: + import locale + orig_locale = locale.setlocale(locale.LC_NUMERIC, '') + locale.setlocale(locale.LC_NUMERIC, 'fr_FR') + except: + # if we can't set the locale, just ignore this test + return + + try: + self.assertEqual(locale.localeconv()['decimal_point'], ',') + except: + # this test is worthless, just skip it and reset the locale + locale.setlocale(locale.LC_NUMERIC, orig_locale) + return + + try: + self.assertEqual(float(" 3,14 "), 3.14) + self.assertEqual(float(" +3,14 "), 3.14) + self.assertEqual(float(" -3,14 "), -3.14) + self.assertEqual(float(" 0x3.1 "), 3.0625) + self.assertEqual(float(" -0x3.p-1 "), -1.5) + self.assertEqual(float(" 25.e-1 "), 2.5) + self.assertEqual(fcmp(float(" .25e-1 "), .025), 0) + finally: + locale.setlocale(locale.LC_NUMERIC, orig_locale) + def test_floatconversion(self): # Make sure that calls to __float__() work properly class Foo0: @@ -682,6 +710,7 @@ self.assertRaises(TypeError, int, 1, 12) self.assertEqual(int('0123', 0), 83) + self.assertEqual(int('0x123', 16), 291) def test_intconversion(self): # Test __int__() _______________________________________________ Python-checkins mailing list Python-checkins at python.org http://mail.python.org/mailman/listinfo/python-checkins From arigo at tunes.org Tue Nov 22 07:01:47 2005 From: arigo at tunes.org (Armin Rigo) Date: Tue, 22 Nov 2005 07:01:47 +0100 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <43823C53.8080403@v.loewis.de> References: <20051119180855.GA26733@code1.codespeak.net> <4380F572.9040402@v.loewis.de> <dlqtk8$37q$1@sea.gmane.org> <43817375.6040108@v.loewis.de> <20051121114101.GC13478@code1.codespeak.net> <43823C53.8080403@v.loewis.de> Message-ID: <20051122060146.GA14960@code1.codespeak.net> Hi Martin, On Mon, Nov 21, 2005 at 10:29:55PM +0100, "Martin v. L?wis" wrote: > > I see no incremental way of fixing some of the downsides of hotshot, > > like its huge log file size and loading time. > > I haven't looked into the details myself, but it appears that some > google-summer-of-code contributor has found some way of fixing it. As discussed elsewhere on this thread: this contribution did not fix any of the mentioned problems. The goal was only to get rid of profile.py by linking it to Hotshot. So the log file size didn't change and the loading time was only 20-30% better, which is still a really long time. > So essentially: fixing bugs isn't fun, but rewriting it from scratch is. Well, sorry for being interested in having fun. And yes, I am formally committing myself to maintaining this new piece of software, because that also looks like fun: it's simple code that does just what you expect from it. Note that I may sound too negative about Hotshot. I see by now that it is a very powerful piece of code, full of careful design trade-offs and capabilities. It can do much more than what the minimalistic documentation says, e.g. it can or could be used as the basis of a tracing tool to debug software, to measure test coverage, etc. (with external tools). Moreover, it comes with carefully chosen drawbacks -- log file size and loading time -- for advanced reasons. You won't find them discussed in the documentation, which makes user experience mostly negative, but you do find them in Tim's e-mails :-) So no, I'm not willing to debug and maintain an "unfinished" (quoting Tim) advanced piece of software doing much more than what common-people- reading-the-stdlib-docs use it for. That is not fun. > Now, it might be that in this specific case, replacing the library > really is the right thing to do. It would be if: > 1.it has improvements over the current library already > (certified by users other than the authors), AND > 2.it has no drawbacks over the current library, AND > 3.there is some clear indication that it will get better maintenance > than the previous library. 1. Log file size (could reuse the existing compact profile.py format) -- good "profile-tweak-reprofile" round-trip time for the developer (no ages spent loading the log) -- ability to interpret the logs in memory, no need for a file -- collecting children call stats. Positive early user experience comes from the authors, me, and at least one other company (Strakt) that cared enough to push for lsprof on the SF tracker. There is this widespread user experience that hotshot is nice "but it doesn't actually appear to work" (as Nick Coghlan put it). Hotshot is indeed buggy and has been producing wrong timings all along (up to and including the current HEAD version) as shown by the test_profile found in the Summer of Code project mentioned above. Now we can fix that one, and see if things get better. In some sense this fix will discard the meaning of any previous user experience, so that lsprof has now more of it than Hotshot... 2. Drawbacks: there are many, as Hotshot has much more capabilities or potential capabilities than lsprof. None of them is to be found in the documentation of Hotshot, though. There is no drawback for people using Hotshot only as documented. Of course we might keep both Hotshot and lsprof in the stdlib, if this sounds like a problem, but I really think the stdlib could do with clean-ups more than pile-ups. 3. Maintenance group: two core developers. A bientot, Armin. From bcannon at gmail.com Tue Nov 22 08:35:37 2005 From: bcannon at gmail.com (Brett Cannon) Date: Mon, 21 Nov 2005 23:35:37 -0800 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <20051122060146.GA14960@code1.codespeak.net> References: <20051119180855.GA26733@code1.codespeak.net> <4380F572.9040402@v.loewis.de> <dlqtk8$37q$1@sea.gmane.org> <43817375.6040108@v.loewis.de> <20051121114101.GC13478@code1.codespeak.net> <43823C53.8080403@v.loewis.de> <20051122060146.GA14960@code1.codespeak.net> Message-ID: <bbaeab100511212335v7be01235o71f932593b7d3fe0@mail.gmail.com> On 11/21/05, Armin Rigo <arigo at tunes.org> wrote: > Hi Martin, > > On Mon, Nov 21, 2005 at 10:29:55PM +0100, "Martin v. L?wis" wrote: > > > I see no incremental way of fixing some of the downsides of hotshot, > > > like its huge log file size and loading time. > > > > I haven't looked into the details myself, but it appears that some > > google-summer-of-code contributor has found some way of fixing it. > > As discussed elsewhere on this thread: this contribution did not fix any > of the mentioned problems. The goal was only to get rid of profile.py > by linking it to Hotshot. So the log file size didn't change and the > loading time was only 20-30% better, which is still a really long time. > > > So essentially: fixing bugs isn't fun, but rewriting it from scratch is. > > Well, sorry for being interested in having fun. And yes, I am formally > committing myself to maintaining this new piece of software, because > that also looks like fun: it's simple code that does just what you > expect from it. > > Note that I may sound too negative about Hotshot. I see by now that it > is a very powerful piece of code, full of careful design trade-offs and > capabilities. It can do much more than what the minimalistic > documentation says, e.g. it can or could be used as the basis of a > tracing tool to debug software, to measure test coverage, etc. (with > external tools). Moreover, it comes with carefully chosen drawbacks -- > log file size and loading time -- for advanced reasons. You won't find > them discussed in the documentation, which makes user experience mostly > negative, but you do find them in Tim's e-mails :-) > > So no, I'm not willing to debug and maintain an "unfinished" (quoting > Tim) advanced piece of software doing much more than what common-people- > reading-the-stdlib-docs use it for. That is not fun. > > > Now, it might be that in this specific case, replacing the library > > really is the right thing to do. It would be if: > > 1.it has improvements over the current library already > > (certified by users other than the authors), AND > > 2.it has no drawbacks over the current library, AND > > 3.there is some clear indication that it will get better maintenance > > than the previous library. > > 1. Log file size (could reuse the existing compact profile.py format) -- > good "profile-tweak-reprofile" round-trip time for the developer (no > ages spent loading the log) -- ability to interpret the logs in memory, > no need for a file -- collecting children call stats. Positive early > user experience comes from the authors, me, and at least one other > company (Strakt) that cared enough to push for lsprof on the SF tracker. > > There is this widespread user experience that hotshot is nice "but it > doesn't actually appear to work" (as Nick Coghlan put it). Hotshot is > indeed buggy and has been producing wrong timings all along (up to and > including the current HEAD version) as shown by the test_profile found > in the Summer of Code project mentioned above. Now we can fix that one, > and see if things get better. In some sense this fix will discard the > meaning of any previous user experience, so that lsprof has now more of > it than Hotshot... > > 2. Drawbacks: there are many, as Hotshot has much more capabilities or > potential capabilities than lsprof. None of them is to be found in the > documentation of Hotshot, though. There is no drawback for people using > Hotshot only as documented. Of course we might keep both Hotshot and > lsprof in the stdlib, if this sounds like a problem, but I really think > the stdlib could do with clean-ups more than pile-ups. > I am perfectly happy with having lsprof be added with all of this and point 3 (any chance we can replace profile with a wrapper to lsprof without much issue?). As for cleanup, I say Hotshot should stay if we can get it working properly and document its power features. If we can't get it to that state then it should go (maybe not until Python 3.0, but eventually). > 3. Maintenance group: two core developers. -Brett From fredrik at pythonware.com Tue Nov 22 08:48:25 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 22 Nov 2005 08:48:25 +0100 Subject: [Python-Dev] [Python-checkins] commit of r41497 -python/trunk/Lib/test References: <20051122051745.B440A1E400B@bag.python.org> <ee2a432c0511212120r2a4429c9lbda2ead70a3156f6@mail.gmail.com> Message-ID: <dluigb$2as$1@sea.gmane.org> Neal Norwitz wrote: > I just checked in the modification below. I'm not sure if this > behaviour is on purpose or by accident. Python 2.4 on Linux: >>> float(" 0x3.1 ") 3.0625 Python 2.4 on Windows: >>> float(" 0x3.1 ") Traceback (most recent call last): File "<stdin>", line 1, in ? ValueError: invalid literal for float(): 0x3.1 </F> From phd at mail2.phd.pp.ru Tue Nov 22 10:00:43 2005 From: phd at mail2.phd.pp.ru (Oleg Broytmann) Date: Tue, 22 Nov 2005 12:00:43 +0300 Subject: [Python-Dev] svn diff -r {2001-01-01} In-Reply-To: <43824EBD.50402@v.loewis.de> References: <43824EBD.50402@v.loewis.de> Message-ID: <20051122090043.GA30828@phd.pp.ru> On Mon, Nov 21, 2005 at 11:48:29PM +0100, "Martin v. L?wis" wrote: > you will have to look up the closest > revision number manually (e.g. in viewcvs, or through svn log). svn annotate (aka svn blame) may help too. Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From walter at livinglogic.de Tue Nov 22 14:13:56 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Tue, 22 Nov 2005 14:13:56 +0100 Subject: [Python-Dev] test_cmd_line on Windows In-Reply-To: <BAY12-F7A12E367A717740C27791AB520@phx.gbl> References: <BAY12-F7A12E367A717740C27791AB520@phx.gbl> Message-ID: <43831994.6060104@livinglogic.de> A.B., Khalid wrote: > Currently test_directories of test_cmd_line fails on the latest Python 2.4.2 > from svn branch and from the svn head. The reason it seems is that the test > assumes that the local language of Windows is English and so tries to find > the string " denied" in the returned system error messages of the commands > ("python .") and ("python < ."). > > But while it is true that the first command ("python .") does return an > English string error message even on so-called non-English versions of > Windows, the same does not seem to be true for the second command ("python < > ."), which seems to return a locale-related string error message. And since > the latter test is looking for the English " denied" in a non-English > language formated string, the test fails in non-English versions of Windows. Does the popen2.popen4() used by the test provide return values of the execute command? Using os.system() instead seems to provide enough information: On Windows: Python 2.4.2 (#67, Sep 28 2005, 12:41:11) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import os >>> os.system("python < .") Zugriff verweigert 1 >>> os.system("python <NUL:") Python 2.4.2 (#67, Sep 28 2005, 12:41:11) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> 0 >>> On Linux: Python 2.4.2 (#1, Oct 3 2005, 15:51:22) [GCC 3.3.5 (Debian 1:3.3.5-13)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import os >>> os.system("python < .") 35584 >>> os.system("python < /dev/null") 0 Can you provide a patch to test_cmd_line.py? Bye, Walter D?rwald From decker at dacafe.com Mon Nov 21 01:42:32 2005 From: decker at dacafe.com (decker@dacafe.com) Date: Mon, 21 Nov 2005 00:42:32 -0000 (Australia/Sydney) Subject: [Python-Dev] Patch Req. # 1351020 & 1351036: PythonD modifications In-Reply-To: <20051120150850.GA27838@unpythonic.net> References: <39387.202.3.192.11.1132108393.squirrel@cafemail.mcadcafe.com> <437FA1D8.7060600@v.loewis.de> <20051120150850.GA27838@unpythonic.net> Message-ID: <25509.202.3.192.11.1132533752.squirrel@cafemail.mcadcafe.com> <quote who="jepler at unpythonic.net"> > On Sat, Nov 19, 2005 at 11:06:16PM +0100, "Martin v. L?wis" wrote: >> decker at dacafe.com wrote: >> > I would appreciate feedback concerning these patches before the next >> > "PythonD" (for DOS/DJGPP) is released. >> >> PEP 11 says that DOS is not supported anymore since Python 2.0. So >> I am -1 on reintroducing support for it. The local python community here in Sydney indicated that python.org is only upset when groups port the source to 'obscure' systems and *don't* submit patches... It is possible that I was misinformed. > If we have someeone who is volunteering the time to make it work, not just > today > but in the future as well, we shouldn't rule out re-adding support. I am not sure about the future myself. DJGPP 2.04 has been parked at beta for two years now. It might be fair to say that the *general* DJGPP developer base has shrunk a little bit. But the PythonD userbase has actually grown since the first release three years ago. For the time being, people get very angry when the servers go down here :-) > I've taken a glance at the patch. There are probably a few things to > quarrel > over--for instance, it looks like a site.py change will cause python to > print > a blank line when it's started, and the removal of a '#define HAVE_FORK 1' > in > posixmodule.c---but this still doesn't mean the re-addition of DOS as a > supported > platform should be rejected out of hand. Well, that's for sure! These patches have never been reviewed by python.org before, so I am sure that there are *plenty* of ways to better fit DOS support into the Python source. Fork will never work under DOS, no matter how much we dream :-) The empty line 'print' was a legacy error to kludge the ANSI color scheme to work correctly. Long story. It can be ignored. In fact, none of the changes to site.py are essential for python to work under DOS. They are 'additions' that most of the PythonD userbase seem to enjoy, but few knew how to do for themselves at one time. But they aren't essential tto the port. The important aspects are the path and stat stuff. Nothing works without them. I should mention that one thing that never did get ported was the build scripts themselves to accomodate DJGPP-DOS. For a complete port, we must still look at Modules/makesetup to remember that although directory separators "\\" or "/" are OK, the path separator ":" is definitely not. ";" must be used. So far, we have simply changed Setup and the Makefiles by hand after initial confiure. Ben ----------------------------------------- Stay ahead of the information curve. Receive MCAD news and jobs on your desktop daily. Subscribe today to the MCAD CafeNews newsletter. [ http://www10.mcadcafe.com/nl/newsletter_subscribe.php ] It's informative and essential. From falcon at intercable.ru Mon Nov 21 08:02:04 2005 From: falcon at intercable.ru (Sokolov Yura) Date: Mon, 21 Nov 2005 10:02:04 +0300 Subject: [Python-Dev] str.dedent Message-ID: <438170EC.8090509@intercable.ru> >>/ msg = textwrap.dedent('''\ >/>/ IDLE's subprocess can't connect to %s:%d. This may be due \ >/>/ to your personal firewall configuration. It is safe to \ >/>/ allow this internal connection because no data is visible on \ >/>/ external ports.''' % address) >/>/ >/ >Unfortunately, it won't help, since the 'dedent' method won't treat >those spaces as indentation. > So that it would be usefull to implicit parser dedent on string with 'd' prefix / msg = d'''\ // IDLE's subprocess can't connect to %s:%d. This may be due \ // to your personal firewall configuration. It is safe to \ // allow this internal connection because no data is visible on \ // external ports.''' % address/ From bend at ddaustralia.com.au Mon Nov 21 08:08:45 2005 From: bend at ddaustralia.com.au (Ben Decker) Date: Mon, 21 Nov 2005 18:08:45 +1100 Subject: [Python-Dev] Patch Req. # 1351020 & 1351036: PythonD modifications In-Reply-To: <43816CE2.2020808@v.loewis.de> References: <39387.202.3.192.11.1132108393.squirrel@cafemail.mcadcafe.com> <437FA1D8.7060600@v.loewis.de> <20051120150850.GA27838@unpythonic.net> <25509.202.3.192.11.1132533752.squirrel@cafemail.mcadcafe.com> <43816CE2.2020808@v.loewis.de> Message-ID: <20051121070845.GA12993@ithaca04.ddaustralia.local> > It's not that much availability of the platform I worry about, but the > commitment of the Python porter. We need somebody to forward bug > reports to, and somebody to intervene if incompatible changes are made. > This person would also indicate that the platform is no longer > available, and hence the port can be removed. > > Regards, > Martin I think the port has beed supported for three years now. I am not sure what kind of commitment you are looking for, but the patch and software are supplied under the same terms of liability and warranty as anything else under the GPL. Bug reports can be sent to either python at exemail.com.au, decker at dacafe.com or developemnt at exemail.com.au. From tony.meyer at gmail.com Mon Nov 21 11:14:25 2005 From: tony.meyer at gmail.com (Tony Meyer) Date: Mon, 21 Nov 2005 23:14:25 +1300 Subject: [Python-Dev] DRAFT: python-dev Summary for 2005-11-01 through 2005-11-15 Message-ID: <6c63de570511210214o69c0a5b6q4683fcbd974441d3@mail.gmail.com> Surprise! It's November, and here's a November summary <wink>. Thanks to all those that proofread the triple summary hit last week; if anyone can spare some time to take a look over these in the next couple of days, that would be great. As always, corrections and suggestions to tony.meyer at gmail.com or steven.bethard at gmail.com. A couple of largish threads were skipped: one continuing discussion about freezing (I couldn't come up with a summary longer than "the heated debate continued"), and one on weak reference dereference notifications (I wasn't sure what to say). If anyone wants those summarized, let me know (ideally with some hints!) and I'll add them in. ============= Announcements ============= ---------------------------------------- PyPy 0.8.0 and Gothenburg PyPy Sprint II ---------------------------------------- `PyPy 0.8.0`_ has been released. This third release of PyPy includes a translatable parser and AST compiler, some speed enhancements (transated PyPy is now about 10 times faster than 0.7, but still 10-20 times slower than CPython), increased language compliancy, and some experimental features are now translatable. This release also includes snapshots of interesting, but not yet completed, subprojects including the OOtyper (a RTyper variation for higher-level backends), a JavaScript backend, a limited (PPC) assembler backend, and some bits for a socket module. The next PyPy Sprint is also coming up soon. The Gothenborg PyPy Sprint II is on the 7th to 11th of December 2005 in Gothenborg, Sweden. Its focus is heading towards phase 2, which means JIT work, alternate threading modules, and logic programming. Newcomer-friendly introductions will also be given. The main topics that are currently scheduled are the L3 interpreter (a small fast interpreter for "assembler-level" flow graphs), Stackless (e.g. Tasklets or Greenlets), porting C modules from CPython, optimization/debugging work, and logic programming in Python. .. _`PyPy 0.8.0`: http://codespeak.net/pypy/dist/pypy/doc/release-0.8.0.html Contributing threads: (1) - `PyPy 0.8.0 is released! < http://mail.python.org/pipermail/python-dev/2005-November/057878.html>`__ (1) - `Gothenburg PyPy Sprint II: 7th - 11th December 2005 < http://mail.python.org/pipermail/python-dev/2005-November/058143.html>`__ [TAM] ------------------------ PyCon Sprint suggestions ------------------------ Every PyCon has featured a python-dev `sprint`_. For the past few years, hacking on the AST branch has been a tradition, but since the AST branch has now been merged into the trunk, other options are worth considering this year. Several PEP implementations were suggested, including `PEP 343`_ ('with:'), `PEP 308`_ ('x if y else z'), `PEP 328`_ ('absolute/relative import'), and `PEP 341`_ ('unifying try/except and try/finally'). Suggestions to continue the AST theme were also made, including one of the "global variable speedup" PEPs, `Guido's instance variable speedup idea`_, using the new AST code to improve/extend/rewrite the optimization steps the compiler performs, or rewriting PyChecker to operate from the AST representation. Phillip J. Eby also suggested working on the oft-mentioned bytes type. All of these suggestions, as well as any others that are made, are being recorded on the `PythonCore sprint wiki`_. .. _sprint: http://wiki.python.org/moin/PyCon2006/Sprints .. _PEP 343: http://www.python.org/peps/pep-0343.html .. _PEP 308: http://www.python.org/peps/pep-0308.html .. _PEP 328: http://www.python.org/peps/pep-0328.html .. _PEP 341: http://www.python.org/peps/pep-0341.html .. _Guido's instance variable speedup idea: http://mail.python.org/pipermail/python-dev/2002-February/019854.html .. _PythonCore sprint wiki: http://wiki.python.org/moin/PyCon2006/Sprints/PythonCore Contributing threads: (13) - `python-dev sprint at PyCon < http://mail.python.org/pipermail/python-dev/2005-November/057830.html>`__ (1) - `PEP 328 - absolute imports (python-dev sprint at PyCon) < http://mail.python.org/pipermail/python-dev/2005-November/057853.html>`__ [TAM] -------------------------------------- Reminder: Python is now on Subversion! -------------------------------------- Just a reminder to everyone that the Python source repository_ is now hosted on Subversion. A few minor bugs were fixed, so you can make SVK mirrors of the repository successfully now. Be sure to check out the newly revised Python Developers FAQ_ if you haven't already. .. _repository: http://svn.python.org/projects/ .. _FAQ: http://www.python.org/dev/devfaq.html Contributing threads: (4) - `Freezing the CVS on Oct 26 for SVN switchover < http://mail.python.org/pipermail/python-dev/2005-November/057823.html>`__ (1) - `svn checksum error < http://mail.python.org/pipermail/python-dev/2005-November/057843.html>`__ (6) - `Problems with revision 4077 of new SVN repository < http://mail.python.org/pipermail/python-dev/2005-November/057867.html>`__ (4) - `No more problems with new SVN repository < http://mail.python.org/pipermail/python-dev/2005-November/057888.html>`__ (7) - `dev FAQ updated with day-to-day svn questions < http://mail.python.org/pipermail/python-dev/2005-November/057999.html>`__ (2) - `Mapping cvs version numbers to svn revisions? < http://mail.python.org/pipermail/python-dev/2005-November/058051.html>`__ (2) - `Checking working copy consistency < http://mail.python.org/pipermail/python-dev/2005-November/058056.html>`__ (9) - `Is some magic required to check out new files from svn? < http://mail.python.org/pipermail/python-dev/2005-November/058065.html>`__ [SJB] --------------------------- Updating the Python-Dev FAQ --------------------------- Brett Cannon has generously volunteered to clean up some of the developers' documentation and wants to know if people would rather the bug/patch guidelines to be in a classic paragraph-style layout or a more FAQ-style layout. If you have an opinion on the topic, please let him know! Contributing threads: (2) - `dev FAQ updated with day-to-day svn questions < http://mail.python.org/pipermail/python-dev/2005-November/058025.html>`__ (1) - `Revamping the bug/patch guidelines (was Re: Implementation of PEP 341) <http://mail.python.org/pipermail/python-dev/2005-November/058108.html >`__ [SJB] ========= Summaries ========= ----------- Event loops ----------- This thread initiated in discussion on sourceforge about patches 1049855_ and 1252236_; Martin v. L?wis and Michiel de Hoon agreed that the fixes were fragile, and that a larger change should be discussed on python-dev. Michiel writes visualization software; he (and others, such as the writers of matplotlib) has trouble creating a good event loop, because the GUI toolkit (especially Tkinter) wants its own event loop to be in charge. Michiel doesn't actually need Tkinter for his own project, but he has to play nice with it because his users expect to be able to use other tools -- particularly IDLE -- while running his software. Note that this isn't the first time this sort of problem has come up; usually it is phrased in terms of a problem with Tix, or not being able to run turtle while in IDLE. Event loops by their very nature are infinite loops; once they start, everything else is out of luck unless it gets triggered by an event or is already started. Donovan Baarda suggested looking at Twisted for state of the art in event loop integration. Unfortunately, as Phillip Eby notes, it works by not using the Tkinter event loop. It decides for itself when to call dooneevent (do-one-event). It is possible to run Tkinter's dooneevent version as part of your own event loop (as Twisted does), but you can't really listen for its events, so you end up with a busy loop polling, and stepping into lots of "I have nothing to do" functions for every client eventloop. You can use Tkinter's loop, but once it goes to sleep waiting for input, everything sort of stalls out for a while, and even non-Tkinter events get queued instead of processed. Mark Hammond suggests that it might be easier to replace the interactive portions of python based on the "code" module. matplotlib suggests using ipython instead of standard python for similar reasons. Another option might be to always start Tk in a new thread, rather than letting it take over the main thread. There was some concern (see patch 1049855) that Tkinter doesn't - and shouldn't - require threading. [Jim Jewett posted a summary of this very repetitive and confusing (to the participants, not just summarizers!) thread towards its end, which this summary is very heavily based on. Many thanks Jim!] Contributing threads: (60) - `Event loops, PyOS_InputHook, and Tkinter < http://mail.python.org/pipermail/python-dev/2005-November/057954.html>`__ (4) - `Event loops, PyOS_InputHook, and Tkinter - Summary attempt < http://mail.python.org/pipermail/python-dev/2005-November/058034.html>`__ .. _1049855: http://www.python.org/sf/1049855 .. _1252236: http://www.python.org/sf/1252236 [TAM] ----------------------------- Importing .pyc and .pyo files ----------------------------- Osvaldo Santana Neto pointed out that if a .pyo file exists, but a .pyc doesn't, then a regularly run python will not import it (unless run with -O), but if the .pyo is in a zip file (which is on the PYTHONPATH) then it will import it. He felt that the inconsistency should be addressed and that the zipimport behaviour was preferable. However, Guido said that his intention was always that, without -O, *.pyo files are entirely ignored (and, with -O, *.pyc files are entirely ignored). In other words, it is the zipimport behaviour that is incorrect. Guido suggested that perhaps .pyo should be deprecated altogether and instead we could have a post-load optimizer optimize .pyc files according to the current optimization settings. The two use cases presented for including .pyo files but not .py files were in situations where disk space is at a premium, and where a proprietary "canned" application is distributed to end users who have no intention or need to ever add to the code. A suggestion was that a new bytecode could be introduced for assertions that would turn into a jump if assertions were disabled (with -O). Guido thought that the idea had potential, but pointed out that it would take someone thinking really hard about all the use cases, edge cases, implementation details, and so on, in order to write a PEP. He suggested that Brett and Phillip might be suitable volunteers for this. Contributing thread: (40) - `Inconsistent behaviour in import/zipimport hooks < http://mail.python.org/pipermail/python-dev/2005-November/057959.html>`__ [TAM] --------------------------------------- Default __hash__() and __eq__() methods --------------------------------------- Noam Raphael suggested that having the default __hash__() and __eq__() methods based off of the object's id() might have been a mistake. He proposed that the default __hash__() method be removed, and the default __eq__() method compare the two objects' __dict__ and slot members. Jim Fulton offered a counter-proposal that both the default __hash__() and __eq__() methods should be dropped for Python 3.0, but Guido convinced him that removing __eq__() is probably a bad idea; it would mean an object wouldn't compare equal to itself. In the end, Guido decided that having a default __hash__() method based on id() isn't really a bad decision; without it, you couldn't have sets of "identity objects" (objects which don't have a usefully defined value-based comparison). He suggested that the right decision was to make the hash() function smarter, and have it raise an exception if a class redefined __eq__() without redefining __hash__(). (In fact, this is what it used to do, but it was lost when object.__hash__() was introduced.) Contributing threads: (11) - `Why should the default hash(x) == id(x)? < http://mail.python.org/pipermail/python-dev/2005-November/057859.html>`__ (14) - `Should the default equality operator compare values instead of identities? < http://mail.python.org/pipermail/python-dev/2005-November/057868.html>`__ (13) - `For Python 3k, drop default/implicit hash, and comparison < http://mail.python.org/pipermail/python-dev/2005-November/057924.html>`__ [SJB] --------------------------- Indented multi-line strings --------------------------- Avi Kivity reintroduced the oft-requested means of writing a multi-line string without getting the spaces from the code indentation. The usual options were presented:: def f(...): ... msg = ('From: %s\n' 'To: %s\n' 'Subject: Host failure report for %s\n') ... msg = '''\ From: %s To: %s Subject: Host failure report for %s''' ... msg = textwrap.dedent('''\ From: %s To: %s Subject: Host failure report for %s''') Noam Raphael suggested that to simplify the latter option, textwrap.dedent() should become a string method, str.dedent(). There were also a few suggestions that this sort of dedenting should have syntactic support (e.g. with an appropriate string prefix). In general, the discussion harkened back to `PEP 295`_, a similar proposal that was previously rejected. People tossed the ideas around for a bit, but it didn't look like any changes were likely to be made. .. _PEP 295: http://www.python.org/peps/pep-0295.html Contributing threads: (3) - `indented longstrings? < http://mail.python.org/pipermail/python-dev/2005-November/058042.html>`__ (18) - `str.dedent < http://mail.python.org/pipermail/python-dev/2005-November/058058.html>`__ (1) - `OT pet peeve (was: Re: str.dedent) < http://mail.python.org/pipermail/python-dev/2005-November/058072.html>`__ [SJB] ------------------ Continued AST work ------------------ Neal Norwitz has been chasing down memory leaks; he believes that the current AST is now as good as before the AST branch was merged in. Nick explained that he is particularly concerned about the returns hidden inside in macros in the AST compiler's symbol table generation and bytecode generation steps. Niko Matsakis suggested that an arena is the way to go for memory management; the goal is to be able to free memory en-masse whatever happens and not have to track individual pointers. Jeremy Hylton noted that the AST phase has a mixture of malloc/free and Python object allocation; he felt that it should be straightforward to change the malloc/free to use an arena API, but that a separate mechanism would be needed to associate a set of PyObject* with the arena. The arena concept gained general approval, and there was some discussion about how best to implement it. In other AST news Rune Holm submitted two_ patches_ for the AST compiler that add better dead code elimination and constant folding and Thomas Lee is attempting to implement `PEP 341`_ (unification of try/except and try/finally), and asked for some help (Nick Coghlan gave some suggestions). .. _two: http://www.python.org/sf/1346214 .. _patches: http://www.python.org/sf/1346238 .. _PEP 341: http://python.org/peps/pep-0341.html Contributing threads: (1) - `Optimizations on the AST representation < http://mail.python.org/pipermail/python-dev/2005-November/057865.html>`__ (4) - `Implementation of PEP 341 < http://mail.python.org/pipermail/python-dev/2005-November/058075.html>`__ (1) - `ast status, memory leaks, etc < http://mail.python.org/pipermail/python-dev/2005-November/058089.html>`__ (7) - `Memory management in the AST parser & compiler < http://mail.python.org/pipermail/python-dev/2005-November/058138.html>`__ (1) - `PEP 341 patch & memory management (was: Memory management in the AST parser & compiler) < http://mail.python.org/pipermail/python-dev/2005-November/058142.html>`__ [TAM] --------------------------------------------------------------------- Adding functional methods (reduce, partial, etc.) to function objects --------------------------------------------------------------------- Raymond Hettinger suggested that some of the functionals, like map or partial, might be appropriate as attributes of function objects. This would allow code like:: results = f.map(data) newf = f.partial(somearg) A number of people liked the idea, but it was pointed out that map() and partial() are intended to work with any callable, and turning these into attributes of function objects would make it hard to use them with classes that define __call__(). Guido emphasized this point, saying that complicating the callable interface was a bad idea. Contributing thread: (9) - `a different kind of reduce... < http://mail.python.org/pipermail/python-dev/2005-November/057828.html>`__ [SJB] -------------------------------------------------- Distributing debug build binaries (python2x_d.dll) -------------------------------------------------- David Abrahams asked whether it would be possible for python.org<http://python.org>to make a debug build of the Python DLL more accessible. Thomas Heller pointed out that the Microsoft debug runtime DLLs are not distributable (which is why the Windows installer does not include the debug version of the Python DLL), and that the ActiveState distribution contains Python debug DLLs. Tim Peters explained that when he used to collect up the debug-build bits at the time the official installer was built, they weren't included in the main installer, because they bloated its size for something that most users don't want. He explainged that he stopped collecting the bits because no two users wanted the same set of stuff, and so it grew so large that people complained that it was too big. Tim suggested that the best thing to do would be to define precisely what an acceptable distribution format is and what exactly it should contain. Martin indicated that he would accept a patch that picked up the files and packages them, and he would include them in the official distribution. Contributing thread: (12) - `Plea to distribute debugging lib < http://mail.python.org/pipermail/python-dev/2005-November/057896.html>`__ [TAM] ----------------------------------- Creating a python-dev-announce list ----------------------------------- Jack Jansen suggested that a low-volume, moderated, python-dev-announce mailing list be created for time-critical announcements for people developing Python. The main benefit would be the ability to keep up with important announcements such as new releases, the switch to svn, and so on, even when developers don't have time to keep up with all threads. Additionally, it would be easier to separate out such announcements, even when following all threads. Although these summaries exist (and the announcements section at the top pretty much covers what Jack is after), the summaries occur at least a week after the end of the period that they cover, which could be as much as three weeks after any announcement (if it occured on the first of a month, for example). I suggested that a simpler possibility might follow along the lines of the PEP topic that the python-checkins list provides (a feature of Mailman). This would still require some sort of effort by the announcer (e.g. putting some sort of tag in the subject), but wouldn't require an additional list, or additional moderators. However, Martin pointed out that this would put an extra burden on people to remember to post to such a list; this burden would also exist using the Mailman topic mechanism. There wasn't much apparent support for the list, so this seems unlikely to occur at present. Of course, that could be because the people that would like it are too busy to have noticed the thread yet < 0.5 wink>, so perhaps there is more to come. Contributing thread: (7) - `Proposal: can we have a python-dev-announce mailing list? < http://mail.python.org/pipermail/python-dev/2005-November/057880.html>`__ [TAM] =============== Skipped Threads =============== (2) - `Divorcing str and unicode (no more implicit conversions). < http://mail.python.org/pipermail/python-dev/2005-November/057827.html>`__ (1) - `[C++-sig] GCC version compatibility < http://mail.python.org/pipermail/python-dev/2005-November/057831.html>`__ (2) - `PYTHOPN_API_VERSION < http://mail.python.org/pipermail/python-dev/2005-November/057879.html>`__ (3) - `Adding examples to PEP 263 < http://mail.python.org/pipermail/python-dev/2005-November/057891.html>`__ (3) - `Class decorators vs metaclasses < http://mail.python.org/pipermail/python-dev/2005-November/057904.html>`__ (2) - `PEP 352 Transition Plan < http://mail.python.org/pipermail/python-dev/2005-November/057911.html>`__ (4) - `PEP submission broken? < http://mail.python.org/pipermail/python-dev/2005-November/057935.html>`__ (7) - `cross-compiling < http://mail.python.org/pipermail/python-dev/2005-November/057939.html>`__ (1) - `[OTAnn] Feedback < http://mail.python.org/pipermail/python-dev/2005-November/057941.html>`__ (1) - `Weekly Python Patch/Bug Summary < http://mail.python.org/pipermail/python-dev/2005-November/057949.html>`__ (4) - `Unifying decimal numbers. < http://mail.python.org/pipermail/python-dev/2005-November/057951.html>`__ (1) - `int(string) (was: DRAFT: python-dev Summary for 2005-09-01 through 2005-09-16) < http://mail.python.org/pipermail/python-dev/2005-November/057994.html>`__ (3) - `to_int -- oops, one step missing for use. < http://mail.python.org/pipermail/python-dev/2005-November/058006.html>`__ (2) - `(no subject) < http://mail.python.org/pipermail/python-dev/2005-November/058023.html>`__ (7) - `Building Python with Visual C++ 2005 Express Edition < http://mail.python.org/pipermail/python-dev/2005-November/058024.html>`__ (3) - `Coroutines (PEP 342) < http://mail.python.org/pipermail/python-dev/2005-November/058133.html>`__ (13) - `Weak references: dereference notification < http://mail.python.org/pipermail/python-dev/2005-November/057961.html>`__ (8) - `apparent ruminations on mutable immutables (was: PEP 351, the freeze protocol) < http://mail.python.org/pipermail/python-dev/2005-November/057839.html>`__ -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20051121/8138c1c8/attachment-0001.html From fb102 at soton.ac.uk Mon Nov 21 17:41:04 2005 From: fb102 at soton.ac.uk (Floris Bruynooghe) Date: Mon, 21 Nov 2005 16:41:04 +0000 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <20051121111430.GB13478@code1.codespeak.net> References: <20051119180855.GA26733@code1.codespeak.net> <1f7befae0511201755h2cb4bdf8s9c4b8586ee3c530a@mail.gmail.com> <20051121111430.GB13478@code1.codespeak.net> Message-ID: <20051121164104.GA8898@laurie.sheepb.homeip.net> Hello On Mon, Nov 21, 2005 at 12:14:30PM +0100, Armin Rigo wrote: > On Sun, Nov 20, 2005 at 08:55:49PM -0500, Tim Peters wrote: > > We should note that hotshot didn't intend to reduce total time > > overhead. What it's aiming at here is to be less disruptive (than > > profile.py) to the code being profiled _while_ that code is running. > > > hotshot tries to stick with tiny little C functions that pack away a > > tiny amount of data each time, and avoid memory alloc/dealloc, to try > > to minimize this disruption. It looked like it was making real > > progress on this at one time ;-) > > I see the point. I suppose that we can discuss if hotshot is really > nicer on the D cache, as it produces a constant stream of data, whereas > classical profilers like lsprof would in the common case only update a > few counters in existing data structures. I can tweak lsprof a bit > more, though -- there is a malloc on each call, but it could be avoided. > > Still, people generally agree that profile.py, while taking a longer > time overall, gives more meaningful results than hotshot. When I looked into this at the beginning of the summer I could find none around on the net. And since hotshot had been around a lot longer then the new lsprof I just made a conservative choice. > Now Brett's > student, Floris, extended hotshot to allow custom timers. This is > essential, because it enables testing. The timing parts of hotshot were > not tested previously. Don't be too enthousiastic here. My aim was to replicate the profile module and thus I needed to hack this into hotshot. However I feel like it is not entirely in hotshot's ideals to do this. The problem is that the call to the timing function is accounted to the code that is being profiled afaik. Since a generic timer interface was needed this means that the call goes out from the C code back to Python and back to whatever-the-timing-function-is-writtin-in. Thus wrongly accounting even more time to the profiled code (not sure how long execing a python statement takes from a C module). Just keep this in mind. > Given the high correlation between untestedness and brokenness, you bet > that Floris' adapted test_profile for hotshot gives wrong numbers. (My > guess is that Floris overlooked that test_profile was an output test, so > he didn't compare the resulting numbers with the expected ones.) Iirc I did compare the output of test_profile between profile and my wrapper. This was one of my checks to make sure it was wrapped correctly. So could you tell me how they are different? On a stdlib note, one recommended and good working profiler would definitely be better then two or three all with their own quirks. Greetings Floris -- Debian GNU/Linux -- The Power of Freedom www.debian.org | www.gnu.org | www.kernel.org From fb102 at soton.ac.uk Mon Nov 21 17:45:03 2005 From: fb102 at soton.ac.uk (Floris Bruynooghe) Date: Mon, 21 Nov 2005 16:45:03 +0000 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <20051121111426.GA13478@code1.codespeak.net> References: <20051119180855.GA26733@code1.codespeak.net> <bbaeab100511191612o4877977bn1144c6cba4c4f5a@mail.gmail.com> <20051121111426.GA13478@code1.codespeak.net> Message-ID: <20051121164503.GB8898@laurie.sheepb.homeip.net> On Mon, Nov 21, 2005 at 12:14:26PM +0100, Armin Rigo wrote: > Hi Brett, hi Floris, > > On Sat, Nov 19, 2005 at 04:12:28PM -0800, Brett Cannon wrote: > > Just for everyone's FYI while we are talking about profilers, Floris > > Bruynooghe (who I am cc'ing on this so he can contribute to the > > conversation), for Google's Summer of Code, wrote a replacement for > > 'profile' that uses Hotshot directly. Thanks to his direct use of > > Hotshot and rewrite of pstats it loads Hotshot data 30% faster and > > also alleviates keeping 'profile' around and its slightly questionable > > license. > > Thanks for the note! 30% faster than an incredibly long time is still > quite long, but that's an improvment, I suppose. It is indeed still a long time. But is was more of a secondary aim really. > However, this code is > not ready yet. For example the new loader gives wrong results in the > presence of recursive function calls. Afaik I did test recursive calls etc. I must admit that I don't think anyone else appart form me tested it, which is far from ideal and thus it is bound to still have bugs. Could you provide a test case for this? Cheers Floris -- Debian GNU/Linux -- The Power of Freedom www.debian.org | www.gnu.org | www.kernel.org From amauryfa at gmail.com Tue Nov 22 09:17:00 2005 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Tue, 22 Nov 2005 09:17:00 +0100 Subject: [Python-Dev] ast status, memory leaks, etc Message-ID: <e27efe130511220017i29eb0e0bl@mail.gmail.com> Hello, Purify is not so difficult to use: just run and learn to read the output ;-) My config: Win2k using VC6sp5, and only 512Mb RAM. I downloaded the snapshot dated 2005/11/21 05:01, commented out #define WITH_PYMALLOC, built in debug mode, modified the rt.bat file to use purify, and ran "rt -d". Here are the most important results so far : 1 - Memory error in test_coding, while importing bad_coding.py : IPR: Invalid pointer read in tok_nextc {1 occurrence} Reading 1 byte from 0x048af076 (1 byte at 0x048af076 illegal) Address 0x048af076 points into a malloc'd block in unallocated region of heap 0x03120000 Thread ID: 0x718 Error location tok_nextc [tokenizer.c:881] tok_get [tokenizer.c:1104] PyTokenizer_Get [tokenizer.c:1495] parsetok [parsetok.c:125] PyParser_ParseFileFlags [parsetok.c:89] PyParser_ASTFromFile [pythonrun.c:1293] parse_source_module [import.c:778] load_source_module [import.c:905] load_module [import.c:1665] import_submodule [import.c:2259] 2 - Stack overflow in test_compile.test_extended_arg. No need to Purify, the debug build is enough to reproduce the problem. Because of the stack overflow, the test suite stopped. I ran some random tests alone, to get memory leak reports, but there is no significant message so far. Today I'll try the complete test suite, excluding test_compile only. -- Amaury From arigo at tunes.org Tue Nov 22 15:35:52 2005 From: arigo at tunes.org (Armin Rigo) Date: Tue, 22 Nov 2005 15:35:52 +0100 Subject: [Python-Dev] s/hotshot/lsprof In-Reply-To: <bbaeab100511212335v7be01235o71f932593b7d3fe0@mail.gmail.com> References: <20051119180855.GA26733@code1.codespeak.net> <4380F572.9040402@v.loewis.de> <dlqtk8$37q$1@sea.gmane.org> <43817375.6040108@v.loewis.de> <20051121114101.GC13478@code1.codespeak.net> <43823C53.8080403@v.loewis.de> <20051122060146.GA14960@code1.codespeak.net> <bbaeab100511212335v7be01235o71f932593b7d3fe0@mail.gmail.com> Message-ID: <20051122143552.GA19036@code1.codespeak.net> Hi Brett, On Mon, Nov 21, 2005 at 11:35:37PM -0800, Brett Cannon wrote: > (any chance we can replace profile with a wrapper to lsprof > without much issue?) Yes. In fact I am thinking about adding lsprof under the module name 'cProfile', to keep true to the (IMHO) good tradition of pickle/cPickle and StringIO/cStringIO. We could also just call it 'profile' and drop the existing profile.py, but I'm not in favor of that. Having pure Python equivalent of our modules is good. When I am in a good mood I am thinking that it would instead be fun to rewrite profile.py to look exactly like lsprof. Not sure pstats would be that much fun, though, and I can't be bothered by license issues too much. However cares can probably derive a pstats replacement from the Summer of Code project. A bientot, Armin. From vinay_sajip at red-dove.com Tue Nov 22 16:17:19 2005 From: vinay_sajip at red-dove.com (Vinay Sajip) Date: Tue, 22 Nov 2005 15:17:19 -0000 Subject: [Python-Dev] Proposed additional keyword argument in logging calls Message-ID: <001a01c5ef77$d7682300$0200a8c0@alpha> On numerous occasions, requests have been made for the ability to easily add user-defined data to logging events. For example, a multi-threaded server application may want to output specific information to a particular server thread (e.g. the identity of the client, specific protocol options for the client connection, etc.) This is currently possible, but you have to subclass the Logger class and override its makeRecord method to put custom attributes in the LogRecord. These can then be output using a customised format string containing e.g. "%(foo)s %(bar)d". The approach is usable but requires more work than necessary. I'd like to propose a simpler way of achieving the same result, which requires use of an additional optional keyword argument in logging calls. The signature of the (internal) Logger._log method would change from def _log(self, level, msg, args, exc_info=None) to def _log(self, level, msg, args, exc_info=None, extra_info=None) The extra_info argument will be passed to Logger.makeRecord, whose signature will change from def makeRecord(self, name, level, fn, lno, msg, args, exc_info): to def makeRecord(self, name, level, fn, lno, msg, args, exc_info, extra_info) makeRecord will, after doing what it does now, use the extra_info argument as follows: If type(extra_info) != types.DictType, it will be ignored. Otherwise, any entries in extra_info whose keys are not already in the LogRecord's __dict__ will be added to the LogRecord's __dict__. Can anyone see any problems with this approach? If not, I propose to post the approach on python-list and then if there are no strong objections, check it in to the trunk. (Since it could break existing code, I'm assuming (please correct me if I'm wrong) that it shouldn't go into the release24-maint branch.) Of course, if anyone can suggest a better way of doing it, I'm all ears :-) Regards, Vinay Sajip From nnorwitz at gmail.com Tue Nov 22 19:13:13 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue, 22 Nov 2005 10:13:13 -0800 Subject: [Python-Dev] ast status, memory leaks, etc In-Reply-To: <e27efe130511220017i29eb0e0bl@mail.gmail.com> References: <e27efe130511220017i29eb0e0bl@mail.gmail.com> Message-ID: <ee2a432c0511221013s75f52ab0he98e36860f7a3649@mail.gmail.com> On 11/22/05, Amaury Forgeot d'Arc <amauryfa at gmail.com> wrote: > Hello, > > Purify is not so difficult to use: just run and learn to read the output ;-) Amaury, Thank you for running Purify. > 1 - Memory error in test_coding, while importing bad_coding.py : > IPR: Invalid pointer read in tok_nextc {1 occurrence} There is a patch for this on SourceForge. It's pretty new. > Because of the stack overflow, the test suite stopped. I ran some > random tests alone, to get memory leak reports, but there is no > significant message so far. > Today I'll try the complete test suite, excluding test_compile only. Great. Thanks! n From bcannon at gmail.com Tue Nov 22 20:31:34 2005 From: bcannon at gmail.com (Brett Cannon) Date: Tue, 22 Nov 2005 11:31:34 -0800 Subject: [Python-Dev] DRAFT: python-dev Summary for 2005-10-16 to 2005-10-31 In-Reply-To: <20051122153038.030c8586.simon@arrowtheory.com> References: <D716D004-B827-4CB4-913B-ECE61118FF0A@gmail.com> <20051122153038.030c8586.simon@arrowtheory.com> Message-ID: <bbaeab100511221131n39da6ad2q3604365b09d2e45@mail.gmail.com> On 11/21/05, Simon Burton <simon at arrowtheory.com> wrote: > On Thu, 17 Nov 2005 13:36:36 +1300 > Tony Meyer <tony.meyer at gmail.com> wrote: > > > > > -------------- > > AST for Python > > -------------- > > > > As of October 21st, Python's compiler now uses a real Abstract Syntax > > Tree (AST)! This should make experimenting with new syntax much > > easier, as well as allowing some optimizations that were difficult > > with the previous Concrete Syntax Tree (CST). > > > While there is no > > Python interface to the AST yet, one is intended for the not-so- > > distant future. > > OK, who is doing this ? I am mad keen to get this happening. > No one yet. Some ideas have been tossed around (read the thread for details), but no one has sat down to hammer out the details. Might happen at PyCon. -Brett From barbieri at gmail.com Tue Nov 22 20:48:38 2005 From: barbieri at gmail.com (Gustavo Sverzut Barbieri) Date: Tue, 22 Nov 2005 17:48:38 -0200 Subject: [Python-Dev] ast status, memory leaks, etc In-Reply-To: <ee2a432c0511201614u1dadb3b2x419e3482ccf5b145@mail.gmail.com> References: <ee2a432c0511131141s72fedecax29008fd783a3b0db@mail.gmail.com> <ee2a432c0511191615y6259e95bwce68aec849a7ebfa@mail.gmail.com> <438048B6.2030103@v.loewis.de> <ee2a432c0511201614u1dadb3b2x419e3482ccf5b145@mail.gmail.com> Message-ID: <9ef20ef30511221148g905deefo548a8fb3e68a08ae@mail.gmail.com> On 11/20/05, Neal Norwitz <nnorwitz at gmail.com> wrote: > Thanks I was going to look into the resizing and forgot about it. > Running without pymalloc confirmed that there weren't more serious > problems. At least with gentoo's Python 2.4.2, I get a bunch of errors from invalid reads and jumps/moves that depends on unitialized values in PyObject_Free(). Running: valgrind --leak-check=full --leak-resolution=high --show-reachable=yes python -c "pass" 2> ~/python-2.4.2-valgrind.log gives me the attached log file. -- Gustavo Sverzut Barbieri -------------------------------------- Computer Engineer 2001 - UNICAMP Mobile: +55 (19) 9165 8010 Phone: +1 (347) 624 6296 @ sip.stanaphone.com Jabber: gsbarbieri at jabber.org ICQ#: 17249123 MSN: barbieri at gmail.com GPG: 0xB640E1A2 @ wwwkeys.pgp.net -------------- next part -------------- A non-text attachment was scrubbed... Name: python-2.4.2-valgrind.log.bz2 Type: application/x-bzip2 Size: 4080 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20051122/76707dd5/python-2.4.2-valgrind.log-0001.bin From fredrik at pythonware.com Tue Nov 22 20:55:52 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 22 Nov 2005 20:55:52 +0100 Subject: [Python-Dev] ast status, memory leaks, etc References: <ee2a432c0511131141s72fedecax29008fd783a3b0db@mail.gmail.com><ee2a432c0511191615y6259e95bwce68aec849a7ebfa@mail.gmail.com><438048B6.2030103@v.loewis.de><ee2a432c0511201614u1dadb3b2x419e3482ccf5b145@mail.gmail.com> <9ef20ef30511221148g905deefo548a8fb3e68a08ae@mail.gmail.com> Message-ID: <dlvt41$cvl$1@sea.gmane.org> Gustavo Sverzut Barbieri wrote: > At least with gentoo's Python 2.4.2, I get a bunch of errors from > invalid reads and jumps/moves that depends on unitialized values in > PyObject_Free(). > > Running: > > valgrind --leak-check=full --leak-resolution=high --show-reachable=yes > python -c "pass" 2> ~/python-2.4.2-valgrind.log did you read the instructions ? $ more Misc/README.valgrind http://cvs.sourceforge.net/viewcvs.py/python/python/dist/src/Misc/README.valgrind?view=markup </F> From p.f.moore at gmail.com Tue Nov 22 21:10:36 2005 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 22 Nov 2005 20:10:36 +0000 Subject: [Python-Dev] Proposed additional keyword argument in logging calls In-Reply-To: <001a01c5ef77$d7682300$0200a8c0@alpha> References: <001a01c5ef77$d7682300$0200a8c0@alpha> Message-ID: <79990c6b0511221210g47bc10eas2531726871da92ba@mail.gmail.com> On 11/22/05, Vinay Sajip <vinay_sajip at red-dove.com> wrote: > makeRecord will, after doing what it does now, use the extra_info argument > as follows: > > If type(extra_info) != types.DictType, it will be ignored. > > Otherwise, any entries in extra_info whose keys are not already in the > LogRecord's __dict__ will be added to the LogRecord's __dict__. > > Can anyone see any problems with this approach? I'd suggest that you raise an error if extra_info doesn't act like a dictionary - probably, just try to add its entries and let any error pass back to the caller. You definitely want to allow dict subclasses, and anything that acts like a dictionary. And you want to catch errors like log(..., extra_info = "whatever") with a format of "... %(extra_info)s..." (ie, assuming that extra_info is a single value - it's what I expected you to propose when I started reading). The rest looks good (I don't have a need for it myself, but it looks like a nice, clean solution to the problem you describe). Paul. From reinhold-birkenfeld-nospam at wolke7.net Tue Nov 22 22:47:37 2005 From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld) Date: Tue, 22 Nov 2005 22:47:37 +0100 Subject: [Python-Dev] something is wrong with test___all__ Message-ID: <dm03lq$41u$1@sea.gmane.org> Hi, on my machine, "make test" hangs at test_colorsys. Careful investigation shows that when the bytecode is freshly generated by "make all" (precisely in test___all__) the .pyc file is different from what a direct call to "regrtest.py test_colorsys" produces. Curiously, a call to "regrtest.py test___all__" instead of "make test" produces the correct bytecode. I can only suspect some AST bug here. Reinhold -- Mail address is perfectly valid! From simon at arrowtheory.com Wed Nov 23 10:15:29 2005 From: simon at arrowtheory.com (Simon Burton) Date: Wed, 23 Nov 2005 09:15:29 +0000 Subject: [Python-Dev] DRAFT: python-dev Summary for 2005-10-16 to 2005-10-31 In-Reply-To: <bbaeab100511221131n39da6ad2q3604365b09d2e45@mail.gmail.com> References: <D716D004-B827-4CB4-913B-ECE61118FF0A@gmail.com> <20051122153038.030c8586.simon@arrowtheory.com> <bbaeab100511221131n39da6ad2q3604365b09d2e45@mail.gmail.com> Message-ID: <20051123091529.6b5ae4d7.simon@arrowtheory.com> On Tue, 22 Nov 2005 11:31:34 -0800 Brett Cannon <bcannon at gmail.com> wrote: > > On 11/21/05, Simon Burton <simon at arrowtheory.com> wrote: > > On Thu, 17 Nov 2005 13:36:36 +1300 > > Tony Meyer <tony.meyer at gmail.com> wrote: > > > > > > > > -------------- > > > AST for Python > > > -------------- > > > > > > As of October 21st, Python's compiler now uses a real Abstract Syntax > > > Tree (AST)! This should make experimenting with new syntax much > > > easier, as well as allowing some optimizations that were difficult > > > with the previous Concrete Syntax Tree (CST). > > > > > While there is no > > > Python interface to the AST yet, one is intended for the not-so- > > > distant future. > > > > OK, who is doing this ? I am mad keen to get this happening. > > > > No one yet. Some ideas have been tossed around (read the thread for > details), but no one has sat down to hammer out the details. Might > happen at PyCon. > > -Brett Yes, i've been reading the threads but I don't see anything about a python interface. Why I'm asking is because I could probably convince my employer to let me (or an intern) work on it. And pycon is not until febuary. I am likely to start hacking on this before then. Simon. -- Simon Burton, B.Sc. Licensed PO Box 8066 ANU Canberra 2601 Australia Ph. 61 02 6249 6940 http://arrowtheory.com From steven.bethard at gmail.com Tue Nov 22 23:30:07 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Tue, 22 Nov 2005 15:30:07 -0700 Subject: [Python-Dev] a Python interface for the AST (WAS: DRAFT: python-dev...) Message-ID: <d11dcfba0511221430j519a2f8fh6faac1ab89ee7d99@mail.gmail.com> I wrote (in the summary): > While there is no interface to the AST yet, one is > intended for the not-so-distant future. Simon Burton wrote: > who is doing this ? I am mad keen to get this happening. Brett Cannon wrote: > No one yet. Some ideas have been tossed around (read the thread for > details), but no one has sat down to hammer out the details. Might > happen at PyCon. Simon Burton wrote: > Yes, i've been reading the threads but I don't see anything > about a python interface. Why I'm asking is because I could > probably convince my employer to let me (or an intern) work > on it. And pycon is not until febuary. I am likely to start > hacking on this before then. Basically, all I saw was your post asking for a Python interface[1], and a few "not yet" responses. I suspect that if you were to volunteer to head up the work on the Python interface, no one would be likely to stop you. ;-) [1]http://mail.python.org/pipermail/python-dev/2005-October/057611.html Steve -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From bcannon at gmail.com Wed Nov 23 01:02:40 2005 From: bcannon at gmail.com (Brett Cannon) Date: Tue, 22 Nov 2005 16:02:40 -0800 Subject: [Python-Dev] a Python interface for the AST (WAS: DRAFT: python-dev...) In-Reply-To: <d11dcfba0511221430j519a2f8fh6faac1ab89ee7d99@mail.gmail.com> References: <d11dcfba0511221430j519a2f8fh6faac1ab89ee7d99@mail.gmail.com> Message-ID: <bbaeab100511221602h36bbc30bpc9f317d7fb3354fa@mail.gmail.com> On 11/22/05, Steven Bethard <steven.bethard at gmail.com> wrote: > I wrote (in the summary): > > While there is no interface to the AST yet, one is > > intended for the not-so-distant future. > > Simon Burton wrote: > > who is doing this ? I am mad keen to get this happening. > > Brett Cannon wrote: > > No one yet. Some ideas have been tossed around (read the thread for > > details), but no one has sat down to hammer out the details. Might > > happen at PyCon. > > Simon Burton wrote: > > Yes, i've been reading the threads but I don't see anything > > about a python interface. Why I'm asking is because I could > > probably convince my employer to let me (or an intern) work > > on it. And pycon is not until febuary. I am likely to start > > hacking on this before then. > > Basically, all I saw was your post asking for a Python interface[1], > and a few "not yet" responses. I suspect that if you were to > volunteer to head up the work on the Python interface, no one would be > likely to stop you. ;-) > > [1]http://mail.python.org/pipermail/python-dev/2005-October/057611.html > All of the discussion has just been "we hope to have it some day" with no real planning. =) There are two problems to this topic; how to get the AST structs into Python objects and how to allow Python code to modify the AST before bytecode emission (or perhaps even after for in-place optimization). To get the AST into Python objects, there are two options. One is to use the AST grammar to generate struct -> serialized form -> Python objects and vice-versa. There might be some rough code already there in the form of emitting a string that looks like Scheme code that represents the AST. Then Python code could use that to make up objects, manipulate, translate back into its serialized form, and then back into the AST structs. It sounds like a lot but with the grammar right there it should be an automated generation of code to make. The other option is to have all AST structs be contained in PyObjects. Neil suggested this for the whole memory problem since we could then just do proper refcounting and we all know how to do that (supposedly =) . With that then all it is to get access is to pass the PyObject of the root out and make sure that the proper attributes or accessor methods (I prefer the former) are available. Once again this can be auto-generated from the AST grammar. The second problem is where to give access to the AST from within Python. One place is the command-line. One could be able to specify the path to function objects (using import syntax, e.g., ``optimizations.static.folding``) on the command-line that are always applied to all generated bytecode. Another possibility is to have an iterable in sys that is iterated over everytime something has bytecode generated. Each call to the iterator would return a function that took in an AST object and returned an AST object. Another possibility is to have a function (like ``ast()`` as a built-in) to pass in a code object and then have the AST returned for that code object. If a function was provided that took an AST and returned the bytecode then selective AST access can be given instead of applying across the board (this could allow for decorators that performed AST optimizations or even hotshot stuff). Obvously this is all pie-in-the-sky stuff. Getting the memory leak situation resolved is a bigger priority in my mind than any of this. But if I had my way I think that having all AST objects be PyObjects and then providing support for all three ways of getting access to the AST (command-line, sys iterable, function for specific code object) would be fantastic. -Brett From nnorwitz at gmail.com Wed Nov 23 02:48:33 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue, 22 Nov 2005 17:48:33 -0800 Subject: [Python-Dev] a Python interface for the AST (WAS: DRAFT: python-dev...) In-Reply-To: <bbaeab100511221602h36bbc30bpc9f317d7fb3354fa@mail.gmail.com> References: <d11dcfba0511221430j519a2f8fh6faac1ab89ee7d99@mail.gmail.com> <bbaeab100511221602h36bbc30bpc9f317d7fb3354fa@mail.gmail.com> Message-ID: <ee2a432c0511221748r272540ffld1ef1f772f8058e3@mail.gmail.com> On 11/22/05, Brett Cannon <bcannon at gmail.com> wrote: > > But if I had my way I think that having all AST objects be PyObjects > and then providing support for all three ways of getting access to the > AST (command-line, sys iterable, function for specific code object) > would be fantastic. There needs to be a function that takes a filename (or string of code) and returns an AST. Hmm, it would be nice to give a function a module name (like from an import statement) and have Python resolve it using the normal sys.path iteration. n From bcannon at gmail.com Wed Nov 23 03:32:59 2005 From: bcannon at gmail.com (Brett Cannon) Date: Tue, 22 Nov 2005 18:32:59 -0800 Subject: [Python-Dev] a Python interface for the AST (WAS: DRAFT: python-dev...) In-Reply-To: <ee2a432c0511221748r272540ffld1ef1f772f8058e3@mail.gmail.com> References: <d11dcfba0511221430j519a2f8fh6faac1ab89ee7d99@mail.gmail.com> <bbaeab100511221602h36bbc30bpc9f317d7fb3354fa@mail.gmail.com> <ee2a432c0511221748r272540ffld1ef1f772f8058e3@mail.gmail.com> Message-ID: <bbaeab100511221832j7939e3e2wdd3a7bff42d4765a@mail.gmail.com> On 11/22/05, Neal Norwitz <nnorwitz at gmail.com> wrote: > On 11/22/05, Brett Cannon <bcannon at gmail.com> wrote: > > > > But if I had my way I think that having all AST objects be PyObjects > > and then providing support for all three ways of getting access to the > > AST (command-line, sys iterable, function for specific code object) > > would be fantastic. > > There needs to be a function that takes a filename (or string of code) > and returns an AST. "Yes" and "I guess". =) I can see the filename to check a module useful for stuff like PyChecker. But for a string of code, I don't think it would be that critical; if you provide a way to get the AST for a code object you can just pass the string to compile() and then get the AST from there. > Hmm, it would be nice to give a function a module > name (like from an import statement) and have Python resolve it using > the normal sys.path iteration. > Yep, import path -> filename path would be cool. -Brett From pje at telecommunity.com Wed Nov 23 03:58:58 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 22 Nov 2005 21:58:58 -0500 Subject: [Python-Dev] a Python interface for the AST (WAS: DRAFT: python-dev...) In-Reply-To: <bbaeab100511221832j7939e3e2wdd3a7bff42d4765a@mail.gmail.co m> References: <ee2a432c0511221748r272540ffld1ef1f772f8058e3@mail.gmail.com> <d11dcfba0511221430j519a2f8fh6faac1ab89ee7d99@mail.gmail.com> <bbaeab100511221602h36bbc30bpc9f317d7fb3354fa@mail.gmail.com> <ee2a432c0511221748r272540ffld1ef1f772f8058e3@mail.gmail.com> Message-ID: <5.1.1.6.0.20051122215139.01f99f90@mail.telecommunity.com> At 06:32 PM 11/22/2005 -0800, Brett Cannon wrote: > > Hmm, it would be nice to give a function a module > > name (like from an import statement) and have Python resolve it using > > the normal sys.path iteration. > > > >Yep, import path -> filename path would be cool. Zipped and frozen modules don't have filename paths, so I'd personally rather see fewer stdlib modules making the assumption that modules are files. Instead, extensions to the PEP 302 loader protocol should be used to support introspection, assuming there aren't already equivalent capabilities available. For example, PEP 302 allows a 'get_source()' method on loaders, and I believe the zipimport loader supports that. (I don't know about frozen modules.) The main barrier to this being really usable is the absence of loader objects for the built-in import process. This was proposed by PEP 302, but never actually implemented, probably due to time constraints on the Python 2.3 release schedule. It's relatively easy to implement this "missing loader class" in Python, though, and in fact the PEP 302 regression test in the stdlib does exactly that. Some work, however, would be required to port this to C and expose it from an appropriate module (imp?). From krumms at gmail.com Wed Nov 23 09:44:27 2005 From: krumms at gmail.com (Thomas Lee) Date: Wed, 23 Nov 2005 18:44:27 +1000 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <dll2v3$78g$1@sea.gmane.org> References: <4379AAD7.2050506@iinet.net.au> <6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu> <e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com> <ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com> <bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com> <13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu> <437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org> <dll2v3$78g$1@sea.gmane.org> Message-ID: <43842BEB.5000406@gmail.com> Neil Schemenauer wrote: >Fredrik Lundh <fredrik at pythonware.com> wrote: > > >>Thomas Lee wrote: >> >> >> >>>Even if it meant we had just one function call - one, safe function call >>>that deallocated all the memory allocated within a function - that we >>>had to put before each and every return, that's better than what we >>>have. >>> >>> >>alloca? >> >> > >Perhaps we should use the memory management technique that the rest >of Python uses: reference counting. I don't see why the AST >structures couldn't be PyObjects. > > Neil > > > I'm +1 for reference counting. It's going to be a little error prone initially (certainly much less error prone than the current system in the long run), but the pooling/arena idea is going to screw with all sorts of stuff within the AST and possibly in bits of Python/compile.c too. At least, all my attempts wound up looking that way :) Cheers, Tom >_______________________________________________ >Python-Dev mailing list >Python-Dev at python.org >http://mail.python.org/mailman/listinfo/python-dev >Unsubscribe: http://mail.python.org/mailman/options/python-dev/krumms%40gmail.com > > > From ncoghlan at gmail.com Wed Nov 23 14:51:59 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 23 Nov 2005 23:51:59 +1000 Subject: [Python-Dev] PEP 302, PEP 338 and imp.getloader (was Re: a Python interface for the AST (WAS: DRAFT: python-dev...) In-Reply-To: <5.1.1.6.0.20051122215139.01f99f90@mail.telecommunity.com> References: <ee2a432c0511221748r272540ffld1ef1f772f8058e3@mail.gmail.com> <d11dcfba0511221430j519a2f8fh6faac1ab89ee7d99@mail.gmail.com> <bbaeab100511221602h36bbc30bpc9f317d7fb3354fa@mail.gmail.com> <ee2a432c0511221748r272540ffld1ef1f772f8058e3@mail.gmail.com> <5.1.1.6.0.20051122215139.01f99f90@mail.telecommunity.com> Message-ID: <438473FF.8020107@gmail.com> Phillip J. Eby wrote: > At 06:32 PM 11/22/2005 -0800, Brett Cannon wrote: >>> Hmm, it would be nice to give a function a module >>> name (like from an import statement) and have Python resolve it using >>> the normal sys.path iteration. >>> >> Yep, import path -> filename path would be cool. > > Zipped and frozen modules don't have filename paths, so I'd personally > rather see fewer stdlib modules making the assumption that modules are > files. Instead, extensions to the PEP 302 loader protocol should be used > to support introspection, assuming there aren't already equivalent > capabilities available. For example, PEP 302 allows a 'get_source()' > method on loaders, and I believe the zipimport loader supports that. (I > don't know about frozen modules.) > > The main barrier to this being really usable is the absence of loader > objects for the built-in import process. This was proposed by PEP 302, but > never actually implemented, probably due to time constraints on the Python > 2.3 release schedule. > > It's relatively easy to implement this "missing loader class" in Python, > though, and in fact the PEP 302 regression test in the stdlib does exactly > that. Some work, however, would be required to port this to C and expose > it from an appropriate module (imp?). Prompted by this, I finally got around to reading PEP 302 to see how it related to PEP 338 (which is intended to fix the current limitations of the '-m' switch by providing a Python fallback when the basic C code can't find the module to be run). The key thing that is missing is the "imp.getloader" functionality discussed at the end of PEP 302. Using that functionality and the exec statement, PEP 338 could easily be modified to support any module accessed via a loader which supports get_code() (and it could probably also get rid of all of the current cruft dealing with normal filesystem packages). So with that in mind, I'm thinking of updating PEP 338 to propose the following: 1. A new pure Python module called "runpy" 2. A function called "runpy.execmodule" that is very similar to execfile, but takes a module reference instead of a filename. It will NOT support modification of the caller's namespace (based on recent discussions regarding the exec statement). argv[0] and the name __file__ in the execution dictionary will be set to the file name for real files (those of type PY_SOURCE or PY_COMPILED), and the module reference otherwise. An optional argument will permit argv[0] (and __file__) to be forced to a specific value.** 3. A function called "runpy.get_source" that, given a module reference, retrieves the source code for that module via loader.get_source() 4. A function called "runpy.get_code" that, given a module reference, retrieves the code object for that module via loader.get_code() 5. A function called "runpy.is_runnable" that, given a module reference, determines if execmodule will work on that module (e.g. by checking that the loader provides the getcode method, that loader.is_package returns false, etc) 6. If invoked as a script, runpy interprets argv[1] as the module to run 7. If the '-m' switch fails to find a module, it invokes runpy as a fallback. To make PEP 338 independent of the C implementation of imp.getloader for PEP 302 being finished, it would propose two private elements in runpy: runpy._getloader and runpy._StandardImportMetaHook If imp.getloader was available, it would be assigned to runpy._getloader, otherwise runpy would fall back on the Python equivalents. ** I'm open to suggestions on how to deal with argv[0] and __file__. They should be set to whatever __file__ would be set to by the module loader, but the Importer Protocol in PEP 302 doesn't seem to expose that information. The current proposal is a compromise that matches the existing behaviour of -m (which supports scripts like regrtest.py) while still giving a meaningful value for scripts which are not part of the normal filesystem. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From pj at place.org Wed Nov 23 06:04:55 2005 From: pj at place.org (Paul Jimenez) Date: Tue, 22 Nov 2005 23:04:55 -0600 Subject: [Python-Dev] urlparse brokenness Message-ID: <20051123050455.9010E7FBF@place.org> It is my assertion that urlparse is currently broken. Specifically, I think that urlparse breaks an abstraction boundary with ill effect. In writing a mailclient, I wished to allow my users to specify their imap server as a url, such as 'imap://user:password at host:port/'. Which worked fine. I then thought that the natural extension to support configuration of imapssl would be 'imaps://user:password at host:port/'.... which failed - user:passwrod at host:port got parsed as the *path* of the URL instead of the network location. It turns out that urlparse keeps a table of url schemes that 'use netloc'... that is to say, that have a 'user:password at host:port' part to their URL. I think this 'special knowledge' about particular schemes 1) breaks an abstraction boundary by having a function whose charter is to pull apart a particularly-formatted string behave differently based on the meaning of the string instead of the structure of it and 2) fails to be extensible or forward compatible due to hardcoded 'magic' strings - if schemes were somehow 'registerable' as 'netloc using' or not, then this objection might be nullified, but the previous objection would still stand. So I propose that urlsplit, the main offender, be replaced with something that looks like: def urlsplit(url, scheme='', allow_fragments=1, default=('','','','','')): """Parse a URL into 5 components: <scheme>://<netloc>/<path>?<query>#<fragment> Return a 5-tuple: (scheme, netloc, path, query, fragment). Note that we don't break the components up in smaller bits (e.g. netloc is a single string) and we don't expand % escapes.""" key = url, scheme, allow_fragments, default cached = _parse_cache.get(key, None) if cached: return cached if len(_parse_cache) >= MAX_CACHE_SIZE: # avoid runaway growth clear_cache() if "://" in url: uscheme, npqf = url.split("://", 1) else: uscheme = scheme if not uscheme: uscheme = default[0] npqf = url pathidx = npqf.find('/') if pathidx == -1: # not found netloc = npqf path, query, fragment = default[1:4] else: netloc = npqf[:pathidx] pqf = npqf[pathidx:] if '?' in pqf: path, qf = pqf.split('?',1) else: path, qf = pqf, ''.join(default[3:5]) if ('#' in qf) and allow_fragments: query, fragment = qf.split('#',1) else: query, fragment = default[3:5] tuple = (uscheme, netloc, path, query, fragment) _parse_cache[key] = tuple return tuple Note that I'm not sold on the _parse_cache, but I'm assuming it was there for a reason so I'm leaving that functionality as-is. If this isn't the right forum for this discussion, or the right place to submit code, please let me know. Also, please cc: me directly on responses as I'm not subscribed to the firehose that is python-dev. --pj From aahz at pythoncraft.com Wed Nov 23 17:55:29 2005 From: aahz at pythoncraft.com (Aahz) Date: Wed, 23 Nov 2005 08:55:29 -0800 Subject: [Python-Dev] urlparse brokenness In-Reply-To: <20051123050455.9010E7FBF@place.org> References: <20051123050455.9010E7FBF@place.org> Message-ID: <20051123165529.GA4322@panix.com> On Tue, Nov 22, 2005, Paul Jimenez wrote: > > If this isn't the right forum for this discussion, or the right place > to submit code, please let me know. Also, please cc: me directly on > responses as I'm not subscribed to the firehose that is python-dev. This is the right forum for discussion. You should post your patch to SourceForge *before* starting a discussion on python-dev, including a link to the patch in your post. It is not essential, but it is certainly a courtesy to subscribe to python-dev for the duration of the discussion; you can feel feel to filter threads you're not interested in. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "If you think it's expensive to hire a professional to do the job, wait until you hire an amateur." --Red Adair From pje at telecommunity.com Wed Nov 23 19:25:44 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 23 Nov 2005 13:25:44 -0500 Subject: [Python-Dev] PEP 302, PEP 338 and imp.getloader (was Re: a Python interface for the AST (WAS: DRAFT: python-dev...) In-Reply-To: <438473FF.8020107@gmail.com> References: <5.1.1.6.0.20051122215139.01f99f90@mail.telecommunity.com> <ee2a432c0511221748r272540ffld1ef1f772f8058e3@mail.gmail.com> <d11dcfba0511221430j519a2f8fh6faac1ab89ee7d99@mail.gmail.com> <bbaeab100511221602h36bbc30bpc9f317d7fb3354fa@mail.gmail.com> <ee2a432c0511221748r272540ffld1ef1f772f8058e3@mail.gmail.com> <5.1.1.6.0.20051122215139.01f99f90@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20051123131857.03aba388@mail.telecommunity.com> At 11:51 PM 11/23/2005 +1000, Nick Coghlan wrote: >The key thing that is missing is the "imp.getloader" functionality discussed >at the end of PEP 302. This isn't hard to implement per se; setuptools for example has a 'get_importer' function, and going from importer to loader is simple: def get_importer(path_item): """Retrieve a PEP 302 "importer" for the given path item If there is no importer, this returns a wrapper around the builtin import machinery. The returned importer is only cached if it was created by a path hook. """ try: importer = sys.path_importer_cache[path_item] except KeyError: for hook in sys.path_hooks: try: importer = hook(path_item) except ImportError: pass else: break else: importer = None sys.path_importer_cache.setdefault(path_item,importer) if importer is None: try: importer = ImpWrapper(path_item) except ImportError: pass return importer So with the above function you could do something like: def get_loader(fullname, path): for path_item in path: try: loader = get_importer(path_item).find_module(fullname) if loader is not None: return loader except ImportError: continue else: return None in order to implement the rest. >** I'm open to suggestions on how to deal with argv[0] and __file__. They >should be set to whatever __file__ would be set to by the module loader, but >the Importer Protocol in PEP 302 doesn't seem to expose that information. The >current proposal is a compromise that matches the existing behaviour of -m >(which supports scripts like regrtest.py) while still giving a meaningful >value for scripts which are not part of the normal filesystem. Ugh. Those are tricky, no question. I can think of several simple answers for each, all of which are wrong in some way. :) From greg.ewing at canterbury.ac.nz Thu Nov 24 04:47:02 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 24 Nov 2005 16:47:02 +1300 Subject: [Python-Dev] a Python interface for the AST (WAS: DRAFT: python-dev...) In-Reply-To: <bbaeab100511221602h36bbc30bpc9f317d7fb3354fa@mail.gmail.com> References: <d11dcfba0511221430j519a2f8fh6faac1ab89ee7d99@mail.gmail.com> <bbaeab100511221602h36bbc30bpc9f317d7fb3354fa@mail.gmail.com> Message-ID: <438537B6.7020009@canterbury.ac.nz> Brett Cannon wrote: > There are two problems to this topic; how to > get the AST structs into Python objects and how to allow Python code > to modify the AST before bytecode emission I'm astounded to hear that the AST isn't made from Python objects in the first place. Is there a particular reason it wasn't done that way? -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From mike at skew.org Thu Nov 24 06:38:39 2005 From: mike at skew.org (Mike Brown) Date: Wed, 23 Nov 2005 22:38:39 -0700 (MST) Subject: [Python-Dev] urlparse brokenness In-Reply-To: <20051123050455.9010E7FBF@place.org> Message-ID: <200511240538.jAO5cdb8012274@chilled.skew.org> Paul Jimenez wrote: > So I propose that urlsplit, the main offender, be replaced with something > that looks like: > > def urlsplit(url, scheme='', allow_fragments=1, default=('','','','','')): +1 in principle. You should probably do a global _parse_cache and add 'is not None' after 'if cached'. From bcannon at gmail.com Thu Nov 24 07:52:12 2005 From: bcannon at gmail.com (Brett Cannon) Date: Wed, 23 Nov 2005 22:52:12 -0800 Subject: [Python-Dev] a Python interface for the AST (WAS: DRAFT: python-dev...) In-Reply-To: <438537B6.7020009@canterbury.ac.nz> References: <d11dcfba0511221430j519a2f8fh6faac1ab89ee7d99@mail.gmail.com> <bbaeab100511221602h36bbc30bpc9f317d7fb3354fa@mail.gmail.com> <438537B6.7020009@canterbury.ac.nz> Message-ID: <bbaeab100511232252l40892b56vc348a5b899accef4@mail.gmail.com> On 11/23/05, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote: > Brett Cannon wrote: > > > There are two problems to this topic; how to > > get the AST structs into Python objects and how to allow Python code > > to modify the AST before bytecode emission > > I'm astounded to hear that the AST isn't made from > Python objects in the first place. Is there a particular > reason it wasn't done that way? > I honestly don't know, Greg. All of the structs are generated by Parser/asdl_c.py which reads in the AST definition from Parser/Python.asdl . The code that is used to allocate and initialize the structs is in Python/Python-ast.c and is also auto-generated by Parser/asdl_c.py . I am guessing here, but it might have to do with type safety. Some nodes can be different kinds of subnodes (like the stmt node) and thus are created using a single struct and a bunch unions internally. So there is some added security that stuff is being done correctly. Otherwise memory is the only other reason I can think of. Or Jeremy just didn't think of doing it that way when this was all started years ago. =) But since it is all auto-generated it should be doable to make them Python objects. -Brett From martin at v.loewis.de Thu Nov 24 10:01:14 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 24 Nov 2005 10:01:14 +0100 Subject: [Python-Dev] Problems with the Python Memory Manager In-Reply-To: <437ADDCF.7080906@ee.byu.edu> References: <437ADDCF.7080906@ee.byu.edu> Message-ID: <4385815A.5090705@v.loewis.de> Travis Oliphant wrote: > In the long term, what is the status of plans to re-work the Python > Memory manager to free memory that it acquires (or improve the detection > of already freed memory locations). The Python memory manager does reuse memory that has been deallocated earlier. There are patches "floating around" that makes it return unused memory to the system (which it currently doesn't). Regards, Martin From martin at v.loewis.de Thu Nov 24 10:06:59 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 24 Nov 2005 10:06:59 +0100 Subject: [Python-Dev] Problems with the Python Memory Manager In-Reply-To: <437BC524.2030105@ee.byu.edu> References: <20051116120346.A434.JCARLSON@uci.edu> <dlg5gt$q1g$1@sea.gmane.org> <20051116145820.A43A.JCARLSON@uci.edu> <437BC524.2030105@ee.byu.edu> Message-ID: <438582B3.80204@v.loewis.de> Travis Oliphant wrote: > As verified by removing usage of the Python PyObject_MALLOC function, it > was the Python memory manager that was performing poorly. Even though > the array-scalar objects were deleted, the memory manager would not > re-use their memory for later object creation. Instead, the memory > manager kept allocating new arenas to cover the load (when it should > have been able to re-use the old memory that had been freed by the > deleted objects--- again, I don't know enough about the memory manager > to say why this happened). One way (I think the only way) this could happen if: - the objects being allocated are all smaller than 256 bytes - when allocating new objects, the requested size was different from any other size previously deallocated. So if you first allocate 1,000,000 objects of size 200, and then release them, and then allocate 1,000,000 objects of size 208, the memory is not reused. If the objects are all of same size, or all larger than 256 bytes, this effect does not occur. Regards, Martin From martin at v.loewis.de Thu Nov 24 10:14:41 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 24 Nov 2005 10:14:41 +0100 Subject: [Python-Dev] Problems with the Python Memory Manager In-Reply-To: <437C54AA.9020203@ee.byu.edu> References: <fb6fbf560511161750y7cef46cdk67700606e655a6ec@mail.gmail.com> <437BE7A8.5000503@ee.byu.edu> <A89BF905-97B2-4E08-BFEB-33B00B3AECE0@mac.com> <437C54AA.9020203@ee.byu.edu> Message-ID: <43858481.5060202@v.loewis.de> Travis Oliphant wrote: > So, I now believe that his code (plus the array scalar extension type) > was actually exposing a real bug in the memory manager itself. In > theory, the Python memory manager should have been able to re-use the > memory for the array-scalar instances because they are always the same > size. In practice, the memory was apparently not being re-used but > instead new blocks were being allocated to handle the load. That is really very hard to believe. Most people on this list would probably agree that obmalloc certain *will* reuse deallocated memory if the next request is for the very same size (number of bytes) that the previously-release object had. > His code is quite complicated and it is difficult to replicate the > problem. That the code is complex would not so much be a problem: we often analyse complex code here. It is a problem that the code is not available, and it would be a problem if the problem was not reproducable even if you had the code (i.e. if the problem would sometimes occur, but not the next day when you ran it again). So if you can, please post the code somewhere, and add a bugreport on sf.net/projects/python. Regards, Martin From fredrik at pythonware.com Thu Nov 24 10:19:31 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu, 24 Nov 2005 10:19:31 +0100 Subject: [Python-Dev] Problems with the Python Memory Manager References: <20051116120346.A434.JCARLSON@uci.edu><dlg5gt$q1g$1@sea.gmane.org> <20051116145820.A43A.JCARLSON@uci.edu><437BC524.2030105@ee.byu.edu> <438582B3.80204@v.loewis.de> Message-ID: <dm40j6$bn7$1@sea.gmane.org> Martin v. Löwis wrote: > One way (I think the only way) this could happen if: > - the objects being allocated are all smaller than 256 bytes > - when allocating new objects, the requested size was different > from any other size previously deallocated. > > So if you first allocate 1,000,000 objects of size 200, and then > release them, and then allocate 1,000,000 objects of size 208, > the memory is not reused. > > If the objects are all of same size, or all larger than 256 bytes, > this effect does not occur. but the allocator should be able to move empty pools between size classes via the freepools list, right ? or am I missing something ? maybe what's happening here is more like So if you first allocate 1,000,000 objects of size 200, and then release most of them, and then allocate 1,000,000 objects of size 208, all memory might not be reused. ? </F> From robert.kern at gmail.com Thu Nov 24 10:59:57 2005 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 24 Nov 2005 01:59:57 -0800 Subject: [Python-Dev] Problems with the Python Memory Manager In-Reply-To: <43858481.5060202@v.loewis.de> References: <fb6fbf560511161750y7cef46cdk67700606e655a6ec@mail.gmail.com> <437BE7A8.5000503@ee.byu.edu> <A89BF905-97B2-4E08-BFEB-33B00B3AECE0@mac.com> <437C54AA.9020203@ee.byu.edu> <43858481.5060202@v.loewis.de> Message-ID: <dm42uu$i4m$1@sea.gmane.org> Martin v. L?wis wrote: > That the code is complex would not so much be a problem: we often > analyse complex code here. It is a problem that the code is not > available, and it would be a problem if the problem was not > reproducable even if you had the code (i.e. if the problem would > sometimes occur, but not the next day when you ran it again). You can get the version of scipy_core just before the fix that Travis applied: svn co -r 1488 http://svn.scipy.org/svn/scipy_core/trunk The fix: http://projects.scipy.org/scipy/scipy_core/changeset/1489 http://projects.scipy.org/scipy/scipy_core/changeset/1490 Here's some code that eats up memory with rev1488, but not with the HEAD: """ import scipy a = scipy.arange(10) for i in xrange(10000000): x = a[5] """ -- Robert Kern robert.kern at gmail.com "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From abo at minkirri.apana.org.au Thu Nov 24 11:09:34 2005 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Thu, 24 Nov 2005 10:09:34 +0000 Subject: [Python-Dev] urlparse brokenness In-Reply-To: <20051123050455.9010E7FBF@place.org> References: <20051123050455.9010E7FBF@place.org> Message-ID: <1132826974.24108.6.camel@warna.corp.google.com> On Tue, 2005-11-22 at 23:04 -0600, Paul Jimenez wrote: > It is my assertion that urlparse is currently broken. Specifically, I > think that urlparse breaks an abstraction boundary with ill effect. > > In writing a mailclient, I wished to allow my users to specify their > imap server as a url, such as 'imap://user:password at host:port/'. Which > worked fine. I then thought that the natural extension to support FWIW, I have a small addition related to this that I think would be handy to add to the urlparse module. It is a pair of functions "netlocparse()" and "netlocunparse()" that is for parsing and unparsing "user:password at host:port" netloc's. Feel free to use/add/ignore it... http://minkirri.apana.org.au/~abo/projects/osVFS/netlocparse.py -- Donovan Baarda <abo at minkirri.apana.org.au> http://minkirri.apana.org.au/~abo/ From arigo at tunes.org Thu Nov 24 12:38:58 2005 From: arigo at tunes.org (Armin Rigo) Date: Thu, 24 Nov 2005 12:38:58 +0100 Subject: [Python-Dev] Problems with the Python Memory Manager In-Reply-To: <dm42uu$i4m$1@sea.gmane.org> References: <fb6fbf560511161750y7cef46cdk67700606e655a6ec@mail.gmail.com> <437BE7A8.5000503@ee.byu.edu> <A89BF905-97B2-4E08-BFEB-33B00B3AECE0@mac.com> <437C54AA.9020203@ee.byu.edu> <43858481.5060202@v.loewis.de> <dm42uu$i4m$1@sea.gmane.org> Message-ID: <20051124113858.GA9262@code1.codespeak.net> Hi, On Thu, Nov 24, 2005 at 01:59:57AM -0800, Robert Kern wrote: > You can get the version of scipy_core just before the fix that Travis > applied: Now we can start debugging :-) > http://projects.scipy.org/scipy/scipy_core/changeset/1490 This changeset alone fixes the small example you provided. However, compiling python "--without-pymalloc" doesn't fix it, so we can't blame the memory allocator. That's all I can say; I am rather clueless as to how the above patch manages to make any difference even without pymalloc. A bientot, Armin From arigo at tunes.org Thu Nov 24 13:11:13 2005 From: arigo at tunes.org (Armin Rigo) Date: Thu, 24 Nov 2005 13:11:13 +0100 Subject: [Python-Dev] Problems with the Python Memory Manager In-Reply-To: <20051124113858.GA9262@code1.codespeak.net> References: <fb6fbf560511161750y7cef46cdk67700606e655a6ec@mail.gmail.com> <437BE7A8.5000503@ee.byu.edu> <A89BF905-97B2-4E08-BFEB-33B00B3AECE0@mac.com> <437C54AA.9020203@ee.byu.edu> <43858481.5060202@v.loewis.de> <dm42uu$i4m$1@sea.gmane.org> <20051124113858.GA9262@code1.codespeak.net> Message-ID: <20051124121113.GA9444@code1.codespeak.net> Hi, Ok, here is the reason for the leak... There is in scipy a type called 'int32_arrtype' which inherits from both another scipy type called 'signedinteger_arrtype', and from 'int'. Obscure! This is not 100% officially allowed: you are inheriting from two C types. You're living dangerously! Now in this case it mostly works as expected, because the parent scipy type has no field at all, so it's mostly like inheriting from both 'object' and 'int' -- which is allowed, or would be if the bases were written in the opposite order. But still, something confuses the fragile logic of typeobject.c. (I'll leave this bit to scipy people to debug :-) The net result is that unless you force your own tp_free as in revision 1490, the type 'int32_arrtype' has tp_free set to int_free(), which is the normal tp_free of 'int' objects. This causes all deallocated int32_arrtype instances to be added to the CPython free list of integers instead of being freed! A bientot, Armin From ncoghlan at gmail.com Thu Nov 24 14:10:07 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 24 Nov 2005 23:10:07 +1000 Subject: [Python-Dev] PEP 302, PEP 338 and imp.getloader (was Re: a Python interface for the AST (WAS: DRAFT: python-dev...) In-Reply-To: <5.1.1.6.0.20051123131857.03aba388@mail.telecommunity.com> References: <5.1.1.6.0.20051122215139.01f99f90@mail.telecommunity.com> <ee2a432c0511221748r272540ffld1ef1f772f8058e3@mail.gmail.com> <d11dcfba0511221430j519a2f8fh6faac1ab89ee7d99@mail.gmail.com> <bbaeab100511221602h36bbc30bpc9f317d7fb3354fa@mail.gmail.com> <ee2a432c0511221748r272540ffld1ef1f772f8058e3@mail.gmail.com> <5.1.1.6.0.20051122215139.01f99f90@mail.telecommunity.com> <5.1.1.6.0.20051123131857.03aba388@mail.telecommunity.com> Message-ID: <4385BBAF.8040104@gmail.com> Phillip J. Eby wrote: > This isn't hard to implement per se; setuptools for example has a > 'get_importer' function, and going from importer to loader is simple: Thanks, I think I'll definitely be able to build something out of that. > So with the above function you could do something like: > > def get_loader(fullname, path): > for path_item in path: > try: > loader = get_importer(path_item).find_module(fullname) > if loader is not None: > return loader > except ImportError: > continue > else: > return None > > in order to implement the rest. I think sys.meta_path needs to figure into that before digging through sys.path, but otherwise the concept seems basically correct. [NickC] >> ** I'm open to suggestions on how to deal with argv[0] and __file__. They >> should be set to whatever __file__ would be set to by the module >> loader, but >> the Importer Protocol in PEP 302 doesn't seem to expose that >> information. The >> current proposal is a compromise that matches the existing behaviour >> of -m >> (which supports scripts like regrtest.py) while still giving a meaningful >> value for scripts which are not part of the normal filesystem. [PJE] > Ugh. Those are tricky, no question. I can think of several simple > answers for each, all of which are wrong in some way. :) Indeed. I tried turning to "exec co in d" and "execfile(name, d)" for guidance, and didn't find any real help there. The only thing they automatically add to the supplied dictionary is __builtins__. The consequence is that any code executed using "exec" or "execfile" sees its name as being "__builtin__" because the lookup for '__name__' falls back to the builtin namespace. Further, "__file__" and "__loader__" won't be set at all when using these functions, which may be something of a surprise for some modules (to say the least). My current thinking is to actually try to distance the runpy module from "exec" and "execfile" significantly more than I'd originally intended. That way, I can explicitly focus on making it look like the item was invoked from the command line, without worrying about behaviour differences between this and the exec statement. It also means runpy can avoid the "implicitly modify the current namespace" behaviour that exec and execfile currently have. The basic function runpy.run_code would look like: def run_code(code, init_globals=None, mod_name=None, mod_file=None, mod_loader=None): """Executes a string of source code or a code object Returns the resulting top level namespace dictionary """ # Handle omitted arguments if mod_name is None: mod_name = "<run>" if mod_file is None: mod_file = "<run>" if mod_loader is None: mod_loader = StandardImportLoader(".") # Set up the top level namespace dictionary run_globals = {} if init_globals is not None: run_globals.update(init_globals) run_globals.update(__name__ = mod_name, __file__ = mod_file, __loader__ = mod_loader) # Run it! exec code in run_globals return run_globals Note that run_code always creates a new execution dictionary and returns it, in contrast to exec and execfile. This is so that naively doing: run_code("print 'Hi there!'", globals()) or: run_code("print 'Hi there!'", locals()) doesn't trash __name__, __file__ or __loader__ in the current module (which would be bad). And runpy.run_module would look something like: def run_module(mod_name, run_globals=None, run_name=None, as_script=False) loader = _get_loader(mod_name) # Handle lack of imp.get_loader code = loader.get_code(mod_name) filename = _get_filename(loader, mod_name) # Handle lack of protocol if run_name is None: run_name = mod_name if as_script: sys.argv[0] = filename return run_code(code, run_globals, run_name, filename, loader) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From duncan-pythondev at grisby.org Thu Nov 24 15:11:30 2005 From: duncan-pythondev at grisby.org (Duncan Grisby) Date: Thu, 24 Nov 2005 14:11:30 +0000 Subject: [Python-Dev] (no subject) Message-ID: <E1EfHow-0002xd-Ar@apasphere.com> Hi, I posted this to comp.lang.python, but got no response, so I thought I would consult the wise people here... I have encountered a problem with the re module. I have a multi-threaded program that does lots of regular expression searching, with some relatively complex regular expressions. Occasionally, events can conspire to mean that the re search takes minutes. That's bad enough in and of itself, but the real problem is that the re engine does not release the interpreter lock while it is running. All the other threads are therefore blocked for the entire time it takes to do the regular expression search. Is there any fundamental reason why the re module cannot release the interpreter lock, for at least some of the time it is running? The ideal situation for me would be if it could do most of its work with the lock released, since the software is running on a multi processor machine that could productively do other work while the re is being processed. Failing that, could it at least periodically release the lock to give other threads a chance to run? A quick look at the code in _sre.c suggests that for most of the time, no Python objects are being manipulated, so the interpreter lock could be released. Has anyone tried to do that? Thanks, Duncan. -- -- Duncan Grisby -- -- duncan at grisby.org -- -- http://www.grisby.org -- From abo at minkirri.apana.org.au Thu Nov 24 15:52:01 2005 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Thu, 24 Nov 2005 14:52:01 +0000 Subject: [Python-Dev] (no subject) In-Reply-To: <E1EfHow-0002xd-Ar@apasphere.com> References: <E1EfHow-0002xd-Ar@apasphere.com> Message-ID: <1132843921.25145.62.camel@warna.corp.google.com> On Thu, 2005-11-24 at 14:11 +0000, Duncan Grisby wrote: > Hi, > > I posted this to comp.lang.python, but got no response, so I thought I > would consult the wise people here... > > I have encountered a problem with the re module. I have a > multi-threaded program that does lots of regular expression searching, > with some relatively complex regular expressions. Occasionally, events > can conspire to mean that the re search takes minutes. That's bad > enough in and of itself, but the real problem is that the re engine > does not release the interpreter lock while it is running. All the > other threads are therefore blocked for the entire time it takes to do > the regular expression search. I don't know if this will help, but in my experience compiling re's often takes longer than matching them... are you sure that it's the match and not a compile that is taking a long time? Are you using pre-compiled re's or are you dynamically generating strings and using them? > Is there any fundamental reason why the re module cannot release the > interpreter lock, for at least some of the time it is running? The > ideal situation for me would be if it could do most of its work with > the lock released, since the software is running on a multi processor > machine that could productively do other work while the re is being > processed. Failing that, could it at least periodically release the > lock to give other threads a chance to run? > > A quick look at the code in _sre.c suggests that for most of the time, > no Python objects are being manipulated, so the interpreter lock could > be released. Has anyone tried to do that? probably not... not many people would have several-minutes-to-match re's. I suspect it would be do-able... I suggest you put together a patch and submit it on SF... -- Donovan Baarda <abo at minkirri.apana.org.au> http://minkirri.apana.org.au/~abo/ From duncan-pythondev at grisby.org Thu Nov 24 16:00:57 2005 From: duncan-pythondev at grisby.org (Duncan Grisby) Date: Thu, 24 Nov 2005 15:00:57 +0000 Subject: [Python-Dev] Re: Regular expressions In-Reply-To: Message from Donovan Baarda <abo@minkirri.apana.org.au> of "Thu, 24 Nov 2005 14:52:01 GMT." <1132843921.25145.62.camel@warna.corp.google.com> Message-ID: <E1EfIao-00030y-OX@apasphere.com> On Thursday 24 November, Donovan Baarda wrote: > I don't know if this will help, but in my experience compiling re's > often takes longer than matching them... are you sure that it's the > match and not a compile that is taking a long time? Are you using > pre-compiled re's or are you dynamically generating strings and using > them? It's definitely matching time. The res are all pre-compiled. [...] > > A quick look at the code in _sre.c suggests that for most of the time, > > no Python objects are being manipulated, so the interpreter lock could > > be released. Has anyone tried to do that? > > probably not... not many people would have several-minutes-to-match > re's. > > I suspect it would be do-able... I suggest you put together a patch and > submit it on SF... The thing that scares me about doing that is that there might be single-threadedness assumptions in the code that I don't spot. It's the kind of thing where a patch could appear to work fine, but them mysteriously fail due to some occasional race condition. Does anyone know if there is there any global state in _sre that would prevent it being re-entered, or know for certain that there isn't? Cheers, Duncan. -- -- Duncan Grisby -- -- duncan at grisby.org -- -- http://www.grisby.org -- From fredrik at pythonware.com Thu Nov 24 16:00:04 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu, 24 Nov 2005 16:00:04 +0100 Subject: [Python-Dev] (no subject) References: <E1EfHow-0002xd-Ar@apasphere.com> <1132843921.25145.62.camel@warna.corp.google.com> Message-ID: <dm4khk$8rg$1@sea.gmane.org> Donovan Baarda wrote: > I don't know if this will help, but in my experience compiling re's > often takes longer than matching them... are you sure that it's the > match and not a compile that is taking a long time? Are you using > pre-compiled re's or are you dynamically generating strings and using > them? patterns with nested repeats can behave badly on certain types of non- matching input. (each repeat is basically a loop, and if you nest enough loops things can quickly get out of hand, even if the inner loop doesn't do much...) </F> From tim.peters at gmail.com Thu Nov 24 16:44:42 2005 From: tim.peters at gmail.com (Tim Peters) Date: Thu, 24 Nov 2005 10:44:42 -0500 Subject: [Python-Dev] Problems with the Python Memory Manager In-Reply-To: <438582B3.80204@v.loewis.de> References: <20051116120346.A434.JCARLSON@uci.edu> <dlg5gt$q1g$1@sea.gmane.org> <20051116145820.A43A.JCARLSON@uci.edu> <437BC524.2030105@ee.byu.edu> <438582B3.80204@v.loewis.de> Message-ID: <1f7befae0511240744t1d5b40fdsf0dd5e9201ae22cb@mail.gmail.com> [Martin v. L?wis] > One way (I think the only way) this could happen if: > - the objects being allocated are all smaller than 256 bytes > - when allocating new objects, the requested size was different > from any other size previously deallocated. > > So if you first allocate 1,000,000 objects of size 200, and then > release them, and then allocate 1,000,000 objects of size 208, > the memory is not reused. Nope, the memory is reused in this case. While each obmalloc "pool" P is devoted to a fixed size so long as at least one object from P is in use, when all objects allocated from P have been released, P can be reassigned to any other size class. The comments in obmalloc.c are quite accurate. This particular case is talked about here: """ empty == all the pool's blocks are currently available for allocation On transition to empty, a pool is unlinked from its usedpools[] list, and linked to the front of the (file static) singly-linked freepools list, via its nextpool member. The prevpool member has no meaning in this case. Empty pools have no inherent size class: the next time a malloc finds an empty list in usedpools[], it takes the first pool off of freepools. If the size class needed happens to be the same as the size class the pool last had, some pool initialization can be skipped. """ Now if you end up allocating a million pools all devoted to 72-byte objects, and leave one object from each pool in use, then all those pools remain devoted to 72-byte objects. Wholly empty pools can be (and do get) reused freely, though. > If the objects are all of same size, or all larger than 256 bytes, > this effect does not occur. If they're larger than 256 bytes, then you see the reuse behavior of the system malloc/free, about which virtually nothing can be said that's true across all Python platforms. From oliphant.travis at ieee.org Thu Nov 24 17:30:55 2005 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Thu, 24 Nov 2005 09:30:55 -0700 Subject: [Python-Dev] Problems with the Python Memory Manager In-Reply-To: <20051124121113.GA9444@code1.codespeak.net> References: <fb6fbf560511161750y7cef46cdk67700606e655a6ec@mail.gmail.com> <437BE7A8.5000503@ee.byu.edu> <A89BF905-97B2-4E08-BFEB-33B00B3AECE0@mac.com> <437C54AA.9020203@ee.byu.edu> <43858481.5060202@v.loewis.de> <dm42uu$i4m$1@sea.gmane.org> <20051124113858.GA9262@code1.codespeak.net> <20051124121113.GA9444@code1.codespeak.net> Message-ID: <dm4ps1$qds$1@sea.gmane.org> Armin Rigo wrote: > Hi, > > Ok, here is the reason for the leak... > > There is in scipy a type called 'int32_arrtype' which inherits from both > another scipy type called 'signedinteger_arrtype', and from 'int'. > Obscure! This is not 100% officially allowed: you are inheriting from > two C types. You're living dangerously! This is allowed because the two types have compatible binaries (in fact the signed integer type is only the PyObject_HEAD) > > Now in this case it mostly works as expected, because the parent scipy > type has no field at all, so it's mostly like inheriting from both > 'object' and 'int' -- which is allowed, or would be if the bases were > written in the opposite order. But still, something confuses the > fragile logic of typeobject.c. (I'll leave this bit to scipy people to > debug :-) > This is definitely possible. I've tripped up in this logic before. I was beginning to suspect that it might have something to do with what is going on. > The net result is that unless you force your own tp_free as in revision > 1490, the type 'int32_arrtype' has tp_free set to int_free(), which is > the normal tp_free of 'int' objects. This causes all deallocated > int32_arrtype instances to be added to the CPython free list of integers > instead of being freed! I'm not sure this is true, It sounds plausible but I will have to check. Previously the tp_free should have been inherited as PyObject_Del for the int32_arrtype. Unless the typeobject.c code copied the tp_free from the wrong base type, this shouldn't have been the case. Thanks for the pointers. It sounds like we're getting close. Perhaps the problem is in typeobject.c .... -Travis From oliphant.travis at ieee.org Thu Nov 24 18:02:59 2005 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Thu, 24 Nov 2005 10:02:59 -0700 Subject: [Python-Dev] Problems with the Python Memory Manager In-Reply-To: <20051124121113.GA9444@code1.codespeak.net> References: <fb6fbf560511161750y7cef46cdk67700606e655a6ec@mail.gmail.com> <437BE7A8.5000503@ee.byu.edu> <A89BF905-97B2-4E08-BFEB-33B00B3AECE0@mac.com> <437C54AA.9020203@ee.byu.edu> <43858481.5060202@v.loewis.de> <dm42uu$i4m$1@sea.gmane.org> <20051124113858.GA9262@code1.codespeak.net> <20051124121113.GA9444@code1.codespeak.net> Message-ID: <dm4ro4$n0$1@sea.gmane.org> Armin Rigo wrote: > Hi, > > Ok, here is the reason for the leak... > > There is in scipy a type called 'int32_arrtype' which inherits from both > another scipy type called 'signedinteger_arrtype', and from 'int'. > Obscure! This is not 100% officially allowed: you are inheriting from > two C types. You're living dangerously! > > Now in this case it mostly works as expected, because the parent scipy > type has no field at all, so it's mostly like inheriting from both > 'object' and 'int' -- which is allowed, or would be if the bases were > written in the opposite order. But still, something confuses the > fragile logic of typeobject.c. (I'll leave this bit to scipy people to > debug :-) > > The net result is that unless you force your own tp_free as in revision > 1490, the type 'int32_arrtype' has tp_free set to int_free(), which is > the normal tp_free of 'int' objects. This causes all deallocated > int32_arrtype instances to be added to the CPython free list of integers > instead of being freed! I can confirm that indeed the int32_arrtype object gets the tp_free slot from it's second parent (the python integer type) instead of its first parent (the new, empty signed integer type). I just did a printf after PyType_Ready was called to see what the tp_free slot contained, and indeed it contained the wrong thing. I suspect this may also be true of the float64_arrtype as well (which inherits from Python's float type). What I don't understand is why the tp_free slot from the second base type got copied over into the tp_free slot of the child. It should have received the tp_free slot of the first parent, right? I'm still looking for why that would be the case. I think, though, Armin has identified the real culprit of the problem. I apologize for any consternation over the memory manager that may have taken place. This problem is obviously an issue of dual inheritance in C. I understand this is not well tested code, but in principle it should work correctly, right? I'll keep looking to see if I made a mistake in believing that the int32_arrtype should have inherited its tp_free slot from the first parent and not the second. -Travis From oliphant.travis at ieee.org Thu Nov 24 18:17:43 2005 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Thu, 24 Nov 2005 10:17:43 -0700 Subject: [Python-Dev] Problems with mro for dual inheritance in C [Was: Problems with the Python Memory Manager] In-Reply-To: <20051124121113.GA9444@code1.codespeak.net> References: <fb6fbf560511161750y7cef46cdk67700606e655a6ec@mail.gmail.com> <437BE7A8.5000503@ee.byu.edu> <A89BF905-97B2-4E08-BFEB-33B00B3AECE0@mac.com> <437C54AA.9020203@ee.byu.edu> <43858481.5060202@v.loewis.de> <dm42uu$i4m$1@sea.gmane.org> <20051124113858.GA9262@code1.codespeak.net> <20051124121113.GA9444@code1.codespeak.net> Message-ID: <dm4sjp$3ov$1@sea.gmane.org> Armin Rigo wrote: > Hi, > > Ok, here is the reason for the leak... > > There is in scipy a type called 'int32_arrtype' which inherits from both > another scipy type called 'signedinteger_arrtype', and from 'int'. > Obscure! This is not 100% officially allowed: you are inheriting from > two C types. You're living dangerously! > > Now in this case it mostly works as expected, because the parent scipy > type has no field at all, so it's mostly like inheriting from both > 'object' and 'int' -- which is allowed, or would be if the bases were > written in the opposite order. But still, something confuses the > fragile logic of typeobject.c. (I'll leave this bit to scipy people to > debug :-) Well, I'm stumped on this. Note the method resolution order for the new scalar array type (exactly as I would expect). Why doesn't the int32 type inherit its tp_free from the early types first? a = zeros(10) type(a[0]).mro() [<type 'int32_arrtype'>, <type 'signedinteger_arrtype'>, <type 'integer_arrtype'>, <type 'numeric_arrtype'>, <type 'generic_arrtype'>, <type 'int'>, <type 'object'>] From nnorwitz at gmail.com Thu Nov 24 20:34:37 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Thu, 24 Nov 2005 11:34:37 -0800 Subject: [Python-Dev] registering unicode codecs Message-ID: <ee2a432c0511241134r173626d4u3cae9c17ccc4ea8c@mail.gmail.com> While running regrtest with -R to find reference leaks I found a usage issue. When a codec is registered it is stored in the interpreter state and cannot be removed. Since it is stored as a list, if you repeated add the same search function, you will get duplicates in the list and they can't be removed. This shows up as a reference leak (which it really isn't) in test_unicode with this code modified from test_codecs_errors: import codecs def search_function(encoding): def encode1(input, errors="strict"): return 42 return (encode1, None, None, None) codecs.register(search_function) ### Should the search function be added to the search path if it is already in there? I don't understand a benefit of having duplicate search functions. Should users have access to the search path (through a codecs.unregister())? If so, should it search from the end of the list to the beginning to remove an item? That way the last entry would be removed rather than the first. n From mal at egenix.com Thu Nov 24 20:44:38 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 24 Nov 2005 20:44:38 +0100 Subject: [Python-Dev] registering unicode codecs In-Reply-To: <ee2a432c0511241134r173626d4u3cae9c17ccc4ea8c@mail.gmail.com> References: <ee2a432c0511241134r173626d4u3cae9c17ccc4ea8c@mail.gmail.com> Message-ID: <43861826.4000705@egenix.com> Neal Norwitz wrote: > While running regrtest with -R to find reference leaks I found a usage > issue. When a codec is registered it is stored in the interpreter > state and cannot be removed. Since it is stored as a list, if you > repeated add the same search function, you will get duplicates in the > list and they can't be removed. This shows up as a reference leak > (which it really isn't) in test_unicode with this code modified from > test_codecs_errors: > > import codecs > def search_function(encoding): > def encode1(input, errors="strict"): > return 42 > return (encode1, None, None, None) > > codecs.register(search_function) > > ### > > Should the search function be added to the search path if it is > already in there? I don't understand a benefit of having duplicate > search functions. Me neither :-) I never expected someone to register a search function more than once, since there's no point in doing so. > Should users have access to the search path (through a > codecs.unregister())? Maybe, but why would you want to unregister a search function ? > If so, should it search from the end of the > list to the beginning to remove an item? That way the last entry > would be removed rather than the first. I'd suggest to raise an exception in case a user tries to register a search function twice. Removal should be the same as doing list.remove(), ie. remove the first (and only) item in the list of search functions. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 24 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From nnorwitz at gmail.com Thu Nov 24 20:51:20 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Thu, 24 Nov 2005 11:51:20 -0800 Subject: [Python-Dev] registering unicode codecs In-Reply-To: <43861826.4000705@egenix.com> References: <ee2a432c0511241134r173626d4u3cae9c17ccc4ea8c@mail.gmail.com> <43861826.4000705@egenix.com> Message-ID: <ee2a432c0511241151j6b028dbcm36b8032c978d456c@mail.gmail.com> On 11/24/05, M.-A. Lemburg <mal at egenix.com> wrote: > > > Should users have access to the search path (through a > > codecs.unregister())? > > Maybe, but why would you want to unregister a search function ? > > > If so, should it search from the end of the > > list to the beginning to remove an item? That way the last entry > > would be removed rather than the first. > > I'd suggest to raise an exception in case a user tries > to register a search function twice. This should take care of the testing problem. > Removal should be the > same as doing list.remove(), ie. remove the first (and > only) item in the list of search functions. Do you recommend adding an unregister()? It's not necessary for this case. n From mal at egenix.com Thu Nov 24 21:12:39 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 24 Nov 2005 21:12:39 +0100 Subject: [Python-Dev] registering unicode codecs In-Reply-To: <ee2a432c0511241151j6b028dbcm36b8032c978d456c@mail.gmail.com> References: <ee2a432c0511241134r173626d4u3cae9c17ccc4ea8c@mail.gmail.com> <43861826.4000705@egenix.com> <ee2a432c0511241151j6b028dbcm36b8032c978d456c@mail.gmail.com> Message-ID: <43861EB7.2030301@egenix.com> Neal Norwitz wrote: > On 11/24/05, M.-A. Lemburg <mal at egenix.com> wrote: > >>>Should users have access to the search path (through a >>>codecs.unregister())? >> >>Maybe, but why would you want to unregister a search function ? >> >> >>>If so, should it search from the end of the >>>list to the beginning to remove an item? That way the last entry >>>would be removed rather than the first. >> >>I'd suggest to raise an exception in case a user tries >>to register a search function twice. > > > This should take care of the testing problem. > > >>Removal should be the >>same as doing list.remove(), ie. remove the first (and >>only) item in the list of search functions. > > > Do you recommend adding an unregister()? It's not necessary for this case. Not really - I don't see much of a need for this; except maybe if a codec package wants to replace another codec package. So far no-one has requested such a feature, so I'd say we don't add .unregister() until a request for it pops up. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 24 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From arigo at tunes.org Thu Nov 24 23:24:52 2005 From: arigo at tunes.org (Armin Rigo) Date: Thu, 24 Nov 2005 23:24:52 +0100 Subject: [Python-Dev] Problems with mro for dual inheritance in C [Was: Problems with the Python Memory Manager] In-Reply-To: <dm4sjp$3ov$1@sea.gmane.org> References: <fb6fbf560511161750y7cef46cdk67700606e655a6ec@mail.gmail.com> <437BE7A8.5000503@ee.byu.edu> <A89BF905-97B2-4E08-BFEB-33B00B3AECE0@mac.com> <437C54AA.9020203@ee.byu.edu> <43858481.5060202@v.loewis.de> <dm42uu$i4m$1@sea.gmane.org> <20051124113858.GA9262@code1.codespeak.net> <20051124121113.GA9444@code1.codespeak.net> <dm4sjp$3ov$1@sea.gmane.org> Message-ID: <20051124222452.GA14236@code1.codespeak.net> Hi Travis, On Thu, Nov 24, 2005 at 10:17:43AM -0700, Travis E. Oliphant wrote: > Why doesn't the int32 > type inherit its tp_free from the early types first? In your case I suspect that the tp_free is inherited from the tp_base which is probably 'int'. I don't see how to "fix" typeobject.c, because I'm not sure that there is a solution that would do the right thing in all cases at this level. I would suggest that you just force the tp_alloc/tp_free that you want in your static types instead. That's what occurs for example if you build a similar inheritance hierarchy with classes defined in Python: these classes are then 'heap types', so they always get the generic tp_alloc/tp_free before PyType_Ready() has a chance to see them. Armin From martin at v.loewis.de Thu Nov 24 23:51:15 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 24 Nov 2005 23:51:15 +0100 Subject: [Python-Dev] SRE should release the GIL (was: no subject) In-Reply-To: <E1EfHow-0002xd-Ar@apasphere.com> References: <E1EfHow-0002xd-Ar@apasphere.com> Message-ID: <438643E3.6030106@v.loewis.de> Duncan Grisby wrote: > Is there any fundamental reason why the re module cannot release the > interpreter lock, for at least some of the time it is running? The > ideal situation for me would be if it could do most of its work with > the lock released, since the software is running on a multi processor > machine that could productively do other work while the re is being > processed. Failing that, could it at least periodically release the > lock to give other threads a chance to run? Formally: no; it access a Python string/Python unicode object all the time. Now, since all the shared objects it accesses are immutable, likely no harm would be done releasing the GIL. I think SRE was originally also intended to operate on array.array objects; this would have caused bigger problems. Not sure whether this is still an issue. Regards, Martin From nnorwitz at gmail.com Fri Nov 25 04:35:06 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Thu, 24 Nov 2005 19:35:06 -0800 Subject: [Python-Dev] reference leaks Message-ID: <ee2a432c0511241935i70127dc0o50999f72b5094f89@mail.gmail.com> There are still a few reference leaks I've been able to identify. I didn't see an obvious solution to these (well, I saw one obvious solution which crashed, so obviously I was wrong). When running regrtest with -R here are the ref leaks reported: test_codeccallbacks leaked [2, 2, 2, 2] references test_compiler leaked [176, 242, 202, 248] references test_generators leaked [254, 254, 254, 254] references test_tcl leaked [35, 35, 35, 35] references test_threading_local leaked [36, 36, 28, 36] references test_urllib2 leaked [-130, 70, -120, 60] references test_compiler and test_urllib2 are probably not real leaks, but data being cached. I'm not really sure if test_tcl is a leak or not. Since there's a lot that goes on under the covers. I didn't see anything obvious in _tkinter.c. I have no idea about test_threading_local. I'm pretty certain test_codeccallbacks and test_generators are leaks. Here is code that I gleaned/modified from the tests and causes leaks in the interpreter: #### test_codeccallbacks import codecs def test_callbacks(): def handler(exc): l = [u"<%d>" % ord(exc.object[pos]) for pos in xrange(exc.start, exc.end)] return (u"[%s]" % u"".join(l), exc.end) codecs.register_error("test.handler", handler) # the {} is necessary to cause the leak, {} can hold data too codecs.charmap_decode("abc", "test.handler", {}) test_callbacks() # leak from PyUnicode_DecodeCharmap() each time test_callbacks() is called #### test_generators from itertools import tee def fib(): def yield_identity_forever(g): while 1: yield g def _fib(): for i in yield_identity_forever(head): yield i head, tail, result = tee(_fib(), 3) return result x = fib() # x.next() leak from itertool.tee() #### The itertools.tee() fix I thought was quite obvious: +++ Modules/itertoolsmodule.c (working copy) @@ -356,7 +356,8 @@ { if (tdo->nextlink == NULL) tdo->nextlink = teedataobject_new(tdo->it); - Py_INCREF(tdo->nextlink); + else + Py_INCREF(tdo->nextlink); return tdo->nextlink; } However, this creates problems elsewhere. I think test_heapq crashed when I added this fix. The patch also didn't fix all the leaks, just a bunch of them. So clearly there's more going on that I'm not getting. n From oliphant at ee.byu.edu Thu Nov 24 10:54:13 2005 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Thu, 24 Nov 2005 02:54:13 -0700 Subject: [Python-Dev] Problems with the Python Memory Manager In-Reply-To: <43858481.5060202@v.loewis.de> References: <fb6fbf560511161750y7cef46cdk67700606e655a6ec@mail.gmail.com> <437BE7A8.5000503@ee.byu.edu> <A89BF905-97B2-4E08-BFEB-33B00B3AECE0@mac.com> <437C54AA.9020203@ee.byu.edu> <43858481.5060202@v.loewis.de> Message-ID: <43858DC5.2080607@ee.byu.edu> Martin v. L?wis wrote: > Travis Oliphant wrote: > >> So, I now believe that his code (plus the array scalar extension >> type) was actually exposing a real bug in the memory manager itself. >> In theory, the Python memory manager should have been able to re-use >> the memory for the array-scalar instances because they are always the >> same size. In practice, the memory was apparently not being re-used >> but instead new blocks were being allocated to handle the load. > > > That is really very hard to believe. Most people on this list would > probably agree that obmalloc certain *will* reuse deallocated memory > if the next request is for the very same size (number of bytes) that > the previously-release object had. Yes, I see that it does. This became more clear as all the simple tests I tried failed to reproduce the problem (and I spent some time looking at the code and reading its comments). I just can't figure out another explanation for why the problem went away when I went to using the system malloc other than some kind of corner-case in the Python memory allocator. > >> His code is quite complicated and it is difficult to replicate the >> problem. > > > That the code is complex would not so much be a problem: we often > analyse complex code here. It is a problem that the code is not > available, and it would be a problem if the problem was not > reproducable even if you had the code (i.e. if the problem would > sometimes occur, but not the next day when you ran it again). > The problem was definitely reproducible. On his machine, and on the two machines I tried to run it on. It without fail rapidly consumed all available memory. > So if you can, please post the code somewhere, and add a bugreport > on sf.net/projects/python. > I'll try to do this at some point. I'll have to get permission from him for the actual Python code. The extension modules he used are all publically available (PyMC). I changed the memory allocator in scipy --- which eliminated the problem --- so you'd have to check out an older version of the code from SVN to see the problem. Thanks for the tips. -Travis From oliphant.travis at ieee.org Thu Nov 24 11:08:11 2005 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu, 24 Nov 2005 03:08:11 -0700 Subject: [Python-Dev] Problems with the Python Memory Manager In-Reply-To: <438582B3.80204@v.loewis.de> References: <20051116120346.A434.JCARLSON@uci.edu> <dlg5gt$q1g$1@sea.gmane.org> <20051116145820.A43A.JCARLSON@uci.edu> <437BC524.2030105@ee.byu.edu> <438582B3.80204@v.loewis.de> Message-ID: <4385910B.40503@ieee.org> Martin v. L?wis wrote: > Travis Oliphant wrote: > >> As verified by removing usage of the Python PyObject_MALLOC function, >> it was the Python memory manager that was performing poorly. Even >> though the array-scalar objects were deleted, the memory manager >> would not re-use their memory for later object creation. Instead, the >> memory manager kept allocating new arenas to cover the load (when it >> should have been able to re-use the old memory that had been freed by >> the deleted objects--- again, I don't know enough about the memory >> manager to say why this happened). > > > One way (I think the only way) this could happen if: > - the objects being allocated are all smaller than 256 bytes > - when allocating new objects, the requested size was different > from any other size previously deallocated. In one version of the code I had moved all objects from the Python memory manager to the system malloc *except* the array scalars. The problem still remained, so I'm pretty sure these were the problem. The array scalars are all less than 256 bytes but they are always the same number of bytes. > > So if you first allocate 1,000,000 objects of size 200, and then > release them, and then allocate 1,000,000 objects of size 208, > the memory is not reused. That is useful information. I don't think his code was doing that kind of thing, but it definitely provides something to check on. Previously I was using the standard tp_alloc and tp_free methods (I was not setting them but just letting PyType_Ready fill those slots in with the default values). When I changed these methods to ones that used system free and system malloc the problem went away. That's why I attribute the issue to the Python memory manager. Of course, it's always possible that I was doing something wrong, but I really did try to make sure I wasn't making a mistake. I didn't do anything fancy with the Python memory allocator. The array scalars all subclass from each other in C, though. I don't see how that could be relevant, but I could be missing something. -Travis From allison at shasta.stanford.edu Thu Nov 24 17:44:27 2005 From: allison at shasta.stanford.edu (Dennis Allison) Date: Thu, 24 Nov 2005 08:44:27 -0800 (PST) Subject: [Python-Dev] Regular expressions In-Reply-To: <E1EfIao-00030y-OX@apasphere.com> Message-ID: <Pine.LNX.4.44.0511240834080.5028-100000@shasta.stanford.edu> This is probably OT for [Python-dev] I suspect that your problem is not the GIL but is due to something else. Rather than dorking with the interpreter's threading, you probably would be better off rethinking your problem and finding a better way to accomplish your task. On Thu, 24 Nov 2005, Duncan Grisby wrote: > On Thursday 24 November, Donovan Baarda wrote: > > > I don't know if this will help, but in my experience compiling re's > > often takes longer than matching them... are you sure that it's the > > match and not a compile that is taking a long time? Are you using > > pre-compiled re's or are you dynamically generating strings and using > > them? > > It's definitely matching time. The res are all pre-compiled. > > [...] > > > A quick look at the code in _sre.c suggests that for most of the time, > > > no Python objects are being manipulated, so the interpreter lock could > > > be released. Has anyone tried to do that? > > > > probably not... not many people would have several-minutes-to-match > > re's. > > > > I suspect it would be do-able... I suggest you put together a patch and > > submit it on SF... > > The thing that scares me about doing that is that there might be > single-threadedness assumptions in the code that I don't spot. It's the > kind of thing where a patch could appear to work fine, but them > mysteriously fail due to some occasional race condition. Does anyone > know if there is there any global state in _sre that would prevent it > being re-entered, or know for certain that there isn't? > > Cheers, > > Duncan. > > -- From victor.stinner at haypocalc.com Fri Nov 25 03:31:40 2005 From: victor.stinner at haypocalc.com (Victor STINNER) Date: Fri, 25 Nov 2005 03:31:40 +0100 Subject: [Python-Dev] Bug bz2.BZ2File(...).seek(0,2) + patch Message-ID: <1132885900.18774.5.camel@haypopc> Hi, I found a bug in bz2 python module. Example: import bz2 bz2.BZ2File("test.bz2","r") bz2.seek(0,2) assert bz2.tell() != 0 Details and *patch* at: http://sourceforge.net/tracker/index.php?func=detail&aid=1366000&group_id=5470&atid=105470 Please CC-me for all your answers. Bye, Victor -- Victor Stinner - student at the UTBM (Belfort, France) http://www.haypocalc.com/wiki/Accueil -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20051125/0a0385a7/attachment.pgp From fanghao at corp.netease.com Fri Nov 25 08:32:15 2005 From: fanghao at corp.netease.com (Frank) Date: Fri, 25 Nov 2005 15:32:15 +0800 Subject: [Python-Dev] (no subject) Message-ID: <20051125074127.32E1C1E400B@bag.python.org> hi, test mail list :) ¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡Ö Àñ£¡ ¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡Frank ¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡fanghao at corp.netease.com ¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡2005-11-25 From fredrik at pythonware.com Fri Nov 25 09:23:25 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 25 Nov 2005 09:23:25 +0100 Subject: [Python-Dev] SRE should release the GIL (was: no subject) References: <E1EfHow-0002xd-Ar@apasphere.com> <438643E3.6030106@v.loewis.de> Message-ID: <dm6hmk$23v$1@sea.gmane.org> Martin v. Löwis wrote: > Formally: no; it access a Python string/Python unicode object all > the time. > > Now, since all the shared objects it accesses are immutable, likely > no harm would be done releasing the GIL. I think SRE was originally > also intended to operate on array.array objects; this would have > caused bigger problems. SRE can operate on anything that implements the buffer interface. </F> From arigo at tunes.org Fri Nov 25 09:41:30 2005 From: arigo at tunes.org (Armin Rigo) Date: Fri, 25 Nov 2005 09:41:30 +0100 Subject: [Python-Dev] Problems with the Python Memory Manager In-Reply-To: <437BE7A8.5000503@ee.byu.edu> References: <fb6fbf560511161750y7cef46cdk67700606e655a6ec@mail.gmail.com> <437BE7A8.5000503@ee.byu.edu> Message-ID: <20051125084130.GA18796@code1.codespeak.net> Hi Jim, You wrote: > >(2) Is he allocating new _types_, which I think don't get properly > > collected. (Off-topic) For reference, as far as I know new types are properly freed. There has been a number of bugs and lots of corner cases to fix, but I know of no remaining one. This assumes that the new types are heap types allocated in some official way -- either by Python code or by somehow calling type() from C. A bientot, Armin From mwh at python.net Fri Nov 25 09:57:00 2005 From: mwh at python.net (Michael Hudson) Date: Fri, 25 Nov 2005 08:57:00 +0000 Subject: [Python-Dev] reference leaks In-Reply-To: <ee2a432c0511241935i70127dc0o50999f72b5094f89@mail.gmail.com> (Neal Norwitz's message of "Thu, 24 Nov 2005 19:35:06 -0800") References: <ee2a432c0511241935i70127dc0o50999f72b5094f89@mail.gmail.com> Message-ID: <2mhda1f6lf.fsf@starship.python.net> Neal Norwitz <nnorwitz at gmail.com> writes: > There are still a few reference leaks I've been able to identify. I > didn't see an obvious solution to these (well, I saw one obvious > solution which crashed, so obviously I was wrong). > > When running regrtest with -R here are the ref leaks reported: > > test_codeccallbacks leaked [2, 2, 2, 2] references > test_compiler leaked [176, 242, 202, 248] references > test_generators leaked [254, 254, 254, 254] references > test_tcl leaked [35, 35, 35, 35] references > test_threading_local leaked [36, 36, 28, 36] references > test_urllib2 leaked [-130, 70, -120, 60] references > > test_compiler and test_urllib2 are probably not real leaks, but data > being cached. I'm not really sure if test_tcl is a leak or not. > Since there's a lot that goes on under the covers. I didn't see > anything obvious in _tkinter.c. > > I have no idea about test_threading_local. It's very odd, but probably not a leak. > I'm pretty certain test_codeccallbacks and test_generators are leaks. Isn't test_codeccallbacks just the extra references you get from registering an error handler? test_generators is new, I think. Cheers, mwh -- Good? Bad? Strap him into the IETF-approved witch-dunking apparatus immediately! -- NTK now, 21/07/2000 From arigo at tunes.org Fri Nov 25 09:59:55 2005 From: arigo at tunes.org (Armin Rigo) Date: Fri, 25 Nov 2005 09:59:55 +0100 Subject: [Python-Dev] reference leaks In-Reply-To: <ee2a432c0511241935i70127dc0o50999f72b5094f89@mail.gmail.com> References: <ee2a432c0511241935i70127dc0o50999f72b5094f89@mail.gmail.com> Message-ID: <20051125085955.GB18796@code1.codespeak.net> Hi Neal, On Thu, Nov 24, 2005 at 07:35:06PM -0800, Neal Norwitz wrote: > The itertools.tee() fix I thought was quite obvious: > > +++ Modules/itertoolsmodule.c (working copy) > @@ -356,7 +356,8 @@ > { > if (tdo->nextlink == NULL) > tdo->nextlink = teedataobject_new(tdo->it); > - Py_INCREF(tdo->nextlink); > + else > + Py_INCREF(tdo->nextlink); > return tdo->nextlink; > } No, if this object is saved as a cache on 'tdo' then obviously it needs to keep a reference on its own. This reference will go away in teedataobject_dealloc(). After debugging, the problem is a reference cycle: the teedataobject 'head' has a field 'it' pointing to the generator-iterator '_fib()', which has a reference back to 'head'. So what is missing is making teedataobject GC-aware, which it current isn't. I suspect that there are other itertools types in the same situation. A bientot, Armin. From walter at livinglogic.de Fri Nov 25 10:27:55 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Fri, 25 Nov 2005 10:27:55 +0100 Subject: [Python-Dev] reference leaks In-Reply-To: <ee2a432c0511241935i70127dc0o50999f72b5094f89@mail.gmail.com> References: <ee2a432c0511241935i70127dc0o50999f72b5094f89@mail.gmail.com> Message-ID: <4386D91B.7030505@livinglogic.de> Neal Norwitz wrote: > [...] > #### test_codeccallbacks > > import codecs > def test_callbacks(): > def handler(exc): > l = [u"<%d>" % ord(exc.object[pos]) for pos in xrange(exc.start, exc.end)] > return (u"[%s]" % u"".join(l), exc.end) > codecs.register_error("test.handler", handler) > # the {} is necessary to cause the leak, {} can hold data too > codecs.charmap_decode("abc", "test.handler", {}) > > test_callbacks() > # leak from PyUnicode_DecodeCharmap() each time test_callbacks() is called Can you move the call to codecs.register_error() out of test_callbacks() and retry? Bye, Walter D?rwald From victor.stinner-linux at haypocalc.com Fri Nov 25 12:55:23 2005 From: victor.stinner-linux at haypocalc.com (Victor STINNER) Date: Fri, 25 Nov 2005 12:55:23 +0100 Subject: [Python-Dev] Bug bz2.BZ2File(...).seek(0,2) + patch Message-ID: <1132919724.26613.4.camel@haypopc> Hi, I found a bug in bz2 python module. Example: import bz2 bz2.BZ2File("test.bz2","r") bz2.seek(0,2) assert bz2.tell() != 0 Details and *patch* at: http://sourceforge.net/tracker/index.php?func=detail&aid=1366000&group_id=5470&atid=105470 Please CC-me for all your answers. Bye, Victor -- Victor Stinner - student at the UTBM (Belfort, France) http://www.haypocalc.com/wiki/Accueil -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20051125/9fcfd32d/attachment.pgp From eric.noyau at gmail.com Fri Nov 25 15:21:19 2005 From: eric.noyau at gmail.com (Eric Noyau) Date: Fri, 25 Nov 2005 14:21:19 +0000 Subject: [Python-Dev] SRE should release the GIL (was: no subject) In-Reply-To: <dm6hmk$23v$1@sea.gmane.org> References: <E1EfHow-0002xd-Ar@apasphere.com> <438643E3.6030106@v.loewis.de> <dm6hmk$23v$1@sea.gmane.org> Message-ID: <49e1c2960511250621t526bbc53p430a3144d1eafe5d@mail.gmail.com> Hi all, I've implemented a patch, please visit bug 1366311 for details. https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1366311&group_id=5470 This patch only release the GIL when the engine perform a low level search *and* if the object searched is a string or a unicode string. The GIL will not be released for any other kind of objects, as there is no guarantee of immutability of the buffer during the run. I've tested this with a couple of simple tests, and also by running the application Duncan talked about. My testing indicates that everything works as before with the added value that our application is still responsive even when processing some of the more egregious regular expressions. As it is my first foray into python module writing I'll welcome any feedback you may have on the patch. Regards, -- Eric On 11/25/05, Fredrik Lundh <fredrik at pythonware.com> wrote: > > Martin v. L?wis wrote: > > > Formally: no; it access a Python string/Python unicode object all > > the time. > > > > Now, since all the shared objects it accesses are immutable, likely > > no harm would be done releasing the GIL. I think SRE was originally > > also intended to operate on array.array objects; this would have > > caused bigger problems. > > SRE can operate on anything that implements the buffer interface. > > </F> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20051125/308a9692/attachment.html From aahz at pythoncraft.com Fri Nov 25 15:54:47 2005 From: aahz at pythoncraft.com (Aahz) Date: Fri, 25 Nov 2005 06:54:47 -0800 Subject: [Python-Dev] Bug bz2.BZ2File(...).seek(0,2) + patch In-Reply-To: <1132885900.18774.5.camel@haypopc> References: <1132885900.18774.5.camel@haypopc> Message-ID: <20051125145447.GA25513@panix.com> On Fri, Nov 25, 2005, Victor STINNER wrote: > > I found a bug in bz2 python module. Example: > > Details and *patch* at: > http://sourceforge.net/tracker/index.php?func=detail&aid=1366000&group_id=5470&atid=105470 Thanks! Particularly with the Thanksgiving weekend, you may not get any other responses for a while. Please be patient. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "If you think it's expensive to hire a professional to do the job, wait until you hire an amateur." --Red Adair From nnorwitz at gmail.com Fri Nov 25 19:02:56 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Fri, 25 Nov 2005 10:02:56 -0800 Subject: [Python-Dev] reference leaks In-Reply-To: <4386D91B.7030505@livinglogic.de> References: <ee2a432c0511241935i70127dc0o50999f72b5094f89@mail.gmail.com> <4386D91B.7030505@livinglogic.de> Message-ID: <ee2a432c0511251002n438ca00eib1d7bdee53df30d7@mail.gmail.com> On 11/25/05, Walter D?rwald <walter at livinglogic.de> wrote: > > Can you move the call to codecs.register_error() out of test_callbacks() > and retry? It then leaks 3 refs on each call to test_callbacks(). n -- >>> import codecs [24540 refs] >>> [24541 refs] >>> def handler(exc): ... l = [u"<%d>" % ord(exc.object[pos]) for pos in xrange(exc.start, exc.end)] ... return (u"[%s]" % u"".join(l), exc.end) ... [24575 refs] >>> codecs.register_error("test.handler", handler) [24579 refs] >>> [24579 refs] >>> def test_callbacks(): ... # the {} is necessary to cause the leak ... codecs.charmap_decode("abc", "test.handler", {}) ... [24604 refs] >>> test_callbacks() [24608 refs] >>> test_callbacks() [24611 refs] >>> test_callbacks() [24614 refs] From jjl at pobox.com Sat Nov 26 17:14:29 2005 From: jjl at pobox.com (John J Lee) Date: Sat, 26 Nov 2005 16:14:29 +0000 (UTC) Subject: [Python-Dev] urlparse brokenness In-Reply-To: <20051123050455.9010E7FBF@place.org> References: <20051123050455.9010E7FBF@place.org> Message-ID: <Pine.LNX.4.58.0511261612330.6228@alice> On Tue, 22 Nov 2005, Paul Jimenez wrote: > It is my assertion that urlparse is currently broken. Specifically, I > think that urlparse breaks an abstraction boundary with ill effect. [...] I have some comments, but I can't see a patch on SF. Did you post it? John From reinhold-birkenfeld-nospam at wolke7.net Sat Nov 26 16:57:34 2005 From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld) Date: Sat, 26 Nov 2005 16:57:34 +0100 Subject: [Python-Dev] Python 3 Message-ID: <dma473$pi$1@sea.gmane.org> Hi, don't know if this is known here, but it seems we have quite a long way to go: http://kuerzer.de/python3 Reinhold <wink> From jjl at pobox.com Sat Nov 26 19:48:57 2005 From: jjl at pobox.com (John J Lee) Date: Sat, 26 Nov 2005 18:48:57 +0000 (UTC) Subject: [Python-Dev] ast status, memory leaks, etc In-Reply-To: <dlvt41$cvl$1@sea.gmane.org> References: <ee2a432c0511131141s72fedecax29008fd783a3b0db@mail.gmail.com><ee2a432c0511191615y6259e95bwce68aec849a7ebfa@mail.gmail.com><438048B6.2030103@v.loewis.de><ee2a432c0511201614u1dadb3b2x419e3482ccf5b145@mail.gmail.com> <9ef20ef30511221148g905deefo548a8fb3e68a08ae@mail.gmail.com> <dlvt41$cvl$1@sea.gmane.org> Message-ID: <Pine.LNX.4.58.0511261845390.6228@alice> On Tue, 22 Nov 2005, Fredrik Lundh wrote: [...] > http://cvs.sourceforge.net/viewcvs.py/python/python/dist/src/Misc/README.valgrind?view=markup The up-to-date version of that (from SVN instead of old CVS repository) is here: http://svn.python.org/view/python/trunk/Misc/README.valgrind?view=markup John From martin at v.loewis.de Sat Nov 26 22:36:27 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 26 Nov 2005 22:36:27 +0100 Subject: [Python-Dev] CVS repository mostly closed now Message-ID: <4388D55B.1070501@v.loewis.de> I tried removing the CVS repository from SF; it turns out that this operation is not supported. Instead, it is only possible to remove it from the project page; pserver and ssh access remain indefinitely, as does viewcvs. The recommended procedure is to place a file into the repository indicating the repository has moved; this is what I just did. Regards, Martin From noamraph at gmail.com Sun Nov 27 00:11:36 2005 From: noamraph at gmail.com (Noam Raphael) Date: Sun, 27 Nov 2005 01:11:36 +0200 Subject: [Python-Dev] For Python 3k, drop default/implicit hash, and comparison In-Reply-To: <ca471dc20511070910u3e2e7ea6o6e98b46357a1af5c@mail.gmail.com> References: <436E2C3E.7060807@zope.com> <436E6A0E.4070508@pobox.com> <5.1.1.6.0.20051106162127.01ede358@mail.telecommunity.com> <5.1.1.6.0.20051106191059.01edcf78@mail.telecommunity.com> <5.1.1.6.0.20051106191251.01fa9818@mail.telecommunity.com> <ca471dc20511070910u3e2e7ea6o6e98b46357a1af5c@mail.gmail.com> Message-ID: <b348a0850511261511q64ed5e6dxa8366af22846fe9a@mail.gmail.com> Three weeks ago, I read this and thought, "well, you have two options for a default comparison, one based on identity and one on value, both are useful sometimes and Guido prefers identity, and it's OK." But today I understood that I still think otherwise. In two sentences: sometimes you wish to compare objects according to "identity", and sometimes you wish to compare objects according to "values". Identity-based comparison is done by the "is" operator; Value-based comparison should be done by the == operator. Let's take the car example, and expand it a bit. Let's say wheels have attributes - say, diameter and manufacturer. Let's say those can't change (which is reasonable), to make wheels hashable. There are two ways to compare wheels: by value and by identity. Two wheels may have the same value, that is, they have the same diameter and were created by the same manufacturer. Two wheels may have the same identity, that is, they are actually the same wheel. We may want to compare wheels based on value, for example to make sure that all the car's wheels fit together nicely: assert car.wheel1 == car.wheel2 == car.wheel3 == car.wheel4. We may want to compare wheels based on identity, for example to make sure that we actually bought four wheels in order to assemble the car: assert car.wheel1 is not car.wheel2 and car.wheel3 is not car.wheel1 and car.wheel3 is not car.wheel2... We may want to associate values with wheels based on their values. For example, it's reasonable to suppose that the price of every wheel of the same model is the same. In that case, we'll write: price[wheel] = 25. We may want to associate values with wheels based on their identities. For example, we may want to note that a specific wheel is broken. For this, I'll first define a general class (I defined it before in one of the discussions, that's because I believe it's useful): class Ref(object): def __init__(self, obj): self._obj = obj def __call__(self): return self._obj def __eq__(self, other): return isinstance(other, ref) and self._obj is other._obj def __hash__(self): return id(self._obj) ^ 0xBEEF Now again, how will we say that a specific wheel is broken? Like this: broken[Ref(wheel)] = True Note that the Ref class also allows us to group wheels of the same kind in a set, regardless of their __hash__ method. I think that most objects, especially most user-defined objects, have a *value*. I don't have an exact definition, but a hint is that two objects that were created in the same way have the same value. Sometimes we wish to compare objects based on their identity - in those cases we use the "is" operator. Sometimes we wish to compare objects based on their value - and that's what the == operator is for. Sometimes we wish to use the value of objects as a dictionary key or as a set member, and that's easy. Sometimes we wish to use the identity of objects as a dictionary key or as a set member - and I claim that we should do that by using the Ref class, whose *value* is the object's *identity*, or by using a dict/set subclass, and not by misusing the __hash__ and __eq__ methods. I think that whenever value-based comparison is meaningful, the __eq__ and __hash__ should be value-based. Treating objects by identity should be done explicitly, by the one who uses the objects, by using the "is" operator or the Ref class. It should not be the job of the object to decide which method (value or identity) is more useful - it should allow the user to use both methods, by defining __eq__ and __hash__ based on value. Please give me examples which prove me wrong. I currently think that the only objects for whom value-based comparison is not meaningful, are objects which represent entities which are "outside" of the process, or in other words, entities which are not "computational". This includes files, sockets, possibly user-interface objects, loggers, etc. I think that objects that represent purely "data", have a "value" that they can be compared according to. Even wheels that don't have any attributes are simply equal to other wheels, and not equal to other objects. Since user-defined classes can interact with the "environment" only through other objects or functions, it is reasonable to suggest that they should get a value-based equality operator. Many times the value is defined by the __dict__ and __slots__ members, so it seems to me a reasonable default. I would greatly appreciate repliers that find a tiny bit of reason in what I said (even if they don't agree), and not deny it all as a complete load of rubbish. Thanks, Noam From martin at v.loewis.de Sun Nov 27 00:48:50 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 27 Nov 2005 00:48:50 +0100 Subject: [Python-Dev] For Python 3k, drop default/implicit hash, and comparison In-Reply-To: <b348a0850511261511q64ed5e6dxa8366af22846fe9a@mail.gmail.com> References: <436E2C3E.7060807@zope.com> <436E6A0E.4070508@pobox.com> <5.1.1.6.0.20051106162127.01ede358@mail.telecommunity.com> <5.1.1.6.0.20051106191059.01edcf78@mail.telecommunity.com> <5.1.1.6.0.20051106191251.01fa9818@mail.telecommunity.com> <ca471dc20511070910u3e2e7ea6o6e98b46357a1af5c@mail.gmail.com> <b348a0850511261511q64ed5e6dxa8366af22846fe9a@mail.gmail.com> Message-ID: <4388F462.1090808@v.loewis.de> Noam Raphael wrote: > I would greatly appreciate repliers that find a tiny bit of reason in > what I said (even if they don't agree), and not deny it all as a > complete load of rubbish. I don't understand what your message is. With this posting, did you suggest that somebody does something specific? If so, who is that one, and what should he do? Anyway, a lot of your posting is what I thought was common knowledge; and with some of it, I disagree. > In two sentences: sometimes you wish to compare objects according to > "identity", and sometimes you wish to compare objects according to > "values". Identity-based comparison is done by the "is" operator; > Value-based comparison should be done by the == operator. Certainly. > We may want to compare wheels based on value, for example to make sure > that all the car's wheels fit together nicely: assert car.wheel1 == > car.wheel2 == car.wheel3 == car.wheel4. I would never write it that way. This would suggest that the wheels have to be "the same". However, this is certainly not true for wheels: they have to have to be of the same make. Now, you write that wheels only carry manufacturer and diameter. However, I would expect that wheels grow additional attributes over time, like whether they are left or right, and what their wear level is. So to write your property, I would write car.wheel1.manufacturer_and_make() == car.wheel2.manufacturer_and_make() == car.wheel3.manufacturer_and_make() == car.wheel4.manufacturer_and_make() > We may want to associate values with wheels based on their values. For > example, it's reasonable to suppose that the price of every wheel of > the same model is the same. In that case, we'll write: price[wheel] = > 25. Again, I would not write it this way. I would find wheel.price() most natural. If I have the notion of a price list, then I would try to understand what the price list is keyed-by, e.g. model number: price[wheel.model] = 25 > Now again, how will we say that a specific wheel is broken? Like this: > > broken[Ref(wheel)] = True If I want things to be keyed by identity, I would write broken = IdentityDictionary() ... broken[wheel] = True although I would prefer to write wheel.broken = True > I think that most objects, especially most user-defined objects, have > a *value*. I don't have an exact definition, but a hint is that two > objects that were created in the same way have the same value. Here I disagree. Consider the wheel example. I would expect that a wheel has a "wear level" or some such, and that this changes over time, and that it belongs to the "value" of the wheel ("value" being synonym to "state"). As this changes over time, it is certainly not that the object is created with that value. Think of lists: what is their value? Are they created with it? > Sometimes we wish to use the > identity of objects as a dictionary key or as a set member - and I > claim that we should do that by using the Ref class, whose *value* is > the object's *identity*, or by using a dict/set subclass, and not by > misusing the __hash__ and __eq__ methods. I think we should a specific type of dictionary then. > I think that whenever value-based comparison is meaningful, the __eq__ > and __hash__ should be value-based. Treating objects by identity > should be done explicitly, by the one who uses the objects, by using > the "is" operator or the Ref class. It should not be the job of the > object to decide which method (value or identity) is more useful - it > should allow the user to use both methods, by defining __eq__ and > __hash__ based on value. If objects are compared for value equality, the object should decide which part of its state goes into that comparison. It may be that two objects compare equal even though their state is memberwise different: Rational(1,2) == Rational(5,10) > Please give me examples which prove me wrong. I currently think that > the only objects for whom value-based comparison is not meaningful, > are objects which represent entities which are "outside" of the > process, or in other words, entities which are not "computational". You mean, things of the real world, right? Like people, bank accounts, and wheels. Regards, Martin From pedronis at strakt.com Sun Nov 27 01:13:28 2005 From: pedronis at strakt.com (Samuele Pedroni) Date: Sun, 27 Nov 2005 01:13:28 +0100 Subject: [Python-Dev] For Python 3k, drop default/implicit hash, and comparison In-Reply-To: <b348a0850511261511q64ed5e6dxa8366af22846fe9a@mail.gmail.com> References: <436E2C3E.7060807@zope.com> <436E6A0E.4070508@pobox.com> <5.1.1.6.0.20051106162127.01ede358@mail.telecommunity.com> <5.1.1.6.0.20051106191059.01edcf78@mail.telecommunity.com> <5.1.1.6.0.20051106191251.01fa9818@mail.telecommunity.com> <ca471dc20511070910u3e2e7ea6o6e98b46357a1af5c@mail.gmail.com> <b348a0850511261511q64ed5e6dxa8366af22846fe9a@mail.gmail.com> Message-ID: <4388FA28.5080800@strakt.com> Noam Raphael wrote: > Three weeks ago, I read this and thought, "well, you have two options > for a default comparison, one based on identity and one on value, both > are useful sometimes and Guido prefers identity, and it's OK." But > today I understood that I still think otherwise. > well, this still belongs to comp.lang.python. > In two sentences: sometimes you wish to compare objects according to > "identity", and sometimes you wish to compare objects according to > "values". Identity-based comparison is done by the "is" operator; > Value-based comparison should be done by the == operator. > > Let's take the car example, and expand it a bit. Let's say wheels have > attributes - say, diameter and manufacturer. Let's say those can't > change (which is reasonable), to make wheels hashable. There are two > ways to compare wheels: by value and by identity. Two wheels may have > the same value, that is, they have the same diameter and were created > by the same manufacturer. Two wheels may have the same identity, that > is, they are actually the same wheel. > > We may want to compare wheels based on value, for example to make sure > that all the car's wheels fit together nicely: assert car.wheel1 == > car.wheel2 == car.wheel3 == car.wheel4. We may want to compare wheels > based on identity, for example to make sure that we actually bought > four wheels in order to assemble the car: assert car.wheel1 is not > car.wheel2 and car.wheel3 is not car.wheel1 and car.wheel3 is not > car.wheel2... > > We may want to associate values with wheels based on their values. For > example, it's reasonable to suppose that the price of every wheel of > the same model is the same. In that case, we'll write: price[wheel] = > 25. We may want to associate values with wheels based on their > identities. For example, we may want to note that a specific wheel is > broken. For this, I'll first define a general class (I defined it > before in one of the discussions, that's because I believe it's > useful): > > class Ref(object): > def __init__(self, obj): > self._obj = obj > def __call__(self): > return self._obj > def __eq__(self, other): > return isinstance(other, ref) and self._obj is other._obj > def __hash__(self): > return id(self._obj) ^ 0xBEEF > > Now again, how will we say that a specific wheel is broken? Like this: > > broken[Ref(wheel)] = True > > Note that the Ref class also allows us to group wheels of the same > kind in a set, regardless of their __hash__ method. > > I think that most objects, especially most user-defined objects, have > a *value*. I don't have an exact definition, but a hint is that two > objects that were created in the same way have the same value. > Sometimes we wish to compare objects based on their identity - in > those cases we use the "is" operator. Sometimes we wish to compare > objects based on their value - and that's what the == operator is for. > Sometimes we wish to use the value of objects as a dictionary key or > as a set member, and that's easy. Sometimes we wish to use the > identity of objects as a dictionary key or as a set member - and I > claim that we should do that by using the Ref class, whose *value* is > the object's *identity*, or by using a dict/set subclass, and not by > misusing the __hash__ and __eq__ methods. > > I think that whenever value-based comparison is meaningful, the __eq__ > and __hash__ should be value-based. Treating objects by identity > should be done explicitly, by the one who uses the objects, by using > the "is" operator or the Ref class. It should not be the job of the > object to decide which method (value or identity) is more useful - it > should allow the user to use both methods, by defining __eq__ and > __hash__ based on value. > > Please give me examples which prove me wrong. I currently think that > the only objects for whom value-based comparison is not meaningful, > are objects which represent entities which are "outside" of the > process, or in other words, entities which are not "computational". > This includes files, sockets, possibly user-interface objects, > loggers, etc. I think that objects that represent purely "data", have > a "value" that they can be compared according to. Even wheels that > don't have any attributes are simply equal to other wheels, and not > equal to other objects. Since user-defined classes can interact with > the "environment" only through other objects or functions, it is > reasonable to suggest that they should get a value-based equality > operator. Many times the value is defined by the __dict__ and > __slots__ members, so it seems to me a reasonable default. > > I would greatly appreciate repliers that find a tiny bit of reason in > what I said (even if they don't agree), and not deny it all as a > complete load of rubbish. > not if you think python-dev is a forum for such discussions on OO thinking vs other paradigms. > Thanks, > Noam > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/pedronis%40strakt.com From rhamph at gmail.com Sun Nov 27 01:25:15 2005 From: rhamph at gmail.com (Adam Olsen) Date: Sat, 26 Nov 2005 17:25:15 -0700 Subject: [Python-Dev] For Python 3k, drop default/implicit hash, and comparison In-Reply-To: <b348a0850511261511q64ed5e6dxa8366af22846fe9a@mail.gmail.com> References: <436E2C3E.7060807@zope.com> <436E6A0E.4070508@pobox.com> <5.1.1.6.0.20051106162127.01ede358@mail.telecommunity.com> <5.1.1.6.0.20051106191059.01edcf78@mail.telecommunity.com> <5.1.1.6.0.20051106191251.01fa9818@mail.telecommunity.com> <ca471dc20511070910u3e2e7ea6o6e98b46357a1af5c@mail.gmail.com> <b348a0850511261511q64ed5e6dxa8366af22846fe9a@mail.gmail.com> Message-ID: <aac2c7cb0511261625p6cdefb6epce8fc1e30e99b1c7@mail.gmail.com> On 11/26/05, Noam Raphael <noamraph at gmail.com> wrote: > [...stuff about using Ref() for identity dictionaries...] I too have thought along these lines, but I went one step further. There is an existing function that could be modified to produce Ref objects: id(). Making id() into a type allows it force unsignedness, incorporate a method for easy printing, maintain a reference to the target so that "id(x.foo) == id(x.bar)" doesn't risk reusing the same id.. and the id object would be the same size as an int object is today. I don't see any disadvantage, except perhaps code that assumes id() returns an int. That could be fixed by having id() subclass int for a few versions while we transition, although that may require we store the pointer seperate from the integer value. id() would be usable in dicts as a value, behaving as Noam suggests that Ref behave. Kills two birds with one stone. -- Adam Olsen, aka Rhamphoryncus From ncoghlan at gmail.com Sun Nov 27 03:09:37 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 27 Nov 2005 12:09:37 +1000 Subject: [Python-Dev] For Python 3k, drop default/implicit hash, and comparison In-Reply-To: <aac2c7cb0511261625p6cdefb6epce8fc1e30e99b1c7@mail.gmail.com> References: <436E2C3E.7060807@zope.com> <436E6A0E.4070508@pobox.com> <5.1.1.6.0.20051106162127.01ede358@mail.telecommunity.com> <5.1.1.6.0.20051106191059.01edcf78@mail.telecommunity.com> <5.1.1.6.0.20051106191251.01fa9818@mail.telecommunity.com> <ca471dc20511070910u3e2e7ea6o6e98b46357a1af5c@mail.gmail.com> <b348a0850511261511q64ed5e6dxa8366af22846fe9a@mail.gmail.com> <aac2c7cb0511261625p6cdefb6epce8fc1e30e99b1c7@mail.gmail.com> Message-ID: <43891561.4080604@gmail.com> Adam Olsen wrote: > On 11/26/05, Noam Raphael <noamraph at gmail.com> wrote: >> [...stuff about using Ref() for identity dictionaries...] > > I too have thought along these lines, but I went one step further. > There is an existing function that could be modified to produce Ref > objects: id(). > > Making id() into a type allows it force unsignedness, incorporate a > method for easy printing, maintain a reference to the target so that > "id(x.foo) == id(x.bar)" doesn't risk reusing the same id.. and the id > object would be the same size as an int object is today. I don't see > any disadvantage, except perhaps code that assumes id() returns an > int. That could be fixed by having id() subclass int for a few > versions while we transition, although that may require we store the > pointer seperate from the integer value. > > id() would be usable in dicts as a value, behaving as Noam suggests > that Ref behave. Kills two birds with one stone. I've occasionally considered the concept of a "Ref" class - usually when I want to be able to access a value in multiple places, and have them all track rebinding operations. You can't do it perfectly (you need to rebind the attribute directly because objects aren't notified of name rebinding) but you can get pretty close (because objects *are* notified of augmented assignment). However, re-using id() for this doesn't seem like the right approach. Cheers, Nick. P.S. Yes, those musings where prompted at least in part by Paul Graham's ramblings ;) The sample version below obviously misses out all the slots it would actually need to delegate to get correct behaviour. Py> class Ref(object): ... def __init__(self, val): ... self._val = val ... def __str__(self): ... return str(self._val) ... def __repr__(self): ... return "%s(%s)" % (type(self).__name__, repr(self._val)) ... def __iadd__(self, other): ... self._val += other ... return self ... Py> n = Ref(1) Py> i = n Py> n += 2 Py> n Ref(3) Py> i Ref(3) Py> def make_accum(n): ... def accum(i, n=Ref(n)): ... n += i ... return n._val ... return accum ... Py> acc = make_accum(3) Py> acc(1) 4 Py> acc(5) 9 -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From kbk at shore.net Sun Nov 27 06:26:03 2005 From: kbk at shore.net (Kurt B. Kaiser) Date: Sun, 27 Nov 2005 00:26:03 -0500 (EST) Subject: [Python-Dev] Weekly Python Patch/Bug Summary Message-ID: <200511270526.jAR5Q3Mh017757@bayview.thirdcreek.com> Patch / Bug Summary ___________________ Patches : 372 open ( -7) / 2980 closed (+12) / 3352 total ( +5) Bugs : 908 open ( -2) / 5395 closed (+11) / 6303 total ( +9) RFE : 200 open ( +0) / 191 closed ( +0) / 391 total ( +0) New / Reopened Patches ______________________ CodeContext - Improved text indentation (2005-11-21) http://python.org/sf/1362975 opened by Tal Einat test_cmd_line expecting English error messages (2005-11-23) CLOSED http://python.org/sf/1364545 opened by A.B., Khalid Add reference for en/decode error types (2005-11-23) CLOSED http://python.org/sf/1364946 opened by Wummel [PATCH] mmap fails on AMD64 (2005-11-24) http://python.org/sf/1365916 opened by Joe Wreschnig Patches Closed ______________ zlib.crc32 doesn't handle 0xffffffff seed (2005-11-07) http://python.org/sf/1350573 closed by akuchling xml.dom.minidom.Node.replaceChild(obj, x, x) removes child x (2005-01-01) http://python.org/sf/1094164 closed by akuchling Patch for (Doc) #1255218 (2005-10-17) http://python.org/sf/1328526 closed by birkenfeld Patch for (Doc) #1261659 (2005-10-17) http://python.org/sf/1328566 closed by birkenfeld Patch for (Doc) #1357604 (2005-11-18) http://python.org/sf/1359879 closed by birkenfeld CallTip Modifications (2005-05-11) http://python.org/sf/1200038 closed by kbk ensure lock is released if exception is raised (2005-10-05) http://python.org/sf/1314396 closed by bcannon test_cmd_line expecting English error messages (2005-11-23) http://python.org/sf/1364545 closed by doerwalter ToolTip.py: fix main() function (2005-10-06) http://python.org/sf/1315161 closed by kbk Add reference for en/decode error types (2005-11-23) http://python.org/sf/1364946 closed by doerwalter solaris 10 should not define _XOPEN_SOURCE_EXTENDED (2005-06-27) http://python.org/sf/1227966 closed by loewis Solaris 10 fails to compile complexobject.c [FIX incl.] (2005-02-05) http://python.org/sf/1116722 closed by loewis New / Reopened Bugs ___________________ textwrap.dedent() expands tabs (2005-11-19) http://python.org/sf/1361643 opened by Steven Bethard Text.edit_modified() doesn't work (2005-11-20) http://python.org/sf/1362475 opened by Ron Provost Problem with tapedevices and the tarfile module (2005-11-21) http://python.org/sf/1362587 opened by Henrik spawnlp is missing (2005-11-21) http://python.org/sf/1363104 opened by Greg MacDonald A possible thinko in the description of os/chmod (2005-11-22) CLOSED http://python.org/sf/1363712 opened by Evgeny Roubinchtein urllib cannot open data: urls (2005-11-25) CLOSED http://python.org/sf/1365984 opened by Warren Butler Bug bz2.BZ2File(...).seek(0,2) (2005-11-25) http://python.org/sf/1366000 opened by STINNER Victor inoorrect documentation for optparse (2005-11-25) http://python.org/sf/1366250 opened by Michael Dunn SRE engine do not release the GIL (2005-11-25) http://python.org/sf/1366311 opened by Eric Noyau inspect.getdoc fails on objs that use property for __doc__ (2005-11-26) http://python.org/sf/1367183 opened by Drew Perttula Bugs Closed ___________ A possible thinko in the description of os.chmod (2005-11-22) http://python.org/sf/1363712 closed by birkenfeld docs need to discuss // and __future__.division (2001-08-08) http://python.org/sf/449093 closed by akuchling Prefer configured browser over Mozilla and friends (2005-11-17) http://python.org/sf/1359150 closed by birkenfeld Incorrect documentation of raw unidaq string literals (2005-11-17) http://python.org/sf/1359053 closed by birkenfeld "appropriately decorated" is undefined in MultiFile.push doc (2005-08-09) http://python.org/sf/1255218 closed by birkenfeld Tutorial doesn't cover * and ** function calls (2005-08-17) http://python.org/sf/1261659 closed by birkenfeld os.path.makedirs DOES handle UNC paths (2005-11-15) http://python.org/sf/1357604 closed by birkenfeld Exec Inside A Function (2005-04-06) http://python.org/sf/1177811 closed by birkenfeld Py_BuildValue k format units don't work with big values (2005-09-04) http://python.org/sf/1281408 closed by birkenfeld urllib cannot open data: urls (2005-11-25) http://python.org/sf/1365984 closed by birkenfeld imaplib: parsing INTERNALDATE (2003-03-06) http://python.org/sf/698706 closed by birkenfeld From noamraph at gmail.com Sun Nov 27 20:04:25 2005 From: noamraph at gmail.com (Noam Raphael) Date: Sun, 27 Nov 2005 21:04:25 +0200 Subject: [Python-Dev] For Python 3k, drop default/implicit hash, and comparison In-Reply-To: <4388F462.1090808@v.loewis.de> References: <436E2C3E.7060807@zope.com> <436E6A0E.4070508@pobox.com> <5.1.1.6.0.20051106162127.01ede358@mail.telecommunity.com> <5.1.1.6.0.20051106191059.01edcf78@mail.telecommunity.com> <5.1.1.6.0.20051106191251.01fa9818@mail.telecommunity.com> <ca471dc20511070910u3e2e7ea6o6e98b46357a1af5c@mail.gmail.com> <b348a0850511261511q64ed5e6dxa8366af22846fe9a@mail.gmail.com> <4388F462.1090808@v.loewis.de> Message-ID: <b348a0850511271104q387ece75sc75b186b96bd792f@mail.gmail.com> On 11/27/05, "Martin v. L?wis" <martin at v.loewis.de> wrote: > Noam Raphael wrote: > > I would greatly appreciate repliers that find a tiny bit of reason in > > what I said (even if they don't agree), and not deny it all as a > > complete load of rubbish. > > I don't understand what your message is. With this posting, did you > suggest that somebody does something specific? If so, who is that one, > and what should he do? Perhaps I felt a bit attacked. It was probably my fault, and anyway, a general message like this is not the proper way - I'm sorry. > > Anyway, a lot of your posting is what I thought was common knowledge; > and with some of it, I disagree. This is fine, of course. > > We may want to compare wheels based on value, for example to make sure > > that all the car's wheels fit together nicely: assert car.wheel1 == > > car.wheel2 == car.wheel3 == car.wheel4. > > I would never write it that way. This would suggest that the wheels > have to be "the same". However, this is certainly not true for wheels: > they have to have to be of the same make. Now, you write that wheels > only carry manufacturer and diameter. However, I would expect that > wheels grow additional attributes over time, like whether they are > left or right, and what their wear level is. So to write your property, > I would write > > car.wheel1.manufacturer_and_make() == > car.wheel2.manufacturer_and_make() == > car.wheel3.manufacturer_and_make() == > car.wheel4.manufacturer_and_make() > You may be right in the case of wheels. From time to time, in the real (programming) world, I encounter objects that I wish to compare by value - this is certainly the case for built-in objects, but is sometimes the case for more complex objects. > > We may want to associate values with wheels based on their values. For > > example, it's reasonable to suppose that the price of every wheel of > > the same model is the same. In that case, we'll write: price[wheel] = > > 25. > > Again, I would not write it this way. I would find > > wheel.price() Many times the objects are not yours to add attributes, or may have __slots__ defined. The truth is that I prefer not to add attributes to external objects even when it's possible. > > most natural. If I have the notion of a price list, then I would > try to understand what the price list is keyed-by, e.g. model number: > > price[wheel.model] = 25 > Sometimes there's no "key" - it's just the state of the object (what if wheels don't have a model number?) > > Now again, how will we say that a specific wheel is broken? Like this: > > > > broken[Ref(wheel)] = True > > If I want things to be keyed by identity, I would write > > broken = IdentityDictionary() > ... > broken[wheel] = True > > although I would prefer to write > > wheel.broken = True > I personally prefer the first method, but the second one is ok too. > > I think that most objects, especially most user-defined objects, have > > a *value*. I don't have an exact definition, but a hint is that two > > objects that were created in the same way have the same value. > > Here I disagree. Consider the wheel example. I would expect that > a wheel has a "wear level" or some such, and that this changes over > time, and that it belongs to the "value" of the wheel ("value" > being synonym to "state"). As this changes over time, it is certainly > not that the object is created with that value. > > Think of lists: what is their value? Are they created with it? > My tounge failed me. I meant: created in the same way = have gone through the same series of actions. That is: a = []; a.append(5); a.extend([2,1]); a.pop() b = []; b.append(5); b.entend([2,1]); b.pop() a == b > > Sometimes we wish to use the > > identity of objects as a dictionary key or as a set member - and I > > claim that we should do that by using the Ref class, whose *value* is > > the object's *identity*, or by using a dict/set subclass, and not by > > misusing the __hash__ and __eq__ methods. > > I think we should a specific type of dictionary then. That's OK too. My point was that the one who uses the objects should explicitly specify whether he means value-based of identity-based lookup. This means that if an object has a "value", it should not make __eq__ and __hash__ be identity-based just to make identity-based lookup easier and implicit. > > > I think that whenever value-based comparison is meaningful, the __eq__ > > and __hash__ should be value-based. Treating objects by identity > > should be done explicitly, by the one who uses the objects, by using > > the "is" operator or the Ref class. It should not be the job of the > > object to decide which method (value or identity) is more useful - it > > should allow the user to use both methods, by defining __eq__ and > > __hash__ based on value. > > If objects are compared for value equality, the object should decide > which part of its state goes into that comparison. It may be that > two objects compare equal even though their state is memberwise > different: > > Rational(1,2) == Rational(5,10) > I completely agree. Indeed, the "value of an object" is in many times not "the value of all its attributes". > > Please give me examples which prove me wrong. I currently think that > > the only objects for whom value-based comparison is not meaningful, > > are objects which represent entities which are "outside" of the > > process, or in other words, entities which are not "computational". > > You mean, things of the real world, right? Like people, bank accounts, > and wheels. No, I meant real programming examples. My theory is that most user-defined classes have a "value", and those that don't are related to I/O, in some sort of a broad definition of the term. I may be wrong, so I ask for counter-examples. Thanks for your reply, Noam From noamraph at gmail.com Sun Nov 27 20:14:15 2005 From: noamraph at gmail.com (Noam Raphael) Date: Sun, 27 Nov 2005 21:14:15 +0200 Subject: [Python-Dev] For Python 3k, drop default/implicit hash, and comparison In-Reply-To: <4388FA28.5080800@strakt.com> References: <436E2C3E.7060807@zope.com> <436E6A0E.4070508@pobox.com> <5.1.1.6.0.20051106162127.01ede358@mail.telecommunity.com> <5.1.1.6.0.20051106191059.01edcf78@mail.telecommunity.com> <5.1.1.6.0.20051106191251.01fa9818@mail.telecommunity.com> <ca471dc20511070910u3e2e7ea6o6e98b46357a1af5c@mail.gmail.com> <b348a0850511261511q64ed5e6dxa8366af22846fe9a@mail.gmail.com> <4388FA28.5080800@strakt.com> Message-ID: <b348a0850511271114g1193090fwa5cff444d2fb8b02@mail.gmail.com> On 11/27/05, Samuele Pedroni <pedronis at strakt.com> wrote: > well, this still belongs to comp.lang.python. ... > not if you think python-dev is a forum for such discussions > on OO thinking vs other paradigms. Perhaps my style made it look like a discussion on OO thinking vs other paradigms, but my conclusion is exactly about the issue of this thread - Jim suggested to drop default __hash__ and __eq__ for Python 3K. Guido decided not to, because it's useful to use them for identity-based comparison and lookup. I say that I disagree, because I think that __hash__ and __eq__ should be used for value-based comparison and lookup, and because if the user of the object does explicit identity-based comparison/lookup, it doesn't matter to him whether __hash__ and __eq__ are defined or not. I also suggested, in a way, that it's OK to define a default value-based __eq__ method. Noam From arigo at tunes.org Sun Nov 27 21:00:38 2005 From: arigo at tunes.org (Armin Rigo) Date: Sun, 27 Nov 2005 21:00:38 +0100 Subject: [Python-Dev] For Python 3k, drop default/implicit hash, and comparison In-Reply-To: <b348a0850511271104q387ece75sc75b186b96bd792f@mail.gmail.com> References: <436E2C3E.7060807@zope.com> <436E6A0E.4070508@pobox.com> <5.1.1.6.0.20051106162127.01ede358@mail.telecommunity.com> <5.1.1.6.0.20051106191059.01edcf78@mail.telecommunity.com> <5.1.1.6.0.20051106191251.01fa9818@mail.telecommunity.com> <ca471dc20511070910u3e2e7ea6o6e98b46357a1af5c@mail.gmail.com> <b348a0850511261511q64ed5e6dxa8366af22846fe9a@mail.gmail.com> <4388F462.1090808@v.loewis.de> <b348a0850511271104q387ece75sc75b186b96bd792f@mail.gmail.com> Message-ID: <20051127200038.GA7033@code1.codespeak.net> Hi Noam, On Sun, Nov 27, 2005 at 09:04:25PM +0200, Noam Raphael wrote: > No, I meant real programming examples. My theory is that most > user-defined classes have a "value", and those that don't are related > to I/O, in some sort of a broad definition of the term. I may be > wrong, so I ask for counter-examples. In the source code base of PyPy, trying to count only what we really wrote and not external tools, I found 19 classes defining __eq__ on a total of 1413. There must be close to zero classes that have anything to do with I/O in there. If anything, this proves that the default comparison for classes is absolutely fine and nothing needs to be fixed in the Python language. Please move this discussion outside python-dev. Armin From guido at python.org Mon Nov 28 03:24:12 2005 From: guido at python.org (Guido van Rossum) Date: Sun, 27 Nov 2005 18:24:12 -0800 Subject: [Python-Dev] urlparse brokenness In-Reply-To: <20051123050455.9010E7FBF@place.org> References: <20051123050455.9010E7FBF@place.org> Message-ID: <ca471dc20511271824k1e227bdeo594559904b9894fe@mail.gmail.com> On 11/22/05, Paul Jimenez <pj at place.org> wrote: > > It is my assertion that urlparse is currently broken. Specifically, I > think that urlparse breaks an abstraction boundary with ill effect. IIRC I did it this way because the RFC about parsing urls specifically prescribed it had to be done this way. Maybe there's a newer RFC with different rules? > In writing a mailclient, I wished to allow my users to specify their > imap server as a url, such as 'imap://user:password at host:port/'. Which > worked fine. I then thought that the natural extension to support > configuration of imapssl would be 'imaps://user:password at host:port/'.... > which failed - user:passwrod at host:port got parsed as the *path* of > the URL instead of the network location. It turns out that urlparse > keeps a table of url schemes that 'use netloc'... that is to say, > that have a 'user:password at host:port' part to their URL. I think this > 'special knowledge' about particular schemes 1) breaks an abstraction > boundary by having a function whose charter is to pull apart a > particularly-formatted string behave differently based on the meaning of > the string instead of the structure of it I disagree. You have to know what the scheme means before you can parse the rest -- there is (by design!) no standard parsing for anything that follows the scheme and the colon. I don't even think that you can trust that if the colon is followed by two slashes that what follows is a netloc for all schemes. But if there's an RFC that says otherwise I'll gladly concede; urlparse's main goal in life is to b RFC compliant. Is your opinion based on an RFC? > and 2) fails to be extensible > or forward compatible due to hardcoded 'magic' strings - if schemes were > somehow 'registerable' as 'netloc using' or not, then this objection > might be nullified, but the previous objection would still stand. I think it is reasonable to propose an extension whereby one can register a parser (or parsing flags like uses_netloc) for a specific scheme, presuming there won't be conflicting registrations (which should only happen if two independently developed libraries have a different use for the same scheme -- a failure of standardization). > So I propose that urlsplit, the main offender, be replaced with something > that looks like: > > def urlsplit(url, scheme='', allow_fragments=1, default=('','','','','')): Since you don't present your new code in diff format, could you explain in English how what it does differs from the original? Or perhaps you could present some unit tests (doctest would be ideal) showing the desired behavior of the proposed code (I understand from later posts that it may have some bugs). (For example, why add the default parameter?) > Note that I'm not sold on the _parse_cache, but I'm assuming it was there > for a reason so I'm leaving that functionality as-is. There's also a special case for http; given that the code is rather general and hence slow, it makes sense that it attempts some optimizations, and removing these might cause a nasty surprise for some users. > If this isn't the right forum for this discussion, or the right place to > submit code, please let me know. Please do submit patches to SF if you want then to be discussed. > Also, please cc: me directly on responses > as I'm not subscribed to the firehose that is python-dev. ACK. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From mike at skew.org Mon Nov 28 06:07:08 2005 From: mike at skew.org (Mike Brown) Date: Sun, 27 Nov 2005 22:07:08 -0700 (MST) Subject: [Python-Dev] urlparse brokenness In-Reply-To: <ca471dc20511271824k1e227bdeo594559904b9894fe@mail.gmail.com> Message-ID: <200511280507.jAS578np069306@chilled.skew.org> Guido van Rossum wrote: > IIRC I did it this way because the RFC about parsing urls specifically > prescribed it had to be done this way. That was true as of RFC 1808 (1995-1998), although the grammar actually allowed for a more generic interpretation. Such an interpretation was suggested in RFC 2396 (1998-2004) via a regular expression for parsing URI 'references' (a formal abstraction introduced in 2396) into 5 components (not six, since 'params' were moved into 'path' and eventually became an option on every path segment, not just the end of the path). The 5 components are: scheme, authority (formerly netloc), path, query, fragment. Parsing could result in some components being undefined, which is distinct from being empty (e.g., 'mailto:foo at bar?' would have an undefined authority and fragment, and a defined, but empty, query). RFC 3986 / STD 66 (2005-) did not change the regular expression, but makes several references to these '5 major components' of a URI, and says that these components are scheme-independent; parsers that operate at the generic syntax level "can parse any URI reference into its major components. Once the scheme is determined, further scheme-specific parsing can be performed on the components." > You have to know what the scheme means before you can > parse the rest -- there is (by design!) no standard parsing for > anything that follows the scheme and the colon. Not since 1998, IMHO. It was implicit, at least since RFC 2396, that all URI references can be interpreted as having the 5 components, it was made explicit in RFC 3986 / STD 66. > I don't even think > that you can trust that if the colon is followed by two slashes that > what follows is a netloc for all schemes. You can. > But if there's an RFC that says otherwise I'll gladly concede; > urlparse's main goal in life is to b RFC compliant. Its intent seems to be to split a URI into its major components, which are now by definition scheme-independent (and have been, implicitly, for a long time), so the function shouldn't distinguish between schemes. Do you want to keep returning that 6-tuple, or can we make it return a 5-tuple? If we keep returning 'params' for backward compatibility, then that means the 'path' we are returning is not the 'path' that people would expect (they'll have to concatenate path+params to get what the generic syntax calls a 'path' nowadays). It's also deceptive because params are now allowed on all path segments, and the current function only takes them from the last segment. Also for backward compatibility, should an absent component continue to manifest in the result as an empty string? I think a compliant parser should make a distinction between absent and empty (it could make a difference, in theory). If a regular expression were used for parsing, it would produce None for absent components and empty-string for empty ones. I implemented it this way in 4Suite's Ft.Lib.Uri and it works nicely. Mike From ncoghlan at iinet.net.au Mon Nov 28 12:26:53 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Mon, 28 Nov 2005 21:26:53 +1000 Subject: [Python-Dev] Metaclass problem in the "with" statement semantics in PEP 343 Message-ID: <438AE97D.2050600@iinet.net.au> Given the current semantics of PEP 343 and the following class: class null_context(object): def __context__(self): return self def __enter__(self): return self def __exit__(self, *exc_info): pass Mistakenly writing: with null_context: # Oops, passed the class instead of an instance Would give a less than meaningful error message: TypeError: unbound method __context__() must be called with null_context instance as first argument (got nothing instead) It's the usual metaclass problem with invoking a slot (or slot equivalent) via "obj.__slot__()" rather than via "type(obj).__slot__(obj)" the way the underlying C code does. I think we need to fix the proposed semantics so that they access the slots via the type, rather than directly through the instance. Otherwise the slots for the with statement will behave strangely when compared to the slots for other magic methods. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From guido at python.org Mon Nov 28 15:53:35 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 28 Nov 2005 06:53:35 -0800 Subject: [Python-Dev] urlparse brokenness In-Reply-To: <200511280507.jAS578np069306@chilled.skew.org> References: <ca471dc20511271824k1e227bdeo594559904b9894fe@mail.gmail.com> <200511280507.jAS578np069306@chilled.skew.org> Message-ID: <ca471dc20511280653r7520fc6av10fcda4ff217958c@mail.gmail.com> OK, you've convinced me. But for backwards compatibility (until Python 3000), a new API should be designed. We can't change the old API in an incompatible way. Please submit complete code + docs to SF. (If you think this requires much design work, a PEP may be in order but I think that given the new RFCs it's probably straightforward enough to not require that. --Guido On 11/27/05, Mike Brown <mike at skew.org> wrote: > Guido van Rossum wrote: > > IIRC I did it this way because the RFC about parsing urls specifically > > prescribed it had to be done this way. > > That was true as of RFC 1808 (1995-1998), although the grammar actually > allowed for a more generic interpretation. > > Such an interpretation was suggested in RFC 2396 (1998-2004) via a regular > expression for parsing URI 'references' (a formal abstraction introduced in > 2396) into 5 components (not six, since 'params' were moved into 'path' > and eventually became an option on every path segment, not just the end > of the path). The 5 components are: > > scheme, authority (formerly netloc), path, query, fragment. > > Parsing could result in some components being undefined, which is distinct > from being empty (e.g., 'mailto:foo at bar?' would have an undefined authority > and fragment, and a defined, but empty, query). > > RFC 3986 / STD 66 (2005-) did not change the regular expression, but makes > several references to these '5 major components' of a URI, and says that these > components are scheme-independent; parsers that operate at the generic syntax > level "can parse any URI reference into its major components. Once the scheme > is determined, further scheme-specific parsing can be performed on the > components." > > > You have to know what the scheme means before you can > > parse the rest -- there is (by design!) no standard parsing for > > anything that follows the scheme and the colon. > > Not since 1998, IMHO. It was implicit, at least since RFC 2396, that all URI > references can be interpreted as having the 5 components, it was made explicit > in RFC 3986 / STD 66. > > > I don't even think > > that you can trust that if the colon is followed by two slashes that > > what follows is a netloc for all schemes. > > You can. > > > But if there's an RFC that says otherwise I'll gladly concede; > > urlparse's main goal in life is to b RFC compliant. > > Its intent seems to be to split a URI into its major components, which are now > by definition scheme-independent (and have been, implicitly, for a long time), > so the function shouldn't distinguish between schemes. > > Do you want to keep returning that 6-tuple, or can we make it return a > 5-tuple? If we keep returning 'params' for backward compatibility, then that > means the 'path' we are returning is not the 'path' that people would expect > (they'll have to concatenate path+params to get what the generic syntax calls > a 'path' nowadays). It's also deceptive because params are now allowed on all > path segments, and the current function only takes them from the last segment. > > Also for backward compatibility, should an absent component continue to > manifest in the result as an empty string? I think a compliant parser should > make a distinction between absent and empty (it could make a difference, in > theory). > > If a regular expression were used for parsing, it would produce None for > absent components and empty-string for empty ones. I implemented it this > way in 4Suite's Ft.Lib.Uri and it works nicely. > > Mike > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Nov 28 17:24:08 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 28 Nov 2005 08:24:08 -0800 Subject: [Python-Dev] Metaclass problem in the "with" statement semantics in PEP 343 In-Reply-To: <438AE97D.2050600@iinet.net.au> References: <438AE97D.2050600@iinet.net.au> Message-ID: <ca471dc20511280824y6af50950y93f70f9c19bfe0d9@mail.gmail.com> On 11/28/05, Nick Coghlan <ncoghlan at iinet.net.au> wrote: > Given the current semantics of PEP 343 and the following class: > > class null_context(object): > def __context__(self): > return self > def __enter__(self): > return self > def __exit__(self, *exc_info): > pass > > Mistakenly writing: > > with null_context: > # Oops, passed the class instead of an instance > > Would give a less than meaningful error message: > > TypeError: unbound method __context__() must be called with null_context > instance as first argument (got nothing instead) > > It's the usual metaclass problem with invoking a slot (or slot equivalent) via > "obj.__slot__()" rather than via "type(obj).__slot__(obj)" the way the > underlying C code does. > > I think we need to fix the proposed semantics so that they access the slots > via the type, rather than directly through the instance. Otherwise the slots > for the with statement will behave strangely when compared to the slots for > other magic methods. Maybe it's because I'm just an old fart, but I can't make myself care about this. The code is broken. You get an error message. It even has the correct exception (TypeError). In this particular case the error message isn't that great -- well, the same is true in many other cases (like whenever the invocation is a method call from Python code). That most built-in operations produce a different error message doesn't mean we have to make *all* built-in operations use the same approach. I fail to see the value of the consistency you're calling for. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Nov 28 17:45:52 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 28 Nov 2005 08:45:52 -0800 Subject: [Python-Dev] (no subject) In-Reply-To: <E1EfHow-0002xd-Ar@apasphere.com> References: <E1EfHow-0002xd-Ar@apasphere.com> Message-ID: <ca471dc20511280845k3c73a7ccj381b9013b3651871@mail.gmail.com> On 11/24/05, Duncan Grisby <duncan-pythondev at grisby.org> wrote: > Hi, > > I posted this to comp.lang.python, but got no response, so I thought I > would consult the wise people here... > > I have encountered a problem with the re module. I have a > multi-threaded program that does lots of regular expression searching, > with some relatively complex regular expressions. Occasionally, events > can conspire to mean that the re search takes minutes. That's bad > enough in and of itself, but the real problem is that the re engine > does not release the interpreter lock while it is running. All the > other threads are therefore blocked for the entire time it takes to do > the regular expression search. Rather than trying to fight the GIL, I suggest that you let a regex expert look at your regex(es) and the input that causes the long running times. As Fredrik suggested, certain patterns are just inefficient but can be rewritten more efficiently. There are plenty of regex experts on c.l.py. Unless you have a multi-CPU box, the performance of your app isn't going to improve by releasing the GIL -- it only affects the responsiveness of other threads. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Nov 28 18:11:13 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 28 Nov 2005 09:11:13 -0800 Subject: [Python-Dev] Patch Req. # 1351020 & 1351036: PythonD modifications In-Reply-To: <43816CE2.2020808@v.loewis.de> References: <39387.202.3.192.11.1132108393.squirrel@cafemail.mcadcafe.com> <437FA1D8.7060600@v.loewis.de> <20051120150850.GA27838@unpythonic.net> <25509.202.3.192.11.1132533752.squirrel@cafemail.mcadcafe.com> <43816CE2.2020808@v.loewis.de> Message-ID: <ca471dc20511280911o3966d2fcr4b9c5bc932407cc4@mail.gmail.com> On 11/20/05, "Martin v. L?wis" <martin at v.loewis.de> wrote: > decker at dacafe.com wrote: > > The local python community here in Sydney indicated that python.org is > > only upset when groups port the source to 'obscure' systems and *don't* > > submit patches... It is possible that I was misinformed. > > I never heard such concerns. I personally wouldn't notice if somebody > ported Python, and did not feed back the patches. I guess that I'm the source of that sentiment. My reason for wanting people to contribute ports back is that if they don't, the port is more likely to stick on some ancient version of Python (e.g. I believe Nokia is still at 2.2.2). Then, assuming the port remains popular, its users are going to pressure developers of general Python packages to provide support for old versions of Python. While I agree that maintaining port-specific code is a pain whenever Python is upgraded, I still think that accepting patches for odd-platform ports is the better alternative. Even if the patches deteriorate as Python evolves, they should still (in principle) make a re-port easier. Perhaps the following compromise can be made: the PSF accepts patches from reputable platform maintainers. (Of course, like all contributions, they must be of high quality and not break anything, etc., before they are accepted.) If such patches cause problems with later Python versions, the PSF won't maintain them, but instead invite the original contributors (or other developers who are interested in that particular port) to fix them. If there is insufficient response, or if it comes too late given the PSF release schedule, the PSF developers may decide to break or remove support for the affected platform. There's a subtle balance between keeping too much old cruft and being too aggressive in removing cruft that still serves a purpose for someone. I bet that we've erred in both directions at times. > Sometimes, people ask "there is this and that port, why isn't it > integrated", to which the answer is in most cases "because authors > didn't contribute". This is not being upset - it is merely a fact. > This port (djgcc) is the first one in a long time (IIRC) where > anybody proposed rejecting it. > > > I am not sure about the future myself. DJGPP 2.04 has been parked at beta > > for two years now. It might be fair to say that the *general* DJGPP > > developer base has shrunk a little bit. But the PythonD userbase has > > actually grown since the first release three years ago. For the time > > being, people get very angry when the servers go down here :-) > > It's not that much availability of the platform I worry about, but the > commitment of the Python porter. We need somebody to forward bug > reports to, and somebody to intervene if incompatible changes are made. > This person would also indicate that the platform is no longer > available, and hence the port can be removed. It sounds like Ben Decker is for the time being volunteering to provide patches and to maintain them. (I hope I'm reading you right, Ben.) I'm +1 on accepting his patches, *provided* as always they pass muster in terms of general Python development standards. (Jeff Epler's comments should be taken to heart.) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From duncan-pythondev at grisby.org Mon Nov 28 18:59:57 2005 From: duncan-pythondev at grisby.org (Duncan Grisby) Date: Mon, 28 Nov 2005 17:59:57 +0000 Subject: [Python-Dev] SRE should release the GIL (was: no subject) In-Reply-To: Message from Guido van Rossum <guido@python.org> of "Mon, 28 Nov 2005 08:45:52 PST." <ca471dc20511280845k3c73a7ccj381b9013b3651871@mail.gmail.com> Message-ID: <E1EgnIE-0004TN-3W@apasphere.com> On Monday 28 November, Guido van Rossum wrote: > On 11/24/05, Duncan Grisby <duncan-pythondev at grisby.org> wrote: > > I have encountered a problem with the re module. I have a > > multi-threaded program that does lots of regular expression searching, > > with some relatively complex regular expressions. Occasionally, events > > can conspire to mean that the re search takes minutes. That's bad > > enough in and of itself, but the real problem is that the re engine > > does not release the interpreter lock while it is running. All the > > other threads are therefore blocked for the entire time it takes to do > > the regular expression search. > > Rather than trying to fight the GIL, I suggest that you let a regex > expert look at your regex(es) and the input that causes the long > running times. As Fredrik suggested, certain patterns are just > inefficient but can be rewritten more efficiently. There are plenty of > regex experts on c.l.py. Part of the problem is certainly inefficient regexes, and we have improved things to some extent by changing some of them. Unfortunately, the regexes come from user input, so we can't be certain that our users aren't going to do stupid things. It's not too bad if a stupid regex slows things down for a bit, but it is bad if it causes the whole application to freeze for minutes at a time. > Unless you have a multi-CPU box, the performance of your app isn't > going to improve by releasing the GIL -- it only affects the > responsiveness of other threads. We do have a multi-CPU box. Even with good regexes, regex matching takes up a significant proportion of the time spent processing in our application, so being able to release the GIL will hopefully increase performance overall as well as increasing responsiveness. We are currently testing our application with the patch to sre that Eric posted. Once we get on to some performance tests, we'll post the results of whether releasing the GIL does make a measurable difference for us. Cheers, Duncan. -- -- Duncan Grisby -- -- duncan at grisby.org -- -- http://www.grisby.org -- From martin at v.loewis.de Mon Nov 28 20:51:27 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 28 Nov 2005 20:51:27 +0100 Subject: [Python-Dev] Patch Req. # 1351020 & 1351036: PythonD modifications In-Reply-To: <ca471dc20511280911o3966d2fcr4b9c5bc932407cc4@mail.gmail.com> References: <39387.202.3.192.11.1132108393.squirrel@cafemail.mcadcafe.com> <437FA1D8.7060600@v.loewis.de> <20051120150850.GA27838@unpythonic.net> <25509.202.3.192.11.1132533752.squirrel@cafemail.mcadcafe.com> <43816CE2.2020808@v.loewis.de> <ca471dc20511280911o3966d2fcr4b9c5bc932407cc4@mail.gmail.com> Message-ID: <438B5FBF.7050604@v.loewis.de> Guido van Rossum wrote: > Perhaps the following compromise can be made: the PSF accepts patches > from reputable platform maintainers. (Of course, like all > contributions, they must be of high quality and not break anything, > etc., before they are accepted.) If such patches cause problems with > later Python versions, the PSF won't maintain them, but instead invite > the original contributors (or other developers who are interested in > that particular port) to fix them. If there is insufficient response, > or if it comes too late given the PSF release schedule, the PSF > developers may decide to break or remove support for the affected > platform. This is indeed the compromise I was after. If the contributors indicate that they will maintain it for some time (which happened in this case), then I can happily accept any port (and did indeed in the past). In the specific case, there is an additional twist that we deliberately removed DOS support some time ago, and listed that as officially removed in a PEP. I understand that djgpp somehow isn't quite the same as DOS, although I don't understand the differences (anymore). But if it's fine with you, it is fine with me. Regards, Martin From amk at amk.ca Mon Nov 28 20:56:46 2005 From: amk at amk.ca (A.M. Kuchling) Date: Mon, 28 Nov 2005 14:56:46 -0500 Subject: [Python-Dev] Bug day this Sunday? Message-ID: <20051128195646.GA21584@rogue.amk.ca> Is anyone interested in joining a Python bug day this Sunday? A useful task might be to prepare for the python-core sprint at PyCon by going through the bug and patch managers, and listing bugs/patches that would be good candidates for working on at PyCon. We'd meet in the usual location: #python-dev on irc.freenode.net, from roughly 9AM to 3PM Eastern (2PM to 8PM UTC) on Sunday Dec. 4. --amk From guido at python.org Mon Nov 28 21:07:37 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 28 Nov 2005 12:07:37 -0800 Subject: [Python-Dev] Patch Req. # 1351020 & 1351036: PythonD modifications In-Reply-To: <438B5FBF.7050604@v.loewis.de> References: <39387.202.3.192.11.1132108393.squirrel@cafemail.mcadcafe.com> <437FA1D8.7060600@v.loewis.de> <20051120150850.GA27838@unpythonic.net> <25509.202.3.192.11.1132533752.squirrel@cafemail.mcadcafe.com> <43816CE2.2020808@v.loewis.de> <ca471dc20511280911o3966d2fcr4b9c5bc932407cc4@mail.gmail.com> <438B5FBF.7050604@v.loewis.de> Message-ID: <ca471dc20511281207i1bb3dabpa0693014d818a4a8@mail.gmail.com> On 11/28/05, "Martin v. L?wis" <martin at v.loewis.de> wrote: > Guido van Rossum wrote: > > Perhaps the following compromise can be made: the PSF accepts patches > > from reputable platform maintainers. (Of course, like all > > contributions, they must be of high quality and not break anything, > > etc., before they are accepted.) If such patches cause problems with > > later Python versions, the PSF won't maintain them, but instead invite > > the original contributors (or other developers who are interested in > > that particular port) to fix them. If there is insufficient response, > > or if it comes too late given the PSF release schedule, the PSF > > developers may decide to break or remove support for the affected > > platform. > > This is indeed the compromise I was after. If the contributors indicate > that they will maintain it for some time (which happened in this case), > then I can happily accept any port (and did indeed in the past). > > In the specific case, there is an additional twist that we deliberately > removed DOS support some time ago, and listed that as officially removed > in a PEP. I understand that djgpp somehow isn't quite the same as DOS, > although I don't understand the differences (anymore). > > But if it's fine with you, it is fine with me. Thanks. :-) I say, the more platforms the merrier. I don't recall why DOS support was removed (PEP 11 doesn't say) but I presume it was just because nobody volunteered to maintain it, not because we have a particularly dislike for DOS. So now that we have a volunteer let's deal with his patches without prejudice. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Nov 28 21:13:09 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 28 Nov 2005 12:13:09 -0800 Subject: [Python-Dev] Proposed additional keyword argument in logging calls In-Reply-To: <001a01c5ef77$d7682300$0200a8c0@alpha> References: <001a01c5ef77$d7682300$0200a8c0@alpha> Message-ID: <ca471dc20511281213i7aa48897qb4fd10d89fbae5dd@mail.gmail.com> On 11/22/05, Vinay Sajip <vinay_sajip at red-dove.com> wrote: > On numerous occasions, requests have been made for the ability to easily add > user-defined data to logging events. For example, a multi-threaded server > application may want to output specific information to a particular server > thread (e.g. the identity of the client, specific protocol options for the > client connection, etc.) > > This is currently possible, but you have to subclass the Logger class and > override its makeRecord method to put custom attributes in the LogRecord. > These can then be output using a customised format string containing e.g. > "%(foo)s %(bar)d". The approach is usable but requires more work than > necessary. > > I'd like to propose a simpler way of achieving the same result, which > requires use of an additional optional keyword argument in logging calls. > The signature of the (internal) Logger._log method would change from > > def _log(self, level, msg, args, exc_info=None) > > to > > def _log(self, level, msg, args, exc_info=None, extra_info=None) > > The extra_info argument will be passed to Logger.makeRecord, whose signature > will change from > > def makeRecord(self, name, level, fn, lno, msg, args, exc_info): > > to > > def makeRecord(self, name, level, fn, lno, msg, args, exc_info, > extra_info) > > makeRecord will, after doing what it does now, use the extra_info argument > as follows: > > If type(extra_info) != types.DictType, it will be ignored. > > Otherwise, any entries in extra_info whose keys are not already in the > LogRecord's __dict__ will be added to the LogRecord's __dict__. > > Can anyone see any problems with this approach? If not, I propose to post > the approach on python-list and then if there are no strong objections, > check it in to the trunk. (Since it could break existing code, I'm assuming > (please correct me if I'm wrong) that it shouldn't go into the > release24-maint branch.) This looks like a good clean solution to me. I agree with Paul Moore's suggestion that if extra_info is not None you should just go ahead and use it as a dict and let the errors propagate. What's the rationale for not letting it override existing fields? (There may be a good one, I just don't see it without turning on my thinking cap, which would cost extra. :-) Perhaps it makes sense to call it 'extra' instead of 'extra_info'? As a new feature it should definitely not go into 2.4; but I don't see how it could break existing code. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Nov 28 21:14:41 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 28 Nov 2005 12:14:41 -0800 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <dll2v3$78g$1@sea.gmane.org> References: <4379AAD7.2050506@iinet.net.au> <6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu> <e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com> <ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com> <bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com> <13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu> <437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org> <dll2v3$78g$1@sea.gmane.org> Message-ID: <ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com> On 11/18/05, Neil Schemenauer <nas at arctrix.com> wrote: > Perhaps we should use the memory management technique that the rest > of Python uses: reference counting. I don't see why the AST > structures couldn't be PyObjects. Me neither. Adding yet another memory allocation scheme to Python's already staggering number of memory allocation strategies sounds like a bad idea. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Nov 28 21:16:21 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 28 Nov 2005 12:16:21 -0800 Subject: [Python-Dev] something is wrong with test___all__ In-Reply-To: <dm03lq$41u$1@sea.gmane.org> References: <dm03lq$41u$1@sea.gmane.org> Message-ID: <ca471dc20511281216p36548ba6l6779da343d14e805@mail.gmail.com> Has this been handled yet? If not, perhaps showing the good and bad bytecode here would help trigger someone's brain into understanding the problem. On 11/22/05, Reinhold Birkenfeld <reinhold-birkenfeld-nospam at wolke7.net> wrote: > Hi, > > on my machine, "make test" hangs at test_colorsys. > > Careful investigation shows that when the bytecode is freshly generated > by "make all" (precisely in test___all__) the .pyc file is different from what a > direct call to "regrtest.py test_colorsys" produces. > > Curiously, a call to "regrtest.py test___all__" instead of "make test" produces > the correct bytecode. > > I can only suspect some AST bug here. > > Reinhold > > -- > Mail address is perfectly valid! > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Mon Nov 28 21:19:38 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 28 Nov 2005 21:19:38 +0100 Subject: [Python-Dev] Patch Req. # 1351020 & 1351036: PythonD modifications In-Reply-To: <ca471dc20511281207i1bb3dabpa0693014d818a4a8@mail.gmail.com> References: <39387.202.3.192.11.1132108393.squirrel@cafemail.mcadcafe.com> <437FA1D8.7060600@v.loewis.de> <20051120150850.GA27838@unpythonic.net> <25509.202.3.192.11.1132533752.squirrel@cafemail.mcadcafe.com> <43816CE2.2020808@v.loewis.de> <ca471dc20511280911o3966d2fcr4b9c5bc932407cc4@mail.gmail.com> <438B5FBF.7050604@v.loewis.de> <ca471dc20511281207i1bb3dabpa0693014d818a4a8@mail.gmail.com> Message-ID: <438B665A.1090002@v.loewis.de> Guido van Rossum wrote: > I don't recall why DOS support was removed (PEP 11 doesn't say) The PEP was actually created after the removal, so you added (or asked me to add) this entry: Name: MS-DOS, MS-Windows 3.x Unsupported in: Python 2.0 Code removed in: Python 2.1 Regards, Martin From jeremy at alum.mit.edu Mon Nov 28 21:47:07 2005 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Mon, 28 Nov 2005 15:47:07 -0500 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com> References: <4379AAD7.2050506@iinet.net.au> <6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu> <e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com> <ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com> <bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com> <13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu> <437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org> <dll2v3$78g$1@sea.gmane.org> <ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com> Message-ID: <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> On 11/28/05, Guido van Rossum <guido at python.org> wrote: > On 11/18/05, Neil Schemenauer <nas at arctrix.com> wrote: > > Perhaps we should use the memory management technique that the rest > > of Python uses: reference counting. I don't see why the AST > > structures couldn't be PyObjects. > > Me neither. Adding yet another memory allocation scheme to Python's > already staggering number of memory allocation strategies sounds like > a bad idea. The reason this thread started was the complaint that reference counting in the compiler is really difficult. Almost every line of code can lead to an error exit. The code becomes quite cluttered when it uses reference counting. Right now, the AST is created with malloc/free, but that makes it hard to free the ast at the right time. It would be fairly complex to convert the ast nodes to pyobjects. They're just simple discriminated unions right now. If they were allocated from an arena, the entire arena could be freed when the compilation pass ends. Jeremy From guido at python.org Mon Nov 28 22:15:58 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 28 Nov 2005 13:15:58 -0800 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> References: <4379AAD7.2050506@iinet.net.au> <e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com> <ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com> <bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com> <13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu> <437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org> <dll2v3$78g$1@sea.gmane.org> <ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com> <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> Message-ID: <ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com> On 11/28/05, Jeremy Hylton <jeremy at alum.mit.edu> wrote: > On 11/28/05, Guido van Rossum <guido at python.org> wrote: > > On 11/18/05, Neil Schemenauer <nas at arctrix.com> wrote: > > > Perhaps we should use the memory management technique that the rest > > > of Python uses: reference counting. I don't see why the AST > > > structures couldn't be PyObjects. > > > > Me neither. Adding yet another memory allocation scheme to Python's > > already staggering number of memory allocation strategies sounds like > > a bad idea. > > The reason this thread started was the complaint that reference > counting in the compiler is really difficult. Almost every line of > code can lead to an error exit. Sorry, I forgot that (I've been off-line for a week of quality time with Orlijn, and am now digging my self out from under several hundred emails :-). > The code becomes quite cluttered when > it uses reference counting. Right now, the AST is created with > malloc/free, but that makes it hard to free the ast at the right time. Would fixing the code to add free() calls in all the error exits make it more or less cluttered than using reference counting? > It would be fairly complex to convert the ast nodes to pyobjects. > They're just simple discriminated unions right now. Are they all the same size? > If they were > allocated from an arena, the entire arena could be freed when the > compilation pass ends. Then I don't understand why there was discussion of alloca() earlier on -- surely the lifetime of a node should not be limited by the stack frame that allocated it? I'm not in principle against having an arena for this purpose, but I worry that this will make it really hard to provide a Python API for the AST, which has already been requested and whose feasibility (unless I'm mistaken) also was touted as an argument for switching to the AST compiler in the first place. I hope we'll never have to deal with an API like the parser module provides... -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy at alum.mit.edu Mon Nov 28 22:23:00 2005 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Mon, 28 Nov 2005 16:23:00 -0500 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com> References: <4379AAD7.2050506@iinet.net.au> <ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com> <bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com> <13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu> <437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org> <dll2v3$78g$1@sea.gmane.org> <ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com> <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> <ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com> Message-ID: <e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com> On 11/28/05, Guido van Rossum <guido at python.org> wrote: > > The code becomes quite cluttered when > > it uses reference counting. Right now, the AST is created with > > malloc/free, but that makes it hard to free the ast at the right time. > > Would fixing the code to add free() calls in all the error exits make > it more or less cluttered than using reference counting? If we had an arena API, we'd only need to call free on the arena at top-level entry points. If an error occurs deeps inside the compiler, the arena will still get cleaned up by calling free at the top. > > It would be fairly complex to convert the ast nodes to pyobjects. > > They're just simple discriminated unions right now. > > Are they all the same size? No. Each type is a different size and there are actually a lot of types -- statements, expressions, arguments, slices, &c. All the objects of one type are the same size. > > If they were > > allocated from an arena, the entire arena could be freed when the > > compilation pass ends. > > Then I don't understand why there was discussion of alloca() earlier > on -- surely the lifetime of a node should not be limited by the stack > frame that allocated it? Actually this is a pretty good limit, because all these data structures are temporaries used by the compiler. Once compilation has finished, there's no need for the AST or the compiler state. > I'm not in principle against having an arena for this purpose, but I > worry that this will make it really hard to provide a Python API for > the AST, which has already been requested and whose feasibility > (unless I'm mistaken) also was touted as an argument for switching to > the AST compiler in the first place. I hope we'll never have to deal > with an API like the parser module provides... My preference would be to have the ast shared by value. We generate code to serialize it to and from a byte stream and share that between Python and C. It is less efficient, but it is also very simple. Jeremy From martin at v.loewis.de Mon Nov 28 22:37:05 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 28 Nov 2005 22:37:05 +0100 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> References: <4379AAD7.2050506@iinet.net.au> <6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu> <e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com> <ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com> <bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com> <13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu> <437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org> <dll2v3$78g$1@sea.gmane.org> <ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com> <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> Message-ID: <438B7881.8020200@v.loewis.de> Jeremy Hylton wrote: > The reason this thread started was the complaint that reference > counting in the compiler is really difficult. Almost every line of > code can lead to an error exit. The code becomes quite cluttered when > it uses reference counting. Right now, the AST is created with > malloc/free, but that makes it hard to free the ast at the right time. > It would be fairly complex to convert the ast nodes to pyobjects. > They're just simple discriminated unions right now. If they were > allocated from an arena, the entire arena could be freed when the > compilation pass ends. I haven't looked at the AST code at all so far, but my experience with gcc is that such an approach is fundamentally flawed: you would always have memory that ought to survive the parsing, so you will have to copy it out of the arena. This will either lead to dangling pointers, or garbage memory. So in gcc, they eventually moved to a full garbage collector (after several iterations). Reference counting has the advantage that you can always DECREF at the end of the function. So if you put all local variables at the beginning of the function, and all DECREFs at the end, getting clean memory management should be doable, IMO. Plus, contributors would be familiar with the scheme in place. I don't know if details have already been proposed, but I would update asdl to generate a hierarchy of classes: i.e. class mod(object):pass class Module(mod): def __init__(self, body): self.body = body # List of stmt #... class Expression(mod): def __init__(self, body): self.body = body # expr # ... class Raise(stmt): def __init__(self, dest, values, nl): self.dest # expr or None self.values # List of expr self.bl # bool (True or False) There would be convenience functions, like PyObject *mod_Module(PyObject* body); enum mod_kind mod_kind(PyObject* mod); // Module, Interactive, Expression, or mod_INVALID PyObject *mod_Expression_body(PyObject*); //... PyObject *stmt_Raise_dest(PyObject*); (whether the accessors return new or borrowed reference could be debated; plain C struct accesses would also be possible) Regards, Martin From nas at arctrix.com Mon Nov 28 22:46:05 2005 From: nas at arctrix.com (Neil Schemenauer) Date: Mon, 28 Nov 2005 14:46:05 -0700 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> References: <6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu> <e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com> <ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com> <bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com> <13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu> <437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org> <dll2v3$78g$1@sea.gmane.org> <ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com> <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> Message-ID: <20051128214605.GB26230@mems-exchange.org> On Mon, Nov 28, 2005 at 03:47:07PM -0500, Jeremy Hylton wrote: > The reason this thread started was the complaint that reference > counting in the compiler is really difficult. I don't think that's exactly right. The problem is that the AST compiler mixes its own memory management strategy with reference counting and the result doesn't quite work. The AST compiler mainly keeps track of memory via containment: for example, if B is an attribute of A then B gets freed when A gets freed. That works fine as long as B is never shared. My memory of the problems is a little fuzzy. Maybe Neal Norwitz can explain it better. Neil From guido at python.org Mon Nov 28 22:46:31 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 28 Nov 2005 13:46:31 -0800 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com> References: <4379AAD7.2050506@iinet.net.au> <bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com> <13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu> <437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org> <dll2v3$78g$1@sea.gmane.org> <ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com> <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> <ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com> <e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com> Message-ID: <ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com> [Guido] > > Then I don't understand why there was discussion of alloca() earlier > > on -- surely the lifetime of a node should not be limited by the stack > > frame that allocated it? [Jeremy] > Actually this is a pretty good limit, because all these data > structures are temporaries used by the compiler. Once compilation has > finished, there's no need for the AST or the compiler state. Are you really saying that there is one function which is called only once (per compilation) which allocates *all* the AST nodes? That's the only situation where I'd see alloca() working -- unless your alloca() doesn't allocate memory on the stack. I was somehow assuming that the tree would be built piecemeal by parser callbacks or some such mechanism. There's still a stack frame whose lifetime limits the AST lifetime, but it is not usually the current stackframe when a new node is allocated, so alloca() can't be used. I guess I don't understand the AST compiler code enough to participate in this discussion. Or perhaps we are agreeing violently? > > I'm not in principle against having an arena for this purpose, but I > > worry that this will make it really hard to provide a Python API for > > the AST, which has already been requested and whose feasibility > > (unless I'm mistaken) also was touted as an argument for switching to > > the AST compiler in the first place. I hope we'll never have to deal > > with an API like the parser module provides... > > My preference would be to have the ast shared by value. We generate > code to serialize it to and from a byte stream and share that between > Python and C. It is less efficient, but it is also very simple. So there would still be a Python-objects version of the AST but the compiler itself doesn't use it. At least by-value makes sense to me -- if you're making tree transformations you don't want accidental sharing to cause unexpected side effects. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bcannon at gmail.com Mon Nov 28 22:59:04 2005 From: bcannon at gmail.com (Brett Cannon) Date: Mon, 28 Nov 2005 13:59:04 -0800 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com> References: <4379AAD7.2050506@iinet.net.au> <13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu> <437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org> <dll2v3$78g$1@sea.gmane.org> <ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com> <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> <ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com> <e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com> <ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com> Message-ID: <bbaeab100511281359r16a8fc63k5fd300447a35e7ce@mail.gmail.com> On 11/28/05, Guido van Rossum <guido at python.org> wrote: > [Guido] > > > Then I don't understand why there was discussion of alloca() earlier > > > on -- surely the lifetime of a node should not be limited by the stack > > > frame that allocated it? > > [Jeremy] > > Actually this is a pretty good limit, because all these data > > structures are temporaries used by the compiler. Once compilation has > > finished, there's no need for the AST or the compiler state. > > Are you really saying that there is one function which is called only > once (per compilation) which allocates *all* the AST nodes? Nope, there isn't for everything. It's just that some are temporary to internal functions and thus can stand to be freed later (unless my memory is really shot). Otherwise it is piece-meal. There is the main data structure such as the compiler struct and the top-level node for the AST, but otherwise everything (currently) is allocated as needed. > That's the > only situation where I'd see alloca() working -- unless your alloca() > doesn't allocate memory on the stack. I was somehow assuming that the > tree would be built piecemeal by parser callbacks or some such > mechanism. There's still a stack frame whose lifetime limits the AST > lifetime, but it is not usually the current stackframe when a new node > is allocated, so alloca() can't be used. > > I guess I don't understand the AST compiler code enough to participate > in this discussion. Or perhaps we are agreeing violently? > I don't think your knowledge of the codebase precludes your participation. Actually, I think it makes it even more important since if some scheme is devised that is not easily explained it is really going to hinder who can help out with maintenance and enhancements on the compiler. > > > I'm not in principle against having an arena for this purpose, but I > > > worry that this will make it really hard to provide a Python API for > > > the AST, which has already been requested and whose feasibility > > > (unless I'm mistaken) also was touted as an argument for switching to > > > the AST compiler in the first place. I hope we'll never have to deal > > > with an API like the parser module provides... > > > > My preference would be to have the ast shared by value. We generate > > code to serialize it to and from a byte stream and share that between > > Python and C. It is less efficient, but it is also very simple. > > So there would still be a Python-objects version of the AST but the > compiler itself doesn't use it. > Yep. The idea was be to return a PyString formatted ala the parser module where it is just a bunch of nested items in a Scheme-like format. There would then be Python or C code that would generate a Python object representation from that. Then, when you were finished tweaking the structure, you would write back out as a PyString and then recreate the internal representation. That makes it pass-by-value since you pass the serialized PyString version across the C-Python boundary. > At least by-value makes sense to me -- if you're making tree > transformations you don't want accidental sharing to cause unexpected > side effects. > Yeah, that could be bad. =) -Brett From walter at livinglogic.de Mon Nov 28 23:13:58 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Mon, 28 Nov 2005 23:13:58 +0100 Subject: [Python-Dev] reference leaks In-Reply-To: <ee2a432c0511251002n438ca00eib1d7bdee53df30d7@mail.gmail.com> References: <ee2a432c0511241935i70127dc0o50999f72b5094f89@mail.gmail.com> <4386D91B.7030505@livinglogic.de> <ee2a432c0511251002n438ca00eib1d7bdee53df30d7@mail.gmail.com> Message-ID: <438B8126.7090502@livinglogic.de> Neal Norwitz wrote: > On 11/25/05, Walter D?rwald <walter at livinglogic.de> wrote: >> Can you move the call to codecs.register_error() out of test_callbacks() >> and retry? > > It then leaks 3 refs on each call to test_callbacks(). This should be fixed now in r41555 and r41556. Bye, Walter D?rwald From nnorwitz at gmail.com Mon Nov 28 23:58:24 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Mon, 28 Nov 2005 14:58:24 -0800 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com> References: <4379AAD7.2050506@iinet.net.au> <13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu> <437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org> <dll2v3$78g$1@sea.gmane.org> <ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com> <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> <ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com> <e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com> <ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com> Message-ID: <ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com> On 11/28/05, Guido van Rossum <guido at python.org> wrote: > > I guess I don't understand the AST compiler code enough to participate > in this discussion. I hope everyone while chime in here. This is important to improve and learn from others. Let me try to describe the current situation with a small amount of code. Hopefully it will give some idea of the larger problems. This is an entire function from Python/ast.c. It demonstrates the issues fairly clearly. It contains at least one memory leak. It uses asdl_seq which are barely more than somewhat dynamic arrays. Sequences do not know what type they hold, so there needs to be different dealloc functions to free them properly (asdl_*_seq_free()). ast_for_*() allocate memory, so in case of an error, the memory will need to be freed. Most of this memory is internal to the AST code. However, there are some identifiers (PyString's) that must be DECREF'ed. See below for the memory leak. static stmt_ty ast_for_funcdef(struct compiling *c, const node *n) { /* funcdef: 'def' [decorators] NAME parameters ':' suite */ identifier name = NULL; arguments_ty args = NULL; asdl_seq *body = NULL; asdl_seq *decorator_seq = NULL; int name_i; REQ(n, funcdef); if (NCH(n) == 6) { /* decorators are present */ decorator_seq = ast_for_decorators(c, CHILD(n, 0)); if (!decorator_seq) goto error; name_i = 2; } else { name_i = 1; } name = NEW_IDENTIFIER(CHILD(n, name_i)); if (!name) goto error; else if (!strcmp(STR(CHILD(n, name_i)), "None")) { ast_error(CHILD(n, name_i), "assignment to None"); goto error; } args = ast_for_arguments(c, CHILD(n, name_i + 1)); if (!args) goto error; body = ast_for_suite(c, CHILD(n, name_i + 3)); if (!body) goto error; return FunctionDef(name, args, body, decorator_seq, LINENO(n)); error: asdl_stmt_seq_free(body); asdl_expr_seq_free(decorator_seq); free_arguments(args); Py_XDECREF(name); return NULL; } The memory leak occurs when FunctionDef fails. name, args, body, and decorator_seq are all local and would not be freed. The simple variables can be freed in each "constructor" like FunctionDef(), but the sequences cannot unless they keep the info about which type they hold. That would help quite a bit, but I'm not sure it's the right/best solution. Hope this helps explain a bit. Please speak up with how this can be improved. Gotta run. n From greg.ewing at canterbury.ac.nz Tue Nov 29 00:55:17 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 29 Nov 2005 12:55:17 +1300 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> References: <4379AAD7.2050506@iinet.net.au> <6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu> <e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com> <ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com> <bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com> <13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu> <437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org> <dll2v3$78g$1@sea.gmane.org> <ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com> <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> Message-ID: <438B98E5.5010209@canterbury.ac.nz> Jeremy Hylton wrote: > Almost every line of > code can lead to an error exit. The code becomes quite cluttered when > it uses reference counting. I don't see why very many more error exits should become possible just by introducing refcounting. Errors are possible whenever you allocate something, however you do it, so you need error checks on all your allocations in any case. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Tue Nov 29 01:11:11 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 29 Nov 2005 13:11:11 +1300 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com> References: <4379AAD7.2050506@iinet.net.au> <13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu> <437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org> <dll2v3$78g$1@sea.gmane.org> <ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com> <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> <ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com> <e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com> <ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com> <ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com> Message-ID: <438B9C9F.6000802@canterbury.ac.nz> Neal Norwitz wrote: > This is an entire function from Python/ast.c. > Sequences do not know what type they hold, so there needs to be > different dealloc functions to free them properly (asdl_*_seq_free()). Well, that's one complication that would go away if the nodes were PyObjects. > The memory leak occurs when FunctionDef fails. name, args, body, and > decorator_seq are all local and would not be freed. The simple > variables can be freed in each "constructor" like FunctionDef(), but > the sequences cannot unless they keep the info about which type they > hold. If FunctionDef's reference semantics are defined so that it steals references to its arguments, then here is how the same function would look with PyObject AST nodes, as far as I can see: static PyObject * ast_for_funcdef(struct compiling *c, const node *n) { /* funcdef: 'def' [decorators] NAME parameters ':' suite */ PyObject *name = NULL; PyObject *args = NULL; PyObject *body = NULL; PyObject *decorator_seq = NULL; int name_i; REQ(n, funcdef); if (NCH(n) == 6) { /* decorators are present */ decorator_seq = ast_for_decorators(c, CHILD(n, 0)); if (!decorator_seq) goto error; name_i = 2; } else { name_i = 1; } name = NEW_IDENTIFIER(CHILD(n, name_i)); if (!name) goto error; else if (!strcmp(STR(CHILD(n, name_i)), "None")) { ast_error(CHILD(n, name_i), "assignment to None"); goto error; } args = ast_for_arguments(c, CHILD(n, name_i + 1)); if (!args) goto error; body = ast_for_suite(c, CHILD(n, name_i + 3)); if (!body) goto error; return FunctionDef(name, args, body, decorator_seq, LINENO(n)); error: Py_XDECREF(body); Py_XDECREF(decorator_seq); Py_XDECREF(args); Py_XDECREF(name); return NULL; } The only things I've changed are turning some type declarations into PyObject * and replacing the deallocation functions at the end with Py_XDECREF! Maybe there are other functions where it would not be so straightforward, but if this really is a typical AST function, switching to PyObjects looks like it wouldn't be difficult at all, and would actually make some things simpler. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Tue Nov 29 01:13:29 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 29 Nov 2005 13:13:29 +1300 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com> References: <4379AAD7.2050506@iinet.net.au> <13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu> <437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org> <dll2v3$78g$1@sea.gmane.org> <ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com> <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> <ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com> <e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com> <ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com> <ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com> Message-ID: <438B9D29.2020403@canterbury.ac.nz> Here's a somewhat radical idea: Why not write the parser and bytecode compiler in Python? A .pyc could be bootstrapped from it and frozen into the executable. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From martin at v.loewis.de Tue Nov 29 01:21:38 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 29 Nov 2005 01:21:38 +0100 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com> References: <4379AAD7.2050506@iinet.net.au> <13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu> <437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org> <dll2v3$78g$1@sea.gmane.org> <ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com> <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> <ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com> <e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com> <ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com> <ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com> Message-ID: <438B9F12.3060607@v.loewis.de> Neal Norwitz wrote: > Hope this helps explain a bit. Please speak up with how this can be > improved. Gotta run. I would rewrite it as static PyObject* ast_for_funcdef(struct compiling *c, const node *n) { /* funcdef: [decorators] 'def' NAME parameters ':' suite */ PyObject *name = NULL; PyObject *args = NULL; PyObject *body = NULL; PyObject *decorator_seq = NULL; PyObject *result = NULL; int name_i; REQ(n, funcdef); if (NCH(n) == 6) { /* decorators are present */ decorator_seq = ast_for_decorators(c, CHILD(n, 0)); if (!decorator_seq) goto error; name_i = 2; } else { name_i = 1; } name = NEW_IDENTIFIER(CHILD(n, name_i)); if (!name) goto error; else if (!strcmp(STR(CHILD(n, name_i)), "None")) { ast_error(CHILD(n, name_i), "assignment to None"); goto error; } args = ast_for_arguments(c, CHILD(n, name_i + 1)); if (!args) goto error; body = ast_for_suite(c, CHILD(n, name_i + 3)); if (!body) goto error; result = FunctionDef(name, args, body, decorator_seq, LINENO(n)); error: Py_XDECREF(name); Py_XDECREF(args); Py_XDECREF(body); Py_XDECREF(decorator_seq); return result; } The convention would be that ast_for_* returns new references, which have to be released regardless of success or failure. FunctionDef would duplicate all of its parameter references if it succeeds, and leave them untouched if it fails. One could develop a checker that verifies that: a) all PyObject* local variables are initialized to NULL, and b) all such variables are Py_XDECREF'ed after the error label. c) result is initialized to NULL, and returned. Then, "goto error" at any point in the code would be correct (assuming an exception had been set prior to the goto). No special release function for the body or the decorators would be necessary - they would be plain Python lists. Regards, Martin From bcannon at gmail.com Tue Nov 29 01:29:09 2005 From: bcannon at gmail.com (Brett Cannon) Date: Mon, 28 Nov 2005 16:29:09 -0800 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <438B9D29.2020403@canterbury.ac.nz> References: <4379AAD7.2050506@iinet.net.au> <dlf7ak$ckg$1@sea.gmane.org> <dll2v3$78g$1@sea.gmane.org> <ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com> <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> <ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com> <e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com> <ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com> <ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com> <438B9D29.2020403@canterbury.ac.nz> Message-ID: <bbaeab100511281629xd89651eudc0c7ed5b1a36eb7@mail.gmail.com> On 11/28/05, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote: > Here's a somewhat radical idea: > > Why not write the parser and bytecode compiler in Python? > > A .pyc could be bootstrapped from it and frozen into > the executable. > Is there a specific reason you are leaving out the AST, Greg, or do you count that as part of the bytecode compiler (I think of that as the AST->bytecode step handled by Python/compile.c)? While ease of maintenance would be fantastic and would probably lead to much more language experimentation if more of the core parts of Python were written in Python, I would worry about performance. While generating bytecode is not necessarily an everytime thing, I know Guido has said he doesn't like punishing the performance of small scripts in the name of large-scale apps (reason why interpreter startup time has always been an issue) which tend not to have a .pyc file. -Brett From hyeshik at gmail.com Tue Nov 29 02:14:56 2005 From: hyeshik at gmail.com (=?EUC-KR?B?wOXH/b3E?=) Date: Tue, 29 Nov 2005 10:14:56 +0900 Subject: [Python-Dev] CVS repository mostly closed now In-Reply-To: <4388D55B.1070501@v.loewis.de> References: <4388D55B.1070501@v.loewis.de> Message-ID: <4f0b69dc0511281714y42a73b7fm6caa34340f0d6fc7@mail.gmail.com> On 11/27/05, "Martin v. L?wis" <martin at v.loewis.de> wrote: > I tried removing the CVS repository from SF; it turns > out that this operation is not supported. Instead, it > is only possible to remove it from the project page; > pserver and ssh access remain indefinitely, as does > viewcvs. There's a hacky trick to remove them: put rm -rf $CVSROOT/src into CVSROOT/loginfo and remove the line then and commit again. :) Hye-Shik From fdrake at acm.org Tue Nov 29 02:32:08 2005 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Mon, 28 Nov 2005 20:32:08 -0500 Subject: [Python-Dev] CVS repository mostly closed now In-Reply-To: <4f0b69dc0511281714y42a73b7fm6caa34340f0d6fc7@mail.gmail.com> References: <4388D55B.1070501@v.loewis.de> <4f0b69dc0511281714y42a73b7fm6caa34340f0d6fc7@mail.gmail.com> Message-ID: <200511282032.09373.fdrake@acm.org> On Monday 28 November 2005 20:14, ??? wrote: > There's a hacky trick to remove them: > put rm -rf $CVSROOT/src into CVSROOT/loginfo > and remove the line then and commit again. :) Wow, that is tricky! Glad it wasn't me who thought of this one. :-) -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> From greg.ewing at canterbury.ac.nz Tue Nov 29 07:31:49 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 29 Nov 2005 19:31:49 +1300 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <bbaeab100511281629xd89651eudc0c7ed5b1a36eb7@mail.gmail.com> References: <4379AAD7.2050506@iinet.net.au> <dlf7ak$ckg$1@sea.gmane.org> <dll2v3$78g$1@sea.gmane.org> <ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com> <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> <ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com> <e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com> <ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com> <ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com> <438B9D29.2020403@canterbury.ac.nz> <bbaeab100511281629xd89651eudc0c7ed5b1a36eb7@mail.gmail.com> Message-ID: <438BF5D5.6090000@canterbury.ac.nz> Brett Cannon wrote: > Is there a specific reason you are leaving out the AST, Greg, or do > you count that as part of the bytecode compiler No, I consider it part of the parser. My mental model of parsing & compiling in the presence of a parse tree is like this: [source] -> scanner -> [tokens] -> parser -> [AST] -> code_generator -> [code] The fact that there still seems to be another kind of parse tree in between the scanner and the AST generator is an oddity which I hope will eventually disappear. > I know > Guido has said he doesn't like punishing the performance of small > scripts in the name of large-scale apps To me, that's an argument in favour of always generating a .pyc, even for scripts. Greg From martin at v.loewis.de Tue Nov 29 08:14:20 2005 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Tue, 29 Nov 2005 08:14:20 +0100 Subject: [Python-Dev] CVS repository mostly closed now In-Reply-To: <4f0b69dc0511281714y42a73b7fm6caa34340f0d6fc7@mail.gmail.com> References: <4388D55B.1070501@v.loewis.de> <4f0b69dc0511281714y42a73b7fm6caa34340f0d6fc7@mail.gmail.com> Message-ID: <438BFFCC.1010005@v.loewis.de> ??? wrote: > There's a hacky trick to remove them: > put rm -rf $CVSROOT/src into CVSROOT/loginfo > and remove the line then and commit again. :) Sure :-) SF makes a big fuss as to how good a service this is: open source will never go away. I tend to agree, somewhat. For historical reasons, it is surely nice to be able to browse the CVS repository (in particular if you need to correlate CVS revision numbers and svn revision numbers); also, people can take any time they want to convert CVS sandboxes. So instead of hacking them, I thought we better comply. With the mechanics in place, anybody should notice we switched to subversion (but I will write something on c.l.p.a, anyway). Regards, Martin P.S. Sorry for not getting your name right in the To: field; that's thunderbird. From nnorwitz at gmail.com Tue Nov 29 08:24:25 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Mon, 28 Nov 2005 23:24:25 -0800 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <438B9F12.3060607@v.loewis.de> References: <4379AAD7.2050506@iinet.net.au> <dlf7ak$ckg$1@sea.gmane.org> <dll2v3$78g$1@sea.gmane.org> <ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com> <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> <ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com> <e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com> <ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com> <ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com> <438B9F12.3060607@v.loewis.de> Message-ID: <ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com> On 11/28/05, "Martin v. L?wis" <martin at v.loewis.de> wrote: > Neal Norwitz wrote: > > Hope this helps explain a bit. Please speak up with how this can be > > improved. Gotta run. > > I would rewrite it as [code snipped] For those watching, Greg's and Martin's version were almost the same. However, Greg's version left in the memory leak, while Martin fixed it by letting the result fall through. Martin added some helpful rules about dealing with the memory. Martin also gets bonus points for talking about developing a checker. :-) In both cases, their modified code is similar to the existing AST code, but all deallocation is done with Py_[X]DECREFs rather than a type specific deallocator. Definitely nicer than the current situation. It's also the same as the rest of the python code. With arenas the code would presumably look something like this: static stmt_ty ast_for_funcdef(struct compiling *c, const node *n) { /* funcdef: 'def' [decorators] NAME parameters ':' suite */ identifier name; arguments_ty args; asdl_seq *body; asdl_seq *decorator_seq = NULL; int name_i; REQ(n, funcdef); if (NCH(n) == 6) { /* decorators are present */ decorator_seq = ast_for_decorators(c, CHILD(n, 0)); if (!decorator_seq) return NULL; name_i = 2; } else { name_i = 1; } name = NEW_IDENTIFIER(CHILD(n, name_i)); if (!name) return NULL; Py_AST_Register(name); if (!strcmp(STR(CHILD(n, name_i)), "None")) { ast_error(CHILD(n, name_i), "assignment to None"); return NULL; } args = ast_for_arguments(c, CHILD(n, name_i + 1)); body = ast_for_suite(c, CHILD(n, name_i + 3)); if (!args || !body) return NULL; return FunctionDef(name, args, body, decorator_seq, LINENO(n)); } All the goto's become return NULLs. After allocating a PyObject, it would need to be registered (ie, the mythical Py_AST_Register(name)). This is easier than using all PyObjects in that when an error occurs, there's nothing to think about, just return. Only optional values (like decorator_seq) need to be initialized. It's harder in that one must remember to register any PyObject so it can be Py_DECREFed at the end. Since the arena is allocated in big hunk(s), it would presumably be faster than using PyObjects since there would be less memory allocation (and fragmentation). It should be possible to get rid of some of the conditionals too (I joined body and args above). Using all PyObjects has another benefit that may have been mentioned elsewhere, ie that the rest of Python uses the same techniques for handling deallocation. I'm not really advocating any particular approach. I *think* arenas would be easiest, but it's not a clear winner. I think Martin's note about GCC using GC is interesting. AFAIK GCC is a lot more complex than the Python code, so I'm not sure it's 100% relevant. OTOH, we need to weigh that experience. n From martin at v.loewis.de Tue Nov 29 08:33:03 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 29 Nov 2005 08:33:03 +0100 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com> References: <4379AAD7.2050506@iinet.net.au> <dlf7ak$ckg$1@sea.gmane.org> <dll2v3$78g$1@sea.gmane.org> <ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com> <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> <ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com> <e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com> <ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com> <ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com> <438B9F12.3060607@v.loewis.de> <ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com> Message-ID: <438C042F.2050502@v.loewis.de> Neal Norwitz wrote: > For those watching, Greg's and Martin's version were almost the same. > However, Greg's version left in the memory leak, while Martin fixed it > by letting the result fall through. Actually, Greg said (correctly) that his version also fixes the leak: he assumed that FunctionDef would *consume* the references being passed (whether it is successful or not). I don't think this is a good convention, though. Regards, Martin From ncoghlan at gmail.com Tue Nov 29 11:48:31 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 29 Nov 2005 20:48:31 +1000 Subject: [Python-Dev] Metaclass problem in the "with" statement semantics in PEP 343 In-Reply-To: <ca471dc20511280824y6af50950y93f70f9c19bfe0d9@mail.gmail.com> References: <438AE97D.2050600@iinet.net.au> <ca471dc20511280824y6af50950y93f70f9c19bfe0d9@mail.gmail.com> Message-ID: <438C31FF.5040302@gmail.com> Guido van Rossum wrote: > On 11/28/05, Nick Coghlan <ncoghlan at iinet.net.au> wrote: >> I think we need to fix the proposed semantics so that they access the slots >> via the type, rather than directly through the instance. Otherwise the slots >> for the with statement will behave strangely when compared to the slots for >> other magic methods. > > Maybe it's because I'm just an old fart, but I can't make myself care > about this. The code is broken. You get an error message. It even has > the correct exception (TypeError). In this particular case the error > message isn't that great -- well, the same is true in many other cases > (like whenever the invocation is a method call from Python code). I'm not particularly worried about the error message - as you say, it even has the right type. Or at least one of the two right types ;) > That most built-in operations produce a different error message > doesn't mean we have to make *all* built-in operations use the same > approach. I fail to see the value of the consistency you're calling > for. The bit that more concerns me is the behavioural discrepancy that comes from having a piece of syntax that looks in the instance dictionary. No other Python syntax is affected by the instance attributes - if the object doesn't have the right type, you're out of luck. Sticking an __iter__ method on an instance doesn't turn an object into an iterator, but with the current semantics, doing the same thing with __context__ *will* give you a manageable context. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From fredrik at pythonware.com Tue Nov 29 09:29:37 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 29 Nov 2005 09:29:37 +0100 Subject: [Python-Dev] Memory management in the AST parser & compiler References: <4379AAD7.2050506@iinet.net.au><6224E623-9137-4DD8-A955-AAB9B23CB148@alum.mit.edu><e8bf7a530511151142n4ab3b757kf6835853de0d6134@mail.gmail.com><ee2a432c0511151357v5679b665lc7e238c6809be9b4@mail.gmail.com><bbaeab100511160056yeb05e9cq76e09e1d1fea7c18@mail.gmail.com><13C5E91F-ACB4-4329-A16C-56DA9E1DB4EE@alum.mit.edu><437B2075.1000102@gmail.com> <dlf7ak$ckg$1@sea.gmane.org><dll2v3$78g$1@sea.gmane.org><ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com> <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> Message-ID: <dmh3hm$rvp$1@sea.gmane.org> Jeremy Hylton wrote: > > Me neither. Adding yet another memory allocation scheme to Python's > > already staggering number of memory allocation strategies sounds like > > a bad idea. > > The reason this thread started was the complaint that reference > counting in the compiler is really difficult. Almost every line of > code can lead to an error exit. The code becomes quite cluttered when > it uses reference counting. Right now, the AST is created with > malloc/free, but that makes it hard to free the ast at the right time. > It would be fairly complex to convert the ast nodes to pyobjects. > They're just simple discriminated unions right now. If they were > allocated from an arena, the entire arena could be freed when the > compilation pass ends. if you're using PyObject's for everything, you can use a list object as the arena. just append every "transient" value to the arena list, and a single DECREF will get rid of it all. if you want to copy something out from the arena, just INCREF the object and it's yours. (for performance reasons, it might be a good idea to add a _PyList_APPEND helper that works like app1 but steals the value reference; e.g. PyObject* _PyList_APPEND(PyListObject *self, PyObject *v) { int n; if (!v) return v; n = PyList_GET_SIZE(self); if (n == INT_MAX) { PyErr_SetString(PyExc_OverflowError, "cannot add more objects to list"); return NULL; } if (list_resize(self, n+1) == -1) return NULL; PyList_SET_ITEM(self, n, v); return v; } which can be called as obj = _PyList_APPEND(c->arena, AST_Foobar_New(...)); if (!obj) return NULL; </F> From ncoghlan at gmail.com Tue Nov 29 13:59:52 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 29 Nov 2005 22:59:52 +1000 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com> References: <4379AAD7.2050506@iinet.net.au> <dlf7ak$ckg$1@sea.gmane.org> <dll2v3$78g$1@sea.gmane.org> <ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com> <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> <ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com> <e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com> <ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com> <ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com> <438B9F12.3060607@v.loewis.de> <ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com> Message-ID: <438C50C8.9040005@gmail.com> Neal Norwitz wrote: > On 11/28/05, "Martin v. L?wis" <martin at v.loewis.de> wrote: >> Neal Norwitz wrote: >>> Hope this helps explain a bit. Please speak up with how this can be >>> improved. Gotta run. >> I would rewrite it as > > [code snipped] > > For those watching, Greg's and Martin's version were almost the same. > However, Greg's version left in the memory leak, while Martin fixed it > by letting the result fall through. Martin added some helpful rules > about dealing with the memory. Martin also gets bonus points for > talking about developing a checker. :-) > > In both cases, their modified code is similar to the existing AST > code, but all deallocation is done with Py_[X]DECREFs rather than a > type specific deallocator. Definitely nicer than the current > situation. It's also the same as the rest of the python code. When working on the CST->AST parser, there were only a few things I found to be seriously painful about the memory management: 1. Remembering which free_* variant to call for AST nodes 2. Remembering which asdl_seq_*_free variant to call for ASDL sequences (it was worse when the variant I wanted didn't exist, since this was done with functions rather than preprocessor macros) 3. Remembering to transpose free_* and *_free between freeing a single node and freeing a sequence. 4. Remembering whether or not a given cleanup function could cope with NULL's or not 5. The fact that there wasn't a consistent "goto error" exception-alike mechanism in use (I had a Spanish Inquisition-esque experience writing that list ;) Simply switching to PyObjects would solve the first four problems: everything becomes a Py_XDECREF. Declaring that none of the AST node creation methods steal references would be consistent with most of the existing C API (e.g. PySequence_SetItem, PySequence_Tuple, PySequence_List), and has nice properties if we handle AST nodes as borrowed references from a PyList used as the arena, as Fredrik suggested. If the top level function refrains from putting the top level node in the arena, then it will all "just work" - any objects will be deleted only if both the PyList arena AND the top-level node object are DECREF'ed. The top-level function only has to obey two simple rules: 1. Always DECREF the arena list 2. On success, INCREF the top-level node BEFORE DECREF'ing the arena list (otherwise Step 1 kills everything. . .) To make the code a little more self-documenting, Fredrik's _PyList_APPEND could be called "new_ast_node" and accept the compiling struct directly: PyObject* new_ast_node(struct compiling *c, PyObject *ast_node) { int n; if (!ast_node) return ast_node; idx = PyList_GET_SIZE(c->arena); if (idx == INT_MAX) { PyErr_SetString(PyExc_OverflowError, "cannot add more objects to arena"); return NULL; } if (list_resize(c->arena, idx+1) == -1) return NULL; PyList_SET_ITEM(c->arena, idx, ast_node); return ast_node; } We'd also need to modify the helper macro for identifiers: #define NEW_IDENTIFER(c, n) \ new_ast_node(c, PyString_InternFromString(STR(n))) Then the function is only borrowing the arena's reference, and doesn't need to decref anything: static PyObject* ast_for_funcdef(struct compiling *c, const node *n) { /* funcdef: [decorators] 'def' NAME parameters ':' suite */ PyObject *name = NULL; PyObject *args = NULL; PyObject *body = NULL; PyObject *decorator_seq = NULL; int name_i; REQ(n, funcdef); if (NCH(n) == 6) { /* decorators are present */ decorator_seq = ast_for_decorators(c, CHILD(n, 0)); if (!decorator_seq) return NULL; name_i = 2; } else { name_i = 1; } name = NEW_IDENTIFIER(c, CHILD(n, name_i)); if (!name) return NULL; else if (!strcmp(STR(CHILD(n, name_i)), "None")) { ast_error(CHILD(n, name_i), "assignment to None"); return NULL; } args = ast_for_arguments(c, CHILD(n, name_i + 1)); if (!args) return NULL; body = ast_for_suite(c, CHILD(n, name_i + 3)); if (!body) return NULL; return new_ast_node(\ FunctionDef(name, args, body, decorator_seq, LINENO(n))); } No need for a checker, because there isn't anything special to do at the call sites: each AST node can take care of putting *itself* in the arena. And as the identifier example shows, this even works for the non-AST leaf nodes that are some other kind of PyObject. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From mcherm at mcherm.com Tue Nov 29 14:28:51 2005 From: mcherm at mcherm.com (Michael Chermside) Date: Tue, 29 Nov 2005 05:28:51 -0800 Subject: [Python-Dev] Metaclass problem in the "with" statement semantics in PEP 343 Message-ID: <20051129052851.l8aezron2rc0ksck@login.werra.lunarpages.com> Nick writes: > I think we need to fix the proposed semantics so that they access the slots > via the type, rather than directly through the instance. Otherwise the slots > for the with statement will behave strangely when compared to the slots for > other magic methods. Guido writes: > I can't make myself care > about this. The code is broken. You get an error message. Nick writes: > The bit that more concerns me is the behavioural discrepancy that comes from > having a piece of syntax that looks in the instance dictionary. No other > Python syntax is affected by the instance attributes - if the object doesn't > have the right type, you're out of luck. > > Sticking an __iter__ method on an instance doesn't turn an object into an > iterator, but with the current semantics, doing the same thing with > __context__ *will* give you a manageable context. If I'm understanding the situation here correctly, I'd like to chime in on Nick's side. I'm unconcerned about the bit of code that uses or misuses Context objects... I'm more concerned about the bit of the manual that describes (in simple words that "fit your brain") how attribute/method resolution works in Python. Right now, we say that there's one rule for all *normal* attributes and methods, and a slightly different rule for all double-underbar methods. (I'd summarize the rules here, but they're just sufficiently complex that I'm sure I'd make a mistake and wind up having people correct my mistake. Suffice to say that the difference between normal and double-underbar lookup has to do with checking (or not checking) the instance dictionary.) With the current state of the code, we'd need to say that there's one rule for all *normal* attributes and a slightly different rule for all double-underbar methods except for __context__ which is just like a normal attribute. That feels too big for my brain -- what on earth is so special about __context__ that it has to be different from all other double-underbar methods? If it were __init__ that had to be an exception, I'd understand, but __context__? -- Michael Chermside From guido at python.org Tue Nov 29 16:15:26 2005 From: guido at python.org (Guido van Rossum) Date: Tue, 29 Nov 2005 07:15:26 -0800 Subject: [Python-Dev] Metaclass problem in the "with" statement semantics in PEP 343 In-Reply-To: <438C31FF.5040302@gmail.com> References: <438AE97D.2050600@iinet.net.au> <ca471dc20511280824y6af50950y93f70f9c19bfe0d9@mail.gmail.com> <438C31FF.5040302@gmail.com> Message-ID: <ca471dc20511290715g1740938ch5d02189de8f3c2a9@mail.gmail.com> On 11/29/05, Nick Coghlan <ncoghlan at gmail.com> wrote: > The bit that more concerns me is the behavioural discrepancy that comes from > having a piece of syntax that looks in the instance dictionary. No other > Python syntax is affected by the instance attributes - if the object doesn't > have the right type, you're out of luck. I'm not sure I buy that. Surely there are plenty of other places that call PyObject_GetAttr(). Classic classes still let you put an __add__ attribute in the instance dict to make it addable (though admittedly this is a weak argument since it'll go away in Py3k). > Sticking an __iter__ method on an instance doesn't turn an object into an > iterator, but with the current semantics, doing the same thing with > __context__ *will* give you a manageable context. This is all a very gray area. Before Python 2.2 most of the built-in operations *did* call PyObject_GetAttr(). I added the slots mostly as a speed-up, and the change in semantics was a side-effect of that. And I'm still not sure why you care -- apart from the error case, it's not going to affect anybody's code -- you should never use __xyzzy__ names except as documented since their undocumented use can change. (So yes I'm keeping the door open for turning __context__ into a slot later.) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Nov 29 16:17:56 2005 From: guido at python.org (Guido van Rossum) Date: Tue, 29 Nov 2005 07:17:56 -0800 Subject: [Python-Dev] Metaclass problem in the "with" statement semantics in PEP 343 In-Reply-To: <20051129052851.l8aezron2rc0ksck@login.werra.lunarpages.com> References: <20051129052851.l8aezron2rc0ksck@login.werra.lunarpages.com> Message-ID: <ca471dc20511290717o568cda19nf4f1899adff2c6f5@mail.gmail.com> On 11/29/05, Michael Chermside <mcherm at mcherm.com> wrote: > Right now, we say that there's one rule for all *normal* attributes and > methods, and a slightly different rule for all double-underbar methods. But it's not normal vs. __xyzzy__. A specific set of slots (including next, but excluding things like __doc__) get special treatment. The rest don't. All I'm saying is that I don't care to give __context__ this special treatment. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Nov 29 16:27:41 2005 From: guido at python.org (Guido van Rossum) Date: Tue, 29 Nov 2005 07:27:41 -0800 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <438BF5D5.6090000@canterbury.ac.nz> References: <4379AAD7.2050506@iinet.net.au> <ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com> <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> <ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com> <e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com> <ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com> <ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com> <438B9D29.2020403@canterbury.ac.nz> <bbaeab100511281629xd89651eudc0c7ed5b1a36eb7@mail.gmail.com> <438BF5D5.6090000@canterbury.ac.nz> Message-ID: <ca471dc20511290727v3e34f2efhf4dc54150d84d28a@mail.gmail.com> On 11/28/05, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote: > [...] My mental model > of parsing & compiling in the presence of a parse tree > is like this: > > [source] -> scanner -> [tokens] > -> parser -> [AST] -> code_generator -> [code] > > The fact that there still seems to be another kind of > parse tree in between the scanner and the AST generator > is an oddity which I hope will eventually disappear. Have a look at http://python.org/sf/1337696 -- a reimplementation of pgen in Python that I did for Elemental and am contributing to the PSF. It customizes the tree generation callback so as to let you produce an style of AST you like. > > I know > > Guido has said he doesn't like punishing the performance of small > > scripts in the name of large-scale apps > > To me, that's an argument in favour of always generating > a .pyc, even for scripts. I'm not sure I follow the connection. But I wouldn't mind if someone contributed code that did this. :) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From goodger at python.org Tue Nov 29 15:59:39 2005 From: goodger at python.org (David Goodger) Date: Tue, 29 Nov 2005 09:59:39 -0500 Subject: [Python-Dev] CVS repository mostly closed now In-Reply-To: <4388D55B.1070501@v.loewis.de> References: <4388D55B.1070501@v.loewis.de> Message-ID: <438C6CDB.7070805@python.org> You can also remove CVS write privileges from project members. It's a good way to prevent accidental checkins. -- David Goodger <http://python.net/~goodger> -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 253 bytes Desc: OpenPGP digital signature Url : http://mail.python.org/pipermail/python-dev/attachments/20051129/37c9a2d7/signature.pgp From nnorwitz at gmail.com Tue Nov 29 19:29:07 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue, 29 Nov 2005 10:29:07 -0800 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <438C50C8.9040005@gmail.com> References: <4379AAD7.2050506@iinet.net.au> <ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com> <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> <ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com> <e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com> <ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com> <ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com> <438B9F12.3060607@v.loewis.de> <ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com> <438C50C8.9040005@gmail.com> Message-ID: <ee2a432c0511291029m5bdc4564s84457533037a7e11@mail.gmail.com> On 11/29/05, Nick Coghlan <ncoghlan at gmail.com> wrote: > > When working on the CST->AST parser, there were only a few things I found to > be seriously painful about the memory management: > > 1. Remembering which free_* variant to call for AST nodes > 2. Remembering which asdl_seq_*_free variant to call for ASDL sequences (it > was worse when the variant I wanted didn't exist, since this was done with > functions rather than preprocessor macros) > 3. Remembering to transpose free_* and *_free between freeing a single node > and freeing a sequence. > 4. Remembering whether or not a given cleanup function could cope with > NULL's or not > 5. The fact that there wasn't a consistent "goto error" exception-alike > mechanism in use > > (I had a Spanish Inquisition-esque experience writing that list ;) :-) I agree all those are existing issues. #3 could be easily fixed. #4 I think all cleanup functions can deal with NULLs now. #5 probably ought to be fixed in favor of using gotos. > Simply switching to PyObjects would solve the first four problems: everything > becomes a Py_XDECREF. I'm mostly convinced that using PyObjects would be a good thing. However, making the change isn't free as all the types need to be created and this is likely quite a bit of code. I'd like to hear what Jeremy thinks about this. Is anyone interested in creating a patch along these lines (even a partial patch) to see the benefits? n From nnorwitz at gmail.com Tue Nov 29 19:17:20 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue, 29 Nov 2005 10:17:20 -0800 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <438C042F.2050502@v.loewis.de> References: <4379AAD7.2050506@iinet.net.au> <ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com> <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> <ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com> <e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com> <ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com> <ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com> <438B9F12.3060607@v.loewis.de> <ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com> <438C042F.2050502@v.loewis.de> Message-ID: <ee2a432c0511291017r6c99c115y722e67fbea7e5cee@mail.gmail.com> On 11/28/05, "Martin v. L?wis" <martin at v.loewis.de> wrote: > Neal Norwitz wrote: > > For those watching, Greg's and Martin's version were almost the same. > > However, Greg's version left in the memory leak, while Martin fixed it > > by letting the result fall through. > > Actually, Greg said (correctly) that his version also fixes the > leak: he assumed that FunctionDef would *consume* the references > being passed (whether it is successful or not). Ah right, I forgot about that. Thanks for correcting me (sorry Greg). Jeremy and I had talked about this before. I keep resisting this solution, though I'm not sure why. n From edloper at gradient.cis.upenn.edu Tue Nov 29 21:27:51 2005 From: edloper at gradient.cis.upenn.edu (Edward Loper) Date: Tue, 29 Nov 2005 15:27:51 -0500 Subject: [Python-Dev] Metaclass problem in the "with" statement semantics in PEP 343 In-Reply-To: <mailman.8173.1133288956.18700.python-dev@python.org> References: <mailman.8173.1133288956.18700.python-dev@python.org> Message-ID: <f6c9c084e0f75619d461c871954c3900@gradient.cis.upenn.edu> Michael Chermside wrote: >> Right now, we say that there's one rule for all *normal* attributes >> and >> methods, and a slightly different rule for all double-underbar >> methods. Guido responded: > But it's not normal vs. __xyzzy__. A specific set of slots (including > next, but excluding things like __doc__) get special treatment. The > rest don't. All I'm saying is that I don't care to give __context__ > this special treatment. Perhaps we should officially document that the effect on special methods of overriding a class attribute with an instance attribute is undefined, for some given set of attributes? (I would say all double-underbar methods, but it sounds like the list needs to also include next().) Otherwise, it seems like people might write code that relies on the current behavior, which will then break if we eg turn __context__ into a slot. (It sounds like you want to reserve the right to change this.) Well, of course, people may rely on the current behavior anyway, but at least they'll have been warned. :) -Edward From bcannon at gmail.com Tue Nov 29 23:03:00 2005 From: bcannon at gmail.com (Brett Cannon) Date: Tue, 29 Nov 2005 14:03:00 -0800 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <ca471dc20511290727v3e34f2efhf4dc54150d84d28a@mail.gmail.com> References: <4379AAD7.2050506@iinet.net.au> <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> <ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com> <e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com> <ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com> <ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com> <438B9D29.2020403@canterbury.ac.nz> <bbaeab100511281629xd89651eudc0c7ed5b1a36eb7@mail.gmail.com> <438BF5D5.6090000@canterbury.ac.nz> <ca471dc20511290727v3e34f2efhf4dc54150d84d28a@mail.gmail.com> Message-ID: <bbaeab100511291403t6402c613j96c6fb283fb1368@mail.gmail.com> On 11/29/05, Guido van Rossum <guido at python.org> wrote: > On 11/28/05, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote: > > [...] My mental model > > of parsing & compiling in the presence of a parse tree > > is like this: > > > > [source] -> scanner -> [tokens] > > -> parser -> [AST] -> code_generator -> [code] > > > > The fact that there still seems to be another kind of > > parse tree in between the scanner and the AST generator > > is an oddity which I hope will eventually disappear. > > Have a look at http://python.org/sf/1337696 -- a reimplementation of > pgen in Python that I did for Elemental and am contributing to the > PSF. It customizes the tree generation callback so as to let you > produce an style of AST you like. > > > > I know > > > Guido has said he doesn't like punishing the performance of small > > > scripts in the name of large-scale apps > > > > To me, that's an argument in favour of always generating > > a .pyc, even for scripts. > > I'm not sure I follow the connection. Greg was proposing having parser, AST, and bytecode compilation all be written in Python and frozen into the executable instead of it being all C code. I said that would be slower and would punish single file scripts that don't get a .pyc generated for them because they would need to have the file compiled every execution. Greg said that is just a good argument for having *any* file, imported or passed in on the command line, to have a .pyc generated when possible. > But I wouldn't mind if someone > contributed code that did this. :) > =) Shouldn't be that complicated (but I don't have time for it right now so it isn't dead simple either =). -Brett From bcannon at gmail.com Tue Nov 29 23:05:21 2005 From: bcannon at gmail.com (Brett Cannon) Date: Tue, 29 Nov 2005 14:05:21 -0800 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <ee2a432c0511291029m5bdc4564s84457533037a7e11@mail.gmail.com> References: <4379AAD7.2050506@iinet.net.au> <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> <ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com> <e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com> <ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com> <ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com> <438B9F12.3060607@v.loewis.de> <ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com> <438C50C8.9040005@gmail.com> <ee2a432c0511291029m5bdc4564s84457533037a7e11@mail.gmail.com> Message-ID: <bbaeab100511291405v4061a5ben5ac014e1178f336b@mail.gmail.com> On 11/29/05, Neal Norwitz <nnorwitz at gmail.com> wrote: > On 11/29/05, Nick Coghlan <ncoghlan at gmail.com> wrote: > > > > When working on the CST->AST parser, there were only a few things I found to > > be seriously painful about the memory management: > > > > 1. Remembering which free_* variant to call for AST nodes > > 2. Remembering which asdl_seq_*_free variant to call for ASDL sequences (it > > was worse when the variant I wanted didn't exist, since this was done with > > functions rather than preprocessor macros) > > 3. Remembering to transpose free_* and *_free between freeing a single node > > and freeing a sequence. > > 4. Remembering whether or not a given cleanup function could cope with > > NULL's or not > > 5. The fact that there wasn't a consistent "goto error" exception-alike > > mechanism in use > > > > (I had a Spanish Inquisition-esque experience writing that list ;) > > :-) I agree all those are existing issues. #3 could be easily fixed. > #4 I think all cleanup functions can deal with NULLs now. #5 > probably ought to be fixed in favor of using gotos. > > > Simply switching to PyObjects would solve the first four problems: everything > > becomes a Py_XDECREF. > > I'm mostly convinced that using PyObjects would be a good thing. > However, making the change isn't free as all the types need to be > created and this is likely quite a bit of code. I'd like to hear what > Jeremy thinks about this. > > Is anyone interested in creating a patch along these lines (even a > partial patch) to see the benefits? > Or should perhaps a branch be made since Subversion makes it so cheap and this allows multiple people to work on it? -Brett From greg.ewing at canterbury.ac.nz Tue Nov 29 23:15:16 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 30 Nov 2005 11:15:16 +1300 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com> References: <4379AAD7.2050506@iinet.net.au> <dlf7ak$ckg$1@sea.gmane.org> <dll2v3$78g$1@sea.gmane.org> <ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com> <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> <ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com> <e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com> <ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com> <ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com> <438B9F12.3060607@v.loewis.de> <ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com> Message-ID: <438CD2F4.4090702@canterbury.ac.nz> Neal Norwitz wrote: > For those watching, Greg's and Martin's version were almost the same. > However, Greg's version left in the memory leak, while Martin fixed it > by letting the result fall through. I addressed the memory leak by stipulating that FunctionDef should steal references to its arguments (whether it succeeds or not). However, while that trick works in this particular case, it wouldn't be so helpful in more complicated situations, so Martin's version is probably a better model to follow. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Tue Nov 29 23:32:06 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 30 Nov 2005 11:32:06 +1300 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <438C50C8.9040005@gmail.com> References: <4379AAD7.2050506@iinet.net.au> <dlf7ak$ckg$1@sea.gmane.org> <dll2v3$78g$1@sea.gmane.org> <ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com> <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> <ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com> <e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com> <ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com> <ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com> <438B9F12.3060607@v.loewis.de> <ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com> <438C50C8.9040005@gmail.com> Message-ID: <438CD6E6.4030504@canterbury.ac.nz> Nick Coghlan wrote: > Declaring that none of the AST node creation methods steal references would be > consistent with most of the existing C API (e.g. PySequence_SetItem, > PySequence_Tuple, PySequence_List), Agreed, although the rest of your proposal (while admirably cunning) requires that ast-building functions effectively return borrowed references, which is not usual Thats' not to say it shouldn't be done, but it does differ from the usual conventions, and that would need to be kept in mind. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Tue Nov 29 23:43:45 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 30 Nov 2005 11:43:45 +1300 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <ca471dc20511290727v3e34f2efhf4dc54150d84d28a@mail.gmail.com> References: <4379AAD7.2050506@iinet.net.au> <ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com> <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> <ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com> <e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com> <ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com> <ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com> <438B9D29.2020403@canterbury.ac.nz> <bbaeab100511281629xd89651eudc0c7ed5b1a36eb7@mail.gmail.com> <438BF5D5.6090000@canterbury.ac.nz> <ca471dc20511290727v3e34f2efhf4dc54150d84d28a@mail.gmail.com> Message-ID: <438CD9A1.4050202@canterbury.ac.nz> Guido van Rossum wrote: >>To me, that's an argument in favour of always generating >>a .pyc, even for scripts. > > I'm not sure I follow the connection. You were saying that if the parser and compiler were slow, it would slow down single-file scripts that didn't have a .pyc (or at least that's what I thought you were saying). If a .pyc were always generated, this problem would not arise. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Tue Nov 29 23:49:00 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 30 Nov 2005 11:49:00 +1300 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <ee2a432c0511291017r6c99c115y722e67fbea7e5cee@mail.gmail.com> References: <4379AAD7.2050506@iinet.net.au> <ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com> <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> <ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com> <e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com> <ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com> <ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com> <438B9F12.3060607@v.loewis.de> <ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com> <438C042F.2050502@v.loewis.de> <ee2a432c0511291017r6c99c115y722e67fbea7e5cee@mail.gmail.com> Message-ID: <438CDADC.1090806@canterbury.ac.nz> Neal Norwitz wrote: > On 11/28/05, "Martin v. L?wis" <martin at v.loewis.de> wrote: > > > he assumed that FunctionDef would *consume* the references > > being passed (whether it is successful or not). > > I keep resisting this solution, though I'm not sure why. One reason for not liking it is that it only works well when you only call one such function from a given function. If there are two, you have to worry about not reaching the second one due to the first one failing, in which case you need to decref the second one's args yourself. In the long run it's probably best to stick to the conventional conventions, which are there for a reason -- they work! -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Tue Nov 29 23:52:21 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 30 Nov 2005 11:52:21 +1300 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <ee2a432c0511291029m5bdc4564s84457533037a7e11@mail.gmail.com> References: <4379AAD7.2050506@iinet.net.au> <ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com> <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> <ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com> <e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com> <ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com> <ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com> <438B9F12.3060607@v.loewis.de> <ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com> <438C50C8.9040005@gmail.com> <ee2a432c0511291029m5bdc4564s84457533037a7e11@mail.gmail.com> Message-ID: <438CDBA5.9050207@canterbury.ac.nz> Neal Norwitz wrote: > I'm mostly convinced that using PyObjects would be a good thing. > However, making the change isn't free as all the types need to be > created and this is likely quite a bit of code. Since they're all so similar, perhaps they could be auto-generated by a fairly simple script? (I'm being very careful not to suggest using Pyrex for this, as I can appreciate the desire not to make such a fundamental part of the core dependent on it!) -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From vinay_sajip at yahoo.co.uk Tue Nov 29 23:49:48 2005 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Tue, 29 Nov 2005 22:49:48 +0000 (UTC) Subject: [Python-Dev] =?utf-8?q?Proposed_additional_keyword_argument_in_lo?= =?utf-8?q?gging=09calls?= References: <001a01c5ef77$d7682300$0200a8c0@alpha> <ca471dc20511281213i7aa48897qb4fd10d89fbae5dd@mail.gmail.com> Message-ID: <loom.20051129T234808-309@post.gmane.org> Guido van Rossum <guido <at> python.org> writes: > This looks like a good clean solution to me. I agree with Paul Moore's > suggestion that if extra_info is not None you should just go ahead and > use it as a dict and let the errors propagate. OK. > What's the rationale for not letting it override existing fields? > (There may be a good one, I just don't see it without turning on my > thinking cap, which would cost extra. The existing fields which could be overwritten are ones which have been computed by the logging package itself: name Name of the logger levelno Numeric logging level for the message (DEBUG, INFO, WARNING, ERROR, CRITICAL) levelname Text logging level for the message ("DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL") msg The message passed in the logging call args The additional args passed in the logging call exc_info Exception information (from sys.exc_info()) exc_text Exception text (cached for use by multiple handlers) pathname Full pathname of the source file where the logging call was issued (if available) filename Filename portion of pathname module Module (name portion of filename) lineno Source line number where the logging call was issued (if available) created Time when the LogRecord was created (time.time() return value) msecs Millisecond portion of the creation time relativeCreated Time in milliseconds when the LogRecord was created, relative to the time the logging module was loaded (typically at application startup time) thread Thread ID (if available) process Process ID (if available) message The result of record.getMessage(), computed just as the record is emitted I couldn't think of a good reason why it should be possible to overwrite these values with values from a user-supplied dictionary, other than to spoof log entries in some way. The intention is to stop a user accidentally overwriting one of the above attributes. But thinking about "Errors should never pass silently", I propose that an exception (KeyError seems most appropriate, though here it would be because a key was present rather than absent) be thrown if one of the above attribute names is supplied as a key in the user-supplied dict. > Perhaps it makes sense to call it 'extra' instead of 'extra_info'? Fine - 'extra' it will be. > As a new feature it should definitely not go into 2.4; but I don't see > how it could break existing code. > OK - thanks for the feedback. Regards, Vinay Sajip From skip at pobox.com Wed Nov 30 00:53:47 2005 From: skip at pobox.com (skip@pobox.com) Date: Tue, 29 Nov 2005 17:53:47 -0600 Subject: [Python-Dev] =?utf-8?q?Proposed_additional_keyword_argument_in_lo?= =?utf-8?q?gging=09calls?= In-Reply-To: <loom.20051129T234808-309@post.gmane.org> References: <001a01c5ef77$d7682300$0200a8c0@alpha> <ca471dc20511281213i7aa48897qb4fd10d89fbae5dd@mail.gmail.com> <loom.20051129T234808-309@post.gmane.org> Message-ID: <17292.59915.267228.293830@montanaro.dyndns.org> Vinay> I couldn't think of a good reason why it should be possible to Vinay> overwrite these values with values from a user-supplied Vinay> dictionary, other than to spoof log entries in some way. If the user doesn't need those values and can provide cheap substitutes, perhaps their computation can be avoided. I did that recently by inlining only the parts of logging.LogRecord.__init__ in a subclass and avoided calling logging.LogRecord.__init__ altogether. It generated lots of instance variables we never use and just slowed things down. Skip From guido at python.org Wed Nov 30 05:19:20 2005 From: guido at python.org (Guido van Rossum) Date: Tue, 29 Nov 2005 20:19:20 -0800 Subject: [Python-Dev] Proposed additional keyword argument in logging calls In-Reply-To: <loom.20051129T234808-309@post.gmane.org> References: <001a01c5ef77$d7682300$0200a8c0@alpha> <ca471dc20511281213i7aa48897qb4fd10d89fbae5dd@mail.gmail.com> <loom.20051129T234808-309@post.gmane.org> Message-ID: <ca471dc20511292019o39a863e9p2a4e030ee3eb6ee8@mail.gmail.com> On 11/29/05, Vinay Sajip <vinay_sajip at yahoo.co.uk> wrote: > But thinking about "Errors should never pass silently", I propose that an > exception (KeyError seems most appropriate, though here it would be because a > key was present rather than absent) be thrown if one of the above attribute > names is supplied as a key in the user-supplied dict. +1 -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ncoghlan at gmail.com Wed Nov 30 10:42:20 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 30 Nov 2005 19:42:20 +1000 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <438CDBA5.9050207@canterbury.ac.nz> References: <4379AAD7.2050506@iinet.net.au> <ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com> <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> <ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com> <e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com> <ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com> <ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com> <438B9F12.3060607@v.loewis.de> <ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com> <438C50C8.9040005@gmail.com> <ee2a432c0511291029m5bdc4564s84457533037a7e11@mail.gmail.com> <438CDBA5.9050207@canterbury.ac.nz> Message-ID: <438D73FC.4090009@gmail.com> Greg Ewing wrote: > Neal Norwitz wrote: > >> I'm mostly convinced that using PyObjects would be a good thing. >> However, making the change isn't free as all the types need to be >> created and this is likely quite a bit of code. > > Since they're all so similar, perhaps they could be > auto-generated by a fairly simple script? > > (I'm being very careful not to suggest using Pyrex > for this, as I can appreciate the desire not to make > such a fundamental part of the core dependent on it!) The ast C structs are already auto-generated by a Python script (asdl_c.py, to be precise). The trick is to make that script generate full PyObjects rather than the simple C structures that it generates now. I believe Jeremy wrote that early in the life of the AST branch, so it's worth waiting for his advice on how to go about modifying it. asdl_seq can disappear entirely: we can just use a PyList instead. The second step is to then modify ast.c to use the new structures. A branch probably wouldn't help much with initial development (this is a "break the world, check in when stuff compiles again" kind of change, which is hard to split amongst multiple people), but I think it would be of benefit when reviewing the change before moving it back to the trunk. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From ncoghlan at gmail.com Wed Nov 30 10:51:26 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 30 Nov 2005 19:51:26 +1000 Subject: [Python-Dev] Metaclass problem in the "with" statement semantics in PEP 343 In-Reply-To: <f6c9c084e0f75619d461c871954c3900@gradient.cis.upenn.edu> References: <mailman.8173.1133288956.18700.python-dev@python.org> <f6c9c084e0f75619d461c871954c3900@gradient.cis.upenn.edu> Message-ID: <438D761E.2040602@gmail.com> Edward Loper wrote: > Otherwise, it seems like people might write code that relies on the > current behavior, which will then break if we eg turn __context__ into > a slot. (It sounds like you want to reserve the right to change this.) > Well, of course, people may rely on the current behavior anyway, but > at least they'll have been warned. :) Yep - I thought "the instance dictionary has no effect" was an actual rule, but it turns out the rules are slightly looser than that (specifically, that fact that the effect of having a slot name in the instance dictionary is undefined). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From mwh at python.net Wed Nov 30 11:02:05 2005 From: mwh at python.net (Michael Hudson) Date: Wed, 30 Nov 2005 10:02:05 +0000 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <438CD9A1.4050202@canterbury.ac.nz> (Greg Ewing's message of "Wed, 30 Nov 2005 11:43:45 +1300") References: <4379AAD7.2050506@iinet.net.au> <ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com> <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> <ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com> <e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com> <ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com> <ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com> <438B9D29.2020403@canterbury.ac.nz> <bbaeab100511281629xd89651eudc0c7ed5b1a36eb7@mail.gmail.com> <438BF5D5.6090000@canterbury.ac.nz> <ca471dc20511290727v3e34f2efhf4dc54150d84d28a@mail.gmail.com> <438CD9A1.4050202@canterbury.ac.nz> Message-ID: <2m1x0ycv36.fsf@starship.python.net> Greg Ewing <greg.ewing at canterbury.ac.nz> writes: > Guido van Rossum wrote: > >>>To me, that's an argument in favour of always generating >>>a .pyc, even for scripts. >> >> I'm not sure I follow the connection. > > You were saying that if the parser and compiler were > slow, it would slow down single-file scripts that > didn't have a .pyc (or at least that's what I thought > you were saying). If a .pyc were always generated, > this problem would not arise. Well, the current stdlib compiler is unacceptably slow, no question. I don't want "make install" to take as long as "regrtest -u all test_compiler", or make test to take nearly that long in all cases. Cheers, mwh -- 58. Fools ignore complexity. Pragmatists suffer it. Some can avoid it. Geniuses remove it. -- Alan Perlis, http://www.cs.yale.edu/homes/perlis-alan/quotes.html From krumms at gmail.com Wed Nov 30 12:58:40 2005 From: krumms at gmail.com (Thomas Lee) Date: Wed, 30 Nov 2005 21:58:40 +1000 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <438D73FC.4090009@gmail.com> References: <4379AAD7.2050506@iinet.net.au> <ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com> <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> <ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com> <e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com> <ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com> <ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com> <438B9F12.3060607@v.loewis.de> <ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com> <438C50C8.9040005@gmail.com> <ee2a432c0511291029m5bdc4564s84457533037a7e11@mail.gmail.com> <438CDBA5.9050207@canterbury.ac.nz> <438D73FC.4090009@gmail.com> Message-ID: <438D93F0.3000005@gmail.com> Nick Coghlan wrote: >Greg Ewing wrote: > > >>Neal Norwitz wrote: >> >> >> >>>I'm mostly convinced that using PyObjects would be a good thing. >>>However, making the change isn't free as all the types need to be >>>created and this is likely quite a bit of code. >>> >>> >>Since they're all so similar, perhaps they could be >>auto-generated by a fairly simple script? >> >>(I'm being very careful not to suggest using Pyrex >>for this, as I can appreciate the desire not to make >>such a fundamental part of the core dependent on it!) >> >> > >The ast C structs are already auto-generated by a Python script (asdl_c.py, to >be precise). The trick is to make that script generate full PyObjects rather >than the simple C structures that it generates now. > > > I was actually trying this approach last night. I'm back to it this evening, working with the ast-objects branch. I'll push a patch tonight with whatever I get done. Quick semi-related question: where are the marshal_* functions called? They're all static in Python-ast.c and don't seem to be actually called anywhere. Can we ditch them? >The second step is to then modify ast.c to use the new structures. A branch >probably wouldn't help much with initial development (this is a "break the >world, check in when stuff compiles again" kind of change, which is hard to >split amongst multiple people), but I think it would be of benefit when >reviewing the change before moving it back to the trunk. > > > Based on my (limited) experience and your approach, compile.c may also need to be modified a little too (this should be pretty trivial). Cheers, Tom From amk at amk.ca Wed Nov 30 14:52:18 2005 From: amk at amk.ca (A.M. Kuchling) Date: Wed, 30 Nov 2005 08:52:18 -0500 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <438D73FC.4090009@gmail.com> References: <ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com> <e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com> <ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com> <ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com> <438B9F12.3060607@v.loewis.de> <ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com> <438C50C8.9040005@gmail.com> <ee2a432c0511291029m5bdc4564s84457533037a7e11@mail.gmail.com> <438CDBA5.9050207@canterbury.ac.nz> <438D73FC.4090009@gmail.com> Message-ID: <20051130135218.GA23728@rogue.amk.ca> On Wed, Nov 30, 2005 at 07:42:20PM +1000, Nick Coghlan wrote: > The second step is to then modify ast.c to use the new structures. A branch > probably wouldn't help much with initial development (this is a "break the > world, check in when stuff compiles again" kind of change, which is hard to > split amongst multiple people), ... There is a bug day scheduled for this Sunday, so maybe the AST developers could meet to coordinate this change. --amk From theller at python.net Wed Nov 30 10:04:07 2005 From: theller at python.net (Thomas Heller) Date: Wed, 30 Nov 2005 10:04:07 +0100 Subject: [Python-Dev] =?utf-8?q?Proposed_additional_keyword_argument_in_lo?= =?utf-8?q?gging=09calls?= References: <001a01c5ef77$d7682300$0200a8c0@alpha> <ca471dc20511281213i7aa48897qb4fd10d89fbae5dd@mail.gmail.com> <loom.20051129T234808-309@post.gmane.org> Message-ID: <64qaik1k.fsf@python.net> Vinay Sajip <vinay_sajip at yahoo.co.uk> writes: > The existing fields which could be overwritten are ones which have been computed > by the logging package itself: > > name Name of the logger > levelno Numeric logging level for the message (DEBUG, INFO, > WARNING, ERROR, CRITICAL) [and so on]. Shouldn't this list be documented? Or is it? Thomas From jimjjewett at gmail.com Wed Nov 30 18:39:58 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Wed, 30 Nov 2005 12:39:58 -0500 Subject: [Python-Dev] Proposed additional keyword argument in logging calls Message-ID: <fb6fbf560511300939i2e31eb16la2a3fb15bb688053@mail.gmail.com> > I couldn't think of a good reason why it should be possible to overwrite these > values with values from a user-supplied dictionary, other than to spoof log > entries in some way. The intention is to stop a user accidentally overwriting > one of the above attributes. This makes sense, but is it worth the time to check on each logging call? -jJ From nnorwitz at gmail.com Wed Nov 30 19:21:27 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Wed, 30 Nov 2005 10:21:27 -0800 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <438D93F0.3000005@gmail.com> References: <4379AAD7.2050506@iinet.net.au> <ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com> <ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com> <438B9F12.3060607@v.loewis.de> <ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com> <438C50C8.9040005@gmail.com> <ee2a432c0511291029m5bdc4564s84457533037a7e11@mail.gmail.com> <438CDBA5.9050207@canterbury.ac.nz> <438D73FC.4090009@gmail.com> <438D93F0.3000005@gmail.com> Message-ID: <ee2a432c0511301021m2e72d710r173f085b84cc2f4@mail.gmail.com> On 11/30/05, Thomas Lee <krumms at gmail.com> wrote: > > Quick semi-related question: where are the marshal_* functions called? > They're all static in Python-ast.c and don't seem to be actually called > anywhere. Can we ditch them? I *think* they are not necessary. My guess is that they were there for marshaling the AST to disk, though I'm not sure why we would want to do that. It could have been there was the idea of how they would be marshalled to PyObjects and exported. Unless you hear otherwise from Jeremy, I would probably remove them. I can check your patch into the branch so others can get an idea and hopefully provide comments. n From nas at arctrix.com Wed Nov 30 18:24:49 2005 From: nas at arctrix.com (Neil Schemenauer) Date: Wed, 30 Nov 2005 17:24:49 +0000 (UTC) Subject: [Python-Dev] Memory management in the AST parser & compiler References: <4379AAD7.2050506@iinet.net.au> <ca471dc20511281214h11fd0572kea0f811a1fa45d3@mail.gmail.com> <e8bf7a530511281247n665ea1f6qdac70180f3d89fef@mail.gmail.com> <ca471dc20511281315s1f8caeb0wae7ce7931c5354e3@mail.gmail.com> <e8bf7a530511281323h3a1bfe49ydbf9f0b38609e56b@mail.gmail.com> <ca471dc20511281346p7156a1f7i4c705561e2e28da1@mail.gmail.com> <ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com> <438B9F12.3060607@v.loewis.de> <ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com> <438C50C8.9040005@gmail.com> <ee2a432c0511291029m5bdc4564s84457533037a7e11@mail.gmail.com> <438CDBA5.9050207@canterbury.ac.nz> <438D73FC.4090009@gmail.com> <438D93F0.3000005@gmail.com> Message-ID: <dmkn90$27q$1@sea.gmane.org> Thomas Lee <krumms at gmail.com> wrote: > Quick semi-related question: where are the marshal_* functions called? > They're all static in Python-ast.c and don't seem to be actually called > anywhere. Can we ditch them? They are intended to be used to make the AST available to Python code. It would be nice if they could be retained but nothing will break (AFAIK) if they are ditched. Neil From mfb at lotusland.dyndns.org Wed Nov 30 19:40:26 2005 From: mfb at lotusland.dyndns.org (Matthew F. Barnes) Date: Wed, 30 Nov 2005 12:40:26 -0600 Subject: [Python-Dev] Short-circuiting iterators Message-ID: <1133376026.19766.31.camel@localhost.localdomain> Hello, I've not had much luck in searching for a discussion on this in the Python-Dev archives, so bear with me. I had an idea this morning for a simple extension to Python's iterator protocol that would allow the user to force an iterator to raise StopIteration on the next call to next(). My thought was to add a new method to iterators called stop(). In my situation it would be useful as a control-flow mechanism, but I imagine there are many other use cases for it: generator = some_generator_function() for x in generator: ... deeply ... ... nested ... ... control-flow ... if satisfaction_condition: # Terminates the for-loop, but # finishes the current iteration generator.stop() ... more stuff ... I'm curious if anything like this has been proposed in the past. If so, could someone kindly point me to any relevant mailing list threads? Matthew Barnes From nnorwitz at gmail.com Wed Nov 30 19:54:45 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Wed, 30 Nov 2005 10:54:45 -0800 Subject: [Python-Dev] Memory management in the AST parser & compiler In-Reply-To: <dmkn90$27q$1@sea.gmane.org> References: <4379AAD7.2050506@iinet.net.au> <ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com> <438B9F12.3060607@v.loewis.de> <ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com> <438C50C8.9040005@gmail.com> <ee2a432c0511291029m5bdc4564s84457533037a7e11@mail.gmail.com> <438CDBA5.9050207@canterbury.ac.nz> <438D73FC.4090009@gmail.com> <438D93F0.3000005@gmail.com> <dmkn90$27q$1@sea.gmane.org> Message-ID: <ee2a432c0511301054u7bae50f9i22c2e2749b2f9969@mail.gmail.com> On 11/30/05, Neil Schemenauer <nas at arctrix.com> wrote: > Thomas Lee <krumms at gmail.com> wrote: > > Quick semi-related question: where are the marshal_* functions called? > > They're all static in Python-ast.c and don't seem to be actually called > > anywhere. Can we ditch them? > > They are intended to be used to make the AST available to Python > code. It would be nice if they could be retained but nothing will > break (AFAIK) if they are ditched. If everything is a PyObject, wouldn't they be redundant? n From aleaxit at gmail.com Wed Nov 30 19:57:51 2005 From: aleaxit at gmail.com (Alex Martelli) Date: Wed, 30 Nov 2005 10:57:51 -0800 Subject: [Python-Dev] Short-circuiting iterators In-Reply-To: <1133376026.19766.31.camel@localhost.localdomain> References: <1133376026.19766.31.camel@localhost.localdomain> Message-ID: <e8a0972d0511301057i35e1a4cei42ba02529859ceb2@mail.gmail.com> On 11/30/05, Matthew F. Barnes <mfb at lotusland.dyndns.org> wrote: ... > I'm curious if anything like this has been proposed in the past. If so, > could someone kindly point me to any relevant mailing list threads? PEP 342, already accepted and found at http://python.org/peps/pep-0342.html , covers related functionality (as well as many other points). Akex From mfb at lotusland.dyndns.org Wed Nov 30 20:16:25 2005 From: mfb at lotusland.dyndns.org (Matthew F. Barnes) Date: Wed, 30 Nov 2005 13:16:25 -0600 Subject: [Python-Dev] Short-circuiting iterators In-Reply-To: <e8a0972d0511301057i35e1a4cei42ba02529859ceb2@mail.gmail.com> References: <1133376026.19766.31.camel@localhost.localdomain> <e8a0972d0511301057i35e1a4cei42ba02529859ceb2@mail.gmail.com> Message-ID: <1133378185.19766.39.camel@localhost.localdomain> On Wed, 2005-11-30 at 10:57 -0800, Alex Martelli wrote: > PEP 342, already accepted and found at > http://python.org/peps/pep-0342.html , covers related functionality > (as well as many other points). Thanks Alex, I'll take another look at that PEP. The first time I tried to read it my brain started to sizzle. I happened to use a generator-iterator in my example, but my thought was that the extension could be applied to iterators in general, including sequence-iterators. Matthew Barnes From edloper at gradient.cis.upenn.edu Wed Nov 30 20:36:54 2005 From: edloper at gradient.cis.upenn.edu (Edward Loper) Date: Wed, 30 Nov 2005 14:36:54 -0500 Subject: [Python-Dev] Short-circuiting iterators In-Reply-To: <mailman.8427.1133378193.18700.python-dev@python.org> References: <mailman.8427.1133378193.18700.python-dev@python.org> Message-ID: <4a6398cac1420f3b957ec1fd449e439f@gradient.cis.upenn.edu> > I had an idea this morning for a simple extension to Python's iterator > protocol that would allow the user to force an iterator to raise > StopIteration on the next call to next(). My thought was to add a new > method to iterators called stop(). There's no need to change the iterator protocol for your example use case; you could just define a simple iterator-wrapper: class InterruptableIterator: stopped = False def __init__(self, iter): self.iter = iter() def next(self): if stopped: raise StopIteration('iterator stopped.') return self.iter.next() def stop(self): self.stopped = True And then just replace: > generator = some_generator_function() with: generator = InterruptableIterator(some_generator_function()) -Edward From reinhold-birkenfeld-nospam at wolke7.net Wed Nov 30 20:51:09 2005 From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld) Date: Wed, 30 Nov 2005 20:51:09 +0100 Subject: [Python-Dev] something is wrong with test___all__ In-Reply-To: <ca471dc20511281216p36548ba6l6779da343d14e805@mail.gmail.com> References: <dm03lq$41u$1@sea.gmane.org> <ca471dc20511281216p36548ba6l6779da343d14e805@mail.gmail.com> Message-ID: <dmkvre$5jc$1@sea.gmane.org> Guido van Rossum wrote: > Has this been handled yet? If not, perhaps showing the good and bad > bytecode here would help trigger someone's brain into understanding > the problem. I've created a tracker item at www.python.org/sf/1370322. Reinhold -- Mail address is perfectly valid! From mfb at lotusland.dyndns.org Wed Nov 30 20:52:03 2005 From: mfb at lotusland.dyndns.org (Matthew F. Barnes) Date: Wed, 30 Nov 2005 13:52:03 -0600 Subject: [Python-Dev] Short-circuiting iterators In-Reply-To: <4a6398cac1420f3b957ec1fd449e439f@gradient.cis.upenn.edu> References: <mailman.8427.1133378193.18700.python-dev@python.org> <4a6398cac1420f3b957ec1fd449e439f@gradient.cis.upenn.edu> Message-ID: <1133380323.19766.45.camel@localhost.localdomain> On Wed, 2005-11-30 at 14:36 -0500, Edward Loper wrote: > There's no need to change the iterator protocol for your example use > case; you could just define a simple iterator-wrapper: Good point. Perhaps it would be a useful addition to the itertools module then? itertools.interruptable(iterable) Matthew Barnes From nas at arctrix.com Wed Nov 30 20:49:53 2005 From: nas at arctrix.com (Neil Schemenauer) Date: Wed, 30 Nov 2005 19:49:53 +0000 (UTC) Subject: [Python-Dev] Memory management in the AST parser & compiler References: <4379AAD7.2050506@iinet.net.au> <ee2a432c0511281458n27806bd6y4358d2eeefc94a50@mail.gmail.com> <438B9F12.3060607@v.loewis.de> <ee2a432c0511282324m2400e968n6d4f48d531268257@mail.gmail.com> <438C50C8.9040005@gmail.com> <ee2a432c0511291029m5bdc4564s84457533037a7e11@mail.gmail.com> <438CDBA5.9050207@canterbury.ac.nz> <438D73FC.4090009@gmail.com> <438D93F0.3000005@gmail.com> <dmkn90$27q$1@sea.gmane.org> <ee2a432c0511301054u7bae50f9i22c2e2749b2f9969@mail.gmail.com> Message-ID: <dmkvp1$4f3$1@sea.gmane.org> Neal Norwitz <nnorwitz at gmail.com> wrote: > If everything is a PyObject, wouldn't [the marshal functions] be > redundant? You could be right. Spending time to kept them working is probably wasted effort. Neil From barry at python.org Wed Nov 30 22:24:01 2005 From: barry at python.org (Barry Warsaw) Date: Wed, 30 Nov 2005 16:24:01 -0500 Subject: [Python-Dev] Standalone email package in the sandbox Message-ID: <1133385841.23988.10.camel@geddy.wooz.org> Unless there are any objections, I'd like to create a space in the sandbox for the standalone email package miscellany. This currently lives in the mimelib project's hidden CVS on SF, but that seems pretty silly. Basically I'm just going to add the test script, setup.py, generated html docs and a few additional unit tests, along with svn:external refs to pull in Lib/email from the appropriate Python svn tree. This way, I'll be able to create standalone email packages from the sandbox (which I need to do because I plan on fixing a few outstanding email bugs). -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20051130/e88db51d/attachment.pgp