From ned at nedbatchelder.com Sun May 1 00:49:11 2011 From: ned at nedbatchelder.com (Ned Batchelder) Date: Sat, 30 Apr 2011 18:49:11 -0400 Subject: [Python-Dev] sys.settrace: behavior doesn't match docs Message-ID: <4DBC91E7.9060402@nedbatchelder.com> This week I learned something new about trace functions (how to write a C trace function that survives a sys.settrace(sys.gettrace()) round-trip), and while writing up what I learned, I was surprised to discover that trace functions don't behave the way I thought, or the way the docs say they behave. The docs say: The trace function is invoked (with /event/ set to 'call') whenever a new local scope is entered; it should return a reference to a local trace function to be used that scope, or None if the scope shouldn't be traced. The local trace function should return a reference to itself (or to another function for further tracing in that scope), or None to turn off tracing in that scope. It's that last part that's wrong: returning None from the trace function only has an effect on the first call in a new frame. Once the trace function returns a function for a frame, returning None from subsequent calls is ignored. A "local trace function" can't turn off tracing in its scope. To demonstrate: import sys UPTO_LINE = 1 def t(frame, event, arg): num = frame.f_lineno print("line %d" % num) if num < UPTO_LINE: return t def try_it(): print("twelve") print("thirteen") print("fourteen") print("fifteen") UPTO_LINE = 1 sys.settrace(t) try_it() UPTO_LINE = 13 sys.settrace(t) try_it() Produces: line 11 twelve thirteen fourteen fifteen line 11 line 12 twelve line 13 thirteen line 14 fourteen line 15 fifteen line 15 The first call to try_it() returns None immediately, preventing tracing for the rest of the function. The second call returns None at line 13, but the rest of the function is traced anyway. This behavior is the same in all versions from 2.3 to 3.2, in fact, the 100 lines of code in sysmodule.c responsible for Python tracing functions are completely unchanged through those versions. (A deeper mystery that I haven't looked into yet is why Python 3.x intersperses all of these lines with "line 18" interjections.) I'm writing this email because I'm not sure whether this is a behavior bug or a doc bug. One of them is wrong, since they disagree. The documented behavior makes sense, and is what people have all along thought the trace function did. The actual behavior is a bit more complicated to explain, but is what people have actually been experiencing. FWIW, PyPy implements the documented behavior. Should we fix the code or the docs? I'd be glad to supply a patch for either. --Ned. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20110430/33f7f201/attachment.html> From guido at python.org Sun May 1 02:43:27 2011 From: guido at python.org (Guido van Rossum) Date: Sat, 30 Apr 2011 17:43:27 -0700 Subject: [Python-Dev] sys.settrace: behavior doesn't match docs In-Reply-To: <4DBC91E7.9060402@nedbatchelder.com> References: <4DBC91E7.9060402@nedbatchelder.com> Message-ID: <BANLkTikvTVP_-V7g86BgU9oPX3dUGE3eSw@mail.gmail.com> I think you need to go back farther in time. :-) In Python 2.0 the call_trace function in ceval.c has a completely different signature (but the docs are the same). I haven't checked all history but somewhere between 2.0 and 2.3, SET_LINENO-less tracing was added, and that's where the implementation must have gone wrong. So I think we should fix the code. --Guido On Sat, Apr 30, 2011 at 3:49 PM, Ned Batchelder <ned at nedbatchelder.com> wrote: > This week I learned something new about trace functions (how to write a C > trace function that survives a sys.settrace(sys.gettrace()) round-trip), and > while writing up what I learned, I was surprised to discover that trace > functions don't behave the way I thought, or the way the docs say they > behave. > > The docs say: > > The trace function is invoked (with event set to 'call') whenever a new > local scope is entered; it should return a reference to a local trace > function to be used that scope, or None if the scope shouldn?t be traced. > > The local trace function should return a reference to itself (or to another > function for further tracing in that scope), or None to turn off tracing in > that scope. > > It's that last part that's wrong: returning None from the trace function > only has an effect on the first call in a new frame.? Once the trace > function returns a function for a frame, returning None from subsequent > calls is ignored.? A "local trace function" can't turn off tracing in its > scope. > > To demonstrate: > > import sys > > UPTO_LINE = 1 > > def t(frame, event, arg): > ??? num = frame.f_lineno > ??? print("line %d" % num) > ??? if num < UPTO_LINE: > ??????? return t > > def try_it(): > ??? print("twelve") > ??? print("thirteen") > ??? print("fourteen") > ??? print("fifteen") > > UPTO_LINE = 1 > sys.settrace(t) > try_it() > > UPTO_LINE = 13 > sys.settrace(t) > try_it() > > Produces: > > line 11 > twelve > thirteen > fourteen > fifteen > line 11 > line 12 > twelve > line 13 > thirteen > line 14 > fourteen > line 15 > fifteen > line 15 > > The first call to try_it() returns None immediately, preventing tracing for > the rest of the function.? The second call returns None at line 13, but the > rest of the function is traced anyway.? This behavior is the same in all > versions from 2.3 to 3.2, in fact, the 100 lines of code in sysmodule.c > responsible for Python tracing functions are completely unchanged through > those versions.? (A deeper mystery that I haven't looked into yet is why > Python 3.x intersperses all of these lines with "line 18" interjections.) > > I'm writing this email because I'm not sure whether this is a behavior bug > or a doc bug.? One of them is wrong, since they disagree.? The documented > behavior makes sense, and is what people have all along thought the trace > function did.? The actual behavior is a bit more complicated to explain, but > is what people have actually been experiencing.? FWIW, PyPy implements the > documented behavior. > > Should we fix the code or the docs?? I'd be glad to supply a patch for > either. > > --Ned. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/guido%40python.org > > -- --Guido van Rossum (python.org/~guido) From techtonik at gmail.com Sun May 1 12:40:43 2011 From: techtonik at gmail.com (anatoly techtonik) Date: Sun, 1 May 2011 13:40:43 +0300 Subject: [Python-Dev] 2to3 status, repositories and HACKING guide In-Reply-To: <AANLkTi=frkAceLLEtBYWDsQCKSOsVrQPv1jJ=G3h_jT1@mail.gmail.com> References: <AANLkTi=Uh7f56LdZ_kgvwQVF7t_FYNgqpfH34DdgppNG@mail.gmail.com> <AANLkTingP4ZwRv=ta6RC_C59nbS6gN-hkzVm15Xdd-2Y@mail.gmail.com> <AANLkTi=frkAceLLEtBYWDsQCKSOsVrQPv1jJ=G3h_jT1@mail.gmail.com> Message-ID: <BANLkTi=87D_89D67swsy4gLO6-U2MGZPSg@mail.gmail.com> Is there any high-level overview of 2to3 tool that people can use as a quick start for writing their own fixers? Source doesn't explain much (to me at least), and some kind of "learn by example" would really help a lot. In particular, I find the syntax of tree matchers the most unclear part. -- anatoly t. On Fri, Mar 25, 2011 at 9:12 PM, Benjamin Peterson <benjamin at python.org> wrote: > The main cpython repo. > > 2011/3/25 anatoly techtonik <techtonik at gmail.com>: >> Hi, Benjamin, >> >> Is your repository for 2to3 is still actual? >> http://svn.python.org/view/sandbox/trunk/2to3/ >> >> Which should I use to start hacking on 2to3? >> >> -- >> anatoly t. >> >> >> >> On Wed, Mar 23, 2011 at 9:01 AM, anatoly techtonik <techtonik at gmail.com> wrote: >>> Hi, >>> >>> Currently 2to3 page at http://wiki.python.org/moin/2to3 lists >>> http://svn.python.org/view/sandbox/trunk/2to3 as a repository for 2to3 >>> tool. There is also an outdated repository at http://hg.python.org/ >>> and the page says that the code is finally integrated into CPython 2.6 >>> - you can see it at >>> http://hg.python.org/cpython/file/default/Lib/lib2to3. So, what >>> version is more up-to-date? >>> >>> In svn repository there is a HACKING guide advising to use >>> find_pattern.py script for writing new fixer. However, there is no >>> find_pattern.py in CPython repository, no HACKING guide, no any >>> documentation about how to write fixers or description of PATTERN >>> format. Did I miss something? >>> -- >>> anatoly t. >>> >> > > > > -- > Regards, > Benjamin > From ncoghlan at gmail.com Sun May 1 13:27:44 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 1 May 2011 21:27:44 +1000 Subject: [Python-Dev] Not-a-Number (was PyObject_RichCompareBool identity shortcut) In-Reply-To: <BANLkTi=v98ZLbqTGSBED-MdE4V4X6JoTdg@mail.gmail.com> References: <4DB7E3EA.3030208@avl.com> <BANLkTik6Fr0e=5PLNTu4x=CT+v12tt3Tsg@mail.gmail.com> <87d3k79jvt.fsf@uwakimon.sk.tsukuba.ac.jp> <BANLkTi=AusPRDsf2zKDGteZ5dGxs0EEuXw@mail.gmail.com> <4DB90748.4030501@g.nevcal.com> <BANLkTi=eAug-2n+MsQvSpaet5PM4NQDHSg@mail.gmail.com> <4DB916DE.1050302@g.nevcal.com> <BANLkTikGVfox3dXkO7B5f5iQbX5L8ypNgw@mail.gmail.com> <4DB927F4.3040206@dcs.gla.ac.uk> <ipchp6$1ba$1@dough.gmane.org> <871v0la5yg.fsf@uwakimon.sk.tsukuba.ac.jp> <BANLkTi=v98ZLbqTGSBED-MdE4V4X6JoTdg@mail.gmail.com> Message-ID: <BANLkTi=dcK5fewo0bdoDgySBhOqYcxz=uQ@mail.gmail.com> On Sat, Apr 30, 2011 at 3:11 AM, Guido van Rossum <guido at python.org> wrote: > Decimal, for that reason, has a context that lets one specify > different behaviors when a NaN is produced. Would it make sense to add > a float context that also lets one specify what should happen? That > could include returning Inf for 1.0/0.0 (for experts), or raising > exceptions when NaNs are produced (for the numerically naive like > myself). > > I could see a downside too, e.g. the correctness of code that > passingly uses floats might be affected by the context settings. > There's also the question of whether the float context should affect > int operations; floats vs. ints is another can of worms since (in > Python 3) we attempt to tie them together through 1/2 == 0.5, but ints > have a much larger range than floats. Given that we delegate most float() behaviour to the underlying CPU and C libraries (and then the math module tries to cope with any cross-platform discrepancies), introducing context handling isn't easy, and would likely harm the current speed advantage that floats hold over the decimal module. We decided that losing the speed advantage of native integers was worthwhile in order to better unify the semantics of int and long for Py3k, but both the speed differential and the semantic gap between float() and decimal.Decimal() are significantly larger. However, I did find Terry's suggestion of using the warnings module to report some of the floating point corner cases that currently silently produce unexpected results to be an interesting one. If those operations issued a FloatWarning, then users could either silence them or turn them into errors as desired. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From benjamin at python.org Sun May 1 17:44:10 2011 From: benjamin at python.org (Benjamin Peterson) Date: Sun, 1 May 2011 10:44:10 -0500 Subject: [Python-Dev] 2to3 status, repositories and HACKING guide In-Reply-To: <BANLkTi=87D_89D67swsy4gLO6-U2MGZPSg@mail.gmail.com> References: <AANLkTi=Uh7f56LdZ_kgvwQVF7t_FYNgqpfH34DdgppNG@mail.gmail.com> <AANLkTingP4ZwRv=ta6RC_C59nbS6gN-hkzVm15Xdd-2Y@mail.gmail.com> <AANLkTi=frkAceLLEtBYWDsQCKSOsVrQPv1jJ=G3h_jT1@mail.gmail.com> <BANLkTi=87D_89D67swsy4gLO6-U2MGZPSg@mail.gmail.com> Message-ID: <BANLkTi=p3epLD_F6gPDVdiL3ihGTnwm7JA@mail.gmail.com> 2011/5/1 anatoly techtonik <techtonik at gmail.com>: > Is there any high-level overview of 2to3 tool that people can use as a > quick start for writing their own fixers? No. > > Source doesn't explain much (to me at least), and some kind of "learn > by example" would really help a lot. In particular, I find the syntax of > tree matchers the most unclear part. I think you can learn a lot by reading through the current fixers in lib2to3/fixers/. -- Regards, Benjamin From g.brandl at gmx.net Sun May 1 18:31:20 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 01 May 2011 18:31:20 +0200 Subject: [Python-Dev] Issue Tracker In-Reply-To: <BANLkTi=ppYhHd4hAHMGeByTN1aUcBF2WNg@mail.gmail.com> References: <4D90EA06.3030003@stoneleaf.us> <AANLkTikK=4Js-4Z2NRgmkhhkfKX_CufXTi3E0A2MhTPe@mail.gmail.com> <20110328223112.76482a9d@pitrou.net> <20110329013756.99EB8D64A7@kimball.webabinitio.net> <BANLkTi=ppYhHd4hAHMGeByTN1aUcBF2WNg@mail.gmail.com> Message-ID: <ipk1st$nfm$1@dough.gmane.org> On 30.04.2011 16:53, anatoly techtonik wrote: > On Tue, Mar 29, 2011 at 4:37 AM, R. David Murray <rdmurray at bitdance.com> wrote: >> >> The hardest part is debugging the TAL when you make a mistake, but >> even that isn't a whole lot worse than any other templating language. > > How much in % is it worse than Django templating language? I'm just guessing here, but I'd say 47.256 %. Georg From g.brandl at gmx.net Sun May 1 19:57:51 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 01 May 2011 19:57:51 +0200 Subject: [Python-Dev] Python 3.2.1 Message-ID: <ipk6v4$h54$1@dough.gmane.org> Hi, I'd like to release Python 3.2.1 on May 21, with a release candidate on May 14. Please bring any issues you think need to be fixed in it to my attention by assigning "release blocker" status in the tracker. Georg From raymond.hettinger at gmail.com Sun May 1 20:22:02 2011 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sun, 1 May 2011 11:22:02 -0700 Subject: [Python-Dev] Python 3.2.1 In-Reply-To: <ipk6v4$h54$1@dough.gmane.org> References: <ipk6v4$h54$1@dough.gmane.org> Message-ID: <5D8F6095-D052-47F6-A65B-D578A4460F20@gmail.com> On May 1, 2011, at 10:57 AM, Georg Brandl wrote: > I'd like to release Python 3.2.1 on May 21, with a release candidate > on May 14. Please bring any issues you think need to be fixed in it > to my attention by assigning "release blocker" status in the tracker. Thanks to http://www.python.org/dev/daily-dmg/ , I've been able to work off of the head every day. Python 3.2.1 is in pretty good shape :-) Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20110501/f34277de/attachment.html> From tjreedy at udel.edu Sun May 1 20:45:06 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 01 May 2011 14:45:06 -0400 Subject: [Python-Dev] Not-a-Number (was PyObject_RichCompareBool identity shortcut) In-Reply-To: <BANLkTi=dcK5fewo0bdoDgySBhOqYcxz=uQ@mail.gmail.com> References: <4DB7E3EA.3030208@avl.com> <BANLkTik6Fr0e=5PLNTu4x=CT+v12tt3Tsg@mail.gmail.com> <87d3k79jvt.fsf@uwakimon.sk.tsukuba.ac.jp> <BANLkTi=AusPRDsf2zKDGteZ5dGxs0EEuXw@mail.gmail.com> <4DB90748.4030501@g.nevcal.com> <BANLkTi=eAug-2n+MsQvSpaet5PM4NQDHSg@mail.gmail.com> <4DB916DE.1050302@g.nevcal.com> <BANLkTikGVfox3dXkO7B5f5iQbX5L8ypNgw@mail.gmail.com> <4DB927F4.3040206@dcs.gla.ac.uk> <ipchp6$1ba$1@dough.gmane.org> <871v0la5yg.fsf@uwakimon.sk.tsukuba.ac.jp> <BANLkTi=v98ZLbqTGSBED-MdE4V4X6JoTdg@mail.gmail.com> <BANLkTi=dcK5fewo0bdoDgySBhOqYcxz=uQ@mail.gmail.com> Message-ID: <ipk9nh$9l$1@dough.gmane.org> On 5/1/2011 7:27 AM, Nick Coghlan wrote: > However, I did find Terry's suggestion of using the warnings module to > report some of the floating point corner cases that currently silently > produce unexpected results to be an interesting one. If those > operations issued a FloatWarning, then users could either silence them > or turn them into errors as desired. I would like to take credit for that, but I was actually seconding Alexander's insight and idea. I may have added the specific name after looking at the currently list and seeing UnicodeWarning and BytesWarning, so why not a FloatWarning. I did read the warnings doc more carefully to verify that it would really put the user in control, which was apparently the intent of the committee. I am not sure whether FloatWarnings should ignored or printed by default. Ignored would, I guess, match current behavior, unless something else is changed as part of a more extensive overhaul. -f and -ff are available to turn ignored FloatWarning into print or raise exception, as with BytesWarning. I suspect that these would get at lease as much usage as -b and -bb. So I see 4 questions: 1. Add FloatWarning? 2. If yes, default disposition? 3. Add command line options? 4. Use the addition of FloatWarning as an opportunity to change other defaults, given that user will have more options? -- Terry Jan Reedy From brian.curtin at gmail.com Sun May 1 22:51:55 2011 From: brian.curtin at gmail.com (Brian Curtin) Date: Sun, 1 May 2011 15:51:55 -0500 Subject: [Python-Dev] Windows 2000 Support Message-ID: <BANLkTik3w5jD+dC1tx2zTjOxSXbcmkfGPw@mail.gmail.com> I'm currently writing a post about the process of removing OS/2 and VMS support and thought about a discussion of Windows 2000 some time back. http://mail.python.org/pipermail/python-dev/2010-March/098074.html makes a proposal for beginning to walk away from 2000, but doesn't appear to come to any conclusion. Was anything decided off the list? I don't see anything in PEP-11 and don't see any changes in the installer made around Windows 2000. If nothing was decided, should anything be done for 3.3? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20110501/6f48ef53/attachment.html> From victor.stinner at haypocalc.com Mon May 2 12:06:47 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Mon, 2 May 2011 12:06:47 +0200 Subject: [Python-Dev] Raise OSError or RuntimeError in the OS module? Message-ID: <201105021206.47384.victor.stinner@haypocalc.com> Hi, I introduced recently the signal.pthread_sigmask() function (issue #8407). pthread_sigmask() (the C function) returns an error code using errno codes. I choosed to raise a RuntimeError using this error code, but I am not sure that RuntimeError is the best choice. It is more an OS error than a runtime error: should signal.pthread_sigmask() raise an OSError instead? signal.signal() raises a RuntimeError if setting the signal handler failed. signal.siginterrupt() raises also a RuntimeError on error. signal.setitimer() and signal.getitimer() have their own exception class: signal.ItimerError, raised on setimer() and getitimer() error. Victor From ned at nedbatchelder.com Mon May 2 13:27:40 2011 From: ned at nedbatchelder.com (Ned Batchelder) Date: Mon, 02 May 2011 07:27:40 -0400 Subject: [Python-Dev] sys.settrace: behavior doesn't match docs In-Reply-To: <BANLkTikvTVP_-V7g86BgU9oPX3dUGE3eSw@mail.gmail.com> References: <4DBC91E7.9060402@nedbatchelder.com> <BANLkTikvTVP_-V7g86BgU9oPX3dUGE3eSw@mail.gmail.com> Message-ID: <4DBE952C.2070005@nedbatchelder.com> Indeed, the 2.0 code is very different, and got this case right. I'm a little surprised no one is arguing that changing this code now could break some applications. Maybe the fact no one noticed the docs were wrong proves that no one ever tried returning None from a local trace function. --Ned. On 4/30/2011 8:43 PM, Guido van Rossum wrote: > I think you need to go back farther in time. :-) In Python 2.0 the > call_trace function in ceval.c has a completely different signature > (but the docs are the same). I haven't checked all history but > somewhere between 2.0 and 2.3, SET_LINENO-less tracing was added, and > that's where the implementation must have gone wrong. So I think we > should fix the code. > > --Guido > > On Sat, Apr 30, 2011 at 3:49 PM, Ned Batchelder<ned at nedbatchelder.com> wrote: >> This week I learned something new about trace functions (how to write a C >> trace function that survives a sys.settrace(sys.gettrace()) round-trip), and >> while writing up what I learned, I was surprised to discover that trace >> functions don't behave the way I thought, or the way the docs say they >> behave. >> >> The docs say: >> >> The trace function is invoked (with event set to 'call') whenever a new >> local scope is entered; it should return a reference to a local trace >> function to be used that scope, or None if the scope shouldn?t be traced. >> >> The local trace function should return a reference to itself (or to another >> function for further tracing in that scope), or None to turn off tracing in >> that scope. >> >> It's that last part that's wrong: returning None from the trace function >> only has an effect on the first call in a new frame. Once the trace >> function returns a function for a frame, returning None from subsequent >> calls is ignored. A "local trace function" can't turn off tracing in its >> scope. >> >> To demonstrate: >> >> import sys >> >> UPTO_LINE = 1 >> >> def t(frame, event, arg): >> num = frame.f_lineno >> print("line %d" % num) >> if num< UPTO_LINE: >> return t >> >> def try_it(): >> print("twelve") >> print("thirteen") >> print("fourteen") >> print("fifteen") >> >> UPTO_LINE = 1 >> sys.settrace(t) >> try_it() >> >> UPTO_LINE = 13 >> sys.settrace(t) >> try_it() >> >> Produces: >> >> line 11 >> twelve >> thirteen >> fourteen >> fifteen >> line 11 >> line 12 >> twelve >> line 13 >> thirteen >> line 14 >> fourteen >> line 15 >> fifteen >> line 15 >> >> The first call to try_it() returns None immediately, preventing tracing for >> the rest of the function. The second call returns None at line 13, but the >> rest of the function is traced anyway. This behavior is the same in all >> versions from 2.3 to 3.2, in fact, the 100 lines of code in sysmodule.c >> responsible for Python tracing functions are completely unchanged through >> those versions. (A deeper mystery that I haven't looked into yet is why >> Python 3.x intersperses all of these lines with "line 18" interjections.) >> >> I'm writing this email because I'm not sure whether this is a behavior bug >> or a doc bug. One of them is wrong, since they disagree. The documented >> behavior makes sense, and is what people have all along thought the trace >> function did. The actual behavior is a bit more complicated to explain, but >> is what people have actually been experiencing. FWIW, PyPy implements the >> documented behavior. >> >> Should we fix the code or the docs? I'd be glad to supply a patch for >> either. >> >> --Ned. >> >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> http://mail.python.org/mailman/options/python-dev/guido%40python.org >> >> > > From mhammond at skippinet.com.au Mon May 2 14:47:11 2011 From: mhammond at skippinet.com.au (Mark Hammond) Date: Mon, 02 May 2011 22:47:11 +1000 Subject: [Python-Dev] sys.settrace: behavior doesn't match docs In-Reply-To: <4DBE952C.2070005@nedbatchelder.com> References: <4DBC91E7.9060402@nedbatchelder.com> <BANLkTikvTVP_-V7g86BgU9oPX3dUGE3eSw@mail.gmail.com> <4DBE952C.2070005@nedbatchelder.com> Message-ID: <4DBEA7CF.4030307@skippinet.com.au> On 2/05/2011 9:27 PM, Ned Batchelder wrote: ... > Maybe the fact no one noticed the docs > were wrong proves that no one ever tried returning None from a local > trace function. Or if they did, they should have complained by now. IMO, if the behaviour regresses from how it is documented and how it previously worked and no reports of the regression exist, we should just fix it without regard to people relying on the "new" functionality... Mark From ncoghlan at gmail.com Mon May 2 15:12:32 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 2 May 2011 23:12:32 +1000 Subject: [Python-Dev] sys.settrace: behavior doesn't match docs In-Reply-To: <4DBEA7CF.4030307@skippinet.com.au> References: <4DBC91E7.9060402@nedbatchelder.com> <BANLkTikvTVP_-V7g86BgU9oPX3dUGE3eSw@mail.gmail.com> <4DBE952C.2070005@nedbatchelder.com> <4DBEA7CF.4030307@skippinet.com.au> Message-ID: <BANLkTi=_wg7KqgQXBFAOz3YoHpYvHyE-UA@mail.gmail.com> On Mon, May 2, 2011 at 10:47 PM, Mark Hammond <mhammond at skippinet.com.au> wrote: > On 2/05/2011 9:27 PM, Ned Batchelder wrote: > ... >> >> Maybe the fact no one noticed the docs >> were wrong proves that no one ever tried returning None from a local >> trace function. > > Or if they did, they should have complained by now. ?IMO, if the behaviour > regresses from how it is documented and how it previously worked and no > reports of the regression exist, we should just fix it without regard to > people relying on the "new" functionality... +1 Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From vinay_sajip at yahoo.co.uk Mon May 2 16:26:56 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Mon, 2 May 2011 14:26:56 +0000 (UTC) Subject: [Python-Dev] Socket servers in the test suite References: <loom.20110427T230704-75@post.gmane.org> <BANLkTimqCY02e+iy-OcV4nzZa1BTiC_sOQ@mail.gmail.com> Message-ID: <loom.20110502T155417-507@post.gmane.org> Nick Coghlan <ncoghlan <at> gmail.com> writes: > sure the urllib tests already fire up a local server). Starting down > the path of standardisation of that test functionality would be good. I've made a start with test_logging.py by implementing some potential server classes for use in tests: in the latest test_logging.py, the servers are between comments containing the text "server_helper". The basic approach for implementing socket servers is traditionally to use a request handler class which implements the custom logic, but for some testing applications this is overkill - you just want to be able to pass a handling callable which is, say, a test case method. So the signatures of the servers are all like this: __init__(self, listen_addr, handler, poll_interval ...) Initialise using the specified listen address and handler callable. Internally, a RequestHandler subclass will be used whose handle() delegates to the handler callable passed in. A zero port number can be passed in, and a port attribute will (after binding) have the actual port number used, so that clients can connect on that port. start() Start the server on a separate thread, using the poll_interval specified in the underlying poll()/select() call. Before this is called, the request handler class could be replaced with a subclass if need be. stop(timeout=None) Ask the server to stop and wait for the server thread to terminate. The server also has a ready attribute which is a threading.Event, set just when the server is entering its service loop. Typical mode of use would be: class ClientTestCase(unittest.TestCase): def setUp(self): self.server = TheAppropriateServerClass(('localhost', 0), self.handle_request, 0.01, ...) self.server.start() self.server.ready.wait() self.handled = threading.Event() def tearDown(self): self.server.stop(1.0) # wait up to 1 sec for thread to stop def handle_request(self, request): # Handle the request, e.g. by setting some attributes based on what # was received at the server # Set the flag to say we finished handling self.handled.set() def test_xxx(self): # set up client and send stuff to server # Wait for server to finish doing stuff self.handled.wait() # make assertions based on the attributes # set during request handling The server classes provided are TestSMTPServer, TestTCPServer, TestUDPServer and TestHTTPServer. There are examples of actual usage in test_logging.py: SMTPHandlerTest, SocketHandlerTest, DatagramHandlerTest, SysLogHandlerTest, HTTPHandlerTest. I'd like some comments on this suggested API. I have not yet looked at how to adapt other stdlib code than test_logging to use these classes, but the above usage mode seems convenient and sufficient for testing applications. No doubt people will be able to suggest problems with/improvements to the approach outlined above. Regards, Vinay Sajip From techtonik at gmail.com Mon May 2 18:06:58 2011 From: techtonik at gmail.com (anatoly techtonik) Date: Mon, 2 May 2011 19:06:58 +0300 Subject: [Python-Dev] Issue Tracker In-Reply-To: <ipk1st$nfm$1@dough.gmane.org> References: <4D90EA06.3030003@stoneleaf.us> <AANLkTikK=4Js-4Z2NRgmkhhkfKX_CufXTi3E0A2MhTPe@mail.gmail.com> <20110328223112.76482a9d@pitrou.net> <20110329013756.99EB8D64A7@kimball.webabinitio.net> <BANLkTi=ppYhHd4hAHMGeByTN1aUcBF2WNg@mail.gmail.com> <ipk1st$nfm$1@dough.gmane.org> Message-ID: <BANLkTikX_vTjy09x35mWBDb2P_aqRFsMQg@mail.gmail.com> On Sun, May 1, 2011 at 7:31 PM, Georg Brandl <g.brandl at gmx.net> wrote: > On 30.04.2011 16:53, anatoly techtonik wrote: >> On Tue, Mar 29, 2011 at 4:37 AM, R. David Murray <rdmurray at bitdance.com> wrote: >>> >>> The hardest part is debugging the TAL when you make a mistake, but >>> even that isn't a whole lot worse than any other templating language. >> >> How much in % is it worse than Django templating language? > > I'm just guessing here, but I'd say 47.256 %. That means switching to Django templates will make Roundup design plumbing work 47.256% more attractive for potential contributors. -- anatoly t. From benjamin at python.org Mon May 2 18:17:59 2011 From: benjamin at python.org (Benjamin Peterson) Date: Mon, 2 May 2011 11:17:59 -0500 Subject: [Python-Dev] Issue Tracker In-Reply-To: <BANLkTikX_vTjy09x35mWBDb2P_aqRFsMQg@mail.gmail.com> References: <4D90EA06.3030003@stoneleaf.us> <AANLkTikK=4Js-4Z2NRgmkhhkfKX_CufXTi3E0A2MhTPe@mail.gmail.com> <20110328223112.76482a9d@pitrou.net> <20110329013756.99EB8D64A7@kimball.webabinitio.net> <BANLkTi=ppYhHd4hAHMGeByTN1aUcBF2WNg@mail.gmail.com> <ipk1st$nfm$1@dough.gmane.org> <BANLkTikX_vTjy09x35mWBDb2P_aqRFsMQg@mail.gmail.com> Message-ID: <BANLkTinnksUMptvmWatwDpSDS08HwJrOYw@mail.gmail.com> 2011/5/2 anatoly techtonik <techtonik at gmail.com>: > On Sun, May 1, 2011 at 7:31 PM, Georg Brandl <g.brandl at gmx.net> wrote: >> On 30.04.2011 16:53, anatoly techtonik wrote: >>> On Tue, Mar 29, 2011 at 4:37 AM, R. David Murray <rdmurray at bitdance.com> wrote: >>>> >>>> The hardest part is debugging the TAL when you make a mistake, but >>>> even that isn't a whole lot worse than any other templating language. >>> >>> How much in % is it worse than Django templating language? >> >> I'm just guessing here, but I'd say 47.256 %. > > That means switching to Django templates will make Roundup design > plumbing work 47.256% more attractive for potential contributors. Perhaps some of those eager contributors would like to volunteer for the task. -- Regards, Benjamin From brian.curtin at gmail.com Mon May 2 18:19:28 2011 From: brian.curtin at gmail.com (Brian Curtin) Date: Mon, 2 May 2011 11:19:28 -0500 Subject: [Python-Dev] Issue Tracker In-Reply-To: <BANLkTikX_vTjy09x35mWBDb2P_aqRFsMQg@mail.gmail.com> References: <4D90EA06.3030003@stoneleaf.us> <AANLkTikK=4Js-4Z2NRgmkhhkfKX_CufXTi3E0A2MhTPe@mail.gmail.com> <20110328223112.76482a9d@pitrou.net> <20110329013756.99EB8D64A7@kimball.webabinitio.net> <BANLkTi=ppYhHd4hAHMGeByTN1aUcBF2WNg@mail.gmail.com> <ipk1st$nfm$1@dough.gmane.org> <BANLkTikX_vTjy09x35mWBDb2P_aqRFsMQg@mail.gmail.com> Message-ID: <BANLkTimFPoo2BDZYyAJV8m0q41oFYzbJ6A@mail.gmail.com> On Mon, May 2, 2011 at 11:06, anatoly techtonik <techtonik at gmail.com> wrote: > On Sun, May 1, 2011 at 7:31 PM, Georg Brandl <g.brandl at gmx.net> wrote: > > On 30.04.2011 16:53, anatoly techtonik wrote: > >> On Tue, Mar 29, 2011 at 4:37 AM, R. David Murray <rdmurray at bitdance.com> > wrote: > >>> > >>> The hardest part is debugging the TAL when you make a mistake, but > >>> even that isn't a whole lot worse than any other templating language. > >> > >> How much in % is it worse than Django templating language? > > > > I'm just guessing here, but I'd say 47.256 %. > > That means switching to Django templates will make Roundup design > plumbing work 47.256% more attractive for potential contributors. What if these "potential contributors" never surface? Then we've made a 47.256% change in attractiveness, which is a 1423.843% waste of time. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20110502/dcc35554/attachment.html> From techtonik at gmail.com Mon May 2 19:14:50 2011 From: techtonik at gmail.com (anatoly techtonik) Date: Mon, 2 May 2011 20:14:50 +0300 Subject: [Python-Dev] PEP 386 and dev repository versions workflow Message-ID: <BANLkTi=0NAuSRM=MELYAiko45BnA=u-HLw@mail.gmail.com> http://guide.python-distribute.org/quickstart.html proposes suffixing version of a module in repository with 'dev' in a way that after release of '1.0' version, the repository version is changed to '2.0dev'. This makes sense, but it is not compatible with PEP 386, which suggests using 2.0.devN, where N is a repository revision number. I'd expand PEP 386 to include 2.0dev use case. -- anatoly t. From ziade.tarek at gmail.com Mon May 2 19:19:28 2011 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Mon, 2 May 2011 19:19:28 +0200 Subject: [Python-Dev] PEP 386 and dev repository versions workflow In-Reply-To: <BANLkTi=0NAuSRM=MELYAiko45BnA=u-HLw@mail.gmail.com> References: <BANLkTi=0NAuSRM=MELYAiko45BnA=u-HLw@mail.gmail.com> Message-ID: <BANLkTindgivLDc=-yPOf75v1iVuKUnjGCw@mail.gmail.com> On Mon, May 2, 2011 at 7:14 PM, anatoly techtonik <techtonik at gmail.com> wrote: > http://guide.python-distribute.org/quickstart.html proposes suffixing > version of a module in repository with 'dev' in a way that after > release of '1.0' version, the repository version is changed to > '2.0dev'. This makes sense, but it is not compatible with PEP 386, > which suggests using 2.0.devN, where N is a repository revision > number. I'd expand PEP 386 to include 2.0dev use case. This is a typo I'll fix, thanks for noticing > -- > anatoly t. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ziade.tarek%40gmail.com > -- Tarek Ziad? | http://ziade.org From g.rodola at gmail.com Mon May 2 20:27:57 2011 From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=) Date: Mon, 2 May 2011 20:27:57 +0200 Subject: [Python-Dev] Issue Tracker In-Reply-To: <BANLkTi=ppYhHd4hAHMGeByTN1aUcBF2WNg@mail.gmail.com> References: <4D90EA06.3030003@stoneleaf.us> <AANLkTikK=4Js-4Z2NRgmkhhkfKX_CufXTi3E0A2MhTPe@mail.gmail.com> <20110328223112.76482a9d@pitrou.net> <20110329013756.99EB8D64A7@kimball.webabinitio.net> <BANLkTi=ppYhHd4hAHMGeByTN1aUcBF2WNg@mail.gmail.com> Message-ID: <BANLkTimW43h6PUFm8U4bzC+mO1Dsrzzm9Q@mail.gmail.com> 2011/4/30 anatoly techtonik <techtonik at gmail.com>: > On Tue, Mar 29, 2011 at 4:37 AM, R. David Murray <rdmurray at bitdance.com> wrote: >> >> The hardest part is debugging the TAL when you make a mistake, but >> even that isn't a whole lot worse than any other templating language. > > How much in % is it worse than Django templating language? > -- > anatoly t. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/g.rodola%40gmail.com > Knowing both of them I can say ZPT is one of the few things I like about Zope and I find it a lot more powerful than Django templating system. Other than that, I don't see how changing the templating language can make any difference. If one does not contribute something because of the language used in templates... well, I think it wouldn't have been a particular good contribution anyway. =) --- Giampaolo http://code.google.com/p/pyftpdlib/ http://code.google.com/p/psutil/ From g.brandl at gmx.net Mon May 2 20:41:12 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 02 May 2011 20:41:12 +0200 Subject: [Python-Dev] Issue Tracker In-Reply-To: <BANLkTikX_vTjy09x35mWBDb2P_aqRFsMQg@mail.gmail.com> References: <4D90EA06.3030003@stoneleaf.us> <AANLkTikK=4Js-4Z2NRgmkhhkfKX_CufXTi3E0A2MhTPe@mail.gmail.com> <20110328223112.76482a9d@pitrou.net> <20110329013756.99EB8D64A7@kimball.webabinitio.net> <BANLkTi=ppYhHd4hAHMGeByTN1aUcBF2WNg@mail.gmail.com> <ipk1st$nfm$1@dough.gmane.org> <BANLkTikX_vTjy09x35mWBDb2P_aqRFsMQg@mail.gmail.com> Message-ID: <ipmtsc$ao9$1@dough.gmane.org> On 02.05.2011 18:06, anatoly techtonik wrote: > On Sun, May 1, 2011 at 7:31 PM, Georg Brandl <g.brandl at gmx.net> wrote: >> On 30.04.2011 16:53, anatoly techtonik wrote: >>> On Tue, Mar 29, 2011 at 4:37 AM, R. David Murray <rdmurray at bitdance.com> wrote: >>>> >>>> The hardest part is debugging the TAL when you make a mistake, but >>>> even that isn't a whole lot worse than any other templating language. >>> >>> How much in % is it worse than Django templating language? >> >> I'm just guessing here, but I'd say 47.256 %. > > That means switching to Django templates will make Roundup design > plumbing work 47.256% more attractive for potential contributors. That's not true actually. It'll be 89.595 % more attractive. Georg From sijinjoseph at gmail.com Mon May 2 17:27:49 2011 From: sijinjoseph at gmail.com (Sijin Joseph) Date: Mon, 2 May 2011 11:27:49 -0400 Subject: [Python-Dev] Convert Py_Buffer to Py_UNICODE Message-ID: <BANLkTi==+x8SChR6y=wEjBQxhsjXEciWeQ@mail.gmail.com> Hi - I am working on a patch where I have an argument that can either be a unicode string or binary data, I parse the argument using the PyArg_ParseTuple method using the s* format specification and get a Py_Buffer. I now need to convert this Py_Buffer object to a Py_Unicode and pass it into a function. What is the best way to do this? If I determine that the passed argument was binary using another flag parameter then I am passing Py_Buffer->buf as a pointer to the start of the data. This is in winsound module, here's the relevant code snippet sound_playsound(PyObject *s, PyObject *args) { Py_buffer *buffer; int flags; int ok; LPCWSTR pszSound; if (PyArg_ParseTuple(args, "s*i:PlaySound", &buffer, &flags)) { if (flags & SND_ASYNC && flags & SND_MEMORY) { /* Sidestep reference counting headache; unfortunately this also prevent SND_LOOP from memory. */ PyBuffer_Release(buffer); PyErr_SetString(PyExc_RuntimeError, "Cannot play asynchronously from memory"); return NULL; } if(flags & SND_MEMORY) { pszSound = buffer->buf; } else { /* pszSound = ????; */ } -- Sijin -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20110502/8a0f0250/attachment.html> From mal at egenix.com Mon May 2 21:12:27 2011 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 02 May 2011 21:12:27 +0200 Subject: [Python-Dev] Convert Py_Buffer to Py_UNICODE In-Reply-To: <BANLkTi==+x8SChR6y=wEjBQxhsjXEciWeQ@mail.gmail.com> References: <BANLkTi==+x8SChR6y=wEjBQxhsjXEciWeQ@mail.gmail.com> Message-ID: <4DBF021B.90602@egenix.com> Sijin Joseph wrote: > Hi - I am working on a patch where I have an argument that can either be a > unicode string or binary data, I parse the argument using the > PyArg_ParseTuple method using the s* format specification and get a > Py_Buffer. > > I now need to convert this Py_Buffer object to a Py_Unicode and pass it into > a function. What is the best way to do this? If I determine that the passed > argument was binary using another flag parameter then I am passing > Py_Buffer->buf as a pointer to the start of the data. I don't understand why you'd want to convert PyUnicode to PyBytes (encoded as UTF-8), only to decode it again afterwards in order to pass it to some other PyUnicode API. It'd be more efficient to use the "O" parser marker and then use PyObject_GetBuffer() to convert non-PyUnicode objects to a Py_buffer. > This is in winsound module, here's the relevant code snippet > > sound_playsound(PyObject *s, PyObject *args) > { > Py_buffer *buffer; > int flags; > int ok; > LPCWSTR pszSound; > > if (PyArg_ParseTuple(args, "s*i:PlaySound", &buffer, &flags)) { > if (flags & SND_ASYNC && flags & SND_MEMORY) { > /* Sidestep reference counting headache; unfortunately this also > prevent SND_LOOP from memory. */ > PyBuffer_Release(buffer); > PyErr_SetString(PyExc_RuntimeError, "Cannot play asynchronously > from memory"); > return NULL; > } > > if(flags & SND_MEMORY) { > pszSound = buffer->buf; > } > else { > /* pszSound = ????; */ > } > > -- Sijin > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/mal%40egenix.com -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 02 2011) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2011-06-20: EuroPython 2011, Florence, Italy 49 days to go ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From benjamin at python.org Mon May 2 21:25:44 2011 From: benjamin at python.org (Benjamin Peterson) Date: Mon, 2 May 2011 14:25:44 -0500 Subject: [Python-Dev] Issue Tracker In-Reply-To: <ipmtsc$ao9$1@dough.gmane.org> References: <4D90EA06.3030003@stoneleaf.us> <AANLkTikK=4Js-4Z2NRgmkhhkfKX_CufXTi3E0A2MhTPe@mail.gmail.com> <20110328223112.76482a9d@pitrou.net> <20110329013756.99EB8D64A7@kimball.webabinitio.net> <BANLkTi=ppYhHd4hAHMGeByTN1aUcBF2WNg@mail.gmail.com> <ipk1st$nfm$1@dough.gmane.org> <BANLkTikX_vTjy09x35mWBDb2P_aqRFsMQg@mail.gmail.com> <ipmtsc$ao9$1@dough.gmane.org> Message-ID: <BANLkTimwmUJvwwzMz=C4jdiGOPD3_ABrQw@mail.gmail.com> 2011/5/2 Georg Brandl <g.brandl at gmx.net>: > On 02.05.2011 18:06, anatoly techtonik wrote: >> On Sun, May 1, 2011 at 7:31 PM, Georg Brandl <g.brandl at gmx.net> wrote: >>> On 30.04.2011 16:53, anatoly techtonik wrote: >>>> On Tue, Mar 29, 2011 at 4:37 AM, R. David Murray <rdmurray at bitdance.com> wrote: >>>>> >>>>> The hardest part is debugging the TAL when you make a mistake, but >>>>> even that isn't a whole lot worse than any other templating language. >>>> >>>> How much in % is it worse than Django templating language? >>> >>> I'm just guessing here, but I'd say 47.256 %. >> >> That means switching to Django templates will make Roundup design >> plumbing work 47.256% more attractive for potential contributors. > > That's not true actually. > > It'll be 89.595 % more attractive. I don't understand why you're truncating to 3 digits. Let's be honest in that it will be sqrt(2)^(13e/2) % more attractive. -- Regards, Benjamin From tjreedy at udel.edu Mon May 2 22:49:54 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 02 May 2011 16:49:54 -0400 Subject: [Python-Dev] running/stepping python backwards In-Reply-To: <BANLkTinSQtdpOVKn0GhH4=cP6NnhGgOD0A@mail.gmail.com> References: <BANLkTinSQtdpOVKn0GhH4=cP6NnhGgOD0A@mail.gmail.com> Message-ID: <4DBF18F2.9040202@udel.edu> On 4/29/2011 10:13 PM, Adrian Johnston wrote: > This may seem like an odd question, but I?m intrigued by the idea of > using Python as a data definition language with ?undo? support. > > If I were to try and instrument the Python interpreter to be able to > step backwards, would that be an unduly difficult or inefficient thing > to do? The pydev list is for development of the next version of Python. Please direct your question to a more appropriate forum such as python-list. > (Please reply to me directly.) I did this time, but you should not expect that when posting to a public list. -- Terry Jan Reedy From martin at v.loewis.de Mon May 2 23:14:06 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 02 May 2011 23:14:06 +0200 Subject: [Python-Dev] Windows 2000 Support In-Reply-To: <BANLkTik3w5jD+dC1tx2zTjOxSXbcmkfGPw@mail.gmail.com> References: <BANLkTik3w5jD+dC1tx2zTjOxSXbcmkfGPw@mail.gmail.com> Message-ID: <4DBF1E9E.5000006@v.loewis.de> Am 01.05.2011 22:51, schrieb Brian Curtin: > I'm currently writing a post about the process of removing OS/2 and VMS > support and thought about a discussion of Windows 2000 some time > back. http://mail.python.org/pipermail/python-dev/2010-March/098074.html makes > a proposal for beginning to walk away from 2000, but doesn't appear to > come to any conclusion. > > Was anything decided off the list? I don't see anything in PEP-11 and > don't see any changes in the installer made around Windows 2000. That's what you get for not following your own processes. It seems the discussion just stopped, with no action. I vaguely recall having made changes to the installer to produce a warning, but apparently never got to commit these changes. > If nothing was decided, should anything be done for 3.3? Most certainly. It seems we missed the chance of dropping support for W2k, so we still can't actively remove any code. However, I'd a) add it to PEP 11, and b) add a warning to the installer I stand by http://mail.python.org/pipermail/python-dev/2010-March/098101.html i.e. if there are patches that happen not to work on W2k, I'd accept them anyway - anybody interested in W2k would then have to provide fixes before 3.3rc1. So please go ahead and change PEP 11. While you are at it, also threaten to remove support for systems where the COMSPEC points to command.com (#2405). Regards, Martin From drsalists at gmail.com Mon May 2 23:19:38 2011 From: drsalists at gmail.com (Dan Stromberg) Date: Mon, 2 May 2011 14:19:38 -0700 Subject: [Python-Dev] running/stepping python backwards In-Reply-To: <4DBF18F2.9040202@udel.edu> References: <BANLkTinSQtdpOVKn0GhH4=cP6NnhGgOD0A@mail.gmail.com> <4DBF18F2.9040202@udel.edu> Message-ID: <BANLkTimSg33jOErPr_+9D7=0z47Vw_2KYw@mail.gmail.com> On Mon, May 2, 2011 at 1:49 PM, Terry Reedy <tjreedy at udel.edu> wrote: > > (Please reply to me directly.) > > I did this time, but you should not expect that when posting to a public > list. Actually, this is not only appropriate on some lists, on some lists one is actually strongly discouraged from doing anything else. EG: sun-managers, where replies are expected to be private, and the originator of the thread is expected to collect all (private) replies and summarize them, to keep the list traffic low and the S/N ratio high. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20110502/2225c141/attachment.html> From barry at python.org Tue May 3 00:35:20 2011 From: barry at python.org (Barry Warsaw) Date: Mon, 2 May 2011 18:35:20 -0400 Subject: [Python-Dev] Python 2.6.7 schedule Message-ID: <20110502183520.1c9efdc0@neurotica.wooz.org> I'd like to make a Python 2.6.7 release candidate this Friday, May 6, with a final release scheduled for May 20. I've put these dates on the Python Release Schedule calendar. This will be a source-only security release. I see no release blockers for Python 2.6, so if you know of anything that must go into 2.6.7, please be sure there is a tracker issue for it, that 2.6 is marked as being affected, and with a release blocker priority. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: <http://mail.python.org/pipermail/python-dev/attachments/20110502/c5d1d695/attachment.pgp> From martin at v.loewis.de Tue May 3 01:09:42 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 03 May 2011 01:09:42 +0200 Subject: [Python-Dev] Fwd: viewVC shows traceback on non utf-8 module markup In-Reply-To: <4DBB19A5.4010409@voidspace.org.uk> References: <4DBB19A5.4010409@voidspace.org.uk> Message-ID: <4DBF39B6.3050100@v.loewis.de> Am 29.04.2011 22:03, schrieb Michael Foord: > I know that the svn repo is now for legacy purposes only, but I doubt it > is intended that the online source browser should raise exceptions. It's certainly not. However, I don't plan to do anything about it, either (nor would I know that anybody else would). To view the source code of the file, use http://svn.python.org/view/python/trunk/Lib/heapq.py?view=co&content-type=text/plain Regards, Martin From brian.curtin at gmail.com Tue May 3 02:39:33 2011 From: brian.curtin at gmail.com (Brian Curtin) Date: Mon, 2 May 2011 19:39:33 -0500 Subject: [Python-Dev] Windows 2000 Support In-Reply-To: <4DBF1E9E.5000006@v.loewis.de> References: <BANLkTik3w5jD+dC1tx2zTjOxSXbcmkfGPw@mail.gmail.com> <4DBF1E9E.5000006@v.loewis.de> Message-ID: <BANLkTi=OFubGKXv0nXaM+aNGxda2_vwtPg@mail.gmail.com> On Mon, May 2, 2011 at 16:14, "Martin v. L?wis" <martin at v.loewis.de> wrote: > Am 01.05.2011 22:51, schrieb Brian Curtin: > > I'm currently writing a post about the process of removing OS/2 and VMS > > support and thought about a discussion of Windows 2000 some time > > back. http://mail.python.org/pipermail/python-dev/2010-March/098074.htmlmakes > > a proposal for beginning to walk away from 2000, but doesn't appear to > > come to any conclusion. > > > > Was anything decided off the list? I don't see anything in PEP-11 and > > don't see any changes in the installer made around Windows 2000. > > That's what you get for not following your own processes. It seems the > discussion just stopped, with no action. I vaguely recall having made > changes to the installer to produce a warning, but apparently never > got to commit these changes. > > > If nothing was decided, should anything be done for 3.3? > > Most certainly. It seems we missed the chance of dropping support for > W2k, so we still can't actively remove any code. However, I'd > > a) add it to PEP 11, and > b) add a warning to the installer > > I stand by > > http://mail.python.org/pipermail/python-dev/2010-March/098101.html > > i.e. if there are patches that happen not to work on W2k, I'd accept > them anyway - anybody interested in W2k would then have to provide > fixes before 3.3rc1. > > So please go ahead and change PEP 11. While you are at it, also threaten > to remove support for systems where the COMSPEC points to command.com > (#2405). > Done and done - http://hg.python.org/peps/rev/b9390aa12855 I'll have a look at the installer and add some type of message. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20110502/1fa43b39/attachment.html> From nadeem.vawda at gmail.com Tue May 3 16:22:27 2011 From: nadeem.vawda at gmail.com (Nadeem Vawda) Date: Tue, 3 May 2011 16:22:27 +0200 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #10276: test_zlib checks that inputs of 2 GB are handled correctly by In-Reply-To: <E1QHFVm-0002kp-TV@dinsdale.python.org> References: <E1QHFVm-0002kp-TV@dinsdale.python.org> Message-ID: <BANLkTinsj8Y4SuQ0iSi3um-8TtCgJkGWPg@mail.gmail.com> On Tue, May 3, 2011 at 3:19 PM, victor.stinner <python-checkins at python.org> wrote: > +# Issue #10276 - check that inputs of 2 GB are handled correctly. > +# Be aware of issues #1202, #8650, #8651 and #10276 > +class ChecksumBigBufferTestCase(unittest.TestCase): > + ? ?int_max = 0x7FFFFFFF > + > + ? ?@unittest.skipUnless(mmap, "mmap() is not available.") > + ? ?def test_big_buffer(self): > + ? ? ? ?if sys.platform[:3] == 'win' or sys.platform == 'darwin': > + ? ? ? ? ? ?requires('largefile', > + ? ? ? ? ? ? ? ? ? ? 'test requires %s bytes and a long time to run' % > + ? ? ? ? ? ? ? ? ? ? str(self.int_max)) > + ? ? ? ?try: > + ? ? ? ? ? ?with open(TESTFN, "wb+") as f: > + ? ? ? ? ? ? ? ?f.seek(self.int_max-4) > + ? ? ? ? ? ? ? ?f.write("asdf") > + ? ? ? ? ? ? ? ?f.flush() > + ? ? ? ? ? ? ? ?try: > + ? ? ? ? ? ? ? ? ? ?m = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ) > + ? ? ? ? ? ? ? ? ? ?self.assertEqual(zlib.crc32(m), 0x709418e7) > + ? ? ? ? ? ? ? ? ? ?self.assertEqual(zlib.adler32(m), -2072837729) > + ? ? ? ? ? ? ? ?finally: > + ? ? ? ? ? ? ? ? ? ?m.close() > + ? ? ? ?except (IOError, OverflowError): > + ? ? ? ? ? ?raise unittest.SkipTest("filesystem doesn't have largefile support") > + ? ? ? ?finally: > + ? ? ? ? ? ?unlink(TESTFN) > + > + 0x7FFFFFFF is (2G-1) bytes. For a 2GB buffer, int_max should be 0x80000000. However, if you make this change, crc32() and adler32() raise OverflowErrors (see changeset a0681e7a6ded). This makes the test to erroneously report that the filesystem doesn't support large files. The assertEqual() tests should probably be changed to assertRaises(..., OverflowError). Also, the assignment to m needs to be moved outside of the inner try...finally block. If mmap() fails, the call to m.close() raises a new exception because m has not yet been bound. This seems to be causing failures on some of the 32-bit buildbots. As an aside, in this sort of situation is it better to just go and commit a fix myself, or is raising it on the mailing list first the right way to do things? Cheers, Nadeem From g.brandl at gmx.net Tue May 3 20:30:22 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 03 May 2011 20:30:22 +0200 Subject: [Python-Dev] Raise OSError or RuntimeError in the OS module? In-Reply-To: <201105021206.47384.victor.stinner@haypocalc.com> References: <201105021206.47384.victor.stinner@haypocalc.com> Message-ID: <ipphk3$iui$2@dough.gmane.org> On 02.05.2011 12:06, Victor Stinner wrote: > Hi, > > I introduced recently the signal.pthread_sigmask() function (issue #8407). > pthread_sigmask() (the C function) returns an error code using errno codes. I > choosed to raise a RuntimeError using this error code, but I am not sure that > RuntimeError is the best choice. It is more an OS error than a runtime error: > should signal.pthread_sigmask() raise an OSError instead? > > signal.signal() raises a RuntimeError if setting the signal handler failed. > signal.siginterrupt() raises also a RuntimeError on error. > > signal.setitimer() and signal.getitimer() have their own exception class: > signal.ItimerError, raised on setimer() and getitimer() error. If it has an errno, it should be a subclass of EnvironmentError. Georg From brian.curtin at gmail.com Tue May 3 20:39:40 2011 From: brian.curtin at gmail.com (Brian Curtin) Date: Tue, 3 May 2011 13:39:40 -0500 Subject: [Python-Dev] Windows 2000 Support In-Reply-To: <BANLkTi=OFubGKXv0nXaM+aNGxda2_vwtPg@mail.gmail.com> References: <BANLkTik3w5jD+dC1tx2zTjOxSXbcmkfGPw@mail.gmail.com> <4DBF1E9E.5000006@v.loewis.de> <BANLkTi=OFubGKXv0nXaM+aNGxda2_vwtPg@mail.gmail.com> Message-ID: <BANLkTi=9tQiFZbGFSBsXro=bXaggxSiR9g@mail.gmail.com> On Mon, May 2, 2011 at 19:39, Brian Curtin <brian.curtin at gmail.com> wrote: > On Mon, May 2, 2011 at 16:14, "Martin v. L?wis" <martin at v.loewis.de>wrote: > >> Am 01.05.2011 22:51, schrieb Brian Curtin: >> > I'm currently writing a post about the process of removing OS/2 and VMS >> > support and thought about a discussion of Windows 2000 some time >> > back. >> http://mail.python.org/pipermail/python-dev/2010-March/098074.html makes >> > a proposal for beginning to walk away from 2000, but doesn't appear to >> > come to any conclusion. >> > >> > Was anything decided off the list? I don't see anything in PEP-11 and >> > don't see any changes in the installer made around Windows 2000. >> >> That's what you get for not following your own processes. It seems the >> discussion just stopped, with no action. I vaguely recall having made >> changes to the installer to produce a warning, but apparently never >> got to commit these changes. >> >> > If nothing was decided, should anything be done for 3.3? >> >> Most certainly. It seems we missed the chance of dropping support for >> W2k, so we still can't actively remove any code. However, I'd >> >> a) add it to PEP 11, and >> b) add a warning to the installer >> >> I stand by >> >> http://mail.python.org/pipermail/python-dev/2010-March/098101.html >> >> i.e. if there are patches that happen not to work on W2k, I'd accept >> them anyway - anybody interested in W2k would then have to provide >> fixes before 3.3rc1. >> >> So please go ahead and change PEP 11. While you are at it, also threaten >> to remove support for systems where the COMSPEC points to command.com >> (#2405). >> > > Done and done - http://hg.python.org/peps/rev/b9390aa12855 > I'll have a look at the installer and add some type of message. > It turns out that you did make the change at some point for 2.7 being the last, but there was no corresponding 3.x version chosen. http://hg.python.org/cpython/rev/de53c52fbcbf changed the installer to list 3.3.0 as the last Windows 2000 release on the default branch. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20110503/f655c8fb/attachment.html> From solipsis at pitrou.net Tue May 3 20:57:47 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 3 May 2011 20:57:47 +0200 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #10276: test_zlib checks that inputs of 2 GB are handled correctly by References: <E1QHFVm-0002kp-TV@dinsdale.python.org> <BANLkTinsj8Y4SuQ0iSi3um-8TtCgJkGWPg@mail.gmail.com> Message-ID: <20110503205747.65a76522@pitrou.net> Hello, On Tue, 3 May 2011 16:22:27 +0200 Nadeem Vawda <nadeem.vawda at gmail.com> wrote: > > As an aside, in this sort of situation is it better to just go and > commit a fix myself, or is raising it on the mailing list first the > right way to do things? Raising it on the mailing-list makes it serve as a kind of post-commit review. Also, it ensures that the committer of the original patch understands the issues with it. cheers Antoine. From victor.stinner at haypocalc.com Tue May 3 22:38:43 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 03 May 2011 22:38:43 +0200 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #10276: test_zlib checks that inputs of 2 GB are handled correctly by In-Reply-To: <BANLkTinsj8Y4SuQ0iSi3um-8TtCgJkGWPg@mail.gmail.com> References: <E1QHFVm-0002kp-TV@dinsdale.python.org> <BANLkTinsj8Y4SuQ0iSi3um-8TtCgJkGWPg@mail.gmail.com> Message-ID: <1304455123.1971.5.camel@marge> Le mardi 03 mai 2011 ? 16:22 +0200, Nadeem Vawda a ?crit : > On Tue, May 3, 2011 at 3:19 PM, victor.stinner > <python-checkins at python.org> wrote: > > +# Issue #10276 - check that inputs of 2 GB are handled correctly. > > +# Be aware of issues #1202, #8650, #8651 and #10276 > > +class ChecksumBigBufferTestCase(unittest.TestCase): > > + int_max = 0x7FFFFFFF > > + > > + @unittest.skipUnless(mmap, "mmap() is not available.") > > + def test_big_buffer(self): > > + if sys.platform[:3] == 'win' or sys.platform == 'darwin': > > + requires('largefile', > > + 'test requires %s bytes and a long time to run' % > > + str(self.int_max)) > > + try: > > + with open(TESTFN, "wb+") as f: > > + f.seek(self.int_max-4) > > + f.write("asdf") > > + f.flush() > > + try: > > + m = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ) > > + self.assertEqual(zlib.crc32(m), 0x709418e7) > > + self.assertEqual(zlib.adler32(m), -2072837729) > > + finally: > > + m.close() > > + except (IOError, OverflowError): > > + raise unittest.SkipTest("filesystem doesn't have largefile support") > > + finally: > > + unlink(TESTFN) > > + > > + > > 0x7FFFFFFF is (2G-1) bytes. For a 2GB buffer, int_max should be > 0x80000000. However, if you make this change, crc32() and adler32() > raise OverflowErrors (see changeset a0681e7a6ded). I don't want to check OverflowError: the test is supposed to compute the checksum of a buffer of 0x7FFFFFFF bytes, to check crc32() and adler32(). 0x7FFFFFFF is the biggest size supported by these functions (zlib doesn't use Py_ssize_t in Python 2.7). If you use a buffer of 0x80000000 bytes, you test PyArg_Parse*() functions, which have already a dedicated test (in test_xml_etree_c, it's not the best file to store such test...). > Also, the assignment to m needs to be moved outside of the inner > try...finally block. Yeah, I noticed this with buildbots: already fixed by dd58f8072216. > As an aside, in this sort of situation is it better to just go and > commit a fix myself, or is raising it on the mailing list first the > right way to do things? I'm not sure that you understood the test, so I think that it's better to ask first on IRC and/or the mailing list. Victor From nadeem.vawda at gmail.com Tue May 3 23:11:48 2011 From: nadeem.vawda at gmail.com (Nadeem Vawda) Date: Tue, 3 May 2011 23:11:48 +0200 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #10276: test_zlib checks that inputs of 2 GB are handled correctly by In-Reply-To: <1304455123.1971.5.camel@marge> References: <E1QHFVm-0002kp-TV@dinsdale.python.org> <BANLkTinsj8Y4SuQ0iSi3um-8TtCgJkGWPg@mail.gmail.com> <1304455123.1971.5.camel@marge> Message-ID: <BANLkTimcU=-826rXGaPqM8LD0fvuWW-GPw@mail.gmail.com> On Tue, May 3, 2011 at 10:38 PM, Victor Stinner <victor.stinner at haypocalc.com> wrote: > I don't want to check OverflowError: the test is supposed to compute the > checksum of a buffer of 0x7FFFFFFF bytes, to check crc32() and > adler32(). 0x7FFFFFFF is the biggest size supported by these functions > (zlib doesn't use Py_ssize_t in Python 2.7). I see. Since you mentioned issue 10276 in the commit message, I assumed you were testing for the underlying C functions truncating their arguments. It seems that I was mistaken. Sorry for the confusion. Cheers, Nadeem From victor.stinner at haypocalc.com Wed May 4 10:58:42 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 04 May 2011 10:58:42 +0200 Subject: [Python-Dev] The zombi thread of the Tcl library Message-ID: <1304499523.15694.11.camel@marge> Hi, I have a question: would it be possible to mask all signals in the Tcl thread? To understand the question, let's see the context... I'm working on signals, especially on pthread_sigmask(), and I'm trying to understand test_signal failures. test_signal fails if the _tkinter module is loaded, because _tkinter loads the Tcl library which create a thread waiting events in select(). For example, "python -m test test_pydoc test_signal" fails, because test_pydoc loads ALL Python modules. I opened an issue for test_pydoc: http://bugs.python.org/issue11995 _tkinter.c contains the following code: #if 0 /* This was not a good idea; through <Destroy> bindings, Tcl_Finalize() may invoke Python code but at that point the interpreter and thread state have already been destroyed! */ Py_AtExit(Tcl_Finalize); #endif Tcl_Finalize() exits the thread, but this function is never called in Python. Anyway, it is not possible to unload a module implemented in C. I would like to know if it would be possible to mask all signals in the Tcl thread, or if Tcl supports/uses signals. It is possible to mask all signals in the Tcl thread using: ---------- allsignals = range(1, signal.NSIG) oldmask = signal.pthread_sigmask(signal.SIG_BLOCK, allsignals) import _tkinter signal.pthread_sigmask(signal.SIG_SETMASK, oldmask) ---------- I'm not asking the question for test_signal: I have a patch fixing test_signal, even if the Tcl zombi thread is present (use pthread_kill() to send the signal directly to the main thread). (I wrote "zombi" thread because I was not aware that Tcl uses a thread, nor that test_pydoc loads all modules. The thread is valid, alive, and it's just a joke. The threads is more hidden than zombi.) Victor From marks at dcs.gla.ac.uk Wed May 4 11:08:33 2011 From: marks at dcs.gla.ac.uk (Mark Shannon) Date: Wed, 04 May 2011 10:08:33 +0100 Subject: [Python-Dev] Borrowed and Stolen References in API In-Reply-To: <1304499523.15694.11.camel@marge> References: <1304499523.15694.11.camel@marge> Message-ID: <4DC11791.2000109@dcs.gla.ac.uk> Hi, The online documentation specifies which API function borrow and/or steal references (as opposed to the default behaviour). Yet, I cannot find this information anywhere in the source. Any clues as to where I should look? Cheers, Mark From amauryfa at gmail.com Wed May 4 11:35:19 2011 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Wed, 4 May 2011 11:35:19 +0200 Subject: [Python-Dev] Borrowed and Stolen References in API In-Reply-To: <4DC11791.2000109@dcs.gla.ac.uk> References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk> Message-ID: <BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com> Hi, Le mercredi 4 mai 2011, Mark Shannon <marks at dcs.gla.ac.uk> a ?crit?: > The online documentation specifies which API function borrow and/or steal references (as opposed to the default behaviour). > Yet, I cannot find this information anywhere in the source. > > Any clues as to where I should look? It's in the file Doc/data/refcounts.dat in some custom format. -- Amaury -- Amaury Forgeot d'Arc From solipsis at pitrou.net Wed May 4 12:05:19 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 4 May 2011 12:05:19 +0200 Subject: [Python-Dev] The zombi thread of the Tcl library References: <1304499523.15694.11.camel@marge> Message-ID: <20110504120519.7a1bc105@pitrou.net> On Wed, 04 May 2011 10:58:42 +0200 Victor Stinner <victor.stinner at haypocalc.com> wrote: > > Tcl_Finalize() exits the thread, but this function is never called in > Python. Anyway, it is not possible to unload a module implemented in C. You could expose Tcl_Finalize() for debug purposes and call it in test_signal. Regards Antoine. From victor.stinner at haypocalc.com Wed May 4 13:54:20 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 04 May 2011 13:54:20 +0200 Subject: [Python-Dev] The zombi thread of the Tcl library In-Reply-To: <20110504120519.7a1bc105@pitrou.net> References: <1304499523.15694.11.camel@marge> <20110504120519.7a1bc105@pitrou.net> Message-ID: <1304510060.15694.13.camel@marge> Le mercredi 04 mai 2011 ? 12:05 +0200, Antoine Pitrou a ?crit : > On Wed, 04 May 2011 10:58:42 +0200 > Victor Stinner <victor.stinner at haypocalc.com> wrote: > > > > Tcl_Finalize() exits the thread, but this function is never called in > > Python. Anyway, it is not possible to unload a module implemented in C. > > You could expose Tcl_Finalize() for debug purposes and call it in > test_signal. Good idea. I opened an issue with a patch implementing Tcl_Finalize(): http://bugs.python.org/issue11998 I also added a workaround _tkinter border effect in test_signal. Buildbots look to be happy. Victor From ncoghlan at gmail.com Wed May 4 19:01:58 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 5 May 2011 03:01:58 +1000 Subject: [Python-Dev] New interest areas in Experts Index Message-ID: <BANLkTiku43f7MaTnv=4N+NBa6sBa+5cJrg@mail.gmail.com> I just added two new interest areas in the Expert's Index [1] context managers: for any issues relating to proposals to add context management capabilities to objects in the stdlib, triagers should feel free to add me to the nosy list test coverage: this is specifically for anyone willing to help review and commit test coverage improvement patches (rather than the more general "testing" interest area that was already present) Cheers, Nick. [1] http://docs.python.org/devguide/experts -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Wed May 4 21:35:11 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 4 May 2011 21:35:11 +0200 Subject: [Python-Dev] cpython (2.7): Issue #11277: test_zlib tests a buffer of 1 GB on 32 bits References: <E1QHhji-0003D9-5t@dinsdale.python.org> Message-ID: <20110504213511.07e9f2bf@pitrou.net> On Wed, 04 May 2011 21:27:50 +0200 victor.stinner <python-checkins at python.org> wrote: > http://hg.python.org/cpython/rev/7f3cab59ef3e > changeset: 69834:7f3cab59ef3e > branch: 2.7 > parent: 69827:affec521b330 > user: Victor Stinner <victor.stinner at haypocalc.com> > date: Wed May 04 21:27:39 2011 +0200 > summary: > Issue #11277: test_zlib tests a buffer of 1 GB on 32 bits What's the point? The issue with 2GB or 4GB buffers is that they cross the potential limit of a machine type (a signed or unsigned integer). I don't see any benefit in testing a 1GB buffer; the test could probably be removed instead. Regards Antoine. From greg.ewing at canterbury.ac.nz Thu May 5 00:04:51 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 05 May 2011 10:04:51 +1200 Subject: [Python-Dev] Borrowed and Stolen References in API In-Reply-To: <4DC11791.2000109@dcs.gla.ac.uk> References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk> Message-ID: <4DC1CD83.3000603@canterbury.ac.nz> Mark Shannon wrote: > The online documentation specifies which API function borrow and/or > steal references (as opposed to the default behaviour). > Yet, I cannot find this information anywhere in the source. There are comments in some places, e.g. in listobject.h: *** WARNING *** PyList_SetItem does not increment the new item's reference count, but does decrement the reference count of the item it replaces, if not nil. It does *decrement* the reference count if it is *not* inserted in the list. Similarly, PyList_GetItem does not increment the returned item's reference count. If you're looking for evidence in the actual code, there's nothing particular to look for -- it's implicit in the way the function works overall. -- Greg From greg.ewing at canterbury.ac.nz Thu May 5 00:23:01 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 05 May 2011 10:23:01 +1200 Subject: [Python-Dev] Borrowed and Stolen References in API In-Reply-To: <BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com> References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk> <BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com> Message-ID: <4DC1D1C5.9010507@canterbury.ac.nz> Amaury Forgeot d'Arc wrote: > It's in the file Doc/data/refcounts.dat > in some custom format. However, it doesn't seem to quite convey the same information. It lists the "refcount effect" on each parameter, but translating that into the notion of borrowed or stolen references seems to require knowledge of what the function does. For example, PyDict_SetItem has: PyDict_SetItem:PyObject*:p:0: PyDict_SetItem:PyObject*:key:+1: PyDict_SetItem:PyObject*:val:+1: All of these parameters take borrowed references, but the key and val get incremented because they're being stored in the dict. So this file appears to be of limited usefulness. -- Greg From ethan at stoneleaf.us Thu May 5 00:40:42 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 04 May 2011 15:40:42 -0700 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #10276: test_zlib checks that inputs of 2 GB are handled correctly by In-Reply-To: <1304455123.1971.5.camel@marge> References: <E1QHFVm-0002kp-TV@dinsdale.python.org> <BANLkTinsj8Y4SuQ0iSi3um-8TtCgJkGWPg@mail.gmail.com> <1304455123.1971.5.camel@marge> Message-ID: <4DC1D5EA.7060608@stoneleaf.us> Victor Stinner wrote: > Le mardi 03 mai 2011 ? 16:22 +0200, Nadeem Vawda a ?crit : >> On Tue, May 3, 2011 at 3:19 PM, victor.stinner >> <python-checkins at python.org> wrote: >>> +# Issue #10276 - check that inputs of 2 GB are handled correctly. >>> +# Be aware of issues #1202, #8650, #8651 and #10276 >>> +class ChecksumBigBufferTestCase(unittest.TestCase): >>> + int_max = 0x7FFFFFFF >>> + >>> + @unittest.skipUnless(mmap, "mmap() is not available.") >>> + def test_big_buffer(self): >>> + if sys.platform[:3] == 'win' or sys.platform == 'darwin': >>> + requires('largefile', >>> + 'test requires %s bytes and a long time to run' % >>> + str(self.int_max)) >>> + try: >>> + with open(TESTFN, "wb+") as f: >>> + f.seek(self.int_max-4) >>> + f.write("asdf") >>> + f.flush() >>> + try: >>> + m = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ) >>> + self.assertEqual(zlib.crc32(m), 0x709418e7) >>> + self.assertEqual(zlib.adler32(m), -2072837729) >>> + finally: >>> + m.close() >>> + except (IOError, OverflowError): >>> + raise unittest.SkipTest("filesystem doesn't have largefile support") >>> + finally: >>> + unlink(TESTFN) >>> + >>> + >> 0x7FFFFFFF is (2G-1) bytes. For a 2GB buffer, int_max should be >> 0x80000000. However, if you make this change, crc32() and adler32() >> raise OverflowErrors (see changeset a0681e7a6ded). > > I don't want to check OverflowError: the test is supposed to compute the > checksum of a buffer of 0x7FFFFFFF bytes The comment says 'check that inputs of 2 GB are handled correctly' but the file created is 1 byte short of 2Gb. Is the test wrong, or just wrongly commented? Or am I not understanding? ~Ethan~ From victor.stinner at haypocalc.com Thu May 5 11:33:27 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 05 May 2011 11:33:27 +0200 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #10276: test_zlib checks that inputs of 2 GB are handled correctly by In-Reply-To: <4DC1D5EA.7060608@stoneleaf.us> References: <E1QHFVm-0002kp-TV@dinsdale.python.org> <BANLkTinsj8Y4SuQ0iSi3um-8TtCgJkGWPg@mail.gmail.com> <1304455123.1971.5.camel@marge> <4DC1D5EA.7060608@stoneleaf.us> Message-ID: <1304588007.22418.7.camel@marge> Le mercredi 04 mai 2011 ? 15:40 -0700, Ethan Furman a ?crit : > Victor Stinner wrote: > > Le mardi 03 mai 2011 ? 16:22 +0200, Nadeem Vawda a ?crit : > >> On Tue, May 3, 2011 at 3:19 PM, victor.stinner > >> <python-checkins at python.org> wrote: > >>> +# Issue #10276 - check that inputs of 2 GB are handled correctly. > >>> +# Be aware of issues #1202, #8650, #8651 and #10276 > >>> +class ChecksumBigBufferTestCase(unittest.TestCase): > >>> + int_max = 0x7FFFFFFF > >>> + > >>> + @unittest.skipUnless(mmap, "mmap() is not available.") > >>> + def test_big_buffer(self): > >>> + if sys.platform[:3] == 'win' or sys.platform == 'darwin': > >>> + requires('largefile', > >>> + 'test requires %s bytes and a long time to run' % > >>> + str(self.int_max)) > >>> + try: > >>> + with open(TESTFN, "wb+") as f: > >>> + f.seek(self.int_max-4) > >>> + f.write("asdf") > >>> + f.flush() > >>> + try: > >>> + m = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ) > >>> + self.assertEqual(zlib.crc32(m), 0x709418e7) > >>> + self.assertEqual(zlib.adler32(m), -2072837729) > >>> + finally: > >>> + m.close() > >>> + except (IOError, OverflowError): > >>> + raise unittest.SkipTest("filesystem doesn't have largefile support") > >>> + finally: > >>> + unlink(TESTFN) > >>> + > >>> + > >> 0x7FFFFFFF is (2G-1) bytes. For a 2GB buffer, int_max should be > >> 0x80000000. However, if you make this change, crc32() and adler32() > >> raise OverflowErrors (see changeset a0681e7a6ded). > > > > I don't want to check OverflowError: the test is supposed to compute the > > checksum of a buffer of 0x7FFFFFFF bytes > > The comment says 'check that inputs of 2 GB are handled correctly' but > the file created is 1 byte short of 2Gb. Is the test wrong, or just > wrongly commented? Or am I not understanding? If you write a byte after 2 GB of zeros, the file size is 2 GB+the few bytes. This trick is to create quickly a large file: some OSes support sparse files, zeros are not written on disk. But on Mac OS X and Windows, you really write 2 GB+some bytes. Victor From nadeem.vawda at gmail.com Thu May 5 11:43:19 2011 From: nadeem.vawda at gmail.com (Nadeem Vawda) Date: Thu, 5 May 2011 11:43:19 +0200 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #10276: test_zlib checks that inputs of 2 GB are handled correctly by In-Reply-To: <1304588007.22418.7.camel@marge> References: <E1QHFVm-0002kp-TV@dinsdale.python.org> <BANLkTinsj8Y4SuQ0iSi3um-8TtCgJkGWPg@mail.gmail.com> <1304455123.1971.5.camel@marge> <4DC1D5EA.7060608@stoneleaf.us> <1304588007.22418.7.camel@marge> Message-ID: <BANLkTinPa4aq7Q35JKipLhCPppxxrANBww@mail.gmail.com> On Thu, May 5, 2011 at 11:33 AM, Victor Stinner <victor.stinner at haypocalc.com> wrote: > Le mercredi 04 mai 2011 ? 15:40 -0700, Ethan Furman a ?crit : >> The comment says 'check that inputs of 2 GB are handled correctly' but >> the file created is 1 byte short of 2Gb. ?Is the test wrong, or just >> wrongly commented? ?Or am I not understanding? > > If you write a byte after 2 GB of zeros, the file size is 2 GB+the few > bytes. This trick is to create quickly a large file: some OSes support > sparse files, zeros are not written on disk. But on Mac OS X and > Windows, you really write 2 GB+some bytes. Ethan's point is that 0x7FFFFFFF is not 2GB - it is (2G-1) bytes. So the test and the preceding comment are inconsistent. From p.f.moore at gmail.com Thu May 5 11:53:59 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 5 May 2011 10:53:59 +0100 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #10276: test_zlib checks that inputs of 2 GB are handled correctly by In-Reply-To: <1304588007.22418.7.camel@marge> References: <E1QHFVm-0002kp-TV@dinsdale.python.org> <BANLkTinsj8Y4SuQ0iSi3um-8TtCgJkGWPg@mail.gmail.com> <1304455123.1971.5.camel@marge> <4DC1D5EA.7060608@stoneleaf.us> <1304588007.22418.7.camel@marge> Message-ID: <BANLkTikNL=6ry4CPPdH5ZVejNNF10eoiEg@mail.gmail.com> On 5 May 2011 10:33, Victor Stinner <victor.stinner at haypocalc.com> wrote: > If you write a byte after 2 GB of zeros, the file size is 2 GB+the few > bytes. This trick is to create quickly a large file: some OSes support > sparse files, zeros are not written on disk. But on Mac OS X and > Windows, you really write 2 GB+some bytes. FWIW, on Windows you can create sparse files, using DeviceIoControl(FILE_SET_SPARSE). It's probably too messy to be worth it for this case, though... Paul From giuott at gmail.com Thu May 5 12:14:34 2011 From: giuott at gmail.com (Giuseppe Ottaviano) Date: Thu, 5 May 2011 11:14:34 +0100 Subject: [Python-Dev] What if replacing items in a dictionary returns the new dictionary? In-Reply-To: <BANLkTikt4ue3NYBzna3p=GbNr6J6zEtGDA@mail.gmail.com> References: <BANLkTin8sB+85CicRtqkbrgtN7--Ujh3jQ@mail.gmail.com> <20110429143406.GA441@iskra.aviel.ru> <BANLkTikt4ue3NYBzna3p=GbNr6J6zEtGDA@mail.gmail.com> Message-ID: <BANLkTiksDBMJVEzcr27=rwuMdX2=ph-qjA@mail.gmail.com> On Fri, Apr 29, 2011 at 4:05 PM, Roy Hyunjin Han <starsareblueandfaraway at gmail.com> wrote: >> ? You can implement this in your own subclass of dict, no? > > Yes, I just thought it would be convenient to have in the language > itself, but the responses to my post seem to indicate that [not > returning the updated object] is an intended language feature for > mutable types like dict or list. In general nothing stops you to use a proxy object that returns itself after each method call, something like class using(object): def __init__(self, obj): self._wrappee = obj def unwrap(self): return self._wrappee def __getattr__(self, attr): def wrapper(*args, **kwargs): getattr(self._wrappee, attr)(*args, **kwargs) return self return wrapper d = dict() print using(d).update(dict(a=1)).update(dict(b=2)).unwrap() # prints {'a': 1, 'b': 2} l = list() print using(l).append(1).append(2).unwrap() # prints [1, 2] From amauryfa at gmail.com Thu May 5 12:38:32 2011 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Thu, 5 May 2011 12:38:32 +0200 Subject: [Python-Dev] Borrowed and Stolen References in API In-Reply-To: <4DC1D1C5.9010507@canterbury.ac.nz> References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk> <BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com> <4DC1D1C5.9010507@canterbury.ac.nz> Message-ID: <BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com> Hi, Le jeudi 5 mai 2011, Greg Ewing <greg.ewing at canterbury.ac.nz> a ?crit?: > Amaury Forgeot d'Arc wrote: > > > It's in the file Doc/data/refcounts.dat > in some custom format. > > > However, it doesn't seem to quite convey the same information. > It lists the "refcount effect" on each parameter, but translating > that into the notion of borrowed or stolen references seems > to require knowledge of what the function does. > > For example, PyDict_SetItem has: > > PyDict_SetItem:PyObject*:p:0: > PyDict_SetItem:PyObject*:key:+1: > PyDict_SetItem:PyObject*:val:+1: > > All of these parameters take borrowed references, but the > key and val get incremented because they're being stored > in the dict. This is not always true, for example when the item is already present in the dict. It's not important to know what the function does to the object, Only the action on the reference is relevant. > > So this file appears to be of limited usefulness. -- Amaury -- Amaury Forgeot d'Arc From ethan at stoneleaf.us Thu May 5 14:07:04 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 05 May 2011 05:07:04 -0700 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #10276: test_zlib checks that inputs of 2 GB are handled correctly by In-Reply-To: <1304588007.22418.7.camel@marge> References: <E1QHFVm-0002kp-TV@dinsdale.python.org> <BANLkTinsj8Y4SuQ0iSi3um-8TtCgJkGWPg@mail.gmail.com> <1304455123.1971.5.camel@marge> <4DC1D5EA.7060608@stoneleaf.us> <1304588007.22418.7.camel@marge> Message-ID: <4DC292E8.9010904@stoneleaf.us> Victor Stinner wrote: > Le mercredi 04 mai 2011 ? 15:40 -0700, Ethan Furman a ?crit : >> Victor Stinner wrote: >>> Le mardi 03 mai 2011 ? 16:22 +0200, Nadeem Vawda a ?crit : >>>> On Tue, May 3, 2011 at 3:19 PM, victor.stinner >>>> <python-checkins at python.org> wrote: >>>>> >>>>> + int_max = 0x7FFFFFFF >>>>> >>>>> + with open(TESTFN, "wb+") as f: >>>>> + f.seek(self.int_max-4) >>>>> + f.write("asdf") >>>>> + f.flush() >>>> >>>> 0x7FFFFFFF is (2G-1) bytes. For a 2GB buffer, int_max should be >>>> 0x80000000. However, if you make this change, crc32() and adler32() >>>> raise OverflowErrors (see changeset a0681e7a6ded). >>> >>> I don't want to check OverflowError: the test is supposed to compute the >>> checksum of a buffer of 0x7FFFFFFF bytes >> >> The comment says 'check that inputs of 2 GB are handled correctly' but >> the file created is 1 byte short of 2Gb. Is the test wrong, or just >> wrongly commented? Or am I not understanding? > > If you write a byte after 2 GB of zeros, the file size is 2 GB+the few > bytes. This trick is to create quickly a large file: some OSes support > sparse files, zeros are not written on disk. But on Mac OS X and > Windows, you really write 2 GB+some bytes. True, but that's not what's happening -- four bytes are being written at int_max - 4, and int_max is one less that 2GB; hence the resulting file is one less than 2GB. ~Ethan~ From victor.stinner at haypocalc.com Thu May 5 14:27:43 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 05 May 2011 14:27:43 +0200 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #10276: test_zlib checks that inputs of 2 GB are handled correctly by In-Reply-To: <4DC292E8.9010904@stoneleaf.us> References: <E1QHFVm-0002kp-TV@dinsdale.python.org> <BANLkTinsj8Y4SuQ0iSi3um-8TtCgJkGWPg@mail.gmail.com> <1304455123.1971.5.camel@marge> <4DC1D5EA.7060608@stoneleaf.us> <1304588007.22418.7.camel@marge> <4DC292E8.9010904@stoneleaf.us> Message-ID: <1304598463.27042.0.camel@marge> Le jeudi 05 mai 2011 ? 05:07 -0700, Ethan Furman a ?crit : > ... hence the resulting file is one less than 2GB. Yep, it's 0x7FFFFFFF because it's INT_MAX, the biggest value storable in an int. The zlib module stores the buffer size into an int in Python 2.7 (and Py_ssize_t in Python 3.3). Victor From ethan at stoneleaf.us Thu May 5 17:17:27 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 05 May 2011 08:17:27 -0700 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #10276: test_zlib checks that inputs of 2 GB are handled correctly by In-Reply-To: <1304598463.27042.0.camel@marge> References: <E1QHFVm-0002kp-TV@dinsdale.python.org> <BANLkTinsj8Y4SuQ0iSi3um-8TtCgJkGWPg@mail.gmail.com> <1304455123.1971.5.camel@marge> <4DC1D5EA.7060608@stoneleaf.us> <1304588007.22418.7.camel@marge> <4DC292E8.9010904@stoneleaf.us> <1304598463.27042.0.camel@marge> Message-ID: <4DC2BF87.40100@stoneleaf.us> Victor Stinner wrote: > Le jeudi 05 mai 2011 ? 05:07 -0700, Ethan Furman a ?crit : >> >> ... hence the resulting file is one less than 2GB. > > Yep, it's 0x7FFFFFFF because it's INT_MAX, the biggest value storable in > an int. The zlib module stores the buffer size into an int in Python 2.7 > (and Py_ssize_t in Python 3.3). So we are agreed that the file is not, in fact, 2GB in size... > On Tue, May 3, 2011 at 3:19 PM, victor.stinner > <python-checkins at python.org> wrote: >> +# Issue #10276 - check that inputs of 2 GB are handled correctly. >> +# Be aware of issues #1202, #8650, #8651 and #10276 So why do the comments say we are testing a 2GB input? ~Ethan~ From starsareblueandfaraway at gmail.com Thu May 5 16:37:04 2011 From: starsareblueandfaraway at gmail.com (Roy Hyunjin Han) Date: Thu, 5 May 2011 10:37:04 -0400 Subject: [Python-Dev] What if replacing items in a dictionary returns the new dictionary? In-Reply-To: <BANLkTiksDBMJVEzcr27=rwuMdX2=ph-qjA@mail.gmail.com> References: <BANLkTin8sB+85CicRtqkbrgtN7--Ujh3jQ@mail.gmail.com> <20110429143406.GA441@iskra.aviel.ru> <BANLkTikt4ue3NYBzna3p=GbNr6J6zEtGDA@mail.gmail.com> <BANLkTiksDBMJVEzcr27=rwuMdX2=ph-qjA@mail.gmail.com> Message-ID: <BANLkTim0EkdVum9vkgDYBhCEpekSVc4+Ow@mail.gmail.com> >> 2011/4/29 Roy Hyunjin Han <starsareblueandfaraway at gmail.com>: >> It would be convenient if replacing items in a dictionary returns the >> new dictionary, in a manner analogous to str.replace(). What do you >> think? >> >> # Current behavior >> x = {'key1': 1} >> x.update(key1=3) == None >> x == {'key1': 3} # Original variable has changed >> >> # Possible behavior >> x = {'key1': 1} >> x.replace(key1=3) == {'key1': 3} >> x == {'key1': 1} # Original variable is unchanged >> > 2011/5/5 Giuseppe Ottaviano <giuott at gmail.com>: > In general nothing stops you to use a proxy object that returns itself > after each method call, something like > > class using(object): > def __init__(self, obj): > self._wrappee = obj > > def unwrap(self): > return self._wrappee > > def __getattr__(self, attr): > def wrapper(*args, **kwargs): > getattr(self._wrappee, attr)(*args, **kwargs) > return self > return wrapper > > > d = dict() > print using(d).update(dict(a=1)).update(dict(b=2)).unwrap() > # prints {'a': 1, 'b': 2} > l = list() > print using(l).append(1).append(2).unwrap() > # prints [1, 2] Cool! I never thought of that. That's a great snippet. I'll forward this to the python-ideas list. I don't think the python-dev people want this discussion to continue on their mailing list. From guido at python.org Thu May 5 19:00:54 2011 From: guido at python.org (Guido van Rossum) Date: Thu, 5 May 2011 10:00:54 -0700 Subject: [Python-Dev] Borrowed and Stolen References in API In-Reply-To: <BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com> References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk> <BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com> <4DC1D1C5.9010507@canterbury.ac.nz> <BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com> Message-ID: <BANLkTikG95Qo+5LodJrjC=y3ANei=KkSXg@mail.gmail.com> On Thu, May 5, 2011 at 3:38 AM, Amaury Forgeot d'Arc <amauryfa at gmail.com> wrote: > Hi, > > Le jeudi 5 mai 2011, Greg Ewing <greg.ewing at canterbury.ac.nz> a ?crit?: >> Amaury Forgeot d'Arc wrote: >> >> >> It's in the file Doc/data/refcounts.dat >> in some custom format. >> >> >> However, it doesn't seem to quite convey the same information. >> It lists the "refcount effect" on each parameter, but translating >> that into the notion of borrowed or stolen references seems >> to require knowledge of what the function does. >> >> For example, PyDict_SetItem has: >> >> PyDict_SetItem:PyObject*:p:0: >> PyDict_SetItem:PyObject*:key:+1: >> PyDict_SetItem:PyObject*:val:+1: >> >> All of these parameters take borrowed references, but the >> key and val get incremented because they're being stored >> in the dict. > > This is not always true, for example when the item is already present > in the dict. > It's not important to know what the function does to the object, > Only the action on the reference is relevant. > >> >> So this file appears to be of limited usefulness. Seems you're in agreement with this. IMO when references are borrowed it is not very interesting. The interesting thing is when calling a function *steals* a reference. The other important thing to know is whether the caller ends up owning the return value (if it is an object) or not. I *think* you can tell the latter from the +1 for the return value; but the former (whether it steals a reference) is unclear from the data given. There's even an XXX comment about this in the file: # XXX NOTE: the 0/+1/-1 refcount information for arguments is # confusing! Much more useful would be to indicate whether the # function "steals" a reference to the argument or not. Take for # example PyList_SetItem(list, i, item). This lists as a 0 change for # both the list and the item arguments. However, in fact it steals a # reference to the item argument! -- --Guido van Rossum (python.org/~guido) From amauryfa at gmail.com Thu May 5 19:17:30 2011 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Thu, 5 May 2011 19:17:30 +0200 Subject: [Python-Dev] Borrowed and Stolen References in API In-Reply-To: <BANLkTikG95Qo+5LodJrjC=y3ANei=KkSXg@mail.gmail.com> References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk> <BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com> <4DC1D1C5.9010507@canterbury.ac.nz> <BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com> <BANLkTikG95Qo+5LodJrjC=y3ANei=KkSXg@mail.gmail.com> Message-ID: <BANLkTimKxOWhRKUaHV-B1T=D7n39RQ8Lhg@mail.gmail.com> 2011/5/5 Guido van Rossum <guido at python.org>: > Seems you're in agreement with this. IMO when references are borrowed > it is not very interesting. The interesting thing is when calling a > function *steals* a reference. The other important thing to know is > whether the caller ends up owning the return value (if it is an > object) or not. I *think* you can tell the latter from the +1 for the > return value; but the former (whether it steals a reference) is > unclear from the data given. There's even an XXX comment about this in > the file: > > # XXX NOTE: the 0/+1/-1 refcount information for arguments is > # confusing! ?Much more useful would be to indicate whether the > # function "steals" a reference to the argument or not. ?Take for > # example PyList_SetItem(list, i, item). ?This lists as a 0 change for > # both the list and the item arguments. ?However, in fact it steals a > # reference to the item argument! Should we change this file then? And only list functions that don't follow the usual conventions. But I'm sure that there are external tools which already use refcounts.dat in its present format. -- Amaury Forgeot d'Arc From guido at python.org Thu May 5 19:18:54 2011 From: guido at python.org (Guido van Rossum) Date: Thu, 5 May 2011 10:18:54 -0700 Subject: [Python-Dev] Borrowed and Stolen References in API In-Reply-To: <BANLkTimKxOWhRKUaHV-B1T=D7n39RQ8Lhg@mail.gmail.com> References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk> <BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com> <4DC1D1C5.9010507@canterbury.ac.nz> <BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com> <BANLkTikG95Qo+5LodJrjC=y3ANei=KkSXg@mail.gmail.com> <BANLkTimKxOWhRKUaHV-B1T=D7n39RQ8Lhg@mail.gmail.com> Message-ID: <BANLkTimvbe5MzYPL7ptNEs9kE8CYXfr6Lg@mail.gmail.com> On Thu, May 5, 2011 at 10:17 AM, Amaury Forgeot d'Arc <amauryfa at gmail.com> wrote: > 2011/5/5 Guido van Rossum <guido at python.org>: >> Seems you're in agreement with this. IMO when references are borrowed >> it is not very interesting. The interesting thing is when calling a >> function *steals* a reference. The other important thing to know is >> whether the caller ends up owning the return value (if it is an >> object) or not. I *think* you can tell the latter from the +1 for the >> return value; but the former (whether it steals a reference) is >> unclear from the data given. There's even an XXX comment about this in >> the file: >> >> # XXX NOTE: the 0/+1/-1 refcount information for arguments is >> # confusing! ?Much more useful would be to indicate whether the >> # function "steals" a reference to the argument or not. ?Take for >> # example PyList_SetItem(list, i, item). ?This lists as a 0 change for >> # both the list and the item arguments. ?However, in fact it steals a >> # reference to the item argument! > > Should we change this file then? > And only list functions that don't follow the usual conventions. > > But I'm sure that there are external tools which already use refcounts.dat > in its present format. Maybe we can *add* a column with the desired information? -- --Guido van Rossum (python.org/~guido) From g.brandl at gmx.net Thu May 5 20:08:51 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 05 May 2011 20:08:51 +0200 Subject: [Python-Dev] Borrowed and Stolen References in API In-Reply-To: <BANLkTikG95Qo+5LodJrjC=y3ANei=KkSXg@mail.gmail.com> References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk> <BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com> <4DC1D1C5.9010507@canterbury.ac.nz> <BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com> <BANLkTikG95Qo+5LodJrjC=y3ANei=KkSXg@mail.gmail.com> Message-ID: <ipup3i$c9v$1@dough.gmane.org> On 05.05.2011 19:00, Guido van Rossum wrote: > On Thu, May 5, 2011 at 3:38 AM, Amaury Forgeot d'Arc <amauryfa at gmail.com> wrote: >> Hi, >> >> Le jeudi 5 mai 2011, Greg Ewing <greg.ewing at canterbury.ac.nz> a ?crit : >>> Amaury Forgeot d'Arc wrote: >>> >>> >>> It's in the file Doc/data/refcounts.dat >>> in some custom format. >>> >>> >>> However, it doesn't seem to quite convey the same information. >>> It lists the "refcount effect" on each parameter, but translating >>> that into the notion of borrowed or stolen references seems >>> to require knowledge of what the function does. >>> >>> For example, PyDict_SetItem has: >>> >>> PyDict_SetItem:PyObject*:p:0: >>> PyDict_SetItem:PyObject*:key:+1: >>> PyDict_SetItem:PyObject*:val:+1: >>> >>> All of these parameters take borrowed references, but the >>> key and val get incremented because they're being stored >>> in the dict. >> >> This is not always true, for example when the item is already present >> in the dict. >> It's not important to know what the function does to the object, >> Only the action on the reference is relevant. >> >>> >>> So this file appears to be of limited usefulness. > > Seems you're in agreement with this. IMO when references are borrowed > it is not very interesting. The interesting thing is when calling a > function *steals* a reference. The other important thing to know is > whether the caller ends up owning the return value (if it is an > object) or not. I *think* you can tell the latter from the +1 for the > return value; but the former (whether it steals a reference) is > unclear from the data given. There's even an XXX comment about this in > the file: > > # XXX NOTE: the 0/+1/-1 refcount information for arguments is > # confusing! Much more useful would be to indicate whether the > # function "steals" a reference to the argument or not. Take for > # example PyList_SetItem(list, i, item). This lists as a 0 change for > # both the list and the item arguments. However, in fact it steals a > # reference to the item argument! We're not using the information about arguments anyway in the doc build. So we're free to change the file to list only return types, and parameters in the event of stolen references. Georg From solipsis at pitrou.net Thu May 5 20:09:30 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 5 May 2011 20:09:30 +0200 Subject: [Python-Dev] Borrowed and Stolen References in API References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk> <BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com> <4DC1D1C5.9010507@canterbury.ac.nz> <BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com> <BANLkTikG95Qo+5LodJrjC=y3ANei=KkSXg@mail.gmail.com> <BANLkTimKxOWhRKUaHV-B1T=D7n39RQ8Lhg@mail.gmail.com> Message-ID: <20110505200930.0412d200@pitrou.net> On Thu, 5 May 2011 19:17:30 +0200 "Amaury Forgeot d'Arc" <amauryfa at gmail.com> wrote: > 2011/5/5 Guido van Rossum <guido at python.org>: > > Seems you're in agreement with this. IMO when references are borrowed > > it is not very interesting. The interesting thing is when calling a > > function *steals* a reference. The other important thing to know is > > whether the caller ends up owning the return value (if it is an > > object) or not. I *think* you can tell the latter from the +1 for the > > return value; but the former (whether it steals a reference) is > > unclear from the data given. There's even an XXX comment about this in > > the file: > > > > # XXX NOTE: the 0/+1/-1 refcount information for arguments is > > # confusing! ?Much more useful would be to indicate whether the > > # function "steals" a reference to the argument or not. ?Take for > > # example PyList_SetItem(list, i, item). ?This lists as a 0 change for > > # both the list and the item arguments. ?However, in fact it steals a > > # reference to the item argument! > > Should we change this file then? > And only list functions that don't follow the usual conventions. +1 Regards Antoine. From raymond.hettinger at gmail.com Thu May 5 20:12:55 2011 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Thu, 5 May 2011 11:12:55 -0700 Subject: [Python-Dev] Borrowed and Stolen References in API In-Reply-To: <BANLkTimvbe5MzYPL7ptNEs9kE8CYXfr6Lg@mail.gmail.com> References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk> <BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com> <4DC1D1C5.9010507@canterbury.ac.nz> <BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com> <BANLkTikG95Qo+5LodJrjC=y3ANei=KkSXg@mail.gmail.com> <BANLkTimKxOWhRKUaHV-B1T=D7n39RQ8Lhg@mail.gmail.com> <BANLkTimvbe5MzYPL7ptNEs9kE8CYXfr6Lg@mail.gmail.com> Message-ID: <D5F03F3E-C4A5-481F-BAFA-8C3C00E62665@gmail.com> On May 5, 2011, at 10:18 AM, Guido van Rossum wrote: > On Thu, May 5, 2011 at 10:17 AM, Amaury Forgeot d'Arc > <amauryfa at gmail.com> wrote: >> 2011/5/5 Guido van Rossum <guido at python.org>: >>> Seems you're in agreement with this. IMO when references are borrowed >>> it is not very interesting. The interesting thing is when calling a >>> function *steals* a reference. The other important thing to know is >>> whether the caller ends up owning the return value (if it is an >>> object) or not. I *think* you can tell the latter from the +1 for the >>> return value; but the former (whether it steals a reference) is >>> unclear from the data given. There's even an XXX comment about this in >>> the file: >>> >>> # XXX NOTE: the 0/+1/-1 refcount information for arguments is >>> # confusing! Much more useful would be to indicate whether the >>> # function "steals" a reference to the argument or not. Take for >>> # example PyList_SetItem(list, i, item). This lists as a 0 change for >>> # both the list and the item arguments. However, in fact it steals a >>> # reference to the item argument! >> >> Should we change this file then? >> And only list functions that don't follow the usual conventions. >> >> But I'm sure that there are external tools which already use refcounts.dat >> in its present format. > > Maybe we can *add* a column with the desired information? +1 Raymond From benjamin at python.org Thu May 5 20:41:50 2011 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 5 May 2011 13:41:50 -0500 Subject: [Python-Dev] [Python-checkins] cpython (3.2): Avoid codec spelling issues by just using the utf-8 default. In-Reply-To: <E1QI3RQ-00050Z-JW@dinsdale.python.org> References: <E1QI3RQ-00050Z-JW@dinsdale.python.org> Message-ID: <BANLkTikamsX+wcTEKu1JtrpzgWzO7H_Huw@mail.gmail.com> 2011/5/5 raymond.hettinger <python-checkins at python.org>: > http://hg.python.org/cpython/rev/1a56775c6e54 > changeset: ? 69857:1a56775c6e54 > branch: ? ? ?3.2 > parent: ? ? ?69855:97a4855202b8 > user: ? ? ? ?Raymond Hettinger <python at rcn.com> > date: ? ? ? ?Thu May 05 11:35:50 2011 -0700 > summary: > ?Avoid codec spelling issues by just using the utf-8 default. Out of curiosity, what is the issue? > > files: > ?Lib/random.py | ?2 +- > ?1 files changed, 1 insertions(+), 1 deletions(-) > > > diff --git a/Lib/random.py b/Lib/random.py > --- a/Lib/random.py > +++ b/Lib/random.py > @@ -114,7 +114,7 @@ > ? ? ? ? if version == 2: > ? ? ? ? ? ? if isinstance(a, (str, bytes, bytearray)): > ? ? ? ? ? ? ? ? if isinstance(a, str): > - ? ? ? ? ? ? ? ? ? ?a = a.encode("utf8") > + ? ? ? ? ? ? ? ? ? ?a = a.encode() -- Regards, Benjamin From solipsis at pitrou.net Thu May 5 20:44:04 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 5 May 2011 20:44:04 +0200 Subject: [Python-Dev] cpython (merge 3.2 -> default): Avoid codec spelling issues by just using the utf-8 default. References: <E1QI3RT-00050j-Ml@dinsdale.python.org> Message-ID: <20110505204404.5cfa02f2@pitrou.net> On Thu, 05 May 2011 20:38:27 +0200 raymond.hettinger <python-checkins at python.org> wrote: > http://hg.python.org/cpython/rev/2bc784057226 > changeset: 69858:2bc784057226 > parent: 69856:b06ad8458b32 > parent: 69857:1a56775c6e54 > user: Raymond Hettinger <python at rcn.com> > date: Thu May 05 11:38:06 2011 -0700 > summary: > Avoid codec spelling issues by just using the utf-8 default. > > files: > Lib/random.py | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > > diff --git a/Lib/random.py b/Lib/random.py > --- a/Lib/random.py > +++ b/Lib/random.py > @@ -114,7 +114,7 @@ > if version == 2: > if isinstance(a, (str, bytes, bytearray)): > if isinstance(a, str): > - a = a.encode("utf-8") > + a = a.encode() Isn't explicit better than implicit? By reading the new code it is not obvious that any thought was given to the choice of a codec, while stating "utf-8" explicitly hints that a decision was made. (also, I don't understand the spelling issue: "utf-8" just works) Regards Antoine. From alexander.belopolsky at gmail.com Thu May 5 21:01:29 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 5 May 2011 15:01:29 -0400 Subject: [Python-Dev] cpython (merge 3.2 -> default): Avoid codec spelling issues by just using the utf-8 default. In-Reply-To: <20110505204404.5cfa02f2@pitrou.net> References: <E1QI3RT-00050j-Ml@dinsdale.python.org> <20110505204404.5cfa02f2@pitrou.net> Message-ID: <BANLkTikXHQGqS04Uwc=WTys-6L7Wrufdhw@mail.gmail.com> On Thu, May 5, 2011 at 2:44 PM, Antoine Pitrou <solipsis at pitrou.net> wrote: .. > (also, I don't understand the spelling issue: "utf-8" just works) This is probably referring to the fact that while encode() accepts many spelling variants, some are short-circuited in C code while others require codec lookup implemented in python. From solipsis at pitrou.net Thu May 5 21:07:07 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 05 May 2011 21:07:07 +0200 Subject: [Python-Dev] cpython (merge 3.2 -> default): Avoid codec spelling issues by just using the utf-8 default. In-Reply-To: <BANLkTikXHQGqS04Uwc=WTys-6L7Wrufdhw@mail.gmail.com> References: <E1QI3RT-00050j-Ml@dinsdale.python.org> <20110505204404.5cfa02f2@pitrou.net> <BANLkTikXHQGqS04Uwc=WTys-6L7Wrufdhw@mail.gmail.com> Message-ID: <1304622427.3564.12.camel@localhost.localdomain> Le jeudi 05 mai 2011 ? 15:01 -0400, Alexander Belopolsky a ?crit : > On Thu, May 5, 2011 at 2:44 PM, Antoine Pitrou <solipsis at pitrou.net> wrote: > .. > > (also, I don't understand the spelling issue: "utf-8" just works) > > This is probably referring to the fact that while encode() accepts > many spelling variants, some are short-circuited in C code while > others require codec lookup implemented in python. This sounds like a bug to fix (isn't it fixed it already, btw?) rather than add hackish workarounds for in stdlib code. Regards Antoine. From benjamin at python.org Thu May 5 21:13:34 2011 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 5 May 2011 14:13:34 -0500 Subject: [Python-Dev] cpython (merge 3.2 -> default): Avoid codec spelling issues by just using the utf-8 default. In-Reply-To: <BANLkTikXHQGqS04Uwc=WTys-6L7Wrufdhw@mail.gmail.com> References: <E1QI3RT-00050j-Ml@dinsdale.python.org> <20110505204404.5cfa02f2@pitrou.net> <BANLkTikXHQGqS04Uwc=WTys-6L7Wrufdhw@mail.gmail.com> Message-ID: <BANLkTinei4otVrp==7Q6CROodKs416z1Ng@mail.gmail.com> 2011/5/5 Alexander Belopolsky <alexander.belopolsky at gmail.com>: > On Thu, May 5, 2011 at 2:44 PM, Antoine Pitrou <solipsis at pitrou.net> wrote: > .. >> (also, I don't understand the spelling issue: "utf-8" just works) > > This is probably referring to the fact that while encode() accepts > many spelling variants, some are short-circuited in C code while > others require codec lookup implemented in python. Isn't it cached after the first run? If this is the reasoning, I find it hard to believe that seed() is a large bottleneck in random. -- Regards, Benjamin From g.brandl at gmx.net Thu May 5 22:45:13 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 05 May 2011 22:45:13 +0200 Subject: [Python-Dev] Borrowed and Stolen References in API In-Reply-To: <BANLkTimKxOWhRKUaHV-B1T=D7n39RQ8Lhg@mail.gmail.com> References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk> <BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com> <4DC1D1C5.9010507@canterbury.ac.nz> <BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com> <BANLkTikG95Qo+5LodJrjC=y3ANei=KkSXg@mail.gmail.com> <BANLkTimKxOWhRKUaHV-B1T=D7n39RQ8Lhg@mail.gmail.com> Message-ID: <ipv28o$6e9$1@dough.gmane.org> On 05.05.2011 19:17, Amaury Forgeot d'Arc wrote: > 2011/5/5 Guido van Rossum <guido at python.org>: >> Seems you're in agreement with this. IMO when references are borrowed >> it is not very interesting. The interesting thing is when calling a >> function *steals* a reference. The other important thing to know is >> whether the caller ends up owning the return value (if it is an >> object) or not. I *think* you can tell the latter from the +1 for the >> return value; but the former (whether it steals a reference) is >> unclear from the data given. There's even an XXX comment about this in >> the file: >> >> # XXX NOTE: the 0/+1/-1 refcount information for arguments is >> # confusing! Much more useful would be to indicate whether the >> # function "steals" a reference to the argument or not. Take for >> # example PyList_SetItem(list, i, item). This lists as a 0 change for >> # both the list and the item arguments. However, in fact it steals a >> # reference to the item argument! > > Should we change this file then? > And only list functions that don't follow the usual conventions. > > But I'm sure that there are external tools which already use refcounts.dat > in its present format. I doubt it. And even if there are, the information in there is in parts highly outdated (because the docs don't use parameter info), and large numbers of functions are missing. Let's remove the cruft, and only keep interesting info. This will also make the file much more manageable. Georg From raymond.hettinger at gmail.com Thu May 5 22:55:07 2011 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Thu, 5 May 2011 13:55:07 -0700 Subject: [Python-Dev] [Python-checkins] cpython (3.2): Avoid codec spelling issues by just using the utf-8 default. In-Reply-To: <BANLkTikamsX+wcTEKu1JtrpzgWzO7H_Huw@mail.gmail.com> References: <E1QI3RQ-00050Z-JW@dinsdale.python.org> <BANLkTikamsX+wcTEKu1JtrpzgWzO7H_Huw@mail.gmail.com> Message-ID: <926F0913-8142-430A-8400-6E6F0CD5B8F1@gmail.com> On May 5, 2011, at 11:41 AM, Benjamin Peterson wrote: > 2011/5/5 raymond.hettinger <python-checkins at python.org>: >> http://hg.python.org/cpython/rev/1a56775c6e54 >> changeset: 69857:1a56775c6e54 >> branch: 3.2 >> parent: 69855:97a4855202b8 >> user: Raymond Hettinger <python at rcn.com> >> date: Thu May 05 11:35:50 2011 -0700 >> summary: >> Avoid codec spelling issues by just using the utf-8 default. > > Out of curiosity, what is the issue? IIRC, the performance depended on how your spelled-it. I believe that is why the spelling got changed in Py3.3. Either way, the code is simpler by just using the default. Raymond From mal at egenix.com Fri May 6 00:32:59 2011 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 06 May 2011 00:32:59 +0200 Subject: [Python-Dev] [Python-checkins] cpython (3.2): Avoid codec spelling issues by just using the utf-8 default. In-Reply-To: <926F0913-8142-430A-8400-6E6F0CD5B8F1@gmail.com> References: <E1QI3RQ-00050Z-JW@dinsdale.python.org> <BANLkTikamsX+wcTEKu1JtrpzgWzO7H_Huw@mail.gmail.com> <926F0913-8142-430A-8400-6E6F0CD5B8F1@gmail.com> Message-ID: <4DC3259B.5020804@egenix.com> Raymond Hettinger wrote: > > On May 5, 2011, at 11:41 AM, Benjamin Peterson wrote: > >> 2011/5/5 raymond.hettinger <python-checkins at python.org>: >>> http://hg.python.org/cpython/rev/1a56775c6e54 >>> changeset: 69857:1a56775c6e54 >>> branch: 3.2 >>> parent: 69855:97a4855202b8 >>> user: Raymond Hettinger <python at rcn.com> >>> date: Thu May 05 11:35:50 2011 -0700 >>> summary: >>> Avoid codec spelling issues by just using the utf-8 default. >> >> Out of curiosity, what is the issue? > > IIRC, the performance depended on how your spelled-it. > I believe that is why the spelling got changed in Py3.3. Not really. It got changed because we have canonical names for the codecs which the stdlib should use rather than rely on aliases. Performance-wise it only makes a difference if you use it in tight loops. > Either way, the code is simpler by just using the default. ... as long as the casual reader knows what the default it :-) I think it's better to make the choice explicit, if the code relies on a particular non-ASCII encoding. If it doesn't, than the default is fine. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 06 2011) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2011-06-20: EuroPython 2011, Florence, Italy 45 days to go ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From tjreedy at udel.edu Fri May 6 00:52:34 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 05 May 2011 18:52:34 -0400 Subject: [Python-Dev] cpython (3.2): Avoid codec spelling issues by just using the utf-8 default. In-Reply-To: <926F0913-8142-430A-8400-6E6F0CD5B8F1@gmail.com> References: <E1QI3RQ-00050Z-JW@dinsdale.python.org> <BANLkTikamsX+wcTEKu1JtrpzgWzO7H_Huw@mail.gmail.com> <926F0913-8142-430A-8400-6E6F0CD5B8F1@gmail.com> Message-ID: <ipv9nh$f5b$1@dough.gmane.org> On 5/5/2011 4:55 PM, Raymond Hettinger wrote: > Either way, the code is simpler by just using the default. I thought about this and decided that the purpose of having defaults is so one does not have to always spell it out. So use it. Readers can always look it up and learn. -- Terry Jan Reedy From alexander.belopolsky at gmail.com Fri May 6 00:54:11 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 5 May 2011 18:54:11 -0400 Subject: [Python-Dev] [Python-checkins] cpython (3.2): Avoid codec spelling issues by just using the utf-8 default. In-Reply-To: <4DC3259B.5020804@egenix.com> References: <E1QI3RQ-00050Z-JW@dinsdale.python.org> <BANLkTikamsX+wcTEKu1JtrpzgWzO7H_Huw@mail.gmail.com> <926F0913-8142-430A-8400-6E6F0CD5B8F1@gmail.com> <4DC3259B.5020804@egenix.com> Message-ID: <BANLkTim-QnEyJ23ZbMeTN7M2+AkDVf9JEQ@mail.gmail.com> On Thu, May 5, 2011 at 6:32 PM, M.-A. Lemburg <mal at egenix.com> wrote: .. >> Either way, the code is simpler by just using the default. > > ... as long as the casual reader knows what the default it :-) > .. or cares. I this particular case, it hardly matters how random bits are encoded. From victor.stinner at haypocalc.com Fri May 6 01:14:14 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Fri, 06 May 2011 01:14:14 +0200 Subject: [Python-Dev] [Python-checkins] cpython (3.2): Avoid codec spelling issues by just using the utf-8 default. In-Reply-To: <BANLkTim-QnEyJ23ZbMeTN7M2+AkDVf9JEQ@mail.gmail.com> References: <E1QI3RQ-00050Z-JW@dinsdale.python.org> <BANLkTikamsX+wcTEKu1JtrpzgWzO7H_Huw@mail.gmail.com> <926F0913-8142-430A-8400-6E6F0CD5B8F1@gmail.com> <4DC3259B.5020804@egenix.com> <BANLkTim-QnEyJ23ZbMeTN7M2+AkDVf9JEQ@mail.gmail.com> Message-ID: <1304637254.12569.4.camel@marge> Le jeudi 05 mai 2011 ? 18:54 -0400, Alexander Belopolsky a ?crit : > On Thu, May 5, 2011 at 6:32 PM, M.-A. Lemburg <mal at egenix.com> wrote: > .. > >> Either way, the code is simpler by just using the default. > > > > ... as long as the casual reader knows what the default it :-) > > > > .. or cares. I this particular case, it hardly matters how random > bits are encoded. You don't get the same random number sequence if you use a different encoding. >>> r=random.Random() >>> r.seed('\xe9'.encode('iso-8859-1')); r.randint(0, 1000) 639 >>> r.seed('\xe9'.encode('utf-8')); r.randint(0, 1000) 992 So it is useful to know how the seed was computed. The real question is which encoding gives the most random numbers? :-) Victor From greg.ewing at canterbury.ac.nz Fri May 6 03:28:11 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 06 May 2011 13:28:11 +1200 Subject: [Python-Dev] Borrowed and Stolen References in API In-Reply-To: <BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com> References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk> <BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com> <4DC1D1C5.9010507@canterbury.ac.nz> <BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com> Message-ID: <4DC34EAB.9050001@canterbury.ac.nz> Amaury Forgeot d'Arc wrote [concerning the Doc/data/refcounts.dat file]: > This is not always true, for example when the item is already present > in the dict. > It's not important to know what the function does to the object, > Only the action on the reference is relevant. Yes, that's the whole point. When using a functon, what you need to know is whether it borrows or steals a reference. But this file *doesn't tell* you that -- rather it assigns either 0 or +1 to a borrowed reference, apparently based on some notion of what the function "usually" does with that parameter. There does not seem to be enough information in that file to work out the borrowed/stolen statuses, which makes it seem rather useless. -- Greg From skip at pobox.com Fri May 6 03:52:08 2011 From: skip at pobox.com (skip at pobox.com) Date: Thu, 5 May 2011 20:52:08 -0500 Subject: [Python-Dev] Borrowed and Stolen References in API In-Reply-To: <ipv28o$6e9$1@dough.gmane.org> References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk> <BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com> <4DC1D1C5.9010507@canterbury.ac.nz> <BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com> <BANLkTikG95Qo+5LodJrjC=y3ANei=KkSXg@mail.gmail.com> <BANLkTimKxOWhRKUaHV-B1T=D7n39RQ8Lhg@mail.gmail.com> <ipv28o$6e9$1@dough.gmane.org> Message-ID: <19907.21576.751581.958722@montanaro.dyndns.org> Georg> Let's remove the cruft, and only keep interesting info. This Georg> will also make the file much more manageable. If I was to do this from scratch I'd think hard about annotating the source code. No matter how hard you try, if you keep this information separate from the code and maintain it manually, it's going to get out-of-date. Skip From marks at dcs.gla.ac.uk Fri May 6 09:44:11 2011 From: marks at dcs.gla.ac.uk (Mark Shannon) Date: Fri, 06 May 2011 08:44:11 +0100 Subject: [Python-Dev] Borrowed and Stolen References in API In-Reply-To: <19907.21576.751581.958722@montanaro.dyndns.org> References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk> <BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com> <4DC1D1C5.9010507@canterbury.ac.nz> <BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com> <BANLkTikG95Qo+5LodJrjC=y3ANei=KkSXg@mail.gmail.com> <BANLkTimKxOWhRKUaHV-B1T=D7n39RQ8Lhg@mail.gmail.com> <ipv28o$6e9$1@dough.gmane.org> <19907.21576.751581.958722@montanaro.dyndns.org> Message-ID: <4DC3A6CB.5020809@dcs.gla.ac.uk> skip at pobox.com wrote: > Georg> Let's remove the cruft, and only keep interesting info. This > Georg> will also make the file much more manageable. > > If I was to do this from scratch I'd think hard about annotating the source > code. No matter how hard you try, if you keep this information separate > from the code and maintain it manually, it's going to get out-of-date. > What about #defining PY_STOLEN in some header? Then any stolen parameter can be prefixed with PY_STOLEN in signature. For return values, similarly #define PY_BORROWED. Cheers, Mark. From amauryfa at gmail.com Fri May 6 10:18:32 2011 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Fri, 6 May 2011 10:18:32 +0200 Subject: [Python-Dev] Borrowed and Stolen References in API In-Reply-To: <4DC3A6CB.5020809@dcs.gla.ac.uk> References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk> <BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com> <4DC1D1C5.9010507@canterbury.ac.nz> <BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com> <BANLkTikG95Qo+5LodJrjC=y3ANei=KkSXg@mail.gmail.com> <BANLkTimKxOWhRKUaHV-B1T=D7n39RQ8Lhg@mail.gmail.com> <ipv28o$6e9$1@dough.gmane.org> <19907.21576.751581.958722@montanaro.dyndns.org> <4DC3A6CB.5020809@dcs.gla.ac.uk> Message-ID: <BANLkTimh4a9BE+1HmuckMXf8yboxtc9m0w@mail.gmail.com> Le vendredi 6 mai 2011, Mark Shannon <marks at dcs.gla.ac.uk> a ?crit?: > What about #defining PY_STOLEN in some header? > > Then any stolen parameter can be prefixed with PY_STOLEN in signature. > > For return values, similarly #define PY_BORROWED. Header files are harder to parse, and I don't see how it would apply to macros. What about additional tags in the .rst files? -- Amaury -- Amaury Forgeot d'Arc From solipsis at pitrou.net Fri May 6 12:27:03 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 6 May 2011 12:27:03 +0200 Subject: [Python-Dev] Borrowed and Stolen References in API References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk> <BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com> <4DC1D1C5.9010507@canterbury.ac.nz> <BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com> <4DC34EAB.9050001@canterbury.ac.nz> Message-ID: <20110506122703.17c4d889@pitrou.net> On Fri, 06 May 2011 13:28:11 +1200 Greg Ewing <greg.ewing at canterbury.ac.nz> wrote: > Amaury Forgeot d'Arc wrote [concerning the Doc/data/refcounts.dat file]: > > > This is not always true, for example when the item is already present > > in the dict. > > It's not important to know what the function does to the object, > > Only the action on the reference is relevant. > > Yes, that's the whole point. When using a functon, > what you need to know is whether it borrows or steals > a reference. Doesn't "borrow" mean the same as "steal" in that context? If an API borrows a reference, I expect it to take it from me. Regards Antoine. From marks at dcs.gla.ac.uk Fri May 6 12:45:38 2011 From: marks at dcs.gla.ac.uk (Mark Shannon) Date: Fri, 06 May 2011 11:45:38 +0100 Subject: [Python-Dev] Borrowed and Stolen References in API In-Reply-To: <20110506122703.17c4d889@pitrou.net> References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk> <BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com> <4DC1D1C5.9010507@canterbury.ac.nz> <BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com> <4DC34EAB.9050001@canterbury.ac.nz> <20110506122703.17c4d889@pitrou.net> Message-ID: <4DC3D152.601@dcs.gla.ac.uk> Antoine Pitrou wrote: > On Fri, 06 May 2011 13:28:11 +1200 > Greg Ewing <greg.ewing at canterbury.ac.nz> wrote: > >> Amaury Forgeot d'Arc wrote [concerning the Doc/data/refcounts.dat file]: >> >>> This is not always true, for example when the item is already present >>> in the dict. >>> It's not important to know what the function does to the object, >>> Only the action on the reference is relevant. >> Yes, that's the whole point. When using a functon, >> what you need to know is whether it borrows or steals >> a reference. > > Doesn't "borrow" mean the same as "steal" in that context? > If an API borrows a reference, I expect it to take it from me. "Stealing" takes the ownership. Borrowing does not. This explains it better: http://docs.python.org/py3k/c-api/intro.html#reference-count-details Cheers, Mark. From jimjjewett at gmail.com Fri May 6 15:49:19 2011 From: jimjjewett at gmail.com (Jim Jewett) Date: Fri, 6 May 2011 09:49:19 -0400 Subject: [Python-Dev] [Python-checkins] cpython: Userlist.copy() wasn't returning a UserList. In-Reply-To: <E1QI6C6-0003Xg-UH@dinsdale.python.org> References: <E1QI6C6-0003Xg-UH@dinsdale.python.org> Message-ID: <BANLkTikphTzgiyGu9wZBwSnOv6W2Mwt97A@mail.gmail.com> Do you also want to assert that u is not v, or would that sort of "copy" be acceptable by some subclasses? On 5/5/11, raymond.hettinger <python-checkins at python.org> wrote: > http://hg.python.org/cpython/rev/f20373fcdde5 > changeset: 69865:f20373fcdde5 > user: Raymond Hettinger <python at rcn.com> > date: Thu May 05 14:34:35 2011 -0700 > summary: > Userlist.copy() wasn't returning a UserList. > > files: > Lib/collections/__init__.py | 2 +- > Lib/test/test_userlist.py | 6 ++++++ > 2 files changed, 7 insertions(+), 1 deletions(-) > > > diff --git a/Lib/collections/__init__.py b/Lib/collections/__init__.py > --- a/Lib/collections/__init__.py > +++ b/Lib/collections/__init__.py > @@ -887,7 +887,7 @@ > def pop(self, i=-1): return self.data.pop(i) > def remove(self, item): self.data.remove(item) > def clear(self): self.data.clear() > - def copy(self): return self.data.copy() > + def copy(self): return self.__class__(self) > def count(self, item): return self.data.count(item) > def index(self, item, *args): return self.data.index(item, *args) > def reverse(self): self.data.reverse() > diff --git a/Lib/test/test_userlist.py b/Lib/test/test_userlist.py > --- a/Lib/test/test_userlist.py > +++ b/Lib/test/test_userlist.py > @@ -52,6 +52,12 @@ > return str(key) + '!!!' > self.assertEqual(next(iter(T((1,2)))), "0!!!") > > + def test_userlist_copy(self): > + u = self.type2test([6, 8, 1, 9, 1]) > + v = u.copy() > + self.assertEqual(u, v) > + self.assertEqual(type(u), type(v)) > + > def test_main(): > support.run_unittest(UserListTest) > > > -- > Repository URL: http://hg.python.org/cpython > From ndbecker2 at gmail.com Fri May 6 16:04:09 2011 From: ndbecker2 at gmail.com (Neal Becker) Date: Fri, 06 May 2011 10:04:09 -0400 Subject: [Python-Dev] Linus on garbage collection Message-ID: <iq0v4q$ubm$1@dough.gmane.org> http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html From solipsis at pitrou.net Fri May 6 16:12:33 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 6 May 2011 16:12:33 +0200 Subject: [Python-Dev] Linus on garbage collection References: <iq0v4q$ubm$1@dough.gmane.org> Message-ID: <20110506161233.1ed647ec@pitrou.net> On Fri, 06 May 2011 10:04:09 -0400 Neal Becker <ndbecker2 at gmail.com> wrote: > http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html Since we're sharing links, here's Matt Mackall's take: http://www.selenic.com/pipermail/mercurial-devel/2011-May/031055.html cheers Antoine. From marks at dcs.gla.ac.uk Fri May 6 16:46:08 2011 From: marks at dcs.gla.ac.uk (Mark Shannon) Date: Fri, 06 May 2011 15:46:08 +0100 Subject: [Python-Dev] Linus on garbage collection In-Reply-To: <iq0v4q$ubm$1@dough.gmane.org> References: <iq0v4q$ubm$1@dough.gmane.org> Message-ID: <4DC409B0.60909@dcs.gla.ac.uk> Neal Becker wrote: > http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html > Being famous does not necessarily make you right. OS kernels are pretty atypical software, even if Linus is right about Linux, it doesn't apply to Python. I have empirical evidence, not opinion, that PyPy and my own HotPy are a *lot* faster (x5 or better) on Unladen Swallow's gcbench benchmark (which stresses the memory management subsystem). (Note that gcbench does not introduce any cycles, so its being easy on CPython) In fact, for gcbench CPython spends over twice as long in the cycle-collector as HotPy takes in total! I don't have such detailed results for PyPy. For other benchmarks, the HotPy GC times are often smaller than the inter-run variations in runtime, for example: HotPy GC stats for pystones (on a slow machine with a small cache): Total memory allocated: 20 Mbytes. 20 minor collections, 0 major collections Max heap size 2.4 Mbytes. Total time spent in GC: 3.5 milliseconds. ( <1% of execution time) My GC is quick, but its not the fastest. Evidence trumps opinion IMHO ;) Cheers, Mark. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/marks%40dcs.gla.ac.uk From solipsis at pitrou.net Fri May 6 17:33:51 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 6 May 2011 17:33:51 +0200 Subject: [Python-Dev] Linus on garbage collection References: <iq0v4q$ubm$1@dough.gmane.org> <4DC409B0.60909@dcs.gla.ac.uk> Message-ID: <20110506173351.4aef8145@pitrou.net> On Fri, 06 May 2011 15:46:08 +0100 Mark Shannon <marks at dcs.gla.ac.uk> wrote: > > Neal Becker wrote: > > http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html > > > Being famous does not necessarily make you right. > > OS kernels are pretty atypical software, > even if Linus is right about Linux, it doesn't apply to Python. > > I have empirical evidence, not opinion, that PyPy and my own HotPy > are a *lot* faster (x5 or better) on Unladen Swallow's gcbench benchmark > (which stresses the memory management subsystem). > > (Note that gcbench does not introduce any cycles, so its being easy on > CPython) > > In fact, for gcbench CPython spends over twice as long in the > cycle-collector as HotPy takes in total! The thing is, it would be easy to change our collection heuristics so that the cycle collector gets called less often (actually, you can already do so using gc.set_threshold, IIRC). Something which is much more delicate for a "full" GC, where it would grow memory consumption a lot. Regards Antoine. From status at bugs.python.org Fri May 6 18:07:23 2011 From: status at bugs.python.org (Python tracker) Date: Fri, 6 May 2011 18:07:23 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20110506160723.04A101CFD5@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2011-04-29 - 2011-05-06) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 2783 (+23) closed 21017 (+41) total 23800 (+64) Open issues with patches: 1201 Issues opened (47) ================== #11955: 3.3 : test_argparse.py fails 'make test' http://bugs.python.org/issue11955 opened by Jason.Vas.Dias #11956: 3.3 : test_import.py causes 'make test' to fail http://bugs.python.org/issue11956 opened by Jason.Vas.Dias #11957: re.sub confusion between count and flags args http://bugs.python.org/issue11957 opened by mindauga #11959: smtpd cannot be used without affecting global state http://bugs.python.org/issue11959 opened by vinay.sajip #11962: Buildbot reliability http://bugs.python.org/issue11962 opened by skrah #11963: Use real assert* for test_trigger_memory_error (test_parser) http://bugs.python.org/issue11963 opened by eric.araujo #11964: Undocumented change to indent param of json.dump in 3.2 http://bugs.python.org/issue11964 opened by eric.araujo #11965: Simplify context manager in os.popen http://bugs.python.org/issue11965 opened by eric.araujo #11968: wsgiref's wsgi application sample code does not work http://bugs.python.org/issue11968 opened by shimizukawa #11969: Can't launch Process on built-in static method http://bugs.python.org/issue11969 opened by cool-RR #11972: input does not strip a trailing newline correctly on Windows http://bugs.python.org/issue11972 opened by Michal.Molhanec #11973: kevent does not accept KQ_NOTE_EXIT (and other (f)flags) http://bugs.python.org/issue11973 opened by DragonSA #11974: Class definition gotcha.. should this be documented somewhere? http://bugs.python.org/issue11974 opened by sleepycal #11975: Fix referencing of built-in types (list, int, ...) http://bugs.python.org/issue11975 opened by jonash #11978: Report correct coverage.py data for tests that invoke subproce http://bugs.python.org/issue11978 opened by ncoghlan #11979: Minor improvements to the Sockets readme: typos, wording and s http://bugs.python.org/issue11979 opened by xmorel #11980: zipfile.ZipFile.write should accept fp as argument http://bugs.python.org/issue11980 opened by proppy #11981: dupe self.fp.tell() in zipfile.ZipFile.writestr http://bugs.python.org/issue11981 opened by proppy #11983: Inconsistent hash and comparison for code objects http://bugs.python.org/issue11983 opened by eltoder #11984: Wrong "See also" in symbol and token module docs http://bugs.python.org/issue11984 opened by davipo #11989: deprecate shutil.copy2 http://bugs.python.org/issue11989 opened by datamuc #11990: redirected output - stdout writes newline as \n in windows http://bugs.python.org/issue11990 opened by Jimbofbx #11992: sys.settrace doesn't disable tracing if a local trace function http://bugs.python.org/issue11992 opened by nedbat #11993: Use sub-second resolution to determine if a file is newer http://bugs.python.org/issue11993 opened by jsjgruber #11994: [2.7/gcc-4.4.3] Segfault under valgrind in string.split() http://bugs.python.org/issue11994 opened by skrah #11995: test_pydoc loads all Python modules http://bugs.python.org/issue11995 opened by haypo #11996: libpython.py: nicer py-bt output http://bugs.python.org/issue11996 opened by haypo #11998: test_signal cannot test blocked signals if _tkinter is loaded; http://bugs.python.org/issue11998 opened by haypo #11999: sporadic failure in test_mailbox on FreeBSD http://bugs.python.org/issue11999 opened by haypo #12001: Extend json.dumps to handle N-triples strings http://bugs.python.org/issue12001 opened by Glenn.Ammons #12002: ftplib.FTP.abort fails with TypeError on Python 3.x http://bugs.python.org/issue12002 opened by nneonneo #12003: documentation: alternate version of xrange seems to fail. http://bugs.python.org/issue12003 opened by tenuki #12004: PyZipFile.writepy gives internal error on syntax errors http://bugs.python.org/issue12004 opened by Ben.Morgan #12005: modulo result of Decimal differs from float/int http://bugs.python.org/issue12005 opened by Kotan #12006: strptime should implement %V or %u directive from libc http://bugs.python.org/issue12006 opened by Erik.Cederstrand #12007: Console commands won't work http://bugs.python.org/issue12007 opened by jake_mcaga #12008: HtmlParser non-strict goes wrong with unquoted attributes http://bugs.python.org/issue12008 opened by svilend #12009: netrc module crashes if netrc file has comment lines http://bugs.python.org/issue12009 opened by rmstoi #12010: Compile fails when sizeof(wchar_t) == 1 http://bugs.python.org/issue12010 opened by dcoles #12011: The signal module should raise OSError for OS-related exceptio http://bugs.python.org/issue12011 opened by pitrou #12012: _ssl module doesn't compile with OpenSSL 1.0.0d: SSLv2_method http://bugs.python.org/issue12012 opened by haypo #12013: file /usr/local/lib/python3.1/lib-dynload/_socket.so: symbol i http://bugs.python.org/issue12013 opened by alex_lai #12014: str.format parses replacement field incorrectly http://bugs.python.org/issue12014 opened by Ben.Wolfson #12015: possible characters in temporary file name is too few http://bugs.python.org/issue12015 opened by planet36 #12016: Wrong behavior for '\xff\n'.decode('gb2312', 'ignore') http://bugs.python.org/issue12016 opened by cdqzzy #12017: Decoding a highly-nested object with json (_speedups enabled) http://bugs.python.org/issue12017 opened by ivank #12018: No tests for ntpath.samefile, ntpath.sameopenfile http://bugs.python.org/issue12018 opened by ronaldoussoren Most recent 15 issues with no replies (15) ========================================== #12018: No tests for ntpath.samefile, ntpath.sameopenfile http://bugs.python.org/issue12018 #12016: Wrong behavior for '\xff\n'.decode('gb2312', 'ignore') http://bugs.python.org/issue12016 #12013: file /usr/local/lib/python3.1/lib-dynload/_socket.so: symbol i http://bugs.python.org/issue12013 #12009: netrc module crashes if netrc file has comment lines http://bugs.python.org/issue12009 #12003: documentation: alternate version of xrange seems to fail. http://bugs.python.org/issue12003 #12002: ftplib.FTP.abort fails with TypeError on Python 3.x http://bugs.python.org/issue12002 #12001: Extend json.dumps to handle N-triples strings http://bugs.python.org/issue12001 #11992: sys.settrace doesn't disable tracing if a local trace function http://bugs.python.org/issue11992 #11989: deprecate shutil.copy2 http://bugs.python.org/issue11989 #11984: Wrong "See also" in symbol and token module docs http://bugs.python.org/issue11984 #11983: Inconsistent hash and comparison for code objects http://bugs.python.org/issue11983 #11979: Minor improvements to the Sockets readme: typos, wording and s http://bugs.python.org/issue11979 #11973: kevent does not accept KQ_NOTE_EXIT (and other (f)flags) http://bugs.python.org/issue11973 #11969: Can't launch Process on built-in static method http://bugs.python.org/issue11969 #11968: wsgiref's wsgi application sample code does not work http://bugs.python.org/issue11968 Most recent 15 issues waiting for review (15) ============================================= #12015: possible characters in temporary file name is too few http://bugs.python.org/issue12015 #12012: _ssl module doesn't compile with OpenSSL 1.0.0d: SSLv2_method http://bugs.python.org/issue12012 #12008: HtmlParser non-strict goes wrong with unquoted attributes http://bugs.python.org/issue12008 #12004: PyZipFile.writepy gives internal error on syntax errors http://bugs.python.org/issue12004 #11999: sporadic failure in test_mailbox on FreeBSD http://bugs.python.org/issue11999 #11998: test_signal cannot test blocked signals if _tkinter is loaded; http://bugs.python.org/issue11998 #11996: libpython.py: nicer py-bt output http://bugs.python.org/issue11996 #11989: deprecate shutil.copy2 http://bugs.python.org/issue11989 #11981: dupe self.fp.tell() in zipfile.ZipFile.writestr http://bugs.python.org/issue11981 #11980: zipfile.ZipFile.write should accept fp as argument http://bugs.python.org/issue11980 #11973: kevent does not accept KQ_NOTE_EXIT (and other (f)flags) http://bugs.python.org/issue11973 #11963: Use real assert* for test_trigger_memory_error (test_parser) http://bugs.python.org/issue11963 #11956: 3.3 : test_import.py causes 'make test' to fail http://bugs.python.org/issue11956 #11949: Make float('nan') unorderable http://bugs.python.org/issue11949 #11948: Tutorial/Modules - small fix to better clarify the modules sea http://bugs.python.org/issue11948 Top 10 most discussed issues (10) ================================= #11277: Crash with mmap and sparse files on Mac OS X http://bugs.python.org/issue11277 19 msgs #8407: expose signalfd(2) and pthread_sigmask in the signal module http://bugs.python.org/issue8407 18 msgs #11935: MMDF/MBOX mailbox need utime http://bugs.python.org/issue11935 17 msgs #11999: sporadic failure in test_mailbox on FreeBSD http://bugs.python.org/issue11999 11 msgs #6721: Locks in python standard library should be sanitized on fork http://bugs.python.org/issue6721 10 msgs #9971: Optimize BufferedReader.readinto http://bugs.python.org/issue9971 9 msgs #3526: Customized malloc implementation on SunOS and AIX http://bugs.python.org/issue3526 8 msgs #11962: Buildbot reliability http://bugs.python.org/issue11962 8 msgs #11949: Make float('nan') unorderable http://bugs.python.org/issue11949 7 msgs #11954: 3.3 - 'make test' fails http://bugs.python.org/issue11954 7 msgs Issues closed (37) ================== #1856: shutdown (exit) can hang or segfault with daemon threads runni http://bugs.python.org/issue1856 closed by pitrou #7517: freeze.py not ported to python3 http://bugs.python.org/issue7517 closed by eric.araujo #8158: Docstring of optparse.OptionParser incomplete http://bugs.python.org/issue8158 closed by r.david.murray #9756: Crash with custom __getattribute__ http://bugs.python.org/issue9756 closed by haypo #10684: Folders get deleted when trying to change case with shutil.mov http://bugs.python.org/issue10684 closed by ronaldoussoren #10775: assertRaises as a context manager should accept a 'msg' keywor http://bugs.python.org/issue10775 closed by ezio.melotti #10922: Unexpected exception when calling function_proxy.__class__.__c http://bugs.python.org/issue10922 closed by haypo #11034: Build problem on Windows with MSVC++ Express 2008 http://bugs.python.org/issue11034 closed by loewis #11206: test_readline unconditionally calls clear_history() http://bugs.python.org/issue11206 closed by ned.deily #11247: Error sending packets to multicast IPV4 address http://bugs.python.org/issue11247 closed by neologix #11335: Memory leak after key function failure in sort http://bugs.python.org/issue11335 closed by stutzbach #11834: wrong module installation dir on Windows http://bugs.python.org/issue11834 closed by brian.curtin #11849: glibc allocator doesn't release all free()ed memory http://bugs.python.org/issue11849 closed by pitrou #11873: test_regexp() of test_compileall fails occassionally http://bugs.python.org/issue11873 closed by r.david.murray #11883: Call connect() before sending an email with smtplib http://bugs.python.org/issue11883 closed by r.david.murray #11887: unittest fails on comparing str with bytes if python has the - http://bugs.python.org/issue11887 closed by michael.foord #11898: Sending binary data with a POST request in httplib can cause U http://bugs.python.org/issue11898 closed by orsenthil #11912: PaX triggers a segfault in dlopen http://bugs.python.org/issue11912 closed by neologix #11930: Remove time.accept2dyear http://bugs.python.org/issue11930 closed by belopolsky #11950: logger use dict for loggers instead of WeakValueDictionary http://bugs.python.org/issue11950 closed by vinay.sajip #11958: test.test_ftplib.TestIPv6Environment failure http://bugs.python.org/issue11958 closed by python-dev #11960: Python crashes when running numpy test http://bugs.python.org/issue11960 closed by amaury.forgeotdarc #11961: Document STARTUPINFO and creationflags options for Windows http://bugs.python.org/issue11961 closed by brian.curtin #11966: Typo in PyModule_AddIntMacro's documentation http://bugs.python.org/issue11966 closed by python-dev #11967: Left shift and Right shift for floats http://bugs.python.org/issue11967 closed by loewis #11970: distutils command 'upload' crashes when --show-response is sel http://bugs.python.org/issue11970 closed by offby1 #11971: Wrong parameter -O0 instead of -OO in manpage http://bugs.python.org/issue11971 closed by r.david.murray #11976: Provide proper documentation for list data type http://bugs.python.org/issue11976 closed by georg.brandl #11977: Document int.conjugate, .denominator, ... http://bugs.python.org/issue11977 closed by python-dev #11982: json.loads() returns str instead of unicode for empty strings http://bugs.python.org/issue11982 closed by ezio.melotti #11985: Document that platform.python_implementation supports PyPy http://bugs.python.org/issue11985 closed by ezio.melotti #11986: Min/max not symmetric in presence of NaN http://bugs.python.org/issue11986 closed by rhettinger #11987: queue.Queue.put should acquire mutex for unfinished_tasks http://bugs.python.org/issue11987 closed by rhettinger #11988: special method lookup docs don't address some important detail http://bugs.python.org/issue11988 closed by r.david.murray #11991: test_distutils fails because of bad filename match http://bugs.python.org/issue11991 closed by eric.araujo #11997: One typo in Doc/c-api/init.rst http://bugs.python.org/issue11997 closed by ezio.melotti #12000: SSL certificate verification failed if no dNSName entry in sub http://bugs.python.org/issue12000 closed by pitrou From skip at pobox.com Fri May 6 18:18:51 2011 From: skip at pobox.com (skip at pobox.com) Date: Fri, 6 May 2011 11:18:51 -0500 Subject: [Python-Dev] Linus on garbage collection In-Reply-To: <20110506161233.1ed647ec@pitrou.net> References: <iq0v4q$ubm$1@dough.gmane.org> <20110506161233.1ed647ec@pitrou.net> Message-ID: <19908.8043.8921.50222@montanaro.dyndns.org> Antoine> Since we're sharing links, here's Matt Mackall's take: Antoine> http://www.selenic.com/pipermail/mercurial-devel/2011-May/031055.html >From that note: 1: You can't have meaningful destructors, because when destruction happens is undefined. And going-out-of-scope destructors are extremely useful. Python is already a rather broken in this regard, so feel free to ignore this point. Given the presence of cyclic data I don't see how reference counting or garbage collection win. Ignoring the fact that in a pure reference counted system you won't even consider cycles for reclmation, would both RC and GC have to punt because they can't tell which object's destructor to call first? Skip From fuzzyman at voidspace.org.uk Fri May 6 18:31:44 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Fri, 06 May 2011 17:31:44 +0100 Subject: [Python-Dev] Linus on garbage collection In-Reply-To: <19908.8043.8921.50222@montanaro.dyndns.org> References: <iq0v4q$ubm$1@dough.gmane.org> <20110506161233.1ed647ec@pitrou.net> <19908.8043.8921.50222@montanaro.dyndns.org> Message-ID: <4DC42270.1000301@voidspace.org.uk> On 06/05/2011 17:18, skip at pobox.com wrote: > Antoine> Since we're sharing links, here's Matt Mackall's take: > Antoine> http://www.selenic.com/pipermail/mercurial-devel/2011-May/031055.html > > > From that note: > > 1: You can't have meaningful destructors, because when destruction > happens is undefined. And going-out-of-scope destructors are extremely > useful. Python is already a rather broken in this regard, so feel free > to ignore this point. > > Given the presence of cyclic data I don't see how reference counting or > garbage collection win. Ignoring the fact that in a pure reference counted > system you won't even consider cycles for reclmation, would both RC and GC > have to punt because they can't tell which object's destructor to call > first? pypy and .NET choose to arbitrarily break cycles rather than leave objects unfinalised and memory unreclaimed. Not sure what Java does. All the best, Michael Foord > Skip > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From greg at krypto.org Fri May 6 18:32:51 2011 From: greg at krypto.org (Gregory P. Smith) Date: Fri, 6 May 2011 09:32:51 -0700 Subject: [Python-Dev] Linus on garbage collection In-Reply-To: <19908.8043.8921.50222@montanaro.dyndns.org> References: <iq0v4q$ubm$1@dough.gmane.org> <20110506161233.1ed647ec@pitrou.net> <19908.8043.8921.50222@montanaro.dyndns.org> Message-ID: <BANLkTikUwyELPXSC6zSWnP8Xv8OqsBkkhQ@mail.gmail.com> On Fri, May 6, 2011 at 9:18 AM, <skip at pobox.com> wrote: > > ? ?Antoine> Since we're sharing links, here's Matt Mackall's take: > ? ?Antoine> http://www.selenic.com/pipermail/mercurial-devel/2011-May/031055.html > > >From that note: > > ? ?1: You can't have meaningful destructors, because when destruction > ? ?happens is undefined. And going-out-of-scope destructors are extremely > ? ?useful. Python is already a rather broken in this regard, so feel free > ? ?to ignore this point. Python being "broken" in this regard is pretty much exactly why __enter__, __exit__ and with as context managers were added to the language. That gives the ability to have the equivalent of well defined nested scopes that destroy something (exit) deterministically much as it is easy to do in C++ with some {}s and a ~destructor(). It is not broken, just different. -gps From marks at dcs.gla.ac.uk Fri May 6 18:33:03 2011 From: marks at dcs.gla.ac.uk (Mark Shannon) Date: Fri, 06 May 2011 17:33:03 +0100 Subject: [Python-Dev] Linus on garbage collection In-Reply-To: <19908.8043.8921.50222@montanaro.dyndns.org> References: <iq0v4q$ubm$1@dough.gmane.org> <20110506161233.1ed647ec@pitrou.net> <19908.8043.8921.50222@montanaro.dyndns.org> Message-ID: <4DC422BF.4010006@dcs.gla.ac.uk> skip at pobox.com wrote: > Antoine> Since we're sharing links, here's Matt Mackall's take: > Antoine> http://www.selenic.com/pipermail/mercurial-devel/2011-May/031055.html > >>From that note: > > 1: You can't have meaningful destructors, because when destruction > happens is undefined. And going-out-of-scope destructors are extremely > useful. Python is already a rather broken in this regard, so feel free > to ignore this point. > > Given the presence of cyclic data I don't see how reference counting or > garbage collection win. Ignoring the fact that in a pure reference counted > system you won't even consider cycles for reclmation, would both RC and GC > have to punt because they can't tell which object's destructor to call > first? It doesn't matter which is called first. In fact, the VM could call all the destructors at the same time if the machine has enough cores and there's no GIL. All objects are kept alive by the GC until after the destructors are called. Those that are still dead will have their memory reclaimed. > > Skip > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/marks%40dcs.gla.ac.uk From stefan_ml at behnel.de Fri May 6 18:51:37 2011 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 06 May 2011 18:51:37 +0200 Subject: [Python-Dev] Linus on garbage collection In-Reply-To: <4DC422BF.4010006@dcs.gla.ac.uk> References: <iq0v4q$ubm$1@dough.gmane.org> <20110506161233.1ed647ec@pitrou.net> <19908.8043.8921.50222@montanaro.dyndns.org> <4DC422BF.4010006@dcs.gla.ac.uk> Message-ID: <iq18uq$s9p$1@dough.gmane.org> Mark Shannon, 06.05.2011 18:33: > skip at pobox.com wrote: >> Antoine> Since we're sharing links, here's Matt Mackall's take: >> Antoine> >> http://www.selenic.com/pipermail/mercurial-devel/2011-May/031055.html >> >>> From that note: >> >> 1: You can't have meaningful destructors, because when destruction >> happens is undefined. And going-out-of-scope destructors are extremely >> useful. Python is already a rather broken in this regard, so feel free >> to ignore this point. >> >> Given the presence of cyclic data I don't see how reference counting or >> garbage collection win. Ignoring the fact that in a pure reference counted >> system you won't even consider cycles for reclmation, would both RC and GC >> have to punt because they can't tell which object's destructor to call >> first? > > It doesn't matter which is called first. May I quote you on that one the next time my software crashes? It may not make a difference for the runtime, but the difference for user software may be "dead" or "alive". Stefan From fuzzyman at voidspace.org.uk Fri May 6 19:04:53 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Fri, 06 May 2011 18:04:53 +0100 Subject: [Python-Dev] Linus on garbage collection In-Reply-To: <BANLkTikUwyELPXSC6zSWnP8Xv8OqsBkkhQ@mail.gmail.com> References: <iq0v4q$ubm$1@dough.gmane.org> <20110506161233.1ed647ec@pitrou.net> <19908.8043.8921.50222@montanaro.dyndns.org> <BANLkTikUwyELPXSC6zSWnP8Xv8OqsBkkhQ@mail.gmail.com> Message-ID: <4DC42A35.6060303@voidspace.org.uk> On 06/05/2011 17:32, Gregory P. Smith wrote: > On Fri, May 6, 2011 at 9:18 AM,<skip at pobox.com> wrote: >> Antoine> Since we're sharing links, here's Matt Mackall's take: >> Antoine> http://www.selenic.com/pipermail/mercurial-devel/2011-May/031055.html >> >> > From that note: >> >> 1: You can't have meaningful destructors, because when destruction >> happens is undefined. And going-out-of-scope destructors are extremely >> useful. Python is already a rather broken in this regard, so feel free >> to ignore this point. > Python being "broken" in this regard is pretty much exactly why > __enter__, __exit__ and with as context managers were added to the > language. > How does that help with cycles? Sure it makes cleaning up some resources easier, but not at all this case. Explicit destruction is of course always an alternative to the runtime doing it for you, but it doesn't help with (for example) reclaiming memory. For long running processes memory leaks due to unreclaimable cycles can be a problem with CPython. > That gives the ability to have the equivalent of well defined nested > scopes that destroy something (exit) deterministically much as it is > easy to do in C++ with some {}s and a ~destructor(). > > It is not broken, just different. +1 QOTW ;-) Michael > -gps > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From fuzzyman at voidspace.org.uk Fri May 6 19:06:35 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Fri, 06 May 2011 18:06:35 +0100 Subject: [Python-Dev] Linus on garbage collection In-Reply-To: <iq18uq$s9p$1@dough.gmane.org> References: <iq0v4q$ubm$1@dough.gmane.org> <20110506161233.1ed647ec@pitrou.net> <19908.8043.8921.50222@montanaro.dyndns.org> <4DC422BF.4010006@dcs.gla.ac.uk> <iq18uq$s9p$1@dough.gmane.org> Message-ID: <4DC42A9B.6020000@voidspace.org.uk> On 06/05/2011 17:51, Stefan Behnel wrote: > Mark Shannon, 06.05.2011 18:33: >> skip at pobox.com wrote: >>> Antoine> Since we're sharing links, here's Matt Mackall's take: >>> Antoine> >>> http://www.selenic.com/pipermail/mercurial-devel/2011-May/031055.html >>> >>>> From that note: >>> >>> 1: You can't have meaningful destructors, because when destruction >>> happens is undefined. And going-out-of-scope destructors are extremely >>> useful. Python is already a rather broken in this regard, so feel free >>> to ignore this point. >>> >>> Given the presence of cyclic data I don't see how reference counting or >>> garbage collection win. Ignoring the fact that in a pure reference >>> counted >>> system you won't even consider cycles for reclmation, would both RC >>> and GC >>> have to punt because they can't tell which object's destructor to call >>> first? >> >> It doesn't matter which is called first. > > May I quote you on that one the next time my software crashes? > Arbitrarily breaking cycles *could* cause a problem if a destructor attempts to access an already collected object. Not breaking cycles *definitely* leaks memory and definitely doesn't call finalizers. Michael > It may not make a difference for the runtime, but the difference for > user software may be "dead" or "alive". > > Stefan > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From glyph at twistedmatrix.com Fri May 6 19:07:44 2011 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Fri, 6 May 2011 13:07:44 -0400 Subject: [Python-Dev] Linus on garbage collection In-Reply-To: <4DC42270.1000301@voidspace.org.uk> References: <iq0v4q$ubm$1@dough.gmane.org> <20110506161233.1ed647ec@pitrou.net> <19908.8043.8921.50222@montanaro.dyndns.org> <4DC42270.1000301@voidspace.org.uk> Message-ID: <8F83194F-5A5C-496E-920A-A2488F9949E4@twistedmatrix.com> On May 6, 2011, at 12:31 PM, Michael Foord wrote: > pypy and .NET choose to arbitrarily break cycles rather than leave objects unfinalised and memory unreclaimed. Not sure what Java does. I think that's a mischaracterization of their respective collectors; "arbitrarily break cycles" implies that user code would see broken or incomplete objects, at least during finalization, which I'm fairly sure is not true on either .NET or PyPy. Java definitely has a collector that can handles cycles too. (None of these are reference counting.) -glyph -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20110506/9bf3df2c/attachment.html> From stephen at xemacs.org Fri May 6 19:15:33 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sat, 07 May 2011 02:15:33 +0900 Subject: [Python-Dev] Linus on garbage collection In-Reply-To: <4DC409B0.60909@dcs.gla.ac.uk> References: <iq0v4q$ubm$1@dough.gmane.org> <4DC409B0.60909@dcs.gla.ac.uk> Message-ID: <87y62jeone.fsf@uwakimon.sk.tsukuba.ac.jp> Mark Shannon writes: > > > Neal Becker wrote: > > http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html > > > Being famous does not necessarily make you right. No, but being a genius sure helps you beat the odds. > OS kernels are pretty atypical software, > even if Linus is right about Linux, it doesn't apply to Python. Well, actually he was writing about GCC.... > I have empirical evidence, not opinion, that PyPy and my own HotPy > are a *lot* faster (x5 or better) on Unladen Swallow's gcbench benchmark > (which stresses the memory management subsystem). You're missing Linus's point, I think. Linus did *not* claim that it's impossible to write a fast *GC*. He claimed that it's hard to write a fast *program* that uses GC for memory management. A benchmark that stresses *only* the memory management system is unlikely to impress him. From fuzzyman at voidspace.org.uk Fri May 6 19:12:51 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Fri, 06 May 2011 18:12:51 +0100 Subject: [Python-Dev] Linus on garbage collection In-Reply-To: <8F83194F-5A5C-496E-920A-A2488F9949E4@twistedmatrix.com> References: <iq0v4q$ubm$1@dough.gmane.org> <20110506161233.1ed647ec@pitrou.net> <19908.8043.8921.50222@montanaro.dyndns.org> <4DC42270.1000301@voidspace.org.uk> <8F83194F-5A5C-496E-920A-A2488F9949E4@twistedmatrix.com> Message-ID: <4DC42C13.8070806@voidspace.org.uk> On 06/05/2011 18:07, Glyph Lefkowitz wrote: > On May 6, 2011, at 12:31 PM, Michael Foord wrote: > >> pypy and .NET choose to arbitrarily break cycles rather than leave >> objects unfinalised and memory unreclaimed. Not sure what Java does. > > I think that's a mischaracterization of their respective collectors; > "arbitrarily break cycles" implies that user code would see broken or > incomplete objects, at least during finalization, which I'm fairly > sure is not true on either .NET or PyPy. http://morepypy.blogspot.com/2008/02/python-finalizers-semantics-part-1.html "Therefore we decided to break such a cycle at an arbitrary place, which doesn't sound too insane." All the best, Michael Foord > > Java definitely has a collector that can handles cycles too. (None of > these are reference counting.) > > -glyph -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20110506/3afbfd6a/attachment.html> From marks at dcs.gla.ac.uk Fri May 6 19:46:37 2011 From: marks at dcs.gla.ac.uk (Mark Shannon) Date: Fri, 06 May 2011 18:46:37 +0100 Subject: [Python-Dev] Linus on garbage collection In-Reply-To: <4DC4321F.3070206@voidspace.org.uk> References: <iq0v4q$ubm$1@dough.gmane.org> <20110506161233.1ed647ec@pitrou.net> <19908.8043.8921.50222@montanaro.dyndns.org> <4DC422BF.4010006@dcs.gla.ac.uk> <iq18uq$s9p$1@dough.gmane.org> <4DC42A9B.6020000@voidspace.org.uk> <4DC42F4B.1050509@dcs.gla.ac.uk> <4DC4321F.3070206@voidspace.org.uk> Message-ID: <4DC433FD.6090803@dcs.gla.ac.uk> Michael Foord wrote: > On 06/05/2011 18:26, Mark Shannon wrote: >> Michael Foord wrote: >>> On 06/05/2011 17:51, Stefan Behnel wrote: >>>> Mark Shannon, 06.05.2011 18:33: >>>>> skip at pobox.com wrote: >>>>>> Antoine> Since we're sharing links, here's Matt Mackall's take: >>>>>> Antoine> >>>>>> http://www.selenic.com/pipermail/mercurial-devel/2011-May/031055.html >>>>>> >>>>>>> From that note: >>>>>> 1: You can't have meaningful destructors, because when destruction >>>>>> happens is undefined. And going-out-of-scope destructors are >>>>>> extremely >>>>>> useful. Python is already a rather broken in this regard, so feel >>>>>> free >>>>>> to ignore this point. >>>>>> >>>>>> Given the presence of cyclic data I don't see how reference >>>>>> counting or >>>>>> garbage collection win. Ignoring the fact that in a pure reference >>>>>> counted >>>>>> system you won't even consider cycles for reclmation, would both >>>>>> RC and GC >>>>>> have to punt because they can't tell which object's destructor to >>>>>> call >>>>>> first? >>>>> It doesn't matter which is called first. >>>> May I quote you on that one the next time my software crashes? >>>> >>> Arbitrarily breaking cycles *could* cause a problem if a destructor >>> attempts to access an already collected object. Not breaking cycles >>> *definitely* leaks memory and definitely doesn't call finalizers. >> You don't need to break the cycles to call the finalizers. Just call >> them, then collect the whole cycle (assuming it is still unreachable). >> >> The GC will *never* reclaim a reachable object. Objects awaiting >> finalization are reachable, by definition. >> > Well it was sloppily worded, so replace it with: > > if a finalizer attempts to access an already finalized object. A finalized object will still be a valid object. Python code cannot make an object unsafe. Obviously C code can make it unsafe, but that's true of C code anywhere. For example, a file object will close itself during finalization, but its still a valid object, just a closed file rather than an open one. > > Michael >>> Michael >>> >>>> It may not make a difference for the runtime, but the difference for >>>> user software may be "dead" or "alive". >>>> >>>> Stefan >>>> >>>> _______________________________________________ >>>> Python-Dev mailing list >>>> Python-Dev at python.org >>>> http://mail.python.org/mailman/listinfo/python-dev >>>> Unsubscribe: >>>> http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk >>> > > From merwok at netwok.org Fri May 6 19:42:11 2011 From: merwok at netwok.org (=?UTF-8?Q?=C3=89ric_Araujo?=) Date: Fri, 06 May 2011 19:42:11 +0200 Subject: [Python-Dev] cpython (3.2): Avoid codec spelling issues by just using the utf-8 default. In-Reply-To: <ipv9nh$f5b$1@dough.gmane.org> References: "\"<E1QI3RQ-00050Z-JW@dinsdale.python.org>" <BANLkTikamsX+wcTEKu1JtrpzgWzO7H_Huw@mail.gmail.com>" <926F0913-8142-430A-8400-6E6F0CD5B8F1@gmail.com> <ipv9nh$f5b$1@dough.gmane.org> Message-ID: <da8189c6ef78d0e47bea356efadae97e@netwok.org> Le 06/05/2011 00:52, Terry Reedy a ?crit : > On 5/5/2011 4:55 PM, Raymond Hettinger wrote: >> Either way, the code is simpler by just using the default. > I thought about this and decided that the purpose of having defaults > is > so one does not have to always spell it out. So use it. Readers can > always look it up and learn. Agreed. I thought about something similar after Victor?s commit that changed open(mode='rU') to use just 'r': Why not remove the mode argument entirely when it is the default value? Regards From merwok at netwok.org Fri May 6 19:51:31 2011 From: merwok at netwok.org (=?UTF-8?Q?=C3=89ric_Araujo?=) Date: Fri, 06 May 2011 19:51:31 +0200 Subject: [Python-Dev] Problems with regrtest and with logging Message-ID: <acbfe5fdfbc9da0ecec6d2591ab3757d@netwok.org> Hi, Sorry for quick email-battery dying. regrtest helpfully reports when a test leaves the environment unclean (sys.path, os.environ, logging._handlerList), but I think the implementation is buggy: it compares object identity and then value. Why is comparing identity useful? I?d just use ==. It makes writing cleanup code easier (just use addCleanup(setattr, obj, 'attr', copy(obj.attr))). Second: in packaging, we have two modules that create a logging handler. I?m not sure how if we should change the code or fix the tests to restore the _handlerList, or how. Thanks for advice. Regards From skip at pobox.com Fri May 6 19:58:34 2011 From: skip at pobox.com (skip at pobox.com) Date: Fri, 6 May 2011 12:58:34 -0500 Subject: [Python-Dev] Linus on garbage collection In-Reply-To: <4DC42C13.8070806@voidspace.org.uk> References: <iq0v4q$ubm$1@dough.gmane.org> <20110506161233.1ed647ec@pitrou.net> <19908.8043.8921.50222@montanaro.dyndns.org> <4DC42270.1000301@voidspace.org.uk> <8F83194F-5A5C-496E-920A-A2488F9949E4@twistedmatrix.com> <4DC42C13.8070806@voidspace.org.uk> Message-ID: <19908.14026.312182.540486@montanaro.dyndns.org> Michael> "Therefore we decided to break such a cycle at an arbitrary Michael> place, which doesn't sound too insane." I trust "arbitrary" != "random"? Skip From stefan_ml at behnel.de Fri May 6 20:06:12 2011 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 06 May 2011 20:06:12 +0200 Subject: [Python-Dev] Linus on garbage collection In-Reply-To: <4DC42A9B.6020000@voidspace.org.uk> References: <iq0v4q$ubm$1@dough.gmane.org> <20110506161233.1ed647ec@pitrou.net> <19908.8043.8921.50222@montanaro.dyndns.org> <4DC422BF.4010006@dcs.gla.ac.uk> <iq18uq$s9p$1@dough.gmane.org> <4DC42A9B.6020000@voidspace.org.uk> Message-ID: <iq1dak$mr2$1@dough.gmane.org> Michael Foord, 06.05.2011 19:06: > On 06/05/2011 17:51, Stefan Behnel wrote: >> Mark Shannon, 06.05.2011 18:33: >>> skip at pobox.com wrote: >>>> Antoine> Since we're sharing links, here's Matt Mackall's take: >>>> Antoine> >>>> http://www.selenic.com/pipermail/mercurial-devel/2011-May/031055.html >>>> >>>>> From that note: >>>> >>>> 1: You can't have meaningful destructors, because when destruction >>>> happens is undefined. And going-out-of-scope destructors are extremely >>>> useful. Python is already a rather broken in this regard, so feel free >>>> to ignore this point. >>>> >>>> Given the presence of cyclic data I don't see how reference counting or >>>> garbage collection win. Ignoring the fact that in a pure reference counted >>>> system you won't even consider cycles for reclmation, would both RC and GC >>>> have to punt because they can't tell which object's destructor to call >>>> first? >>> >>> It doesn't matter which is called first. >> >> May I quote you on that one the next time my software crashes? > > Arbitrarily breaking cycles *could* cause a problem if a destructor > attempts to access an already collected object. This is more real than the "could" suggests. Remember that CPython includes a lot of C code, and is commonly used to interface with C libraries. While you will simply get an exception when cycles are broken in Python code, cycles that involve C code can suffer quite badly from this problem. There was a bug in the lxml.etree XML library a while ago that could let it crash hard when its Element objects participated in a reference cycle. It's based on libxml2, so there's an underlying C tree that potentially involves disconnected subtrees, and a Python space representation using Element proxies, with at least one Element for each disconnected subtree. Basically, Elements reference their Document (not the other way round) even if they are disconnected from the main C document tree. The Document needs to do some final cleanup in the end, whereas the Elements require the Document to be alive to do their own subtree cleanup, if only to know what exactly to clean up, as the subtrees share some C state through the document. Now, if any of the Elements ends up in a reference cycle for some reason, the GC will throw its dices and may decide to call the Document destructor first. Then the Element destructors are bound to crash, trying to access dead memory of the Document. This was easy to fix in CPython's refcounting environment. A double INCREF on the Document for each Element does the trick, as it effectively removes the Document from the collectable cycle and lets the Element destructors decide when to let the Document refcount go down to 0. A fix in a pure GC system is substantially harder to make efficient. Stefan From g.brandl at gmx.net Fri May 6 20:14:28 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 06 May 2011 20:14:28 +0200 Subject: [Python-Dev] Borrowed and Stolen References in API In-Reply-To: <BANLkTimh4a9BE+1HmuckMXf8yboxtc9m0w@mail.gmail.com> References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk> <BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com> <4DC1D1C5.9010507@canterbury.ac.nz> <BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com> <BANLkTikG95Qo+5LodJrjC=y3ANei=KkSXg@mail.gmail.com> <BANLkTimKxOWhRKUaHV-B1T=D7n39RQ8Lhg@mail.gmail.com> <ipv28o$6e9$1@dough.gmane.org> <19907.21576.751581.958722@montanaro.dyndns.org> <4DC3A6CB.5020809@dcs.gla.ac.uk> <BANLkTimh4a9BE+1HmuckMXf8yboxtc9m0w@mail.gmail.com> Message-ID: <iq1dq3$p6i$1@dough.gmane.org> On 06.05.2011 10:18, Amaury Forgeot d'Arc wrote: > Le vendredi 6 mai 2011, Mark Shannon <marks at dcs.gla.ac.uk> a ?crit : >> What about #defining PY_STOLEN in some header? >> >> Then any stolen parameter can be prefixed with PY_STOLEN in signature. >> >> For return values, similarly #define PY_BORROWED. > > Header files are harder to parse, and I don't see how it would apply to macros. > What about additional tags in the .rst files? Possible, of course, and even easier to implement. Georg From g.brandl at gmx.net Fri May 6 20:16:20 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 06 May 2011 20:16:20 +0200 Subject: [Python-Dev] Borrowed and Stolen References in API In-Reply-To: <20110506122703.17c4d889@pitrou.net> References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk> <BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com> <4DC1D1C5.9010507@canterbury.ac.nz> <BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com> <4DC34EAB.9050001@canterbury.ac.nz> <20110506122703.17c4d889@pitrou.net> Message-ID: <iq1dtj$p6i$2@dough.gmane.org> On 06.05.2011 12:27, Antoine Pitrou wrote: > On Fri, 06 May 2011 13:28:11 +1200 > Greg Ewing <greg.ewing at canterbury.ac.nz> wrote: > >> Amaury Forgeot d'Arc wrote [concerning the Doc/data/refcounts.dat file]: >> >> > This is not always true, for example when the item is already present >> > in the dict. >> > It's not important to know what the function does to the object, >> > Only the action on the reference is relevant. >> >> Yes, that's the whole point. When using a functon, >> what you need to know is whether it borrows or steals >> a reference. > > Doesn't "borrow" mean the same as "steal" in that context? > If an API borrows a reference, I expect it to take it from me. Basically, "borrow" is applied to return values (or, more generally, "out" parameters), and means that *you* borrowed the reference. "steal", OTOH, is applied to (and the exception for) "in" parameters. Georg From marks at dcs.gla.ac.uk Fri May 6 20:45:41 2011 From: marks at dcs.gla.ac.uk (Mark Shannon) Date: Fri, 06 May 2011 19:45:41 +0100 Subject: [Python-Dev] Linus on garbage collection In-Reply-To: <iq1dak$mr2$1@dough.gmane.org> References: <iq0v4q$ubm$1@dough.gmane.org> <20110506161233.1ed647ec@pitrou.net> <19908.8043.8921.50222@montanaro.dyndns.org> <4DC422BF.4010006@dcs.gla.ac.uk> <iq18uq$s9p$1@dough.gmane.org> <4DC42A9B.6020000@voidspace.org.uk> <iq1dak$mr2$1@dough.gmane.org> Message-ID: <4DC441D5.2070102@dcs.gla.ac.uk> Stefan Behnel wrote: > Michael Foord, 06.05.2011 19:06: >> On 06/05/2011 17:51, Stefan Behnel wrote: >>> Mark Shannon, 06.05.2011 18:33: >>>> skip at pobox.com wrote: >>>>> Antoine> Since we're sharing links, here's Matt Mackall's take: >>>>> Antoine> >>>>> http://www.selenic.com/pipermail/mercurial-devel/2011-May/031055.html >>>>> >>>>>> From that note: >>>>> 1: You can't have meaningful destructors, because when destruction >>>>> happens is undefined. And going-out-of-scope destructors are extremely >>>>> useful. Python is already a rather broken in this regard, so feel free >>>>> to ignore this point. >>>>> >>>>> Given the presence of cyclic data I don't see how reference counting or >>>>> garbage collection win. Ignoring the fact that in a pure reference counted >>>>> system you won't even consider cycles for reclmation, would both RC and GC >>>>> have to punt because they can't tell which object's destructor to call >>>>> first? >>>> It doesn't matter which is called first. >>> May I quote you on that one the next time my software crashes? >> Arbitrarily breaking cycles *could* cause a problem if a destructor >> attempts to access an already collected object. > > This is more real than the "could" suggests. Remember that CPython includes > a lot of C code, and is commonly used to interface with C libraries. While > you will simply get an exception when cycles are broken in Python code, > cycles that involve C code can suffer quite badly from this problem. > > There was a bug in the lxml.etree XML library a while ago that could let it > crash hard when its Element objects participated in a reference cycle. It's > based on libxml2, so there's an underlying C tree that potentially involves > disconnected subtrees, and a Python space representation using Element > proxies, with at least one Element for each disconnected subtree. > > Basically, Elements reference their Document (not the other way round) even > if they are disconnected from the main C document tree. The Document needs > to do some final cleanup in the end, whereas the Elements require the > Document to be alive to do their own subtree cleanup, if only to know what > exactly to clean up, as the subtrees share some C state through the > document. Now, if any of the Elements ends up in a reference cycle for some > reason, the GC will throw its dices and may decide to call the Document > destructor first. Then the Element destructors are bound to crash, trying > to access dead memory of the Document. With a tracing collector it is *impossible* to access dead memory, ever. If it can be reached the GC will *not* collect it. This should be a fundamental invariant of *all* GCs. If an object is finalizable or reachable from any finalizable objects then it is reachable and its memory should not be reclaimed until it is truly unreachable. Finalization and reclamation are separate phases. > > This was easy to fix in CPython's refcounting environment. A double INCREF > on the Document for each Element does the trick, as it effectively removes > the Document from the collectable cycle and lets the Element destructors > decide when to let the Document refcount go down to 0. A fix in a pure GC > system is substantially harder to make efficient. With a tracing GC: While the Elements are finalized, the Document is still alive. While the Document is finalized, the Elements are still alive. Then, and only then, is the whole lot reclaimed. Mark. From vinay_sajip at yahoo.co.uk Fri May 6 20:57:24 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Fri, 6 May 2011 18:57:24 +0000 (UTC) Subject: [Python-Dev] Problems with regrtest and with logging References: <acbfe5fdfbc9da0ecec6d2591ab3757d@netwok.org> Message-ID: <loom.20110506T205048-495@post.gmane.org> ?ric Araujo <merwok <at> netwok.org> writes: > Second: in packaging, we have two modules that create a logging > handler. I?m not sure how if we should change the code or fix the tests > to restore the _handlerList, or how. If you are saying this happens in your unit tests for packaging, then you can either restore the _handlerList using the approach in test_logging, or else you can just close the handlers when you've done with them. If you point me at the relevant code (is it on bitbucket or on hg.python.org?) I can perhaps take a look and advise. Regards, Vinay Sajip From stefan_ml at behnel.de Fri May 6 21:10:30 2011 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 06 May 2011 21:10:30 +0200 Subject: [Python-Dev] Linus on garbage collection In-Reply-To: <4DC441D5.2070102@dcs.gla.ac.uk> References: <iq0v4q$ubm$1@dough.gmane.org> <20110506161233.1ed647ec@pitrou.net> <19908.8043.8921.50222@montanaro.dyndns.org> <4DC422BF.4010006@dcs.gla.ac.uk> <iq18uq$s9p$1@dough.gmane.org> <4DC42A9B.6020000@voidspace.org.uk> <iq1dak$mr2$1@dough.gmane.org> <4DC441D5.2070102@dcs.gla.ac.uk> Message-ID: <iq1h37$dmi$1@dough.gmane.org> Mark Shannon, 06.05.2011 20:45: > Stefan Behnel wrote: >> Michael Foord, 06.05.2011 19:06: >>> On 06/05/2011 17:51, Stefan Behnel wrote: >>>> Mark Shannon, 06.05.2011 18:33: >>>>> skip at pobox.com wrote: >>>>>> Antoine> Since we're sharing links, here's Matt Mackall's take: >>>>>> Antoine> >>>>>> http://www.selenic.com/pipermail/mercurial-devel/2011-May/031055.html >>>>>> >>>>>>> From that note: >>>>>> 1: You can't have meaningful destructors, because when destruction >>>>>> happens is undefined. And going-out-of-scope destructors are extremely >>>>>> useful. Python is already a rather broken in this regard, so feel free >>>>>> to ignore this point. >>>>>> >>>>>> Given the presence of cyclic data I don't see how reference counting or >>>>>> garbage collection win. Ignoring the fact that in a pure reference >>>>>> counted >>>>>> system you won't even consider cycles for reclmation, would both RC >>>>>> and GC >>>>>> have to punt because they can't tell which object's destructor to call >>>>>> first? >>>>> It doesn't matter which is called first. >>>> May I quote you on that one the next time my software crashes? >>> Arbitrarily breaking cycles *could* cause a problem if a destructor >>> attempts to access an already collected object. >> >> This is more real than the "could" suggests. Remember that CPython >> includes a lot of C code, and is commonly used to interface with C >> libraries. While you will simply get an exception when cycles are broken >> in Python code, cycles that involve C code can suffer quite badly from >> this problem. >> >> There was a bug in the lxml.etree XML library a while ago that could let >> it crash hard when its Element objects participated in a reference cycle. >> It's based on libxml2, so there's an underlying C tree that potentially >> involves disconnected subtrees, and a Python space representation using >> Element proxies, with at least one Element for each disconnected subtree. >> >> Basically, Elements reference their Document (not the other way round) >> even if they are disconnected from the main C document tree. The Document >> needs to do some final cleanup in the end, whereas the Elements require >> the Document to be alive to do their own subtree cleanup, if only to know >> what exactly to clean up, as the subtrees share some C state through the >> document. Now, if any of the Elements ends up in a reference cycle for >> some reason, the GC will throw its dices and may decide to call the >> Document destructor first. Then the Element destructors are bound to >> crash, trying to access dead memory of the Document. > > With a tracing collector it is *impossible* to access dead memory, ever. > If it can be reached the GC will *not* collect it. > This should be a fundamental invariant of *all* GCs. > > If an object is finalizable or reachable from any finalizable objects > then it is reachable and its memory should not be reclaimed until it is > truly unreachable. > > Finalization and reclamation are separate phases. Sure. However, I'm talking about Python types and C memory here. Even if the Python objects are still alive, they may already have freed the underlying C memory during their *finalisation*. When an Element goes out of scope, it must free its C subtree if it is disconnected, even if the Document stays alive. So that's what Elements do in their destructor, and they need the Document's C memory for that, which the Document frees during its own finalisation. I do agree that CPython's destructor call algorithms could have been smarter in this case. After all, the described crash case indicates that the Document destructor was called before all of the Element destructors had been called, although all Elements reference their Document, but the Document does not refer to any of the Elements, so it's basically a dead end. That would have provided a detectable hint to call the Document destructor last, after the ones of all objects that reference it. Apparently, this hint did not lead to an appropriate action, possibly because it's an unimplemented special case and there are enough cases where multiple objects with destructors are actually part of the 'real' cycle. Stefan From drsalists at gmail.com Fri May 6 21:59:30 2011 From: drsalists at gmail.com (Dan Stromberg) Date: Fri, 6 May 2011 12:59:30 -0700 Subject: [Python-Dev] Linus on garbage collection In-Reply-To: <iq0v4q$ubm$1@dough.gmane.org> References: <iq0v4q$ubm$1@dough.gmane.org> Message-ID: <BANLkTimYpqiBtp_EFtmeQ3X9xCAnww+tcw@mail.gmail.com> On Fri, May 6, 2011 at 7:04 AM, Neal Becker <ndbecker2 at gmail.com> wrote: > http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html > Of course, a generational GC improves locality of reference. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20110506/db1cb31f/attachment.html> From rdmurray at bitdance.com Fri May 6 22:07:30 2011 From: rdmurray at bitdance.com (R. David Murray) Date: Fri, 06 May 2011 16:07:30 -0400 Subject: [Python-Dev] Problems with regrtest and with logging In-Reply-To: <acbfe5fdfbc9da0ecec6d2591ab3757d@netwok.org> References: <acbfe5fdfbc9da0ecec6d2591ab3757d@netwok.org> Message-ID: <20110506200734.049872500DF@webabinitio.net> On Fri, 06 May 2011 19:51:31 +0200, =?UTF-8?Q?=C3=89ric_Araujo?= <merwok at netwok.org> wrote: > regrtest helpfully reports when a test leaves the environment unclean > (sys.path, os.environ, logging._handlerList), but I think the > implementation is buggy: it compares object identity and then value. > Why is comparing identity useful? I???d just use ==. It makes writing > cleanup code easier (just use addCleanup(setattr, obj, 'attr', > copy(obj.attr))). Well, the implementation is intentional. Nick (I think) added the identity check, and he had a reason at the time. I don't remember what it was, though. -- R. David Murray http://www.bitdance.com From greg.ewing at canterbury.ac.nz Sat May 7 01:25:09 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 07 May 2011 11:25:09 +1200 Subject: [Python-Dev] Linus on garbage collection In-Reply-To: <iq0v4q$ubm$1@dough.gmane.org> References: <iq0v4q$ubm$1@dough.gmane.org> Message-ID: <4DC48355.2050509@canterbury.ac.nz> Neal Becker wrote: > http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html There, Linus says > For example, if you have an _explicit_ refcounting system, then it is > quite natural to have operations like ... > > note_t *node = *np; > if (node->count > 1) > newnode = copy_alloc(node); It's interesting to note that, even though you *can* get reference count information in CPython, it's not all that useful for doing things like that, because it's hard to be sure how many incidental references have been created on the way to the code concerned. So tricks like this at the Python level aren't really feasible in any robust way. -- Greg From greg.ewing at canterbury.ac.nz Sat May 7 01:43:16 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 07 May 2011 11:43:16 +1200 Subject: [Python-Dev] Linus on garbage collection In-Reply-To: <19908.8043.8921.50222@montanaro.dyndns.org> References: <iq0v4q$ubm$1@dough.gmane.org> <20110506161233.1ed647ec@pitrou.net> <19908.8043.8921.50222@montanaro.dyndns.org> Message-ID: <4DC48794.5070808@canterbury.ac.nz> Antoine> http://www.selenic.com/pipermail/mercurial-devel/2011-May/031055.html > >>From that note: > > 1: You can't have meaningful destructors, because when destruction > happens is undefined. And going-out-of-scope destructors are extremely > useful. Python is already a rather broken in this regard, so feel free > to ignore this point. It's only broken if you regard RAII as the One True Way to implement scoped resource management. Python has other approaches to that, such as the with-statement. Also, you *can* have destructors that work for objects in cycles, as long as you don't insist on the destructor having access to the object that's being destroyed. Weakref callbacks provide a way of implementing this in CPython. -- Greg From greg.ewing at canterbury.ac.nz Sat May 7 01:53:39 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 07 May 2011 11:53:39 +1200 Subject: [Python-Dev] Linus on garbage collection In-Reply-To: <4DC433FD.6090803@dcs.gla.ac.uk> References: <iq0v4q$ubm$1@dough.gmane.org> <20110506161233.1ed647ec@pitrou.net> <19908.8043.8921.50222@montanaro.dyndns.org> <4DC422BF.4010006@dcs.gla.ac.uk> <iq18uq$s9p$1@dough.gmane.org> <4DC42A9B.6020000@voidspace.org.uk> <4DC42F4B.1050509@dcs.gla.ac.uk> <4DC4321F.3070206@voidspace.org.uk> <4DC433FD.6090803@dcs.gla.ac.uk> Message-ID: <4DC48A03.2020800@canterbury.ac.nz> Mark Shannon wrote: > For example, a file object will close itself during finalization, > but its still a valid object, just a closed file rather than an open one. It might be valid in the sense that you won't get a segfault. But the point is that the destructors of some objects may be relying on other objects still being in a certain state, e.g. a file still being open. One would have to adopt a highly defensive coding style in destructors, verging on paranoia, to be sure that one's destructor code was completely immune to this kind of problem. All of this worry goes away if the destructor is not a method of the object being destroyed, but something external that runs *after* the object has disappeared. -- Greg From ncoghlan at gmail.com Sat May 7 02:12:33 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 7 May 2011 10:12:33 +1000 Subject: [Python-Dev] Linus on garbage collection In-Reply-To: <iq0v4q$ubm$1@dough.gmane.org> References: <iq0v4q$ubm$1@dough.gmane.org> Message-ID: <6163FB60-2F6C-4143-8D9D-EE241DD09081@gmail.com> Even if he's right (and he probably is) manual memory management is still a premature optimization for most applications. C and C++ data structures are a PITA because you have to be so careful to avoid leaks and double-frees, so people end up using dumb algorithms. Worrying about losing cycles waiting for main memory is stupid if your high level algorithm is O(N^2). Cheers, Nick. -- Nick Coghlan, Brisbane, Australia On 07/05/2011, at 12:04 AM, Neal Becker <ndbecker2 at gmail.com> wrote: > http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com From greg.ewing at canterbury.ac.nz Sat May 7 02:22:22 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 07 May 2011 12:22:22 +1200 Subject: [Python-Dev] Linus on garbage collection In-Reply-To: <4DC441D5.2070102@dcs.gla.ac.uk> References: <iq0v4q$ubm$1@dough.gmane.org> <20110506161233.1ed647ec@pitrou.net> <19908.8043.8921.50222@montanaro.dyndns.org> <4DC422BF.4010006@dcs.gla.ac.uk> <iq18uq$s9p$1@dough.gmane.org> <4DC42A9B.6020000@voidspace.org.uk> <iq1dak$mr2$1@dough.gmane.org> <4DC441D5.2070102@dcs.gla.ac.uk> Message-ID: <4DC490BE.4080002@canterbury.ac.nz> Mark Shannon wrote: > With a tracing GC: > While the Elements are finalized, the Document is still alive. > While the Document is finalized, the Elements are still alive. > Then, and only then, is the whole lot reclaimed. One problem is that, at the C level in CPython, you can't separate finalisation and reclamation. When an object's refcount drops to zero, its tp_dealloc method is called, which both finalises the object and reclaims its memory. Another problem is that just because an object's memory hasn't been reclaimed yet doesn't mean it's safe to do anything with that object. This is doubly true at the C level, where the consequences can include segfaults. Seems to me the basic issue here is that the C code wasn't designed with tracing GC in mind. There is a reference cycle, but it is assumed that the user is in manual control of deallocation and will deallocate the Nodes before the Document. -- Greg From greg.ewing at canterbury.ac.nz Sat May 7 02:26:10 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 07 May 2011 12:26:10 +1200 Subject: [Python-Dev] Linus on garbage collection In-Reply-To: <iq1h37$dmi$1@dough.gmane.org> References: <iq0v4q$ubm$1@dough.gmane.org> <20110506161233.1ed647ec@pitrou.net> <19908.8043.8921.50222@montanaro.dyndns.org> <4DC422BF.4010006@dcs.gla.ac.uk> <iq18uq$s9p$1@dough.gmane.org> <4DC42A9B.6020000@voidspace.org.uk> <iq1dak$mr2$1@dough.gmane.org> <4DC441D5.2070102@dcs.gla.ac.uk> <iq1h37$dmi$1@dough.gmane.org> Message-ID: <4DC491A2.2060309@canterbury.ac.nz> Stefan Behnel wrote: > After all, the described crash case indicates that > the Document destructor was called before all of the Element destructors > had been called, although all Elements reference their Document, but the > Document does not refer to any of the Elements, In that case, why was the GC system regarding this as a cycle at all? There must be more going on. -- Greg From glyph at twistedmatrix.com Sat May 7 03:39:10 2011 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Fri, 6 May 2011 21:39:10 -0400 Subject: [Python-Dev] Linus on garbage collection In-Reply-To: <iq0v4q$ubm$1@dough.gmane.org> References: <iq0v4q$ubm$1@dough.gmane.org> Message-ID: <AF61F6E0-4A2A-4036-8305-65CDE66458CD@twistedmatrix.com> Apologies in advance for contributing to an obviously and increasingly off-topic thread, but this kind of FUD about GC is a pet peeve of mine. On May 6, 2011, at 10:04 AM, Neal Becker wrote: > http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html Counterpoint: <http://lwn.net/Articles/268783/>. Sorry Linus, sometimes correctness matters more than performance. But, even the performance argument is kind of bogus. See, for example, this paper on real-time garbage collection: <http://domino.research.ibm.com/comm/research_people.nsf/pages/dgrove.ecoop07.html>. That's just one example of an easy-to-find solution to a problem that Linus holds up as unsolved or unsolvable. There are solutions to pretty much all of the problems that Linus brings up. One of these solutions is even famously implemented by CPython! The CPython "string +=" idiom optimization fixes at least one case of the "you tend to always copy the node" antipattern Linus describes, and lots of languages (especially Scheme and derivatives, IIRC) have very nice optimizations around this area. One could argue that any functional language without large pools of mutable state (i.e. Erlang) is a massive optimization for this case. Another example: the "dirty cache" problem Linus talks about can be addressed by having a GC that cooperates with the VMM: <http://www.cs.umass.edu/~emery/pubs/f034-hertz.pdf>. And the "re-using stuff as fast as possible" thing is exactly the kind of problem that generational GCs address. When you run out of space in cache, you reap your first generation before you start copying stuff. One of the key insights of generational GC is that you'll usually reclaim enough (in this case, cache-local) memory that you can keep going for a little while. You don't have to read a super fancy modern paper on this, Wikipedia explains nicely: <http://en.wikipedia.org/wiki/Garbage_collection_(computer_science)#Generational_GC_.28ephemeral_GC.29>. Of course if you don't tune your GC at all for your machine-specific cache size, you won't see this performance benefit play out. I don't know if there's a programming language and runtime with a real-time, VM-cooperating garbage collector that actually exists today which has all the bells and whistles required to implement an OS kernel, so I wouldn't give the Linux kernel folks too much of a hard time for still using C; but there's nothing wrong with the idea in the abstract. The performance differences between automatic and manual GC are dubious at best, and with a really good GC and a language that supports it, GC tends to win big. When it loses, it loses in ways which can be fixed in one area of the code (the GC) rather than millions of tiny fixes across your whole codebase, as is the case with strategies used by manual collection algorithms. The assertion that "modern hardware" is not designed for big data-structure pointer-chasing is also a bit silly. On the contrary, modern hardware has evolved staggeringly massive caches, specifically because large programs (whether they're GC'd or not) tend to do lots of this kind of thing, because there's a certain level of complexity beyond which one can no longer avoid it. It's old hardware, with tiny caches (that were, by virtue of their tininess, closer to the main instruction-processing silicon), that was optimized for the "carefully stack-allocating everything in the world to conserve cache" approach. You can see this pretty clearly by running your favorite Python benchmark of choice on machines which are similar except for cache size. The newer machine, with the bigger cache, will run Python considerably faster, but doesn't help the average trivial C benchmark that much - or, for that matter, Linux benchmarks. -glyph -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20110506/3ad368a3/attachment.html> From g.brandl at gmx.net Sat May 7 08:55:57 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 07 May 2011 08:55:57 +0200 Subject: [Python-Dev] Linus on garbage collection In-Reply-To: <4DC48355.2050509@canterbury.ac.nz> References: <iq0v4q$ubm$1@dough.gmane.org> <4DC48355.2050509@canterbury.ac.nz> Message-ID: <iq2qds$1um$1@dough.gmane.org> On 07.05.2011 01:25, Greg Ewing wrote: > Neal Becker wrote: >> http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html > > There, Linus says > >> For example, if you have an _explicit_ refcounting system, then it is >> quite natural to have operations like ... >> >> note_t *node = *np; >> if (node->count > 1) >> newnode = copy_alloc(node); > > It's interesting to note that, even though you *can* get reference > count information in CPython, it's not all that useful for doing > things like that, because it's hard to be sure how many incidental > references have been created on the way to the code concerned. > So tricks like this at the Python level aren't really feasible in > any robust way. But they are at the C level, see for example the optimization for string += something if "string"'s reference count is exactly one. Georg From stefan_ml at behnel.de Sat May 7 09:20:38 2011 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 07 May 2011 09:20:38 +0200 Subject: [Python-Dev] Linus on garbage collection In-Reply-To: <4DC491A2.2060309@canterbury.ac.nz> References: <iq0v4q$ubm$1@dough.gmane.org> <20110506161233.1ed647ec@pitrou.net> <19908.8043.8921.50222@montanaro.dyndns.org> <4DC422BF.4010006@dcs.gla.ac.uk> <iq18uq$s9p$1@dough.gmane.org> <4DC42A9B.6020000@voidspace.org.uk> <iq1dak$mr2$1@dough.gmane.org> <4DC441D5.2070102@dcs.gla.ac.uk> <iq1h37$dmi$1@dough.gmane.org> <4DC491A2.2060309@canterbury.ac.nz> Message-ID: <iq2rs6$91r$1@dough.gmane.org> Greg Ewing, 07.05.2011 02:26: > Stefan Behnel wrote: >> After all, the described crash case indicates that the Document >> destructor was called before all of the Element destructors had been >> called, although all Elements reference their Document, but the Document >> does not refer to any of the Elements, > > In that case, why was the GC system regarding this as a cycle > at all? There must be more going on. It's a dead-end that is referenced by a cycle, that's all. Stefan From greg.ewing at canterbury.ac.nz Sat May 7 10:20:23 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 07 May 2011 20:20:23 +1200 Subject: [Python-Dev] Linus on garbage collection In-Reply-To: <iq2rs6$91r$1@dough.gmane.org> References: <iq0v4q$ubm$1@dough.gmane.org> <20110506161233.1ed647ec@pitrou.net> <19908.8043.8921.50222@montanaro.dyndns.org> <4DC422BF.4010006@dcs.gla.ac.uk> <iq18uq$s9p$1@dough.gmane.org> <4DC42A9B.6020000@voidspace.org.uk> <iq1dak$mr2$1@dough.gmane.org> <4DC441D5.2070102@dcs.gla.ac.uk> <iq1h37$dmi$1@dough.gmane.org> <4DC491A2.2060309@canterbury.ac.nz> <iq2rs6$91r$1@dough.gmane.org> Message-ID: <4DC500C7.1070608@canterbury.ac.nz> Stefan Behnel wrote: > It's a dead-end that is referenced by a cycle, that's all. But shouldn't it be breaking the cycle by clearing one of the objects that's actually part of the cycle, rather than part of the dead-end? I can't see how the Document could get picked for clearing unless it was actually in the cycle. Either that or I'm imagining the cyclic GC algorithm to be smarter than it actually is. -- Greg From solipsis at pitrou.net Sat May 7 10:34:57 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 7 May 2011 10:34:57 +0200 Subject: [Python-Dev] Linus on garbage collection References: <iq0v4q$ubm$1@dough.gmane.org> <AF61F6E0-4A2A-4036-8305-65CDE66458CD@twistedmatrix.com> Message-ID: <20110507103457.1e586a76@pitrou.net> On Fri, 6 May 2011 21:39:10 -0400 Glyph Lefkowitz <glyph at twistedmatrix.com> wrote: > > The assertion that "modern hardware" is not designed for big data-structure pointer-chasing is also a bit silly. On the contrary, modern hardware has evolved staggeringly massive caches, specifically because large programs (whether they're GC'd or not) tend to do lots of this kind of thing, because there's a certain level of complexity beyond which one can no longer avoid it. "Staggeringly massive"? The average 4MB L3 cache is very small compared to the heap of non-trivial Python (or Java) workloads. And Linus is right: modern hardware is not optimized for random pointer-chasing, simply because optimizing for it is very hard. Regards Antoine. From doug.hellmann at gmail.com Sat May 7 13:54:59 2011 From: doug.hellmann at gmail.com (Doug Hellmann) Date: Sat, 7 May 2011 07:54:59 -0400 Subject: [Python-Dev] Python Insider translations Message-ID: <23CF5EAF-FA24-48D3-89B5-E6C07F920FD1@gmail.com> I wanted to take a few minutes to let you all know that the recent call for help with translating Python Insider was met with a wave of enthusiastic contributors. We now have teams prepared to translate all posts to Simplified and Traditional Chinese, German, Japanese, Portuguese, Romanian, and Spanish. Setting up each blog takes a bit of effort, so we are launching them in batches as they are ready. When all of the existing teams are launched, I will be looking for translators for additional languages. The next time you have Python related information that you would like to share with the community, I hope you will consider working with us and publishing it through Python Insider, so it can reach the widest possible audience. Either Brian Curtin or I can help you get set up, so contact one of us directly when you are ready. Thanks, Doug From merwok at netwok.org Sat May 7 18:28:21 2011 From: merwok at netwok.org (=?UTF-8?Q?=C3=89ric_Araujo?=) Date: Sat, 07 May 2011 18:28:21 +0200 Subject: [Python-Dev] Problems with regrtest and with logging In-Reply-To: <loom.20110506T205048-495@post.gmane.org> References: <acbfe5fdfbc9da0ecec6d2591ab3757d@netwok.org> <loom.20110506T205048-495@post.gmane.org> Message-ID: <2ed4b4e7b4fc17cba2162535d2a220d8@netwok.org> Le 06/05/2011 20:57, Vinay Sajip a ?crit : > ?ric Araujo <merwok <at> netwok.org> writes: >> Second: in packaging, we have two modules that create a logging >> handler. I?m not sure how if we should change the code or fix the >> tests >> to restore the _handlerList, or how. > > If you are saying this happens in your unit tests for packaging, then > you can > either restore the _handlerList using the approach in test_logging, > or else you > can just close the handlers when you've done with them. We create one handler in a command-line script, not in the lib, which is the Right Way AFAIU, but there is also one module that creates one handler (in order to set its level depending on a verbose attribute) deep in the library code, not in the command-line script, which I think is bad. Our tests that instantiate that object (dist.Distribution) end up modifying logging._handlerList, but I feel that the code is wrong, not the tests. The code is on https://bitbucket.org/tarek/cpython, in Lib/packaging. Thanks! From merwok at netwok.org Sat May 7 18:28:37 2011 From: merwok at netwok.org (=?UTF-8?Q?=C3=89ric_Araujo?=) Date: Sat, 07 May 2011 18:28:37 +0200 Subject: [Python-Dev] Problems with regrtest and with logging In-Reply-To: <20110506200734.049872500DF@webabinitio.net> References: <acbfe5fdfbc9da0ecec6d2591ab3757d@netwok.org> <20110506200734.049872500DF@webabinitio.net> Message-ID: <17c0c1bacc61292edec4600e3feb40f9@netwok.org> Hi, Le 06/05/2011 22:07, R. David Murray a ?crit : > On Fri, 06 May 2011 19:51:31 +0200, =?UTF-8?Q?=C3=89ric_Araujo?= > <merwok at netwok.org> wrote: >> regrtest helpfully reports when a test leaves the environment >> unclean >> (sys.path, os.environ, logging._handlerList), but I think the >> implementation is buggy: it compares object identity and then value. >> Why is comparing identity useful? I?d just use ==. It makes >> writing >> cleanup code easier (just use addCleanup(setattr, obj, 'attr', >> copy(obj.attr))). > > Well, the implementation is intentional. Nick (I think) added the > identity check, and he had a reason at the time. I don't remember > what > it was, though. Drat. Nick, if it was indeed you, can you enlighten me? /off to replace all those addCleanup/setattr combos :( Regards From catch-all at masklinn.net Sat May 7 20:31:45 2011 From: catch-all at masklinn.net (Xavier Morel) Date: Sat, 7 May 2011 20:31:45 +0200 Subject: [Python-Dev] Linus on garbage collection In-Reply-To: <AF61F6E0-4A2A-4036-8305-65CDE66458CD@twistedmatrix.com> References: <iq0v4q$ubm$1@dough.gmane.org> <AF61F6E0-4A2A-4036-8305-65CDE66458CD@twistedmatrix.com> Message-ID: <FB627310-A744-4C09-8035-421157A76757@masklinn.net> On 2011-05-07, at 03:39 , Glyph Lefkowitz wrote: > > I don't know if there's a programming language and runtime with a real-time, VM-cooperating garbage collector that actually exists today which has all the bells and whistles required to implement an OS kernel, so I wouldn't give the Linux kernel folks too much of a hard time for still using C; but there's nothing wrong with the idea in the abstract. Not sure it had all those bells and whistles, and there were other issues, but I believe Lisp Machines implemented garbage collection at the hardware (or at least microcode) level, and the OS itself provided a pretty direct interface to it (it was part of the core services). From solipsis at pitrou.net Sat May 7 23:52:05 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 7 May 2011 23:52:05 +0200 Subject: [Python-Dev] cpython (2.7): Some tests were incorrectly marked as C specific. References: <E1QIorr-0000Fm-IG@dinsdale.python.org> Message-ID: <20110507235205.162d414c@pitrou.net> On Sat, 07 May 2011 23:16:51 +0200 raymond.hettinger <python-checkins at python.org> wrote: > > +class TestErrorHandling_Python(unittest.TestCase): > + module = py_heapq This class contains no tests. Regards Antoine. From vinay_sajip at yahoo.co.uk Sun May 8 16:22:18 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Sun, 8 May 2011 14:22:18 +0000 (UTC) Subject: [Python-Dev] Problems with regrtest and with logging References: <acbfe5fdfbc9da0ecec6d2591ab3757d@netwok.org> <loom.20110506T205048-495@post.gmane.org> <2ed4b4e7b4fc17cba2162535d2a220d8@netwok.org> Message-ID: <loom.20110508T153655-217@post.gmane.org> ?ric Araujo <merwok <at> netwok.org> writes: > The code is on https://bitbucket.org/tarek/cpython, in Lib/packaging. The cases you refer to seem to be _set_logger in packaging/run.py (which appears not to be used at all - there appear to be no other references to it in the code), Dispatcher.__init__ in packaging/run.py and Distribution.parse_command_line in packaging/dist.py. I can't see why the first case is there. In the second and third cases, can you be sure that only one of these code paths will be executed, at most once? If not, multiple StreamHandler instances would be added to the logger, resulting in duplicated messages. If the code paths will be executed at most once, then the code seems to be acceptable. You may wish to add a guard using "if not logger.hasHandlers():" so that even if the code is executed multiple times, a handler isn't added multiple times. In the case of the test support code, I'm not really sure that LoggingCatcher is needed. There is already a TestHandler class in test.support which captures records in a buffer, and allows flexible matching for assertions, as described in http://plumberjack.blogspot.com/2010/09/unit-testing-and-logging.html The _handlerList in logging contains weak references to handlers, and when the referent is finalised, it's removed from the list. If you want to control this more finely, you could do something like (untested): class MyTestCase(unittest.TestCase): def setUp(self): self.handler = TestHandler(Matcher()) logging.getLogger().addHandler(self.handler) def tearDown(self): logging.getLogger().removeHandler(self.handler) self.handler.close() refs = weakref.getweakrefs(self.handler) for ref in refs: logging._removeHandlerRef(ref) def test_something(self): logging.warning('Test') self.assertTrue(self.handler.matches(message='Test')) Regards, Vinay Sajip From victor.stinner at haypocalc.com Mon May 9 12:32:48 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Mon, 09 May 2011 12:32:48 +0200 Subject: [Python-Dev] Commit changelog: issue number and merges Message-ID: <1304937168.22910.21.camel@marge> Hi, Commit changelogs are important to understand why the code was changed. I regulary use hg blame to search which commit introduced a particular line of code, and I am always happy if I can find an issue number because it usually contains the whole story. And since the migration to Mercurial, we have also a great tool adding a comment to an issue if the changelog contains an issue number (e.g. changelog starting with "Issue #118888: ..."). So if someone watchs an issue (is in the nosy list), (s)he will be noticed that a related commit was pushed. It is not exactly something new: we already do that with Subversion except that today it is more automatic. I noticed that some recent commits don't contain the issue number: please try to always prefix your changelog with the issue number. It is not "mandatory", but it helps me when I dig the Python history. -- For merge commits: many developers just write "merge" or "merge 3.1". I have to go to the parent commit (and something to the grandparent, 3.1->3.2->3.3) to learn more about the commit. Would it be possible to repeat the changelog of the original commit in the merge commits? svnmerge toold prepared a nice changelog containing the changelog of all pendings commits, even when a commit was "blocked". For a merge commit, I copy/paste the changelog of the original commit and I add a "(Merge 3.1) " prefix. I prefer to add explictly a prefix because it is not easy to notice that it is a merge commit in a python-checkins email or in the history of hg.python.org. We need maybe new tools to help the process. -- Usecases needing better changelogs: - "All changes" section of a buildbot build - hg blame (or just hg log) Victor From rdmurray at bitdance.com Mon May 9 14:40:03 2011 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 09 May 2011 08:40:03 -0400 Subject: [Python-Dev] Commit changelog: issue number and merges In-Reply-To: <1304937168.22910.21.camel@marge> References: <1304937168.22910.21.camel@marge> Message-ID: <20110509124003.EB9D5250044@webabinitio.net> On Mon, 09 May 2011 12:32:48 +0200, Victor Stinner <victor.stinner at haypocalc.com> wrote: > For merge commits: many developers just write "merge" or "merge 3.1". I > have to go to the parent commit (and something to the grandparent, > 3.1->3.2->3.3) to learn more about the commit. > > Would it be possible to repeat the changelog of the original commit in > the merge commits? svnmerge toold prepared a nice changelog containing > the changelog of all pendings commits, even when a commit was "blocked". > > For a merge commit, I copy/paste the changelog of the original commit > and I add a "(Merge 3.1) " prefix. I prefer to add explictly a prefix > because it is not easy to notice that it is a merge commit in a > python-checkins email or in the history of hg.python.org. +1. What I do is, in the edit window for the commit message, I pull in .hg/last-message.txt, and just type 'Merge' in front of my previous first line. I don't add the merge-from number, because I figure if you know which branch you are looking at you know which branch the merge came from, given that there is a strict progression. -- R. David Murray http://www.bitdance.com From jimjjewett at gmail.com Mon May 9 14:53:52 2011 From: jimjjewett at gmail.com (Jim Jewett) Date: Mon, 9 May 2011 08:53:52 -0400 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #11277: Remove useless test from test_zlib. In-Reply-To: <E1QIe26-0006OM-07@dinsdale.python.org> References: <E1QIe26-0006OM-07@dinsdale.python.org> Message-ID: <BANLkTi=BGGJRbi-i94tj6HmuOkqGq+RL9Q@mail.gmail.com> Can you clarify (preferably in the commit message as well) exactly *why* these largefile tests are useless? For example, is there another test that covers this already? -jJ On 5/7/11, nadeem.vawda <python-checkins at python.org> wrote: > http://hg.python.org/cpython/rev/201dcfc56e86 > changeset: 69886:201dcfc56e86 > branch: 2.7 > parent: 69881:a0147a1f1776 > user: Nadeem Vawda <nadeem.vawda at gmail.com> > date: Sat May 07 11:28:03 2011 +0200 > summary: > Issue #11277: Remove useless test from test_zlib. > > files: > Lib/test/test_zlib.py | 42 ------------------------------- > 1 files changed, 0 insertions(+), 42 deletions(-) > > > diff --git a/Lib/test/test_zlib.py b/Lib/test/test_zlib.py > --- a/Lib/test/test_zlib.py > +++ b/Lib/test/test_zlib.py > @@ -72,47 +72,6 @@ > zlib.crc32('spam', (2**31))) > > > -# Issue #11277 - check that inputs of 2 GB (or 1 GB on 32 bits system) are > -# handled correctly. Be aware of issues #1202. We cannot test a buffer of 4 > GB > -# or more (#8650, #8651 and #10276), because the zlib stores the buffer > size > -# into an int. > -class ChecksumBigBufferTestCase(unittest.TestCase): > - if sys.maxsize > _4G: > - # (64 bits system) crc32() and adler32() stores the buffer size > into an > - # int, the maximum filesize is INT_MAX (0x7FFFFFFF) > - filesize = 0x7FFFFFFF > - else: > - # (32 bits system) On a 32 bits OS, a process cannot usually > address > - # more than 2 GB, so test only 1 GB > - filesize = _1G > - > - @unittest.skipUnless(mmap, "mmap() is not available.") > - def test_big_buffer(self): > - if sys.platform[:3] == 'win' or sys.platform == 'darwin': > - requires('largefile', > - 'test requires %s bytes and a long time to run' % > - str(self.filesize)) > - try: > - with open(TESTFN, "wb+") as f: > - f.seek(self.filesize-4) > - f.write("asdf") > - f.flush() > - m = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ) > - try: > - if sys.maxsize > _4G: > - self.assertEqual(zlib.crc32(m), 0x709418e7) > - self.assertEqual(zlib.adler32(m), -2072837729) > - else: > - self.assertEqual(zlib.crc32(m), 722071057) > - self.assertEqual(zlib.adler32(m), -1002962529) > - finally: > - m.close() > - except (IOError, OverflowError): > - raise unittest.SkipTest("filesystem doesn't have largefile > support") > - finally: > - unlink(TESTFN) > - > - > class ExceptionTestCase(unittest.TestCase): > # make sure we generate some expected errors > def test_badlevel(self): > @@ -595,7 +554,6 @@ > def test_main(): > run_unittest( > ChecksumTestCase, > - ChecksumBigBufferTestCase, > ExceptionTestCase, > CompressTestCase, > CompressObjectTestCase > > -- > Repository URL: http://hg.python.org/cpython > From eliben at gmail.com Mon May 9 14:56:57 2011 From: eliben at gmail.com (Eli Bendersky) Date: Mon, 9 May 2011 15:56:57 +0300 Subject: [Python-Dev] more timely detection of unbound locals Message-ID: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com> Hi all, It's a known Python gotcha (*) that the following code: x = 5 def foo(): print(x) x = 1 print(x) foo() Will throw: UnboundLocalError: local variable 'x' referenced before assignment On the usage of 'x' in the *first* print. Recently, while reading the zillionth question on StackOverflow on some variation of this case, I started thinking whether this behavior is desired or just an implementation artifact. IIUC, the reason it behaves this way is that the symbol table logic goes over the code before the code generation runs, sees the assignment 'x = 1` and marks 'x' as local in foo. Then, the code generator generates LOAD_FAST for all loads of 'x' in 'foo', even though 'x' is actually bound locally after the first print. When the bytecode is run, since it's LOAD_FAST and no store was made into the local 'x', ceval.c then throws the exception. On first sight, it's possible to signal that 'x' truly becomes local only after it's bound in the scope (and before that LOAD_NAME can be generated for it instead of LOAD_FAST). To do this, some modifications to the symbol table creation and usage are required, because we can no longer say "x is local in this block", but rather should attach scope information to each instance of "x". This has some overhead, but it's only at the compilation stage so it shouldn't have a real effect on the runtime of Python code. This is also less convenient and "clean" than the current approach - this is why I'm wondering whether the behavior is an artifact of the implementation. Would it not be worth to make Python's behavior more expected in this case, at the cost of some implementation complexity? What are the cons to making such a change? At least judging by the amount of people getting confused by it, maybe it's in line with the zen of Python to behave more explicitly here. Thanks in advance, Eli (*) Variation of this FAQ: http://docs.python.org/faq/programming.html#why-am-i-getting-an-unboundlocalerror-when-the-variable-has-a-value -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20110509/2fd5298b/attachment.html> From jimjjewett at gmail.com Mon May 9 15:00:17 2011 From: jimjjewett at gmail.com (Jim Jewett) Date: Mon, 9 May 2011 09:00:17 -0400 Subject: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII In-Reply-To: <E1QIf1U-0001ch-SK@dinsdale.python.org> References: <E1QIf1U-0001ch-SK@dinsdale.python.org> Message-ID: <BANLkTimm=KLc0XmCbqM7Vx5tBE=C=Kmneg@mail.gmail.com> Are you asserting that all foreign modules (or at least all handled by this) are in C, as opposed to C++ or even Java or Fortran? (And the C won't change?) Is this ASCII restriction (as opposed to even UTF8) really needed? Or are you just saying that we need to create an ASCII name for passing to C? -jJ On 5/7/11, victor.stinner <python-checkins at python.org> wrote: > http://hg.python.org/cpython/rev/eb003c3d1770 > changeset: 69889:eb003c3d1770 > user: Victor Stinner <victor.stinner at haypocalc.com> > date: Sat May 07 12:46:05 2011 +0200 > summary: > _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII > > The name must be encodable to ASCII because dynamic module must have a > function > called "PyInit_NAME", they are written in C, and the C language doesn't > accept > non-ASCII identifiers. > > files: > Python/importdl.c | 40 +++++++++++++++++++++------------- > 1 files changed, 25 insertions(+), 15 deletions(-) > > > diff --git a/Python/importdl.c b/Python/importdl.c > --- a/Python/importdl.c > +++ b/Python/importdl.c > @@ -20,31 +20,36 @@ > const char *pathname, FILE *fp); > #endif > > -/* name should be ASCII only because the C language doesn't accept > non-ASCII > - identifiers, and dynamic modules are written in C. */ > - > PyObject * > _PyImport_LoadDynamicModule(PyObject *name, PyObject *path, FILE *fp) > { > - PyObject *m; > + PyObject *m = NULL; > #ifndef MS_WINDOWS > PyObject *pathbytes; > #endif > + PyObject *nameascii; > char *namestr, *lastdot, *shortname, *packagecontext, *oldcontext; > dl_funcptr p0; > PyObject* (*p)(void); > struct PyModuleDef *def; > > - namestr = _PyUnicode_AsString(name); > - if (namestr == NULL) > - return NULL; > - > m = _PyImport_FindExtensionObject(name, path); > if (m != NULL) { > Py_INCREF(m); > return m; > } > > + /* name must be encodable to ASCII because dynamic module must have a > + function called "PyInit_NAME", they are written in C, and the C > language > + doesn't accept non-ASCII identifiers. */ > + nameascii = PyUnicode_AsEncodedString(name, "ascii", NULL); > + if (nameascii == NULL) > + return NULL; > + > + namestr = PyBytes_AS_STRING(nameascii); > + if (namestr == NULL) > + goto error; > + > lastdot = strrchr(namestr, '.'); > if (lastdot == NULL) { > packagecontext = NULL; > @@ -60,34 +65,33 @@ > #else > pathbytes = PyUnicode_EncodeFSDefault(path); > if (pathbytes == NULL) > - return NULL; > + goto error; > p0 = _PyImport_GetDynLoadFunc(shortname, > PyBytes_AS_STRING(pathbytes), fp); > Py_DECREF(pathbytes); > #endif > p = (PyObject*(*)(void))p0; > if (PyErr_Occurred()) > - return NULL; > + goto error; > if (p == NULL) { > PyErr_Format(PyExc_ImportError, > "dynamic module does not define init function" > " (PyInit_%s)", > shortname); > - return NULL; > + goto error; > } > oldcontext = _Py_PackageContext; > _Py_PackageContext = packagecontext; > m = (*p)(); > _Py_PackageContext = oldcontext; > if (m == NULL) > - return NULL; > + goto error; > > if (PyErr_Occurred()) { > - Py_DECREF(m); > PyErr_Format(PyExc_SystemError, > "initialization of %s raised unreported exception", > shortname); > - return NULL; > + goto error; > } > > /* Remember pointer to module init function. */ > @@ -101,12 +105,18 @@ > Py_INCREF(path); > > if (_PyImport_FixupExtensionObject(m, name, path) < 0) > - return NULL; > + goto error; > if (Py_VerboseFlag) > PySys_FormatStderr( > "import %U # dynamically loaded from %R\n", > name, path); > + Py_DECREF(nameascii); > return m; > + > +error: > + Py_DECREF(nameascii); > + Py_XDECREF(m); > + return NULL; > } > > #endif /* HAVE_DYNAMIC_LOADING */ > > -- > Repository URL: http://hg.python.org/cpython > From orsenthil at gmail.com Mon May 9 15:08:48 2011 From: orsenthil at gmail.com (Senthil Kumaran) Date: Mon, 9 May 2011 21:08:48 +0800 Subject: [Python-Dev] Commit changelog: issue number and merges In-Reply-To: <20110509124003.EB9D5250044@webabinitio.net> References: <1304937168.22910.21.camel@marge> <20110509124003.EB9D5250044@webabinitio.net> Message-ID: <20110509130848.GA2402@kevin> On Mon, May 09, 2011 at 08:40:03AM -0400, R. David Murray wrote: > +1. What I do is, in the edit window for the commit message, I pull > in .hg/last-message.txt, and just type 'Merge' in front of my previous Thanks for this tip. I shall start following this one too. -- Senthil From ijmorlan at uwaterloo.ca Mon May 9 15:26:38 2011 From: ijmorlan at uwaterloo.ca (Isaac Morland) Date: Mon, 9 May 2011 09:26:38 -0400 (EDT) Subject: [Python-Dev] more timely detection of unbound locals In-Reply-To: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com> References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com> Message-ID: <Pine.GSO.4.64.1105090913360.27288@core.cs.uwaterloo.ca> On Mon, 9 May 2011, Eli Bendersky wrote: > It's a known Python gotcha (*) that the following code: > > x = 5 > def foo(): > print(x) > x = 1 > print(x) > foo() > > Will throw: > > UnboundLocalError: local variable 'x' referenced before assignment > > On the usage of 'x' in the *first* print. Recently, while reading the > zillionth question on StackOverflow on some variation of this case, I > started thinking whether this behavior is desired or just an implementation > artifact. > > IIUC, the reason it behaves this way is that the symbol table logic goes > over the code before the code generation runs, sees the assignment 'x = 1` > and marks 'x' as local in foo. Then, the code generator generates LOAD_FAST > for all loads of 'x' in 'foo', even though 'x' is actually bound locally > after the first print. When the bytecode is run, since it's LOAD_FAST and no > store was made into the local 'x', ceval.c then throws the exception. > > On first sight, it's possible to signal that 'x' truly becomes local only > after it's bound in the scope (and before that LOAD_NAME can be generated > for it instead of LOAD_FAST). To do this, some modifications to the symbol > table creation and usage are required, because we can no longer say "x is > local in this block", but rather should attach scope information to each > instance of "x". This has some overhead, but it's only at the compilation > stage so it shouldn't have a real effect on the runtime of Python code. This > is also less convenient and "clean" than the current approach - this is why > I'm wondering whether the behavior is an artifact of the implementation. x = 5 def foo (): print (x) if bar (): x = 1 print (x) Isaac Morland CSCF Web Guru DC 2554C, x36650 WWW Software Specialist From stefan_ml at behnel.de Mon May 9 15:27:09 2011 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 09 May 2011 15:27:09 +0200 Subject: [Python-Dev] more timely detection of unbound locals In-Reply-To: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com> References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com> Message-ID: <iq8q3d$kfk$1@dough.gmane.org> Eli Bendersky, 09.05.2011 14:56: > It's a known Python gotcha (*) that the following code: > > x = 5 > def foo(): > print(x) > x = 1 > print(x) > foo() > > Will throw: > > UnboundLocalError: local variable 'x' referenced before assignment > > On the usage of 'x' in the *first* print. Recently, while reading the > zillionth question on StackOverflow on some variation of this case, I > started thinking whether this behavior is desired or just an implementation > artifact. Well, basically any compiler these days can detect that a variable is being used before assignment, or at least that this is possibly the case, depending on prior branching. ISTM that your suggestion is to let x refer to the outer x up to the assignment and to the inner x from that point on. IMHO, that's much worse than the current behaviour and potentially impractical due to conditional assignments. However, it's also a semantic change to reject code with unbound locals at compile time, as the specific code in question may actually be unreachable at runtime. This makes me think that it would be best to discuss this on the python-ideas list first. If nothing else, I'd like to see a discussion on this behaviour being an implementation detail of CPython or a feature of the Python language. Stefan From ericsnowcurrently at gmail.com Mon May 9 15:41:43 2011 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Mon, 9 May 2011 07:41:43 -0600 Subject: [Python-Dev] more timely detection of unbound locals In-Reply-To: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com> References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com> Message-ID: <BANLkTi=d3JpxxC59ckuxSmE4K-v2woJ_eg@mail.gmail.com> On May 9, 2011 6:59 AM, "Eli Bendersky" <eliben at gmail.com> wrote: > > Hi all, > > It's a known Python gotcha (*) that the following code: > > x = 5 > def foo(): > print(x) > x = 1 > print(x) > foo() > > Will throw: > > UnboundLocalError: local variable 'x' referenced before assignment > > On the usage of 'x' in the *first* print. Recently, while reading the zillionth question on StackOverflow on some variation of this case, I started thinking whether this behavior is desired or just an implementation artifact. > > IIUC, the reason it behaves this way is that the symbol table logic goes over the code before the code generation runs, sees the assignment 'x = 1` and marks 'x' as local in foo. Then, the code generator generates LOAD_FAST for all loads of 'x' in 'foo', even though 'x' is actually bound locally after the first print. When the bytecode is run, since it's LOAD_FAST and no store was made into the local 'x', ceval.c then throws the exception. > > On first sight, it's possible to signal that 'x' truly becomes local only after it's bound in the scope (and before that LOAD_NAME can be generated for it instead of LOAD_FAST). To do this, some modifications to the symbol table creation and usage are required, because we can no longer say "x is local in this block", but rather should attach scope information to each instance of "x". This has some overhead, but it's only at the compilation stage so it shouldn't have a real effect on the runtime of Python code. This is also less convenient and "clean" than the current approach - this is why I'm wondering whether the behavior is an artifact of the implementation. > > Would it not be worth to make Python's behavior more expected in this case, at the cost of some implementation complexity? What are the cons to making such a change? At least judging by the amount of people getting confused by it, maybe it's in line with the zen of Python to behave more explicitly here. This is about mixing scopes for the the same name in the same block, right? Perhaps a more specific error would be enough, unless there is a good use case for having that mixed scope for the name. -eric > Thanks in advance, > Eli > > (*) Variation of this FAQ: http://docs.python.org/faq/programming.html#why-am-i-getting-an-unboundlocalerror-when-the-variable-has-a-value > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ericsnowcurrently%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20110509/010f77dc/attachment.html> From nadeem.vawda at gmail.com Mon May 9 16:08:55 2011 From: nadeem.vawda at gmail.com (Nadeem Vawda) Date: Mon, 9 May 2011 16:08:55 +0200 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #11277: Remove useless test from test_zlib. In-Reply-To: <BANLkTi=BGGJRbi-i94tj6HmuOkqGq+RL9Q@mail.gmail.com> References: <E1QIe26-0006OM-07@dinsdale.python.org> <BANLkTi=BGGJRbi-i94tj6HmuOkqGq+RL9Q@mail.gmail.com> Message-ID: <BANLkTimiA9OzfNob0-b0GfPnymeb0h7zrg@mail.gmail.com> On Mon, May 9, 2011 at 2:53 PM, Jim Jewett <jimjjewett at gmail.com> wrote: > Can you clarify (preferably in the commit message as well) exactly > *why* these largefile tests are useless? For example, is there > another test that covers this already? Ah, sorry about that. It was discussed on the tracker issue, but I guess I can't expect people to read through 90+ messages to figure it out :P The short version is that it was supposed to test 4GB+ inputs, but in 2.7, the functions being tested don't accept inputs that large. The details: The test was originally intended to catch the case where crc32() or adler32() would get a buffer of >=4GB, and then silently truncate the buffer size and produce an incorrect result (issue10276). It had been written for 3.x, and then backported to 2.7. However, in 2.7, zlibmodule.c doesn't define PY_SSIZE_T_CLEAN, so passing in a buffer of >=2GB raises an OverflowError (see issue8651). This means that it is impossible to trigger the bug in question on 2.7, making the test pointless. Of course, the code that was deleted tests with an input sized 2GB-1 or 1GB, rather than 4GB (the size used in 3.x). When the test was backported, the size of the input was reduced, to avoid triggering an OverflowException. At the time, no-one realized that this also would not trigger the bug being tested for; it only came to light when the test started crashing for unrelated reasons (issue11277). Cheers, Nadeem From benjamin at python.org Mon May 9 16:08:53 2011 From: benjamin at python.org (Benjamin Peterson) Date: Mon, 9 May 2011 09:08:53 -0500 Subject: [Python-Dev] Commit changelog: issue number and merges In-Reply-To: <1304937168.22910.21.camel@marge> References: <1304937168.22910.21.camel@marge> Message-ID: <BANLkTimfd7d_U-ffnuH_KOdURRkmFQgM0w@mail.gmail.com> 2011/5/9 Victor Stinner <victor.stinner at haypocalc.com>: > Hi, > > Commit changelogs are important to understand why the code was changed. > I regulary use hg blame to search which commit introduced a particular > line of code, and I am always happy if I can find an issue number > because it usually contains the whole story. > > And since the migration to Mercurial, we have also a great tool adding a > comment to an issue if the changelog contains an issue number (e.g. > changelog starting with "Issue #118888: ..."). So if someone watchs an > issue (is in the nosy list), (s)he will be noticed that a related commit > was pushed. It is not exactly something new: we already do that with > Subversion except that today it is more automatic. > > I noticed that some recent commits don't contain the issue number: > please try to always prefix your changelog with the issue number. It is > not "mandatory", but it helps me when I dig the Python history. > > -- > > For merge commits: many developers just write "merge" or "merge 3.1". I > have to go to the parent commit (and something to the grandparent, > 3.1->3.2->3.3) to learn more about the commit. I thought the whole point of merging was that you brought a changeset from one branch to another. This why I just write "merge" because otherwise you're technically duplicating information that is pulled onto the branch by merging. It seems like something that should be solved by tools like a display visual graph indicating what is merged. (like Bazaar) -- Regards, Benjamin From victor.stinner at haypocalc.com Mon May 9 16:11:15 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Mon, 09 May 2011 16:11:15 +0200 Subject: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII In-Reply-To: <BANLkTimm=KLc0XmCbqM7Vx5tBE=C=Kmneg@mail.gmail.com> References: <E1QIf1U-0001ch-SK@dinsdale.python.org> <BANLkTimm=KLc0XmCbqM7Vx5tBE=C=Kmneg@mail.gmail.com> Message-ID: <1304950275.22910.32.camel@marge> Le lundi 09 mai 2011 ? 09:00 -0400, Jim Jewett a ?crit : > Are you asserting that all foreign modules (or at least all handled by > this) are in C, as opposed to C++ or even Java or Fortran? (And the C > won't change?) C and C++ identifiers are restricted to ASCII. I don't know for Fortran or Java. Is it possible to write a CPython extension module in Java or Fortran? (My change doesn't concern Jython: it's an implementation detail of dynamic modules in CPython.) > Is this ASCII restriction (as opposed to even UTF8) really needed? I prefer to explicitly limit module names of dynamic modules to ASCII. If we decide to extend the support to something else than ASCII, we will need a working module to test it, and maybe also a test. > Or are you just saying that we need to create an ASCII name for passing to C? You pass a Unicode module name to import (import h? or __import__('h?')), and Python encodes the name to ASCII if it is a dynamic module. It is still possible to use non-ASCII module names, but only for modules written in Python. Victor From victor.stinner at haypocalc.com Mon May 9 16:14:03 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Mon, 09 May 2011 16:14:03 +0200 Subject: [Python-Dev] Commit changelog: issue number and merges In-Reply-To: <BANLkTimfd7d_U-ffnuH_KOdURRkmFQgM0w@mail.gmail.com> References: <1304937168.22910.21.camel@marge> <BANLkTimfd7d_U-ffnuH_KOdURRkmFQgM0w@mail.gmail.com> Message-ID: <1304950443.22910.34.camel@marge> Le lundi 09 mai 2011 ? 09:08 -0500, Benjamin Peterson a ?crit : > It seems like something that should be solved by tools like a display > visual graph indicating what is merged. (like Bazaar) Yeah, we could fix buildbot, hg.python.org website, improve hg log, and all other tools using Mercurial. But until that, I would prefer to duplicate the information. Victor From ncoghlan at gmail.com Mon May 9 16:36:04 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 10 May 2011 00:36:04 +1000 Subject: [Python-Dev] Borrowed and Stolen References in API In-Reply-To: <20110506122703.17c4d889@pitrou.net> References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk> <BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com> <4DC1D1C5.9010507@canterbury.ac.nz> <BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com> <4DC34EAB.9050001@canterbury.ac.nz> <20110506122703.17c4d889@pitrou.net> Message-ID: <BANLkTikfL1_ocbdVLz5uHttEwLrNVJVWsQ@mail.gmail.com> On Fri, May 6, 2011 at 8:27 PM, Antoine Pitrou <solipsis at pitrou.net> wrote: > On Fri, 06 May 2011 13:28:11 +1200 > Greg Ewing <greg.ewing at canterbury.ac.nz> wrote: > >> Amaury Forgeot d'Arc wrote [concerning the Doc/data/refcounts.dat file]: >> >> > This is not always true, for example when the item is already present >> > in the dict. >> > It's not important to know what the function does to the object, >> > Only the action on the reference is relevant. >> >> Yes, that's the whole point. When using a functon, >> what you need to know is whether it borrows or steals >> a reference. > > Doesn't "borrow" mean the same as "steal" in that context? > If an API borrows a reference, I expect it to take it from me. Input parameter, borrowed or new reference: caller retains ownership and must still decref item Input parameter, stolen reference: caller transfers ownership and must NOT decref item (or must incref before call to guarantee lifecycle if planning to continue using the object after the call) Output parameter or return value, borrowed reference: caller does NOT receive ownership and does not need to decref item, but needs to be careful of lifecycle (and promote to a full reference with incref if the borrowed reference may outlive the original) Output parameter or return value, stolen or new reference: caller receives ownership and must decref item One interesting aspect is that from the caller's point of view, a *new* reference to the relevant behaves like a borrowed reference for input parameters, but like a stolen reference for output parameters and return values. It is typically the converse cases (stolen reference to an input parameter, borrowed reference to an output parameter or return value) that requires special attention on the caller's part. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From steve at pearwood.info Mon May 9 16:45:09 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 10 May 2011 00:45:09 +1000 Subject: [Python-Dev] more timely detection of unbound locals In-Reply-To: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com> References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com> Message-ID: <4DC7FDF5.9060003@pearwood.info> Eli Bendersky wrote: > Hi all, > > It's a known Python gotcha (*) that the following code: > > x = 5 > def foo(): > print(x) > x = 1 > print(x) > foo() > > Will throw: > > UnboundLocalError: local variable 'x' referenced before assignment I think part of the problem is that UnboundLocalError is a jargon name, while it's predecessor NameError (used up to Python 1.5) is far more intuitively obvious. > On the usage of 'x' in the *first* print. Recently, while reading the > zillionth question on StackOverflow on some variation of this case, I > started thinking whether this behavior is desired or just an implementation > artifact. [...] > Would it not be worth to make Python's behavior more expected in this case, > at the cost of some implementation complexity? What are the cons to making > such a change? At least judging by the amount of people getting confused by > it, maybe it's in line with the zen of Python to behave more explicitly > here. I think you are making an unwarranted assumption about what is "more expected". I presume you are thinking that the expected behaviour is that foo() should: print global x (5) assign 1 to local x print local x (1) If we implemented this change, there would be no more questions about UnboundLocalError, but instead there would be lots of questions like "why is it that globals revert to their old value after I change them in a function?". -- Steven From ncoghlan at gmail.com Mon May 9 17:00:38 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 10 May 2011 01:00:38 +1000 Subject: [Python-Dev] Problems with regrtest and with logging In-Reply-To: <acbfe5fdfbc9da0ecec6d2591ab3757d@netwok.org> References: <acbfe5fdfbc9da0ecec6d2591ab3757d@netwok.org> Message-ID: <BANLkTikx0KXuo9vyDBi=Ky2mWqY61Vp=6g@mail.gmail.com> On Sat, May 7, 2011 at 3:51 AM, ?ric Araujo <merwok at netwok.org> wrote: > regrtest helpfully reports when a test leaves the environment unclean > (sys.path, os.environ, logging._handlerList), but I think the implementation > is buggy: it compares object identity and then value. ?Why is comparing > identity useful? ?I?d just use ==. ?It makes writing cleanup code easier > (just use addCleanup(setattr, obj, 'attr', copy(obj.attr))). Because changing the identity of any of those global state attributes that regrtest monitors is itself suggestive of a bug. When it comes to containers, identity matters at least as much as value does (and sometimes more so - e.g. sys.modules). Replacing those global containers with new ones isn't guaranteed to work, as they may be cached in various places rather than always retrieved fresh from the relevant module namespace. Modifying them in place, on the other hand, does the right thing even in the presence of cached references. A comment to that effect may be a useful addition to regrtest, as I expect others may have similar questions about those identity checks in the future. (It may even be a useful addition to the documentation, but I have no idea where it could be sensibly included). Also, don't be surprised if wholesale cleanup like that isn't completely reliable - it's far, far better if the test case understands the changes it is making (even indirectly) and explicitly reverses them. Save-and-restore should be a last resort technique (although context managers that are designed for more general use, such as warnings.catch_warnings(), use save-and-restore by necessity, since they have no control over the body of the relevant with statements). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From eliben at gmail.com Mon May 9 17:01:06 2011 From: eliben at gmail.com (Eli Bendersky) Date: Mon, 9 May 2011 18:01:06 +0300 Subject: [Python-Dev] more timely detection of unbound locals In-Reply-To: <4DC7FDF5.9060003@pearwood.info> References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com> <4DC7FDF5.9060003@pearwood.info> Message-ID: <BANLkTimUOUMfSkTFR+99869RK5Y3nEoZyw@mail.gmail.com> > I think you are making an unwarranted assumption about what is "more > expected". I presume you are thinking that the expected behaviour is that > foo() should: > > print global x (5) > assign 1 to local x > print local x (1) > > If we implemented this change, there would be no more questions about > UnboundLocalError, but instead there would be lots of questions like "why is > it that globals revert to their old value after I change them in a > function?". > True, but this is less confusing and follows the rules in a more straightforward way. x = 1 without a 'global x' assigns a local x, this make sense and is similar to what happens in C where an inner declaration temporarily shadows a global one. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20110509/20553648/attachment.html> From ncoghlan at gmail.com Mon May 9 17:04:21 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 10 May 2011 01:04:21 +1000 Subject: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII In-Reply-To: <BANLkTimm=KLc0XmCbqM7Vx5tBE=C=Kmneg@mail.gmail.com> References: <E1QIf1U-0001ch-SK@dinsdale.python.org> <BANLkTimm=KLc0XmCbqM7Vx5tBE=C=Kmneg@mail.gmail.com> Message-ID: <BANLkTiko98eop8xF18q3KZFSBQFXJ3mdtQ@mail.gmail.com> On Mon, May 9, 2011 at 11:00 PM, Jim Jewett <jimjjewett at gmail.com> wrote: > Are you asserting that all foreign modules (or at least all handled by > this) are in C, as opposed to C++ or even Java or Fortran? ?(And the C > won't change?) The extension module that interfaces them to CPython will be written in C, or something that can export a C-compatible library interface (after reading in the Python C API headers). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From eliben at gmail.com Mon May 9 17:06:18 2011 From: eliben at gmail.com (Eli Bendersky) Date: Mon, 9 May 2011 18:06:18 +0300 Subject: [Python-Dev] more timely detection of unbound locals In-Reply-To: <Pine.GSO.4.64.1105090913360.27288@core.cs.uwaterloo.ca> References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com> <Pine.GSO.4.64.1105090913360.27288@core.cs.uwaterloo.ca> Message-ID: <BANLkTimmAwLUzHr_4aoX8HNscQRE83LR1A@mail.gmail.com> > x = 5 > def foo (): > print (x) > if bar (): > x = 1 > print (x) > I wish you'd annotate this code sample, what do you intend it to demonstrate? It probably shows the original complaint even more strongly. As for being a problem with the suggested solution, I suppose you're right, although it doesn't make it much different. Still, before a *possible* assignment to 'x', it should be loaded as LOAD_NAME since it was surely not bound as local, yet. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20110509/82d1a583/attachment.html> From ncoghlan at gmail.com Mon May 9 17:17:35 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 10 May 2011 01:17:35 +1000 Subject: [Python-Dev] more timely detection of unbound locals In-Reply-To: <BANLkTimUOUMfSkTFR+99869RK5Y3nEoZyw@mail.gmail.com> References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com> <4DC7FDF5.9060003@pearwood.info> <BANLkTimUOUMfSkTFR+99869RK5Y3nEoZyw@mail.gmail.com> Message-ID: <BANLkTikW9UgdaPvFLjF3_OrzPJ-aUMHAUQ@mail.gmail.com> On Tue, May 10, 2011 at 1:01 AM, Eli Bendersky <eliben at gmail.com> wrote: > >> I think you are making an unwarranted assumption about what is "more >> expected". I presume you are thinking that the expected behaviour is that >> foo() should: >> >> print global x (5) >> assign 1 to local x >> print local x (1) >> >> If we implemented this change, there would be no more questions about >> UnboundLocalError, but instead there would be lots of questions like "why is >> it that globals revert to their old value after I change them in a >> function?". > > True, but this is less confusing and follows the rules in a more > straightforward way. x = 1 without a 'global x' assigns a local x, this make > sense and is similar to what happens in C where an inner declaration > temporarily shadows a global one. However, since flow control constructs in Python don't create new scopes (unlike C/C++), you run into a fundamental problem with cases like the one Isaac posted, or even nastier ones like the following: def f(): if bar(): fill = 1 else: fiil = 2 print(fill) # Q: What does this do when bool(bar()) is False? Since we want to make the decision categorically at compile-time, the simplest, least-confusing option is to say "assignment makes a variable name local, referencing it before the first assignment is now an error". I don't know of anyone that particularly *likes* UnboundLocalError, but it's better than letting errors like the one above pass silently. (It obviously doesn't trap *all* typo-related errors, but it at least lets you reason sanely about name bindings) On the reasoning-sanely front, closures likely present a more compelling argument: def f(): def g(): print(x) # We want this to refer to the closure in f(), thanks x = 1 return g UnboundLocalError is really about aligning the rules for the current scope with those for references from nested scopes (i.e. x is a local variable of f, whether it is referenced from f's local scope, or any nested scope within f) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Mon May 9 17:22:36 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 10 May 2011 01:22:36 +1000 Subject: [Python-Dev] more timely detection of unbound locals In-Reply-To: <BANLkTimmAwLUzHr_4aoX8HNscQRE83LR1A@mail.gmail.com> References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com> <Pine.GSO.4.64.1105090913360.27288@core.cs.uwaterloo.ca> <BANLkTimmAwLUzHr_4aoX8HNscQRE83LR1A@mail.gmail.com> Message-ID: <BANLkTin-V6HZ0Z_kxSj9mgjnMgkPE0sTRw@mail.gmail.com> On Tue, May 10, 2011 at 1:06 AM, Eli Bendersky <eliben at gmail.com> wrote: > It probably shows the original complaint even more strongly. As for being a > problem with the suggested solution, I suppose you're right, although it > doesn't make it much different. Still, before a *possible* assignment to > 'x', it should be loaded as LOAD_NAME since it was surely not bound as > local, yet. Yeah, I've decided I'm happier with the closure based arguments than the conditional statement related ones. "Assignments create local variables" is a relatively simple rule to reason about, and is equally valid for the current scope and for any nested scopes. The symtable analysis for nested scopes is ordering independent (and can't be changed for backwards compatibility reasons if nothing else), and UnboundLocalError is a natural outgrowth of applying those semantics to the current scope as well. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From rdmurray at bitdance.com Mon May 9 17:44:15 2011 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 09 May 2011 11:44:15 -0400 Subject: [Python-Dev] Commit changelog: issue number and merges In-Reply-To: <BANLkTimfd7d_U-ffnuH_KOdURRkmFQgM0w@mail.gmail.com> References: <1304937168.22910.21.camel@marge> <BANLkTimfd7d_U-ffnuH_KOdURRkmFQgM0w@mail.gmail.com> Message-ID: <20110509154416.35BBF250037@webabinitio.net> On Mon, 09 May 2011 09:08:53 -0500, Benjamin Peterson <benjamin at python.org> wrote: > I thought the whole point of merging was that you brought a changeset > from one branch to another. This why I just write "merge" because > otherwise you're technically duplicating information that is pulled > onto the branch by merging. No it isn't. The commit message isn't pulled into the new branch. > It seems like something that should be solved by tools like a display > visual graph indicating what is merged. (like Bazaar) You'd need some extension to hg log that would show the original commit message for the first changeset in the merge line in order to "fix" this. I doubt that is going to happen. Note that saying just 'merge' makes perfect sense when you are pulling in a whole group of changesets in order to synchronize two branches. But if you are applying a single changeset to multiple branches, as we often do in our workflow, then I think duplicating the commit message is (1) easy to do and (2) very helpful when looking at hg log output. -- R. David Murray http://www.bitdance.com From ijmorlan at uwaterloo.ca Mon May 9 17:44:21 2011 From: ijmorlan at uwaterloo.ca (Isaac Morland) Date: Mon, 9 May 2011 11:44:21 -0400 (EDT) Subject: [Python-Dev] more timely detection of unbound locals In-Reply-To: <BANLkTimmAwLUzHr_4aoX8HNscQRE83LR1A@mail.gmail.com> References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com> <Pine.GSO.4.64.1105090913360.27288@core.cs.uwaterloo.ca> <BANLkTimmAwLUzHr_4aoX8HNscQRE83LR1A@mail.gmail.com> Message-ID: <Pine.GSO.4.64.1105091129450.27288@core.cs.uwaterloo.ca> On Mon, 9 May 2011, Eli Bendersky wrote: >> x = 5 >> def foo (): >> print (x) >> if bar (): >> x = 1 >> print (x) >> > > I wish you'd annotate this code sample, what do you intend it to > demonstrate? > > It probably shows the original complaint even more strongly. As for being a > problem with the suggested solution, I suppose you're right, although it > doesn't make it much different. Still, before a *possible* assignment to > 'x', it should be loaded as LOAD_NAME since it was surely not bound as > local, yet. Extrapolating from your suggestion, you're saying before a *possible* assignment it will be treated as global, and after a *possible* assignment it will be treated as local? But surely: print (x) if False: x = 1 print (x) should always print the same thing twice (in the absence of actions taken by other threads)! Replace "False" by something that is usually (but not always) True, and "print (x)" by something that actually does something, and you had best put on your helmet because it's going to be a fun ride. But I won't be on it. The idea that the same name within the same scope always refers to the same value is an idea from functional programming and not part of Python; but surely the same name within the same scope should at least always refer to the same variable! If something is to be done here, it occurs to me that the same parser that decides that the initial reference to x should use the local x could conceivably issue an error right away - "local variable can never be assigned before use" rather than waiting until runtime. But even if I haven't confused myself about the possibility of this raising a false positive (and it certainly could in the presence of dead code), it wouldn't catch cases of conditional premature use of a local variable. I think in those cases people would still ask the same questions they do with the existing implementation. Isaac Morland CSCF Web Guru DC 2554C, x36650 WWW Software Specialist From merwok at netwok.org Mon May 9 17:55:42 2011 From: merwok at netwok.org (=?UTF-8?Q?=C3=89ric_Araujo?=) Date: Mon, 09 May 2011 17:55:42 +0200 Subject: [Python-Dev] Commit changelog: issue number and merges In-Reply-To: <BANLkTimfd7d_U-ffnuH_KOdURRkmFQgM0w@mail.gmail.com> References: <1304937168.22910.21.camel@marge> <BANLkTimfd7d_U-ffnuH_KOdURRkmFQgM0w@mail.gmail.com> Message-ID: <7fa082450fb750d082e71d5070a62171@netwok.org> Hi, Le 09/05/2011 16:08, Benjamin Peterson a ?crit : > 2011/5/9 Victor Stinner <victor.stinner at haypocalc.com>: >> For merge commits: many developers just write "merge" or "merge >> 3.1". I >> have to go to the parent commit (and something to the grandparent, >> 3.1->3.2->3.3) to learn more about the commit. I follow conventions I?ve seen elsewhere (maybe Mercurial itself): I use ?Branch merge? when I merge anonymous branches on the same named branch, and ?Merge x.y? for forward-porting across named branches. I also tend to do more than one commit before merging. It would not be very easy with my current toolchain to get the commit message(s) to insert into the new message, and I think it?s not necessary. > I thought the whole point of merging was that you brought a changeset > from one branch to another. This why I just write "merge" because > otherwise you're technically duplicating information that is pulled > onto the branch by merging. +1. No interest in manually duplicating available information. Le 09/05/2011 17:44, R. David Murray a ?crit : > No it isn't. The commit message isn't pulled into the new branch. Sorry, your terminology does not make sense. If you mean that the commit message is not reused in the new commit after the merge, it?s true. However, the commit message with the relevant information is available as part of the changesets that have been pulled and merged. Regards From merwok at netwok.org Mon May 9 18:36:43 2011 From: merwok at netwok.org (=?UTF-8?Q?=C3=89ric_Araujo?=) Date: Mon, 09 May 2011 18:36:43 +0200 Subject: [Python-Dev] Problems with regrtest and with logging In-Reply-To: <loom.20110508T153655-217@post.gmane.org> References: "\"<acbfe5fdfbc9da0ecec6d2591ab3757d@netwok.org>" <loom.20110506T205048-495@post.gmane.org>" <2ed4b4e7b4fc17cba2162535d2a220d8@netwok.org> <loom.20110508T153655-217@post.gmane.org> Message-ID: <d2381201d78ab4c329487cc9f0c20236@netwok.org> Hi, Thanks for the help. I didn?t know about handler.close. (By which I mean that I used logging without re-reading its documentation, which is a testimony to its usability :) > The cases you refer to seem to be _set_logger in packaging/run.py > (which appears > not to be used at all - there appear to be no other references to it > in the > code), Yep, probably dead code. I think that an handler should be defined only once, in the ?if __name__ == '__main__'? block. Am I right? Just like you don?t call sys.exit from library code (hello optparse!), you don?t set logging handlers in library code, only in the outmost layer of the script. > Dispatcher.__init__ in packaging/run.py and This is the new-fangled command line parser, which can run global (Python-wide) commands (search, uninstall, etc.) as well as traditional project-wide commands (build, check, sdist, etc.) > Distribution.parse_command_line in packaging/dist.py. This is the older command line parser, that can handle only project-wide commands. I?m not sure the work is finished to integrate both parsers; my smoke test used to be --help-commands, which can be hard to run these days. The problem is that Dispatcher or Distribution get the quiet or verbose options from the command-line deep in the library code, and want to use it to configure the log level on the handler, which I?ve just said should be set up at a much higher level. To solve this, I?m going to add a *logginghandler* argument to Dispatcher/Distribution; that way, the creation of the handler will happen only once and at a high level, but the command-line parsing code will be able to set the log handler from the command-line arguments. :) > In the second and third cases, can you be sure that only one of these > code paths > will be executed, at most once? Gut feeling is yes, but we?ve learned not to trust our instinct with distutils. > In the case of the test support code, I'm not really sure that > LoggingCatcher is > needed. There is already a TestHandler class in test.support which > captures > records in a buffer, and allows flexible matching for assertions, as > described in distutils used its own log module; this mixin was used to intercept messages sent with this system. When we migrated to stdlib logging, I added a todo comment to update the code to use something less kludgy :) The post you linked to is already in my bookmarks. Note that this support module also helps with Python 2.4+, so I may have to copy-paste TestHandler. So, I will fix the LoggingCatcher mixin to use the much cleaner addHandler/removeHandler combo (I?ll avoid calling logging._removeHandlerRef if I don?t have to) and try my idea about the handler instantiation in the code. Thanks a lot! Cheers From steve at pearwood.info Mon May 9 18:39:14 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 10 May 2011 02:39:14 +1000 Subject: [Python-Dev] more timely detection of unbound locals In-Reply-To: <BANLkTimUOUMfSkTFR+99869RK5Y3nEoZyw@mail.gmail.com> References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com> <4DC7FDF5.9060003@pearwood.info> <BANLkTimUOUMfSkTFR+99869RK5Y3nEoZyw@mail.gmail.com> Message-ID: <4DC818B2.5060508@pearwood.info> Eli Bendersky wrote: >> I think you are making an unwarranted assumption about what is "more >> expected". I presume you are thinking that the expected behaviour is that >> foo() should: >> >> print global x (5) >> assign 1 to local x >> print local x (1) >> >> If we implemented this change, there would be no more questions about >> UnboundLocalError, but instead there would be lots of questions like "why is >> it that globals revert to their old value after I change them in a >> function?". >> > > True, but this is less confusing and follows the rules in a more > straightforward way. x = 1 without a 'global x' assigns a local x, this make > sense and is similar to what happens in C where an inner declaration > temporarily shadows a global one. I disagree that it is less confusing. Instead of a nice, straightforward error that you can google, the function will silently do the wrong thing, giving no clue that weirdness is happening. def spam(): if x < 0: # refers to global x x = 1 # now local if x > 0: # could be either global or local x = x - 1 # local on the LHS of the equal # sometimes global on the RHS else: x += 1 # local x, but what value does it have? Just thinking about debugging the mess that this could make gives me a headache. -- Steven From merwok at netwok.org Mon May 9 18:42:06 2011 From: merwok at netwok.org (=?UTF-8?Q?=C3=89ric_Araujo?=) Date: Mon, 09 May 2011 18:42:06 +0200 Subject: [Python-Dev] Problems with regrtest and with logging In-Reply-To: <BANLkTikx0KXuo9vyDBi=Ky2mWqY61Vp=6g@mail.gmail.com> References: <acbfe5fdfbc9da0ecec6d2591ab3757d@netwok.org> <BANLkTikx0KXuo9vyDBi=Ky2mWqY61Vp=6g@mail.gmail.com> Message-ID: <190b407c9c414b46c8658307a88e5dfa@netwok.org> Hi, > When it comes to > containers, identity matters at least as much as value does (and > sometimes more so - e.g. sys.modules). Replacing those global > containers with new ones isn't guaranteed to work, as they may be > cached in various places rather than always retrieved fresh from the > relevant module namespace. Modifying them in place, on the other > hand, > does the right thing even in the presence of cached references. That makes sense, thanks for the explanation! > A comment to that effect may be a useful addition to regrtest, as I > expect others may have similar questions about those identity checks > in the future. (It may even be a useful addition to the > documentation, > but I have no idea where it could be sensibly included). Somewhere in unittest doc, say in the section about tearDown. Or maybe it?s time for a Python testing best practices howto? > Also, don't be surprised if wholesale cleanup like that isn't > completely reliable - it's far, far better if the test case > understands the changes it is making (even indirectly) and explicitly > reverses them. Yep, I was probably bringing out the big guns too early. self.addCleanup(sys.path.remove, path) is better and even shorter than my previous code! Cheers From tjreedy at udel.edu Mon May 9 18:59:29 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 09 May 2011 12:59:29 -0400 Subject: [Python-Dev] more timely detection of unbound locals In-Reply-To: <iq8q3d$kfk$1@dough.gmane.org> References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com> <iq8q3d$kfk$1@dough.gmane.org> Message-ID: <iq96he$1q8$1@dough.gmane.org> On 5/9/2011 9:27 AM, Stefan Behnel wrote: > Eli Bendersky, 09.05.2011 14:56: >> It's a known Python gotcha (*) that the following code: >> >> x = 5 >> def foo(): >> print(x) >> x = 1 >> print(x) >> foo() >> >> Will throw: >> >> UnboundLocalError: local variable 'x' referenced before assignment >> >> On the usage of 'x' in the *first* print. Recently, while reading the >> zillionth question on StackOverflow on some variation of this case, I >> started thinking whether this behavior is desired or just an >> implementation >> artifact. > > Well, basically any compiler these days can detect that a variable is > being used before assignment, or at least that this is possibly the > case, depending on prior branching. > > ISTM that your suggestion is to let x refer to the outer x up to the > assignment and to the inner x from that point on. IMHO, that's much > worse than the current behaviour and potentially impractical due to > conditional assignments. > > However, it's also a semantic change to reject code with unbound locals > at compile time, as the specific code in question may actually be > unreachable at runtime. This makes me think that it would be best to > discuss this on the python-ideas list first. > > If nothing else, I'd like to see a discussion on this behaviour being an > implementation detail of CPython or a feature of the Python language. > > Stefan > -- Terry Jan Reedy From tjreedy at udel.edu Mon May 9 19:24:20 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 09 May 2011 13:24:20 -0400 Subject: [Python-Dev] Commit messages: please avoid temporal ambiguity Message-ID: <iq9802$e9m$1@dough.gmane.org> A commit (push) partition time and behavior into before and after (with a short change period in between during which behavior is undefined). Some commit messages have the form 'x does y'. Does 'does' mean before or after? Sometimes that is clear. 'x crashes' means before. 'x return correct value' means after. But some messages of this type are unclear to me as written. Consider 'x raises exception'? The temporal reference is obvious to the committer but not necessary to everyone else. It could mean 'x used to segfault and now raises a catchable exception'. There was a fix like this (with a clear message) just today. It could also mean 'x used to raise but now return an answer. There have been many fixes like this. Two minimal fixes are 'x raised exception' or 'make x raise exception'. -- Terry Jan Reedy From vinay_sajip at yahoo.co.uk Mon May 9 19:40:03 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Mon, 9 May 2011 17:40:03 +0000 (UTC) Subject: [Python-Dev] Problems with regrtest and with logging Message-ID: <loom.20110509T193140-280@post.gmane.org> ?ric Araujo <merwok <at> netwok.org> writes: > Yep, probably dead code. I think that an handler should be defined > only once, in the ?if __name__ == '__main__'? block. Am I right? Just > like you don?t call sys.exit from library code (hello optparse!), you > don?t set logging handlers in library code, only in the outmost layer of > the script. That's right, though it's OK to provide a documented convenience API for adding handlers. > The problem is that Dispatcher or Distribution get the quiet or verbose > options from the command-line deep in the library code, and want to use > it to configure the log level on the handler, which I?ve just said > should be set up at a much higher level. To solve this, I?m going to > add a *logginghandler* argument to Dispatcher/Distribution; that way, > the creation of the handler will happen only once and at a high level, > but the command-line parsing code will be able to set the log handler > from the command-line arguments. :) You don't necessarily need to set the level on the handler - why can you not just set it on the logger? The effect would often be the same: the logger's level is checked first, and then the handler's level. Generally you set levels on handlers when you want specific behaviour, such as all ERROR and above to a particular file, all CRITICAL to an email handler etc. For command-line scripts outputting to the console and nowhere else, usually you could just add a StreamHandler (with no level set on it), and set the level on the logger. Where the functionality may be used in an API, you should perhaps check logger.hasHandlers() and avoid adding handlers if there are already some added by a using library or application. Regards, Vinay Sajip From rdmurray at bitdance.com Mon May 9 19:54:45 2011 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 09 May 2011 13:54:45 -0400 Subject: [Python-Dev] Commit changelog: issue number and merges In-Reply-To: <7fa082450fb750d082e71d5070a62171@netwok.org> References: <1304937168.22910.21.camel@marge> <BANLkTimfd7d_U-ffnuH_KOdURRkmFQgM0w@mail.gmail.com> <7fa082450fb750d082e71d5070a62171@netwok.org> Message-ID: <20110509175447.4DC56250039@webabinitio.net> On Mon, 09 May 2011 17:55:42 +0200, =?UTF-8?Q?=C3=89ric_Araujo?= <merwok at netwok.org> wrote: > Le 09/05/2011 16:08, Benjamin Peterson a ??crit : > > 2011/5/9 Victor Stinner <victor.stinner at haypocalc.com>: > >> For merge commits: many developers just write "merge" or "merge > >> 3.1". I > >> have to go to the parent commit (and something to the grandparent, > >> 3.1->3.2->3.3) to learn more about the commit. > > I follow conventions I???ve seen elsewhere (maybe Mercurial itself): I > use ???Branch merge??? when I merge anonymous branches on the same named > branch, and ???Merge x.y??? for forward-porting across named branches. > > I also tend to do more than one commit before merging. It would not be > very easy with my current toolchain to get the commit message(s) to > insert into the new message, and I think it???s not necessary. > > > I thought the whole point of merging was that you brought a changeset > > from one branch to another. This why I just write "merge" because > > otherwise you're technically duplicating information that is pulled > > onto the branch by merging. > > +1. No interest in manually duplicating available information. > > Le 09/05/2011 17:44, R. David Murray a ??crit : > > No it isn't. The commit message isn't pulled into the new branch. > > Sorry, your terminology does not make sense. If you mean that the > commit message is not reused in the new commit after the merge, it???s > true. However, the commit message with the relevant information is > available as part of the changesets that have been pulled and merged. The changesets are in the repository and there are pointers to them from the merge changeset, sure, but the data isn't in the checkout (that's how I understood "pulled in to the new branch"). If I do 'hg log' and search for a revno (that I got from hg annotate), the commit message describing the change is not attached to that revno, nor as far as I know is there a tool that makes it easy to get from that revno to the explanatory commit message. That's what Victor and I are talking about. Is there a tool that fixes this problem? (svnmerge did a nice job of that from the automate-the-message-generation end of things). -- R. David Murray http://www.bitdance.com From ned at nedbatchelder.com Mon May 9 20:36:44 2011 From: ned at nedbatchelder.com (Ned Batchelder) Date: Mon, 09 May 2011 14:36:44 -0400 Subject: [Python-Dev] Commit messages: please avoid temporal ambiguity In-Reply-To: <iq9802$e9m$1@dough.gmane.org> References: <iq9802$e9m$1@dough.gmane.org> Message-ID: <4DC8343C.2050005@nedbatchelder.com> On 5/9/2011 1:24 PM, Terry Reedy wrote: > A commit (push) partition time and behavior into before and after > (with a short change period in between during which behavior is > undefined). > > Some commit messages have the form 'x does y'. Does 'does' mean before > or after? Sometimes that is clear. 'x crashes' means before. 'x return > correct value' means after. But some messages of this type are unclear > to me as written. > > Consider 'x raises exception'? The temporal reference is obvious to > the committer but not necessary to everyone else. It could mean 'x > used to segfault and now raises a catchable exception'. There was a > fix like this (with a clear message) just today. It could also mean 'x > used to raise but now return an answer. There have been many fixes > like this. > > Two minimal fixes are 'x raised exception' or 'make x raise exception'. > I've always favored "X now properly raises an exception." --Ned. From guido at python.org Mon May 9 21:17:45 2011 From: guido at python.org (Guido van Rossum) Date: Mon, 9 May 2011 12:17:45 -0700 Subject: [Python-Dev] Commit messages: please avoid temporal ambiguity In-Reply-To: <4DC8343C.2050005@nedbatchelder.com> References: <iq9802$e9m$1@dough.gmane.org> <4DC8343C.2050005@nedbatchelder.com> Message-ID: <BANLkTi=32fwy6+SC1vrDGZ6he_fvNcrX-g@mail.gmail.com> On Mon, May 9, 2011 at 11:36 AM, Ned Batchelder <ned at nedbatchelder.com> wrote: > On 5/9/2011 1:24 PM, Terry Reedy wrote: >> >> A commit (push) partition time and behavior into before and after (with a >> short change period in between during which behavior is undefined). >> >> Some commit messages have the form 'x does y'. Does 'does' mean before or >> after? Sometimes that is clear. 'x crashes' means before. 'x return correct >> value' means after. But some messages of this type are unclear to me as >> written. >> >> Consider 'x raises exception'? The temporal reference is obvious to the >> committer but not necessary to everyone else. It could mean 'x used to >> segfault and now raises a catchable exception'. There was a fix like this >> (with a clear message) just today. It could also mean 'x used to raise but >> now return an answer. There have been many fixes like this. >> >> Two minimal fixes are 'x raised exception' or 'make x raise exception'. >> > I've always favored "X now properly raises an exception." While my own preference is "make X properly raise an exception" I'm happy with any of the alternatives proposed here, and grateful to Terry for calling this out. Checkin comments of the form "X does Y" are ambiguous and confusing. (Same for feature requests in the tracker.) I'm curious where the habit to use the present tense comes from; I wonder if it originates in some agile development practice? -- --Guido van Rossum (python.org/~guido) From eric at trueblade.com Mon May 9 21:36:21 2011 From: eric at trueblade.com (Eric Smith) Date: Mon, 09 May 2011 15:36:21 -0400 Subject: [Python-Dev] Commit messages: please avoid temporal ambiguity In-Reply-To: <BANLkTi=32fwy6+SC1vrDGZ6he_fvNcrX-g@mail.gmail.com> References: <iq9802$e9m$1@dough.gmane.org> <4DC8343C.2050005@nedbatchelder.com> <BANLkTi=32fwy6+SC1vrDGZ6he_fvNcrX-g@mail.gmail.com> Message-ID: <4DC84235.4060600@trueblade.com> On 05/09/2011 03:17 PM, Guido van Rossum wrote: > On Mon, May 9, 2011 at 11:36 AM, Ned Batchelder <ned at nedbatchelder.com> wrote: >> On 5/9/2011 1:24 PM, Terry Reedy wrote: >>> >>> A commit (push) partition time and behavior into before and after (with a >>> short change period in between during which behavior is undefined). >>> >>> Some commit messages have the form 'x does y'. Does 'does' mean before or >>> after? Sometimes that is clear. 'x crashes' means before. 'x return correct >>> value' means after. But some messages of this type are unclear to me as >>> written. >>> >>> Consider 'x raises exception'? The temporal reference is obvious to the >>> committer but not necessary to everyone else. It could mean 'x used to >>> segfault and now raises a catchable exception'. There was a fix like this >>> (with a clear message) just today. It could also mean 'x used to raise but >>> now return an answer. There have been many fixes like this. >>> >>> Two minimal fixes are 'x raised exception' or 'make x raise exception'. >>> >> I've always favored "X now properly raises an exception." > > While my own preference is "make X properly raise an exception" I'm > happy with any of the alternatives proposed here, and grateful to > Terry for calling this out. Checkin comments of the form "X does Y" > are ambiguous and confusing. (Same for feature requests in the > tracker.) > > I'm curious where the habit to use the present tense comes from; I > wonder if it originates in some agile development practice? > Thanks indeed for bringing this up, Terry. It's been on my to-do list for a while. I think it comes from just copying the title of a bug report. The bug is "X does Y", and that's what's used in the fix. Eric. From guido at python.org Mon May 9 22:05:30 2011 From: guido at python.org (Guido van Rossum) Date: Mon, 9 May 2011 13:05:30 -0700 Subject: [Python-Dev] Commit messages: please avoid temporal ambiguity In-Reply-To: <4DC84235.4060600@trueblade.com> References: <iq9802$e9m$1@dough.gmane.org> <4DC8343C.2050005@nedbatchelder.com> <BANLkTi=32fwy6+SC1vrDGZ6he_fvNcrX-g@mail.gmail.com> <4DC84235.4060600@trueblade.com> Message-ID: <BANLkTi=BGR3D+RwHFff3rgUfBt4A9pOX5w@mail.gmail.com> On Mon, May 9, 2011 at 12:36 PM, Eric Smith <eric at trueblade.com> wrote: > On 05/09/2011 03:17 PM, Guido van Rossum wrote: >> On Mon, May 9, 2011 at 11:36 AM, Ned Batchelder <ned at nedbatchelder.com> wrote: >>> On 5/9/2011 1:24 PM, Terry Reedy wrote: >>>> >>>> A commit (push) partition time and behavior into before and after (with a >>>> short change period in between during which behavior is undefined). >>>> >>>> Some commit messages have the form 'x does y'. Does 'does' mean before or >>>> after? Sometimes that is clear. 'x crashes' means before. 'x return correct >>>> value' means after. But some messages of this type are unclear to me as >>>> written. >>>> >>>> Consider 'x raises exception'? The temporal reference is obvious to the >>>> committer but not necessary to everyone else. It could mean 'x used to >>>> segfault and now raises a catchable exception'. There was a fix like this >>>> (with a clear message) just today. It could also mean 'x used to raise but >>>> now return an answer. There have been many fixes like this. >>>> >>>> Two minimal fixes are 'x raised exception' or 'make x raise exception'. >>>> >>> I've always favored "X now properly raises an exception." >> >> While my own preference is "make X properly raise an exception" I'm >> happy with any of the alternatives proposed here, and grateful to >> Terry for calling this out. Checkin comments of the form "X does Y" >> are ambiguous and confusing. (Same for feature requests in the >> tracker.) >> >> I'm curious where the habit to use the present tense comes from; I >> wonder if it originates in some agile development practice? >> > > Thanks indeed for bringing this up, Terry. It's been on my to-do list > for a while. I think it comes from just copying the title of a bug > report. The bug is "X does Y", and that's what's used in the fix. But in bug reports it is also ambiguous, since I've often seen it used meaning "X should do Y" which is very confusing when it doesn't do Y yet at the time the bug is created. :-( -- --Guido van Rossum (python.org/~guido) From tjreedy at udel.edu Tue May 10 00:59:41 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 09 May 2011 18:59:41 -0400 Subject: [Python-Dev] Commit messages: please avoid temporal ambiguity In-Reply-To: <BANLkTi=BGR3D+RwHFff3rgUfBt4A9pOX5w@mail.gmail.com> References: <iq9802$e9m$1@dough.gmane.org> <4DC8343C.2050005@nedbatchelder.com> <BANLkTi=32fwy6+SC1vrDGZ6he_fvNcrX-g@mail.gmail.com> <4DC84235.4060600@trueblade.com> <BANLkTi=BGR3D+RwHFff3rgUfBt4A9pOX5w@mail.gmail.com> Message-ID: <iq9rkr$2j7$1@dough.gmane.org> On 5/9/2011 4:05 PM, Guido van Rossum wrote: > On Mon, May 9, 2011 at 12:36 PM, Eric Smith<eric at trueblade.com> wrote: >> On 05/09/2011 03:17 PM, Guido van Rossum wrote: >>> While my own preference is "make X properly raise an exception" I'm >>> happy with any of the alternatives proposed here, and grateful to >>> Terry for calling this out. I am willing to admit that I do not know all corners of Python ;-) I read the commit messages to learn more; in particular what sort of errors exist and how are they fixed. >>> Checkin comments of the form "X does Y" >>> are ambiguous and confusing. (Same for feature requests in the >>> tracker.) I have always assumed that an issue entitled 'x does y' is a bug report about doing y now, before a fix. >> Thanks indeed for bringing this up, Terry. It's been on my to-do list >> for a while. I think it comes from just copying the title of a bug >> report. The bug is "X does Y", and that's what's used in the fix. I have also seen this type of message for non-tracker-issue commits. > But in bug reports it is also ambiguous, since I've often seen it used > meaning "X should do Y" which is very confusing when it doesn't do Y > yet at the time the bug is created. :-( If I notice a title that bad, I will try to change it. -- Terry Jan Reedy From tjreedy at udel.edu Tue May 10 01:03:24 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 09 May 2011 19:03:24 -0400 Subject: [Python-Dev] Commit changelog: issue number and merges In-Reply-To: <20110509175447.4DC56250039@webabinitio.net> References: <1304937168.22910.21.camel@marge> <BANLkTimfd7d_U-ffnuH_KOdURRkmFQgM0w@mail.gmail.com> <7fa082450fb750d082e71d5070a62171@netwok.org> <20110509175447.4DC56250039@webabinitio.net> Message-ID: <iq9rrq$3nh$1@dough.gmane.org> On 5/9/2011 1:54 PM, R. David Murray wrote: > If I do 'hg log' and search for a revno (that I got from hg annotate), > the commit message describing the change is not attached to that revno, > nor as far as I know is there a tool that makes it easy to get from that > revno to the explanatory commit message. That's what Victor and I are > talking about. Is there a tool that fixes this problem? (svnmerge did a > nice job of that from the automate-the-message-generation end of things). TortoiseSvn, and I presume TortoiseHg also, has a 'recent messages' box that makes is trivial to reuse a message. I used it with svn and will make sure to use it, if it exists, when I get started with hg. -- Terry Jan Reedy From benjamin at python.org Tue May 10 01:23:45 2011 From: benjamin at python.org (Benjamin Peterson) Date: Mon, 9 May 2011 18:23:45 -0500 Subject: [Python-Dev] Commit changelog: issue number and merges In-Reply-To: <20110509154416.35BBF250037@webabinitio.net> References: <1304937168.22910.21.camel@marge> <BANLkTimfd7d_U-ffnuH_KOdURRkmFQgM0w@mail.gmail.com> <20110509154416.35BBF250037@webabinitio.net> Message-ID: <BANLkTik3M2y=W3t4pkGYqbc2MbWsWo=h+A@mail.gmail.com> 2011/5/9 R. David Murray <rdmurray at bitdance.com>: > On Mon, 09 May 2011 09:08:53 -0500, Benjamin Peterson <benjamin at python.org> wrote: >> I thought the whole point of merging was that you brought a changeset >> from one branch to another. This why I just write "merge" because >> otherwise you're technically duplicating information that is pulled >> onto the branch by merging. > > No it isn't. ?The commit message isn't pulled into the new branch. > >> It seems like something that should be solved by tools like a display >> visual graph indicating what is merged. (like Bazaar) > > You'd need some extension to hg log that would show the original commit > message for the first changeset in the merge line in order to "fix" > this. ?I doubt that is going to happen. *cough* http://mercurial.selenic.com/wiki/GraphlogExtension > > Note that saying just 'merge' makes perfect sense when you are pulling > in a whole group of changesets in order to synchronize two branches. > But if you are applying a single changeset to multiple branches, > as we often do in our workflow, then I think duplicating the commit > message is (1) easy to do and (2) very helpful when looking at > hg log output. What's the difference between pulling multiple changesets in and one then? -- Regards, Benjamin From nyamatongwe at gmail.com Tue May 10 01:52:49 2011 From: nyamatongwe at gmail.com (Neil Hodgson) Date: Tue, 10 May 2011 09:52:49 +1000 Subject: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII In-Reply-To: <1304950275.22910.32.camel@marge> References: <E1QIf1U-0001ch-SK@dinsdale.python.org> <BANLkTimm=KLc0XmCbqM7Vx5tBE=C=Kmneg@mail.gmail.com> <1304950275.22910.32.camel@marge> Message-ID: <BANLkTinbZ91fp1-d7XxN1Ej+xBbvfts-Ew@mail.gmail.com> Victor Stinner: > C and C++ identifiers are restricted to ASCII. I don't know for Fortran > or Java. Some C and C++ implementations currently allow non-ASCII identifiers and the forthcoming C1X and C++0x language standards include non-ASCII identifiers. The allowed characters are specified in Annexes of the respective standards. http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf - Annex D http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3225.pdf - Annex E Neil From solipsis at pitrou.net Tue May 10 02:06:03 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 10 May 2011 02:06:03 +0200 Subject: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII References: <E1QIf1U-0001ch-SK@dinsdale.python.org> <BANLkTimm=KLc0XmCbqM7Vx5tBE=C=Kmneg@mail.gmail.com> <1304950275.22910.32.camel@marge> Message-ID: <20110510020603.25774eb9@pitrou.net> On Mon, 09 May 2011 16:11:15 +0200 Victor Stinner <victor.stinner at haypocalc.com> wrote: > Le lundi 09 mai 2011 ? 09:00 -0400, Jim Jewett a ?crit : > > Are you asserting that all foreign modules (or at least all handled by > > this) are in C, as opposed to C++ or even Java or Fortran? (And the C > > won't change?) > > C and C++ identifiers are restricted to ASCII. I don't know for Fortran > or Java. Why is it important, though? What matters is not what C/C++ can produce, but what a shared library can export. So the question is: are shared libraries limited to ASCII symbols? Regards Antoine. From greg.ewing at canterbury.ac.nz Tue May 10 02:13:47 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 10 May 2011 12:13:47 +1200 Subject: [Python-Dev] Borrowed and Stolen References in API In-Reply-To: <BANLkTikfL1_ocbdVLz5uHttEwLrNVJVWsQ@mail.gmail.com> References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk> <BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com> <4DC1D1C5.9010507@canterbury.ac.nz> <BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com> <4DC34EAB.9050001@canterbury.ac.nz> <20110506122703.17c4d889@pitrou.net> <BANLkTikfL1_ocbdVLz5uHttEwLrNVJVWsQ@mail.gmail.com> Message-ID: <4DC8833B.9080503@canterbury.ac.nz> Nick Coghlan wrote: > One interesting aspect is that from the caller's point of view, a > *new* reference to the relevant behaves like a borrowed reference for > input parameters, but like a stolen reference for output parameters > and return values. I think it's less confusing to use the term "new" only for output/return values, and "stolen" only for input values. Inputs are either "borrowed" or "stolen" (by the callee). Outputs are either "new" (to the caller) or "borrowed" (by the caller). (Or maybe the terms for outputs should be "given" and "lent"?-) -- Greg From victor.stinner at haypocalc.com Tue May 10 02:57:14 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 10 May 2011 02:57:14 +0200 Subject: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII In-Reply-To: <BANLkTinbZ91fp1-d7XxN1Ej+xBbvfts-Ew@mail.gmail.com> References: <E1QIf1U-0001ch-SK@dinsdale.python.org> <BANLkTimm=KLc0XmCbqM7Vx5tBE=C=Kmneg@mail.gmail.com> <1304950275.22910.32.camel@marge> <BANLkTinbZ91fp1-d7XxN1Ej+xBbvfts-Ew@mail.gmail.com> Message-ID: <1304989034.29582.7.camel@marge> Le mardi 10 mai 2011 ? 09:52 +1000, Neil Hodgson a ?crit : > Some C and C++ implementations currently allow non-ASCII > identifiers and the forthcoming C1X and C++0x language standards > include non-ASCII identifiers. The allowed characters are specified in > Annexes of the respective standards. > > http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf - Annex D > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3225.pdf - Annex E I read these documents but they don't explain which encoding is used in libraries and programs. Does it mean that Windows and Linux may use different encodings? At least, the surrogate range (U+DC00-U+DFFF) is excluded, which is a good news (UTF-8 decoder of Python 3 rejects surrogate characters). I discovered -fextended-identifiers option of gcc: using this option, you can use \uHHHH and \UHHHHHHHH in identifiers, but not \xHH. On Linux, identifiers are encoded to UTF-8. Example: -------------- #define _ISOC99_SOURCE #include <stdio.h> int f\u00E9() { wprintf(L"U+00E9 = \xE9\n"); } int g\U000000E8() { wprintf(L"U+00E8 = \xE8\n"); } int main() { f\u00E9(); g\U000000E8(); return 0; } -------------- It's not very practical, I would prefer to write directly Unicode characters (as I can do in Python 3!). I'm not sure that chineses will prefer to call \u4f60\u597d() instead of hello(). Ok, I now agree, it is possible to use non-ASCII characters in C. But what about the encoding of symbols in a dynamic library: is it always UTF-8? Victor From nyamatongwe at gmail.com Tue May 10 03:08:32 2011 From: nyamatongwe at gmail.com (Neil Hodgson) Date: Tue, 10 May 2011 11:08:32 +1000 Subject: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII In-Reply-To: <1304989034.29582.7.camel@marge> References: <E1QIf1U-0001ch-SK@dinsdale.python.org> <BANLkTimm=KLc0XmCbqM7Vx5tBE=C=Kmneg@mail.gmail.com> <1304950275.22910.32.camel@marge> <BANLkTinbZ91fp1-d7XxN1Ej+xBbvfts-Ew@mail.gmail.com> <1304989034.29582.7.camel@marge> Message-ID: <BANLkTim_KJOPR9rddRHOEdxaLYnLPfB3pg@mail.gmail.com> Victor Stinner: > I read these documents but they don't explain which encoding is used in > libraries and programs. Does it mean that Windows and Linux may use > different encodings? Yes, Windows will use UTF-16 as it does for almost everything. From a user's point of view, these should both just be seen as Unicode. Neil From greg.ewing at canterbury.ac.nz Tue May 10 03:28:04 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 10 May 2011 13:28:04 +1200 Subject: [Python-Dev] Borrowed and Stolen References in API In-Reply-To: <20110510005835.GA29281@rectangular.com> References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk> <BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com> <4DC1D1C5.9010507@canterbury.ac.nz> <BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com> <4DC34EAB.9050001@canterbury.ac.nz> <20110506122703.17c4d889@pitrou.net> <BANLkTikfL1_ocbdVLz5uHttEwLrNVJVWsQ@mail.gmail.com> <4DC8833B.9080503@canterbury.ac.nz> <20110510005835.GA29281@rectangular.com> Message-ID: <4DC894A4.2000709@canterbury.ac.nz> Marvin Humphrey wrote: > incremented: The caller has to account for an additional refcount. > decremented: The caller has to account for a lost refcount. I'm not sure that really clarifies anything. These terms sound like they're talking about the reference count of the object, but if they correspond to borrowed/stolen, they don't necessarily correlate with what actually happens to the reference count. -- Greg From marvin at rectangular.com Tue May 10 02:58:35 2011 From: marvin at rectangular.com (Marvin Humphrey) Date: Mon, 9 May 2011 17:58:35 -0700 Subject: [Python-Dev] Borrowed and Stolen References in API In-Reply-To: <4DC8833B.9080503@canterbury.ac.nz> References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk> <BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com> <4DC1D1C5.9010507@canterbury.ac.nz> <BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com> <4DC34EAB.9050001@canterbury.ac.nz> <20110506122703.17c4d889@pitrou.net> <BANLkTikfL1_ocbdVLz5uHttEwLrNVJVWsQ@mail.gmail.com> <4DC8833B.9080503@canterbury.ac.nz> Message-ID: <20110510005835.GA29281@rectangular.com> On Tue, May 10, 2011 at 12:13:47PM +1200, Greg Ewing wrote: > Nick Coghlan wrote: > >> One interesting aspect is that from the caller's point of view, a >> *new* reference to the relevant behaves like a borrowed reference for >> input parameters, but like a stolen reference for output parameters >> and return values. > > I think it's less confusing to use the term "new" only for > output/return values, and "stolen" only for input values. > > Inputs are either "borrowed" or "stolen" (by the callee). > > Outputs are either "new" (to the caller) or "borrowed" > (by the caller). > > (Or maybe the terms for outputs should be "given" and "lent"?-) To solve this problem in a similar system (the Clownfish object system used by Apache Lucy) we used the keywords "incremented" and "decremented". Applied to some Python C API function documentation: incremented PyObject* PyTuple_New(Py_ssize_t len) int PyTuple_SetItem(PyObject *p, Py_ssize_t pos, decremented PyObject *o) With "incremented" and "decremented", the perspective is always that of the caller. incremented: The caller has to account for an additional refcount. decremented: The caller has to account for a lost refcount. Marvin Humphrey From rdmurray at bitdance.com Tue May 10 03:32:46 2011 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 09 May 2011 21:32:46 -0400 Subject: [Python-Dev] Commit changelog: issue number and merges In-Reply-To: <BANLkTik3M2y=W3t4pkGYqbc2MbWsWo=h+A@mail.gmail.com> References: <1304937168.22910.21.camel@marge> <BANLkTimfd7d_U-ffnuH_KOdURRkmFQgM0w@mail.gmail.com> <20110509154416.35BBF250037@webabinitio.net> <BANLkTik3M2y=W3t4pkGYqbc2MbWsWo=h+A@mail.gmail.com> Message-ID: <20110510013247.655DA250037@webabinitio.net> On Mon, 09 May 2011 18:23:45 -0500, Benjamin Peterson <benjamin at python.org> wrote: > 2011/5/9 R. David Murray <rdmurray at bitdance.com>: > > On Mon, 09 May 2011 09:08:53 -0500, Benjamin Peterson <benjamin at python.or= > g> wrote: > >> I thought the whole point of merging was that you brought a changeset > >> from one branch to another. This why I just write "merge" because > >> otherwise you're technically duplicating information that is pulled > >> onto the branch by merging. > > > > No it isn't. =C2=A0The commit message isn't pulled into the new branch. > > > >> It seems like something that should be solved by tools like a display > >> visual graph indicating what is merged. (like Bazaar) > > > > You'd need some extension to hg log that would show the original commit > > message for the first changeset in the merge line in order to "fix" > > this. =C2=A0I doubt that is going to happen. > > *cough* http://mercurial.selenic.com/wiki/GraphlogExtension I'm sorry, but I've looked at the output of that and the mental overhead has so far proven too high for it to be of any use to me. I apologize for not having made the full mental transition to "distributed VCS"/DAG (apparently), but it sounds like I'm not the only one.... > > Note that saying just 'merge' makes perfect sense when you are pulling > > in a whole group of changesets in order to synchronize two branches. > > But if you are applying a single changeset to multiple branches, > > as we often do in our workflow, then I think duplicating the commit > > message is (1) easy to do and (2) very helpful when looking at > > hg log output. > > What's the difference between pulling multiple changesets in and one then? I'm talking about merging trunk to a feature branch, for example. I'd not expect any message other than 'merge' for that. I'd be satisfied if the commit messages listed the issue numbers involved in the merge, especially if someone (like ??ric) is merging more than one change at a time. But as I think about this, frankly I'd rather see atomic commits, even on merges. That was something I disliked about svnmerge, the fact that often an svnmerge commit involved many changesets from the other branch. That was especially painful in exactly the same situation: trying to backtrack a change starting from 'svn blame'. I limited my own use of multiple-changeset-svnmerge to doc changes and changesets that were actually related, despite the overhead involved in doing it that way. All that said, I'm not trying to impose my will on the workflow, I'll certainly live with the consensus (though unless there is an outcry against it I'll continue putting the full commit message in my own merges). -- R. David Murray http://www.bitdance.com From marvin at rectangular.com Tue May 10 03:53:10 2011 From: marvin at rectangular.com (Marvin Humphrey) Date: Mon, 9 May 2011 18:53:10 -0700 Subject: [Python-Dev] Borrowed and Stolen References in API In-Reply-To: <4DC894A4.2000709@canterbury.ac.nz> References: <4DC11791.2000109@dcs.gla.ac.uk> <BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com> <4DC1D1C5.9010507@canterbury.ac.nz> <BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com> <4DC34EAB.9050001@canterbury.ac.nz> <20110506122703.17c4d889@pitrou.net> <BANLkTikfL1_ocbdVLz5uHttEwLrNVJVWsQ@mail.gmail.com> <4DC8833B.9080503@canterbury.ac.nz> <20110510005835.GA29281@rectangular.com> <4DC894A4.2000709@canterbury.ac.nz> Message-ID: <20110510015310.GA29407@rectangular.com> On Tue, May 10, 2011 at 01:28:04PM +1200, Greg Ewing wrote: > Marvin Humphrey wrote: > >> incremented: The caller has to account for an additional refcount. >> decremented: The caller has to account for a lost refcount. > > I'm not sure that really clarifies anything. These terms > sound like they're talking about the reference count of the > object, but if they correspond to borrowed/stolen, they > don't necessarily correlate with what actually happens to > the reference count. Hmm, they don't correspond to borrowed/stolen. stolen from the caller -> decremented stolen from the callee -> incremented borrowed -> [no modifier] We don't have a modifier keyword which is analogous to "borrowed". The user is expected to understand object lifespan issues for borrowed references without explicit guidance. With regards to "what actually happens to the reference count", I would argue that "incremented" and "decremented" are accurate descriptions. * When a function returns an "incremented" object, that function has added a refcount to it. * When a function accepts a "decremented" object as an argument, it will consume a refcount from it -- either right away, or at some point in the future. In my view, it is not desirable to label arguments or return values as "borrowed"; it is only necessary to advise the user when they must take action to account for a refcount, gained or lost. Cheers, Marvin Humphrey From stephen at xemacs.org Tue May 10 04:51:19 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 10 May 2011 11:51:19 +0900 Subject: [Python-Dev] Commit changelog: issue number and merges In-Reply-To: <20110510013247.655DA250037@webabinitio.net> References: <1304937168.22910.21.camel@marge> <BANLkTimfd7d_U-ffnuH_KOdURRkmFQgM0w@mail.gmail.com> <20110509154416.35BBF250037@webabinitio.net> <BANLkTik3M2y=W3t4pkGYqbc2MbWsWo=h+A@mail.gmail.com> <20110510013247.655DA250037@webabinitio.net> Message-ID: <87d3jre09k.fsf@uwakimon.sk.tsukuba.ac.jp> R. David Murray writes: > On Mon, 09 May 2011 18:23:45 -0500, Benjamin Peterson <benjamin at python.org> wrote: > > *cough* http://mercurial.selenic.com/wiki/GraphlogExtension > > I'm sorry, but I've looked at the output of that and the mental overhead > has so far proven too high for it to be of any use to me. How about the hgk extension, and "hg view"? http://mercurial.selenic.com/wiki/HgkExtension > But as I think about this, frankly I'd rather see atomic commits, even > on merges. That was something I disliked about svnmerge, the fact that > often an svnmerge commit involved many changesets from the other branch. > That was especially painful in exactly the same situation: trying to > backtrack a change starting from 'svn blame'. I don't understand the issue. In my experience, hg annotate will point to the commit on the branch, not to the merge, unless there was a conflict, in which case the merge is the "right" place (although not necessarily the most useful place) to point. From murman at gmail.com Tue May 10 05:18:10 2011 From: murman at gmail.com (Michael Urman) Date: Mon, 9 May 2011 22:18:10 -0500 Subject: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII In-Reply-To: <BANLkTim_KJOPR9rddRHOEdxaLYnLPfB3pg@mail.gmail.com> References: <E1QIf1U-0001ch-SK@dinsdale.python.org> <BANLkTimm=KLc0XmCbqM7Vx5tBE=C=Kmneg@mail.gmail.com> <1304950275.22910.32.camel@marge> <BANLkTinbZ91fp1-d7XxN1Ej+xBbvfts-Ew@mail.gmail.com> <1304989034.29582.7.camel@marge> <BANLkTim_KJOPR9rddRHOEdxaLYnLPfB3pg@mail.gmail.com> Message-ID: <BANLkTi=qR=BKfrfr4_7sdVROfqt3xG9F5Q@mail.gmail.com> On Mon, May 9, 2011 at 20:08, Neil Hodgson <nyamatongwe at gmail.com> wrote: > ? Yes, Windows will use UTF-16 as it does for almost everything. From > a user's point of view, these should both just be seen as Unicode. I'm not convinced this is correct for this case. GetProcAddress takes an "ANSI" string, meaning while it could theoretically use UTF-8, in practice I doubt it uses anything outside of ASCII safely. So while the name of the library would be encoded in UTF-16, the name of the function loaded from the library would not be. http://msdn.microsoft.com/en-us/library/ms683212(v=vs.85).aspx -- Michael Urman From nyamatongwe at gmail.com Tue May 10 06:09:06 2011 From: nyamatongwe at gmail.com (Neil Hodgson) Date: Tue, 10 May 2011 14:09:06 +1000 Subject: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII In-Reply-To: <BANLkTi=qR=BKfrfr4_7sdVROfqt3xG9F5Q@mail.gmail.com> References: <E1QIf1U-0001ch-SK@dinsdale.python.org> <BANLkTimm=KLc0XmCbqM7Vx5tBE=C=Kmneg@mail.gmail.com> <1304950275.22910.32.camel@marge> <BANLkTinbZ91fp1-d7XxN1Ej+xBbvfts-Ew@mail.gmail.com> <1304989034.29582.7.camel@marge> <BANLkTim_KJOPR9rddRHOEdxaLYnLPfB3pg@mail.gmail.com> <BANLkTi=qR=BKfrfr4_7sdVROfqt3xG9F5Q@mail.gmail.com> Message-ID: <BANLkTi=J4GGFY2FQF37ahnJJ69csmTTi2Q@mail.gmail.com> Michael Urman: > I'm not convinced this is correct for this case. GetProcAddress takes > an "ANSI" string, meaning while it could theoretically use UTF-8, in > practice I doubt it uses anything outside of ASCII safely. So while > the name of the library would be encoded in UTF-16, the name of the > function loaded from the library would not be. Yes you are right: http://scintilla.org/NarrowName.png Neil From murman at gmail.com Tue May 10 06:23:54 2011 From: murman at gmail.com (Michael Urman) Date: Mon, 9 May 2011 23:23:54 -0500 Subject: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII In-Reply-To: <BANLkTi=J4GGFY2FQF37ahnJJ69csmTTi2Q@mail.gmail.com> References: <E1QIf1U-0001ch-SK@dinsdale.python.org> <BANLkTimm=KLc0XmCbqM7Vx5tBE=C=Kmneg@mail.gmail.com> <1304950275.22910.32.camel@marge> <BANLkTinbZ91fp1-d7XxN1Ej+xBbvfts-Ew@mail.gmail.com> <1304989034.29582.7.camel@marge> <BANLkTim_KJOPR9rddRHOEdxaLYnLPfB3pg@mail.gmail.com> <BANLkTi=qR=BKfrfr4_7sdVROfqt3xG9F5Q@mail.gmail.com> <BANLkTi=J4GGFY2FQF37ahnJJ69csmTTi2Q@mail.gmail.com> Message-ID: <BANLkTincMf_YA8exSrr1=R8LWnC3Ty3VvQ@mail.gmail.com> On Mon, May 9, 2011 at 23:09, Neil Hodgson <nyamatongwe at gmail.com> wrote: > Michael Urman: > >> I'm not convinced this is correct for this case. GetProcAddress takes >> an "ANSI" string, meaning while it could theoretically use UTF-8, in >> practice I doubt it uses anything outside of ASCII safely. So while >> the name of the library would be encoded in UTF-16, the name of the >> function loaded from the library would not be. > > ? Yes you are right: > http://scintilla.org/NarrowName.png > > ? Neil > That screenshot seems to show UTF-8 is being used. This may just be the literal bytes in the .c file, but could it be something more dependable? http://unicode.org/cgi-bin/GetUnihanData.pl?codepoint=6728 From nyamatongwe at gmail.com Tue May 10 06:35:27 2011 From: nyamatongwe at gmail.com (Neil Hodgson) Date: Tue, 10 May 2011 14:35:27 +1000 Subject: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII In-Reply-To: <BANLkTincMf_YA8exSrr1=R8LWnC3Ty3VvQ@mail.gmail.com> References: <E1QIf1U-0001ch-SK@dinsdale.python.org> <BANLkTimm=KLc0XmCbqM7Vx5tBE=C=Kmneg@mail.gmail.com> <1304950275.22910.32.camel@marge> <BANLkTinbZ91fp1-d7XxN1Ej+xBbvfts-Ew@mail.gmail.com> <1304989034.29582.7.camel@marge> <BANLkTim_KJOPR9rddRHOEdxaLYnLPfB3pg@mail.gmail.com> <BANLkTi=qR=BKfrfr4_7sdVROfqt3xG9F5Q@mail.gmail.com> <BANLkTi=J4GGFY2FQF37ahnJJ69csmTTi2Q@mail.gmail.com> <BANLkTincMf_YA8exSrr1=R8LWnC3Ty3VvQ@mail.gmail.com> Message-ID: <BANLkTinJ-mcY5A-W0EUY7qfJRNu8x7TKHg@mail.gmail.com> Michael Urman: > That screenshot seems to show UTF-8 is being used. This may just be > the literal bytes in the .c file, but could it be something more > dependable? The file is in UTF-8 so the compiler may just be copying the bytes. There is a setlocale pragma but that seems to be just for string literals. Neil From eliben at gmail.com Tue May 10 07:36:38 2011 From: eliben at gmail.com (Eli Bendersky) Date: Tue, 10 May 2011 08:36:38 +0300 Subject: [Python-Dev] more timely detection of unbound locals In-Reply-To: <Pine.GSO.4.64.1105091129450.27288@core.cs.uwaterloo.ca> References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com> <Pine.GSO.4.64.1105090913360.27288@core.cs.uwaterloo.ca> <BANLkTimmAwLUzHr_4aoX8HNscQRE83LR1A@mail.gmail.com> <Pine.GSO.4.64.1105091129450.27288@core.cs.uwaterloo.ca> Message-ID: <BANLkTik8JCOp_n9WGqazg6r0gdQdGt9Ugg@mail.gmail.com> On Mon, May 9, 2011 at 18:44, Isaac Morland <ijmorlan at uwaterloo.ca> wrote: > On Mon, 9 May 2011, Eli Bendersky wrote: > > x = 5 >>> def foo (): >>> print (x) >>> if bar (): >>> x = 1 >>> print (x) >>> >>> >> I wish you'd annotate this code sample, what do you intend it to >> demonstrate? >> >> It probably shows the original complaint even more strongly. As for being >> a >> problem with the suggested solution, I suppose you're right, although it >> doesn't make it much different. Still, before a *possible* assignment to >> 'x', it should be loaded as LOAD_NAME since it was surely not bound as >> local, yet. >> > > Extrapolating from your suggestion, you're saying before a *possible* > assignment it will be treated as global, and after a *possible* assignment > it will be treated as local? > > But surely: > > print (x) > if False: > x = 1 > print (x) > > [snip] Alright, I now understand the problems with the suggestion. Indeed, conditional assignments that are only really resolved at runtime are the big stumbling block here. However, maybe the error message/reporting can still be improved? ISTM the UnboundLocalError exception gets raised only in those weird and confusing cases. After all, why would Python decide an access to some name is to a local? Only if it found an assignment to that local in the scope. But that assignment clearly didn't happen yet, so the error is thrown. So cases like these: x = 2 def foo1(): x += 1 def foo2(): print(x) x = 10 def foo3(): if something_that_didnot_happen: x = 10 print(x) All belong to the category. With an unlimited error message length it could make sense to say "Hey, I see 'x' may be assigned in this scope, so I mark it local. But this access to 'x' happens before assignment - so ERROR". This isn't realistic, of course, so I'm wondering: 1. Does this error message (although unrealistic) capture all possible appearances of UnboundLocalError? 2. If the answer to (1) is yes - could it be usefully shortened to be clearer than the current "local variable referenced before assignment"? This may not be possible, of course, but it doesn't harm trying :-) Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20110510/95de9ed5/attachment.html> From stefan_ml at behnel.de Tue May 10 08:16:24 2011 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 10 May 2011 08:16:24 +0200 Subject: [Python-Dev] more timely detection of unbound locals In-Reply-To: <iq8q3d$kfk$1@dough.gmane.org> References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com> <iq8q3d$kfk$1@dough.gmane.org> Message-ID: <iqal7o$g96$2@dough.gmane.org> [forwarded to the python-ideas list] Stefan From victor.stinner at haypocalc.com Tue May 10 10:03:29 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 10 May 2011 10:03:29 +0200 Subject: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII In-Reply-To: <BANLkTi=qR=BKfrfr4_7sdVROfqt3xG9F5Q@mail.gmail.com> References: <E1QIf1U-0001ch-SK@dinsdale.python.org> <BANLkTimm=KLc0XmCbqM7Vx5tBE=C=Kmneg@mail.gmail.com> <1304950275.22910.32.camel@marge> <BANLkTinbZ91fp1-d7XxN1Ej+xBbvfts-Ew@mail.gmail.com> <1304989034.29582.7.camel@marge> <BANLkTim_KJOPR9rddRHOEdxaLYnLPfB3pg@mail.gmail.com> <BANLkTi=qR=BKfrfr4_7sdVROfqt3xG9F5Q@mail.gmail.com> Message-ID: <1305014609.2014.6.camel@marge> Le lundi 09 mai 2011 ? 22:18 -0500, Michael Urman a ?crit : > On Mon, May 9, 2011 at 20:08, Neil Hodgson <nyamatongwe at gmail.com> wrote: > > Yes, Windows will use UTF-16 as it does for almost everything. From > > a user's point of view, these should both just be seen as Unicode. > > I'm not convinced this is correct for this case. GetProcAddress takes > an "ANSI" string, meaning while it could theoretically use UTF-8, in > practice I doubt it uses anything outside of ASCII safely. If GetProcAddress() expects a byte string encoded to the ANSI code page, my patch is correct because the function used the UTF-8 encoding, not the ANSI code page. We can maybe use GetProcAddressW() to pass a Unicode string. I don't know which encoding is used by GetProcAddressW()... I already patched _PyImport_GetDynLoadFunc() for Windows: the path is now a Unicode object instead of a byte string encoded to the filesystem encoding. _PyImport_GetDynLoadWindows() uses GetFullPathNameW() and LoadLibraryExW(). The work to be fully Unicode compliant (for the path field, not for the name) is not completly done... but I have a pending patch, see: http://bugs.python.org/issue11619 But this patch is huge and creates many functions. I am not sure that we need it, I will work on this later. Victor From orsenthil at gmail.com Tue May 10 10:37:42 2011 From: orsenthil at gmail.com (Senthil Kumaran) Date: Tue, 10 May 2011 16:37:42 +0800 Subject: [Python-Dev] [Python-checkins] cpython: Issue #12039: Add end_headers() call to avoid BadStatusLine. In-Reply-To: <E1QJi1H-0003qr-Cd@dinsdale.python.org> References: <E1QJi1H-0003qr-Cd@dinsdale.python.org> Message-ID: <20110510083742.GA16239@kevin> On Tue, May 10, 2011 at 10:10:15AM +0200, vinay.sajip wrote: > diff --git a/Lib/test/test_logging.py b/Lib/test/test_logging.py > --- a/Lib/test/test_logging.py > +++ b/Lib/test/test_logging.py > @@ -1489,6 +1489,7 @@ > except: > self.post_data = None > request.send_response(200) > + request.end_headers() > self.handled.set() This is accurate. It should have resulted from the change made in the http.server, because the headers are now cached and then written to the output stream in one-shot when end_headers/flush_headers are called. Thanks, Senthil From aurelien.campeas at logilab.fr Tue May 10 13:51:41 2011 From: aurelien.campeas at logilab.fr (=?ISO-8859-1?Q?Aur=E9lien_Camp=E9as?=) Date: Tue, 10 May 2011 13:51:41 +0200 Subject: [Python-Dev] Commit changelog: issue number and merges In-Reply-To: <87d3jre09k.fsf@uwakimon.sk.tsukuba.ac.jp> References: <1304937168.22910.21.camel@marge> <BANLkTimfd7d_U-ffnuH_KOdURRkmFQgM0w@mail.gmail.com> <20110509154416.35BBF250037@webabinitio.net> <BANLkTik3M2y=W3t4pkGYqbc2MbWsWo=h+A@mail.gmail.com> <20110510013247.655DA250037@webabinitio.net> <87d3jre09k.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <4DC926CD.1040108@logilab.fr> Le 10/05/2011 04:51, Stephen J. Turnbull a ?crit : > R. David Murray writes: > > On Mon, 09 May 2011 18:23:45 -0500, Benjamin Peterson<benjamin at python.org> wrote: > > > > *cough* http://mercurial.selenic.com/wiki/GraphlogExtension > > > > I'm sorry, but I've looked at the output of that and the mental overhead > > has so far proven too high for it to be of any use to me. > > How about the hgk extension, and "hg view"? > > http://mercurial.selenic.com/wiki/HgkExtension > or, FWIW, "hgview" (http://www.logilab.org/project/hgview) From ncoghlan at gmail.com Tue May 10 14:29:58 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 10 May 2011 22:29:58 +1000 Subject: [Python-Dev] Commit messages: please avoid temporal ambiguity In-Reply-To: <4DC84235.4060600@trueblade.com> References: <iq9802$e9m$1@dough.gmane.org> <4DC8343C.2050005@nedbatchelder.com> <BANLkTi=32fwy6+SC1vrDGZ6he_fvNcrX-g@mail.gmail.com> <4DC84235.4060600@trueblade.com> Message-ID: <BANLkTinRySS374X9yatULztpG7MLxD_SJg@mail.gmail.com> On Tue, May 10, 2011 at 5:36 AM, Eric Smith <eric at trueblade.com> wrote: > Thanks indeed for bringing this up, Terry. It's been on my to-do list > for a while. I think it comes from just copying the title of a bug > report. The bug is "X does Y", and that's what's used in the fix. I believe I've actually seen it in NEWS entries as well (although thankfully not often and I can't recall any specific instances off the top of my head). I'm also a fan of including the word "now" and describing the new behaviour, although I'll sometimes use "no longer" and describe the old behaviour for some bugs where that seems more appropriate. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Tue May 10 14:44:29 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 10 May 2011 22:44:29 +1000 Subject: [Python-Dev] Borrowed and Stolen References in API In-Reply-To: <20110510015310.GA29407@rectangular.com> References: <4DC11791.2000109@dcs.gla.ac.uk> <BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com> <4DC1D1C5.9010507@canterbury.ac.nz> <BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com> <4DC34EAB.9050001@canterbury.ac.nz> <20110506122703.17c4d889@pitrou.net> <BANLkTikfL1_ocbdVLz5uHttEwLrNVJVWsQ@mail.gmail.com> <4DC8833B.9080503@canterbury.ac.nz> <20110510005835.GA29281@rectangular.com> <4DC894A4.2000709@canterbury.ac.nz> <20110510015310.GA29407@rectangular.com> Message-ID: <BANLkTinzYcvKVzP8heCUy0uC8nerqv7beQ@mail.gmail.com> On Tue, May 10, 2011 at 11:53 AM, Marvin Humphrey <marvin at rectangular.com> wrote: > With regards to "what actually happens to the reference count", I would argue > that "incremented" and "decremented" are accurate descriptions. > > ?* When a function returns an "incremented" object, that function has added > ? ?a refcount to it. Except that's not quite true in cases like PySet_Pop(). In that case, the net effect on the refcount is neutral. The significant point is that the set no longer holds a reference, it has passed that responsibility back to the caller. > In my view, it is not desirable to label arguments or return values as > "borrowed"; it is only necessary to advise the user when they must take action > to account for a refcount, gained or lost. Agreed on this part, though. Callers need to know when: 1. The return value is a new reference that must be decremented (currently indicated in the docs by "Return value: New reference") 2. An input parameter transfers responsibility for the reference to the callee (the only example I noticed in the docs is PyList_SetItem, which uses an explicit note rather than any kind of markup or the refcount data) I believe the current refcount data covers the first case reasonably well, but not the latter (and still has the problem of being separated from the documentation of the functions themselves). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From rdmurray at bitdance.com Tue May 10 14:50:34 2011 From: rdmurray at bitdance.com (R. David Murray) Date: Tue, 10 May 2011 08:50:34 -0400 Subject: [Python-Dev] Commit changelog: issue number and merges In-Reply-To: <87d3jre09k.fsf@uwakimon.sk.tsukuba.ac.jp> References: <1304937168.22910.21.camel@marge> <BANLkTimfd7d_U-ffnuH_KOdURRkmFQgM0w@mail.gmail.com> <20110509154416.35BBF250037@webabinitio.net> <BANLkTik3M2y=W3t4pkGYqbc2MbWsWo=h+A@mail.gmail.com> <20110510013247.655DA250037@webabinitio.net> <87d3jre09k.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <20110510125035.8F20B250041@webabinitio.net> On Tue, 10 May 2011 11:51:19 +0900, "Stephen J. Turnbull" <stephen at xemacs.org> wrote: > R. David Murray writes: > > On Mon, 09 May 2011 18:23:45 -0500, Benjamin Peterson <benjamin at python.org> wrote: > > > > *cough* http://mercurial.selenic.com/wiki/GraphlogExtension > > > > I'm sorry, but I've looked at the output of that and the mental overhead > > has so far proven too high for it to be of any use to me. > > How about the hgk extension, and "hg view"? I think the problem is in my brain, not the graphical tools :) With rare exceptions I don't use tools that require a mouse to operate, though, so unless I feel like doing tcl hacking to make good keyboard bindings that particular tool won't help me anyway. > > But as I think about this, frankly I'd rather see atomic commits, even > > on merges. That was something I disliked about svnmerge, the fact that > > often an svnmerge commit involved many changesets from the other branch. > > That was especially painful in exactly the same situation: trying to > > backtrack a change starting from 'svn blame'. > > I don't understand the issue. In my experience, hg annotate will > point to the commit on the branch, not to the merge, unless there was > a conflict, in which case the merge is the "right" place (although not > necessarily the most useful place) to point. That's what I get for reasoning about hg based on my svn experience. Someone on IRC also pointed this out. I haven't done this often enough in HG for the difference to have penetrated my brain. I have a feeling I'm still going to get confused occasionally, but then I'm sure I did with svn too... -- R. David Murray http://www.bitdance.com From ncoghlan at gmail.com Tue May 10 14:59:02 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 10 May 2011 22:59:02 +1000 Subject: [Python-Dev] Commit changelog: issue number and merges In-Reply-To: <87d3jre09k.fsf@uwakimon.sk.tsukuba.ac.jp> References: <1304937168.22910.21.camel@marge> <BANLkTimfd7d_U-ffnuH_KOdURRkmFQgM0w@mail.gmail.com> <20110509154416.35BBF250037@webabinitio.net> <BANLkTik3M2y=W3t4pkGYqbc2MbWsWo=h+A@mail.gmail.com> <20110510013247.655DA250037@webabinitio.net> <87d3jre09k.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <BANLkTinuY1qCp0oQjbqJDyEk2Cg7kH2Spg@mail.gmail.com> On Tue, May 10, 2011 at 12:51 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote: > R. David Murray writes: > ?> On Mon, 09 May 2011 18:23:45 -0500, Benjamin Peterson <benjamin at python.org> wrote: > > ?> > *cough* http://mercurial.selenic.com/wiki/GraphlogExtension > ?> > ?> I'm sorry, but I've looked at the output of that and the mental overhead > ?> has so far proven too high for it to be of any use to me. > > How about the hgk extension, and "hg view"? > > http://mercurial.selenic.com/wiki/HgkExtension I don't think it's really a jump up to the "graphical" level that we're after. It's more a matter of: 1. Display commit message for current commit 2. Notice that this commit has two parents 3. Ignore any parent commit in the same branch as the current commit 4. For a parent commit in another branch, also display that commit message 5. If the commit in step 4 also has multiple parents, repeat from step 3 (but based off that parent commit) So a standard 3.1->3.2->default merge could be displayed along the lines of: Merge from 3.2 3.2: Merge from 3.1 3.1: Issue #123456: mod.func now works correctly when argument is negative It won't help if the last commit on the initial branch was something boring like "Fix whitespace", but it will be adequate for our typical single-commit bug fix workflow. (If nobody does anything before then, I'll see what I can do with the email hook next week) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From rdmurray at bitdance.com Tue May 10 15:11:44 2011 From: rdmurray at bitdance.com (R. David Murray) Date: Tue, 10 May 2011 09:11:44 -0400 Subject: [Python-Dev] more timely detection of unbound locals In-Reply-To: <BANLkTik8JCOp_n9WGqazg6r0gdQdGt9Ugg@mail.gmail.com> References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com> <Pine.GSO.4.64.1105090913360.27288@core.cs.uwaterloo.ca> <BANLkTimmAwLUzHr_4aoX8HNscQRE83LR1A@mail.gmail.com> <Pine.GSO.4.64.1105091129450.27288@core.cs.uwaterloo.ca> <BANLkTik8JCOp_n9WGqazg6r0gdQdGt9Ugg@mail.gmail.com> Message-ID: <20110510131144.C8D75250041@webabinitio.net> On Tue, 10 May 2011 08:36:38 +0300, Eli Bendersky <eliben at gmail.com> wrote: > With an unlimited error message length it could make sense to say "Hey, I > see 'x' may be assigned in this scope, so I mark it local. But this access > to 'x' happens before assignment - so ERROR". This isn't realistic, of > course, so I'm wondering: > > 1. Does this error message (although unrealistic) capture all possible > appearances of UnboundLocalError? > 2. If the answer to (1) is yes - could it be usefully shortened to be > clearer than the current "local variable referenced before assignment"? > > This may not be possible, of course, but it doesn't harm trying :-) How about: "reference to variable 'y' precedes an assignment that makes it a local variable" IMO this still leaves room for confusion, but is better because it indicates the causation more clearly. (I don't think it is necessary to capture the subtlety of conditional assignment in the error message.) -- R. David Murray http://www.bitdance.com From rdmurray at bitdance.com Tue May 10 15:33:13 2011 From: rdmurray at bitdance.com (R. David Murray) Date: Tue, 10 May 2011 09:33:13 -0400 Subject: [Python-Dev] Commit messages: please avoid temporal ambiguity In-Reply-To: <BANLkTinRySS374X9yatULztpG7MLxD_SJg@mail.gmail.com> References: <iq9802$e9m$1@dough.gmane.org> <4DC8343C.2050005@nedbatchelder.com> <BANLkTi=32fwy6+SC1vrDGZ6he_fvNcrX-g@mail.gmail.com> <4DC84235.4060600@trueblade.com> <BANLkTinRySS374X9yatULztpG7MLxD_SJg@mail.gmail.com> Message-ID: <20110510133314.66B48250041@webabinitio.net> On Tue, 10 May 2011 22:29:58 +1000, Nick Coghlan <ncoghlan at gmail.com> wrote: > On Tue, May 10, 2011 at 5:36 AM, Eric Smith <eric at trueblade.com> wrote: > > Thanks indeed for bringing this up, Terry. It's been on my to-do list > > for a while. I think it comes from just copying the title of a bug > > report. The bug is "X does Y", and that's what's used in the fix. > > I believe I've actually seen it in NEWS entries as well (although > thankfully not often and I can't recall any specific instances off the > top of my head). > > I'm also a fan of including the word "now" and describing the new > behaviour, although I'll sometimes use "no longer" and describe the > old behaviour for some bugs where that seems more appropriate. I generally don't use the same text for commit and NEWS, because I like to stick to one-liners for the first line of the commit, possibly with more detail in the body, while for NEWS items I'm aiming for a one to three line description. But in both cases what I'm thinking about is "what have I *changed*". In the commit message that will probably focus more on code changes, while the NEWS item will focus more on behavior changes, but the results are generally similar. So for example my most recent two comments look like this: commit: 11999: sync based on comparing mtimes, not mtime to system clock NEWS: Issue 11999: fixed sporadic sync failure mailbox.Maildir due to its trying to detect mtime changes by comparing to the system clock instead of to the previous value of the mtime. commit: #11873: Improve test regex so random directory names don't cause test to fail NEWS: Issue #11873: Change regex in test_compileall to fix occasional failures when when the randomly generated temporary path happened to match the regex. You will note the *active* verbs "fixed", "improve", and "change" figure in there prominently :) (Eh. And proofreading this email I see I made a grammar error in that first NEWS example :( -- R. David Murray http://www.bitdance.com From murman at gmail.com Tue May 10 15:34:38 2011 From: murman at gmail.com (Michael Urman) Date: Tue, 10 May 2011 08:34:38 -0500 Subject: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII In-Reply-To: <1305014609.2014.6.camel@marge> References: <E1QIf1U-0001ch-SK@dinsdale.python.org> <BANLkTimm=KLc0XmCbqM7Vx5tBE=C=Kmneg@mail.gmail.com> <1304950275.22910.32.camel@marge> <BANLkTinbZ91fp1-d7XxN1Ej+xBbvfts-Ew@mail.gmail.com> <1304989034.29582.7.camel@marge> <BANLkTim_KJOPR9rddRHOEdxaLYnLPfB3pg@mail.gmail.com> <BANLkTi=qR=BKfrfr4_7sdVROfqt3xG9F5Q@mail.gmail.com> <1305014609.2014.6.camel@marge> Message-ID: <BANLkTinZjexmqhFfBshL3NANangZE8o7AA@mail.gmail.com> On Tue, May 10, 2011 at 03:03, Victor Stinner <victor.stinner at haypocalc.com> wrote: > If GetProcAddress() expects a byte string encoded to the ANSI code page, > my patch is correct because the function used the UTF-8 encoding, not > the ANSI code page. We can maybe use GetProcAddressW() to pass a Unicode > string. I don't know which encoding is used by GetProcAddressW()... While I can find references to a GetProcAddressW, most of them seem to agree it doesn't exist. "My kernel32.dll only exports GetProcAddress." This suggests to me it accepts a null-terminated bytestring instead of specifically an ANSI string. What data ends up in the export table is likely similar to the linux filesystem case, only with less likelihood of the environment telling you its encoding. > I already patched _PyImport_GetDynLoadFunc() for Windows: the path is > now a Unicode object instead of a byte string encoded to the filesystem > encoding. _PyImport_GetDynLoadWindows() uses GetFullPathNameW() and > LoadLibraryExW(). The work to be fully Unicode compliant (for the path > field, not for the name) is not completly done... but I have a pending > patch, see: > http://bugs.python.org/issue11619 > > But this patch is huge and creates many functions. I am not sure that we > need it, I will work on this later. I'm comfortable with the idea of requiring UTF-8 encoding for the initmodule entry points of modules named with non-ASCII identifiers, especially if there is nothing which works consistently today. I've only seen pure-ASCII library names in all my C++ work, so I feel it borders on YAGNI (but I like it in theory). As an alternate approach, one article I read suggested to use ordinals instead of names if you wanted to use non-ASCII names. Python could certainly try to load by ordinal on Windows, and fall back to loading by name. I don't have a clue what the rate of false positives would be. -- Michael Urman From phd at phdru.name Tue May 10 15:45:44 2011 From: phd at phdru.name (Oleg Broytman) Date: Tue, 10 May 2011 17:45:44 +0400 Subject: [Python-Dev] Commit messages: please avoid temporal ambiguity In-Reply-To: <20110510133314.66B48250041@webabinitio.net> References: <iq9802$e9m$1@dough.gmane.org> <4DC8343C.2050005@nedbatchelder.com> <BANLkTi=32fwy6+SC1vrDGZ6he_fvNcrX-g@mail.gmail.com> <4DC84235.4060600@trueblade.com> <BANLkTinRySS374X9yatULztpG7MLxD_SJg@mail.gmail.com> <20110510133314.66B48250041@webabinitio.net> Message-ID: <20110510134544.GA9665@iskra.aviel.ru> On Tue, May 10, 2011 at 09:33:13AM -0400, R. David Murray wrote: > commit: > 11999: sync based on comparing mtimes, not mtime to system clock > NEWS: > Issue 11999: fixed sporadic sync failure mailbox.Maildir due to its > trying to detect mtime changes by comparing to the system clock > instead of to the previous value of the mtime. > > commit: > #11873: Improve test regex so random directory names don't cause test to fail > NEWS: > Issue #11873: Change regex in test_compileall to fix occasional > failures when when the randomly generated temporary path happened to > match the regex. > > You will note the *active* verbs "fixed", "improve", and "change" > figure in there prominently :) Why "fixed" is in the past tense, but "improve", and "change" are in present tense? I use past tense to describe what I did on the code, and present simple to describe what the new code does when running. For example: "Fixed a bug in time comparison: compare mtime to mtime, not mtime to system clock" I.e., "fixed" - that what I did, and "compare" is what the code does. (I used an excerpt from above only for the example, not to correct something.) Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From eliben at gmail.com Tue May 10 16:02:07 2011 From: eliben at gmail.com (Eli Bendersky) Date: Tue, 10 May 2011 17:02:07 +0300 Subject: [Python-Dev] more timely detection of unbound locals In-Reply-To: <20110510131144.C8D75250041@webabinitio.net> References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com> <Pine.GSO.4.64.1105090913360.27288@core.cs.uwaterloo.ca> <BANLkTimmAwLUzHr_4aoX8HNscQRE83LR1A@mail.gmail.com> <Pine.GSO.4.64.1105091129450.27288@core.cs.uwaterloo.ca> <BANLkTik8JCOp_n9WGqazg6r0gdQdGt9Ugg@mail.gmail.com> <20110510131144.C8D75250041@webabinitio.net> Message-ID: <BANLkTimP6He_jqQjBi0WQ38Ckxk+ZoZ=YQ@mail.gmail.com> On Tue, May 10, 2011 at 16:11, R. David Murray <rdmurray at bitdance.com>wrote: > On Tue, 10 May 2011 08:36:38 +0300, Eli Bendersky <eliben at gmail.com> > wrote: > > With an unlimited error message length it could make sense to say "Hey, I > > see 'x' may be assigned in this scope, so I mark it local. But this > access > > to 'x' happens before assignment - so ERROR". This isn't realistic, of > > course, so I'm wondering: > > > > 1. Does this error message (although unrealistic) capture all possible > > appearances of UnboundLocalError? > > 2. If the answer to (1) is yes - could it be usefully shortened to be > > clearer than the current "local variable referenced before assignment"? > > > > This may not be possible, of course, but it doesn't harm trying :-) > > How about: > > "reference to variable 'y' precedes an assignment that makes it a local > variable" > <http://www.bitdance.com> Yes, this is much better and not too long IMHO Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20110510/ce04e972/attachment.html> From rdmurray at bitdance.com Tue May 10 16:46:18 2011 From: rdmurray at bitdance.com (R. David Murray) Date: Tue, 10 May 2011 10:46:18 -0400 Subject: [Python-Dev] Commit messages: please avoid temporal ambiguity In-Reply-To: <20110510134544.GA9665@iskra.aviel.ru> References: <iq9802$e9m$1@dough.gmane.org> <4DC8343C.2050005@nedbatchelder.com> <BANLkTi=32fwy6+SC1vrDGZ6he_fvNcrX-g@mail.gmail.com> <4DC84235.4060600@trueblade.com> <BANLkTinRySS374X9yatULztpG7MLxD_SJg@mail.gmail.com> <20110510133314.66B48250041@webabinitio.net> <20110510134544.GA9665@iskra.aviel.ru> Message-ID: <20110510144618.DEC5B250041@webabinitio.net> On Tue, 10 May 2011 17:45:44 +0400, Oleg Broytman <phd at phdru.name> wrote: > On Tue, May 10, 2011 at 09:33:13AM -0400, R. David Murray wrote: > > commit: > > 11999: sync based on comparing mtimes, not mtime to system clock > > NEWS: > > Issue 11999: fixed sporadic sync failure mailbox.Maildir due to its > > trying to detect mtime changes by comparing to the system clock > > instead of to the previous value of the mtime. > > > > commit: > > #11873: Improve test regex so random directory names don't cause test to fail > > NEWS: > > Issue #11873: Change regex in test_compileall to fix occasional > > failures when when the randomly generated temporary path happened to > > match the regex. > > > > You will note the *active* verbs "fixed", "improve", and "change" > > figure in there prominently :) > > Why "fixed" is in the past tense, but "improve", and "change" are in > present tense? > > I use past tense to describe what I did on the code, and present > simple to describe what the new code does when running. For example: > > "Fixed a bug in time comparison: compare mtime to mtime, not mtime to system clock" > > I.e., "fixed" - that what I did, and "compare" is what the code does. > > (I used an excerpt from above only for the example, not to correct > something.) Yes, that's a good point. I'll try to be more consistent about that in the future. Change should have been Changed. -- R. David Murray http://www.bitdance.com From ncoghlan at gmail.com Tue May 10 16:59:08 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 11 May 2011 00:59:08 +1000 Subject: [Python-Dev] more timely detection of unbound locals In-Reply-To: <20110510131144.C8D75250041@webabinitio.net> References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com> <Pine.GSO.4.64.1105090913360.27288@core.cs.uwaterloo.ca> <BANLkTimmAwLUzHr_4aoX8HNscQRE83LR1A@mail.gmail.com> <Pine.GSO.4.64.1105091129450.27288@core.cs.uwaterloo.ca> <BANLkTik8JCOp_n9WGqazg6r0gdQdGt9Ugg@mail.gmail.com> <20110510131144.C8D75250041@webabinitio.net> Message-ID: <BANLkTi=MPSWxrDR6rU=5nFMtDu1meUJX4A@mail.gmail.com> On Tue, May 10, 2011 at 11:11 PM, R. David Murray <rdmurray at bitdance.com> wrote: > How about: > > "reference to variable 'y' precedes an assignment that makes it a local > variable" For comparison, the error messages I was able to elicit from 2.7 were as follows: # Module level NameError: name 'bob' is not defined # Function level reference to implicit global NameError: global name 'bob' is not defined # Early reference to local UnboundLocalError: local variable 'bob' referenced before assignment # Early reference from closure NameError: free variable 'bob' referenced before assignment in enclosing scope Personally, I would just add "in current scope" to the existing error message for the unbound local case (and potentially collapse the exception hierarchy a bit by setting UnboundLocalError = NameError). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From rdmurray at bitdance.com Tue May 10 19:31:04 2011 From: rdmurray at bitdance.com (R. David Murray) Date: Tue, 10 May 2011 13:31:04 -0400 Subject: [Python-Dev] more timely detection of unbound locals In-Reply-To: <BANLkTi=MPSWxrDR6rU=5nFMtDu1meUJX4A@mail.gmail.com> References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com> <Pine.GSO.4.64.1105090913360.27288@core.cs.uwaterloo.ca> <BANLkTimmAwLUzHr_4aoX8HNscQRE83LR1A@mail.gmail.com> <Pine.GSO.4.64.1105091129450.27288@core.cs.uwaterloo.ca> <BANLkTik8JCOp_n9WGqazg6r0gdQdGt9Ugg@mail.gmail.com> <20110510131144.C8D75250041@webabinitio.net> <BANLkTi=MPSWxrDR6rU=5nFMtDu1meUJX4A@mail.gmail.com> Message-ID: <20110510173107.74C9B250041@webabinitio.net> On Wed, 11 May 2011 00:59:08 +1000, Nick Coghlan <ncoghlan at gmail.com> wrote: > On Tue, May 10, 2011 at 11:11 PM, R. David Murray <rdmurray at bitdance.com> w= > rote: > > How about: > > > > "reference to variable 'y' precedes an assignment that makes it a local > > variable" > > For comparison, the error messages I was able to elicit from 2.7 were > as follows: > > # Module level > NameError: name 'bob' is not defined > > # Function level reference to implicit global > NameError: global name 'bob' is not defined > > # Early reference to local > UnboundLocalError: local variable 'bob' referenced before assignment > > # Early reference from closure > NameError: free variable 'bob' referenced before assignment in enclosing sc= > ope > > Personally, I would just add "in current scope" to the existing error > message for the unbound local case (and potentially collapse the > exception hierarchy a bit by setting UnboundLocalError = NameError). I don't think adding that phrase would add any clarity, myself. The mental disconnect comes from the fact that the UnboundLocal error message is emitted for the reference, but it is not immediately obvious *why* the variable is considered local. My rephrasing emphasizes that it is the assignment statement that led to that classification and therefore the error. This disconnect doesn't apply in the global cases. It applies less strongly in the free variable case because there is visibly another scope involved (that is, the triggering assignment isn't in the same scope as the reference producing the error message). -- R. David Murray http://www.bitdance.com From tjreedy at udel.edu Tue May 10 19:56:58 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 10 May 2011 13:56:58 -0400 Subject: [Python-Dev] more timely detection of unbound locals In-Reply-To: <BANLkTi=MPSWxrDR6rU=5nFMtDu1meUJX4A@mail.gmail.com> References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com> <Pine.GSO.4.64.1105090913360.27288@core.cs.uwaterloo.ca> <BANLkTimmAwLUzHr_4aoX8HNscQRE83LR1A@mail.gmail.com> <Pine.GSO.4.64.1105091129450.27288@core.cs.uwaterloo.ca> <BANLkTik8JCOp_n9WGqazg6r0gdQdGt9Ugg@mail.gmail.com> <20110510131144.C8D75250041@webabinitio.net> <BANLkTi=MPSWxrDR6rU=5nFMtDu1meUJX4A@mail.gmail.com> Message-ID: <iqbu9b$981$1@dough.gmane.org> On 5/10/2011 10:59 AM, Nick Coghlan wrote: > On Tue, May 10, 2011 at 11:11 PM, R. David Murray<rdmurray at bitdance.com> wrote: >> How about: >> >> "reference to variable 'y' precedes an assignment that makes it a local >> variable" > > For comparison, the error messages I was able to elicit from 2.7 were > as follows: > > # Module level > NameError: name 'bob' is not defined > > # Function level reference to implicit global > NameError: global name 'bob' is not defined > > # Early reference to local > UnboundLocalError: local variable 'bob' referenced before assignment I would change this to "local name 'bob' used before the assignment that makes it a local name" Calling names 'variables' is itself a point of confusion. > > # Early reference from closure > NameError: free variable 'bob' referenced before assignment in enclosing scope > > Personally, I would just add "in current scope" to the existing error > message for the unbound local case (and potentially collapse the > exception hierarchy a bit by setting UnboundLocalError = NameError). > > Cheers, > Nick. > -- Terry Jan Reedy From rdmurray at bitdance.com Tue May 10 20:31:17 2011 From: rdmurray at bitdance.com (R. David Murray) Date: Tue, 10 May 2011 14:31:17 -0400 Subject: [Python-Dev] more timely detection of unbound locals In-Reply-To: <iqbu9b$981$1@dough.gmane.org> References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com> <Pine.GSO.4.64.1105090913360.27288@core.cs.uwaterloo.ca> <BANLkTimmAwLUzHr_4aoX8HNscQRE83LR1A@mail.gmail.com> <Pine.GSO.4.64.1105091129450.27288@core.cs.uwaterloo.ca> <BANLkTik8JCOp_n9WGqazg6r0gdQdGt9Ugg@mail.gmail.com> <20110510131144.C8D75250041@webabinitio.net> <BANLkTi=MPSWxrDR6rU=5nFMtDu1meUJX4A@mail.gmail.com> <iqbu9b$981$1@dough.gmane.org> Message-ID: <20110510183118.6D2B8250041@webabinitio.net> On Tue, 10 May 2011 13:56:58 -0400, Terry Reedy <tjreedy at udel.edu> wrote: > On 5/10/2011 10:59 AM, Nick Coghlan wrote: > > On Tue, May 10, 2011 at 11:11 PM, R. David Murray<rdmurray at bitdance.com> wrote: > >> How about: > >> > >> "reference to variable 'y' precedes an assignment that makes it a local > >> variable" > > > > For comparison, the error messages I was able to elicit from 2.7 were > > as follows: > > > > # Module level > > NameError: name 'bob' is not defined > > > > # Function level reference to implicit global > > NameError: global name 'bob' is not defined > > > > # Early reference to local > > UnboundLocalError: local variable 'bob' referenced before assignment > > I would change this to > "local name 'bob' used before the assignment that makes it a local name" > > Calling names 'variables' is itself a point of confusion. Yes, your phrasing is much better. -- R. David Murray http://www.bitdance.com From eliben at gmail.com Tue May 10 20:59:04 2011 From: eliben at gmail.com (Eli Bendersky) Date: Tue, 10 May 2011 21:59:04 +0300 Subject: [Python-Dev] more timely detection of unbound locals In-Reply-To: <20110510183118.6D2B8250041@webabinitio.net> References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com> <Pine.GSO.4.64.1105090913360.27288@core.cs.uwaterloo.ca> <BANLkTimmAwLUzHr_4aoX8HNscQRE83LR1A@mail.gmail.com> <Pine.GSO.4.64.1105091129450.27288@core.cs.uwaterloo.ca> <BANLkTik8JCOp_n9WGqazg6r0gdQdGt9Ugg@mail.gmail.com> <20110510131144.C8D75250041@webabinitio.net> <BANLkTi=MPSWxrDR6rU=5nFMtDu1meUJX4A@mail.gmail.com> <iqbu9b$981$1@dough.gmane.org> <20110510183118.6D2B8250041@webabinitio.net> Message-ID: <BANLkTimk+7Qv-_AJF6oRif1oGX=4uFe1+A@mail.gmail.com> <snip> > > > # Early reference to local > > > UnboundLocalError: local variable 'bob' referenced before assignment > > > > I would change this to > > "local name 'bob' used before the assignment that makes it a local name" > > > > Calling names 'variables' is itself a point of confusion. > > Yes, your phrasing is much better. > +1 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20110510/780da783/attachment.html> From steve at pearwood.info Wed May 11 00:38:53 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 11 May 2011 08:38:53 +1000 Subject: [Python-Dev] more timely detection of unbound locals In-Reply-To: <BANLkTi=MPSWxrDR6rU=5nFMtDu1meUJX4A@mail.gmail.com> References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com> <Pine.GSO.4.64.1105090913360.27288@core.cs.uwaterloo.ca> <BANLkTimmAwLUzHr_4aoX8HNscQRE83LR1A@mail.gmail.com> <Pine.GSO.4.64.1105091129450.27288@core.cs.uwaterloo.ca> <BANLkTik8JCOp_n9WGqazg6r0gdQdGt9Ugg@mail.gmail.com> <20110510131144.C8D75250041@webabinitio.net> <BANLkTi=MPSWxrDR6rU=5nFMtDu1meUJX4A@mail.gmail.com> Message-ID: <4DC9BE7D.1070800@pearwood.info> Nick Coghlan wrote: > Personally, I would just add "in current scope" to the existing error > message for the unbound local case (and potentially collapse the > exception hierarchy a bit by setting UnboundLocalError = NameError). -0 That was the case prior to Python 2.0. Reverting is potentially a semantic change that will break any code that distinguishes between (global) NameError and (local) UnboundLocalError. But personally, I don't know why it was thought necessary to distinguish between them in the first place. -- Steven From fdrake at acm.org Wed May 11 01:37:09 2011 From: fdrake at acm.org (Fred Drake) Date: Tue, 10 May 2011 19:37:09 -0400 Subject: [Python-Dev] more timely detection of unbound locals In-Reply-To: <4DC9BE7D.1070800@pearwood.info> References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com> <Pine.GSO.4.64.1105090913360.27288@core.cs.uwaterloo.ca> <BANLkTimmAwLUzHr_4aoX8HNscQRE83LR1A@mail.gmail.com> <Pine.GSO.4.64.1105091129450.27288@core.cs.uwaterloo.ca> <BANLkTik8JCOp_n9WGqazg6r0gdQdGt9Ugg@mail.gmail.com> <20110510131144.C8D75250041@webabinitio.net> <BANLkTi=MPSWxrDR6rU=5nFMtDu1meUJX4A@mail.gmail.com> <4DC9BE7D.1070800@pearwood.info> Message-ID: <BANLkTi=MeFiNBKm5RgsYhYrmPisjRugW5A@mail.gmail.com> On Tue, May 10, 2011 at 6:38 PM, Steven D'Aprano <steve at pearwood.info> wrote: > I don't know why it was thought necessary to distinguish between them in the > first place. New users almost constantly expressed confusion by NameError when the name was clearly bound at global scope, and a subsequent assignment caused it to be considered a local in their function. -Fred -- Fred L. Drake, Jr.? ? <fdrake at acm.org> "Give me the luxuries of life and I will willingly do without the necessities." ?? --Frank Lloyd Wright From palla74 at gmail.com Wed May 11 11:51:59 2011 From: palla74 at gmail.com (Palla) Date: Wed, 11 May 2011 11:51:59 +0200 Subject: [Python-Dev] EuroPython: Early Bird will end in 2 days! Message-ID: <BANLkTikg6L64aJET4yH1mSs3AgsARNQ7XA@mail.gmail.com> Hi all, If you plan to attend, you could save quite a bit on registration fees! The end of Early bird is on May 12th, Friday, 23:59:59 CEST. We'd like to ask to you to forward this post to anyone that you feel may be interested. We have an amazing lineup of tutorials, events and talks. We have some excellent keynote speakers and a very complete partner program... but early bird registration ends in 2 days! Right now, you still get discounts on talks and tutorials so if you plan to attend Register Now: http://ep2011.europython.eu/registration/ While you are booking, remember to have a look at the partner program and our offer for a prepaid, data+voice+tethering SIM. We'd like to ask to you to forward this post to anyone that you feel may be interested. All the best, -- ->PALLA From merwok at netwok.org Wed May 11 18:38:53 2011 From: merwok at netwok.org (=?UTF-8?Q?=C3=89ric_Araujo?=) Date: Wed, 11 May 2011 18:38:53 +0200 Subject: [Python-Dev] Commit changelog: issue number and merges In-Reply-To: <20110509175447.4DC56250039@webabinitio.net> References: <1304937168.22910.21.camel@marge> <BANLkTimfd7d_U-ffnuH_KOdURRkmFQgM0w@mail.gmail.com> <7fa082450fb750d082e71d5070a62171@netwok.org> <20110509175447.4DC56250039@webabinitio.net> Message-ID: <5899b3f80aa1a05b5f6f3364eebde44e@netwok.org> Le 09/05/2011 19:54, R. David Murray a ?crit : >>> No it isn't. The commit message isn't pulled into the new branch. >> Sorry, your terminology does not make sense. If you mean that the >> commit message is not reused in the new commit after the merge, >> it?s >> true. However, the commit message with the relevant information is >> available as part of the changesets that have been pulled and >> merged. > > The changesets are in the repository and there are pointers to them > from the merge changeset, sure, but the data isn't in the checkout > (that's how I understood "pulled in to the new branch"). No commit message is ever in the checkout, so I don?t follow you. > If I do 'hg log' and search for a revno (that I got from hg > annotate), > the commit message describing the change is not attached to that > revno, Ah, I understand your problem now. I would not object to a policy requiring to put helpful information in merge changesets commit messages, like ?Merge fixes for #4444 and #5555? or ?Merge doc fixes? when there are no bug reports. I?m not sure about the ?atomic? merge changesets idea that someone else expressed; I don?t think it would be that useful. > nor as far as I know is there a tool that makes it easy to get from > that > revno to the explanatory commit message. That's what Victor and I > are > talking about. Is there a tool that fixes this problem? I tend to use graphical tools for history viewing. I like the GTK version of TortoiseHg, or failing that the graph displayed by ?hg serve? if you enable the graphlog extension and use a browser with JavaScript. From merwok at netwok.org Wed May 11 18:39:21 2011 From: merwok at netwok.org (=?UTF-8?Q?=C3=89ric_Araujo?=) Date: Wed, 11 May 2011 18:39:21 +0200 Subject: [Python-Dev] Problems with regrtest and with logging In-Reply-To: <loom.20110509T193140-280@post.gmane.org> References: <loom.20110509T193140-280@post.gmane.org> Message-ID: <d1989510eccc69219dc75384faf7be23@netwok.org> Hi, > That's right, though it's OK to provide a documented convenience API > for adding > handlers. I think I?ll aim for simplicity. We?ll document that we use the logger ?packaging? throughout and let people use getLogger and addHandler with that. > You don't necessarily need to set the level on the handler - why can > you not > just set it on the logger? The effect would often be the same: the > logger's > level is checked first, and then the handler's level. I thought that if we set the level on the logger, we would prevent third-party code to get some messages. E.g., we set level to INFO but pip uses some packaging functions and would like to get DEBUG messages. Regards From merwok at netwok.org Wed May 11 18:39:54 2011 From: merwok at netwok.org (=?UTF-8?Q?=C3=89ric_Araujo?=) Date: Wed, 11 May 2011 18:39:54 +0200 Subject: [Python-Dev] Commit messages: please avoid temporal ambiguity In-Reply-To: <20110510144618.DEC5B250041@webabinitio.net> References: <iq9802$e9m$1@dough.gmane.org> "\"<4DC8343C.2050005@nedbatchelder.com> <BANLkTi=32fwy6+SC1vrDGZ6he_fvNcrX-g@mail.gmail.com> <4DC84235.4060600@trueblade.com> <BANLkTinRySS374X9yatULztpG7MLxD_SJg@mail.gmail.com>" <20110510133314.66B48250041@webabinitio.net>" <20110510134544.GA9665@iskra.aviel.ru> <20110510144618.DEC5B250041@webabinitio.net> Message-ID: <ed9491d44d238df972576bf8b215c91b@netwok.org> Le 10/05/2011 16:46, R. David Murray a ?crit : > On Tue, 10 May 2011 17:45:44 +0400, Oleg Broytman <phd at phdru.name> > wrote: >> Why "fixed" is in the past tense, but "improve", and "change" are >> in >> present tense? >> I use past tense to describe what I did on the code, and present >> simple to describe what the new code does when running. For example: Funny, I always use the present tense, to convey what the code does now. From merwok at netwok.org Wed May 11 19:05:48 2011 From: merwok at netwok.org (=?UTF-8?Q?=C3=89ric_Araujo?=) Date: Wed, 11 May 2011 19:05:48 +0200 Subject: [Python-Dev] =?utf-8?q?=5BPython-checkins=5D_cpython_=282=2E7=29?= =?utf-8?q?=3A_=28Merge_3=2E1=29_Issue_=2312012=3A_ssl=2EPROTOCOL=5FSSLv2_?= =?utf-8?q?becomes_optional?= In-Reply-To: <E1QJaFY-00046A-Tn@dinsdale.python.org> References: <E1QJaFY-00046A-Tn@dinsdale.python.org> Message-ID: <c5683e5e669b57e6645e01eb80501fa9@netwok.org> Le 10/05/2011 01:52, victor.stinner a ?crit : > http://hg.python.org/cpython/rev/3c87a13980be > changeset: 70001:3c87a13980be > branch: 2.7 > parent: 69996:c9f07c69b138 > user: Victor Stinner <victor.stinner at haypocalc.com> > date: Tue May 10 01:52:03 2011 +0200 > summary: > (Merge 3.1) Issue #12012: ssl.PROTOCOL_SSLv2 becomes optional ?(Merge 3.1)? is inaccurate for 2.7. Regards From tjreedy at udel.edu Wed May 11 19:45:39 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 11 May 2011 13:45:39 -0400 Subject: [Python-Dev] Commit messages: please avoid temporal ambiguity In-Reply-To: <ed9491d44d238df972576bf8b215c91b@netwok.org> References: <iq9802$e9m$1@dough.gmane.org> "\"<4DC8343C.2050005@nedbatchelder.com> <BANLkTi=32fwy6+SC1vrDGZ6he_fvNcrX-g@mail.gmail.com> <4DC84235.4060600@trueblade.com> <BANLkTinRySS374X9yatULztpG7MLxD_SJg@mail.gmail.com>" <20110510133314.66B48250041@webabinitio.net>" <20110510134544.GA9665@iskra.aviel.ru> <20110510144618.DEC5B250041@webabinitio.net> <ed9491d44d238df972576bf8b215c91b@netwok.org> Message-ID: <iqei04$kic$1@dough.gmane.org> On 5/11/2011 12:39 PM, ?ric Araujo wrote: > Funny, I always use the present tense, to convey what the code does now. Which code ;-). At the moment you write a push message, your private clone does something different from the public repository (and other private clones). At the moment people read a push message, they may not have pulled the change, so that there is a difference between the repository and *their* clone. Besides the ambiguity, there is also inconsistency between writers. Hence my request for a few clarifying keystrokes when needed. -- Terry Jan Reedy From victor.stinner at haypocalc.com Wed May 11 20:08:49 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 11 May 2011 20:08:49 +0200 Subject: [Python-Dev] [Python-checkins] cpython (2.7): (Merge 3.1) Issue #12012: ssl.PROTOCOL_SSLv2 becomes optional In-Reply-To: <c5683e5e669b57e6645e01eb80501fa9@netwok.org> References: <E1QJaFY-00046A-Tn@dinsdale.python.org> <c5683e5e669b57e6645e01eb80501fa9@netwok.org> Message-ID: <1305137329.12577.1.camel@marge> Le mercredi 11 mai 2011 ? 19:05 +0200, ?ric Araujo a ?crit : > Le 10/05/2011 01:52, victor.stinner a ?crit : > > http://hg.python.org/cpython/rev/3c87a13980be > > changeset: 70001:3c87a13980be > > branch: 2.7 > > parent: 69996:c9f07c69b138 > > user: Victor Stinner <victor.stinner at haypocalc.com> > > date: Tue May 10 01:52:03 2011 +0200 > > summary: > > (Merge 3.1) Issue #12012: ssl.PROTOCOL_SSLv2 becomes optional > > ?(Merge 3.1)? is inaccurate for 2.7. Ah, why? I did not use "hg merge" command (but hg export|hg import), but it's a "merge" between two branches. Which term would you use? Victor From guido at python.org Wed May 11 20:48:50 2011 From: guido at python.org (Guido van Rossum) Date: Wed, 11 May 2011 11:48:50 -0700 Subject: [Python-Dev] Commit messages: please avoid temporal ambiguity In-Reply-To: <ed9491d44d238df972576bf8b215c91b@netwok.org> References: <iq9802$e9m$1@dough.gmane.org> <4DC8343C.2050005@nedbatchelder.com> <BANLkTi=32fwy6+SC1vrDGZ6he_fvNcrX-g@mail.gmail.com> <4DC84235.4060600@trueblade.com> <BANLkTinRySS374X9yatULztpG7MLxD_SJg@mail.gmail.com> <20110510133314.66B48250041@webabinitio.net> <20110510134544.GA9665@iskra.aviel.ru> <20110510144618.DEC5B250041@webabinitio.net> <ed9491d44d238df972576bf8b215c91b@netwok.org> Message-ID: <BANLkTi=NSeCknTq4wrJ66J_Ce3x-+otU6A@mail.gmail.com> On Wed, May 11, 2011 at 9:39 AM, ?ric Araujo <merwok at netwok.org> wrote: > Funny, I always use the present tense, to convey what the code does now. Yeah, and that's exactly what I am objecting to. Please describe what changed how, since that is the focus of the patch. -- --Guido van Rossum (python.org/~guido) From vinay_sajip at yahoo.co.uk Wed May 11 21:45:12 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Wed, 11 May 2011 19:45:12 +0000 (UTC) Subject: [Python-Dev] Problems with regrtest and with logging References: <loom.20110509T193140-280@post.gmane.org> <d1989510eccc69219dc75384faf7be23@netwok.org> Message-ID: <loom.20110511T213726-472@post.gmane.org> ?ric Araujo <merwok <at> netwok.org> writes: > I thought that if we set the level on the logger, we would prevent > third-party code to get some messages. E.g., we set level to INFO but > pip uses some packaging functions and would like to get DEBUG messages. Then pip can set the level of the packaging logger as it wishes, perhaps in response to command-line arguments for verbosity. It'd be easier for pip to do that, regardless of which handlers are attached. And pip itself might be being used, say by virtualenv. It's hard in general to say what the top-level code will be, and generally that's the code which should set the handlers. The levels set by a library for its loggers are merely defaults. Applications using the library can choose to override those levels as they wish. Regards, Vinay Sajip From tjreedy at udel.edu Wed May 11 23:12:39 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 11 May 2011 17:12:39 -0400 Subject: [Python-Dev] [Python-checkins] cpython (2.7): (Merge 3.1) Issue #12012: ssl.PROTOCOL_SSLv2 becomes optional In-Reply-To: <1305137329.12577.1.camel@marge> References: <E1QJaFY-00046A-Tn@dinsdale.python.org> <c5683e5e669b57e6645e01eb80501fa9@netwok.org> <1305137329.12577.1.camel@marge> Message-ID: <iqeu47$udg$1@dough.gmane.org> On 5/11/2011 2:08 PM, Victor Stinner wrote: > Le mercredi 11 mai 2011 ? 19:05 +0200, ?ric Araujo a ?crit : >> Le 10/05/2011 01:52, victor.stinner a ?crit : >>> http://hg.python.org/cpython/rev/3c87a13980be >>> changeset: 70001:3c87a13980be >>> branch: 2.7 >>> parent: 69996:c9f07c69b138 >>> user: Victor Stinner<victor.stinner at haypocalc.com> >>> date: Tue May 10 01:52:03 2011 +0200 >>> summary: >>> (Merge 3.1) Issue #12012: ssl.PROTOCOL_SSLv2 becomes optional >> >> ?(Merge 3.1)? is inaccurate for 2.7. > > Ah, why? I did not use "hg merge" command (but hg export|hg import), but > it's a "merge" between two branches. Which term would you use? export/import sounds like transport: "(transport from 3.1)" would be clear enough to me. -- Terry Jan Reedy From pythondev at genstein.net Thu May 12 04:35:16 2011 From: pythondev at genstein.net (Genstein) Date: Thu, 12 May 2011 03:35:16 +0100 Subject: [Python-Dev] py3k buffered I/O - flush() required between read/write? Message-ID: <4DCB4764.5080902@genstein.net> Hi all, Sincere apologies for posting a question without lurking for a while first. I'm not sure whether I'm being dumb (which is very plausible) or whether this is a potential bug. I asked on comp.lang.python but responses were equivocal, so I'm following the README.txt advice and asking here. If I'm out of line, do feel free to slap me down viciously, remove me from the list, or whatever seems most appropriate. Under py3k, is it necessary to flush() a file between buffered read/write calls in order to see consistent results? I have a case under Python 3.2 (r32:88445) where I see different results depending on whether buffering is active, on Gentoo Linux and Windows Vista. Perusing the docs and PEPs I couldn't seem to find an answer; I did find bufferedio.c's comment: "BufferedReader, BufferedWriter and BufferedRandom...share a single buffer...this enables interleaved reads and writes without flushing" which is suggestive but I may be taking it out of context. The following is the smallest code I can conjure which demonstrates the issue I'm seeing: [code] START = 0 MID = 1 LENGTH = 4 def test(buffering): f = open("test.bin", "w+b", buffering = buffering) for i in range(LENGTH): f.write(b'\x00') f.seek(MID) f.read(1) f.write(b'\x00') f.seek(MID) f.write(b'\x01') f.seek(START) f.seek(MID) print(f.read(1)) f.close() print("Buffered result: ") test(-1) print("Unbuffered result:") test(0) [end code] Output on both Gentoo and Vista is: Buffered result: b'\x00' Unbuffered result: b'\x01' I expected the results to be the same, but they aren't. The issue is reproducible with larger files provided that the constants are increased ~proportionally (START 0, MID 500, LENGTH 1000 for example). Transposing the buffered/unbuffered tests and/or using different buffer sizes for the buffered test seem have no effect. Apologies once more if I'm wasting your time. All the best, -eg. PS. By way of entirely belated introduction, I'm a UK software developer with a background mostly in C#, C++ and Lua in both "real software" and commercial games. In my spare time I mostly write code (curiously I don't know many developers who do; I suspect I just know the wrong people.) I perpetrated the Trizbort mapper for interactive fiction which doubtless nobody will have heard of, and with good reason. I'm toying with Python as a genuinely portable alternative to C# for my own projects, and so far loving it. From solipsis at pitrou.net Thu May 12 12:47:27 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 12 May 2011 12:47:27 +0200 Subject: [Python-Dev] py3k buffered I/O - flush() required between read/write? References: <4DCB4764.5080902@genstein.net> Message-ID: <20110512124727.3f4d921e@pitrou.net> Hello, On Thu, 12 May 2011 03:35:16 +0100 Genstein <pythondev at genstein.net> wrote: > > The following is the smallest code I can conjure which demonstrates the > issue I'm seeing: This is a bug indeed. Can you report it on http://bugs.python.org ? Thanks a lot for finding this, Antoine. From pythondev at genstein.net Thu May 12 15:22:22 2011 From: pythondev at genstein.net (Genstein) Date: Thu, 12 May 2011 14:22:22 +0100 Subject: [Python-Dev] py3k buffered I/O - flush() required between read/write? In-Reply-To: <20110512124727.3f4d921e@pitrou.net> References: <4DCB4764.5080902@genstein.net> <20110512124727.3f4d921e@pitrou.net> Message-ID: <4DCBDF0E.4070504@genstein.net> On 12/05/2011 11:47, Antoine Pitrou wrote: > This is a bug indeed. Can you report it on http://bugs.python.org ? > > Thanks a lot for finding this, > > Antoine. > Duly reported as http://bugs.python.org/issue12062. I'm glad it wasn't me being dumb(er than usual). It took a while to pin down to a small reproducible case. Thanks for the fast and definite response, I'll cheerfully revert to lurking now ;) All the best, -eg. From skip at montanaro.dyndns.org Thu May 12 18:33:37 2011 From: skip at montanaro.dyndns.org (Skip Montanaro) Date: Thu, 12 May 2011 11:33:37 -0500 (CDT) Subject: [Python-Dev] Could these restrictions be removed? Message-ID: <20110512163337.3758D12B7749@montanaro.dyndns.org> A friend at work who is new to Python wondered why this didn't work with pickle: class Outer: Class Inner: ... def __init__(self): self.i = Outer.Inner() I explained: > http://docs.python.org/library/pickle.html#what-can-be-pickled-and-unpickled > > > From that: > > # functions defined at the top level of a module > # built-in functions defined at the top level of a module > # classes that are defined at the top level of a module I've never questions this, but I wonder, is this a fundamental restriction or could it be overcome with a modest amount of work? Just curious... Skip From walter at livinglogic.de Thu May 12 18:58:12 2011 From: walter at livinglogic.de (=?UTF-8?B?V2FsdGVyIETDtnJ3YWxk?=) Date: Thu, 12 May 2011 18:58:12 +0200 Subject: [Python-Dev] Could these restrictions be removed? In-Reply-To: <4DCC10A3.9000209@livinglogic.de> References: <20110512163337.3758D12B7749@montanaro.dyndns.org> <4DCC10A3.9000209@livinglogic.de> Message-ID: <4DCC11A4.4050101@livinglogic.de> On 12.05.11 18:53, Walter D?rwald wrote: > On 12.05.11 18:33, skip at pobox.com wrote: > >> A friend at work who is new to Python wondered why this didn't work with >> pickle: >> >> class Outer: >> >> Class Inner: >> >> ... >> >> def __init__(self): >> self.i = Outer.Inner() >> >> I explained: >> >>> http://docs.python.org/library/pickle.html#what-can-be-pickled-and-unpickled >>> >>> >>> From that: >>> >>> # functions defined at the top level of a module >>> # built-in functions defined at the top level of a module >>> # classes that are defined at the top level of a module >> >> I've never questions this, but I wonder, is this a fundamental restriction >> or could it be overcome with a modest amount of work? > > This is related to http://bugs.python.org/issue633930 See also the thread started at: http://mail.python.org/pipermail/python-dev/2005-March/052454.html Servus, Walter From solipsis at pitrou.net Thu May 12 19:05:46 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 12 May 2011 19:05:46 +0200 Subject: [Python-Dev] Could these restrictions be removed? References: <20110512163337.3758D12B7749@montanaro.dyndns.org> Message-ID: <20110512190546.116a9a91@pitrou.net> On Thu, 12 May 2011 11:33:37 -0500 (CDT) Skip Montanaro <skip at montanaro.dyndns.org> wrote: > > A friend at work who is new to Python wondered why this didn't work with > pickle: > > class Outer: > > Class Inner: > > ... > > def __init__(self): > self.i = Outer.Inner() > [...] > > I've never questions this, but I wonder, is this a fundamental restriction > or could it be overcome with a modest amount of work? pickle uses heuristics to try to find out the "official name" of a class or function. It would be a matter of improving these heuristics. There are other cases in which pickle similarly fails: >>> pickle.dumps(random.random) b'\x80\x03crandom\nrandom\nq\x00.' >>> pickle.dumps(random.randint) Traceback (most recent call last): File "<stdin>", line 1, in <module> _pickle.PicklingError: Can't pickle <class 'method'>: attribute lookup builtins.method failed Regards Antoine. From walter at livinglogic.de Thu May 12 18:53:55 2011 From: walter at livinglogic.de (=?UTF-8?B?V2FsdGVyIETDtnJ3YWxk?=) Date: Thu, 12 May 2011 18:53:55 +0200 Subject: [Python-Dev] Could these restrictions be removed? In-Reply-To: <20110512163337.3758D12B7749@montanaro.dyndns.org> References: <20110512163337.3758D12B7749@montanaro.dyndns.org> Message-ID: <4DCC10A3.9000209@livinglogic.de> On 12.05.11 18:33, skip at pobox.com wrote: > A friend at work who is new to Python wondered why this didn't work with > pickle: > > class Outer: > > Class Inner: > > ... > > def __init__(self): > self.i = Outer.Inner() > > I explained: > >> http://docs.python.org/library/pickle.html#what-can-be-pickled-and-unpickled >> >> >> From that: >> >> # functions defined at the top level of a module >> # built-in functions defined at the top level of a module >> # classes that are defined at the top level of a module > > I've never questions this, but I wonder, is this a fundamental restriction > or could it be overcome with a modest amount of work? This is related to http://bugs.python.org/issue633930 Servus, Walter From dickinsm at gmail.com Fri May 13 10:14:12 2011 From: dickinsm at gmail.com (Mark Dickinson) Date: Fri, 13 May 2011 09:14:12 +0100 Subject: [Python-Dev] Python Language Summit at EuroPython: 19th June In-Reply-To: <4DA9ACB5.6030505@python.org> References: <4DA9ACB5.6030505@python.org> Message-ID: <BANLkTinAHe9kAumvJFqqJ6sbtHpW9KJdmg@mail.gmail.com> Hi Michael, Sorry for the late reply; it's been kinda busy around here. If there are places left, I'll definitely be there at the summit. Congratulations on your impending doom! (And sorry to hear that you might not be there in Florence.) Mark On Sat, Apr 16, 2011 at 3:50 PM, Michael Foord <michael at python.org> wrote: > Hello all, > > This is an invite to all core-python developers, and developers of > alternative implementations, to attend the Python Language Summit at > EuroPython. The summit will be on June 19th and EuroPython this year will be > held at the beautiful city of Florence in Italy. > > ? ?http://ep2011.europython.eu/ > > If you are not a core-Python developer but would like to attend then please > email me privately and I will let you know if spaces are available. If you > are a core developer, or you have received a direct invitation, then please > respond by private email to let me know if you are able to attend. A maybe > is fine, you can always change your mind later. Attending for only part of > the day is fine. > > We expect the summit to run from 10am - 4pm with appropriate breaks. > > Like previous language summits it is an opportunity to discuss topics like, > Python 3 adoption, PEPs and changes for Python 3.3, the future of Python > 2.7, documentation, package index, web site, etc. > > If you have topics you'd like to discuss at the language summit please let > me know. > > Volunteers for taking notes at the language summit, for posting to > Python-dev and the Python Insiders blog after the event, would be much > appreciated. > > All the best, > > Michael Foord > > N.B. Due to my impending doom (oops, I mean impending fatherhood) I am not > yet 100% certain I will be able to attend. If I can't I will arrange for > someone else to chair. > > -- > http://www.voidspace.org.uk/ > > May you do good and not evil > May you find forgiveness for yourself and forgive others > May you share freely, never taking more than you give. > -- the sqlite blessing http://www.sqlite.org/different.html > > From sandeep.mathew at hp.com Fri May 13 11:25:44 2011 From: sandeep.mathew at hp.com (Mathew, Sandeep (OpenVMS)) Date: Fri, 13 May 2011 09:25:44 +0000 Subject: [Python-Dev] Python Support on OpenVMS Message-ID: <DB140D138DBD2F42A2768791634E108021064E4A9B@GVW1351EXA.americas.hpqcorp.net> Hi Folks, I am Sandeep Mathew from OpenVMS engineering in Hewlett-Packard. I have worked on various components of the OpenVMS operating system including MONITOR, TDF, EXEC, LIBRTL, DCL and SYSMAN. I happened to read this blog post about dropping OpenVMS support for further releases of python here: http://blog.python.org/2011/05/python-33-to-drop-support-for-os2.html. I am willing to spend time and effort to ensure that python remains supported on OpenVMS. Please let me know what needs to be done for continued OpenVMS Support in Python. Looking forward to working with the Python community. Regards Sandeep Mathew -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20110513/163b2103/attachment.html> From solipsis at pitrou.net Fri May 13 12:08:18 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 13 May 2011 12:08:18 +0200 Subject: [Python-Dev] Python Support on OpenVMS References: <DB140D138DBD2F42A2768791634E108021064E4A9B@GVW1351EXA.americas.hpqcorp.net> Message-ID: <20110513120818.139dca63@pitrou.net> Welcome Sandeep, > I am willing to spend time and effort to ensure that python remains supported > on OpenVMS. Please let me know what needs to be done for continued > OpenVMS Support in Python. Looking forward to working with the Python > community. The first thing would be to check whether the current development tree (the future Python 3.3) compiles and works fine for OpenVMS. Given that 3.x has had many changes compared to 2.x, this is not guaranteed. Instructions for getting the source tree are here: http://docs.python.org/devguide/setup.html Once the interpreter compiled fine, the second step is to run the test suite: http://docs.python.org/devguide/runtests.html Any compilation errors and test suite failures should be reported to the bug tracker (http://bugs.python.org/), preferably with patches since I doubt any of us would be able to fix the issues themselves. If you have any questions, don't hesitate to ask. Regards Antoine. From merwok at netwok.org Fri May 13 17:14:46 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Fri, 13 May 2011 17:14:46 +0200 Subject: [Python-Dev] [Python-checkins] cpython (2.7): (Merge 3.1) Issue #12012: ssl.PROTOCOL_SSLv2 becomes optional In-Reply-To: <1305137329.12577.1.camel@marge> References: <E1QJaFY-00046A-Tn@dinsdale.python.org> <c5683e5e669b57e6645e01eb80501fa9@netwok.org> <1305137329.12577.1.camel@marge> Message-ID: <4DCD4AE6.7030704@netwok.org> Le 11/05/2011 20:08, Victor Stinner a ?crit : >>> (Merge 3.1) Issue #12012: ssl.PROTOCOL_SSLv2 becomes optional >> ?(Merge 3.1)? is inaccurate for 2.7. > Ah, why? I did not use "hg merge" command (but hg export|hg import), but > it's a "merge" between two branches. Which term would you use? I prefer to use merge only to refer to hg merges. The 2.7 and 3.x lines are independent, so I wouldn?t put any marker in the commit message, just use the same as the message used in 3.1. Regards From merwok at netwok.org Fri May 13 17:35:01 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Fri, 13 May 2011 17:35:01 +0200 Subject: [Python-Dev] [Python-checkins] Python Regression Test Failures doc (1) In-Reply-To: <20110508200046.GA2465@kbk-i386-bb.dyndns.org> References: <20110508200046.GA2465@kbk-i386-bb.dyndns.org> Message-ID: <4DCD4FA5.3040607@netwok.org> Hi, Le 08/05/2011 22:00, Neal Norwitz a ?crit : > rm -rf build/* > rm -rf tools/sphinx > rm -rf tools/pygments > rm -rf tools/jinja2 > rm -rf tools/docutils > Checking out Sphinx... > svn: PROPFIND request failed on '/projects/external/Sphinx-0.6.5/sphinx' > svn: PROPFIND of '/projects/external/Sphinx-0.6.5/sphinx': Could not resolve hostname `svn.python.org': Host not found (http://svn.python.org) > make: *** [checkout] Error 1 I always wonder about these messages. They?re mostly error messages recently; what are python-checkins subscribers supposed to do in reaction? In non-error mode, what are they useful for? Thanks in advance for enlightening me. Regards From merwok at netwok.org Fri May 13 17:44:00 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Fri, 13 May 2011 17:44:00 +0200 Subject: [Python-Dev] [Python-checkins] cpython (3.1): Fix for issue 10684: Folders get deleted when trying to change case with In-Reply-To: <E1QIHOF-0003NK-11@dinsdale.python.org> References: <E1QIHOF-0003NK-11@dinsdale.python.org> Message-ID: <4DCD51C0.9080304@netwok.org> Hi, Le 06/05/2011 11:32, ronald.oussoren a ?crit : > http://hg.python.org/cpython/rev/26da299ca88e > summary: > Fix for issue 10684: Folders get deleted when trying to change case with shutil.move (case insensitive file systems only) > > - except OSError: > + except OSError as exc: > if os.path.isdir(src): > if _destinsrc(src, dst): > raise Error("Cannot move a directory '%s' into itself '%s'." % (src, dst)) Is this change a debugging leftover? Regards From status at bugs.python.org Fri May 13 18:07:22 2011 From: status at bugs.python.org (Python tracker) Date: Fri, 13 May 2011 18:07:22 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20110513160722.1F2C31CE85@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2011-05-06 - 2011-05-13) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 2784 ( +1) closed 21069 (+52) total 23853 (+53) Open issues with patches: 1198 Issues opened (36) ================== #6011: python doesn't build if prefix contains non-ascii characters http://bugs.python.org/issue6011 reopened by haypo #11786: ConfigParser.[Raw]ConfigParser optionxform() http://bugs.python.org/issue11786 reopened by eric.araujo #11873: test_regexp() of test_compileall fails occassionally http://bugs.python.org/issue11873 reopened by haypo #11977: Document int.conjugate, .denominator, ... http://bugs.python.org/issue11977 reopened by georg.brandl #12019: Dead or buggy code in importlib.test.__main__ http://bugs.python.org/issue12019 opened by eric.araujo #12020: Attribute error with flush on stdout,stderr http://bugs.python.org/issue12020 opened by Jimbofbx #12021: mmap.read requires an argument http://bugs.python.org/issue12021 opened by rich-noir #12022: AttributeError should report the same details when raised by l http://bugs.python.org/issue12022 opened by dholth #12024: 2.6 svn and hg branches are out of sync http://bugs.python.org/issue12024 opened by barry #12026: Support more of MSI api http://bugs.python.org/issue12026 opened by markm #12028: threading._get_ident(): remove it in the doc and make it publi http://bugs.python.org/issue12028 opened by haypo #12029: ABC registration of Exceptions http://bugs.python.org/issue12029 opened by acooke #12034: check_GetFinalPathNameByHandle() suboptimal http://bugs.python.org/issue12034 opened by pitrou #12037: test_email failures under Windows with the eol extension activ http://bugs.python.org/issue12037 opened by pitrou #12038: assertEqual doesn't display newline differences quite well http://bugs.python.org/issue12038 opened by pitrou #12040: Expose a Process.sentinel property (and fix polling loop in Pr http://bugs.python.org/issue12040 opened by pitrou #12042: What's New multiprocessing example error http://bugs.python.org/issue12042 opened by davipo #12043: Update shutil documentation http://bugs.python.org/issue12043 opened by sandro.tosi #12045: external shell command executed twice in ctypes.util._get_sona http://bugs.python.org/issue12045 opened by pitrou #12046: Windows build identification incomplete http://bugs.python.org/issue12046 opened by loewis #12048: Python 3, ZipFile Bug In Chinese http://bugs.python.org/issue12048 opened by yaoyu #12049: expose RAND_bytes() function of OpenSSL http://bugs.python.org/issue12049 opened by haypo #12050: unconsumed_tail of zlib.Decompress is not always cleared on de http://bugs.python.org/issue12050 opened by Takeshi.Yoshino #12053: Add prefetch() for Buffered IO (experiment) http://bugs.python.org/issue12053 opened by jcon #12055: doctest not working on nested functions http://bugs.python.org/issue12055 opened by dabrahams #12057: HZ codec has no test http://bugs.python.org/issue12057 opened by haypo #12059: hashlib does not handle missing hash functions correctly http://bugs.python.org/issue12059 opened by Ian.Wienand #12060: Python doesn't support real time signals http://bugs.python.org/issue12060 opened by haypo #12063: tokenize module appears to treat unterminated single and doubl http://bugs.python.org/issue12063 opened by Devin Jeanpierre #12065: test_ssl failure when svn.python.org fails to resolve http://bugs.python.org/issue12065 opened by r.david.murray #12066: Empty ('') xmlns attribute is not properly handled by xml.dom. http://bugs.python.org/issue12066 opened by atamyrat #12067: Doc: remove errors about mixed-type comparisons. http://bugs.python.org/issue12067 opened by terry.reedy #12068: test_logging failure in test_rollover http://bugs.python.org/issue12068 opened by pitrou #12069: test_signal.test_without_siginterrupt() failure on AMD64 OpenI http://bugs.python.org/issue12069 opened by haypo #12070: Unlimited loop in sysconfig._parse_makefile() http://bugs.python.org/issue12070 opened by haypo #12071: test_concurrent_futures.test_context_manager_shutdown() hangs http://bugs.python.org/issue12071 opened by haypo Most recent 15 issues with no replies (15) ========================================== #12071: test_concurrent_futures.test_context_manager_shutdown() hangs http://bugs.python.org/issue12071 #12069: test_signal.test_without_siginterrupt() failure on AMD64 OpenI http://bugs.python.org/issue12069 #12066: Empty ('') xmlns attribute is not properly handled by xml.dom. http://bugs.python.org/issue12066 #12063: tokenize module appears to treat unterminated single and doubl http://bugs.python.org/issue12063 #12059: hashlib does not handle missing hash functions correctly http://bugs.python.org/issue12059 #12055: doctest not working on nested functions http://bugs.python.org/issue12055 #12053: Add prefetch() for Buffered IO (experiment) http://bugs.python.org/issue12053 #12045: external shell command executed twice in ctypes.util._get_sona http://bugs.python.org/issue12045 #12043: Update shutil documentation http://bugs.python.org/issue12043 #12037: test_email failures under Windows with the eol extension activ http://bugs.python.org/issue12037 #12034: check_GetFinalPathNameByHandle() suboptimal http://bugs.python.org/issue12034 #12029: ABC registration of Exceptions http://bugs.python.org/issue12029 #12024: 2.6 svn and hg branches are out of sync http://bugs.python.org/issue12024 #12019: Dead or buggy code in importlib.test.__main__ http://bugs.python.org/issue12019 #11992: sys.settrace doesn't disable tracing if a local trace function http://bugs.python.org/issue11992 Most recent 15 issues waiting for review (15) ============================================= #12060: Python doesn't support real time signals http://bugs.python.org/issue12060 #12059: hashlib does not handle missing hash functions correctly http://bugs.python.org/issue12059 #12057: HZ codec has no test http://bugs.python.org/issue12057 #12049: expose RAND_bytes() function of OpenSSL http://bugs.python.org/issue12049 #12040: Expose a Process.sentinel property (and fix polling loop in Pr http://bugs.python.org/issue12040 #12026: Support more of MSI api http://bugs.python.org/issue12026 #12018: No tests for ntpath.samefile, ntpath.sameopenfile http://bugs.python.org/issue12018 #12015: possible characters in temporary file name is too few http://bugs.python.org/issue12015 #12014: str.format parses replacement field incorrectly http://bugs.python.org/issue12014 #12008: HtmlParser non-strict goes wrong with unquoted attributes http://bugs.python.org/issue12008 #12004: PyZipFile.writepy gives internal error on syntax errors http://bugs.python.org/issue12004 #12002: ftplib.FTP.abort fails with TypeError on Python 3.x http://bugs.python.org/issue12002 #11999: sporadic failure in test_mailbox http://bugs.python.org/issue11999 #11998: test_signal cannot test blocked signals if _tkinter is loaded; http://bugs.python.org/issue11998 #11996: libpython.py: nicer py-bt output http://bugs.python.org/issue11996 Top 10 most discussed issues (10) ================================= #11948: Tutorial/Modules - small fix to better clarify the modules sea http://bugs.python.org/issue11948 15 msgs #6727: ImportError when package is symlinked on Windows http://bugs.python.org/issue6727 14 msgs #8407: expose signalfd(2) and pthread_sigmask in the signal module http://bugs.python.org/issue8407 14 msgs #11877: Change os.fsync() to support physical backing store syncs http://bugs.python.org/issue11877 14 msgs #12015: possible characters in temporary file name is too few http://bugs.python.org/issue12015 12 msgs #9205: Parent process hanging in multiprocessing if children terminat http://bugs.python.org/issue9205 10 msgs #10666: OS X installer variants have confusing readline differences http://bugs.python.org/issue10666 10 msgs #12057: HZ codec has no test http://bugs.python.org/issue12057 10 msgs #5723: Incomplete json tests http://bugs.python.org/issue5723 9 msgs #12010: Compile fails when sizeof(wchar_t) == 1 http://bugs.python.org/issue12010 9 msgs Issues closed (51) ================== #1195: Problems on Linux with Ctrl-D and Ctrl-C during raw_input http://bugs.python.org/issue1195 closed by haypo #1350: IDLE - CallTips enhancement - show full doc-string in new wind http://bugs.python.org/issue1350 closed by kbk #5154: OSX broken poll testing doesn't work http://bugs.python.org/issue5154 closed by ronaldoussoren #5559: IDLE Output Window 's goto fails when path has spaces http://bugs.python.org/issue5559 closed by kbk #8498: Cannot use backlog = 0 for sockets http://bugs.python.org/issue8498 closed by pitrou #8808: imaplib should support SSL contexts http://bugs.python.org/issue8808 closed by pitrou #9971: Optimize BufferedReader.readinto http://bugs.python.org/issue9971 closed by pitrou #10169: socket.sendto raises incorrect exception when passed incorrect http://bugs.python.org/issue10169 closed by ezio.melotti #10419: distutils command build_scripts fails with UnicodeDecodeError http://bugs.python.org/issue10419 closed by python-dev #11072: Add MLSD command support to ftplib http://bugs.python.org/issue11072 closed by giampaolo.rodola #11164: xml shouldn't use _xmlplus http://bugs.python.org/issue11164 closed by python-dev #11347: libpython3.so: Broken soname and linking http://bugs.python.org/issue11347 closed by python-dev #11607: Apllication crashes when saving file http://bugs.python.org/issue11607 closed by ronaldoussoren #11617: Sporadic failure in test_httpservers http://bugs.python.org/issue11617 closed by haypo #11743: Rewrite PipeConnection and Connection in pure Python http://bugs.python.org/issue11743 closed by pitrou #11799: urllib HTTP authentication behavior with unrecognized auth met http://bugs.python.org/issue11799 closed by orsenthil #11888: Add C99's log2() function to the math library http://bugs.python.org/issue11888 closed by haypo #11896: Save on Close fails in IDLE, from Linux system http://bugs.python.org/issue11896 closed by kbk #11910: test_heapq C tests are not skipped when _heapq is missing http://bugs.python.org/issue11910 closed by ezio.melotti #11916: A few errnos from OSX http://bugs.python.org/issue11916 closed by python-dev #11927: SMTP_SSL doesn't use port 465 by default http://bugs.python.org/issue11927 closed by pitrou #11962: Buildbot reliability http://bugs.python.org/issue11962 closed by skrah #11968: wsgiref's wsgi application sample code does not work http://bugs.python.org/issue11968 closed by orsenthil #11972: input does not strip a trailing newline correctly on Windows http://bugs.python.org/issue11972 closed by terry.reedy #11994: [2.7/gcc-4.4.3] Segfault under valgrind in string.split() http://bugs.python.org/issue11994 closed by haypo #12001: Extend json.dumps to handle N-triples strings http://bugs.python.org/issue12001 closed by terry.reedy #12011: The signal module should raise OSError for OS-related exceptio http://bugs.python.org/issue12011 closed by haypo #12012: _ssl module doesn't compile with OpenSSL 1.0.0d: SSLv2_method http://bugs.python.org/issue12012 closed by haypo #12013: file /usr/local/lib/python3.1/lib-dynload/_socket.so: symbol i http://bugs.python.org/issue12013 closed by eric.araujo #12017: Decoding a highly-nested object with json (_speedups enabled) http://bugs.python.org/issue12017 closed by ezio.melotti #12023: non causal behavior http://bugs.python.org/issue12023 closed by ezio.melotti #12025: strangely missing separator in "resource" table http://bugs.python.org/issue12025 closed by jcea #12027: Optimize import this (patch to make it 10x faster) http://bugs.python.org/issue12027 closed by rhettinger #12030: Roundup Refused Update with No text/plain http://bugs.python.org/issue12030 closed by benjamin.peterson #12031: subprocess module does not accept file twice http://bugs.python.org/issue12031 closed by neologix #12032: Tools/Scripts/crlv.py needs updating for python 3+ http://bugs.python.org/issue12032 closed by python-dev #12033: AttributeError: 'module' object has no attribute 'scipy' http://bugs.python.org/issue12033 closed by alex #12035: problem with installing validator.nu on windows http://bugs.python.org/issue12035 closed by amaury.forgeotdarc #12036: ConfigParser: Document items() added the vars dictionary to th http://bugs.python.org/issue12036 closed by python-dev #12039: test_logging: bad file descriptor on FreeBSD bot http://bugs.python.org/issue12039 closed by vinay.sajip #12041: test_os test_ctypes test_wait3 causes test_wait3 error http://bugs.python.org/issue12041 closed by pitrou #12044: subprocess.Popen.__exit__ doesn't wait for process end http://bugs.python.org/issue12044 closed by brian.curtin #12047: Expand the style guide http://bugs.python.org/issue12047 closed by rhettinger #12051: Segfaults in _json while encoding objects http://bugs.python.org/issue12051 closed by ezio.melotti #12052: round() erroneous for some large arguments http://bugs.python.org/issue12052 closed by mark.dickinson #12054: test_socket: replace custom _get_unused_port() by support.find http://bugs.python.org/issue12054 closed by pitrou #12056: "???" (HORIZONTAL ELLIPSIS) should be an alternative syntax fo http://bugs.python.org/issue12056 closed by benjamin.peterson #12058: Minor edits to comments in faulthandler http://bugs.python.org/issue12058 closed by ezio.melotti #12061: Remove duplicate 'key functions' entry in Glossary http://bugs.python.org/issue12061 closed by georg.brandl #12062: Buffered I/O inconsistent with unbuffered I/O in certain cases http://bugs.python.org/issue12062 closed by pitrou #12064: unexpected behavior with exception variable http://bugs.python.org/issue12064 closed by ezio.melotti From merwok at netwok.org Fri May 13 19:56:28 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Fri, 13 May 2011 19:56:28 +0200 Subject: [Python-Dev] Problems with regrtest and with logging In-Reply-To: <loom.20110511T213726-472@post.gmane.org> References: <loom.20110509T193140-280@post.gmane.org> <d1989510eccc69219dc75384faf7be23@netwok.org> <loom.20110511T213726-472@post.gmane.org> Message-ID: <4DCD70CC.7030406@netwok.org> Le 11/05/2011 21:45, Vinay Sajip a ?crit : > ?ric Araujo <merwok <at> netwok.org> writes: >> I thought that if we set the level on the logger, we would prevent >> third-party code to get some messages. E.g., we set level to INFO but >> pip uses some packaging functions and would like to get DEBUG messages. > Then pip can set the level of the packaging logger as it wishes, perhaps in > response to command-line arguments for verbosity. It'd be easier for pip to do > that, regardless of which handlers are attached. And pip itself might be being > used, say by virtualenv. It's hard in general to say what the top-level code > will be, and generally that's the code which should set the handlers. Okay. I?ll go ahead and remove handlers (except for the command-line script), and set the level on the logger. If it turns out that the code in packaging incorrectly resets the level set by calling code, we?ll fix it later; now we want to fix the tests to produce the patch that will add packaging to CPython. > The levels set by a library for its loggers are merely defaults. The conflict here is that there?s a class setting the logging level on instantiation, which could reset the level set by calling code. Thanks again for your messages (and blog). From eric at netwok.org Fri May 13 20:02:05 2011 From: eric at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Fri, 13 May 2011 20:02:05 +0200 Subject: [Python-Dev] Problems with regrtest and with logging In-Reply-To: <BANLkTikx0KXuo9vyDBi=Ky2mWqY61Vp=6g@mail.gmail.com> References: <acbfe5fdfbc9da0ecec6d2591ab3757d@netwok.org> <BANLkTikx0KXuo9vyDBi=Ky2mWqY61Vp=6g@mail.gmail.com> Message-ID: <4DCD721D.1080108@netwok.org> On Sat, May 7, 2011 at 3:51 AM, ?ric Araujo <merwok at netwok.org> wrote: > regrtest helpfully reports when a test leaves the environment unclean > (sys.path, os.environ, logging._handlerList) A quick follow-up: I talked about regrtest with RDM on IRC, and I will use the context manager that detects changes in the ?if __name__ == '__main__'? blocks of our test files to find the guilty ones. Some warnings are subtle to track down: the test runs a command which instantiates a class which calls a function and here?s the code that sets an environment variable. In the future, I?ll take part in the efforts to reimplement parts of regrtest with new unittest features. Right now it?s quite painful to have to use either unittest to run just one file or regrtest to get the warnings. Cheers From sdaoden at googlemail.com Fri May 13 21:49:01 2011 From: sdaoden at googlemail.com (Steffen Daode Nurpmeso) Date: Fri, 13 May 2011 21:49:01 +0200 Subject: [Python-Dev] Summary of Python tracker Issues In-Reply-To: <20110513160722.1F2C31CE85@psf.upfronthosting.co.za> References: <20110513160722.1F2C31CE85@psf.upfronthosting.co.za> Message-ID: <20110513194901.GA40824@sherwood.local> The summary mails part 1 was declared as US-ASCII, 8bit, but it contained a UTF-8 character: > #12056: "???" (HORIZONTAL ELLIPSIS) should be an alternative syntax fo > http://bugs.python.org/issue12056 closed by benjamin.peterson This is handled without any problem by Python 3000 due to David Murrays patch of issue 11605 for 3.2 and 3.3. (It however broke my obviously insufficient non-postman thing :(, and it's of course not a valid mail, strictly speaking. So i report this just in case your stricken MUAs simply do the right thing and noone recognizes it at all.) May the juice be with you -- Steffen, sdaoden(*)(gmail.com) From ncoghlan at gmail.com Sat May 14 09:00:30 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 14 May 2011 17:00:30 +1000 Subject: [Python-Dev] Python Support on OpenVMS In-Reply-To: <20110513120818.139dca63@pitrou.net> References: <DB140D138DBD2F42A2768791634E108021064E4A9B@GVW1351EXA.americas.hpqcorp.net> <20110513120818.139dca63@pitrou.net> Message-ID: <BANLkTi=-WiQ6EpdiOQCvwMVEsOJ67qB_zQ@mail.gmail.com> On Fri, May 13, 2011 at 8:08 PM, Antoine Pitrou <solipsis at pitrou.net> wrote: > Any compilation errors and test suite failures should be reported to > the bug tracker (http://bugs.python.org/), preferably with patches > since I doubt any of us would be able to fix the issues themselves. For ongoing support, it would also be *really* helpful if HP could provide an OpenVMS buildbot. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From peck at us.ibm.com Sat May 14 18:03:10 2011 From: peck at us.ibm.com (Jon K Peck) Date: Sat, 14 May 2011 10:03:10 -0600 Subject: [Python-Dev] AUTO: Jon K Peck is out of the office (returning 05/18/2011) Message-ID: <OFE00C5BAC.4D8A3197-ON87257890.00582E60-87257890.00582E60@us.ibm.com> I am out of the office until 05/18/2011. I am out of the office traveling Wed - Thursday, May 11-12 and Saturday-Tuesday, May 14-17. I will have limited access to email during this time, so I will be delayed in responding. Note: This is an automated response to your message "Python-Dev Digest, Vol 94, Issue 25" sent on 5/14/11 4:00:03. This is the only notification you will receive while this person is away. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20110514/83d53036/attachment.html> From vinay_sajip at yahoo.co.uk Sun May 15 10:55:13 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Sun, 15 May 2011 08:55:13 +0000 (UTC) Subject: [Python-Dev] more timely detection of unbound locals References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com> <Pine.GSO.4.64.1105090913360.27288@core.cs.uwaterloo.ca> <BANLkTimmAwLUzHr_4aoX8HNscQRE83LR1A@mail.gmail.com> <Pine.GSO.4.64.1105091129450.27288@core.cs.uwaterloo.ca> <BANLkTik8JCOp_n9WGqazg6r0gdQdGt9Ugg@mail.gmail.com> <20110510131144.C8D75250041@webabinitio.net> <BANLkTi=MPSWxrDR6rU=5nFMtDu1meUJX4A@mail.gmail.com> <iqbu9b$981$1@dough.gmane.org> Message-ID: <loom.20110515T105420-429@post.gmane.org> Terry Reedy <tjreedy <at> udel.edu> writes: > I would change this to > "local name 'bob' used before the assignment that makes it a local name" > > Calling names 'variables' is itself a point of confusion. +1 From senthil at uthcode.com Mon May 16 04:15:03 2011 From: senthil at uthcode.com (Senthil Kumaran) Date: Mon, 16 May 2011 10:15:03 +0800 Subject: [Python-Dev] Python Support on OpenVMS In-Reply-To: <BANLkTi=-WiQ6EpdiOQCvwMVEsOJ67qB_zQ@mail.gmail.com> References: <DB140D138DBD2F42A2768791634E108021064E4A9B@GVW1351EXA.americas.hpqcorp.net> <20110513120818.139dca63@pitrou.net> <BANLkTi=-WiQ6EpdiOQCvwMVEsOJ67qB_zQ@mail.gmail.com> Message-ID: <20110516021503.GB2808@kevin> On Sat, May 14, 2011 at 05:00:30PM +1000, Nick Coghlan wrote: > For ongoing support, it would also be *really* helpful if HP could > provide an OpenVMS buildbot. Yes, that would be best first step in the on-going struggle to support OpenVMS platform. The problem in the first place is no one has the hardware to try install python, leaving alone fixing the bugs in that. So, Sandeep, if you can setup a buildbot ( http://python.org/dev/buildbot/) and be the owner of the buildbot, it would be really helpful. -- Senthil From martin at v.loewis.de Mon May 16 09:20:41 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 16 May 2011 09:20:41 +0200 Subject: [Python-Dev] Python Support on OpenVMS In-Reply-To: <20110516021503.GB2808@kevin> References: <DB140D138DBD2F42A2768791634E108021064E4A9B@GVW1351EXA.americas.hpqcorp.net> <20110513120818.139dca63@pitrou.net> <BANLkTi=-WiQ6EpdiOQCvwMVEsOJ67qB_zQ@mail.gmail.com> <20110516021503.GB2808@kevin> Message-ID: <4DD0D049.9050407@v.loewis.de> Am 16.05.2011 04:15, schrieb Senthil Kumaran: > On Sat, May 14, 2011 at 05:00:30PM +1000, Nick Coghlan wrote: >> For ongoing support, it would also be *really* helpful if HP could >> provide an OpenVMS buildbot. > > Yes, that would be best first step in the on-going struggle to support > OpenVMS platform. I guess the best first step would be to make it compile at all. Then try to make it pass the test suite. This may well take several months to complete. Regards, Martin From ncoghlan at gmail.com Mon May 16 10:04:05 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 16 May 2011 18:04:05 +1000 Subject: [Python-Dev] Python Support on OpenVMS In-Reply-To: <4DD0D049.9050407@v.loewis.de> References: <DB140D138DBD2F42A2768791634E108021064E4A9B@GVW1351EXA.americas.hpqcorp.net> <20110513120818.139dca63@pitrou.net> <BANLkTi=-WiQ6EpdiOQCvwMVEsOJ67qB_zQ@mail.gmail.com> <20110516021503.GB2808@kevin> <4DD0D049.9050407@v.loewis.de> Message-ID: <BANLkTi=oG6brdSvOP8fZSQsYA8vPXyHRow@mail.gmail.com> On Mon, May 16, 2011 at 5:20 PM, "Martin v. L?wis" <martin at v.loewis.de> wrote: > Am 16.05.2011 04:15, schrieb Senthil Kumaran: >> On Sat, May 14, 2011 at 05:00:30PM +1000, Nick Coghlan wrote: >>> For ongoing support, it would also be *really* helpful if HP could >>> provide an OpenVMS buildbot. >> >> Yes, that would be best first step in the on-going struggle to support >> OpenVMS platform. > > I guess the best first step would be to make it compile at all. Then try > to make it pass the test suite. This may well take several months to > complete. And then make sure the buildbot client runs properly. Still, having someone start down that path now (with a green stable buildbot as the target end state) provides a specific goal that any patches can work towards. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From sandeep.mathew at hp.com Mon May 16 10:08:27 2011 From: sandeep.mathew at hp.com (Mathew, Sandeep (OpenVMS)) Date: Mon, 16 May 2011 08:08:27 +0000 Subject: [Python-Dev] Python Support on OpenVMS In-Reply-To: <mailman.53.1305453602.25450.python-dev@python.org> References: <mailman.53.1305453602.25450.python-dev@python.org> Message-ID: <DB140D138DBD2F42A2768791634E108021064E4E7B@GVW1351EXA.americas.hpqcorp.net> Hi All, Thanks for your responses!. First thing on my radar is to get buildbot working on OpenVMS. I had a quick glance at source, although buildbot is written purely in python it has many platform specific issues. See: https://github.com/buildbot/buildbot/blob/master/master/README.w32 However I am guessing that it may not be very difficult to resolve. I will concentrating on Itanium systems initially and will later port it to Alpha in a similar way. I have requested for an account in HP's OpenVMS cluster meant for open source development. I will kick off my work after the account has been activated! Regards Sandeep Mathew From solipsis at pitrou.net Mon May 16 15:46:47 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 16 May 2011 15:46:47 +0200 Subject: [Python-Dev] Python Support on OpenVMS References: <mailman.53.1305453602.25450.python-dev@python.org> <DB140D138DBD2F42A2768791634E108021064E4E7B@GVW1351EXA.americas.hpqcorp.net> Message-ID: <20110516154647.4a927e2e@pitrou.net> On Mon, 16 May 2011 08:08:27 +0000 "Mathew, Sandeep (OpenVMS)" <sandeep.mathew at hp.com> wrote: > Hi All, > > Thanks for your responses!. First thing on my radar is to get buildbot working on OpenVMS. > I had a quick glance at source, although buildbot is written purely in python it has many > platform specific issues. See: https://github.com/buildbot/buildbot/blob/master/master/README.w32 I think this file is way out of date. We have Windows buildbots running fine, and I don't think they required a modification of the buildbot software. Furthermore, you only need the buildbot slave, not master. See http://wiki.python.org/moin/BuildBot for more info Regards Antoine. From dsuch at gefira.pl Mon May 16 20:31:45 2011 From: dsuch at gefira.pl (Dariusz Suchojad) Date: Mon, 16 May 2011 20:31:45 +0200 Subject: [Python-Dev] Simple XML-RPC server over SSL/TLS In-Reply-To: <27392.1304016849@parc.com> References: <BANLkTinDGtWZsDPZ37U5_zqw9Aio-CpeXw@mail.gmail.com> <4DB975BB.1040402@netwok.org> <27392.1304016849@parc.com> Message-ID: <4DD16D91.6040805@gefira.pl> Bill Janssen wrote: Hello, >>> But what I would like to know, is if is there any reason why XML-RPC can't >>> optionally work over TLS/SSL using Python's ssl module. I'll create a >>> ticket, and send a patch, but I was wondering if it was a reason why this >>> was not implemented. >> >> I think there's no deeper reason than nobody thought about it. The ssl >> module is new in 2.6 and 3.x, xmlrpc is an older module for an old >> technology *cough*, so feel free to open a bug report. Patch guidelines >> are found at http://docs.python.org/devguide Thanks in advance! > > What he said. I'm not a big fan of XMLRPC in the first place, so I > probably didn't even notice that there wasn't SSL support for it. > > Go for it! I know it's been some time but I've only now spotted the thread and just in case it could be helpful to anyone, Spring Python project has implemented the feature last year for Python 2.x http://static.springsource.org/spring-python/1.2.x/sphinx/html/remoting.html#secure-xml-rpc cheers, -- Dariusz Suchojad From digitalxero at gmail.com Tue May 17 01:15:48 2011 From: digitalxero at gmail.com (Dj Gilcrease) Date: Mon, 16 May 2011 19:15:48 -0400 Subject: [Python-Dev] [OT] Server Side Clone mode Message-ID: <BANLkTimBFRTyD3us6pPSb=aXCwEuG4Q2cQ@mail.gmail.com> I was wondering if there was a place I could get the modifications that have been made at hg.python.org to add the Server Side Clone to the hgweb interface. Dj Gilcrease ?____ ( | ? ? \ ?o ? ?() ? | ?o ?|`| ? | ? ? ?| ? ? ?/`\_/| ? ? ?| | ? ,__ ? ,_, ? ,_, ? __, ? ?, ? ,_, _| ? ? ?| | ? ?/ ? ? ?| ?| ? |/ ? / ? ? ?/ ? | ? |_/ ?/ ? ?| ? / \_|_/ (/\___/ ?|/ ?/(__,/ ?|_/|__/\___/ ? ?|_/|__/\__/|_/\,/ ?|__/ ? ? ? ? ?/| ? ? ? ? ?\| From mhammond at skippinet.com.au Tue May 17 09:38:07 2011 From: mhammond at skippinet.com.au (Mark Hammond) Date: Tue, 17 May 2011 17:38:07 +1000 Subject: [Python-Dev] Updated version of PEP-0397 - Python launcher for Windows. Message-ID: <4DD225DF.2060605@skippinet.com.au> Hi all, I've updated PEP-0397 to try and address some of the comments from the last draft. I've checked the new version into hg, so you can find a full diff there, but the key items I've changed are: * Spelled out the "version qualifier" rules for the shebang lines. * Spelled out some customization options, both for fine-tuning the specific Python version selected and for supporting other Python implementations via "custom" commands. * Indicated the launcher is not supported at all on Win2k or earlier. * Removed some cruft. The new version is attached and I welcome all comments, including bike-shedding on the environment variable names and INI section/value names. Note that the reference implementation has not changed - I'll update that once there is general agreement on the functionality described in the PEP. Thanks, Mark -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: pep-0397.txt URL: <http://mail.python.org/pipermail/python-dev/attachments/20110517/72c828f8/attachment.txt> From victor.stinner at haypocalc.com Tue May 17 16:01:35 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 17 May 2011 16:01:35 +0200 Subject: [Python-Dev] Success x86 XP-4 2.7 buildbot without any log and should be a failure Message-ID: <201105171601.35813.victor.stinner@haypocalc.com> Hi, I broke recently all tests of CJK encodings (#12057) in Python 2.7 (sorry, it is now fixed). But the "x86 XP-4 2.7" buildbot is green, I don't understand how (the bug was not fixed in the build 894): http://www.python.org/dev/buildbot/all/builders/x86%20XP-4%202.7/builds/894 This build doesn't contain any log. Victor From ziade.tarek at gmail.com Tue May 17 17:36:10 2011 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Tue, 17 May 2011 17:36:10 +0200 Subject: [Python-Dev] "packaging" merge imminent Message-ID: <BANLkTimcbu6_tSD=KrKoG=nvHrUWzS_McQ@mail.gmail.com> Hello I am about to merge packaging in the stdlib, and we will continue our work there :) The impact is: - addition of Lib/packaging - addition of test/test_packaging.py - changes in Lib/sysconfig.py - addition of Lib/sysconfig.cfg For the last one, I would like to make sure again that everyone is ok with having a .cfg file added in the Lib/ directory. If not, we need to discuss how to do this differently. == purpose of sysconfig.cfg == The sysconfig.cfg file is a ini-like file that sysconfig.py reads to get the installation paths. We currently have these paths harcoded in the python module. The next change I have planned is to allow several levels of configuration, like distutils.cfg does. sysconfig.py will look for a sysconfig.cfg file in these places: 1. the current working directory -- so can be potentially included in a project source release 2. the user home (specific location be defined, maybe in ~/local) [inherits from the previous one] 3. the global [inherits from the previous one] I have decided to make it a .cfg file instead of a .py file for various reasons: - easier for people to edit, without the danger of ending-up with an over-engineered python module (that's the problem we have with setup.py files) - the override logic is easier to implement and understand: if I want to change a single path, I add a ini file in my home with this single path. If I have no complains, the merge will happen tomorrow of my time == next moves == - make sysconfig.py stop reading Makefile and pyconfig.h, this will be done by adding a _sysconfig.py file created by the Makefile - continue our work in packaging for 3.3 - we're planning to merge the doc in Doc/packaging very soon (still working on it) Cheers Tarek -- Tarek Ziad? | http://ziade.org From lists at cheimes.de Tue May 17 18:42:59 2011 From: lists at cheimes.de (Christian Heimes) Date: Tue, 17 May 2011 18:42:59 +0200 Subject: [Python-Dev] "packaging" merge imminent In-Reply-To: <BANLkTimcbu6_tSD=KrKoG=nvHrUWzS_McQ@mail.gmail.com> References: <BANLkTimcbu6_tSD=KrKoG=nvHrUWzS_McQ@mail.gmail.com> Message-ID: <4DD2A593.90203@cheimes.de> Am 17.05.2011 17:36, schrieb Tarek Ziad?: > The next change I have planned is to allow several levels of > configuration, like distutils.cfg does. sysconfig.py will look for a > sysconfig.cfg file in these places: > > 1. the current working directory -- so can be potentially included in > a project source release > 2. the user home (specific location be defined, maybe in ~/local) > [inherits from the previous one] > 3. the global You may want to study my site package PEP [1] regarding possible security implications. I recommend that you ignore the current working directory and user's home directory under conditions like different effective user or the -E option. A good place for a local sysconfig.cfg could be the user's stdlib directory (e.g. ~/.local/lib/python3.2/sysconfig.cfg). Christian [1] http://www.python.org/dev/peps/pep-0370 From jdunck at gmail.com Tue May 17 19:40:04 2011 From: jdunck at gmail.com (Jeremy Dunck) Date: Tue, 17 May 2011 12:40:04 -0500 Subject: [Python-Dev] Bug in json (the format and the module) Message-ID: <BANLkTi=YmC3RqB6d44pHFHu3A9d=tNxGqA@mail.gmail.com> This blog post describes a bug in a common usage pattern of JSON: http://timelessrepo.com/json-isnt-a-javascript-subset That is, there are some characters which are legal in JSON serializations, but not in JavaScript strings. This works OK for JSON parsers, but a common use case of JSON is JSONP, where the result of a request is presumed to be executable javascript: <script src="http://someapi.com/jsonp?callback=foo"> might return a response: foo({"some_json":"which might or might not be legal javascript"}) The post also suggests a solution -- to replace literal U+2028 - Line separator and U+2029 - Paragraph separator with their escape sequences \u2028 and \u2029. This is a nice solution in that it makes the JSON valid JS while keeping the same semantics. Of course there's the annoyance of processing the full string, comparable in overhead to utf-8 encoding, I presume. So, to start with, is there a maintainer for the json module, or how should I go about discussing implementing this solution? From bob at redivi.com Tue May 17 20:18:15 2011 From: bob at redivi.com (Bob Ippolito) Date: Tue, 17 May 2011 12:18:15 -0600 Subject: [Python-Dev] Bug in json (the format and the module) In-Reply-To: <BANLkTi=YmC3RqB6d44pHFHu3A9d=tNxGqA@mail.gmail.com> References: <BANLkTi=YmC3RqB6d44pHFHu3A9d=tNxGqA@mail.gmail.com> Message-ID: <BANLkTim+wEMv_kSBqvtBPM=aHYytd7+N+g@mail.gmail.com> By default the json module already escapes anything outside of 7-bit ASCII, so unless you're using ensure_ascii=False then this is a non-issue. I implemented a workaround for ensure_ascii=False in simplejson here, it would be pretty trivial to add this feature to the json module as well: https://github.com/simplejson/simplejson/commit/4989e693bab39b1ce5cf6fc0b21dbacd108c312c On Tue, May 17, 2011 at 11:40 AM, Jeremy Dunck <jdunck at gmail.com> wrote: > This blog post describes a bug in a common usage pattern of JSON: > > http://timelessrepo.com/json-isnt-a-javascript-subset > > That is, there are some characters which are legal in JSON > serializations, but not in JavaScript strings. > > This works OK for JSON parsers, but a common use case of JSON is > JSONP, where the result of a request is presumed to be executable > javascript: > > <script src="http://someapi.com/jsonp?callback=foo"> might return a response: > > foo({"some_json":"which might or might not be legal javascript"}) > > The post also suggests a solution -- to replace literal U+2028 - Line > separator and U+2029 - Paragraph separator with their escape sequences > \u2028 and \u2029. > > This is a nice solution in that it makes the JSON valid JS while > keeping the same semantics. ?Of course there's the annoyance of > processing the full string, comparable in overhead to utf-8 encoding, > I presume. > > So, to start with, is there a maintainer for the json module, or how > should I go about discussing implementing this solution? > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/bob%40redivi.com > From dirkjan at ochtman.nl Tue May 17 20:21:26 2011 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Tue, 17 May 2011 20:21:26 +0200 Subject: [Python-Dev] Bug in json (the format and the module) In-Reply-To: <BANLkTi=YmC3RqB6d44pHFHu3A9d=tNxGqA@mail.gmail.com> References: <BANLkTi=YmC3RqB6d44pHFHu3A9d=tNxGqA@mail.gmail.com> Message-ID: <BANLkTinF-c5mNFkU6_1wZXrrSCg8L-B8Nw@mail.gmail.com> On Tue, May 17, 2011 at 19:40, Jeremy Dunck <jdunck at gmail.com> wrote: > So, to start with, is there a maintainer for the json module, or how > should I go about discussing implementing this solution? Your subject states that there is an actual bug in the json module, but your message fails to mention any actual bug. Is this what you mean? Python 2.7.1 (r271:86832, Mar 28 2011, 09:54:04) [GCC 4.4.5] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import json >>> print json.dumps(u'foo\u2028bar') "foo\u2028bar" Cheers, Dirkjan From ronaldoussoren at mac.com Tue May 17 19:21:26 2011 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Tue, 17 May 2011 19:21:26 +0200 Subject: [Python-Dev] "packaging" merge imminent In-Reply-To: <BANLkTimcbu6_tSD=KrKoG=nvHrUWzS_McQ@mail.gmail.com> References: <BANLkTimcbu6_tSD=KrKoG=nvHrUWzS_McQ@mail.gmail.com> Message-ID: <D3727EF7-AEFC-4335-96FE-53E44A21ACBC@mac.com> On 17 May, 2011, at 17:36, Tarek Ziad? wrote: > Hello > > I am about to merge packaging in the stdlib, and we will continue our > work there :) > > The impact is: > > - addition of Lib/packaging > - addition of test/test_packaging.py > - changes in Lib/sysconfig.py > - addition of Lib/sysconfig.cfg > > For the last one, I would like to make sure again that everyone is ok > with having a .cfg file added in the Lib/ directory. If not, we need > to discuss how to do this differently. > > == purpose of sysconfig.cfg == > > The sysconfig.cfg file is a ini-like file that sysconfig.py reads to > get the installation paths. We currently have these paths harcoded in > the python module. > > The next change I have planned is to allow several levels of > configuration, like distutils.cfg does. sysconfig.py will look for a > sysconfig.cfg file in these places: > > 1. the current working directory -- so can be potentially included in > a project source release Does this mean that python behaves differently when there happens to be a sysconfig.cfg file in the current working directory? That's a potentional security risk. > 2. the user home (specific location be defined, maybe in ~/local) > [inherits from the previous one] How hard would it be to disable this behavior for tools like virtualenv and py2app? Ronald -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2224 bytes Desc: not available URL: <http://mail.python.org/pipermail/python-dev/attachments/20110517/9caee69c/attachment.bin> From jdunck at gmail.com Tue May 17 20:48:44 2011 From: jdunck at gmail.com (Jeremy Dunck) Date: Tue, 17 May 2011 13:48:44 -0500 Subject: [Python-Dev] Bug in json (the format and the module) In-Reply-To: <BANLkTinF-c5mNFkU6_1wZXrrSCg8L-B8Nw@mail.gmail.com> References: <BANLkTi=YmC3RqB6d44pHFHu3A9d=tNxGqA@mail.gmail.com> <BANLkTinF-c5mNFkU6_1wZXrrSCg8L-B8Nw@mail.gmail.com> Message-ID: <BANLkTi=W3Ep_O8E+r9bYtdLpVpfeFffZPg@mail.gmail.com> On Tue, May 17, 2011 at 1:21 PM, Dirkjan Ochtman <dirkjan at ochtman.nl> wrote: > On Tue, May 17, 2011 at 19:40, Jeremy Dunck <jdunck at gmail.com> wrote: >> So, to start with, is there a maintainer for the json module, or how >> should I go about discussing implementing this solution? > > Your subject states that there is an actual bug in the json module, > but your message fails to mention any actual bug. Is this what you > mean? > > Python 2.7.1 (r271:86832, Mar 28 2011, 09:54:04) > [GCC 4.4.5] on linux2 > Type "help", "copyright", "credits" or "license" for more information. >>>> import json >>>> print json.dumps(u'foo\u2028bar') > "foo\u2028bar" Actually, that would be fine, and Bob's right that this is a non-issue with ensure_ascii=True (the default). His change upstream seems good for the ensure_ascii=False case. To be complete, this is what I meant: >>> s = '{"JSON":"ro cks!"}' # this string has a literal U+2028 in it >>> s '{"JSON":"ro\xe2\x80\xa8cks!"}' >>> import json >>> json.dumps(s) # fine by default '"{\\"JSON\\":\\"ro\\u2028cks!\\"}"' >>> json.dumps(s, ensure_ascii=False) # not fine with ensure_ascii=False '"{\\"JSON\\":\\"ro\xe2\x80\xa8cks!\\"}"' From georg at python.org Tue May 17 20:50:37 2011 From: georg at python.org (Georg Brandl) Date: Tue, 17 May 2011 20:50:37 +0200 Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1 Message-ID: <4DD2C37D.7000008@python.org> On behalf of the Python development team, I am pleased to announce the first release candidate of Python 3.2.1. Python 3.2.1 will the first bugfix release for Python 3.2, fixing over 120 bugs and regressions in Python 3.2. For an extensive list of changes and features in the 3.2 line, see http://docs.python.org/3.2/whatsnew/3.2.html To download Python 3.2.1 visit: http://www.python.org/download/releases/3.2.1/ This is a testing release: Please consider trying Python 3.2.1 with your code and reporting any bugs you may notice to: http://bugs.python.org/ Enjoy! -- Georg Brandl, Release Manager georg at python.org (on behalf of the entire python-dev team and 3.2's contributors) From victor.stinner at haypocalc.com Tue May 17 22:40:35 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 17 May 2011 22:40:35 +0200 Subject: [Python-Dev] "packaging" merge imminent In-Reply-To: <BANLkTimcbu6_tSD=KrKoG=nvHrUWzS_McQ@mail.gmail.com> References: <BANLkTimcbu6_tSD=KrKoG=nvHrUWzS_McQ@mail.gmail.com> Message-ID: <1305664835.29701.2.camel@marge> Le mardi 17 mai 2011 ? 17:36 +0200, Tarek Ziad? a ?crit : > - addition of Lib/packaging > - addition of test/test_packaging.py > - changes in Lib/sysconfig.py > - addition of Lib/sysconfig.cfg Does setup.py continue to use the "old" distutils module? I fixed recently some bugs in distutils. Should I also fix them in the packaging module, or are both modules already "synchronized"? Victor From ziade.tarek at gmail.com Tue May 17 23:20:23 2011 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Tue, 17 May 2011 23:20:23 +0200 Subject: [Python-Dev] "packaging" merge imminent In-Reply-To: <D3727EF7-AEFC-4335-96FE-53E44A21ACBC@mac.com> References: <BANLkTimcbu6_tSD=KrKoG=nvHrUWzS_McQ@mail.gmail.com> <D3727EF7-AEFC-4335-96FE-53E44A21ACBC@mac.com> Message-ID: <BANLkTikCo4_oYfQCPAwv1M9zM2tUAmJpzg@mail.gmail.com> On Tue, May 17, 2011 at 7:21 PM, Ronald Oussoren <ronaldoussoren at mac.com> wrote: ... >> 1. the current working directory -- so can be potentially included in >> a project source release > > Does this mean that python behaves differently when there happens to be a sysconfig.cfg file in the current working directory? That's a potentional security risk. The use case is to have it there at install time so packaging can have alternative locations if needed. We could also drop the working dir scanning and have it: 1- passed explicitly to the pysetup script via an option. 2- used only if found in a root of a project at installation time, and only then > >> 2. the user home ?(specific location be defined, maybe in ~/local) >> [inherits from the previous one] > > How hard would it be to disable this behavior for tools like virtualenv and py2app? Not hard at all, just an option. And the goal is also to allow virtualenv to have its own copy, like it does for distutils.cfg > > Ronald -- Tarek Ziad? | http://ziade.org From ziade.tarek at gmail.com Tue May 17 23:23:32 2011 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Tue, 17 May 2011 23:23:32 +0200 Subject: [Python-Dev] "packaging" merge imminent In-Reply-To: <1305664835.29701.2.camel@marge> References: <BANLkTimcbu6_tSD=KrKoG=nvHrUWzS_McQ@mail.gmail.com> <1305664835.29701.2.camel@marge> Message-ID: <BANLkTikjoaU=HoG3OsSOvJ5UzOWPTwT=vg@mail.gmail.com> On Tue, May 17, 2011 at 10:40 PM, Victor Stinner <victor.stinner at haypocalc.com> wrote: > Le mardi 17 mai 2011 ? 17:36 +0200, Tarek Ziad? a ?crit : >> - addition of Lib/packaging >> - addition of test/test_packaging.py >> - changes in Lib/sysconfig.py >> - addition of Lib/sysconfig.cfg > > Does setup.py continue to use the "old" distutils module? Yes. The plan is to keep distutils support, so projects with setup.py still work. For the new packaging, people will have to provide new sections in setup.cfg. The pysetup script will detect at installation time if the project has the required bits in the cfg, and if not will fallback to executing setup.py > I fixed recently some bugs in distutils. Should I also fix them in the > packaging module, or are both modules already "synchronized"? They need to be backported yes. We did some, but we'll need to double check distutils timeline to make sure it's synced > > Victor > > -- Tarek Ziad? | http://ziade.org From ziade.tarek at gmail.com Tue May 17 23:25:38 2011 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Tue, 17 May 2011 23:25:38 +0200 Subject: [Python-Dev] "packaging" merge imminent In-Reply-To: <4DD2A593.90203@cheimes.de> References: <BANLkTimcbu6_tSD=KrKoG=nvHrUWzS_McQ@mail.gmail.com> <4DD2A593.90203@cheimes.de> Message-ID: <BANLkTingN8a1my-e5O6soXY9b9uEM2eu_A@mail.gmail.com> On Tue, May 17, 2011 at 6:42 PM, Christian Heimes <lists at cheimes.de> wrote: > Am 17.05.2011 17:36, schrieb Tarek Ziad?: >> The next change I have planned is to allow several levels of >> configuration, like distutils.cfg does. sysconfig.py will look for a >> sysconfig.cfg file in these places: >> >> 1. the current working directory -- so can be potentially included in >> a project source release >> 2. the user home ?(specific location be defined, maybe in ~/local) >> [inherits from the previous one] >> 3. the global > > You may want to study my site package PEP [1] regarding possible > security implications. I recommend that you ignore the current working > directory and user's home directory under conditions like different > effective user or the -E option. Sounds good, thanks > A good place for a local sysconfig.cfg could be the user's stdlib > directory (e.g. ~/.local/lib/python3.2/sysconfig.cfg). Yes, so, part of the packaging imcoming work will be to relocate all user .cfg files in the same, python-specific place. That includes pydistutils.cfg, and pypirc. I remember we did talk about that a few months ago, and will restart this discussion asap > Christian > > [1] http://www.python.org/dev/peps/pep-0370 > -- Tarek Ziad? | http://ziade.org From ethan at stoneleaf.us Wed May 18 00:27:45 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 17 May 2011 15:27:45 -0700 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> Message-ID: <4DD2F661.2050005@stoneleaf.us> The bytes type in Python 3 does not feel very consistent. For example: --> some_var = 'abcdef' --> some_var 'abcdef' --> some_var[3] 'd' --> some_other_var = b'abcdef' --> some_other_var b'abcdef' --> some_other_var[3] 100 On the one hand we have the 'bytes are ascii data' type interface, and on the other we have the 'bytes are a list of integers between 0 - 256' interface. And trying to use the two is not intuitive: --> some_other_var[3] == b'd' False When I'm parsing a .dbf file and extracting field types from the byte stream, I'm not thinking, "okay, 67 is a Character field" -- what I'm thinking is, "b'C' is a Character field". Considering that ord() still works fine, I'm not sure why it was done this way. Is there code out there that is using this "list of int's" interface, or is there time to make changes to bytes? ~Ethan~ From benjamin at python.org Wed May 18 01:04:51 2011 From: benjamin at python.org (Benjamin Peterson) Date: Tue, 17 May 2011 18:04:51 -0500 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <4DD2F661.2050005@stoneleaf.us> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> Message-ID: <BANLkTi=B4EmqRyfUMBFzQVLgWdEY9v7RZw@mail.gmail.com> 2011/5/17 Ethan Furman <ethan at stoneleaf.us>: > Considering that ord() still works fine, I'm not sure why it was done this > way. I agree that this change was unfortunate and not too useful in practice. > > Is there code out there that is using this "list of int's" interface, or is > there time to make changes to bytes? I don't doubt there is, and I'm afraid it's far to late to change this. -- Regards, Benjamin From raymond.hettinger at gmail.com Wed May 18 01:05:00 2011 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Tue, 17 May 2011 18:05:00 -0500 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <4DD2F661.2050005@stoneleaf.us> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> Message-ID: <F0C153BA-29BB-45D0-88F3-0EE5C21D1B19@gmail.com> On May 17, 2011, at 5:27 PM, Ethan Furman wrote: > The bytes type in Python 3 does not feel very consistent. > > For example: > > --> some_var = 'abcdef' > --> some_var > 'abcdef' > --> some_var[3] > 'd' > --> some_other_var = b'abcdef' > --> some_other_var > b'abcdef' > --> some_other_var[3] > 100 > > > On the one hand we have the 'bytes are ascii data' type interface, This is incidental. Bytes can and often do contain data with non-ascii encoded text, plain binary data, or structs, or raw data read off a disk, etc. > and on the other we have the 'bytes are a list of integers between 0 - 256' interface. And trying to use the two is not intuitive: > > --> some_other_var[3] == b'd' > False > > When I'm parsing a .dbf file and extracting field types from the byte stream, I'm not thinking, "okay, 67 is a Character field" -- what I'm thinking is, "b'C' is a Character field". > > Considering that ord() still works fine, I'm not sure why it was done this way. > > Is there code out there that is using this "list of int's" interface, Yes. > or is there time to make changes to bytes? No. Raymond From nad at acm.org Wed May 18 01:25:16 2011 From: nad at acm.org (Ned Deily) Date: Tue, 17 May 2011 16:25:16 -0700 Subject: [Python-Dev] "packaging" merge imminent References: <BANLkTimcbu6_tSD=KrKoG=nvHrUWzS_McQ@mail.gmail.com> <1305664835.29701.2.camel@marge> <BANLkTikjoaU=HoG3OsSOvJ5UzOWPTwT=vg@mail.gmail.com> Message-ID: <nad-FF73E4.16251617052011@news.gmane.org> In article <BANLkTikjoaU=HoG3OsSOvJ5UzOWPTwT=vg at mail.gmail.com>, Tarek Ziad? <ziade.tarek at gmail.com> wrote: > On Tue, May 17, 2011 at 10:40 PM, Victor Stinner > <victor.stinner at haypocalc.com> wrote: > > Le mardi 17 mai 2011 ? 17:36 +0200, Tarek Ziad? a ?crit : > >> - addition of Lib/packaging > >> - addition of test/test_packaging.py > >> - changes in Lib/sysconfig.py > >> - addition of Lib/sysconfig.cfg > > > > Does setup.py continue to use the "old" distutils module? > > Yes. The plan is to keep distutils support, so projects with setup.py > still work. Just to be clear: what about for the build of the interpreter itself, i.e. its setup.py for the standard library extension modules? Will the existing distutils code continue to be used for that? Or is it being replaced by code in packaging? If so, have Python builds been tested yet on the various platforms? -- Ned Deily, nad at acm.org From ziade.tarek at gmail.com Wed May 18 01:37:18 2011 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Wed, 18 May 2011 01:37:18 +0200 Subject: [Python-Dev] "packaging" merge imminent In-Reply-To: <nad-FF73E4.16251617052011@news.gmane.org> References: <BANLkTimcbu6_tSD=KrKoG=nvHrUWzS_McQ@mail.gmail.com> <1305664835.29701.2.camel@marge> <BANLkTikjoaU=HoG3OsSOvJ5UzOWPTwT=vg@mail.gmail.com> <nad-FF73E4.16251617052011@news.gmane.org> Message-ID: <BANLkTikhNDSPPdeOce5KF18G+vbjcwqwEw@mail.gmail.com> On Wed, May 18, 2011 at 1:25 AM, Ned Deily <nad at acm.org> wrote: ... > Just to be clear: what about for the build of the interpreter itself, > i.e. its setup.py for the standard library extension modules? ?Will the > existing distutils code continue to be used for that? ?Or is it being > replaced by code in packaging? ?If so, have Python builds been tested > yet on the various platforms? It will remain distutils-based for now. Moving it to packaging is not our top priority. Cheers Tarek -- Tarek Ziad? | http://ziade.org From nad at acm.org Wed May 18 01:53:56 2011 From: nad at acm.org (Ned Deily) Date: Tue, 17 May 2011 16:53:56 -0700 Subject: [Python-Dev] "packaging" merge imminent References: <BANLkTimcbu6_tSD=KrKoG=nvHrUWzS_McQ@mail.gmail.com> <1305664835.29701.2.camel@marge> <BANLkTikjoaU=HoG3OsSOvJ5UzOWPTwT=vg@mail.gmail.com> <nad-FF73E4.16251617052011@news.gmane.org> <BANLkTikhNDSPPdeOce5KF18G+vbjcwqwEw@mail.gmail.com> Message-ID: <nad-86F423.16535617052011@news.gmane.org> In article <BANLkTikhNDSPPdeOce5KF18G+vbjcwqwEw at mail.gmail.com>, Tarek Ziad? <ziade.tarek at gmail.com> wrote: > On Wed, May 18, 2011 at 1:25 AM, Ned Deily <nad at acm.org> wrote: > > Just to be clear: what about for the build of the interpreter itself, > > i.e. its setup.py for the standard library extension modules? ?Will the > > existing distutils code continue to be used for that? [...] > It will remain distutils-based for now. Moving it to packaging is not > our top priority. +1. Thanks! -- Ned Deily, nad at acm.org From ncoghlan at gmail.com Wed May 18 05:13:32 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 18 May 2011 13:13:32 +1000 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <4DD2F661.2050005@stoneleaf.us> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> Message-ID: <BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com> On Wed, May 18, 2011 at 8:27 AM, Ethan Furman <ethan at stoneleaf.us> wrote: > On the one hand we have the 'bytes are ascii data' type interface, and on > the other we have the 'bytes are a list of integers between 0 - 256' > interface. No. Bytes are a list of integers between 0-256. End of story. Using them to represent text as well was precisely the problem with 2.x 8-bit strings, since the boundaries got blurred. However, as a matter of practicality, many byte-oriented protocols use ASCII to make elements of the protocol readable by humans. The "text-like" elements of the bytes and bytearray types are a concession to the existence of those protocols. However, that doesn't make them text - they're still binary data streams. If you want to treat them as text, convert them to "str" objects first (e.g. that's what urlib.urlparse does internally in order to operate on bytes and bytearray instances). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From robertc at robertcollins.net Wed May 18 05:23:07 2011 From: robertc at robertcollins.net (Robert Collins) Date: Wed, 18 May 2011 15:23:07 +1200 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com> Message-ID: <BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com> On Wed, May 18, 2011 at 3:13 PM, Nick Coghlan <ncoghlan at gmail.com> wrote: > On Wed, May 18, 2011 at 8:27 AM, Ethan Furman <ethan at stoneleaf.us> wrote: >> On the one hand we have the 'bytes are ascii data' type interface, and on >> the other we have the 'bytes are a list of integers between 0 - 256' >> interface. > > No. Bytes are a list of integers between 0-256. End of story. Using > them to represent text as well was precisely the problem with 2.x > 8-bit strings, since the boundaries got blurred. > > However, as a matter of practicality, many byte-oriented protocols use > ASCII to make elements of the protocol readable by humans. The > "text-like" elements of the bytes and bytearray types are a concession > to the existence of those protocols. However, that doesn't make them > text - they're still binary data streams. If you want to treat them as > text, convert them to "str" objects first (e.g. that's what > urlib.urlparse does internally in order to operate on bytes and > bytearray instances). This is a not a useful argument - its an implementation choice in Python 3, and urlparse converting bytes to 'str' to operate on them is at best a kludge - you're forcing 5 times the storage (the original bytes + 4 bytes-per-byte when its decoded into unicode) to work on something which is defined as a BNF * that uses ascii *. The Python 2 confusion was deplorable, but it doesn't make the Python 3 situation better: its different, but still very awkward for people to write code that is correct and fast in. Its probably too late to change, but please don't try to argue that its correct: the continued confusion of folk running into this is evidence that confusion *is happening*. Treat that as evidence and think about how to fix it going forward. _Rob From ncoghlan at gmail.com Wed May 18 05:40:14 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 18 May 2011 13:40:14 +1000 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com> <BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com> Message-ID: <BANLkTinBiOKZiTknmV5+jxMGbwG39E4Uuw@mail.gmail.com> On Wed, May 18, 2011 at 1:23 PM, Robert Collins <robertc at robertcollins.net> wrote: > The Python 2 confusion was deplorable, but it doesn't make the Python > 3 situation better: its different, but still very awkward for people > to write code that is correct and fast in. When Python 3 goes wrong, it raises exceptions or executes the wrong control flow. That's a vast improvement over silently corrupting the data stream the way that 2.x does. If it really bothers anyone, they should feel free to implement and promote their own "ascii" data type on PyPI. If it is explicitly restricted to 7 bit characters, it may even avoid many of the problems of silent corruption that the 2.x str had. Speculation on python-dev isn't going to be convincing here, though: only code in real use will be effective on that front. As far as the memory and runtime overhead goes, yes, that's a real problem (indeed, that overhead is *why* bytes and bytearray have as many str-like features as they do). PEP 393 is intended to at least alleviate the memory burden of the Unicode text. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From greg.ewing at canterbury.ac.nz Wed May 18 07:39:40 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 18 May 2011 17:39:40 +1200 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <4DD2F661.2050005@stoneleaf.us> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> Message-ID: <4DD35B9C.3030702@canterbury.ac.nz> Ethan Furman wrote: > On the one hand we have the 'bytes are ascii data' type interface, and > on the other we have the 'bytes are a list of integers between 0 - 256' > interface. I think the weird part is that there exists a literal for writing a byte array as an ascii string, and furthermore that it's the *only* kind of literal available for bytes. Personally I think that the default literal syntax for bytes, and also the form produced by repr(), should have been something more neutral, such as hex, with the ascii form available for use when it makes sense. Currently if you want to write a bytes literal in hex, you have to say something like some_var = b'\xde\xad\xbe\xef' which is ugly and unreadable. Much nicer would be some_var = x'deadbeef' As for > --> some_other_var[3] == b'd' there ought to be a literal for specifying an integer using an ascii character, so you could say something like if some_other_var[3] == c'd': which would be equivalent to if some_other_var[3] == ord(b'd') but without the overhead of computing the value each time at run time. -- Greg From greg.ewing at canterbury.ac.nz Wed May 18 07:43:37 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 18 May 2011 17:43:37 +1200 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com> <BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com> Message-ID: <4DD35C89.2030807@canterbury.ac.nz> Robert Collins wrote: > urlparse converting bytes to 'str' to operate on them is > at best a kludge - you're forcing 5 times the storage (the original > bytes + 4 bytes-per-byte when its decoded into unicode) That is itself an implementation detail of current Python, though, due to it only having one internal representation of unicode. In principle there could be a form of str that keeps its data encoded in latin1, in which case constructing it from a byte string could simply involve storing a pointer to the original bytes data. -- Greg From v+python at g.nevcal.com Wed May 18 07:46:34 2011 From: v+python at g.nevcal.com (Glenn Linderman) Date: Tue, 17 May 2011 22:46:34 -0700 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <4DD35B9C.3030702@canterbury.ac.nz> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <4DD35B9C.3030702@canterbury.ac.nz> Message-ID: <4DD35D3A.6050004@g.nevcal.com> On 5/17/2011 10:39 PM, Greg Ewing wrote: > Personally I think that the default literal syntax for > bytes, and also the form produced by repr(), should have > been something more neutral, such as hex, with the ascii > form available for use when it makes sense. > Much nicer would be > > some_var = x'deadbeef' > > As for > >> --> some_other_var[3] == b'd' > > there ought to be a literal for specifying an integer > using an ascii character, so you could say something like > > if some_other_var[3] == c'd': > > which would be equivalent to > > if some_other_var[3] == ord(b'd') > > but without the overhead of computing the value each time > at run time. +1 Seems this could be added compatibly? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20110517/17ceb88f/attachment.html> From chris at simplistix.co.uk Wed May 18 07:51:43 2011 From: chris at simplistix.co.uk (Chris Withers) Date: Wed, 18 May 2011 06:51:43 +0100 Subject: [Python-Dev] how do you find out what version of Python a PEP landed in? Message-ID: <4DD35E6F.8030901@simplistix.co.uk> Hi All, A friend of mine is coming over to Python and asked a question I thought would have a better answer than it appears to: How do I know which version of Python a PEP lands in? I was expecting there to be a note at the bottom of the PEP, 342 in this case, but that doesn't appear to be the case. What is the policy on this? Where should we be looking? cheers, Chris -- Simplistix - Content Management, Batch Processing & Python Consulting - http://www.simplistix.co.uk From amauryfa at gmail.com Wed May 18 08:00:08 2011 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Wed, 18 May 2011 08:00:08 +0200 Subject: [Python-Dev] how do you find out what version of Python a PEP landed in? In-Reply-To: <4DD35E6F.8030901@simplistix.co.uk> References: <4DD35E6F.8030901@simplistix.co.uk> Message-ID: <BANLkTi=bpk98TXuE2T-dPtsJ6wGKbQOzQA@mail.gmail.com> Hi, 2011/5/18 Chris Withers <chris at simplistix.co.uk>: > A friend of mine is coming over to Python and asked a question I thought > would have a better answer than it appears to: > > How do I know which version of Python a PEP lands in? > > I was expecting there to be a note at the bottom of the PEP, 342 in this > case, but that doesn't appear to be the case. > > What is the policy on this? Where should we be looking? Normally PEPs are important enough to be mentioned in the "whatsnew" document of each release. Googling for "what's new pep 342" suggests that it was released with Python 2.5. Now, an "official" way to get this information would probably be better... -- Amaury Forgeot d'Arc From techtonik at gmail.com Wed May 18 08:07:19 2011 From: techtonik at gmail.com (anatoly techtonik) Date: Wed, 18 May 2011 09:07:19 +0300 Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1 In-Reply-To: <4DD2C37D.7000008@python.org> References: <4DD2C37D.7000008@python.org> Message-ID: <BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com> That's great, but where is the list if changes? -- anatoly t. On Tue, May 17, 2011 at 9:50 PM, Georg Brandl <georg at python.org> wrote: > On behalf of the Python development team, I am pleased to announce the > first release candidate of Python 3.2.1. > > Python 3.2.1 will the first bugfix release for Python 3.2, fixing over 120 > bugs and regressions in Python 3.2. > > For an extensive list of changes and features in the 3.2 line, see > > ? ?http://docs.python.org/3.2/whatsnew/3.2.html > > To download Python 3.2.1 visit: > > ? ?http://www.python.org/download/releases/3.2.1/ > > This is a testing release: Please consider trying Python 3.2.1 with your code > and reporting any bugs you may notice to: > > ? ?http://bugs.python.org/ > > > Enjoy! > > -- > Georg Brandl, Release Manager > georg at python.org > (on behalf of the entire python-dev team and 3.2's contributors) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/techtonik%40gmail.com > From amauryfa at gmail.com Wed May 18 08:18:06 2011 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Wed, 18 May 2011 08:18:06 +0200 Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1 In-Reply-To: <BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com> References: <4DD2C37D.7000008@python.org> <BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com> Message-ID: <BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com> Hi, 2011/5/18 anatoly techtonik <techtonik at gmail.com>: > That's great, but where is the list if changes? All changes are always listed in the Misc/NEWS file. A "Change log" link on every download page displays this file. -- Amaury Forgeot d'Arc From martin at v.loewis.de Wed May 18 08:24:29 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 18 May 2011 08:24:29 +0200 Subject: [Python-Dev] how do you find out what version of Python a PEP landed in? In-Reply-To: <4DD35E6F.8030901@simplistix.co.uk> References: <4DD35E6F.8030901@simplistix.co.uk> Message-ID: <4DD3661D.30908@v.loewis.de> > How do I know which version of Python a PEP lands in? You should look at the Python-Version header of the PEP. Regards, Martin From g.brandl at gmx.net Wed May 18 08:31:28 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 18 May 2011 08:31:28 +0200 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <4DD35B9C.3030702@canterbury.ac.nz> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <4DD35B9C.3030702@canterbury.ac.nz> Message-ID: <iqvp43$et3$1@dough.gmane.org> On 18.05.2011 07:39, Greg Ewing wrote: > Ethan Furman wrote: > >> On the one hand we have the 'bytes are ascii data' type interface, and >> on the other we have the 'bytes are a list of integers between 0 - 256' >> interface. > > I think the weird part is that there exists a literal for > writing a byte array as an ascii string, and furthermore > that it's the *only* kind of literal available for bytes. > > Personally I think that the default literal syntax for > bytes, and also the form produced by repr(), should have > been something more neutral, such as hex, with the ascii > form available for use when it makes sense. Currently if > you want to write a bytes literal in hex, you have to > say something like > > some_var = b'\xde\xad\xbe\xef' > > which is ugly and unreadable. Much nicer would be > > some_var = x'deadbeef' We do have bytes.fromhex('deadbeef') Georg From martin at v.loewis.de Wed May 18 08:32:17 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 18 May 2011 08:32:17 +0200 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <4DD2F661.2050005@stoneleaf.us> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> Message-ID: <4DD367F1.2050306@v.loewis.de> > Is there code out there that is using this "list of int's" interface Just in case this isn't clear yet: yes, certainly. Any non-trivial piece of Python 3 code that has been written already (and there is some) will have run into that issue. Regards, Martin From martin at v.loewis.de Wed May 18 08:34:07 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 18 May 2011 08:34:07 +0200 Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1 In-Reply-To: <BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com> References: <4DD2C37D.7000008@python.org> <BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com> <BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com> Message-ID: <4DD3685F.1040503@v.loewis.de> >> That's great, but where is the list if changes? > > All changes are always listed in the Misc/NEWS file. > A "Change log" link on every download page displays this file. I think it would be good if the release announcement made some summary statement, though, like "NNN bugs have been fixed, in MMM modules; see NEWS for details", or some such. Regards, Martin From amauryfa at gmail.com Wed May 18 08:38:17 2011 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Wed, 18 May 2011 08:38:17 +0200 Subject: [Python-Dev] how do you find out what version of Python a PEP landed in? In-Reply-To: <4DD3661D.30908@v.loewis.de> References: <4DD35E6F.8030901@simplistix.co.uk> <4DD3661D.30908@v.loewis.de> Message-ID: <BANLkTiny++ziwc_+UsOfWfiUD-dxM7KZxA@mail.gmail.com> 2011/5/18 "Martin v. L?wis" <martin at v.loewis.de>: >> How do I know which version of Python a PEP lands in? > > You should look at the Python-Version header of the PEP. But some PEPs don't have it: 341, 342, 343, 353... -- Amaury Forgeot d'Arc From ncoghlan at gmail.com Wed May 18 08:39:54 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 18 May 2011 16:39:54 +1000 Subject: [Python-Dev] how do you find out what version of Python a PEP landed in? In-Reply-To: <4DD3661D.30908@v.loewis.de> References: <4DD35E6F.8030901@simplistix.co.uk> <4DD3661D.30908@v.loewis.de> Message-ID: <BANLkTi=_uW6mo_H=i8YhEve8yexdK8z6mQ@mail.gmail.com> On Wed, May 18, 2011 at 4:24 PM, "Martin v. L?wis" <martin at v.loewis.de> wrote: >> How do I know which version of Python a PEP lands in? > > You should look at the Python-Version header of the PEP. Which is unfortunately missing from some PEPs (including PEP 342). PEP 344 shows where this information *should* be, though. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From g.brandl at gmx.net Wed May 18 09:57:45 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 18 May 2011 09:57:45 +0200 Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1 In-Reply-To: <4DD3685F.1040503@v.loewis.de> References: <4DD2C37D.7000008@python.org> <BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com> <BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com> <4DD3685F.1040503@v.loewis.de> Message-ID: <iqvu5t$82s$1@dough.gmane.org> On 18.05.2011 08:34, "Martin v. L?wis" wrote: >>> That's great, but where is the list if changes? >> >> All changes are always listed in the Misc/NEWS file. >> A "Change log" link on every download page displays this file. > > I think it would be good if the release announcement made some > summary statement, though, like "NNN bugs have been fixed, in MMM > modules; see NEWS for details", or some such. It does say "over NNN bugs have been fixed", not sure if the MMM modules add anything of value. I agree that a link to the NEWS file should be present though. Georg From techtonik at gmail.com Wed May 18 09:58:11 2011 From: techtonik at gmail.com (anatoly techtonik) Date: Wed, 18 May 2011 10:58:11 +0300 Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1 In-Reply-To: <BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com> References: <4DD2C37D.7000008@python.org> <BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com> <BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com> Message-ID: <BANLkTikYY2O-JJwECbq4URxmjLEXKPNXbw@mail.gmail.com> On Wed, May 18, 2011 at 9:18 AM, Amaury Forgeot d'Arc <amauryfa at gmail.com> wrote: > Hi, > > 2011/5/18 anatoly techtonik <techtonik at gmail.com>: >> That's great, but where is the list if changes? > > All changes are always listed in the Misc/NEWS file. > A "Change log" link on every download page displays this file. I actually followed http://docs.python.org/3.2/whatsnew/3.2.html to Misc/NEWS, but it doesn't contain any references of 3.2.1 -- anatoly t. From techtonik at gmail.com Wed May 18 11:25:51 2011 From: techtonik at gmail.com (anatoly techtonik) Date: Wed, 18 May 2011 12:25:51 +0300 Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1 In-Reply-To: <4DD3685F.1040503@v.loewis.de> References: <4DD2C37D.7000008@python.org> <BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com> <BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com> <4DD3685F.1040503@v.loewis.de> Message-ID: <BANLkTikVJy64DqYqwD5xSJgJRJ7aU19+cg@mail.gmail.com> On Wed, May 18, 2011 at 9:34 AM, "Martin v. L?wis" <martin at v.loewis.de> wrote: >>> That's great, but where is the list if changes? >> >> All changes are always listed in the Misc/NEWS file. >> A "Change log" link on every download page displays this file. > > I think it would be good if the release announcement made some > summary statement, though, like "NNN bugs have been fixed, in MMM > modules; see NEWS for details", or some such. That's a good idea. But for such kind of query Roundup should be module aware [1,2]. I'd say if Jesse could make a competition on best announcement format - we could easily see what information we tend to skip while preparing the releases (and improve NEWS format [3]). [1] http://code.google.com/p/pydotorg/issues/detail?id=8 [2] http://psf.upfronthosting.co.za/roundup/meta/issue373 [3] https://convore.com/the-changelog/the-best-changelog/ -- anatoly t. From ncoghlan at gmail.com Wed May 18 12:40:09 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 18 May 2011 20:40:09 +1000 Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1 In-Reply-To: <BANLkTikYY2O-JJwECbq4URxmjLEXKPNXbw@mail.gmail.com> References: <4DD2C37D.7000008@python.org> <BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com> <BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com> <BANLkTikYY2O-JJwECbq4URxmjLEXKPNXbw@mail.gmail.com> Message-ID: <BANLkTins+pavZbQGTziP-Ba+ba34Nu5EZg@mail.gmail.com> On Wed, May 18, 2011 at 5:58 PM, anatoly techtonik <techtonik at gmail.com> wrote: > On Wed, May 18, 2011 at 9:18 AM, Amaury Forgeot d'Arc > <amauryfa at gmail.com> wrote: >> Hi, >> >> 2011/5/18 anatoly techtonik <techtonik at gmail.com>: >>> That's great, but where is the list if changes? >> >> All changes are always listed in the Misc/NEWS file. >> A "Change log" link on every download page displays this file. > > I actually followed http://docs.python.org/3.2/whatsnew/3.2.html to > Misc/NEWS, but it doesn't contain any references of 3.2.1 What's New and Misc/NEWS are not the same thing. Misc/NEWS is the second info link on the download page ("Change log for this release"). (In this case, it lands at http://hg.python.org/releasing/3.2.1/file/v3.2.1rc1/Misc/NEWS) Agreed that What's New isn't a hugely useful thing to link from a point release announcement, though. It sounds like Georg is going to change that for the actual release. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Wed May 18 12:49:08 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 18 May 2011 20:49:08 +1000 Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1 In-Reply-To: <iqvu5t$82s$1@dough.gmane.org> References: <4DD2C37D.7000008@python.org> <BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com> <BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com> <4DD3685F.1040503@v.loewis.de> <iqvu5t$82s$1@dough.gmane.org> Message-ID: <BANLkTimZqGr0JqCrrgBaj1hx8pBELPxErw@mail.gmail.com> On Wed, May 18, 2011 at 5:57 PM, Georg Brandl <g.brandl at gmx.net> wrote: > On 18.05.2011 08:34, "Martin v. L?wis" wrote: >>>> That's great, but where is the list if changes? >>> >>> All changes are always listed in the Misc/NEWS file. >>> A "Change log" link on every download page displays this file. >> >> I think it would be good if the release announcement made some >> summary statement, though, like "NNN bugs have been fixed, in MMM >> modules; see NEWS for details", or some such. > > It does say "over NNN bugs have been fixed", not sure if the MMM modules > add anything of value. > > I agree that a link to the NEWS file should be present though. Wishlist item: How hard would it be to run a ReST parser over Misc/NEWS and create a HTML version for inclusion in the release pages? (Bonus points if it steals the issue reference linkification code from the tracker...) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From techtonik at gmail.com Wed May 18 12:50:03 2011 From: techtonik at gmail.com (anatoly techtonik) Date: Wed, 18 May 2011 13:50:03 +0300 Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1 In-Reply-To: <BANLkTins+pavZbQGTziP-Ba+ba34Nu5EZg@mail.gmail.com> References: <4DD2C37D.7000008@python.org> <BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com> <BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com> <BANLkTikYY2O-JJwECbq4URxmjLEXKPNXbw@mail.gmail.com> <BANLkTins+pavZbQGTziP-Ba+ba34Nu5EZg@mail.gmail.com> Message-ID: <BANLkTi=XTL1WVFB8bZ0-BJ1KMvKGPEr9EA@mail.gmail.com> On Wed, May 18, 2011 at 1:40 PM, Nick Coghlan <ncoghlan at gmail.com> wrote: > On Wed, May 18, 2011 at 5:58 PM, anatoly techtonik <techtonik at gmail.com> wrote: >> On Wed, May 18, 2011 at 9:18 AM, Amaury Forgeot d'Arc >> <amauryfa at gmail.com> wrote: >>> Hi, >>> >>> 2011/5/18 anatoly techtonik <techtonik at gmail.com>: >>>> That's great, but where is the list if changes? >>> >>> All changes are always listed in the Misc/NEWS file. >>> A "Change log" link on every download page displays this file. >> >> I actually followed http://docs.python.org/3.2/whatsnew/3.2.html to >> Misc/NEWS, but it doesn't contain any references of 3.2.1 > > What's New and Misc/NEWS are not the same thing. I believe you misunderstood. If you follow what's new link above, you will see a link to Misc/NEWS, but this one leads to http://hg.python.org/cpython/file/default/Misc/NEWS where no references to 3.2.1 are available. > Agreed that What's New isn't a hugely useful thing to link from a > point release announcement, though. It sounds like Georg is going to > change that for the actual release. There is nothing bad in linking to major release notes (i.e. What's New). IIRC, Mozilla does that for their minor releases, but they explicitly mention changes since last minor release. -- anatoly t. From g.brandl at gmx.net Wed May 18 12:58:18 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 18 May 2011 12:58:18 +0200 Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1 In-Reply-To: <BANLkTimZqGr0JqCrrgBaj1hx8pBELPxErw@mail.gmail.com> References: <4DD2C37D.7000008@python.org> <BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com> <BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com> <4DD3685F.1040503@v.loewis.de> <iqvu5t$82s$1@dough.gmane.org> <BANLkTimZqGr0JqCrrgBaj1hx8pBELPxErw@mail.gmail.com> Message-ID: <ir08oe$5et$1@dough.gmane.org> On 18.05.2011 12:49, Nick Coghlan wrote: > On Wed, May 18, 2011 at 5:57 PM, Georg Brandl <g.brandl at gmx.net> wrote: >> On 18.05.2011 08:34, "Martin v. L?wis" wrote: >>>>> That's great, but where is the list if changes? >>>> >>>> All changes are always listed in the Misc/NEWS file. >>>> A "Change log" link on every download page displays this file. >>> >>> I think it would be good if the release announcement made some >>> summary statement, though, like "NNN bugs have been fixed, in MMM >>> modules; see NEWS for details", or some such. >> >> It does say "over NNN bugs have been fixed", not sure if the MMM modules >> add anything of value. >> >> I agree that a link to the NEWS file should be present though. > > Wishlist item: How hard would it be to run a ReST parser over > Misc/NEWS and create a HTML version for inclusion in the release > pages? (Bonus points if it steals the issue reference linkification > code from the tracker...) See http://dev.pocoo.org/~gbrandl/news.html which I made as an experiment a while ago. Georg From ncoghlan at gmail.com Wed May 18 13:04:15 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 18 May 2011 21:04:15 +1000 Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1 In-Reply-To: <ir08oe$5et$1@dough.gmane.org> References: <4DD2C37D.7000008@python.org> <BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com> <BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com> <4DD3685F.1040503@v.loewis.de> <iqvu5t$82s$1@dough.gmane.org> <BANLkTimZqGr0JqCrrgBaj1hx8pBELPxErw@mail.gmail.com> <ir08oe$5et$1@dough.gmane.org> Message-ID: <BANLkTimY3r57cspcYp63Xk5GF+iX2pzFqw@mail.gmail.com> On Wed, May 18, 2011 at 8:58 PM, Georg Brandl <g.brandl at gmx.net> wrote: > On 18.05.2011 12:49, Nick Coghlan wrote: >> Wishlist item: How hard would it be to run a ReST parser over >> Misc/NEWS and create a HTML version for inclusion in the release >> pages? (Bonus points if it steals the issue reference linkification >> code from the tracker...) > > See > > http://dev.pocoo.org/~gbrandl/news.html > > which I made as an experiment a while ago. I quite like that! What would we need to do to make it part of the docs build process? Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From g.brandl at gmx.net Wed May 18 12:59:55 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 18 May 2011 12:59:55 +0200 Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1 In-Reply-To: <BANLkTi=XTL1WVFB8bZ0-BJ1KMvKGPEr9EA@mail.gmail.com> References: <4DD2C37D.7000008@python.org> <BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com> <BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com> <BANLkTikYY2O-JJwECbq4URxmjLEXKPNXbw@mail.gmail.com> <BANLkTins+pavZbQGTziP-Ba+ba34Nu5EZg@mail.gmail.com> <BANLkTi=XTL1WVFB8bZ0-BJ1KMvKGPEr9EA@mail.gmail.com> Message-ID: <ir08re$5et$2@dough.gmane.org> On 18.05.2011 12:50, anatoly techtonik wrote: > On Wed, May 18, 2011 at 1:40 PM, Nick Coghlan <ncoghlan at gmail.com> wrote: >> On Wed, May 18, 2011 at 5:58 PM, anatoly techtonik <techtonik at gmail.com> wrote: >>> On Wed, May 18, 2011 at 9:18 AM, Amaury Forgeot d'Arc >>> <amauryfa at gmail.com> wrote: >>>> Hi, >>>> >>>> 2011/5/18 anatoly techtonik <techtonik at gmail.com>: >>>>> That's great, but where is the list if changes? >>>> >>>> All changes are always listed in the Misc/NEWS file. >>>> A "Change log" link on every download page displays this file. >>> >>> I actually followed http://docs.python.org/3.2/whatsnew/3.2.html to >>> Misc/NEWS, but it doesn't contain any references of 3.2.1 >> >> What's New and Misc/NEWS are not the same thing. > > I believe you misunderstood. If you follow what's new link above, you > will see a link to Misc/NEWS, but this one leads to > http://hg.python.org/cpython/file/default/Misc/NEWS where no > references to 3.2.1 are available. This link is wrong, it should point to /cpython/file/3.2/Misc/NEWS. (But you'll still not see 3.2.1 changes until the 3.2.1 final release, because the rc is made from a separate clone.) Georg From ncoghlan at gmail.com Wed May 18 13:12:29 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 18 May 2011 21:12:29 +1000 Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1 In-Reply-To: <BANLkTi=XTL1WVFB8bZ0-BJ1KMvKGPEr9EA@mail.gmail.com> References: <4DD2C37D.7000008@python.org> <BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com> <BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com> <BANLkTikYY2O-JJwECbq4URxmjLEXKPNXbw@mail.gmail.com> <BANLkTins+pavZbQGTziP-Ba+ba34Nu5EZg@mail.gmail.com> <BANLkTi=XTL1WVFB8bZ0-BJ1KMvKGPEr9EA@mail.gmail.com> Message-ID: <BANLkTi=unD89XG2zKtRGpEvdzMwSKnAWKA@mail.gmail.com> On Wed, May 18, 2011 at 8:50 PM, anatoly techtonik <techtonik at gmail.com> wrote: > I believe you misunderstood. If you follow what's new link above, you > will see a link to Misc/NEWS, but this one leads to > http://hg.python.org/cpython/file/default/Misc/NEWS where no > references to 3.2.1 are available. Ah, I see what you mean. That actually looks to be a bug in the ":source:" tag that generates the file links. It should really generate version appropriate links, but it currently just always links to "default". (This wasn't an issue until 3.2 was released and 3.3 development started. Older versions didn't have that tag, and hence referenced the specific release directly). The source code links in the module docs have the same problem (e.g. see http://docs.python.org/3.2/library/functools) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From victor.stinner at haypocalc.com Wed May 18 13:26:40 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 18 May 2011 13:26:40 +0200 Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1 In-Reply-To: <ir08oe$5et$1@dough.gmane.org> References: <4DD2C37D.7000008@python.org> <BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com> <BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com> <4DD3685F.1040503@v.loewis.de> <iqvu5t$82s$1@dough.gmane.org> <BANLkTimZqGr0JqCrrgBaj1hx8pBELPxErw@mail.gmail.com> <ir08oe$5et$1@dough.gmane.org> Message-ID: <1305718000.16682.0.camel@marge> Le mercredi 18 mai 2011 ? 12:58 +0200, Georg Brandl a ?crit : > On 18.05.2011 12:49, Nick Coghlan wrote: > > On Wed, May 18, 2011 at 5:57 PM, Georg Brandl <g.brandl at gmx.net> wrote: > >> On 18.05.2011 08:34, "Martin v. L?wis" wrote: > >>>>> That's great, but where is the list if changes? > >>>> > >>>> All changes are always listed in the Misc/NEWS file. > >>>> A "Change log" link on every download page displays this file. > >>> > >>> I think it would be good if the release announcement made some > >>> summary statement, though, like "NNN bugs have been fixed, in MMM > >>> modules; see NEWS for details", or some such. > >> > >> It does say "over NNN bugs have been fixed", not sure if the MMM modules > >> add anything of value. > >> > >> I agree that a link to the NEWS file should be present though. > > > > Wishlist item: How hard would it be to run a ReST parser over > > Misc/NEWS and create a HTML version for inclusion in the release > > pages? (Bonus points if it steals the issue reference linkification > > code from the tracker...) > > See > > http://dev.pocoo.org/~gbrandl/news.html > > which I made as an experiment a while ago. Oh, I like it. But the output should be reST to be able to include it directly in the Python documentation. Sphinx would generate a new table of contents with links to each release. Victor From orsenthil at gmail.com Wed May 18 13:33:51 2011 From: orsenthil at gmail.com (Senthil Kumaran) Date: Wed, 18 May 2011 19:33:51 +0800 Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1 In-Reply-To: <1305718000.16682.0.camel@marge> References: <4DD2C37D.7000008@python.org> <BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com> <BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com> <4DD3685F.1040503@v.loewis.de> <iqvu5t$82s$1@dough.gmane.org> <BANLkTimZqGr0JqCrrgBaj1hx8pBELPxErw@mail.gmail.com> <ir08oe$5et$1@dough.gmane.org> <1305718000.16682.0.camel@marge> Message-ID: <20110518113351.GA3199@kevin> On Wed, May 18, 2011 at 01:26:40PM +0200, Victor Stinner wrote: > > http://dev.pocoo.org/~gbrandl/news.html > > > Oh, I like it. But the output should be reST to be able to include it > directly in the Python documentation. Sphinx would generate a new table Interesting ideas! It would be really useful too. +1 -- Senthil From g.brandl at gmx.net Wed May 18 13:35:51 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 18 May 2011 13:35:51 +0200 Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1 In-Reply-To: <1305718000.16682.0.camel@marge> References: <4DD2C37D.7000008@python.org> <BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com> <BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com> <4DD3685F.1040503@v.loewis.de> <iqvu5t$82s$1@dough.gmane.org> <BANLkTimZqGr0JqCrrgBaj1hx8pBELPxErw@mail.gmail.com> <ir08oe$5et$1@dough.gmane.org> <1305718000.16682.0.camel@marge> Message-ID: <ir0aur$hl4$1@dough.gmane.org> On 18.05.2011 13:26, Victor Stinner wrote: >> See >> >> http://dev.pocoo.org/~gbrandl/news.html >> >> which I made as an experiment a while ago. > > Oh, I like it. But the output should be reST to be able to include it > directly in the Python documentation. Sphinx would generate a new table > of contents with links to each release. The output of processing reST should be reST? Now I'm confused. Georg From techtonik at gmail.com Wed May 18 14:01:17 2011 From: techtonik at gmail.com (anatoly techtonik) Date: Wed, 18 May 2011 15:01:17 +0300 Subject: [Python-Dev] Inconsistent case in directory names for installed Python on Windows Message-ID: <BANLkTinxQU=4AeEncfai1K5fdA3sFL6Zhg@mail.gmail.com> Greetings, While studying `virtualenv` code I've noticed that in Python directory tree `include`, `libs` and `tcl` are lowercased while other dirs are capitalized. It doesn't seem important (especially for developers here), but it still can leave an unpleasant image for people new to Python (and programming in general). ?[Python27] ? ??DLLs ? ??Doc ? ??include ? ??Lib ? ??libs ? ??Scripts ? ??tcl ? ??Tools How about making a consistent lowercased or uppercased scheme? Windows filesystems are case-insensitive, so the change shouldn't affect anybody. Another candidate for normalization is Tools/Scripts dir, which I'd lowercase FWIW: ??Tools ??i18n ??pynche ??Scripts ??versioncheck ??webchecker Lowercased dirs on a top level seem to contains files that are relevant to C developers only. However, I can not say for sure. It seems that there could be a better place for them like top level directory named Dev or C-API. -- anatoly t. From victor.stinner at haypocalc.com Wed May 18 14:06:07 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 18 May 2011 14:06:07 +0200 Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1 In-Reply-To: <ir0aur$hl4$1@dough.gmane.org> References: <4DD2C37D.7000008@python.org> <BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com> <BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com> <4DD3685F.1040503@v.loewis.de> <iqvu5t$82s$1@dough.gmane.org> <BANLkTimZqGr0JqCrrgBaj1hx8pBELPxErw@mail.gmail.com> <ir08oe$5et$1@dough.gmane.org> <1305718000.16682.0.camel@marge> <ir0aur$hl4$1@dough.gmane.org> Message-ID: <1305720367.16682.2.camel@marge> Le mercredi 18 mai 2011 ? 13:35 +0200, Georg Brandl a ?crit : > On 18.05.2011 13:26, Victor Stinner wrote: > > >> See > >> > >> http://dev.pocoo.org/~gbrandl/news.html > >> > >> which I made as an experiment a while ago. > > > > Oh, I like it. But the output should be reST to be able to include it > > directly in the Python documentation. Sphinx would generate a new table > > of contents with links to each release. > > The output of processing reST should be reST? Now I'm confused. Misc/NEWS is already formatted to reST? It doesn't contain any link (to the issues). We may replace "Issue #xxx" by :issue:`xxx` (directly in Misc/NEWS) to simplify the process? And maybe move Misc/NEWS to Doc? http://dev.pocoo.org/~gbrandl/news.html is an HTML document. Victor From g.brandl at gmx.net Wed May 18 14:17:16 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 18 May 2011 14:17:16 +0200 Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1 In-Reply-To: <1305720367.16682.2.camel@marge> References: <4DD2C37D.7000008@python.org> <BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com> <BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com> <4DD3685F.1040503@v.loewis.de> <iqvu5t$82s$1@dough.gmane.org> <BANLkTimZqGr0JqCrrgBaj1hx8pBELPxErw@mail.gmail.com> <ir08oe$5et$1@dough.gmane.org> <1305718000.16682.0.camel@marge> <ir0aur$hl4$1@dough.gmane.org> <1305720367.16682.2.camel@marge> Message-ID: <ir0dcg$vok$1@dough.gmane.org> On 18.05.2011 14:06, Victor Stinner wrote: > Le mercredi 18 mai 2011 ? 13:35 +0200, Georg Brandl a ?crit : >> On 18.05.2011 13:26, Victor Stinner wrote: >> >> >> See >> >> >> >> http://dev.pocoo.org/~gbrandl/news.html >> >> >> >> which I made as an experiment a while ago. >> > >> > Oh, I like it. But the output should be reST to be able to include it >> > directly in the Python documentation. Sphinx would generate a new table >> > of contents with links to each release. >> >> The output of processing reST should be reST? Now I'm confused. > > Misc/NEWS is already formatted to reST? Yes, it is. > It doesn't contain any link (to > the issues). We may replace "Issue #xxx" by :issue:`xxx` (directly in > Misc/NEWS) to simplify the process? Replacing the issue links is the only preprocessing that I did. > And maybe move Misc/NEWS to Doc? I don't think people would like that :) > http://dev.pocoo.org/~gbrandl/news.html is an HTML document. As the file name says :) Georg From victor.stinner at haypocalc.com Wed May 18 14:21:55 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 18 May 2011 14:21:55 +0200 Subject: [Python-Dev] Don't set local variable in a list comprehension or generator Message-ID: <1305721315.16682.10.camel@marge> Hi, ''.join(c for c in 'abc') and ''.join([c for c in 'abc']) do create a temporary c variable. In this case, the variable is useless and requires two opcodes: STORE_FAST(c), LOAD_FAST(c). The variable is not available outside the list comprehension/generator. I would like to remove the variable in these cases to speed up (micro-optimize!) Python. Remove the variable breaks code using introspection like: list([locals()['x'] for x in range(3)]) We may detect the usage of introspection (I don't know how hard it is), only optimize trivial cases (like "x for x in ..."), or only optmize with Python is running in optimize mode (python -O or python -OO). What do you think? Is it useless and/or stupid? I heard about optimization in the AST tree instead of working on the bytecode. What is the status of this project? Victor From brian.curtin at gmail.com Wed May 18 14:47:27 2011 From: brian.curtin at gmail.com (Brian Curtin) Date: Wed, 18 May 2011 07:47:27 -0500 Subject: [Python-Dev] Inconsistent case in directory names for installed Python on Windows Message-ID: <BANLkTik_U_k3+xAPhPjvYj9n=zC58SdkwQ@mail.gmail.com> On May 18, 2011 7:03 AM, "anatoly techtonik" <techtonik at gmail.com> wrote: > > Greetings, > > While studying `virtualenv` code I've noticed that in Python directory > tree `include`, `libs` and `tcl` are lowercased while other dirs are > capitalized. It doesn't seem important (especially for developers > here), but it still can leave an unpleasant image for people new to > Python (and programming in general). In theory there are probably a lot of things that might seem unpleasant but are actually non-issues. I don't believe there have been any complaints about actual unpleasantries with directory case. > > ?[Python27] > ? ??DLLs > ? ??Doc > ? ??include > ? ??Lib > ? ??libs > ? ??Scripts > ? ??tcl > ? ??Tools > > How about making a consistent lowercased or uppercased scheme? Windows > filesystems are case-insensitive, so the change shouldn't affect > anybody. Some Macs have case-sensitive file systems, and some people use case-sensitive file systems on various flavors of UNIX. The change would probably require a thorough look through the build chain. > Another candidate for > normalization is Tools/Scripts dir, > which I'd lowercase FWIW: > > ??Tools > ??i18n > ??pynche > ??Scripts > ??versioncheck > ??webchecker > > > Lowercased dirs on a top level seem to contains files that are > relevant to C developers only. However, I can not say for sure. It > seems that there could be a better place for them like top level > directory named Dev or C-API. > -- > anatoly t. Overall I think it boils down to a cosmetic change that I'm not sure we need to make, which could unnecessarily break people's work. -1 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20110518/31d855ad/attachment.html> From benjamin at python.org Wed May 18 15:41:48 2011 From: benjamin at python.org (Benjamin Peterson) Date: Wed, 18 May 2011 08:41:48 -0500 Subject: [Python-Dev] Don't set local variable in a list comprehension or generator In-Reply-To: <1305721315.16682.10.camel@marge> References: <1305721315.16682.10.camel@marge> Message-ID: <BANLkTingy7NHxP6w5j5OpghNwdOcO6Zw=w@mail.gmail.com> 2011/5/18 Victor Stinner <victor.stinner at haypocalc.com>: > Hi, > > ''.join(c for c in 'abc') and ''.join([c for c in 'abc']) do create a > temporary c variable. In this case, the variable is useless and requires > two opcodes: STORE_FAST(c), LOAD_FAST(c). The variable is not available > outside the list comprehension/generator. > > I would like to remove the variable in these cases to speed up > (micro-optimize!) Python. > > Remove the variable breaks code using introspection like: > > ? list([locals()['x'] for x in range(3)]) > > We may detect the usage of introspection (I don't know how hard it is), > only optimize trivial cases (like "x for x in ..."), or only optmize > with Python is running in optimize mode (python -O or python -OO). > > What do you think? Is it useless and/or stupid? Far more useful would be figuring out how to remove the call. -- Regards, Benjamin From ncoghlan at gmail.com Wed May 18 16:05:37 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 19 May 2011 00:05:37 +1000 Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1 In-Reply-To: <1305718000.16682.0.camel@marge> References: <4DD2C37D.7000008@python.org> <BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com> <BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com> <4DD3685F.1040503@v.loewis.de> <iqvu5t$82s$1@dough.gmane.org> <BANLkTimZqGr0JqCrrgBaj1hx8pBELPxErw@mail.gmail.com> <ir08oe$5et$1@dough.gmane.org> <1305718000.16682.0.camel@marge> Message-ID: <BANLkTiniDyExGSiJW6N=Tqs53MzMh=zjeQ@mail.gmail.com> On Wed, May 18, 2011 at 9:26 PM, Victor Stinner <victor.stinner at haypocalc.com> wrote: > > Oh, I like it. But the output should be reST to be able to include it > directly in the Python documentation. Sphinx would generate a new table > of contents with links to each release. As Georg noted, Misc/NEWS is already ReST. My proposal was essentially to add an extra step to the docs build process that invoked the same commands that Georg used to generate the sample version (with appropriate additions to Doc/tools as needed to make that work). The generated NEWS.html file could easily live inside the whatsnew directory alongside the actual What's New document. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From benjamin at python.org Wed May 18 16:12:34 2011 From: benjamin at python.org (Benjamin Peterson) Date: Wed, 18 May 2011 09:12:34 -0500 Subject: [Python-Dev] 2.7.2 and 3.1.4 Message-ID: <BANLkTikhNOA1LpzJgNX2CEeav87LgowxHg@mail.gmail.com> It's time to continue 2.7.* point releases with 2.7.2 and finish off 3.1.* with 3.1.4. I plan to do a RC for both on May 28th and a final on June 11th. -- Regards, Benjamin From ncoghlan at gmail.com Wed May 18 16:17:28 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 19 May 2011 00:17:28 +1000 Subject: [Python-Dev] Don't set local variable in a list comprehension or generator In-Reply-To: <1305721315.16682.10.camel@marge> References: <1305721315.16682.10.camel@marge> Message-ID: <BANLkTikroG0Vkx545Spy-jEiiR+1qeEkDA@mail.gmail.com> On Wed, May 18, 2011 at 10:21 PM, Victor Stinner <victor.stinner at haypocalc.com> wrote: > What do you think? Is it useless and/or stupid? I wouldn't call it useless or stupid - merely "lost in the noise". In small cases, I expect it would be swamped completely by the high fixed overhead of entering the new scope and in all generator expressions I expected it would be swamped by the cost of resuming the generator on each iteration, and even for comprehensions any time spent on the unneeded variable assignment is likely still going to be dominated by the __next__() call overhead. > I heard about optimization in the AST tree instead of working on the > bytecode. What is the status of this project? First step is getting back to Eugene Toder's AST cleanup patch and working on getting that in. It's a big patch though, and I'd like to see it broken up into a couple of distinct phases before we proceed. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From nadeem.vawda at gmail.com Wed May 18 16:19:59 2011 From: nadeem.vawda at gmail.com (Nadeem Vawda) Date: Wed, 18 May 2011 16:19:59 +0200 Subject: [Python-Dev] Don't set local variable in a list comprehension or generator In-Reply-To: <1305721315.16682.10.camel@marge> References: <1305721315.16682.10.camel@marge> Message-ID: <BANLkTi=k46QMDYjHcBVCrAs5mAMfJYUA9Q@mail.gmail.com> On Wed, May 18, 2011 at 2:21 PM, Victor Stinner <victor.stinner at haypocalc.com> wrote: > ''.join(c for c in 'abc') and ''.join([c for c in 'abc']) do create a > temporary c variable. I'm not sure why you would encounter code like that in the first place. Surely any code of the form: ''.join(c for c in my_string) would just return my_string? Or am I missing something? > I heard about optimization in the AST tree instead of working on the > bytecode. What is the status of this project? Are you referring to issue11549? There was some related discussion [1] on python-dev about six weeks ago, but I haven't seen anything on the topic since then. Cheers, Nadeem [1] http://mail.python.org/pipermail/python-dev/2011-April/110399.html From janssen at parc.com Wed May 18 16:59:58 2011 From: janssen at parc.com (Bill Janssen) Date: Wed, 18 May 2011 07:59:58 PDT Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <iqvp43$et3$1@dough.gmane.org> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <4DD35B9C.3030702@canterbury.ac.nz> <iqvp43$et3$1@dough.gmane.org> Message-ID: <86793.1305730798@parc.com> Georg Brandl <g.brandl at gmx.net> wrote: > We do have > > bytes.fromhex('deadbeef') Sort of reminds me of Java's Integer.parseInt(), and not in a good way. Bill From ethan at stoneleaf.us Wed May 18 17:57:46 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 18 May 2011 08:57:46 -0700 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <4DD35B9C.3030702@canterbury.ac.nz> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <4DD35B9C.3030702@canterbury.ac.nz> Message-ID: <4DD3EC7A.8070801@stoneleaf.us> Greg Ewing wrote: > Ethan Furman wrote: > >> On the one hand we have the 'bytes are ascii data' type interface, and >> on the other we have the 'bytes are a list of integers between 0 - >> 255' interface. > > I think the weird part is that there exists a literal for > writing a byte array as an ascii string, and furthermore > that it's the *only* kind of literal available for bytes. That is the point I was trying to make -- thank you for stating it more clearly than I managed to. :) > Personally I think that the default literal syntax for > bytes, and also the form produced by repr(), should have > been something more neutral, such as hex, Agreed. It is surprising to extract an element out of bytes, and not end up with bytes, but with an int -- if the repr used something besides the plain ascii representation, this would not be an expectation. For comparison, when one extracts an element out of a str one gets a str -- not the int representing the unicode code point. > with the ascii form available for use when it makes sense. > > As for > >> --> some_other_var[3] == b'd' > > there ought to be a literal for specifying an integer > using an ascii character, so you could say something like > > if some_other_var[3] == c'd': > > which would be equivalent to > > if some_other_var[3] == ord(b'd') > > but without the overhead of computing the value each time > at run time. Given that we can't change the behavior of b'abc'[1], that would be better than what we have. +1 ~Ethan~ From stephen at xemacs.org Wed May 18 18:16:44 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 19 May 2011 01:16:44 +0900 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com> <BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com> Message-ID: <87k4doasr7.fsf@uwakimon.sk.tsukuba.ac.jp> Robert Collins writes: > Its probably too late to change, but please don't try to argue that > its correct: the continued confusion of folk running into this is > evidence that confusion *is happening*. Treat that as evidence and > think about how to fix it going forward. Sorry, Rob, but you're just wrong here, and Nick is right. It's possible to improve Python 3, but not to "fix" it in this respect. The Python 3 solution is correct, the Python 2 approach is not. There's no way to avoid discontinuity and confusion here. Confusion is indeed happening, but it's real confusion in the way people think about the problem space, not a language design cockup. The problem can't be solved by embedding ASCII in Unicode, because non-ASCII bytes don't have a canonical embedding in Unicode. Ie, the situation is inherently confusing. You can't wish it away, you can only choose to impose more or less of it on particular constituencies. Now, it's quite possible that there are other correct approaches that allow straightforward manipulation of non-ASCII text, but I don't know what they are, and I don't know anybody else who does. From merwok at netwok.org Wed May 18 18:47:49 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Wed, 18 May 2011 18:47:49 +0200 Subject: [Python-Dev] [Python-checkins] cpython: Skip some more tests in the absence of threading. In-Reply-To: <E1QMDZY-0002xi-1W@dinsdale.python.org> References: <E1QMDZY-0002xi-1W@dinsdale.python.org> Message-ID: <4DD3F835.6070609@netwok.org> Hi, > http://hg.python.org/cpython/rev/c83fb59b73ea > user: Vinay Sajip <vinay_sajip at yahoo.co.uk> > date: Tue May 17 07:15:53 2011 +0100 > summary: > Skip some more tests in the absence of threading > diff --git a/Lib/test/test_logging.py b/Lib/test/test_logging.py > --- a/Lib/test/test_logging.py > +++ b/Lib/test/test_logging.py > try: > import threading > + # The following imports are needed only for tests which > + import asynchat I guess ?for tests which use threading? > +if threading: > + class TestSMTPChannel(smtpd.SMTPChannel): I wonder if you could have saved yourself all this reindenting if your import had fallen back to dummy_threading. > + at unittest.skipUnless(threading, 'Threading required for this test.') I?d have used lower-case threading, to make it a bit clearer that it?s the threading module that?s require, not some abstract notion of threading. But they may be the same thing, I?m not sure. Regards From merwok at netwok.org Wed May 18 18:51:18 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Wed, 18 May 2011 18:51:18 +0200 Subject: [Python-Dev] [Python-checkins] cpython: Skip some tests in the absence of multiprocessing. In-Reply-To: <E1QMDyC-0004jq-1J@dinsdale.python.org> References: <E1QMDyC-0004jq-1J@dinsdale.python.org> Message-ID: <4DD3F906.2080100@netwok.org> Hi again, > http://hg.python.org/cpython/rev/4b7c29201c60 > user: Vinay Sajip <vinay_sajip at yahoo.co.uk> > summary: > Skip some tests in the absence of multiprocessing. > + @unittest.skipUnless(threading, 'Threading required for this test.') Who wins, the commit message or the code? :) > + try: > + import multiprocessing as mp > + r = logging.makeLogRecord({}) > + self.assertEqual(r.processName, mp.current_process().name) > + except ImportError: > + pass Isn?t support.import_module or somesuch useful for this kind of checks? Regards From rdmurray at bitdance.com Wed May 18 19:10:17 2011 From: rdmurray at bitdance.com (R. David Murray) Date: Wed, 18 May 2011 13:10:17 -0400 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <87k4doasr7.fsf@uwakimon.sk.tsukuba.ac.jp> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com> <BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com> <87k4doasr7.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <20110518171018.807E1250045@webabinitio.net> On Thu, 19 May 2011 01:16:44 +0900, "Stephen J. Turnbull" <stephen at xemacs.org> wrote: > Robert Collins writes: > > > Its probably too late to change, but please don't try to argue that > > its correct: the continued confusion of folk running into this is > > evidence that confusion *is happening*. Treat that as evidence and > > think about how to fix it going forward. > > Sorry, Rob, but you're just wrong here, and Nick is right. It's > possible to improve Python 3, but not to "fix" it in this respect. > The Python 3 solution is correct, the Python 2 approach is not. > There's no way to avoid discontinuity and confusion here. > > Confusion is indeed happening, but it's real confusion in the way > people think about the problem space, not a language design cockup. Note that the more common idiom (not that I can measure it, mind) when dealing with byte strings is something analogous to if my_byte_string[i:i+1] == b'x': rather than if my_byte_string[i] == 170: and the former is a lot more readable than the latter, even though you have to stare at the slice for a couple seconds the first time you encounter it to realize what is going on. So *something* is wrong with Python3's approach. Python2 was wronger, though :) --David From merwok at netwok.org Wed May 18 19:46:16 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Wed, 18 May 2011 19:46:16 +0200 Subject: [Python-Dev] "packaging" merge imminent In-Reply-To: <4DD2A593.90203@cheimes.de> References: <BANLkTimcbu6_tSD=KrKoG=nvHrUWzS_McQ@mail.gmail.com> <4DD2A593.90203@cheimes.de> Message-ID: <4DD405E8.1090401@netwok.org> Le 17/05/2011 18:42, Christian Heimes a ?crit : > A good place for a local sysconfig.cfg could be the user's stdlib > directory (e.g. ~/.local/lib/python3.2/sysconfig.cfg). I don?t think so. See http://bugs.python.org/issue7175 and http://mail.python.org/pipermail/python-dev/2010-August/103011.html Regards From merwok at netwok.org Wed May 18 19:48:25 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Wed, 18 May 2011 19:48:25 +0200 Subject: [Python-Dev] "packaging" merge imminent In-Reply-To: <1305664835.29701.2.camel@marge> References: <BANLkTimcbu6_tSD=KrKoG=nvHrUWzS_McQ@mail.gmail.com> <1305664835.29701.2.camel@marge> Message-ID: <4DD40669.7000904@netwok.org> > I fixed recently some bugs in distutils. Should I also fix them in the > packaging module, or are both modules already "synchronized"? I ported some fixes, especially in sysconfig; for distutils, I have a number of them marked for backport in the bug tracker (distutils2 component) or in personal bookmarks. There are not very many. Cheers From ethan at stoneleaf.us Wed May 18 20:51:54 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 18 May 2011 11:51:54 -0700 Subject: [Python-Dev] Equality testing Message-ID: <4DD4154A.3080603@stoneleaf.us> In Python 3 inequality comparisons became forbidden. --> 123 < [1, 2, 3] Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unorderable types: int() < list() However, equality comparisons are still allowed --> 123 == [1, 2, 3] False But you can't mix them (inequality wins) --> 123 <= [1, 2, 3] Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unorderable types: int() <= list() I realize this is probably a Py4000 change if it happens at all, but does this make sense? Shouldn't an attempt to compare to unlike objects be a TypeError, just like trying to order them is? It bit me when I tried to compare a byte string element with a single character byte string (of course they should have matched, but since the element was an int, the match was not longer True). ~Ethan~ From hagen at zhuliguan.net Wed May 18 20:39:58 2011 From: hagen at zhuliguan.net (=?ISO-8859-1?Q?Hagen_F=FCrstenau?=) Date: Wed, 18 May 2011 20:39:58 +0200 Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1 In-Reply-To: <4DD2C37D.7000008@python.org> References: <4DD2C37D.7000008@python.org> Message-ID: <4DD4127E.6050301@zhuliguan.net> > On behalf of the Python development team, I am pleased to announce the > first release candidate of Python 3.2.1. Shouldn't there be a tag "v3.2.1rc1" in the hg repo? Cheers, Hagen From martin at v.loewis.de Wed May 18 20:52:17 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 18 May 2011 20:52:17 +0200 Subject: [Python-Dev] how do you find out what version of Python a PEP landed in? In-Reply-To: <BANLkTiny++ziwc_+UsOfWfiUD-dxM7KZxA@mail.gmail.com> References: <4DD35E6F.8030901@simplistix.co.uk> <4DD3661D.30908@v.loewis.de> <BANLkTiny++ziwc_+UsOfWfiUD-dxM7KZxA@mail.gmail.com> Message-ID: <4DD41561.6090305@v.loewis.de> Am 18.05.2011 08:38, schrieb Amaury Forgeot d'Arc: > 2011/5/18 "Martin v. L?wis" <martin at v.loewis.de>: >>> How do I know which version of Python a PEP lands in? >> >> You should look at the Python-Version header of the PEP. > > But some PEPs don't have it: 341, 342, 343, 353... In these cases, the respective authors (or somebody else who cares) should add it. Regards, Martin From martin at v.loewis.de Wed May 18 21:06:09 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 18 May 2011 21:06:09 +0200 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <20110518171018.807E1250045@webabinitio.net> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com> <BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com> <87k4doasr7.fsf@uwakimon.sk.tsukuba.ac.jp> <20110518171018.807E1250045@webabinitio.net> Message-ID: <4DD418A1.6000508@v.loewis.de> > Note that the more common idiom (not that I can measure it, mind) > when dealing with byte strings is something analogous to > > if my_byte_string[i:i+1] == b'x': > > rather than > > if my_byte_string[i] == 170: FWIW, Another spelling of this is if my_byte_string[i] == ord(b'x') >From a readability point, it's in the same category as the first one, but less twisted. Regards, Martin From martin at v.loewis.de Wed May 18 21:09:03 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 18 May 2011 21:09:03 +0200 Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1 In-Reply-To: <4DD4127E.6050301@zhuliguan.net> References: <4DD2C37D.7000008@python.org> <4DD4127E.6050301@zhuliguan.net> Message-ID: <4DD4194F.9020009@v.loewis.de> Am 18.05.2011 20:39, schrieb Hagen F?rstenau: >> On behalf of the Python development team, I am pleased to announce the >> first release candidate of Python 3.2.1. > > Shouldn't there be a tag "v3.2.1rc1" in the hg repo? http://hg.python.org/releasing/3.2.1/ Regards, Martin P.S. "Shouldn't" makes it sound as if there was a mistake. From eric at trueblade.com Wed May 18 21:10:15 2011 From: eric at trueblade.com (Eric Smith) Date: Wed, 18 May 2011 15:10:15 -0400 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <87k4doasr7.fsf@uwakimon.sk.tsukuba.ac.jp> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com> <BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com> <87k4doasr7.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <4DD41997.4060401@trueblade.com> On 05/18/2011 12:16 PM, Stephen J. Turnbull wrote: > Robert Collins writes: > > > Its probably too late to change, but please don't try to argue that > > its correct: the continued confusion of folk running into this is > > evidence that confusion *is happening*. Treat that as evidence and > > think about how to fix it going forward. > > Sorry, Rob, but you're just wrong here, and Nick is right. It's > possible to improve Python 3, but not to "fix" it in this respect. > The Python 3 solution is correct, the Python 2 approach is not. > There's no way to avoid discontinuity and confusion here. I don't think there's any connection between the way 2.x confused text strings and binary data (which certainly needed addressing) with the way that 3.x returns a different type for byte_str[i] than it does for byte_str[i:i+1]. I think it's the latter that's confusing to people. There's no particular requirement for different types that's needed to fix the byte/str problem. And of course it's too late to make any change to this. Eric. From ethan at stoneleaf.us Wed May 18 21:29:47 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 18 May 2011 12:29:47 -0700 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <4DD3EC7A.8070801@stoneleaf.us> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <4DD35B9C.3030702@canterbury.ac.nz> <4DD3EC7A.8070801@stoneleaf.us> Message-ID: <4DD41E2B.7000404@stoneleaf.us> Ethan Furman wrote: > Greg Ewing wrote: >> As for >> >>> --> some_other_var[3] == b'd' >> >> there ought to be a literal for specifying an integer >> using an ascii character, so you could say something like >> >> if some_other_var[3] == c'd': >> >> which would be equivalent to >> >> if some_other_var[3] == ord(b'd') >> >> but without the overhead of computing the value each time >> at run time. > > Given that we can't change the behavior of b'abc'[1], that would be > better than what we have. > > +1 Here's another thought, that perhaps is not backwards-incompatible... some_var[3] == b'd' At some point, the bytes class' __eq__ will be called -- is there a reason why we cannot have 1) a check to see if the bytes instance is length 1 2) a check to see if i) the other object is an int, and 2) 0 <= other_obj < 256 3) if 1 and 2, make the comparison instead of returning NotImplemented? This makes sense to me -- after all, the bytes class is an array of ints in range(256); it is a special case, but doesn't feel any more special than passing an int into bytes() giving a string of that many null bytes; and it would get rid of the, in my opinion ugly, idiom of some_var[i:i+1] == b'd' It would also not require a new literal syntax. ~Ethan~ From benjamin at python.org Wed May 18 21:22:18 2011 From: benjamin at python.org (Benjamin Peterson) Date: Wed, 18 May 2011 14:22:18 -0500 Subject: [Python-Dev] Equality testing In-Reply-To: <4DD4154A.3080603@stoneleaf.us> References: <4DD4154A.3080603@stoneleaf.us> Message-ID: <BANLkTikyb7A_sjdwB_jNRVybOm5OnqTeqQ@mail.gmail.com> 2011/5/18 Ethan Furman <ethan at stoneleaf.us>: > In Python 3 inequality comparisons became forbidden. > > --> 123 < [1, 2, 3] > Traceback (most recent call last): > ?File "<stdin>", line 1, in <module> > TypeError: unorderable types: int() < list() > > However, equality comparisons are still allowed > > --> 123 == [1, 2, 3] > False > > But you can't mix them (inequality wins) > > --> 123 <= [1, 2, 3] > Traceback (most recent call last): > ?File "<stdin>", line 1, in <module> > TypeError: unorderable types: int() <= list() > > I realize this is probably a Py4000 change if it happens at all, but does > this make sense? ?Shouldn't an attempt to compare to unlike objects be a > TypeError, just like trying to order them is? No. Ordering for types which completely different doesn't make any sense, but equality testing is just fine because it has an obvious answer: no. -- Regards, Benjamin From hagen at zhuliguan.net Wed May 18 21:37:29 2011 From: hagen at zhuliguan.net (=?ISO-8859-1?Q?Hagen_F=FCrstenau?=) Date: Wed, 18 May 2011 21:37:29 +0200 Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1 In-Reply-To: <4DD4194F.9020009@v.loewis.de> References: <4DD2C37D.7000008@python.org> <4DD4127E.6050301@zhuliguan.net> <4DD4194F.9020009@v.loewis.de> Message-ID: <4DD41FF9.9040704@zhuliguan.net> > P.S. "Shouldn't" makes it sound as if there was a mistake. Well, I thought there was. When do these tags get merged into "cpython" then? "v3.2.1b1" is there, but "v3.2.1rc1" isn't: http://hg.python.org/cpython/tags Cheers, Hagen From g.brandl at gmx.net Wed May 18 21:37:57 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 18 May 2011 21:37:57 +0200 Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1 In-Reply-To: <4DD4194F.9020009@v.loewis.de> References: <4DD2C37D.7000008@python.org> <4DD4127E.6050301@zhuliguan.net> <4DD4194F.9020009@v.loewis.de> Message-ID: <ir176p$25a$1@dough.gmane.org> On 18.05.2011 21:09, "Martin v. L?wis" wrote: > Am 18.05.2011 20:39, schrieb Hagen F?rstenau: >>> On behalf of the Python development team, I am pleased to announce the >>> first release candidate of Python 3.2.1. >> >> Shouldn't there be a tag "v3.2.1rc1" in the hg repo? > > http://hg.python.org/releasing/3.2.1/ > > Regards, > Martin > > P.S. "Shouldn't" makes it sound as if there was a mistake. To clarify: once the final is done, the repo Martin mentioned will be merged back to main and then vanish. Georg From g.brandl at gmx.net Wed May 18 21:47:43 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 18 May 2011 21:47:43 +0200 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <4DD418A1.6000508@v.loewis.de> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com> <BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com> <87k4doasr7.fsf@uwakimon.sk.tsukuba.ac.jp> <20110518171018.807E1250045@webabinitio.net> <4DD418A1.6000508@v.loewis.de> Message-ID: <ir17p3$60h$1@dough.gmane.org> On 18.05.2011 21:06, "Martin v. L?wis" wrote: >> Note that the more common idiom (not that I can measure it, mind) >> when dealing with byte strings is something analogous to >> >> if my_byte_string[i:i+1] == b'x': >> >> rather than >> >> if my_byte_string[i] == 170: > > FWIW, Another spelling of this is > > if my_byte_string[i] == ord(b'x') > >>From a readability point, it's in the same category as the first one, > but less twisted. Probably more twisted: if my_byte_string[i] == b'x'[0]: :) Georg From ethan at stoneleaf.us Wed May 18 22:10:11 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 18 May 2011 13:10:11 -0700 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <4DD41E2B.7000404@stoneleaf.us> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <4DD35B9C.3030702@canterbury.ac.nz> <4DD3EC7A.8070801@stoneleaf.us> <4DD41E2B.7000404@stoneleaf.us> Message-ID: <4DD427A3.9080207@stoneleaf.us> Ethan Furman wrote: [...] Also posted to Python-Ideas. ~Ethan~ From martin at v.loewis.de Wed May 18 22:01:12 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 18 May 2011 22:01:12 +0200 Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1 In-Reply-To: <4DD41FF9.9040704@zhuliguan.net> References: <4DD2C37D.7000008@python.org> <4DD4127E.6050301@zhuliguan.net> <4DD4194F.9020009@v.loewis.de> <4DD41FF9.9040704@zhuliguan.net> Message-ID: <4DD42588.4080202@v.loewis.de> Am 18.05.2011 21:37, schrieb Hagen F?rstenau: >> P.S. "Shouldn't" makes it sound as if there was a mistake. > > Well, I thought there was. When do these tags get merged into "cpython" > then? See PEP 101 Regards, Martin From martin at v.loewis.de Wed May 18 22:06:26 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 18 May 2011 22:06:26 +0200 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <4DD41E2B.7000404@stoneleaf.us> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <4DD35B9C.3030702@canterbury.ac.nz> <4DD3EC7A.8070801@stoneleaf.us> <4DD41E2B.7000404@stoneleaf.us> Message-ID: <4DD426C2.7060706@v.loewis.de> > Here's another thought, that perhaps is not backwards-incompatible... > > some_var[3] == b'd' > > At some point, the bytes class' __eq__ will be called -- is there a > reason why we cannot have > > 1) a check to see if the bytes instance is length 1 > 2) a check to see if > i) the other object is an int, and > 2) 0 <= other_obj < 256 > 3) if 1 and 2, make the comparison instead of returning NotImplemented? Immutable objects that compare equal should hash equal; so we would also have to change the hashing of byte strings. Not sure whether that, in turn, has undesirable consequences. In addition, equality should be transitive, so b'A' == 65.0. Regards, Martin From lac at openend.se Wed May 18 22:30:28 2011 From: lac at openend.se (Laura Creighton) Date: Wed, 18 May 2011 22:30:28 +0200 Subject: [Python-Dev] how do you find out what version of Python a PEP landed in? In-Reply-To: Message from =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= <martin@v.loewis.de> of "Wed, 18 May 2011 20:52:17 +0200." <4DD41561.6090305@v.loewis.de> References: <4DD35E6F.8030901@simplistix.co.uk> <4DD3661D.30908@v.loewis.de> <BANLkTiny++ziwc_+UsOfWfiUD-dxM7KZxA@mail.gmail.com><4DD41561.6090305@v.loewis.de> Message-ID: <201105182030.p4IKUSU9005831@theraft.openend.se> Politely ask them to add it. (just my suggrestion). Laura From ethan at stoneleaf.us Wed May 18 22:48:07 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 18 May 2011 13:48:07 -0700 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <4DD426C2.7060706@v.loewis.de> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <4DD35B9C.3030702@canterbury.ac.nz> <4DD3EC7A.8070801@stoneleaf.us> <4DD41E2B.7000404@stoneleaf.us> <4DD426C2.7060706@v.loewis.de> Message-ID: <4DD43087.6090602@stoneleaf.us> Martin v. L?wis wrote: >> Here's another thought, that perhaps is not backwards-incompatible... >> >> some_var[3] == b'd' >> >> At some point, the bytes class' __eq__ will be called -- is there a >> reason why we cannot have >> >> 1) a check to see if the bytes instance is length 1 >> 2) a check to see if >> i) the other object is an int, and >> 2) 0 <= other_obj < 256 >> 3) if 1 and 2, make the comparison instead of returning NotImplemented? > > Immutable objects that compare equal should hash equal; > so we would also have to change the hashing of byte strings. Not sure > whether that, in turn, has undesirable consequences. I thought it was the other-way-round -- if they hash equal, they should compare equal? Or is this just for immutables? > In addition, equality should be transitive, so b'A' == 65.0. I'm not sure what you're getting at... we could certainly have step 2 check for a number instead of an int, and then step 3 could extract the one element, giving an int, and then let that int compare itself with the other number, whether it be int, float, fraction, what-have-you. ~Ethan~ From tjreedy at udel.edu Wed May 18 22:41:45 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 18 May 2011 16:41:45 -0400 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <4DD427A3.9080207@stoneleaf.us> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <4DD35B9C.3030702@canterbury.ac.nz> <4DD3EC7A.8070801@stoneleaf.us> <4DD41E2B.7000404@stoneleaf.us> <4DD427A3.9080207@stoneleaf.us> Message-ID: <ir1au8$o4s$1@dough.gmane.org> On 5/18/2011 4:10 PM, Ethan Furman wrote: > Ethan Furman wrote: > > [...] > > Also posted to Python-Ideas. Good. That is where it should have gone in the first place, as this is about ideas not yet even in the PEP stage. -- Terry Jan Reedy From tjreedy at udel.edu Wed May 18 23:01:28 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 18 May 2011 17:01:28 -0400 Subject: [Python-Dev] Equality testing In-Reply-To: <4DD4154A.3080603@stoneleaf.us> References: <4DD4154A.3080603@stoneleaf.us> Message-ID: <ir1c3a$rk$1@dough.gmane.org> On 5/18/2011 2:51 PM, Ethan Furman wrote: > In Python 3 inequality comparisons became forbidden. > > --> 123 < [1, 2, 3] > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > TypeError: unorderable types: int() < list() > > However, equality comparisons are still allowed > > --> 123 == [1, 2, 3] > False > > But you can't mix them (inequality wins) > > --> 123 <= [1, 2, 3] > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > TypeError: unorderable types: int() <= list() > > I realize this is probably a Py4000 change if it happens at all, but > does this make sense? Shouldn't an attempt to compare to unlike objects > be a TypeError, just like trying to order them is? > > It bit me when I tried to compare a byte string element with a single > character byte string (of course they should have matched, but since the > element was an int, the match was not longer True). Questions/comments like this that are not about developing the next versions of Python, as you acknowledge above, really belong elsewhere, like on the ideas list. -- Terry Jan Reedy From tjreedy at udel.edu Wed May 18 23:13:23 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 18 May 2011 17:13:23 -0400 Subject: [Python-Dev] Don't set local variable in a list comprehension or generator In-Reply-To: <BANLkTi=k46QMDYjHcBVCrAs5mAMfJYUA9Q@mail.gmail.com> References: <1305721315.16682.10.camel@marge> <BANLkTi=k46QMDYjHcBVCrAs5mAMfJYUA9Q@mail.gmail.com> Message-ID: <ir1cpj$52a$1@dough.gmane.org> On 5/18/2011 10:19 AM, Nadeem Vawda wrote: > I'm not sure why you would encounter code like that in the first place. > Surely any code of the form: > > ''.join(c for c in my_string) > > would just return my_string? Or am I missing something? Good question. Anything useful like "'-'.join(c for c in 'abc')" is the same as "'-'.join('abc'). The same, as far as I can think of, for anything like list() or set() taking an iterable arg. -- Terry Jan Reedy From ethan at stoneleaf.us Wed May 18 23:42:37 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 18 May 2011 14:42:37 -0700 Subject: [Python-Dev] Equality testing In-Reply-To: <ir1c3a$rk$1@dough.gmane.org> References: <4DD4154A.3080603@stoneleaf.us> <ir1c3a$rk$1@dough.gmane.org> Message-ID: <4DD43D4D.1080902@stoneleaf.us> Terry Reedy wrote: > On 5/18/2011 2:51 PM, Ethan Furman wrote: >> In Python 3 inequality comparisons became forbidden. >> >> --> 123 < [1, 2, 3] >> Traceback (most recent call last): >> File "<stdin>", line 1, in <module> >> TypeError: unorderable types: int() < list() >> >> However, equality comparisons are still allowed >> >> --> 123 == [1, 2, 3] >> False >> >> But you can't mix them (inequality wins) >> >> --> 123 <= [1, 2, 3] >> Traceback (most recent call last): >> File "<stdin>", line 1, in <module> >> TypeError: unorderable types: int() <= list() >> >> I realize this is probably a Py4000 change if it happens at all, but >> does this make sense? Shouldn't an attempt to compare to unlike objects >> be a TypeError, just like trying to order them is? >> >> It bit me when I tried to compare a byte string element with a single >> character byte string (of course they should have matched, but since the >> element was an int, the match was not longer True). > > Questions/comments like this that are not about developing the next > versions of Python, as you acknowledge above, really belong elsewhere, > like on the ideas list. My apologies. I'll be more careful. ~Ethan~ From victor.stinner at haypocalc.com Wed May 18 23:34:09 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 18 May 2011 23:34:09 +0200 Subject: [Python-Dev] Don't set local variable in a list comprehension or generator In-Reply-To: <BANLkTi=k46QMDYjHcBVCrAs5mAMfJYUA9Q@mail.gmail.com> References: <1305721315.16682.10.camel@marge> <BANLkTi=k46QMDYjHcBVCrAs5mAMfJYUA9Q@mail.gmail.com> Message-ID: <1305754449.27389.30.camel@marge> Le mercredi 18 mai 2011 ? 16:19 +0200, Nadeem Vawda a ?crit : > I'm not sure why you would encounter code like that in the first place. Well, I found the STORE_FAST/LOAD_FAST "issue" while trying to optimize the this module which reimplements rot13 using a dict in Python 3: d = {} for c in (65, 97): for i in range(26): d[chr(i+c)] = chr((i+13) % 26 + c) I tried: d = {chr(i+c): chr((i+13) % 26 + c) for i in range(26) for c in (65, 97)} But it is slower whereas I read somewhere than generators are faster than loops. By the way, (c for c in ...) is slower than [c for c in ...]. I suppose that a generator is slower because it exits/reenter into PyEval_EvalFrameEx() at each step, whereas [c for c ...] uses BUILD_LIST in a dummy (but fast) loop. (c for c in ...) and [c for c in ...] is stupid, but I used a simplified example to explain the problem. A more realistic example would be: squares = (x*x for x in range(10000)) You don't really need the "x" variable, you just want the square. Another example is the syntax using a if the filter the data set: (x for x in ... if condition(x)) > > I heard about optimization in the AST tree instead of working on the > > bytecode. What is the status of this project? > > Are you referring to issue11549? There was some related discussion [1] on > python-dev about six weeks ago, but I haven't seen anything on the topic > since then. Ah yes, it looks to be this issue. I didn't know that there was an issue. Victor From amauryfa at gmail.com Wed May 18 23:37:30 2011 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Wed, 18 May 2011 23:37:30 +0200 Subject: [Python-Dev] Don't set local variable in a list comprehension or generator In-Reply-To: <ir1cpj$52a$1@dough.gmane.org> References: <1305721315.16682.10.camel@marge> <BANLkTi=k46QMDYjHcBVCrAs5mAMfJYUA9Q@mail.gmail.com> <ir1cpj$52a$1@dough.gmane.org> Message-ID: <BANLkTinih_3ght8GSknCNrTjaD3t9i-Ayw@mail.gmail.com> Hi, 2011/5/18 Terry Reedy <tjreedy at udel.edu>: > On 5/18/2011 10:19 AM, Nadeem Vawda wrote: > >> I'm not sure why you would encounter code like that in the first place. >> Surely any code of the form: >> >> ? ? ''.join(c for c in my_string) >> >> would just return my_string? Or am I missing something? > > Good question. Anything useful like "'-'.join(c for c in 'abc')" is the same > as "'-'.join('abc'). The same, as far as I can think of, for anything like > list() or set() taking an iterable arg. With a little imagination you can build something non trivial. For example, a join_words function: def join_words(words): return ', '.join(w.strip() for w in words) Like Victor says, the code of the generator object contains a STORE_FAST followed by LOAD_FAST. This pair of opcodes could be removed, and the value left on the stack. >>> dis.dis(join_words.func_code.co_consts[2]) 1 0 SETUP_LOOP 24 (to 27) 3 LOAD_FAST 0 (.0) >> 6 FOR_ITER 17 (to 26) 9 STORE_FAST 1 (w) 12 LOAD_FAST 1 (w) 15 LOAD_ATTR 0 (strip) 18 CALL_FUNCTION 0 21 YIELD_VALUE 22 POP_TOP 23 JUMP_ABSOLUTE 6 >> 26 POP_BLOCK >> 27 LOAD_CONST 0 (None) 30 RETURN_VALUE It's probably not easy to do though. Think of expressions where the variable appears several times, or even where the variable is not the first object, like str(ord(x)). -- Amaury Forgeot d'Arc From martin at v.loewis.de Wed May 18 23:58:21 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 18 May 2011 23:58:21 +0200 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <4DD43087.6090602@stoneleaf.us> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <4DD35B9C.3030702@canterbury.ac.nz> <4DD3EC7A.8070801@stoneleaf.us> <4DD41E2B.7000404@stoneleaf.us> <4DD426C2.7060706@v.loewis.de> <4DD43087.6090602@stoneleaf.us> Message-ID: <4DD440FD.7060208@v.loewis.de> >> Immutable objects that compare equal should hash equal; >> so we would also have to change the hashing of byte strings. Not sure >> whether that, in turn, has undesirable consequences. > > I thought it was the other-way-round -- if they hash equal, they should > compare equal? No no no. If they hash equal, it could just be a hash collision - objects of a class could all hash to 42, if they wanted to. Dictionaries require the property I mentioned. If they compare equal, but hash differently, a dictionary lookup would fail to find the key. >> In addition, equality should be transitive, so b'A' == 65.0. > > I'm not sure what you're getting at... That it is counter-intuitive to have a bytes object compare equal to a floating-point number. Regards, Martin From greg.ewing at canterbury.ac.nz Thu May 19 00:02:48 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 19 May 2011 10:02:48 +1200 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <iqvp43$et3$1@dough.gmane.org> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <4DD35B9C.3030702@canterbury.ac.nz> <iqvp43$et3$1@dough.gmane.org> Message-ID: <4DD44208.70101@canterbury.ac.nz> Georg Brandl wrote: > We do have > > bytes.fromhex('deadbeef') But again, there is a run-time overhead to this. -- Greg From greg.ewing at canterbury.ac.nz Thu May 19 00:32:28 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 19 May 2011 10:32:28 +1200 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <4DD41997.4060401@trueblade.com> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com> <BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com> <87k4doasr7.fsf@uwakimon.sk.tsukuba.ac.jp> <4DD41997.4060401@trueblade.com> Message-ID: <4DD448FC.9030301@canterbury.ac.nz> Eric Smith wrote: > And of course it's too late to make any change to this. It's too late to change the meaning of b'...', but is it really too late to introduce an x'...' literal and change the repr() to produce it? -- Greg From greg.ewing at canterbury.ac.nz Thu May 19 00:39:34 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 19 May 2011 10:39:34 +1200 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <4DD41E2B.7000404@stoneleaf.us> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <4DD35B9C.3030702@canterbury.ac.nz> <4DD3EC7A.8070801@stoneleaf.us> <4DD41E2B.7000404@stoneleaf.us> Message-ID: <4DD44AA6.9030600@canterbury.ac.nz> Ethan Furman wrote: > some_var[3] == b'd' > > 1) a check to see if the bytes instance is length 1 > 2) a check to see if > i) the other object is an int, and > 2) 0 <= other_obj < 256 > 3) if 1 and 2, make the comparison instead of returning NotImplemented? It might seem convenient, but I'd worry that it would lead to even more confusion in other ways. If someone sees that some_var[3] == b'd' is true, and that some_var[3] == 100 is also true, they might expect to be able to do things like n = b'd' + 1 and get 101... or maybe b'e'... -- Greg From eric at trueblade.com Thu May 19 00:46:01 2011 From: eric at trueblade.com (Eric Smith) Date: Wed, 18 May 2011 18:46:01 -0400 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <4DD448FC.9030301@canterbury.ac.nz> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com> <BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com> <87k4doasr7.fsf@uwakimon.sk.tsukuba.ac.jp> <4DD41997.4060401@trueblade.com> <4DD448FC.9030301@canterbury.ac.nz> Message-ID: <4DD44C29.6050008@trueblade.com> On 5/18/2011 6:32 PM, Greg Ewing wrote: > Eric Smith wrote: > >> And of course it's too late to make any change to this. > > It's too late to change the meaning of b'...', but is it > really too late to introduce an x'...' literal and change > the repr() to produce it? My "this" was the different types returned by b[i] and b[i:i+1]. Eric. From greg.ewing at canterbury.ac.nz Thu May 19 00:47:09 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 19 May 2011 10:47:09 +1200 Subject: [Python-Dev] Don't set local variable in a list comprehension or generator In-Reply-To: <1305754449.27389.30.camel@marge> References: <1305721315.16682.10.camel@marge> <BANLkTi=k46QMDYjHcBVCrAs5mAMfJYUA9Q@mail.gmail.com> <1305754449.27389.30.camel@marge> Message-ID: <4DD44C6D.8000808@canterbury.ac.nz> Victor Stinner wrote: > squares = (x*x for x in range(10000)) What bytecode would you optimise that into? -- Greg From robertc at robertcollins.net Thu May 19 01:39:19 2011 From: robertc at robertcollins.net (Robert Collins) Date: Thu, 19 May 2011 11:39:19 +1200 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <87k4doasr7.fsf@uwakimon.sk.tsukuba.ac.jp> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com> <BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com> <87k4doasr7.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <BANLkTikh6D1KgF7_k_guy4s4JETpz5cJpw@mail.gmail.com> On Thu, May 19, 2011 at 4:16 AM, Stephen J. Turnbull <stephen at xemacs.org> wrote: > Robert Collins writes: > > ?> Its probably too late to change, but please don't try to argue that > ?> its correct: the continued confusion of folk running into this is > ?> evidence that confusion *is happening*. Treat that as evidence and > ?> think about how to fix it going forward. > > Sorry, Rob, but you're just wrong here, and Nick is right. ?It's > possible to improve Python 3, but not to "fix" it in this respect. > The Python 3 solution is correct, the Python 2 approach is not. > There's no way to avoid discontinuity and confusion here. The top level description: 'bytes is a different type to text[unicode] and casting between them must be explicit' is completely correct in Python 3: I didn't (and have never AFAIK) quibbled about that. Thats separate to the implementation issues I have mentioned in this thread and previous. Arguing that implicit casting is a good idea isn't what I was doing, nor what Nick was rebutting, AFAICT. -Rob From tjreedy at udel.edu Thu May 19 03:44:24 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 18 May 2011 21:44:24 -0400 Subject: [Python-Dev] Don't set local variable in a list comprehension or generator In-Reply-To: <1305754449.27389.30.camel@marge> References: <1305721315.16682.10.camel@marge> <BANLkTi=k46QMDYjHcBVCrAs5mAMfJYUA9Q@mail.gmail.com> <1305754449.27389.30.camel@marge> Message-ID: <ir1slo$ibs$1@dough.gmane.org> On 5/18/2011 5:34 PM, Victor Stinner wrote: You initial example gave me the impression that the issue has something to do with join in particular, or even comprehensions in particular. It is really about for loops. > squares = (x*x for x in range(10000)) >>> dis('for x in range(3): y = x*x') 1 0 SETUP_LOOP 30 (to 33) 3 LOAD_NAME 0 (range) 6 LOAD_CONST 0 (3) 9 CALL_FUNCTION 1 12 GET_ITER >> 13 FOR_ITER 16 (to 32) 16 STORE_NAME 1 (x) 19 LOAD_NAME 1 (x) 22 LOAD_NAME 1 (x) 25 BINARY_MULTIPLY 26 STORE_NAME 2 (y) 29 JUMP_ABSOLUTE 13 >> 32 POP_BLOCK >> 33 LOAD_CONST 1 (None) 36 RETURN_VALUE > You don't really need the "x" variable, you just want the square. It is nothing new that hand-crafted assembler (which mnemonic bytecode is) can sometimes beat a compiler. In this case, you want store, load, load before the multiply replaced with dup, and you cannot get that with Python code without a much smarter optimizer. > -- Terry Jan Reedy From tjreedy at udel.edu Thu May 19 03:59:47 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 18 May 2011 21:59:47 -0400 Subject: [Python-Dev] Don't set local variable in a list comprehension or generator In-Reply-To: <BANLkTinih_3ght8GSknCNrTjaD3t9i-Ayw@mail.gmail.com> References: <1305721315.16682.10.camel@marge> <BANLkTi=k46QMDYjHcBVCrAs5mAMfJYUA9Q@mail.gmail.com> <ir1cpj$52a$1@dough.gmane.org> <BANLkTinih_3ght8GSknCNrTjaD3t9i-Ayw@mail.gmail.com> Message-ID: <ir1til$lo6$1@dough.gmane.org> On 5/18/2011 5:37 PM, Amaury Forgeot d'Arc wrote: > Hi, > > 2011/5/18 Terry Reedy<tjreedy at udel.edu>: >> On 5/18/2011 10:19 AM, Nadeem Vawda wrote: >> >>> I'm not sure why you would encounter code like that in the first place. >>> Surely any code of the form: >>> >>> ''.join(c for c in my_string) >>> >>> would just return my_string? Or am I missing something? >> >> Good question. Anything useful like "'-'.join(c for c in 'abc')" is the same >> as "'-'.join('abc'). The same, as far as I can think of, for anything like >> list() or set() taking an iterable arg. > > With a little imagination you can build something non trivial. > For example, a join_words function: > > def join_words(words): > return ', '.join(w.strip() for w in words) > > Like Victor says, the code of the generator object contains a > STORE_FAST followed by LOAD_FAST. > This pair of opcodes could be removed, and the value left on the stack. > >>>> dis.dis(join_words.func_code.co_consts[2]) > 1 0 SETUP_LOOP 24 (to 27) > 3 LOAD_FAST 0 (.0) > >> 6 FOR_ITER 17 (to 26) > 9 STORE_FAST 1 (w) > 12 LOAD_FAST 1 (w) > 15 LOAD_ATTR 0 (strip) > 18 CALL_FUNCTION 0 > 21 YIELD_VALUE > 22 POP_TOP > 23 JUMP_ABSOLUTE 6 > >> 26 POP_BLOCK > >> 27 LOAD_CONST 0 (None) > 30 RETURN_VALUE As I pointed out in response to Victor, you get nearly the same with bytecode with regular old for loops; in particular, the store x/load x pair. > It's probably not easy to do though. > Think of expressions where the variable appears several times, > or even where the variable is not the first object, like str(ord(x)). Where first means first in left-to-right order rather than in innermost to outermost order. (OT: I think Python is a bit unusual in this way.) -- Terry Jan Reedy From techtonik at gmail.com Thu May 19 04:33:11 2011 From: techtonik at gmail.com (anatoly techtonik) Date: Thu, 19 May 2011 05:33:11 +0300 Subject: [Python-Dev] Inconsistent case in directory names for installed Python on Windows In-Reply-To: <BANLkTik_U_k3+xAPhPjvYj9n=zC58SdkwQ@mail.gmail.com> References: <BANLkTik_U_k3+xAPhPjvYj9n=zC58SdkwQ@mail.gmail.com> Message-ID: <BANLkTikXpJ3FuAxZLaYpkvnxU2MHcW4LBw@mail.gmail.com> On Wed, May 18, 2011 at 3:47 PM, Brian Curtin <brian.curtin at gmail.com> wrote: > > On May 18, 2011 7:03 AM, "anatoly techtonik" <techtonik at gmail.com> wrote: >> >> Greetings, >> >> While studying `virtualenv` code I've noticed that in Python directory >> tree `include`, `libs` and `tcl` are lowercased while other dirs are >> capitalized. It doesn't seem important (especially for developers >> here), but it still can leave an unpleasant image for people new to >> Python (and programming in general). > > In theory there are probably a lot of things that might seem unpleasant but > are actually non-issues. I don't believe there have been any complaints > about actual unpleasantries with directory case. Among web folks there are no people who care less about typography than those who spend most of their time in text terminals. =) I think that probability of receiving such complaint is very low even if everybody notices that. "Why should I bother about consistency if Python developers are not giving damn about it?" >> >> ?[Python27] >> ? ??DLLs >> ? ??Doc >> ? ??include >> ? ??Lib >> ? ??libs >> ? ??Scripts >> ? ??tcl >> ? ??Tools >> >> How about making a consistent lowercased or uppercased scheme? Windows >> filesystems are case-insensitive, so the change shouldn't affect >> anybody. > > Some Macs have case-sensitive file systems, and some people use > case-sensitive file systems on various flavors of UNIX. The change would > probably require a thorough look through the build chain. But we are speaking only about Windows. >> Another candidate for >> normalization is Tools/Scripts dir, >> which I'd lowercase FWIW: >> >> ??Tools >> ???i18n >> ???pynche >> ???Scripts >> ???versioncheck >> ???webchecker >> >> >> Lowercased dirs on a top level seem to contains files that are >> relevant to C developers only. However, I can not say for sure. It >> seems that there could be a better place for them like top level >> directory named Dev or C-API. > > Overall I think it boils down to a cosmetic change that I'm not sure we need > to make, which could unnecessarily break people's work. -1 That's right - I started that without cosmetic changes the project becomes ugly and start to accumulate a lot of garbage. With due attention to improving an image of Python from perspective of project layout organization, this change could be made in Python 3. It is something to keep in mind for the future. -- anatoly t. From techtonik at gmail.com Thu May 19 04:46:23 2011 From: techtonik at gmail.com (anatoly techtonik) Date: Thu, 19 May 2011 05:46:23 +0300 Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1 In-Reply-To: <ir176p$25a$1@dough.gmane.org> References: <4DD2C37D.7000008@python.org> <4DD4127E.6050301@zhuliguan.net> <4DD4194F.9020009@v.loewis.de> <ir176p$25a$1@dough.gmane.org> Message-ID: <BANLkTinC3+O88edv+HkN4zNTCz1uXS+z_w@mail.gmail.com> On Wed, May 18, 2011 at 10:37 PM, Georg Brandl <g.brandl at gmx.net> wrote: > On 18.05.2011 21:09, "Martin v. L?wis" wrote: >> Am 18.05.2011 20:39, schrieb Hagen F?rstenau: >>>> On behalf of the Python development team, I am pleased to announce the >>>> first release candidate of Python 3.2.1. >>> >>> Shouldn't there be a tag "v3.2.1rc1" in the hg repo? >> >> http://hg.python.org/releasing/3.2.1/ >> >> Regards, >> Martin >> >> P.S. "Shouldn't" makes it sound as if there was a mistake. > > To clarify: once the final is done, the repo Martin mentioned will be > merged back to main and then vanish. Can't this work be done in the branch of main repo, so that everybody can track the progress in place? Is there any picture of the process similar to http://nvie.com/posts/a-successful-git-branching-model/ ? -- anatoly t. From brian.curtin at gmail.com Thu May 19 04:48:03 2011 From: brian.curtin at gmail.com (Brian Curtin) Date: Wed, 18 May 2011 21:48:03 -0500 Subject: [Python-Dev] Inconsistent case in directory names for installed Python on Windows In-Reply-To: <BANLkTikXpJ3FuAxZLaYpkvnxU2MHcW4LBw@mail.gmail.com> References: <BANLkTik_U_k3+xAPhPjvYj9n=zC58SdkwQ@mail.gmail.com> <BANLkTikXpJ3FuAxZLaYpkvnxU2MHcW4LBw@mail.gmail.com> Message-ID: <BANLkTikJxMkM+ByNRc4hV8L8Mf1k2VeQNw@mail.gmail.com> On Wed, May 18, 2011 at 21:33, anatoly techtonik <techtonik at gmail.com>wrote: > On Wed, May 18, 2011 at 3:47 PM, Brian Curtin <brian.curtin at gmail.com> > wrote: > > > > On May 18, 2011 7:03 AM, "anatoly techtonik" <techtonik at gmail.com> > wrote: > >> > >> Greetings, > >> > >> While studying `virtualenv` code I've noticed that in Python directory > >> tree `include`, `libs` and `tcl` are lowercased while other dirs are > >> capitalized. It doesn't seem important (especially for developers > >> here), but it still can leave an unpleasant image for people new to > >> Python (and programming in general). > > > > In theory there are probably a lot of things that might seem unpleasant > but > > are actually non-issues. I don't believe there have been any complaints > > about actual unpleasantries with directory case. > > Among web folks there are no people who care less about typography > than those who spend most of their time in text terminals. =) I think > that probability of receiving such complaint is very low even if > everybody notices that. "Why should I bother about consistency if > Python developers are not giving damn about it?" > > >> > >> ?[Python27] > >> ? ??DLLs > >> ? ??Doc > >> ? ??include > >> ? ??Lib > >> ? ??libs > >> ? ??Scripts > >> ? ??tcl > >> ? ??Tools > >> > >> How about making a consistent lowercased or uppercased scheme? Windows > >> filesystems are case-insensitive, so the change shouldn't affect > >> anybody. > > > > Some Macs have case-sensitive file systems, and some people use > > case-sensitive file systems on various flavors of UNIX. The change would > > probably require a thorough look through the build chain. > > But we are speaking only about Windows. > Definitely -1 to change the folder names only on Windows. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20110518/e4177fa4/attachment.html> From tjreedy at udel.edu Thu May 19 05:20:25 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 18 May 2011 23:20:25 -0400 Subject: [Python-Dev] Inconsistent case in directory names for installed Python on Windows In-Reply-To: <BANLkTikXpJ3FuAxZLaYpkvnxU2MHcW4LBw@mail.gmail.com> References: <BANLkTik_U_k3+xAPhPjvYj9n=zC58SdkwQ@mail.gmail.com> <BANLkTikXpJ3FuAxZLaYpkvnxU2MHcW4LBw@mail.gmail.com> Message-ID: <ir229p$9kq$1@dough.gmane.org> On 5/18/2011 10:33 PM, anatoly techtonik wrote: >>> ?[Python27] >>> ? ??DLLs >>> ? ??Doc >>> ? ??include >>> ? ??Lib >>> ? ??libs >>> ? ??Scripts >>> ? ??tcl >>> ? ??Tools Except for DLLs and tcl, these are the platform-independent names in the source tree. They are copied directly over to the installations, and I would not want it any way. Since I suspect change on *nix is out, I would feel the same for winX. I actually like having 'Lib' uppercase versus 'libs' lowercase, to make it easier to pick out 'Lib'. Most users have little reason to look as this directory list very often. Certainly, Doc, Lib, Scripts, and Tools are ones they might want to look in, which include, libs, and tcl have nothing to look at. Notice the pattern? Hmmm. By the same logic, DLLs should have been dlls, but I suspect someone wanted to distinguish the plural s from dll. -- Terry Jan Reedy From tjreedy at udel.edu Thu May 19 05:24:38 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 18 May 2011 23:24:38 -0400 Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1 In-Reply-To: <BANLkTinC3+O88edv+HkN4zNTCz1uXS+z_w@mail.gmail.com> References: <4DD2C37D.7000008@python.org> <4DD4127E.6050301@zhuliguan.net> <4DD4194F.9020009@v.loewis.de> <ir176p$25a$1@dough.gmane.org> <BANLkTinC3+O88edv+HkN4zNTCz1uXS+z_w@mail.gmail.com> Message-ID: <ir22ho$amc$1@dough.gmane.org> On 5/18/2011 10:46 PM, anatoly techtonik wrote: > On Wed, May 18, 2011 at 10:37 PM, Georg Brandl<g.brandl at gmx.net> wrote: >> On 18.05.2011 21:09, "Martin v. L?wis" wrote: >>> http://hg.python.org/releasing/3.2.1/ >> To clarify: once the final is done, the repo Martin mentioned will be >> merged back to main and then vanish. > > Can't this work be done in the branch of main repo, so that everybody > can track the progress in place? As I understand it, this is a snapshot that George hopes will require No work between the candidate and final release and which will get only the minimum needed. -- Terry Jan Reedy From martin at v.loewis.de Thu May 19 05:59:15 2011 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Thu, 19 May 2011 05:59:15 +0200 Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1 In-Reply-To: <BANLkTinC3+O88edv+HkN4zNTCz1uXS+z_w@mail.gmail.com> References: <4DD2C37D.7000008@python.org> <4DD4127E.6050301@zhuliguan.net> <4DD4194F.9020009@v.loewis.de> <ir176p$25a$1@dough.gmane.org> <BANLkTinC3+O88edv+HkN4zNTCz1uXS+z_w@mail.gmail.com> Message-ID: <4DD49593.30605@v.loewis.de> > Can't this work be done in the branch of main repo, so that everybody > can track the progress in place? Is there any picture of the process > similar to http://nvie.com/posts/a-successful-git-branching-model/ ? It *is* a branch of the main repo, so everybody *can* track the progress (not sure what "track in place" means). If you are asking for a named branch: no, that shouldn't be done. Regards, Martin From g.brandl at gmx.net Thu May 19 07:28:36 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 19 May 2011 07:28:36 +0200 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <4DD44AA6.9030600@canterbury.ac.nz> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <4DD35B9C.3030702@canterbury.ac.nz> <4DD3EC7A.8070801@stoneleaf.us> <4DD41E2B.7000404@stoneleaf.us> <4DD44AA6.9030600@canterbury.ac.nz> Message-ID: <ir29q8$90n$1@dough.gmane.org> On 19.05.2011 00:39, Greg Ewing wrote: > Ethan Furman wrote: > >> some_var[3] == b'd' >> >> 1) a check to see if the bytes instance is length 1 >> 2) a check to see if >> i) the other object is an int, and >> 2) 0 <= other_obj < 256 >> 3) if 1 and 2, make the comparison instead of returning NotImplemented? > > It might seem convenient, but I'd worry that it would lead to > even more confusion in other ways. If someone sees that > > some_var[3] == b'd' > > is true, and that > > some_var[3] == 100 > > is also true, they might expect to be able to do things > like > > n = b'd' + 1 > > and get 101... or maybe b'e'... Maybe they should :) Georg From g.brandl at gmx.net Thu May 19 07:32:18 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 19 May 2011 07:32:18 +0200 Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1 In-Reply-To: <4DD41FF9.9040704@zhuliguan.net> References: <4DD2C37D.7000008@python.org> <4DD4127E.6050301@zhuliguan.net> <4DD4194F.9020009@v.loewis.de> <4DD41FF9.9040704@zhuliguan.net> Message-ID: <ir2a16$90n$2@dough.gmane.org> On 18.05.2011 21:37, Hagen F?rstenau wrote: >> P.S. "Shouldn't" makes it sound as if there was a mistake. > > Well, I thought there was. When do these tags get merged into "cpython" > then? "v3.2.1b1" is there, but "v3.2.1rc1" isn't: > > http://hg.python.org/cpython/tags 3.2.1b1 was already merged back. (And 3.2.1rc1 will also be merged back soon, since there will be a 3.2.1rc2.) Georg From stefan_ml at behnel.de Thu May 19 08:11:20 2011 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 19 May 2011 08:11:20 +0200 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <4DD44208.70101@canterbury.ac.nz> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <4DD35B9C.3030702@canterbury.ac.nz> <iqvp43$et3$1@dough.gmane.org> <4DD44208.70101@canterbury.ac.nz> Message-ID: <ir2ca8$h97$1@dough.gmane.org> Greg Ewing, 19.05.2011 00:02: > Georg Brandl wrote: > >> We do have >> >> bytes.fromhex('deadbeef') > > But again, there is a run-time overhead to this. Well, yes, but it's negligible if you assign it to a suitable variable first. Stefan From hagen at zhuliguan.net Thu May 19 09:01:01 2011 From: hagen at zhuliguan.net (=?ISO-8859-1?Q?Hagen_F=FCrstenau?=) Date: Thu, 19 May 2011 09:01:01 +0200 Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1 In-Reply-To: <ir2a16$90n$2@dough.gmane.org> References: <4DD2C37D.7000008@python.org> <4DD4127E.6050301@zhuliguan.net> <4DD4194F.9020009@v.loewis.de> <4DD41FF9.9040704@zhuliguan.net> <ir2a16$90n$2@dough.gmane.org> Message-ID: <4DD4C02D.2030100@zhuliguan.net> > 3.2.1b1 was already merged back. (And 3.2.1rc1 will also be merged back > soon, since there will be a 3.2.1rc2.) Thanks for the clarification! :-) Cheers, Hagen From python-dev at masklinn.net Thu May 19 09:41:08 2011 From: python-dev at masklinn.net (Xavier Morel) Date: Thu, 19 May 2011 09:41:08 +0200 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <ir29q8$90n$1@dough.gmane.org> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <4DD35B9C.3030702@canterbury.ac.nz> <4DD3EC7A.8070801@stoneleaf.us> <4DD41E2B.7000404@stoneleaf.us> <4DD44AA6.9030600@canterbury.ac.nz> <ir29q8$90n$1@dough.gmane.org> Message-ID: <663D4696-0454-45A8-A6F7-AD18A07709FA@masklinn.net> On 2011-05-19, at 07:28 , Georg Brandl wrote: > On 19.05.2011 00:39, Greg Ewing wrote: >> Ethan Furman wrote: >> >>> some_var[3] == b'd' >>> >>> 1) a check to see if the bytes instance is length 1 >>> 2) a check to see if >>> i) the other object is an int, and >>> 2) 0 <= other_obj < 256 >>> 3) if 1 and 2, make the comparison instead of returning NotImplemented? >> >> It might seem convenient, but I'd worry that it would lead to >> even more confusion in other ways. If someone sees that >> >> some_var[3] == b'd' >> >> is true, and that >> >> some_var[3] == 100 >> >> is also true, they might expect to be able to do things >> like >> >> n = b'd' + 1 >> >> and get 101... or maybe b'e'... > > Maybe they should :) But why wouldn't "they" expect `b'de' + 1` to work as well in this case? If a 1-byte bytes is equivalent to an integer, why not an arbitrary one as well? From ncoghlan at gmail.com Thu May 19 09:49:47 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 19 May 2011 17:49:47 +1000 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <4DD41997.4060401@trueblade.com> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com> <BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com> <87k4doasr7.fsf@uwakimon.sk.tsukuba.ac.jp> <4DD41997.4060401@trueblade.com> Message-ID: <BANLkTimE7F68kWyrAO130pO2v9RZrSu1DA@mail.gmail.com> On Thu, May 19, 2011 at 5:10 AM, Eric Smith <eric at trueblade.com> wrote: > On 05/18/2011 12:16 PM, Stephen J. Turnbull wrote: >> Robert Collins writes: >> >> ?> Its probably too late to change, but please don't try to argue that >> ?> its correct: the continued confusion of folk running into this is >> ?> evidence that confusion *is happening*. Treat that as evidence and >> ?> think about how to fix it going forward. >> >> Sorry, Rob, but you're just wrong here, and Nick is right. ?It's >> possible to improve Python 3, but not to "fix" it in this respect. >> The Python 3 solution is correct, the Python 2 approach is not. >> There's no way to avoid discontinuity and confusion here. > > I don't think there's any connection between the way 2.x confused text > strings and binary data (which certainly needed addressing) with the way > that 3.x returns a different type for byte_str[i] than it does for > byte_str[i:i+1]. I think it's the latter that's confusing to people. > There's no particular requirement for different types that's needed to > fix the byte/str problem. It's a mental model problem. People try to think of bytes as equivalent to 2.x str and that's just wrong, wrong, wrong. It's far closer to array.array('c'). Strings are basically *unique* in returning a length 1 instance of themselves for indexing operations. For every other sequence type, including tuples, lists and arrays, slicing returns a new instance of the same type, while indexing will typically return something different. Now, we definitely didn't *help* matters by keeping so many of the default behaviours of bytes() and bytearray() coupled to ASCII-encoded text, but that was a matter of practicality beating purity: there really *are* a lot of wire protocols out there that are ASCII based. In hindsight, perhaps we should have gone further in breaking things to try to make the point about the mental model shift more forcefully. (However, that idea carries with it its own problems). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From stephen at xemacs.org Thu May 19 10:00:24 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 19 May 2011 17:00:24 +0900 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <BANLkTikh6D1KgF7_k_guy4s4JETpz5cJpw@mail.gmail.com> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com> <BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com> <87k4doasr7.fsf@uwakimon.sk.tsukuba.ac.jp> <BANLkTikh6D1KgF7_k_guy4s4JETpz5cJpw@mail.gmail.com> Message-ID: <87fwobazmv.fsf@uwakimon.sk.tsukuba.ac.jp> Robert Collins writes: > Thats separate to the implementation issues I have mentioned in this > thread and previous. Oops, sorry. Nevertheless, I personally think that b'a'[0] == 97 is a good idea, and consistent with everything else in Python. It's Unicode (str) that is weird, it's str is surprising when first encountered by a C or Lisp programmer at first, but not enough to cause a heart attack given how weird natural language is. But I don't see why that weirdness (an element of LIST of TYPE is a LIST of TYPE, hey, young man, you're very smart but *it's turtles all the way down!*) should be replicated elsewhere. If you want your bytes object to behave like a str, it's very easy to get that (.decode('latin1')), and nobody has yet demonstrated that this is too time-inefficient for real work, given the other overhead imposed by Python. The space inefficiency could be dealt with as Greg points out (by internally having a Unicode representation using 1 byte instead of 2 or 4). But if you want your bytes object to *be* a string, then you're confused. It isn't (any more). Even if it's just a matter of flipping one bit in the type field, a str-with-unibyte- representation, is not equal to a bytes object with the same bytes. For example, you write: > urlparse converting bytes to 'str' to operate on them is at best a > kludge - you're forcing 5 times the storage (the original bytes + 4 > bytes-per-byte when its decoded into unicode) to work on something > which is defined as a BNF * that uses ascii *. Indeed it (RFC 3896) does *use* ASCII. But I think there is confusion in your words. This is what the RFC says about that use of ASCII: 2. Characters The URI syntax provides a method of encoding data, presumably for the sake of identifying a resource, as a sequence of characters. [...] The ABNF notation defines its terminal values to be non-negative integers (codepoints) based on the US-ASCII coded character set [ASCII]. Because a URI is a sequence of characters, we must invert that relation in order to understand the URI syntax. Therefore, the integer values used by the ABNF must be mapped back to their corresponding characters via US-ASCII in order to complete the syntax rules. Ie, ASCII is *irrelevant* to (the modern definition of) URLs except as it is a convenient and familiar way to refer to a certain familiar and rather small set of *characters*. There are reasons for this (that I'm not going to rehash here), and they are the *same* reasons why Python 3's behavior is "correct" IMHO (modulo the issue about the type of a list element, which I discuss above). It is true that one might like there to be a literal that expresses `ord(bytes-object-of-length-one)', ie, something like o'a' == 97. (This is different from Greg's x'6465616462656566' == b'deadbeef', which I don't think helps solve the confusion problem although it would definitely be convenient.) From python-dev at masklinn.net Thu May 19 10:05:04 2011 From: python-dev at masklinn.net (Xavier Morel) Date: Thu, 19 May 2011 10:05:04 +0200 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <BANLkTimE7F68kWyrAO130pO2v9RZrSu1DA@mail.gmail.com> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com> <BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com> <87k4doasr7.fsf@uwakimon.sk.tsukuba.ac.jp> <4DD41997.4060401@trueblade.com> <BANLkTimE7F68kWyrAO130pO2v9RZrSu1DA@mail.gmail.com> Message-ID: <78681B05-171D-40EF-BEFC-0ABE57FBE3ED@masklinn.net> On 2011-05-19, at 09:49 , Nick Coghlan wrote: > On Thu, May 19, 2011 at 5:10 AM, Eric Smith <eric at trueblade.com> wrote: >> On 05/18/2011 12:16 PM, Stephen J. Turnbull wrote: >>> Robert Collins writes: >>> >>> > Its probably too late to change, but please don't try to argue that >>> > its correct: the continued confusion of folk running into this is >>> > evidence that confusion *is happening*. Treat that as evidence and >>> > think about how to fix it going forward. >>> >>> Sorry, Rob, but you're just wrong here, and Nick is right. It's >>> possible to improve Python 3, but not to "fix" it in this respect. >>> The Python 3 solution is correct, the Python 2 approach is not. >>> There's no way to avoid discontinuity and confusion here. >> >> I don't think there's any connection between the way 2.x confused text >> strings and binary data (which certainly needed addressing) with the way >> that 3.x returns a different type for byte_str[i] than it does for >> byte_str[i:i+1]. I think it's the latter that's confusing to people. >> There's no particular requirement for different types that's needed to >> fix the byte/str problem. > > It's a mental model problem. People try to think of bytes as > equivalent to 2.x str and that's just wrong, wrong, wrong. It's far > closer to array.array('c'). Strings are basically *unique* in > returning a length 1 instance of themselves for indexing operations. > For every other sequence type, including tuples, lists and arrays, > slicing returns a new instance of the same type, while indexing will > typically return something different. > > Now, we definitely didn't *help* matters by keeping so many of the > default behaviours of bytes() and bytearray() coupled to ASCII-encoded > text, but that was a matter of practicality beating purity: there > really *are* a lot of wire protocols out there that are ASCII based. > In hindsight, perhaps we should have gone further in breaking things > to try to make the point about the mental model shift more forcefully. > (However, that idea carries with it its own problems). For what it's worth, Erlang's approach to the subject is ? in my opinion ? excellent: binaries (whose literals are called "bit syntax" there) are quite distinct from strings in both syntax and API, but you can put chunks of strings within binaries (the bit syntax acts as a container, in which you can put a literal or non-literal string). This simultaneously impresses upon the user that binaries are *not* strings and that they can still easily create binaries from strings. From stefan_ml at behnel.de Thu May 19 10:37:03 2011 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 19 May 2011 10:37:03 +0200 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <663D4696-0454-45A8-A6F7-AD18A07709FA@masklinn.net> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <4DD35B9C.3030702@canterbury.ac.nz> <4DD3EC7A.8070801@stoneleaf.us> <4DD41E2B.7000404@stoneleaf.us> <4DD44AA6.9030600@canterbury.ac.nz> <ir29q8$90n$1@dough.gmane.org> <663D4696-0454-45A8-A6F7-AD18A07709FA@masklinn.net> Message-ID: <ir2krg$20r$1@dough.gmane.org> Xavier Morel, 19.05.2011 09:41: > On 2011-05-19, at 07:28 , Georg Brandl wrote: >> On 19.05.2011 00:39, Greg Ewing wrote: >>> If someone sees that >>> >>> some_var[3] == b'd' >>> >>> is true, and that >>> >>> some_var[3] == 100 >>> >>> is also true, they might expect to be able to do things >>> like >>> >>> n = b'd' + 1 >>> >>> and get 101... or maybe b'e'... >> >> Maybe they should :) > > But why wouldn't "they" expect `b'de' + 1` to work as well in this case? If a 1-byte bytes is equivalent to an integer, why not an arbitrary one as well? The result of this must obviously be b"de1". Stefan From ncoghlan at gmail.com Thu May 19 10:43:54 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 19 May 2011 18:43:54 +1000 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <4DD2F661.2050005@stoneleaf.us> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> Message-ID: <BANLkTikx8U4jWwLXXazpmtsL6MasDmyVyg@mail.gmail.com> OK, summarising the thread so far from my point of view. 1. There are some aspects of the behavior of bytes() objects that tempt people to think of them as string-like objects (primarily the b'' literals and their use in repr(), along with the fact that they fill roles that were filled by str in it's "arbitrary binary data" incarnation in Python 2.x). The mental model this creates in the reader is incorrect, as bytes() are far closer to array.array('c') in their underlying behaviour (and deliberately so - cf. PEP 358, 3112, 3137). One proposal for addressing this is to add a x'deadbeef' literal and using that in repr() rather than the bytestring. Another would be to escape all characters, even printable ASCII, in the bytes() representation. Both of these are undesirable, as they miss the original purpose of this behaviour: making it easier to work with the many ASCII based wire protocols that are in widespread use. To be honest, I don't think there is a lot we can do here except to further emphasise in the documentation and elsewhere that *bytes is not a string type* (regardless of any API similarities retained to ease transition from the 2.x series). For example, if we have any lingering references to "byte strings" they should be replaced with "byte sequences" or "bytes objects" (depending on context, as the former phrasing also encompasses bytearray objects). 2. As a concrete usability issue, it is awkward to programmatically check the value of a specific byte when working with an ASCII based protocol: data[i] == b'a' # Intuitive, but always False due to type mismatch data[i:i+1] == b'a' # Works, but clumsy data[i] == b'a'[0] # Ditto (but at least susceptible to compiler const-expression optimisation) data[i] == ord('a') # Clumsy and slow data[i] == 97 # Hard to read Proposals to address this include: - introduce a "character" literal to allow c'a' as an alternative to ord('a') Potentially workable, but leaves the intuitive answer above silently producing an unexpected answer - allow 1-element byte sequences to compare equal to the corresponding integer values. - would require reworking of bytes.__hash__ to use the hash of the contained element when the data length is exactly 1 - transitivity of equality would recommend also supporting equivalences such as b'a' == 97.0 - backwards compatibility concerns arise due to introduction of new key collisions in dictionaries and sets and other value based containers - yet more string-like behaviour in a type that is *not* a string (further reinforcing the mistaken impression from point 1) - One thing that *isn't* a concern from my point of view is the fact that we have ample precedent in decimal.Decimal for supporting implicit coercion in comparison operations while disallowing them in arithmetic operations (Decimal("1") == 1.0 is allowed, but Decimal("1") + 1.0 will raise TypeError). For point 2, I'm personally +0 on the idea of having 1-element bytes and bytearray objects delegate hashing and comparison operations to the corresponding integer object. We have the power to make the obvious code correct code, so let's do that. However, the implications of the additional key collisions in value based containers may need to be explored further. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Thu May 19 10:54:18 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 19 May 2011 18:54:18 +1000 Subject: [Python-Dev] [Python-checkins] cpython: Skip some tests in the absence of multiprocessing. In-Reply-To: <4DD3F906.2080100@netwok.org> References: <E1QMDyC-0004jq-1J@dinsdale.python.org> <4DD3F906.2080100@netwok.org> Message-ID: <BANLkTikwW-Eh33B4zBooH3qCpVCUakA8ig@mail.gmail.com> On Thu, May 19, 2011 at 2:51 AM, ?ric Araujo <merwok at netwok.org> wrote: > Isn?t support.import_module or somesuch useful for this kind of checks? You have to restructure your tests into the appropriate files for that to work, as support.import_module() throws SkipTest if the module isn't available. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Thu May 19 11:03:10 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 19 May 2011 19:03:10 +1000 Subject: [Python-Dev] Don't set local variable in a list comprehension or generator In-Reply-To: <1305754449.27389.30.camel@marge> References: <1305721315.16682.10.camel@marge> <BANLkTi=k46QMDYjHcBVCrAs5mAMfJYUA9Q@mail.gmail.com> <1305754449.27389.30.camel@marge> Message-ID: <BANLkTikxbGi7UPt4_KfBvS7C5fQCA24TGA@mail.gmail.com> On Thu, May 19, 2011 at 7:34 AM, Victor Stinner <victor.stinner at haypocalc.com> wrote: > But it is slower whereas I read somewhere than generators are faster > than loops. Are you sure it wasn't that generator expressions can be faster than list comprehensions (if the memory savings are significant)? Or that a reduction function with a generator expression can be faster than a module-level explicit loop (due to the replacement of dict-based variable assignment with fast locals in the generator and C looping in the reduction function)? In general, as long as both are using fast locals and looping in Python, I would expect inline looping code to be faster than the equivalent generator (but often harder to maintain due to lack of reusability). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From lukasz at langa.pl Thu May 19 11:25:23 2011 From: lukasz at langa.pl (=?iso-8859-2?Q?=A3ukasz_Langa?=) Date: Thu, 19 May 2011 11:25:23 +0200 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <ir2krg$20r$1@dough.gmane.org> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <4DD35B9C.3030702@canterbury.ac.nz> <4DD3EC7A.8070801@stoneleaf.us> <4DD41E2B.7000404@stoneleaf.us> <4DD44AA6.9030600@canterbury.ac.nz> <ir29q8$90n$1@dough.gmane.org> <663D4696-0454-45A8-A6F7-AD18A07709FA@masklinn.net> <ir2krg$20r$1@dough.gmane.org> Message-ID: <340C7155-49FE-4EF7-963E-65EA8DB9DDEE@langa.pl> Wiadomo?? napisana przez Stefan Behnel w dniu 2011-05-19, o godz. 10:37: >> But why wouldn't "they" expect `b'de' + 1` to work as well in this case? If a 1-byte bytes is equivalent to an integer, why not an arbitrary one as well? > > The result of this must obviously be b"de1". I hope you're joking. At best, the result should be b"de\x01". But I don't think such construct should be allowed. Just like you can't do `[1, 2, 3] + 4`. I wouldn't ever expect that a single byte behaves like a sequence of bytes. In the case of bytes b'a' is obviously still a sequence of bytes, just happening to store a single one. Indexing should return a byte so I'm not surprised it returns a number. Slicing on the other hand returns a sub-sequence. However inconvenient, I find the current behaviour logical and predictable. A shortcut for b'a'[0] would obviously be nice but that's for python-ideas. -- Best regards, ?ukasz Langa Senior Systems Architecture Engineer IT Infrastructure Department Grupa Allegro Sp. z o.o. From stefan_ml at behnel.de Thu May 19 12:06:19 2011 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 19 May 2011 12:06:19 +0200 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <340C7155-49FE-4EF7-963E-65EA8DB9DDEE@langa.pl> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <4DD35B9C.3030702@canterbury.ac.nz> <4DD3EC7A.8070801@stoneleaf.us> <4DD41E2B.7000404@stoneleaf.us> <4DD44AA6.9030600@canterbury.ac.nz> <ir29q8$90n$1@dough.gmane.org> <663D4696-0454-45A8-A6F7-AD18A07709FA@masklinn.net> <ir2krg$20r$1@dough.gmane.org> <340C7155-49FE-4EF7-963E-65EA8DB9DDEE@langa.pl> Message-ID: <ir2q2r$ag$1@dough.gmane.org> ?ukasz Langa, 19.05.2011 11:25: > Wiadomo?? napisana przez Stefan Behnel w dniu 2011-05-19, o godz. 10:37: > >>> But why wouldn't "they" expect `b'de' + 1` to work as well in this case? If a 1-byte bytes is equivalent to an integer, why not an arbitrary one as well? >> >> The result of this must obviously be b"de1". > > I hope you're joking. I "obviously" was. My point is that expectations and "obvious behaviour" may not be obvious to everyone. Nick summed it up very nicely IMHO. Stefan From catch-all at masklinn.net Thu May 19 12:12:56 2011 From: catch-all at masklinn.net (Xavier Morel) Date: Thu, 19 May 2011 12:12:56 +0200 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <340C7155-49FE-4EF7-963E-65EA8DB9DDEE@langa.pl> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <4DD35B9C.3030702@canterbury.ac.nz> <4DD3EC7A.8070801@stoneleaf.us> <4DD41E2B.7000404@stoneleaf.us> <4DD44AA6.9030600@canterbury.ac.nz> <ir29q8$90n$1@dough.gmane.org> <663D4696-0454-45A8-A6F7-AD18A07709FA@masklinn.net> <ir2krg$20r$1@dough.gmane.org> <340C7155-49FE-4EF7-963E-65EA8DB9DDEE@langa.pl> Message-ID: <052FA5C5-F6F2-4702-9E8A-E78C8E6DD34F@masklinn.net> On 2011-05-19, at 11:25 , ?ukasz Langa wrote: > Wiadomo?? napisana przez Stefan Behnel w dniu 2011-05-19, o godz. 10:37: > >>> But why wouldn't "they" expect `b'de' + 1` to work as well in this case? If a 1-byte bytes is equivalent to an integer, why not an arbitrary one as well? >> >> The result of this must obviously be b"de1". > I hope you're joking. At best, the result should be b"de\x01". Actually, if `b'd'+1` returns `b'e'` an equivalent behavior should be that `b'de'+1` returns `b'df'`. From victor.stinner at haypocalc.com Thu May 19 12:34:29 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 19 May 2011 12:34:29 +0200 Subject: [Python-Dev] Don't set local variable in a list comprehension or generator In-Reply-To: <4DD44C6D.8000808@canterbury.ac.nz> References: <1305721315.16682.10.camel@marge> <BANLkTi=k46QMDYjHcBVCrAs5mAMfJYUA9Q@mail.gmail.com> <1305754449.27389.30.camel@marge> <4DD44C6D.8000808@canterbury.ac.nz> Message-ID: <1305801269.2380.4.camel@marge> Le jeudi 19 mai 2011 ? 10:47 +1200, Greg Ewing a ?crit : > Victor Stinner wrote: > > > squares = (x*x for x in range(10000)) > > What bytecode would you optimise that into? I suppose that you have the current value of range(10000) on the stack: DUP_TOP; BINARY_MULTIPLY; gives you the square. You don't need the x variable (LOAD_FAST/STORE_FAST). Full example using a function (instead of loop, so I need to load x): ----------- import dis, opcode, struct def f(x): return x*x def patch_bytecode(f, bytecode): fcode = f.__code__ code_type = type(f.__code__) new_code = code_type( fcode.co_argcount, fcode.co_kwonlyargcount, fcode.co_nlocals, fcode.co_stacksize, fcode.co_flags, bytecode, fcode.co_consts, fcode.co_names, fcode.co_varnames, fcode.co_filename, fcode.co_name, fcode.co_firstlineno, fcode.co_lnotab, ) f.__code__ = new_code print("Original:") print("f(4) = %s" % f(4)) dis.dis(f) print() LOAD_FAST = opcode.opmap['LOAD_FAST'] DUP_TOP = opcode.opmap['DUP_TOP'] BINARY_MULTIPLY = opcode.opmap['BINARY_MULTIPLY'] RETURN_VALUE = opcode.opmap['RETURN_VALUE'] bytecode = struct.pack( '=BHBBB', LOAD_FAST, 0, DUP_TOP, BINARY_MULTIPLY, RETURN_VALUE) print("Patched:") patch_bytecode(f, bytecode) print("f(4) patched = %s" % f(4)) dis.dis(f) ----------- Output: ----------- $ python3 square.py Original: f(4) = 16 3 0 LOAD_FAST 0 (x) 3 LOAD_FAST 0 (x) 6 BINARY_MULTIPLY 7 RETURN_VALUE Patched: f(4) patched = 16 3 0 LOAD_FAST 0 (x) 3 DUP_TOP 4 BINARY_MULTIPLY 5 RETURN_VALUE ----------- Victor From solipsis at pitrou.net Thu May 19 12:37:27 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 19 May 2011 12:37:27 +0200 Subject: [Python-Dev] Python 3.x and bytes References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com> <BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com> <87k4doasr7.fsf@uwakimon.sk.tsukuba.ac.jp> <4DD41997.4060401@trueblade.com> <BANLkTimE7F68kWyrAO130pO2v9RZrSu1DA@mail.gmail.com> Message-ID: <20110519123727.408b401f@pitrou.net> On Thu, 19 May 2011 17:49:47 +1000 Nick Coghlan <ncoghlan at gmail.com> wrote: > > It's a mental model problem. People try to think of bytes as > equivalent to 2.x str and that's just wrong, wrong, wrong. It's far > closer to array.array('c'). Strings are basically *unique* in > returning a length 1 instance of themselves for indexing operations. > For every other sequence type, including tuples, lists and arrays, > slicing returns a new instance of the same type, while indexing will > typically return something different. > > Now, we definitely didn't *help* matters by keeping so many of the > default behaviours of bytes() and bytearray() coupled to ASCII-encoded > text, but that was a matter of practicality beating purity: there > really *are* a lot of wire protocols out there that are ASCII based. I think "practicality beating purity" should have been extended to __getitem__ as well. I have almost never had a use for treating a bytestring as a sequence of integers, while treating a bytestring as a sequence of one-byte strings is *very* common. (and, as you say, if you want a sequence of integers you can already use array.array() which gives you more flexibility as to the width and signedness of integers) Regards Antoine. From victor.stinner at haypocalc.com Thu May 19 12:39:57 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 19 May 2011 12:39:57 +0200 Subject: [Python-Dev] Don't set local variable in a list comprehension or generator In-Reply-To: <ir1slo$ibs$1@dough.gmane.org> References: <1305721315.16682.10.camel@marge> <BANLkTi=k46QMDYjHcBVCrAs5mAMfJYUA9Q@mail.gmail.com> <1305754449.27389.30.camel@marge> <ir1slo$ibs$1@dough.gmane.org> Message-ID: <1305801597.2380.9.camel@marge> Le mercredi 18 mai 2011 ? 21:44 -0400, Terry Reedy a ?crit : > On 5/18/2011 5:34 PM, Victor Stinner wrote: > > You initial example gave me the impression that the issue has something > to do with join in particular, or even comprehensions in particular. It > is really about for loops. > > >>> dis('for x in range(3): y = x*x') > ... > >> 13 FOR_ITER 16 (to 32) > 16 STORE_NAME 1 (x) > 19 LOAD_NAME 1 (x) > 22 LOAD_NAME 1 (x) > 25 BINARY_MULTIPLY > 26 STORE_NAME 2 (y) > ... Yeah, "STORE_NAME; LOAD_NAME; LOAD_NAME" can be replaced by a single opcode: DUP_TOP. But the user expects x to be defined outside the loop: >>> for x in range(3): y = x*x ... >>> x 2 Well, it is possible to detect if x is used or not after the loop, but it is a little more complex to optimize than list comprehension/generator :-) > .. you cannot get that with Python code without a much smarter optimizer. Yes, I would like to write a smarter optimizer. But I first asked if it would accepted to avoid the temporary loop variable because it changes the Python language: the user can expect a loop variable using introspection or a debugger. That's why I suggested to only enable the optimization if Python is running in optimized mode (python -O or python -OO). Victor From ncoghlan at gmail.com Thu May 19 13:02:12 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 19 May 2011 21:02:12 +1000 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <BANLkTikx8U4jWwLXXazpmtsL6MasDmyVyg@mail.gmail.com> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <BANLkTikx8U4jWwLXXazpmtsL6MasDmyVyg@mail.gmail.com> Message-ID: <BANLkTi=_GqrNntRU7pv7x=kj3gHurk-Gaw@mail.gmail.com> On Thu, May 19, 2011 at 6:43 PM, Nick Coghlan <ncoghlan at gmail.com> wrote: > For point 2, I'm personally +0 on the idea of having 1-element bytes > and bytearray objects delegate hashing and comparison operations to > the corresponding integer object. We have the power to make the > obvious code correct code, so let's do that. However, the implications > of the additional key collisions in value based containers may need to > be explored further. On further reflection, the key collision and semantics blurring problems mean I am at best -0 on this particular solution to the problem (and heading fairly rapidly in the direction of -1). Best to just go with b'a'[0] and let the optimiser sort it out (PyPy should handle it automatically, CPython would need work). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From fuzzyman at voidspace.org.uk Thu May 19 13:29:07 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Thu, 19 May 2011 12:29:07 +0100 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <340C7155-49FE-4EF7-963E-65EA8DB9DDEE@langa.pl> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <4DD35B9C.3030702@canterbury.ac.nz> <4DD3EC7A.8070801@stoneleaf.us> <4DD41E2B.7000404@stoneleaf.us> <4DD44AA6.9030600@canterbury.ac.nz> <ir29q8$90n$1@dough.gmane.org> <663D4696-0454-45A8-A6F7-AD18A07709FA@masklinn.net> <ir2krg$20r$1@dough.gmane.org> <340C7155-49FE-4EF7-963E-65EA8DB9DDEE@langa.pl> Message-ID: <4DD4FF03.5070005@voidspace.org.uk> On 19/05/2011 10:25, ?ukasz Langa wrote: > Wiadomo?? napisana przez Stefan Behnel w dniu 2011-05-19, o godz. 10:37: > >>> But why wouldn't "they" expect `b'de' + 1` to work as well in this case? If a 1-byte bytes is equivalent to an integer, why not an arbitrary one as well? >> The result of this must obviously be b"de1". > I hope you're joking. At best, the result should be b"de\x01". The behaviour Stefan suggests is what some "weakly typed" languages like perl (and possibly php?) do, which masks errors and is rightly abhorred by Python programmers (although semantically not *so* different from 1 + 1.0 == 2.0). I think it's safe to say that Stefan was joking. Michael > But I don't think such construct should be allowed. Just like you can't do `[1, 2, 3] + 4`. I wouldn't ever expect that a single byte behaves like a sequence of bytes. In the case of bytes b'a' is obviously still a sequence of bytes, just happening to store a single one. Indexing should return a byte so I'm not surprised it returns a number. Slicing on the other hand returns a sub-sequence. > > However inconvenient, I find the current behaviour logical and predictable. A shortcut for b'a'[0] would obviously be nice but that's for python-ideas. > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From ziade.tarek at gmail.com Thu May 19 13:35:39 2011 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Thu, 19 May 2011 13:35:39 +0200 Subject: [Python-Dev] packaging landed in stdlib Message-ID: <BANLkTikg_OswAKYE-+r6iLyoA2-yzikBfQ@mail.gmail.com> Hey I've pushed packaging in stdlib. There are a few buildbots errors we're fixing right now. We will continue our work in their directly for now on. The next "big" commit will be for the documentation, Cheers Tarek -- Tarek Ziad? | http://ziade.org From greg.ewing at canterbury.ac.nz Thu May 19 14:16:31 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 20 May 2011 00:16:31 +1200 Subject: [Python-Dev] Don't set local variable in a list comprehension or generator In-Reply-To: <1305801269.2380.4.camel@marge> References: <1305721315.16682.10.camel@marge> <BANLkTi=k46QMDYjHcBVCrAs5mAMfJYUA9Q@mail.gmail.com> <1305754449.27389.30.camel@marge> <4DD44C6D.8000808@canterbury.ac.nz> <1305801269.2380.4.camel@marge> Message-ID: <4DD50A1F.3010008@canterbury.ac.nz> Victor Stinner wrote: > I suppose that you have the current value of range(10000) on the stack: > DUP_TOP; BINARY_MULTIPLY; gives you the square. You don't need the x > variable (LOAD_FAST/STORE_FAST). That seems far too special-purpose to be worth it to me. -- Greg From doug.hellmann at gmail.com Thu May 19 15:07:27 2011 From: doug.hellmann at gmail.com (Doug Hellmann) Date: Thu, 19 May 2011 09:07:27 -0400 Subject: [Python-Dev] looking for a contact at Google on the Blogger team Message-ID: <5B09C555-1F4E-413E-9719-2ED1C9B68DF1@gmail.com> Several of the PSF blogs hosted on Google's Blogger platform are experiencing issues as fallout from the recent maintenance problems they had. We have already had to recreate at least one of the translations for Python Insider in order to be able to publish to it, and now we can't edit posts on Python Insider itself. Can anyone put me in contact with someone at Google from the Blogger team? I would at least like to know whether the "bX-qpvq7q" problem is being worked on, so I can decide whether to take a hiatus or start moving us to another platform. There are a lot of posts about the error on the support forums, but no obvious response from Google. Thanks, Doug -- Doug Hellmann Communications Director Python Software Foundation http://python.org/psf/ From tseaver at palladion.com Thu May 19 19:05:36 2011 From: tseaver at palladion.com (Tres Seaver) Date: Thu, 19 May 2011 13:05:36 -0400 Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1 In-Reply-To: <BANLkTinC3+O88edv+HkN4zNTCz1uXS+z_w@mail.gmail.com> References: <4DD2C37D.7000008@python.org> <4DD4127E.6050301@zhuliguan.net> <4DD4194F.9020009@v.loewis.de> <ir176p$25a$1@dough.gmane.org> <BANLkTinC3+O88edv+HkN4zNTCz1uXS+z_w@mail.gmail.com> Message-ID: <ir3il1$ulp$1@dough.gmane.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 05/18/2011 10:46 PM, anatoly techtonik wrote: > On Wed, May 18, 2011 at 10:37 PM, Georg Brandl <g.brandl at gmx.net> wrote: >> On 18.05.2011 21:09, "Martin v. L?wis" wrote: >>> Am 18.05.2011 20:39, schrieb Hagen F?rstenau: >>>>> On behalf of the Python development team, I am pleased to announce the >>>>> first release candidate of Python 3.2.1. >>>> >>>> Shouldn't there be a tag "v3.2.1rc1" in the hg repo? >>> >>> http://hg.python.org/releasing/3.2.1/ >>> >>> Regards, >>> Martin >>> >>> P.S. "Shouldn't" makes it sound as if there was a mistake. >> >> To clarify: once the final is done, the repo Martin mentioned will be >> merged back to main and then vanish. > > Can't this work be done in the branch of main repo, so that everybody > can track the progress in place? Is there any picture of the process > similar to http://nvie.com/posts/a-successful-git-branching-model/ ? Note that in that writeup, 'release-*' (and 'hotfix-*') branches are not shown as pushed to the 'origin' repository. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk3VTeAACgkQ+gerLs4ltQ42kgCeMbIDH6zRU5uyd0Su28Nb9E5q WAMAniWnrvzRReDa+b3mYtavbyaywGVJ =Dr2p -----END PGP SIGNATURE----- From ziade.tarek at gmail.com Thu May 19 19:12:14 2011 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Thu, 19 May 2011 19:12:14 +0200 Subject: [Python-Dev] packaging landed in stdlib In-Reply-To: <BANLkTikg_OswAKYE-+r6iLyoA2-yzikBfQ@mail.gmail.com> References: <BANLkTikg_OswAKYE-+r6iLyoA2-yzikBfQ@mail.gmail.com> Message-ID: <BANLkTinDyNY0E_NyECXeNGW4zgqHadCqBw@mail.gmail.com> On Thu, May 19, 2011 at 1:35 PM, Tarek Ziad? <ziade.tarek at gmail.com> wrote: > Hey > > I've pushed packaging in stdlib. There are a few buildbots errors > we're fixing right now. FYI. there are still some failures we're fixing. Thanks for your patience and thanks to the folks that are helping me on this :) I expect the bbots to be back on track later today Cheers Tarek -- Tarek Ziad? | http://ziade.org From ethan at stoneleaf.us Thu May 19 19:50:10 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 19 May 2011 10:50:10 -0700 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <BANLkTikx8U4jWwLXXazpmtsL6MasDmyVyg@mail.gmail.com> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <BANLkTikx8U4jWwLXXazpmtsL6MasDmyVyg@mail.gmail.com> Message-ID: <4DD55852.9070903@stoneleaf.us> Nick Coghlan wrote: > OK, summarising the thread so far from my point of view. [snip] > To be honest, I don't think there is a lot we can do here except to > further emphasise in the documentation and elsewhere that *bytes is > not a string type* (regardless of any API similarities retained to > ease transition from the 2.x series). For example, if we have any > lingering references to "byte strings" they should be replaced with > "byte sequences" or "bytes objects" (depending on context, as the > former phrasing also encompasses bytearray objects). I think this would be a big help. > 2. As a concrete usability issue, it is awkward to programmatically > check the value of a specific byte when working with an ASCII based > protocol: > > data[i] == b'a' # Intuitive, but always False due to type mismatch > data[i:i+1] == b'a' # Works, but clumsy > data[i] == b'a'[0] # Ditto (but at least susceptible to compiler > const-expression optimisation) > data[i] == ord('a') # Clumsy and slow > data[i] == 97 # Hard to read > > Proposals to address this include: > - introduce a "character" literal to allow c'a' as an alternative to ord('a') > Potentially workable, but leaves the intuitive answer above > silently producing an unexpected answer [snip] > For point 2, I'm personally +0 on the idea of having 1-element bytes > and bytearray objects delegate hashing and comparison operations to > the corresponding integer object. We have the power to make the > obvious code correct code, so let's do that. However, the implications > of the additional key collisions in value based containers may need to > be explored further. Nick Coghlan also wrote: > On further reflection, the key collision and semantics blurring > problems mean I am at best -0 on this particular solution to the > problem (and heading fairly rapidly in the direction of -1). Last thought I have for a possible 'solution' -- when a bytes object is tested for equality against an int raise TypeError. Precedent being sum() raising a TypeError when passed a list of strings because performance is so poor. Reason here being that the intuitive behavior will never work and will always produce silent bugs. ~Ethan~ From guido at python.org Thu May 19 19:43:02 2011 From: guido at python.org (Guido van Rossum) Date: Thu, 19 May 2011 10:43:02 -0700 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <BANLkTikx8U4jWwLXXazpmtsL6MasDmyVyg@mail.gmail.com> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <BANLkTikx8U4jWwLXXazpmtsL6MasDmyVyg@mail.gmail.com> Message-ID: <BANLkTik9oXD0Tont0MeyFF9im655946r2g@mail.gmail.com> On Thu, May 19, 2011 at 1:43 AM, Nick Coghlan <ncoghlan at gmail.com> wrote: > OK, summarising the thread so far from my point of view. > > 1. There are some aspects of the behavior of bytes() objects that > tempt people to think of them as string-like objects (primarily the > b'' literals and their use in repr(), along with the fact that they > fill roles that were filled by str in it's "arbitrary binary data" > incarnation in Python 2.x). The mental model this creates in the > reader is incorrect, as bytes() are far closer to array.array('c') in > their underlying behaviour (and deliberately so - cf. PEP 358, 3112, > 3137). I think most of this "wrong mental model" is actually due to people not having completely internalized the Python 3 way. > One proposal for addressing this is to add a x'deadbeef' literal and > using that in repr() rather than the bytestring. Another would be to > escape all characters, even printable ASCII, in the bytes() > representation. Both of these are undesirable, as they miss the > original purpose of this behaviour: making it easier to work with the > many ASCII based wire protocols that are in widespread use. Indeed, -1 on both. > To be honest, I don't think there is a lot we can do here except to > further emphasise in the documentation and elsewhere that *bytes is > not a string type* (regardless of any API similarities retained to > ease transition from the 2.x series). For example, if we have any > lingering references to "byte strings" they should be replaced with > "byte sequences" or "bytes objects" (depending on context, as the > former phrasing also encompasses bytearray objects). +1 > 2. As a concrete usability issue, it is awkward to programmatically > check the value of a specific byte when working with an ASCII based > protocol: > > ?data[i] == b'a' # Intuitive, but always False due to type mismatch > ?data[i:i+1] == b'a' ?# Works, but clumsy > ?data[i] == b'a'[0] ?# Ditto (but at least susceptible to compiler > const-expression optimisation) > ?data[i] == ord('a') # Clumsy and slow > ?data[i] == 97 # Hard to read > > Proposals to address this include: > - introduce a "character" literal to allow c'a' as an alternative to ord('a') -1; the result is not a *character* but an integer. I'm personally favoring using b'a'[0] and possibly hiding this in a constant definition. > Potentially workable, but leaves the intuitive answer above > silently producing an unexpected answer I'm not convinced that that problem is any worse than other comparison-related problems. E.g. b'a' == 'a' also always returns False (most likely it'll be disguised by at least one operand being a variable of course.) > - allow 1-element byte sequences to compare equal to the corresponding > integer values. > ? ?- would require reworking of bytes.__hash__ to use the hash of the > contained element when the data length is exactly 1 > ? ?- transitivity of equality would recommend also supporting > equivalences such as b'a' == 97.0 > ? ?- backwards compatibility concerns arise due to introduction of > new key collisions in dictionaries and sets and other value based > containers > ? ?- yet more string-like behaviour in a type that is *not* a string > (further reinforcing the mistaken impression from point 1) > ? ?- One thing that *isn't* a concern from my point of view is the > fact that we have ample precedent in decimal.Decimal for supporting > implicit coercion in comparison operations while disallowing them in > arithmetic operations (Decimal("1") == 1.0 is allowed, but > Decimal("1") + 1.0 will raise TypeError). > > For point 2, I'm personally +0 on the idea of having 1-element bytes > and bytearray objects delegate hashing and comparison operations to > the corresponding integer object. We have the power to make the > obvious code correct code, so let's do that. However, the implications > of the additional key collisions in value based containers may need to > be explored further. My gut feeling about this is that this will probably introduce some confusing or unintended side effect elsewhere, and I am -1 on this change. -- --Guido van Rossum (python.org/~guido) From guido at python.org Thu May 19 19:46:14 2011 From: guido at python.org (Guido van Rossum) Date: Thu, 19 May 2011 10:46:14 -0700 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <4DD55852.9070903@stoneleaf.us> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <BANLkTikx8U4jWwLXXazpmtsL6MasDmyVyg@mail.gmail.com> <4DD55852.9070903@stoneleaf.us> Message-ID: <BANLkTimYJc0s=WJjMzwWtHwwC+JKmY68Og@mail.gmail.com> On Thu, May 19, 2011 at 10:50 AM, Ethan Furman <ethan at stoneleaf.us> wrote: > Last thought I have for a possible 'solution' -- when a bytes object is > tested for equality against an int raise TypeError. ?Precedent being sum() > raising a TypeError when passed a list of strings because performance is so > poor. ?Reason here being that the intuitive behavior will never work and > will always produce silent bugs. Not the same thing at all. The == operator is special, and should not raise exceptions; too many things would start randomly failing (e.g. membership tests for a dict that has both ints and bytes as keys, or for a list containing a variety of types). -- --Guido van Rossum (python.org/~guido) From guido at python.org Thu May 19 19:56:23 2011 From: guido at python.org (Guido van Rossum) Date: Thu, 19 May 2011 10:56:23 -0700 Subject: [Python-Dev] Don't set local variable in a list comprehension or generator In-Reply-To: <1305754449.27389.30.camel@marge> References: <1305721315.16682.10.camel@marge> <BANLkTi=k46QMDYjHcBVCrAs5mAMfJYUA9Q@mail.gmail.com> <1305754449.27389.30.camel@marge> Message-ID: <BANLkTik+av3-HSTRPGJYwiq07dSGtcV6zw@mail.gmail.com> On Wed, May 18, 2011 at 2:34 PM, Victor Stinner <victor.stinner at haypocalc.com> wrote: > Le mercredi 18 mai 2011 ? 16:19 +0200, Nadeem Vawda a ?crit : >> I'm not sure why you would encounter code like that in the first place. > > Well, I found the STORE_FAST/LOAD_FAST "issue" while trying to optimize > the this module which reimplements rot13 using a dict in Python 3: > > d = {} > for c in (65, 97): > ? ?for i in range(26): > ? ? ? ?d[chr(i+c)] = chr((i+13) % 26 + c) > > I tried: > > d = {chr(i+c): chr((i+13) % 26 + c) > ? ? for i in range(26) > ? ? for c in (65, 97)} > > But it is slower whereas I read somewhere than generators are faster > than loops. I'm curious where you read that. The explicit loop should be faster or equally fast *except* when you can avoid a loop in bytecode by applying map() to a built-in function. However map() with a lambda is significantly slower. Maybe what you recall actually (correctly) said that a comprehension is faster than map+lambda? > By the way, (c for c in ...) is slower than [c for c > in ...]. I suppose that a generator is slower because it exits/reenter > into PyEval_EvalFrameEx() at each step, whereas [c for c ...] uses > BUILD_LIST in a dummy (but fast) loop. Did you test this in Python 2 or 3? In 2 the genexpr is definitely slower than the comprehension; in 3 I'm not sure there's much difference any more. > (c for c in ...) and [c for c in ...] is stupid, but I used a simplified > example to explain the problem. A more realistic example would be: > > ? squares = (x*x for x in range(10000)) > > You don't really need the "x" variable, you just want the square. > Another example is the syntax using a if the filter the data set: > > ? (x for x in ... if condition(x)) > >> > I heard about optimization in the AST tree instead of working on the >> > bytecode. What is the status of this project? >> >> Are you referring to issue11549? There was some related discussion [1] on >> python-dev about six weeks ago, but I haven't seen anything on the topic >> since then. > > Ah yes, it looks to be this issue. I didn't know that there was an > issue. Hm, probably. -- --Guido van Rossum (python.org/~guido) From glyph at twistedmatrix.com Thu May 19 20:22:20 2011 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Thu, 19 May 2011 14:22:20 -0400 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <BANLkTik9oXD0Tont0MeyFF9im655946r2g@mail.gmail.com> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <BANLkTikx8U4jWwLXXazpmtsL6MasDmyVyg@mail.gmail.com> <BANLkTik9oXD0Tont0MeyFF9im655946r2g@mail.gmail.com> Message-ID: <16FC9995-2C52-44C2-BDDE-7E7E4B54C9E3@twistedmatrix.com> On May 19, 2011, at 1:43 PM, Guido van Rossum wrote: > -1; the result is not a *character* but an integer. Well, really the result ought to be an octet, but I suppose adding an 'octet' type is beyond the scope of even this sprawling discussion :). > I'm personally favoring using b'a'[0] and possibly hiding this in a constant definition. As someone who spends a frankly unfortunate amount of time handling protocols where things like this are necessary, I agree with this recommendation. In protocols where one needs to compare network data with one-byte type identifiers or packet prefixes, more (documented) constants and less inscrutable junk like if p == 'c': ... elif p == 'j': ... elif p == 'J': # for compatibility ... would definitely be a good thing. Of course, I realize that this sort of programmer will most likely replace those constants with 99, 106, 74 than take a moment to document what they mean, but at least they'll have to pause for a moment and realize that they have now lost _all_ mnemonics... In fact, I feel like I would want to push in the opposite direction: don't treat one-byte bytes slices less like integers; I wish I could more easily treat n-byte sequences _more_ like integers! :). More protocols have 2-byte or 4-byte network-endian packed integers embedded in them than have individual tag bytes that I want to examine. For the typical ASCII-ish protocol where you want to look at command names and CRLF-separated messages, you'd never want to look at an individual octet, stringish operations like split() will give you what you want. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20110519/16c4fe14/attachment.html> From g.brandl at gmx.net Thu May 19 20:30:18 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 19 May 2011 20:30:18 +0200 Subject: [Python-Dev] packaging landed in stdlib In-Reply-To: <BANLkTikg_OswAKYE-+r6iLyoA2-yzikBfQ@mail.gmail.com> References: <BANLkTikg_OswAKYE-+r6iLyoA2-yzikBfQ@mail.gmail.com> Message-ID: <ir3njj$u61$1@dough.gmane.org> On 19.05.2011 13:35, Tarek Ziad? wrote: > Hey > > I've pushed packaging in stdlib. There are a few buildbots errors > we're fixing right now. > > We will continue our work in their directly for now on. Rock on! Georg From g.brandl at gmx.net Thu May 19 21:31:01 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 19 May 2011 21:31:01 +0200 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <ir2krg$20r$1@dough.gmane.org> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <4DD35B9C.3030702@canterbury.ac.nz> <4DD3EC7A.8070801@stoneleaf.us> <4DD41E2B.7000404@stoneleaf.us> <4DD44AA6.9030600@canterbury.ac.nz> <ir29q8$90n$1@dough.gmane.org> <663D4696-0454-45A8-A6F7-AD18A07709FA@masklinn.net> <ir2krg$20r$1@dough.gmane.org> Message-ID: <ir3r5d$ii1$1@dough.gmane.org> On 19.05.2011 10:37, Stefan Behnel wrote: > Xavier Morel, 19.05.2011 09:41: >> On 2011-05-19, at 07:28 , Georg Brandl wrote: >>> On 19.05.2011 00:39, Greg Ewing wrote: >>>> If someone sees that >>>> >>>> some_var[3] == b'd' >>>> >>>> is true, and that >>>> >>>> some_var[3] == 100 >>>> >>>> is also true, they might expect to be able to do things like >>>> >>>> n = b'd' + 1 >>>> >>>> and get 101... or maybe b'e'... >>> >>> Maybe they should :) >> >> But why wouldn't "they" expect `b'de' + 1` to work as well in this case? If >> a 1-byte bytes is equivalent to an integer, why not an arbitrary one as >> well? > > The result of this must obviously be b"de1". To clarify my original one-liner: if bytes objects (but only one-char bytes objects) equal integers, you should rightly expect to treat them as integers. This is obviously *not* desirable from a strong-typing POV. Georg From tjreedy at udel.edu Thu May 19 22:36:42 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 19 May 2011 16:36:42 -0400 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <BANLkTimE7F68kWyrAO130pO2v9RZrSu1DA@mail.gmail.com> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com> <BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com> <87k4doasr7.fsf@uwakimon.sk.tsukuba.ac.jp> <4DD41997.4060401@trueblade.com> <BANLkTimE7F68kWyrAO130pO2v9RZrSu1DA@mail.gmail.com> Message-ID: <ir3v0p$g3d$1@dough.gmane.org> On 5/19/2011 3:49 AM, Nick Coghlan wrote: > It's a mental model problem. People try to think of bytes as > equivalent to 2.x str and that's just wrong, wrong, wrong. It's far > closer to array.array('c'). Or like C char arrays > Strings are basically *unique* in > returning a length 1 instance of themselves for indexing operations. I still remember having to work that out and get used to it. -- Terry Jan Reedy From skip at pobox.com Fri May 20 01:47:57 2011 From: skip at pobox.com (skip at pobox.com) Date: Thu, 19 May 2011 18:47:57 -0500 Subject: [Python-Dev] Don't set local variable in a list comprehension or generator In-Reply-To: <ir1til$lo6$1@dough.gmane.org> References: <1305721315.16682.10.camel@marge> <BANLkTi=k46QMDYjHcBVCrAs5mAMfJYUA9Q@mail.gmail.com> <ir1cpj$52a$1@dough.gmane.org> <BANLkTinih_3ght8GSknCNrTjaD3t9i-Ayw@mail.gmail.com> <ir1til$lo6$1@dough.gmane.org> Message-ID: <19925.44077.710651.843807@montanaro.dyndns.org> On 5/18/2011 10:19 AM, Nadeem Vawda wrote: > I'm not sure why you would encounter code like that in the first place. > Surely any code of the form: > > ''.join(c for c in my_string) > > would just return my_string? Or am I missing something? You might more-or-less legitimately encounter it if the generator expression originally contained a condition which got removed. Skip From victor.stinner at haypocalc.com Fri May 20 00:51:23 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Fri, 20 May 2011 00:51:23 +0200 Subject: [Python-Dev] [Python-checkins] cpython: Issue #12120, Issue #12119: tests were missing a sys.dont_write_bytecode check In-Reply-To: <E1QN7SR-0004Nt-A9@dinsdale.python.org> References: <E1QN7SR-0004Nt-A9@dinsdale.python.org> Message-ID: <1305845483.10075.5.camel@marge> Python 3.3 is not supposed to create .pyc files in the same directory than the .py files. So I don't understand the following code. Le jeudi 19 mai 2011 ? 19:56 +0200, tarek.ziade a ?crit : > http://hg.python.org/cpython/rev/9d1fb6a9104b > changeset: 70207:9d1fb6a9104b > user: Tarek Ziade <tarek at ziade.org> > date: Thu May 19 19:56:12 2011 +0200 > summary: > Issue #12120, Issue #12119: tests were missing a sys.dont_write_bytecode check > > files: > Lib/distutils/tests/test_build_py.py | 3 ++- > Lib/packaging/tests/test_command_build_py.py | 3 ++- > Misc/NEWS | 3 +++ > 3 files changed, 7 insertions(+), 2 deletions(-) > > > diff --git a/Lib/distutils/tests/test_build_py.py b/Lib/distutils/tests/test_build_py.py > --- a/Lib/distutils/tests/test_build_py.py > +++ b/Lib/distutils/tests/test_build_py.py > @@ -58,7 +58,8 @@ > pkgdest = os.path.join(destination, "pkg") > files = os.listdir(pkgdest) > self.assertTrue("__init__.py" in files) > - self.assertTrue("__init__.pyc" in files) > + if not sys.dont_write_bytecode: > + self.assertTrue("__init__.pyc" in files) > self.assertTrue("README.txt" in files) > > def test_empty_package_dir (self): > diff --git a/Lib/packaging/tests/test_command_build_py.py b/Lib/packaging/tests/test_command_build_py.py > --- a/Lib/packaging/tests/test_command_build_py.py > +++ b/Lib/packaging/tests/test_command_build_py.py > @@ -61,7 +61,8 @@ > pkgdest = os.path.join(destination, "pkg") > files = os.listdir(pkgdest) > self.assertIn("__init__.py", files) > - self.assertIn("__init__.pyc", files) > + if not sys.dont_write_bytecode: > + self.assertIn("__init__.pyc", files) > self.assertIn("README.txt", files) > > def test_empty_package_dir(self): > diff --git a/Misc/NEWS b/Misc/NEWS > --- a/Misc/NEWS > +++ b/Misc/NEWS > @@ -153,6 +153,9 @@ > Library > ------- > > +- Issue #12120, #12119: skip a test in packaging and distutils > + if sys.dont_write_bytecode is set to True. > + > - Issue #12065: connect_ex() on an SSL socket now returns the original errno > when the socket's timeout expires (it used to return None). > > > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > http://mail.python.org/mailman/listinfo/python-checkins From ethan at stoneleaf.us Fri May 20 02:40:26 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 19 May 2011 17:40:26 -0700 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <BANLkTi=_GqrNntRU7pv7x=kj3gHurk-Gaw@mail.gmail.com> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <BANLkTikx8U4jWwLXXazpmtsL6MasDmyVyg@mail.gmail.com> <BANLkTi=_GqrNntRU7pv7x=kj3gHurk-Gaw@mail.gmail.com> Message-ID: <4DD5B87A.9060902@stoneleaf.us> Nick Coghlan wrote: > On Thu, May 19, 2011 at 6:43 PM, Nick Coghlan <ncoghlan at gmail.com> wrote: >> For point 2, I'm personally +0 on the idea of having 1-element bytes >> and bytearray objects delegate hashing and comparison operations to >> the corresponding integer object. We have the power to make the >> obvious code correct code, so let's do that. However, the implications >> of the additional key collisions in value based containers may need to >> be explored further. Several folk have said that objects that compare equal must hash equal... Why? It's an honest question. Here's what I have tried: --> class Wierd(): ... def __init__(self, value): ... self.value = value ... def __eq__(self, other): ... return self.value == other ... def __hash__(self): ... return hash((self.value + 13) ** 3) ... --> one = Wierd(1) --> two = Wierd(2) --> three = Wierd(3) --> one <Wierd object at 0x00BFE710> --> one == 1 True --> one == 2 False --> two == 2 True --> three == 3 True --> d = dict() --> d[one] = '1' --> d[two] = '2' --> d[three] = '3' --> d {<Wierd object at 0x00BFE710>: '1', <Wierd object at 0x00BFE870>: '3', <Wierd object at 0x00BFE830>: '2'} --> d[1] = '1.0' --> d[2] = '2.0' --> d[3] = '3.0' --> d {<Wierd object at 0x00BFE870>: '3', 1: '1.0', 2: '2.0', 3: '3.0', <Wierd object at 0x00BFE830>: '2', <Wierd object at 0x00BFE710>: '1'} --> d[2] '2.0' --> d[two] '2' This behavior matches what I was imagining for having b'a' == 97. They compare equal, yet remain distinct objects for all other purposes. If anybody has a link to or an explanation why equal values must be equal hashes I'm all ears. My apologies in advance if this is an incredibly naive question. ~Ethan~ From benjamin at python.org Fri May 20 02:51:16 2011 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 19 May 2011 19:51:16 -0500 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <4DD5B87A.9060902@stoneleaf.us> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <BANLkTikx8U4jWwLXXazpmtsL6MasDmyVyg@mail.gmail.com> <BANLkTi=_GqrNntRU7pv7x=kj3gHurk-Gaw@mail.gmail.com> <4DD5B87A.9060902@stoneleaf.us> Message-ID: <BANLkTimQ+yPB+2_PM_u5MMYUZ88XbUYebQ@mail.gmail.com> 2011/5/19 Ethan Furman <ethan at stoneleaf.us>: > If anybody has a link to or an explanation why equal values must be equal > hashes I'm all ears. ?My apologies in advance if this is an incredibly naive > question. https://secure.wikimedia.org/wikipedia/en/wiki/Hash_table -- Regards, Benjamin From raymond.hettinger at gmail.com Fri May 20 05:10:44 2011 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Thu, 19 May 2011 22:10:44 -0500 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <4DD5B87A.9060902@stoneleaf.us> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <BANLkTikx8U4jWwLXXazpmtsL6MasDmyVyg@mail.gmail.com> <BANLkTi=_GqrNntRU7pv7x=kj3gHurk-Gaw@mail.gmail.com> <4DD5B87A.9060902@stoneleaf.us> Message-ID: <8102E548-63BE-4674-902E-C458DC5FBA9F@gmail.com> On May 19, 2011, at 7:40 PM, Ethan Furman wrote: > Several folk have said that objects that compare equal must hash equal... And so do the docs: http://docs.python.org/dev/reference/datamodel.html#object.__hash__ , "the only required property is that objects which compare equal have the same hash value". Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20110519/f8100944/attachment.html> From eliben at gmail.com Fri May 20 09:02:03 2011 From: eliben at gmail.com (Eli Bendersky) Date: Fri, 20 May 2011 10:02:03 +0300 Subject: [Python-Dev] looking for a contact at Google on the Blogger team In-Reply-To: <5B09C555-1F4E-413E-9719-2ED1C9B68DF1@gmail.com> References: <5B09C555-1F4E-413E-9719-2ED1C9B68DF1@gmail.com> Message-ID: <BANLkTiksdt85-sV5w5bHgB3OUpnFW9qEVw@mail.gmail.com> On Thu, May 19, 2011 at 16:07, Doug Hellmann <doug.hellmann at gmail.com> wrote: > Several of the PSF blogs hosted on Google's Blogger platform are experiencing issues as fallout from the recent maintenance problems they had. We have already had to recreate at least one of the translations for Python Insider in order to be able to publish to it, and now we can't edit posts on Python Insider itself. > > Can anyone put me in contact with someone at Google from the Blogger team? I would at least like to know whether the "bX-qpvq7q" problem is being worked on, so I can decide whether to take a hiatus or start moving us to another platform. There are a lot of posts about the error on the support forums, but no obvious response from Google. > With respect to Google Blogger, I don't see a good reason to use it as the platform for the blog. IMHO it would be much better to go for a less-dependencies approach and just deploy a Wordpress installation, or possibly even something Python-based (if volunteers to maintain it are found. Eli From ncoghlan at gmail.com Fri May 20 10:40:09 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 20 May 2011 18:40:09 +1000 Subject: [Python-Dev] looking for a contact at Google on the Blogger team In-Reply-To: <BANLkTiksdt85-sV5w5bHgB3OUpnFW9qEVw@mail.gmail.com> References: <5B09C555-1F4E-413E-9719-2ED1C9B68DF1@gmail.com> <BANLkTiksdt85-sV5w5bHgB3OUpnFW9qEVw@mail.gmail.com> Message-ID: <BANLkTin7cf6iZZQz+MkKWmTTzSH6yNgL2A@mail.gmail.com> On Fri, May 20, 2011 at 5:02 PM, Eli Bendersky <eliben at gmail.com> wrote: > On Thu, May 19, 2011 at 16:07, Doug Hellmann <doug.hellmann at gmail.com> wrote: >> Several of the PSF blogs hosted on Google's Blogger platform are experiencing issues as fallout from the recent maintenance problems they had. We have already had to recreate at least one of the translations for Python Insider in order to be able to publish to it, and now we can't edit posts on Python Insider itself. >> >> Can anyone put me in contact with someone at Google from the Blogger team? I would at least like to know whether the "bX-qpvq7q" problem is being worked on, so I can decide whether to take a hiatus or start moving us to another platform. There are a lot of posts about the error on the support forums, but no obvious response from Google. >> > > With respect to Google Blogger, I don't see a good reason to use it as > the platform for the blog. As with any infrastructure, there is a reasonably high cost in changing, as people have become used to a certain way of doing things, and porting the contents from the old system to the new one requires additional effort. Blogger has its problems, but it typically gets the job done well enough (modulo cases like the one currently affecting Doug and his team). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From Tim.Golden at cbsoutdoor.co.uk Fri May 20 10:38:24 2011 From: Tim.Golden at cbsoutdoor.co.uk (Tim Golden) Date: Fri, 20 May 2011 09:38:24 +0100 Subject: [Python-Dev] os.access on Windows Message-ID: <17E3183FF8D8EB47839A7E240AD39FA831772B209C@SVR-EXCH-VMBX.gb.vo.local> There's a thread on python-list at the moment: http://mail.python.org/pipermail/python-list/2011-May/1272505.html which is discussing the validity of os.access results on Windows. Now we've been here before: I raised issue2528 for a previous enquiry some years ago and proffered a patch which uses the AccessCheck API to perform the equivalent check, but didn't follow through. Someone on the new thread is suggesting -- validly -- that the docs should highlight the limitations of this call on Windows. But the docs for that call are already fairly involved: http://docs.python.org/library/os.html#os.access We seem to have a few options in increasing order of difficulty: * Do nothing - inform the occasional enquirer of the situation and leave it at that. * Update the docs to add something which describes what the function actually does on the Windows platform. (Whether or not we change any code). * Apply the patch in issue2528 to 3.3 and maybe 2.7 * Leave os.access alone but offer alternative Windows-specific functionality in the os module or elsewhere, using essentially the code in the issue2528 patch. As a side note, the pywin32 packages don't actually include AccessCheck at the moment. (Which makes it slightly harder to explain to people how they could do this check for themselves). It could probably be added over there which might ease the burden over here. Opinions? TJG Tim Golden Very Senior Analyst Programmer CBS Outdoor UK Camden Wharf 28 Jamestown Road London NW1 7BY T: 020 7482 3000 F: 020 7267 4944 http://www.cbsoutdoor.co.uk/ http://www.cbsoutdoor.co.uk/ http://www.bigbuschallenge.com/ Don't waste paper. Think before you print. The contents of this e-mail are confidential to the ordinary user of the e-mail address to which it was addressed, and may also be privileged. If you are not the addressee of this e-mail you may not copy, forward, disclose or otherwise use it or any part of it in any form whatsoever. If you have received this e-mail in error, please e-mail the sender by replying to this message. CBS Outdoor Ltd reserves the right to monitor e-mail communications from external/internal sources for the purposes of ensuring correct and appropriate use of CBS Outdoor facilities. CBS Outdoor Limited, registered in England and Wales with company number 02866133 and registered address at Camden Wharf, 28 Jamestown Road, London, NW1 7BY. ________________________________________________________________________ This e-mail has been scanned for all viruses by Star. The service is powered by MessageLabs. For more information on a proactive anti-virus service working around the clock, around the globe, visit: http://www.star.net.uk ________________________________________________________________________ From ncoghlan at gmail.com Fri May 20 11:21:24 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 20 May 2011 19:21:24 +1000 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <4DD5B87A.9060902@stoneleaf.us> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <BANLkTikx8U4jWwLXXazpmtsL6MasDmyVyg@mail.gmail.com> <BANLkTi=_GqrNntRU7pv7x=kj3gHurk-Gaw@mail.gmail.com> <4DD5B87A.9060902@stoneleaf.us> Message-ID: <BANLkTi=smKJaW9vYM0v8isU5_n5UBUZs6g@mail.gmail.com> On Fri, May 20, 2011 at 10:40 AM, Ethan Furman <ethan at stoneleaf.us> wrote: > This behavior matches what I was imagining for having > b'a' == 97. ?They compare equal, yet remain distinct objects > for all other purposes. > > If anybody has a link to or an explanation why equal values must be equal > hashes I'm all ears. ?My apologies in advance if this is an incredibly naive > question. Because whether or not two objects can coexist in the same hash table should *not* depend on their hash values - it should depend on whether or not they compare equal to each other. The use of hashing should just be an optimisation, not fundamentally change the nature of the comparison operation. (i.e. "hash(a) == hash(b) and a == b" is meant to be a fast alternative to "a == b", not a completely different check). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From eliben at gmail.com Fri May 20 11:39:22 2011 From: eliben at gmail.com (Eli Bendersky) Date: Fri, 20 May 2011 12:39:22 +0300 Subject: [Python-Dev] looking for a contact at Google on the Blogger team In-Reply-To: <BANLkTin7cf6iZZQz+MkKWmTTzSH6yNgL2A@mail.gmail.com> References: <5B09C555-1F4E-413E-9719-2ED1C9B68DF1@gmail.com> <BANLkTiksdt85-sV5w5bHgB3OUpnFW9qEVw@mail.gmail.com> <BANLkTin7cf6iZZQz+MkKWmTTzSH6yNgL2A@mail.gmail.com> Message-ID: <BANLkTi=WGTmCKT3rwjDbcU+j2KEkLzVjUg@mail.gmail.com> >> With respect to Google Blogger, I don't see a good reason to use it as >> the platform for the blog. > > As with any infrastructure, there is a reasonably high cost in > changing, as people have become used to a certain way of doing things, > and porting the contents from the old system to the new one requires > additional effort. > > Blogger has its problems, but it typically gets the job done well > enough (modulo cases like the one currently affecting Doug and his > team). Has the Python insider blog really accumulated enough history and cruft to make this move problematic? It's a fairly new blog, with not much content in it. From my blogging experience, Blogger has other limitations which eventually bite you, and since it's not very flexible you can either live with it or move to a more flexible platform. All of this completely IMHO, of course. Just friendly advice ;-) Eli From jnoller at gmail.com Fri May 20 16:24:29 2011 From: jnoller at gmail.com (Jesse Noller) Date: Fri, 20 May 2011 10:24:29 -0400 Subject: [Python-Dev] looking for a contact at Google on the Blogger team In-Reply-To: <BANLkTi=WGTmCKT3rwjDbcU+j2KEkLzVjUg@mail.gmail.com> References: <5B09C555-1F4E-413E-9719-2ED1C9B68DF1@gmail.com> <BANLkTiksdt85-sV5w5bHgB3OUpnFW9qEVw@mail.gmail.com> <BANLkTin7cf6iZZQz+MkKWmTTzSH6yNgL2A@mail.gmail.com> <BANLkTi=WGTmCKT3rwjDbcU+j2KEkLzVjUg@mail.gmail.com> Message-ID: <BANLkTin8RWdoXFN-aEfLEiYqe6-dmqG7jw@mail.gmail.com> On Fri, May 20, 2011 at 5:39 AM, Eli Bendersky <eliben at gmail.com> wrote: >>> With respect to Google Blogger, I don't see a good reason to use it as >>> the platform for the blog. >> >> As with any infrastructure, there is a reasonably high cost in >> changing, as people have become used to a certain way of doing things, >> and porting the contents from the old system to the new one requires >> additional effort. >> >> Blogger has its problems, but it typically gets the job done well >> enough (modulo cases like the one currently affecting Doug and his >> team). > > Has the Python insider blog really accumulated enough history and > cruft to make this move problematic? It's a fairly new blog, with not > much content in it. From my blogging experience, Blogger has other > limitations which eventually bite you, and since it's not very > flexible you can either live with it or move to a more flexible > platform. > > All of this completely IMHO, of course. Just friendly advice ;-) > Eli There is ongoing work for an RFP by the board to improve the python.org publishing system/site to allow us to self-host these things. Moving PSF properties off of it, and onto another "hosted by someone else" site is probably not a good idea, but our hands may be forced if google/blogger can not resolve the issues. jesse From brian.curtin at gmail.com Fri May 20 17:21:02 2011 From: brian.curtin at gmail.com (Brian Curtin) Date: Fri, 20 May 2011 10:21:02 -0500 Subject: [Python-Dev] os.access on Windows In-Reply-To: <17E3183FF8D8EB47839A7E240AD39FA831772B209C@SVR-EXCH-VMBX.gb.vo.local> References: <17E3183FF8D8EB47839A7E240AD39FA831772B209C@SVR-EXCH-VMBX.gb.vo.local> Message-ID: <BANLkTi=o8g-z1Jbfh9qb+KFipTdq_Z=isw@mail.gmail.com> On Fri, May 20, 2011 at 03:38, Tim Golden <Tim.Golden at cbsoutdoor.co.uk>wrote: > There's a thread on python-list at the moment: > > http://mail.python.org/pipermail/python-list/2011-May/1272505.html > > which is discussing the validity of os.access results on > Windows. Now we've been here before: I raised issue2528 > for a previous enquiry some years ago and proffered a patch > which uses the AccessCheck API to perform the equivalent check, > but didn't follow through. > > Someone on the new thread is suggesting -- validly -- that the > docs should highlight the limitations of this call on Windows. > But the docs for that call are already fairly involved: > > http://docs.python.org/library/os.html#os.access > > We seem to have a few options in increasing order of difficulty: > > * Do nothing - inform the occasional enquirer of the situation and > leave it at that. > > * Update the docs to add something which describes what the function > actually does on the Windows platform. (Whether or not we change any > code). > I think we should tread lightly in the documentation area. We already have two note boxes, and adding a third probably scares everyone away. Maybe there should be a bullet list of considerations to be made when using os.access? * Apply the patch in issue2528 to 3.3 and maybe 2.7 > I'd vote in favor of this. If we can be a bit smarter in determining os.access results, let's do it. I haven't reviewed the patch other than 1 minute scan, but I'll put this on my radar and try to get you a review. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20110520/98df05ce/attachment.html> From mail at timgolden.me.uk Fri May 20 17:25:45 2011 From: mail at timgolden.me.uk (Tim Golden) Date: Fri, 20 May 2011 16:25:45 +0100 Subject: [Python-Dev] os.access on Windows In-Reply-To: <BANLkTi=o8g-z1Jbfh9qb+KFipTdq_Z=isw@mail.gmail.com> References: <17E3183FF8D8EB47839A7E240AD39FA831772B209C@SVR-EXCH-VMBX.gb.vo.local> <BANLkTi=o8g-z1Jbfh9qb+KFipTdq_Z=isw@mail.gmail.com> Message-ID: <4DD687F9.1040403@timgolden.me.uk> On 20/05/2011 16:21, Brian Curtin wrote: > On Fri, May 20, 2011 at 03:38, Tim Golden <Tim.Golden at cbsoutdoor.co.uk (Sorry about that; I had no idea I'd sent that from my work account) > I think we should tread lightly in the documentation area. We already > have two note boxes, and adding a third probably scares everyone away. I entirely agree. (That's what I meant by "involved" above) > Maybe there should be a bullet list of considerations to be made when > using os.access? > > * Apply the patch in issue2528 to 3.3 and maybe 2.7 > > > I'd vote in favor of this. If we can be a bit smarter in determining > os.access results, let's do it. > > I haven't reviewed the patch other than 1 minute scan, but I'll put this > on my radar and try to get you a review. Thanks. To be honest I wrote the patch 3 years ago; I haven't even tried to apply it to either of the current posixmodule.c. Let's see if I can dust it off and mould it into shape, or you'll be left fighting patch errors instead of reviewing code :) TJG From ziade.tarek at gmail.com Fri May 20 17:29:16 2011 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Fri, 20 May 2011 17:29:16 +0200 Subject: [Python-Dev] packaging landed in stdlib In-Reply-To: <ir3njj$u61$1@dough.gmane.org> References: <BANLkTikg_OswAKYE-+r6iLyoA2-yzikBfQ@mail.gmail.com> <ir3njj$u61$1@dough.gmane.org> Message-ID: <BANLkTimC-QEFP5u+2QHSgsYag50DB1k2Pg@mail.gmail.com> On Thu, May 19, 2011 at 8:30 PM, Georg Brandl <g.brandl at gmx.net> wrote: > On 19.05.2011 13:35, Tarek Ziad? wrote: >> Hey >> >> I've pushed packaging in stdlib. There are a few buildbots errors >> we're fixing right now. >> >> We will continue our work in their directly for now on. > > Rock on! Thanks :) Still working on some issues under windows and solaris bbots today, but we're getting there. Sorry for the inconvenience Tarek > > Georg > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ziade.tarek%40gmail.com > -- Tarek Ziad? | http://ziade.org From eliben at gmail.com Fri May 20 17:35:56 2011 From: eliben at gmail.com (Eli Bendersky) Date: Fri, 20 May 2011 18:35:56 +0300 Subject: [Python-Dev] looking for a contact at Google on the Blogger team In-Reply-To: <BANLkTin8RWdoXFN-aEfLEiYqe6-dmqG7jw@mail.gmail.com> References: <5B09C555-1F4E-413E-9719-2ED1C9B68DF1@gmail.com> <BANLkTiksdt85-sV5w5bHgB3OUpnFW9qEVw@mail.gmail.com> <BANLkTin7cf6iZZQz+MkKWmTTzSH6yNgL2A@mail.gmail.com> <BANLkTi=WGTmCKT3rwjDbcU+j2KEkLzVjUg@mail.gmail.com> <BANLkTin8RWdoXFN-aEfLEiYqe6-dmqG7jw@mail.gmail.com> Message-ID: <BANLkTinV9XXvVH51LL74HVzTDaB-dzSWJA@mail.gmail.com> > There is ongoing work for an RFP by the board to improve the > python.org publishing system/site to allow us to self-host these > things. Moving PSF properties off of it, and onto another "hosted by > someone else" site is probably not a good idea, but our hands may be > forced if google/blogger can not resolve the issues. > > jesse The whole idea of a Wordpress-(or similar)-based solution is self hosting, and less reliance on outside providers like blogger. Wordpress is just a bunch of PHP code you place in a directory on your server and you have a blog. You don't depend on anyone, except your own hosting. Eli From ncoghlan at gmail.com Fri May 20 17:37:35 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 21 May 2011 01:37:35 +1000 Subject: [Python-Dev] looking for a contact at Google on the Blogger team In-Reply-To: <BANLkTi=WGTmCKT3rwjDbcU+j2KEkLzVjUg@mail.gmail.com> References: <5B09C555-1F4E-413E-9719-2ED1C9B68DF1@gmail.com> <BANLkTiksdt85-sV5w5bHgB3OUpnFW9qEVw@mail.gmail.com> <BANLkTin7cf6iZZQz+MkKWmTTzSH6yNgL2A@mail.gmail.com> <BANLkTi=WGTmCKT3rwjDbcU+j2KEkLzVjUg@mail.gmail.com> Message-ID: <BANLkTik9v8O250C07Zz7U=GYkAkpW_JxzQ@mail.gmail.com> On Fri, May 20, 2011 at 7:39 PM, Eli Bendersky <eliben at gmail.com> wrote: > Has the Python insider blog really accumulated enough history and > cruft to make this move problematic? It's a fairly new blog, with not > much content in it. From my blogging experience, Blogger has other > limitations which eventually bite you, and since it's not very > flexible you can either live with it or move to a more flexible > platform. It's not just the Python Insider blog that is affected (and *any* effort directed towards platform changes is effort that isn't going towards writing new articles. Of course, if Blogger don't fix the currrent problems, then that will be a moot point - moving will be necessary to get *anything* done). In general, though, infrastructure changes start from a position of "not worth the hassle", just like code changes. It takes a pretty compelling set of features to justify switching, and, while Blogger isn't the best engine out there, it isn't terrible either (especially once you replace their lousy comment system with something that is at least half usable like DISQUS). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Fri May 20 17:44:39 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 21 May 2011 01:44:39 +1000 Subject: [Python-Dev] looking for a contact at Google on the Blogger team In-Reply-To: <BANLkTinV9XXvVH51LL74HVzTDaB-dzSWJA@mail.gmail.com> References: <5B09C555-1F4E-413E-9719-2ED1C9B68DF1@gmail.com> <BANLkTiksdt85-sV5w5bHgB3OUpnFW9qEVw@mail.gmail.com> <BANLkTin7cf6iZZQz+MkKWmTTzSH6yNgL2A@mail.gmail.com> <BANLkTi=WGTmCKT3rwjDbcU+j2KEkLzVjUg@mail.gmail.com> <BANLkTin8RWdoXFN-aEfLEiYqe6-dmqG7jw@mail.gmail.com> <BANLkTinV9XXvVH51LL74HVzTDaB-dzSWJA@mail.gmail.com> Message-ID: <BANLkTinaU6ypzk1+n4z-0FeOmM-fGic7KA@mail.gmail.com> On Sat, May 21, 2011 at 1:35 AM, Eli Bendersky <eliben at gmail.com> wrote: >> There is ongoing work for an RFP by the board to improve the >> python.org publishing system/site to allow us to self-host these >> things. Moving PSF properties off of it, and onto another "hosted by >> someone else" site is probably not a good idea, but our hands may be >> forced if google/blogger can not resolve the issues. >> >> jesse > > The whole idea of a Wordpress-(or similar)-based solution is self > hosting, and less reliance on outside providers like blogger. > Wordpress is just a bunch of PHP code you place in a directory on your > server and you have a blog. You don't depend on anyone, except your > own hosting. As Jesse has said, there is an RFP in development to improve python.org to the point where we can self-host blogs and the like and deal with the associated user account administration appropriately. But when it comes to collaborative blogs, it *isn't* just a matter of dropping a blogging engine in and running with it. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From tseaver at palladion.com Fri May 20 18:00:20 2011 From: tseaver at palladion.com (Tres Seaver) Date: Fri, 20 May 2011 12:00:20 -0400 Subject: [Python-Dev] looking for a contact at Google on the Blogger team In-Reply-To: <BANLkTinV9XXvVH51LL74HVzTDaB-dzSWJA@mail.gmail.com> References: <5B09C555-1F4E-413E-9719-2ED1C9B68DF1@gmail.com> <BANLkTiksdt85-sV5w5bHgB3OUpnFW9qEVw@mail.gmail.com> <BANLkTin7cf6iZZQz+MkKWmTTzSH6yNgL2A@mail.gmail.com> <BANLkTi=WGTmCKT3rwjDbcU+j2KEkLzVjUg@mail.gmail.com> <BANLkTin8RWdoXFN-aEfLEiYqe6-dmqG7jw@mail.gmail.com> <BANLkTinV9XXvVH51LL74HVzTDaB-dzSWJA@mail.gmail.com> Message-ID: <ir636k$8v1$1@dough.gmane.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 05/20/2011 11:35 AM, Eli Bendersky wrote: >> There is ongoing work for an RFP by the board to improve the >> python.org publishing system/site to allow us to self-host these >> things. Moving PSF properties off of it, and onto another "hosted by >> someone else" site is probably not a good idea, but our hands may be >> forced if google/blogger can not resolve the issues. >> >> jesse > > The whole idea of a Wordpress-(or similar)-based solution is self > hosting, and less reliance on outside providers like blogger. > Wordpress is just a bunch of PHP code you place in a directory on your > server and you have a blog. You don't depend on anyone, except your > own hosting. And your own sysadmins now have to chase fixes for remotely-exploitable WP bugs: http://www.wordpressexploit.com/ Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk3WkBQACgkQ+gerLs4ltQ72iwCeIhkCLXm26ujJJ3kqh9vKB4fr dMYAn05qsoyiNxio02UAYJ7luLjVaSML =OFdv -----END PGP SIGNATURE----- From status at bugs.python.org Fri May 20 18:07:23 2011 From: status at bugs.python.org (Python tracker) Date: Fri, 20 May 2011 18:07:23 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20110520160723.0E4011CE30@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2011-05-13 - 2011-05-20) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 2794 (+10) closed 21115 (+46) total 23909 (+56) Open issues with patches: 1201 Issues opened (37) ================== #8796: Deprecate codecs.open() http://bugs.python.org/issue8796 reopened by haypo #11377: Deprecate platform.popen() http://bugs.python.org/issue11377 reopened by eric.araujo #12068: test_logging failure in test_rollover http://bugs.python.org/issue12068 reopened by pitrou #12073: regrtest: use faulthandler to dump the tracebacks on SIGUSR1 http://bugs.python.org/issue12073 opened by haypo #12074: regrtest: display the current number of failures http://bugs.python.org/issue12074 opened by haypo #12075: python3.2 memory leak when setting integer key in dictionary http://bugs.python.org/issue12075 opened by kaizhu #12077: Harmonizing descriptor protocol documentation http://bugs.python.org/issue12077 opened by davide.rizzo #12079: decimal.py: TypeError precedence in fma() http://bugs.python.org/issue12079 opened by skrah #12080: decimal.py: performance in _power_exact http://bugs.python.org/issue12080 opened by skrah #12081: Remove distributed copy of libffi http://bugs.python.org/issue12081 opened by benjamin.peterson #12082: Python/import.c still references fstat even with DONT_HAVE_FST http://bugs.python.org/issue12082 opened by joshtriplett #12084: os.stat() on windows doesn't consider relative symlink http://bugs.python.org/issue12084 opened by ocean-city #12085: subprocess.Popen.__del__ raises AttributeError if __init__ was http://bugs.python.org/issue12085 opened by chortos #12086: Tutorial doesn't discourage name mangling http://bugs.python.org/issue12086 opened by sheep #12087: install_egg_info fails with UnicodeEncodeError depending on lo http://bugs.python.org/issue12087 opened by hagen #12089: regrtest.py doesn't check for unexpected output anymore? http://bugs.python.org/issue12089 opened by haypo #12090: 3.2: build --without-threads fails http://bugs.python.org/issue12090 opened by skrah #12091: multiprocessing: simplify ApplyResult and MapResult with threa http://bugs.python.org/issue12091 opened by charles-francois.natali #12097: python.exe crashes if it is unable to find its .dll http://bugs.python.org/issue12097 opened by techtonik #12098: Child process running as debug on Windows http://bugs.python.org/issue12098 opened by thebits #12100: Incremental encoders of CJK codecs reset the codec at each cal http://bugs.python.org/issue12100 opened by haypo #12101: PEPs should have consecutive revision numbers http://bugs.python.org/issue12101 opened by techtonik #12102: mmap requires file to be synced http://bugs.python.org/issue12102 opened by rion4ik at gmail.com #12103: Documentation of open() does not claim 'e' support in mode str http://bugs.python.org/issue12103 opened by mmarkk #12105: open() does not able to set flags, such as O_CLOEXEC http://bugs.python.org/issue12105 opened by mmarkk #12106: reflect syntatic sugar in with ast http://bugs.python.org/issue12106 opened by benjamin.peterson #12107: TCP listening sockets created without FD_CLOEXEC flag http://bugs.python.org/issue12107 opened by Christophe.Devriese #12112: The new packaging module should not use the locale encoding http://bugs.python.org/issue12112 opened by haypo #12113: test_packaging fails when run twice http://bugs.python.org/issue12113 opened by pitrou #12114: packaging.util._find_exe_version(): potential deadlock http://bugs.python.org/issue12114 opened by haypo #12115: some tests need to be skipped on threadless systems http://bugs.python.org/issue12115 opened by tarek #12117: Failures with PYTHONDONTWRITEBYTECODE: test_importlib, test_im http://bugs.python.org/issue12117 opened by pitrou #12121: test_packaging failure when ssl is not available http://bugs.python.org/issue12121 opened by pitrou #12124: python -m test test_packaging test_zipimport failure http://bugs.python.org/issue12124 opened by haypo #12125: test_sysconfig fails on OpenIndiana because of test_packaging http://bugs.python.org/issue12125 opened by haypo #12126: incorrect select documentation http://bugs.python.org/issue12126 opened by exarkun #12127: Inconsistent leading zero treatment http://bugs.python.org/issue12127 opened by Peter.Wentworth Most recent 15 issues with no replies (15) ========================================== #12126: incorrect select documentation http://bugs.python.org/issue12126 #12125: test_sysconfig fails on OpenIndiana because of test_packaging http://bugs.python.org/issue12125 #12121: test_packaging failure when ssl is not available http://bugs.python.org/issue12121 #12114: packaging.util._find_exe_version(): potential deadlock http://bugs.python.org/issue12114 #12106: reflect syntatic sugar in with ast http://bugs.python.org/issue12106 #12100: Incremental encoders of CJK codecs reset the codec at each cal http://bugs.python.org/issue12100 #12091: multiprocessing: simplify ApplyResult and MapResult with threa http://bugs.python.org/issue12091 #12085: subprocess.Popen.__del__ raises AttributeError if __init__ was http://bugs.python.org/issue12085 #12066: Empty ('') xmlns attribute is not properly handled by xml.dom. http://bugs.python.org/issue12066 #12063: tokenize module appears to treat unterminated single and doubl http://bugs.python.org/issue12063 #12055: doctest not working on nested functions http://bugs.python.org/issue12055 #12053: Add prefetch() for Buffered IO (experiment) http://bugs.python.org/issue12053 #12037: test_email failures under Windows with the eol extension activ http://bugs.python.org/issue12037 #12029: ABC registration of Exceptions http://bugs.python.org/issue12029 #12019: Dead or buggy code in importlib.test.__main__ http://bugs.python.org/issue12019 Most recent 15 issues waiting for review (15) ============================================= #12124: python -m test test_packaging test_zipimport failure http://bugs.python.org/issue12124 #12114: packaging.util._find_exe_version(): potential deadlock http://bugs.python.org/issue12114 #12112: The new packaging module should not use the locale encoding http://bugs.python.org/issue12112 #12106: reflect syntatic sugar in with ast http://bugs.python.org/issue12106 #12105: open() does not able to set flags, such as O_CLOEXEC http://bugs.python.org/issue12105 #12102: mmap requires file to be synced http://bugs.python.org/issue12102 #12100: Incremental encoders of CJK codecs reset the codec at each cal http://bugs.python.org/issue12100 #12098: Child process running as debug on Windows http://bugs.python.org/issue12098 #12091: multiprocessing: simplify ApplyResult and MapResult with threa http://bugs.python.org/issue12091 #12085: subprocess.Popen.__del__ raises AttributeError if __init__ was http://bugs.python.org/issue12085 #12084: os.stat() on windows doesn't consider relative symlink http://bugs.python.org/issue12084 #12074: regrtest: display the current number of failures http://bugs.python.org/issue12074 #12073: regrtest: use faulthandler to dump the tracebacks on SIGUSR1 http://bugs.python.org/issue12073 #12057: HZ codec has no test http://bugs.python.org/issue12057 #12049: expose RAND_bytes() function of OpenSSL http://bugs.python.org/issue12049 Top 10 most discussed issues (10) ================================= #11610: Improving property to accept abstract methods http://bugs.python.org/issue11610 12 msgs #6721: Locks in python standard library should be sanitized on fork http://bugs.python.org/issue6721 9 msgs #12105: open() does not able to set flags, such as O_CLOEXEC http://bugs.python.org/issue12105 9 msgs #11877: Change os.fsync() to support physical backing store syncs http://bugs.python.org/issue11877 8 msgs #12086: Tutorial doesn't discourage name mangling http://bugs.python.org/issue12086 8 msgs #12112: The new packaging module should not use the locale encoding http://bugs.python.org/issue12112 8 msgs #12127: Inconsistent leading zero treatment http://bugs.python.org/issue12127 8 msgs #1615158: POSIX capabilities support http://bugs.python.org/issue1615158 8 msgs #6727: ImportError when package is symlinked on Windows http://bugs.python.org/issue6727 7 msgs #12097: python.exe crashes if it is unable to find its .dll http://bugs.python.org/issue12097 7 msgs Issues closed (49) ================== #4621: zipfile returns string but expects binary http://bugs.python.org/issue4621 closed by haypo #5723: Incomplete json tests http://bugs.python.org/issue5723 closed by ezio.melotti #6059: ctypes/uuid-related segmentation fault http://bugs.python.org/issue6059 closed by charles-francois.natali #6498: Py_Main() does not return on SystemExit http://bugs.python.org/issue6498 closed by python-dev #7656: test_hashlib fails on some installations (specifically Neal's http://bugs.python.org/issue7656 closed by gregory.p.smith #7960: test.support.captured_output has invalid docstring example http://bugs.python.org/issue7960 closed by ezio.melotti #8650: zlibmodule.c isn't 64-bit clean http://bugs.python.org/issue8650 closed by nadeem.vawda #8809: smtplib should support SSL contexts http://bugs.python.org/issue8809 closed by pitrou #9516: sysconfig: $MACOSX_DEPLOYMENT_TARGET mismatch: now "10.3" but http://bugs.python.org/issue9516 closed by ronaldoussoren #9927: Leak around GetFinalPathNameByHandle (Windows) http://bugs.python.org/issue9927 closed by ocean-city #10090: python -m locale fails on OSX http://bugs.python.org/issue10090 closed by ronaldoussoren #10154: locale.normalize strips "-" from UTF-8, which fails on Mac http://bugs.python.org/issue10154 closed by ronaldoussoren #10239: multiprocessing signal defect http://bugs.python.org/issue10239 closed by charles-francois.natali #10756: Error in atexit._run_exitfuncs [...] Exception expected for v http://bugs.python.org/issue10756 closed by haypo #11088: IDLE on OS X with Cocoa Tk 8.5 can hang waiting on input / raw http://bugs.python.org/issue11088 closed by ronaldoussoren #11614: import __hello__ is broken in Python 3 http://bugs.python.org/issue11614 closed by haypo #11731: Simplify email API via 'policy' objects http://bugs.python.org/issue11731 closed by r.david.murray #11949: Make float('nan') unorderable http://bugs.python.org/issue11949 closed by rhettinger #11979: Minor improvements to the Sockets readme: typos, wording and s http://bugs.python.org/issue11979 closed by ezio.melotti #11996: libpython.py: nicer py-bt output http://bugs.python.org/issue11996 closed by haypo #12002: ftplib.FTP.abort fails with TypeError on Python 3.x http://bugs.python.org/issue12002 closed by giampaolo.rodola #12048: Python 3, ZipFile Bug In Chinese http://bugs.python.org/issue12048 closed by haypo #12050: unconsumed_tail of zlib.Decompress is not always cleared on de http://bugs.python.org/issue12050 closed by nadeem.vawda #12059: hashlib does not handle missing hash functions correctly http://bugs.python.org/issue12059 closed by gregory.p.smith #12060: Python doesn't support real time signals http://bugs.python.org/issue12060 closed by gregory.p.smith #12065: test_ssl failure when svn.python.org fails to resolve http://bugs.python.org/issue12065 closed by pitrou #12072: Missing parenthesis in c-api/buffer PyBuffer_FillContiguousStr http://bugs.python.org/issue12072 closed by ezio.melotti #12076: IDLE v.3.2 crashing randomly on MacOSX 10.6.7 http://bugs.python.org/issue12076 closed by amaury.forgeotdarc #12083: Compile-time option to avoid writing files, including generate http://bugs.python.org/issue12083 closed by loewis #12088: tarfile.extractall fails to overwrite unresolved symlinks and http://bugs.python.org/issue12088 closed by orsenthil #12092: Clarify sentence in tutorial http://bugs.python.org/issue12092 closed by ezio.melotti #12093: Typo in struct unpacking example http://bugs.python.org/issue12093 closed by ezio.melotti #12094: Cannot Launch IDLE http://bugs.python.org/issue12094 closed by ezio.melotti #12095: test failures due to missing module http://bugs.python.org/issue12095 closed by haypo #12096: test_threading.test_waitfor() timeout (1 hour) on x86 Gentoo 3 http://bugs.python.org/issue12096 closed by haypo #12099: re pattern objects have no __class__ http://bugs.python.org/issue12099 closed by python-dev #12104: os.path.join('/some/path', '') adds extra slash at end of resu http://bugs.python.org/issue12104 closed by brian.curtin #12108: test_packaging monkeypatches httplib http://bugs.python.org/issue12108 closed by pitrou #12109: test_packaging monkeypatches httplib http://bugs.python.org/issue12109 closed by pitrou #12110: test_packaging monkeypatches httplib http://bugs.python.org/issue12110 closed by pitrou #12111: email's use of __setitem__ is highly counterintuitive http://bugs.python.org/issue12111 closed by r.david.murray #12116: io.Buffer*.seek() doesn't seek if "seeking leaves us inside th http://bugs.python.org/issue12116 closed by pitrou #12118: test_imp failure http://bugs.python.org/issue12118 closed by haypo #12119: test_distutils failure http://bugs.python.org/issue12119 closed by haypo #12120: test_packaging failure http://bugs.python.org/issue12120 closed by haypo #12122: test_runpy failure http://bugs.python.org/issue12122 closed by haypo #12123: test_import failures http://bugs.python.org/issue12123 closed by haypo #1746656: IPv6 Interface naming/indexing functions http://bugs.python.org/issue1746656 closed by gregory.p.smith #12078: re.sub() replaces only several matches http://bugs.python.org/issue12078 closed by ezio.melotti From guido at python.org Fri May 20 18:09:48 2011 From: guido at python.org (Guido van Rossum) Date: Fri, 20 May 2011 09:09:48 -0700 Subject: [Python-Dev] os.access on Windows In-Reply-To: <4DD687F9.1040403@timgolden.me.uk> References: <17E3183FF8D8EB47839A7E240AD39FA831772B209C@SVR-EXCH-VMBX.gb.vo.local> <BANLkTi=o8g-z1Jbfh9qb+KFipTdq_Z=isw@mail.gmail.com> <4DD687F9.1040403@timgolden.me.uk> Message-ID: <BANLkTinkCSZT=b1Bi+iSB57OjMV2VWHbOg@mail.gmail.com> On May 20, 2011 8:30 AM, "Tim Golden" <mail at timgolden.me.uk> wrote: > On 20/05/2011 16:21, Brian Curtin wrote: > >> On Fri, May 20, 2011 at 03:38, Tim Golden <Tim.Golden at cbsoutdoor.co.uk > (Sorry about that; I had no idea I'd sent that from my work account) > >> I think we should tread lightly in the documentation area. We already >> have two note boxes, and adding a third probably scares everyone away. > > I entirely agree. (That's what I meant by "involved" above) TBH I think the less attractive we can make os.access() look the better. It uses the real uid instead of the effective uid, it encourages LBYL behavior, the outcome may be incorrect, it doesn't work on Windows... The ONLY reason to ever use it is in a setuid() program. But who writes those any more? (Esp. in Python!) -- --Guido van Rossum (python.org/~guido) From cf.natali at gmail.com Fri May 20 19:01:26 2011 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Fri, 20 May 2011 19:01:26 +0200 Subject: [Python-Dev] Hello! Message-ID: <BANLkTimp_vvh_aUYbD5Q5p0D8UiSJyRx=Q@mail.gmail.com> Hi, My name is Charles-Fran?ois Natali, I've been using Python for a couple years, and I've recently been granted commit priviledge. I just wanted to say hi to everyone on this list, and let you know that I'm really happy and proud of joining this great community. Cheers, cf From stefan_ml at behnel.de Fri May 20 21:04:49 2011 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 20 May 2011 21:04:49 +0200 Subject: [Python-Dev] in latest Py3k site.py: configparser.NoSectionError: No section: 'posix_prefix' Message-ID: <ir6e0i$hn6$1@dough.gmane.org> Hi, since May 19, I get the exception below in the latest py3k site.py when trying to run a distutils build with it (building Cython). The changelog since the previous (working) CPython build is here: https://sage.math.washington.edu:8091/hudson/job/py3k-hg/374/ The failing build is here: https://sage.math.washington.edu:8091/hudson/job/cython-devel-build-py3k/1313/console This is on 64bit Linux. I tried with a clean checkout, no difference. Is this problem obvious to someone, is there anything that needs adaptation on our side (I hope not), or should I file a bug report? Thanks, Stefan """ $ python setup.py bdist --formats=gztar --cython-profile Traceback (most recent call last): File "/.../python/lib/python3.3/configparser.py", line 842, in items d.update(self._sections[section]) KeyError: 'posix_prefix' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/.../python/lib/python3.3/site.py", line 537, in <module> main() File "/.../python/lib/python3.3/site.py", line 522, in main known_paths = addusersitepackages(known_paths) File "/.../python/lib/python3.3/site.py", line 263, in addusersitepackages user_site = getusersitepackages() File "/.../python/lib/python3.3/site.py", line 238, in getusersitepackages user_base = getuserbase() # this will also set USER_BASE File "/.../python/lib/python3.3/site.py", line 228, in getuserbase USER_BASE = get_config_var('userbase') File "/.../python/lib/python3.3/sysconfig.py", line 576, in get_config_var return get_config_vars().get(name) File "/.../python/lib/python3.3/sysconfig.py", line 472, in get_config_vars _init_posix(_CONFIG_VARS) File "/.../python/lib/python3.3/sysconfig.py", line 324, in _init_posix makefile = get_makefile_filename() File "/.../python/lib/python3.3/sysconfig.py", line 318, in get_makefile_filename return os.path.join(get_path('stdlib'), config_dir_name, 'Makefile') File "/.../python/lib/python3.3/sysconfig.py", line 436, in get_path return get_paths(scheme, vars, expand)[name] File "/.../python/lib/python3.3/sysconfig.py", line 426, in get_paths return _expand_vars(scheme, vars) File "/.../python/lib/python3.3/sysconfig.py", line 142, in _expand_vars for key, value in _SCHEMES.items(scheme): File "/.../python/lib/python3.3/configparser.py", line 845, in items raise NoSectionError(section) configparser.NoSectionError: No section: 'posix_prefix' """ From g.brandl at gmx.net Fri May 20 22:30:19 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 20 May 2011 22:30:19 +0200 Subject: [Python-Dev] looking for a contact at Google on the Blogger team In-Reply-To: <BANLkTinV9XXvVH51LL74HVzTDaB-dzSWJA@mail.gmail.com> References: <5B09C555-1F4E-413E-9719-2ED1C9B68DF1@gmail.com> <BANLkTiksdt85-sV5w5bHgB3OUpnFW9qEVw@mail.gmail.com> <BANLkTin7cf6iZZQz+MkKWmTTzSH6yNgL2A@mail.gmail.com> <BANLkTi=WGTmCKT3rwjDbcU+j2KEkLzVjUg@mail.gmail.com> <BANLkTin8RWdoXFN-aEfLEiYqe6-dmqG7jw@mail.gmail.com> <BANLkTinV9XXvVH51LL74HVzTDaB-dzSWJA@mail.gmail.com> Message-ID: <ir6j0k$g6q$1@dough.gmane.org> On 20.05.2011 17:35, Eli Bendersky wrote: >> There is ongoing work for an RFP by the board to improve the >> python.org publishing system/site to allow us to self-host these >> things. Moving PSF properties off of it, and onto another "hosted by >> someone else" site is probably not a good idea, but our hands may be >> forced if google/blogger can not resolve the issues. >> >> jesse > > The whole idea of a Wordpress-(or similar)-based solution is self > hosting, and less reliance on outside providers like blogger. > Wordpress is just a bunch of PHP code you place in a directory on your > server That's exactly the problem. Georg From martin at v.loewis.de Fri May 20 23:47:35 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 20 May 2011 23:47:35 +0200 Subject: [Python-Dev] looking for a contact at Google on the Blogger team In-Reply-To: <BANLkTinaU6ypzk1+n4z-0FeOmM-fGic7KA@mail.gmail.com> References: <5B09C555-1F4E-413E-9719-2ED1C9B68DF1@gmail.com> <BANLkTiksdt85-sV5w5bHgB3OUpnFW9qEVw@mail.gmail.com> <BANLkTin7cf6iZZQz+MkKWmTTzSH6yNgL2A@mail.gmail.com> <BANLkTi=WGTmCKT3rwjDbcU+j2KEkLzVjUg@mail.gmail.com> <BANLkTin8RWdoXFN-aEfLEiYqe6-dmqG7jw@mail.gmail.com> <BANLkTinV9XXvVH51LL74HVzTDaB-dzSWJA@mail.gmail.com> <BANLkTinaU6ypzk1+n4z-0FeOmM-fGic7KA@mail.gmail.com> Message-ID: <4DD6E177.5020202@v.loewis.de> > As Jesse has said, there is an RFP in development to improve > python.org to the point where we can self-host blogs and the like and > deal with the associated user account administration appropriately. To run a blog on www.python.org, a PEP is not needed. If anybody would volunteer to set this up, it could be done in no time. Regards, Martin From martin at v.loewis.de Fri May 20 23:56:01 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 20 May 2011 23:56:01 +0200 Subject: [Python-Dev] os.access on Windows In-Reply-To: <BANLkTinkCSZT=b1Bi+iSB57OjMV2VWHbOg@mail.gmail.com> References: <17E3183FF8D8EB47839A7E240AD39FA831772B209C@SVR-EXCH-VMBX.gb.vo.local> <BANLkTi=o8g-z1Jbfh9qb+KFipTdq_Z=isw@mail.gmail.com> <4DD687F9.1040403@timgolden.me.uk> <BANLkTinkCSZT=b1Bi+iSB57OjMV2VWHbOg@mail.gmail.com> Message-ID: <4DD6E371.2020706@v.loewis.de> > TBH I think the less attractive we can make os.access() look the > better. It uses the real uid instead of the effective uid, it > encourages LBYL behavior, the outcome may be incorrect, it doesn't > work on Windows... The ONLY reason to ever use it is in a setuid() > program. But who writes those any more? (Esp. in Python!) +1. The best way to determine "could I access this file" is to try to access it, and be prepared to get an exception. So we might deprecate-then-delete it on Windows. People who *really* need to know in advance should use the Windows API for that on Windows (i.e. call AccessCheck). Regards, Martin From doug.hellmann at gmail.com Sat May 21 00:36:49 2011 From: doug.hellmann at gmail.com (Doug Hellmann) Date: Fri, 20 May 2011 18:36:49 -0400 Subject: [Python-Dev] looking for a contact at Google on the Blogger team In-Reply-To: <4DD6E177.5020202@v.loewis.de> References: <5B09C555-1F4E-413E-9719-2ED1C9B68DF1@gmail.com> <BANLkTiksdt85-sV5w5bHgB3OUpnFW9qEVw@mail.gmail.com> <BANLkTin7cf6iZZQz+MkKWmTTzSH6yNgL2A@mail.gmail.com> <BANLkTi=WGTmCKT3rwjDbcU+j2KEkLzVjUg@mail.gmail.com> <BANLkTin8RWdoXFN-aEfLEiYqe6-dmqG7jw@mail.gmail.com> <BANLkTinV9XXvVH51LL74HVzTDaB-dzSWJA@mail.gmail.com> <BANLkTinaU6ypzk1+n4z-0FeOmM-fGic7KA@mail.gmail.com> <4DD6E177.5020202@v.loewis.de> Message-ID: <5CC8D21C-156F-4D27-B490-9DF29CB1C5F5@gmail.com> On May 20, 2011, at 5:47 PM, Martin v. L?wis wrote: >> As Jesse has said, there is an RFP in development to improve >> python.org to the point where we can self-host blogs and the like and >> deal with the associated user account administration appropriately. > > To run a blog on www.python.org, a PEP is not needed. If anybody would > volunteer to set this up, it could be done in no time. The blog is working again, so we can continue using the tool chain we have. Thanks, Doug -- Doug Hellmann Communications Director Python Software Foundation http://python.org/psf/ From barry at python.org Sat May 21 02:53:14 2011 From: barry at python.org (Barry Warsaw) Date: Fri, 20 May 2011 20:53:14 -0400 Subject: [Python-Dev] Python 2.6.7 release candidate 2 now available Message-ID: <20110520205314.1be39eec@neurotica.wooz.org> Hello to all you Pythoneers and Pythonistas, I'm happy to announce the availability of Python 2.6.7 release candidate 2. Release candidate 1 was not widely announced due to a mismatch between the Mercurial and Subversion branches. Barring any unforeseen issues, this will be the last release candidate before 2.6.7 final, which is currently scheduled for June 3, 2011. As previously announced, Python 2.6 is in security-fix only mode. This means that general bug fix maintenance has ended, and only critical security fixes are supported. We will support Python 2.6 in security-fix only mode until October 2013. Also, this is a source-only release; no installers for Windows or Mac OS X will be provided. Please download and test this release candidate. http://www.python.org/download/releases/2.6.7/ The NEWS file contains a list of changes since 2.6.6. http://www.python.org/download/releases/2.6.7/NEWS.txt Many thanks go out to the entire Python community for their contributions great and small. Enjoy, -Barry (on behalf of the Python development community) -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: <http://mail.python.org/pipermail/python-dev/attachments/20110520/e51c7872/attachment.pgp> From nad at acm.org Sat May 21 05:18:51 2011 From: nad at acm.org (Ned Deily) Date: Fri, 20 May 2011 20:18:51 -0700 Subject: [Python-Dev] in latest Py3k site.py: configparser.NoSectionError: No section: 'posix_prefix' References: <ir6e0i$hn6$1@dough.gmane.org> Message-ID: <nad-3CDB70.20185120052011@news.gmane.org> In article <ir6e0i$hn6$1 at dough.gmane.org>, Stefan Behnel <stefan_ml at behnel.de> wrote: > since May 19, I get the exception below in the latest py3k site.py when > trying to run a distutils build with it (building Cython). The changelog > since the previous (working) CPython build is here: > > https://sage.math.washington.edu:8091/hudson/job/py3k-hg/374/ > > The failing build is here: > > https://sage.math.washington.edu:8091/hudson/job/cython-devel-build-py3k/1313/ > console > > This is on 64bit Linux. I tried with a clean checkout, no difference. Is > this problem obvious to someone, is there anything that needs adaptation on > our side (I hope not), or should I file a bug report? It's a bug introduced by the packaging (Distutils2) feature. Thanks for finding it first. http://bugs.python.org/issue12131 -- Ned Deily, nad at acm.org From rosslagerwall at gmail.com Sat May 21 06:42:43 2011 From: rosslagerwall at gmail.com (Ross Lagerwall) Date: Sat, 21 May 2011 06:42:43 +0200 Subject: [Python-Dev] Hello! In-Reply-To: <BANLkTimp_vvh_aUYbD5Q5p0D8UiSJyRx=Q@mail.gmail.com> References: <BANLkTimp_vvh_aUYbD5Q5p0D8UiSJyRx=Q@mail.gmail.com> Message-ID: <1305952963.1475.0.camel@hobo> On Fri, 2011-05-20 at 19:01 +0200, Charles-Fran?ois Natali wrote: > Hi, > > My name is Charles-Fran?ois Natali, I've been using Python for a > couple years, and I've recently been granted commit priviledge. > I just wanted to say hi to everyone on this list, and let you know > that I'm really happy and proud of joining this great community. Congratulations, welcome. Ross From solipsis at pitrou.net Sat May 21 13:09:03 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 21 May 2011 13:09:03 +0200 Subject: [Python-Dev] cpython: Added SSL test for HTTPHandler. References: <E1QNjTp-0002V3-Gf@dinsdale.python.org> Message-ID: <20110521130903.2f7cf91f@pitrou.net> On Sat, 21 May 2011 12:32:21 +0200 vinay.sajip <python-checkins at python.org> wrote: > + if secure: > + import ssl > + fd, fn = tempfile.mkstemp() > + os.close(fd) > + with open(fn, 'w') as f: > + f.write(self.PEMFILE) > + sslctx = ssl.SSLContext(ssl.PROTOCOL_SSLv23) > + sslctx.load_cert_chain(fn) We already bundle a couple of cert files in Lib/test, so you shouldn't have to use your own (see e.g. Lib/test/keycert.pem). > + self.h_hdlr = logging.handlers.HTTPHandler(host, '/frob', secure=secure) If you want real security, HTTPHandler should configure its SSLContext in CERT_REQUIRED mode (and be given the proper root certificate(s)). Otherwise you are vulnerable to man-in-the-middle attacks. See the "context" and "check_hostname" arguments to HTTPSConnection: http://docs.python.org/dev/library/http.client.html#http.client.HTTPSConnection Regards Antoine. From solipsis at pitrou.net Sat May 21 13:59:25 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 21 May 2011 13:59:25 +0200 Subject: [Python-Dev] Hello! References: <BANLkTimp_vvh_aUYbD5Q5p0D8UiSJyRx=Q@mail.gmail.com> Message-ID: <20110521135925.33599a44@pitrou.net> On Fri, 20 May 2011 19:01:26 +0200 Charles-Fran?ois Natali <cf.natali at gmail.com> wrote: > Hi, > > My name is Charles-Fran?ois Natali, I've been using Python for a > couple years, and I've recently been granted commit priviledge. > I just wanted to say hi to everyone on this list, and let you know > that I'm really happy and proud of joining this great community. Welcome, and keep up the good work. Regards Antoine. From solipsis at pitrou.net Sat May 21 16:37:14 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 21 May 2011 16:37:14 +0200 Subject: [Python-Dev] Stable buildbots update Message-ID: <20110521163714.68c5384f@pitrou.net> Hello, We recently got a couple of new stable buildbots: - R. David Murray's "x86 Gentoo" machine, which builds in non-debug mode and therefore checks that release Pythons work fine - Stefan Krah's "AMD64 FreeBSD 8.2" machine - Bill Janssen's "AMD64 Snow Leopard" machine Many stable buildbots on the default branch (*) are currently red because of test_packaging issues. (*) http://www.python.org/dev/buildbot/all/waterfall?category=3.x.stable Regards Antoine. From solipsis at pitrou.net Sat May 21 17:07:25 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 21 May 2011 17:07:25 +0200 Subject: [Python-Dev] The socket HOWTO Message-ID: <20110521170725.51eab5f9@pitrou.net> Hello, I would like to suggest that we remove the socket HOWTO (currently at http://docs.python.org/dev/howto/sockets.html) My main issue with this document is that it doesn't seem to have a well-defined destination: - people who know sockets won't learn anything from it - but people who don't know sockets will probably find it clear as mud (for example, what's an "INET" or "STREAM" socket? what's "select"?) I have other issues, such as the style/tone it's written in. I'm sure the author had fun writing it but it doesn't fit well with the rest of the documentation. Also, the author gives a lot of "advice" without explaining or justifying it ("if somewhere in those input lists of sockets is one which has died a nasty death, the select will fail" -> is that really true? what is a "nasty death" and how is that supposed to happen? couldn't the author have put a 3-line example to demonstrate this supposed drawback and how it manifests?). And, finally, many statements seem arbitrary ("There?s no question that the fastest sockets code uses non-blocking sockets and select to multiplex them") or plain wrong ("threading support in Unixes varies both in API and quality. So the normal Unix solution is to fork a subprocess to deal with each connection"). I don't think giving misleading advice to users is really a good idea. And suggesting beginners they use non-blocking sockets without even *showing* how (or pointing to asyncore or Twisted) is a very bad idea. select() is not enough, you still have to be prepared to get EAGAIN or EWOULDBLOCK when calling recv() or send() (i.e. select() can give false positives). Oh and I think it's obsolete too, because the "class mysocket" concatenates the output of recv() with a str rather than a bytes object. Not to mention that features of the "class mysocket" can be had using a buffered socket.makefile() instead of writing custom code. (followed up from http://bugs.python.org/issue12126 at Eli's request) Regards Antoine. From g.brandl at gmx.net Sat May 21 17:37:05 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 21 May 2011 17:37:05 +0200 Subject: [Python-Dev] The socket HOWTO In-Reply-To: <20110521170725.51eab5f9@pitrou.net> References: <20110521170725.51eab5f9@pitrou.net> Message-ID: <ir8m70$e7c$1@dough.gmane.org> On 05/21/11 17:07, Antoine Pitrou wrote: > > Hello, > > I would like to suggest that we remove the socket HOWTO (currently at > http://docs.python.org/dev/howto/sockets.html) +1, or a big rewrite. Georg From eliben at gmail.com Sat May 21 17:48:37 2011 From: eliben at gmail.com (Eli Bendersky) Date: Sat, 21 May 2011 18:48:37 +0300 Subject: [Python-Dev] The socket HOWTO In-Reply-To: <20110521170725.51eab5f9@pitrou.net> References: <20110521170725.51eab5f9@pitrou.net> Message-ID: <BANLkTikbNd8=AAMeUvD345P4sichrQZmZQ@mail.gmail.com> > I would like to suggest that we remove the socket HOWTO (currently at > http://docs.python.org/dev/howto/sockets.html) > > My main issue with this document is that it doesn't seem to have > a well-defined destination: > - people who know sockets won't learn anything from it > - but people who don't know sockets will probably find it clear as mud > (for example, what's an "INET" or "STREAM" socket? what's "select"?) > <snip> I definitely recall finding this document useful when I first learned Python. I knew socket programming from other languages, and the document helped to see how it maps to Python. That said, I must agree that there is probably no place for such a tutorial in Python's official documentation. Python is a widely-general purpose language, and sockets programming is just one of a plethora of things it supports, so a special treatment for sockets probably isn't warranted, especially given that the `socket` module itself is a relatively thin wrapper over the OS socket interface. I don't think a rewrite will help either. To describe socket programming in full, without missing anything and being accurate will require no less than a small book (and in fact many such books already exist). Therefore, I'm +1 on removing it from the official docs. It can be relegated to the Python wiki, where it can be improved if someone wishes to contribute to that. Eli From rosslagerwall at gmail.com Sat May 21 17:48:48 2011 From: rosslagerwall at gmail.com (Ross Lagerwall) Date: Sat, 21 May 2011 17:48:48 +0200 Subject: [Python-Dev] The socket HOWTO In-Reply-To: <20110521170725.51eab5f9@pitrou.net> References: <20110521170725.51eab5f9@pitrou.net> Message-ID: <1305992928.1475.10.camel@hobo> On Sat, 2011-05-21 at 17:07 +0200, Antoine Pitrou wrote: > Hello, > > I would like to suggest that we remove the socket HOWTO (currently at > http://docs.python.org/dev/howto/sockets.html) > > My main issue with this document is that it doesn't seem to have > a well-defined destination: > - people who know sockets won't learn anything from it > - but people who don't know sockets will probably find it clear as mud > (for example, what's an "INET" or "STREAM" socket? what's "select"?) > > I have other issues, such as the style/tone it's written in. I'm sure > the author had fun writing it but it doesn't fit well with the rest of > the documentation. Also, the author gives a lot of "advice" without > explaining or justifying it ("if somewhere in those input lists of > sockets is one which has died a nasty death, the select will fail" -> > is that really true? what is a "nasty death" and how is that supposed to > happen? couldn't the author have put a 3-line example to demonstrate > this supposed drawback and how it manifests?). > > And, finally, many statements seem arbitrary ("There?s no question that > the fastest sockets code uses non-blocking sockets and select to > multiplex them") or plain wrong ("threading support in Unixes varies > both in API and quality. So the normal Unix solution is to fork a > subprocess to deal with each connection"). I don't think giving > misleading advice to users is really a good idea. And suggesting > beginners they use non-blocking sockets without even *showing* how (or > pointing to asyncore or Twisted) is a very bad idea. select() is not > enough, you still have to be prepared to get EAGAIN or EWOULDBLOCK when > calling recv() or send() (i.e. select() can give false positives). > > Oh and I think it's obsolete too, because the "class mysocket" > concatenates the output of recv() with a str rather than a bytes > object. Not to mention that features of the "class mysocket" can be had > using a buffered socket.makefile() instead of writing custom code. > > (followed up from http://bugs.python.org/issue12126 at Eli's request) While I agree with most of what you said, I actually did find it very useful when first learning sockets. It's in the top page on google for "socket programming" or "socket how to". Also, it hinted at some concepts that could then be googled for more information like select, nonblocking sockets, etc. However, I would agree that this should be moved out of the documentation and as suggested in the issue, into the wiki. From orsenthil at gmail.com Sat May 21 18:01:19 2011 From: orsenthil at gmail.com (Senthil Kumaran) Date: Sun, 22 May 2011 00:01:19 +0800 Subject: [Python-Dev] The socket HOWTO In-Reply-To: <ir8m70$e7c$1@dough.gmane.org> References: <20110521170725.51eab5f9@pitrou.net> <ir8m70$e7c$1@dough.gmane.org> Message-ID: <20110521160118.GA22904@kevin> On Sat, May 21, 2011 at 05:37:05PM +0200, Georg Brandl wrote: > > > > I would like to suggest that we remove the socket HOWTO (currently at > > http://docs.python.org/dev/howto/sockets.html) > > +1, or a big rewrite. > I favor a rewrite over removal. I have read it once/twice and have never revisited it (the probably the reason that it was not helpful enough for a revisit), but still gives some important pointers. One document cannot cover it all, there are many pointers (examples at effbot.org, Python MoTW docs) all serve as good introduction to sockets in python. So a rewrite with good pointers would be more appropriate. -- Senthil From g.brandl at gmx.net Sat May 21 19:38:42 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 21 May 2011 19:38:42 +0200 Subject: [Python-Dev] The socket HOWTO In-Reply-To: <20110521160118.GA22904@kevin> References: <20110521170725.51eab5f9@pitrou.net> <ir8m70$e7c$1@dough.gmane.org> <20110521160118.GA22904@kevin> Message-ID: <ir8tb1$k0o$1@dough.gmane.org> On 05/21/11 18:01, Senthil Kumaran wrote: > On Sat, May 21, 2011 at 05:37:05PM +0200, Georg Brandl wrote: >> > >> > I would like to suggest that we remove the socket HOWTO (currently at >> > http://docs.python.org/dev/howto/sockets.html) >> >> +1, or a big rewrite. >> > > I favor a rewrite over removal. I have read it once/twice and have > never revisited it (the probably the reason that it was not helpful > enough for a revisit), but still gives some important pointers. > > One document cannot cover it all, there are many pointers (examples at > effbot.org, Python MoTW docs) all serve as good introduction to > sockets in python. > > So a rewrite with good pointers would be more appropriate. Even then, it's better off in the Wiki until the rewrite is complete. Georg From ziade.tarek at gmail.com Sat May 21 20:17:40 2011 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Sat, 21 May 2011 20:17:40 +0200 Subject: [Python-Dev] Stable buildbots update In-Reply-To: <20110521163714.68c5384f@pitrou.net> References: <20110521163714.68c5384f@pitrou.net> Message-ID: <BANLkTimpt1YR7PsjYh4H+rY9E9p8VopS_g@mail.gmail.com> On Sat, May 21, 2011 at 4:37 PM, Antoine Pitrou <solipsis at pitrou.net> wrote: > > Hello, > > We recently got a couple of new stable buildbots: > - R. David Murray's "x86 Gentoo" machine, which builds in non-debug > ?mode and therefore checks that release Pythons work fine > - Stefan Krah's "AMD64 FreeBSD 8.2" machine > - Bill Janssen's "AMD64 Snow Leopard" machine > > Many stable buildbots on the default branch (*) are currently red > because of test_packaging issues. > (*) http://www.python.org/dev/buildbot/all/waterfall?category=3.x.stable Yes, I am aware of this. I have fixed today most remaining issues, and fixing the final ones right now. > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ziade.tarek%40gmail.com > -- Tarek Ziad? | http://ziade.org From artur.siekielski at gmail.com Sun May 22 01:57:55 2011 From: artur.siekielski at gmail.com (Artur Siekielski) Date: Sun, 22 May 2011 01:57:55 +0200 Subject: [Python-Dev] CPython optimization: storing reference counters outside of objects Message-ID: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com> Hi. The problem with reference counters is that they are very often incremented/decremented, even for read-only algorithms (like traversal of a list). It has two drawbacks: 1. CPU cache lines (64 bytes on X86) containing a beginning of a PyObject are very often invalidated, resulting in loosing many chances to use the CPU caches 2. The copy-on-write after fork() optimization (Linux) is almost useless in CPython, because even if you don't modify data directly, refcounts are modified, and PyObjects with refcounts inside are spread all over process' memory (and one small refcount modification causes the whole page - 4kB - to be copied into a child process). So an idea I would like to try is to move reference counts outside of PyObjects, to a contiguous block(s) of memory. PyObjects would have a pointer to a reference count inside this block. Doing this I think that 1. The beginning of PyObject structs could be CPU-cached for a much longer time (small objects like ints could be fully cached). I don't know if having localized writes into the block with refcounts also help performance? 2. copy-on-write after fork() will work much better, only the block with refcounts would be copied into a child process (for read-only algorithms) However the drawback is that such design introduces a new level of indirection which is a pointer inside a PyObject instead of a direct value. Also it seems that the "block" with refcounts would have to be a non-trivial data structure. I'm not a compiler/profiling expert so the main question is if such design can work, and maybe someone was thinking about something similar? And if CPython was profiled for CPU cache usage? From solipsis at pitrou.net Sun May 22 14:48:37 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 22 May 2011 14:48:37 +0200 Subject: [Python-Dev] CPython optimization: storing reference counters outside of objects References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com> Message-ID: <20110522144837.10e9b95d@pitrou.net> Hello, On Sun, 22 May 2011 01:57:55 +0200 Artur Siekielski <artur.siekielski at gmail.com> wrote: > 1. CPU cache lines (64 bytes on X86) containing a beginning of a > PyObject are very often invalidated, resulting in loosing many chances > to use the CPU caches Mutating data doesn't invalidate a cache line. It just makes it necessary to write it back to memory at some point. > 2. The copy-on-write after fork() optimization (Linux) is almost > useless in CPython, because even if you don't modify data directly, > refcounts are modified, and PyObjects with refcounts inside are spread > all over process' memory (and one small refcount modification causes > the whole page - 4kB - to be copied into a child process). Indeed. > I'm not a compiler/profiling expert so the main question is if such > design can work, and maybe someone was thinking about something > similar? And if CPython was profiled for CPU cache usage? This has already been proposed a couple of times. I guess what's needed is for someone to experiment and post benchmark results. Regards Antoine. From neologix at free.fr Sun May 22 16:23:55 2011 From: neologix at free.fr (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Sun, 22 May 2011 16:23:55 +0200 Subject: [Python-Dev] CPython optimization: storing reference counters outside of objects In-Reply-To: <20110522144837.10e9b95d@pitrou.net> References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com> <20110522144837.10e9b95d@pitrou.net> Message-ID: <BANLkTinjggE98g+6Mso5CAF7VH9qyhHRRQ@mail.gmail.com> >> 1. CPU cache lines (64 bytes on X86) containing a beginning of a >> PyObject are very often invalidated, resulting in loosing many chances >> to use the CPU caches > > Mutating data doesn't invalidate a cache line. It just makes it > necessary to write it back to memory at some point. > I think he's referring to the multi-core case. In MESI terminology, the cache line will become modified in the current cache (current thread), but invalid in other cores' caches. But given that objects are accessed serialized by the GIL (which will issue a memory barrier anyway), I'm not sure that the performance impact will be noticeable. Furthermore, given that threads are actually serialized, I suspect that the scheduler tends to bind them naturally to the same CPU. >> 2. The copy-on-write after fork() optimization (Linux) is almost >> useless in CPython, because even if you don't modify data directly, >> refcounts are modified, and PyObjects with refcounts inside are spread >> all over process' memory (and one small refcount modification causes >> the whole page - 4kB - to be copied into a child process). > > Indeed. > There's been a bug report a couple months ago from someone using large datasets for some scientific application. He was suggesting to add support for Linux's MADV_MERGEABLE, but the root cause is really the reference count being incremented even when objects are treated read-only. For the record, it's http://bugs.python.org/issue9942 (and this idea was brought up here). cf From janssen at parc.com Mon May 23 03:00:51 2011 From: janssen at parc.com (Bill Janssen) Date: Sun, 22 May 2011 18:00:51 PDT Subject: [Python-Dev] Stable buildbots update In-Reply-To: <BANLkTimpt1YR7PsjYh4H+rY9E9p8VopS_g@mail.gmail.com> References: <20110521163714.68c5384f@pitrou.net> <BANLkTimpt1YR7PsjYh4H+rY9E9p8VopS_g@mail.gmail.com> Message-ID: <58834.1306112451@parc.com> Tarek Ziad? <ziade.tarek at gmail.com> wrote: > Yes, I am aware of this. I have fixed today most remaining issues, and > fixing the final ones right now. Just FYI: the "AMD64 Snow Leopard" buildbot and "PPC Leopard" buildbots are now green, but the "PPC Tiger" buildbot is still failing for all branches because of packaging errors: ====================================================================== FAIL: test_user_site (packaging.tests.test_command_install_dist.InstallTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/packaging/tests/test_command_install_dist.py", line 95, in test_user_site self._test_user_site() File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/packaging/tests/test_command_install_dist.py", line 124, in _test_user_site self.assertTrue(os.path.exists(self.user_base)) AssertionError: False is not true ====================================================================== FAIL: test_get_outputs (packaging.tests.test_command_install_lib.InstallLibTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/packaging/tests/test_command_install_lib.py", line 71, in test_get_outputs self.assertEqual(len(cmd.get_outputs()), 4) AssertionError: 2 != 4 Bill From martin at v.loewis.de Mon May 23 06:59:19 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 23 May 2011 06:59:19 +0200 Subject: [Python-Dev] CPython optimization: storing reference counters outside of objects In-Reply-To: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com> References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com> Message-ID: <4DD9E9A7.50807@v.loewis.de> > I'm not a compiler/profiling expert so the main question is if such > design can work, and maybe someone was thinking about something > similar? My expectation is that your approach would likely make the issues worse in a multi-CPU setting. If you put multiple reference counters into a contiguous block of memory, unrelated reference counters will live in the same cache line. Consequentially, changing one reference counter on one CPU will invalidate the cached reference counters of that cache line on other CPU, making your problem a) actually worse. Regards, Martin From cesare.di.mauro at gmail.com Mon May 23 07:35:31 2011 From: cesare.di.mauro at gmail.com (Cesare Di Mauro) Date: Mon, 23 May 2011 07:35:31 +0200 Subject: [Python-Dev] CPython optimization: storing reference counters outside of objects In-Reply-To: <4DD9E9A7.50807@v.loewis.de> References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com> <4DD9E9A7.50807@v.loewis.de> Message-ID: <BANLkTi=6Htr6Gt61GAYBEJ2VZLJ-CSYRCA@mail.gmail.com> 2011/5/23 "Martin v. L?wis" <martin at v.loewis.de> > > I'm not a compiler/profiling expert so the main question is if such > > design can work, and maybe someone was thinking about something > > similar? > > My expectation is that your approach would likely make the issues > worse in a multi-CPU setting. If you put multiple reference counters > into a contiguous block of memory, unrelated reference counters will > live in the same cache line. Consequentially, changing one reference > counter on one CPU will invalidate the cached reference counters of > that cache line on other CPU, making your problem a) actually worse. > > Regards, > Martin > I don't think that moving ob_refcnt to a proper memory pool will solve the problem of cache pollution anyway. ob_refcnt is obviously the most stressed field in PyObject, but it's not the only one. We have , that is needed to model each object (instance) "behavior", which is massively accessed too, so a cache line will be loaded as well when the object will be used. Also, only a few of simple objects have just ob_refcnt and ob_type. Most of them have other fields too, and accessing them means a line cache load. Regards, Cesare P.S. Memory allocation granularity can help sometimes, leaving some data (ob_refcnt and/or ob_type) on one cache line, and the other on the next one. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20110523/2ec4121b/attachment.html> From ncoghlan at gmail.com Mon May 23 08:15:35 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 23 May 2011 16:15:35 +1000 Subject: [Python-Dev] looking for a contact at Google on the Blogger team In-Reply-To: <4DD6E177.5020202@v.loewis.de> References: <5B09C555-1F4E-413E-9719-2ED1C9B68DF1@gmail.com> <BANLkTiksdt85-sV5w5bHgB3OUpnFW9qEVw@mail.gmail.com> <BANLkTin7cf6iZZQz+MkKWmTTzSH6yNgL2A@mail.gmail.com> <BANLkTi=WGTmCKT3rwjDbcU+j2KEkLzVjUg@mail.gmail.com> <BANLkTin8RWdoXFN-aEfLEiYqe6-dmqG7jw@mail.gmail.com> <BANLkTinV9XXvVH51LL74HVzTDaB-dzSWJA@mail.gmail.com> <BANLkTinaU6ypzk1+n4z-0FeOmM-fGic7KA@mail.gmail.com> <4DD6E177.5020202@v.loewis.de> Message-ID: <BANLkTi=RyeTT-mnwYES90tCBaVoBeen9ow@mail.gmail.com> On Sat, May 21, 2011 at 7:47 AM, "Martin v. L?wis" <martin at v.loewis.de> wrote: >> As Jesse has said, there is an RFP in development to improve >> python.org to the point where we can self-host blogs and the like and >> deal with the associated user account administration appropriately. > > To run a blog on www.python.org, a PEP is not needed. If anybody would > volunteer to set this up, it could be done in no time. If I understand correctly, the RFP is more about improving the entire python.org toolchain to make it something that non-programmers can easily provide content for (and even *programmers* don't particularly like the current toolchain). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Mon May 23 08:17:27 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 23 May 2011 16:17:27 +1000 Subject: [Python-Dev] Hello! In-Reply-To: <20110521135925.33599a44@pitrou.net> References: <BANLkTimp_vvh_aUYbD5Q5p0D8UiSJyRx=Q@mail.gmail.com> <20110521135925.33599a44@pitrou.net> Message-ID: <BANLkTi=UsrLqR6t_tJA96UxCgLGJ7E4kTw@mail.gmail.com> On Sat, May 21, 2011 at 9:59 PM, Antoine Pitrou <solipsis at pitrou.net> wrote: > On Fri, 20 May 2011 19:01:26 +0200 > Charles-Fran?ois Natali <cf.natali at gmail.com> wrote: > >> Hi, >> >> My name is Charles-Fran?ois Natali, I've been using Python for a >> couple years, and I've recently been granted commit priviledge. >> I just wanted to say hi to everyone on this list, and let you know >> that I'm really happy and proud of joining this great community. > > Welcome, and keep up the good work. Indeed! Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Mon May 23 08:22:20 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 23 May 2011 16:22:20 +1000 Subject: [Python-Dev] The socket HOWTO In-Reply-To: <ir8tb1$k0o$1@dough.gmane.org> References: <20110521170725.51eab5f9@pitrou.net> <ir8m70$e7c$1@dough.gmane.org> <20110521160118.GA22904@kevin> <ir8tb1$k0o$1@dough.gmane.org> Message-ID: <BANLkTin6WNTTkfQoJd=fthmEgf25fheHPA@mail.gmail.com> On Sun, May 22, 2011 at 3:38 AM, Georg Brandl <g.brandl at gmx.net> wrote: > On 05/21/11 18:01, Senthil Kumaran wrote: >> So a rewrite with good pointers would be more appropriate. > > Even then, it's better off in the Wiki until the rewrite is complete. Perhaps replacing it with a placeholder page that refers to the Wiki would be appropriate? A simple summary saying that the HOWTO had not aged well, and hence had been removed from the official documentation until it had been updated on the Wiki would allow people looking for it to better understand the situation, and also how to help improve it. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ziade.tarek at gmail.com Mon May 23 08:48:15 2011 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Mon, 23 May 2011 08:48:15 +0200 Subject: [Python-Dev] Stable buildbots update In-Reply-To: <58834.1306112451@parc.com> References: <20110521163714.68c5384f@pitrou.net> <BANLkTimpt1YR7PsjYh4H+rY9E9p8VopS_g@mail.gmail.com> <58834.1306112451@parc.com> Message-ID: <BANLkTi=ofyuCUA49wheByRPVcygPzrCUKg@mail.gmail.com> On Mon, May 23, 2011 at 3:00 AM, Bill Janssen <janssen at parc.com> wrote: > Tarek Ziad? <ziade.tarek at gmail.com> wrote: > >> Yes, I am aware of this. I have fixed today most remaining issues, and >> fixing the final ones right now. > > Just FYI: ?the "AMD64 Snow Leopard" buildbot and "PPC Leopard" buildbots > are now green, but the "PPC Tiger" buildbot is still failing for all > branches because of packaging errors: All the linux and windows stable slaves are now green, and I have a few issues left to be fixed for all solaris flavors and the two you are showing, that are also failing under Free BSD. Thanks Tarek -- Tarek Ziad? | http://ziade.org From mail at timgolden.me.uk Mon May 23 10:42:38 2011 From: mail at timgolden.me.uk (Tim Golden) Date: Mon, 23 May 2011 09:42:38 +0100 Subject: [Python-Dev] os.access on Windows In-Reply-To: <4DD6E371.2020706@v.loewis.de> References: <17E3183FF8D8EB47839A7E240AD39FA831772B209C@SVR-EXCH-VMBX.gb.vo.local> <BANLkTi=o8g-z1Jbfh9qb+KFipTdq_Z=isw@mail.gmail.com> <4DD687F9.1040403@timgolden.me.uk> <BANLkTinkCSZT=b1Bi+iSB57OjMV2VWHbOg@mail.gmail.com> <4DD6E371.2020706@v.loewis.de> Message-ID: <4DDA1DFE.6070800@timgolden.me.uk> On 20/05/2011 22:56, "Martin v. L?wis" wrote: >> TBH I think the less attractive we can make os.access() look the >> better. It uses the real uid instead of the effective uid, it >> encourages LBYL behavior, the outcome may be incorrect, it doesn't >> work on Windows... The ONLY reason to ever use it is in a setuid() >> program. But who writes those any more? (Esp. in Python!) > > +1. The best way to determine "could I access this file" is to try > to access it, and be prepared to get an exception. FWIW the OP knew this but -- for some reason specific to his use case -- wanted to avoid updating the mod dates of the containing directory. Obviously that's his problem, not ours... > So we might deprecate-then-delete it on Windows. I'll rework that patch to be a DeprecationWarning in that case. > People who *really* need to know in advance should use the Windows > API for that on Windows (i.e. call AccessCheck). And indeed this is what I've recommended to the OP. I'll follow this up in that python-list thread. I see that Benjamin's updated the os.access docs so I'll let this thread die and talk the OP through the AccessCheck route (which is, unfortunately, more tricky because it's not exposed by pywin32. Also not our problem...) TJG From ziade.tarek at gmail.com Mon May 23 11:58:29 2011 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Mon, 23 May 2011 11:58:29 +0200 Subject: [Python-Dev] the distutils2 repo and 3to2 Message-ID: <BANLkTikX5__nyOn+RX2FncnLPtXm4=aPyw@mail.gmail.com> Hey, Now that packaging has landed, the distutils2 repo is going to be re-seted and will be the python 2.x / 3.1 / 3.2 backport of packaging. In theory, we want to automate the extraction of packaging from the stdlib and a few other modules, and run 3to2 at install time. Or should I say 3.3tosomething. I want to do this to avoid maintaining yet another code base. In practice, I don't really know the current state of 3to2 so we'll see.. Any help/hint in this project would be appreciated. Thanks Tarek -- Tarek Ziad? | http://ziade.org From lukasz at langa.pl Mon May 23 12:51:27 2011 From: lukasz at langa.pl (=?iso-8859-2?Q?=A3ukasz_Langa?=) Date: Mon, 23 May 2011 12:51:27 +0200 Subject: [Python-Dev] the distutils2 repo and 3to2 In-Reply-To: <BANLkTikX5__nyOn+RX2FncnLPtXm4=aPyw@mail.gmail.com> References: <BANLkTikX5__nyOn+RX2FncnLPtXm4=aPyw@mail.gmail.com> Message-ID: <84205284-4F06-408A-95B7-57B504849F59@langa.pl> Wiadomo?? napisana przez Tarek Ziad? w dniu 2011-05-23, o godz. 11:58: > Hey, > > Now that packaging has landed, the distutils2 repo is going to be > re-seted and will be the python 2.x / 3.1 / 3.2 backport of packaging. > > In theory, we want to automate the extraction of packaging from the > stdlib and a few other modules, and run 3to2 at install time. Or > should I say 3.3tosomething. > I want to do this to avoid maintaining yet another code base. In > practice, I don't really know the current state of 3to2 so we'll see.. > > Any help/hint in this project would be appreciated. I'm maintaining a configparser 3.2+ backport for 2.6-2.7 using 3to2. A fully automatic conversion is not really possible, partly because the 3to2 tool is not perfect, and partly because there are parts of the code (esp. in the tests) which no mechanical converter could have figured out on its own. Anyway, the backport is available here: http://pypi.python.org/pypi/configparser There's some documentation there on the conversion process I came up with. As for distutils2, I was already contacted by ?ric Araujo and will help him improve 3to2. We are yet to contact its authors to see if they believe merging our changes upstream will be possible. -- Best regards, ?ukasz Langa Senior Systems Architecture Engineer IT Infrastructure Department Grupa Allegro Sp. z o.o. From ziade.tarek at gmail.com Mon May 23 12:58:22 2011 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Mon, 23 May 2011 12:58:22 +0200 Subject: [Python-Dev] the distutils2 repo and 3to2 In-Reply-To: <84205284-4F06-408A-95B7-57B504849F59@langa.pl> References: <BANLkTikX5__nyOn+RX2FncnLPtXm4=aPyw@mail.gmail.com> <84205284-4F06-408A-95B7-57B504849F59@langa.pl> Message-ID: <BANLkTinPS_F4hnxKWd1nd3XoOyWae5daXQ@mail.gmail.com> 2011/5/23 ?ukasz Langa <lukasz at langa.pl>: .. > I'm maintaining a configparser 3.2+ backport for 2.6-2.7 using 3to2. Do you backport to 3.1 ? .. > > There's some documentation there on the conversion process I came up with. Awesome, will look up, thanks > > As for distutils2, I was already contacted by ?ric Araujo and will help him improve 3to2. We are yet to contact its authors to see if they believe merging our changes upstream will be possible. Great, anything was started already ? If so, we should sync to see how we can initiate the d2 repo Cheers Tarek -- Tarek Ziad? | http://ziade.org From lukasz at langa.pl Mon May 23 13:14:58 2011 From: lukasz at langa.pl (=?iso-8859-2?Q?=A3ukasz_Langa?=) Date: Mon, 23 May 2011 13:14:58 +0200 Subject: [Python-Dev] the distutils2 repo and 3to2 In-Reply-To: <BANLkTinPS_F4hnxKWd1nd3XoOyWae5daXQ@mail.gmail.com> References: <BANLkTikX5__nyOn+RX2FncnLPtXm4=aPyw@mail.gmail.com> <84205284-4F06-408A-95B7-57B504849F59@langa.pl> <BANLkTinPS_F4hnxKWd1nd3XoOyWae5daXQ@mail.gmail.com> Message-ID: <B32E9FD3-94B9-4CCE-9257-EF12B0E57BAA@langa.pl> Wiadomo?? napisana przez Tarek Ziad? w dniu 2011-05-23, o godz. 12:58: > 2011/5/23 ?ukasz Langa <lukasz at langa.pl>: > .. >> I'm maintaining a configparser 3.2+ backport for 2.6-2.7 using 3to2. > > Do you backport to 3.1 ? > Not really. I personally think people already using py3k will migrate sooner (even if they have to do it on their own) than the folk on 2.x. The new Ubuntu already ships with Python 3.2. As for Python 2.x I've learnt that keeping compatibility with a Python version without decorators, `io` library, abstract base classes, etc. would mean either diverging branches or reproducing and maintaining bits of the newer stdlib. This is something 3to2 won't help you with as it's out of scope for that tool. For configparser I only support 2.6+ and none the less the backport has a helpers module with a couple of things copied over from 2.7 or 3.1. There's also an external dependency on ordereddict, etc. You see where this is going. I've heard you're targetting 2.4 compatibility so be prepared that this is not going to be easy. -- Best regards, ?ukasz Langa Senior Systems Architecture Engineer IT Infrastructure Department Grupa Allegro Sp. z o.o. -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.jpg Type: image/jpeg Size: 1898 bytes Desc: not available URL: <http://mail.python.org/pipermail/python-dev/attachments/20110523/cc3c0e67/attachment.jpg> -------------- next part -------------- Pomy?l o ?rodowisku naturalnym zanim wydrukujesz t? wiadomo??! Please consider the environment before printing out this e-mail. From ziade.tarek at gmail.com Mon May 23 13:23:32 2011 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Mon, 23 May 2011 13:23:32 +0200 Subject: [Python-Dev] the distutils2 repo and 3to2 In-Reply-To: <B32E9FD3-94B9-4CCE-9257-EF12B0E57BAA@langa.pl> References: <BANLkTikX5__nyOn+RX2FncnLPtXm4=aPyw@mail.gmail.com> <84205284-4F06-408A-95B7-57B504849F59@langa.pl> <BANLkTinPS_F4hnxKWd1nd3XoOyWae5daXQ@mail.gmail.com> <B32E9FD3-94B9-4CCE-9257-EF12B0E57BAA@langa.pl> Message-ID: <BANLkTi=RCRrQfnz+V8Ec-HKGRPmtNcGYzw@mail.gmail.com> 2011/5/23 ?ukasz Langa <lukasz at langa.pl>: ... > > I've heard you're targetting 2.4 compatibility so be prepared that this is not going to be easy. yeah well, we might raise the bar to 2.5 and use some __future__ statements. I am not sure that keeping 2.4 support is that useful anymore. Cheers Tarek -- Tarek Ziad? | http://ziade.org From ncoghlan at gmail.com Mon May 23 14:14:50 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 23 May 2011 22:14:50 +1000 Subject: [Python-Dev] the distutils2 repo and 3to2 In-Reply-To: <BANLkTi=RCRrQfnz+V8Ec-HKGRPmtNcGYzw@mail.gmail.com> References: <BANLkTikX5__nyOn+RX2FncnLPtXm4=aPyw@mail.gmail.com> <84205284-4F06-408A-95B7-57B504849F59@langa.pl> <BANLkTinPS_F4hnxKWd1nd3XoOyWae5daXQ@mail.gmail.com> <B32E9FD3-94B9-4CCE-9257-EF12B0E57BAA@langa.pl> <BANLkTi=RCRrQfnz+V8Ec-HKGRPmtNcGYzw@mail.gmail.com> Message-ID: <BANLkTinTxBn_H2YS=z_n5FacGY7BcgZZpQ@mail.gmail.com> 2011/5/23 Tarek Ziad? <ziade.tarek at gmail.com>: > 2011/5/23 ?ukasz Langa <lukasz at langa.pl>: > ... >> >> I've heard you're targetting 2.4 compatibility so be prepared that this is not going to be easy. > > yeah well, we might raise the bar to 2.5 and use some __future__ > statements. I am not sure that keeping 2.4 support is that useful > anymore. Anyone still stuck with 2.4 at this point in time is probably going to struggle to switch their packaging support library from distutils to distutils2 anyway. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From fdrake at acm.org Mon May 23 14:25:22 2011 From: fdrake at acm.org (Fred Drake) Date: Mon, 23 May 2011 08:25:22 -0400 Subject: [Python-Dev] the distutils2 repo and 3to2 In-Reply-To: <B32E9FD3-94B9-4CCE-9257-EF12B0E57BAA@langa.pl> References: <BANLkTikX5__nyOn+RX2FncnLPtXm4=aPyw@mail.gmail.com> <84205284-4F06-408A-95B7-57B504849F59@langa.pl> <BANLkTinPS_F4hnxKWd1nd3XoOyWae5daXQ@mail.gmail.com> <B32E9FD3-94B9-4CCE-9257-EF12B0E57BAA@langa.pl> Message-ID: <BANLkTinL26Fd1GhN=OuM90QJt=ui5_fsSQ@mail.gmail.com> 2011/5/23 ?ukasz Langa <lukasz at langa.pl>: > The new Ubuntu already ships with Python 3.2. Uptake on Ubuntu 11.04 will take longer than 10.10 uptake, given the reliability issues and the reaction to the new user interface. That's not to say it won't be significant, but the strength of the indicator may be less significant than in the past. -Fred -- Fred L. Drake, Jr.? ? <fdrake at acm.org> "Give me the luxuries of life and I will willingly do without the necessities." ?? --Frank Lloyd Wright From jnoller at gmail.com Mon May 23 14:44:43 2011 From: jnoller at gmail.com (Jesse Noller) Date: Mon, 23 May 2011 08:44:43 -0400 Subject: [Python-Dev] looking for a contact at Google on the Blogger team In-Reply-To: <BANLkTi=RyeTT-mnwYES90tCBaVoBeen9ow@mail.gmail.com> References: <5B09C555-1F4E-413E-9719-2ED1C9B68DF1@gmail.com> <BANLkTiksdt85-sV5w5bHgB3OUpnFW9qEVw@mail.gmail.com> <BANLkTin7cf6iZZQz+MkKWmTTzSH6yNgL2A@mail.gmail.com> <BANLkTi=WGTmCKT3rwjDbcU+j2KEkLzVjUg@mail.gmail.com> <BANLkTin8RWdoXFN-aEfLEiYqe6-dmqG7jw@mail.gmail.com> <BANLkTinV9XXvVH51LL74HVzTDaB-dzSWJA@mail.gmail.com> <BANLkTinaU6ypzk1+n4z-0FeOmM-fGic7KA@mail.gmail.com> <4DD6E177.5020202@v.loewis.de> <BANLkTi=RyeTT-mnwYES90tCBaVoBeen9ow@mail.gmail.com> Message-ID: <BANLkTik7XVD13x2He=o8kQ+ZUG3V3hgJcQ@mail.gmail.com> On Mon, May 23, 2011 at 2:15 AM, Nick Coghlan <ncoghlan at gmail.com> wrote: > On Sat, May 21, 2011 at 7:47 AM, "Martin v. L?wis" <martin at v.loewis.de> wrote: >>> As Jesse has said, there is an RFP in development to improve >>> python.org to the point where we can self-host blogs and the like and >>> deal with the associated user account administration appropriately. >> >> To run a blog on www.python.org, a PEP is not needed. If anybody would >> volunteer to set this up, it could be done in no time. > > If I understand correctly, the RFP is more about improving the entire > python.org toolchain to make it something that non-programmers can > easily provide content for (and even *programmers* don't particularly > like the current toolchain). > > Cheers, > Nick. That is correct. From barry at python.org Mon May 23 16:40:30 2011 From: barry at python.org (Barry Warsaw) Date: Mon, 23 May 2011 10:40:30 -0400 Subject: [Python-Dev] the distutils2 repo and 3to2 In-Reply-To: <BANLkTinL26Fd1GhN=OuM90QJt=ui5_fsSQ@mail.gmail.com> References: <BANLkTikX5__nyOn+RX2FncnLPtXm4=aPyw@mail.gmail.com> <84205284-4F06-408A-95B7-57B504849F59@langa.pl> <BANLkTinPS_F4hnxKWd1nd3XoOyWae5daXQ@mail.gmail.com> <B32E9FD3-94B9-4CCE-9257-EF12B0E57BAA@langa.pl> <BANLkTinL26Fd1GhN=OuM90QJt=ui5_fsSQ@mail.gmail.com> Message-ID: <20110523104030.6a08801f@neurotica.wooz.org> Okay, this reply is getting off-topic, so I won't belabor the point (please email me directly if you want to discuss further). On May 23, 2011, at 08:25 AM, Fred Drake wrote: >2011/5/23 ?ukasz Langa <lukasz at langa.pl>: >> The new Ubuntu already ships with Python 3.2. > >Uptake on Ubuntu 11.04 will take longer than 10.10 uptake, given the >reliability issues and the reaction to the new user interface. You're not required to run the default desktop (Unity) of course. There are several options out of the box, including the classic desktop and Unity 2D, and there are a wide range of supported derivatives of Ubuntu offering additional desktops, such as KDE (Kubuntu) and Xfce (Xubuntu). Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: <http://mail.python.org/pipermail/python-dev/attachments/20110523/37d92b6b/attachment.pgp> From fdrake at acm.org Mon May 23 16:53:07 2011 From: fdrake at acm.org (Fred Drake) Date: Mon, 23 May 2011 10:53:07 -0400 Subject: [Python-Dev] the distutils2 repo and 3to2 In-Reply-To: <20110523104030.6a08801f@neurotica.wooz.org> References: <BANLkTikX5__nyOn+RX2FncnLPtXm4=aPyw@mail.gmail.com> <84205284-4F06-408A-95B7-57B504849F59@langa.pl> <BANLkTinPS_F4hnxKWd1nd3XoOyWae5daXQ@mail.gmail.com> <B32E9FD3-94B9-4CCE-9257-EF12B0E57BAA@langa.pl> <BANLkTinL26Fd1GhN=OuM90QJt=ui5_fsSQ@mail.gmail.com> <20110523104030.6a08801f@neurotica.wooz.org> Message-ID: <BANLkTim-fkz6qVCpH5JAUePVk2NB9=2uww@mail.gmail.com> On Mon, May 23, 2011 at 10:40 AM, Barry Warsaw <barry at python.org> wrote: > You're not required to run the default desktop (Unity) of course. ?There are > several options out of the box, including the classic desktop and Unity 2D, > and there are a wide range of supported derivatives of Ubuntu offering > additional desktops, such as KDE (Kubuntu) and Xfce (Xubuntu). Of course, but I still think the default affects the rate of uptake. I'm not attacking Ubuntu, but I think the uptake rate is relevant to our current discussion. That said, the multi-monitor issues prevent my updating to 11.04. -Fred -- Fred L. Drake, Jr.? ? <fdrake at acm.org> "Give me the luxuries of life and I will willingly do without the necessities." ?? --Frank Lloyd Wright From ethan at stoneleaf.us Mon May 23 19:20:50 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 23 May 2011 10:20:50 -0700 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <16FC9995-2C52-44C2-BDDE-7E7E4B54C9E3@twistedmatrix.com> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <BANLkTikx8U4jWwLXXazpmtsL6MasDmyVyg@mail.gmail.com> <BANLkTik9oXD0Tont0MeyFF9im655946r2g@mail.gmail.com> <16FC9995-2C52-44C2-BDDE-7E7E4B54C9E3@twistedmatrix.com> Message-ID: <4DDA9772.2060401@stoneleaf.us> Glyph Lefkowitz wrote: > In fact, I feel like I would want to push in the opposite direction: > don't treat one-byte bytes slices less like integers; I wish I could > more easily treat n-byte sequences _more_ like integers! :). More > protocols have 2-byte or 4-byte network-endian packed integers embedded > in them than have individual tag bytes that I want to examine. So are you thinking that bytes([01,56])[:2] == 120 ? Or more along the lines of a .to_int() method? ~Ethan~ From ziade.tarek at gmail.com Mon May 23 19:16:36 2011 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Mon, 23 May 2011 19:16:36 +0200 Subject: [Python-Dev] Stable buildbots update In-Reply-To: <BANLkTi=ofyuCUA49wheByRPVcygPzrCUKg@mail.gmail.com> References: <20110521163714.68c5384f@pitrou.net> <BANLkTimpt1YR7PsjYh4H+rY9E9p8VopS_g@mail.gmail.com> <58834.1306112451@parc.com> <BANLkTi=ofyuCUA49wheByRPVcygPzrCUKg@mail.gmail.com> Message-ID: <BANLkTi=Twu6pcoQKM8KebLmxtYfL6HbCOQ@mail.gmail.com> On Mon, May 23, 2011 at 8:48 AM, Tarek Ziad? <ziade.tarek at gmail.com> wrote: > On Mon, May 23, 2011 at 3:00 AM, Bill Janssen <janssen at parc.com> wrote: >> Tarek Ziad? <ziade.tarek at gmail.com> wrote: >> >>> Yes, I am aware of this. I have fixed today most remaining issues, and >>> fixing the final ones right now. >> >> Just FYI: ?the "AMD64 Snow Leopard" buildbot and "PPC Leopard" buildbots >> are now green, but the "PPC Tiger" buildbot is still failing for all >> branches because of packaging errors: > > All the linux and windows stable slaves are now green, and I have a > few issues left to be fixed for all solaris flavors and the two you > are showing, that are also failing under Free BSD. I have now completed the cleanup and we're back on green-land for the stable bots. The red slaves should get green when they catch up with the latest rev (they are slow). If they're not and they are failing in packaging or sysconfig let me know. Sorry again if it has taken so long. Setting up Solaris and BSD VMs took some time ;) Cheers Tarek -- Tarek Ziad? | http://ziade.org From sturla at molden.no Mon May 23 18:39:07 2011 From: sturla at molden.no (Sturla Molden) Date: Mon, 23 May 2011 18:39:07 +0200 Subject: [Python-Dev] CPython optimization: storing reference counters outside of objects In-Reply-To: <4DD9E9A7.50807@v.loewis.de> References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com> <4DD9E9A7.50807@v.loewis.de> Message-ID: <4DDA8DAB.2060209@molden.no> Den 23.05.2011 06:59, skrev "Martin v. L?wis": > > My expectation is that your approach would likely make the issues > worse in a multi-CPU setting. If you put multiple reference counters > into a contiguous block of memory, unrelated reference counters will > live in the same cache line. Consequentially, changing one reference > counter on one CPU will invalidate the cached reference counters of > that cache line on other CPU, making your problem a) actually worse. In a multi-threaded setting with concurrent thread accessing reference counts, this would certainly worsen the situation. In a single-threaded setting, this will likely be an improvement. CPython, however, has a GIL. Thus there is only one concurrently active thread with access to reference counts. On a thread switch in the interpreter, I think the performance result will depend on the nature of the Python code: If threads share a lot of objects, it could help to reduce the number of dirty cache lines. If threads mainly work on private objects, it would likely have the effect you predict. Which will dominate is hard to tell. Instead, we could use multiple heaps: Each Python thread could manage it's own heap for malloc and free (cf. HeapAlloc and HeapFree in Windows). Objects local to one thread only reside in the locally managed heap. When an object becomes shared by seveeral Python threads, it is moved from a local heap to the global heap of the process. Some objects, such as modules, would be stored directly onto the global heap. This way, objects only used by only one thread would never dirty cache lines used by other threads. This would also be a way to reduce the CPython dependency on the GIL. Only the global heap would need to be protected by the GIL, whereas the local heaps would not need any global synchronization. (I am setting follow-up to the Python Ideas list, it does not belong on Python dev.) Sturla Molden From tjreedy at udel.edu Mon May 23 19:55:41 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 23 May 2011 13:55:41 -0400 Subject: [Python-Dev] Python 3.x and bytes In-Reply-To: <4DDA9772.2060401@stoneleaf.us> References: <4DD2C2A5.3080403@stoneleaf.us> <BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com> <4DD2D89D.4000303@stoneleaf.us> <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com> <4DD2F661.2050005@stoneleaf.us> <BANLkTikx8U4jWwLXXazpmtsL6MasDmyVyg@mail.gmail.com> <BANLkTik9oXD0Tont0MeyFF9im655946r2g@mail.gmail.com> <16FC9995-2C52-44C2-BDDE-7E7E4B54C9E3@twistedmatrix.com> <4DDA9772.2060401@stoneleaf.us> Message-ID: <ire72p$s38$1@dough.gmane.org> On 5/23/2011 1:20 PM, Ethan Furman wrote: > Glyph Lefkowitz wrote: >> In fact, I feel like I would want to push in the opposite direction: >> don't treat one-byte bytes slices less like integers; I wish I could >> more easily treat n-byte sequences _more_ like integers! :). More >> protocols have 2-byte or 4-byte network-endian packed integers >> embedded in them than have individual tag bytes that I want to examine. > > So are you thinking that bytes([01,56])[:2] == 120 ? Or more along the > lines of a .to_int() method? I believe that such things can be handled by the struct module. -- Terry Jan Reedy From artur.siekielski at gmail.com Mon May 23 22:55:21 2011 From: artur.siekielski at gmail.com (Artur Siekielski) Date: Mon, 23 May 2011 22:55:21 +0200 Subject: [Python-Dev] CPython optimization: storing reference counters outside of objects In-Reply-To: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com> References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com> Message-ID: <BANLkTi=1eajrgLEtAbjmgpm9sENBYw3+aA@mail.gmail.com> Ok, I managed to make a quick but working patch (sufficient to get working interpreter, it segfaults for extension modules). It uses the "ememoa" allocator (http://code.google.com/p/ememoa/) which seems a reasonable pool allocator. The patch: http://dpaste.org/K8en/. The main obstacle was that there isn't a single function/macro that can be used to initialize all PyObjects, so I had to initialize static PyObjects (mainly PyTypeObjects) by hand. I used a naive quicksort algorithm as a benchmark: http://dpaste.org/qquh/ . The result is that after patching it runs 50% SLOWER. I profiled it and allocator methods used 35% time. So there is still 15% performance loss even if the allocator is poor. Anyway, I'd like to have working copy-on-write in CPython - in the presence of GIL I find it important to have multiprocess programs optimized (and I think it's a common idiom that a parent process prepares some big data structure, and child "worker" processes do some read-only quering). Artur From guido at python.org Mon May 23 23:08:48 2011 From: guido at python.org (Guido van Rossum) Date: Mon, 23 May 2011 14:08:48 -0700 Subject: [Python-Dev] CPython optimization: storing reference counters outside of objects In-Reply-To: <BANLkTi=1eajrgLEtAbjmgpm9sENBYw3+aA@mail.gmail.com> References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com> <BANLkTi=1eajrgLEtAbjmgpm9sENBYw3+aA@mail.gmail.com> Message-ID: <BANLkTimG_iRMqTP_AqRsAJzaJvod36LSTQ@mail.gmail.com> On Mon, May 23, 2011 at 1:55 PM, Artur Siekielski <artur.siekielski at gmail.com> wrote: > Ok, I managed to make a quick but working patch (sufficient to get > working interpreter, it segfaults for extension modules). It uses the > "ememoa" allocator (http://code.google.com/p/ememoa/) which seems a > reasonable pool allocator. The patch: http://dpaste.org/K8en/. The > main obstacle was that there isn't a single function/macro that can be > used to initialize all PyObjects, so I had to initialize static > PyObjects (mainly PyTypeObjects) by hand. > > I used a naive quicksort algorithm as a benchmark: > http://dpaste.org/qquh/ . The result is that after patching it runs > 50% SLOWER. I profiled it and allocator methods used 35% time. So > there is still 15% performance loss even if the allocator is poor. > > Anyway, I'd like to have working copy-on-write in CPython - in the > presence of GIL I find it important to have multiprocess programs > optimized (and I think it's a common idiom that a parent process > prepares some big data structure, and child "worker" processes do some > read-only quering). That is the question though -- *is* the idiom commonly used? It doesn't seem to me that it would scale all that far, since it only works as long as all forked copies live on the same machine and run on the same symmetrical multi-core processor. -- --Guido van Rossum (python.org/~guido) From artur.siekielski at gmail.com Tue May 24 00:07:27 2011 From: artur.siekielski at gmail.com (Artur Siekielski) Date: Tue, 24 May 2011 00:07:27 +0200 Subject: [Python-Dev] CPython optimization: storing reference counters outside of objects In-Reply-To: <BANLkTimG_iRMqTP_AqRsAJzaJvod36LSTQ@mail.gmail.com> References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com> <BANLkTi=1eajrgLEtAbjmgpm9sENBYw3+aA@mail.gmail.com> <BANLkTimG_iRMqTP_AqRsAJzaJvod36LSTQ@mail.gmail.com> Message-ID: <BANLkTimgc=_gd5xzgYaEraoEABspe+Ddsw@mail.gmail.com> 2011/5/23 Guido van Rossum <guido at python.org>: >> Anyway, I'd like to have working copy-on-write in CPython - in the >> presence of GIL I find it important to have multiprocess programs >> optimized (and I think it's a common idiom that a parent process >> prepares some big data structure, and child "worker" processes do some >> read-only quering). > > That is the question though -- *is* the idiom commonly used? In fact I came to the whole idea with this optimization because the idiom didn't work for me. I had a big word index built by a parent process, and than wanted the children to enable querying this index (I wanted to use all cores on a server). The index consumed 50% of RAM and after a few minutes the children consumed all RAM. I find it common in languages like Java to use thread pools, in Python+Linux we have multiprocess pools if we want to use all cores, and in this setting having a working copy-on-write is really valuable. Oh, and using explicit shared memory or mmap is much harder, because you have to map the whole object graph into bytes. > It > doesn't seem to me that it would scale all that far, since it only > works as long as all forked copies live on the same machine and run on > the same symmetrical multi-core processor. ? I don't know about multi-processor systems, but on single-processor multi-core systems (which are common even on servers) and Linux it works. Artur From sturla at molden.no Tue May 24 00:33:43 2011 From: sturla at molden.no (Sturla Molden) Date: Tue, 24 May 2011 00:33:43 +0200 Subject: [Python-Dev] CPython optimization: storing reference counters outside of objects In-Reply-To: <BANLkTimgc=_gd5xzgYaEraoEABspe+Ddsw@mail.gmail.com> References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com> <BANLkTi=1eajrgLEtAbjmgpm9sENBYw3+aA@mail.gmail.com> <BANLkTimG_iRMqTP_AqRsAJzaJvod36LSTQ@mail.gmail.com> <BANLkTimgc=_gd5xzgYaEraoEABspe+Ddsw@mail.gmail.com> Message-ID: <4DDAE0C7.9040501@molden.no> Den 24.05.2011 00:07, skrev Artur Siekielski: > > Oh, and using explicit shared memory or mmap is much harder, because > you have to map the whole object graph into bytes. It sounds like you need PYRO, POSH or multiprocessing's proxy objects. Sturla From victor.stinner at haypocalc.com Tue May 24 02:08:49 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 24 May 2011 02:08:49 +0200 Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader Message-ID: <1306195729.605.27.camel@marge> Hi, In Python 2, codecs.open() is the best way to read and/or write files using Unicode. But in Python 3, open() is preferred with its fast io module. I would like to deprecate codecs.open() because it can be replaced by open() and io.TextIOWrapper. I would like your opinion and that's why I'm writing this email. -- codecs.open() and StreamReader, StreamWriter and StreamReaderWriter classes of the codecs module don't support universal newlines, still have some issues with stateful codecs (like UTF-16/32 BOMs), and each codec has to implement a StreamReader and a StreamWriter class. StreamReader and StreamWriter are stateless codecs (no reset() or setstate() method), and so it's not possible to write a generic fix for all child classes in the codecs module. Each stateful codec has to handle special cases like seek() problems. For example, UTF-16 codec duplicates some IncrementalEncoder/IncrementalDecoder code into its StreamWriter/StreamReader class. The io module is well tested, supports non-seekable streams, handles correctly corner-cases (like UTF-16/32 BOMs) and supports any kind of newlines including an "universal newline" mode. TextIOWrapper reuses incremental encoders and decoders, so BOM issues were fixed only once, in TextIOWrapper. It's trivial to replace a call to codecs.open() by a call to open(), because the two API are very close. The main different is that codecs.open() doesn't support universal newline, so you have to use open(..., newline='') to keep the same behaviour (keep newlines unchanged). This task can be done by 2to3. But I suppose that most people will be happy with the universal newline mode. I don't see which usecase is not covered by TextIOWrapper. But I know some cases which are not supported by StreamReader/StreamWriter. -- I opened an issue for this idea. Brett and Marc-Andree Lemburg don't want to deprecate codecs.open() & friends because they want to be able to write code working on Python 2 and on Python 3 without any change. I don't think it's realistic: nontrivial programs require at least the six module, and most likely the 2to3 program. The six module can have its "codecs.open" function if codecs.open is removed from Python 3.4. StreamReader, StreamWriter, StreamReaderEncoder and EncodedFile are not used in the Python 3 standard library. I tried removed them: except tests of test_codecs which test them directly, the full test suite pass. Read the issue for more information: http://bugs.python.org/issue8796 Victor From stephen at xemacs.org Tue May 24 04:12:35 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 24 May 2011 11:12:35 +0900 Subject: [Python-Dev] Stable buildbots update In-Reply-To: <BANLkTi=Twu6pcoQKM8KebLmxtYfL6HbCOQ@mail.gmail.com> References: <20110521163714.68c5384f@pitrou.net> <BANLkTimpt1YR7PsjYh4H+rY9E9p8VopS_g@mail.gmail.com> <58834.1306112451@parc.com> <BANLkTi=ofyuCUA49wheByRPVcygPzrCUKg@mail.gmail.com> <BANLkTi=Twu6pcoQKM8KebLmxtYfL6HbCOQ@mail.gmail.com> Message-ID: <87zkmcalt8.fsf@uwakimon.sk.tsukuba.ac.jp> Tarek Ziad? writes: > I have now completed the cleanup and we're back on green-land for the > stable bots. Are you saying you expect Mac OS X 10.4 "Tiger" to go green once the bots update? If so, I'm impressed, and "thank you!" to all involved. Apple and MacPorts have long since washed their hands of that release. From rdmurray at bitdance.com Tue May 24 04:50:54 2011 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 23 May 2011 22:50:54 -0400 Subject: [Python-Dev] Stable buildbots update In-Reply-To: <87zkmcalt8.fsf@uwakimon.sk.tsukuba.ac.jp> References: <20110521163714.68c5384f@pitrou.net> <BANLkTimpt1YR7PsjYh4H+rY9E9p8VopS_g@mail.gmail.com> <58834.1306112451@parc.com> <BANLkTi=ofyuCUA49wheByRPVcygPzrCUKg@mail.gmail.com> <BANLkTi=Twu6pcoQKM8KebLmxtYfL6HbCOQ@mail.gmail.com> <87zkmcalt8.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <20110524025055.4B7E1250042@webabinitio.net> On Tue, 24 May 2011 11:12:35 +0900, "Stephen J. Turnbull" <stephen at xemacs.org> wrote: > Tarek Ziad?? writes: > > > I have now completed the cleanup and we're back on green-land for the > > stable bots. > > Are you saying you expect Mac OS X 10.4 "Tiger" to go green once the > bots update? If so, I'm impressed, and "thank you!" to all involved. > Apple and MacPorts have long since washed their hands of that release. You will note that Tiger is *not* in the stable set :) -- R. David Murray http://www.bitdance.com From nad at acm.org Tue May 24 07:03:13 2011 From: nad at acm.org (Ned Deily) Date: Mon, 23 May 2011 22:03:13 -0700 Subject: [Python-Dev] Stable buildbots update References: <20110521163714.68c5384f@pitrou.net> <BANLkTimpt1YR7PsjYh4H+rY9E9p8VopS_g@mail.gmail.com> <58834.1306112451@parc.com> <BANLkTi=ofyuCUA49wheByRPVcygPzrCUKg@mail.gmail.com> <BANLkTi=Twu6pcoQKM8KebLmxtYfL6HbCOQ@mail.gmail.com> <87zkmcalt8.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <nad-0412B2.22031323052011@news.gmane.org> In article <87zkmcalt8.fsf at uwakimon.sk.tsukuba.ac.jp>, "Stephen J. Turnbull" <stephen at xemacs.org> wrote: > Are you saying you expect Mac OS X 10.4 "Tiger" to go green once the > bots update? If so, I'm impressed, and "thank you!" to all involved. > Apple and MacPorts have long since washed their hands of that release. OS X 10.4 does have its quirks that makes it challenging to get all of the tests to run without a few cornercase failures but, besides the buildbots, I still test regularly with 10.4 and occasionally build there, too. And, FWIW, while top-of-trunk MacPorts may not officially support 10.4, many ports work there just fine including python2.6, 2.7, and 3.1. (3.2 has a build issue that may get fixed in 3.2.1). -- Ned Deily, nad at acm.org From ncoghlan at gmail.com Tue May 24 07:07:03 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 24 May 2011 15:07:03 +1000 Subject: [Python-Dev] CPython optimization: storing reference counters outside of objects In-Reply-To: <4DDAE0C7.9040501@molden.no> References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com> <BANLkTi=1eajrgLEtAbjmgpm9sENBYw3+aA@mail.gmail.com> <BANLkTimG_iRMqTP_AqRsAJzaJvod36LSTQ@mail.gmail.com> <BANLkTimgc=_gd5xzgYaEraoEABspe+Ddsw@mail.gmail.com> <4DDAE0C7.9040501@molden.no> Message-ID: <BANLkTikwewqji-acMFY4HxzmxK9K3__z2g@mail.gmail.com> On Tue, May 24, 2011 at 8:33 AM, Sturla Molden <sturla at molden.no> wrote: > Den 24.05.2011 00:07, skrev Artur Siekielski: >> >> Oh, and using explicit shared memory or mmap is much harder, because >> you have to map the whole object graph into bytes. > > It sounds like you need PYRO, POSH or multiprocessing's proxy objects. Indeed. Abstractions over mmap (local machine sharing) and serialisation (remote sharing) are likely to be far more beneficial in this area than trying to change the underlying memory model to support copy-on-write. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Tue May 24 07:24:20 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 24 May 2011 15:24:20 +1000 Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader In-Reply-To: <1306195729.605.27.camel@marge> References: <1306195729.605.27.camel@marge> Message-ID: <BANLkTik64iVohjzvgLC+LvqA4nhVMNUR=g@mail.gmail.com> On Tue, May 24, 2011 at 10:08 AM, Victor Stinner <victor.stinner at haypocalc.com> wrote: > It's trivial to replace a call to codecs.open() by a call to open(), > because the two API are very close. The main different is that > codecs.open() doesn't support universal newline, so you have to use > open(..., newline='') to keep the same behaviour (keep newlines > unchanged). This task can be done by 2to3. But I suppose that most > people will be happy with the universal newline mode. Is there any reason that codecs.open() can't become a thin wrapper around builtin open in 3.3? > I don't see which usecase is not covered by TextIOWrapper. But I know > some cases which are not supported by StreamReader/StreamWriter. How API compatible is TextIOWrapper with StreamReader/StreamWriter? How hard would it to be change them to be adapters over the main IO machinery rather than independent classes? Rather than deprecating them, that seems like a more profitable direction to take them. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From g.brandl at gmx.net Tue May 24 08:38:19 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 24 May 2011 08:38:19 +0200 Subject: [Python-Dev] cpython: Issue #11377: platform.popen() emits a DeprecationWarning In-Reply-To: <E1QOdQq-0002J4-RT@dinsdale.python.org> References: <E1QOdQq-0002J4-RT@dinsdale.python.org> Message-ID: <irfjoe$121$1@dough.gmane.org> On 24.05.2011 00:17, victor.stinner wrote: > http://hg.python.org/cpython/rev/e44b851d0a2b > changeset: 70323:e44b851d0a2b > parent: 70321:202d973e8bf5 > user: Victor Stinner <victor.stinner at haypocalc.com> > date: Tue May 24 00:16:16 2011 +0200 > summary: > Issue #11377: platform.popen() emits a DeprecationWarning Please see http://mail.python.org/pipermail/python-dev/2011-May/111303.html about the style of your commit messages. 9a16fa0c9548 is another example. Georg From victor.stinner at haypocalc.com Tue May 24 09:23:38 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 24 May 2011 09:23:38 +0200 Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader In-Reply-To: <BANLkTik64iVohjzvgLC+LvqA4nhVMNUR=g@mail.gmail.com> References: <1306195729.605.27.camel@marge> <BANLkTik64iVohjzvgLC+LvqA4nhVMNUR=g@mail.gmail.com> Message-ID: <1306221818.2619.6.camel@marge> Le mardi 24 mai 2011 ? 15:24 +1000, Nick Coghlan a ?crit : > On Tue, May 24, 2011 at 10:08 AM, Victor Stinner > <victor.stinner at haypocalc.com> wrote: > > It's trivial to replace a call to codecs.open() by a call to open(), > > because the two API are very close. The main different is that > > codecs.open() doesn't support universal newline, so you have to use > > open(..., newline='') to keep the same behaviour (keep newlines > > unchanged). This task can be done by 2to3. But I suppose that most > > people will be happy with the universal newline mode. > > Is there any reason that codecs.open() can't become a thin wrapper > around builtin open in 3.3? Yes, it's trivial to implement codecs.open using: def open(filename, mode='rb', encoding=None, errors='strict', buffering=1): return builtins.open(filename, mode, buffering, encoding, errors, newline='') But do you we really need two ways to open a file? Extract of import this: "There should be one-- and preferably only one --obvious way to do it." Another example: Python 3.2 has subprocess.Popen, os.popen and platform.popen to open a subprocess. platform.popen is now deprecated in Python 3.3. Well, it's already better than Python 2.5 which has os.popen(), os.popen2(), os.popen3(), os.popen4(), os.spawnl(), os.spawnle(), os.spawnlp(), os.spawnlpe(), os.spawnv(), os.spawnve(), os.spawnvp(), os.spawnvpe(), subprocess.Popen, platform.popen and maybe others :-) > How API compatible is TextIOWrapper with StreamReader/StreamWriter? It's fully compatible. > How hard would it to be change them to be adapters over the main IO > machinery rather than independent classes? I don't understand your proposition. We don't need StreamReader and StreamWriter to open a stream as a file text, only incremental decoders and encoders. Why do you want to keep them? Victor From mal at egenix.com Tue May 24 10:03:22 2011 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 24 May 2011 10:03:22 +0200 Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader In-Reply-To: <1306195729.605.27.camel@marge> References: <1306195729.605.27.camel@marge> Message-ID: <4DDB664A.7050705@egenix.com> Victor Stinner wrote: > Hi, > > In Python 2, codecs.open() is the best way to read and/or write files > using Unicode. But in Python 3, open() is preferred with its fast io > module. I would like to deprecate codecs.open() because it can be > replaced by open() and io.TextIOWrapper. I would like your opinion and > that's why I'm writing this email. I think you should have moved this part of your email further up, since it explains the reason why this idea was rejected for now: > I opened an issue for this idea. Brett and Marc-Andree Lemburg don't > want to deprecate codecs.open() & friends because they want to be able > to write code working on Python 2 and on Python 3 without any change. I > don't think it's realistic: nontrivial programs require at least the six > module, and most likely the 2to3 program. The six module can have its > "codecs.open" function if codecs.open is removed from Python 3.4. And now for something completely different: > codecs.open() and StreamReader, StreamWriter and StreamReaderWriter > classes of the codecs module don't support universal newlines, still > have some issues with stateful codecs (like UTF-16/32 BOMs), and each > codec has to implement a StreamReader and a StreamWriter class. > > StreamReader and StreamWriter are stateless codecs (no reset() or > setstate() method), and so it's not possible to write a generic fix for > all child classes in the codecs module. Each stateful codec has to > handle special cases like seek() problems. For example, UTF-16 codec > duplicates some IncrementalEncoder/IncrementalDecoder code into its > StreamWriter/StreamReader class. Please read PEP 100 regarding StreamReader and StreamWriter. Those codecs parts were explicitly designed to be stateful, unlike the stateless encoder/decoder methods. Please read my reply on the ticket: """ StreamReader and StreamWriter classes provide the base codec implementations for stateful interaction with streams. They define the interface and provide a working implementation for those codecs that choose not to implement their own variants. Each codec can, however, implement variants which are optimized for the specific encoding or intercept certain stream methods to add functionality or improve the encoding/decoding performance. Both are essential parts of the codec interface. TextIOWrapper and StreamReaderWriter are merely wrappers around streams that make use of the codecs. They don't provide any codec logic themselves. That's the conceptual difference. """ > The io module is well tested, supports non-seekable streams, handles > correctly corner-cases (like UTF-16/32 BOMs) and supports any kind of > newlines including an "universal newline" mode. TextIOWrapper reuses > incremental encoders and decoders, so BOM issues were fixed only once, > in TextIOWrapper. > > It's trivial to replace a call to codecs.open() by a call to open(), > because the two API are very close. The main different is that > codecs.open() doesn't support universal newline, so you have to use > open(..., newline='') to keep the same behaviour (keep newlines > unchanged). This task can be done by 2to3. But I suppose that most > people will be happy with the universal newline mode. > > I don't see which usecase is not covered by TextIOWrapper. But I know > some cases which are not supported by StreamReader/StreamWriter. This is a misunderstanding of the concepts behind the two. StreamReader and StreamWriters are implemented by the codecs, they are part of the API that each codec has to provide in order to register in the Python codecs system. Their purpose is to provide a stateful interface and work efficiently and directly on streams rather than buffers. Here's my reply from the ticket regarding using incremental encoders/decoders for the StreamReader/Writer parts of the codec set of APIs: """ The point about having them use incremental codecs for encoding and decoding is a good one and would need to be investigated. If possible, we could use incremental encoders/decoders for the standard StreamReader/Writer base classes or add new IncrementalStreamReader/Writer classes which then use the IncrementalEncode/Decoder per default. Please open a new ticket for this. """ > StreamReader, StreamWriter, StreamReaderEncoder and EncodedFile are not > used in the Python 3 standard library. I tried removed them: except > tests of test_codecs which test them directly, the full test suite pass. > > Read the issue for more information: http://bugs.python.org/issue8796 -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 24 2011) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2011-06-20: EuroPython 2011, Florence, Italy 27 days to go ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From vinay_sajip at yahoo.co.uk Tue May 24 10:16:03 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Tue, 24 May 2011 08:16:03 +0000 (UTC) Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader References: <1306195729.605.27.camel@marge> Message-ID: <loom.20110524T095820-780@post.gmane.org> Victor Stinner <victor.stinner <at> haypocalc.com> writes: > I opened an issue for this idea. Brett and Marc-Andree Lemburg don't > want to deprecate codecs.open() & friends because they want to be able > to write code working on Python 2 and on Python 3 without any change. I > don't think it's realistic: nontrivial programs require at least the six > module, and most likely the 2to3 program. The six module can have its > "codecs.open" function if codecs.open is removed from Python 3.4. What's "non-trivial"? Both pip and virtualenv (widely used programs) were ported to Python 3 using a single codebase for 2.x and 3.x, because it seemed to involve the least ongoing maintenance burden. Though these particular programs don't use codecs.open, I don't see much value in making it harder to write programs which can run under both 2.x and 3.x; that's not going to speed adoption of 3.x. I find 2to3 very useful indeed for showing where changes may need to be made for 2.x/3.x portability, but do not use it as an automatic conversion tool. The six module is very useful, too, but some projects won't necessarily want to add it as an additional dependency, and reimplement just the parts they need from that bag of tricks. So I would also want to keep codecs.open() and friends, at least for now - though it makes seems to make sense to implement them as wrappers (as Nick suggested). Regards, Vinay Sajip From victor.stinner at haypocalc.com Tue May 24 10:31:50 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 24 May 2011 10:31:50 +0200 Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader In-Reply-To: <loom.20110524T095820-780@post.gmane.org> References: <1306195729.605.27.camel@marge> <loom.20110524T095820-780@post.gmane.org> Message-ID: <1306225910.2619.12.camel@marge> Le mardi 24 mai 2011 ? 08:16 +0000, Vinay Sajip a ?crit : > So I would also want to keep codecs.open() and friends, at least for now Well, I would agree to keep codecs.open() (if we patch it to reuse TextIOWrapper and add a note to say that it is kept for backward compatibiltiy and open() should be preferred in Python 3), but deprecate StreamReader, StreamWriter and EncodedFile. As I wrote, codecs.open() is useful in Python 2. But I don't know any program or library using directly StreamReader or StreamWriter. I found some projects (ex: twisted-mail, feeds2imap, pyflag, pygsm, ...) implementing their own Python codec (cool!) and their codec has their StreamReader and StreamWriter class, but I don't think that these classes are used. Victor From victor.stinner at haypocalc.com Tue May 24 10:58:54 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 24 May 2011 10:58:54 +0200 Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader In-Reply-To: <4DDB664A.7050705@egenix.com> References: <1306195729.605.27.camel@marge> <4DDB664A.7050705@egenix.com> Message-ID: <1306227534.2619.34.camel@marge> Le mardi 24 mai 2011 ? 10:03 +0200, M.-A. Lemburg a ?crit : > Please read PEP 100 regarding StreamReader and StreamWriter. > Those codecs parts were explicitly designed to be stateful, > unlike the stateless encoder/decoder methods. Yes, it is possible to implement stateful StreamReader and StreamWriter classes and we have such codecs (I gave the example of UTF-16), but the state is not exposed (getstate / setstate), and so it's not possible to write generic code to handle the codec state in the base StreamReader and StreamWriter classes. io.TextIOWrapper requires encoder.setstate(0) for example. > Each codec can, however, implement variants which are optimized > for the specific encoding or intercept certain stream methods > to add functionality or improve the encoding/decoding > performance. Can you give me some examples? > TextIOWrapper and StreamReaderWriter are merely wrappers > around streams that make use of the codecs. They don't > provide any codec logic themselves. That's the conceptual > difference. > ... > StreamReader and StreamWriters ... work efficiently and > directly on streams rather than buffers. StreamReader, StreamWriter, TextIOWrapper and StreamReaderWriter all have a file-like API: tell(), seek(), read(), readline(), write(), etc. The implementation is maybe different, but the API is just the same, and so the usecases are just the same. I don't see in which case I should use StreamReader or StreamWriter instead TextIOWrapper. I thought that TextIOWrapper is specific to files on disk, but TextIOWrapper is already used for other usages like sockets. > Here's my reply from the ticket regarding using incremental > encoders/decoders for the StreamReader/Writer parts of the > codec set of APIs: > > """ > The point about having them use incremental codecs for encoding and > decoding is a good one and would > need to be investigated. If possible, we could use incremental > encoders/decoders for the standard > StreamReader/Writer base classes or add new > IncrementalStreamReader/Writer classes which then use > the IncrementalEncode/Decoder per default. Why do you want to write a duplicate feature? TextIOWrapper is already here, it's working and widely used. I am working on codec issues (like CJK encodings, see #12100, #12057, #12016) and I would like to remove StreamReader and StreamWriter to have *less* code to maintain. If you want to add more code, will be available to maintain it? It looks like you are busy, some people (not me ;-)) are still waiting .transform()/.untransform()! Victor From solipsis at pitrou.net Tue May 24 11:56:55 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 24 May 2011 11:56:55 +0200 Subject: [Python-Dev] Stable buildbots update In-Reply-To: <BANLkTi=Twu6pcoQKM8KebLmxtYfL6HbCOQ@mail.gmail.com> References: <20110521163714.68c5384f@pitrou.net> <BANLkTimpt1YR7PsjYh4H+rY9E9p8VopS_g@mail.gmail.com> <58834.1306112451@parc.com> <BANLkTi=ofyuCUA49wheByRPVcygPzrCUKg@mail.gmail.com> <BANLkTi=Twu6pcoQKM8KebLmxtYfL6HbCOQ@mail.gmail.com> Message-ID: <20110524115655.65030e15@pitrou.net> On Mon, 23 May 2011 19:16:36 +0200 Tarek Ziad? <ziade.tarek at gmail.com> wrote: > > I have now completed the cleanup and we're back on green-land for the > stable bots. > > The red slaves should get green when they catch up with the latest rev > (they are slow). If they're not and they are failing in packaging or > sysconfig let me know. > > Sorry again if it has taken so long. Setting up Solaris and BSD VMs > took some time ;) Thank you very much! What a beautiful sight this is: http://www.python.org/dev/buildbot/all/waterfall?category=3.x.stable (until a sporadic failure comes up, that is) Regards Antoine. From artur.siekielski at gmail.com Tue May 24 11:55:32 2011 From: artur.siekielski at gmail.com (Artur Siekielski) Date: Tue, 24 May 2011 11:55:32 +0200 Subject: [Python-Dev] CPython optimization: storing reference counters outside of objects In-Reply-To: <4DDAE0C7.9040501@molden.no> References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com> <BANLkTi=1eajrgLEtAbjmgpm9sENBYw3+aA@mail.gmail.com> <BANLkTimG_iRMqTP_AqRsAJzaJvod36LSTQ@mail.gmail.com> <BANLkTimgc=_gd5xzgYaEraoEABspe+Ddsw@mail.gmail.com> <4DDAE0C7.9040501@molden.no> Message-ID: <BANLkTink2knbSW+jX5y-quqmPThcW90TxA@mail.gmail.com> 2011/5/24 Sturla Molden <sturla at molden.no>: >> Oh, and using explicit shared memory or mmap is much harder, because >> you have to map the whole object graph into bytes. > > It sounds like you need PYRO, POSH or multiprocessing's proxy objects. PYRO/multiprocessing proxies isn't a comparable solution because of ORDERS OF MAGNITUDE worser performance. You compare here direct memory access vs serialization/message passing through sockets/pipes. POSH might be good, but the project is dead for 8 years. And this copy-on-write is nice because you don't need changes/restrictions to your code, or a special garbage collector. Artur From solipsis at pitrou.net Tue May 24 12:06:01 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 24 May 2011 12:06:01 +0200 Subject: [Python-Dev] "streams" vs "buffers" References: <1306195729.605.27.camel@marge> <4DDB664A.7050705@egenix.com> Message-ID: <20110524120601.32de673a@pitrou.net> On Tue, 24 May 2011 10:03:22 +0200 "M.-A. Lemburg" <mal at egenix.com> wrote: > > StreamReader and StreamWriters are implemented by the codecs, > they are part of the API that each codec has to provide in order > to register in the Python codecs system. Their purpose is > to provide a stateful interface and work efficiently and > directly on streams rather than buffers. I think you are trying to make a conceptual distinction which doesn't exist in practice. Your OS uses buffers to represent "streams" to you. Also, how come StreamReader has internal members named "bytebuffer", "charbuffer" and "linebuffer"? There certainly seems to be some (non-trivial) amount of buffering going on there, and probably quite slow and inefficient since it's pure Python (TextIOWrapper is written in C). Regards Antoine. From mal at egenix.com Tue May 24 12:14:10 2011 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 24 May 2011 12:14:10 +0200 Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader In-Reply-To: <1306227534.2619.34.camel@marge> References: <1306195729.605.27.camel@marge> <4DDB664A.7050705@egenix.com> <1306227534.2619.34.camel@marge> Message-ID: <4DDB84F2.40106@egenix.com> Victor Stinner wrote: > Le mardi 24 mai 2011 ? 10:03 +0200, M.-A. Lemburg a ?crit : >> Please read PEP 100 regarding StreamReader and StreamWriter. >> Those codecs parts were explicitly designed to be stateful, >> unlike the stateless encoder/decoder methods. > > Yes, it is possible to implement stateful StreamReader and StreamWriter > classes and we have such codecs (I gave the example of UTF-16), but the > state is not exposed (getstate / setstate), and so it's not possible to > write generic code to handle the codec state in the base StreamReader > and StreamWriter classes. io.TextIOWrapper requires encoder.setstate(0) > for example. So instead of always suggesting to deprecate everything, how about you come up with a proposal to add meaningful new methods to those base classes ? >> Each codec can, however, implement variants which are optimized >> for the specific encoding or intercept certain stream methods >> to add functionality or improve the encoding/decoding >> performance. > > Can you give me some examples? See the UTF-16 codec in the stdlib for example. This uses some of the available possibilities to interpret the BOM mark and then switches the encoder/decoder methods accordingly. A lot more could be done for other variable length encoding codecs, e.g. UTF-8, since these often have problems near the end of a read due to missing bytes. The base class implementation provides a general purpose implementation to cover the case, but it's not efficient, since it doesn't know anything about the encoding characteristics. Such an implementation would have to be done per codec and that's why we have per codec StreamReader/Writer APIs. >> TextIOWrapper and StreamReaderWriter are merely wrappers >> around streams that make use of the codecs. They don't >> provide any codec logic themselves. That's the conceptual >> difference. >> ... >> StreamReader and StreamWriters ... work efficiently and >> directly on streams rather than buffers. > > StreamReader, StreamWriter, TextIOWrapper and StreamReaderWriter all > have a file-like API: tell(), seek(), read(), readline(), write(), etc. > The implementation is maybe different, but the API is just the same, and > so the usecases are just the same. > > I don't see in which case I should use StreamReader or StreamWriter > instead TextIOWrapper. I thought that TextIOWrapper is specific to files > on disk, but TextIOWrapper is already used for other usages like > sockets. I have no idea why TextIOWrapper was added to the stdlib instead of making StreamReaderWriter more capable, since StreamReaderWriter had already been available in Python since Python 1.6 (and this is being used by codecs.open()). Perhaps we should deprecate TextIOWrapper instead and replace it with codecs.StreamReaderWriter ? ;-) Seriously, I don't see use of TextIOWrapper as an argument for removing StreamReader/Writer parts of the codecs API. >> Here's my reply from the ticket regarding using incremental >> encoders/decoders for the StreamReader/Writer parts of the >> codec set of APIs: >> >> """ >> The point about having them use incremental codecs for encoding and >> decoding is a good one and would >> need to be investigated. If possible, we could use incremental >> encoders/decoders for the standard >> StreamReader/Writer base classes or add new >> IncrementalStreamReader/Writer classes which then use >> the IncrementalEncode/Decoder per default. > > Why do you want to write a duplicate feature? TextIOWrapper is already > here, it's working and widely used. See above and please also try to understand why we have per-codec implementations for streams. I'm tired of repeating myself. I would much prefer to see the codec-specific functionality in TextIOWrapper added back to the codecs where it belongs. > I am working on codec issues (like CJK encodings, see #12100, #12057, > #12016) and I would like to remove StreamReader and StreamWriter to have > *less* code to maintain. > > If you want to add more code, will be available to maintain it? It looks > like you are busy, some people (not me ;-)) are still > waiting .transform()/.untransform()! I dropped the ball on the idea after the strong wave of comments against those methods. People will simply have to use codecs.encode() and codecs.decode(). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 24 2011) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2011-06-20: EuroPython 2011, Florence, Italy 27 days to go ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From ncoghlan at gmail.com Tue May 24 12:25:11 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 24 May 2011 20:25:11 +1000 Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader In-Reply-To: <1306227534.2619.34.camel@marge> References: <1306195729.605.27.camel@marge> <4DDB664A.7050705@egenix.com> <1306227534.2619.34.camel@marge> Message-ID: <BANLkTi=pqLU9kXmr6Kj7o36x7LuUO=Y3Cg@mail.gmail.com> On Tue, May 24, 2011 at 6:58 PM, Victor Stinner <victor.stinner at haypocalc.com> wrote: > StreamReader, StreamWriter, TextIOWrapper and StreamReaderWriter all > have a file-like API: tell(), seek(), read(), ?readline(), write(), etc. > The implementation is maybe different, but the API is just the same, and > so the usecases are just the same. > > I don't see in which case I should use StreamReader or StreamWriter > instead TextIOWrapper. I thought that TextIOWrapper is specific to files > on disk, but TextIOWrapper is already used for other usages like > sockets. Back up a step here. It's important to remember that the codecs module *long* predates the existence of the Python 3 I/O model and the io module in particular. Just as PEP 302 defines how module importers should be written, PEP 100 defines how text codecs should be written (i.e. in terms of StreamReader and StreamWriter). PEP 3116 then defines how such codecs can be used as part of the overall I/O stack as redesigned for Python 3. Now, there may be an opportunity here to rationalise things a bit and re-use the *new* io module interfaces as the basis for an updated codec API PEP, but we shouldn't be hasty in deprecating an old API that is about "how to write codecs" just because it is similar to a shiny new one that is about "how to process I/O data". Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Tue May 24 12:27:35 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 24 May 2011 20:27:35 +1000 Subject: [Python-Dev] Stable buildbots update In-Reply-To: <20110524115655.65030e15@pitrou.net> References: <20110521163714.68c5384f@pitrou.net> <BANLkTimpt1YR7PsjYh4H+rY9E9p8VopS_g@mail.gmail.com> <58834.1306112451@parc.com> <BANLkTi=ofyuCUA49wheByRPVcygPzrCUKg@mail.gmail.com> <BANLkTi=Twu6pcoQKM8KebLmxtYfL6HbCOQ@mail.gmail.com> <20110524115655.65030e15@pitrou.net> Message-ID: <BANLkTi=Vb7VkMAbDr-cfJyU5Vh56J6O+6A@mail.gmail.com> On Tue, May 24, 2011 at 7:56 PM, Antoine Pitrou <solipsis at pitrou.net> wrote: > Thank you very much! What a beautiful sight this is: > http://www.python.org/dev/buildbot/all/waterfall?category=3.x.stable > > (until a sporadic failure comes up, that is) I could turn test_crashers back on if you like ;) Great work to all involved in tidying things up post-merge! Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From walter at livinglogic.de Tue May 24 12:16:49 2011 From: walter at livinglogic.de (=?UTF-8?B?V2FsdGVyIETDtnJ3YWxk?=) Date: Tue, 24 May 2011 12:16:49 +0200 Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader In-Reply-To: <1306195729.605.27.camel@marge> References: <1306195729.605.27.camel@marge> Message-ID: <4DDB8591.2060308@livinglogic.de> On 24.05.11 02:08, Victor Stinner wrote: > [...] > codecs.open() and StreamReader, StreamWriter and StreamReaderWriter > classes of the codecs module don't support universal newlines, still > have some issues with stateful codecs (like UTF-16/32 BOMs), and each > codec has to implement a StreamReader and a StreamWriter class. > > StreamReader and StreamWriter are stateless codecs (no reset() or > setstate() method), They *are* stateful, they just don't expose their state to the public. > and so it's not possible to write a generic fix for > all child classes in the codecs module. Each stateful codec has to > handle special cases like seek() problems. Yes, which in theory makes it possible to implement shortcuts for certain codecs (e.g. the UTF-32-BE/LE codecs could simply multiply the character position by 4 to get the byte position). However AFAICR none of the readers/writers does that. > For example, UTF-16 codec > duplicates some IncrementalEncoder/IncrementalDecoder code into its > StreamWriter/StreamReader class. Actually it's the other way round: When I implemented the incremental codecs, I copied code from the StreamReader/StreamWriter classes. > The io module is well tested, supports non-seekable streams, handles > correctly corner-cases (like UTF-16/32 BOMs) and supports any kind of > newlines including an "universal newline" mode. TextIOWrapper reuses > incremental encoders and decoders, so BOM issues were fixed only once, > in TextIOWrapper. > > It's trivial to replace a call to codecs.open() by a call to open(), > because the two API are very close. The main different is that > codecs.open() doesn't support universal newline, so you have to use > open(..., newline='') to keep the same behaviour (keep newlines > unchanged). This task can be done by 2to3. But I suppose that most > people will be happy with the universal newline mode. > > I don't see which usecase is not covered by TextIOWrapper. But I know > some cases which are not supported by StreamReader/StreamWriter. This could be be partially fixed by implementing generic StreamReader/StreamWriter classes that reuse the incremental codecs, but I don't think thats worth it. > [...] Servus, Walter From solipsis at pitrou.net Tue May 24 12:39:29 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 24 May 2011 12:39:29 +0200 Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader References: <1306195729.605.27.camel@marge> <4DDB664A.7050705@egenix.com> <1306227534.2619.34.camel@marge> <BANLkTi=pqLU9kXmr6Kj7o36x7LuUO=Y3Cg@mail.gmail.com> Message-ID: <20110524123929.42dd91ef@pitrou.net> On Tue, 24 May 2011 20:25:11 +1000 Nick Coghlan <ncoghlan at gmail.com> wrote: > > Just as PEP 302 defines how module importers should be written, PEP > 100 defines how text codecs should be written (i.e. in terms of > StreamReader and StreamWriter). > > PEP 3116 then defines how such codecs can be used as part of the > overall I/O stack as redesigned for Python 3. The I/O stack doesn't use StreamReader and StreamWriter. That's the whole point. Stream* have been made useless by the new I/O stack. > Now, there may be an opportunity here to rationalise things a bit and > re-use the *new* io module interfaces as the basis for an updated > codec API PEP, but we shouldn't be hasty in deprecating an old API > that is about "how to write codecs" just because it is similar to a > shiny new one that is about "how to process I/O data". Ok, can you explain us the difference, concretely? Thanks Antoine. From lukasz at langa.pl Tue May 24 12:42:44 2011 From: lukasz at langa.pl (=?iso-8859-2?Q?=A3ukasz_Langa?=) Date: Tue, 24 May 2011 12:42:44 +0200 Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader In-Reply-To: <4DDB8591.2060308@livinglogic.de> References: <1306195729.605.27.camel@marge> <4DDB8591.2060308@livinglogic.de> Message-ID: <A5309F8C-375D-4ED9-A325-8172B9E852B3@langa.pl> Wiadomo?? napisana przez Walter D?rwald w dniu 2011-05-24, o godz. 12:16: >> I don't see which usecase is not covered by TextIOWrapper. But I know >> some cases which are not supported by StreamReader/StreamWriter. > > This could be be partially fixed by implementing generic > StreamReader/StreamWriter classes that reuse the incremental codecs, but > I don't think thats worth it. Why not? -- Best regards, ?ukasz Langa Senior Systems Architecture Engineer IT Infrastructure Department Grupa Allegro Sp. z o.o. From victor.stinner at haypocalc.com Tue May 24 12:50:28 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 24 May 2011 12:50:28 +0200 Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader In-Reply-To: <loom.20110524T095820-780@post.gmane.org> References: <1306195729.605.27.camel@marge> <loom.20110524T095820-780@post.gmane.org> Message-ID: <1306234228.2619.44.camel@marge> Le mardi 24 mai 2011 ? 08:16 +0000, Vinay Sajip a ?crit : > > I opened an issue for this idea. Brett and Marc-Andree Lemburg don't > > want to deprecate codecs.open() & friends because they want to be able > > to write code working on Python 2 and on Python 3 without any change. I > > don't think it's realistic: nontrivial programs require at least the six > > module, and most likely the 2to3 program. The six module can have its > > "codecs.open" function if codecs.open is removed from Python 3.4. > > What's "non-trivial"? Both pip and virtualenv (widely used programs) were ported > to Python 3 using a single codebase for 2.x and 3.x, because it seemed to > involve the least ongoing maintenance burden. Though these particular programs > don't use codecs.open, I don't see much value in making it harder to write > programs which can run under both 2.x and 3.x; that's not going to speed > adoption of 3.x. pip has a pip.backwardcompat module which is vey similar to six. If codecs.open() is deprecated or removed, it will be trivial to add a wrapper for codecs.open() or open() to six and pip.backwardcompat. virtualenv.py starts also with a thin compatibility layer. But yes, each program using a compatibily layer/module will have to be updated. Victor From solipsis at pitrou.net Tue May 24 12:54:53 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 24 May 2011 12:54:53 +0200 Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader References: <1306195729.605.27.camel@marge> <4DDB8591.2060308@livinglogic.de> Message-ID: <20110524125453.4b20107b@pitrou.net> On Tue, 24 May 2011 12:16:49 +0200 Walter D?rwald <walter at livinglogic.de> wrote: > > > and so it's not possible to write a generic fix for > > all child classes in the codecs module. Each stateful codec has to > > handle special cases like seek() problems. > > Yes, which in theory makes it possible to implement shortcuts for > certain codecs (e.g. the UTF-32-BE/LE codecs could simply multiply the > character position by 4 to get the byte position). However AFAICR none > of the readers/writers does that. And in practice, TextIOWrapper.tell() does a similar optimization in a generic way. I'm linking to the Python implementation for readability: http://hg.python.org/cpython/file/5c716437a83a/Lib/_pyio.py#l1741 TextIOWrapper.seek() is straightforward due to the structure of the integer "cookie" returned by TextIOWrapper.tell(). In practice, TextIOWrapper gets much more love than Stream{Reader,Writer} because it's an essential part of the new I/O stack. As Victor said, problems which Stream* have had for years are solved neatly in TextIOWrapper. Therefore, leaving Stream{Reader,Writer} in is not a matter of "choice" and "freedom given to users". It's giving people the misleading possibility of using non-optimized, poorly debugged, less featureful implementations of the same basic idea (an unicode stream abstraction). Regards Antoine. From victor.stinner at haypocalc.com Tue May 24 12:58:01 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 24 May 2011 12:58:01 +0200 Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader In-Reply-To: <A5309F8C-375D-4ED9-A325-8172B9E852B3@langa.pl> References: <1306195729.605.27.camel@marge> <4DDB8591.2060308@livinglogic.de> <A5309F8C-375D-4ED9-A325-8172B9E852B3@langa.pl> Message-ID: <1306234681.2619.45.camel@marge> Le mardi 24 mai 2011 ? 12:42 +0200, ?ukasz Langa a ?crit : > Wiadomo?? napisana przez Walter D?rwald w dniu 2011-05-24, o godz. 12:16: > > >> I don't see which usecase is not covered by TextIOWrapper. But I know > >> some cases which are not supported by StreamReader/StreamWriter. > > > > This could be be partially fixed by implementing generic > > StreamReader/StreamWriter classes that reuse the incremental codecs, but > > I don't think thats worth it. > > Why not? We have already an implementation of this idea, it is called io.TextIOWrapper. Victor From fijall at gmail.com Tue May 24 13:31:38 2011 From: fijall at gmail.com (Maciej Fijalkowski) Date: Tue, 24 May 2011 13:31:38 +0200 Subject: [Python-Dev] CPython optimization: storing reference counters outside of objects In-Reply-To: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com> References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com> Message-ID: <BANLkTi=85VbhgRBk1XCiwGz25gNONZO=6Q@mail.gmail.com> On Sun, May 22, 2011 at 1:57 AM, Artur Siekielski <artur.siekielski at gmail.com> wrote: > Hi. > The problem with reference counters is that they are very often > incremented/decremented, even for read-only algorithms (like traversal > of a list). It has two drawbacks: > 1. CPU cache lines (64 bytes on X86) containing a beginning of a > PyObject are very often invalidated, resulting in loosing many chances > to use the CPU caches Not sure what scenario exactly are you discussing here, but storing reference counts outside of objects has (at least on a single processor) worse cache locality than inside objects. > > However the drawback is that such design introduces a new level of > indirection which is a pointer inside a PyObject instead of a direct > value. Also it seems that the "block" with refcounts would have to be > a non-trivial data structure. That would almost certainly be slower for most use cases, except for the copy-on-write fork. I guess recycler papers might be an interesting read: http://www.research.ibm.com/people/d/dfb/recycler.html This is the best reference-counting GC I'm aware of. > > I'm not a compiler/profiling expert so the main question is if such > design can work, and maybe someone was thinking about something > similar? And if CPython was profiled for CPU cache usage? CPython was not designed for CPU cache usage as far as I'm aware. >From my (heavily biased) point of view, PyPy is a way better platform to perform such experiments (and PyPy has been profiled for CPU cache usage). The main advantage is that you can code your GC without the need to modify the interpreter. On the other hand you obviously don't get benefits on CPython, but maybe it's worth experimenting. Cheers, fijal From walter at livinglogic.de Tue May 24 14:01:43 2011 From: walter at livinglogic.de (=?UTF-8?B?V2FsdGVyIETDtnJ3YWxk?=) Date: Tue, 24 May 2011 14:01:43 +0200 Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader In-Reply-To: <1306234681.2619.45.camel@marge> References: <1306195729.605.27.camel@marge> <4DDB8591.2060308@livinglogic.de> <A5309F8C-375D-4ED9-A325-8172B9E852B3@langa.pl> <1306234681.2619.45.camel@marge> Message-ID: <4DDB9E27.7040605@livinglogic.de> On 24.05.11 12:58, Victor Stinner wrote: > Le mardi 24 mai 2011 ? 12:42 +0200, ?ukasz Langa a ?crit : >> Wiadomo?? napisana przez Walter D?rwald w dniu 2011-05-24, o godz. 12:16: >> >>>> I don't see which usecase is not covered by TextIOWrapper. But I know >>>> some cases which are not supported by StreamReader/StreamWriter. >>> >>> This could be be partially fixed by implementing generic >>> StreamReader/StreamWriter classes that reuse the incremental codecs, but >>> I don't think thats worth it. >> >> Why not? > > We have already an implementation of this idea, it is called > io.TextIOWrapper. Exactly. >From another post by Victor: > As I wrote, codecs.open() is useful in Python 2. But I don't know any > program or library using directly StreamReader or StreamWriter. So: implementing this is a lot of work, duplicates existing functionality and is mostly unused. Servus, Walter From stefan_ml at behnel.de Tue May 24 14:05:26 2011 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 24 May 2011 14:05:26 +0200 Subject: [Python-Dev] CPython optimization: storing reference counters outside of objects In-Reply-To: <BANLkTi=85VbhgRBk1XCiwGz25gNONZO=6Q@mail.gmail.com> References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com> <BANLkTi=85VbhgRBk1XCiwGz25gNONZO=6Q@mail.gmail.com> Message-ID: <irg6u7$cu$1@dough.gmane.org> Maciej Fijalkowski, 24.05.2011 13:31: > CPython was not designed for CPU cache usage as far as I'm aware. That's a pretty bold statement to make on this list. Even if it wasn't originally "designed" for (efficient?) CPU cache usage, it's certainly been around for long enough to have received numerous performance tweaks in that regard. I doubt that efficient CPU cache usage was a major design goal of PyPy right from the start. IMHO, the project has changed its objectives way too many times to claim something like that, especially at the low level where the CPU cache becomes relevant. I remember that not so long ago, PyPy was hugely memory hungry compared to CPython. Although, one could certainly call *that* "designed for CPU cache usage"... ;) Stefan From sturla at molden.no Tue May 24 14:08:14 2011 From: sturla at molden.no (Sturla Molden) Date: Tue, 24 May 2011 14:08:14 +0200 Subject: [Python-Dev] CPython optimization: storing reference counters outside of objects In-Reply-To: <BANLkTink2knbSW+jX5y-quqmPThcW90TxA@mail.gmail.com> References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com> <BANLkTi=1eajrgLEtAbjmgpm9sENBYw3+aA@mail.gmail.com> <BANLkTimG_iRMqTP_AqRsAJzaJvod36LSTQ@mail.gmail.com> <BANLkTimgc=_gd5xzgYaEraoEABspe+Ddsw@mail.gmail.com> <4DDAE0C7.9040501@molden.no> <BANLkTink2knbSW+jX5y-quqmPThcW90TxA@mail.gmail.com> Message-ID: <4DDB9FAE.5060205@molden.no> Den 24.05.2011 11:55, skrev Artur Siekielski: > > PYRO/multiprocessing proxies isn't a comparable solution because of > ORDERS OF MAGNITUDE worser performance. You compare here direct memory > access vs serialization/message passing through sockets/pipes. The bottleneck is likely the serialization, but only if you serialize large objects. IPC is always very fast, at least on localhost . Just out of curiosity, have you considered using a database? Sqlite and BSD DB can even be put in shared memory if you want. It sounds like you are trying to solve a database problem using os.fork, something which is more or less doomed to fail (i.e. you have to replicate all effort put into scaling up databases). If a database is too slow, I am rather sure you need something else than Python as well. Sturla From sturla at molden.no Tue May 24 14:25:59 2011 From: sturla at molden.no (Sturla Molden) Date: Tue, 24 May 2011 14:25:59 +0200 Subject: [Python-Dev] CPython optimization: storing reference counters outside of objects In-Reply-To: <BANLkTi=85VbhgRBk1XCiwGz25gNONZO=6Q@mail.gmail.com> References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com> <BANLkTi=85VbhgRBk1XCiwGz25gNONZO=6Q@mail.gmail.com> Message-ID: <4DDBA3D7.9060807@molden.no> Den 24.05.2011 13:31, skrev Maciej Fijalkowski: > > Not sure what scenario exactly are you discussing here, but storing > reference counts outside of objects has (at least on a single > processor) worse cache locality than inside objects. > Artur Siekielski is not talking about cache locality, but copy-on-write fork on Linux et al. When reference counts are updated after forking, memory pages marked copy-on-write are copied if they store reference counts. And then he quickly runs out of memory. He wants to put reference counts and PyObjects in different pages, so only the pages with reference counts get copied. I don't think he cares about cache locality at all, but the rest of us do :-) Sturla From sturla at molden.no Tue May 24 14:31:47 2011 From: sturla at molden.no (Sturla Molden) Date: Tue, 24 May 2011 14:31:47 +0200 Subject: [Python-Dev] CPython optimization: storing reference counters outside of objects In-Reply-To: <BANLkTink2knbSW+jX5y-quqmPThcW90TxA@mail.gmail.com> References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com> <BANLkTi=1eajrgLEtAbjmgpm9sENBYw3+aA@mail.gmail.com> <BANLkTimG_iRMqTP_AqRsAJzaJvod36LSTQ@mail.gmail.com> <BANLkTimgc=_gd5xzgYaEraoEABspe+Ddsw@mail.gmail.com> <4DDAE0C7.9040501@molden.no> <BANLkTink2knbSW+jX5y-quqmPThcW90TxA@mail.gmail.com> Message-ID: <4DDBA533.3070800@molden.no> Den 24.05.2011 11:55, skrev Artur Siekielski: > > POSH might be good, but the project is dead for 8 years. And this > copy-on-write is nice because you don't need changes/restrictions to > your code, or a special garbage collector. Then I have a solution for you, one that is cheaper than anything else you are trying to do (taking work hours into account): BUY MORE RAM! RAM is damn cheap. You just need more of it. And 64-bit Python :-) Sturla From solipsis at pitrou.net Tue May 24 14:32:42 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 24 May 2011 14:32:42 +0200 Subject: [Python-Dev] CPython optimization: storing reference counters outside of objects References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com> <BANLkTi=85VbhgRBk1XCiwGz25gNONZO=6Q@mail.gmail.com> <irg6u7$cu$1@dough.gmane.org> Message-ID: <20110524143242.0774326c@pitrou.net> On Tue, 24 May 2011 14:05:26 +0200 Stefan Behnel <stefan_ml at behnel.de> wrote: > > I doubt that efficient CPU cache usage was a major design goal of PyPy > right from the start. IMHO, the project has changed its objectives way too > many times to claim something like that, especially at the low level where > the CPU cache becomes relevant. I remember that not so long ago, PyPy was > hugely memory hungry compared to CPython. Although, one could certainly > call *that* "designed for CPU cache usage"... ;) Well, to be honest, "hugely memory hungry" doesn't necessarily mean cache-averse. It depends on the locality of memory access patterns. Regards Antoine. From stefan_ml at behnel.de Tue May 24 15:01:49 2011 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 24 May 2011 15:01:49 +0200 Subject: [Python-Dev] CPython optimization: storing reference counters outside of objects In-Reply-To: <20110524143242.0774326c@pitrou.net> References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com> <BANLkTi=85VbhgRBk1XCiwGz25gNONZO=6Q@mail.gmail.com> <irg6u7$cu$1@dough.gmane.org> <20110524143242.0774326c@pitrou.net> Message-ID: <irga7u$n0o$1@dough.gmane.org> Antoine Pitrou, 24.05.2011 14:32: > On Tue, 24 May 2011 14:05:26 +0200Stefan Behnel wrote: >> >> I doubt that efficient CPU cache usage was a major design goal of PyPy >> right from the start. IMHO, the project has changed its objectives way too >> many times to claim something like that, especially at the low level where >> the CPU cache becomes relevant. I remember that not so long ago, PyPy was >> hugely memory hungry compared to CPython. Although, one could certainly >> call *that* "designed for CPU cache usage"... ;) > > Well, to be honest, "hugely memory hungry" doesn't necessarily mean > cache-averse. It depends on the locality of memory access patterns. Sure. AFAIR (and Maciej is certainly the right person to prove me wrong), the problem at the time was that the overall memory footprint of objects was too high. That, at least, speaks against efficient cache usage and makes it's more likely to result in cache thrashing. In any case, we're talking about a historical problem they already fixed. Stefan From ncoghlan at gmail.com Tue May 24 16:33:07 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 25 May 2011 00:33:07 +1000 Subject: [Python-Dev] CPython optimization: storing reference counters outside of objects In-Reply-To: <irg6u7$cu$1@dough.gmane.org> References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com> <BANLkTi=85VbhgRBk1XCiwGz25gNONZO=6Q@mail.gmail.com> <irg6u7$cu$1@dough.gmane.org> Message-ID: <BANLkTi=VH=7k5NX67ekZsaEfACgV3x96Ew@mail.gmail.com> On Tue, May 24, 2011 at 10:05 PM, Stefan Behnel <stefan_ml at behnel.de> wrote: > Maciej Fijalkowski, 24.05.2011 13:31: >> >> CPython was not designed for CPU cache usage as far as I'm aware. > > That's a pretty bold statement to make on this list. Even if it wasn't > originally "designed" for (efficient?) CPU cache usage, it's certainly been > around for long enough to have received numerous performance tweaks in that > regard. As a statement of Guido's original intent, I'd side with Maciej (Guido has made it pretty clear that he subscribes to the "first, make it work, and only worry about making it faster if that first approach isn't good enough" school of thought). Various *parts* of CPython, on the other hand, have indeed been optimised over the years to be quite aware of potential low level CPU and RAM effects (e.g. dicts, sorting, the small object allocator). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From artur.siekielski at gmail.com Tue May 24 17:39:06 2011 From: artur.siekielski at gmail.com (Artur Siekielski) Date: Tue, 24 May 2011 17:39:06 +0200 Subject: [Python-Dev] CPython optimization: storing reference counters outside of objects In-Reply-To: <4DDB9FAE.5060205@molden.no> References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com> <BANLkTi=1eajrgLEtAbjmgpm9sENBYw3+aA@mail.gmail.com> <BANLkTimG_iRMqTP_AqRsAJzaJvod36LSTQ@mail.gmail.com> <BANLkTimgc=_gd5xzgYaEraoEABspe+Ddsw@mail.gmail.com> <4DDAE0C7.9040501@molden.no> <BANLkTink2knbSW+jX5y-quqmPThcW90TxA@mail.gmail.com> <4DDB9FAE.5060205@molden.no> Message-ID: <BANLkTimog4wHqXQ-O+tRKcKoxZDvj=n52A@mail.gmail.com> 2011/5/24 Sturla Molden <sturla at molden.no>: > Den 24.05.2011 11:55, skrev Artur Siekielski: >> >> PYRO/multiprocessing proxies isn't a comparable solution because of >> ORDERS OF MAGNITUDE worser performance. You compare here direct memory >> access vs serialization/message passing through sockets/pipes. > The bottleneck is likely the serialization, but only if you serialize large > objects. IPC is always very fast, at least on localhost . It cannot be "fast" compared to direct memory access. Here is a benchmark: summing numbers in a small list in a child process using multiprocessing "manager": http://dpaste.org/QzKr/ , and using implicit copy of the structure after fork(): http://dpaste.org/q3eh/. The first is 200 TIMES SLOWER. It means if the work finishes in 20 seconds using fork(), the same work will require more than one hour using multiprocessing manager. > If a database is too slow, I am rather sure you need > something else than Python as well. Disk access is about 1000x slower than memory access in C, and Python in a worst case is 50x slower than C, so there is still a huge win (not to mention that in a common case Python is only a few times slower). Artur From tjreedy at udel.edu Tue May 24 17:44:39 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 24 May 2011 11:44:39 -0400 Subject: [Python-Dev] CPython optimization: storing reference counters outside of objects In-Reply-To: <4DDBA3D7.9060807@molden.no> References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com> <BANLkTi=85VbhgRBk1XCiwGz25gNONZO=6Q@mail.gmail.com> <4DDBA3D7.9060807@molden.no> Message-ID: <irgjp5$tv5$1@dough.gmane.org> On 5/24/2011 8:25 AM, Sturla Molden wrote: > Artur Siekielski is not talking about cache locality, but copy-on-write > fork on Linux et al. > > When reference counts are updated after forking, memory pages marked > copy-on-write are copied if they store reference counts. And then he > quickly runs out of memory. He wants to put reference counts and > PyObjects in different pages, so only the pages with reference counts > get copied. > > I don't think he cares about cache locality at all, but the rest of us > do :-) It seems clear that separating reference counts from objects satisfies a specialized need and should be done in a spedial, patched version of CPython rather than the general distribution. -- Terry Jan Reedy From victor.stinner at haypocalc.com Tue May 24 18:06:15 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 24 May 2011 18:06:15 +0200 Subject: [Python-Dev] [Python-checkins] cpython: Issue #12049: Add RAND_bytes() and RAND_pseudo_bytes() functions to the ssl In-Reply-To: <4DDBCE7C.6090200@udel.edu> References: <E1QOoUJ-0002QA-PE@dinsdale.python.org> <4DDBCE7C.6090200@udel.edu> Message-ID: <1306253175.13660.18.camel@marge> Le mardi 24 mai 2011 ? 11:27 -0400, Terry Reedy a ?crit : > > > > +.. function:: RAND_bytes(num) > > + > > + Returns *num* cryptographically strong pseudo-random bytes. > > + > > + .. versionadded:: 3.3 > > + > > +.. function:: RAND_pseudo_bytes(num) > > + > > + Returns (bytes, is_cryptographic): bytes are *num* pseudo-random bytes, > > + is_cryptographic is True if the bytes generated are cryptographically > > + strong. > > + > > + .. versionadded:: 3.3 > > I am curious what 'cryptographically strong' means, what the real > difference is between the above two functions, and how these do not > duplicate what is in random.random. An important feature of a CPRNG (cryptographic pseudo-random number generator) is that even if you know all of its output, you cannot rebuild its internal state to guess next (or maybe previous number). The CPRNG can for example hash its output using SHA-1: you will have to "break" the SHA-1 hash (maybe using "salt"). Another important feature is that even if you know the internal state, you will not be able to guess all previous and next numbers, because the internal state is regulary updated using an external source of entropy. Use RAND_add() to do that explicitly. We may add a link to Wikipedia: http://en.wikipedia.org/wiki/CPRNG Read the "Requirements" section, it's maybe more correct than my explanation: http://en.wikipedia.org/wiki/CPRNG#Requirements About the random module, it must not be used to generate passwords or certificates, because it is easy to rebuild the internal state of a Mersenne Twister generator if you know the previous 624 numbers. Since you know the state, it's also easy to generate all next numbers. Seed a Mersenne Twister PRNG doesn't help. See my Hasard project if you would like to learn more about PRNG ;-) We may also add a link from random to SSL.RAND_bytes() and SSL.RAND_pseudo_bytes(). https://bitbucket.org/haypo/hasard/ Victor From tjreedy at udel.edu Tue May 24 18:35:16 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 24 May 2011 12:35:16 -0400 Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader In-Reply-To: <4DDB84F2.40106@egenix.com> References: <1306195729.605.27.camel@marge> <4DDB664A.7050705@egenix.com> <1306227534.2619.34.camel@marge> <4DDB84F2.40106@egenix.com> Message-ID: <irgmo0$hrr$1@dough.gmane.org> On 5/24/2011 6:14 AM, M.-A. Lemburg wrote: > I have no idea why TextIOWrapper was added to the stdlib > instead of making StreamReaderWriter more capable, > since StreamReaderWriter had already been available in Python > since Python 1.6 (and this is being used by codecs.open()). As I understand it, you (and others) wrote codecs long ago and recently other people wrote the new i/o stack, which sometimes uses codecs, and when they needed to add a few details, they 'naturally' added them to the module they were working on and understood (and planned to rewrite in C) rather than to the older module that they maybe did not completely understand and which is only in Python. The Victor comes along to do maintenance on some of the Asian codecs and discovers that he needs to make changes in two (or more?) places rather than one, which he naturally finds unsatifactory. > Perhaps we should deprecate TextIOWrapper instead and > replace it with codecs.StreamReaderWriter ? ;-) I think we should separate two issues: removing internal implementation duplication and removing external api duplication. I should think that the former should not be too controversial. The latter, I know, is more contentious. One problem is that stdlib changes that perhaps 'should' have been made in 3.0/1 could not be discovered until the moratorium and greater focus on the stdlib. -- Terry Jan Reedy From tjreedy at udel.edu Tue May 24 18:39:20 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 24 May 2011 12:39:20 -0400 Subject: [Python-Dev] Stable buildbots update In-Reply-To: <BANLkTi=Vb7VkMAbDr-cfJyU5Vh56J6O+6A@mail.gmail.com> References: <20110521163714.68c5384f@pitrou.net> <BANLkTimpt1YR7PsjYh4H+rY9E9p8VopS_g@mail.gmail.com> <58834.1306112451@parc.com> <BANLkTi=ofyuCUA49wheByRPVcygPzrCUKg@mail.gmail.com> <BANLkTi=Twu6pcoQKM8KebLmxtYfL6HbCOQ@mail.gmail.com> <20110524115655.65030e15@pitrou.net> <BANLkTi=Vb7VkMAbDr-cfJyU5Vh56J6O+6A@mail.gmail.com> Message-ID: <irgmvj$hrr$2@dough.gmane.org> On 5/24/2011 6:27 AM, Nick Coghlan wrote: > On Tue, May 24, 2011 at 7:56 PM, Antoine Pitrou<solipsis at pitrou.net> wrote: >> Thank you very much! What a beautiful sight this is: >> http://www.python.org/dev/buildbot/all/waterfall?category=3.x.stable >> >> (until a sporadic failure comes up, that is) > > I could turn test_crashers back on if you like ;) No need. One xp (but not the other) and win7 turned red again. -- Terry Jan Reedy From debatem1 at gmail.com Tue May 24 19:09:07 2011 From: debatem1 at gmail.com (geremy condra) Date: Tue, 24 May 2011 10:09:07 -0700 Subject: [Python-Dev] CPython optimization: storing reference counters outside of objects In-Reply-To: <irgjp5$tv5$1@dough.gmane.org> References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com> <BANLkTi=85VbhgRBk1XCiwGz25gNONZO=6Q@mail.gmail.com> <4DDBA3D7.9060807@molden.no> <irgjp5$tv5$1@dough.gmane.org> Message-ID: <BANLkTikYC_eCkq7hMrJjjo=+KojTEAOvTg@mail.gmail.com> On Tue, May 24, 2011 at 8:44 AM, Terry Reedy <tjreedy at udel.edu> wrote: > On 5/24/2011 8:25 AM, Sturla Molden wrote: > >> Artur Siekielski is not talking about cache locality, but copy-on-write >> fork on Linux et al. >> >> When reference counts are updated after forking, memory pages marked >> copy-on-write are copied if they store reference counts. And then he >> quickly runs out of memory. He wants to put reference counts and >> PyObjects in different pages, so only the pages with reference counts >> get copied. >> >> I don't think he cares about cache locality at all, but the rest of us >> do :-) > > It seems clear that separating reference counts from objects satisfies a > specialized need and should be done in a spedial, patched version of CPython > rather than the general distribution. I'm not sure I agree, especially given that the classical answer to GIL woes has been to tell people to fork() themselves. There has to be a lot of code out there that would benefit from this. Geremy Condra From g.brandl at gmx.net Tue May 24 19:28:33 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 24 May 2011 19:28:33 +0200 Subject: [Python-Dev] cpython: move specialized dir implementations into __dir__ methods (closes #12166) In-Reply-To: <E1QOuAE-0002zH-7a@dinsdale.python.org> References: <E1QOuAE-0002zH-7a@dinsdale.python.org> Message-ID: <irgprk$8sc$1@dough.gmane.org> On 24.05.2011 18:08, benjamin.peterson wrote: > http://hg.python.org/cpython/rev/8f403199f999 > changeset: 70331:8f403199f999 > user: Benjamin Peterson <benjamin at python.org> > date: Tue May 24 11:09:06 2011 -0500 > summary: > move specialized dir implementations into __dir__ methods (closes #12166) > +static PyMethodDef module_methods[] = { > + {"__dir__", module_dir, METH_NOARGS, > + PyDoc_STR("__dir__() -> specialized dir() implementation")}, > + {0} > +}; > static PyMethodDef type_methods[] = { > {"mro", (PyCFunction)mro_external, METH_NOARGS, > PyDoc_STR("mro() -> list\nreturn a type's method resolution order")}, > @@ -2585,6 +2661,8 @@ > PyDoc_STR("__instancecheck__() -> check if an object is an instance")}, > {"__subclasscheck__", type___subclasscheck__, METH_O, > PyDoc_STR("__subclasscheck__() -> check if a class is a subclass")}, > + {"__dir__", type_dir, METH_NOARGS, > + PyDoc_STR("__dir__() -> specialized __dir__ implementation for types")}, > static PyMethodDef object_methods[] = { > {"__reduce_ex__", object_reduce_ex, METH_VARARGS, > PyDoc_STR("helper for pickle")}, > @@ -3449,6 +3574,8 @@ > PyDoc_STR("default object formatter")}, > {"__sizeof__", object_sizeof, METH_NOARGS, > PyDoc_STR("__sizeof__() -> size of object in memory, in bytes")}, > + {"__dir__", object_dir, METH_NOARGS, > + PyDoc_STR("__dir__() -> default dir() implementation")}, This is interesting: I though we use "->" to specify the return value (or its type). __instancecheck__ and __subclasscheck__ set a different precedent, while __sizeof__ follows. I didn't look at the files to check for other examples. Georg From benjamin at python.org Tue May 24 19:39:57 2011 From: benjamin at python.org (Benjamin Peterson) Date: Tue, 24 May 2011 12:39:57 -0500 Subject: [Python-Dev] cpython: move specialized dir implementations into __dir__ methods (closes #12166) In-Reply-To: <irgprk$8sc$1@dough.gmane.org> References: <E1QOuAE-0002zH-7a@dinsdale.python.org> <irgprk$8sc$1@dough.gmane.org> Message-ID: <BANLkTikjCq4R9PpLxzjiWx_QBE5ga+1Uuw@mail.gmail.com> 2011/5/24 Georg Brandl <g.brandl at gmx.net>: > On 24.05.2011 18:08, benjamin.peterson wrote: >> http://hg.python.org/cpython/rev/8f403199f999 >> changeset: ? 70331:8f403199f999 >> user: ? ? ? ?Benjamin Peterson <benjamin at python.org> >> date: ? ? ? ?Tue May 24 11:09:06 2011 -0500 >> summary: >> ? move specialized dir implementations into __dir__ methods (closes #12166) > >> +static PyMethodDef module_methods[] = { >> + ? ?{"__dir__", module_dir, METH_NOARGS, >> + ? ? PyDoc_STR("__dir__() -> specialized dir() implementation")}, >> + ? ?{0} >> +}; > >> ?static PyMethodDef type_methods[] = { >> ? ? ?{"mro", (PyCFunction)mro_external, METH_NOARGS, >> ? ? ? PyDoc_STR("mro() -> list\nreturn a type's method resolution order")}, >> @@ -2585,6 +2661,8 @@ >> ? ? ? PyDoc_STR("__instancecheck__() -> check if an object is an instance")}, >> ? ? ?{"__subclasscheck__", type___subclasscheck__, METH_O, >> ? ? ? PyDoc_STR("__subclasscheck__() -> check if a class is a subclass")}, >> + ? ?{"__dir__", type_dir, METH_NOARGS, >> + ? ? PyDoc_STR("__dir__() -> specialized __dir__ implementation for types")}, > >> ?static PyMethodDef object_methods[] = { >> ? ? ?{"__reduce_ex__", object_reduce_ex, METH_VARARGS, >> ? ? ? PyDoc_STR("helper for pickle")}, >> @@ -3449,6 +3574,8 @@ >> ? ? ? PyDoc_STR("default object formatter")}, >> ? ? ?{"__sizeof__", object_sizeof, METH_NOARGS, >> ? ? ? PyDoc_STR("__sizeof__() -> size of object in memory, in bytes")}, >> + ? ?{"__dir__", object_dir, METH_NOARGS, >> + ? ? PyDoc_STR("__dir__() -> default dir() implementation")}, > > This is interesting: I though we use "->" to specify the return value (or > its type). ?__instancecheck__ and __subclasscheck__ set a different > precedent, while __sizeof__ follows. Yes, I was wondering about that, so I just picked one. :) "->" seems to be better for return values, though, given the resemblance to annotations. -- Regards, Benjamin From cesare.di.mauro at gmail.com Tue May 24 19:40:47 2011 From: cesare.di.mauro at gmail.com (Cesare Di Mauro) Date: Tue, 24 May 2011 19:40:47 +0200 Subject: [Python-Dev] CPython optimization: storing reference counters outside of objects In-Reply-To: <irg6u7$cu$1@dough.gmane.org> References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com> <BANLkTi=85VbhgRBk1XCiwGz25gNONZO=6Q@mail.gmail.com> <irg6u7$cu$1@dough.gmane.org> Message-ID: <BANLkTimwQPCgDaOdNV0An4k59OrtQUFeuQ@mail.gmail.com> 2011/5/24 Stefan Behnel <stefan_ml at behnel.de> > Maciej Fijalkowski, 24.05.2011 13:31: > > CPython was not designed for CPU cache usage as far as I'm aware. >> > > That's a pretty bold statement to make on this list. Even if it wasn't > originally "designed" for (efficient?) CPU cache usage, it's certainly been > around for long enough to have received numerous performance tweaks in that > regard. > > Stefan Maybe a change on memory allocation granularity can help here. Raising it to 16 and 32 bytes for 32 and 64 bits system respectively guarantees that an access to ob_refcnt and/or ob_type will put on the cache line some other information for the same object, which is usually required by itself (except for very simple ones, such as PyNone, PyEllipsis, etc.). Think about a long, a tuple, a list, a dictionary, ecc.: all of them have some critical data after these fields, that most likely will be accessed after INCRef or type checking. Regards, Cesare -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20110524/957fe2ee/attachment.html> From janssen at parc.com Tue May 24 19:43:06 2011 From: janssen at parc.com (Bill Janssen) Date: Tue, 24 May 2011 10:43:06 PDT Subject: [Python-Dev] Stable buildbots update In-Reply-To: <nad-0412B2.22031323052011@news.gmane.org> References: <20110521163714.68c5384f@pitrou.net> <BANLkTimpt1YR7PsjYh4H+rY9E9p8VopS_g@mail.gmail.com> <58834.1306112451@parc.com> <BANLkTi=ofyuCUA49wheByRPVcygPzrCUKg@mail.gmail.com> <BANLkTi=Twu6pcoQKM8KebLmxtYfL6HbCOQ@mail.gmail.com> <87zkmcalt8.fsf@uwakimon.sk.tsukuba.ac.jp> <nad-0412B2.22031323052011@news.gmane.org> Message-ID: <87174.1306258986@parc.com> Ned Deily <nad at acm.org> wrote: > In article <87zkmcalt8.fsf at uwakimon.sk.tsukuba.ac.jp>, > "Stephen J. Turnbull" <stephen at xemacs.org> wrote: > > Are you saying you expect Mac OS X 10.4 "Tiger" to go green once the > > bots update? If so, I'm impressed, and "thank you!" to all involved. > > Apple and MacPorts have long since washed their hands of that release. > > OS X 10.4 does have its quirks that makes it challenging to get all of > the tests to run without a few cornercase failures but, besides the > buildbots, I still test regularly with 10.4 and occasionally build > there, too. And, FWIW, while top-of-trunk MacPorts may not officially > support 10.4, many ports work there just fine including python2.6, 2.7, > and 3.1. (3.2 has a build issue that may get fixed in 3.2.1). Perhaps more importantly, parc-leopard-1 and parc-tiger-1 are two of the very few usually-connected buildbots we have running on big-endian architectures, along with loewis-sun (I *think* Solaris-10 on SPARC is still big-endian). Bill From tjreedy at udel.edu Tue May 24 19:52:59 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 24 May 2011 13:52:59 -0400 Subject: [Python-Dev] [Python-checkins] cpython: Issue #12049: Add RAND_bytes() and RAND_pseudo_bytes() functions to the ssl In-Reply-To: <1306253175.13660.18.camel@marge> References: <E1QOoUJ-0002QA-PE@dinsdale.python.org> <4DDBCE7C.6090200@udel.edu> <1306253175.13660.18.camel@marge> Message-ID: <irgr9o$k1l$1@dough.gmane.org> On 5/24/2011 12:06 PM, Victor Stinner wrote: > Le mardi 24 mai 2011 ? 11:27 -0400, Terry Reedy a ?crit : >>> >>> +.. function:: RAND_bytes(num) >>> + >>> + Returns *num* cryptographically strong pseudo-random bytes. >>> + >>> + .. versionadded:: 3.3 >>> + >>> +.. function:: RAND_pseudo_bytes(num) >>> + >>> + Returns (bytes, is_cryptographic): bytes are *num* pseudo-random bytes, >>> + is_cryptographic is True if the bytes generated are cryptographically >>> + strong. >>> + >>> + .. versionadded:: 3.3 >> >> I am curious what 'cryptographically strong' means, what the real >> difference is between the above two functions, and how these do not >> duplicate what is in random.random. > > An important feature of a CPRNG (cryptographic pseudo-random number > generator) is that even if you know all of its output, you cannot > rebuild its internal state to guess next (or maybe previous number). The > CPRNG can for example hash its output using SHA-1: you will have to > "break" the SHA-1 hash (maybe using "salt"). So it is presumably slower. I still do not get RAND_pseudo_bytes, which somehow decides internally what to do. > Another important feature is that even if you know the internal state, > you will not be able to guess all previous and next numbers, because the > internal state is regulary updated using an external source of entropy. > Use RAND_add() to do that explicitly. > > We may add a link to Wikipedia: > http://en.wikipedia.org/wiki/CPRNG That would be helpful > > Read the "Requirements" section, it's maybe more correct than my > explanation: > http://en.wikipedia.org/wiki/CPRNG#Requirements > > About the random module, it must not be used to generate passwords or > certificates, because it is easy to rebuild the internal state of a > Mersenne Twister generator if you know the previous 624 numbers. Since > you know the state, it's also easy to generate all next numbers. Seed a > Mersenne Twister PRNG doesn't help. See my Hasard project if you would > like to learn more about PRNG ;-) > > We may also add a link from random to SSL.RAND_bytes() and > SSL.RAND_pseudo_bytes(). -- Terry Jan Reedy From sturla at molden.no Tue May 24 20:31:27 2011 From: sturla at molden.no (Sturla Molden) Date: Tue, 24 May 2011 20:31:27 +0200 Subject: [Python-Dev] CPython optimization: storing reference counters outside of objects In-Reply-To: <BANLkTimog4wHqXQ-O+tRKcKoxZDvj=n52A@mail.gmail.com> References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com> <BANLkTi=1eajrgLEtAbjmgpm9sENBYw3+aA@mail.gmail.com> <BANLkTimG_iRMqTP_AqRsAJzaJvod36LSTQ@mail.gmail.com> <BANLkTimgc=_gd5xzgYaEraoEABspe+Ddsw@mail.gmail.com> <4DDAE0C7.9040501@molden.no> <BANLkTink2knbSW+jX5y-quqmPThcW90TxA@mail.gmail.com> <4DDB9FAE.5060205@molden.no> <BANLkTimog4wHqXQ-O+tRKcKoxZDvj=n52A@mail.gmail.com> Message-ID: <4DDBF97F.8010005@molden.no> Den 24.05.2011 17:39, skrev Artur Siekielski: > > Disk access is about 1000x slower than memory access in C, and Python > in a worst case is 50x slower than C, so there is still a huge win > (not to mention that in a common case Python is only a few times > slower). You can put databases in shared memory (e.g. Sqlite and BSDDB have options for this). On linux you can also mount /dev/shm as ramdisk. Also, why do you distrust the database developers of Oracle et al. not to do the suffient optimizations? Sturla From gzlist at googlemail.com Tue May 24 21:39:32 2011 From: gzlist at googlemail.com (Martin (gzlist)) Date: Tue, 24 May 2011 20:39:32 +0100 Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader In-Reply-To: <1306195729.605.27.camel@marge> References: <1306195729.605.27.camel@marge> Message-ID: <BANLkTine6bRSUemG_PBFy=_UNwmkd3C5bw@mail.gmail.com> On 24/05/2011, Victor Stinner <victor.stinner at haypocalc.com> wrote: > > In Python 2, codecs.open() is the best way to read and/or write files > using Unicode. But in Python 3, open() is preferred with its fast io > module. I would like to deprecate codecs.open() because it can be > replaced by open() and io.TextIOWrapper. I would like your opinion and > that's why I'm writing this email. There are some modules that try to stay compatible with Python 2 and 3 without a source translation step. Removing the codecs classes would mean they'd have to add a few more compatibility hacks, but could be done. As an aside, I'm still not sure how the io module should be used. Example, a simple task I've used StreamWriter classes for is to wrap stdout. If the stdout.encoding can't represent a character, using "replace" means you can write any unicode string without throwing a UnicodeEncodeError. With the io module, it seems you need to construct a new TextIOWrapper object, passing the attributes of the old one as parameters, and as soon as someone passes something that's not a TextIOWrapper (say, a StringIO object) your code breaks. Is the intention that code dealing with streams needs to be covered in isinstance checks in Python 3? Martin From srini605 at gmail.com Tue May 24 23:09:47 2011 From: srini605 at gmail.com (srinivasan munisamy) Date: Wed, 25 May 2011 02:39:47 +0530 Subject: [Python-Dev] [pyodbc] Setting values to SQL_* constants while creating a connection object Message-ID: <BANLkTik_CifwV_jqACqDaZHmE2Td07f+aA@mail.gmail.com> Hi, I would like to know how to set values to values to SQL_* constants while creatinga db connection through pyodbc module. For example, i am getting a connection object like below: In [27]: dbh1 = pyodbc.connect("DSN=<dsn>;UID=<uid>;PWD=<pwd>;DATABASE=<database>;APP=<app_name>") In [28]: dbh1.getinfo(pyodbc.SQL_DESCRIBE_PARAMETER) Out[28]: True I want to set this SQL_DESCRIBE_PARAMETER to false for this connection object. How could i do that? Please help me in figuring it out. Thanks, Srini -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20110525/b6a71e01/attachment.html> From tjreedy at udel.edu Wed May 25 00:06:00 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 24 May 2011 18:06:00 -0400 Subject: [Python-Dev] [pyodbc] Setting values to SQL_* constants while creating a connection object In-Reply-To: <BANLkTik_CifwV_jqACqDaZHmE2Td07f+aA@mail.gmail.com> References: <BANLkTik_CifwV_jqACqDaZHmE2Td07f+aA@mail.gmail.com> Message-ID: <irha44$nd8$2@dough.gmane.org> On 5/24/2011 5:09 PM, srinivasan munisamy wrote: > Hi, > I would like to know how to set values to values to SQL_* constants Please direct Python use questions to python-listor other user discussion forums. Py-dev is for discussion of development of the next versions of Python. -- Terry Jan Reedy From ncoghlan at gmail.com Wed May 25 07:09:40 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 25 May 2011 15:09:40 +1000 Subject: [Python-Dev] [Python-checkins] cpython: Issue #12049: Add RAND_bytes() and RAND_pseudo_bytes() functions to the ssl In-Reply-To: <irgr9o$k1l$1@dough.gmane.org> References: <E1QOoUJ-0002QA-PE@dinsdale.python.org> <4DDBCE7C.6090200@udel.edu> <1306253175.13660.18.camel@marge> <irgr9o$k1l$1@dough.gmane.org> Message-ID: <BANLkTim1ta3mRHE2io_Rc_-U1bPpTyrvOQ@mail.gmail.com> On Wed, May 25, 2011 at 3:52 AM, Terry Reedy <tjreedy at udel.edu> wrote: > On 5/24/2011 12:06 PM, Victor Stinner wrote: >> An important feature of a CPRNG (cryptographic pseudo-random number >> generator) is that even if you know all of its output, you cannot >> rebuild its internal state to guess next (or maybe previous number). The >> CPRNG can for example hash its output using SHA-1: you will have to >> "break" the SHA-1 hash (maybe using "salt"). > > So it is presumably slower. I still do not get RAND_pseudo_bytes, which > somehow decides internally what to do. The more important feature here is that it is exposing *OpenSSL's* random number generation, rather than our own. A CPRNG isn't *necessarily* slower than a non-crypto one (particularly on systems with dedicated crypto hardware), but they can definitely fail to return data if there isn't enough entropy available in the pool (and the system has to have a usable entropy source in the first place). The RAND_bytes() documentation should probably make it clearer that unlike the random module and RAND_pseudo_bytes(), RAND_bytes() can *fail* (by raising SSLError) if it isn't in a position to provide the requested random data. The pseudo_bytes version just encapsulates a fallback technique that may be suitable in some circumstances: if crypto quality random data is not available, fall back on PRNG data instead of failing. It is most suitable for tasks like prototyping an algorithm in Python for later conversion to C, or similar tasks where it is desirable to use the OpenSSL PRNG over the one in the random module. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Wed May 25 07:13:44 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 25 May 2011 15:13:44 +1000 Subject: [Python-Dev] [Python-checkins] Daily reference leaks (234021dcad93): sum=61 In-Reply-To: <E1QP4Tu-0002qe-D5@ap.vmr.nerim.net> References: <E1QP4Tu-0002qe-D5@ap.vmr.nerim.net> Message-ID: <BANLkTikqWv62fc-t1yfJJkcOY1vDXnWksQ@mail.gmail.com> On Wed, May 25, 2011 at 1:09 PM, <solipsis at pitrou.net> wrote: > results for 234021dcad93 on branch "default" > -------------------------------------------- > > test_packaging leaked [128, 128, 128] references, sum=384 Is there a new cache in packaging that regrtest needs to know about and either ignore or clear when checking reference counts? Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From petri at digip.org Wed May 25 07:59:26 2011 From: petri at digip.org (Petri Lehtinen) Date: Wed, 25 May 2011 08:59:26 +0300 Subject: [Python-Dev] [Python-checkins] cpython: Issue #12049: Add RAND_bytes() and RAND_pseudo_bytes() functions to the ssl In-Reply-To: <irgr9o$k1l$1@dough.gmane.org> References: <E1QOoUJ-0002QA-PE@dinsdale.python.org> <4DDBCE7C.6090200@udel.edu> <1306253175.13660.18.camel@marge> <irgr9o$k1l$1@dough.gmane.org> Message-ID: <20110525055926.GA21500@colossus> Terry Reedy wrote: > On 5/24/2011 12:06 PM, Victor Stinner wrote: > >Le mardi 24 mai 2011 ? 11:27 -0400, Terry Reedy a ?crit : > >>> > >>>+.. function:: RAND_bytes(num) > >>>+ > >>>+ Returns *num* cryptographically strong pseudo-random bytes. > >>>+ > >>>+ .. versionadded:: 3.3 > >>>+ > >>>+.. function:: RAND_pseudo_bytes(num) > >>>+ > >>>+ Returns (bytes, is_cryptographic): bytes are *num* pseudo-random bytes, > >>>+ is_cryptographic is True if the bytes generated are cryptographically > >>>+ strong. > >>>+ > >>>+ .. versionadded:: 3.3 > >> > >>I am curious what 'cryptographically strong' means, what the real > >>difference is between the above two functions, and how these do not > >>duplicate what is in random.random. > > > >An important feature of a CPRNG (cryptographic pseudo-random number > >generator) is that even if you know all of its output, you cannot > >rebuild its internal state to guess next (or maybe previous number). The > >CPRNG can for example hash its output using SHA-1: you will have to > >"break" the SHA-1 hash (maybe using "salt"). > > So it is presumably slower. I still do not get RAND_pseudo_bytes, > which somehow decides internally what to do. According to the RAND_bytes manual page from OpenSSL: RAND_bytes() puts num cryptographically strong pseudo-random bytes into buf. An error occurs if the PRNG has not been seeded with enough randomness to ensure an unpredictable byte sequence. RAND_pseudo_bytes() puts num pseudo-random bytes into buf. Pseudo-random byte sequences generated by RAND_pseudo_bytes() will be unique if they are of sufficient length, but are not necessarily unpredictable. They can be used for non-cryptographic purposes and for certain purposes in cryptographic protocols, but usually not for key generation etc. And: RAND_bytes() returns 1 on success, 0 otherwise. The error code can be obtained by ERR_get_error(3). RAND_pseudo_bytes() returns 1 if the bytes generated are cryptographically strong, 0 otherwise. Both functions return -1 if they are not supported by the current RAND method. So it seems to me that RAND_bytes() either returns cryptographically strong data or fails (is it possible to detect the failure with the Python function? Should this be documented?). RAND_pseudo_bytes() always succeeds but does not necessarily generate cryptographically strong data. > > > Another important feature is that even if you know the internal state, > >you will not be able to guess all previous and next numbers, because the > >internal state is regulary updated using an external source of entropy. > >Use RAND_add() to do that explicitly. > > > >We may add a link to Wikipedia: > >http://en.wikipedia.org/wiki/CPRNG > > That would be helpful > > > >Read the "Requirements" section, it's maybe more correct than my > >explanation: > >http://en.wikipedia.org/wiki/CPRNG#Requirements > > > >About the random module, it must not be used to generate passwords or > >certificates, because it is easy to rebuild the internal state of a > >Mersenne Twister generator if you know the previous 624 numbers. Since > >you know the state, it's also easy to generate all next numbers. Seed a > >Mersenne Twister PRNG doesn't help. See my Hasard project if you would > >like to learn more about PRNG ;-) > > > >We may also add a link from random to SSL.RAND_bytes() and > >SSL.RAND_pseudo_bytes(). Obviously, the user needs to be familiar with the concept of "cryptographically strong randomness" to use these functions. Petri Lehtinen From sandro.tosi at gmail.com Wed May 25 10:24:23 2011 From: sandro.tosi at gmail.com (Sandro Tosi) Date: Wed, 25 May 2011 10:24:23 +0200 Subject: [Python-Dev] Extending os.chown() to accept user/group names Message-ID: <BANLkTikXottB8xfXbe1y48P-GWVxhKn2=Q@mail.gmail.com> Hi all, before opening an issue to track the request, I'd like to ask advice here about this: extend os.chown() to accept even user/group names instead of just uid and gid. On a Unix system, you can call chown command passing either id or names, so it seems (to me at least) natural to expect os.chown() to behave similarly; but that's not the case. I can see os module wants to be a thin wrapper around OS syscalls and chown(2) accepts only uid/gid as input, so what would be best: extend os.chown() or provide a chown() function in shutil module for this purpose? Thanks in advance, -- Sandro Tosi (aka morph, morpheus, matrixhasu) My website: http://matrixhasu.altervista.org/ Me at Debian: http://wiki.debian.org/SandroTosi From victor.stinner at haypocalc.com Wed May 25 11:10:54 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 25 May 2011 11:10:54 +0200 Subject: [Python-Dev] [Python-checkins] Daily reference leaks (234021dcad93): sum=61 In-Reply-To: <BANLkTikqWv62fc-t1yfJJkcOY1vDXnWksQ@mail.gmail.com> References: <E1QP4Tu-0002qe-D5@ap.vmr.nerim.net> <BANLkTikqWv62fc-t1yfJJkcOY1vDXnWksQ@mail.gmail.com> Message-ID: <1306314654.6407.1.camel@marge> Le mercredi 25 mai 2011 ? 15:13 +1000, Nick Coghlan a ?crit : > On Wed, May 25, 2011 at 1:09 PM, <solipsis at pitrou.net> wrote: > > results for 234021dcad93 on branch "default" > > -------------------------------------------- > > > > test_packaging leaked [128, 128, 128] references, sum=384 > > Is there a new cache in packaging that regrtest needs to know about > and either ignore or clear when checking reference counts? See the issue http://bugs.python.org/issue12167 : Antoine listed tests leaking references, and I already fixed some of them. Victor From victor.stinner at haypocalc.com Wed May 25 11:29:17 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 25 May 2011 11:29:17 +0200 Subject: [Python-Dev] [Python-checkins] cpython: Issue #12049: Add RAND_bytes() and RAND_pseudo_bytes() functions to the ssl In-Reply-To: <BANLkTim1ta3mRHE2io_Rc_-U1bPpTyrvOQ@mail.gmail.com> References: <E1QOoUJ-0002QA-PE@dinsdale.python.org> <4DDBCE7C.6090200@udel.edu> <1306253175.13660.18.camel@marge> <irgr9o$k1l$1@dough.gmane.org> <BANLkTim1ta3mRHE2io_Rc_-U1bPpTyrvOQ@mail.gmail.com> Message-ID: <1306315757.6407.5.camel@marge> Le mercredi 25 mai 2011 ? 15:09 +1000, Nick Coghlan a ?crit : > The RAND_bytes() documentation should probably make it clearer that > unlike the random module and RAND_pseudo_bytes(), RAND_bytes() can > *fail* (by raising SSLError) if it isn't in a position to provide the > requested random data. According to the doc, both functions can fail, but it is more likely than RAND_bytes() fail. I disabled temporary Linux random devices to test RAND_bytes() error code: mv /dev/random /dev/random.xxx mv /dev/urandom /dev/urandom.xxx In this case, RAND_pseudo_bytes() generates non-cryptographic random numbers: it returns (random_bytes, False). I don't know how to test RAND_pseudo_bytes() error code. -- I patched test_ssl to test that RAND_bytes() raises an SSLError if there is not enough entropy, and I also improved the documentation to detail the error cases. Victor From mal at egenix.com Wed May 25 11:38:10 2011 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 25 May 2011 11:38:10 +0200 Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader In-Reply-To: <4DDB9E27.7040605@livinglogic.de> References: <1306195729.605.27.camel@marge> <4DDB8591.2060308@livinglogic.de> <A5309F8C-375D-4ED9-A325-8172B9E852B3@langa.pl> <1306234681.2619.45.camel@marge> <4DDB9E27.7040605@livinglogic.de> Message-ID: <4DDCCE02.7060105@egenix.com> Walter D?rwald wrote: > On 24.05.11 12:58, Victor Stinner wrote: >> Le mardi 24 mai 2011 ? 12:42 +0200, ?ukasz Langa a ?crit : >>> Wiadomo?? napisana przez Walter D?rwald w dniu 2011-05-24, o godz. 12:16: >>> >>>>> I don't see which usecase is not covered by TextIOWrapper. But I know >>>>> some cases which are not supported by StreamReader/StreamWriter. >>>> >>>> This could be be partially fixed by implementing generic >>>> StreamReader/StreamWriter classes that reuse the incremental codecs, but >>>> I don't think thats worth it. >>> >>> Why not? >> >> We have already an implementation of this idea, it is called >> io.TextIOWrapper. > > Exactly. > > From another post by Victor: > >> As I wrote, codecs.open() is useful in Python 2. But I don't know any >> program or library using directly StreamReader or StreamWriter. > > So: implementing this is a lot of work, duplicates existing > functionality and is mostly unused. You are missing the point: we have StreamReader and StreamWriter APIs on codecs to allow each codecs to implement more efficient ways of encoding and decoding streams. Examples of such optimizations are reading the stream in chunks that can be decoded in one piece, or writing to the stream in a way that doesn't generate encoding state problems on the receiving end by ending transmission half-way through a shift block. Of course, you won't find many direct uses of these APIs, since most of the time, applications will simply use codecs.open() to automatically benefit from these optimizations. OTOH, TextIOWrapper doesn't know anything about specific encodings and thus does not allow for such optimizations to be implemented by codecs. We don't have many such specialized implementations in the stdlib, but this doesn't mean that there's no use for them. It just means that developers and users are simply unaware of the possibilities opened by these stateful stream APIs. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 25 2011) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2011-05-23: Released eGenix mx Base 3.2.0 http://python.egenix.com/ 2011-05-25: Released mxODBC 3.1.1 http://python.egenix.com/ 2011-06-20: EuroPython 2011, Florence, Italy 26 days to go ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From victor.stinner at haypocalc.com Wed May 25 11:39:52 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 25 May 2011 11:39:52 +0200 Subject: [Python-Dev] [Python-checkins] cpython: Issue #12049: Add RAND_bytes() and RAND_pseudo_bytes() functions to the ssl In-Reply-To: <20110525055926.GA21500@colossus> References: <E1QOoUJ-0002QA-PE@dinsdale.python.org> <4DDBCE7C.6090200@udel.edu> <1306253175.13660.18.camel@marge> <irgr9o$k1l$1@dough.gmane.org> <20110525055926.GA21500@colossus> Message-ID: <1306316392.6407.14.camel@marge> Le mercredi 25 mai 2011 ? 08:59 +0300, Petri Lehtinen a ?crit : > So it seems to me that RAND_bytes() either returns cryptographically > strong data or fails (is it possible to detect the failure with the > Python function? Should this be documented?). RAND_bytes() raises an SSLError on error. You can check if there is enough entropy before calling RAND_bytes() using RAND_status(). I documented this two infos. > RAND_pseudo_bytes() always succeeds... No, it can fail if the RAND method was changed and the current RAND method doesn't support this operation. Example: ---- >>> import ctypes >>> from ctypes import c_void_p >>> libssl=ctypes.cdll.LoadLibrary('libssl.so') >>> RAND_set_rand_method=libssl.RAND_set_rand_method >>> class rand_meth_st(ctypes.Structure): _fields_ = (('seed', c_void_p), ('bytes', c_void_p), ('cleanup', c_void_p), ('add', c_void_p), ('pseudorand', c_void_p), ('status', c_void_p)) ... >>> not_supported = rand_meth_st() >>> RAND_set_rand_method(ctypes.byref(not_supported)) >>> import ssl >>> ssl.RAND_bytes(1) ... ssl.SSLError: [Errno 0] None >>> ssl.RAND_pseudo_bytes(1) ... ssl.SSLError: [Errno 0] None ------ Cool, ssl.RAND_pseudo_bytes() raises also an error, as expected :-) > ... but does not necessarily generate cryptographically > strong data. Yes, if the PRNG was not seed with enough data, the RAND_pseudo_bytes() Python function returns (random_bytes, False). > > >We may also add a link from random to SSL.RAND_bytes() and > > >SSL.RAND_pseudo_bytes(). > > Obviously, the user needs to be familiar with the concept of > "cryptographically strong randomness" to use these functions. I already patched the doc of the random module to add a security warning. Well, you don't really need to know how a CSPRNG is implemented, just that random cannot be used for security and that ssl.RAND_bytes() raises an error if was seeded with enough data. Tell me if my warning is not clear: .. warning:: The generators of the :mod:`random` module should not be used for security purposes, they are not cryptographic. Use ssl.RAND_bytes() if you require a cryptographically secure pseudorandom number generator. Victor From petri at digip.org Wed May 25 12:20:12 2011 From: petri at digip.org (Petri Lehtinen) Date: Wed, 25 May 2011 13:20:12 +0300 Subject: [Python-Dev] [Python-checkins] cpython: Issue #12049: Add RAND_bytes() and RAND_pseudo_bytes() functions to the ssl In-Reply-To: <1306316392.6407.14.camel@marge> References: <E1QOoUJ-0002QA-PE@dinsdale.python.org> <4DDBCE7C.6090200@udel.edu> <1306253175.13660.18.camel@marge> <irgr9o$k1l$1@dough.gmane.org> <20110525055926.GA21500@colossus> <1306316392.6407.14.camel@marge> Message-ID: <20110525102012.GD10448@colossus> Victor Stinner wrote: > I already patched the doc of the random module to add a security > warning. Well, you don't really need to know how a CSPRNG is > implemented, just that random cannot be used for security and that > ssl.RAND_bytes() raises an error if was seeded with enough data. > > Tell me if my warning is not clear: > > .. warning:: > > The generators of the :mod:`random` module should not be used for > security purposes, they are not cryptographic. Use ssl.RAND_bytes() > if you require a cryptographically secure pseudorandom number > generator. Looks good to me. Regarding style, you should probably make a link, like :func:`ssl.RAND_bytes()`. Petri From eric at trueblade.com Wed May 25 12:54:22 2011 From: eric at trueblade.com (Eric Smith) Date: Wed, 25 May 2011 06:54:22 -0400 (EDT) Subject: [Python-Dev] [Python-checkins] cpython: Issue #12049: Add RAND_bytes() and RAND_pseudo_bytes() functions to the ssl In-Reply-To: <20110525102012.GD10448@colossus> References: <E1QOoUJ-0002QA-PE@dinsdale.python.org> <4DDBCE7C.6090200@udel.edu> <1306253175.13660.18.camel@marge> <irgr9o$k1l$1@dough.gmane.org> <20110525055926.GA21500@colossus> <1306316392.6407.14.camel@marge> <20110525102012.GD10448@colossus> Message-ID: <6cb8bc01c5c8812a57662243fd39af1b.squirrel@mail.trueblade.com> > Victor Stinner wrote: >> I already patched the doc of the random module to add a security >> warning. Well, you don't really need to know how a CSPRNG is >> implemented, just that random cannot be used for security and that >> ssl.RAND_bytes() raises an error if was seeded with enough data. >> >> Tell me if my warning is not clear: >> >> .. warning:: >> >> The generators of the :mod:`random` module should not be used for >> security purposes, they are not cryptographic. Use ssl.RAND_bytes() >> if you require a cryptographically secure pseudorandom number >> generator. > > Looks good to me. Regarding style, you should probably make a link, > like :func:`ssl.RAND_bytes()`. Does "are not cryptographic" have any meaning? (I'm not an expert, just not sure). Should it not be "cryptographically secure", to match the next sentence? Eric. From petri at digip.org Wed May 25 12:58:52 2011 From: petri at digip.org (Petri Lehtinen) Date: Wed, 25 May 2011 13:58:52 +0300 Subject: [Python-Dev] [Python-checkins] cpython: Issue #12049: Add RAND_bytes() and RAND_pseudo_bytes() functions to the ssl In-Reply-To: <6cb8bc01c5c8812a57662243fd39af1b.squirrel@mail.trueblade.com> References: <E1QOoUJ-0002QA-PE@dinsdale.python.org> <4DDBCE7C.6090200@udel.edu> <1306253175.13660.18.camel@marge> <irgr9o$k1l$1@dough.gmane.org> <20110525055926.GA21500@colossus> <1306316392.6407.14.camel@marge> <20110525102012.GD10448@colossus> <6cb8bc01c5c8812a57662243fd39af1b.squirrel@mail.trueblade.com> Message-ID: <20110525105852.GE10448@colossus> Eric Smith wrote: > > Victor Stinner wrote: > >> I already patched the doc of the random module to add a security > >> warning. Well, you don't really need to know how a CSPRNG is > >> implemented, just that random cannot be used for security and that > >> ssl.RAND_bytes() raises an error if was seeded with enough data. > >> > >> Tell me if my warning is not clear: > >> > >> .. warning:: > >> > >> The generators of the :mod:`random` module should not be used for > >> security purposes, they are not cryptographic. Use ssl.RAND_bytes() > >> if you require a cryptographically secure pseudorandom number > >> generator. > > > > Looks good to me. Regarding style, you should probably make a link, > > like :func:`ssl.RAND_bytes()`. > > Does "are not cryptographic" have any meaning? (I'm not an expert, just > not sure). Should it not be "cryptographically secure", to match the next > sentence? Or just remove ", they are not cryptographic" altogether? From victor.stinner at haypocalc.com Wed May 25 13:10:51 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 25 May 2011 13:10:51 +0200 Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader In-Reply-To: <4DDCCE02.7060105@egenix.com> References: <1306195729.605.27.camel@marge> <4DDB8591.2060308@livinglogic.de> <A5309F8C-375D-4ED9-A325-8172B9E852B3@langa.pl> <1306234681.2619.45.camel@marge> <4DDB9E27.7040605@livinglogic.de> <4DDCCE02.7060105@egenix.com> Message-ID: <1306321851.6407.49.camel@marge> Le mercredi 25 mai 2011 ? 11:38 +0200, M.-A. Lemburg a ?crit : > You are missing the point: we have StreamReader and StreamWriter APIs > on codecs to allow each codecs to implement more efficient ways of > encoding and decoding streams. > > Examples of such optimizations are reading the stream in > chunks that can be decoded in one piece, or writing to the stream > in a way that doesn't generate encoding state problems on the > receiving end by ending transmission half-way through a > shift block. > > ... > > We don't have many such specialized implementations in the stdlib, > but this doesn't mean that there's no use for them. It > just means that developers and users are simply unaware of the > possibilities opened by these stateful stream APIs. Does at least one codec implement such implementation in its StreamReader or StreamWriter class? And can't we implement such optimization in incremental encoders and decoders (or in TextIOWrapper)? I checked all multibyte codecs (UTF and CJK codecs) and I don't see any of such optimization. UTF codecs handle the BOM, but don't have anything looking like an optimization. CJK codecs use multibytecodec, MultibyteStreamReader and MultibyteStreamWriter, which don't look to be optimized. But I missed maybe something? TextIOWrapper has an advanced buffer algorithm to prefetch (readahead) some bytes at each read to speed up small read. It is difficult to implement such algorithm, but it's done and it works. -- Ok, let's stop to speak about theorical optimizations, and let's do a benchmark to compare codecs and the io modules on reading files! I tested Python 3.3 (70370:178d367c9733) compiled in release mode (gcc -O3) on a Pentium4 @ 3 GHz with 2 GB of memory. I tunned manually the number of loops to ensure that the faster test takes at least one second. I only ran my benchmark once. See the attached bench.py file. (1) Decode Objects/unicodeobject.c (317336 characters) from utf-8 test_io.readline(): 89.6 ms test_codecs.readline(): 1272.8 ms -> codecs 1320% slower than io test_io.read(1): 1728.9 ms test_codecs.read(1): 36395.0 ms -> codecs 2005% slower than io test_io.read(100): 460.7 ms test_codecs.read(100): 3897.0 ms -> codecs 746% slower than io test_io.read(-1): 1911.7 ms test_codecs.read(-1): 1740.7 ms -> codecs 10% FASTER than io (2) Decode README (6613 characters) from ascii test_io.readline(): 109.9 ms test_codecs.readline(): 1023.8 ms -> codecs 832% slower than io test_io.read(1): 1560.4 ms test_codecs.read(1): 29402.6 ms -> codecs 1784% slower than io test_io.read(100): 866.9 ms test_codecs.read(100): 3699.5 ms -> codecs 327% slower than io test_io.read(-1): 5140.2 ms test_codecs.read(-1): 4817.9 ms -> codecs 7% FASTER than io (3) Decode Lib/test/cjkencodings/gb18030.txt (501 characters) from gb18030 test_io.readline(): 1193.7 ms test_codecs.readline(): 1474.3 ms -> codecs 24% slower than io test_io.read(1): 3847.7 ms test_codecs.read(1): 27103.9 ms -> codecs 604% slower than io test_io.read(100): 12839.5 ms test_codecs.read(100): 13444.2 ms -> codecs 5% slower than io test_io.read(-1): 2183.3 ms test_codecs.read(-1): 1906.1 ms -> codecs 15% FASTER than io The readahead code does really help read(1): io is between 6 and 20 times faster than the codecs. But it does really use a more common usecase, readline: io is between 1.2 and 13 times faster than the codecs. codecs is always faster (between 1.07 and 1.15 times faster than io) to read the whole content of file using read(-1). Something should maybe be optimized in TextIOWrapper.read() ;-) But the gain is minor if you compare it to the gain on read(1) and readline()! Please check my bench.py script and redo the benchmark on your own computer! Victor -------------- next part -------------- A non-text attachment was scrubbed... Name: bench.py Type: text/x-python Size: 1867 bytes Desc: not available URL: <http://mail.python.org/pipermail/python-dev/attachments/20110525/dadd9dd4/attachment.py> From ncoghlan at gmail.com Wed May 25 14:44:15 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 25 May 2011 22:44:15 +1000 Subject: [Python-Dev] [Python-checkins] Daily reference leaks (234021dcad93): sum=61 In-Reply-To: <1306314654.6407.1.camel@marge> References: <E1QP4Tu-0002qe-D5@ap.vmr.nerim.net> <BANLkTikqWv62fc-t1yfJJkcOY1vDXnWksQ@mail.gmail.com> <1306314654.6407.1.camel@marge> Message-ID: <BANLkTintnRBGEyRErQ91-LYC3LOqdfvKBQ@mail.gmail.com> On Wed, May 25, 2011 at 7:10 PM, Victor Stinner <victor.stinner at haypocalc.com> wrote: > Le mercredi 25 mai 2011 ? 15:13 +1000, Nick Coghlan a ?crit : >> On Wed, May 25, 2011 at 1:09 PM, ?<solipsis at pitrou.net> wrote: >> > results for 234021dcad93 on branch "default" >> > -------------------------------------------- >> > >> > test_packaging leaked [128, 128, 128] references, sum=384 >> >> Is there a new cache in packaging that regrtest needs to know about >> and either ignore or clear when checking reference counts? > > See the issue http://bugs.python.org/issue12167 : Antoine listed tests > leaking references, and I already fixed some of them. Thanks for the issue link. I'd seen a few of these reports go by, so it's good to know that dealing with it is in progress. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From eric at trueblade.com Wed May 25 15:08:30 2011 From: eric at trueblade.com (Eric Smith) Date: Wed, 25 May 2011 09:08:30 -0400 Subject: [Python-Dev] [Python-checkins] cpython: Issue #12049: Add RAND_bytes() and RAND_pseudo_bytes() functions to the ssl In-Reply-To: <20110525105852.GE10448@colossus> References: <E1QOoUJ-0002QA-PE@dinsdale.python.org> <4DDBCE7C.6090200@udel.edu> <1306253175.13660.18.camel@marge> <irgr9o$k1l$1@dough.gmane.org> <20110525055926.GA21500@colossus> <1306316392.6407.14.camel@marge> <20110525102012.GD10448@colossus> <6cb8bc01c5c8812a57662243fd39af1b.squirrel@mail.trueblade.com> <20110525105852.GE10448@colossus> Message-ID: <4DDCFF4E.6030809@trueblade.com> On 05/25/2011 06:58 AM, Petri Lehtinen wrote: > Eric Smith wrote: >>> Victor Stinner wrote: >>>> I already patched the doc of the random module to add a security >>>> warning. Well, you don't really need to know how a CSPRNG is >>>> implemented, just that random cannot be used for security and that >>>> ssl.RAND_bytes() raises an error if was seeded with enough data. >>>> >>>> Tell me if my warning is not clear: >>>> >>>> .. warning:: >>>> >>>> The generators of the :mod:`random` module should not be used for >>>> security purposes, they are not cryptographic. Use ssl.RAND_bytes() >>>> if you require a cryptographically secure pseudorandom number >>>> generator. >>> >>> Looks good to me. Regarding style, you should probably make a link, >>> like :func:`ssl.RAND_bytes()`. >> >> Does "are not cryptographic" have any meaning? (I'm not an expert, just >> not sure). Should it not be "cryptographically secure", to match the next >> sentence? > > Or just remove ", they are not cryptographic" altogether? Good call. That's a better change. Eric. From barry at python.org Wed May 25 15:41:46 2011 From: barry at python.org (Barry Warsaw) Date: Wed, 25 May 2011 09:41:46 -0400 Subject: [Python-Dev] Extending os.chown() to accept user/group names In-Reply-To: <BANLkTikXottB8xfXbe1y48P-GWVxhKn2=Q@mail.gmail.com> References: <BANLkTikXottB8xfXbe1y48P-GWVxhKn2=Q@mail.gmail.com> Message-ID: <20110525094146.4941b681@neurotica.wooz.org> On May 25, 2011, at 10:24 AM, Sandro Tosi wrote: >before opening an issue to track the request, I'd like to ask advice >here about this: extend os.chown() to accept even user/group names >instead of just uid and gid. > >On a Unix system, you can call chown command passing either id or >names, so it seems (to me at least) natural to expect os.chown() to >behave similarly; but that's not the case. > >I can see os module wants to be a thin wrapper around OS syscalls and >chown(2) accepts only uid/gid as input, so what would be best: extend >os.chown() or provide a chown() function in shutil module for this >purpose? I think it would be a nice feature, and I can see the conflict. OT1H you want to keep os.chown() a thin wrapper, but OTOH you'd rather not have to add a new, arguably more difficult to discover, function. Given those two choices, I still think I'd come down on adding a new function and shutil.chown() seems an appropriate place for it. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: <http://mail.python.org/pipermail/python-dev/attachments/20110525/fe45f151/attachment.pgp> From mal at egenix.com Wed May 25 15:43:55 2011 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 25 May 2011 15:43:55 +0200 Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader In-Reply-To: <1306321851.6407.49.camel@marge> References: <1306195729.605.27.camel@marge> <4DDB8591.2060308@livinglogic.de> <A5309F8C-375D-4ED9-A325-8172B9E852B3@langa.pl> <1306234681.2619.45.camel@marge> <4DDB9E27.7040605@livinglogic.de> <4DDCCE02.7060105@egenix.com> <1306321851.6407.49.camel@marge> Message-ID: <4DDD079B.7090906@egenix.com> Victor Stinner wrote: > Le mercredi 25 mai 2011 ? 11:38 +0200, M.-A. Lemburg a ?crit : >> You are missing the point: we have StreamReader and StreamWriter APIs >> on codecs to allow each codecs to implement more efficient ways of >> encoding and decoding streams. >> >> Examples of such optimizations are reading the stream in >> chunks that can be decoded in one piece, or writing to the stream >> in a way that doesn't generate encoding state problems on the >> receiving end by ending transmission half-way through a >> shift block. >> >> ... >> >> We don't have many such specialized implementations in the stdlib, >> but this doesn't mean that there's no use for them. It >> just means that developers and users are simply unaware of the >> possibilities opened by these stateful stream APIs. > > Does at least one codec implement such implementation in its > StreamReader or StreamWriter class? And can't we implement such > optimization in incremental encoders and decoders (or in TextIOWrapper)? I don't see how, since you need control over the file API methods in order to implement such optimizations. OTOH, adding lots of special cases to TextIOWrapper isn't a good either, since these optimizations would then only trigger for a small number of codecs and completely leave out 3rd party codecs. > I checked all multibyte codecs (UTF and CJK codecs) and I don't see any > of such optimization. UTF codecs handle the BOM, but don't have anything > looking like an optimization. CJK codecs use multibytecodec, > MultibyteStreamReader and MultibyteStreamWriter, which don't look to be > optimized. But I missed maybe something? No, you haven't missed such per-codec optimizations. The base classes implement general purpose support for reading from streams in chunks, but the support isn't optimized per codec. For UTF-16 it would e.g. make sense to always read data in blocks with even sizes, removing the trial-and-error decoding and extra buffering currently done by the base classes. For UTF-32, the blocks should have size % 4 == 0. For UTF-8 (and other variable length encodings) it would make sense looking at the end of the (bytes) data read from the stream to see whether a complete code point was read or not, rather than simply running the decoder on the complete data set, only to find that a few bytes at the end are missing. For single character encodings, it would make sense to prefetch data in big chunks and skip all the trial and error decoding implemented by the base classes to address the above problem with variable length encodings. Finally, all this could be implemented in C, reducing the Python call overhead dramatically. > TextIOWrapper has an advanced buffer algorithm to prefetch (readahead) > some bytes at each read to speed up small read. It is difficult to > implement such algorithm, but it's done and it works. > > -- > > Ok, let's stop to speak about theorical optimizations, and let's do a > benchmark to compare codecs and the io modules on reading files! That's somewhat unfair: TextIOWrapper is implemented in C, whereas the StreamReader/Writer subclasses used by the codecs are written in Python. A fair comparison would use the Python implementation of TextIOWrapper. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 25 2011) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2011-05-23: Released eGenix mx Base 3.2.0 http://python.egenix.com/ 2011-05-25: Released mxODBC 3.1.1 http://python.egenix.com/ 2011-06-20: EuroPython 2011, Florence, Italy 26 days to go ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From solipsis at pitrou.net Wed May 25 15:58:57 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 25 May 2011 15:58:57 +0200 Subject: [Python-Dev] Extending os.chown() to accept user/group names References: <BANLkTikXottB8xfXbe1y48P-GWVxhKn2=Q@mail.gmail.com> <20110525094146.4941b681@neurotica.wooz.org> Message-ID: <20110525155857.4c4e87b7@pitrou.net> On Wed, 25 May 2011 09:41:46 -0400 Barry Warsaw <barry at python.org> wrote: > On May 25, 2011, at 10:24 AM, Sandro Tosi wrote: > > >before opening an issue to track the request, I'd like to ask advice > >here about this: extend os.chown() to accept even user/group names > >instead of just uid and gid. > > > >On a Unix system, you can call chown command passing either id or > >names, so it seems (to me at least) natural to expect os.chown() to > >behave similarly; but that's not the case. > > > >I can see os module wants to be a thin wrapper around OS syscalls and > >chown(2) accepts only uid/gid as input, so what would be best: extend > >os.chown() or provide a chown() function in shutil module for this > >purpose? > > I think it would be a nice feature, and I can see the conflict. OT1H you want > to keep os.chown() a thin wrapper, but OTOH you'd rather not have to add a > new, arguably more difficult to discover, function. Given those two choices, > I still think I'd come down on adding a new function and shutil.chown() seems > an appropriate place for it. +1 for shutil.chown(). Regards Antoine. From dirkjan at ochtman.nl Wed May 25 16:15:32 2011 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Wed, 25 May 2011 16:15:32 +0200 Subject: [Python-Dev] Extending os.chown() to accept user/group names In-Reply-To: <20110525094146.4941b681@neurotica.wooz.org> References: <BANLkTikXottB8xfXbe1y48P-GWVxhKn2=Q@mail.gmail.com> <20110525094146.4941b681@neurotica.wooz.org> Message-ID: <BANLkTinC6k=LH1Zs6b4dmYZZbt13XO--gw@mail.gmail.com> On Wed, May 25, 2011 at 15:41, Barry Warsaw <barry at python.org> wrote: > I think it would be a nice feature, and I can see the conflict. ?OT1H you want > to keep os.chown() a thin wrapper, but OTOH you'd rather not have to add a > new, arguably more difficult to discover, function. ?Given those two choices, > I still think I'd come down on adding a new function and shutil.chown() seems > an appropriate place for it. Right. Please add a mention of shutil.chown() to the os.chown() docs, though. Cheers, Dirkjan From barry at python.org Wed May 25 16:18:39 2011 From: barry at python.org (Barry Warsaw) Date: Wed, 25 May 2011 10:18:39 -0400 Subject: [Python-Dev] Extending os.chown() to accept user/group names In-Reply-To: <BANLkTinC6k=LH1Zs6b4dmYZZbt13XO--gw@mail.gmail.com> References: <BANLkTikXottB8xfXbe1y48P-GWVxhKn2=Q@mail.gmail.com> <20110525094146.4941b681@neurotica.wooz.org> <BANLkTinC6k=LH1Zs6b4dmYZZbt13XO--gw@mail.gmail.com> Message-ID: <20110525101839.6dd65d9c@neurotica.wooz.org> On May 25, 2011, at 04:15 PM, Dirkjan Ochtman wrote: >Right. Please add a mention of shutil.chown() to the os.chown() docs, though. Brilliant! -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: <http://mail.python.org/pipermail/python-dev/attachments/20110525/a2d91ff3/attachment.pgp> From victor.stinner at haypocalc.com Wed May 25 17:48:11 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 25 May 2011 17:48:11 +0200 Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader In-Reply-To: <4DDD079B.7090906@egenix.com> References: <1306195729.605.27.camel@marge> <4DDB8591.2060308@livinglogic.de> <A5309F8C-375D-4ED9-A325-8172B9E852B3@langa.pl> <1306234681.2619.45.camel@marge> <4DDB9E27.7040605@livinglogic.de> <4DDCCE02.7060105@egenix.com> <1306321851.6407.49.camel@marge> <4DDD079B.7090906@egenix.com> Message-ID: <1306338491.6407.74.camel@marge> Le mercredi 25 mai 2011 ? 15:43 +0200, M.-A. Lemburg a ?crit : > For UTF-16 it would e.g. make sense to always read data in blocks > with even sizes, removing the trial-and-error decoding and extra > buffering currently done by the base classes. For UTF-32, the > blocks should have size % 4 == 0. > > For UTF-8 (and other variable length encodings) it would make > sense looking at the end of the (bytes) data read from the > stream to see whether a complete code point was read or not, > rather than simply running the decoder on the complete data > set, only to find that a few bytes at the end are missing. I think that the readahead algorithm is much more faster than trying to avoid partial input, and it's not a problem to have partial input if you use an incremental decoder. > For single character encodings, it would make sense to prefetch > data in big chunks and skip all the trial and error decoding > implemented by the base classes to address the above problem > with variable length encodings. TextIOWrapper implements this optimization using its readahead algorithm. > That's somewhat unfair: TextIOWrapper is implemented in C, > whereas the StreamReader/Writer subclasses used by the > codecs are written in Python. > > A fair comparison would use the Python implementation of > TextIOWrapper. Do you mean that you would like to reimplement codecs in C? It is not revelant to compare codecs and _pyio, because codecs reuses BufferedReader (of the io module, not of the _pyio module), and io is the main I/O module of Python 3. But well, as you want, here is a benchmark comparing: _pyio.TextIOWrapper(io.open(filename, 'rb'), encoding) and codecs.open(filename, encoding) The only change with my previous bench.py script is the test_io() function : def test_io(test_func, chunk_size): with open(FILENAME, 'rb') as buffered: f = _pyio.TextIOWrapper(buffered, ENCODING) test_file(f, test_func, chunk_size) f.close() (1) Decode Objects/unicodeobject.c (317336 characters) from utf-8 test_io.readline(): 1193.4 ms test_codecs.readline(): 1267.9 ms -> codecs 6% slower than io test_io.read(1): 21696.4 ms test_codecs.read(1): 36027.2 ms -> codecs 66% slower than io test_io.read(100): 3080.7 ms test_codecs.read(100): 3901.7 ms -> codecs 27% slower than io test_io.read(): 3991.0 ms test_codecs.read(): 1736.9 ms -> codecs 130% FASTER than io (2) Decode README (6613 characters) from ascii test_io.readline(): 678.1 ms test_codecs.readline(): 760.5 ms -> codecs 12% slower than io test_io.read(1): 13533.2 ms test_codecs.read(1): 21900.0 ms -> codecs 62% slower than io test_io.read(100): 2663.1 ms test_codecs.read(100): 3270.1 ms -> codecs 23% slower than io test_io.read(): 6769.1 ms test_codecs.read(): 3919.6 ms -> codecs 73% FASTER than io (3) Decode Lib/test/cjkencodings/gb18030.txt (501 characters) from gb18030 test_io.readline(): 38.9 ms test_codecs.readline(): 15.1 ms -> codecs 157% FASTER than io test_io.read(1): 369.8 ms test_codecs.read(1): 302.2 ms -> codecs 22% FASTER than io test_io.read(100): 258.2 ms test_codecs.read(100): 155.1 ms -> codecs 67% FASTER than io test_io.read(): 1803.2 ms test_codecs.read(): 1002.9 ms -> codecs 80% FASTER than io _pyio.TextIOWrapper is faster than codecs.StreamReader for readline(), read(1) and read(100), with ASCII and UTF-8. It is slower for gb18030. As in the io vs codecs benchmark, codecs.StreamReader is always faster than _pyio.TextIOWrapper for read(). Victor From victor.stinner at haypocalc.com Wed May 25 18:04:17 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 25 May 2011 18:04:17 +0200 Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader In-Reply-To: <1306321851.6407.49.camel@marge> References: <1306195729.605.27.camel@marge> <4DDB8591.2060308@livinglogic.de> <A5309F8C-375D-4ED9-A325-8172B9E852B3@langa.pl> <1306234681.2619.45.camel@marge> <4DDB9E27.7040605@livinglogic.de> <4DDCCE02.7060105@egenix.com> <1306321851.6407.49.camel@marge> Message-ID: <1306339457.20017.1.camel@marge> Le mercredi 25 mai 2011 ? 13:10 +0200, Victor Stinner a ?crit : > codecs is always faster (between 1.07 and 1.15 times faster than io) to > read the whole content of file using read(-1). Something should maybe be > optimized in TextIOWrapper.read() ;-) Oh, I understood: it's maybe the universal newline mode of TextIOWrapper was enabled. If you disable is using open(..., newline='\n'), io and codecs run at the same speed to read the whole content of the file (f.read()). Victor From tjreedy at udel.edu Wed May 25 18:42:19 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 25 May 2011 12:42:19 -0400 Subject: [Python-Dev] [Python-checkins] cpython: Issue #12049: Add RAND_bytes() and RAND_pseudo_bytes() functions to the ssl In-Reply-To: <BANLkTim1ta3mRHE2io_Rc_-U1bPpTyrvOQ@mail.gmail.com> References: <E1QOoUJ-0002QA-PE@dinsdale.python.org> <4DDBCE7C.6090200@udel.edu> <1306253175.13660.18.camel@marge> <irgr9o$k1l$1@dough.gmane.org> <BANLkTim1ta3mRHE2io_Rc_-U1bPpTyrvOQ@mail.gmail.com> Message-ID: <irjbhe$gov$1@dough.gmane.org> On 5/25/2011 1:09 AM, Nick Coghlan wrote: > The more important feature here is that it is exposing *OpenSSL's* > random number generation, rather than our own. I agree, thought from a different stance, I think. The issue is whether we should 'automatically' expose everything is a wrapped library, even if it duplicates existing functions. I think not. But in this case, at least one of the two functions is sufficiently different, and the newest doc patches clarify the situation. -- Terry Jan Reedy From neologix at free.fr Wed May 25 18:46:02 2011 From: neologix at free.fr (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Wed, 25 May 2011 18:46:02 +0200 Subject: [Python-Dev] Extending os.chown() to accept user/group names In-Reply-To: <BANLkTikXottB8xfXbe1y48P-GWVxhKn2=Q@mail.gmail.com> References: <BANLkTikXottB8xfXbe1y48P-GWVxhKn2=Q@mail.gmail.com> Message-ID: <BANLkTi=8K1=ZQfxAD0nK0wAaCy33FRmEEw@mail.gmail.com> While we're at it, adding a "recursive" argument to this shutil.chown could also be useful. From solipsis at pitrou.net Wed May 25 18:55:13 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 25 May 2011 18:55:13 +0200 Subject: [Python-Dev] cpython: Fix closes issue #11109 - socketserver.ForkingMixIn leaves zombies, also fails References: <E1QPGv0-0002dG-9Y@dinsdale.python.org> Message-ID: <20110525185513.1cf2e252@pitrou.net> On Wed, 25 May 2011 18:26:46 +0200 senthil.kumaran <python-checkins at python.org> wrote: > > A new method called service_action is made available in BaseServer, called by > serve_forever loop. This useful in cases where Mixins can use it for cleanup > action. ForkingMixin class uses service_action to collect the zombie child > processes. Initial Patch by Justin Wark. Is it reasonable, performance-wise, to do this at every iteration of the loop (that is, at every incoming connection)? Regards Antoine. From victor.stinner at haypocalc.com Wed May 25 19:17:41 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 25 May 2011 19:17:41 +0200 Subject: [Python-Dev] Extending os.chown() to accept user/group names In-Reply-To: <BANLkTi=8K1=ZQfxAD0nK0wAaCy33FRmEEw@mail.gmail.com> References: <BANLkTikXottB8xfXbe1y48P-GWVxhKn2=Q@mail.gmail.com> <BANLkTi=8K1=ZQfxAD0nK0wAaCy33FRmEEw@mail.gmail.com> Message-ID: <1306343861.20117.4.camel@marge> Le mercredi 25 mai 2011 ? 18:46 +0200, Charles-Fran?ois Natali a ?crit : > While we're at it, adding a "recursive" argument to this shutil.chown > could also be useful. I don't like the idea of a recursive flag. I would prefer a "map-like" function to "apply" a function on all files of a directory. Something like shutil.apply_recursive(shutil.chown)... ... maybe with options to choose between deep-first search and breadth-first search, filter (filenames, file size, files only, directories only, other attributes?), directory before files (may be need for chmod(0o000)), etc. Victor From eric at trueblade.com Wed May 25 19:37:26 2011 From: eric at trueblade.com (Eric Smith) Date: Wed, 25 May 2011 13:37:26 -0400 Subject: [Python-Dev] Extending os.chown() to accept user/group names In-Reply-To: <1306343861.20117.4.camel@marge> References: <BANLkTikXottB8xfXbe1y48P-GWVxhKn2=Q@mail.gmail.com> <BANLkTi=8K1=ZQfxAD0nK0wAaCy33FRmEEw@mail.gmail.com> <1306343861.20117.4.camel@marge> Message-ID: <4DDD3E56.3050605@trueblade.com> On 5/25/2011 1:17 PM, Victor Stinner wrote: > Le mercredi 25 mai 2011 ? 18:46 +0200, Charles-Fran?ois Natali a ?crit : >> While we're at it, adding a "recursive" argument to this shutil.chown >> could also be useful. > > I don't like the idea of a recursive flag. I would prefer a "map-like" > function to "apply" a function on all files of a directory. Something > like shutil.apply_recursive(shutil.chown)... > > ... maybe with options to choose between deep-first search and > breadth-first search, filter (filenames, file size, files only, > directories only, other attributes?), directory before files (may be > need for chmod(0o000)), etc. You can do all of this with an appropriate application of os.walk(). Eric. From petri at digip.org Wed May 25 20:03:39 2011 From: petri at digip.org (Petri Lehtinen) Date: Wed, 25 May 2011 21:03:39 +0300 Subject: [Python-Dev] Extending os.chown() to accept user/group names In-Reply-To: <1306343861.20117.4.camel@marge> References: <BANLkTikXottB8xfXbe1y48P-GWVxhKn2=Q@mail.gmail.com> <BANLkTi=8K1=ZQfxAD0nK0wAaCy33FRmEEw@mail.gmail.com> <1306343861.20117.4.camel@marge> Message-ID: <20110525180338.GA1718@ihaa> Victor Stinner wrote: > Le mercredi 25 mai 2011 ? 18:46 +0200, Charles-Fran?ois Natali a ?crit : > > While we're at it, adding a "recursive" argument to this shutil.chown > > could also be useful. > > I don't like the idea of a recursive flag. I would prefer a "map-like" > function to "apply" a function on all files of a directory. Something > like shutil.apply_recursive(shutil.chown)... FWIW, the chown program (in GNU coreutils at least) has a -R flag for recursive operation, and I've found it *extremely* useful on many situations. Petri From fuzzyman at voidspace.org.uk Wed May 25 20:41:26 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Wed, 25 May 2011 19:41:26 +0100 Subject: [Python-Dev] Python 3.3 release schedule posted In-Reply-To: <AANLkTinbgpQrPBU64OY_vD7QzmV5HGznr_uXNEEhMkcY@mail.gmail.com> References: <imdj8n$dq0$1@dough.gmane.org> <AANLkTi=9bedAp40CPHQG-fkPTHwqrkzJ6q9Dr6X7p_f7@mail.gmail.com> <AANLkTikt+PUE41h576o5oo+foT5o61RW=p=oEBF5qkzC@mail.gmail.com> <AANLkTinbgpQrPBU64OY_vD7QzmV5HGznr_uXNEEhMkcY@mail.gmail.com> Message-ID: <4DDD4D56.9020301@voidspace.org.uk> On 26/03/2011 00:33, Laurens Van Houtven wrote: > On Thu, Mar 24, 2011 at 12:18 AM, Thomas Wouters <thomas at python.org > <mailto:thomas at python.org>> wrote: > > It ended up that Jim Fulton is actually writing the PEP (with > input from Twisted people and others.) > > -- > Thomas Wouters <thomas at python.org <mailto:thomas at python.org>> > > Hi! I'm a .signature virus! copy me into your .signature file to > help me spread! > > > Well, if help is still needed I'll gladly chip in. It's not that I'm > not interested in doing it -- it's just that I don't know who's > supposed to or who's working on it :) > Hey lvh, It's worth following this up. If Jim Fulton hasn't had time to move this forward and you have the bandwidth to work on it then it would be great to see some action. All the best, Michael Foord > -- > cheers > lvh > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20110525/8e5e2c48/attachment.html> From neologix at free.fr Wed May 25 20:45:24 2011 From: neologix at free.fr (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Wed, 25 May 2011 20:45:24 +0200 Subject: [Python-Dev] Extending os.chown() to accept user/group names In-Reply-To: <1306343861.20117.4.camel@marge> References: <BANLkTikXottB8xfXbe1y48P-GWVxhKn2=Q@mail.gmail.com> <BANLkTi=8K1=ZQfxAD0nK0wAaCy33FRmEEw@mail.gmail.com> <1306343861.20117.4.camel@marge> Message-ID: <BANLkTikHWFa+0K7kbRbk0suXDM=cELke-A@mail.gmail.com> >> While we're at it, adding a "recursive" argument to this shutil.chown >> could also be useful. > > I don't like the idea of a recursive flag. I would prefer a "map-like" > function to "apply" a function on all files of a directory. Something > like shutil.apply_recursive(shutil.chown)... > I was also thinking about this possibility. The advantage is that we could factor-out the recursive walk logic to make it available for other functions (chown, chmod...). It doesn't map well to the Unix command, though. > You can do all of this with an appropriate application of os.walk(). Then, I wonder why shutil.copytree and shutil.rmtree are provided. Recursive rm/copy/chown/chmod are extremely useful in system administration scripts. Furthermore, it's not as simple as it seems because of symlinks, see for example http://bugs.python.org/issue4489 . From neologix at free.fr Wed May 25 23:00:51 2011 From: neologix at free.fr (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Wed, 25 May 2011 23:00:51 +0200 Subject: [Python-Dev] cpython: Fix closes issue #11109 - socketserver.ForkingMixIn leaves zombies, also fails In-Reply-To: <20110525185513.1cf2e252@pitrou.net> References: <E1QPGv0-0002dG-9Y@dinsdale.python.org> <20110525185513.1cf2e252@pitrou.net> Message-ID: <BANLkTimOV+gPXkoyxGUSSH=wW_CzrJgXDw@mail.gmail.com> >> A new method called service_action is made available in BaseServer, called by >> serve_forever loop. This useful in cases where Mixins can use it for cleanup >> action. ForkingMixin class uses service_action to collect the zombie child >> processes. Initial Patch by Justin Wark. > > Is it reasonable, performance-wise, to do this at every iteration of > the loop (that is, at every incoming connection)? > I haven't measured it, but it's O(N) where N is the number of children. It should be possible to optimize this by putting all the children in a process group (the other advantage is that we wouldn't wait() children not spawned by socketserver). cf From lac at openend.se Wed May 25 23:41:53 2011 From: lac at openend.se (Laura Creighton) Date: Wed, 25 May 2011 23:41:53 +0200 Subject: [Python-Dev] multibytecodex Message-ID: <201105252141.p4PLfr6T025372@theraft.openend.se> This just in from pypy-dev. I am reposting it here because I am fairly certain that nobody on the pypy-dev mailing list uses the multibytecodex, but there has got to be at least one person here who does. Please reply to the pypy-dev article, not here, or mail to pypy-dev at python.org if you are not on the pypy-dev mailing list (but have delivery turned off as many of you do.) Thank you, Laura ------- Forwarded Message From: Armin Rigo <arigo at tunes.org> Date: Wed, 25 May 2011 21:39:35 +0200 To: pypy-dev at python.org Subject: [pypy-dev] multibytecodec: missing features Hi all, Here are the missing features in multibytecodec: * support for ``errors !=3D "strict"''. * classes MultibyteIncrementalEncoder, MultibyteIncrementalDecoder, MultibyteStreamReader and MultibyteStreamWriter. One reason I didn't implement the classes yet is that I couldn't understand two points in how they are supposed to work. But it seems that there are really two bugs, as I've been pointed to: http://bugs.python.org/issue12100 and http://bugs.python.org/issue12171 . So the question is if we should be bug-compatible with Python 2.7 or if we should instead implement some fixed version. I suppose I'm rather for the fixed version, but I'd like to hear some feedback from people that actually use multibytecodecs. Also, I wouldn't mind if someone would pick up the work and just do it, either the classes or ``errors !=3D "strict"'' :-) A bient=F4t, Armin. _______________________________________________ pypy-dev mailing list pypy-dev at python.org http://mail.python.org/mailman/listinfo/pypy-dev ------- End of Forwarded Message From victor.stinner at haypocalc.com Thu May 26 00:13:42 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 26 May 2011 00:13:42 +0200 Subject: [Python-Dev] multibytecodex In-Reply-To: <201105252141.p4PLfr6T025372@theraft.openend.se> References: <201105252141.p4PLfr6T025372@theraft.openend.se> Message-ID: <1306361622.24449.14.camel@marge> Le mercredi 25 mai 2011 ? 23:41 +0200, Laura Creighton a ?crit : > One reason I didn't implement the classes yet is that I couldn't > understand two points in how they are supposed to work. But it seems > that there are really two bugs, as I've been pointed to: > http://bugs.python.org/issue12100 and > http://bugs.python.org/issue12171 . So the question is if we should > be bug-compatible with Python 2.7 or if we should instead implement > some fixed version. I fixed #12100 in Python 2.7, 3.1, 3.2, 3.3 yesterday. I plan also to fix #12171 in these four versions, it should be done next days. > I suppose I'm rather for the fixed version, but I'd like to hear some > feedback from people that actually use multibytecodecs. Both bugs are related to encoders. I don't think that anyone is using Python CJK codecs to encode text (because nobody noticed these bugs before), but more likely to decode text. Anyway, you should implement a codec without these *bugs*. For your information, I added more tests to the CJK codecs (e.g. see #12057), and I plan to add more tests next weeks. I plan also to fix issue #12016, yet another CJK codec bug. You may want to wait until all of these bugs are fixed before working on your own implementation, or implement directly a version without all of these bugs, and then upgrade the test suite. > Also, I wouldn't mind if someone would pick up the work and just do it, > either the classes or ``errors !=3D "strict"'' :-) The support of error handlers different than strict is far from being perfect. Issue #12016 is the main problem, but there are other minor issues. In some cases, invalid byte sequences are ignored even with the replace error handler (whereas I expected U+FFFD characters). CJK codecs are special because they use escape sequences (especially the ISO 2022 family): what should be done if a byte sequence looks like an escape sequences, but it is not valid? Replace each byte by U+FFFD, or ignore these bytes? I'm trying to write tests "describing" the current behaviour, and then I will maybe try to improve how invalid byte sequences are handled. Victor From timothy.c.delaney at gmail.com Thu May 26 00:10:29 2011 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Thu, 26 May 2011 08:10:29 +1000 Subject: [Python-Dev] Extending os.chown() to accept user/group names In-Reply-To: <1306343861.20117.4.camel@marge> References: <BANLkTikXottB8xfXbe1y48P-GWVxhKn2=Q@mail.gmail.com> <BANLkTi=8K1=ZQfxAD0nK0wAaCy33FRmEEw@mail.gmail.com> <1306343861.20117.4.camel@marge> Message-ID: <BANLkTin_3C69T5OHP5+Ngti-NAJ3E85WjA@mail.gmail.com> 2011/5/26 Victor Stinner <victor.stinner at haypocalc.com> > Le mercredi 25 mai 2011 ? 18:46 +0200, Charles-Fran?ois Natali a ?crit : > > While we're at it, adding a "recursive" argument to this shutil.chown > > could also be useful. > > I don't like the idea of a recursive flag. I would prefer a "map-like" > function to "apply" a function on all files of a directory. Something > like shutil.apply_recursive(shutil.chown)... > > ... maybe with options to choose between deep-first search and > breadth-first search, filter (filenames, file size, files only, > directories only, other attributes?), directory before files (may be > need for chmod(0o000)), etc. Pass an iterable to shutil.chown()? Then you could call it like: shutil.chown(os.walk(path)) Then of course you have the difficulty of wanting to pass either an iterator or a single path - probably prefer two functions e.g.: shutil.chown(path) shutil.chown_many(iter) Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20110526/658c4dc7/attachment-0001.html> From senthil at uthcode.com Thu May 26 07:10:55 2011 From: senthil at uthcode.com (Senthil Kumaran) Date: Thu, 26 May 2011 13:10:55 +0800 Subject: [Python-Dev] cpython: Fix closes issue #11109 - socketserver.ForkingMixIn leaves zombies, also fails In-Reply-To: <BANLkTimOV+gPXkoyxGUSSH=wW_CzrJgXDw@mail.gmail.com> References: <E1QPGv0-0002dG-9Y@dinsdale.python.org> <20110525185513.1cf2e252@pitrou.net> <BANLkTimOV+gPXkoyxGUSSH=wW_CzrJgXDw@mail.gmail.com> Message-ID: <20110526051055.GB2736@kevin> Antoine Pitrou wrote: > >> A new method called service_action is made available in BaseServer, called by > >> serve_forever loop. This useful in cases where Mixins can use it for cleanup > >> action. ForkingMixin class uses service_action to collect the zombie child > >> processes. Initial Patch by Justin Wark. > > > > Is it reasonable, performance-wise, to do this at every iteration of > > the loop (that is, at every incoming connection)? If not here, the call was being done at the process_request level when creating a new child process and the wait would have been there. I am not sure, how much performance different (lag) this aggressive collection can bring. Charles-Fran?ois Natali wrote: > I haven't measured it, but it's O(N) where N is the number of children. > It should be possible to optimize this by putting all the children in > a process group (the other advantage is that we wouldn't wait() > children not spawned by socketserver). +1. This is definitely a good idea. The change needs to be done in the collection_children routine which tries to wait for all children to finish instead of just the ones forked by the socketserver. Shall raise ticket for this. -- Senthil Although the moon is smaller than the earth, it is farther away. From ncoghlan at gmail.com Thu May 26 07:58:16 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 26 May 2011 15:58:16 +1000 Subject: [Python-Dev] Extending os.chown() to accept user/group names In-Reply-To: <BANLkTikHWFa+0K7kbRbk0suXDM=cELke-A@mail.gmail.com> References: <BANLkTikXottB8xfXbe1y48P-GWVxhKn2=Q@mail.gmail.com> <BANLkTi=8K1=ZQfxAD0nK0wAaCy33FRmEEw@mail.gmail.com> <1306343861.20117.4.camel@marge> <BANLkTikHWFa+0K7kbRbk0suXDM=cELke-A@mail.gmail.com> Message-ID: <BANLkTimTMVTgW6f1LnSgFq3a1ugqZywJhA@mail.gmail.com> 2011/5/26 Charles-Fran?ois Natali <neologix at free.fr>: > Then, I wonder why shutil.copytree and shutil.rmtree are provided. > Recursive rm/copy/chown/chmod are extremely useful in system > administration scripts. Furthermore, it's not as simple as it seems > because of symlinks, see for example http://bugs.python.org/issue4489 Rather than a fixed binary flag, I would suggest following the precedent of copytree and rmtree, and provide recursive functionality as a separate shutil function (i.e. shutil.chmodtree, shutil.chowntree). As noted, while these *can* be written manually, it is convenient to have the logic for handling symlinks dealt with for you, as well as not having to look up the particular incantation for correctly linking os.walk and the relevant operations. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From petri at digip.org Thu May 26 08:09:04 2011 From: petri at digip.org (Petri Lehtinen) Date: Thu, 26 May 2011 09:09:04 +0300 Subject: [Python-Dev] Extending os.chown() to accept user/group names In-Reply-To: <BANLkTimTMVTgW6f1LnSgFq3a1ugqZywJhA@mail.gmail.com> References: <BANLkTikXottB8xfXbe1y48P-GWVxhKn2=Q@mail.gmail.com> <BANLkTi=8K1=ZQfxAD0nK0wAaCy33FRmEEw@mail.gmail.com> <1306343861.20117.4.camel@marge> <BANLkTikHWFa+0K7kbRbk0suXDM=cELke-A@mail.gmail.com> <BANLkTimTMVTgW6f1LnSgFq3a1ugqZywJhA@mail.gmail.com> Message-ID: <20110526060903.GB7580@colossus> Nick Coghlan wrote: > 2011/5/26 Charles-Fran?ois Natali <neologix at free.fr>: > > Then, I wonder why shutil.copytree and shutil.rmtree are provided. > > Recursive rm/copy/chown/chmod are extremely useful in system > > administration scripts. Furthermore, it's not as simple as it seems > > because of symlinks, see for example http://bugs.python.org/issue4489 > > Rather than a fixed binary flag, I would suggest following the > precedent of copytree and rmtree, and provide recursive functionality > as a separate shutil function (i.e. shutil.chmodtree, > shutil.chowntree). +1 > As noted, while these *can* be written manually, it is convenient to > have the logic for handling symlinks dealt with for you, as well as > not having to look up the particular incantation for correctly linking > os.walk and the relevant operations. This is exactly what I meant when saying that the -R option to chown and chmod shell commands is useful. I *could* do it without them, but writing the same logic every time with error handling would be cumbersome. Petri From victor.stinner at haypocalc.com Thu May 26 14:32:51 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 26 May 2011 14:32:51 +0200 Subject: [Python-Dev] [Python-checkins] cpython: Avoid useless "++" at the end of functions In-Reply-To: <4DDE43DC.6020103@trueblade.com> References: <E1QPZM1-0001Ll-1A@dinsdale.python.org> <4DDE43DC.6020103@trueblade.com> Message-ID: <1306413171.14987.3.camel@marge> Le jeudi 26 mai 2011 ? 08:13 -0400, Eric Smith a ?crit : > If you're ever going to add code at the end of these functions, it's > unlikely you'll remember that you need to add these increments back in. You don't have to remember. Test the result of the function, it will not give the expected output. I don't think that you need fuzzing or a complex tool to detect that the new code doesn't behave correctly. > It's a bug waiting to happen What? It's not a bug. Ading new non-tested code is a bug :-) > I don't see any harm leaving them in. > Maybe we should add a comment about why they're done. It makes Python faster (!) and make silent the Clang Static Analyzer :-) Victor From eric at trueblade.com Thu May 26 16:10:26 2011 From: eric at trueblade.com (Eric Smith) Date: Thu, 26 May 2011 10:10:26 -0400 Subject: [Python-Dev] [Python-checkins] cpython: Avoid useless "++" at the end of functions In-Reply-To: <1306413171.14987.3.camel@marge> References: <E1QPZM1-0001Ll-1A@dinsdale.python.org> <4DDE43DC.6020103@trueblade.com> <1306413171.14987.3.camel@marge> Message-ID: <4DDE5F52.7030303@trueblade.com> On 5/26/2011 8:32 AM, Victor Stinner wrote: > Le jeudi 26 mai 2011 ? 08:13 -0400, Eric Smith a ?crit : >> If you're ever going to add code at the end of these functions, it's >> unlikely you'll remember that you need to add these increments back in. > > You don't have to remember. Test the result of the function, it will not > give the expected output. I don't think that you need fuzzing or a > complex tool to detect that the new code doesn't behave correctly. > >> It's a bug waiting to happen > > What? It's not a bug. Ading new non-tested code is a bug :-) True. But assuming all code additions will have 100% branch coverage in the C code is foolish. >> I don't see any harm leaving them in. >> Maybe we should add a comment about why they're done. > > It makes Python faster (!) I doubt that. > and make silent the Clang Static Analyzer :-) I care less about that than maintainability and future-proofing. From benjamin at python.org Thu May 26 16:50:03 2011 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 26 May 2011 09:50:03 -0500 Subject: [Python-Dev] [Python-checkins] cpython: Avoid useless "++" at the end of functions In-Reply-To: <1306413171.14987.3.camel@marge> References: <E1QPZM1-0001Ll-1A@dinsdale.python.org> <4DDE43DC.6020103@trueblade.com> <1306413171.14987.3.camel@marge> Message-ID: <BANLkTikOB+oE8+F=EhUrYqd+PpWAyW9yXw@mail.gmail.com> 2011/5/26 Victor Stinner <victor.stinner at haypocalc.com>: > Le jeudi 26 mai 2011 ? 08:13 -0400, Eric Smith a ?crit : >> If you're ever going to add code at the end of these functions, it's >> unlikely you'll remember that you need to add these increments back in. > > You don't have to remember. Test the result of the function, it will not > give the expected output. I don't think that you need fuzzing or a > complex tool to detect that the new code doesn't behave correctly. > >> It's a bug waiting to happen > > What? It's not a bug. Ading new non-tested code is a bug :-) > >> I don't see any harm leaving them in. >> Maybe we should add a comment about why they're done. > > It makes Python faster (!) and make silent the Clang Static Analyzer :-) Surely, GCC can optimize that out. -- Regards, Benjamin From eric at trueblade.com Thu May 26 17:26:20 2011 From: eric at trueblade.com (Eric Smith) Date: Thu, 26 May 2011 11:26:20 -0400 Subject: [Python-Dev] [Python-checkins] cpython: Avoid useless "++" at the end of functions In-Reply-To: <08A5EAF7-2AAD-409D-B5F7-3AF068F9241B@mac.com> References: <E1QPZM1-0001Ll-1A@dinsdale.python.org> <4DDE43DC.6020103@trueblade.com> <1306413171.14987.3.camel@marge> <4DDE5F52.7030303@trueblade.com> <08A5EAF7-2AAD-409D-B5F7-3AF068F9241B@mac.com> Message-ID: <4DDE711C.1040209@trueblade.com> On 5/26/2011 10:34 AM, Ronald Oussoren wrote: > > On 26 May, 2011, at 16:10, Eric Smith wrote: >> >> >>> and make silent the Clang Static Analyzer :-) >> >> I care less about that than maintainability and future-proofing. > > Have to looked at the patch? The patch and resulting code look sane to me, and if anything at most of the updated segments look cleaner after the patch. I have looked at it. I think the code was better before the patch. If I were looking at this, I'd have to wonder why the pointer was incremented everywhere else, but not here. This is especially true when the changed code isn't particularly near the end of the function. From ronaldoussoren at mac.com Thu May 26 16:34:46 2011 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Thu, 26 May 2011 16:34:46 +0200 Subject: [Python-Dev] [Python-checkins] cpython: Avoid useless "++" at the end of functions In-Reply-To: <4DDE5F52.7030303@trueblade.com> References: <E1QPZM1-0001Ll-1A@dinsdale.python.org> <4DDE43DC.6020103@trueblade.com> <1306413171.14987.3.camel@marge> <4DDE5F52.7030303@trueblade.com> Message-ID: <08A5EAF7-2AAD-409D-B5F7-3AF068F9241B@mac.com> On 26 May, 2011, at 16:10, Eric Smith wrote: > > >> and make silent the Clang Static Analyzer :-) > > I care less about that than maintainability and future-proofing. Have to looked at the patch? The patch and resulting code look sane to me, and if anything at most of the updated segments look cleaner after the patch. Ronald -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2224 bytes Desc: not available URL: <http://mail.python.org/pipermail/python-dev/attachments/20110526/bf809ca5/attachment.bin> From tjreedy at udel.edu Thu May 26 19:59:51 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 26 May 2011 13:59:51 -0400 Subject: [Python-Dev] [Python-checkins] cpython: Avoid useless "++" at the end of functions In-Reply-To: <08A5EAF7-2AAD-409D-B5F7-3AF068F9241B@mac.com> References: <E1QPZM1-0001Ll-1A@dinsdale.python.org> <4DDE43DC.6020103@trueblade.com> <1306413171.14987.3.camel@marge> <4DDE5F52.7030303@trueblade.com> <08A5EAF7-2AAD-409D-B5F7-3AF068F9241B@mac.com> Message-ID: <irm4eo$7id$1@dough.gmane.org> On 5/26/2011 10:34 AM, Ronald Oussoren wrote: > > On 26 May, 2011, at 16:10, Eric Smith wrote: >> >> >>> and make silent the Clang Static Analyzer :-) >> >> I care less about that than maintainability and future-proofing. > Have to looked at the patch? The patch and resulting code look sane to me, and if anything at most of the updated segments look cleaner after the patch. Lets assume that the function currently does what it is supposed to do, as verified by tests. Then adding an unneeded increment in case the function is redefined in the future so that it needs more code strikes me as YAGNI. Certainly, reading it today with an unused increment suggests to me that something is missing that would use the incremented value. This strike me as different from adding a comma at the end of a Python sequence display. -- Terry Jan Reedy From guido at python.org Thu May 26 20:08:06 2011 From: guido at python.org (Guido van Rossum) Date: Thu, 26 May 2011 11:08:06 -0700 Subject: [Python-Dev] [Python-checkins] cpython: Avoid useless "++" at the end of functions In-Reply-To: <irm4eo$7id$1@dough.gmane.org> References: <E1QPZM1-0001Ll-1A@dinsdale.python.org> <4DDE43DC.6020103@trueblade.com> <1306413171.14987.3.camel@marge> <4DDE5F52.7030303@trueblade.com> <08A5EAF7-2AAD-409D-B5F7-3AF068F9241B@mac.com> <irm4eo$7id$1@dough.gmane.org> Message-ID: <BANLkTim8h0kL2zvMfa_AXQ6=ZTc61tmn9A@mail.gmail.com> On Thu, May 26, 2011 at 10:59 AM, Terry Reedy <tjreedy at udel.edu> wrote: > On 5/26/2011 10:34 AM, Ronald Oussoren wrote: >> >> On 26 May, 2011, at 16:10, Eric Smith wrote: >>> >>> >>>> and make silent the Clang Static Analyzer :-) >>> >>> I care less about that than maintainability and future-proofing. > > >> Have to looked at the patch? The patch and resulting code look sane to me, >> and if anything at most of the updated segments look cleaner after the >> patch. > > Lets assume that the function currently does what it is supposed to do, as > verified by tests. Then adding an unneeded increment in case the function is > redefined in the future so that it needs more code strikes me as YAGNI. > Certainly, reading it today with an unused increment suggests to me that > something is missing that would use the incremented value. This strike me as > different from adding a comma at the end of a Python sequence display. Sorry to butt in here, but I agree with Eric that it was better before. There is a common idiom, *pointer++ = <something>, and whenever you see that you know that you are appending something to an output buffer. Perhaps the most important idea here is that this maintains the *invariant* "pointer points just after the last thing in the buffer". Always maintaining the invariant is better than trying to micro-optimize things so as to avoid updating dead values. The compiler is better at that. -- --Guido van Rossum (python.org/~guido) From alexander.belopolsky at gmail.com Thu May 26 20:14:42 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 26 May 2011 14:14:42 -0400 Subject: [Python-Dev] [Python-checkins] cpython: Avoid useless "++" at the end of functions In-Reply-To: <4DDE711C.1040209@trueblade.com> References: <E1QPZM1-0001Ll-1A@dinsdale.python.org> <4DDE43DC.6020103@trueblade.com> <1306413171.14987.3.camel@marge> <4DDE5F52.7030303@trueblade.com> <08A5EAF7-2AAD-409D-B5F7-3AF068F9241B@mac.com> <4DDE711C.1040209@trueblade.com> Message-ID: <BANLkTin7yLeSeyPxVOv+hBsXtsyT=JNH_w@mail.gmail.com> On Thu, May 26, 2011 at 11:26 AM, Eric Smith <eric at trueblade.com> wrote: .. >> Have to looked at the patch? The patch and resulting code look sane to me, and >> if anything at most of the updated segments look cleaner after the patch. > > I have looked at it. I think the code was better before the patch. If I > were looking at this, I'd have to wonder why the pointer was incremented > everywhere else, but not here. This is especially true when the changed > code isn't particularly near the end of the function. +1 To me, *p++ = c is an idiomatic way to fill the buffer. I prefer to think of p as the state of the stream for which adding a character is impossible without advancing the state. Seeing *p = c will definitely make me pause and think whether or not it is a bug. From tjreedy at udel.edu Thu May 26 20:22:11 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 26 May 2011 14:22:11 -0400 Subject: [Python-Dev] cpython: Avoid useless "++" at the end of functions In-Reply-To: <BANLkTim8h0kL2zvMfa_AXQ6=ZTc61tmn9A@mail.gmail.com> References: <E1QPZM1-0001Ll-1A@dinsdale.python.org> <4DDE43DC.6020103@trueblade.com> <1306413171.14987.3.camel@marge> <4DDE5F52.7030303@trueblade.com> <08A5EAF7-2AAD-409D-B5F7-3AF068F9241B@mac.com> <irm4eo$7id$1@dough.gmane.org> <BANLkTim8h0kL2zvMfa_AXQ6=ZTc61tmn9A@mail.gmail.com> Message-ID: <4DDE9A53.9060203@udel.edu> On 5/26/2011 2:08 PM, Guido van Rossum wrote: > Sorry to butt in here, but I agree with Eric that it was better > before. There is a common idiom, *pointer++ =<something>, and > whenever you see that you know that you are appending something to an > output buffer. Perhaps the most important idea here is that this > maintains the *invariant* "pointer points just after the last thing in > the buffer". Always maintaining the invariant is better than trying to > micro-optimize things so as to avoid updating dead values. The > compiler is better at that. This explanation makes sense (more than Eric's version of perhaps the same thing ;-). http://bugs.python.org/issue12188 "A condensed version of the above added to PEP 7 would help new developers see the usage as local idiom rather than style bug." Terry J. Reedy From benjamin at python.org Thu May 26 20:26:43 2011 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 26 May 2011 13:26:43 -0500 Subject: [Python-Dev] cpython: Avoid useless "++" at the end of functions In-Reply-To: <4DDE9A53.9060203@udel.edu> References: <E1QPZM1-0001Ll-1A@dinsdale.python.org> <4DDE43DC.6020103@trueblade.com> <1306413171.14987.3.camel@marge> <4DDE5F52.7030303@trueblade.com> <08A5EAF7-2AAD-409D-B5F7-3AF068F9241B@mac.com> <irm4eo$7id$1@dough.gmane.org> <BANLkTim8h0kL2zvMfa_AXQ6=ZTc61tmn9A@mail.gmail.com> <4DDE9A53.9060203@udel.edu> Message-ID: <BANLkTimB1u6BLhw-3Bb7kMxEVz+th96jig@mail.gmail.com> 2011/5/26 Terry Reedy <tjreedy at udel.edu>: > On 5/26/2011 2:08 PM, Guido van Rossum wrote: > >> Sorry to butt in here, but I agree with Eric that it was better >> before. There is a common idiom, *pointer++ =<something>, and >> whenever you see that you know that you are appending something to an >> output buffer. Perhaps the most important idea here is that this >> maintains the *invariant* "pointer points just after the last thing in >> the buffer". Always maintaining the invariant is better than trying to >> micro-optimize things so as to avoid updating dead values. The >> compiler is better at that. > > This explanation makes sense (more than Eric's version of perhaps the same > thing ;-). > > http://bugs.python.org/issue12188 > "A condensed version of the above added to PEP 7 would help new developers > see the usage as local idiom rather than style bug." I think a more general formulation would be: "Idiomatic code is more important than making static analyzers happy." -- Regards, Benjamin From sandro.tosi at gmail.com Thu May 26 22:15:57 2011 From: sandro.tosi at gmail.com (Sandro Tosi) Date: Thu, 26 May 2011 22:15:57 +0200 Subject: [Python-Dev] Extending os.chown() to accept user/group names In-Reply-To: <20110525155857.4c4e87b7@pitrou.net> References: <BANLkTikXottB8xfXbe1y48P-GWVxhKn2=Q@mail.gmail.com> <20110525094146.4941b681@neurotica.wooz.org> <20110525155857.4c4e87b7@pitrou.net> Message-ID: <BANLkTi=PPYp4H7Zf2dnh_mj4TFxo_XscbQ@mail.gmail.com> On Wed, May 25, 2011 at 15:58, Antoine Pitrou <solipsis at pitrou.net> wrote: > On Wed, 25 May 2011 09:41:46 -0400 > Barry Warsaw <barry at python.org> wrote: >> I think it would be a nice feature, and I can see the conflict. ?OT1H you want >> to keep os.chown() a thin wrapper, but OTOH you'd rather not have to add a >> new, arguably more difficult to discover, function. ?Given those two choices, >> I still think I'd come down on adding a new function and shutil.chown() seems >> an appropriate place for it. > > +1 for shutil.chown(). and so shutil.chown() be it: http://bugs.python.org/issue12191 Currently, only the function for a single file is implemented, let's look later what to do for a recursive one. Cheers, -- Sandro Tosi (aka morph, morpheus, matrixhasu) My website: http://matrixhasu.altervista.org/ Me at Debian: http://wiki.debian.org/SandroTosi From mal at egenix.com Fri May 27 10:17:29 2011 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 27 May 2011 10:17:29 +0200 Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader In-Reply-To: <1306338491.6407.74.camel@marge> References: <1306195729.605.27.camel@marge> <4DDB8591.2060308@livinglogic.de> <A5309F8C-375D-4ED9-A325-8172B9E852B3@langa.pl> <1306234681.2619.45.camel@marge> <4DDB9E27.7040605@livinglogic.de> <4DDCCE02.7060105@egenix.com> <1306321851.6407.49.camel@marge> <4DDD079B.7090906@egenix.com> <1306338491.6407.74.camel@marge> Message-ID: <4DDF5E19.3080701@egenix.com> Victor Stinner wrote: > Le mercredi 25 mai 2011 ? 15:43 +0200, M.-A. Lemburg a ?crit : >> For UTF-16 it would e.g. make sense to always read data in blocks >> with even sizes, removing the trial-and-error decoding and extra >> buffering currently done by the base classes. For UTF-32, the >> blocks should have size % 4 == 0. >> >> For UTF-8 (and other variable length encodings) it would make >> sense looking at the end of the (bytes) data read from the >> stream to see whether a complete code point was read or not, >> rather than simply running the decoder on the complete data >> set, only to find that a few bytes at the end are missing. > > I think that the readahead algorithm is much more faster than trying to > avoid partial input, and it's not a problem to have partial input if you > use an incremental decoder. Depends on where you're coming from. For non-seekable streams such as sockets or pipes, readahead is not going to work. For seekable streams, I agree that readahead is better strategy. And of course, it also makes sense to use incremental decoders for these encodings. >> For single character encodings, it would make sense to prefetch >> data in big chunks and skip all the trial and error decoding >> implemented by the base classes to address the above problem >> with variable length encodings. > > TextIOWrapper implements this optimization using its readahead > algorithm. It does yes, but the above was an optimization specific to single character encodings, not all encodings and TextIOWrapper doesn't know anything about specific characteristics of the underlying encodings (except perhaps a few special cases). >> That's somewhat unfair: TextIOWrapper is implemented in C, >> whereas the StreamReader/Writer subclasses used by the >> codecs are written in Python. >> >> A fair comparison would use the Python implementation of >> TextIOWrapper. > > Do you mean that you would like to reimplement codecs in C? As use of Unicode codecs increases in Python applications, this would certainly be an approach to consider, yes. Looking at the current situation, it is better to use TextIOWrapper as it provides better performance, but since TextIOWrapper cannot (per desing) provide per-codec optimizations, this is likely to change with a codec rewrite in C of codecs that benefit a lot from such specific optimizations. > It is not > revelant to compare codecs and _pyio, because codecs reuses > BufferedReader (of the io module, not of the _pyio module), and io is > the main I/O module of Python 3. They both use whatever stream you pass in as parameter, so your TextIOWrapper benchmark will also use the BufferedReader of the io module. The point here is to compare Python to Python, not Python to C. > But well, as you want, here is a benchmark comparing: > _pyio.TextIOWrapper(io.open(filename, 'rb'), encoding) > and > codecs.open(filename, encoding) > > The only change with my previous bench.py script is the test_io() > function : > > def test_io(test_func, chunk_size): > with open(FILENAME, 'rb') as buffered: > f = _pyio.TextIOWrapper(buffered, ENCODING) > test_file(f, test_func, chunk_size) > f.close() Thanks for running those tests. > (1) Decode Objects/unicodeobject.c (317336 characters) from utf-8 > > test_io.readline(): 1193.4 ms > test_codecs.readline(): 1267.9 ms > -> codecs 6% slower than io > > test_io.read(1): 21696.4 ms > test_codecs.read(1): 36027.2 ms > -> codecs 66% slower than io > > test_io.read(100): 3080.7 ms > test_codecs.read(100): 3901.7 ms > -> codecs 27% slower than io This shows that StreamReader/Writer could benefit quite a bit from using incremental encoders/decoders. > test_io.read(): 3991.0 ms > test_codecs.read(): 1736.9 ms > -> codecs 130% FASTER than io No surprise here. It's also a very common use case to read the whole file in one go and the bigger the file, the more impact this has. > (2) Decode README (6613 characters) from ascii > > test_io.readline(): 678.1 ms > test_codecs.readline(): 760.5 ms > -> codecs 12% slower than io > > test_io.read(1): 13533.2 ms > test_codecs.read(1): 21900.0 ms > -> codecs 62% slower than io > > test_io.read(100): 2663.1 ms > test_codecs.read(100): 3270.1 ms > -> codecs 23% slower than io > > test_io.read(): 6769.1 ms > test_codecs.read(): 3919.6 ms > -> codecs 73% FASTER than io See above. > (3) Decode Lib/test/cjkencodings/gb18030.txt (501 characters) from > gb18030 > > test_io.readline(): 38.9 ms > test_codecs.readline(): 15.1 ms > -> codecs 157% FASTER than io > > test_io.read(1): 369.8 ms > test_codecs.read(1): 302.2 ms > -> codecs 22% FASTER than io > > test_io.read(100): 258.2 ms > test_codecs.read(100): 155.1 ms > -> codecs 67% FASTER than io > > test_io.read(): 1803.2 ms > test_codecs.read(): 1002.9 ms > -> codecs 80% FASTER than io These results are interesting since gb18030 is a shift encoding which keeps state in the encoded data stream, so the strategy chosen by TextIOWrapper doesn't work out that well. It hints to what I mentioned above: per codec optimizations are going to be relevant once these codecs get a lot of use. > _pyio.TextIOWrapper is faster than codecs.StreamReader for readline(), > read(1) and read(100), with ASCII and UTF-8. It is slower for gb18030. > > As in the io vs codecs benchmark, codecs.StreamReader is always faster > than _pyio.TextIOWrapper for read(). Just to repeat it here what I already mentioned on the ticket: I am still -1 on deprecating the StreamReader/Writer parts of the codec APIs. I've given numerous reasons on why these are useful, what their intention is, why they were added to Python 1.6. Since such a deprecation would change an important documented API, please write a PEP outlining your reasoning, including my comments, use cases and possibilities for optimizations. Please back out your checkin: """ http://hg.python.org/cpython/rev/3555cf6f9c98 changeset: 70430:3555cf6f9c98 user: Victor Stinner <victor.stinner at haypocalc.com> date: Fri May 27 01:51:18 2011 +0200 summary: Issue #8796: codecs.open() calls the builtin open() function instead of using StreamReaderWriter. Deprecate StreamReader, StreamWriter, StreamReaderWriter, StreamRecoder and EncodedFile() of the codec module. Use the builtin open() function or io.TextIOWrapper instead. files: Doc/library/codecs.rst | 25 ++++ Lib/codecs.py | 25 ++-- Lib/test/test_codecs.py | 152 +++++++++++++++++++-------- Misc/NEWS | 5 + 4 files changed, 148 insertions(+), 59 deletions(-) """ I wasn't very happy to see that checkin on the checkins list... We can discuss changing codec.open() to use TextIOWrapper, but your quest for deprecating APIs in Python has gone too far on this one. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 27 2011) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2011-05-23: Released eGenix mx Base 3.2.0 http://python.egenix.com/ 2011-05-25: Released mxODBC 3.1.1 http://python.egenix.com/ 2011-06-20: EuroPython 2011, Florence, Italy 24 days to go ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From eric at trueblade.com Fri May 27 14:14:55 2011 From: eric at trueblade.com (Eric Smith) Date: Fri, 27 May 2011 08:14:55 -0400 Subject: [Python-Dev] [Python-checkins] cpython: Avoid useless "++" at the end of functions In-Reply-To: <E1QPZM1-0001Ll-1A@dinsdale.python.org> References: <E1QPZM1-0001Ll-1A@dinsdale.python.org> Message-ID: <4DDF95BF.3070800@trueblade.com> So, given the discussions about this change, can you please revert it, Victor? Eric. On 05/26/2011 08:07 AM, victor.stinner wrote: > http://hg.python.org/cpython/rev/7ba176c2f558 > changeset: 70397:7ba176c2f558 > user: Victor Stinner <victor.stinner at haypocalc.com> > date: Thu May 26 13:53:47 2011 +0200 > summary: > Avoid useless "++" at the end of functions > > Warnings found by the Clang Static Analyzer. > > files: > Objects/setobject.c | 4 ++-- > Objects/unicodeobject.c | 2 +- > Python/compile.c | 6 +++--- > 3 files changed, 6 insertions(+), 6 deletions(-) > > > diff --git a/Objects/setobject.c b/Objects/setobject.c > --- a/Objects/setobject.c > +++ b/Objects/setobject.c > @@ -612,9 +612,9 @@ > *u++ = '{'; > /* Omit the brackets from the listrepr */ > Py_UNICODE_COPY(u, PyUnicode_AS_UNICODE(listrepr)+1, > - PyUnicode_GET_SIZE(listrepr)-2); > + newsize-2); > u += newsize-2; > - *u++ = '}'; > + *u = '}'; > } > Py_DECREF(listrepr); > if (Py_TYPE(so) != &PySet_Type) { > diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c > --- a/Objects/unicodeobject.c > +++ b/Objects/unicodeobject.c > @@ -6474,7 +6474,7 @@ > } > } > /* 0-terminate the output string */ > - *output++ = '\0'; > + *output = '\0'; > Py_XDECREF(exc); > Py_XDECREF(errorHandler); > return 0; > diff --git a/Python/compile.c b/Python/compile.c > --- a/Python/compile.c > +++ b/Python/compile.c > @@ -3747,11 +3747,11 @@ > a->a_lnotab_off += 2; > if (d_bytecode) { > *lnotab++ = d_bytecode; > - *lnotab++ = d_lineno; > + *lnotab = d_lineno; > } > else { /* First line of a block; def stmt, etc. */ > *lnotab++ = 0; > - *lnotab++ = d_lineno; > + *lnotab = d_lineno; > } > a->a_lineno = i->i_lineno; > a->a_lineno_off = a->a_offset; > @@ -3796,7 +3796,7 @@ > if (i->i_hasarg) { > assert(size == 3 || size == 6); > *code++ = arg & 0xff; > - *code++ = arg >> 8; > + *code = arg >> 8; > } > return 1; > } > > > > > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > http://mail.python.org/mailman/listinfo/python-checkins From victor.stinner at haypocalc.com Fri May 27 15:29:15 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Fri, 27 May 2011 15:29:15 +0200 Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader In-Reply-To: <4DDF5E19.3080701@egenix.com> References: <1306195729.605.27.camel@marge> <1306338491.6407.74.camel@marge> <4DDF5E19.3080701@egenix.com> Message-ID: <201105271529.15421.victor.stinner@haypocalc.com> Le vendredi 27 mai 2011 10:17:29, M.-A. Lemburg a ?crit : > > I think that the readahead algorithm is much more faster than trying to > > avoid partial input, and it's not a problem to have partial input if you > > use an incremental decoder. > > Depends on where you're coming from. For non-seekable streams > such as sockets or pipes, readahead is not going to work. I don't see how StreamReader/StreamWriter can do a better job than TextIOWrapper for non-seekable streams. > > TextIOWrapper implements this optimization using its readahead > > algorithm. > > It does yes, but the above was an optimization specific > to single character encodings, not all encodings and > TextIOWrapper doesn't know anything about specific characteristics > of the underlying encodings (except perhaps a few special > cases). Please give me numbers: how fast are your suggested optimizations? Are they faster than readahead? All of your argumentation is based on theorical facts. > > Do you mean that you would like to reimplement codecs in C? > > As use of Unicode codecs increases in Python applications, > this would certainly be an approach to consider, yes. I am not sure that StreamReader is/can be faster than TextIOWrapper if it is reimplemented in C (see the updated benchmark below, codecs vs _pyio). > > test_io.read(): 3991.0 ms > > test_codecs.read(): 1736.9 ms > > -> codecs 130% FASTER than io > > No surprise here. It's also a very common use case > to read the whole file in one go and the bigger > the file, the more impact this has. Oh, I understood why codecs is always faster than _pyio (or even io): it's because of IncrementalNewlineDecoder. To be fair, the read(-1) should be tested without IncrementalNewlineDecoder: e.g. with newline='\n'. newline='' cannot be used for the read(-1) test, because even if newline='' indicates that we don't want to translate newlines, read(-1) uses the IncrementalNewlineDecoder (which is slower than not calling it at all). We may optimize this specific case in TextIOWrapper. > > (3) Decode Lib/test/cjkencodings/gb18030.txt (501 characters) from > > gb18030 > > > > test_io.readline(): 38.9 ms > > test_codecs.readline(): 15.1 ms > > -> codecs 157% FASTER than io > > > > test_io.read(1): 369.8 ms > > test_codecs.read(1): 302.2 ms > > -> codecs 22% FASTER than io > > > > test_io.read(100): 258.2 ms > > test_codecs.read(100): 155.1 ms > > -> codecs 67% FASTER than io > > > > test_io.read(): 1803.2 ms > > test_codecs.read(): 1002.9 ms > > -> codecs 80% FASTER than io > > These results are interesting since gb18030 is a shift > encoding which keeps state in the encoded data stream, so > the strategy chosen by TextIOWrapper doesn't work out that > well. In the 4 tests, TextIOWrapper only calls the decoder *once*, on the whole content of the file. The file size if 864 bytes, which is smaller than the TextIOWrapper chunk size (2048 bytes). StreamReader of the gb18030 codec is implemented in C, not in Python (using multibytecodec.c). So to be fair, the test on this encoding should be done using io, not _pyio for this encoding. Moreover, the multibytecodec module doesn't support universal newline! It does only support '\n' newlines. So to be more fair, the test should use '\n' newline. It's one more reason to TextIOWrapper instead of StreamReader: it has the same behaviour (universal newlines) for all encodings. Or is it yet another bug in StreamReader? > I am still -1 on deprecating the StreamReader/Writer parts of > the codec APIs. I've given numerous reasons on why these are > useful, what their intention is, why they were added to Python 1.6. codecs.open() now uses TextIOWrapper, so there is no good reason to keep StreamReader or StreamWriter. You did not give me any use case where StreamReader or StreamWriter should be used instead of TextIOWrapper. You only listed theorical optimizations. You have until the release of Python 3.3 to prove that StreamReader and/or StreamWriter can be faster than TextIOWrapper. If you can prove it using a patch and a benchmark, I will be ok to revert my commit. > Since such a deprecation would change an important documented API, > please write a PEP outlining your reasoning, including my comments, > use cases and possibilities for optimizations. Ok, I will write on a PEP explaining why StreamReader and StreamWriter are deprecated. ----------- I wrote a new benchmarking script which tries to compare more closely codecs to io/_pyio (change the newline value and use io for gb18030). It should be a little bit more reliable because each test now runs 5 times (taking the smallest time), but it's not really reliable... The script is attached to this mail. (1) Decode Objects/unicodeobject.c (317334 characters) from utf-8 _pyio.readline(): 1078.4 ms (8 loops, newline: '') codecs.readline(): 983.0 ms (8 loops, newline: '') -> codecs 10% FASTER than _pyio _pyio.read(1): 3503.5 ms (2 loops, newline: '') codecs.read(1): 6626.7 ms (2 loops, newline: '') -> codecs 89% slower than _pyio _pyio.read(100): 2076.2 ms (80 loops, newline: '') codecs.read(100): 2870.8 ms (80 loops, newline: '') -> codecs 38% slower than _pyio _pyio.read(): 1698.0 ms (800 loops, newline: '\n') codecs.read(): 1686.4 ms (800 loops, newline: '\n') -> codecs 1% FASTER than _pyio (2) Decode Lib/test/cjkencodings/gb18030.txt (501 characters) from gb18030 io.readline(): 5.1 ms (80 loops, newline: '\n') codecs.readline(): 6.8 ms (80 loops, newline: '\n') -> codecs 34% slower than io io.read(1): 5.6 ms (20 loops, newline: '\n') codecs.read(1): 45.5 ms (20 loops, newline: '\n') -> codecs 705% slower than io io.read(100): 54.2 ms (800 loops, newline: '\n') codecs.read(100): 56.7 ms (800 loops, newline: '\n') -> codecs 5% slower than io io.read(): 395.8 ms (8000 loops, newline: '\n') codecs.read(): 309.2 ms (8000 loops, newline: '\n') -> codecs 28% FASTER than io (3) Decode README (6613 characters) from ascii _pyio.readline(): 385.9 ms (160 loops, newline: '') codecs.readline(): 384.5 ms (160 loops, newline: '') -> codecs 0% FASTER than _pyio _pyio.read(1): 1473.6 ms (40 loops, newline: '') codecs.read(1): 1913.9 ms (40 loops, newline: '') -> codecs 30% slower than _pyio _pyio.read(100): 1081.0 ms (1600 loops, newline: '') codecs.read(100): 1325.6 ms (1600 loops, newline: '') -> codecs 23% slower than _pyio _pyio.read(): 1570.9 ms (16000 loops, newline: '\n') codecs.read(): 1518.8 ms (16000 loops, newline: '\n') -> codecs 3% FASTER than _pyio codecs is still faster in 4 cases: * ascii, read(): 3% faster than _pyio * utf-8, readline(): 10% faster than _pyio * utf-8, read(): 1% faster than _pyio * gb18030, read(): 28% faster than io (!) The last one is interesting and should be analyzed. ---- Even if it's not fair, benchmark using io for ASCII and UTF-8 (GB18030 already used io for the reasons explained before): (1) Decode Objects/unicodeobject.c (317334 characters) from utf-8 io.readline(): 52.0 ms (8 loops, newline: '') codecs.readline(): 1001.0 ms (8 loops, newline: '') -> codecs 1825% slower than io io.read(1): 265.7 ms (2 loops, newline: '') codecs.read(1): 6734.5 ms (2 loops, newline: '') -> codecs 2434% slower than io io.read(100): 269.4 ms (80 loops, newline: '') codecs.read(100): 2881.6 ms (80 loops, newline: '') -> codecs 970% slower than io io.read(): 1628.9 ms (800 loops, newline: '\n') codecs.read(): 1692.8 ms (800 loops, newline: '\n') -> codecs 4% slower than io (3) Decode README (6613 characters) from ascii io.readline(): 25.7 ms (160 loops, newline: '') codecs.readline(): 415.5 ms (160 loops, newline: '') -> codecs 1516% slower than io io.read(1): 153.3 ms (40 loops, newline: '') codecs.read(1): 2243.6 ms (40 loops, newline: '') -> codecs 1363% slower than io io.read(100): 210.2 ms (1600 loops, newline: '') codecs.read(100): 1521.9 ms (1600 loops, newline: '') -> codecs 624% slower than io io.read(): 1100.1 ms (16000 loops, newline: '\n') codecs.read(): 1501.1 ms (16000 loops, newline: '\n') -> codecs 36% slower than io So if you compare codecs to io (and not _pyio), codecs is only faster (26%) in one case: read the whole content of the file for multibytecodecs. Note that the codecs module is 2434% slower than io to read a file in UTF-8 character by character (which is stupid, don't do that! :-)), and 1825% slower to read line by line. Victor -------------- next part -------------- A non-text attachment was scrubbed... Name: bench.py Type: text/x-python Size: 3326 bytes Desc: not available URL: <http://mail.python.org/pipermail/python-dev/attachments/20110527/8b9511f7/attachment.py> From benjamin at python.org Fri May 27 15:33:07 2011 From: benjamin at python.org (Benjamin Peterson) Date: Fri, 27 May 2011 08:33:07 -0500 Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader In-Reply-To: <201105271529.15421.victor.stinner@haypocalc.com> References: <1306195729.605.27.camel@marge> <1306338491.6407.74.camel@marge> <4DDF5E19.3080701@egenix.com> <201105271529.15421.victor.stinner@haypocalc.com> Message-ID: <BANLkTinrGOCVjObD4Gfz3U=kRMPzUFYBZA@mail.gmail.com> 2011/5/27 Victor Stinner <victor.stinner at haypocalc.com>: > You have until the release of Python 3.3 to prove that StreamReader and/or > StreamWriter can be faster than TextIOWrapper. If you can prove it using a > patch and a benchmark, I will be ok to revert my commit. Please don't hold commits over someone's head. -- Regards, Benjamin From mal at egenix.com Fri May 27 15:42:10 2011 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 27 May 2011 15:42:10 +0200 Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader In-Reply-To: <201105271529.15421.victor.stinner@haypocalc.com> References: <1306195729.605.27.camel@marge> <1306338491.6407.74.camel@marge> <4DDF5E19.3080701@egenix.com> <201105271529.15421.victor.stinner@haypocalc.com> Message-ID: <4DDFAA32.5030209@egenix.com> Victor Stinner wrote: > Le vendredi 27 mai 2011 10:17:29, M.-A. Lemburg a ?crit : >> I am still -1 on deprecating the StreamReader/Writer parts of >> the codec APIs. I've given numerous reasons on why these are >> useful, what their intention is, why they were added to Python 1.6. > > codecs.open() now uses TextIOWrapper, so there is no good reason to keep > StreamReader or StreamWriter. You did not give me any use case where > StreamReader or StreamWriter should be used instead of TextIOWrapper. You only > listed theorical optimizations. > > You have until the release of Python 3.3 to prove that StreamReader and/or > StreamWriter can be faster than TextIOWrapper. If you can prove it using a > patch and a benchmark, I will be ok to revert my commit. Victor, please revert the change. It has *not* been approved ! If we'd go by your reasoning for deprecating and eventually removing parts of the stdlib or Python's subsystems, we'll end up with a barebone version of Python. That's not what we want and it's not what our users want. I have tried to explain the design decisions and reasons for those codec APIs at great length. You've pretty much used up my patience. If you are not going to revert the patch, I will. >> Since such a deprecation would change an important documented API, >> please write a PEP outlining your reasoning, including my comments, >> use cases and possibilities for optimizations. > > Ok, I will write on a PEP explaining why StreamReader and StreamWriter are > deprecated. Wrong order: first write a PEP, then discuss, then get approval, then patch. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 27 2011) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2011-05-23: Released eGenix mx Base 3.2.0 http://python.egenix.com/ 2011-05-25: Released mxODBC 3.1.1 http://python.egenix.com/ 2011-06-20: EuroPython 2011, Florence, Italy 24 days to go ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From ncoghlan at gmail.com Fri May 27 16:01:14 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 28 May 2011 00:01:14 +1000 Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader In-Reply-To: <4DDFAA32.5030209@egenix.com> References: <1306195729.605.27.camel@marge> <1306338491.6407.74.camel@marge> <4DDF5E19.3080701@egenix.com> <201105271529.15421.victor.stinner@haypocalc.com> <4DDFAA32.5030209@egenix.com> Message-ID: <BANLkTimM_BmsX_uncd9foc6OaKGnA4-L5w@mail.gmail.com> On Fri, May 27, 2011 at 11:42 PM, M.-A. Lemburg <mal at egenix.com> wrote: > > Wrong order: first write a PEP, then discuss, then get approval, > then patch. Indeed. If another committer says "please revert and better justify this change" then we revert it. We don't get into commit wars. Something does need to be done to resolve the duplication of functionality between the io and codecs modules, but it is *far* from clear that deprecating chunks of the longer standing API is the right way to go about it. This is especially true given Guido's explicit direction following the issues with the PyCObject removal in 3.2 that we be *very* conservative about introducing additional incompatibilities between Python 2 and Python 3. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From victor.stinner at haypocalc.com Fri May 27 17:08:06 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Fri, 27 May 2011 17:08:06 +0200 Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader In-Reply-To: <BANLkTinrGOCVjObD4Gfz3U=kRMPzUFYBZA@mail.gmail.com> References: <1306195729.605.27.camel@marge> <201105271529.15421.victor.stinner@haypocalc.com> <BANLkTinrGOCVjObD4Gfz3U=kRMPzUFYBZA@mail.gmail.com> Message-ID: <201105271708.06211.victor.stinner@haypocalc.com> Le vendredi 27 mai 2011 15:33:07, Benjamin Peterson a ?crit : > 2011/5/27 Victor Stinner <victor.stinner at haypocalc.com>: > > You have until the release of Python 3.3 to prove that StreamReader > > and/or StreamWriter can be faster than TextIOWrapper. If you can prove > > it using a patch and a benchmark, I will be ok to revert my commit. > > Please don't hold commits over someone's head. Tell me if I am wrong, but only Marc-Andre is against deprecating StreamReader and StreamWriter. Walter and Antoine are in favor of using TextIOWrapper instead of StreamReader/StreamWriter. Different people would like to be able to call codecs.open() in Python 2 and 3, so I kept the function with its API unchanged, and I documented that open() should be preferred (but I did not deprecated codecs.open). Victor From benjamin at python.org Fri May 27 17:34:28 2011 From: benjamin at python.org (Benjamin Peterson) Date: Fri, 27 May 2011 10:34:28 -0500 Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader In-Reply-To: <201105271708.06211.victor.stinner@haypocalc.com> References: <1306195729.605.27.camel@marge> <201105271529.15421.victor.stinner@haypocalc.com> <BANLkTinrGOCVjObD4Gfz3U=kRMPzUFYBZA@mail.gmail.com> <201105271708.06211.victor.stinner@haypocalc.com> Message-ID: <BANLkTik=gUsHdJdBivDNMKA0SOod1JVnOw@mail.gmail.com> 2011/5/27 Victor Stinner <victor.stinner at haypocalc.com>: > Le vendredi 27 mai 2011 15:33:07, Benjamin Peterson a ?crit : >> 2011/5/27 Victor Stinner <victor.stinner at haypocalc.com>: >> > You have until the release of Python 3.3 to prove that StreamReader >> > and/or StreamWriter can be faster than TextIOWrapper. If you can prove >> > it using a patch and a benchmark, I will be ok to revert my commit. >> >> Please don't hold commits over someone's head. > > Tell me if I am wrong, but only Marc-Andre is against deprecating StreamReader > and StreamWriter. Walter and Antoine are in favor of using TextIOWrapper > instead of StreamReader/StreamWriter. I'm am too. There does, however, seem to be significant disagreement, and it shouldn't be a race to see who can commit first. > > Different people would like to be able to call codecs.open() in Python 2 and 3, > so I kept the function with its API unchanged, and I documented that open() > should be preferred (but I did not deprecated codecs.open). -- Regards, Benjamin From victor.stinner at haypocalc.com Fri May 27 17:35:31 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Fri, 27 May 2011 17:35:31 +0200 Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader In-Reply-To: <BANLkTimM_BmsX_uncd9foc6OaKGnA4-L5w@mail.gmail.com> References: <1306195729.605.27.camel@marge> <4DDFAA32.5030209@egenix.com> <BANLkTimM_BmsX_uncd9foc6OaKGnA4-L5w@mail.gmail.com> Message-ID: <201105271735.31859.victor.stinner@haypocalc.com> Le vendredi 27 mai 2011 16:01:14, Nick Coghlan a ?crit : > On Fri, May 27, 2011 at 11:42 PM, M.-A. Lemburg <mal at egenix.com> wrote: > > Wrong order: first write a PEP, then discuss, then get approval, > > then patch. > > Indeed. > > If another committer says "please revert and better justify this > change" then we revert it. We don't get into commit wars. I reverted my controversal commit. > Something does need to be done to resolve the duplication of > functionality between the io and codecs modules, but it is *far* from > clear that deprecating chunks of the longer standing API is the right > way to go about it. Yes, StreamReader & friends are present in Python since Python 2.0. > This is especially true given Guido's explicit > direction following the issues with the PyCObject removal in 3.2 that > we be *very* conservative about introducing additional > incompatibilities between Python 2 and Python 3. I did search for usage of these classes on the Internet, and except projects implementing their own codecs (and so implement their StreamReader/StreamWriter classes, even if they don't use it), I only found one project using directly StreamReader: pygment (*). I searched quickly, so don't trust these results :-) StreamReader & friends are used indirectly through codecs.open(). My patch changes codecs.open() to make it reuse open (io.TextIOWrapper), so the deprecation of StreamReader would not be noticed by most users. I think that there are much more users of PyCObject than users using directly the StreamReader API (not through codecs.open()). (*) I also found Sphinx, but I was wrong: it doesn't use StreamReader, it just has a full copy of the UTF-8-SIG codec which has a StreamReader class. I don't think that the class is used. Victor From victor.stinner at haypocalc.com Fri May 27 17:44:06 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Fri, 27 May 2011 17:44:06 +0200 Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader In-Reply-To: <4DDFAA32.5030209@egenix.com> References: <1306195729.605.27.camel@marge> <201105271529.15421.victor.stinner@haypocalc.com> <4DDFAA32.5030209@egenix.com> Message-ID: <201105271744.06307.victor.stinner@haypocalc.com> Le vendredi 27 mai 2011 15:42:10, M.-A. Lemburg a ?crit : > If we'd go by your reasoning for deprecating and eventually > removing parts of the stdlib or Python's subsystems, we'll end > up with a barebone version of Python. That's not what we want > and it's not what our users want. I don't want to deprecate the whole stdlib, just duplicate old API, to follow "import this" mantra: "There should be one-- and preferably only one --obvious way to do it." It's difficult for an user to choose between between open() and codecs.open(). Victor From status at bugs.python.org Fri May 27 18:07:23 2011 From: status at bugs.python.org (Python tracker) Date: Fri, 27 May 2011 18:07:23 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20110527160723.74C681D1DB@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2011-05-20 - 2011-05-27) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 2813 (+19) closed 21165 (+50) total 23978 (+69) Open issues with patches: 1216 Issues opened (47) ================== #12128: Allow an abc.abstractproperty to be overridden by an instance http://bugs.python.org/issue12128 opened by cool-RR #12129: Document Object Model API - validation http://bugs.python.org/issue12129 opened by Kyle.Keating #12133: ResourceWarning in urllib.request http://bugs.python.org/issue12133 opened by ezio.melotti #12134: json.dump much slower than dumps http://bugs.python.org/issue12134 opened by poq #12135: The spawn function should return stderr. http://bugs.python.org/issue12135 opened by pitrou #12137: EBADF in test_urllibnet http://bugs.python.org/issue12137 opened by pitrou #12139: Add CCC command support to ftplib http://bugs.python.org/issue12139 opened by giampaolo.rodola #12141: sysconfig.get_config_vars('srcdir') fails in specific cases http://bugs.python.org/issue12141 opened by tarek #12142: Reference cycle when importing ctypes http://bugs.python.org/issue12142 opened by poq #12144: cookielib.CookieJar.make_cookies fails for cookies with 'expir http://bugs.python.org/issue12144 opened by Scott.Wimer #12145: distutils2 should support README.rst http://bugs.python.org/issue12145 opened by daniellindsley #12147: smtplib.send_message does not implement corectly rfc 2822 http://bugs.python.org/issue12147 opened by Nicolas.Estibals #12148: Clarify "or-ing together" doctest option flags http://bugs.python.org/issue12148 opened by ekorn #12149: Segfault in _PyObject_GenericGetAttrWithDict http://bugs.python.org/issue12149 opened by ezio.melotti #12151: test_logging fails sometimes http://bugs.python.org/issue12151 opened by haypo #12154: PyDoc Partial Functions http://bugs.python.org/issue12154 opened by JJeffries #12155: queue example doesn't stop worker threads http://bugs.python.org/issue12155 opened by haypo #12156: test_multiprocessing.test_notify_all() timeout (1 hour) on Fre http://bugs.python.org/issue12156 opened by haypo #12157: join method of multiprocessing Pool object hangs if iterable a http://bugs.python.org/issue12157 opened by G??k??en.Eraslan #12160: codecs doc: what is StreamCodec? http://bugs.python.org/issue12160 opened by haypo #12162: Documentation about re \number http://bugs.python.org/issue12162 opened by Seth.Troisi #12163: str.count http://bugs.python.org/issue12163 opened by py.user #12164: str.translate docstring doesn't mention that 'table' can be No http://bugs.python.org/issue12164 opened by mark.dickinson #12165: Nonlocal does not include global; clarify doc http://bugs.python.org/issue12165 opened by Lukas.Petru #12167: test_packaging reference leak http://bugs.python.org/issue12167 opened by pitrou #12168: SysLogHandler incorrectly appends \000 to messages http://bugs.python.org/issue12168 opened by Carl.Crowder #12169: Factor out common code for d2 commands register, upload and up http://bugs.python.org/issue12169 opened by eric.araujo #12170: Bytes.index() and bytes.count() should accept byte ints http://bugs.python.org/issue12170 opened by max-alleged #12171: Reset method of the incremental encoders of CJK codecs calls t http://bugs.python.org/issue12171 opened by haypo #12172: IDLE crashes when I use F5 to run http://bugs.python.org/issue12172 opened by Kevin Ness #12174: Multiprocessing logging levels unclear http://bugs.python.org/issue12174 opened by JJeffries #12175: FileIO.readall() read the file position and size at each read http://bugs.python.org/issue12175 opened by haypo #12177: re.match raises MemoryError http://bugs.python.org/issue12177 opened by EungJun.Yi #12178: csv writer doesn't escape escapechar http://bugs.python.org/issue12178 opened by ebreck #12179: Race condition using PyGILState_Ensure on a new thread http://bugs.python.org/issue12179 opened by syeberman #12181: SIGBUS error on OpenBSD (sparc64) http://bugs.python.org/issue12181 opened by rpointel #12183: Document behaviour of shutil.copy2 and copystat with symlinks http://bugs.python.org/issue12183 opened by mmarkk #12184: socketserver.ForkingMixin collect_children routine needs to co http://bugs.python.org/issue12184 opened by orsenthil #12186: readline.replace_history_item still leaks memory http://bugs.python.org/issue12186 opened by stefanholek #12187: subprocess.wait() with a timeout uses polling on POSIX http://bugs.python.org/issue12187 opened by haypo #12188: PEP 7, C style: add ++ policy and explanation http://bugs.python.org/issue12188 opened by terry.reedy #12190: intern filenames in bytecode http://bugs.python.org/issue12190 opened by Mike.Solomon #12191: Shutil - add chown() in order to allow to use user and group n http://bugs.python.org/issue12191 opened by sandro.tosi #12192: Doc that collection mutation methods return item or None http://bugs.python.org/issue12192 opened by terry.reedy #12195: Little documentation of annotations http://bugs.python.org/issue12195 opened by JJeffries #12196: add pipe2() to the os module http://bugs.python.org/issue12196 opened by charles-francois.natali #12185: Decimal documentation lists "first" and "second" arguments, sh http://bugs.python.org/issue12185 opened by eric.smith Most recent 15 issues with no replies (15) ========================================== #12188: PEP 7, C style: add ++ policy and explanation http://bugs.python.org/issue12188 #12186: readline.replace_history_item still leaks memory http://bugs.python.org/issue12186 #12185: Decimal documentation lists "first" and "second" arguments, sh http://bugs.python.org/issue12185 #12179: Race condition using PyGILState_Ensure on a new thread http://bugs.python.org/issue12179 #12164: str.translate docstring doesn't mention that 'table' can be No http://bugs.python.org/issue12164 #12157: join method of multiprocessing Pool object hangs if iterable a http://bugs.python.org/issue12157 #12156: test_multiprocessing.test_notify_all() timeout (1 hour) on Fre http://bugs.python.org/issue12156 #12142: Reference cycle when importing ctypes http://bugs.python.org/issue12142 #12137: EBADF in test_urllibnet http://bugs.python.org/issue12137 #12129: Document Object Model API - validation http://bugs.python.org/issue12129 #12091: multiprocessing: simplify ApplyResult and MapResult with threa http://bugs.python.org/issue12091 #12066: Empty ('') xmlns attribute is not properly handled by xml.dom. http://bugs.python.org/issue12066 #12053: Add prefetch() for Buffered IO (experiment) http://bugs.python.org/issue12053 #12037: test_email failures under Windows with the eol extension activ http://bugs.python.org/issue12037 #11992: sys.settrace doesn't disable tracing if a local trace function http://bugs.python.org/issue11992 Most recent 15 issues waiting for review (15) ============================================= #12196: add pipe2() to the os module http://bugs.python.org/issue12196 #12191: Shutil - add chown() in order to allow to use user and group n http://bugs.python.org/issue12191 #12190: intern filenames in bytecode http://bugs.python.org/issue12190 #12184: socketserver.ForkingMixin collect_children routine needs to co http://bugs.python.org/issue12184 #12175: FileIO.readall() read the file position and size at each read http://bugs.python.org/issue12175 #12174: Multiprocessing logging levels unclear http://bugs.python.org/issue12174 #12171: Reset method of the incremental encoders of CJK codecs calls t http://bugs.python.org/issue12171 #12165: Nonlocal does not include global; clarify doc http://bugs.python.org/issue12165 #12164: str.translate docstring doesn't mention that 'table' can be No http://bugs.python.org/issue12164 #12160: codecs doc: what is StreamCodec? http://bugs.python.org/issue12160 #12154: PyDoc Partial Functions http://bugs.python.org/issue12154 #12149: Segfault in _PyObject_GenericGetAttrWithDict http://bugs.python.org/issue12149 #12147: smtplib.send_message does not implement corectly rfc 2822 http://bugs.python.org/issue12147 #12144: cookielib.CookieJar.make_cookies fails for cookies with 'expir http://bugs.python.org/issue12144 #12139: Add CCC command support to ftplib http://bugs.python.org/issue12139 Top 10 most discussed issues (10) ================================= #8898: The email package should defer to the codecs module for all al http://bugs.python.org/issue8898 30 msgs #12006: strptime should implement %V or %u directive from libc http://bugs.python.org/issue12006 23 msgs #5715: listen socket close in SocketServer.ForkingMixIn.process_reque http://bugs.python.org/issue5715 18 msgs #12175: FileIO.readall() read the file position and size at each read http://bugs.python.org/issue12175 16 msgs #12181: SIGBUS error on OpenBSD (sparc64) http://bugs.python.org/issue12181 14 msgs #12085: subprocess.Popen.__del__ raises AttributeError if __init__ was http://bugs.python.org/issue12085 11 msgs #12168: SysLogHandler incorrectly appends \000 to messages http://bugs.python.org/issue12168 10 msgs #12042: What's New multiprocessing example error http://bugs.python.org/issue12042 9 msgs #12057: HZ codec has no test http://bugs.python.org/issue12057 9 msgs #12167: test_packaging reference leak http://bugs.python.org/issue12167 9 msgs Issues closed (44) ================== #1625: bz2.BZ2File doesn't support multiple streams http://bugs.python.org/issue1625 closed by nadeem.vawda #9435: test_distutils fails without zlib http://bugs.python.org/issue9435 closed by eric.araujo #9942: Allow memory sections to be OS MERGEABLE http://bugs.python.org/issue9942 closed by loewis #10818: pydoc: Remove old server and tk panel http://bugs.python.org/issue10818 closed by haypo #10832: Add support of bytes objects in PyBytes_FromFormatV() http://bugs.python.org/issue10832 closed by haypo #11998: test_signal cannot test blocked signals if _tkinter is loaded; http://bugs.python.org/issue11998 closed by haypo #12003: documentation: alternate version of xrange seems to fail. http://bugs.python.org/issue12003 closed by eli.bendersky #12024: 2.6 svn and hg branches are out of sync http://bugs.python.org/issue12024 closed by barry #12045: external shell command executed twice in ctypes.util._get_sona http://bugs.python.org/issue12045 closed by pitrou #12049: expose RAND_bytes() function of OpenSSL http://bugs.python.org/issue12049 closed by haypo #12070: Unlimited loop in sysconfig._parse_makefile() http://bugs.python.org/issue12070 closed by haypo #12071: test_concurrent_futures.test_context_manager_shutdown() hangs http://bugs.python.org/issue12071 closed by haypo #12074: regrtest: display the current number of failures http://bugs.python.org/issue12074 closed by haypo #12079: decimal.py: TypeError precedence in fma() http://bugs.python.org/issue12079 closed by mark.dickinson #12100: Incremental encoders of CJK codecs reset the codec at each cal http://bugs.python.org/issue12100 closed by haypo #12105: open() does not able to set flags, such as O_CLOEXEC http://bugs.python.org/issue12105 closed by charles-francois.natali #12113: test_packaging fails when run twice http://bugs.python.org/issue12113 closed by haypo #12114: packaging.util._find_exe_version(): potential deadlock http://bugs.python.org/issue12114 closed by python-dev #12121: test_packaging failure when ssl is not available http://bugs.python.org/issue12121 closed by haypo #12124: python -m test test_packaging test_zipimport failure http://bugs.python.org/issue12124 closed by haypo #12126: incorrect select documentation http://bugs.python.org/issue12126 closed by eli.bendersky #12130: regex 0.1.20110514 findall overlapped not working with 'start http://bugs.python.org/issue12130 closed by brian.curtin #12131: python built with --prefix fails in site.py with no section 'p http://bugs.python.org/issue12131 closed by ned.deily #12132: test_packaging failures when run with -j http://bugs.python.org/issue12132 closed by tarek #12136: test_logging fails when no ssl available http://bugs.python.org/issue12136 closed by vinay.sajip #12138: buggy use of transient_internet() in test_urllibnet http://bugs.python.org/issue12138 closed by pitrou #12140: Crash upon start up http://bugs.python.org/issue12140 closed by amaury.forgeotdarc #12143: packaging extension gcc linking fails on Ubuntu Shared http://bugs.python.org/issue12143 closed by eric.araujo #12146: Possible bug in 're' documentation example http://bugs.python.org/issue12146 closed by eli.bendersky #12150: test_sysconfig fails on solaris http://bugs.python.org/issue12150 closed by haypo #12152: Parser/asdl_c.py relies on mercurial repository revision http://bugs.python.org/issue12152 closed by doko #12153: Modules/faulthandler.c exports `stack_overflow' symbol http://bugs.python.org/issue12153 closed by python-dev #12158: platform: add linux_version() http://bugs.python.org/issue12158 closed by lemburg #12159: Integer Overflow in __len__ http://bugs.python.org/issue12159 closed by benjamin.peterson #12161: StringIO AttributeError instead of ValueError after close.. http://bugs.python.org/issue12161 closed by python-dev #12166: object.__dir__ http://bugs.python.org/issue12166 closed by python-dev #12173: PyImport_ImportModuleLevel doesn't have 'const' on its argumen http://bugs.python.org/issue12173 closed by python-dev #12176: Compiling Python 2.7.1 on Ubuntu 11.04 (Natty Narwhale) http://bugs.python.org/issue12176 closed by skrah #12180: test_packaging: failures --without-threads http://bugs.python.org/issue12180 closed by tarek #12182: pydoc.py integer division problem http://bugs.python.org/issue12182 closed by python-dev #12189: Python 2.6.6 fails to compile a source whereas pycompile 1.0 a http://bugs.python.org/issue12189 closed by r.david.murray #12193: Argparse does not work together with gettext and non-ASCII hel http://bugs.python.org/issue12193 closed by thorsten #12194: Fix LDFLAGS on Ubuntu 11.04+ http://bugs.python.org/issue12194 closed by barry #1441530: socket read() can cause MemoryError in Windows http://bugs.python.org/issue1441530 closed by charles-francois.natali From mal at egenix.com Fri May 27 20:26:45 2011 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 27 May 2011 20:26:45 +0200 Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader In-Reply-To: <201105271744.06307.victor.stinner@haypocalc.com> References: <1306195729.605.27.camel@marge> <201105271529.15421.victor.stinner@haypocalc.com> <4DDFAA32.5030209@egenix.com> <201105271744.06307.victor.stinner@haypocalc.com> Message-ID: <4DDFECE5.50100@egenix.com> Victor Stinner wrote: > Le vendredi 27 mai 2011 15:42:10, M.-A. Lemburg a ?crit : >> If we'd go by your reasoning for deprecating and eventually >> removing parts of the stdlib or Python's subsystems, we'll end >> up with a barebone version of Python. That's not what we want >> and it's not what our users want. > > I don't want to deprecate the whole stdlib, just duplicate old API, to follow > "import this" mantra: > > "There should be one-- and preferably only one --obvious way to do it." What people tend to miss in this mantra is the last part: "obvious". It doesn't say: there should only be one way to do it. There can be many ways, but there should preferably be only one *obvious* way. Using codec.open() is not obvious in Python3, since the standard open() already provides a way to access an encoded stream. Using a builtin is the obvious way to go. It is obvious in Python2 where the standard open() doesn't provide a way to define an encoding, so the user has to explicitly look for this kind of API and then find it in the "obvious" (to some extent) codecs module, since that's where encodings happen in Python2. Having multiple ways to do things, is the most natural thing on earth and it's good that way. Python does not and should not force people into doing things in one dictated "right" way. It should, however, provide natural choices and obvious hints to find a good solution. And that's what the Zen mantra is all about. > It's difficult for an user to choose between between open() and codecs.open(). As I mentioned on the ticket and in my replies: I'm not against changing codecs.open() to use a variant that is based on TextIOWrapper, provided there are no user noticeable compatibility issues. Thanks for reverting the patch. Have a nice weekend, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 27 2011) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2011-05-23: Released eGenix mx Base 3.2.0 http://python.egenix.com/ 2011-05-25: Released mxODBC 3.1.1 http://python.egenix.com/ 2011-06-20: EuroPython 2011, Florence, Italy 24 days to go ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From martin at v.loewis.de Fri May 27 20:37:47 2011 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 27 May 2011 20:37:47 +0200 Subject: [Python-Dev] [ANN] Python 2.5.6 released Message-ID: <4DDFEF7B.5020803@v.loewis.de> On behalf of the Python development team and the Python community, I'm happy to announce the release of Python 2.5.6. There were no changes since the release candidate. This is a source-only release that only includes security fixes. The last full bug-fix release of Python 2.5 was Python 2.5.4. Users are encouraged to upgrade to the latest release of Python 2.7 (which is 2.7.1 at this point). This release is most likely the final release of Python 2.5; under the current release policy, no security issues in Python 2.5 will be fixed after October, 2011. This releases fixes issues with the urllib, urllib2, SimpleHTTPServer, and audiop modules. See the release notes at the website (also available as Misc/NEWS in the source distribution) for details of bugs fixed. For more information on Python 2.5.6, including download links for various platforms, release notes, and known issues, please see: http://www.python.org/2.5.6 Highlights of the previous major Python releases are available from the Python 2.5 page, at http://www.python.org/2.5/highlights.html Enjoy this release, Martin Martin v. Loewis martin at v.loewis.de Python Release Manager (on behalf of the entire python-dev team) From tjreedy at udel.edu Fri May 27 22:30:31 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 27 May 2011 16:30:31 -0400 Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader In-Reply-To: <201105271708.06211.victor.stinner@haypocalc.com> References: <1306195729.605.27.camel@marge> <201105271529.15421.victor.stinner@haypocalc.com> <BANLkTinrGOCVjObD4Gfz3U=kRMPzUFYBZA@mail.gmail.com> <201105271708.06211.victor.stinner@haypocalc.com> Message-ID: <irp1l8$9mr$1@dough.gmane.org> On 5/27/2011 11:08 AM, Victor Stinner wrote: > Tell me if I am wrong, but only Marc-Andre is against deprecating StreamReader While I am, in general, in favor of removing some duplication, I was and am against doing this change precipitously. So I was for the reversion (noted), at least temporarily. Given the disagreement, I think there should be a PEP with pro and con arguments. -- Terry Jan Reedy From ncoghlan at gmail.com Sat May 28 03:21:57 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 28 May 2011 11:21:57 +1000 Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader In-Reply-To: <irp1l8$9mr$1@dough.gmane.org> References: <1306195729.605.27.camel@marge> <201105271529.15421.victor.stinner@haypocalc.com> <BANLkTinrGOCVjObD4Gfz3U=kRMPzUFYBZA@mail.gmail.com> <201105271708.06211.victor.stinner@haypocalc.com> <irp1l8$9mr$1@dough.gmane.org> Message-ID: <BANLkTimP3c33CuzxF9eazFGwkn_C8BjG4w@mail.gmail.com> On Sat, May 28, 2011 at 6:30 AM, Terry Reedy <tjreedy at udel.edu> wrote: > On 5/27/2011 11:08 AM, Victor Stinner wrote: > >> Tell me if I am wrong, but only Marc-Andre is against deprecating >> StreamReader > > While I am, in general, in favor of removing some duplication, I was and am > against doing this change precipitously. So I was for the reversion (noted), > at least temporarily. Given the disagreement, I think there should be a PEP > with pro and con arguments. Indeed. I'm also against any deprecation in this area, since that just means needless work for anyone that *do* use these APIs (even if those people are few and far between). If we can refactor to remove the duplication of functionality, that's a *much* better solution. If we can carry optparse style argument parsing and 2.x style string formatting, we can carry a couple of legacy codec interface definitions. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From vinay_sajip at yahoo.co.uk Sat May 28 16:57:15 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Sat, 28 May 2011 14:57:15 +0000 (UTC) Subject: [Python-Dev] =?utf-8?q?Deprecate_codecs=2Eopen=28=29_and=09Stream?= =?utf-8?q?Writer/StreamReader?= References: <1306195729.605.27.camel@marge> <201105271529.15421.victor.stinner@haypocalc.com> <4DDFAA32.5030209@egenix.com> <201105271744.06307.victor.stinner@haypocalc.com> Message-ID: <loom.20110528T164222-362@post.gmane.org> Victor Stinner <victor.stinner <at> haypocalc.com> writes: > It's difficult for an user to choose between between open() and > codecs.open(). Is it? How about the following decision process? If writing code for Python 3.x only, use open(). If writing code which has to work under both Python 2.x and 3.x, use codecs.open(). BTW I have written code using StreamReader and StreamWriter in the past, though it may not have been published on the Internet. Python is used a lot by companies for internal systems. Such code is seldom published on the Internet, so it seems that there's no real way of knowing how much StreamReader/StreamWriter are used. When looking at porting projects to Python 3.x, I've always adopted a single code-base approach for 2.x and 3.x, as I feel it's the path of least ongoing maintenance and hence (in my experience) the path of least resistance to providing 3.x support. Though of course I've no objection to implementing their functionality in the most efficient way possible (which may well be TextIOWrapper), IMO deprecating StreamReader/StreamWriter will make 2.x/3.x portability harder to achieve, and so seems a step too far. Regards, Vinay Sajip From greg at krypto.org Sun May 29 11:29:15 2011 From: greg at krypto.org (Gregory P. Smith) Date: Sun, 29 May 2011 02:29:15 -0700 Subject: [Python-Dev] The socket HOWTO In-Reply-To: <BANLkTin6WNTTkfQoJd=fthmEgf25fheHPA@mail.gmail.com> References: <20110521170725.51eab5f9@pitrou.net> <ir8m70$e7c$1@dough.gmane.org> <20110521160118.GA22904@kevin> <ir8tb1$k0o$1@dough.gmane.org> <BANLkTin6WNTTkfQoJd=fthmEgf25fheHPA@mail.gmail.com> Message-ID: <BANLkTimcz+w+Op3bctonbV1dGaAiiV+SBA@mail.gmail.com> On Sun, May 22, 2011 at 11:22 PM, Nick Coghlan <ncoghlan at gmail.com> wrote: > On Sun, May 22, 2011 at 3:38 AM, Georg Brandl <g.brandl at gmx.net> wrote: > > On 05/21/11 18:01, Senthil Kumaran wrote: > >> So a rewrite with good pointers would be more appropriate. > > > > Even then, it's better off in the Wiki until the rewrite is complete. > > Perhaps replacing it with a placeholder page that refers to the Wiki > would be appropriate? A simple summary saying that the HOWTO had not > aged well, and hence had been removed from the official documentation > until it had been updated on the Wiki would allow people looking for > it to better understand the situation, and also how to help improve > it. > +1 on removal. +0.8 on the pointer with a disclaimer (please also add the disclaimer at the top of the socket howto as well). there's a lot of editorial misinformation in that page even if some parts of it are useful for the socket unaware... -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20110529/7533d9ab/attachment.html> From ncoghlan at gmail.com Sun May 29 15:08:13 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 29 May 2011 23:08:13 +1000 Subject: [Python-Dev] [Python-checkins] cpython (3.2): Fix ProcessTestCasePOSIXPurePython to test the module from import when In-Reply-To: <E1QQM9G-0005fv-JI@dinsdale.python.org> References: <E1QQM9G-0005fv-JI@dinsdale.python.org> Message-ID: <BANLkTi=7sbegV1wD50tDpeo2oESpCt=ABg@mail.gmail.com> On Sun, May 29, 2011 at 2:13 AM, gregory.p.smith <python-checkins at python.org> wrote: > Ironically: I don't think any platform should ever actually _use_ the > pure Python subprocess code on POSIX platforms anymore. ?This at least > tests it properly in this stable branch. ?The pure python code for > this is likely to be removed in 3.3. Don't do that - keeping the pure Python equivalents around can help reduce the level of effort for other implementations (especially PyPy). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sun May 29 15:09:07 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 29 May 2011 23:09:07 +1000 Subject: [Python-Dev] [Python-checkins] cpython (3.2): Fix ProcessTestCasePOSIXPurePython to test the module from import when In-Reply-To: <BANLkTi=7sbegV1wD50tDpeo2oESpCt=ABg@mail.gmail.com> References: <E1QQM9G-0005fv-JI@dinsdale.python.org> <BANLkTi=7sbegV1wD50tDpeo2oESpCt=ABg@mail.gmail.com> Message-ID: <BANLkTindxfmLfPyoiVdMbZm-gcTiMA0kmw@mail.gmail.com> On Sun, May 29, 2011 at 11:08 PM, Nick Coghlan <ncoghlan at gmail.com> wrote: > On Sun, May 29, 2011 at 2:13 AM, gregory.p.smith > <python-checkins at python.org> wrote: >> Ironically: I don't think any platform should ever actually _use_ the >> pure Python subprocess code on POSIX platforms anymore. ?This at least >> tests it properly in this stable branch. ?The pure python code for >> this is likely to be removed in 3.3. > > Don't do that - keeping the pure Python equivalents around can help > reduce the level of effort for other implementations (especially > PyPy). Never mind, you addressed that it in a later checkin. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From martin at v.loewis.de Sun May 29 17:20:29 2011 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Sun, 29 May 2011 17:20:29 +0200 Subject: [Python-Dev] The socket HOWTO In-Reply-To: <20110521170725.51eab5f9@pitrou.net> References: <20110521170725.51eab5f9@pitrou.net> Message-ID: <4DE2643D.3070906@v.loewis.de> > I would like to suggest that we remove the socket HOWTO (currently at > http://docs.python.org/dev/howto/sockets.html) -1. I think there should be a Python-oriented introduction to sockets. You may have complaints about the specific wording of the text, but please understand that these are probably irrelevant to most first-time readers of this text. My observation is that people actually don't read the text that much, but instead try to imitate the examples. So if the examples are good (and I think they are, mostly), it's of minor relevance whether the text makes all sense the first time. > - people who know sockets won't learn anything from it True. People who know sockets just need to read the module documentation. It is a beauty of the Python library design that it exposes the API mostly as-is, so if you know Berkeley sockets, you will be immediately familiar with Python sockets (unlike, say, Java or .NET, where they decided to regroup the API into classes). > - but people who don't know sockets will probably find it clear as mud See above - it doesn't really matter. > (for example, what's an "INET" or "STREAM" socket? You are probably referring to the sentence "I?m only going to talk about INET sockets, but they account for at least 99% of the sockets in use. And I?ll only talk about STREAM sockets" here. It's not important to first-time readers to actually understand that, and the wording explicitly tells them that they don't need to understand. It says "there is more stuff, and you won't need it, and the stuff you need is called INET and STREAM". It's easy to fix, though, and I fixed it in f70e26452621 (explaining that this is all about TCPv4). > what's "select"?) It's well explained in the section Non-blocking Sockets, isn't it? > I have other issues, such as the style/tone it's written in. I'm sure > the author had fun writing it but it doesn't fit well with the rest of > the documentation. Also, the author gives a lot of "advice" without > explaining or justifying it It's a HOWTO - of course it has advise without justification. It's not a reference documentation which only tells you what it does, but not what the best way of putting it together is. > ("if somewhere in those input lists of > sockets is one which has died a nasty death, the select will fail" -> > is that really true? I think it is: py> import select py> select.select([100],[],[],0) Traceback (most recent call last): File "<stdin>", line 1, in <module> select.error: (9, 'Bad file descriptor') Of course, rather than "has died a nasty death", it could also say "has been closed". > what is a "nasty death" and how is that supposed to > happen? couldn't the author have put a 3-line example to demonstrate > this supposed drawback and how it manifests?). It may well be that the author didn't fully understand the problem when writing the text, so I wouldn't mind removing this specific paragraph. > And, finally, many statements seem arbitrary ("There?s no question that > the fastest sockets code uses non-blocking sockets and select to > multiplex them") or plain wrong ("threading support in Unixes varies > both in API and quality. So the normal Unix solution is to fork a > subprocess to deal with each connection"). I'd evaluate these two statements exactly vice versa. The first one (non-blocking sockets are faster) is plain wrong, and the second one ("threading support in Unix varies") is arbitrary, but factually correct :-) I'd drop the entire "Performance" section - there is much more to be said about socket performance than a few paragraphs of text, and for the target audience, performance is probably no concern. > Oh and I think it's obsolete too, because the "class mysocket" > concatenates the output of recv() with a str rather than a bytes > object. That's easy to fix, too - c65e1a422bc3 > Not to mention that features of the "class mysocket" can be had > using a buffered socket.makefile() instead of writing custom code. I find it actually appropriate in the context. It illustrates a number of important points about sockets, namely that you cannot rely on send() and recv() to match in block size. Ultimately, people that use the socket API *really* need to understand TCP, so it's good to explain to them that there are issues to consider right in the first tutorial. Regards, Martin From tiagoboldt at gmail.com Sun May 29 15:41:52 2011 From: tiagoboldt at gmail.com (Tiago Boldt Sousa) Date: Sun, 29 May 2011 14:41:52 +0100 Subject: [Python-Dev] PhD ideas Message-ID: <BANLkTim_yE-kM45jWA8XZjXLL5kG86FGsw@mail.gmail.com> Hi, I'm now currently finishing my MsC and am thinking about enrolling into the PhD program. I was wondering if any of you would like to suggest me some research topic that could benefit the scientific community, that might also result as a potential improvement for Python. I love everything that's web related (Django here) and software engineering but I?don't yet have any idea for a research topic that would be relevant?for a PhD so I'm completely open to suggestions. Please contact me directly. Best regards -- Tiago Boldt Sousa From benjamin at python.org Mon May 30 00:44:58 2011 From: benjamin at python.org (Benjamin Peterson) Date: Sun, 29 May 2011 17:44:58 -0500 Subject: [Python-Dev] [RELEASE] 3.1.4 release candidate 1 Message-ID: <BANLkTimsJhBjBH1CiqkSaw9pKmAE=_=+aw@mail.gmail.com> On behalf of the Python development team, I'm happy as a swallow to announce a release candidate for the fourth bugfix release for the Python 3.1 series, Python 3.1.4. 3.1.4 will the last bug fix release in the 3.1 series before 3.1. After 3.1.4, 3.1 will be in security-only fix mode. The Python 3.1 version series focuses on the stabilization and optimization of the features and changes that Python 3.0 introduced. For example, the new I/O system has been rewritten in C for speed. File system APIs that use unicode strings now handle paths with undecodable bytes in them. Other features include an ordered dictionary implementation, a condensed syntax for nested with statements, and support for ttk Tile in Tkinter. For a more extensive list of changes in 3.1, see http://doc.python.org/3.1/whatsnew/3.1.html or Misc/NEWS in the Python distribution. This is a testing release. To download Python 3.1.4rc1 visit: http://www.python.org/download/releases/3.1.4/ A list of changes in 3.1.4 can be found here: http://hg.python.org/cpython/file/35419f276c60/Misc/NEWS The 3.1 documentation can be found at: http://docs.python.org/3.1 Bugs can always be reported to: http://bugs.python.org Enjoy! -- Benjamin Peterson Release Manager benjamin at python.org (on behalf of the entire python-dev team and 3.1.4's contributors) From benjamin at python.org Mon May 30 00:47:42 2011 From: benjamin at python.org (Benjamin Peterson) Date: Sun, 29 May 2011 17:47:42 -0500 Subject: [Python-Dev] [RELEASE] Python 2.7.2 release candidate 1 Message-ID: <BANLkTi=Hkdj4VuJsCUERDgEPLw4y2_Extw@mail.gmail.com> On behalf of the Python development team, I'm happy to announce the immediate availability of Python 2.7.2 release candidate 1. 2.7.2 is the second in bugfix release for the Python 2.7 series. 2.7 is the last major verison of the 2.x line and will be receiving bug fixes while new feature development focuses on 3.x. 2.7 includes many features that were first released in Python 3.1. The faster io module, the new nested with statement syntax, improved float repr, set literals, dictionary views, and the memoryview object have been backported from 3.1. Other features include an ordered dictionary implementation, unittests improvements, a new sysconfig module, auto-numbering of fields in the str/unicode format method, and support for ttk Tile in Tkinter. For a more extensive list of changes in 2.7, see http://doc.python.org/dev/whatsnew/2.7.html or Misc/NEWS in the Python distribution. To download Python 2.7.2rc1 visit: http://www.python.org/download/releases/2.7.1/ The 2.7.2 changelog is at: http://hg.python.org/cpython/file/439396b06416/Misc/NEWS 2.7 documentation can be found at: http://docs.python.org/2.7/ This is a preview release. Assuming no major problems, 2.7.2 will be released in two weeks. Please report any bugs you find to http://bugs.python.org/ Enjoy! -- Benjamin Peterson Release Manager benjamin at python.org (on behalf of the entire python-dev team and 2.7.2's contributors) From jackdied at gmail.com Mon May 30 01:11:02 2011 From: jackdied at gmail.com (Jack Diederich) Date: Sun, 29 May 2011 19:11:02 -0400 Subject: [Python-Dev] [RELEASE] Python 2.7.2 release candidate 1 In-Reply-To: <BANLkTi=Hkdj4VuJsCUERDgEPLw4y2_Extw@mail.gmail.com> References: <BANLkTi=Hkdj4VuJsCUERDgEPLw4y2_Extw@mail.gmail.com> Message-ID: <BANLkTinVkF14-doWJXSrqQVgf5XWMCZGJw@mail.gmail.com> On Sun, May 29, 2011 at 6:47 PM, Benjamin Peterson <benjamin at python.org> wrote: > 2.7.2 is the second in bugfix release for the Python 2.7 series. 2.7 is the last > major verison of the 2.x line and will be receiving bug fixes while new feature > development focuses on 3.x. > > 2.7 includes many features that were first released in Python 3.1. It might not be clear to a casual reader that the features were released in 2.7.0 and not 2.7.2. We don't, but many projects do release new features with bugfix version numbers - I'm looking at you, Django. -Jack From benjamin at python.org Mon May 30 01:13:03 2011 From: benjamin at python.org (Benjamin Peterson) Date: Sun, 29 May 2011 18:13:03 -0500 Subject: [Python-Dev] [RELEASE] Python 2.7.2 release candidate 1 In-Reply-To: <BANLkTinVkF14-doWJXSrqQVgf5XWMCZGJw@mail.gmail.com> References: <BANLkTi=Hkdj4VuJsCUERDgEPLw4y2_Extw@mail.gmail.com> <BANLkTinVkF14-doWJXSrqQVgf5XWMCZGJw@mail.gmail.com> Message-ID: <BANLkTinPnsFWqCB2X1dN4cOOMsAPpGZr2w@mail.gmail.com> 2011/5/29 Jack Diederich <jackdied at gmail.com>: > On Sun, May 29, 2011 at 6:47 PM, Benjamin Peterson <benjamin at python.org> wrote: >> 2.7.2 is the second in bugfix release for the Python 2.7 series. 2.7 is the last >> major verison of the 2.x line and will be receiving bug fixes while new feature >> development focuses on 3.x. >> >> 2.7 includes many features that were first released in Python 3.1. > > It might not be clear to a casual reader that the features were > released in 2.7.0 and not 2.7.2. ?We don't, but many projects do > release new features with bugfix version numbers - I'm looking at you, > Django. Okay. I suppose I can say "The 2.7 series" next time. -- Regards, Benjamin From carl at oddbird.net Mon May 30 02:49:55 2011 From: carl at oddbird.net (Carl Meyer) Date: Sun, 29 May 2011 19:49:55 -0500 Subject: [Python-Dev] [RELEASE] Python 2.7.2 release candidate 1 In-Reply-To: <BANLkTinVkF14-doWJXSrqQVgf5XWMCZGJw@mail.gmail.com> References: <BANLkTi=Hkdj4VuJsCUERDgEPLw4y2_Extw@mail.gmail.com> <BANLkTinVkF14-doWJXSrqQVgf5XWMCZGJw@mail.gmail.com> Message-ID: <4DE2E9B3.4050402@oddbird.net> On 05/29/2011 06:11 PM, Jack Diederich wrote: > We don't, but many projects do > release new features with bugfix version numbers - I'm looking at you, > Django. Really? Do you have an example of a new Django feature that was released in a bugfix version number? Just curious, since that's certainly not the documented release policy. [1] Carl [1] https://docs.djangoproject.com/en/dev/internals/release-process/ From ralf at brainbot.com Mon May 30 06:47:40 2011 From: ralf at brainbot.com (Ralf Schmitt) Date: Mon, 30 May 2011 06:47:40 +0200 Subject: [Python-Dev] [RELEASE] Python 2.7.2 release candidate 1 In-Reply-To: <BANLkTi=Hkdj4VuJsCUERDgEPLw4y2_Extw@mail.gmail.com> (Benjamin Peterson's message of "Sun, 29 May 2011 17:47:42 -0500") References: <BANLkTi=Hkdj4VuJsCUERDgEPLw4y2_Extw@mail.gmail.com> Message-ID: <878vtodcb7.fsf@muni.brainbot.com> Benjamin Peterson <benjamin at python.org> writes: > The 2.7.2 changelog is at: > > http://hg.python.org/cpython/file/439396b06416/Misc/NEWS > The news file mentions that issue 1195 ("Problems on Linux with Ctrl-D and Ctrl-C during raw_input") is fixed. That's not true, see: http://bugs.python.org/msg135671 Does one need special roundup rights to reopen issues? Cheers, - Ralf From victor.stinner at haypocalc.com Mon May 30 10:26:44 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Mon, 30 May 2011 10:26:44 +0200 Subject: [Python-Dev] [RELEASE] Python 2.7.2 release candidate 1 In-Reply-To: <878vtodcb7.fsf@muni.brainbot.com> References: <BANLkTi=Hkdj4VuJsCUERDgEPLw4y2_Extw@mail.gmail.com> <878vtodcb7.fsf@muni.brainbot.com> Message-ID: <201105301026.44557.victor.stinner@haypocalc.com> Hi, Le lundi 30 mai 2011 06:47:40, Ralf Schmitt a ?crit : > Benjamin Peterson <benjamin at python.org> writes: > > The 2.7.2 changelog is at: > > http://hg.python.org/cpython/file/439396b06416/Misc/NEWS > > The news file mentions that issue 1195 ("Problems on Linux with Ctrl-D > and Ctrl-C during raw_input") is fixed. That's not true, see: > http://bugs.python.org/msg135671 > > Does one need special roundup rights to reopen issues? Oh, I forgot that one. Please reopen the issue, I will apply your fix instead of mine. Victor From ralf at brainbot.com Mon May 30 10:46:39 2011 From: ralf at brainbot.com (Ralf Schmitt) Date: Mon, 30 May 2011 10:46:39 +0200 Subject: [Python-Dev] [RELEASE] Python 2.7.2 release candidate 1 In-Reply-To: <201105301026.44557.victor.stinner@haypocalc.com> (Victor Stinner's message of "Mon, 30 May 2011 10:26:44 +0200") References: <BANLkTi=Hkdj4VuJsCUERDgEPLw4y2_Extw@mail.gmail.com> <878vtodcb7.fsf@muni.brainbot.com> <201105301026.44557.victor.stinner@haypocalc.com> Message-ID: <877h98mv80.fsf@muni.brainbot.com> Victor Stinner <victor.stinner at haypocalc.com> writes: >> Does one need special roundup rights to reopen issues? > > Oh, I forgot that one. Please reopen the issue, I will apply your fix instead > of mine. I would love to do that, but as my above question implies I'm either too stupid to do that or I'm missing the rights to do it :) Cheers, - Ralf From victor.stinner at haypocalc.com Mon May 30 10:55:38 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Mon, 30 May 2011 10:55:38 +0200 Subject: [Python-Dev] [RELEASE] Python 2.7.2 release candidate 1 In-Reply-To: <877h98mv80.fsf@muni.brainbot.com> References: <BANLkTi=Hkdj4VuJsCUERDgEPLw4y2_Extw@mail.gmail.com> <201105301026.44557.victor.stinner@haypocalc.com> <877h98mv80.fsf@muni.brainbot.com> Message-ID: <201105301055.39036.victor.stinner@haypocalc.com> Le lundi 30 mai 2011 10:46:39, Ralf Schmitt a ?crit : > Victor Stinner <victor.stinner at haypocalc.com> writes: > >> Does one need special roundup rights to reopen issues? > > > > Oh, I forgot that one. Please reopen the issue, I will apply your fix > > instead of mine. > > I would love to do that, but as my above question implies I'm either too > stupid to do that or I'm missing the rights to do it :) Oops, I am stupid. I reopened the issue. Victor From ziade.tarek at gmail.com Mon May 30 18:44:43 2011 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Mon, 30 May 2011 18:44:43 +0200 Subject: [Python-Dev] pysetup as a top script Message-ID: <BANLkTimiPUBBPEzd04RpfDWAgTLL-HZnTQ@mail.gmail.com> Hello If no one objects, I'll promote Tools/scripts/pysetup3 to a top level script that gets installed in scripts/ like 2to3, pydoc etc.. That way, people will be able to use it directly when installing, removing projects, or studying what's installed Cheers Tarek -- Tarek Ziad? | http://ziade.org From merwok at netwok.org Mon May 30 18:52:16 2011 From: merwok at netwok.org (=?UTF-8?Q?=C3=89ric_Araujo?=) Date: Mon, 30 May 2011 18:52:16 +0200 Subject: [Python-Dev] Docs for the packaging module Message-ID: <b35dbbc50253fd96327a62e906b8259f@netwok.org> Hi, The docs were not added alongside the code when packaging was merged back into CPython because they were not in a shape conforming with the rest of the docs. I?d like your input on layout so that I can fix this ASAP and merge the docs. (They would still require a lot of additions, fixes and improvements after that, but at least they?d be in the repo.) The easiest part is the library documentation, i.e. the docs for developers of packaging-related tools that want to use for example packaging.version or packaging.metadata to do their own stuff. These documents should go into Doc/library/packaging.*, I think this is a no-brainer. (Distutils has only a stub here, its API docs is mixed with its usage docs.) There is a guide for end-users, which contains an outdated copy of the old ?Installing Python Modules? and a few documents about the new pysetup script (superseder of setup.py scripts), which are not integrated with the first document. I think those should supersede the existing distutils-based Doc/install tree. We want to advertise pysetup and packaging as the way of gettting modules in 3.3. A question remains: is it worthwhile to keep the old document somewhere? Last but not least, the doc for authors wanting to package and distribute their project (?Distributing Python Modules?, Doc/distutils). I think we should not overwrite this directory, because the directory name is tied to distutils and because there will be users needing that documentation (distutils is not removed). So, is it okay to create a new Doc/packaging directory and change the link on the docs front page from distutils to packaging? (Technical question: I think I?ll get a complaint from Sphinx that distutils is not included in any toctree; I?ll try adding a toctree from library/distutils to distutils/index and see if that works.) Thanks for reading From g.brandl at gmx.net Mon May 30 19:04:51 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 30 May 2011 19:04:51 +0200 Subject: [Python-Dev] cpython: removed spurious output In-Reply-To: <4DE3BDA6.1040100@udel.edu> References: <E1QQzfO-0003da-B2@dinsdale.python.org> <4DE3BDA6.1040100@udel.edu> Message-ID: <is0ind$m68$1@dough.gmane.org> On 30.05.2011 17:54, Terry Reedy wrote: > > > On 5/30/2011 6:25 AM, tarek.ziade wrote: > > Should not old_out be sys.stderr, since that is what you over-write and > 'restore'? > >> + old_out = sys.stdout >> + sys.stderr = StringIO() >> + try: >> + dist = self.run_setup('install_dist', '--prefix=' + self.root_dir) >> + finally: >> + sys.sterr = old_out And even more importantly, shouldn't this be "sys.stderr" instead of "sys.sterr"? Really, what happened to testing before you push? Georg From ziade.tarek at gmail.com Mon May 30 19:13:43 2011 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Mon, 30 May 2011 19:13:43 +0200 Subject: [Python-Dev] cpython: removed spurious output In-Reply-To: <is0ind$m68$1@dough.gmane.org> References: <E1QQzfO-0003da-B2@dinsdale.python.org> <4DE3BDA6.1040100@udel.edu> <is0ind$m68$1@dough.gmane.org> Message-ID: <BANLkTimVh-fh_tJZtfoT1hX_hg7pL=uNJA@mail.gmail.com> On Mon, May 30, 2011 at 7:04 PM, Georg Brandl <g.brandl at gmx.net> wrote: > On 30.05.2011 17:54, Terry Reedy wrote: >> >> >> On 5/30/2011 6:25 AM, tarek.ziade wrote: >> >> Should not old_out be sys.stderr, since that is what you over-write and >> 'restore'? >> >>> + ? ? ? ?old_out = sys.stdout >>> + ? ? ? ?sys.stderr = StringIO() >>> + ? ? ? ?try: >>> + ? ? ? ? ? ?dist = self.run_setup('install_dist', '--prefix=' + self.root_dir) >>> + ? ? ? ?finally: >>> + ? ? ? ? ? ?sys.sterr = old_out > > And even more importantly, shouldn't this be "sys.stderr" instead of "sys.sterr"? Yes, > > Really, what happened to testing before you push? I did test it, before and after my push, sir. This was not to fix a test bug, but to avoid a spurious output in the tests. > Georg > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ziade.tarek%40gmail.com > -- Tarek Ziad? | http://ziade.org From g.brandl at gmx.net Mon May 30 19:31:43 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 30 May 2011 19:31:43 +0200 Subject: [Python-Dev] cpython: removed spurious output In-Reply-To: <BANLkTimVh-fh_tJZtfoT1hX_hg7pL=uNJA@mail.gmail.com> References: <E1QQzfO-0003da-B2@dinsdale.python.org> <4DE3BDA6.1040100@udel.edu> <is0ind$m68$1@dough.gmane.org> <BANLkTimVh-fh_tJZtfoT1hX_hg7pL=uNJA@mail.gmail.com> Message-ID: <is0k9o$2s8$1@dough.gmane.org> On 30.05.2011 19:13, Tarek Ziad? wrote: > On Mon, May 30, 2011 at 7:04 PM, Georg Brandl <g.brandl at gmx.net> wrote: >> On 30.05.2011 17:54, Terry Reedy wrote: >>> >>> >>> On 5/30/2011 6:25 AM, tarek.ziade wrote: >>> >>> Should not old_out be sys.stderr, since that is what you over-write and >>> 'restore'? >>> >>>> + old_out = sys.stdout >>>> + sys.stderr = StringIO() >>>> + try: >>>> + dist = self.run_setup('install_dist', '--prefix=' + self.root_dir) >>>> + finally: >>>> + sys.sterr = old_out >> >> And even more importantly, shouldn't this be "sys.stderr" instead of "sys..sterr"? > > Yes, > >> >> Really, what happened to testing before you push? > > I did test it, before and after my push, sir. > > This was not to fix a test bug, but to avoid a spurious output in the tests. Well, I assumed changing sys.stderr would be noticed as changing the execution environment. But as I've now found out, the test class itself cleans up sys.stderr, so you couldn't have noticed the bug. I apologize. Georg From ncoghlan at gmail.com Tue May 31 07:13:06 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 31 May 2011 15:13:06 +1000 Subject: [Python-Dev] pysetup as a top script In-Reply-To: <BANLkTimiPUBBPEzd04RpfDWAgTLL-HZnTQ@mail.gmail.com> References: <BANLkTimiPUBBPEzd04RpfDWAgTLL-HZnTQ@mail.gmail.com> Message-ID: <BANLkTinzDKL9Fh5gq5mqu_fR5jVGqzYxsw@mail.gmail.com> On Tue, May 31, 2011 at 2:44 AM, Tarek Ziad? <ziade.tarek at gmail.com> wrote: > Hello > > If no one objects, I'll promote Tools/scripts/pysetup3 to a top level > script that gets installed in scripts/ like 2to3, pydoc etc.. > > That way, people will be able to use it directly when installing, > removing projects, or studying what's installed Cool. Now I'm trying to remember if it was a list discussion or the language summit where you got the initial consensus on that approach... Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Tue May 31 07:18:28 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 31 May 2011 15:18:28 +1000 Subject: [Python-Dev] [Python-checkins] cpython: removed spurious output In-Reply-To: <E1QQzfO-0003da-B2@dinsdale.python.org> References: <E1QQzfO-0003da-B2@dinsdale.python.org> Message-ID: <BANLkTikUyWRTq0xjucXJpXn105XAuoiqww@mail.gmail.com> On Mon, May 30, 2011 at 8:25 PM, tarek.ziade <python-checkins at python.org> wrote: > + ? ? ? ?old_out = sys.stdout > + ? ? ? ?sys.stderr = StringIO() > + ? ? ? ?try: > + ? ? ? ? ? ?dist = self.run_setup('install_dist', '--prefix=' + self.root_dir) > + ? ? ? ?finally: > + ? ? ? ? ? ?sys.sterr = old_out There's actually a helper for this in test.support: with support.captured_stderr(): dist = self.run_setup('install_dist', '--prefix=' + self.root_dir) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Tue May 31 07:44:17 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 31 May 2011 15:44:17 +1000 Subject: [Python-Dev] [Python-checkins] cpython: Close #12028: Make threading._get_ident() public, rename it to In-Reply-To: <E1QR9dN-0005rc-Tj@dinsdale.python.org> References: <E1QR9dN-0005rc-Tj@dinsdale.python.org> Message-ID: <BANLkTingX9zqUH+TWWbw_e+n=hdPhXxaBA@mail.gmail.com> On Tue, May 31, 2011 at 7:04 AM, victor.stinner <python-checkins at python.org> wrote: > +.. function:: get_ident() > + > + ? Return the 'thread identifier' of the current thread. ?This is a nonzero > + ? integer. ?Its value has no direct meaning; it is intended as a magic cookie > + ? to be used e.g. to index a dictionary of thread-specific data. ?Thread > + ? identifiers may be recycled when a thread exits and another thread is > + ? created. That's not quite true - the Thread id isn't relinquished until the Thread object itself is destroyed, rather than when the underlying thread finishes execution (i.e. the lifecycle of a_thread.ident is the same as that of id(a_thread)). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ziade.tarek at gmail.com Tue May 31 08:45:01 2011 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Tue, 31 May 2011 08:45:01 +0200 Subject: [Python-Dev] pysetup as a top script In-Reply-To: <BANLkTinzDKL9Fh5gq5mqu_fR5jVGqzYxsw@mail.gmail.com> References: <BANLkTimiPUBBPEzd04RpfDWAgTLL-HZnTQ@mail.gmail.com> <BANLkTinzDKL9Fh5gq5mqu_fR5jVGqzYxsw@mail.gmail.com> Message-ID: <BANLkTinHSg=YWQvDmY+QGxp5-K6rXJQK6A@mail.gmail.com> On Tue, May 31, 2011 at 7:13 AM, Nick Coghlan <ncoghlan at gmail.com> wrote: > On Tue, May 31, 2011 at 2:44 AM, Tarek Ziad? <ziade.tarek at gmail.com> wrote: >> Hello >> >> If no one objects, I'll promote Tools/scripts/pysetup3 to a top level >> script that gets installed in scripts/ like 2to3, pydoc etc.. >> >> That way, people will be able to use it directly when installing, >> removing projects, or studying what's installed > > Cool. > > Now I'm trying to remember if it was a list discussion or the language > summit where you got the initial consensus on that approach... The thread starts here: http://mail.python.org/pipermail/python-dev/2010-October/104535.html The pysetup top-level script was mentioned here: http://mail.python.org/pipermail/python-dev/2010-October/104581.html Cheers Tarek > > Cheers, > Nick. > > -- > Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia > -- Tarek Ziad? | http://ziade.org From neologix at free.fr Tue May 31 09:17:23 2011 From: neologix at free.fr (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Tue, 31 May 2011 09:17:23 +0200 Subject: [Python-Dev] [Python-checkins] cpython: Close #12028: Make threading._get_ident() public, rename it to In-Reply-To: <BANLkTingX9zqUH+TWWbw_e+n=hdPhXxaBA@mail.gmail.com> References: <E1QR9dN-0005rc-Tj@dinsdale.python.org> <BANLkTingX9zqUH+TWWbw_e+n=hdPhXxaBA@mail.gmail.com> Message-ID: <BANLkTi=M0-CP922qZOEXWfCte-sTOkC55A@mail.gmail.com> >> +.. function:: get_ident() >> + >> + ? Return the 'thread identifier' of the current thread. ?This is a nonzero >> + ? integer. ?Its value has no direct meaning; it is intended as a magic cookie >> + ? to be used e.g. to index a dictionary of thread-specific data. ?Thread >> + ? identifiers may be recycled when a thread exits and another thread is >> + ? created. > > That's not quite true - the Thread id isn't relinquished until the > Thread object itself is destroyed, rather than when the underlying > thread finishes execution (i.e. the lifecycle of a_thread.ident is the > same as that of id(a_thread)). > I'm not sure I understand, Nick. Since threads are started detached, their thread ID (e.g. returned by pthread_self() on pthreads) can be reused as soon as the underlying OS thread exits (i.e. returns from Modules/_threadmodule.c:t_boostrap) : On a Linux kernel with NPTL: $ cat /tmp/test.py import threading def print_ident(): print(threading._get_ident()) t1 = threading.Thread(target=print_ident) t2 = threading.Thread(target=print_ident) t1.start() t1.join() t2.start() t2.join() print(id(t1), id(t2)) $ ./python /tmp/test.py -1211954272 -1211954272 (3085561228L, 3083093028L) I'm just curious, maybe I missed something? Thanks, cf From ncoghlan at gmail.com Tue May 31 10:37:15 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 31 May 2011 18:37:15 +1000 Subject: [Python-Dev] [Python-checkins] cpython: Close #12028: Make threading._get_ident() public, rename it to In-Reply-To: <BANLkTi=M0-CP922qZOEXWfCte-sTOkC55A@mail.gmail.com> References: <E1QR9dN-0005rc-Tj@dinsdale.python.org> <BANLkTingX9zqUH+TWWbw_e+n=hdPhXxaBA@mail.gmail.com> <BANLkTi=M0-CP922qZOEXWfCte-sTOkC55A@mail.gmail.com> Message-ID: <BANLkTikX12cjFVdBVOB19SXMivZxhpFxuQ@mail.gmail.com> 2011/5/31 Charles-Fran?ois Natali <neologix at free.fr>: >>> +.. function:: get_ident() >>> + >>> + ? Return the 'thread identifier' of the current thread. ?This is a nonzero >>> + ? integer. ?Its value has no direct meaning; it is intended as a magic cookie >>> + ? to be used e.g. to index a dictionary of thread-specific data. ?Thread >>> + ? identifiers may be recycled when a thread exits and another thread is >>> + ? created. >> >> That's not quite true - the Thread id isn't relinquished until the >> Thread object itself is destroyed, rather than when the underlying >> thread finishes execution (i.e. the lifecycle of a_thread.ident is the >> same as that of id(a_thread)). >> > > I'm not sure I understand, Nick. I was just wrong, but the wording is still confusing since it has been copied from _thread.ident. "Thread" means something other than "threading.Thread" in that module, while in the threading docs, it typically refers to the actual objects. With the change of module, there needs to be something to make it clearer that this is information related to os level threads. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From victor.stinner at haypocalc.com Tue May 31 10:51:45 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 31 May 2011 10:51:45 +0200 Subject: [Python-Dev] [Python-checkins] cpython: Close #12028: Make threading._get_ident() public, rename it to In-Reply-To: <BANLkTikX12cjFVdBVOB19SXMivZxhpFxuQ@mail.gmail.com> References: <E1QR9dN-0005rc-Tj@dinsdale.python.org> <BANLkTi=M0-CP922qZOEXWfCte-sTOkC55A@mail.gmail.com> <BANLkTikX12cjFVdBVOB19SXMivZxhpFxuQ@mail.gmail.com> Message-ID: <201105311051.46234.victor.stinner@haypocalc.com> Le mardi 31 mai 2011 10:37:15, Nick Coghlan a ?crit : > I was just wrong, but the wording is still confusing since it has been > copied from _thread.ident. "Thread" means something other than > "threading.Thread" in that module, while in the threading docs, it > typically refers to the actual objects. With the change of module, > there needs to be something to make it clearer that this is > information related to os level threads. Yes, I copy-pasted the doc from Python 2.7, from thread.get_ident(). Feel free to edit directly the doc. Victor From fuzzyman at voidspace.org.uk Tue May 31 15:19:27 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Tue, 31 May 2011 14:19:27 +0100 Subject: [Python-Dev] Release pages malformed on python.org Message-ID: <4DE4EADF.5080709@voidspace.org.uk> Hello all, I believe that the release manager is aware of this, but just in case... The web pages on python.org for the recent releases are malformatted: http://www.python.org/download/releases/3.1.4/ http://www.python.org/download/releases/2.7.2/ These are the pages linked to from the news on the front page. All the best, Michael Foord -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From benjamin at python.org Tue May 31 16:01:15 2011 From: benjamin at python.org (Benjamin Peterson) Date: Tue, 31 May 2011 09:01:15 -0500 Subject: [Python-Dev] Release pages malformed on python.org In-Reply-To: <4DE4EADF.5080709@voidspace.org.uk> References: <4DE4EADF.5080709@voidspace.org.uk> Message-ID: <BANLkTikqjfZvcPS_fyaiK-Ygdse9-iLATg@mail.gmail.com> 2011/5/31 Michael Foord <fuzzyman at voidspace.org.uk>: > Hello all, > > I believe that the release manager is aware of this, but just in case... The > web pages on python.org for the recent releases are malformatted: > > ? ?http://www.python.org/download/releases/3.1.4/ > ? ?http://www.python.org/download/releases/2.7.2/ Wohaa. Martin, I think this is from your checkin? -- Regards, Benjamin From techtonik at gmail.com Tue May 31 19:04:10 2011 From: techtonik at gmail.com (anatoly techtonik) Date: Tue, 31 May 2011 20:04:10 +0300 Subject: [Python-Dev] Sniffing passwords from PyPI using insecure connection Message-ID: <BANLkTikfTXNrvBQpJ-_kXHePon7ynAmOGw@mail.gmail.com> Hi, I'd like to escalate http://bugs.python.org/issue12226 : 'use secured channel for uploading packages to pypi' to be shipped with next Python 2.6+ This will prevent pydotorg password sniffing when submitting packages through public networks (such as hotels). -- anatoly t. From martin at v.loewis.de Tue May 31 20:11:39 2011 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Tue, 31 May 2011 20:11:39 +0200 Subject: [Python-Dev] Release pages malformed on python.org In-Reply-To: <BANLkTikqjfZvcPS_fyaiK-Ygdse9-iLATg@mail.gmail.com> References: <4DE4EADF.5080709@voidspace.org.uk> <BANLkTikqjfZvcPS_fyaiK-Ygdse9-iLATg@mail.gmail.com> Message-ID: <4DE52F5B.2050002@v.loewis.de> Am 31.05.2011 16:01, schrieb Benjamin Peterson: > 2011/5/31 Michael Foord <fuzzyman at voidspace.org.uk>: >> Hello all, >> >> I believe that the release manager is aware of this, but just in case... The >> web pages on python.org for the recent releases are malformatted: >> >> http://www.python.org/download/releases/3.1.4/ >> http://www.python.org/download/releases/2.7.2/ > > Wohaa. Martin, I think this is from your checkin? Indeed... I have now fixed it. Regards, Martin From tjreedy at udel.edu Tue May 31 21:05:29 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 31 May 2011 15:05:29 -0400 Subject: [Python-Dev] Sniffing passwords from PyPI using insecure connection In-Reply-To: <BANLkTikfTXNrvBQpJ-_kXHePon7ynAmOGw@mail.gmail.com> References: <BANLkTikfTXNrvBQpJ-_kXHePon7ynAmOGw@mail.gmail.com> Message-ID: <is3e5o$qou$1@dough.gmane.org> On 5/31/2011 1:04 PM, anatoly techtonik wrote: > Hi, > > I'd like to escalate http://bugs.python.org/issue12226 : 'use secured > channel for uploading packages to pypi' to be shipped with next Python > 2.6+ > This will prevent pydotorg password sniffing when submitting packages > through public networks (such as hotels). The requested one character change is - DEFAULT_REPOSITORY = 'http://pypi.python.org/pypi' + DEFAULT_REPOSITORY = 'https://pypi.python.org/pypi' If Tarek (or perhaps Eric) agree that it is appropriate and otherwise innocuous, then Martin and Barry can decide whether to include in 2.5/2.6. Terry Jan Reedy